2017-05-11 332 views
3

使用R中的plm包来拟合固定效应模型,将滞后变量添加到模型的正确语法是什么?类似于Stata中的'L1.variable'命令。R plm lag - 在Stata中相当于L1.x的值是多少?

这是我尝试添加一个滞后变量(这是一个测试模型,它可能没有什么意义):

library(foreign) 
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") 
pnlswork <- plm.data(nlswork, c('idcode', 'year')) 
ffe <- plm(ln_wage ~ ttl_exp+lag(wks_work,1) 
      , model = 'within' 
      , data = nlswork) 
summary(ffe) 

右输出:

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = nlswork, 
    model = "within") 

Unbalanced Panel: n=3911, T=1-14, N=19619 

Residuals : 
    Min. 1st Qu. Median 3rd Qu.  Max. 
-1.77000 -0.10100 0.00293 0.11000 2.90000 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.02341057 0.00073832 31.7078 < 2.2e-16 *** 
lag(wks_work) 0.00081576 0.00010628 7.6755 1.744e-14 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 1296.9 
Residual Sum of Squares: 1126.9 
R-Squared:  0.13105 
Adj. R-Squared: -0.085379 
F-statistic: 1184.39 on 2 and 15706 DF, p-value: < 2.22e-16 

但是,我得到了不同的结果相比,什么Stata生产。

在我的实际模型中,我想用滞后的价值来衡量一个内生变量。

谢谢!

作为参考,这里是Stata的代码:

webuse nlswork.dta 
xtset idcode year 
xtreg ln_wage ttl_exp L1.wks_work, fe 

Stata的输出:

Fixed-effects (within) regression    Number of obs  =  10,680 
Group variable: idcode       Number of groups =  3,671 

R-sq:           Obs per group: 
    within = 0.1492           min =   1 
    between = 0.2063           avg =  2.9 
    overall = 0.1483           max =   8 

               F(2,7007)   =  614.60 
corr(u_i, Xb) = 0.1329       Prob > F   =  0.0000 

------------------------------------------------------------------------------ 
    ln_wage |  Coef. Std. Err.  t P>|t|  [95% Conf. Interval] 
-------------+---------------------------------------------------------------- 
    ttl_exp | .0192578 .0012233 15.74 0.000  .0168597 .0216558 
      | 
    wks_work | 
     L1. | .0015891 .0001957  8.12 0.000  .0012054 .0019728 
      | 
     _cons | 1.502879 .0075431 199.24 0.000  1.488092 1.517666 
-------------+---------------------------------------------------------------- 
    sigma_u | .40678942 
    sigma_e | .28124886 
     rho | .67658275 (fraction of variance due to u_i) 
------------------------------------------------------------------------------ 
F test that all u_i=0: F(3670, 7007) = 4.71     Prob > F = 0.0000 

回答

2

lag()因为它是在plm滞后于观测逐行无 “寻找” 的时间可变,即它将变量(每个人)移动。如果时间维度存在空白,则可能需要考虑时间变量的值。有(现在)未提交函数plm:::lagt.pseries考虑了时间变量,因此可以处理数据中的空白,就像您期望的那样。

library(plm) 
library(foreign) 
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") 
pnlswork <- pdata.frame(nlswork, c('idcode', 'year')) 
ffe <- plm(ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work,1) 
      , model = 'within' 
      , data = pnlswork) 
summary(ffe) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work, 
    1), data = nlswork, model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
            Estimate Std. Error t-value Pr(>|t|)  
ttl_exp       0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
plm:::lagt.pseries(wks_work, 1) 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 

Btw1:

如下用它更好地利用pdata.frame(),而不是plm.data()。 Btw2:您可以在您的数据与检查间隙PLM的is.pconsecutive()

is.pconsecutive(pnlswork) 
all(is.pconsecutive(pnlswork)) 

您也可以连续的数据,然后再使用lag(),像这样:

pnlswork2 <- make.pconsecutive(pnlswork) 
pnlswork2$wks_work_lag <- lag(pnlswork2$wks_work) 
ffe2 <- plm(ln_wage ~ ttl_exp + wks_work_lag 
      , model = 'within' 
      , data = pnlswork2) 
summary(ffe2) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + wks_work_lag, data = pnlswork2, 
    model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
wks_work_lag 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 

或者干脆:

ffe3 <- plm(ln_wage ~ ttl_exp + lag(wks_work) 
      , model = 'within' 
      , data = pnlswork2) # note: it is the consecutive panel data set here 
summary(ffe3) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = pnlswork2, 
    model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
lag(wks_work) 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 
+0

谢谢,Helix123!有没有办法使用plm/lfe获取Stata的'simga_u'? 我读过你的评论:[link] https://stats.stackexchange.com/a/228806/,但无法复制结果,我得到了这个错误:'non-conformable arguments' 此外,R-sq:内,之间,总体? –

+0

这些问题同时存在很多问题......为了在模型内部的链接中复制我的答案,您需要从数据中删除截距,即在第一行之后插入X < - X [,-1]。 – Helix123