2015-02-24 74 views
0

我有一个数据集,如下所示,包含23个日期的数据(如下所示为1日期和第二天的部分数据 - 注意:标题是偏移的) 。我想通过FrontBack和日期(即每个日期2 lm,前1个,后1个)运行y〜x(lm(y〜x))的线性模型。然后,我想总结一个矩阵中的输出,其中每个列的斜率,截距和误差列在单独的列中。这应该给我一个有46行(23日期×2前/后级别)的矩阵。循环计算线性模型并将结果写入矩阵

Date FrontBack  y   x  
20140916 Back 2234.580 2.253175  
20140916 Back 2267.631 7.725422  
20140916 Back 2246.668 14.414951  
20140916 Back 2216.307 17.837861 
20140916 Back 2214.225 15.484364 
20140916 Front 2245.522 90.062102 
20140916 Back 2219.565 12.326474 
20140916 Front 2267.427 63.396137 
20140916 Back 2213.286 7.861758 
20140916 Front 2264.902 61.661650 
20140916 Front 2256.183 70.124702 
20140916 Back 2202.254 7.400539 
20140916 Front 2241.997 44.826769 
20140916 Back 2204.868 5.739663 
20140916 Back 2209.424 2.165606 
20140916 Front 2266.947 1.068334 
20140917 Back 2237.199 2.190785 
20140917 Back 2248.541 4.415886 
20140917 Back 2260.041 8.724817 
20140917 Back 2277.407 13.420694 
20140917 Back 2278.414 14.789667 
20140917 Front 2346.622 29.878672 
20140917 Back 2268.111 15.120095 
20140917 Front 2496.946 60.30390 

回答

0

因为您只有两个类别,前部和后部部分是否可以分组?如果是的话,下面的代码应该给你你想要的功能:

fake.y=runif(15*5,min=2200,max=2500) #make some fake data for testing 
fake.FrontBack<-sample(c("Front","Back"),15*5,replace=T) 
fake.Date<-sort(sample(seq(20140916,20140920),15*5,replace=T)) 
fake.data<-data.frame(Date=fake.Date,FrontBack=fake.FrontBack,y=fake.y,x=fake.y/120+runif(15*5)+1*(fake.FrontBack=="Back")) 

# starting here use your data instead of fake.data 
f<-subset(fake.data,FrontBack=="Front") # use subset to select front/back 
b<-subset(fake.data,FrontBack=="Back") 
f.fit<-lm(y~x,data=f)# regress front values 
f.resid<-data.frame(Date=f$Date,resid=f.fit$residuals,int=f.fit$coefficients[1],slope=f.fit$coefficients[2],FrontBack="Front") # make dataframe of residuals 
b.fit<-lm(y~x,data=b)# regress back values 
b.resid<-data.frame(Date=b$Date,resid=b.fit$residuals,int=b.fit$coefficients[1],slope=b.fit$coefficients[2],FrontBack="Back") # make dataframe of residuals 
all.resid<-rbind(f.resid,b.resid) # stick 'em together 
head(all.resid) # should be what you wanted, residual error for all entires 

# could be what you wanted, aggregate mean error for each date, front and back 
ag<-aggregate(resid~.,data=all.resid,mean) 
print(ag) 

代码会吐出这样的事情,以显示剩余的每一行的模型误差在原始数据:

 Date  resid  int slope FrontBack 
1 20140916 -47.91676 393.5386 97.52173  Front 
3 20140916 52.01027 393.5386 97.52173  Front 
4 20140916 52.58631 393.5386 97.52173  Front 
5 20140916 -17.56038 393.5386 97.52173  Front 
6 20140916 -21.85633 393.5386 97.52173  Front 
9 20140916 44.97382 393.5386 97.52173  Front 

如果你想每一天的平均误差,汇总数据将是:

 Date  int  slope FrontBack  resid 
1 20140916 393.5386 97.52173  Front 10.372821 
2 20140917 393.5386 97.52173  Front -3.840699 
3 20140918 393.5386 97.52173  Front -10.876092 
4 20140919 393.5386 97.52173  Front -2.159973 
5 20140920 393.5386 97.52173  Front 8.878526 
6 20140916 212.6598 101.60367  Back 2.862525 
7 20140917 212.6598 101.60367  Back -14.476662 
8 20140918 212.6598 101.60367  Back 10.822712 
9 20140919 212.6598 101.60367  Back -10.072473 
10 20140920 212.6598 101.60367  Back 10.472516 

设有独立的正面和背面的条目每个日期,该子集和回归应该工作在同样为您的数据,在这个例子中

0

例如:

# break data frame to list of data frames 
df_list <- split(df1, f = paste(df1$Date, df1$FrontBack)) 

# run lm models on each data frame 
models <- lapply(df_list, function(dat) { 
    lm(y ~ x, data = dat) 
}) 

# format result 
res <- 
    do.call(
    rbind, 
    lapply(models, function(x) { 
     coefs <- coef(summary(x)) 
     c(intercept = coefs["(Intercept)", "Estimate"], 
     slope = coefs["x", "Estimate"], 
     p_value = coefs["x","Pr(>|t|)"], 
     InterceptP = coefs["(Intercept)","Pr(>|t|)"], 
     StdErrorX = coefs["x","Std. Error"], 
     StdErrorI = coefs["(Intercept)","Std. Error"], 
     r2 = x$r.squared, 
     AIC = AIC(x) 
     ) 
    }) 
) 
+0

非常感谢 - 此代码似乎工作分割每个数据帧上的数据和运行LM模型,但格式结果我的时候得到以下错误: coefs [“x”,“Estimate”]中的错误:下标越界 – Rebecca 2015-02-25 19:18:36

+0

查看'coef(summary(models [[1]]))''的结构。如果您的变量不是“x”,则使用其他名称。对于斜坡你也可以使用'coefs [2,1]' – bergant 2015-02-25 19:34:24

+0

完美 - 谢谢! – Rebecca 2015-02-26 17:50:15