2016-11-26 71 views
2

我知道reshape从基R可以转化为其中所述时间是从短截线估算的变量名AB长格式,例如:reshape2和宽(估算)时间变量

wide = data.frame(A.2010 = c('a', 'b', 'c'), 
        A.2011 = c('f', 'g', 'd'), 
        B.2010 = c('A', 'B', 'C'), 
        B.2011 = c('G', 'G', 'H'), 
        z = runif(3), 
        x = runif(3)) 

wide 
# A.2010 A.2011 B.2010 B.2011   z   x 
#1  a  f  A  G 0.3626823 0.67212468 
#2  b  g  B  G 0.3726911 0.09663248 
#3  c  d  C  H 0.9807237 0.31259394 

变为:

reshape(wide, direction = 'long', sep = '.', 
     varying = c('A.2010', 'A.2011', 'B.2010', 'B.2011')) 
#    z   x time A B id 
#1.2010 0.3626823 0.67212468 2010 a A 1 
#2.2010 0.3726911 0.09663248 2010 b B 2 
#3.2010 0.9807237 0.31259394 2010 c C 3 
#1.2011 0.3626823 0.67212468 2011 f G 1 
#2.2011 0.3726911 0.09663248 2011 g G 2 
#3.2011 0.9807237 0.31259394 2011 d H 3 

我可以用reshape2::melt完成相同的操作吗?

回答

2

看起来基于r的reshape是最好的工具,因为在melt函数中没有类似的功能,因此reshape2包中没有。但是,可以实现与patterns功能类似的东西在melt.data.table

library(reshape2) 
library(data.table) 

wide = data.table(wide) 

long = melt(wide, id.vars = c("z", "x"), measure = patterns("^A", "^B"), 
      value.name = c("A", "B"), variable.name = "time") 

> long 
      z   x time A B 
1: 0.3421681 0.8432707 1 a A 
2: 0.1243282 0.5096108 1 b B 
3: 0.3650165 0.1441660 1 c C 
4: 0.3421681 0.8432707 2 f G 
5: 0.1243282 0.5096108 2 g G 
6: 0.3650165 0.1441660 2 d H 

注意melt识别变“时间”,并把它们正确的群体,但可根据需要不使用2010和2011年。解决方法是手动重新编码级别,这应该是微不足道的。

levels(long$time) = c("2010", "2011") 

> long 
      z   x time A B 
1: 0.3421681 0.8432707 2010 a A 
2: 0.1243282 0.5096108 2010 b B 
3: 0.3650165 0.1441660 2010 c C 
4: 0.3421681 0.8432707 2011 f G 
5: 0.1243282 0.5096108 2011 g G 
6: 0.3650165 0.1441660 2011 d H 

我希望这有助于!