2014-09-29 100 views
2

这是我之前在Matching and transposing data between dataframes in R处的问题的后续。我有dataframes的列表,例如:向数据框列表中的每个列添加列

dfs <- structure(list(df1 = structure(list(id = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "A", class = "factor")), .Names = "id", class = "data.frame", row.names = c(NA, 
-12L)), df2 = structure(list(id = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "B", class = "factor")), .Names = "id", class = "data.frame", row.names = c(NA, 
-12L)), df3 = structure(list(id = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "C", class = "factor")), .Names = "id", class = "data.frame", row.names = c(NA, 
-12L))), .Names = c("df1", "df2", "df3")) 

在列表中的每个数据帧,我想创建一个基于从第四数据框中df4匹配,并调换新列data

df4 <- structure(list(id = structure(1:3, .Label = c("A", "B", "C"), class = "factor"), 
    x1 = c(9L, 4L, 9L), x2 = c(7L, 2L, 8L), x3 = c(7L, 6L, 7L 
    ), x4 = c(9L, 5L, 5L), x5 = c(8L, 8L, 4L), x6 = c(7L, 4L, 
    6L), x7 = c(9L, 8L, 5L), x8 = c(7L, 7L, 8L), x9 = c(5L, 5L, 
    5L), x10 = c(4L, 2L, 8L), x11 = c(9L, 1L, 4L), x12 = c(8L, 
    6L, 5L)), .Names = c("id", "x1", "x2", "x3", "x4", "x5", 
"x6", "x7", "x8", "x9", "x10", "x11", "x12"), class = "data.frame", row.names = c(NA, 
-3L)) 

我能实现这个使用单独的代码行列表中的每个数据帧,如

dfs$df1$data <- t(df4[unique(match(dfs$df1$id, df4$id)), 2:13]) 
dfs$df2$data <- t(df4[unique(match(dfs$df2$id, df4$id)), 2:13]) 
dfs$df3$data <- t(df4[unique(match(dfs$df3$id, df4$id)), 2:13]) 

,但我敢肯定,必须有一个多的电子不足之处和较短的方式来做到这一点。我很确定我需要使用lapply,但无法弄清楚如何完成这项工作。例如,我可以用

lapply(dfs, function(d) t(df4[unique(match(d$id, df4$id)), 2:13])) 

给结果作为载体,但我无法弄清楚如何在列表中的每个数据帧中插入这些所谓data作为新列。有谁知道我该怎么做?

谢谢!

+1

也许'地图(cbind,DFS,lapply(DFS,函数(d)T(DF4 [独特(比赛(d $ ID,DF4的$ id)), -1])))'? – 2014-09-29 22:17:20

+1

@smci你认为有多少人遵循“插入”标签? – GSee 2014-09-29 22:22:20

+0

当你在R中遇到这些痛苦时,我建议你可以考虑[Python pandas](http://pandas.pydata.org/pandas-docs/stable/)。如果你能接受,我会告诉你熊猫的答案,这将是无限可读性和可维护性。 – smci 2014-09-29 22:23:52

回答

3

下面是一个使用lapply尝试:

lapply(dfs, function(x) { 
    cbind(
     x, 
     new=unlist(df4[match(x$id[1],df4$id),-1]) 
     ) 
}) 

#$df1 
# id new 
#x1 A 9 
#x2 A 7 
#x3 A 7 
#... 
# 
#$df2 
# id new 
#x1 B 4 
#x2 B 2 
#x3 B 6 
#... 
# 
#$df3 
# id new 
#x1 C 9 
#x2 C 8 
#x3 C 7 
#... 
+0

辉煌,感谢thelatemail!你几乎为我节省了一大笔代码 – Thomas 2014-09-29 22:34:34

相关问题