2013-05-01 62 views
14

我有存储为所谓的“DLIST”列出的清单三个文本文档的列表上:喜欢的东西expand.grid名单

dlist <- structure(list(name = c("a", "b", "c"), text = list(c("the", "quick", "brown"), c("fox", "jumps", "over", "the"), c("lazy", "dog"))), .Names = c("name", "text")) 

在我脑子里,我发现它有助于图片DLIST是这样的:

name text 
1 a  c("the", "quick", "brown") 
2 b  c("fox", "jumps", "over", "the") 
3 c  c("lazy", "dog") 

这怎么能被操纵如下?这个想法是绘制它,所以可以熔化ggplot2的东西会很好。

name text 
1 a the 
2 a quick 
3 a brown 
4 b fox 
5 b jumps 
6 b over 
7 b the 
8 c lazy 
9 c dog 

这是每个单词一行,同时提供单词和其父文档。

我曾尝试:

> expand.grid(dlist) 
    name     text 
1 a  the, quick, brown 
2 b  the, quick, brown 
3 c  the, quick, brown 
4 a fox, jumps, over, the 
5 b fox, jumps, over, the 
6 c fox, jumps, over, the 
7 a    lazy, dog 
8 b    lazy, dog 
9 c    lazy, dog 

> sapply(seq(1,3), function(x) (expand.grid(dlist$name[[x]], dlist$text[[x]]))) 
    [,1]  [,2]  [,3]  
Var1 factor,3 factor,4 factor,2 
Var2 factor,3 factor,4 factor,2 

unlist(dlist) 
    name1 name2 name3 text1 text2 text3 text4 
    "a"  "b"  "c" "the" "quick" "brown" "fox" 
    text5 text6 text7 text8 text9 
"jumps" "over" "the" "lazy" "dog" 

> sapply(seq(1,3), function(x) (cbind(dlist$name[[x]], dlist$text[[x]]))) 
[[1]] 
    [,1] [,2] 
[1,] "a" "the" 
[2,] "a" "quick" 
[3,] "a" "brown" 

[[2]] 
    [,1] [,2] 
[1,] "b" "fox" 
[2,] "b" "jumps" 
[3,] "b" "over" 
[4,] "b" "the" 

[[3]] 
    [,1] [,2] 
[1,] "c" "lazy" 
[2,] "c" "dog" 

公平地说,我通过各种迷惑申请和plyr功能,真的不知道从哪里开始。我从来没有见过类似于上面的“侥幸”尝试的结果,并且不理解它。

+1

您可以更紧密地格式化到你在你的脑袋是什么,就像这样:'DLIST <-list(A = C(下称“”,“快”, “褐色”),...)'。这样做也可以简化对这个问题的答案。 – Frank 2013-05-01 21:09:52

+0

谢谢Frank,Josh的setNames函数告诉我如何去做。 – nacnudus 2013-05-01 21:36:50

回答

11

如果您将dlist转换为命名列表(我认为更适合的结构),则可以使用stack()来获取所需的两列数据。

(该rev()setNames()在第二行调用只是众多方法来调整列的顺序和名称以匹配你的问题出所需的输出之一。)

x <- setNames(dlist$text, dlist$name) 
setNames(rev(stack(x)), c("name", "text")) 
# name text 
# 1 a the 
# 2 a quick 
# 3 a brown 
# 4 b fox 
# 5 b jumps 
# 6 b over 
# 7 b the 
# 8 c lazy 
# 9 c dog 
+1

+1我有*不知道这是如何工作的。现在我明白了,我喜欢那样。 – 2013-05-01 21:32:23

+0

感谢三个伟大的新功能,特别是setNames,这意味着我可以在事后跟踪Frank的评论,而不是直接回到开头。 – nacnudus 2013-05-01 21:34:50

+0

@ SimonO101 - 噢好。起初,我实际上已经拒绝发布这个消息,因为它在几行中包含了很多步骤。根据你和nacnudus的评论,虽然,我很高兴我做到了。 (FWIW,我可能*真的*使用'(dlist,setNames(text,name))',我。) – 2013-05-01 21:54:25

0

Josh的答案是更甜但我想我会把我的帽子扔在戒指里。

dlist <- structure(list(name = c("a", "b", "c"), 
    text = list(c("the", "quick", "brown"), 
    c("fox", "jumps", "over", "the"), c("lazy", "dog"))), 
    .Names = c("name", "text")) 

lens <- sapply(unlist(dlist[-1], recursive = FALSE), length) 

data.frame(name = rep(dlist[[1]], lens), text = unlist(dlist[-1]), row.names = NULL) 

## name text 
## 1 a the 
## 2 a quick 
## 3 a brown 
## 4 b fox 
## 5 b jumps 
## 6 b over 
## 7 b the 
## 8 c lazy 
## 9 c dog 

这就是说,列表清单是一种尴尬的存储方法。向量列表(特别是向量列表)将更容易处理。

1

另一种解决方案,也许更普及:

do.call(rbind, do.call(mapply, c(dlist, FUN = data.frame, SIMPLIFY = FALSE))) 

#  name text 
# a.1 a the 
# a.2 a quick 
# a.3 a brown 
# b.1 b fox 
# b.2 b jumps 
# b.3 b over 
# b.4 b the 
# c.1 c lazy 
# c.2 c dog 
+0

这比Simon O'Hanlon的建议更好,因为它允许多列的数据帧(比如“名称“)在基于列表列的基础上扩展为行! – datamole 2015-09-01 13:26:49