2017-08-04 41 views
0

模糊标题的信息,不确定如何描述我正在尝试做什么...该示例将非常清晰, 。将元素和子元素的表格分解为所有可能的排列的大表格

require (tibble) 

#I have a (much larger) table of items that I need to turn into a very large lookup table 
ElementMatrix <- tribble(
    ~Category, ~Elements, 
    "Gender", "Male", 
    "Gender", "Female", 
    "Smoking", "Smoker", 
    "Smoking", "Non-Smoker", 
    "Type1", "A", 
    "Type1", "B", 
    "Type1", "C", 
    "Type1", NA 
) 

#into this 
BigLookupMatrix <- tribble(
    ~Gender, ~Smoking, ~Type1, 
    "Male", "Smoker", "A", 
    "Male", "Smoker", "B", 
    "Male", "Smoker", "C", 
    "Male", "Smoker", NA, 
    "Male", "Non-Smoker", "A", 
    "Male", "Non-Smoker", "B", 
    "Male", "Non-Smoker", "C", 
    "Male", "Non-Smoker", NA, 
    "Female", "Smoker", "A", 
    "Female", "Smoker", "B", 
    "Female", "Smoker", "C", 
    "Female", "Smoker", NA, 
    "Female", "Non-Smoker", "A", 
    "Female", "Non-Smoker", "B", 
    "Female", "Non-Smoker", "C", 
    "Female", "Non-Smoker", NA 
) 

#I guessed it would be sonme gather/spready type thing, but that clearly doesnt work 
gather(ElementMatrix, key=Category, value=Elements) #gives me back my origitional matrix 
spread(ElementMatrix, key=Category, value=Elements) #gets angry about Duplicate identifiers 

现在,我明显可以做一些嵌套循环,但看起来很杂乱。必须有一个很好和干净的方式来做到这一点。

非常感谢提前的帮助!

回答

3

如何用一个小基R与unstackexpand.grid

expand.grid(unstack(ElementMatrix, Elements ~ Category)) 
    Gender Smoking Type1 
1 Male  Smoker  A 
2 Female  Smoker  A 
3 Male Non-Smoker  A 
4 Female Non-Smoker  A 
5 Male  Smoker  B 
6 Female  Smoker  B 
7 Male Non-Smoker  B 
8 Female Non-Smoker  B 
9 Male  Smoker  C 
10 Female  Smoker  C 
11 Male Non-Smoker  C 
12 Female Non-Smoker  C 
13 Male  Smoker <NA> 
14 Female  Smoker <NA> 
15 Male Non-Smoker <NA> 
16 Female Non-Smoker <NA> 

unstack将按照类别划分的元素列,这里返回命名列表。这被提供给expand.grid,它产生具有三元组(Gender-Smoking-Type1)的所有组合的数据帧。

+0

哇,谢谢! 快速的newby问题:该做什么,该操作员继续使我神秘... – Sylvain

+0

它在许多上下文中使用(主要在模型公式中用于'lm'和'glm')。你可以在这里把它看作“按类别分组元素”。作为另一个例子,函数'聚合'执行组级聚合。帮助文件('?aggregate')中的一个例子显示了可以说“按饲料类型计算鸡的平均体重”的集合(体重〜饲料,数据=鸡,平均值)。 – lmo

+0

一位非常神秘的操作员,但我认为我越来越接近理解它。谢谢! – Sylvain

1

你也可以做到这一点tidyverse内:

library(tidyverse) 

ElementMatrix %>% 
    group_by(Category) %>% 
    summarise(Elements = list(Elements)) %>% 
    spread(Category, Elements) %>% 
    as.list() %>% 
    transpose() %>% 
    flatten() %>% 
    expand.grid() %>% 
    arrange(Gender, Smoking, Type1)