2017-08-04 65 views
5

比方说,我有两个数据框,学生和老师。在R中生成所有可能的行组合?

students <- data.frame(name = c("John", "Mary", "Sue", "Mark", "Gordy", "Joey", "Marge", "Sheev", "Lisa"), 
        height = c(111, 93, 99, 107, 100, 123, 104, 80, 95), 
        smart = c("no", "no", "yes", "no", "yes", "yes", "no", "yes", "no")) 
teachers <- data.frame(name = c("Ben", "Craig", "Mindy"), 
        height = c(130, 101, 105), 
        smart = c("yes", "yes", "yes")) 

我要生成学生和教师的所有可能的组合,并保持附带信息,基本上是从数据帧创建行的所有组合“学生”和“老师”。这可以很容易地用循环和cbind来完成,但对于大量的数据帧来说,这是永久的。帮助R新手出去 - 做这件事最快的方法是什么?

编辑:如果这是不明确的,我所要的输出格式如下:

rbind(
    cbind(students[1, ], teachers[1, ]), 
    cbind(students[1, ], teachers[2, ]) 
    ... 
    cbind(students[n, ], teachers[n, ])) 

回答

2

可以将所有数据组合如下:

do.call(cbind.data.frame,Map(expand.grid,teacher=teachers,students=students)) 

    name.teacher name.students height.teacher height.students smart.teacher smart.students 
1   Ben   John   130    111   yes    no 
2   Craig   John   101    111   yes    no 
3   Mindy   John   105    111   yes    no 
4   Ben   Mary   130    93   yes    no 
5   Craig   Mary   101    93   yes    no 
6   Mindy   Mary   105    93   yes    no 
:   :   :    :    :   :    : 
:   :   :    :    :   :    : 
2

,并保持附带的信息

我建议不这样做。没有必要将所有东西都放在一个对象中。

要只是结合教师和学生,还有

res = expand.grid(teacher_name = teachers$name, student_name = students$name) 

要在其他数据合并(这我会建议不这样做,直到有必要):

res[, paste("teacher", c("height", "smart"), sep="_")] <- 
    teachers[match(res$teacher_name, teachers$name), c("height","smart")] 

res[, paste("student", c("height", "smart"), sep="_")] <- 
    students[match(res$student_name, students$name), c("height","smart")] 

这给

head(res) 

    teacher_name student_name teacher_height teacher_smart student_height student_smart 
1   Ben   John   130   yes   111   no 
2  Craig   John   101   yes   111   no 
3  Mindy   John   105   yes   111   no 
4   Ben   Mary   130   yes    93   no 
5  Craig   Mary   101   yes    93   no 
6  Mindy   Mary   105   yes    93   no 
+0

边注意:'logical'类是专为'yes' /'no'值。见'?逻辑'。 – Frank

+0

有用的功能,但为了我目前的目的,我想把所有内容保存在一个对象中 –

+1

或者使用'data.table'中的'CJ' – akrun

0

您可以使用此功能

expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...)) 

expand.grid.df(students,teachers)