与R中

包含多个数值的列交叉表我想知道有多少低，中，高甲戏我有，又有多少低，中，高的犯罪我有我的数据帧。与R中

这里是我的数据帧的样本：

       genres class_rentabilite 
         Crime, Drama   Medium 
    Action, Crime, Drama, Thriller   High  
Action, Adventure, Sci-Fi, Thriller   Medium 
           Drama   Low 
         Crime, Drama   High 
         Comedy, Drama   high

我用table()在我的数据的另一个列，它的工作：

table(df$language, df$class_rentabilite)

上面的代码给出了这样的：

   Low Medium High NA 
        1  1  0 3 
    Aboriginal  0  0  2 0 
    Arabic   0  0  1 3 
    Aramaic   1  0  0 0 
    Bosnian   1  0  0 0 
    Cantonese  5  2  1 3

我想用这种方法对样本数据，但table()不工作，因为genres中的每一行都有多个值。我怎样才能解决这种情况？

来源

2016-12-12 Y.P

这里是给你一个方法。你有separate_rows()分裂流派，并创建一个临时的数据帧。然后，像你一样使用table()。

library(dplyr) 
library(tidyr) 

mydf %>% 
separate_rows(genres, sep = ", ") -> foo 

table(foo$genres, foo$class_rentabilite) 

#   High Low Medium 
# Action  1 0  1 
# Adventure 0 0  1 
# Comedy  1 0  0 
# Crime  2 0  1 
# Drama  3 1  1 
# Sci-Fi  0 0  1 
# Thriller  1 0  1

DATA

mydf <- structure(list(genres = c("Crime, Drama", "Action, Crime, Drama, Thriller", 
"Action, Adventure, Sci-Fi, Thriller", "Drama", "Crime, Drama", 
"Comedy, Drama"), class_rentabilite = c("Medium", "High", "Medium", 
"Low", "High", "High")), .Names = c("genres", "class_rentabilite" 
), row.names = c(NA, -6L), class = "data.frame")

来源

2016-12-12 01:01:28 jazzurro

非常感谢你。即使像Sci和Fi这样的一些错误分裂成不同的组，这也有很大的帮助。 –

@ Y.P我修改了代码。我认为这是你想要的。 :) – jazzurro

好用'separate_rows' – akrun

回答

相关问题