2016-08-19 83 views
2

我有一个数据帧,如下所示:ř使用dcast,熔化并级联重塑数据帧

mydf <- data.frame(Term = c('dog','cat','lion','tiger','pigeon','vulture'), Category = c('pet','pet','wild','wild','pet','wild'), 
    Count = c(12,14,19,7,11,10), Rate = c(0.4,0.7,0.3,0.6,0.1,0.8), Brand = c('GS','GS','MN','MN','PG','MN') ) 

导致数据帧:

 Term Category Count Rate Brand 
1  dog  pet 12 0.4 GS 
2  cat  pet 14 0.7 GS 
3 lion  wild 19 0.3 MN 
4 tiger  wild  7 0.6 MN 
5 pigeon  pet 11 0.1 PG 
6 vulture  wild 10 0.8 MN 

我希望该数据帧变换为以下resultDF

Category   pet    wild    
Term    dog,cat,pigeon lion,tiger,vulture 
Countlessthan13 dog,pigeon  tiger,vulture  
Ratemorethan0.5 cat    tiger,vulture  
Brand   GS,PG   MN     

行标题表示像Countlessthan13这样的操作意味着计数< 13适用于术语,然后分组。 另请注意,品牌名称是独一无二的,不会重复使用。

我试过dcast和融化......但没有得到想要的结果。

回答

3

我们可以使用data.table来做到这一点。将'data.frame'转换为'data.table'(setDT(mydf)),按'Category'分组,创建一些总结列pasteunique值'Term',其中'Count'小于13或'Rate'更大比'0.5',以及'品牌'的unique元素。

library(data.table) 
dt <- setDT(mydf)[, .(Term = paste(unique(Term), collapse=","), 
         Countlesstthan13 = paste(unique(Term[Count < 13]), collapse=","), 

         Ratemorethan0.5 = paste(unique(Term[Rate > 0.5]), collapse=","), 
         Brand = paste(unique(Brand), collapse=",")), by = Category] 

从汇总数据集(“DT”),我们melt以“长”通过指定“id.var”作为“类别”,然后dcast回“宽”格式格式。

dcast(melt(dt, id.var = "Category", variable.name = "category"), 
          category ~Category, value.var = "value") 
#   category   pet    wild 
#1:    Term dog,cat,pigeon lion,tiger,vulture 
#2: Countlesstthan13  dog,pigeon  tiger,vulture 
#3: Ratemorethan0.5   cat  tiger,vulture 
#4:   Brand   GS,PG     MN 
+1

太棒了。谢谢Akrun ..制作我的一天! – Tarak