与R中

-1

条件求和专栏中，我有一个这样的数据帧：与R中

df <- data.frame(a=c(111,111,111,222,222,222,333,333,333), 
       b=c(1,0,1,1,1,1,0,0,1)) 
df 
    a b 
1 111 1 
2 111 0 
3 111 1 
4 222 1 
5 222 1 
6 222 1 
7 333 0 
8 333 0 
9 333 1

我需要的列“B”的总和为每个“一”：

哪有我以最快的方式做到这一点？

来源

2016-12-16 Vitaliy Poletaev

aggregate(df$b, by=list(df$a), FUN=sum)

来源

2016-12-16 23:49:32 G5W

-1

您可以使用dplyr：

df %>% group_by(a) %>% summarise(.,b = sum(b))

来源

2016-12-16 23:56:58 PhilC

一般来说，大数据的最快方法是使用data.table。

install.packages("data.table", type = "source", 
repos = "http://Rdatatable.github.io/data.table") 
library("data.table") 

df <- data.frame(a=c(111,111,111,222,222,222,333,333,333), 
      b=c(1,0,1,1,1,1,0,0,1)) 
df <- as.data.table(df) 
df[, sum(b), by = a]

来源

2016-12-16 23:58:40

您的最后一行代码不会产生OP描述的输出。这非常接近：'df [，sum（b），by = a]' – bdemarest

-2

如果我们使用包dplyr，我们真的需要像这样的代码（由其他PhilC提到）。

DF％>％GROUP_BY（一）％>％综述（， b = sum（b））？

这不行吗？

df％>％group_by（a）％>％summarize（b = sum（b））？

来源

2016-12-17 00:25:25

回答

相关问题