2016-07-05 62 views
1

我有一个包含2个因子的数据框。这样计算与数据框中的子集的差异

Eyecolour Haircolour Points 
    <fctr> <fctr> <dbl> 
1 brown blond   4 
2 brown brunette  -8 
3 blue blond   2 
4 blue brunette  3 
5 green blond   -5 
6 green brunette  9 

我想有金发和黑发之间分差为每Eyecolor或者只是简单地从黑发金发减去每一个Eyecolor

我试过使用dplyr包,但我很努力地让代码正确。此外与diff()不喜欢负值。

回答

2

使用您的数据

df <- read.table(text = c(" 
Eyecolour Haircolour Points 
brown blond   4 
brown brunette  -8 
blue blond   2 
blue brunette  3 
green blond   -5 
green brunette  9"), header = T) 

你可以尝试

library(dplyr) 
library(tidyr) 
df %>% 
    tidyr::spread(Haircolour, Points) %>% 
    dplyr::mutate(diff = blond - brunette) 

结果

Eyecolour blond brunette diff 
1  blue  2  3 -1 
2  brown  4  -8 12 
3  green -5  9 -14 
+1

的伟大工程,方便!干杯 –

2

我们可以使用

library(dplyr) 
df %>% 
    mutate(Haircolour = as.character(Haircolour)) %>% 
    group_by(Eyecolour) %>% 
    summarise(Diff = Points[Haircolour=="blond"] - Points[Haircolour =="brunette"]) 
# Eyecolour Diff 
#  <fctr> <int> 
#1  blue -1 
#2  brown 12 
#3  green -14 

或者使用data.table

library(data.table) 
dcast(setDT(df), Eyecolour~Haircolour, value.var="Points")[, Diff:= blond-brunette][] 
# Eyecolour blond brunette Diff 
#1:  blue  2  3 -1 
#2:  brown  4  -8 12 
#3:  green -5  9 -14 
+0

@ZheyuanLi是的,当我得到你的回复,你没有在做它。 – akrun