2016-10-10 78 views
3

让我们假装我有这样的事情:如何完成数据框架中的缺失因子水平?

df <- data.frame(
     PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"),   
     FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach"), 
     A = c(100, 200, 100, 200, 300), 
     B=c(1,2,3,4,5)) 
df$PERSON <- as.factor(df$Person) 
df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut")) 

str(df): 'data.frame': 5 obs. of 4 variables: 
$ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1 
$ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2 
$ A  : num 100 200 100 200 300 
$ B  : num 1 2 3 4 5 

我希望让每个人有本发明的水果的各个层面,这样的扩大这个数据,帧里面resulsts :为AB

Person FRUIT A B 
1 Peter Apple 100 1 
2 Peter Peach 200 2 
3 Peter Coconut 0 0 
4 Marcel Apple 100 3 
5 Marcel Peach 0 0 
6 Marcel Coconut 0 0 
7 Lisa Apple 200 4 
8 Lisa Peach 300 5 
9 Lisa Coconut 0 0 

缺失值应填充0

我试过tidyr::complete(df$FRUIT, 0),但看起来,我用这个函数是错误的。

预先感谢

+0

请指定您使用套件'tidyr'的完整功能。 – agenis

回答

9

complete取第一参数作为“数据”,随后的列扩展。默认情况下,fill是NA,但我们可以通过在list中指定它将其更改为0。

complete(df, PERSON, FRUIT, fill = list(A=0, B = 0)) 
+0

它的工作原理,谢谢。是否也可以根据列名建立列表?在现实世界中,填充0的列数是20.所以输入 – barracuda317

+0

@ barracuda317有很多,在这种情况下,请尝试'complete_',即'library(dplyr); A:B)complete_(df,names(df)[1:2])%>%mutate_each(funs(替换(。,is.na(。),0)), – akrun