2015-10-19 174 views
0
> str(store) 
'data.frame': 1115 obs. of 10 variables: 
$ Store     : int 1 2 3 4 5 6 7 8 9 10 ... 
$ StoreType    : Factor w/ 4 levels "a","b","c","d": 3 1 1 3 1 1 1 1 1 1 ... 
$ Assortment    : Factor w/ 3 levels "a","b","c": 1 1 1 3 1 1 3 1 3 1 ... 
$ CompetitionDistance  : int 1270 570 14130 620 29910 310 24000 7520 2030 3160 ... 
$ CompetitionOpenSinceMonth: int 9 11 12 9 4 12 4 10 8 9 ... 
$ CompetitionOpenSinceYear : int 2008 2007 2006 2009 2015 2013 2013 2014 2000 2009 ... 
$ Promo2     : int 0 1 1 0 0 0 0 0 0 0 ... 
$ Promo2SinceWeek   : int NA 13 14 NA NA NA NA NA NA NA ... 
$ Promo2SinceYear   : int NA 2010 2011 NA NA NA NA NA NA NA ... 
$ PromoInterval   : Factor w/ 4 levels "","Feb,May,Aug,Nov",..: 1 3 3 1 1 1 1 1 1 1 ... 

我试图用Promo2值替换NA。值应该用列均值代替。替换NA取决于条件的值

不明白为什么我的代码不能编辑商店数据。

for (i in 1:nrow(store)){ 
    if(is.na(store[i,])== TRUE & store$Promo2[i] ==0){ 
    store[i,] <- ifelse(is.na(store[i,]),0,store[i,]) 
    } 
    else if (is.na(store[i,])== TRUE & store$Promo2[i] ==1){ 
    for(j in 1:ncol(store)){ 
     store[is.na(store[i,j]), j] <- mean(store[,j], na.rm = TRUE) 
    } 
    } 
} 
+0

你需要学习一些基本的R. –

回答

3

对于Promo2SinceWeek柱:

store$Promo2SinceWeek[store$Promo2==0 & is.na(store$Promo2SinceWeek)] <- 0 
store$Promo2SinceWeek[store$Promo2==1 & is.na(store$Promo2SinceWeek)] <- mean(store$Promo2SinceWeek, na.rm=TRUE) 

对于其他列,使用同样的方法。矢量化功能R.

0

的一个非常有用的功能来修复for循环:

for(i in 1:nrow(store)) { 
    col <- which(is.na(store[i,])) 
    store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0 
} 

或者,如果你不希望任何if语句:

for (i in 1:nrow(store)) { 

    store[i,][is.na(store[i,]) & store$Promo2[i] ==0] <- 0 

    store[i,][is.na(store[i,]) & store$Promo2[i] ==1] <- 
     colMeans(store[,is.na(store[i,]) & store$Promo2[i] ==1], na.rm = TRUE) 

} 

你的循环是不因为if陈述接受一个条件值从测试工作。您的循环向它发送if(is.na(store[i,])== TRUE & store$Promo2[i] ==0)。但是该条件声明将具有许多值TRUE FALSE FALSE FALSE TRUE...。这是一系列的修复和错误时,它应该只有一个值,或者是一个 TRUE或一个错误。只有当您给出倍数时,该函数才会取第一个值。

重复的例子,

store 
#     Promo2 gear carb 
#Mazda RX4    1 NA NA 
#Mazda RX4 Wag   1 4 4 
#Datsun 710    1 4 1 
#Hornet 4 Drive   0 3 1 
#Hornet Sportabout  0 3 NA 
#Valiant    0 3 1 

    for(i in 1:nrow(store)) { 
     col <- which(is.na(store[i,])) 
     store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0 
    } 

store 
#     Promo2 gear carb 
#Mazda RX4    1 3.4 1.75 
#Mazda RX4 Wag   1 4.0 4.00 
#Datsun 710    1 4.0 1.00 
#Hornet 4 Drive   0 3.0 1.00 
#Hornet Sportabout  0 3.0 0.00 
#Valiant    0 3.0 1.00 

数据

store <- head(mtcars) 
store <- store[-(1:8)] 
names(store)[1] <- "Promo2" 
store[1,2] <- NA 
store[5,3] <- NA 
store[1,3] <- NA 
store