2014-11-06 67 views
-1

一个我有这样一个数据帧:下降值,如果它们匹配的其他

Family Component x1 m_x1 x2 m_x2 x3 m_x3 y1 m_y1 y2 m_y2 y3 m_y3 
a1  1   1 100 2 300 0 0  2 250 0 0  0 0 
a1  2   1 100 2 300 0 0  2 250 0 0  0 01 
a1  3   1 100 2 300 0 0  2 250 0 0  0 0 
a2  1   2 150 0 0  0 0  0 0  0 0  0 0 
a2  2   2 150 0 0  0 0  0 0  0 0  0 0 
a3  1   1 4000 3 150 4 130 2 150 3 400 0 0 
a3  2   1 4000 3 150 4 130 2 150 3 400 0 0 
a3  3   1 4000 3 150 4 130 2 150 3 400 0 0 
a3  4   1 4000 3 150 4 130 2 150 3 400 0 0 

家庭是分组变量。然后我想的是,如果"Component"值(每个Family)不匹配,一个在x1x2,,y1y2y3,该变量的值和下一个(为x1m_x1,为x2m_x2 ,...)被删除。我期待的结果是:

Family Component x1 m_x1 x2 m_x2 x3 m_x3 y1 m_y1 y2 m_y2 y3 m_y3 
a1  1   1 100 0 0  0 0  0 0  0 0  0 0 
a1  2   0 0  2 300 0 0  2 250 0 0  0 0 
a1  3   0 0  0 0  0 0  0 0  0 0  0 0 
a2  1   0 0  0 0  0 0  0 0  0 0  0 0 
a2  2   2 150 0 0  0 0  0 0  0 0  0 0 
a3  1   1 4000 0 0  0 0  0 0  0 0  0 0 
a3  2   0 0  0 0  0 0  2 150 0 0  0 0 
a3  3   0 0  3 150 0 0  0 0  3 400 0 0 
a3  4   0 0  0 0  4 130 0 0  0 0  0 0 

我应该使用什么函数?我尝试过合并,但无法使其工作。

+0

使用'melt'这个数据帧将使这个过程变得更加简单。 – 2014-11-06 20:19:23

+0

@akrun是的,谢谢。 – 2014-11-06 20:26:40

回答

2

这里有一个简单的方法:

# find nonmatching entries 
idx <- dat[-(1:2)][c(TRUE, FALSE)] != dat$Component 

# full index 
idx_full <- idx[ , rep(seq(ncol(idx)), each = 2)] 

# replace values with 0 
dat[-(1:2)][idx_full] <- 0 

dat 
# Family Component x1 m_x1 x2 m_x2 x3 m_x3 y1 m_y1 y2 m_y2 y3 m_y3 
# 1  a1   1 1 100 0 0 0 0 0 0 0 0 0 0 
# 2  a1   2 0 0 2 300 0 0 2 250 0 0 0 0 
# 3  a1   3 0 0 0 0 0 0 0 0 0 0 0 0 
# 4  a2   1 0 0 0 0 0 0 0 0 0 0 0 0 
# 5  a2   2 2 150 0 0 0 0 0 0 0 0 0 0 
# 6  a3   1 1 4000 0 0 0 0 0 0 0 0 0 0 
# 7  a3   2 0 0 0 0 0 0 2 150 0 0 0 0 
# 8  a3   3 0 0 3 150 0 0 0 0 3 400 0 0 
# 9  a3   4 0 0 0 0 4 130 0 0 0 0 0 0 

其中dat是你的数据帧的名称。

1

你可以试试:

cols <- as.vector(t(outer(c("x","y"), 1:3, 
        function(...) paste(...,sep="")))) 
df[, 3:ncol(df)] <- do.call(cbind, lapply(cols, function(x) df[, 
           c(x,paste(sep="","m_",x))]*(df[[x]]==df$Component))) 
1

如果列不总是以相同的顺序,你也可以这样做:

n1 <- unique(gsub(".+\\_", "", colnames(df1)[-(1:2)])) 

df1[,-(1:2)] <- do.call(cbind,lapply(n1, function(x) { 
         indx <- grep(x, names(df1)) 
         m1 <- as.matrix(df1[indx]) 
         m1[m1[,1]!=df1$Component] <- 0 
         as.data.frame(m1) })) 
    df1 
    # Family Component x1 m_x1 x2 m_x2 x3 m_x3 y1 m_y1 y2 m_y2 y3 m_y3 
    #1  a1   1 1 100 0 0 0 0 0 0 0 0 0 0 
    #2  a1   2 0 0 2 300 0 0 2 250 0 0 0 0 
    #3  a1   3 0 0 0 0 0 0 0 0 0 0 0 0 
    #4  a2   1 0 0 0 0 0 0 0 0 0 0 0 0 
    #5  a2   2 2 150 0 0 0 0 0 0 0 0 0 0 
    #6  a3   1 1 4000 0 0 0 0 0 0 0 0 0 0 
    #7  a3   2 0 0 0 0 0 0 2 150 0 0 0 0 
    #8  a3   3 0 0 3 150 0 0 0 0 3 400 0 0 
    #9  a3   4 0 0 0 0 4 130 0 0 0 0 0 0 
相关问题