2014-09-23 69 views
3

我有两个data.frames的滤波data.framedata_qual看起来像这样:R:比较,并根据条件

data_qual <- structure(list(NAME = structure(1:3, .Label = c("NAME1", "NAME2", "NAME3"), class = "factor"), ID = c(56L, 47L, 77L), YEAR = c(1990L, 2007L, 1899L), VALUE = structure(c(2L, 1L, 1L), .Label = c("ST", "X"), class = "factor")), .Names = c("NAME", "ID", "YEAR", "VALUE"), class = "data.frame", row.names = c(NA, -3L)) 

NAME ID YEAR VALUE 
1 NAME1 56 1990  X 
2 NAME2 47 2007 ST 
3 NAME3 77 1899 ST 

我想通过比较其他数据帧dat过滤掉来自data_qual值:

dat <- structure(list(NAME = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("NAME1","NAME2"), class = "factor"), ID = c(56L, 56L, 56L, 47L, 47L, 47L, 47L), YEAR = c(1988L, 1989L, 1991L, 2005L, 2006L, 2007L, 2008L), VALUE = c(45L, 28L, 28L, -12L, 14L, 23L, 32L)), .Names = c("NAME", "ID", "YEAR", "VALUE"), class = "data.frame", row.names = c(NA, -7L)) 

    NAME ID YEAR VALUE 
1 NAME1 56 1988 45 
2 NAME1 56 1989 28 
3 NAME1 56 1991 28 
4 NAME2 47 2005 -12 
5 NAME2 47 2006 14 
6 NAME2 47 2007 23 
7 NAME2 47 2008 32 

我怎么能只基于行写入到一个新的data.frame吨在第一滤波处理列ID使过滤data_qual帽子有匹配IDdat

NAME ID YEAR VALUE 
1 NAME1 56 1990  X 
2 NAME2 47 2007 ST 

然后在那之后我正在寻找一种方式,从产生的data.frame仅行应写出不每组有相同YEAR(由ID定义)

NAME ID YEAR VALUE 
1 NAME1 56 1990  X 

任何帮助,慷慨赞赏。

+0

您有兴趣的中间步骤还是直接最后一步? – Arun 2014-09-23 20:02:19

+0

中间步骤很好看,这样我可以更好地控制数据。 – kurdtc 2014-09-23 20:15:45

回答

2

对于第一部分,然后第二部分

good_rows <- lapply(paste(dat2$ID, dat2$YEAR, sep = ":"), grepl, x = paste(dat$ID, dat$YEAR, sep = ":")) 
dat3 <- dat2[!unlist(lapply(good_rows, any)), ] 

或者,如果这是太乱了你,

dat2 <- data_qual[data_qual$ID %in% dat$ID, ] 
dat2 
    NAME ID YEAR VALUE 
1 NAME1 56 1990  X 
2 NAME2 47 2007 ST 

一个for循环

good_rows <- vector(length = nrow(dat2)) 
for (i in 1:nrow(dat2)) { 
    good_rows[i] <- !any(grepl(dat2$YEAR[i], dat[dat$ID == dat2$ID[i], "YEAR"])) 
} 
dat3 <- dat2[good_rows, ] 
dat3 
    NAME ID YEAR VALUE 
1 NAME1 56 1990  X