2016-08-01 93 views
1

我想匹配2列表中的值,只有列表中的变量名称相同。我希望结果是一个列表,长列表的长度填充了总匹配数。匹配两个不等长的列表

jac <- structure(list(s1 = "a", s2 = c("b", "c", "d"), s3 = 5), 
       .Names = c("s1", "s2", "s3")) 

larger <- structure(list(s1 = structure(c(1L, 1L, 1L), .Label = "a", class = "factor"), 
      s2 = structure(c(2L, 1L, 3L), .Label = c("b", "c", "d"), class = "factor"), 
      s3 = c(1, 2, 7)), .Names = c("s1", "s2", "s3"), row.names = c(NA, -3L), class = "data.frame") 

我使用mapply(FUN = pmatch, jac, larger)这给了我正确的,但总不能说我会喜欢下面的格式:

不过,我不认为pmatch将确保每名匹配所以我写了一个函数,我仍然有问题:

prodMatch <- function(jac,larger){ 
     for(i in 1:nrow(larger)){ 
      if(names(jac)[i] %in% names(larger[i])){ 
       r[i] <- jac %in% larger[i] 
       r 
      } 
    } 
} 

任何人都可以帮忙吗?

导致一个不是ohter的倍数另一个数据集:

larger2 <- 
    structure(list(s1 = structure(c(1L, 1L, 1L), class = "factor", .Label = "a"), 
     s2 = structure(c(1L, 1L, 1L), class = "factor", .Label = "c"), 
     s3 = c(1, 2, 7), s4 = c(8, 9, 10)), .Names = c("s1", "s2", 
    "s3", "s4"), row.names = c(NA, -3L), class = "data.frame") 

回答

0

mapply返回匹配索引列表,你可以将其转换为数据帧简单地使用as.data.frame

as.data.frame(mapply(match, jac, larger)) 
# s1 s2 s3 
# 1 1 2 NA 
# 2 1 1 NA 
# 3 1 3 NA 

cbindlarger的结果给出了您的预期:

cbind(larger, 
     setNames(as.data.frame(mapply(match, jac, larger)), 
       paste(names(jac), "result", sep = ""))) 

# s1 s2 s3 s1result s2result s3result 
#1 a c 1  1  2  NA 
#2 a b 2  1  1  NA 
#3 a d 7  1  3  NA 

更新:为了照顾的情况下,这两个列表的名称不匹配,我们可以通过larger循环,它同时是名称和jac提取内容如下:

as.data.frame(
    mapply(function(col, name) { 
     m <- match(jac[[name]], col) 
     if(length(m) == 0) NA else m # if the name doesn't exist in jac return NA as well 
     }, larger, names(larger))) 

# s1 s2 s3 
#1 1 2 NA 
#2 1 1 NA 
#3 1 3 NA 
+1

如果可能的话,我将处理很多行并希望使用data.table。 data.table与您的建议是否相同? – user3067851

+0

你可以使用'as.data.table'来转换成'data.table'。 – Psidom

+1

当使用'匹配',即使列的名称不匹配,将找到匹配的索引,正确?如果我在具有不同名称的列中具有匹配的值,那可能会出现问题,否? – user3067851