如果列X和Y相等(我必须匹配dOne.X == dTwo.X & dOne.Y == dTwo.Y
以及dOne.X == dTwo.Y & dOne.Y == dTwo.X
)我试图在另一列中“合并”数据帧的列V使用for
循环解决了这个问题,但是当Data Frame dOne很大时(在我的机器上,如果length(dOne.X) == 500000
需要25分钟)它会很慢。我想知道是否有办法使用更快的“矢量化”操作来解决此问题。完成后通过匹配列来合并具有不同大小的两个数据帧
Data Frame ONE
X Y V
a b 2
a c 3
a d 0
a e 0
b c 2
b d 3
b e 0
c d 2
c e 0
d e 0
Data Frame TWO
X Y V
a b 1
a c 1
a d 1
b c 1
b d 1
c d 1
e d 1
Expected Data Frame after the columns are merged
X Y V V2
a b 2 1
a c 3 1
a d 0 1
a e 0 0
b c 2 1
b d 3 1
b e 0 0
c d 2 1
c e 0 0
d e 0 1
这是我使用至今的代码是缓慢的大(几十万行)::以上是我所想要做的个例
copyadjlistValueColumn <- function(dOne, dTwo) {
dOne$V2 <- 0
lv <- union(levels(dOne$Y), levels(dOne$X))
dTwo$X <- factor(dTwo$X, levels = lv)
dTwo$Y <- factor(dTwo$Y, levels = lv)
dOne$X <- factor(dOne$X, levels = lv)
dOne$Y <- factor(dOne$Y, levels = lv)
for(i in 1:nrow(dTwo)) {
row <- dTwo[i,]
dOne$V2[dOne$X == row$X & dOne$Y == row$Y] <- row$V
dOne$V2[dOne$X == row$Y & dOne$Y == row$X] <- row$V
}
dOne
}
这是一个涵盖我所期望的测试案例(使用上面的数据框):
test_that("Copy V column to another Data Frame", {
dfOne <- data.frame(X=c("a", "a", "a", "a", "b", "b", "b", "c", "c", "d"),
Y=c("b", "c", "d", "e", "c", "d", "e", "d", "e", "e"),
V=c(2, 3, 0, 0, 2, 3, 0, 2, 0, 0))
dfTwo <- data.frame(X=c("a", "a", "a", "b", "b", "c", "e"),
Y=c("b", "c", "d", "c", "d", "d", "d"),
V=c(1, 1, 1, 1, 1, 1, 1))
lv <- union(levels(dfTwo$Y), levels(dfTwo$X))
dfExpected <- data.frame(X=c("a", "a", "a", "a", "b", "b", "b", "c", "c", "d"),
Y=c("b", "c", "d", "e", "c", "d", "e", "d", "e", "e"),
V=c(2, 3, 0, 0, 2, 3, 0, 2, 0, 0),
V2=c(1, 1, 1, 0, 1, 1, 0, 1, 0, 1))
dfExpected$X <- factor(dfExpected$X, levels = lv)
dfExpected$Y <- factor(dfExpected$Y, levels = lv)
dfMerged <- copyadjlistValueColumn(dfOne, dfTwo)
expect_identical(dfMerged, dfExpected)
})
任何建议吗?
感谢很多:)
可能重复[如何在R(内部,外部,左侧,右侧)连接数据框?](http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left -right) – 2014-11-24 13:17:01
'merge(dOne,dTwo,by = c(“X “,”Y“),all.x = TRUE)?虽然由于某种原因,它不完全符合你想要的输出 – 2014-11-24 13:19:12
嘿大卫,我认为这是因为我必须以“双向”方式匹配它:'dOne.X == dTwo.X&dOne.Y == dTwo。 Y'和'dOne.X == dTwo.Y&dOne.Y == dTwo.X' – alfakini 2014-11-24 13:21:26