2013-03-02 81 views
13

data.table FAQ中,nomatch = NA参数被认为与外连接类似。但是,我一直无法获得data.table做一个完整的外连接 - 只有正确的外连接。如何使用data.table完成全连接?

例如:

a <- data.table("dog" = c(8:12), "cat" = c(15:19)) 

    dog cat 
1: 8 15 
2: 9 16 
3: 10 17 
4: 11 18 
5: 12 19 

b <- data.table("dog" = 1:10, "bullfrog" = 11:20) 

    dog bullfrog 
1: 1  11 
2: 2  12 
3: 3  13 
4: 4  14 
5: 5  15 
6: 6  16 
7: 7  17 
8: 8  18 
9: 9  19 
10: 10  20 

setkey(a, dog) 
setkey(b, dog) 

a[b, nomatch = NA] 

    dog cat bullfrog 
1: 1 NA  11 
2: 2 NA  12 
3: 3 NA  13 
4: 4 NA  14 
5: 5 NA  15 
6: 6 NA  16 
7: 7 NA  17 
8: 8 15  18 
9: 9 16  19 
10: 10 17  20 

所以,nomatch = NA产生右外连接(这是默认值)。如果我需要全面加入,该怎么办?例如:

merge(a, b, by = "dog", all = TRUE) 
# Or with plyr: 
join(a, b, by = "dog", type = "full") 

    dog cat bullfrog 
1: 1 NA  11 
2: 2 NA  12 
3: 3 NA  13 
4: 4 NA  14 
5: 5 NA  15 
6: 6 NA  16 
7: 7 NA  17 
8: 8 15  18 
9: 9 16  19 
10: 10 17  20 
11: 11 18  NA 
12: 12 19  NA 

这可能与data.table

+0

对于加入与data.table看到[此帖]最后的答案[1 ] [1]:http://stackoverflow.com/questions/14076065/data-table-inner-outer-join-with-na-in-join-column-of-type-double-bug ?rq = 1 – statquant 2013-03-03 22:45:58

+0

对于与data.table加入各种见[此帖]最后的答案[1] [1]:http://stackoverflow.com/questions/14076065/data-table-inner-outer -a-in-join-column-of-type-double-bug?rq = 1 – statquant 2013-03-03 22:48:00

回答

19

你实际上就在那里。使用merge.data.table这是你在做什么,当你调用

merge(a, b, by = "dog", all = TRUE) 

因为adata.tablemerge(a, b, ...)调用merge.data.table(a, b, ...)

+0

啊,当然。我应该知道这一点。谢谢。 – 2013-03-03 05:40:34

0
x= data.table(a=1:5,b=11:15) 
y= data.table(a=c(1:4,6),c=c(101:104,106)) 

setkey(x,a) 
setkey(y,a) 

unique_keys <- unique(c(x[,a], y[,a])) 
y[x[.(unique_keys), on="a"] ] # Full Outer Join