2017-06-19 46 views
0

我想在R中加入两个data.tables我按照名称加入它们,我想将一个数据表中的行插入到另一个数据表的名称组中。数据表B具有“名称”和“数量”,数据表B具有“名称”和“地址”(但每个名称不止一个地址)。我想要一个数据表,其中包含每个名称,相应的地址以及每个名称组的一个“金额”。在加入时将行插入组

我试过在dplyr中使用“left_join”,但是对于每个“地址”行,数量列都会得到重复。

任何人有任何想法?谢谢。

示例图片(接合表1和2以创建3):

甚至像这样:

编辑:添加的两个数据集是什么可再现的例子喜欢和期望的输出是什么

table_one <- data.frame(name=c("x","y","z"), amount=c("$100","200","300")) 
table_two <- data.frame(name=c("x","x","y","z","z","z"), address=c("A","B","C","D","E","F")) 

output <- data.frame(name=c("x","x","y","z","z","z"), 
        address=c("A","B","C","D","E","F"), amount=c("$100","","$200","$300","","")) 
+2

最好是包括[重复的例子]做到这一点(https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)样本输入数据的形式,我们可以复制/粘贴。数据图片没有帮助。 – MrFlick

+0

表3看起来更像是行绑定而不是加入我。也许'bind_rows'? – aosmith

回答

1

使用dplyr

library(dplyr) 

left_join(table_two, table_one, by = 'name') %>% 
    mutate(amount = replace(amount, duplicated(name), NA)) 
# name address amount 
#1 x  A $100 
#2 x  B <NA> 
#3 y  C 200 
#4 z  D 300 
#5 z  E <NA> 
#6 z  F <NA> 
0

在这里,你去。

table_one <- data.frame(name=c("x","y","z"), amount=c("$100","$200","$300")) 
table_two <- data.frame(name=c("x","x","y","z","z","z"), address=c("A","B","C","D","E","F")) 

output <- data.frame(name=c("x","x","y","z","z","z"), 
        address=c("A","B","C","D","E","F"), amount=c("$100","","$200","$300","","")) 


test <- merge(table_one, table_two, by = 'name') 
test$amount <- as.character(test$amount) 
test$amount[duplicated(test[,c(1,2)])] <- "" 
test 
0

我们可以match

i1 <- with(table_one, match(name, table_two$name)) 
table_two$amount <- "" 
table_two$amount[i1] <- as.character(table_one$amount) 
table_two 
# name address amount 
#1 x  A $100 
#2 x  B  
#3 y  C 200 
#4 z  D 300 
#5 z  E  
#6 z  F