2014-09-25 68 views
-2

我有两个数据帧。一个被称为数据,就像乘以r中的数据帧

data <- data.frame(ID = c(1, 1, 2, 2), 
      Number = c(1,2, 1, 2), 
      Answer = c(1, 2, 3, 2) 
      ) 

另一种是所谓的权重,像

weights <- data.frame (Number=c(1,2), 
      weight1=c(0.5,1), 
      weight2=c(1, 1) 
     ) 

我想是使用Data $答案乘以基于数权重$权重(在dataframes)。最终结果应该看起来像

ID Number Answer Answer*Weights1 Answer*Weights2 
1 1  1  1 1*0.5    1*1 
2 1  2  2 2*1    2*1 
3 2  1  3 3*0.5    3*1 
4 2  2  2 2*1    2*1 

我该如何实现它?您的意见将深受赞赏。谢谢。

回答

3
data <- merge(data, weights, by = "Number") 
data <- transform(data, 
        A1 = Answer * weight1, 
        A2 = Answer * weight2) 
# Number ID Answer weight1 weight2 A1 A2 
#1  1 1  1  0.5  1 0.5 1 
#2  1 2  3  0.5  1 1.5 3 
#3  2 1  2  1.0  1 2.0 2 
#4  2 2  2  1.0  1 2.0 2 
+1

阅读一些教程,了解如何对data.frame进行排序。这确实有用。 – Roland 2014-09-25 15:27:56

+0

是的。有效。非常感谢。 – lucyh 2014-09-25 15:49:19

0

万一要在Answers*Weights1Answers*Weights2列这些条目是字符串,而不是实际的乘法,就像你在你原来的职位:

data <- cbind(data, 
       paste(data[, 3], weights[, 2], sep = "*"), 
       paste(data[, 3], weights[, 3], sep = "*")) 
names(data)[4:5] <- c("Answer*Weights1", "Answer*Weights2") 
# ID Number Answer Answer*Weights1 Answer*Weights2 
# 1 1  1  1   1*0.5    1*1 
# 2 1  2  2    2*1    2*1 
# 3 2  1  3   3*0.5    3*1 
# 4 2  2  2    2*1    2*1 

或者您希望的数字,而不是串

data[, 4] <- data[, 3] * weights[, 2] 
data[, 5] <- data[, 3] * weights[, 3] 
names(data)[4:5] <- c("Answer*Weights1", "Answer*Weights2") 
# ID Number Answer Answer*Weights1 Answer*Weights2 
# 1 1  1  1    0.5    1 
# 2 1  2  2    2.0    2 
# 3 2  1  3    1.5    3 
# 4 2  2  2    2.0    2 
1

你也可以做

library(dplyr) 
left_join(data, weights, by="Number") %>% 
select(ID:Answer, Answer_weight1=weight1, Answer_weight2=weight2) %>% 
mutate_each(funs(Answer*.), contains("weight")) 
# ID Number Answer Answer_weight1 Answer_weight2 
# 1 1  1  1   0.5    1 
# 2 1  2  2   2.0    2 
# 3 2  1  3   1.5    3 
# 4 2  2  2   2.0    2 
1

这里是你如何能做到这一点使用data.table

require(data.table) ## 1.9.2 
setDT(data)   ## convert data.frame to data.table by reference 
setDT(weights) 

setkey(data, Number) ## set the key columns to join by 
data[weights, c("Answer1", "Answer2") := 
      list(Answer * weight1, Answer * weight2)] 

我们执行连接,而是直接创建没有中间变量(weight1weight2)所需的列,因此是相当高效利用内存。它修改了data到位