2017-03-08 98 views
1

我对不清楚问题标题的道歉!匹配日期并在R中的多个列中分配名称?

我的问题是我有一个数据集有一个名为real.date的列。每个ID都有几个实际日期。我想将real.date转换成名称,它应该使用3列date.1,date.2和date.3。规则是允许±5天不同。

id = c(1,1,1,2,2,2,3,3,3) 
real.date = c('21-06-16','29-08-16','21-11-16','20-06-16','28-08-16','20-11-16','22-06-16','30-08-16','22-11-16') 
date.1 = c('21-06-16','21-06-16','21-06-16','20-06-16','20-06-16','20-06-16','22-06-16','22-06-16','22-06-16') 
date.2 = c('29-08-16','29-08-16','29-08-16','28-08-16','28-08-16','28-08-16','30-08-16','30-08-16','30-08-16') 
date.3 = c('21-11-16','21-11-16','21-11-16','20-11-16','20-11-16','20-11-16','19-11-16','19-11-16','19-11-16') 
df = cbind(id,real.date,date.1,date.2,date.3) 

df 
     id real.date date.1  date.2  date.3  
[1,] "1" "21-06-16" "21-06-16" "29-08-16" "21-11-16" 
[2,] "1" "29-08-16" "21-06-16" "29-08-16" "21-11-16" 
[3,] "1" "21-11-16" "21-06-16" "29-08-16" "21-11-16" 
[4,] "2" "20-06-16" "20-06-16" "28-08-16" "20-11-16" 
[5,] "2" "28-08-16" "20-06-16" "28-08-16" "20-11-16" 
[6,] "2" "20-11-16" "20-06-16" "28-08-16" "20-11-16" 
[7,] "3" "22-06-16" "22-06-16" "30-08-16" "19-11-16" 
[8,] "3" "30-08-16" "22-06-16" "30-08-16" "19-11-16" 
[9,] "3" "22-11-16" "22-06-16" "30-08-16" "19-11-16" 

我希望有这样

id real.date 
1 date.1 
1 date.2 
1 date.3 
2 date.1 
2 date.2 
2 date.3 
3 date.1 
3 date.2 
3 date.3 

任何帮助是非常感激的结果!

谢谢

回答

1

这可以通过使用as.Date字符串转换为日期进行:

day.diff <- as.Date(df[, 3:5]) - as.Date(df[, 2]) 
day.diff <- matrix(as.numeric(day.diff), nrow = nrow(df)) 

x <- apply(day.diff, 1, function(x){ 
    res <- which(abs(x) <= 5) 
    print(length(res)) 
    if(length(res) > 1){ # more than 1 col meets the requirement 
     res <- res[1] 
    }else if(length(res) == 0){ #' none of the cols meets the requirement 
     res <- NA 
    } 
    res 
}) 

new.df <- data.frame(id = df[, 1], real.date = colnames(df)[3:5][x]) 
# id real.date 
# 1 1 date.1 
# 2 1 date.2 
# 3 1 date.3 
# 4 2 date.1 
# 5 2 date.2 
# 6 2 date.3 
# 7 3 date.1 
# 8 3 date.2 
# 9 3  <NA> 

注:在您的示例中的最后一行并没有满足您在最5d的规则区别。


如果df是一个data.frame,我们可以得出day.diff矩阵:

day.diff <- sapply(3:5, function(i) as.Date(df[, i]) - as.Date(df[, 2])) 
# this is also applicable if df is a matrix 
+0

非常感谢你的回答。我跟着这个,并得到了警告:二进制运算符的非数字参数 此外:警告消息:不兼容的方法(“Ops.data.frame”,“ - 日期”)为“ - ” – PNY

+0

对不起,我不能编辑我的文本作为代码风格>。< – PNY

+0

我也试过as.POSIXct,但它似乎没有工作 – PNY