R：查找非唯一/重复值的数据帧索引

我想从矢量中提取一些值，修改它们并将它们放回原始位置。
我一直在寻找很多，并尝试了解决这个问题的不同方法。恐怕这可能很简单，但我还没有看到它。R：查找非唯一/重复值的数据帧索引

创建一个矢量并将其转换为数据框。也为结果创建一个空的数据框。

hight <- c(5,6,1,3) 
hight_df <- data.frame("ID"=1:length(hight), "hight"=hight) 
hight_min_df <- data.frame()

提取每对值的较小值与相应的ID。

for(i in 1:(length(hight_df[,2])-1)) 
{ 
    hight_min_df[i,1] <- which(grepl(min(hight_df[,2][i:(i+1)]), hight_df[,2])) 
    hight_min_df[i,2] <- min(hight_df[,2][i:(i+1)]) 
}

修改提取的值并通过更高的值聚合相同的ID。最后写回修改后的值。

hight_min_df[,2] <- hight_min_df[,2]+20 
adj_hight <- aggregate(x=hight_min_df[,2],by=list(hight_min_df[,1]), FUN=max) 
hight[adj_hight[,1]] <- adj_hight[,2]

这只要一个完美的我在hight只有潮头值工作。如何使用像这样的矢量运行此脚本：hight <- c(5,6,1,3,5)？

来源

2017-08-25 Jack M

如果hight < - c（5,6,1,3,5）'，预期的输出是多少？ – BLT

好的，这里有很多东西需要解压缩。我建议用管道功能dplyr来代替循环。阅读小插曲here - 它是一个优秀的资源和

因此，使用dplyr我们可以重写你的代码是这样一个很好的方式进行数据操纵R.：

library(dplyr) 
hight <- c(5,6,1,3,5) #skip straight to the test case 
hight_df <- data.frame("ID"=1:length(hight), "hight"=hight) 

adj_hight <- hight_df %>% 
    #logic psuedo code: if the last hight (using lag() function), 
    # going from the first row to the last, 
    # is greater than the current rows hight, take the current rows value. else 
    # take the last rows value 
    mutate(subst.id = ifelse(lag(hight) > hight, ID, lag(ID)), 
     subst.val = ifelse(lag(hight) > hight, hight, lag(hight)) + 20) %>% 
    filter(!is.na(subst.val)) %>% #remove extra rows 
    select(subst.id, subst.val) %>% #take just the columns we want 
    #grouping - rewrite of your use of aggregate 
    group_by(subst.id) %>% 
    summarise(subst.val = max(subst.val)) %>% 
    data.frame(.) 

#tying back in 
hight[adj_hight[,1]] <- adj_hight[,2] 
print(hight)

，并提供：

[1] 25 6 21 23 5

来源

2017-08-25 16:33:18 Zach

R：查找非唯一/重复值的数据帧索引

回答

相关问题