2017-07-03 54 views
0

我正在读取csv文件,并尝试根据两个连续行的bug_id和bug_when相同并且第i行的列列值为“RESOLVED”的条件来更新名为'added'的列的值“然后通过连接”已添加“列(i和i + 1行)的值更新(i + 1)行上添加的列的值并删除第i行。我累了,但它没有正常工作。该文件包含以下信息:更新列值并删除R中的行

bug_id bug_when   field  added 
1141327 2015-03-09 16:21:30 Status  RESOLVED 
1141327 2015-03-09 16:21:30 Resolution DUPLICATE 
1142623 2015-03-24 18:15:22 Status  RESOLVED 
1142623 2015-03-24 18:15:22 Resolution FIXED 
1143179 2015-07-30 09:37:56 Status  RESOLVED 
1143179 2015-07-30 09:37:56 Resolution FIXED 

这里是我的代码:

dataframe <- read.csv("prototype.csv", header = TRUE) 
start <- 1 
end <- nrow(dataframe)-1 

for(i in start:end) 
{ 
    if(dataframe$bug_id[i]==dataframe$bug_id[i+1] & dataframe$bug_when[i]==dataframe$bug_when[i+1]) 
    { 
    if(dataframe$added[i]=="RESOLVED") 
    { 
     df <- paste(dataframe$added[i],"-",dataframe$added[i+1]) 
     dataframe$added[i+1] <- df 
     dataframe <- dataframe[!(dataframe[i,])] 
    } 

    } 

} 

任何建议将高度赞赏。 所需的结果:

bug_id bug_when   field  added 
1141327 2015-03-09 16:21:30 Resolution RESOLVED-DuPLICATE 
1142623 2015-03-24 18:15:22 Resolution RESOLVED-FIXED 
1143179 2015-07-30 09:37:56 Resolution RESOLVED-FIXED 
+0

你可以添加数据例如你想要的结果,你提供的? –

+0

@PLapointe希望添加结果 – user2293224

回答

0

这里是如何做到这一点与dplyr。基本上,每添加一次t-1中的“RESOLVED”,添加的字符串都与paste连接。然后使用filter仅保留带有“分辨率”的字段。

library(dplyr) 
df%>% 
    group_by(bug_id,bug_when)%>% 
    mutate(added=ifelse(lag(added) =="RESOLVED" & !is.na(lag(added)), 
        paste(lag(added),(added),sep="-"), 
        added))%>% 
    filter(field=="Resolution") 

    bug_id   bug_when  field    added 
    <int>    <chr>  <chr>    <chr> 
1 1141327 2015-03-09 16:21:30 Resolution RESOLVED-DUPLICATE 
2 1142623 2015-03-24 18:15:22 Resolution  RESOLVED-FIXED 
3 1143179 2015-07-30 09:37:56 Resolution  RESOLVED-FIXED 

数据

df <- read.table(text="bug_id bug_when   field  added 
1141327 '2015-03-09 16:21:30' Status  RESOLVED 
1141327 '2015-03-09 16:21:30' Resolution DUPLICATE 
1142623 '2015-03-24 18:15:22' Status  RESOLVED 
1142623 '2015-03-24 18:15:22' Resolution FIXED 
1143179 '2015-07-30 09:37:56' Status  RESOLVED 
1143179 '2015-07-30 09:37:56' Resolution FIXED", 
       header=TRUE,stringsAsFactors=FALSE) 
0

我想你想结合骨料和粘贴,就像这样:

df <- read.table(text="bug_id bug_when   field  added 
1141327 '2015-03-09 16:21:30' Status  RESOLVED 
1141327 '2015-03-09 16:21:30' Resolution DUPLICATE 
1142623 '2015-03-24 18:15:22' Status  RESOLVED 
1142623 '2015-03-24 18:15:22' Resolution FIXED 
1143179 '2015-07-30 09:37:56' Status  RESOLVED 
1143179 '2015-07-30 09:37:56' Resolution FIXED",stringsAsFactors = FALSE,header=TRUE) 

df2 <- aggregate(added ~ bug_id + bug_when, df,paste,collapse = "-") 
df2$field <- "Resolution" 

# bug_id   bug_when    added  field 
# 1 1141327 2015-03-09 16:21:30 RESOLVED-DUPLICATE Resolution 
# 2 1142623 2015-03-24 18:15:22  RESOLVED-FIXED Resolution 
# 3 1143179 2015-07-30 09:37:56  RESOLVED-FIXED Resolution