只有当数据帧列中的值与其他两个列值相匹配时，才替换其中的值

假设以下数据只是我正在使用的非常大的数据的一部分。只有当数据帧列中的值与其他两个列值相匹配时，才替换其中的值

mydf<-data.frame(Date=as.Date(c("2015-01-01","2015-01-10","2015-01-27","2015-02-27","2015-03-15","2015-04-17","2015-04-18")),Expense=c(1566,5646,3456,6546,5313,6466,5456),Details=c('item101 xsda','fuel asa','item102a','fuel asa','fuel sda','fuel','item102a'),Vehicle=c('Car','Bike','Car','Car','Bike','Bike','Bike'),Person=c('John','Smith','Robin',rep(NA,3),'Robin')) 

Date   Expense  Details  Vehicle Person 
1 2015-01-01 1566  item101 xsda Car  John 
2 2015-01-10 5646  fuel asa  Bike  Smith 
3 2015-01-27 3456  item102a  Car  Robin 
4 2015-02-27 6546  fuel asa  Car  <NA> 
5 2015-03-15 5313  fuel sda  Bike  <NA> 
6 2015-04-17 6466  fuel   Bike  <NA> 
7 2015-04-18 5456  item102a  Bike  Robin

有两点需要考虑

1）当车辆的车“使用的是和“燃料”被购买了约翰

2人），当车辆“自行车”是购买二手和“燃料”，那么这个人是史密斯

我期望的输出是

 Date  Expense Details  Vehicle Person 
1 2015-01-01 1566 item101 xsda  Car  John 
2 2015-01-10 5646 fuel    Bike  Smith 
3 2015-01-27 3456 item102a   Car  Robin 
4 2015-02-27 6546 fuel    Car  John 
5 2015-03-15 5313 fuel sda   Bike  Smith 
6 2015-04-17 6466 fuel    Bike  Smith 
7 2015-04-18 5456 item102a   Bike  Robin

请告诉我如何解决这个问题？我用下面的步骤和对解决方案

mydf$Details<-as.character(mydf$Details) 
mydf$Details[grepl('fuel',mydf$Details,ignore.case=TRUE)]<-'Fuel'

是myDF

Date  Expense  Details  Vehicle Person 
1 2015-01-01 1566  item101 xsda Car  John 
2 2015-01-10 5646  Fuel   Bike  Smith 
3 2015-01-27 3456  item102a  Car  Robin 
4 2015-02-27 6546  Fuel   Car  <NA> 
5 2015-03-15 5313  Fuel   Bike  <NA> 
6 2015-04-17 6466  Fuel   Bike  <NA> 
7 2015-04-18 5456  item102a  Bike  Robin

注达到了一半：如果可能的话，请避免环路。如果有更好更快的这样做的方法，请分享

来源

2016-03-28 learner

你一半了，你说尝试这两条线：使用data.table

mydf$Person[mydf$Details=='Fuel' & mydf$Vehicle=='Car'] <- 'John' 
mydf$Person[mydf$Details=='Fuel' & mydf$Vehicle=='Bike'] <- 'Smith'

来源

2016-03-28 14:38:58

你可以在几行做：

library(data.table) 

setDT(mydf) 

mydf[is.na(Person) & Details %like% "fuel" & Vehicle == "Car", Person := "John"] 
mydf[is.na(Person) & Details %like% "fuel" & Vehicle == "Bike", Person := "Smith"] 

mydf 
#>   Date Expense  Details Vehicle Person 
#> 1: 2015-01-01 1566 item101 xsda  Car John 
#> 2: 2015-01-10 5646  fuel asa Bike Smith 
#> 3: 2015-01-27 3456  item102a  Car Robin 
#> 4: 2015-02-27 6546  fuel asa  Car John 
#> 5: 2015-03-15 5313  fuel sda Bike Smith 
#> 6: 2015-04-17 6466   fuel Bike Smith 
#> 7: 2015-04-18 5456  item102a Bike Robin

使用dplyr，你也可以做条件变异，但代码更长。我使用stringr包进行字符串操作

library(dplyr) 
library(stringr) 
mydf %>% 
    mutate(
    Person = ifelse(
     is.na(Person) & 
     str_detect(Details, "fuel") & 
     Vehicle == "Car", 
     "John", 
     ifelse(
     is.na(Person) & 
      str_detect(Details, "fuel") & 
      Vehicle == "Bike", 
     "Smith", 
     as.character(Person))) 
) 
#>   Date Expense  Details Vehicle Person 
#> 1 2015-01-01 1566 item101 xsda  Car John 
#> 2 2015-01-10 5646  fuel asa Bike Smith 
#> 3 2015-01-27 3456  item102a  Car Robin 
#> 4 2015-02-27 6546  fuel asa  Car John 
#> 5 2015-03-15 5313  fuel sda Bike Smith 
#> 6 2015-04-17 6466   fuel Bike Smith 
#> 7 2015-04-18 5456  item102a Bike Robin

来源

2016-03-28 14:39:23 cderv

使用* data.table *可以更合适地使用join + update。 – Arun

我不确定如何去做，然后... – cderv

只有当数据帧列中的值与其他两个列值相匹配时，才替换其中的值

回答

相关问题