2017-08-31 61 views
0

我有以下数据如果一列中的级别包含R中另一列的所有级别,如何提取所有行?

ID  INDUSTRY  PRODUCT     
    625109 PersonalCare  Neolone Preservatives  
    199672 PersonalCare  Neolone Preservatives  
    227047 Pharma   Optiphen 
    186117 Food    Sasol BHT 
    625109 PersonalCare  Optiphen 
    227047 Food    Neolone Preservatives 

我想如果一个ID既包含了产品NEOLONE防腐剂和Optiphen提取行。

预期结果

ID  INDUSTRY   PRODUCT 
625109 PersonalCare  Neolone Preservatives 
227047 Pharma   Optiphen 
625109 PersonalCare  Optiphen 
227047 Food    Neolone Preservatives 

这些ID 625109和227047单独含有两种产品,因此萃取。我如何在R中做到这一点?

回答

2

多种方式来做到这一点:

dplyr

df %>% 
    group_by(ID) %>% 
    filter(all(c("Neolone Preservatives", "Optiphen") %in% PRODUCT)) 


#  ID  INDUSTRY    PRODUCT 
# <int>  <chr>     <chr> 
#1 625109 PersonalCare Neolone Preservatives 
#2 227047  Pharma    Optiphen 
#3 625109 PersonalCare    Optiphen 
#4 227047   Food Neolone Preservatives 

在BAS e R:

df[ave(df$PRODUCT, df$ID, FUN = function(x) 
       all(c("Neolone_Preservatives", "Optiphen") %in% x)) == "TRUE", ] 
+0

非常感谢。 – Rini

1

这应该工作:

library(dplyr) 

df <- data.frame(ID = c(62, 19, 22, 18, 62, 22), 
       INDUSTRY = c("PC", "PC", "P", "F", "PC", "F"), 
       PRODUCT = c("NP", "NP", "O", "SB", "O", "NP")) 

df %>% 
    group_by(ID) %>% 
    filter(any(PRODUCT %in% c("NP"))& any(PRODUCT %in% c("O"))) 

# A tibble: 4 x 3 
# Groups: ID [2] 
    ID INDUSTRY PRODUCT 
    <dbl> <fctr> <fctr> 
1 62  PC  NP 
2 22  P  O 
3 62  PC  O 
4 22  F  NP 
0

你可以与图书馆做dplyr

filteredData<-data %>% 
filter(INDUSTRY=='PersonalCare',PRODUCT=='Optiphen') 
相关问题