2017-02-04 89 views
0

我正在分析我公司的原材料需求,我正在采用的方法是使用成品的销售记录与物料清单为每个成品。我现在面临的问题是,每个成品都由多个组件组成,许多成品共享通用组件。我试图保留每个成品的所有单个销售记录,并使用UnitsSold与每个组件的单位数量相乘以获得原材料的需求。这里是集样本代码:dplyr合并成品销售和物料清单的两个数据集

fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2), 
         Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6), 
         FG_UnitsSold=c(100,200,300,400,500,600)) 

bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4), 
          Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"), 
          Qty=rnorm(3,1,n = 12))%>% 
          arrange(FG_PartNumber) 

我感到很熟悉dplyr left_join但似乎没有工作,因为它总是给我以每个成品的第一个组件。

任何人都可以提供帮助吗? 谢谢。

回答

0

也许我不理解的问题,但如果你组由FG_PartNumber你的两个数据帧,并就你感兴趣的量的关系透视表,你可以得到你正在寻找的总额:

#Create data 
    set.seed(1) 
     fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2), 
          Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6), 
          FG_UnitsSold=c(100,200,300,400,500,600)) 

    bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4), 
           Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"), 
           Qty=rnorm(3,1,n = 12))%>% 
     arrange(FG_PartNumber) 

    library(dplyr) 
#make pivot tables for sales and quantity 

    tot_sales <- fg_Sales %>% 
     group_by(FG_PartNumber) %>% 
     summarise(tot_sales = sum(FG_UnitsSold)) 

    tot_materials <- bill_materials %>% 
     group_by(FG_PartNumber) %>% 
     summarise(tot_qty = sum(Qty)) 

#join the pivot tables together  
    df <- left_join(tot_sales, tot_materials) 

> df 
# A tibble: 3 × 3 
    FG_PartNumber tot_sales tot_qty 
      <chr>  <dbl> <dbl> 
1    A  500 13.15087 
2    B  700 14.76326 
3    C  900 11.30953 
0

我认为inner_joindplyr是这里最好的选择:

library(dplyr) 
fg_Sales_ext <- inner_join(x = fg_Sales, 
          y = bill_materials, 
          by = "FG_PartNumber") 

inner_join文档:“如果在matche的所有组合的X和Y之间的多个匹配, s返回。“

有了fg_Sales_ext您现在可以使用group_bysummarise执行任何类型的分析。

+0

嗨evgeniC,这正是我需要的。谢谢你的帮助! –