2017-10-11 69 views
0

我有两个表R基团的值,并将结果平均为每个间隔

表1:

Dates_only <- data.frame(ID=c('1118','1118','1118','1118','1118', 
           '1118','1118','1118','1119','1119', 
           '1119','1119','1119','1119','1119', 
           '1119','13PP','13PP','13PP','13PP', 
           '13PP','13PP','13PP','13PP'), 
          Quart_y=c('2017Q3','2017Q4','2018Q1','2018Q2', 
             '2018Q3','2018Q4','2019Q1','2019Q2', 
             '2017Q3','2017Q4','2018Q1','2018Q2', 
             '2018Q3','2018Q4','2019Q1','2019Q2', 
             '2017Q3','2017Q4','2018Q1','2018Q2', 
             '2018Q3','2018Q4','2019Q1','2019Q2'), 
          Quart=c(0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00, 
            0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00, 
            0.25,0.50,0.75,1.00,1.25,1.50,1.75,2.00)) 

和表2:

Values <- data.frame(ID=c('1118','1119','13PP','1118','1119','13PP', 
          '1118','1119','13PP','1118','1119','13PP', 
          '1118','1119','13PP','1118','1119','13PP', 
          '1118','1119','13PP','1118','1119','13PP', 
          '1118','1119','13PP','1118','1119','13PP'), 
        Day=c(0,0,0,0.14,0.13,0.13,0.2,0.23,0.24,0.27,0.28, 
          0.32,0.32,0.32,0.44,0.47,0.49,0.49,0.59,0.64, 
          0.61,0.72,0.71,0.73,0.95,0.86,0.78,1.1,0.93,1.15), 
        Value=c(7.6,6.2,6.8,7.1,6.2,5.9,6.8,5.8,4.6,6.5,5.4, 
          4.2,6.3,4.8,4,6,4.3,3.8,5.9,4,3.6,5.6,3.8, 
          3.4,5.4,3.2,3,5,2.9,2.9)) 

我什么试图做的是根据Dates_only$Quart找到一种方法来改变Values$Day中的值。 具体而言,Dates_only$Quart代表量化季度(2017Q3 - 0.25, 2017Q4-0.50,...,2018Q4-1.50)等。而Values$Day代表量化天数。 我想改变按季度划分,而不是Values$Day,例如: 为0<=Values$Day<=0.25Values$Day==0.25,为0.25<Values$Day<=0.50Values$Day==0.50

我试图做的是用这种方法波纹管,但它有错误出现消息:

unique_quarters <- unique(Dates_only$Quart) 
unique_quarters <- append(unique_quarters, 0, after=0) 
df3 <- transform(Dates_only, 
       Transf_Day=Values$Quart[findInterval(Values$Day, unique_quarters)]) 

我想这个问题是问题findInterval(Values$Day, unique_quarters)回报

1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 5 4 5 

虽然Values$Quart有值

0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 
+0

尝试'切(价值$日,SEQ(0,3,0.25),include.lowest = T)' – Jimbou

+0

谢谢,但这并没有真正的帮助。因为我想提取数字而不是间隔。感谢您的努力! – Jespar

回答

0

试试这个:

library(tidyverse) 
as.tbl(Values) %>% 
    mutate(Int=cut(Day, seq(0,3,0.25), include.lowest = T)) %>% 
    mutate(Int2=factor(Int, labels = seq(0.25,1.25,0.25))) 
# A tibble: 30 x 5 
     ID Day Value  Int Int2 
<fctr> <dbl> <dbl>  <fctr> <fctr> 
1 1118 0.00 7.6 [0,0.25] 0.25 
2 1119 0.00 6.2 [0,0.25] 0.25 
3 13PP 0.00 6.8 [0,0.25] 0.25 
4 1118 0.14 7.1 [0,0.25] 0.25 
5 1119 0.13 6.2 [0,0.25] 0.25 
6 13PP 0.13 5.9 [0,0.25] 0.25 
7 1118 0.20 6.8 [0,0.25] 0.25 
8 1119 0.23 5.8 [0,0.25] 0.25 
9 13PP 0.24 4.6 [0,0.25] 0.25 
10 1118 0.27 6.5 (0.25,0.5] 0.5 
# ... with 20 more rows