2017-08-07 61 views
0

我有这样一个数据帧:如何按照日期分隔相同的组ID,然后按r时间排序?

deviceid  date       speed 
325   2016/09/12 07:55:40    50 
325   2016/09/12 08:55:40    90 
325   2016/09/13 06:55:40    40 
325   2016/09/13 09:55:40    90 
325   2016/09/13 08:55:40    69 
325   2016/09/14 08:55:40    99 
5525   2016/09/12 09:55:40    60 
5525   2016/09/12 06:55:40    90 
5525   2016/09/15 03:55:40    63 
4325   2016/09/12 08:55:40    99 
4325   2016/09/12 07:55:40    30 
4325   2016/09/14 10:55:40    70 

我想改变它像下面这样:

deviceid    date      speed 
325_12   2016/09/12 07:55:40    50 
325_12   2016/09/12 08:55:40    90 
325_13   2016/09/13 06:55:40    90 
325_13   2016/09/13 08:55:40    69 
325_13   2016/09/13 09:55:40    40 
325_14   2016/09/14 08:55:40    99 
5525_12   2016/09/12 06:55:40    90 
5525_12   2016/09/12 09:55:40    60 
5525_15   2016/09/15 03:55:40    63 
4325_12   2016/09/12 07:55:40    30 
4325_12   2016/09/12 08:55:40    99 
4325_14   2016/09/14 10:55:40    70 

这样做的主要原因是,事后我想排序在每个时间组为不同的日期。因此,输出应该像上面那样。

回答

2

您可以使用pastegsub做到这一点:

df$deviceid = paste(df$deviceid,gsub("\\d+/\\d+/(\\d+).*","\\1",df$date),sep="_") 
    deviceid    date speed 
1 325_12 2016/09/12 07:55:40 50 
2 325_12 2016/09/12 08:55:40 90 
3 325_13 2016/09/13 06:55:40 40 
4 325_13 2016/09/13 09:55:40 90 
5 325_13 2016/09/13 08:55:40 69 
6 325_14 2016/09/14 08:55:40 99 
7 5525_12 2016/09/12 09:55:40 60 
8 5525_12 2016/09/12 06:55:40 90 
9 5525_15 2016/09/15 03:55:40 63 
10 4325_12 2016/09/12 08:55:40 99 
11 4325_12 2016/09/12 07:55:40 30 
12 4325_14 2016/09/14 10:55:40 70 
+2

您可以使用'library(lubridate); library(dplyr); df%>% mutate(deviceid = sprintf('%d_%d',deviceid,day(ymd_hms(date))))'对于时间日期列,最好避免使用正则表达式 – akrun

3

我们可以deviceid

paste(df$deviceid, format(as.POSIXct(df$date), "%d"), sep = "_") 

#[1] "325_12" "325_12" "325_13" "325_13" "325_13" "325_14" "5525_12" 
#[8] "5525_12" "5525_15" "4325_12" "4325_12" "4325_14" 
0

相同的结果只提取使用formatpaste它的日期与管道编码来帮助你用你的工作流程:

library(lubridate) 
 
library(tidyverse) 
 
library(stringr) 
 

 
df <- data.frame(
 
      deviceid = c(325, 325, 325, 325, 325, 325, 5525, 5525, 5525, 4325, 4325, 
 
         4325), 
 
       date = c("2016/09/12 07:55:40", "2016/09/12 08:55:40", 
 
         "2016/09/13 06:55:40", "2016/09/13 09:55:40", 
 
         "2016/09/13 08:55:40", "2016/09/14 08:55:40", "2016/09/12 09:55:40", 
 
         "2016/09/12 06:55:40", "2016/09/15 03:55:40", 
 
         "2016/09/12 08:55:40", "2016/09/12 07:55:40", "2016/09/14 10:55:40"), 
 
      speed = c(50, 90, 40, 90, 69, 99, 60, 90, 63, 99, 30, 70) 
 
    ) 
 

 

 
df$date <- ymd_hms(df$date) # convert to date format using lubridate 
 

 
df %>% 
 
mutate(deviceid = paste(deviceid, str_sub(year(date), 3, 4), sep = "_"))

相关问题