2017-02-17 27 views
0

在Google BigQuery中,我试图将事件开始时间与结束时间关联起来,结束时间定义为事件类型与开始时间的事件类型不匹配的最长时间。如何根据辅助字段匹配日期?

下面是一个例子来说明我的问题:

原始数据集:

Name Event  Event Type Datetime 
**** ****** ********** **************** 
Bob  Tennis Start   2017-02-17 8:00 
Bob  Tennis Playing  2017-02-17 8:10 
Bob  Tennis Playing  2017-02-17 8:20 
Bob  Tennis Playing  2017-02-17 8:30 
Bob  Tennis Playing  2017-02-17 8:50 
Bob  Tennis Start   2017-02-17 10:00 
Bob  Tennis Playing  2017-02-17 10:30 
Bob  Bowling Start   2017-02-18 2:15 
Bob  Bowling Playing  2017-02-18 2:18 

所需的表:

Name Event  Start Datetime  End Datetime 
**** ****** **************** **************** 
Bob  Tennis 2017-02-17 8:00  2017-02-17 8:50 
Bob  Tennis 2017-02-17 10:00 2017-02-17 10:30 
Bob  Bowling 2017-02-18 2:15  2017-02-18 2:18 

我知道解决的办法必须包括partitionmax功能,但我m不知道如何找到事件类型与所讨论的行不匹配的最大日期时间。

回答

3

尝试以下,应该给你一个想法

#standardSQL 
SELECT Name, Event, MIN(DateTime) AS StartDateTime, MAX(DateTime) AS EndDateTime 
FROM (
    SELECT Name, Event, EventType, DateTime, 
    COUNTIF(EventType = 'Start') OVER(PARTITION BY Name, Event ORDER BY DateTime) AS grp 
    FROM yourTable 
) 
GROUP BY Name, Event, grp 

你可以测试它下面的虚拟数据

WITH yourTable AS (
    SELECT 'Bob' AS Name, 'Tennis' AS Event, 'Start' AS EventType, '2017-02-17 08:00' AS DateTime UNION ALL 
    SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:10' UNION ALL 
    SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:20' UNION ALL 
    SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:30' UNION ALL 
    SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 08:50' UNION ALL 
    SELECT 'Bob', 'Tennis', 'Start', '2017-02-17 10:00' UNION ALL 
    SELECT 'Bob', 'Tennis', 'Playing', '2017-02-17 10:30' UNION ALL 
    SELECT 'Bob', 'Bowling', 'Start', '2017-02-18 02:15' UNION ALL 
    SELECT 'Bob', 'Bowling', 'Playing', '2017-02-18 02:18' 
) 
+0

非常聪明的解决方案! –

+0

不可思议!所有冰雹SQL高手米哈伊尔!非常感谢! – dnaeye