自与条件接合表

表dummy1：

e_n t_s item 
a  t1 c 
a  t2 c 
a  t3 c 
a  t4 c 
b  p1 c 
b  p2 c 
b  p3 c 
b  p4 c

T1，T2，T3，T4，P1，P2，P3，P4是时间戳在升序。 t1，t2，t3，t4是event_name'a'的升序时间戳。 p1，p2，p3，p4是event_name'b'升序的时间戳。

c是发生这些事件'a'和'b'的item_number。

我试图写它的结果应该是作为查询如下：

e_n1 e_n2 item t_s_1 t_s_2 
a  b  c  t1 p1 
a  b  c  t2 p2 
a  b  c  t3 p3 
a  b  c  t4 p4

我曾尝试下面的代码：

select l.e_n as e_n_1, m.e_n as e_n_2, l.item, l.t_s as t_s_a, 
m.t_s as t_s_b from (
(select * from dummy where e_n = 'a') l 
join 
(select * from dummy where e_n = 'b') m 
on l.item = m.item and l.t_s < m.t_s

的加入l.item = m.item需要，因为有许多其他项目C1，C2，C3具有相同的结构

结果是：

e_n1 e_n2 item t_s_a t_s_b 
    a  b  c  t1 p1 
    a  b  c  t1 p2 
    a  b  c  t1 p3 
    a  b  c  t1 p4 
    a  b  c  t2 p1 
    a  b  c  t2 p2 
    a  b  c  t2 p3 

so on

我如何以高效的方式实现我的结果？

来源

2017-03-09 SpaceOddity

是你的apache-spark-sql支持ROW_NUMBER（）OVER（ORDER BY t_s）rn？如果是，那么简单地使用'l.rn = m.rn'完全外部连接表'l'和'm' –

这是专门针对Amazon Redshift的吗？还是Spark？您能否相应地澄清您的标签？ –

这是为apache-spark-sql – SpaceOddity

select  min (case when e_n = 'a' then 'a' end) as e_n1 
      ,min (case when e_n = 'b' then 'b' end) as e_n2 
      ,item 
      ,min (case when e_n = 'a' then t_s end) as t_s_1 
      ,min (case when e_n = 'b' then t_s end) as t_s_2 

from  (select  d.* 
         ,row_number() over (partition by item,e_n order by t_s) as rn 

      from  dummy as d 
      ) d 

group by item 
      ,rn

+------+------+------+-------+-------+ 
| e_n1 | e_n2 | item | t_s_1 | t_s_2 | 
+------+------+------+-------+-------+ 
| a | b | c | t1 | p1 | 
| a | b | c | t2 | p2 | 
| a | b | c | t3 | p3 | 
| a | b | c | t4 | p4 | 
+------+------+------+-------+-------+

来源

2017-03-09 08:11:44

一种亲切的提醒来接受答案（通过标记** V **标记留给它） –

首先，排序时间戳每一个事件，然后加入对排序表中的行数。

请尝试下面的代码。

select l.e_n as e_n_1, m.e_n as e_n_2, isnull(l.item,m.item) as item, l.t_s as t_s_a, 
    m.t_s as t_s_b from 
    (select *,(row_number() over (order by t_s)) as rn from dummy where e_n = 'a') l 
    full join 
    (select *,(row_number() over (order by t_s)) as rn from dummy where e_n = 'b') m 
    on l.item = m.item and l.rn=m.rn

来源

2017-03-09 08:45:26

自与条件接合表

回答

相关问题