2017-07-21 860 views
1

我有两个表。一个是Reference表,用于排序优先级,一个是Customer表。 Reference表用于优先考虑表Customer中的每个列,以便为单个客户的单个列给出不同的顺序。在'PARTITION BY'上使用过滤条件

参考表

--------------------------------------- 
| Priority | Attribute | sourceID | 
--------------------------------------- 
| 1  |  EMAIL |  1  | 
| 2  |  EMAIL |  2  | 
| 3  |  EMAIL |  3  | 
| 2  |  NAME |  1  | 
| 1  |  NAME |  2  | 
| 3  |  NAME |  3  | 
--------------------------------------- 

客户表

----------------------------------------------------------------------- 
| CustomerID | Name |  Email  | SourceID |  Date | 
----------------------------------------------------------------------- 
| 1  | John |  NULL   |  1  | 03/01/2017 | 
| 1  | NULL | [email protected] |  3  | 01/01/2017 | 
| 1  | J  | [email protected] |  2  | 02/01/2017 | 
----------------------------------------------------------------------- 

结果

--------------------------------------------- 
| CustomerID | Name |  Email  | 
--------------------------------------------- 
|  1  | John | [email protected] | 
--------------------------------------------- 

在t他目前我使用下面的查询做到这一点:

SELECT DISTINCT 
     FIRST_VALUE(c.Name IGNORE NULLS) 
      OVER (PARTITION BY p.customerID 
       ORDER BY r.PRIORITY, c.DATE 
       ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL, 
     FIRST_VALUE(c.Email IGNORE NULLS) 
      OVER (PARTITION BY c.customerID 
       ORDER BY r.PRIORITY, c.DATE 
       ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL 
FROM Customer c 
    JOIN reference r ON c.sourceID = r.sourceID; 

然而,这确实需要为每列不同的属性纳入考虑。我需要按部分为每个分区添加某种过滤器。

任何人都可以协助我如何去做到这一点?

回答

2

一种方法是把属性为客户在一列,然后重新组合它们:

SELECT DISTINCT customerId 
     first_value(CASE WHEN ca.attribute = 'NAME' THEN ca.val end) OVER 
      (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS name, 
     first_value(CASE WHEN ca.attribute = 'EMAIL' THEN ca.val END) OVER 
      (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS email 
FROM ((SELECT customerId, 'NAME' AS attribute, name AS val, sourceId, date 
     FROM customer c 
    ) UNION ALL 
     (SELECT customerId, 'EMAIL' AS attribute, email AS val, sourceId, date 
     FROM customer c 
    ) 
    ) ca JOIN 
    reference r 
    ON r.sourceId = ca.sourceId AND r.attribute = ca.attribute; 

注意,这里使用SELECT DISTINCT而不是GROUP BY。我不认为Netezza具有聚合函数first_value(),所以这个构造解决了这个问题。