2016-03-04 37 views
2

假设你有一个多行的非归架构如下图所示:如何在BigQuery中将多行聚合为一行?

uuid | property | value 
------------------------------------------ 
    abc  | first_name | John 
    abc  | last_name | Connor 
    abc  | age   | 26 
... 

同一组的所有行的属性,不一定排序。 如何创建一个表,例如使用大量查询(即,没有客户端):

表user_properties:

uuid | first_name | last_name | age 
-------------------------------------------------------- 
    abc  | John   | Connor  | 26 

在传统的SQL存在用于此目的的“东西”关键字。

,如果我能至少获得通过UUID订购的结果会更容易些,所以客户端不需要加载整个表(4GB)排序 - 这将有可能通过扫描顺序来滋润每个实体与uuid相同的行。然而,这样的查询:

SELECT * FROM user_properties ORDER BY uuid; 

超出了可用资源BigQuery中(使用allowLargeResults禁止ORDER BY)。几乎看起来我无法在BigQuery中排序大表(4GB),除非我订阅了高端机器。有任何想法吗?

回答

6
SELECT 
    uuid, 
    MAX(IF(property = 'first_name', value, NULL)) AS first_name, 
    MAX(IF(property = 'last_name', value, NULL)) AS last_name, 
    MAX(IF(property = 'age', value, NULL)) AS age 
FROM user_properties 
GROUP BY uuid 

另一种选择 - 无GROUP'ing参与

SELECT uuid, first_name, last_name, age 
FROM (
    SELECT 
    uuid, 
    LEAD(value, 1) OVER(PARTITION BY uuid ORDER BY property) AS first_name, 
    LEAD(value, 2) OVER(PARTITION BY uuid ORDER BY property) AS last_name, 
    value AS age, 
    property = 'age' AS anchor 
    FROM user_properties 
) 
HAVING anchor