2017-10-19 81 views
0

我有一个嵌套表,我无法访问所有使用标准谷歌BigQuery的字段。无法使用谷歌bigquery取消嵌套一些字段(标准)

例如查询失败

SELECT * 
FROM 
    (
    SELECT 
      rev_info.user.id as player_id, 
      rev_info.purchase.total.currency as currency, 
      rev_info.purchase.total.amount as REV 
      ,rev_info.purchase.virtual_items.items.sku  as sku 
    FROM `gcs.rev` 
    ) 
WHERE currency = 'USD' 

错误

"Error: Cannot access field sku on a value with type ARRAY> at [9:59]"

然而

SELECT * 
FROM 
    (
    SELECT 
      rev_info.user.id as player_id, 
      rev_info.purchase.total.currency as currency, 
      rev_info.purchase.total.amount as REV 
      --,rev_info.purchase.virtual_items.items.sku as sku 
    FROM `gcs.rev` 
    ) 
WHERE currency = 'USD' 

该查询是罚款。

还要注意的是

SELECT 
     rev_info.purchase.virtual_items.items.sku  as sku 
FROM `gcs.rev` 

失败,同样的错误如上。

+0

你的意思是“无法ŧ o不嵌套...“?你甚至没有试过!至少这是它在你的问题中看起来的样子! –

+0

你好,欢迎来到Stackoverflow!如果您收到的答案以任何方式帮助您或解决了您的问题,请考虑接受并投票,因为这在此论坛中很重要:https://stackoverflow.com/help/someone-answers –

回答

1

扩大对艾略特的答案 - 我觉得这里首先需要UNNEST,但你很可能需要聚合回你sku的。否则,你会得到相当多余的(扁平化)输出

低于我的感觉是,你可能需要的东西 - 它是BigQuery的标准SQL

#standardSQL 
SELECT 
    player_id, 
    currency, 
    REV, 
    STRING_AGG(sku) SKUs 
FROM (
    SELECT 
    rev_info.user.id AS player_id, 
    rev_info.purchase.total.currency AS currency, 
    rev_info.purchase.total.amount AS REV, 
    item.sku AS sku 
    FROM `gcs.rev` t, 
    UNNEST(t.rev_info.purchase.virtual_items.items) item 
) 
WHERE currency = 'USD' 
GROUP BY 1, 2, 3 

因此,所有SKU将作为一个列表给出player_id,随着量和货币

增加,按照艾略特的意见/建议

#standardSQL 
SELECT 
    rev_info.user.id AS player_id, 
    rev_info.purchase.total.currency AS currency, 
    rev_info.purchase.total.amount AS REV, 
    (SELECT STRING_AGG(item.sku) 
    FROM UNNEST(t.rev_info.purchase.virtual_items.items) item 
) AS SKUs 
FROM `gcs.rev` t, 
WHERE currency = 'USD' 
+0

或者'ARRAY(SELECT sku FROM UNNEST(t.rev_info.purchase.virtual_items.items))'AS sku避免聚集(您可以使用'STRING_AGG'替代)。 –

+0

完全同意。如果它是我的代码 - 我很可能会使用像'(SELECT STRING_AGG(item.sku)FROM UNNEST(...)item)AS SKUs'没有'GROUP BY'和没有'SELECT *'等。我在SO上了解到,在过去两年的每一天回答是,通常OP在许多情况下试图“简化”/混淆他们的代码,使他们的外部“小”离开他们未来但非常重要的部分,但通常他们不要转换/更改查询的结构。所以在这种情况下 - 'SELECT *'看起来有点可疑,所以我试图不改变内部查询 –

1

如果您的目标是为每个items数组元素获取一行,则可以在表和rev_info.purchase.virtual_items.items之间使用逗号(join)运算符。例如,

SELECT * 
FROM (
    SELECT 
    rev_info.user.id as player_id, 
    rev_info.purchase.total.currency as currency, 
    rev_info.purchase.total.amount as REV, 
    item.sku as sku 
    FROM `gcs.rev` t, 
    t.rev_info.purchase.virtual_items.items item 
) 
WHERE currency = 'USD' 
+0

谢谢,确实有用!然而,我很困惑,为什么它是必要的SKU,而不是转速或货币。特别是,我不明白为什么标准显示的自动展平并不直接做到这一点。 – user2998362

+0

只有'sku'是必须的,因为包含它的字段('items')是一个数组。对于其他字段路径,例如'currency'和'amount',沿路径没有数组。在使用标准SQL时,不存在“自动展平”;你必须明确地表达你的意图(就像在这种情况下用逗号运算符)。 –