2017-08-25 57 views
0

有没有办法比较BQ中的2列?2列的字符串匹配

我试过如下:

SELECT 
    T1.id, 
    CASE 
    WHEN REGEXP_CONTAINS(geo, countries) THEN TRUE 
    ELSE FALSE 
    END AS geo_match 
FROM 
    T1 
LEFT JOIN 
    T2 
ON 
    T1.id = T2.id 

而且得到了以下错误:

No matching signature for function REGEXP_CONTAINS for argument types: STRING, ARRAY<STRING>. Supported signatures: REGEXP_CONTAINS(STRING, STRING); REGEXP_CONTAINS(BYTES, BYTES) at [4:10] 

我也试过LIKE功能。从未工作过。

回答

3

下面是BigQuery的标准SQL

基于错误消息,我认为geo是字符串,countries是一个重复的字符串(数组):

#standardSQL 
SELECT 
    T1.id, 
    (SELECT COUNT(1) FROM UNNEST(countries) AS country WHERE geo = country) > 0 AS geo_match 
FROM T1 LEFT JOIN T2 
ON T1.id = T2.id 
ORDER BY id 

根据您的需求,您可以使用任何比较逻辑(LIKEREGEXP_CONTAINS等)而不是简单的

WHERE geo = country 

您可以播放/测试虚拟数据如下

#standardSQL 
WITH T1 AS (
    SELECT 1 AS id, 'US' AS geo UNION ALL 
    SELECT 2, 'UK' UNION ALL 
    SELECT 3, 'MX' UNION ALL 
    SELECT 4, 'CA' 
), 
T2 AS (
    SELECT 1 AS id, ['US', 'UK'] AS countries UNION ALL 
    SELECT 2, ['MX', 'CA'] UNION ALL 
    SELECT 3, ['MX', 'CA'] 
) 
SELECT 
    T1.id, 
    (SELECT COUNT(1) FROM UNNEST(countries) AS country WHERE geo = country) > 0 AS geo_match 
FROM T1 LEFT JOIN T2 
ON T1.id = T2.id 
ORDER BY id 
+0

完美!谢谢! – hamsy