2016-09-29 1201 views
0

我有一个数据库表,其中有一列存储JSON格式的字符串。字符串本身包含像数组一样的多元素。每个元素包含多个键值对。某些值也可能包含多个键值对,例如下面的“地址”属性。HIVE,如何从数组中获取元素,元素本身也是数组

People table: 
    Col1  Col2 ..... info 
    aaa  bbb   see below 

对于列“信息”,它包含以下JSON格式字符串:

[{"name":"abc", 
    "address":{"street":"str1", "city":"c1"}, 
    "phone":"1234567" 
}, 
{"name":"def", 
    "address":{"street":"str2", "city":"c1", "county":"ct"}, 
    "phone":"7145895" 
} 
] 

我需要JSON字符串中获取每个字段的单个值。我可以通过调用爆炸这样做的所有领域,除了“地址”字段(),如下所示:

SELECT 
    get_json_object(person, '$.name') AS name, 
    get_json_object(person, '$.phone') AS phone, 
    get_json_object(person, '$.address') AS addr 
FROM people lateral view explode(split(regexp_replace(
     regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{'), '\\[|\\]',''), '\\\\n')) 
     p as person; 

我的问题是如何让“地址”字段中每个字段。 “地址”字段可以包含任意数量的键值对,我不能使用JSONSerDe。我正在考虑使用另一个爆炸()电话,但我无法让它工作。有人可以请帮助。非常感谢。

回答

1

,您可以直接与

SELECT 
    get_json_object(person, '$.name') AS name, 
    get_json_object(person, '$.phone') AS phone, 
    get_json_object(person, '$.address.street') AS street, 
    get_json_object(person, '$.address.city') AS city, 
    get_json_object(person, '$.address.county') AS county,  
FROM people lateral view explode(split(regexp_replace(
    regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{'), '\\[|\\]',''), '\\\\n')) 
    p as person; 
致电json_objects