我有一个配置单元查询,它使用XPath从XML返回一组数组。 我想将数组的这些元素插入配置单元表中。如何将数据插入XPath返回的数组中的hive表中
在hivexml表XML内容是:
<tag><row Id="1" TagName=".net" Count="244006" ExcerptPostId="3624959" WikiPostId="3607476" /><row Id="2" TagName="html" Count="602809" ExcerptPostId="3673183" WikiPostId="3673182" /><row Id="3" TagName="javascript" Count="1274350" ExcerptPostId="3624960" WikiPostId="3607052" /><row Id="4" TagName="css" Count="434937" ExcerptPostId="3644670" WikiPostId="3644669" /><row Id="5" TagName="php" Count="1009113" ExcerptPostId="3624936" WikiPostId="3607050" /><row Id="8" TagName="c" Count="236386" ExcerptPostId="3624961" WikiPostId="3607013" /></tag>
它返回组阵列的该查询:
select xpath(str,'/tag/row/@Id'), xpath(str,'/tag/row/@TagName'), xpath(str,'/tag/row/@Count'), xpath(str,'/tag/row/@ExcerptPostId'), xpath(str,'/tag/row/@WikiPostId') from hivexml;"
和上面查询的输出(设定阵列)是:
["1","2","3","4","5"] [".net","html","css","php","c"] ["244006","602809","434937","1009113","236386"] ["3624959","3673183","3644670","3624936","3624961"] ["3607476","36
73182","3644669","3607050","3607013"]
我想插入这些值到一个配置单元表中,就像在这种格式:
1 .net 244006 3624959 3607476
2 html 602809 3673183 3673182
3 css 434937 3644670 3644669
4 php 1009113 3624936 3607050
5 c 236386 3624961 3607013
如果我做一个插入上述选择查询:
insert into newhivexml select xpath(str,'/tags/row/@Id'), xpath(str,'/tag/row/@TagName'), xpath(str,'/tag/row/@Count'), xpath(str,'/tag/row/@ExcerptPostId'), xpath(str,'/tag/row/@WikiPostId') from hivexml;"
然后我得到一个错误:
NoMatchingMethodException No matching method for class org.apache.hadoop.hive.ql.udf.UDFToInteger with (array). Possible choices: FUNC(bigint) FUNC(boolean) FU NC(decimal(38,18)) FUNC(double) FUNC(float) FUNC(smallint) FUNC(string) FUNC(struct) FUNC(timestamp) FUNC(tinyin t) FUNC(void)
我认为,我们不能直接插入这样的,我在这里失去了一些东西。谁能告诉我如何做到这一点?也就是说,将数组中的这些值插入到表中。
下载只是为了确保 - 的XML刚刚开始列中的e列,而不是整个数据,对不对? –