2015-03-31 46 views
0

当我在Hive CLI上运行“创建表作为选择”查询时,表已创建,但数据未填充。但是,当我在Hive Beeswax上运行相同的查询时,我正在使用填充的数据创建目标表。Hive CLI不填充表数据(从创建表作为选择查询)而蜂巢蜂蜡工作正常

下面是该查询:

hive -e ' 
    create table table_validation as 

    select listing_id, city, area, expected_amount_inr, property_id, house_type, case when area_builtup_sqft 

    is NULL or 

    area_builtup_sqft = 0 or area_builtup_sqft = " " then plot_area else area_builtup_sqft end as area_sqft, 

    case when area_builtup_sqft is NULL or area_builtup_sqft = 0 or area_builtup_sqft = " " 

    then expected_amount_inr/plot_area else expected_amount_inr/area_builtup_sqft end as 

    price_sqft,listing_state, 

    case when house_type like "apartment" then "apartment" when house_type like "plot" then "plot" else 

    "others" end as property_type, case when house_type like "plot" then "NA" when num_bedrooms between 1 and 1.9 then 1 when num_bedrooms between 
    2 and 2.9 then 2 when num_bedrooms between 3 and 3.9 then 3 when num_bedrooms >= 4 then 4 else num_bedrooms end as number_bedrooms 

    from realestate_listing_main 

    where listing_type LIKE "rent" 

    and added_on between '2015-02-01' and '2015-03-31' 
' --database default; 

当我运行此查询,我得到以下结果:

running hive query 
    0 2015-03-31 18:40:41,025 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 
    2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 
    2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 
    2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 
    2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 
    2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 
    2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 
    2015-03-31 18:40:41,336 WARN [main] conf.HiveConf (HiveConf.java:initialize(1155)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. 

    Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties 
    OK 
    Time taken: 0.621 seconds 
    Total MapReduce jobs = 3 
    Launching Job 1 out of 3 
    Number of reduce tasks is set to 0 since there's no reduce operator 
    Starting Job = job_1427789583342_0014, Tracking URL = http://ip-10-172-133-249.ap-southeast-1.compute.internal:8088/proxy/application_1427789583342_0014/ 
    Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1427789583342_0014 
    Hadoop job information for Stage-1: number of mappers: 10; number of reducers: 0 
    2015-03-31 18:40:59,849 Stage-1 map = 0%, reduce = 0% 
    2015-03-31 18:41:10,188 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:11,219 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:12,252 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:13,289 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:14,321 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:15,357 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec 
    2015-03-31 18:41:16,393 Stage-1 map = 35%, reduce = 0%, Cumulative CPU 39.78 sec 
    2015-03-31 18:41:17,428 Stage-1 map = 40%, reduce = 0%, Cumulative CPU 41.17 sec 
    2015-03-31 18:41:18,460 Stage-1 map = 45%, reduce = 0%, Cumulative CPU 43.26 sec 
    2015-03-31 18:41:19,499 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 49.68 sec 
    2015-03-31 18:41:20,536 Stage-1 map = 70%, reduce = 0%, Cumulative CPU 50.49 sec 
    2015-03-31 18:41:21,569 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:22,598 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:23,627 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:24,655 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:25,684 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:26,714 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:27,743 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:28,773 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec 
    2015-03-31 18:41:29,803 Stage-1 map = 85%, reduce = 0%, Cumulative CPU 61.88 sec 
    2015-03-31 18:41:30,840 Stage-1 map = 90%, reduce = 0%, Cumulative CPU 63.8 sec 
    2015-03-31 18:41:31,872 Stage-1 map = 90%, reduce = 0%, Cumulative CPU 63.8 sec 
    2015-03-31 18:41:32,905 Stage-1 map = 95%, reduce = 0%, Cumulative CPU 69.86 sec 
    2015-03-31 18:41:33,935 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.58 sec 
    2015-03-31 18:41:34,964 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.58 sec 
    MapReduce Total cumulative CPU time: 1 minutes 11 seconds 580 msec 
    Ended Job = job_1427789583342_0014 
    Stage-4 is selected by condition resolver. 
    Stage-3 is filtered out by condition resolver. 
    Stage-5 is filtered out by condition resolver. 
    Moving data to: hdfs://ip-10-172-133-249.ap-southeast-1.compute.internal:8020/tmp/hive-root/hive_2015-03-31_18-40-42_689_38529489390850959-1/-ext-10001 
    Moving data to: hdfs://ip-10-172-133-249.ap-southeast-1.compute.internal:8020/user/hive/warehouse/default.db/table_validation 
    Table default.table_validation stats: [num_partitions: 0, num_files: 10, num_rows: 0, total_size: 0, raw_data_size: 0] 
    MapReduce Jobs Launched: 
    Job 0: Map: 10 Cumulative CPU: 71.58 sec HDFS Read: 2635527679 HDFS Write: 0 SUCCESS 
    Total MapReduce CPU Time Spent: 1 minutes 11 seconds 580 msec 
    OK 
    Time taken: 52.896 seconds 

它不执行第二和第三的工作。但是,当我在蜂巢蜂蜡上运行查询时,所有作业都正在执行,并且使用数据创建表。

请让我知道我错过了什么?我在过去3天内陷入了这种困境。

+0

在Hive中创建表作为选择只有很少的限制。请考虑它们在https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL中可用 – Ilango 2015-04-01 10:24:21

回答

0

得到了答案。在运行查询之前需要添加serde.jar,因为配置单元无法识别没有此jar的数据。