2016-03-06 74 views
0

我正在尝试使用OOZIE工作流程运行HIVE操作。下面是蜂巢行动:OOZIE工作流程:HIVE表格不存在,但在HDFS中创建目录

create table abc (a INT);

我可以找到在HDFS内部表(目录abc得到/user/hive/warehouse下创建的),但是当我从hive>触发命令SHOW TABLES,我无法看到该表。

这是workflow.xml文件:

<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf"> 
<start to="hiveac"/> 
<action name="hiveac"> 
    <hive xmlns="uri:oozie:hive-action:0.2"> 
      <job-tracker>${jobTracker}</job-tracker> 
      <name-node>${nameNode}</name-node> 
      <!-- <prepare> <delete path="${nameNode}/user/${wf:user()}/case1/out"/> </prepare> --> 
     <!-- <job-xml>hive-default.xml</job-xml>--> 
      <configuration> 
       <property> 
        <name>oozie.hive.defaults</name> 
        <value>hive-default.xml</value> 
       </property> 
       <property> 
        <name>mapred.job.queue.name</name> 
        <value>${queueName}</value> 
       </property> 
      </configuration> 
      <script>script.q</script> 
      <!-- <param>INPUT=/user/${wf:user()}/case1/sales_history_temp4</param> 
      <param>OUTPUT=/user/${wf:user()}/case1/out</param> --> 
     </hive> 
    <ok to="end"/> 
    <error to="fail"/> 
</action> 
    <kill name="fail"> 
    <message>Pig Script failed!!!</message> 
    </kill> 
    <end name="end"/> 
</workflow-app> 

这是hive-default.xml文件:

<configuration> 
<property> 
    <name>javax.jdo.option.ConnectionURL</name> 
    <value>jdbc:mysql://localhost/metastore</value> 
    <description>JDBC connect string for a JDBC metastore</description> 
</property> 

<property> 
    <name>javax.jdo.option.ConnectionDriverName</name> 
    <value>org.apache.derby.jdbc.EmbeddedDriver</value> 
    <description>Driver class name for a JDBC metastore</description> 
</property> 

<property> 
    <name>javax.jdo.option.ConnectionUserName</name> 
    <value>hiveuser</value> 
</property> 
<property> 
    <name>javax.jdo.option.ConnectionPassword</name> 
    <value>password</value> 
</property> 
<property> 
    <name>datanucleus.autoCreateSchema</name> 
    <value>false</value> 
</property> 
<property> 
    <name>datanucleus.fixedDatastore</name> 
    <value>true</value> 
</property> 
<property> 
    <name>hive.stats.autogather</name> 
    <value>false</value> 
</property> 
</configuration> 

这是job.properties文件:

nameNode=hdfs://localhost:8020 
jobTracker=localhost:8021 
queueName=default 
oozie.libpath=/user/oozie/shared/lib 
#oozie.use.system.libpath=true 
oozie.wf.application.path=${nameNode}/user/my/jobhive 

日志没有给任何错误因此:

stderr logs 

Logging initialized using configuration in jar:file:/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/distcache/3179985539753819871_-620577179_884768063/localhost/user/oozie/shared/lib/hive-common-0.9.0-cdh4.1.1.jar!/hive-log4j.properties 
Hive history file=/tmp/mapred/hive_job_log_mapred_201603060735_17840386.txt 
OK 
Time taken: 9.322 seconds 
Log file: /var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/training/jobcache/job_201603060455_0012/attempt_201603060455_0012_m_000000_0/work/hive-oozie-job_201603060455_0012.log not present. Therefore no Hadoop jobids found 

我碰到了类似的线程:Tables created by oozie hive action cannot be found from hive client but can find them in HDFS

但这并没有解决我的问题。请让我知道如何解决这个问题。

+0

看起来像您指示Hive使用Derby实例化**“沙箱”metastore **,而不是连接到真正的Metastore(MySQL?),因此CREATE TABLE被写入Derby临时DB ...然后丢失。 –

+0

顺便说一句,为什么你使用古老的'工作流程:0.2'?你安装了什么版本的Oozie? –

+0

我相信'oozie.hive.defaults'是遗留的东西,忘了它 - 除非你有一个veeeeeeeeeeeery旧版本。 –

回答

0

我没有用了Oozie了几个月(并没有保留的,因为法律原因存档)反正它是V4.x的所以它的一些猜测...

  1. 上传您的有效hive-site.xml到HDFS的地方
  2. 告诉Oozie的运行蜂巢课前注入的启动Configuration所有这些特性,使得它继承了所有这些,与 <job-xml>/some/hdfs/path/hive-site.xml</job-xml>
  3. 删除任何参考oozie.hive.defaults

警告:所有假定您的沙箱集群具有持续 Metastore - 即你的hive-site.xml不指向被擦除每次Derby嵌入式数据库!

+0

我试着将'oozie.hive.defaults'改为'',但是我得到错误:'Unable to connect to the metastore',then reverted it to oozie.hive.defaults'。请在''hive-default'中提出错误信息。上面贴出的xml',以及我需要做些什么改变来指向正确的持续性Metastore – user182944