2016-09-22 179 views
0

相对较新的猪/ hadoop生态系统,并尝试执行一个简单的DUMP时遇到一个令人沮丧的问题。我正试图调用下面的猪脚本(该文件是本地的,而不是HFDS,所以我使用pig -x local打开猪壳)。PIG无法读取导致作业失败的本地CSV

REGISTER utils.py USING jython AS utils; 
events = LOAD '../test/events.csv' USING PigStorage(',') AS (patientid:int, eventid:chararray, eventdesc:chararray, timestamp:chararray, value:float); 
events = FOREACH events GENERATE patientid, eventid, ToDate(timestamp, 'yyyy-MM-dd') AS etimestamp, value; 
DUMP events; 

但是,这样做的时候,我收到以下错误消息(下面失败的工作摘要,完整的PIG堆栈跟踪底部):

Input(s): Failed to read data from "file:///bootcamp/test/events.csv" 
Output(s): Failed to produce result in "file/tmp/temp/305054006/tmp-908064458" 

猪堆栈跟踪:

ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.pig.PigServer.openIterator(PigServer.java:925) 
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:746) 
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) 
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) 
at org.apache.pig.Main.run(Main.java:558) 
at org.apache.pig.Main.main(Main.java:170) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:822) 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452) 
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280) 
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) 
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) 
at org.apache.pig.PigServer.storeEx(PigServer.java:1034) 
at org.apache.pig.PigServer.store(PigServer.java:997) 
at org.apache.pig.PigServer.openIterator(PigServer.java:910) 
... 13 more 
Caused by: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294) 
at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:540) 
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235) 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:801) 
...20 more 

我已经看到了类似的失败的工作问题,但遗憾的是,我还没有设法寻找到目前为止的解决方案。

编辑:我应该提到,当下面的PIG教程在下面的链接,我遇到了同样的问题。

http://www.sunlab.org/teaching/cse8803/fall2016/lab/hadoop-pig/

+0

查看答案,意外发布为评论。 – mongolol

回答

0

所以,我发现我能够做“转储”文件如下:

tmp = events 100000; --any int larger than number of rows 
dump tmp; 

我曾见过这里类似的问题,并能够通过运行来解决作为根。

相关问题