2014-10-19 97 views
0

我正在猪身上编程,发生错误,我一直无法解决。错误1200:意外的符号?

这里是代码我试图运行:

--Load files into relations 
month1 = LOAD 'hdfs:/data/big/data/weather/201201hourly.txt' USING PigStorage(','); 
month2 = LOAD 'hdfs:/data/big/data/weather/201202hourly.txt' USING PigStorage(','); 
month3 = LOAD 'hdfs:/data/big/data/weather/201203hourly.txt' USING PigStorage(','); 
month4 = LOAD 'hdfs:/data/big/data/weather/201204hourly.txt' USING PigStorage(','); 
month5 = LOAD 'hdfs:/data/big/data/weather/201205hourly.txt' USING PigStorage(','); 
month6 = LOAD 'hdfs:/data/big/data/weather/201206hourly.txt' USING PigStorage(','); 

--Combine relations 
months = UNION month1, month2, month3, month4, month5, month6; 

/* Splitting relations 
SPLIT months INTO 
     splitMonth1 IF SUBSTRING(date, 4, 6) == '01', 
     splitMonth2 IF SUBSTRING(date, 4, 6) == '02', 
     splitMonth3 IF SUBSTRING(date, 4, 6) == '03', 
     splitRest IF (SUBSTRING(date, 4, 6) == '04' OR SUBSTRING(date, 4, 6) == '04'); 
*/ 

/* Joining relations 

stations = LOAD 'hdfs:/data/big/data/QCLCD201211/stations.txt' USING PigStorage() AS (id:int, name:chararray) 

JOIN months BY wban, stations by id; 

*/ 

--filter out unwanted data 
clearWeather = FILTER months BY SkyCondition == 'CLR'; 

--Transform and shape relation 
shapedWeather = FOREACH clearWeather GENERATE date, SUBSTRING(date, 0, 4) as year, SUBSTRING(date, 4, 6) as month, SUBSTRING(date, 6, 8) as day, skyCondition, dryTemp; 

--Group relation specifying number of reducers 
groupedMonthDay = GROUP shapedWeather BY month, day PARALLEL 10; 

--Aggregate relation 
aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10; 

--Sort relation 
sortedResults = SORT aggedResults BY $1 DESC; 

--Store results in HDFS 
STORE SortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':'); 

这是我得到的回报,当我运行代码:

Pig Stack Trace 
--------------- 
ERROR 1200: <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1691) 
    at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) 
    at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) 
    at org.apache.pig.PigServer.executeBatch(PigServer.java:369) 
    at org.apache.pig.PigServer.executeBatch(PigServer.java:355) 
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) 
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) 
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) 
    at org.apache.pig.Main.run(Main.java:607) 
    at org.apache.pig.Main.main(Main.java:156) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160) 
Caused by: Failed to parse: <file /home/eduardo/Documentos/pig/weather.pig, line 35, column 52> Syntax error, unexpected symbol at or near 'PARALLEL' 
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:241) 
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179) 
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) 
    ... 15 more 
================================================================================ 

回答

1

,如果你是分组多列你必须把里面的功能括号

groupedMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10; 

另一点是你可以通过使用避免多重负载和联合低命令,这将加载所有以上述组合开始的文件。

allMonths = LOAD 'hdfs:/data/big/data/weather/[0-9]*hourly.txt' USING PigStorage(','); 

柜面你想从一堆文件加载仅上述六个文件,那么你可以加载这样

allMonths = LOAD 'hdfs:/data/big/data/weather/20120[1-6]*hourly.txt' USING PigStorage(','); 
+0

谢谢您的帮助我解决了这个问题。 – 2014-10-21 00:01:16