0
这里是我的PIG脚本:猪错误:无法打开迭代器的别名
json = LOAD '/tmp/events/*/*/flume-.*' USING JsonLoader('state:chararray, city:chararray, promotionType:chararray, promotionPlace: chararray, purchase:int');
grouped = FOREACH (group json BY (state, city, promotionType, promotionPlace)) GENERATE group, SUM(json.purchase) as purchase;
grpd = GROUP grouped BY group.city;
top1 = foreach grpd {sorted = order grouped by purchase desc;top = limit sorted 1;generate group, flatten(top);};
DUMP top1;
它适用于多个文件,但对于多个文件(3K),它提供了错误:“无法打开别名TOP1迭代器” 。 任何想法如何解决这个问题?
很难说,也许你的3k文件中有一个文件被破坏,或者它有不同的模式?您可以尝试加载并转储数据的联合。 – AntonyBrd
它是相同的架构 –
对于在寻找[错误1066:无法打开别名的迭代器]时发现此帖子的人(http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator- for-alias-in-pig-generic-solution)这里是[通用解决方案](http://stackoverflow.com/a/34495086/983722)。 –