2014-12-04 72 views
0

我刚开始在猪的编写一些脚本,我试图总结int列使用SUM(),我的剧本是这样的:猪拉丁

DATA = LOAD 'SomeFile' as (fingerPrint, size, str1, str2); 
groupedChunks = GROUP DATA BY fingerPrint; 


uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1; 
sizes = FOREACH uniqueChunks GENERATE MAX($.size) as size; 

现在我有一个表,只有一列,这是大小列,如果我 调用DESCRIBE,它会生成此输出:sizes:{size: int}

现在我需要在这一步的帮助,我如何获得此列的所有大小的总和?

回答

1

你可以试试吗?

result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes); 
DUMP result; 

UPDATE:全码

input.txt中

a  1  b  c 
d  2  e  f 

PigScript:

DATA = LOAD 'input.txt' as (fingerPrint, size, str1, str2); 
groupedChunks = GROUP DATA BY fingerPrint; 
uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1; 
sizes = FOREACH uniqueChunks GENERATE MAX(DATA.size) as size; 
result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes); 
DUMP result; 

输出:

(3.0) 
+0

我仍然收到错误,无法为别名结果打开迭代器。 它与铸造有关吗?大小警告:遇到警告IMPLICI_CAST_TO_LONG – Bafla13 2014-12-05 08:45:14

+0

对我来说它工作正常。你可以粘贴你的样本输入吗? – 2014-12-05 12:29:02

+0

a \ t1 \ tb \ tc d \ t2 \ te \ tf 与\ t我猜你知道我的意思是表格 – Bafla13 2014-12-05 12:31:37

0

V = GROUP DATA ALL; result = FOREACH V GENERATE SUM(DATA.size)