2016-11-18 92 views
1
a = load '/user/home/samp.txt' using PigStorage(','); 
dump a; 
(2008-Jan-12,12.1,13.1,36.0) 
(2008-Jan-13,13.1,14.1,45.00) 
(2008-Jan-15,14.2,15.2,47.00) 
(2008-Jan-16,16.1,17.1,47.5) 
(2008-Jan-12,8.5,17,50,12.0) 
(2008-Jan-12,n#/a,n#/a,n#/a) 
(2008-Jan-19,n#/a,n#/a,n#/a) 
(2008-Jan-12,n#/a,n#/a,27) 
(2008-Jan-12,n#/a,13.00,n#/a) 
b = filter a by ($1!='n#/a' OR $2!='n#/a' OR $3!='n#/a'); 
dump b; 
(2008-Jan-12,12.1,13.1,36.0) 
(2008-Jan-13,13.1,14.1,45.00) 
(2008-Jan-15,14.2,15.2,47.00) 
(2008-Jan-16,16.1,17.1,47.5) 
(2008-Jan-12,8.5,17,50,12.0) 
(2008-Jan-12,n#/a,n#/a,27) 
(2008-Jan-12,n#/a,13.00,n#/a) 

为什么仍然IAM b中得到"n#/a"猪过滤器或操作

回答

2

结果不出所料,因为你使用!=和OR.You与"n#/a"让行,因为条件的ATLEAST一个是真正的(2008-Jan-12,n#/a,n#/a,27)(2008-Jan-12,n#/a,13.00,n#/a)

如果要过滤没有"n#/a"的行,请使用AND

B = FILTER A BY (($1 != 'n#/a') AND ($2 != 'n#/a') AND ($3 != 'n#/a')); 

如果你想使用或再结合逻辑或resutls然后否定

B = FILTER A BY NOT($1 == 'n#/a' OR $2 == 'n#/a' OR $3 == 'n#/a'); 

OR

B = FILTER A BY NOT($1 matches 'n#/a' OR $2 matches 'n#/a' OR $3 matches 'n#/a'); 

输出

enter image description here