2011-04-22 41 views
1

我有一个包含费用的文件。类别是树状的,因为类别可以具有多个子类别,其可以具有多个子类别等。使用awk创建嵌套类别的总和

2011-02-01,-4.00,entertainment/itunes 
2011-02-02,-5.00,entertainment/food/dinner 
2011-02-03,-6.00,entertainment/food/take-away/thai 
2011-02-04,-7.00,entertainment/food/take-away/indian 
2011-02-05,-8.00,entertainment/books/kindle 
2011-02-05,-8.00,entertainment/books/kindle 
2011-02-06,-9.00,entertainment/books/real 

我想用awk来创建一个报告,总结类别树中的每个节点。

例如

entertainment:-47.00 
entertainment/books:-25.00 
entertainment/books/kindle:-16.00 
entertainment/books/real:-9.00 
entertainment/food:-18.00 
entertainment/food/take-away:-13.00 
entertainment/food/take-away/indian:-7.00 
entertainment/food/take-away/thai:-6.00 

任何帮助,将不胜感激。

+0

是一个有趣的问题:看看我们是否会删除iTunes的费用? – sehe 2011-04-22 19:37:14

回答

1

这是怎么回事?

awk -F, '{ 
    tots["/"$3]+=$2 
    n=split($3, tmpT, "/") 
    key="/" 
    for (i=1;i<n;i++) { 
     key = (key == "/") ? key tmpT[i] : key "/" tmpT[i] 
     tots[key]+=$2 
    } 
} 
END{ 
    for (t in tots) print t "\t" tots[t] 
}' testData.txt | sort -u 

**Output** 
/entertainment -47  
/entertainment/books -25 
/entertainment/books/kindle  -16 
/entertainment/books/real  -9 
/entertainment/food  -18 
/entertainment/food/dinner  -5 
/entertainment/food/take-away -13 
/entertainment/food/take-away/indian -7 
/entertainment/food/take-away/thai  -6 
/entertainment/itunes -4 

它被小计的每个子节点。

不知道这是否是关键部分。

我希望这会有所帮助。

+0

完美,正是我所期待的。 – Ben 2011-04-24 22:10:19

1

好了,我不这样做的awk,但也许这将有助于:

cat input | 
    perl -e 'while (<>) 
    { chomp; (undef, $bal, $cat) = split /,/; $tot{$cat} += 1.0 * $bal + 0.0; } 
    map { print "$_: $tot{$_}\n" } keys %tot; ' 

哈那花了一段时间(生活在继续),但我没看见有人打我呢?!真的不能再证明这是一个oneliner:

#!/usr/bin/perl 
use strict; 
use warnings; 

my %tot; 

while (<>) 
{ 
    chomp; 
    my (undef, $bal, $cat) = split /,/; 
    my @subs = split (qr(/), $cat); 

    $tot{$_} += ($bal+0.0) 
     foreach (map { join('/', @subs[0..$_]) } (0 .. $#subs)) 
} 

map { print "$_: $tot{$_}\n" } sort keys %tot; 
1

做它的命令行: 无需script.Just可以保持它的简单和易于理解的

awk -F",|/" '{a[$3]+=$2; 
       b[$3"/"$4]+=$2; 
       c[$3"/"$4"/"$5]+=$2; 
       d[$3"/"$4"/"$5"/"$6]+=$2} 
      END{for (i in a) print i","a[i]; 
       for (j in b) print j","b[j]; 
       for (k in c) print k","c[k]; 
       for (l in d) print l","d[l];}' file.txt|grep -v '/,' 

输出继电器:

entertainment,-47 
entertainment/books,-25 
entertainment/itunes,-4 
entertainment/food,-18 
entertainment/books/kindle,-16 
entertainment/books/real,-9 
entertainment/food/take-away,-13 
entertainment/food/dinner,-5 
entertainment/food/take-away/thai,-6 
entertainment/food/take-away/indian,-7