2012-07-17 95 views
2

我有一个看起来像这样的文本文件。我想提取“A”和“E”字符的总数。使用awk从文本文件中提取特定字符的数量

>pr1 
FSVSQNNPAE 
>pr2 
MAKERAHSQ 
>pr3 
RRRDKINNWIVQL 

我想获得这样的

>pr1 
Total number of A - 1 
Total number of E - 1 

>pr2 
Total number of A – 2 
Total number of E - 1 

>pr3 
Total number of A – 0 
Total number of E – 0 

我怎样才能做到这一点使用awk输出?

回答

4

单程。当找到以>开头的行时,请阅读下一行,将其保存在str变量中,并计算每个字母的替换次数。

awk ' 
    $1 ~ /^>/ { 
     getline str 
     num_a = gsub(/A/, "", str) 
     num_e = gsub(/E/, "", str) 
     printf "%s\nTotal number of A - %d\nTotal number of E - %d\n\n", $0, num_a, num_e 
    } 
' infile 

输出:

>pr1                                                           
Total number of A - 1                                                       
Total number of E - 1                                                       

>pr2                                                           
Total number of A - 2                                                       
Total number of E - 1                                                       

>pr3                                                           
Total number of A - 0                                                       
Total number of E - 0 
+0

或多或少相同的想法,但没有,则对getline ... – 2012-07-17 12:26:56

+0

@Birei非常感谢你。 – user1510708 2012-07-17 14:40:10

3

UPDATE:这将在飞行中改变FSfield seperator)工作:

{ 
    if ($0 ~ /^>/) 
    printf("\n%s\n", $0); 
    else 
    { 
    FS="A" 
    nl = $0; 
    $0 = nl; 
    print "Total number of A - ", NF-1; 

    FS="E" 
    $0 = nl; 
    print "Total number of E - ", NF-1; 
    } 
} 

给出:

>pr1 
Total number of A - 1 
Total number of E - 1 

>pr2 
Total number of A - 2 
Total number of E - 1 

>pr3 
Total number of A - 0 
Total number of E - 0 

以前的解决方案

{ 
    if ($1 ~ /^>/) 
    printf("\n%s\n", $0) 
    else 
    { 
    print "total number of A - ", gsub(/A/,"A") 
    print "total number of E - ", gsub(/E/,"E") 
    } 
} 

类似@ Birei的

+0

改变FS的思想是什么? – 2012-07-17 12:50:22

+0

@KarlNordström工作解决方案张贴..我*知道*这是可能的'FS' ..只是得到正确的序列。 – Levon 2012-07-17 13:16:07

+0

@ user1510708工作解决方案已发布(更新) – Levon 2012-07-17 13:17:43

相关问题