2010-09-23 81 views
2

我写了一个管道外壳命令,其中有多个管道很好用。我现在想把它放在一个(整齐)shell脚本的形式中。这里是脚本:从丑陋的命令行管道创建整洁的外壳脚本

#!/bin/bash 
for number in `cat xmlEventLog_2010-03-23T* | sed -nr "/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p};/payloadType/{h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p};/sender/{/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p};/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p};/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}"| tee checkThis.txt| awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}' | sort | uniq -c| egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" | awk '{print $1}'`; do fil="$fil+$number" 
done 
echo "fil is $fil" 

我想整理它,以便它是可读的。 for sed和awk管道的for循环看起来很丑。有没有人得到建议来整理这个管道怪物。管道会阻止我把它分解成不同的线路吗?

感谢

一个

如果复制上面的行到记事本,你会明白我的意思大约丑(但功能)

好乡亲。这是最后清理的版本。

有人提到event_structure函数完全可以在awk中完成。我想知道有没有人能告诉我一个如何做到这一点的例子。记录分隔符将被设置为/ event,这将分隔事件,但它是我感兴趣的events.txt(见下文)中的结构。数字结果并不重要。

代码的核心是在event_structure函数中。我想解析出数据,并将其全部放入数据结构中,以供日后检查时使用。以下工作正常。在以payloadType开头的行上,我需要解析出2个值或将任何缺失值设置为Unknown。这完全是awkable还是sed/awk组合我在这里有最好的方法来做到这一点?

#!/bin/bash 

event_structure() { 
     sed -nr "/<event eventTimestamp/,/<\/event>/ { 
      /event /{s/^.*$/\n/; p} 
      /payloadType/{h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p} 
      /sender/{/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p} 
      /result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p} 
      /filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}" xmlEventLog_2010-03-23T* | 
     tee events.txt| 
     awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} 
     $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}' 
} 

numbers=$(event_structure | sort | uniq -c | egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" | awk '{print $1}') 
addition=`echo $numbers | tr -s ' \n\t' '+' | sed -e '1s/^/fil is /' -e '$s/+$//'` 
for number in $numbers 
do 
     fil="$fil+$number" 
done 
echo $addition=$(($fil)) 

这里是产生的events.txt文件的一部分:

result: Blocked 
sccpAddress: 353869000000 
protocol: SMS 
payload: COPS 
type: SERVICE_BLACK_LIST 
result: Blocked 


result: Blocked 
sccpAddress: 353869000000 
protocol: SMS 
payload: COPS 
type: SERVICE_BLACK_LIST 
result: Blocked 

result: Modified 
sccpAddress: Unknown 
protocol: IM 
payload: IM 
type: NUMBER_BLACKLIST 
result: Modified 

result: Allowed 
sccpAddress: Unknown 
protocol: MM1 
payload: MM1 

这里是输出:

$ ./bashShell.sh 
fil is 2+372+1+1+214+73+1+20=684 

这里只是函数调用的输出:

$ ./bashShell.sh | head -10 
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: TEXT_ANALYSIS;result: Blocked 
result: Blocked;sccpAddress: 353201000002;protocol: SMS;payload: SMS-MT-FSM;type: TEXT_ANALYSIS;result: Blocked 
result: Blocked;sccpAddress: 353201000005;protocol: SMS;payload: SMS-MO-FSM;type: SERVICE_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353201000021;protocol: SMS;payload: SMS-MT-FSM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353201000033;protocol: IM;payload: IM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353401009001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353201000005;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353401000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked 
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked 

ps我把脚本命名为ba shShell.sh无特殊原因

打破多行当A

回答

3

管道不拦你,但使用$(...),而不是反引号的。像这样的东西应该工作:

#!/bin/bash 

for number in $(
    cat xmlEventLog_2010-03-23T* | 
    sed -nr "/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p};/payloadType/{h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p};/sender/{/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p};/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p};/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}"| 
    tee checkThis.txt | 
    awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}' | 
    sort | 
    uniq -c | 
    egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" | 
    awk '{print $1}' 
); do fil="$fil+$number" 
done 
echo "fil is $fil" 

当然,更大的部分是分裂AWK和SED skripts成多行也...

但我相信,即使之后的结果将是还是相当不可读。

我会建议在Perl,Ruby或任何其他比Bash更可读的脚本语言中完全重写脚本。这只是我个人经验的建议 - 每次从shell脚本开始,我最终都会用Ruby重写它。我喜欢Bash,但它似乎没有规模。

+0

我也是用python写的(非常漂亮),但是我喜欢把bash版本清理干净。谢谢 – amadain 2010-09-23 13:57:29

+0

反引号有什么问题? – 2010-09-23 17:28:05

+0

我和雷内在一起。您需要使用适合于处理XML文件的东西。你不应该尝试使用正则表达式。不足之处是,'sed'部分可以在AWK中重写,'egrep'和'uniq'可以在AWK中完成。如果你有'gawk',你也可以做'排序'。但最后,您应该使用Python或Perl XML模块。 – 2010-09-23 17:46:27

2

两个小备注:

把“列表”在一个单独的功能:

number_list() { 
    # complete pipe command list 
    # divided over multiple lines 
} 

for number in `number_list` 
do 
    # ... 
done 

尝试结合一些命令:该cat是不需要的,最终egrepawk可以合并。

+0

为什么猫不需要? – amadain 2010-09-23 14:20:21

+1

@amadain:'sed'也需要多个文件参数,所以你可以替换'cat file ... | sed [pat]'与'sed [pat]文件...'。 – schot 2010-09-23 14:30:16

1

你可以加入使用TR和预先设置不同的令牌使用SED“FIL是”:

first-command \ 
    | second-command \ 
    | third-command \ 
    ... 
    | last-command 
1

shell脚本居然是:

pipeline | tr -s ' \n\t' '+' | sed -e '1s/^/fil is /' -e '$s/+$//' 

管道可以使用分隔成了多行简单的部分。 sed脚本是可怕的位。这个脚本可以用这里的文档来改进,但是看到这个评论:

#!/bin/bash 

seds=/tmp/seds.$$ 
awks=/tmp/awks.$$ 
gres=/tmp/gres.$$ 

trap "rm -f $seds $awks $gres" 0 1 2 3 15 

# this is a noble and hairy attempt to parse xml with sed 
# it is extremely fragile and strongly dependent upon 
# the form of the source file never changing 
# I'm alternately proud or disgusted that I've been able 
# to get away with this 

cat > $seds <<'EOF' 
/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p}; 
/payloadType/{h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; 
/type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; 
/type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p}; 
/sender/{/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; 
/sccpAddress/! {s/.*/sccpAddress: Unknown/}; p}; 
/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p}; 
/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};} 
EOF 

cat > $awks <<'EOF' 
BEGIN {FS="\n"; RS=""; OFS=";"; ORS="\n"} 
$1~/result: Blocked|Modified/ && \ 
$2~/sccpAddress: 353201000001/ && \ 
$4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print} 
EOF 

cat > $gres <<EOF 
NUMBER_BLACKLIST 
USER_BLACKLIST 
NUMBER_WALLEDGARDEN 
USER_WALLED_GARDEN 
SERVICE_RESTRICTION 
BLOCK_VOICE_TO_SMS 
PEP_Blacklist_Whitelist 
EOF 

cat xmlEventLog_2010-03-23T* | \ 
sed -nr -f $seds | \ 
tee checkThis.txt | \ 
awk -f $awks | \ 
sort | uniq -c | \ 
fgrep -f $gres | \ 
awk '{print $1}'