可能这可以帮助 -
[jaypal:~/Temp] cat tmp
2 118610455 P2_PM_2_5034 T <DUP:TANDEM> 40 . END=118610566;SVLEN=110;SVTYPE=TDUP;CIPOS=-100,55;CIEND=-56,100;IMPRECISE;DBVARID=esv7540;VALIDATED;VALMETHOD=CGH;SVMETHOD=RP
[jaypal:~/Temp] var=$(awk -v FS="[ ;=]" '{print $1,$4,$24}' tmp)
[jaypal:~/Temp] echo $var
2 118610455 118610566
FS
是awk's
内置变量。它默认为空格或制表符。由于您的行作为多个分隔符,因此将FS
设置为字符类有助于为每个解除限制器分割行。我们在这里定义的字符类是space
,semi-colon
或equal
。
可能会觉得有点奇怪,但是当我碰巧用一个以上的分隔符分析一行时,我将它用作识别列的调试工具。这是我从你的线了 -
[jaypal:~/Temp] awk -v FS="[ ;=]" '{for(i=1;i<=NF;i++) print "$"i" is "$i}' tmp
$1 is 2
$2 is
$3 is
$4 is 118610455
$5 is
$6 is
$7 is P2_PM_2_5034
$8 is
$9 is
$10 is
$11 is T
$12 is
$13 is
$14 is <DUP:TANDEM>
$15 is
$16 is
$17 is
$18 is 40
$19 is
$20 is .
$21 is
$22 is
$23 is END
$24 is 118610566
$25 is SVLEN
$26 is 110
$27 is SVTYPE
$28 is TDUP
$29 is CIPOS
$30 is -100,55
$31 is CIEND
$32 is -56,100
$33 is IMPRECISE
$34 is DBVARID
$35 is esv7540
$36 is VALIDATED
$37 is VALMETHOD
$38 is CGH
$39 is SVMETHOD
$40 is RP
您也可以通过以下方式使用awk
简单substr
内置功能 -
[jaypal:~/Temp] awk '{print $1,$2,$8=substr($8,5,9)}' tmp
2 118610455 118610566
thx很好......但你能否解释一下sub(/;.*/,"",$$ 8)?我知道在这里截断了部分;对?但我不明白是什么。 *表示这里。 – user815408 2011-12-19 22:07:47
添加了解释。 – Kevin 2011-12-19 22:21:22