我有包含许多行的文件,如下面:在awk/GSUB替代的特殊字符和字符串的提取
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">XX:The quick brown fox jumped over the lazy </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">YY:Jack and Jill went up the hill </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">ZZ: Mary had a little lamb </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
我希望提取以下字符串,并丢弃一切。
XX: The quick brown fox jumped over the lazy
YY: Jack and Jill went up the hill
ZZ: Mary had a little lamb
到目前为止,我已经使用以下awk命令尝试,但它似乎被限制为XX需要更换的YY和ZZ。
awk '{gsub(/^.*XX:/,"XX:"); gsub(/[<\a>].*$/,"[</a>].");print}'
有没有其他人可以建议使用任何其他标准的Linux工具? 谢谢。
XX/YY/ZZ的通用性如何?如果是这样,你可以在大多数正则表达式引擎中执行'[XYZ] {2}''。 – stevesliva
@stevesliva,我认为问题是更多(或也),OP必须改变替换字符串以及哪些字母匹配正则表达式。 – jas
嗨,Jas是正确的,在':'之前替换字符串的变化将是一个要求..感谢您的回复 – niknak