2016-08-04 49 views
0

我有一个内容。我将这一段分成不同的行,以便我可以清楚地解释。正则表达式包含标签之间的

<p>The most 
i<del class="del" editid="6">m</del>por<ins class="ins">sss</ins>t<del class="del>a</del>n<ins class="ins">sss</ins>t 
reso<del class="del">ur</del>ce 
for all develo<ins class="ins">vvv</ins>pers 
working with , 
integratin<del class="del">g i</del>t 
with their 
<ins class="ins">ssss</ins>w<ins class="ins">ss</ins><del class="del">e</del><ins class="ins">ss</ins>bsi<del class="del">te</del>s 
and applications, 
an<ins class="ins">sss</ins>d<del class="del"> </del>customizing 
to their needs. You can start from here. 

在这个我有单词包含<del></del> and <ins></ins> tags之间的单词。每个词有任何数量的<del></del> and <ins></ins> tags标签。

我想写一个正则表达式来识别由这些<del></del> and <ins></ins> tags组成的单词。

请注意,只显示<del></del> and <ins></ins> tags的单词的正则表达式。这个词可以用一个字母或一个<del>标签或<ins>标签开始,它也可以用字母或<del>标签或<ins>标签

这里一个字的意思是它必须在空间后开始,直到它找到一个空格(空格结束不

之间 <del> and <ins> tags or space not between the words inside the <del></del> and <ins></ins> tags

例如,在内容的词语是

i<del class="del" editid="6">m</del>por<ins class="ins">sss</ins>t<del class="del>a</del>n<ins class="ins">sss</ins>t 

    reso<del class="del">ur</del>ce 

    integratin<del class="del">g i</del>t 

    <ins class="ins">ssss</ins>w<ins class="ins">ss</ins><del class="del">e</del><ins class="ins">ss</ins>bsi<del class="del">te</del>s 

    an<ins class="ins">sss</ins>d<del class="del"> </del>customizing 

如何写一个正则表达式,以识别与这样的条件的话。请帮忙。

+0

你试过没有? – Rao

+0

@Rao我对正则表达式很陌生。我试了一些,我猜这是不正确的https://regex101.com/r/wJ9rL3/1。它没有标识所有的情况下,特别是有多个标签 – chai

回答

0

Regex101

text = ['<p>The most ' 
    ,' i<del class="del" editid="6">m</del>por<ins class="ins">sss</ins>t<del class="del>a</del>n<ins class="ins">sss</ins>t ' 
    ,' reso<del class="del">ur</del>ce ' 
    ,' for all develo<ins class="ins">vvv</ins>pers ' 
    ,' working with , ' 
    ,' integratin<del class="del">g i</del>t' 
    ,' with their ' 
    ,' <ins class="ins">ssss</ins>w<ins class="ins">ss</ins><del class="del">e</del><ins class="ins">ss</ins>bsi<del class="del">te</del>s ' 
    ,' and applications, ' 
    ,' an<ins class="ins">sss</ins>d<del class="del"> </del>customizing' 
    ,' to their needs. You can start from here.' 
].join('\n'); 
text.match(/(\s|^)(\S{0,}<(del|ins).*>(.*)<\/(del|ins)>\S{0,})(\s|$)/g); 

结果:

Array 
0 " i<del class="del" editid="6">m</del>por<ins class="ins">sss</ins>t<del class="del>a</del>n<ins class="ins">sss</ins>t " 
1 " reso<del class="del">ur</del>ce " 
2 " develo<ins class="ins">vvv</ins>pers " 
3 " integratin<del class="del">g i</del>t " 
4 " <ins class="ins">ssss</ins>w<ins class="ins">ss</ins><del class="del">e</del><ins class="ins">ss</ins>bsi<del class="del">te</del>s " 
5 " an<ins class="ins">sss</ins>d<del class="del"> </del>customizing " 
length 6 
+0

的标签这很好,但在https://regex101.com/r/cE4mE3/2在这种情况下,它必须返回3个匹配,即,你可以在这里看到https://regex101.com/r/cE4mE3/3。在这个我分裂的内容让你更好地理解 – chai