2014-11-04 61 views
0

我试图找出下一个文本文件一种表达式搜索多表达:优化在C#中的正则表达式中的文本

<&lt>[some text][newline][some text]<&gt;> 

这里美中不足的是,新行可以有很多之前我们发现结束标记<&gt;>

我试过以下的正则表达式

&lt;(.*?\n.*?)&gt; 

它完美的作品找到表达单线分,但我也需要发现用各种线条划分的表达式。

我尝试下面的表达式也:

&lt;(.*?\n.*?)*&gt;

但搜索它是导致超时, 请帮帮忙?

用于搜索示例文字:

<p class=3DMsoNormal style=3D'margin-top:12.0pt;margin-right:0cm;margin-bot= 
tom: 
0cm;margin-left:148.85pt;margin-bottom:.0001pt;text-indent:-148.85pt; 
tab-stops:148.85pt right 16.0cm'><b style=3D'mso-bidi-font-weight:normal'><= 
span 
style=3D'font-family:"Calibri","sans-serif"'>RISK DETAILS<span style=3D'mso= 
-tab-count: 
1'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb= 
sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span></b><span 
style=3D'font-family:"Calibri","sans-serif"'>Your home is described as 
&lt;q_1&gt;<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>The 
construction of your home is &lt;q_2&gt;<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>The 
main roof material is &lt;q_3&gt;<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>Your 
home was built in &lt;q_4&gt;<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>Your 
<span class=3DGramE>home &lt;q_5&gt; double</span> keyed deadlocks to all 
external doors<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>Your 
home &lt;q_6&gt; keyed locks or grilles on all windows<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>Your 
home has &lt;q_7&gt; alarm installed<o:p></o:p></span></p> 

<p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
:0cm; 
margin-left:148.85pt;margin-bottom:.0001pt'><span style=3D'font-family:"Cal= 
ibri","sans-serif"'>Your 
home &lt;q_8&gt; connected to mains water supply<o:p></o:p></span></p> 

一些例子: 例1: 文本进行搜索:

<span 
     style=3D'color:blue'><o:p></o:p></span></span></p> 
     </td> 
     <td width=3D103 valign=3Dtop style=3D'width:77.5pt;padding:0cm 5.4pt 0cm = 
    0cm'> 
     <p class=3DMsoNormal align=3Dright style=3D'margin-top:3.0pt;margin-right= 
    :0cm; 
     margin-bottom:0cm;margin-left:0cm;margin-bottom:.0001pt;text-align:right; 
     tab-stops:155.95pt'><span style=3D'font-family:"Calibri","sans-serif"'>&lt;= 
    <span 
     class=3DSpellE>spec_contents_value</span>&gt;<span style=3D'color:blue'><= 
    o:p></o:p></span></span></p> 
     </td> 
    </tr> 
    </table> 

    <p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
    :0cm; 
    margin-left:148.85pt;margin-bottom:.0001pt;text-indent:-148.85pt;tab-stops: 
    148.85pt right 453.55pt'><span style=3D'font-family:"Calibri","sans-serif"'= 
    ><o:p>&nbsp;</o:p></span></p> 

    <p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
    :0cm; 
    margin-left:148.85pt;margin-bottom:.0001pt;text-indent:-148.85pt;tab-stops: 
    148.85pt right 453.55pt'><span style=3D'font-family:"Calibri","sans-serif"'= 
    >Unspecified 
    Valuables<b style=3D'mso-bidi-font-weight:normal'><span style=3D'mso-tab-co= 
    unt: 
    1'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </= 
    span></b>&lt;valuables&gt;<o:p></o:p></span></p> 

    <p class=3DMsoNormal style=3D'margin-top:0cm;margin-right:0cm;margin-bottom= 
    :0cm; 
    margin-left:148.85pt;margin-bottom:.0001pt;text-indent:-148.85pt;tab-stops: 
    148.85pt right 453.55pt'><span style=3D'font-family:"Calibri","sans-serif"'= 
    >Specified 
    Valuables<b style=3D'mso-bidi-font-weight:normal'><span style=3D'mso-tab-co= 
    unt: 
    1'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb= 
    sp;&nbsp;&nbsp;&nbsp;&nbsp; </span></b>&lt;<spanclass=3DSpellE>spec_valuables_ni</span>&gt;= 
    <o:p></o:p></span></p> 

我希望我的Regex.Match模式能够搜索:

&lt;= 
<span 
    class=3DSpellE>spec_contents_value</span>&gt; 

或任何< ...跨越多行的>模式。但不在同一行上。

+0

格式的代码,以便我们能提供你想要的正则表达式。 – 2014-11-04 08:59:37

+0

感谢@ nu11p01n73R进行编辑:-) – mohits00691 2014-11-04 09:07:27

+0

@ mohits00691它是否&lt; <你的原始代码? – nu11p01n73R 2014-11-04 09:08:08

回答

1

使用DOTALL修饰符使点匹配偶数换行符(\n\r)。

(?s)&lt;(?:(?!&[gl]t;).)*?\n(?:(?!&[gl]t;).)*?&gt; 

DEMO

+0

Hi @Avunash Raj第一个表达式工作正常,但它也在搜索没有嵌入换行符的文本,是否可以添加一些内容使其包含至少一个换行符? – mohits00691 2014-11-04 09:10:32

+0

尝试我的更新.. – 2014-11-04 09:12:02

+0

它现在跳过几次发生>,我附上了一个示例文本,用于在我的问题上进行搜索。 – mohits00691 2014-11-04 09:20:00

1

如何在正则表达式

&lt;[^&]*&gt; 

例如http://regex101.com/r/iV9lS4/3

  • &lt;比赛&lt;

  • [^&]*匹配任何东西比其他&包括换行符

  • &gt;匹配&gt;

您还可以使用.通过提供DOTALL (?s)运算符来匹配任何东西。

对于输入

&lt;= 
<span 
    class=3DSpellE>spec_contents_value</span>&gt; 

这将匹配为http://regex101.com/r/iV9lS4/4