正则表达式过滤器"与< >标签包括

我有一些正则表达式代码有问题任何人都可以帮助。正则表达式过滤器"与< >标签包括

我有以下数据串见下图：

abcd &quot; something code &quot; nothing &quot;f &lt;b&gt; cannot find this section &lt;/b&gt; &quot;

我想找到"引号之间的部分。

我可以得到，如果细使用以下regax工作：

foreach (Match match in Regex.Matches(sourceLine, @"&quot;((\\&quot;)|[^&quot;(\\&quot;)])+&quot;"))

然而，如果双引号之间的部分含有<>没有找到段。不知道该怎么做才能在正则表达式中包含<>标签。

谢谢你的时间。

来源

2010-09-27 Chris

A character class […]描述了一组允许的字符和一个否定的字符类[^…]描述了一组不允许的字符。所以[^"(\\")]意味着除了&，q，u，o，t，;，(，\和)任何字符。它确实是而不是意味着什么，但"(")。

试试这个：

&quot;(.*?)&quot;

相反使用ungreedy quantifier *?匹配尽可能少的贪婪量词*尽可能多匹配越好。

来源

2010-09-27 09:56:39 Gumbo

欢呼你的回复和细节非常有帮助。 – Chris 2010-09-27 10:05:10

恕我直言，问题是，你会错过\\ "e;只是这个，而它看起来像user459320想要赶上 – PierrOz 2010-09-27 10:10:01

部分和在上面给出的例子中，它不会捕获字符串“无”，而它是在第一部分（“代码”）结束之后并在第二部分开始之前（“f <b>找不到此部分</b >”）？ – PierrOz 2010-09-27 10:36:17

public List<string> Parse(string input) 
{ 
    List<string> results = new List<string>(); 
    bool startSection = true; 
    int startIndex = 0; 
    foreach (Match m in Regex.Matches(input, @"(^|[^\\])(&quot;)")) 
    { 
     if (startSection) 
     { 
      startSection = false; 
      // capture a new section 
      startIndex = m.Index + "&quot;".Length; 

     } 
     else 
     { 
      // next match starts a new section to capture 
      startSection = true; 
      results.Add(input.Substring(startIndex, m.Index - startIndex + 1)); 
     } 
    } 
    return results; 
}

来源

2010-09-27 10:03:42 PierrOz

我读它的方式，在文本中没有明显的反斜杠。 OP引入了一些误导性的尝试，使用引用字符串的现有正则表达式（例如''[['\\\\\\\ * *（？：\\“[^”\\] *）*“'），但用' "'转义序列而不是引号。 – 2010-09-27 13:48:47

您可以使用HttpUtility.HtmlDecode将此文本转换为普通字符。然后使用正则表达式来提取双引号之间的文本将很简单。

来源

2013-10-18 20:55:21 Jerry

正则表达式过滤器"与< >标签包括

回答

相关问题