2012-03-13 80 views
1

我试图捕获<pre>标记中的属性以及可选的类标记。我想在一个正则表达式中捕获类标签的内容,而不是捕获所有属性,然后在可能的情况下查找类属性值。由于类标记是可选的,因此我尝试添加一个?,但这会导致以下正则表达式仅使用最后一个捕获组捕获 - 该类未被捕获,并且之前的属性也不是。正则表达式可选类标记

// Works, but class isn't optional 
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])(.*?)\3)([^\>]*)>' 

// Fails to match class, the whole set of attributes are matched by last group 
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])?(.*?)\3)([^\>]*)>' 

e.g. <pre style="..." class="some-class" title="stuff"> 

编辑:

我结束了使用此:

$wp_content = preg_replace_callback('#(?<!\$)<\s*pre(?=(?:([^>]*)\bclass\s*=\s*(["\'])(.*?)\2([^>]*))?)([^>]*)>(.*?)<\s*/\s*pre\s*>#msi', 'CrayonWP::pre_tag', $wp_content); 

它允许标签内的空白,也前后类属性后分隔的东西,以及捕捉所有属性。

然后回调把东西的地方:

public static function pre_tag($matches) { 
    $pre_class = $matches[1]; 
    $quotes = $matches[2]; 
    $class = $matches[3]; 
    $post_class = $matches[4]; 
    $atts = $matches[5]; 
    $content = $matches[6]; 
    if (!empty($class)) { 
     // Allow hyphenated "setting-value" style settings in the class attribute 
     $class = preg_replace('#\b([A-Za-z-]+)-(\S+)#msi', '$1='.$quotes.'$2'.$quotes, $class); 
     return "[crayon $pre_class $class $post_class] $content [/crayon]"; 
    } else { 
     return "[crayon $atts] $content [/crayon]"; 
    } 
} 

回答

4

你可以把捕获组为class属性在先行断言,使其可选:

'(?<!\$)<pre(?=(?:[^>]*\bclass\s*=\s*(["\'])(.*?)\1)?)([^>]*)>' 

现在,$2将包含如果存在,则为class属性的值。

(?<!\$)    # Assert no preceding $ (why?) 
<pre     # Match <pre 
(?=     # Assert that the following can be matched: 
(?:     # Try to match this: 
    [^>]*    # any text except > 
    \bclass\s*=\s*  # class = 
    (["\'])   # opening quote 
    (.*?)    # any text, lazy --> capture this in group no. 2 
    \1     # corresponding closing quote 
)?     # but make the whole thing optional. 
)      # End of lookahead 
([^\>]*)>    # Match the entire contents of the tag and the closing > 
+0

宏伟,谢谢! – 2012-03-13 11:49:50