2009-10-04 128 views
3

库存PHP5,什么是好的preg_replace表达了这种转换:与BR标签替换换行,但里面只有PRE标签

<br />替换换行,但仅限于<pre>

(例如,我们可以假设标签将是一条线,而不是病态的东西,如:

输入文字:

<div><pre class='some class'>1 
2 
3 
</pre> 
<pre>line 1 
line 2 
line 3 
</pre> 
</div> 

输出:

<div><pre>1<br />2<br />3<br /></pre> 
<pre>line 1<br />line 2<br />line 3<br /></pre> 
</div> 

(激励背景:试图收出错误20760在维基媒体SyntaxHighlight_GeSHI扩展,并找到我的PHP技能(我主要是做Python)的不及格)。

我接受其他解决方案,除了regexen,但小是首选(作为一个例子,建设html解析机械是矫枉过正)。

回答

0

基于什么SilentGhost说(这是显示不出来这里出于某种原因):

<?php 
$str = "<div><pre class='some class' >1 
2 
3 
</pre> 
<pre>line 1 
line 2 
line 3 
</pre> 
</div>"; 

$out = "<div><pre class='some class' >1<br />2<br />3<br /></pre> 
<pre>line 1<br />line 2<br />line 3<br /></pre> 
</div>"; 

function protect_newlines($str) { 
    // \n -> <br />, but only if it's in a pre block 
    // protects newlines from Parser::doBlockLevels() 
    /* split on <pre ... /pre>, basically. probably good enough */ 
    $str = " ".$str; // guarantee split will be in even positions 
    //$parts = preg_split('/(<pre .* pre>)/Umsxu',$str,-1,PREG_SPLIT_DELIM_CAPTURE); 
    $parts = preg_split("/(< \s* pre .* \/ \s* pre \s* >)/Umsxu",$str,-1,PREG_SPLIT_DELIM_CAPTURE); 
    foreach ($parts as $idx=>$part) { 
     if ($idx % 2) { 
      $parts[$idx] = preg_replace("/\n/", "<br />", $part); 
     } 
    } 
    $str = implode('',$parts); 
    /* chop off the first space, that we had added */ 
    return substr($str,1); 
} 

assert(protect_newlines($str) === $out); 
?> 
6

像这样的东西?

<?php 

$content = "<div><pre class='some class'>1 
2 
3 
</pre> 
<pre>line 1 
line 2 
line 3 
</pre> 
</div> 
"; 

function getInnerHTML($Node) 
{ 
    $Body = $Node->ownerDocument->documentElement->firstChild->firstChild; 
    $Document = new DOMDocument();  
    $Document->appendChild($Document->importNode($Body,true)); 
    return $Document->saveHTML(); 
} 

$dom = new DOMDocument(); 
$dom->loadHTML($content); 
$preElements = $dom->getElementsByTagName('pre'); 

if (count($preElements)) { 
    foreach ($preElements as $pre) { 
    $value = preg_replace('/\n|\r\n/', '<br/>', $pre->nodeValue ); 
    $pre->nodeValue = $value; 
    } 

    echo html_entity_decode(getInnerHTML($dom->documentElement)); 
} 
+0

更新的答案与'html_entity_decode',删除它,如果你不需要它。 – 2009-10-04 19:21:16

+0

我只是抛出一个快速的换行符,如果你看到任何问题让我知道,对于你perl正则表达式向导:) – 2009-10-04 19:23:51

+0

这对我的目的是失败的,因为html_entity_decode在元素之间添加了换行符。不要怪我,怪wikimedia的解析器课:) – 2009-10-05 16:08:42