替换文本忽略HTML标签

我有一个HTML标签一个简单的文本，例如：替换文本忽略HTML标签

Once <u>the</u> activity <a href="#">reaches</a> the resumed state, you can freely add and remove fragments to the activity. Thus, <i>only</i> while the activity is in the resumed state can the <b>lifecycle</b> of a <hr/> fragment change independently.

我需要更换这个文本的某些部分忽视了它的HTML标签，当我这样做替换，例如该字符串 - Thus, only while我需要用我的字符串Hello, its only while来替换。要被替换的文本和字符串是动态的。我需要您的帮助与我的preg_replace模式

$text = '<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text'; 

$arrayKeys= array('Some html' => 'My html', 'and there' => 'is there', 'in this text' => 'in this code'); 

foreach ($arrayKeys as $key => $value) 
    $text = preg_replace('...$key...', '...$value...', $text); 

echo $text; // output should be: <b>My html</b> tags with <u>is</u> there are a lot of tags <i>in</i> this code';

请帮我找到解决办法。谢谢

来源

2012-02-22 pleerock

所提供的例子，我不相信正则表达式可以做你想做的，因为你没有一个具体的一套规则，似乎你的要求改变与你提供的每个不同的例子。 – qJake 2012-02-22 23:01:55

好吧正则表达式不能..也许有任何其他工具？...问题是，用户（网站管理员）输入数据被替换，数组是动态的 – pleerock 2012-02-22 23:10:44

可能无法完成。除非你能澄清，如果字符串是“你好世界再次”。我想用“你好来自地球”来代替“你好世界”。输出是什么？ – iWantSimpleLife 2012-02-23 01:13:55

在这里，我们走了。这段代码应该可以工作，假设你仅尊重两个约束：

模式和替换必须具有相同的字数。（逻辑，因为你想保持位置）
你不能围绕标签分词。（赫尔 LO世界将无法正常工作。）

但是，如果这些都得到尊重，这应该只是罚款！

<?php 
    // Splits a string in parts delimited with the sequence. 
    // '<b>Hey</b> you' becomes '~-=<b>~-=Hey~-=</b>~-= you' that make us get 
    // array ("<b>", "Hey" " you") 
    function getTextArray ($text, $special) { 
     $text = preg_replace ('#(<.*>)#isU', $special . '$1' . $special, $text); // Adding spaces to make explode work fine. 

     return preg_split ('#' . $special . '#', $text, -1, PREG_SPLIT_NO_EMPTY); 
    } 
     $text = " 
    <html> 
     <div> 
      <p> 
       <b>Hey</b> you ! No, you don't have <em>to</em> go! 
      </p> 
     </div> 
    </html>"; 

    $replacement = array (
     "Hey you" => "Bye me", 
     "have to" => "need to", 
     "to go" => "to run"); 

    // This is a special sequence that you must be sure to find nowhere in your code. It is used to split sequences, and will disappear. 
    $special = '~-='; 

    $text_array = getTextArray ($text, $special); 

    // $restore is the array that will finally contain the result. 
    // Now we're only storing the tags. 
    // We'll be story the text later. 
    // 
    // $clean_text is the text without the tags, but with the special sequence instead. 
    $restore = array(); 
    for ($i = 0; $i < sizeof ($text_array); $i++) { 
     $str = $text_array[$i]; 

     if (preg_match('#<.+>#', $str)) { 
      $restore[$i] = $str; 
      $clean_text .= $special; 
     } 

     else { 
      $clean_text .= $str; 
     } 
    } 

    // Here comes the tricky part. 
    // We wanna keep the position of each part of the text so the tags don't 
    // move after. 
    // So we're making the regex look like (~-=)*Hey(~-=)* you(~-=)* 
    // And the replacement look like $1Bye$2 me $3. 
    // So that we keep the separators at the right place. 
    foreach ($replacement as $regex => $newstr) { 
     $regex_array = explode (' ', $regex); 
     $regex = '(' . $special . '*)' . implode ('(' . $special . '*) ', $regex_array) . '(' . $special . '*)'; 

     $newstr_array = explode (' ', $newstr); 
     $newstr = "$1"; 

     for ($i = 0; $i < count ($regex_array) - 1; $i++) { 
      $newstr .= $newstr_array[$i] . '$' . ($i + 2) . ' '; 
     } 
     $newstr .= $newstr_array[count($regex_array) - 1] . '$' . (count ($regex_array) + 1); 

     $clean_text = preg_replace ('#' . $regex . '#isU', $newstr, $clean_text); 
    } 

    // Here we re-split one last time. 
    $clean_text_array = preg_split ('#' . $special . '#', $clean_text, -1, PREG_SPLIT_NO_EMPTY); 

    // And we merge with $restore. 
    for ($i = 0, $j = 0; $i < count ($text_array); $i++) { 
     if (!isset($restore[$i])) { 
      $restore[$i] = $clean_text_array[$j]; 
      $j++; 
     } 
    } 

    // Now we reorder everything, and make it go back to a string. 
    ksort ($restore); 
    $result = implode ($restore); 

    echo $result; 
?>

将输出再见我！不，你不需要到就跑！

[编辑]现在支持自定义模式，它允许避免添加无用的空间。

来源

2013-05-28 07:26:25 Jerska

我看到HTML的全局变量和正则表达式。因此，我的downvote。正则表达式的HTML几乎总是可以被打破，这也不例外。 – 2013-05-28 08:58:03

嗡嗡声，问题标签是什么？这不是因为一种习惯不被告知它不能实现。 – Jerska 2013-05-28 09:02:18

由于我们在这场辩论中，PHP在许多方面是一种可怕的语言，但它的一些功能让我喜欢它。根据你的说法，我应该退出PHP编程吗？ – Jerska 2013-05-28 09:04:51

基本上我们将使用正则表达式构建动态的匹配和模式数组。此代码只能匹配最初要求的代码，但您应该能够了解如何从我拼写完成的方式编辑代码。我们捕捉一个打开或关闭的标签和空白作为passthru变量并替换它周围的文本。这是基于两个和三个字组合设置的。

<?php 

    $text = '<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text'; 

    $arrayKeys= array(
    'Some html' => 'My html', 
    'and there' => 'is there', 
    'in this text' =>'in this code'); 


    function make_pattern($string){ 
     $patterns = array(
         '!(\w+)!i', 
         '#^#', 
         '! !', 
         '#$#'); 
     $replacements = array(
         "($1)", 
         '!', 
       //This next line is where we capture the possible tag or 
       //whitespace so we can ignore it and pass it through. 
         '(\s?<?/?[^>]*>?\s?)', 
         '!i'); 
     $new_string = preg_replace($patterns,$replacements,$string); 
     return $new_string; 
    } 

    function make_replacement($replacement){ 
     $patterns = array(
         '!^(\w+)(\s+)(\w+)(\s+)(\w+)$!', 
         '!^(\w+)(\s+)(\w+)$!'); 
     $replacements = array(
         '$1\$2$3\$4$5', 
         '$1\$2$3'); 
     $new_replacement = preg_replace($patterns,$replacements,$replacement); 
     return $new_replacement; 
    } 


    foreach ($arrayKeys as $key => $value){ 
     $new_Patterns[] = make_pattern($key); 
     $new_Replacements[] = make_replacement($value); 
    } 

    //For debugging 
    //print_r($new_Patterns); 
    //print_r($new_Replacements); 

    $new_text = preg_replace($new_Patterns,$new_Replacements,$text); 

    echo $new_text."\n"; 
    echo $text; 


?>

输出

<b>My html</b> tags with <u>is</u> there are a lot of tags <i>in</i> this code 
<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text

来源

2013-05-28 08:52:30

替换文本忽略HTML标签

回答

相关问题