2012-01-28 47 views
2

链接我有一个像这样的字符串:缩短文本鸣叫般的无切削内

I love @kevinrose 's new website <a href="http://kevinrose.com">Link</a> 

而且我有这样的功能:

function short($string, $max = 255) { 
    if (strlen($string) >= $max) { 
     $string = mb_substr($string, 0, $max - 5, 'utf-8') . '...'; 
    } return $string; 
} 

如果我切屏50,它将最终成为:

I love @kevinrose 's new website <a href="http://kevinr... 

这当然会杀死HTML。

有没有一种简单的方法,我可以避免削减一个href标签(之前或最好之后)而不破坏HTML?

我需要保留我的标签当然。

谢谢

+0

的可能重复[PHP:截断的HTML,忽略标记(http://stackoverflow.com/questions/1193500/php-truncate-html-ignoring-tags) – user 2014-03-18 03:18:36

回答

5

从PHP看到这一点:截断字符串,同时保留HTML标签和整个单词 - 艾伦惠普尔 - >http://alanwhipple.com/2011/05/25/php-truncate-string-preserving-html-tags-words/

<?php 
/** 
* truncateHtml can truncate a string up to a number of characters while preserving whole words and HTML tags 
* 
* @param string $text String to truncate. 
* @param integer $length Length of returned string, including ellipsis. 
* @param string $ending Ending to be appended to the trimmed string. 
* @param boolean $exact If false, $text will not be cut mid-word 
* @param boolean $considerHtml If true, HTML tags would be handled correctly 
* 
* @return string Trimmed string. 
*/ 
function truncateHtml($text, $length = 100, $ending = '...', $exact = false, $considerHtml = true) { 
    if ($considerHtml) { 
     // if the plain text is shorter than the maximum length, return the whole text 
     if (strlen(preg_replace('/<.*?>/', '', $text)) <= $length) { 
      return $text; 
     } 
     // splits all html-tags to scanable lines 
     preg_match_all('/(<.+?>)?([^<>]*)/s', $text, $lines, PREG_SET_ORDER); 
     $total_length = strlen($ending); 
     $open_tags = array(); 
     $truncate = ''; 
     foreach ($lines as $line_matchings) { 
      // if there is any html-tag in this line, handle it and add it (uncounted) to the output 
      if (!empty($line_matchings[1])) { 
       // if it's an "empty element" with or without xhtml-conform closing slash 
       if (preg_match('/^<(\s*.+?\/\s*|\s*(img|br|input|hr|area|base|basefont|col|frame|isindex|link|meta|param)(\s.+?)?)>$/is', $line_matchings[1])) { 
        // do nothing 
       // if tag is a closing tag 
       } else if (preg_match('/^<\s*\/([^\s]+?)\s*>$/s', $line_matchings[1], $tag_matchings)) { 
        // delete tag from $open_tags list 
        $pos = array_search($tag_matchings[1], $open_tags); 
        if ($pos !== false) { 
        unset($open_tags[$pos]); 
        } 
       // if tag is an opening tag 
       } else if (preg_match('/^<\s*([^\s>!]+).*?>$/s', $line_matchings[1], $tag_matchings)) { 
        // add tag to the beginning of $open_tags list 
        array_unshift($open_tags, strtolower($tag_matchings[1])); 
       } 
       // add html-tag to $truncate'd text 
       $truncate .= $line_matchings[1]; 
      } 
      // calculate the length of the plain text part of the line; handle entities as one character 
      $content_length = strlen(preg_replace('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', ' ', $line_matchings[2])); 
      if ($total_length+$content_length> $length) { 
       // the number of characters which are left 
       $left = $length - $total_length; 
       $entities_length = 0; 
       // search for html entities 
       if (preg_match_all('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', $line_matchings[2], $entities, PREG_OFFSET_CAPTURE)) { 
        // calculate the real length of all entities in the legal range 
        foreach ($entities[0] as $entity) { 
         if ($entity[1]+1-$entities_length <= $left) { 
          $left--; 
          $entities_length += strlen($entity[0]); 
         } else { 
          // no more characters left 
          break; 
         } 
        } 
       } 
       $truncate .= substr($line_matchings[2], 0, $left+$entities_length); 
       // maximum lenght is reached, so get off the loop 
       break; 
      } else { 
       $truncate .= $line_matchings[2]; 
       $total_length += $content_length; 
      } 
      // if the maximum length is reached, get off the loop 
      if($total_length>= $length) { 
       break; 
      } 
     } 
    } else { 
     if (strlen($text) <= $length) { 
      return $text; 
     } else { 
      $truncate = substr($text, 0, $length - strlen($ending)); 
     } 
    } 
    // if the words shouldn't be cut in the middle... 
    if (!$exact) { 
     // ...search the last occurance of a space... 
     $spacepos = strrpos($truncate, ' '); 
     if (isset($spacepos)) { 
      // ...and cut the text in this position 
      $truncate = substr($truncate, 0, $spacepos); 
     } 
    } 
    // add the defined ending to the text 
    $truncate .= $ending; 
    if($considerHtml) { 
     // close all unclosed html-tags 
     foreach ($open_tags as $tag) { 
      $truncate .= '</' . $tag . '>'; 
     } 
    } 
    return $truncate; 
} 

?> 

也看到这里

+0

谢谢!效果很好! – xtrimsky 2012-01-28 05:13:10

+0

这是超级。我厌倦了这个问题。此代码完美工作。 – Pramod 2014-03-25 09:31:35

1

这里有一点简短的方法。它不走DOM树,但它几乎适用于所有情况。

该方法首先从内容中剥离所有html标签(因此html标签也不会计入字符串长度)。然后,如果字符串需要被截断,它会截断它并重新插入所有的html标签。

<?php 
function short($string, $max = 255) { 
    preg_match_all('/<[^>]+>/', $string, $tags); // Save tag information for later 
    $stripped = preg_replace('/<[^>]+>/', '', $string); // Strip html tags 

    // Truncate the string if needed 
    if (strlen($stripped) > $max) { 
     $truncated = mb_substr($stripped, 0, $max, 'utf-8'); 

     // Insert html tags, if any 
     if (sizeof($tags) > 0) { 
      $pos = 0; 
      foreach ($tags[0] as $tag) { 
       $pos += strpos($string, $tag); // Get the position the tag should be inserted at 
       $string = substr($string, $pos); // Shift to avoid issues with duplicate tags 
       $truncated = substr_replace($truncated, $tag, $pos, 0); // Insert the tag 
      } 
     } 

     $string = $truncated . '&hellip;'; 
    } 

    return $string; 
} 

echo short('I love @kevinrose\'s new website <a href="http://kevinrose.com">Link</a>. Here is a bit of additional text after the link.<a></a>', 50); 
+0

这不适用于更多的html标签 – Francesco 2015-11-10 16:29:43