2016-10-10 39 views
1

我在PHP中使用不同的函数,可以帮助我计算单词,字符以及阅读时间。但他们都有一个轻微的“错误”:功能包括了所有的事情 - 包括bbCode(带笑脸)。我不想那样!不要在阅读时间和字/字符计数器中包含bbCode

function calculate_readingtime($string) { 
    $word = str_word_count(strip_tags($string)); 
    $m = floor($word/200); 
    $s = floor($word % 200/(200/60)); 

    $minutes = ($m != 0 ? $m.' min.' : ''); 
    $seconds = (($m != 0 AND $s != 0) ? ' ' : '') . $s.' sec.'; 

    return $minutes . $seconds; 
} 

$content = 'This is some text with [b]bbCode[/b]! Oh, so pretty :D And here\'s is a link too: [url="https://example.com/"]das linkish[/url]. What about an image? That\'s pretty to, you know. [img src="https://example.com/image.jpg" size="128" height="128" width="128"] And another one: [img src="https://example.com/image.jpg" height="128"]'; 
$reading_time = calculate_readingtime($content); 
$count_words = str_word_count($content, 1, 'àáãâçêéíîóõôúÀÁÃÂÇÊÉÍÎÓÕÔÚÅåÄäÖö'); 
$count_chars_with_spaces = mb_strlen($content); 

echo 'Reading time: '.$reading_time.'<br>'; 
echo 'Words: '.count($count_words).'<br>'; 
echo 'Characters with spaces: '.$count_chars_with_spaces; 

# OUTPUT 
Reading time: 16 sec. 
Words: 55 
Characters with spaces: 326 

我想计数器(包括阅读时间)更准确,不包括BB代码,但包括有BB代码内的文本(例如:包括文本bbCode[b]bbCode[/b])。

我该如何做到这一点?

回答

0

使用preg_replace解析BBCode中的字符串实际上相对容易,尤其是在支持PCRE库的PHP等语言中。假设你的BB代码语法的几件事情,这里的最短途径:

preg_replace('@\[(?:\w+(?:="(?>.*?"))?(?: \w+="(?>.*?"))*|/\w+)]@s', '', $content); 

Demo on Regex101

或者更好的办法是与结束标记和筑巢更精确:

function parse($str) { 
    return preg_replace_callback('@\[(\w+)(?:="(?>.*?"))?(?: \w+="(?>.*?"))*](?:(.*?)\[/\1])[email protected]', 
     function($matches) { return $matches[2] ? parse($matches[2]) : ''; }, 
     $str 
    ); 
} 

Demo on Ideone