2011-09-03 59 views
0

我试图消除一切从以下字符串除了对象标签的子标签:如何从一个WordPress后删除所有的标签,除了使用DOM

<p>If a post is marked video, and there is text BEFORE the video, the video player does not appear! We only see the actual text for the url…</p> 
<p>&nbsp;</p> 
<p><object width="584" height="463"><param value="http://www.youtube.com/v/Clp9AeBdgL0?version=3" name="movie"><param value="true" name="allowFullScreen"><param value="always" name="allowscriptaccess"><embed width="584" height="463" allowfullscreen="true" allowscriptaccess="always" type="application/x-shockwave-flash" src="http://www.youtube.com/v/Clp9AeBdgL0?version=3"></object></p> 
<p>Of course, you might even have a paragraph AFTER the video. Could be lots and lots of meaningless text &ndash; we should definitely limit this. Lorem ipsum</p> 

正如你可以在上面看到,第三届“ p'标签包含一个'object'标签。我想摆脱除“对象”标签及其内容之外的所有内容。换句话说,我想遍历DOM和删除除一切:

<object width="584" height="463"><param value="http://www.youtube.com/v/Clp9AeBdgL0?version=3" name="movie"><param value="true" name="allowFullScreen"><param value="always" name="allowscriptaccess"><embed width="584" height="463" allowfullscreen="true" allowscriptaccess="always" type="application/x-shockwave-flash" src="http://www.youtube.com/v/Clp9AeBdgL0?version=3"></object> 

我能写的删除任何特定的标记(P,IMG,DIV等),从它的内容的功能字符串,通过遍历DOM,但我无法弄清楚如何在这种情况下保存子标记的内容。任何人都可以帮忙吗?

回答

1

而不是用XML解析对象遍历DOM(这听起来像你正在做的事情,如果我不正确,对不起),我建议你只是在你的字符串上使用正则表达式类型搜索。

PHP supports PCREs

编辑: 它看起来像'/<object .*<\/object>/'作品。您可以测试PHP正则表达式here - 我使用preg_match()函数。另外,如果每页有多个<object> s,则需要确保不使用“贪婪”匹配。最后,这不会工作与嵌套对象,但我不指望你会拥有它们。

所以整个片断可能是:

$pattern = '/<object .*<\/object>/'; 
$subject = /* this is your string containing the html' */ 
$matches = array(); 

if(preg_match($pattern, $subject, $matches)) 
{ 
    echo $matches[0]; 
} 
else 
{ 
    echo "No match found." 
} 
相关问题