我正在使用Xelement-Linq to XML来解析一些RSS提要。Linq XML如何忽略html代码?
RSS例:
<item>
<title>Waterfront Ice Skating</title>
<link>http://www.eventfinder.co.nz/2011/sep/wellington/wellington-waterfront-ice-skating?utm_medium=rss</link>
<description><p>An ice skating rink in Wellington for a limited time only!
Enjoy the magic of the New Zealand winter at an outdoor skating experience with all the fun and atmosphere of New York&#039;s Rockefeller Centre or Central Park, ...</p><p>Wellington | Friday, 30 September 2011 - Sunday, 30 October 2011</p></description>
<content:encoded><![CDATA[Today, Wellington Waterfront<br/>Wellington]]></content:encoded>
<guid isPermalink="false">108703</guid>
<pubDate>2011-09-30T10:00:00Z</pubDate>
<enclosure url="http://s1.eventfinder.co.nz/uploads/events/transformed/190501-108703-13.jpg" length="5000" type="image/jpeg"></enclosure>
</item>
其所有工作正常,但描述元素有很多的HTML标记,我需要删除。
说明:
<description><p>An ice skating rink in Wellington for a limited time only!
Enjoy the magic of the New Zealand winter at an outdoor skating experience with all the fun and atmosphere of New York&#039;s Rockefeller Centre or Central Park, ...</p><p>Wellington | Friday, 30 September 2011 - Sunday, 30 October 2011</p></description>
谁能帮助呢?
你是什么意思“忽略html代码”。你想提取文本? – adatapost
@AVD是的,我只想提取文本,并忽略标记。 – Rhys
看看这个链接 - http://www.dotnetperls.com/remove-html-tags – adatapost