0
我试图在Windows Phone 8.1应用程序中使用htmlagilitypack 2.28获取div中段落标记的文本。HTML敏捷从段落标记中获取文本
DIV的结构是
<div id="55">
<p> </p>
<p><span class="dropcap">W
</span><span class="zw-portion"><strong>ith the start of festive season in India</strong>, we
will also witness the f<strong>irst London Derby</strong> of the season
between the newly London rivals <strong>Chelsea and Arsenal</strong>. It will be a great chance
for Arsene Wenger to get rid of his <strong>1000</strong></span>
<strong><span class="zw-portion">th</span><span class="zw-portion"> managed </span>
<span class="zw-portion">6-0 </spa>
<span class="zw-portion">massacre</span></strong>
<span class="zw-portion"> in March,</span>
<span class="zw-portion"> </span>
<span class="zw-portion">while the Special One will be eager to continue his winning rampage
</span>
<span class="zw-portion"> </span>
<span class="zw- portion">over his “<strong>Specialist in Failure</strong>” counterpart. Although
both clubs can boast of being unbeaten this season and both clubs can take this opportunity
</span>
<span class="zw-portion"> to bring down their rival</span><span class="zw-portion">.</span></p>
<p> </p>
<p><iframe width="640" height="360" src="https://www.youtube.com/embed/zFBN8M1pCxo?
feature=oembed" frameborder="0" allowfullscreen=""></iframe></p>
<p class="zw-paragraph" data-textformat="
{"type":"text","td":"none"}"></p>
<p class="zw-paragraph" data-textformat=
{"type":"text","td":"none"}">
<span class="zw-portion">The rivalry between Chelsea and Arsenal was not as a primary London
Derby, until Chelsea rose to top of Premier League in 2000’s, when they consistently competed
against each other. The rivalry between the two clubs rose higher as compared to their
traditional rivals. Both the clubs rivalry are now not only limited to their pitch but has also
been to the fans. In 2009 survey by Football Fans Census, Arsenal fans named Chelsea as the
<strong>most disliked club</strong> </span>
<span class="zw-portion"> ahead of their traditional rivals <strong>Manchest</strong></span>
<strong> <span class="zw-portion">er United and Tottenham Hotspur</span></strong>
<span class="zw-portion">. However the report of the other camp doesn’t differ much as Chelsea
fans ranks Arsenal as their <strong>second most-disliked club</strong></span>
<strong><span class="zw-portion">.
</span></strong></p>
</div>
我想只提取DIV内的段落元素内containined文本。 到目前为止,我已经写了以下代码,其中feedurl包含要从中提取数据的页面地址(提取正确的地址)。之后,我尝试使用它的id(总是等于55)来获得对div的引用。
var feedurl = GetValue("feedurl");
string htmlPage = "asdsad";
HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(feedurl);
htmldoc.OptionUseIdAttribute=true;
HtmlNode div = htmldoc.GetElementbyId("55");
if (div != null)
{
htmlPage += "done";
}
_content = htmlPage;
return _content;
htmldoc.GetElementbyId("55");
正在返回空引用。 我已阅读并使用htmldoc.DocumentNode.SelectNodes([arguments])
。但没有SelectNodes
方法可供我使用。我迷失在如何进一步发展。请帮忙。
我将不得不使用system.link吧。 如果在此之后,我使用 'if(div55!= null){做点什么}'它没有做任何事情。如果我使用'div55.InnerText',我会得到NullReference异常。 – user3263192 2014-10-06 09:14:28
确保您正确地将HTML加载到'HtmlDocument'(您可以通过'DocumentNode.OuterHtml'属性进行检查,看该属性是否包含预期的HTML标记) – har07 2014-10-06 10:28:58
'DocumentNode.OuterHtml'返回存储在页面中的页面url feedurl变量。这是对的吗。原谅我这么愚蠢的问题,因为我是新手,无法在网上找到答案。 – user3263192 2014-10-06 10:46:43