2011-12-20 398 views
1

获取数据我的程序使用HtmlAgilityPack并抓住一个HTML网页,并将其存储在一个变量,我试图从HTML两个表这是在特定的股利类标签(boardcontainer)来获得。以我目前的代码,它会搜索整个网页为每个表并显示它们,但是当一个细胞是空的,它抛出一个异常:“的NullReferenceException是未处理的 - 不设置到对象的实例对象引用”HtmlAgilityPack - 从HTML表格

的HTML代码段(在这种情况下,我在网站上搜索“微软”:

<div class="boardcontainer"> 
<table cellpadding="4" cellspacing="1" border="0" width="100%"> 
<tr><td colspan="6" class="catbg" height="18" >Main Database</td></tr> 
<tr> 
    <td class="windowbg" width="28%" align="center">Company Name</td> 
    <td class="windowbg" width="12%" align="center">0870/0871</td> 
    <td class="windowbg" width="12%" align="center">0844/0845</td> 
    <td class="windowbg" width="12%" align="center">01/02/03</td> 
    <td class="windowbg" width="12%" align="center">Freephone</td> 
    <td class="windowbg" width="24%" align="center">Other Information</td> 
</tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.websitename.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�01954 713950</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC>�<b>Customer Support</b><br><i>Straight to agent (no menu)</i><br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.websitename.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0118 909 7800</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC>�<b>Main UK Switchboard</b><br><i>Ask to be put through to required department</i><br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 

这是我当前的代码,只有抓住表和显示的行+细胞,然后抛出一个异常当空。

 string html = myRequest.GetResponse(); 
     HtmlDocument htmlDoc = new HtmlDocument(); 
     htmlDoc.LoadHtml(html); 


     foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//table")) 
     { 
      Console.WriteLine("Found: " + table.Id); 
      foreach (HtmlNode row in table.SelectNodes("tr")) 
      { 
       Console.WriteLine("row"); 
       foreach (HtmlNode cell in row.SelectNodes("th|td")) //Exception is thrown here 
       { 
        Console.WriteLine("cell: " + cell.InnerText); 
       } 
      } 
     } 

我怎样才能改变这种搜索特定的div类,并从?

中提取表

谢谢您的阅读。

FULL HTML:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> 
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> 

<form method="post" action="sdfsd.php"> 
<html> 
<head><title>SAYNOTO0870.COM - Non-Geographical Alternative Telephone Numbers</title> 
<meta name='copyright' content='SAYNOTO0870.COM - 1999-2010'> 
<META name="y_key" content="5a00e35b9f1986b0" > 
</head> 

<body> 
<BODY bgColor=#ffffe6> 
<table border="0" width="100%" id="headertable1"> 
    <tr> 

     <td width="335" valign="top"> 
<font face='Tahoma' size='2'> 
     <center><b> 
     <font size="6">SAY<font color="#FF0000">NO</font>TO<font color="#FF0000">0870</font>.COM</font> 
</b></center> 
</font> 
<font face='Tahoma' size='4'> 
<center><b><font size="2">Non-Geographical Alternative Telephone Numbers</font></b><font size="3"></font><font face='Tahoma' size='2'></font><br> 

<span style="font-weight: 700"><font size="1">Awarded Website Of The Day by 
BBC Radio 2, and featured<br> 
on the BBC Radio 2's 
Jeremy Vine show and The Guardian.</font></span></center> 
     </td> 
     <td width="403" rowspan="2" align="center"> 

<a href="http://energy.saynoto0870.com" target="_blank"><img src="/banners/energyheader.gif" alt="Save Money on your Gas and Electricity" width="420" height="60" border="0" align="middle"></a>   
     </td> </tr></table> 
<table width="92%" cellspacing="1" cellpadding="0" border="0" align="CENTER"> 
    <tr> 
    <td align="center"> 
     <table bgcolor="#AFC6DB" width="100%" cellspacing="0" cellpadding="0" align="center"> 

     <tr> 
      <td width="100%" align="center"> 
      <table border="0" width="100%" cellpadding="3" cellspacing="0" bgcolor="#AFC6DB" align="center"> 
       <tr> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="180"> 
<font face='Tahoma' size='2'> 
       <b> 
<a href="/"> 
       <img src="/images/home.gif" alt="Home" border="0">Home</a></td> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="143"> 
       <b> 

<font face='Tahoma' size='2'> 

       <a href="/cgi-bin/forum/YaBB.cgi"> 
       <img src="/images/forum.gif" alt="Discussion Forum" border="0">Discussion Forum</a></td> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="134"> 
       <font face="Tahoma" size="2"> 
       <b> 
       <a href="/links.php"> 
       <img src="/images/links.gif" alt="Links" border="0">Links</a></td> 

       <td valign="middle" bgcolor="#CCFFCC" align="center" width="103"> 
       <font face="Tahoma" size="2"> 
       <b> 
       <a href="/help.php"> 
       <img src="/images/help.gif" alt="Help" border="0">Help</a></td> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="114"> 
       <font face="Tahoma" size="2"> 
       <b> 

       <a href="/contact"> 
       <img src="/images/contact.gif" alt="Contact Us" border="0">Contact Us</a> 
       </td> 
       </tr> 
       <tr> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="321" colspan="2"> 
<font face='Tahoma' size='2'> 
       <a href="/search.php"> 
       <font face="Tahoma"> 

       <b> 
       <font size="2"> 
       <img src="/images/search.gif" alt="Search" border="0"></font></b></font><font size="2"><b>Search 
       to find an alternative number</b></font></a></td> 
       <td valign="middle" bgcolor="#CCFFCC" align="center" width="365" colspan="3"> 
<font face='Tahoma' size='2'> 
       <a href="/add.php"> 
       <font face="Tahoma"> 
       <b> 
       <font size="2"> 

       <img src="/images/addno.gif" alt="Add A New Number" border="0"></font></b></font><font size="2"><b>Click 
       here to add a new alternative number</b></font></a></td> 
       </tr> 
      </table> 
      </td> 
     </tr> 
     </table> 
    </td> 
    </tr> 

</table> 

<br> 
<center> 
<script type="text/javascript"><!-- 
google_ad_client = "pub-9959843696187618"; 
google_ad_width = 468; 
google_ad_height = 60; 
google_ad_format = "468x60_as"; 
google_ad_type = "text_image"; 
//2007-06-07: SAYNOTO0870-Header 
google_ad_channel = "6422558175"; 
google_color_border = "ffffe6"; 
google_color_bg = "ffffe6"; 
google_color_link = "32527A"; 
google_color_text = "000000"; 
google_color_url = "2D8930"; 
//--> 
</script> 
<script type="text/javascript" 
    src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> 
</script> 
</center> 
<BR><input type=hidden name="search_name" value="Microsoft"> 
</form> 
<link rel="stylesheet" href="search.css" type="text/css" /> 

    <table width="100%" align="center" border="0"> 
    <tr> 

    <td><font size="2"> 

<div class="seperator"></div> 

<div class="boardcontainer"> 
<table cellpadding="4" cellspacing="1" border="0" width="100%"> 
<tr><td colspan="6" class="catbg" height="18" >Main Database</td></tr> 

<tr> 
    <td class="windowbg" width="28%" align="center">Company Name</td> 
    <td class="windowbg" width="12%" align="center">0870/0871</td> 

    <td class="windowbg" width="12%" align="center">0844/0845</td> 
    <td class="windowbg" width="12%" align="center">01/02/03</td> 
    <td class="windowbg" width="12%" align="center">Freephone</td> 
    <td class="windowbg" width="24%" align="center">Other Information</td> 
</tr> 


    <tr> 

<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 01954 713950</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b><br><i>Straight to agent (no menu)</i><br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0118 909 7800</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Main UK Switchboard</b><br><i>Ask to be put through to required department</i><br><font size=1>Also for 0870 6010200</font></td></tr> 

    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35314502113</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b><br><i>Answers as Microsoft Ireland with same options as UK 08 numbers</i><br>Reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 241 1963</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 3147 4930</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0188354</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Product Activation</b><br><i>Home & Business (Volume Licensing)</i><br><font size=1>Also: 0800 018 8364 & +800 2284 8283<br>Also for 0870 6010100 & 0870 6010200</font></td></tr> 

    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 241 1963</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 9179016</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Volume Licensing</b></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 3027 6039</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 7318457</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Online Services Support</b><br><i>MSN, Hotmail, Live, Messenger etc</i><br><font size=1>Also: 0800 587 2920</font></td></tr> 

    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 607 0700</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 6006</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35317065353</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Ask Partner Hotline</b><br><i>Answers with same options</i><br>Reduce cost using 1899 (or similar)</td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 607 0700</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 6006</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 9173128</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Partner Network Regional Service Centre</b><br><i>Help with membership questions and tools, benefits and resource queries</i></td></tr> 

    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0324479</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Direct Services</b><br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk/msdn target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35318831002</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0517215</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>MSDN (Microsoft Developers Network)</b><br>When calling +353 reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr> 

    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk/technet target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35318831002</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 281221</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Microsoft Technet</b><br>When calling +353 reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr> 
    <tr> 
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.xbox.co.uk target="_blank">Microsoft XBOX</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 7365 9792</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 5871102</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b></td></tr> 

    <tr> 

</tr> 
</table> 
</div><br />  

<table width="100%" align="center" border="0"> 
    <tr><td><font size="2"> 
<div class="seperator"></div> 

<div class="boardcontainer"> 
<table cellpadding="4" cellspacing="1" border="0" width="100%"> 

<tr><td colspan="6" class="catbg" height="18" >Unverified Numbers Database</td></tr> 

<tr> 
    <td class="windowbg" width="28%" align="center">Company Name</td> 
    <td class="windowbg" width="12%" align="center">0870/0871</td> 
    <td class="windowbg" width="12%" align="center">0844/0845</td> 
    <td class="windowbg" width="12%" align="center">01/02/03</td> 
    <td class="windowbg" width="12%" align="center">Freephone</td> 
    <td class="windowbg" width="24%" align="center">Other Information</td> 

</tr> 

<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF> Microsoft</td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0870 501 0800</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0844 800 8338</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0118 909 7994</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> <b>Premier Support</b></td></tr> 
    <tr> 
<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF>Microsoft AskPartner (Licensing)</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0870 607 0700</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 020 8784 1000</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> Switchboard of Sitel UK in Kingston where the AskPartner team is based. Ask for Microsoft Team. 0800 - 1800.</td></tr> 

    <tr> 
<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF> Microsoft Office Live Meeting</td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 020 3024 9260</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0800 0854811</a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> EMC Conferencing on Meeting Place</td></tr> 

</tr> 
</table> 
</div><br /> 

<center> 
<a href="http://homephone.consumerchoices.co.uk/?partner=saynoto0870" target="_blank"> 

<img src="/banners/consumerchoices.png" border="0" alt="ConsumerChoices" align="middle"></img></a> 
<BR><BR> 
</center> 

<div class="seperator"> 
<table cellpadding="4" cellspacing="1" border="0" width="100%"> 
<tr> 
    <td class="titlebg" align="center" colspan="2"> 
     Info Centre 
    </td> 
</tr> 

    <td class="windowbg2"> 
     <div style="float: left; width: 59%; text-align: left;"> 

     <span class="small">Please use the Contact Us option, to report any incorrect numbers that you notice on the site. Thanks for your support.</span><br /> 
     </div> 
     <div style="float: left; width: 40%; text-align: left;"> 
     <div class="small" style="float: left; width: 49%;"><span style="color: red;"><b>lllll</b></span> Main Database - A number that has been checked and at the time it was checked worked correctly. Please let us know of any numbers that no longer work as expected.</div><div class="small" style="float: left; width: 49%;"><span style="color: #CCFFFF;"><b>lllll</b></span> Unverified Number - A number that has been added by a visitor to the website, and hasn't yet been verified as correct. Please use the Contact Us link at the top of the page to let us know if these work (or don't work) for you.</div> 
     </div> 

    </td> 
</tr> 
</table> 

</div> 
    </font></td> 
    </tr> 
</table> 


<br> 

<head> 
<style> 
<!--.smallfont{ font: 11px verdana, geneva, lucida, 'lucida grande', arial, helvetica, sans-serif;}--> 

</style> 
</head> 
<b> 
<center> 
<font color='red'> 
</center> 
</b> 
</font> 
<BR> 
<center> 

<script type="text/javascript"><!-- 
google_ad_client = "pub-9959843696187618"; 
google_ad_width = 728; 
google_ad_height = 90; 
google_ad_format = "728x90_as"; 
google_ad_type = "text_image"; 
//2007-06-07: SAYNOTO0870-Footer 
google_ad_channel = "7459969292"; 
google_color_border = "FFFFE6"; 
google_color_bg = "FFFFE6"; 
google_color_link = "32527A"; 
google_color_text = "000000"; 
google_color_url = "2D8930"; 
//--> 
</script> 
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script> 
<BR></center> 
<BR><center><B> 

<font face="Tahoma" size="2"> 
Website and Content © 1999-2011 SAYNOTO0870.COM.&nbsp; All Rights Reserved</b>. 
<br><b>Written permission is required to duplicate any of the content within this site. </b></center></font> 
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script> 
<script type="text/javascript">_uacct = "UA-194609-1";urchinTracker();</script> 
</body></html> 

回答

2

以下XPath允许您在HTML文档中搜索特定DIV(用类 'boardcontainer'):

//div[@class='boardcontainer']/table 

要处理的空行,只需检查返回的HtmlNodeCollection是否为null

下面是一个完整的例子:

HtmlDocument htmlDoc = new HtmlDocument(); 
htmlDoc.LoadHtml(html); 

foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table")) 
{ 
    Console.WriteLine("Found: " + table.Id); 

    foreach (HtmlNode row in table.SelectNodes("tr")) 
    { 
    Console.WriteLine("row"); 

    HtmlNodeCollection cells = row.SelectNodes("th|td"); 

    if (cells == null) 
    { 
     continue; 
    } 

    foreach (HtmlNode cell in cells) 
    {       
     Console.WriteLine("cell: " + cell.InnerText); 
    } 
    } 
} 

您也应该检查是否有表被发现,如果发现表中包含行的。

+0

非常感谢您@Hans! – 2011-12-20 20:39:56

+0

您的方法工作@Hans,但由于某种原因,当它运行它只抓取第一个表中的数据和第二个表中的1个单元格时,第二个表格具有与第一个表格相同的div标签,所以我不确定它为什么不抓取那个也是? – 2011-12-21 00:49:41

+0

@JoeChatterton:你能发布完整的html吗? – Hans 2011-12-21 17:48:23