2011-05-11 52 views
4

例如,我有一串<tr>标签我想收集。我需要将这些标签中的每一个拆分为单独的元素,以便于我解析。我可以使用HtmlAgilityPack在特定标签上拆分HTML文档吗?

这可能吗?

标记的一个例子:

<tr class="first-in-year"> 
    <td class="year">2011</td> 

    <td class="img"><a href="/battlefield-3/61-27006/"><img src= 
    "http://media.giantbomb.com/uploads/6/63038/1700748-bf3_thumb.jpg" alt=""></a></td> 

    <td class="title"> 
    <a href="/battlefield-3/61-27006/">Battlefield 3</a> 

    <p class="deck">Battlefield 3 is DICE's next installment in the franchise and 
    will be on PC, PS3 and Xbox 360. The game will feature jets, prone, a 
    single-player and co-op campaign, and 64-player multiplayer (on PC). It's due out 
    in Fall of 2011.</p> 
    </td> 

    <td class="date">Expected: Q4 2011</td> 

    <td><a href="/pc/60-94/" class="PC">PC</a>, <a href="/xbox-360/60-20/" class= 
    "X360">X360</a>, <a href="/playstation-3/60-35/" class="PS3">PS3</a></td> 
</tr> 

<tr> 
    <td class="year"></td> 

    <td class="img"><a href="/forza-motorsport-4/61-33400/"><img src= 
    "http://media.giantbomb.com/uploads/0/1992/1654849-forza4_thumb.jpg" alt= 
    ""></a></td> 

    <td class="title"> 
    <a href="/forza-motorsport-4/61-33400/">Forza Motorsport 4</a> 

    <p class="deck">The next installment of Turn 10's racing franchise slated for 
    release in Fall 2011. It is set to feature 16 player online races, dynamic race 
    conditions, cars from over 80 manufacturers, and compatibility with Kinect, both 
    on and off the racetrack.</p> 
    </td> 

    <td class="date">Expected: Oct 2011</td> 

    <td><a href="/xbox-360/60-20/" class="X360">X360</a></td> 
</tr> 

<tr> 
    <td class="year"></td> 

    <td class="img"><a href="/max-payne-3/61-23398/"><img src= 
    "http://media.giantbomb.com/uploads/0/1400/938434-custom_1237811317319_mp3_poster_thumb.jpg" 
    alt=""></a></td> 

    <td class="title"> 
    <a href="/max-payne-3/61-23398/">Max Payne 3</a> 

    <p class="deck">The long awaited third instalment in Remedy's beloved series, in 
    which an aging Max Payne faces one final chance to redeem himself.</p> 
    </td> 

    <td class="date">Expected: 2011</td> 

    <td><a href="/pc/60-94/" class="PC">PC</a>, <a href="/playstation-3/60-35/" class= 
    "PS3">PS3</a>, <a href="/xbox-360/60-20/" class="X360">X360</a></td> 
</tr> 

所以,我这里有三个要素的这个例子。 :)

回答

2

如果这就是你的意思,你不能将它分成标签上的多个HTML文档。您可以选择各个TD元素并分别解析这些元素。

XPath选择器//td将选择您可以传递给解析方法的所有元素。

HtmlAgilityPack.HtmlDocument doc = LoadHtmlHowever(); 
doc.DocumentNode.SelectNodes("//td"); 
相关问题