-1
我需要网页中的一些值,因此我使用html敏捷包构建了一个刮取。如何使用Html敏捷包从网页中刮取值
我会告诉你html网站和我的Csharp。
html网页:
<div class="box-overflow">
<div class="box-overflow__in">
<table class="table-main js-tablebanner-t js-tablebanner-ntb">
<tr>
<th class="h-text-left" colspan="2">17. Round</th>
<th class="h-text-center">1</th>
<th class="h-text-center">X</th>
<th class="h-text-center">2</th>
<th> </th>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/lechia-gdansk-leczna/Kjnscb6D/" class=
"in-match"><span>Lechia Gdansk</span> - <span>Leczna</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/lechia-gdansk-leczna/Kjnscb6D/">3:0</a></td>
<td class="table-matches__odds colored"></td>
<td class="table-matches__odds" data-odd="4.04"></td>
<td class="table-matches__odds" data-odd="6.29"></td>
<td class="h-text-right h-text-no-wrap">28.11.2016</td>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/plock-piast-gliwice/KrhILsqE/" class=
"in-match"><span>Plock</span> - <span>Piast Gliwice</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/plock-piast-gliwice/KrhILsqE/">0:0</a></td>
<td class="table-matches__odds" data-odd="2.05"></td>
<td class="table-matches__odds colored"></td>
<td class="table-matches__odds" data-odd="3.50"></td>
<td class="h-text-right h-text-no-wrap">27.11.2016</td>
</tr>
<tr>
<td class="h-text-left"><a href=
"/soccer/poland/ekstraklasa/slask-wroclaw-legia/bZjMK1bK/" class=
"in-match"><span>Slask Wroclaw</span> - <span>Legia</span></a></td>
<td class="h-text-center"><a href=
"/soccer/poland/ekstraklasa/slask-wroclaw-legia/bZjMK1bK/">0:4</a></td>
<td class="table-matches__odds" data-odd="4.53"></td>
<td class="table-matches__odds" data-odd="3.64"></td>
<td class="table-matches__odds colored"></td>
<td class="h-text-right h-text-no-wrap">27.11.2016</td>
</tr>
</table>
</div>
</div>
我CSHARP:
var url = "http://www.betexplorer.com/soccer/poland/ekstraklasa/results/";
var web = new HtmlWeb();
var doc = web.Load(url);
Bets = new List<Bet>();
// Lettura delle righe
var Rows = doc.DocumentNode.SelectNodes("//table");
foreach (var row in Rows)
{
if (!row.GetAttributeValue("class", "").Contains("table-main js-tablebanner-t js-tablebanner-ntb"))
{
if (string.IsNullOrEmpty(row.InnerText))
continue;
var rowBet = new Bet();
foreach (var node in row.ChildNodes)
{
var data_odd = node.GetAttributeValue("data-odd", "");
if (string.IsNullOrEmpty(data_odd))
{
if (node.GetAttributeValue("class", "").Contains("in-match"))
{
rowBet.Match = node.InnerText.Trim();
var matchTeam = rowBet.Match.Split(new[] { " - " }, StringSplitOptions.RemoveEmptyEntries);
rowBet.Home = matchTeam[0];
rowBet.Host = matchTeam[1];
}
if (node.GetAttributeValue("class", "").Contains("h-text-center"))
{
rowBet.Result = node.InnerText.Trim();
var matchPoints = rowBet.Result.Split(new[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
int help;
if (int.TryParse(matchPoints[0], out help))
{
rowBet.HomePoints = help;
}
if (matchPoints.Length == 2 && int.TryParse(matchPoints[1], out help))
{
rowBet.HostPoints = help;
}
}
if (node.GetAttributeValue("class", "").Contains("h-text-right h-text-no-wrap"))
rowBet.Date = node.InnerText.Trim();
}
else
{
rowBet.Odds.Add(data_odd);
}
}
if (!string.IsNullOrEmpty(rowBet.Match))
Bets.Add(rowBet);
}
}
我会给你更多的信息:
I need to take teams name (e.g. Lechia Gdansk - Leczna),
result (e.g. 3:0)
data-odd (e.g. 1.49, 4.04, 6.29)
and match date (e.g. 28.11.2016)
如果有人需要更多的infromations,问我你想要什么知道。由于
'如果(!row.GetAttributeValue( “类”, “”)。载有( “表主JS-tablebanner-T JS-tablebanner-NTB”) )' - 这些类是在表本身声明的,而不是行。 – stuartd