2016-03-06 121 views
1

网络Exrtacting数据我试图我使用下面的代码从第从,使用R

http://www.covers.com/sports/NCAAB/matchups?selectedDate=2015-02-28

提取数据:

library(XML) 
library(RCurl) 

url1<-"http://www.covers.com/sports/NCAAB/matchups?selectedDate=2015-02-28" 
data1<-htmlTreeParse(url1) 
competype<-xpathSApply(xmlRoot(data1),"//div[@class = 'data-competition-type']") 

然而,competype输出为空列表。

的DATA1的部分是象下面这样:

<div class="cmg_matchup_game_box" data-home-score="54" data-away-score="51" data-event-id="888836" data-index="147" data-following="false" data-last-update="2015-03-01T03:12:09.0000000" data-link="/Sports/NCAAB/Matchups/888836" data-handicap-difference="0.5" data-game-odd="-3.5" data-game-total="128" data-line-moves="7" data-sdi-event-id="/sport/basketball/competition:888836" data-game-date="2015-02-28 23:59:00" data-top-25="false" data-competition-type="Regular Season" data-conference="Big West" data-home-conference="Big West" data-away-conference="Big West"> 

我想提取“游戏竞争型”。我怎样才能使用R?我会很乐意提供任何帮助。非常感谢。

回答

1

这应该工作:

nodes <- getNodeSet(xmlRoot(data1),"//div[@class = 'cmg_matchup_game_box']") 
sapply(nodes, xmlGetAttr, "data-competition-type")