0
我试图从http://www.childrenshospital.org/directory?state=%7B%22showLandingContent%22%3Afalse%2C%22model%22%3A%7B%22search_specialist%22%3Afalse%2C%22search_type%22%3A%5B%22directoryphysician%22%2C%22directorynurse%22%5D%7D%2C%22customModel%22%3A%7B%22nurses%22%3Atrue%7D%7D刮去个别提供商的网址。从R网页中用JavaScript抓取链接
我查看了页面源和确定的感兴趣的URL。例如,我想刮“http://www.childrenshospital.org/doctors/mirna-aeschlimann”形成如下节点
<a data-layer-event="searchClick" data-bind="attr: {href: model.Url}" href="http://www.childrenshospital.org/doctors/mirna-aeschlimann"><!--ko text: model.FirstName-->Mirna<!--/ko--><!--ko text: ' ' + model.LastName--> Aeschlimann<!--/ko--><!--ko if: model.Suffix-->, <!--ko text: model.Suffix-->MD<!--/ko--><!--/ko--></a>
我尝试下面的代码。但是,由于某些原因,它没有返回上面的节点。
base_html <- "http://www.childrenshospital.org/directory?state=%7B%22showLandingContent%22%3Afalse%2C%22model%22%3A%7B%22search_specialist%22%3Afalse%2C%22search_type%22%3A%5B%22directoryphysician%22%2C%22directorynurse%22%5D%7D%2C%22customModel%22%3A%7B%22nurses%22%3Atrue%7D%7D"
doc <- htmlTreeParse(base_html, useInternal = TRUE)
任何帮助将不胜感激。请让我知道是否应该提供更多信息。
查看'rvest'包[这里](https://github.com/hadley/rvest)。 – Jeff
,你在调用'htmlTreeParse'后发现你没有得到目标节点? – hrbrmstr