2017-05-18 84 views
1

我试图从htmlParse未能加载外部实体

url <- ("http://angel.co/companies?locations[]=1647-India") 

代码提取数据:

library(XML) 
my <- htmlParse(url) 

Error: failed to load external entity from url

尝试2

library(XML) 
library(httr) 
qw <- GET(url) 
my <- readHTMLTable(rawToChar(qw$content)) 

Error in qw$content : $ operator is invalid for atomic vectors

尝试3

qw <- getURL(url) 
my <- readHTMLTable(url, stringsAsFactors = F) 

Error: could not find function "getURL"

Error: failed to load external entity from url

回答

0

的网址给301地位,以及该原因是该网站只允许SSL连接。试试这个(本质上区别在于使用https而不是http)。

library(XML) 
library(RCurl) 
url <- ("https://angel.co/companies?locations[]=1647-India") 
htmlContent <- getURL(url) 
htmlTree <- htmlTreeParse(htmlContent) 
相关问题