我使用的是Splash 2.0.2 + Scrapy 1.0.5 + Scrapyjs 0.1.1
,我仍然无法通过点击呈现JavaScript。下面是一个例子网址https://olx.pt/anuncio/loja-nova-com-250m2-garagem-em-box-fechada-para-arrumos-IDyTzAT.html#c49d3d94cfScrapy + Splash + ScrapyJS
我仍然没有得到电话号码的页面渲染:
class OlxSpider(scrapy.Spider):
name = "olx"
rotate_user_agent = True
allowed_domains = ["olx.pt"]
start_urls = [
"https://olx.pt/imoveis/"
]
def parse(self, response):
script = """
function main(splash)
splash:go(splash.args.url)
splash:runjs('document.getElementById("contact_methods").getElementsByTagName("span")[1].click();')
splash:wait(0.5)
return splash:html()
end
"""
for href in response.css('.link.linkWithHash.detailsLink::attr(href)'):
url = response.urljoin(href.extract())
yield scrapy.Request(url, callback=self.parse_house_contents, meta={
'splash': {
'args': {'lua_source': script},
'endpoint': 'execute',
}
})
for next_page in response.css('.pager .br3.brc8::attr(href)'):
url = response.urljoin(next_page.extract())
yield scrapy.Request(url, self.parse)
def parse_house_contents(self, response):
import ipdb;ipdb.set_trace()
我怎样才能得到这个工作?
我真的需要这个工作,因为我会@ psychok7你肯定scrapyjs就足以被移动到更复杂的JS站点,日期选择器日历和东西 – psychok7
为您的复杂动态网站?也许切换到'硒'会让事情变得更快,更简单.. – alecxe
我试了一下..我不知道如果它是可能的或不..但我会考虑硒以及谢谢 – psychok7