爬行蜘蛛不进入下一页

-1

我在http://www.ulta.com/makeup-eyes-eyebrows?N=26yi上刮所有产品的详细信息。我的规则复制如下。我只从第一页获得数据，而不会进入下一页。爬行蜘蛛不进入下一页

rules = (Rule(LinkExtractor(
      restrict_xpaths='//*[@id="canada"]/div[4]/div[2]/div[3]/div[3]/div[2]/ul/li[3]/a',), 
      callback = 'parse', 
      follow =True),)

任何人都可以帮助我吗？

来源

2017-07-03 Zhuoyang Li

使用CrawlSpider在下面的问题中提到，https://stackoverflow.com/questions/32624033/scrapy-crawl-with-next-page –

我认为我的代码完全遵循上面链接中的爬行蜘蛛。但不起作用 –

使用CrawlSpider，它会自动抓取到其他页面，否则用，蜘蛛，你需要手动传递等环节

class Scrapy1Spider(CrawlSpider):

代替

class Scrapy1Spider(scrapy.Spider):

看到：Scrapy crawl with next page

来源

2017-07-03 12:01:20

我使用爬行蜘蛛而不是蜘蛛。而restrict_xpaths是下一个按钮的xpath。但它只是刮掉第一页。 –

检查其他链接是否为allowed_domains变量的一部分。为什么你不在LinkExtractor中添加allow（）。 –

问题解决了。抓取第一页时出现产品错误。 –

爬行蜘蛛不进入下一页

回答

相关问题