Cloudflare抓取，查找元素

我一直在玩cfscrape模块，它允许您绕过站点上的cloudflare captcha保护...我访问了页面的内容，但似乎无法让我的代码工作，而是整个HTML被打印。我只是想给内<span class="availability">Cloudflare抓取，查找元素

import urllib2 
import cfscrape 
from bs4 import BeautifulSoup 
import requests 
from lxml import etree 
import smtplib 
import urllib2, sys 
scraper = cfscrape.CloudflareScraper() 
url = "http://www.sneakersnstuff.com/en/product/25698/adidas-stan-smith-gtx" 
req = scraper.get(url).content 


try: 
    page = urllib2.urlopen(req) 
except urllib2.HTTPError, e: 
    print("hi") 
    content = e.fp.read() 


soup = BeautifulSoup(content, "lxml") 
result = soup.find_all("span", {"class":"availability"})

查找关键字我省略了的代码

来源

2016-12-31 ColeWorld

try: 
    page = urllib2.urlopen(req) 
    content = page.read() 
except urllib2.HTTPError, e: 
    print("hi")

一些无关紧要的部分，就应该阅读包含HTML代码中的urlopen的对象。

你应该把content变量放在except之前。

来源

2016-12-31 09:08:41

你是否熟悉ConnectionError：（'Connection aborted。'，BadStatusLine'错误？不知道为什么我得到这个.. – ColeWorld

@ColeWorld你应该发布其他问题，而不是在评论中提出新问题。接受这个答案来关闭这个问题。 –

Cloudflare抓取，查找元素

回答

相关问题