1
我正在从网站上刮取数据(http://sports.yahoo.com/nfl/players/8800/),为此我使用urllib2和BeautifulSoup。我此刻的代码如下所示:迭代为一个漂亮的结果集python
site= 'http://sports.yahoo.com/nfl/players/8800/'
response = urllib2.urlopen(site)
html = response.read()
soup = BeautifulSoup(html)
rushing=[]
passing=[]
receiving=[]
#here is where my problem arises
for elem in soup.find_all('th', text=re.compile('2008')):
passing = elem.parent.find_all('td', class_=re.compile('10'))
rushing = elem.parent.find_all('td', class_=re.compile('20'))
receiving = elem.parent.find_all('td', class_=re.compile('30'))
有三种情况,其中soup.find_all(...“2008”))存在此页面上部分,每这些的时候转动起来部分,是分开印刷。然而,运行这个for循环只运行一次循环。我如何确保循环运行三次?