2016-11-28 61 views
1

返回的结果不是按顺序排列,我需要按顺序返回结果。BeautifulSoup循环不返回序列

试图记录排名。

def parse(self, response): 
    sourceHtml = BeautifulSoup(response.body) 
    soup = sourceHtml.find("dl", {"id": "resultList"}) 
    for link in soup.find_all('dd'): 
     print(link.get('code')) 

回答

1

如果你想在列表中有印有“码”,只需使用一个"list comprehension"

def parse(self, response): 
    sourceHtml = BeautifulSoup(response.body) 
    soup = sourceHtml.find("dl", {"id": "resultList"}) 
    return [link.get('code') for link in soup.find_all('dd')] 

您还可以改善你的定位元素的方式,并使用CSS selector

def parse(self, response): 
    soup = BeautifulSoup(response.body) 
    return [link.get('code') for link in soup.select('dl#resultList dd')] 

这也是一个好主意,provide an underlying parser explicitly

soup = BeautifulSoup(response.body, "html.parser") 
# or soup = BeautifulSoup(response.body, "html5lib") 
# or soup = BeautifulSoup(response.body, "lxml")