Web抓取 - 使用BeautifulSoup和Python从类中获取文本？

我想从网站上刮取文本（“显示650个结果”）。Web抓取 - 使用BeautifulSoup和Python从类中获取文本？

的我期待的结果是：

Result : Showing 650 results

以下是HTML代码：

<div class="jobs-search-results__count-sort pt3"> 
      <div class="jobs-search-results__count-string results-count-string Sans-15px-black-55% pb0 pl5 pr4"> 
       Showing 650 results 
      </div>

Python代码：

response = requests.get(index_url) 
    soup = BeautifulSoup(response.text, 'html.parser') 
    text = {} 
    link = "jobs-search-results__count-string results-count-string Sans-15px-black-55% pb0 pl5 pr4" 
    for div in soup.find_all('div',attrs={"class" : link}): 
     text[div.text] 
    text

到目前为止，它看起来像我代码不起作用。

来源

2017-08-01 David

您的代码在语法上不正确。它为什么会起作用？ – DyZ

你不需要soup.find_all如果你正在寻找一个元素而已，soup.find作品一样好
您可以使用tag.string/tag.contents/tag.text访问内部文本

div = soup.find('div', {"class" : link}) 
text = div.string

来源

2017-08-02 00:14:29

甚至是'tag.text'！ ps：尽管这是一种调用'.string'的老方式，它总是会返回我猜想的相同的东西。（[*实际上取决于*]（https://stackoverflow.com/questions/25327693/difference-between-string-and-text-beautifulsoup））=） –

@ViníciusAguiar谢谢：] –

我收到以下错误： 'NoneType'对象没有属性'文本' – David

Web抓取 - 使用BeautifulSoup和Python从类中获取文本？

回答

相关问题