2016-02-26 65 views
1

刮我想从韦氏字典定义刮。防爆。 http://www.merriam-webster.com/dictionary/abandon从内部类

这是我想刮的代码片段。

<div class="definition-block def-text"> 
     <ul class="definition-list no-count"> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave and never return to (someone who needs protection or help)</span> 
       </p> 
      </li> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave and never return to (something)</span> 
       </p> 
      </li> 
         <li> 
       <p class="definition-inner-item"> 
       <span><span class="intro-colon">:</span> to leave (a place) because of danger</span> 
       </p> 
      </li> 
        </ul> 
     </div> 

这里是我的代码

for element in soup.find(class_="definition-list no-count"): 
    if(soup.find("li")): 
     print element 

输出是

<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave and never return to (someone who needs protection or help)</span> 
</p> 
</li> 


<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave and never return to (something)</span> 
</p> 
</li> 


<li> 
<p class="definition-inner-item"> 
<span><span class="intro-colon">:</span> to leave (a place) because of danger</span> 
</p> 
</li> 

但我想<span>里面的定义。如果我使用get_text()方法,则会出现类型错误。

for element in soup.find(class_="definition-list no-count"): 
     if(soup.find("li")): 
      print soup.get_text(element) 

输出:

Traceback (most recent call last): 
    File "scrape.py", line 18, in <module> 
    print soup.get_text(element) 
    File "/usr/lib/python2.7/dist-packages/bs4/element.py", line 852, in get_text 
    strip, types=types)]) 
TypeError: 'NoneType' object is not callable 
+0

和你的代码? – dnit13

回答

0

你有没有考虑过使用beautifulsoup完成这个任务?我相信你可以做到这一点其他的方法,但beautifulsoup是微不足道:

from bs4 import BeautifulSoup 
import urllib 
r = urllib.urlopen('http://www.merriam-webster.com/dictionary/abandon').read() 
soup = BeautifulSoup(r) 
definitions = soup.find_all("p", class_="definition-inner-statement") 

,然后你可以用定义,你需要做的。

+0

并没有我的情况下工作。返回空列表。 – dhiraj

+0

我并没有想我给你确切的代码,因此它可能有一个错误的地方,但总的一点,我做是使用beautifulsoup。这太自以为是,以期待人们在互联网上只是写出来你的整个程序为您服务。 – ubadub