2016-11-10 126 views
1

我提取链接文本与BeautifulSoup,如:BeautifulSoup解析特殊字符

from BeautifulSoup import BeautifulSoup 
import urllib2 
response = urllib2.urlopen(link) 
html = response.read() 
soup = BeautifulSoup(html) 

#print(soup) 
for a in soup.findAll('a',attrs={"class":"link"}): 
    print(a.text) 

但我得到“&#8211”的一个简单的“-”一些字符。 如何获得这些人物可读的字符?

回答

1

尝试以下操作:

for a in soup.findAll('a',attrs={"class":"link"}): 
    print(a.get_text())