2017-04-18 417 views
2
from urllib.request import urlopen 
from bs4 import BeautifulSoup 
html= urlopen("http://www.pythonscraping.com/pages/page3.html") 
soup= BeautifulSoup(html.read()) 
print(soup.find("img",{"src":"../img/gifts/img1.jpg" 
}).parent.previous_sibling.get_text()) 

上面的代码工作正常,但不是下面的那个。它给出了如上所述的属性错误。谁能告诉我原因?属性错误:'NoneType'对象没有属性'parent'

from urllib.request import urlopen  
from bs4 import BeautifulSoup 
html= urlopen("http://www.pythonscraping.com/pages/page3.html") 
soup= BeautifulSoup(html.read()) 
price =soup.find("img",{"src=":"../img/gifts/img1.jpg" 
}).parent.previous_sibling.get_text() 
print(price) 

谢谢! :)

+0

都得到$ 15.00 – Serge

+0

希望我能说的一样。我曾尝试重新启动和一切,但同样的错误。我会尝试看看代码一次。由于 – Xexus

回答

0

如果你比较第一和第二个版本,你会发现:

第一:soup.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()

  • 注:"src"

二:soup.find("img","src=":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()

  • 注:"src="

第二个代码返回Attribute Error:'NoneType' object has no attribute 'parent'因为它无法找到所提供的汤src=="../img/gifts/img1.jpg"

所以,如果你在第二个版本中删除=,它应该工作。


顺便说一句,你应该明确你想要使用的解析器,否则bs4将返回以下警告:

UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change code that looks like this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "lxml")

所以,在警告消息说,你只需要改变soup = BeautifulSoup(html.read())soup = BeautifulSoup(html.read(), 'lxml'),例如。

+0

我很新的这一切。非常感谢!! – Xexus

相关问题