的代码是自我解释...字节STR转换失败python3
$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:18)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request as req
>>> url = 'http://bangladeshbrands.com/342560550782-44083.html'
>>> res = req.urlopen(url)
>>> html = res.read()
>>> type(html)
<class 'bytes'>
>>> html = html.decode('utf-8') # bytes -> str
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 66081: invalid start byte
为什么不使用知道如何通过HTTP正确处理HTML的模块? – 2014-10-30 05:08:57
@ IgnacioVazquez-Abrams,你能解释一下吗? read()方法适用于大多数url。 – Dewsworld 2014-10-30 05:10:20
'read()'方法不会告诉你有关服务器告诉你HTML的字符集的任何信息。 – 2014-10-30 05:10:59