python编码

使用mechanize，我检索到一个网页的源页面，其中包含一些非ASCII字符，如中文字符。python编码

代码低于：

#using python2.6 
from mechanize import Browser 

br = Browser() 
br.open("http://www.example.html") 

src = br.reponse().read() #retrieve the source of the web 

print src #print the src

问：

1。根据该页面的源代码，我可以看到，它的charset=gb2312，但是当我print src，所有的内容是正确的，我的意思是没有胡言乱语。为什么？ print知道src的编码吗？

2.我应该明确解码还是编码src？

来源

2011-09-26 Alcott

打印根据控制台的编码方案为您编码。如果你想输出结果到文件，你需要对它进行编码 – xiaohan2012

src是unicode，它没有编码。 print（或更准确地说，sys.stdout.write()）指出输出时使用什么编码。

来源

2011-09-26 07:11:45

没有编码？但unicode（utf-8？）不是一种编码？ – Alcott

[Unicode不是UTF-8。]（http://www.joelonsoftware.com/articles/Unicode.html） –

回答

相关问题