因此,在过去的几天里,我一直试图在App Engine中学习Python。但是,我遇到了许多使用ASCII和UTF编码的问题。最新鲜的问题如下:Google App Engine中的Python问题-UTF-8和ASCII
我有如下一段从书“代码在云”
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>MarkCC's AppEngine Chat Room</title>
</head>
<body>
<h1>Welcome to MarkCC's AppEngine Chat Room</h1>
<p>(Current time is %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Name:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Message</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get("name")
msg = self.request.get("message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
它工作正常的英语简单聊天室的代码。然而,当我添加一些非标准字符的各种问题开始
首先,为了事实上能够在HTML中显示字符,我添加元标记 - 字符集= UTF-8“等
奇怪的是,如果输入的是非标准字母,程序会很好地处理它们,并且不会显示任何问题,但是如果我使用脚本输入任何非ascii字母到web布局,则无法加载。我发现添加utf-8编码线会起作用,因此我添加了(# - - coding:utf-8 - - ),这还不够,当然我忘了将文件保存为UTF-8格式。然后程序开始运行。
这将是很好的一种结局,唉....
它不工作
长话短说验证码:
# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>Witaj w pokoju czatu MarkCC w App Engine</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
<p>(Dokladny czas Twojego logowania to: %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Twój Nick:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Twoja Wiadomość</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get(u"name")
msg = self.request.get(u"message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
无法处理任何事情我写的在聊天应用程序运行时。它加载但我输入消息的时刻(即使只使用标准字符)我收到
File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in range(128)
错误消息。换句话说,如果我想在应用程序中使用任何字符,我不能在我的界面中放置非英语字符。或者相反,只有在我不用UTF-8编码文件时,我才能在应用程序中使用非英文字符。如何使它们一起工作?
如果您尚未遇到它,请访问unicode bootcamp:http://www.joelonsoftware.com/articles/Unicode.html。这对了解实际发生的事情至关重要。然后在StringIO文档中查看关于unicode的警告:http://www.joelonsoftware.com/articles/Unicode.html –
@Thomas K.我明白你的意思,并且我明白需要和使用不同的编码。正如您在代码的第二个示例中所见,我通过添加诸如# - * - coding:utf-8 - * - 或HTML charset元标记的行来解释不同的字符集。我不明白的是Python如何处理这一切。为什么Python要求我不断地对事物进行编码和解码,我自己呢?在这个例子中我如何完成它。我一直在用各种方法玩,包括(unicode(s,“utf-8”))和(.encode(“utf-8”),但很少成功。是的,我很缺乏经验。 – Mathias
我没有确切地知道你的应用程序正在发生什么,但是在第21行和第35行,尝试让你的字符串以'u'“”开头,所以它们是unicode字符串。问题是你正在尝试写出一个混合的编码字符串和unicode。 –