这是现在一个老问题,但我想添加一个关于如何在Python 3中执行此操作的答案,因为我已经完成了一个实现。
允许的字符记录在这里:https://docs.python.org/3/reference/lexical_analysis.html#identifiers。它们包含相当多的特殊字符,包括标点符号,下划线和整个外国字符。幸运的是unicodedata
模块可以提供帮助。下面是我的实现直接实现的Python文档说什么:
import unicodedata
def is_valid_name(name):
if not _is_id_start(name[0]):
return False
for character in name[1:]:
if not _is_id_continue(character):
return False
return True #All characters are allowed.
_allowed_id_continue_categories = {"Ll", "Lm", "Lo", "Lt", "Lu", "Mc", "Mn", "Nd", "Nl", "Pc"}
_allowed_id_continue_characters = {"_", "\u00B7", "\u0387", "\u1369", "\u136A", "\u136B", "\u136C", "\u136D", "\u136E", "\u136F", "\u1370", "\u1371", "\u19DA", "\u2118", "\u212E", "\u309B", "\u309C"}
_allowed_id_start_categories = {"Ll", "Lm", "Lo", "Lt", "Lu", "Nl"}
_allowed_id_start_characters = {"_", "\u2118", "\u212E", "\u309B", "\u309C"}
def _is_id_start(character):
return unicodedata.category(character) in _allowed_id_start_categories or character in _allowed_id_start_categories or unicodedata.category(unicodedata.normalize("NFKC", character)) in _allowed_id_start_categories or unicodedata.normalize("NFKC", character) in _allowed_id_start_characters
def _is_id_continue(character):
return unicodedata.category(character) in _allowed_id_continue_categories or character in _allowed_id_continue_characters or unicodedata.category(unicodedata.normalize("NFKC", character)) in _allowed_id_continue_categories or unicodedata.normalize("NFKC", character) in _allowed_id_continue_characters
此代码是从这里适于在CC0:https://github.com/Ghostkeeper/Luna/blob/d69624cd0dd5648aec2139054fae4d45b634da7e/plugins/data/enumerated/enumerated_type.py#L91。它已经过很好的测试。
而投票的原因是......?这是一个基本问题,但仍是一个有效的问题:+1。 – EOL 2012-04-12 08:54:17
试图创建一个名为'None'或'__debug__'的类是做什么的?根据以下文档,我期望它会引发'SyntaxError':https://docs.python.org/2/library/constants.html – ArtOfWarfare 2015-02-23 17:27:36