2012-04-30 42 views
3

我是在很长一段时间停留在这个错误:Python的类型错误:预期的字符缓冲区对象,个人误解

TypeError: expected a character buffer object 

我只是明白我误解了,这是一些有关的unicode字符串之间的区别和一个'简单'的字符串,我试图用一个“正常”字符串使用上面的代码,而我不得不通过一个unicode。所以在字符串被破解之前伪造简单的“u”:/ !!!

顺便说一句TypeError对我来说很不清楚,现在依然如此。

请问,有人可以解释我缺少什么以及为什么“简单”字符串不是“字符缓冲区对象”?

可以用下面的代码(萃取和(c)从here:)再现

def maketransU(s1, s2, todel=u""): 
    """Build translation table for use with unicode.translate(). 

    :param s1: string of characters to replace. 
    :type s1: unicode 
    :param s2: string of replacement characters (same order as in s1). 
    :type s2: unicode 
    :param todel: string of characters to remove. 
    :type todel: unicode 
    :return: translation table with character code -> character code. 
    :rtype: dict 
    """ 
    # We go unicode internally - ensure callers are ok with that. 
    assert (isinstance(s1,unicode)) 
    assert (isinstance(s2,unicode)) 
    trans_tab = dict(zip(map(ord, s1), map(ord, s2))) 
    trans_tab.update((ord(c),None) for c in todel) 
    return trans_tab 

#BlankToSpace_table = string.maketrans (u"\r\n\t\v\f",u"  ") 
BlankToSpace_table = maketransU (u"\r\n\t\v\f",u"  ") 
def BlankToSpace(text) : 
    """Replace blanks characters by realspaces. 

    May be good to prepare for regular expressions & Co based on whitespaces. 

    :param text: the text to clean from blanks. 
    :type text: string 
    :return: List of parts in their apparition order. 
    :rtype: [ string ] 
    """ 
    print text, type(text), len(text) 
    try: 
     out = text.translate(BlankToSpace_table) 
    except TypeError, e: 
     raise 
    return out 

# for SO : the code below is just to reproduce what i did not understand 
dummy = "Hello,\n, this is a \t dummy test!" 
for s in (unicode(dummy), dummy): 
    print repr(s) 
    print repr(BlankToSpace(s)) 

生产:

u'Hello,\n, this is a \t dummy test!' 
Hello, 
, this is a  dummy test! <type 'unicode'> 32 
u'Hello, , this is a dummy test!' 
'Hello,\n, this is a \t dummy test!' 
Hello, 
, this is a  dummy test! <type 'str'> 32 

Traceback (most recent call last): 
    File "C:/treetaggerwrapper.error.py", line 44, in <module> 
    print repr(BlankToSpace(s)) 
    File "C:/treetaggerwrapper.error.py", line 36, in BlankToSpace 
    out = text.translate(BlankToSpace_table) 
TypeError: expected a character buffer object 

回答

11

的问题是,一个字节串的translate方法是不同一个unicode字符串的translate方法。这里的的非Unicode版本的文档字符串:

S.translate(table [,deletechars]) -> string

Return a copy of the string S, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256.

和这里的Unicode版本:

S.translate(table) -> unicode

Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.

你可以看到,非Unicode版本的期待“长度为256的字符串”,而非Unicode版本正在期待“映射”(即字典)。所以问题不在于你的unicode字符串是一个缓冲区对象,非unicode字符串不是 - 当然都是缓冲区 - 但是一个translate方法期望这样一个缓冲区对象,另一个不是。

+0

谢谢!我对此非常了解甚少!因为我没有注意到它不是同一个对象:/ – user1340802

相关问题