2017-06-19 108 views
1

我有一个特殊字符[˛]作为分隔符的文本文件。我复制粘贴此特殊字符作为分隔符在我read_csv命令,我收到以下错误:python熊猫特殊字符作为分隔符

ParserWarning: Falling back to the 'python' engine because the 
separator encoded in utf-8 is > 1 char long, and the 'c' engine does 
not support such separators; you can avoid this warning by specifying 
engine='python'. 
    """Entry point for launching an IPython kernel. 

不知道如何在阅读文本文件中使用特殊字符?

回答

1

你只能得到警告和解决方案,删除它很容易 - 添加engine='python'

Specifying the parser engine

Under the hood pandas uses a fast and efficient parser implemented in C as well as a python implementation which is currently more feature-complete. Where possible pandas uses the C parser (specified as engine='c'), but may fall back to python if C-unsupported options are specified. Currently, C-unsupported options include:

  • 月比单字符以外(例如正则表达式分隔符)
  • skipfooter
  • 九月=无与delim_whitespace =假

Specifying any of the above options will produce a ParserWarning unless the python engine is selected explicitly using engine='python' .

import pandas as pd 
from pandas.compat import StringIO 

temp=u"""a˛b˛c 
1˛3˛5 
7˛8˛1 
""" 
#after testing replace 'StringIO(temp)' to 'filename.csv' 
df = pd.read_csv(StringIO(temp), sep="˛", engine='python') 
print (df) 
    a b c 
0 1 3 5 
1 7 8 1