为什么pandas read_csv不支持多个注释（＃，@，...）？

我发现pandas read_csv方法比numpy loadtxt更快。不幸的是，现在我发现自己处于一种不得不返回numpy的状态，因为loadtxt可以选择设置comments=['#','@']。据我所知，帮助网站上的Pandas read_csv方法只能使用一个注释字符串，如comment='#'。任何建议或解决方法，可以使我的生活更轻松，并使我不会回到numpy？另外为什么熊猫不支持多个评论指标？为什么pandas read_csv不支持多个注释（＃，@，...）？

# save this in test.dat 
@ bla 
# bla 
1 2 3 4

小例子：

# does work, but only one type of comment is accounted for 
df = pd.read_csv('test.dat', index_col=0, header=None, comment='#') 

# does not work (not suprising reading the help) 
df = pd.read_csv('test.dat', index_col=0, header=None, comment=['#','@']) 

# does work but is slow 
df = np.loadtxt('test.dat', comments=['#','@'])

来源

2016-11-17 Asking Questions

[MCVE]（http://stackoverflow.com/help/mcve） – Kartik

请包括一些测试数据以及。此外，没有任何代码使用熊猫... – darthbith

简短的回答是，没有人在pandas实现它。通过他们的Github上的问题寻找迅速，它看起来像别人已经提出它和维护者是开放的实现它补丁： https://github.com/pandas-dev/pandas/issues/13948

可能是一个很好的机会，让你贡献回到如果你pandas项目对此感到满意，或者如果有人这样做，就留意这个问题。处理注释的代码库的一部分看起来在这里_check_comments：https://github.com/pandas-dev/pandas/blob/master/pandas/io/parsers.py#L2348

来源

2016-11-17 16:59:41

为什么pandas read_csv不支持多个注释（＃，@，...）？

回答

相关问题