2016-07-25 93 views
8

我有混合列名的熊猫数据帧的字符串:将列名称从int到大熊猫

1,2,3,4,5,“类”

当我保存这个数据帧到h5file它说性能会受到混合类型的影响。如何将整数转换为熊猫字符串?

回答

15

您可以简单地使用df.columns = df.columns.astype(str)

In [26]: df = pd.DataFrame(np.random.random((3,6)), columns=[1,2,3,4,5,'Class']) 

In [27]: df 
Out[27]: 
      1   2   3   4   5  Class 
0 0.773423 0.865091 0.614956 0.219458 0.837748 0.862177 
1 0.544805 0.535341 0.323215 0.929041 0.042705 0.759294 
2 0.215638 0.251063 0.648350 0.353999 0.986773 0.483313 

In [28]: df.columns.map(type) 
Out[28]: 
array([<class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, 
     <class 'int'>, <class 'str'>], dtype=object) 

In [29]: df.to_hdf("out.h5", "d1") 
C:\Anaconda3\lib\site-packages\pandas\io\pytables.py:260: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot 
map directly to c-types [inferred_type->mixed-integer,key->axis0] [items->None] 

    f(store) 
C:\Anaconda3\lib\site-packages\pandas\io\pytables.py:260: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot 
map directly to c-types [inferred_type->mixed-integer,key->block0_items] [items->None] 

    f(store) 

In [30]: df.columns = df.columns.astype(str) 

In [31]: df.columns.map(type) 
Out[31]: 
array([<class 'str'>, <class 'str'>, <class 'str'>, <class 'str'>, 
     <class 'str'>, <class 'str'>], dtype=object) 

In [32]: df.to_hdf("out.h5", "d1") 

In [33]: 
0

您可以简单地使用df.columns = df.columns.map(str)

DSM的第一个答案df.columns = df.columns.astype(str)我的数据框没有工作。 (我得到TypeError:将dtype设置为不支持float64或object的任何东西)