2017-03-08 67 views
3

我有一个数据帧resultstatsDF熊猫:铸铁柱串不起作用

resultstatsDF = DataFrame({'a': [1,2,3,4,5]}) 
resultstatsDF['file'] = 'asdf' 
resultstatsDF.dtypes 
a  int64 
file object 
dtype: object 

objectfile,我想转换为字符串:

我试图

resultstatsDF = resultstatsDF.astype({'file': str}) 
resultstatsDF['file'] = resultstatsDF['file'].astype(str) 
resultstatsDF['file'] = resultstatsDF['file'].to_string 
resultstatsDF['file'] = resultstatsDF.file.apply(str) 
resultstatsDF['file'] = resultstatsDF['file'].apply(str) 

但无论我做什么,当我检查与

resultstatsDF.dtypes 

file保持为tpye object

回答

1

dtypestringdictlist总是object,用于测试type需要选择列的一些值例如通过iat

type(resultstatsDF['file'].iat[0]) 

样品:

resultstatsDF = pd.DataFrame({'file':['a','d','f']}) 
print (resultstatsDF) 
    file 
0 a 
1 d 
2 f 

print (type(resultstatsDF['file'].iloc[0])) 
<class 'str'> 

print (resultstatsDF['file'].apply(type)) 
0 <class 'str'> 
1 <class 'str'> 
2 <class 'str'> 
Name: file, dtype: object 

样品:

df = pd.DataFrame({'strings':['a','d','f'], 
        'dicts':[{'a':4}, {'c':8}, {'e':9}], 
        'lists':[[4,8],[7,8],[3]], 
        'tuples':[(4,8),(7,8),(3,)], 
        'sets':[set([1,8]), set([7,3]), set([0,1])] }) 

print (df) 
     dicts lists sets strings tuples 
0 {'a': 4} [4, 8] {8, 1}  a (4, 8) 
1 {'c': 8} [7, 8] {3, 7}  d (7, 8) 
2 {'e': 9}  [3] {0, 1}  f (3,) 

所有值都具有相同的dtypes

print (df.dtypes) 
dicts  object 
lists  object 
sets  object 
strings object 
tuples  object 
dtype: object 

type是不同的,如果需要通过循环检查:列

for col in df: 
    print (df[col].apply(type)) 

0 <class 'dict'> 
1 <class 'dict'> 
2 <class 'dict'> 
Name: dicts, dtype: object 
0 <class 'list'> 
1 <class 'list'> 
2 <class 'list'> 
Name: lists, dtype: object 
0 <class 'set'> 
1 <class 'set'> 
2 <class 'set'> 
Name: sets, dtype: object 
0 <class 'str'> 
1 <class 'str'> 
2 <class 'str'> 
Name: strings, dtype: object 
0 <class 'tuple'> 
1 <class 'tuple'> 
2 <class 'tuple'> 
Name: tuples, dtype: object 

或者第一值:

print (type(df['strings'].iat[0])) 
<class 'str'> 

print (type(df['dicts'].iat[0])) 
<class 'dict'> 

print (type(df['lists'].iat[0])) 
<class 'list'> 

print (type(df['tuples'].iat[0])) 
<class 'tuple'> 

print (type(df['sets'].iat[0])) 
<class 'set'> 

随着boolean indexing如果可能混合柱(然后一些熊猫功能可以分成)是可能的过滤器通过type

df = pd.DataFrame({'mixed':['3', 5, 9,'2']}) 
print (df) 
    mixed 
0  3 
1  5 
2  9 
3  2 

print (df.dtypes) 
mixed object 
dtype: object 

for col in df: 
    print (df[col].apply(type)) 
0 <class 'str'> 
1 <class 'int'> 
2 <class 'int'> 
3 <class 'str'> 
Name: mixed, dtype: object 

#python 3 - string 
#python 2 - basestring 
mask = df['mixed'].apply(lambda x: isinstance(x,str)) 
print (mask) 
0  True 
1 False 
2 False 
3  True 
Name: mixed, dtype: bool 

df = df[mask] 
print (df) 
    mixed 
0  3 
3  2 
+0

那么为什么我会得到TypeError? http://stackoverflow.com/questions/42671168/dfply-mutating-string-column-typeerror – Make42

+0

我不知道'r',所以我不知道什么是问题 – jezrael

+1

这是python不是R. – Make42