2017-07-19 41 views
0
for x in range(1,17): 
    df.loc[(df[x]=='n'), (x)]=float(0.0) 
    df.loc[(df[x]=='y'), (x)]=float(1.0) 
    df.loc[(df[x]=='?'), (x)]=np.nan 

df.dtypes 

返回所有对象。为什么当我特别将每个项目设置为浮点数0或1或NaN时。基本上我无法在此数据帧上运行列方式。将每个值设置为浮点数,但返回大熊猫中的对象

+0

我不需要虽然重置索引。这与所有采用对象值而不是float64的列有关。我发现了一个工作,但它仍然没有回答为什么上述输出所有对象。 – a1letterword

回答

1
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/voting-records/house-votes-84.data' 
df = pd.read_csv(url, header=None, index_col=0) 
df[df.eq('?')] = np.nan 
df[df.eq('y')] = 1.0 
df[df.eq('n')] = 0.0 
df = df.reset_index() 

结果:

In [67]: df 
Out[67]: 
      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
0 republican 0 1 0 1 1 1 0 0 0 1 NaN 1 1 1 0 1 
1 republican 0 1 0 1 1 1 0 0 0 0 0 1 1 1 0 NaN 
2  democrat NaN 1 1 NaN 1 1 0 0 0 0 1 0 1 1 0 0 
3  democrat 0 1 1 0 NaN 1 0 0 0 0 1 0 1 0 0 1 
4  democrat 1 1 1 0 1 1 0 0 0 0 1 NaN 1 1 1 1 
5  democrat 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1 
6  democrat 0 1 0 1 1 1 0 0 0 0 0 0 NaN 1 1 1 
7 republican 0 1 0 1 1 1 0 0 0 0 0 0 1 1 NaN 1 
8 republican 0 1 0 1 1 1 0 0 0 0 0 1 1 1 0 1 
9  democrat 1 1 1 0 0 0 1 1 1 0 0 0 0 0 NaN NaN 
..   ... ... ... ... ... ... .. ... ... ... ... ... ... ... ... ... ... 
425 democrat 0 0 1 0 0 0 1 1 0 1 1 0 0 0 1 NaN 
426 democrat 1 0 1 0 0 0 1 1 1 1 0 0 0 0 1 1 
427 republican 0 0 0 1 1 1 1 1 0 1 0 1 1 1 0 1 
428 democrat NaN NaN NaN 0 0 0 1 1 1 1 0 0 1 0 1 1 
429 democrat 1 0 1 0 NaN 0 1 1 1 1 0 1 0 NaN 1 1 
430 republican 0 0 1 1 1 1 0 0 1 1 0 1 1 1 0 1 
431 democrat 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 1 
432 republican 0 NaN 0 1 1 1 0 0 0 0 1 1 1 1 0 1 
433 republican 0 0 0 1 1 1 NaN NaN NaN NaN 0 1 1 1 0 1 
434 republican 0 1 0 1 1 1 0 0 0 1 0 1 1 1 NaN 0 

[435 rows x 17 columns] 
+1

缺少'='? – Wen

+0

@文,谢谢! – MaxU

+1

这样一个快速的解决方案+1 – Wen