for x in range(1,17):
df.loc[(df[x]=='n'), (x)]=float(0.0)
df.loc[(df[x]=='y'), (x)]=float(1.0)
df.loc[(df[x]=='?'), (x)]=np.nan
df.dtypes
返回所有对象。为什么当我特别将每个项目设置为浮点数0或1或NaN时。基本上我无法在此数据帧上运行列方式。将每个值设置为浮点数,但返回大熊猫中的对象
for x in range(1,17):
df.loc[(df[x]=='n'), (x)]=float(0.0)
df.loc[(df[x]=='y'), (x)]=float(1.0)
df.loc[(df[x]=='?'), (x)]=np.nan
df.dtypes
返回所有对象。为什么当我特别将每个项目设置为浮点数0或1或NaN时。基本上我无法在此数据帧上运行列方式。将每个值设置为浮点数,但返回大熊猫中的对象
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/voting-records/house-votes-84.data'
df = pd.read_csv(url, header=None, index_col=0)
df[df.eq('?')] = np.nan
df[df.eq('y')] = 1.0
df[df.eq('n')] = 0.0
df = df.reset_index()
结果:
In [67]: df
Out[67]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0 republican 0 1 0 1 1 1 0 0 0 1 NaN 1 1 1 0 1
1 republican 0 1 0 1 1 1 0 0 0 0 0 1 1 1 0 NaN
2 democrat NaN 1 1 NaN 1 1 0 0 0 0 1 0 1 1 0 0
3 democrat 0 1 1 0 NaN 1 0 0 0 0 1 0 1 0 0 1
4 democrat 1 1 1 0 1 1 0 0 0 0 1 NaN 1 1 1 1
5 democrat 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1
6 democrat 0 1 0 1 1 1 0 0 0 0 0 0 NaN 1 1 1
7 republican 0 1 0 1 1 1 0 0 0 0 0 0 1 1 NaN 1
8 republican 0 1 0 1 1 1 0 0 0 0 0 1 1 1 0 1
9 democrat 1 1 1 0 0 0 1 1 1 0 0 0 0 0 NaN NaN
.. ... ... ... ... ... ... .. ... ... ... ... ... ... ... ... ... ...
425 democrat 0 0 1 0 0 0 1 1 0 1 1 0 0 0 1 NaN
426 democrat 1 0 1 0 0 0 1 1 1 1 0 0 0 0 1 1
427 republican 0 0 0 1 1 1 1 1 0 1 0 1 1 1 0 1
428 democrat NaN NaN NaN 0 0 0 1 1 1 1 0 0 1 0 1 1
429 democrat 1 0 1 0 NaN 0 1 1 1 1 0 1 0 NaN 1 1
430 republican 0 0 1 1 1 1 0 0 1 1 0 1 1 1 0 1
431 democrat 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 1
432 republican 0 NaN 0 1 1 1 0 0 0 0 1 1 1 1 0 1
433 republican 0 0 0 1 1 1 NaN NaN NaN NaN 0 1 1 1 0 1
434 republican 0 1 0 1 1 1 0 0 0 1 0 1 1 1 NaN 0
[435 rows x 17 columns]
我不需要虽然重置索引。这与所有采用对象值而不是float64的列有关。我发现了一个工作,但它仍然没有回答为什么上述输出所有对象。 – a1letterword