根据列值选择所有的行pandas

嗨我需要根据列值选择所有行，将其存储在一个新的变量中，或者创建一个新的数据帧并将其保存到没有头文件的csv中，。根据列值选择所有的行pandas

import pandas as pd 
import numpy as np 

print(df) 
#  0  1 2 3 
# 0 Gm# one 0 0 
# 1 922 one 1 2 
# 2 933 two 2 4 
# 3 952 three 3 6 
# 4 Gm# two 4 8 
# 5 960 two 5 10 
# 6 963 one 6 12 
# 7 999 three 7 14

所以我想要一个基于第一列条件的新数据框。我只想抓取范围为>= 900 & <=999的行。所需的输出：

我想将它存储在没有索引的csv中。

print (df2) 
    922 one 1 2 
    933 two 2 4 
    952 three 3 6 
    960 two 5 10 
    963 one 6 12 
    999 three 7 14

我尝试这样做：问题我得到我无法弄清楚如何将孔列转换成integers..or也许还有一个更简单的方式通过这样做只是参考，而不是检查孔的数据帧在堆栈溢出和YouTube视频的各种文章，但只是不能正确的。任何想法，我会很乐意欣赏它。

#df[x]= data[x][(data[x]['0'].astype(np.int64))] need to find a away to convert the column [0] into integer for it evaluate 
#df2 = data[i]([(data['0'] >= 900) & (data['0'] <= 999)])

来源

2016-05-12 herrington

可以通过iloc转换由位置选择to_numeric第一列，然后添加条件(data['0'].notnull())，因为不是数字值被转换为NaN。最后使用to_csv与参数index=False用于去除头部去除index和header=None：通过评论

import pandas as pd 

data = pd.DataFrame(
{'1': {0: 'one', 1: 'one', 2: 'two', 3: 'three', 4: 'two', 5: 'two', 6: 'one', 7: 'three'}, 
'0': {0: 'Gm', 1: '922', 2: '933', 3: '952', 4: 'Gm', 5: '960', 6: '963', 7: '999'}, 
'3': {0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14}, 
'2': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7}}) 

print data 

    0  1 2 3 
0 Gm one 0 0 
1 922 one 1 2 
2 933 two 2 4 
3 952 three 3 6 
4 Gm two 4 8 
5 960 two 5 10 
6 963 one 6 12 
7 999 three 7 14

data.iloc[:, 0] = pd.to_numeric(data.iloc[:, 0], errors='coerce') 
print data 
     0  1 2 3 
0 NaN one 0 0 
1 922.0 one 1 2 
2 933.0 two 2 4 
3 952.0 three 3 6 
4 NaN two 4 8 
5 960.0 two 5 10 
6 963.0 one 6 12 
7 999.0 three 7 14 


df1 = data[(data['0'] >= 900) & (data['0'] <= 999) & (data['0'].notnull())] 
print df1 
     0  1 2 3 
1 922.0 one 1 2 
2 933.0 two 2 4 
3 952.0 three 3 6 
5 960.0 two 5 10 
6 963.0 one 6 12 
7 999.0 three 7 14 


df1.to_csv('file', index=False, header=None)

编辑：

你可以试试：

for i in range(0, len(tables)): 
    df = tables[i] 
    df.replace(regex=True,inplace=True,to_replace='½',value='.5') 
    df.iloc[:, 0] = pd.to_numeric(df.iloc[:, 0], errors='coerce') 
    df1 = df[(df.iloc[:, 0] >= 900) & (df['0'] <= 999) & (df['0'].notnull())] 
    print (df1)

来源

2016-05-12 13:23:41 jezrael

哎我得到这个错误我只是改变了你给我的东西，并得到这个错误。 tables [i] .iloc [:, 0] = tables [i] .to_numeric（tables [i] .iloc [:, 0]，errors ='coerce'）文件“C：\ Python35 \ lib \ site我不使用'tables [i]'包装\ pandas \ core \ generic.py“，第2669行，在__getattr__ 返回对象.__ getattribute __（self，name） AttributeError：'DataFrame'object has no attribute'to_numeric' – herrington

但是， '，但'表' – jezrael

请检查我的解决方案，我添加测试DataFrame。 – jezrael

根据列值选择所有的行pandas

回答

相关问题