2017-02-09 150 views
2

我该如何解决从哪些列匹配从Excel文件创建的数据框中的特定值的行?根据列值从熊猫数据框中提取行

以下是数据帧的几行:

Food   Men  Women 
0 Total fruit  86.20 88.26 
1 Apples, Total 89.01 89.66 
2 Apples as fruit 89.18 90.42 
3 Apple juice  88.78 88.42 
4 Bananas   95.42 94.18 
5 Berries   84.21 81.73 
6 Grapes   88.79 88.13 

,这是我用来读取Excel文件中的代码,选择我需要的列,并适当对其进行重命名:

data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K') 
data1.columns = ['Food', 'Men', 'Women'] 

# Try 1: data1 = data1[data1['Food'].isin(['Total fruit']) == True] works 
# Try 2: data1 = data1[data1['Food'].isin(['Apple, Total']) == True] doesn't work 
# Try 3: data1 = data1.iloc[[1]] returns Apples, Total but not appropriate to use integer index 
# Try 4: data1[data1['Food'] == 'Berries'] doesn't work 

到目前为止,基于this,thishere等答案,我只能返回Food =“Total fruit”的第一个索引。当我尝试其他方法上面我只得到了列名,如:

Food Men Women 

我是新来的熊猫和看不到的地方,我错了。为什么我可以提取第一行Food == Total水果,但没有其他的东西?

回答

2

对我来说工作良好,也许问题与一些空格 - 由strip其删除:

print (data1.Food.tolist()) 
['Total fruit', 'Apples, Total ', 'Apples as fruit', 
'Apple juice', 'Bananas', ' Berries', 'Grapes'] 

data1['Food'] = data1['Food'].str.strip() 

print (data1.Food.tolist()) 
['Total fruit', 'Apples, Total', 'Apples as fruit', 
'Apple juice', 'Bananas', 'Berries', 'Grapes'] 

data2 = data1[data1['Food'].isin(['Total fruit'])] 
print (data2) 
      Food Men Women 
0 Total fruit 86.2 88.26 

data3 = data1[data1['Food'].isin(['Apples, Total'])] 
print (data3) 
      Food Men Women 
1 Apples, Total 89.01 89.66 

data3 = data1[data1['Food'].isin(['Berries'])] 
print (data3) 
     Food Men Women 
5 Berries 84.21 81.73 
0

使用此代码

data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K') 
list_of_strings_to_match = ['Total fruit', 'Berries', 'Grape'] 
for index, row in data1.iterrows(): 
    if row['Food'] in list_of_strings_to_match: 
     print row 
+0

浆果或葡萄没有行结果 – dreamin