2017-02-18 69 views
2

我有以下的熊猫数据框:更换熊猫数据框中的值时出现问题?

在:

df = pd.DataFrame({'Fruits':['this should be a pinneapple', 
           'this should be an apple', 
           'this should be a tomato', 'this should 3 grapes', 
          'this should be an orange', 
           'this should be an 01', 
          'this should be an 02']}) 

df 

日期:

Fruits 
0 this should be a pinneapple 
1 this should be an apple 
2 this should be a tomato 
3 this should 3 grapes 
4 this should be an orange 
5 this should be an 01 
6 this should be an 02 

我想用一个id替换所有的水果(例如01nn)。为此,我与熊猫试图替换功能:

df['Fruits'] = df['Fruits'].replace(['pinneapple', 'apple', 'tomato', 'grapes', 'orange'],\ 
                     ['01', '02', '03', '04', '05']) 

然而,当我做上述分配不采取措施的专栏中,我有兴趣来调整。因此,如何将每个单词替换为预定义的数字?

回答

3

您可以Series.replace使用参数regex=True

df['Fruits'] = df['Fruits'].replace(['pinneapple', 'apple', 'tomato', 'grapes', 'orange'],\ 
            ['01', '02', '03', '04', '05'], regex=True) 
print (df) 
       Fruits 
0 this should be a 01 
1 this should be an 02 
2 this should be a 03 
3  this should 3 04 
4 this should be an 05 
5 this should be an 01 
6 this should be an 02 

您还可以使用list comprehensioncodes

fruits = ['pinneapple', 'apple', 'tomato', 'grapes', 'orange'] 
codes = [str(i + 1).zfill(2) for i, c in enumerate(fruits)] 
print (codes) 
['01', '02', '03', '04', '05'] 

df['Fruits'] = df['Fruits'].replace(fruits,codes, regex=True) 
print (df) 

       Fruits 
0 this should be a 01 
1 this should be an 02 
2 this should be a 03 
3  this should 3 04 
4 this should be an 05 
5 this should be an 01 
6 this should be an 02 
+0

感谢您的帮助jez! – tumbleweed

1

尝试使用重置值如下:

df['Fruits'] = pd.DataFrame() 

然后再次分配新值