2017-04-19 114 views
1

我想用字典中的值替换数据帧的值。 In English简体中文:如果Column C中的某个值与字典密钥匹配,则用与该特定密钥对应的字典中的值替换Column D基于词典设置熊猫值

import pandas as pd 
import numpy as np 
dfp = pd.DataFrame({'A' : [np.NaN,np.NaN,3,4,5,5,3,1,5,np.NaN], 
        'B' : [1,0,3,5,0,0,np.NaN,9,0,0], 
        'C' : ['AA1233445','A9875', 'rmacy','Idaho Rx','Ab123455','TV192837','RX','Ohio Drugs','RX12345','USA Pharma'], 
        'D' : [123456,123456,1234567,12345678,12345,12345,12345678,123456789,1234567,np.NaN], 
        'E' : ['Assign','Unassign','Assign','Ugly','Appreciate','Undo','Assign','Unicycle','Assign','Unicorn',]}) 
print(dfp) 

z = {'rmacy': 999} 
dfp.loc[dfp['C'].isin(z.keys()), 'D' ] = z.values() # <--- code to change 

Output: 
    A B   C   D   E 
0 NaN 1.0 AA1233445  123456  Assign 
1 NaN 0.0  A9875  123456 Unassign 
2 3.0 3.0  rmacy  (999)  Assign #<--- Worked with paranthesis 
3 4.0 5.0 Idaho Rx 1.23457e+07  Ugly 
4 5.0 0.0 Ab123455  12345 Appreciate 
5 5.0 0.0 TV192837  12345  Undo 
6 3.0 NaN   RX 1.23457e+07  Assign 
7 1.0 9.0 Ohio Drugs 1.23457e+08 Unicycle 
8 5.0 0.0  RX12345 1.23457e+06  Assign 
9 NaN 0.0 USA Pharma   NaN  Unicorn 

上面的代码工作(除非是放入Paranthesis的价值。但如果字典是大于一个键,就会把两个值中Column D因为有在列两场比赛。

 A B   C   D   E 
0 NaN 1.0 AA1233445  123456  Assign 
1 NaN 0.0  A9875  123456 Unassign 
2 3.0 3.0  rmacy (999, 333)  Assign 
3 4.0 5.0 Idaho Rx 1.23457e+07  Ugly 
4 5.0 0.0 Ab123455  12345 Appreciate 
5 5.0 0.0 TV192837  12345  Undo 
6 3.0 NaN   RX (999, 333)  Assign 
7 1.0 9.0 Ohio Drugs 1.23457e+08 Unicycle 
8 5.0 0.0  RX12345 1.23457e+06  Assign 
9 NaN 0.0 USA Pharma   NaN  Unicorn 

一个将如何解决这个问题?

回答

2

使用mapfillna

dfp.assign(D=dfp.C.map(z).fillna(dfp.D)) 

    A B   C   D   E 
0 NaN 1.0 AA1233445  123456.0  Assign 
1 NaN 0.0  A9875  123456.0 Unassign 
2 3.0 3.0  rmacy  999.0  Assign 
3 4.0 5.0 Idaho Rx 12345678.0  Ugly 
4 5.0 0.0 Ab123455  12345.0 Appreciate 
5 5.0 0.0 TV192837  12345.0  Undo 
6 3.0 NaN   RX 12345678.0  Assign 
7 1.0 9.0 Ohio Drugs 123456789.0 Unicycle 
8 5.0 0.0  RX12345 1234567.0  Assign 
9 NaN 0.0 USA Pharma   NaN  Unicorn 
+0

你总是给出正确答案!哈哈。你能通过这个方法走过我吗?如何使用'fillna()'给出正确的结果正在引起我的注意。 – MattR

+0

@MattR'map'返回字典中的值(如果有),否则返回'NaN'。然后我用'fillna(df.D)'填充这些'NaN'。然后我用这些新值覆盖当前的'df.D'。 – piRSquared

+0

简直太棒了......我希望我有一天能像你一样流利。干杯! – MattR