2017-08-04 91 views
3

计算我有数据帧是这样的:的Python /大熊猫:基于单元格的值

A B C D E 
0 2 3 4 8 7 
1 4 7 5 9 4 
2 3 4 5 7 2 
3 8 9 1 3 7 

我需要做这样的事情:

if 'value in column A' == 2: 
    'value for this row in new column' = 'value from column B' + 'value from column C' 
elif 'value in column A' == 4: 
    'value for this row in new column' = 'value from column B' + 'value from column D' 
elif 'value in column A' == 8: 
    'value for this row in new column' = 'value from column B' + 'value from column E' 
else: 
    'value for this row in new column' = 0 

我试图做这几种方法,例如:

1. 
df['sum'][df['A'] == 2] = df['B'] + df['C'] 
df['sum'][df['A'] == 4] = df['B'] + df['D'] 
df['sum'][df['A'] == 8] = df['B'] + df['E'] 

2. 
df.loc[df['A'] == 2, 'sum'] = df['B'] + df['C'] 
df.loc[df['A'] == 4, 'sum'] = df['B'] + df['D'] 
df.loc[df['A'] == 8, 'sum'] = df['B'] + df['E'] 

但我在结果中有空单元格。

+1

您的解决方案不处理其他cas即您可能需要将呼叫传递给[fillna](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html),例如'df.fillna(0,axis = 1)'在处理了前三种情况后。 – Quickbeam2k1

回答

5

做的另一种简单的方法,这是我使用字典和lookup得到的总和即

colons = {2: 'C', 4: 'D', 8: 'E'} 
df['sum']= np.nan 
df['sum'] = df['B']+ df.lookup(df['A'].index,df['A'].map(colons).fillna('sum')) 

输出:

 
    A B C D E sum 
0 2 3 4 8 7 7.0 
1 4 7 5 9 4 16.0 
2 3 4 5 7 2 NaN 
3 8 9 1 3 7 16.0 

您可以填写楠0使用df.fillna(0)

+2

我在想这个。好的回答 – piRSquared

+0

谢谢先生! – Dark

1

这是一个方法

def f1(x): 
    if x['A']==2: 
     return x['B'] + x['C'] 
    elif x['A']==4: 
     return x['B'] + x['D'] 
    elif x['A']==8: 
     return x['B'] + x['E'] 
    else: 
     return 0 

df['sum'] = df.apply(f1 , axis=1) 
df.head() 

输出:

A B C D E sum 
2 3 4 8 7 7 
4 7 5 9 4 16 
3 4 5 7 2 0 
8 9 1 3 7 16 
0

你得到的NA,因为df.A == 3的情况下没有被覆盖。使用df.loc[:,'sum'] = 0 # or any other starting value避免这种

A =[2, 4, 3, 8] 
B =[3, 7, 4, 9] 
C =[4, 5, 5, 1] 
D =[8, 9, 7, 3] 
E =[7, 4, 2, 7] 

_all = [A,B,C,D,E] 
df = pd.DataFrame(_all, columns = ['A', 'B', 'C', 'D', 'E']) 

df.loc[:,'sum'] = 0 
df.loc[:,'sum'][df['A'] == 2] = df['B'] + df['C'] 
df.loc[:,'sum'][df['A'] == 4] = df['B'] + df['D'] 
df.loc[:,'sum'][df['A'] == 8] = df['B'] + df['E'] 

>>> df 
    A B C D E sum 
0 2 3 4 8 7 7 
1 4 7 5 9 4 16 
2 3 4 5 7 2 0 
3 8 9 1 3 7 16