2017-05-28 100 views
1

我想在两个不同的dfs中的行的索引之间进行匹配,如果索引相同,我想转到第二个df,迭代它的列,如果列的值为'V',转到第一个df,并将df的名称添加到列的值中。Python Pandas匹配列的两个索引和值

所以例如:

MAIN DF:

names col1 col2 col3 total 
bbb  V  V  X  2 
ccc  V  X  X  1 

DF2:

names col1 col2 col3 total 
bbb  V  V  X  2 
zzz  X  X  V  1 

打完MAIN DF将是:

names col1 col2 col3 total totla_col1 total_col2 total_col3 
bbb  V  V  X  2   DF2   DF2   NULL 
ccc  V  X  X  1   NULL   NULL  NULL 

回答

1

您可以先创建names列的索引set_index,replacedictadd_prefix

然后join它原来:

cols = ['col1','col2','col3'] 
DF2 = DF2.set_index('names')[cols].replace({'V':'DF2', 'X':np.nan}).add_prefix('total_') 
print (DF2) 
     total_col1 total_col2 total_col3 
names         
bbb   DF2  DF2  NaN 
zzz   NaN  NaN  DF2 

df = df.join(DF2, on='names') 
print (df) 
    names col1 col2 col3 total total_col1 total_col2 total_col3 
0 bbb V V X  2  DF2  DF2  NaN 
1 ccc V X X  1  NaN  NaN  NaN