2017-08-14 232 views
0

对于下面数据帧DF1产生从多个列词典:在大熊猫

sentence   A  B  C  D  F  G 
dizzy   1  1  0  0  k  1 
Head   0  0  1  0  l  1 
nausea   0  0  0  1  fd  1 
zap    1  0  1  0  g  1 
dizziness  0  0  0  1  V  1  

我需要创建从塔句子与列A,B,C和D.

在字典下一步,我需要将数据框F2中的语句列映射为值A,B,C和D.输出如下:

sentences   A  B  C  D    
    dizzy   1  1  0  0 
    happy    
    Head   0  0  1  0    
    nausea   0  0  0  1 
    fill out   
    zap    1  0  1  0    
    dizziness  0  0  0  1  

    This is my code, but just for one column, I do not know how to do it for several columns: 


equiv = df1.set_index (sentences)[A].to_dict() 
df2[A]=df2[sentences].apply (lambda x:equiv.get(x, np.nan)) 

谢谢。

回答

0

IIUC:

设置:

In [164]: df1 
Out[164]: 
    sentence A B C D F G 
0  dizzy 1 1 0 0 k 1 
1  Head 0 0 1 0 l 1 
2  nausea 0 0 0 1 fd 1 
3  zap 1 0 1 0 g 1 
4 dizziness 0 0 0 1 V 1 

In [165]: df2 
Out[165]: 
    sentences 
0  dizzy 
1  happy 
2  Head 
3  nausea 
4 fill out 
5  zap 
6 dizziness 

解决方案:

In [174]: df2[['sentences']].merge(df1[['sentence','A','B','C','D']], 
            left_on='sentences', 
            right_on='sentence', 
            how='outer') 
Out[174]: 
    sentences sentence A B C D 
0  dizzy  dizzy 1.0 1.0 0.0 0.0 
1  happy  NaN NaN NaN NaN NaN 
2  Head  Head 0.0 0.0 1.0 0.0 
3  nausea  nausea 0.0 0.0 0.0 1.0 
4 fill out  NaN NaN NaN NaN NaN 
5  zap  zap 1.0 0.0 1.0 0.0 
6 dizziness dizziness 0.0 0.0 0.0 1.0