2017-05-25 79 views
0

提取行,我有两个pandas dataframes为:通过比较按键键

df1.ix[1:5] 

    Keys ColName 
1 LOSTP LOSTP1 
2 LOSTP LOSTP2 
3 LOSTP LOSTP3 
4 GIDEO GIDEOasdun 
5 sdfff sdfffvrf 


df2.ix[1:5] 

    Keys ColName 
2 LOSTQ LOSTQ2 
3 LOSTR LOSTR3 
5 sdfff sdfffvrf 

我想提取df1如下:

 Keys ColName 
1 LOSTP LOSTP1 
2 LOSTP LOSTP2 
3 LOSTP LOSTP3 
4 GIDEO GIDEOasdun 

这意味着df1['keys] difference df2['keys']。即通过在keys

回答

0
#use an apply function to see if the colname in the current row is in df2 colnames with the same key as the current row. Then use this mask array to select rows. 

df1[df1.apply(lambda x: x.ColName not in df2[df2.Keys==x.Keys]['ColName'].tolist(), axis=1)] 
Out[272]: 
    Keys  ColName 
1 LOSTP  LOSTP1 
2 LOSTP  LOSTP2 
3 LOSTP  LOSTP3 
4 GIDEO GIDEOasdun 
找到来自df1的不在df2中的元素