2017-06-29 75 views
1

我有一个dataframe1包含500,000行。我想通过查找包含配置的dataframe2中的型号来填充配置列。如何根据groupby函数输出向熊猫数据框添加新列?

Dataframe1:

Model     Date  Status Configuration 
A4     10/2014 Inop  
A4     11/2014 Op    
A4     11/2014 Op          
G5     10/2014 Inop         
G5     11/2014 Inop         
G5     11/2014 Op          
G8     10/2014 Op          
G8     11/2014 Op          
G8     11/2014 Op          
G8     10/2014 Inop         
Z2     11/2014 Op          
Z2     11/2014 Op          

Dataframe2:

Model    Configuration 
A4     ICS 
G5     PCS 
G8     ICS  
Z2     1/2 ICS 

代码我目前正在运行:

for Model, group in dataframe1.groupby('Model'): 
    #gets configuration from dataframe2 
    config = get_configuration(Model) 
    #attempt to assign configuration to all columns with that model number in dataframe1 
    dataframe1['Config'] = con 

此代码返回:

此代码组dataframe1通过模型并成功地获得各组的配置,但我不能说配置应用到一个新的行dataframe1了以下结果:

Model     Date  Status Configuration 
A4     10/2014 Inop  ICS 
A4     11/2014 Op  ICS  
A4     11/2014 Op  ICS  
G5     10/2014 Inop  PCS 
G5     11/2014 Inop  PCS 
G5     11/2014 Op  PCS 
G8     10/2014 Op  ICS 
G8     11/2014 Op  ICS  
G8     11/2014 Op  ICS  
G8     10/2014 Inop  ICS  
Z2     11/2014 Op  1/2 ICS 
Z2     11/2014 Op  1/2 ICS 
+0

试试这个链接https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html – Wen

回答

3

使用map

Dataframe1['Config'] = Dataframe1['Model'].map(Dataframe2.set_index('Model').Config) 
Dataframe1 

    Model  Date Status Config 
0  A4 10/2014 Inop  ICS 
1  A4 11/2014  Op  ICS 
2  A4 11/2014  Op  ICS 
3  G5 10/2014 Inop Non ICS 
4  G5 11/2014 Inop Non ICS 
5  G5 11/2014  Op Non ICS 
6  G8 10/2014  Op  ICS 
7  G8 11/2014  Op  ICS 
8  G8 11/2014  Op  ICS 
9  G8 10/2014 Inop  ICS 
10 Z2 11/2014  Op 1/2 ICS 
11 Z2 11/2014  Op 1/2 ICS 
1

尝试pd.merge

Dataframe1.merge(Dataframe2,left_on='Model',right_on='Model',how='left')   
+0

这也是一个很好的解决方案: - )...你不需要'rig ht_on'或'left_on'列名是否相同。你可以使用'on' – piRSquared

+0

@piRSquared基础上的效率,你更好〜 – Wen