2016-07-29 81 views
0

我有这些列的事务表:CUSTOMER_ID,TRANSACTION_ID,月熊猫 - 不同聚合为一个场

我想写这将是相当于SQL下面的查询:

SELECT min(month) as first_month, max(month) as last_month 
FROM transactions 
GROUP BY customer_id 

在熊猫,看来我只能汇总每列一次,如下面的查询将返回仅一个月列:

transactions.groupby('customer_id').aggregate({ 'Month' : 'min', 'Month' : 'max'}) 

任何想法我怎么能做到这一点?

回答

1

您可以使用:

transactions.groupby('customer_id').aggregate({ 'Month' : ['min', 'max']}) 

样品:

transactions = pd.DataFrame({'customer_id':[1,2,3,1,2,1], 
        'Month':  [4,5,6,1,1,3]}) 

print (transactions) 
    Month customer_id 
0  4   1 
1  5   2 
2  6   3 
3  1   1 
4  1   2 
5  3   1 

df = transactions.groupby('customer_id').aggregate({ 'Month' : ['min', 'max']}) 
print (df) 
      Month  
       min max 
customer_id   
1    1 4 
2    1 5 
3    6 6 

更快的解决方案是:

g = transactions.groupby('customer_id')['Month'] 
print (pd.concat([g.min(), g.max()], axis=1, keys=['min','max'])) 
+0

许多感谢的人! – Shgidi