1
列平均列在大熊猫0.18.1,蟒蛇2.7.6:蟒蛇大熊猫计算由
假设我们有如下表:
ID,FROM_YEAR,FROM_MONTH,YEARMONTH,AREA,AREA2
1,2015,1,201501,200,100
1,2015,2,201502,200,100
1,2015,3,201503,200,100
1,2015,4,201504,200,100
1,2015,5,201505,200,100
1,2015,6,201506,200,100
1,2015,7,201507,200,100
1,2015,8,201508,200,100
1,2015,9,201509,200,100
1,2015,10,201510,200,100
1,2015,11,201511,200,100
1,2015,12,201512,200,100
1,2016,1,201601,100,200
1,2016,2,201602,100,200
1,2016,3,201603,100,200
1,2016,4,201604,100,200
1,2016,5,201605,100,200
1,2016,6,201606,100,200
1,2016,7,201607,100,200
1,2016,8,201608,100,200
1,2016,9,201609,100,200
1,2016,10,201610,100,200
1,2016,11,201611,100,200
1,2016,12,201612,100,200
有没有什么办法,我们可以做同样的事情作为在python熊猫中的以下MySQL查询(合并功能可能可以工作,但有什么办法可以避免昂贵的合并/连接在Python熊猫)?
SELECT
ID,
FROM_YEAR,
'A' AS TYPE,
AVG(AREA) AS AREA,
AVG(AREA2) AS AREA2
FROM table GROUP BY ID,FROM_YEAR
UNION ALL
SELECT
ID,
FROM_YEAR,
'B' AS TYPE,
AVG(AREA) AS AREA,
AVG(AREA2) AS AREA2
FROM table GROUP BY ID,FROM_YEAR;
这里的目标是获得在以下格式的历年平均面积和AREA2列:
ID,FROM_YEAR,TYPE,AREA,AREA2
1,2015,A,200,100
1,2016,A,100,200
1,2015,B,200,100
1,2016,B,100,200
可以在任何大师指教?
=================================一个扩展问题========== =======
感谢您的回答!我只是遇到一个连续12个案例的另一个问题:
所需的输出:
ID,FROM_YEAR,FROM_MONTH,YEARMONTH,AREA,AREA2
1,2015,1,201501,NULL,NULL
1,2015,2,201502,NULL,NULL
1,2015,3,201503,NULL,NULL
1,2015,4,201504,NULL,NULL
1,2015,5,201505,NULL,NULL
1,2015,6,201506,NULL,NULL
1,2015,7,201507,NULL,NULL
1,2015,8,201508,NULL,NULL
1,2015,9,201509,NULL,NULL
1,2015,10,201510,NULL,NULL
1,2015,11,201511,NULL,NULL
1,2015,12,201512,200,100
下面的代码
agg=df.groupby(['ID','FROM_YEAR'])[['AREA','AREA2']].rolling(window=12).mean()
才会产生这样的结果,其中FROM_MONTH和YEARMONTH失踪。
ID,FROM_YEAR,AREA,AREA2
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,NULL,NULL
1,2015,200,100
任何人都可以启发?谢谢!
列表理解的尼斯使用得到这里的类型栏! +1 – pansen
@pansen谢谢!欣赏评论。 – Psidom
感谢您的优雅的答案,Psidom!关于如何添加另一列并更新问题,我还有一个问题。你能开导吗? – Chubaka