2016-03-31 37 views
1

我想用Python 2.7 pandas编写m_tax的滚动平均代码来分析网页中的时间序列数据(http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm)。python 2.7

datum m_ta m_tax  m_taxd m_tan  m_tand 
------- ----- ----- ---------- ----- ---------- 
1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10 
1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15 
1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01 
1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23 
1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05 
1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17 
1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04 
1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29 
.... 

在这里,我想我的代码如下:

pd.rolling_mean(df.resample("1M", fill_method="ffill"), window=60, min_periods=1, center=True).mean() 

,我得到的结果:

m_ta   11.029173 
m_tax   17.104283 
m_tan   4.848637 
month   6.499500 
monthly_mean 11.030405 
monthly_std  1.836159 
m_tax%   0.083348 
m_tan%   0.023627 
dtype: float64 

以另一种方式我试图为:

s = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/1900', periods=1000)) 
s = s.cumsum() 
r = s.rolling(window=60) 
r.mean() 

和我结果

1900-01-01   NaN 
1900-01-02   NaN 
1900-01-03   NaN 
1900-01-04   NaN 
1900-01-05   NaN 
1900-01-06   NaN 
1900-01-07   NaN 
1900-01-08   NaN 
... 

所以我很困惑这里。我应该使用哪一个?有人可以给我想法吗?谢谢!

回答

0

从版本0.18.0开始,rolling()resample()的行为与groupby()的行为相似,并且不建议作为函数使用。

What's new in pandas version 0.18.0

rolling()/expanding() in pandas version 0.18.0

resample() in pandas version 0.18.0

我不能告诉你确切想要的结果是什么,但也许这样的事情是你想要的吗? (你可以看到下面的警告信息,虽然我不知道这里触发了什么。)

>>> df 

      m_ta m_tax  m_taxd m_tan  m_tand 
datum             
1901-01-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10 
1901-02-01 -2.1 3.5 1901-02-06 -7.9 1901-02-15 
1901-03-01 5.8 13.5 1901-03-20 0.6 1901-03-01 
1901-04-01 11.6 18.2 1901-04-10 7.4 1901-04-23 
1901-05-01 16.8 22.5 1901-05-31 12.2 1901-05-05 
1901-06-01 21.0 24.8 1901-06-03 14.6 1901-06-17 
1901-07-01 22.4 27.4 1901-07-30 16.9 1901-07-04 
1901-08-01 20.7 25.9 1901-08-01 14.7 1901-08-29 

>>> df.resample("1M").rolling(3,center=True,min_periods=1).mean() 

/Users/john/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:1: FutureWarning: .resample() is now a deferred operation 
use .resample(...).mean() instead of .resample(...) 
    if __name__ == '__main__': 

       m_ta  m_tax  m_tan 
datum          
1901-01-31 -3.400000 4.250000 -10.050000 
1901-02-28 -0.333333 7.333333 -6.500000 
1901-03-31 5.100000 11.733333 0.033333 
1901-04-30 11.400000 18.066667 6.733333 
1901-05-31 16.466667 21.833333 11.400000 
1901-06-30 20.066667 24.900000 14.566667 
1901-07-31 21.366667 26.033333 15.400000 
1901-08-31 21.550000 26.650000 15.800000