2016-04-14 164 views
3

实际问题

如所示的这个小例子,我尝试每周重新取样大熊猫数据帧:大熊猫timedelta重采样周未能

import datetime 
import pandas as pd 

df = pd.DataFrame([{ 
    'A' : datetime.datetime.now() - datetime.datetime.now(), 
    'B' : 2 
},{ 
    'A' : datetime.datetime.now() - datetime.datetime.now(), 
    'B' : 3 
}]) 

df = df.set_index('A') 

df.resample('W', how="mean") 

,这将引发一个AttributeError

AttributeError: 'Week' object has no attribute 'nanos' 

(注意:如果我通过"D"重新采样,问题不会发生)

如果我改为将索引投射到日期时间:

df.index = pd.to_datetime(df.index.values) 
df.resample('W', how="mean") 

重采样也适用。
问题:有没有不依赖nano秒的熊猫timedelta类型?
或者:你有没有比更适合timedelta的优雅方式?


完整跟踪:

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/Library/Python/2.7/site-packages/pandas/core/generic.py", line 3266, in resample 
    return sampler.resample(self).__finalize__(self) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 98, in resample 
    rs = self._resample_timestamps(kind='timedelta') 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 272, in _resample_timestamps 
    self._get_binner_for_resample(kind=kind) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 122, in _get_binner_for_resample 
    self.binner, bins, binlabels = self._get_time_delta_bins(ax) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/resample.py", line 236, in _get_time_delta_bins 
    name=ax.name) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 167, in __new__ 
    closed=closed) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 235, in _generate 
    index = _generate_regular_range(start, end, periods, offset) 
    File "/Library/Python/2.7/site-packages/pandas/tseries/tdi.py", line 895, in _generate_regular_range 
    stride = offset.nanos 
AttributeError: 'Week' object has no attribute 'nanos' 

版本

>>> pd.__version__ 
'0.16.2' 
>>> np.__version__ 
'1.10.1' 

回答

0

我相信不同的是,熊猫使用numpy的的datetime64而蟒蛇DateTime类是不同的东西。当你调用

df.index = pd.to_datetime(df.index.values) 

您从您创建到需要重新取样作为参数numpy.datetime64对象datetime.datetime对象铸造。

+0

当然,但是这怎么回答我的问题呢? –

+0

嗯,你的问题是: 问题:有没有一种不依赖nano秒的熊猫timedelta类型? 对此我回答说: 是,numpy.datetime64 你还问: 或:你除了利用对timedelta日期时间任何更优雅的方式? 答案是: 不,因为没有与numpy.datetime64中的datetime.datetime.now()等效。请参阅http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html – kingledion