2016-03-07 59 views
2

我有一个熊猫日期框架是这样的:减去年大熊猫数据帧,并把它们添加到一个矩阵

 maturity    coupon  freq 
0 2018-06-01 00:00:00   3   1 
1 2017-10-01 00:00:00   2   1 

我想要的是,在第一列包含这些日期和日期1,2的矩阵。在这些日期之前的几年,第二列包含从2016年3月4日至日期的天数。

像这样:

date     number of days remaining 
2016-06-01 00:00:00   89 
2016-10-01 00:00:00   211  
2017-06-01 00:00:00   454 
2017-10-01 00:00:00   576 
2018-06-01 00:00:00   819 

请帮帮忙!

回答

1

您可以尝试通过减去DataOffset列出dfs追加新Series创造新DataFrame,然后concat他们。最后你可以日期时间。减去和dTimedelta通过np.timedeltainteger转换:

d = "2016.03.04" 

#append substracted column maturity with DateOffset 
dfs =[] 
for i in range(5): 
    years_before = df['maturity'] - pd.DateOffset(years=i) 

    #get only datetime to date d 
    #print years_before.loc[years_before > d] 
    dfs.append(years_before.loc[years_before > d]) 
df = pd.DataFrame(pd.concat(dfs, ignore_index=True)) 
print df 
    maturity 
0 2018-06-01 
1 2017-10-01 
2 2017-06-01 
3 2016-10-01 
4 2016-06-01 
df['remain'] = (df['maturity'] - pd.to_datetime(d))/np.timedelta64(1, 'D') 
#sort values by column maturity 
df = df.sort_values('maturity') 
print df 
    maturity remain 
4 2016-06-01  89 
3 2016-10-01  211 
2 2017-06-01  454 
1 2017-10-01  576 
0 2018-06-01  819 

我估计一下循环的最大数(不深入的检验):

#get max count of years => loops 
maxYears = (df['maturity'].max() - pd.to_datetime(d))/np.timedelta64(1, 'D')/(365.25) 
print maxYears 
2.24229979466 

#convert float to int, if 2.999 => 2, so one year is added 
#rather add one more year (leap years, year is only estimated) 
maxYears = int(maxYears) + 2 
print maxYears 
4