2015-11-03 124 views
2

我有两个daframes DF1和DF2合并两只大熊猫dataframes邮票

DF1是

time     status 
2/2/2015 8.00 am  on time 
2/2/2015 9.00 am  canceled 
2/2/2015 10.30 am  on time 
2/2/2015 12.45 pm  on time 

DF2是

w_time     temp 
2/2/2015 8.00 am  45 
2/2/2015 8.50 am  46 
2/2/2015 9.40 am  47 
2/2/2015 10.15 am  47 
2/2/2015 10.35 am  48 
2/2/2015 12.00 pm  48 
2/2/2015 1.00 pm  49 

现在我想合并以这样的方式两个数据帧即第二个时间戳总是接近或等于第一个时间戳

结果应该是

time    status  w_time    temp 

2/2/2015 8.00 am on time 2/2/2015 8.00 am  45 

2/2/2015 9.00 am canceled 2/2/2015 8.50 am  46 

2/2/2015 10.30 am on time 2/2/2015 10.35 am 48 
2/2/2015 12.45 pm on time 2/2/2015 1.00 pm 49 
+0

请发布您到目前为止尝试的代码.. http://stackoverflow.com/help/on-topic – WoodChopper

回答

7

首先确保日期列是datetime64列。

df1['time'] = pd.to_datetime(df1['time'].str.replace(".", ":")) 
df2['w_time'] = pd.to_datetime(df2['w_time'].str.replace(".", ":")) 

如果将这些作为DatetimeIndex当时的可使用reindex与 '最近' 的方法:

In [11]: df1 = df1.set_index("time") 

In [12]: df2 = df2.set_index("w_time", drop=False) 

In [13]: df1 
Out[13]: 
         status 
time 
2015-02-02 08:00:00 on time 
2015-02-02 09:00:00 canceled 
2015-02-02 10:30:00 on time 
2015-02-02 12:45:00 on time 

In [14]: df2 
Out[14]: 
        temp    w_time 
w_time 
2015-02-02 08:00:00 45 2015-02-02 08:00:00 
2015-02-02 08:50:00 46 2015-02-02 08:50:00 
2015-02-02 09:40:00 47 2015-02-02 09:40:00 
2015-02-02 10:15:00 47 2015-02-02 10:15:00 
2015-02-02 10:35:00 48 2015-02-02 10:35:00 
2015-02-02 12:00:00 48 2015-02-02 12:00:00 
2015-02-02 13:00:00 49 2015-02-02 13:00:00 

下列要求:

In [15]: df2.reindex(df1.index, method='nearest') 
Out[15]: 
        temp    w_time 
time 
2015-02-02 08:00:00 45 2015-02-02 08:00:00 
2015-02-02 09:00:00 46 2015-02-02 08:50:00 
2015-02-02 10:30:00 48 2015-02-02 10:35:00 
2015-02-02 12:45:00 49 2015-02-02 13:00:00 

然后添加这些列/加入回到df1。