2016-04-28 64 views
0

我想散点图duration(mins)start时间这样的(这是一天的时间,无论什么日期是上):我有熊猫散点图与一天中的时间?

enter image description here

CSV文件commute.csv看起来像这样:

date, prediction, start, stop, duration, duration(mins), Day of week 
14/08/2015, , 08:02:00, 08:22:00, 00:20:00, 20, Fri 
25/08/2015, , 18:16:00, 18:27:00, 00:11:00, 11, Tue 
26/08/2015, , 08:26:00, 08:46:00, 00:20:00, 20, Wed 
26/08/2015, , 18:28:00, 18:46:00, 00:18:00, 18, Wed 

我可以导入CSV文件,如下所示:

import pandas as pd 
times = pd.read_csv('commute.csv', parse_dates=[[0, 2], [0, 3]], dayfirst=True) 
times.head() 

输出:

date_start date_stop prediction duration duration(mins) Day of week 
0 2015-08-14 08:02:00 2015-08-14 08:22:00 NaN 00:20:00 20 Fri 
1 2015-08-25 18:16:00 2015-08-25 18:27:00 NaN 00:11:00 11 Tue 
2 2015-08-26 08:26:00 2015-08-26 08:46:00 NaN 00:20:00 20 Wed 
3 2015-08-26 18:28:00 2015-08-26 18:46:00 NaN 00:18:00 18 Wed 
4 2015-08-28 08:37:00 2015-08-28 08:52:00 NaN 00:15:00 15 Fri 

我现在努力绘制duration(mins)start时间(无日期)。请帮忙!

@jezrael一直是一个很好的帮助... issue 8113的意见之一建议使用df.plot(x = x,y = y,style =“。”)的变体。我试了一下:

times.plot(x='start', y='duration(mins)', style='.') 

然而,它并不显示一样我打算情节:输出不正确,因为X轴被拉长,使每个数据点是相同的相隔距离在X:

enter image description here

有没有办法与时间阴谋?

回答

2

我认为有问题使用time - issue 8113scatter graph

但是你可以使用hour

df['hours'] = df.date_start.dt.hour 
print df 
      date_start   date_stop prediction duration \ 
0 2015-08-14 08:02:00 2015-08-14 08:22:00   NaN 00:20:00 
1 2015-08-25 18:16:00 2015-08-25 18:27:00   NaN 00:11:00 
2 2015-08-26 08:26:00 2015-08-26 08:46:00   NaN 00:20:00 
3 2015-08-26 18:28:00 2015-08-26 18:46:00   NaN 00:18:00 

    duration(mins) Dayofweek hours 
0    20  Fri  8 
1    11  Tue  18 
2    20  Wed  8 
3    18  Wed  18 

df.plot.scatter(x='hours', y='duration(mins)') 

graph

与计数timeminutes另一种解决方案:

df['time'] = df.date_start.dt.hour * 60 + df.date_start.dt.minute 
print df 
      date_start   date_stop prediction duration \ 
0 2015-08-14 08:02:00 2015-08-14 08:22:00   NaN 00:20:00 
1 2015-08-25 18:16:00 2015-08-25 18:27:00   NaN 00:11:00 
2 2015-08-26 08:26:00 2015-08-26 08:46:00   NaN 00:20:00 
3 2015-08-26 18:28:00 2015-08-26 18:46:00   NaN 00:18:00 

    duration(mins) Dayofweek time 
0    20  Fri 482 
1    11  Tue 1096 
2    20  Wed 506 
3    18  Wed 1108 

df.plot.scatter(x='time', y='duration(mins)') 

graph1