2016-11-30 54 views
0

我有一个由excel浮动格式日期(1996年7月5日以来的每分钟)组成的列表列表以及与每个日期相关的整数值,如下所示:[[datetime,integer]...]。我需要创建一个由所有日期(无小时或分钟)组成的新列表以及该日期内所有日期时间值的总和。换句话说,当listolists[x][0] >= math.floor(listolists[x][0])listolists[x][0] < math.floor(listolists[x][0])时,每个日期的值的总和是多少。谢谢列表日期列表的Python SumIfs

+2

你所说的“该日期内的日期时间”是什么意思? – rassar

+0

我认为她想总结某一天的所有分钟值。 – blacksite

回答

0

由于您没有提供任何实际数据(只是您使用的数据结构,嵌套列表),我在下面创建了一些虚拟数据来演示如何在Python中执行SUMIFS类型的问题。

from datetime import datetime 
import numpy as np 
import pandas as pd 

dates_list = [] 

# just take one month as an example of how to group by day 
year = 2015 
month = 12 

# generate similar data to what you might have 
for day in range(1, 32): 
    for hour in range(1, 24): 
     for minute in range(1, 60): 
      dates_list.append([datetime(year, month, day, hour, minute), np.random.randint(20)]) 

# unpack these nested list pairs so we have all of the dates in 
# one list, and all of the values in the other 
# this makes it easier for pandas later 
dates, values = zip(*dates_list) 

# to eventually group by day, we need to forget about all intra-day data, e.g. 
# different hours and minutes. we only care about the data for a given day, 
# not the by-minute observations. So, let's set all of the intra-day values to 
# some constant for easier rolling-up of these dates. 
new_dates = [] 

for d in dates: 
    new_d = d.replace(hour = 0, minute = 0) 
    new_dates.append(new_d) 

# throw the new dates and values into a pandas.DataFrame object 
df = pd.DataFrame({'new_dates': new_dates, 'values': values}) 

# here's the SUMIFS function you're looking for 
grouped = df.groupby('new_dates')['values'].sum() 

让我们看看结果:

>>> print(grouped.head()) 
new_dates 
2015-12-01 12762 
2015-12-02 13292 
2015-12-03 12857 
2015-12-04 12762 
2015-12-05 12561 
Name: values, dtype: int64 

编辑:如果您希望这些新的分组数据传回在嵌套列表格式,只是这样做:

new_list = [[date, value] for date, value in zip(grouped.index, grouped)] 
0

谢谢大家。这是我能想出不需要熊猫最简单的代码:

for row in listolist: 
    for k in (0, 1): 
     row[k] = math.floor(float(row[k])) 
date = {} 
for d,v in listolist: 
    if d in date: 
     date[math.floor(d)].append(v) 
    else: 
     date[math.floor(d)] = [v] 
result = [(d,sum(v)) for d,v in date.items()]