搜索CSV的匹配字段，并使用初始日期

我正在尝试搜索CSV文件的重复设备名称的行。输出应记录第一个匹配行的日期，并记录最后一行的日期。我需要一些关于从CSV文件中删除重复设备名称的逻辑的帮助，同时还要记录设备首次和最后一次出现的时间。搜索CSV的匹配字段，并使用初始日期

import time as epoch 

# AlertTime, DeviceName, Status 
Input = [['14/08/2016 13:00', 'device-A', 'UP'], ['14/08/2016 13:15', 'device-B', 'DOWN'], ['15/08/2016 17:30', 'device-A', 'UP']] 

# FirstSeen, LastSeen, DeviceName, Status 
Output = [] 

# Last 48 hours 
now = epoch.time() 
cutoff = now - (172800) 

for i in Input: 
    AlertTime = epoch.mktime(epoch.strptime(i[0], '%d/%m/%Y %H:%M')) 
    if AlertTime > cutoff: 
     Result = [i[0], i[0], i[1], i[2]] 
     Output.append(Result) 

print(Output)

输出（3项）：

[['14/08/2016 13:00', '14/08/2016 13:00', 'device-A', 'UP'], ['14/08/2016 13:15', '14/08/2016 13:15', 'device-B', 'DOWN'], ['15/08/2016 17:30', '15/08/2016 17:30', 'device-A', 'UP']]

通缉输出（2项）：

[['14/08/2016 13:15', '14/08/2016 13:15', 'device-B', 'DOWN'], ['14/08/2016 13:00', '15/08/2016 17:30', 'device-A', 'UP']]

来源

2016-08-15 zeepi

用'device'关键和'（FirstSeen，LastSeen，设备名称使用字典，状态）'作为价值。 –

@VedangMehta也许你可以省略'DeviceName'字段，因为它已经是关键了？否则，我完全同意。 – bdvll

@bdvll你是完全正确的。 –

您可以使用OrderedDict来保存在CSV文件中看到设备的顺序。字典用于自动删除重复项。

以下工作通过尝试更新现有字典条目（如果它尚不存在），Python生成KeyError异常。在这种情况下，可以添加具有相同开始和结束警报时间的新条目。更新条目时，使用现有的first_seen更新最新发现的条目alert_time和status。最后，该字典进行解析，以创建所需的输出格式：

from collections import OrderedDict 

# AlertTime, DeviceName, Status 
input_data = [['14/08/2016 13:00', 'device-A', 'UP'], ['14/08/2016 13:15', 'device-B', 'DOWN'], ['15/08/2016 17:30', 'device-A', 'UP']] 

entries = OrderedDict() 

for alert_time, device_name, status in input_data: 
    try: 
     entries[device_name] = [entries[device_name][0], alert_time, status] 
    except KeyError as e: 
     entries[device_name] = [alert_time, alert_time, status] 

# Convert the dictionary of entries into the required format   
output_data = [[device_name, first_seen, last_seen, status] for device_name, [first_seen, last_seen, status] in entries.items()] 

print(output_data)

给你作为输出：

[['device-A', '14/08/2016 13:00', '15/08/2016 17:30', 'UP'], ['device-B', '14/08/2016 13:15', '14/08/2016 13:15', 'DOWN']]

来源

2016-08-15 10:50:19

感谢Martin，感谢您的帮助。我最终使用了你的方法。 – zeepi

由于Vedang梅塔在评论中说，你可以使用字典存储数据。

my_dict = {} 
    for i in Input: 
     AlertTime = epoch.mktime(epoch.strptime(i[0], '%d/%m/%Y %H:%M')) 
     if AlertTime > cutoff: 
      #if we have seen this device before, update it 
      if i[1] in my_dict: 
       my_dict[i[1]] = (my_dict[i[1]][0], i[0], i[2]) 
      #if we haven't seen it, add it 
      else: 
       my_dict[i[1]] = (i[0],i[0],i[2])

在此之后，所有的设备都将存储在my_dict含（first_seen，last_seen和status）。

来源

2016-08-15 08:53:32 bdvll

感谢bvvll为您的努力 – zeepi

搜索CSV的匹配字段，并使用初始日期

回答

相关问题