我建议你坚持下去,一旦分类保存在你喜欢的任何可用格式的数据,但使用sortedtcontainersSortedDict将允许你这样做的对比中为log N时间,一旦你有一个排序的集合,由开始整理IP:
import csv
from sortedcontainers import sorteddict
with open("ips.csv") as f:
ips = ["192.168.43.102", "10.10.145.100", "192.168.1.1", "192.168.43.99","127.0.0.1"]
reader = csv.reader(f)
# Use start ip as the key, creating tuple or using netaddr to turn into an int
sorted_dict = sorteddict.SortedDict((tuple(map(int, sip.split("."))),(eip, rnge)) for sip, eip, rnge in reader)
for ip in ips:
# do the same for the ip you want to search for
ip = tuple(map(int, ip.split(".")))
# bisect to see where the ip would land
ind = sorted_dict.bisect(ip) - 1
start_ip = sorted_dict.iloc[ind]
end_ip = tuple(map(int, sorted_dict[sorted_dict.iloc[ind]][0].split(".")))
print(start_ip, ip, end_ip)
print(start_ip <= ip <= end_ip)
如果我们运行一个测试文件的代码:
In [5]: !cat ips.csv
192.168.43.100,192.168.43.130,foo
192.168.27.1,192.168.27.12,foobar
192.168.1.1,192.168.1.98,bar
192.168.43.131,192.168.43.140,bar
10.10.131.10,10.10.131.15,barf
10.10.145.10,10.10.145.100,foob
In [6]: import csv
In [7]: from sortedcontainers import sorteddict
In [8]: with open("ips.csv") as f:
...: ips = ["192.168.43.102", "10.10.145.100", "192.168.1.1", "192.168.43.99","127.0.0.1"]
...: reader = csv.reader(f)
...: sorted_dict = sorteddict.SortedDict((tuple(map(int, sip.split("."))),(eip, rnge)) for sip, eip, rnge in reader)
...: for ip in ips:
...: ip = tuple(map(int, ip.split(".")))
...: ind = sorted_dict.bisect(ip) - 1
...: start_ip = sorted_dict.iloc[ind]
...: end_ip = tuple(map(int, sorted_dict[sorted_dict.iloc[ind]][0].split(".")))
...: print(start_ip,ip, end_ip)
...: print(start_ip <= ip <= end_ip)
...:
(192, 168, 43, 100) (192, 168, 43, 102) (192, 168, 43, 130)
True
(10, 10, 145, 10) (10, 10, 145, 100) (10, 10, 145, 100)
True
(192, 168, 1, 1) (192, 168, 1, 1) (192, 168, 1, 98)
True
(192, 168, 27, 1) (192, 168, 43, 99) (192, 168, 27, 12)
False
(10, 10, 145, 10) (127, 0, 0, 1) (10, 10, 145, 100)
False
你也可以修改bisect_right只有合作nsider的第一要素,并使用常用的Python列表:
def bisect_right(a, x, lo=0, hi=None):
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi) // 2
if x < a[mid][0]:
hi = mid
else:
lo = mid + 1
return lo
with open("ips.csv") as f:
ips = ["192.168.43.102", "10.10.145.100", "192.168.1.1", "192.168.43.99", "127.0.0.1"]
reader = csv.reader(f)
sorted_data = sorted(((tuple(map(int, sip.split("."))), eip, rnge) for sip, eip, rnge in reader))
for ip in ips:
ip = tuple(map(int, ip.split(".")))
ind = bisect_right(sorted_data, ip) - 1
ip_sub = sorted_data[ind]
start_ip, end_ip, _ = sorted_data[ind]
end_ip = tuple(map(int, end_ip.split(".")))
print(start_ip, ip, end_ip)
print(start_ip <= ip <= end_ip)
结果将是相同的,我会想象使用SortedDict几乎肯定是更快平分线在C级完成。
那么如果ips是为了您可以使用bisect'O(log n)'查找,请使用start ip来查看ip将在哪里登陆并查看它是否在开始和结束ip的范围内,你可能会发现一个数据库有用的地方,你使用start ip作为主索引将被索引 –
如何将IP地址转换为整数? –
@ cricket_007'int(ip.replace(“。”,“”))',推定ipv4 –