2014-09-27 75 views
0

我已经实现了从图像构建图像的方法。该方法基于KNN。使用python和igraph构造图像的knn图形

基本上,每个像素代表一个顶点,并且有必要将每个像素与k个最近邻居连接起来。

脚本很简单,但速度很慢。我试图优化欧几里德距离的计算和添加边缘的步骤。

有人有任何建议来优化我的代码?

最慢的一步是计算欧几里德距离。由于计算所有顶点的距离,因此该距离为n^2。例如,图像大小600x375具有225000个vértices。

执行:

python file.py -f image.jpg -k 10 

enter image description here

代码:

import Image 
import math 
from optparse import OptionParser 
import igraph 

def euclidian_distance(x1,y1,r1,g1,b1,x2,y2,r2,g2,b2): 
    return math.sqrt(
     (x1 - x2) ** 2 + 
     (y1 - y2) ** 2 + 
     (r1 - r2) ** 2 + 
     (g1 - g2) ** 2 + 
     (b1 - b2) ** 2 
    ) 

def _plot_xy(g): 
    visual_style = {} 
    visual_style["vertex_shape"] = "circle" 
    visual_style["label_color"] = "white" 
    visual_style["edge_color"] = "black" 
    visual_style["edge_width"] = 0.2 
    visual_style["vertex_size"] = 0.5 

    layout = [] 
    for vertex in g.vs(): 
     layout.append((vertex["x"],vertex["y"])) 

    visual_style["layout"] = layout 
    visual_style["bbox"] = (200, 200) 
    visual_style["margin"] = 10 
    igraph.plot(g, **visual_style) 

if __name__ == '__main__': 

    parser = OptionParser() 

    usage = "usage: python %prog [options] args ..." 
    description = """Description""" 
    parser.add_option("-f", "--file", dest="filename", help="read FILE", metavar="FILE") 
    parser.add_option("-k", "--knn", dest="k", help="knn") 

    (options, args) = parser.parse_args() 
    filename = options.filename 
    k = int(options.k) 

    if filename is None: 
     parser.error("required -f [filename] arg.") 

    g = igraph.Graph() 
    im = Image.open(filename) 
    pix = im.load() 
    for j in range(0,im.size[1]): 
     for i in range(0,im.size[0]): 
      g.add_vertex() 
      vertex = g.vs[g.vcount()-1] 
      vertex["name"] = vertex.index 
      vertex["x"] = i 
      vertex["y"] = j 
      vertex["r"] = pix[i,j][0] 
      vertex["g"] = pix[i,j][1] 
      vertex["b"] = pix[i,j][2] 

    // --> This step is very slow 
    for v in g.vs(): 
     set_distance = dict() 
     for n in g.vs(): 
      distance = euclidian_distance(v["x"],v["y"],v["r"],v["g"],v["b"],n["x"],n["y"],n["r"],n["g"],n["b"]) 
      set_distance[n.index] = distance 
     sorted_set_distance = sorted(set_distance.items(), key=lambda set_distance: set_distance[1]) 
     v["distance"] = sorted_set_distance[:k] 

    edges = [] 
    weight = [] 
    for v in g.vs(): 
     for n in v["distance"]: 
      edges += [(v.index, n[0])] 
      weight.append(n[1]) 

    g.add_edges(edges) 
    g.es["weight"] = weight 

    _plot_xy(g) 

    g.write(filename.split('.')[0]+".edgelist", format='ncol') 
+0

简介它,看看什么是慢。 – 2014-09-28 02:34:02

+0

我已编辑帖子 – 2014-09-28 14:36:02

+1

这里是一个提示:不要取平方根,使用平方距离。取平方根是一个非常缓慢的操作。 – 2014-09-28 18:47:40

回答

1

代替计算将所有对节点的欧式距离,建立从节点kd-tree,然后简单地取使用kd-tree的最近邻居;这将大大减少距离计算的次数。 SciPy包含kd-trees的efficient implementation,所以不需要重新发明轮子。

+0

这个实现使用欧几里得距离吗? kd-tree中的最近邻居是什么意思? – 2014-09-29 13:12:58

+0

我认为这可以解决eucliadiana的距离http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.query.html – 2014-09-29 13:57:58

0

基于Tamas答案,我修改了原始代码。比原始代码快:

import math 
from optparse import OptionParser 
import igraph 
from scipy import spatial 
import numpy as np 

if __name__ == '__main__': 

    parser = OptionParser() 

    usage = "usage: python %prog [options] args ..." 
    description = """Description""" 
    parser.add_option("-f", "--file", dest="filename", help="read FILE", metavar="FILE") 
    parser.add_option("-k", "--knn", dest="k", help="knn") 

    (options, args) = parser.parse_args() 
    filename = options.filename 
    k = int(options.k) 

    if filename is None: 
     parser.error("required -f [filename] arg.") 

    graph = igraph.Graph() 
    im = Image.open(filename) 
    pix = im.load() 
    x, y, r, g, b = [], [], [], [], [] 
    for j in range(0,im.size[1]): 
     for i in range(0,im.size[0]): 
      graph.add_vertex() 
      vertex = graph.vs[graph.vcount()-1] 
      vertex["name"] = vertex.index 
      vertex["x"] = i 
      vertex["y"] = j 
      vertex["r"] = pix[i,j][0] 
      vertex["g"] = pix[i,j][1] 
      vertex["b"] = pix[i,j][2] 
      x.append(i) 
      y.append(j) 
      r.append(pix[i,j][0]) 
      g.append(pix[i,j][1]) 
      b.append(pix[i,j][2]) 

    x = np.array(x) 
    y = np.array(y) 
    r = np.array(r) 
    g = np.array(g) 
    b = np.array(b) 
    tree = spatial.KDTree(zip(x.ravel(), y.ravel(), r.ravel(), g.ravel(), b.ravel())) 

    edges = [] 
    weight = [] 
    for v in graph.vs(): 
     pts = np.array([[v["x"], v["y"], v["r"], v["g"], v["b"]]]) 
     list_nn = tree.query(pts, k=k); 
     for idx, nn in enumerate(list_nn[1][0]): 
      edges += [(v.index, nn)] 
      weight.append(1/(1+list_nn[0][0][idx])) 

    graph.add_edges(edges) 
    graph.es["weight"] = weight 

    graph.write(filename.split('.')[0]+".edgelist", format='ncol')