Python多进程/线程循环。

我想要做的是检查哪个多处理最适合我的数据。我试着多进程这个循环：Python多进程/线程循环。

def __pure_calc(args): 

    j = args[0] 
    point_array = args[1] 
    empty = args[2] 
    tree = args[3] 

    for i in j: 
      p = tree.query(i) 

      euc_dist = math.sqrt(np.sum((point_array[p[1]]-i)**2)) 

      ##add one row at a time to empty list 
      empty.append([i[0], i[1], i[2], euc_dist, point_array[p[1]][0], point_array[p[1]][1], point_array[p[1]][2]]) 

    return empty

只是纯粹的功能正在6.52秒。

我的第一种方法是multiprocessing.map：

from multiprocessing import Pool 

def __multiprocess(las_point_array, point_array, empty, tree): 

    pool = Pool(os.cpu_count()) 

    for j in las_point_array: 
     args=[j, point_array, empty, tree] 
     results = pool.map(__pure_calc, args) 

    #close the pool and wait for the work to finish 
    pool.close() 
    pool.join() 

    return results

当我检查了其他的答案如何多进程功能应该很容易为：地图（通话功能，输入） - 完成。但由于某种原因，我的多处理器不在我的输入之外，因为scipy.spatial.ckdtree.cKDTree对象不是可代码化的上升错误。

所以我试图用apply_async：

from multiprocessing.pool import ThreadPool 

def __multiprocess(arSegment, wires_point_array, ptList, tree): 

    pool = ThreadPool(os.cpu_count()) 

    args=[arSegment, point_array, empty, tree] 

    result = pool.apply_async(__pure_calc, [args]) 

    results = result.get()

它与时弊运行。对于我的测试数据，我设法在6.42秒内计算它。

为什么apply_async接受ckdtree没有任何问题而pool.map不是？我需要改变才能使其运行？

来源

2017-04-27 Losbaltica

pool.map(function, iterable)，它基本上与itertool的map具有相同的占位面积。来自迭代器的每个项目将为您的__pure_calc函数的args。

在这种情况下，我猜你可能会改变这个：

def __multiprocess(las_point_array, point_array, empty, tree): 

    pool = Pool(os.cpu_count()) 

    args_list = [ 
     [j, point_array, empty, tree] 
     for j in las_point_array 
    ] 

    results = pool.map(__pure_calc, args_list) 

    #close the pool and wait for the work to finish 
    pool.close() 
    pool.join() 

    return results

来源

2017-04-27 12:07:35

Python多进程/线程循环。

回答

相关问题