2011-11-16 66 views
1

我有一个图像处理问题,我目前使用numpy和scipy在python中解决。简而言之,我有一个想要应用许多局部收缩的图像。我的原型代码正在工作,最终的图像看起来很棒。但是,处理时间已成为我们应用程序中的严重瓶颈。你能帮我加快我的图像处理代码吗?在python中加速插值

我试图将我们的代码归结为下面的'卡通'版本。剖析表明我正在花大部分时间进行插值。有没有明显的方法来加速执行?

import cProfile, pstats 
import numpy 
from scipy.ndimage import interpolation 

def get_centered_subimage(
    center_point, window_size, image): 
    x, y = numpy.round(center_point).astype(int) 
    xSl = slice(max(x-window_size-1, 0), x+window_size+2) 
    ySl = slice(max(y-window_size-1, 0), y+window_size+2) 
    subimage = image[xSl, ySl] 
    interpolation.shift(
     subimage, shift=(x, y)-center_point, output=subimage) 
    return subimage[1:-1, 1:-1] 

"""In real life, this is experimental data""" 
im = numpy.zeros((1000, 1000), dtype=float) 
"""In real life, this mask is a non-zero pattern""" 
window_radius = 10 
mask = numpy.zeros((2*window_radius+1, 2*window_radius+1), dtype=float) 
"""The x, y coordinates in the output image""" 
new_grid_x = numpy.linspace(0, im.shape[0]-1, 2*im.shape[0]) 
new_grid_y = numpy.linspace(0, im.shape[1]-1, 2*im.shape[1]) 


"""The grid we'll end up interpolating onto""" 
grid_step_x = new_grid_x[1] - new_grid_x[0] 
grid_step_y = new_grid_y[1] - new_grid_y[0] 
subgrid_radius = numpy.floor(
    (-1 + window_radius * 0.5/grid_step_x, 
    -1 + window_radius * 0.5/grid_step_y)) 
subgrid = (
    window_radius + 2 * grid_step_x * numpy.arange(
     -subgrid_radius[0], subgrid_radius[0] + 1), 
    window_radius + 2 * grid_step_y * numpy.arange(
     -subgrid_radius[1], subgrid_radius[1] + 1)) 
subgrid_points = ((2*subgrid_radius[0] + 1) * 
        (2*subgrid_radius[1] + 1)) 

"""The coordinates of the set of spots we we want to contract. In real 
life, this set is non-random:""" 
numpy.random.seed(0) 
num_points = 10000 
center_points = numpy.random.random(2*num_points).reshape(num_points, 2) 
center_points[:, 0] *= im.shape[0] 
center_points[:, 1] *= im.shape[1] 

"""The output image""" 
final_image = numpy.zeros(
    (new_grid_x.shape[0], new_grid_y.shape[0]), dtype=numpy.float) 

def profile_me(): 
    for m, cp in enumerate(center_points): 
     """Take an image centered on each illumination point""" 
     spot_image = get_centered_subimage(
      center_point=cp, window_size=window_radius, image=im) 
     if spot_image.shape != (2*window_radius+1, 2*window_radius+1): 
      continue #Skip to the next spot 
     """Mask the image""" 
     masked_image = mask * spot_image 
     """Resample the image""" 
     nearest_grid_index = numpy.round(
       (cp - (new_grid_x[0], new_grid_y[0]))/
       (grid_step_x, grid_step_y)) 
     nearest_grid_point = (
      (new_grid_x[0], new_grid_y[0]) + 
      (grid_step_x, grid_step_y) * nearest_grid_index) 
     new_coordinates = numpy.meshgrid(
      subgrid[0] + 2 * (nearest_grid_point[0] - cp[0]), 
      subgrid[1] + 2 * (nearest_grid_point[1] - cp[1])) 
     resampled_image = interpolation.map_coordinates(
      masked_image, 
      (new_coordinates[0].reshape(subgrid_points), 
      new_coordinates[1].reshape(subgrid_points)) 
      ).reshape(2*subgrid_radius[1]+1, 
         2*subgrid_radius[0]+1).T 
     """Add the recentered image back to the scan grid""" 
     final_image[ 
      nearest_grid_index[0]-subgrid_radius[0]: 
      nearest_grid_index[0]+subgrid_radius[0]+1, 
      nearest_grid_index[1]-subgrid_radius[1]: 
      nearest_grid_index[1]+subgrid_radius[1]+1, 
      ] += resampled_image 

cProfile.run('profile_me()', 'profile_results') 
p = pstats.Stats('profile_results') 
p.strip_dirs().sort_stats('cumulative').print_stats(10) 

的代码做什么含糊的解释:

我们先从一个像素化的2D图像,以及一组任意的(X,Y)在我们的形象加分不一般落在整数格。对于每个(x,y)点,我想通过一个小屏蔽中心精确地在该点上乘以图像。接下来,我们将有限量的遮罩区域缩小/展开,最后将这个处理后的子图像添加到可能与原始图像不具有相同像素大小的最终图像。 (不是我最好的解释,好吧)。

+1

作为第一次切割,你可能想* [给这个尝试](http://stackoverflow.com/questions/4295799/how-to-improve-performance-of-this-code/4299378#4299378) *。 –

回答

3

我敢肯定,正如你所说的,大部分计算时间发生在interpolate.map_coordinates(…)之间,在center_points的每次迭代中调用一次,这里是10,000次。一般来说,使用numpy/scipy堆栈时,您希望重复执行大型数组的任务在本地Numpy/Scipy函数中发生 - 即在同类数据的C循环中 - 而不是在Python中显式执行。

一个策略可能加快插值,但也将增加使用的内存量,方法是:

  • 首先,获取在三维阵列中的所有子图像(这里命名masked_image)( window_radius x window_radius x center_points.size
  • 编写一个ufunc(读取它很有用),它使用numpy.frompyfunc包装每个子图像必须完成的工作,该工作应该返回另一个三维数组(subgrid_radius[0] x subgrid_radius[1] x center_points.size)。简而言之,这会创建一个向量化的python函数版本,它可以在数组上按元素进行广播。
  • 通过在第三维上求和来构建最终图像。

希望让你更接近你的目标!