2017-07-17 60 views
2

我想根据一些布尔条件(级联,一个接一个)更改多维numpy数组(例如mydata)。在numpy中多次引用布尔切片

这工作:

mydata[condition] = something 

这不:

mydata[condition1][condition2] = something 

如果所有条件都兼容形状的布尔数组(brodcast-能)。 任何原因,为什么不这样做,什么可能是一个很好的解决方案?现在,我通过重新分配到原来的解决它由以下几点:

tempdata = mydata[condition1] 
tempdata[condition2] = something 
mydata[condition1] = tempdata 
+0

发布的解决方案是否适合您? – Divakar

回答

2

为了解决情况下,像那些使用链/级联integer-indexing -

idx1 = np.flatnonzero(condition1) 
idx2 = np.flatnonzero(condition2) 
mydata[idx1[idx2]] = something 

采样运行 -

In [42]: mydata = np.array([2,6,8,0,9,3,1,4]) 
    ...: mydata_copy = mydata.copy() # make copy for verification 
    ...: condition1 = np.array([True,False,True,True,True,False,False,True]) 
    ...: condition2 = np.array([False,True,False,True,True]) 
    ...: something = -1 
    ...: 

# Working solution from question  
In [43]: tempdata = mydata[condition1] 
    ...: tempdata[condition2] = something 
    ...: mydata[condition1] = tempdata 
    ...: 

In [44]: mydata # Check changed values 
Out[44]: array([ 2, 6, -1, 0, -1, 3, 1, -1]) 

# Proposed solution 
In [45]: idx1 = np.flatnonzero(condition1) 
    ...: idx2 = np.flatnonzero(condition2) 
    ...: mydata_copy[idx1[idx2]] = something 
    ...: 

In [46]: mydata_copy # Verify changed values in copy 
Out[46]: array([ 2, 6, -1, 0, -1, 3, 1, -1]) 

替代方法:或者,如果您不介意编辑condition1,你可以做 -

condition1[idx1] = condition2 

,然后使用mydata[condition1] = something作为最后一步。


性能优势

让我们一次提出,看看是否有过一个问题中任何好处。

途径 -

# Original approach 
def org_app(mydata,condition1,condition2): 
    tempdata = mydata[condition1] 
    tempdata[condition2] = something 
    mydata[condition1] = tempdata 
    return mydata 

# Proposed one 
def proposed_app(mydata,condition1,condition2): 
    idx1 = np.flatnonzero(condition1) 
    idx2 = np.flatnonzero(condition2) 
    mydata[idx1[idx2]] = something 
    return mydata 

计时 -

In [58]: mydata = np.random.rand(1000000) 
    ...: mydata_copy = mydata.copy() 
    ...: condition1 = np.random.rand(mydata.size)>0.5 
    ...: condition2 = np.random.rand(condition1.sum())>0.5 
    ...: something = -1 
    ...: 

In [59]: %timeit org_app(mydata,condition1,condition2) 
100 loops, best of 3: 14.1 ms per loop 

In [61]: %timeit proposed_app(mydata_copy,condition1,condition2) 
100 loops, best of 3: 7.44 ms per loop 

结合Alternative method应进一步带来的性能提升。