这是相当快的使用np.where:
>>> a
array([[0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0]])
>>> np.where(a>0)
(array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 4, 5]), array([3, 4, 5, 6, 3, 4, 5, 6, 4, 5, 6, 6, 6, 6]))
与提供的元组比0
您还可以使用NP更大的价值坐标。其中,以测试每个子阵列:
def first_true1(a):
""" return a dict of row: index with value in row > 0 """
di={}
for i in range(len(a)):
idx=np.where(a[i]>0)
try:
di[i]=idx[0][0]
except IndexError:
di[i]=None
return di
打印:
{0: 3, 1: 3, 2: 4, 3: 6, 4: 6, 5: 6, 6: None}
即,行0:索引3> 0;第4行:索引4> 0;第6行:没有指数大于0
当你怀疑,argmax可能会更快:
def first_true2():
di={}
for i in range(len(a)):
idx=np.argmax(a[i])
if idx>0:
di[i]=idx
else:
di[i]=None
return di
# same dict is returned...
如果你能处理没有针对所有naughts的行的None
的逻辑,这是快还是:
def first_true3():
di={}
for i, j in zip(*np.where(a>0)):
if i in di:
continue
else:
di[i]=j
return di
这里是在argmax使用轴(如您的意见建议)版本:
def first_true4():
di={}
for i, ele in enumerate(np.argmax(a,axis=1)):
if ele==0 and a[i][0]==0:
di[i]=None
else:
di[i]=ele
return di
对于速度比较(你的例子阵列上),我得到这个:
rate/sec usec/pass first_true1 first_true2 first_true3 first_true4
first_true1 23,818 41.986 -- -34.5% -63.1% -70.0%
first_true2 36,377 27.490 52.7% -- -43.6% -54.1%
first_true3 64,528 15.497 170.9% 77.4% -- -18.6%
first_true4 79,287 12.612 232.9% 118.0% 22.9% --
如果我规模,为2000 X 2000 NP阵列,这里是我得到:
rate/sec usec/pass first_true3 first_true1 first_true2 first_true4
first_true3 3 354380.107 -- -0.3% -74.7% -87.8%
first_true1 3 353327.084 0.3% -- -74.6% -87.7%
first_true2 11 89754.200 294.8% 293.7% -- -51.7%
first_true4 23 43306.494 718.3% 715.9% 107.3% --
我会很很多人期望argmax函数更快。 如果性能至关重要,你可以尝试写一个扩展名为C – SudoNhim 2012-07-31 01:46:59