另一种方法可以做到这一点,特别是如果你青睐的可读性,是利用broadcasting。
所以,你可以从一维和二维阵列进行三维数组,然后在合适的轴总结:
>>> Ms = np.random.randn(4, 2, 3) # 4 arrays of size 2x3
>>> As = np.random.randn(4)
>>> np.sum(As[:, np.newaxis, np.newaxis] * Ms)
array([[-1.40199248, -0.40337845, -0.69986566],
[ 3.52724279, 0.19547118, 2.1485559 ]])
>>> sum(a*M for a, M in zip(As, Ms))
array([[-1.40199248, -0.40337845, -0.69986566],
[ 3.52724279, 0.19547118, 2.1485559 ]])
不过,值得注意的是,np.einsum
和np.tensordot
通常更有效:
>>> %timeit np.sum(As[:, np.newaxis, np.newaxis] * Ms, axis=0)
The slowest run took 7.38 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.58 µs per loop
>>> %timeit np.einsum('i,ijk->jk', As, Ms)
The slowest run took 19.16 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.44 µs per loop
这也适用于更大的数字:
>>> Ms = np.random.randn(100, 200, 300)
>>> As = np.random.randn(100)
>>> %timeit np.einsum('i,ijk->jk', As, Ms)
100 loops, best of 3: 5.03 ms per loop
>>> %timeit np.sum(As[:, np.newaxis, np.newaxis] * Ms, axis=0)
100 loops, best of 3: 14.8 ms per loop
>>> %timeit np.tensordot(As,Ms,axes=(0,0))
100 loops, best of 3: 2.79 ms per loop
所以np.tensordot
在这种情况下效果最好。
使用np.sum
和广播的唯一好理由是使代码小更具可读性(当你有小矩阵时有帮助)。