2017-06-12 57 views
1

我是Python和numpy的新手,所以我只是运行示例代码并尝试调整它们以便理解。我遇到了一些关于numpy.sum的代码,其中axis参数,但我无法运行。在某段时间(阅读scipy文档,尝试实验)后,我通过使用axis = (1,2,3)而不是axis = 1来运行它。具有轴行为的python numpy sum函数

事情是,无论我搜索,他们只写axis = 1让它工作。

我正在使用Python 3.5.3,numpy 1.12.1 是否有一个numpy/python版本在行为上有很大的差异?或者我只是以某种方式配置它是错误的?

import numpy as np 
from past.builtins import xrange 


# sample data 
X = np.arange(1, 4*4*3*5+1).reshape(5, 4, 4, 3) 
Y = np.arange(5, 4*4*3*8+5).reshape(8, 4, 4, 3) 
Xlen = X.shape[0] 
Ylen = Y.shape[0] 

# allocate some space for whatever calculation 
rs = np.zeros((Xlen, Ylen)) 
rs1 = np.zeros((Xlen, Ylen)) 

# calculate the result with 2 loops 
for i in xrange(Xlen): 
    for j in xrange(Ylen): 
     rs[i, j] = np.sum(X[i] + Y[j]) 

# calculate the result with one loop only 
for i in xrange(Xlen): 
    rs1[i, :] = np.sum(Y + X[i], axis=(1,2,3)) 

print(rs1 == rs) # same result 

# also with one loop, as everywhere on the internet: 
for i in xrange(Xlen): 
    rs1[i, :] = np.sum(Y + X[i], axis=1) 
    # ValueError: could not broadcast input array from shape (8,4,3) into shape (8) 
+0

'轴= 1'款项你指定一个维度。结果是一个尺寸为'ndim-1'的数组;在你的情况下,它有三个维度和形状'(8,4,3)'。这与输出数组rs1 [i,:]'不兼容,它只有两个维度。 – MaxPowers

回答

0
axis : None or int or tuple of ints, optional 
    ... 
    If axis is a tuple of ints, a sum is performed on all of the axes 
    specified in the tuple instead of a single axis or all the axes as 
    before. 

使用元组的能力是加成(V1.7,2013年)。我没有用太多,当我在MATLAB中需要它时,我使用了重复的和,例如

In [149]: arr = np.arange(24).reshape(2,3,4) 
In [150]: arr.sum(axis=(1,2)) 
Out[150]: array([ 66, 210]) 
In [151]: arr.sum(axis=2).sum(axis=1) 
Out[151]: array([ 66, 210]) 

在做你需要记住的顺序是求和的轴变化的数量(除非你使用keepdims,本身就是一个新望参数)。


X,Y总和:

In [160]: rs = np.zeros((Xlen, Ylen),int) 
    ...: rs1 = np.zeros((Xlen, Ylen),int) 
    ...: 
    ...: # calculate the result with 2 loops 
    ...: for i in range(Xlen): 
    ...: for j in range(Ylen): 
    ...:  rs[i,j] = np.sum(X[i] + Y[j]) 
    ...: 
In [161]: rs 
Out[161]: 
array([[ 2544, 4848, 7152, 9456, 11760, 14064, 16368, 18672], 
     [ 4848, 7152, 9456, 11760, 14064, 16368, 18672, 20976], 
     [ 7152, 9456, 11760, 14064, 16368, 18672, 20976, 23280], 
     [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584], 
     [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]]) 

可以在没有环被复制。

In [162]: X.sum((1,2,3)) 
Out[162]: array([ 1176, 3480, 5784, 8088, 10392]) 
In [163]: Y.sum((1,2,3)) 
Out[163]: array([ 1368, 3672, 5976, 8280, 10584, 12888, 15192, 17496]) 
In [164]: X.sum((1,2,3))[:,None] + Y.sum((1,2,3)) 
Out[164]: 
array([[ 2544, 4848, 7152, 9456, 11760, 14064, 16368, 18672], 
     [ 4848, 7152, 9456, 11760, 14064, 16368, 18672, 20976], 
     [ 7152, 9456, 11760, 14064, 16368, 18672, 20976, 23280], 
     [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584], 
     [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]]) 

np.sum(X[i] + Y[j]) =>np.sum(X[i]) + np.sum(Y[j])sum(X[i])总计X[i](axis = None)的所有元素。除了1st,X.sum(axis=(1,2,3))[i]以外,所有轴都是相同的。

In [165]: X[0].sum() 
Out[165]: 1176 
In [166]: X.sum((1,2,3))[0] 
Out[166]: 1176 
In [167]: X.sum(1).sum(1).sum(1)[0] 
Out[167]: 1176 

至于广播错误,看片:

In [168]: rs1[i,:] 
Out[168]: array([0, 0, 0, 0, 0, 0, 0, 0]) # shape (8,) 
In [169]: (Y+X[i]).shape # (8,4,4,3) + (4,4,3) 
Out[169]: (8, 4, 4, 3) 
In [170]: (Y+X[i]).sum(1).shape # sums axis 1, ie one of the 4's 
Out[170]: (8, 4, 3) 
+0

我第一次结束了使用重复的总和,但后来我发现使用元组更容易阅读我的情况。 不过,我不明白为什么很多人只有在使用axis = 1的情况下才能使用,这会导致不同的形状。 – Nin

+0

我期望'tuple'选项的主要理由是可读性。 – hpaulj

0

要只axis=1得到相同的结果写的,我们能做到事前重塑数据集的一招。

X = np.reshape(X, (X.shape[0], -1)) 
Y = np.reshape(Y, (Y.shape[0], -1)) 

for i in xrange(Xlen): 
    rs[i, :] = np.sum(Y + X[i], axis=1) 
print(rs) 

其结果是:仅

[[ 2544. 4848. 7152. 9456. 11760. 14064. 16368. 18672.] 
[ 4848. 7152. 9456. 11760. 14064. 16368. 18672. 20976.] 
[ 7152. 9456. 11760. 14064. 16368. 18672. 20976. 23280.] 
[ 9456. 11760. 14064. 16368. 18672. 20976. 23280. 25584.] 
[ 11760. 14064. 16368. 18672. 20976. 23280. 25584. 27888.]]