渐变下降：delta值应该是标量还是向量？

-2

当运行反向传播之后计算用于神经网络的δ值：渐变下降：delta值应该是标量还是向量？

增量（1）将是一个标量值的值时，它应该是一个矢量？

更新：

从http://www.holehouse.org/mlclass/09_Neural_Networks_Learning.html

具体措施：

来源

2016-05-12 blue-sky

公式的任何参考？ – greeness

@greeness请参阅更新 –

首先，你可能明白，在每一层，我们有n x m参数（或重量）需要学习，因此形成一个2-d矩阵。

n is the number of nodes in the current layer plus 1 (for bias) 
m is the number of nodes in the previous layer.

我们有n x m参数，因为没有任何先前的和当前层之间的两个节点之间的一个连接。

我非常肯定，L层的Delta（大三角洲）用于为L层的每个参数累积偏导数项。因此，您在每层也有Delta的2D矩阵。要更新第i行（在当前层中的第i个节点）和矩阵的第j列（在前面层中的第j个节点），

D_(i,j) = D_(i,j) + a_j * delta_i 
note a_j is the activation from the j-th node in previous layer, 
    delta_i is the error of the i-th node of the current layer 
so we accumulate the error proportional to their activation weight.

因此回答您的问题，Delta应该是一个矩阵。

来源

2016-05-12 20:31:58 greeness

谢谢，但我的问题是为什么标量被输出而不是矩阵作为错误*（a）转置是一个scala。也许我指出的链接不正确？ –

错误是nx1，a的转置是1xm，所以产品是nxm。你可能用（1xn）x（nx1）来计算，所以它变成了一个标量。 – greeness

渐变下降：delta值应该是标量还是向量？

回答

相关问题