2017-07-31 74 views

回答

0

对我来说,最后一块拼图来自computing sum of outer products

这里是我想出了:

% X is a {# of training examples} x {# of features} matrix 
% Y is a {# of training examples} x {# of output neurons} matrix 
% Theta is a cell matrix containing Theta{1}...Theta{n} 

% Number of training examples 
m = size(X, 1); 

% Get h(X) and z (non-activated output of all neurons in network) 
[hX, z, activation] = predict(Theta, X); 

% Get error of output layer 
layers = 1 + length(Theta); 
d{layers} = hX - Y; 

% Propagate errors backwards through hidden layers 
for layer = layers-1 : -1 : 2 
    d{layer} = d{layer+1} * Theta{layer}; 
    d{layer} = d{layer}(:, 2:end); % Remove "error" for constant bias term 
    d{layer} .*= sigmoidGradient(z{layer}); 
end 

% Calculate Theta gradients 
for l = 1:layers-1 
    Theta_grad{l} = zeros(size(Theta{l})); 

    % Sum of outer products 
    Theta_grad{l} += d{l+1}' * [ones(m,1) activation{l}]; 

    % Add regularisation term 
    Theta_grad{l}(:, 2:end) += lambda * Theta{l}(:, 2:end); 
    Theta_grad{l} /= m; 
end