有什么不对我的梯度下降算法

enter image description here

我对算法的出发点是W =（U，V）=（ 2,2）。学习率是eta = 0.01和界限= 10^-14。这里是我的MATLAB代码：

function [resultTable, boundIter] = gradientDescent(w, iters, bound, eta) 
% FUNCTION [resultTable, boundIter] = gradientDescent(w, its, bound, eta) 
% 
% DESCRIPTION: 
% - This function will do gradient descent error minimization for the 
% function E(u,v) = (u*exp(v) - 2*v*exp(-u))^2. 
% 
% INPUTS: 
% 'w' a 1-by-2 vector indicating initial weights w = [u,v] 
% 'its' a positive integer indicating the number of gradient descent 
% iterations 
% 'bound' a real number indicating an error lower bound 
% 'eta' a positive real number indicating the learning rate of GD algorithm 
% 
% OUTPUTS: 
% 'resultTable' a iters+1-by-6 table indicating the error, partial 
% derivatives and weights for each GD iteration 
% 'boundIter' a positive integer specifying the GD iteration when the error 
% function got below the given error bound 'bound' 
% 


% The error function 
E = @(u,v) (u*exp(v) - 2*v*exp(-u))^2; 

% Partial derivative of E with respect to u 
pEpu = @(u,v) 2*(u*exp(v) - 2*v*exp(-u))*(exp(v) + 2*v*exp(-u)); 
% Partial derivative of E with respect to v 
pEpv = @(u,v) 2*(u*exp(v) - 2*v*exp(-u))*(u*exp(v) - 2*exp(-u)); 

% Initialize boundIter 
boundIter = 0; 
% Create a table for holding the results 
resultTable = zeros(iters+1, 6); 
% Iteration number 
resultTable(1, 1) = 0; 
% Error at iteration i 
resultTable(1, 2) = E(w(1), w(2)); 
% The value of pEpu at initial w = (u,v) 
resultTable(1, 3) = pEpu(w(1), w(2)); 
% The value of pEpv at initial w = (u,v) 
resultTable(1, 4) = pEpv(w(1), w(2)); 
% Initial u 
resultTable(1, 5) = w(1); 
% Initial v 
resultTable(1, 6) = w(2); 

% Loop all the iterations 
for i = 2:iters+1 

    % Save the iteration number 
    resultTable(i, 1) = i-1; 
    % Update the weights 
    temp1 = w(1) - eta*(pEpu(w(1), w(2))); 
    temp2 = w(2) - eta*(pEpv(w(1), w(2))); 
    w(1) = temp1; 
    w(2) = temp2; 
    % Evaluate the error function at new weights 
    resultTable(i, 2) = E(w(1), w(2)); 
    % Evaluate pEpu at the new point 
    resultTable(i, 3) = pEpu(w(1), w(2)); 
    % Evaluate pEpv at the new point 
    resultTable(i, 4) = pEpv(w(1), w(2)); 
    % Save the new weights 
    resultTable(i, 5) = w(1); 
    resultTable(i, 6) = w(2); 
    % If the error function is below a specified bound save this iteration 
    % index 
    if E(w(1), w(2)) < bound 
     boundIter = i-1; 
    end 

end

这是在我的机器学习课程的练习，但由于某些原因，我的成绩都是错误的。代码中必须有错误。我试过调试和调试它，并没有发现任何错误...有人可以确定我的问题在这里？...换句话说，你可以检查代码是否为给定函数的有效梯度下降算法？

请让我知道，如果我的问题是太不清楚，或者如果你需要更多的信息:)

谢谢你的努力和帮助！ =）

这里是我的结果五个迭代和别人有：

参数：W = [2,2]，埃塔= 0.01，势必= 10^-14，iters = 5

enter image description here

来源

2014-10-31 jjepsuomi

你有输入数据和结果吗？ – 2014-10-31 12:14:20

@AnderBiguri嗨，这个问题没有输入数据。重点在于最小化具有梯度下降的给定函数E（u，v）。起点是w =（u，v）=（2,2），eta = 0.01，bound = 10^-14。 'iters'参数可以自由选择，例如iters = 50.我将用五次迭代发布我的结果，然后从我的课程讨论论坛获得其他人的相应结果。 – jjepsuomi 2014-10-31 12:18:18

哈哈有输入数据，只要你给我吧！谢谢，我会检查。 – 2014-10-31 12:20:04

随着问题讨论如下：我会说，别人都错了......你的最小化导致的E(u,v)较小的值，请检查：

E(1.4,1.6) = 37.8 >> 3.6 = E(0.63, -1.67)

来源

2014-10-31 13:19:16 matheburg

+1谢谢:) – jjepsuomi 2014-10-31 13:19:39

欢迎你接受你的建议的答案;） – matheburg 2014-10-31 13:19:52

（我不只是评论道歉，但我是新来的SO，不能发表评论。）

看来，你的算法是做正确的事。你想要确定的是，在每一步能量都在减少（它是这样）。有几个原因可能会导致你的数据点可能与班级中的其他人不一致：他们可能是错的（你或班上其他人），他们可能是从不同的角度开始的，他们可能使用了不同的步长（你是什么我打电话给eta）。

理想情况下，您不想硬编码迭代次数。你想继续下去，直到你达到当地的最低值（希望这是全球最低）。为了检查这一点，你希望两个偏导数都是零（或非常接近）。此外，为了确保你处于本地分钟（不是本地最大或鞍点），你应该检查E_uu * E_vv - E_uv^2的符号和E_uu的符号：http://en.wikipedia.org/wiki/Second_partial_derivative_test了解详情（第二个衍生测试，在顶部）。如果你发现自己处于局部最大或鞍点，你的梯度会告诉你不要移动（因为偏导数是0）。既然你知道这不是最优的，你必须干扰你的解决方案（有时称为模拟退火）。

希望这会有所帮助。

来源

2014-10-31 13:20:04 TravisJ

+1谢谢@TravisJ出色答卷：）你也是对的。我得到了正确的答案。看起来其他人确实是错的:) – jjepsuomi 2014-10-31 13:21:14

不是一个完整的答案，但让我们去为它：

我在你的代码添加了一个阴谋的一部分，所以你可以看到怎么回事。

u1=resultTable(:,5); 
v1=resultTable(:,6); 
E1=E(u1,v1); 
E1(E1<bound)=NaN; 
[x,y]=meshgrid(-1:0.1:5,-5:0.1:2);Z=E(x,y); 
surf(x,y,Z) 

hold on 
plot3(u1,v1,E1,'r') 
plot3(u1,v1,E1,'r*')

enter image description here 结果表明，你的算法是做该功能的正确的事情。所以，正如其他人所说，或所有其他人都是错的，或者你没有使用正在开始的正确方程式。

来源

2014-10-31 13:23:22

+1非常好，谢谢@AnderBiguri非常感谢你的努力！ :) – jjepsuomi 2014-10-31 13:24:51

有什么不对我的梯度下降算法

回答

相关问题