梯度下降在Python

对数下降曲线在我想为代表的对数下降曲线上运行梯度下降：梯度下降在Python

Y = Y0 - A * LN（B + X）。

我的这个例子Y0：800

我试图做到这一点使用的偏导数相对于A和B，但在这种明显减少了误差平方，它不收敛。我知道这不是矢量化的，我可能会完全采用错误的方法。我是否犯了一个简单的错误，或完全解决这个问题？

import numpy as np 

# constants my gradient descent model should find: 
a = 4 
b = 4 

# function to fit on! 
def function(x, a, b): 
    y0 = 800 
    return y0 - a * np.log(b + x) 

# Generates data 
def gen_data(numpoints): 
    a = 4 
    b = 4 
    x = np.array(range(0, numpoints)) 
    y = function(x, a, b) 
    return x, y 
x, y = gen_data(600) 

def grad_model(x, y, iterations): 
    converged = False 

    # length of dataset 
    m = len(x) 

    # guess a , b 
    theta = [0.1, 0.1] 
    alpha = 0.001 

    # initial error 
    e = np.sum((np.square(function(x, theta[0], theta[1])) - y)) 

    for iteration in range(iterations): 
     hypothesis = function(x, theta[0], theta[1]) 
     loss = hypothesis - y 

     # compute partial deritaves to find slope to "fall" into 
     theta0_grad = (np.mean(np.sum(-np.log(x + y))))/(m) 
     theta1_grad = (np.mean((((np.log(theta[1] + x))/theta[0]) - (x*(np.log(theta[1] + x))/theta[0]))))/(2*m) 

     theta0 = theta[0] - (alpha * theta0_grad) 
     theta1 = theta[1] - (alpha * theta1_grad) 

     theta[1] = theta1 
     theta[0] = theta0 

     new_e = np.sum(np.square((function(x, theta[0], theta[1])) - y)) 
     if new_e > e: 
      print "AHHHH!" 
      print "Iteration: "+ str(iteration) 
      break 
     print theta 
    return theta[0], theta[1]

来源

2016-12-15 Scott Rothbarth

是的，每当我通过标准线性渐变下降并且不太清楚如何解决这个问题时，我遇到了麻烦。 –

还没有真正读过代码，但是，它是什么意思，它不会收敛？错误是否越来越大，因此它是分歧的？或者它收敛太久了？假设你确实编码了衍生物，那可能就是你选择了错误的“alpha”，或者梯度的方向有符号翻转（'+'而不是'-'）。 –

如果我的错误分歧，我在代码中放了一个休息时间。我相信我的theta [0]（a）变量的偏导数是正确的，但不是我的theta [1]（b）变量。它似乎正确收敛，但只有theta [0]。 –

我在代码中发现了一些bug。行

e = np.sum((np.square(function(x, theta[0], theta[1])) - y))

不正确并且应当与

e = np.sum((np.square(function(x, theta[0], theta[1]) - y)))

为new_e公式包含相同的错误来代替。

此外，梯度公式是错误的。您的损失函数为 $ L（a，b）= \ sum_ {i = 1}^N y_0 - a \ log（b + x_i）$，因此您必须计算$ L $的偏导数$ a $和$ b $。（LaTeX真的不能在stackoverflow上工作吗？）最后一点是梯度下降方法有一个步长限制，所以我们的步长不能太大。下面是一个工作更好的代码版本：

import numpy as np 
import matplotlib.pyplot as plt 

# constants my gradient descent model should find: 
a = 4.0 
b = 4.0 
y0 = 800.0 

# function to fit on! 
def function(x, a, b): 
    # y0 = 800 
    return y0 - a * np.log(b + x) 

# Generates data 
def gen_data(numpoints): 
    # a = 4 
    # b = 4 
    x = np.array(range(0, numpoints)) 
    y = function(x, a, b) 
    return x, y 
x, y = gen_data(600) 

def grad_model(x, y, iterations): 
    converged = False 

    # length of dataset 
    m = len(x) 

    # guess a , b 
    theta = [0.1, 0.1] 
    alpha = 0.00001 

    # initial error 
    # e = np.sum((np.square(function(x, theta[0], theta[1])) - y)) # This was a bug 
    e = np.sum((np.square(function(x, theta[0], theta[1]) - y))) 

    costs = np.zeros(iterations) 

    for iteration in range(iterations): 
     hypothesis = function(x, theta[0], theta[1]) 
     loss = hypothesis - y 

     # compute partial deritaves to find slope to "fall" into 
     # theta0_grad = (np.mean(np.sum(-np.log(x + y))))/(m) 
     # theta1_grad = (np.mean((((np.log(theta[1] + x))/theta[0]) - (x*(np.log(theta[1] + x))/theta[0]))))/(2*m) 
     theta0_grad = 2*np.sum((y0 - theta[0]*np.log(theta[1] + x) - y)*(-np.log(theta[1] + x))) 
     theta1_grad = 2*np.sum((y0 - theta[0]*np.log(theta[1] + x) - y)*(-theta[0]/(b + x))) 

     theta0 = theta[0] - (alpha * theta0_grad) 
     theta1 = theta[1] - (alpha * theta1_grad) 

     theta[1] = theta1 
     theta[0] = theta0 

     # new_e = np.sum(np.square((function(x, theta[0], theta[1])) - y)) # This was a bug 
     new_e = np.sum(np.square((function(x, theta[0], theta[1]) - y))) 
     costs[iteration] = new_e 
     if new_e > e: 
      print "AHHHH!" 
      print "Iteration: "+ str(iteration) 
      # break 
     print theta 
    return theta[0], theta[1], costs 

(theta0,theta1,costs) = grad_model(x,y,100000) 
plt.semilogy(costs)

来源

2016-12-17 01:25:54 littleO

谢谢！奇迹般有效！任何标准程序遵循如何找到正确的步长？ –

梯度下降在Python

回答

相关问题