我正在评论我在Andrew Ng的ML课上做过的材料,并试图在TensorFlow中实现它。我能够使用scipy的optimize
函数来获得0.213的成本,但是对于TensorFlow,它被卡在0.622
,离初始损失0.693
不太远,使用初始权重为零的初始损失。TensorFlow GradientDescentOptimizer没有收敛于预期成本
我审查后here,并添加了tf.maximum
调用我的损失功能,防止NaN的。我不相信这是正确的方法,我相信有更好的方法。我也尝试使用tf.clip_by_value
来代替,但它给出了相同的非优化成本。
iterations = 1500
with tf.Session() as sess:
X = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
theta = tf.Variable(tf.zeros([3,1]), dtype=tf.float32)
training_rows = tf.placeholder(tf.float32)
z = tf.matmul(X, theta)
h_x = 1.0/(1.0 + tf.exp(-z))
lhs = tf.matmul(tf.transpose(-y), tf.log(tf.maximum(1e-5, h_x)))
rhs = tf.matmul(tf.transpose((1 - y)), tf.log(tf.maximum(1e-5, 1 - h_x)))
loss = tf.reduce_sum(lhs - rhs)/training_rows
alpha = 0.001
optimizer = tf.train.GradientDescentOptimizer(alpha)
train = optimizer.minimize(loss)
# Run the session
X_val, y_val = get_data()
rows = X_val.shape[0]
kwargs = {X: X_val, y: y_val, training_rows: rows}
sess.run(tf.global_variables_initializer())
sess.run(tf.assign(theta, np.array([0,0,0]).reshape(3,1)))
print("Original cost before optimization is: {}".format(sess.run(loss, kwargs)))
print("Optimizing loss function")
costs = []
for i in range(iterations):
optimal_theta, result = sess.run([theta, train], {X: X_val, y: y_val, training_rows: rows})
cost = sess.run(loss, kwargs)
costs.append(cost)
optimal_theta,loss = sess.run([theta, loss], {X: X_val, y: y_val, training_rows: rows})
print("Optimal value for theta is: {} with a loss of: {}".format(optimal_theta, loss))
plt.plot(costs)
plt.show()
我也注意到,任何学习速度比0.001
更大会导致优化跳舞似地来回的损失。这是正常的吗?最后,当我尝试将迭代次数增加到25,000时,我意识到成本降低到0.53
。我期待它会以更少的迭代收敛。