1

我正在尝试使用numpy实现一个简单的RNN(基于this article),并且我正在训练它做二进制加法,它一次只增加两个8位无符号整数(从结尾开始),在需要的时候学习如何“添加一个”。但是,它似乎并没有学习。对于训练,我选择两个随机数,以a和b中的一位作为输入向前传播8个步骤,并在每个时间步中存储输出和隐藏层值,并反向传播8个步骤,我计算隐藏层错误(output_error.dot(weights_hidden_to_output.T)) * sigmoid_to_derivative(hidden) + future_hidden_error.dot(weights_hidden_to_hidden.T))以及通过矩阵将父层与子层的错误相乘来对每个权重矩阵进行更新。这是正确的方法吗?为什么不是我的RNN学习?

这是我的代码,如果它会使它更清晰。我注意到由于某种原因,每次训练时权重开始突然增加,并且它们引起S形函数的溢出,导致训练失败。任何想法可能导致这种情况?

import numpy as np 
np.random.seed(0) 

def sigmoid(x): 
    return np.atleast_2d(1/(1+np.exp(-x))) 
    #return np.atleast_2d(np.max(x, 0.01)) 
def sig_deriv(x): 
    return x*(1-x) 
def add_bias(x): 
    return np.hstack([np.ones((len(x), 1)), x]) 
def dec_to_bin(dec): 
    return np.array(map(int, list(format(dec, '#010b'))[2:])) 
def bin_to_dec(b): 
    out = 0 
    for bit in b: 
     out = (out << 1) | bit 
    return out 


batch_size = 8 
learning_rate = .1 

input_size = 2 
hidden_size = 16 
output_size = 1 

weights_xh = 2 * np.random.random((input_size+1, hidden_size)) - 1 
weights_hh = 2 * np.random.random((hidden_size+1, hidden_size)) - 1 
weights_hy = 2 * np.random.random((hidden_size+1, output_size)) - 1 

xh_update = np.zeros_like(weights_xh) 
hh_update = np.zeros_like(weights_hh) 
hy_update = np.zeros_like(weights_hy) 

for i in xrange(10000): 
    a = np.random.randint(0, 2**batch_size/2) 
    b = np.random.randint(0, 2**batch_size/2) 
    sum_ = a+b 
    X = add_bias(np.hstack([np.atleast_2d(dec_to_bin(a)).T, np.atleast_2d(dec_to_bin(b)).T])) 
    y = np.atleast_2d(dec_to_bin(sum_)).T 

    error = 0 

    output_errors = [] 
    outputs = [] 
    hiddens = [add_bias(np.zeros((1, hidden_size)))] 
    #forward propagation through time 
    for j in xrange(batch_size): 
     hidden = sigmoid(X[-j-1].dot(weights_xh) + hiddens[-1].dot(weights_hh)) 
     hidden = add_bias(hidden) 
     hiddens.append(hidden) 
     output = sigmoid(hidden.dot(weights_hy)) 
     outputs.append(output[0][0]) 
     output_error = (y[-j-1] - output) 
     error += np.abs(output_error[0]) 
     output_errors.append((output_error * sig_deriv(output))) 

    future_hidden_error = np.zeros((1,hidden_size)) 
    #backward ppropagation through time 
    for j in xrange(batch_size): 
     output_error = output_errors[-j-1] 
     hidden = hiddens[-j-1] 
     prev_hidden = hiddens[-j-2] 

     hidden_error = (output_error.dot(weights_hy.T) * sig_deriv(hidden)) + future_hidden_error.dot(weights_hh.T) 
     hidden_error = np.delete(hidden_error, 0, 1) #delete bias error 

     xh_update += np.atleast_2d(X[j]).T.dot(hidden_error) 
     hh_update += prev_hidden.T.dot(hidden_error) 
     hy_update += hidden.T.dot(output_error) 

     future_hidden_error = hidden_error 

    weights_xh += (xh_update * learning_rate)/batch_size 
    weights_hh += (hh_update * learning_rate)/batch_size 
    weights_hy += (hy_update * learning_rate)/batch_size 

    xh_update *= 0 
    hh_update *= 0 
    hy_update *= 0 

    if i%1000==0: 
     guess = map(int, map(round, outputs[::-1])) 
     print "Iteration {}".format(i) 
     print "Error: {}".format(error) 
     print "Problem: {} + {} = {}".format(a, b, sum_) 
     print "a:  {}".format(list(dec_to_bin(a))) 
     print "+ b:  {}".format(list(dec_to_bin(b))) 
     print "Solution: {}".format(map(int, y)) 
     print "Guess: {} ({})".format(guess, bin_to_dec(guess)) 
     print 

回答

0

我想通了。如果有人想知道为什么它不起作用,那是因为我只是将隐藏错误的一部分(来自输出错误的部分)乘以隐藏层激活的导数。现在很容易在几千次迭代中学习加法问题。

相关问题