2017-07-03 115 views
1

我创建了一个简单的TensorFlow程序,该程序尝试使用文本正文中的前3个字符来预测下一个字符。在TensorFlow RNN中输出序列

单个输入可能看起来像这样:

np.array(['t','h','i']) 

与目标大约是

np.array(['s']) 

我想这个扩展输出下再说4个字符,而不仅仅是未来字符。要做到这一点,我想在一个较长的阵列喂到y

np.array(['s','','i']) 

除了改变y以

y = tf.placeholder(dtype=tf.int32, shape=[None, n_steps]) 

然而,这会产生错误:

Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).

以下是完整代码

embedding_size=40 
n_neurons = 200 
n_output = vocab_size 
learning_rate = 0.001 

with tf.Graph().as_default(): 
    x = tf.placeholder(dtype=tf.int32, shape=[None, n_steps]) 
    y = tf.placeholder(dtype=tf.int32, shape=[None]) 
    seq_length = tf.placeholder(tf.int32, [None]) 

    # Let's set up the embedding converting words to vectors 
    embeddings = tf.Variable(tf.random_uniform(shape=[vocab_size, embedding_size], minval=-1, maxval=1)) 
    train_input = tf.nn.embedding_lookup(embeddings, x) 

    basic_cell = tf.nn.rnn_cell.GRUCell(num_units=n_neurons) 
    outputs, states = tf.nn.dynamic_rnn(basic_cell, train_input, sequence_length=seq_length, dtype=tf.float32) 

    logits = tf.layers.dense(states, units=vocab_size, activation=None) 
    predictions = tf.nn.softmax(logits) 
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
     labels=y, 
     logits=logits) 
    loss = tf.reduce_mean(xentropy) 
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate) 
    training_op = optimizer.minimize(loss) 

    with tf.Session() as sess: 
     sess.run(tf.global_variables_initializer()) 
     for r in range(1000): 
      x_batch, y_batch, seq_length_batch = input_fn() 
      feed_dict = {x: x_batch, y: y_batch, seq_length: seq_length_batch} 
      _, loss_out = sess.run([training_op, loss], feed_dict=feed_dict) 
      if r % 1000 == 0: 
       print("loss_out", loss_out) 

     sample_text = "for th" 
     sample_text_ids = np.expand_dims(np.array([w_to_id[c] for c in sample_text]+[0, 0], dtype=np.int32), 0) 
     prediction_out = sess.run(predictions, feed_dict={x: sample_text_ids, seq_length: np.array([len(sample_text)])}) 
     print("Result:", id_to_w[np.argmax(prediction_out)])  

回答

0

如果是多对多RNN,则应使用tf.contrib.seq2seq.sequence_loss来计算每个时间步长丢失。您的代码应该是这样的:

... 
logits = tf.layers.dense(states, units=vocab_size, activation=None) 
weights = tf.sequence_mask(seq_length, n_steps) 
xentropy = tf.contrib.seq2seq.sequence_loss(logits, y, weights) 
... 

关于tf.contrib.seq2seq.sequence_loss更多详细信息,请参阅here