0

我正在TensorFlow 1.0.1中使用遗留的序列到序列框架构建编码器 - 解码器模型。当我在编码器和解码器中有一层LSTM时,所有东西都能正常工作。但是,当我尝试使用包装在MultiRNNCell中的> 1层LSTM时,致电tf.contrib.legacy_seq2seq.rnn_decoder时出现错误。TensorFlow仅在使用MultiRNNCell时抛出错误

完整的错误是在结束了这一职位,但简单地说,它是由在TensorFlow线

(c_prev, m_prev) = state 

抛出TypeError: 'Tensor' object is not iterable.引起的。我很困惑,因为我传递给rnn_decoder的初始状态确实是一个元组,因为它应该是。据我所知,使用1层或> 1层的唯一区别是后者涉及使用MultiRNNCell。使用它时,我应该了解一些API怪癖吗?

这是我的代码(基于this GitHub repo中的示例)。对其长度抱歉;这是尽可能少的,我仍然可以完成并且可以验证。

import tensorflow as tf 
import tensorflow.contrib.legacy_seq2seq as seq2seq 
import tensorflow.contrib.rnn as rnn 

seq_len = 50 
input_dim = 300 
output_dim = 12 
num_layers = 2 
hidden_units = 100 

sess = tf.Session() 

encoder_inputs = [] 
decoder_inputs = [] 

for i in range(seq_len): 
    encoder_inputs.append(tf.placeholder(tf.float32, shape=(None, input_dim), 
             name="encoder_{0}".format(i))) 

for i in range(seq_len + 1): 
    decoder_inputs.append(tf.placeholder(tf.float32, shape=(None, output_dim), 
             name="decoder_{0}".format(i))) 

if num_layers > 1: 
    # Encoder cells (bidirectional) 
    # Forward 
    enc_cells_fw = [rnn.LSTMCell(hidden_units) 
        for _ in range(num_layers)] 
    enc_cell_fw = rnn.MultiRNNCell(enc_cells_fw) 
    # Backward 
    enc_cells_bw = [rnn.LSTMCell(hidden_units) 
        for _ in range(num_layers)] 
    enc_cell_bw = rnn.MultiRNNCell(enc_cells_bw) 
    # Decoder cell 
    dec_cells = [rnn.LSTMCell(2*hidden_units) 
       for _ in range(num_layers)] 
    dec_cell = rnn.MultiRNNCell(dec_cells) 
else: 
    # Encoder 
    enc_cell_fw = rnn.LSTMCell(hidden_units) 
    enc_cell_bw = rnn.LSTMCell(hidden_units) 
    # Decoder 
    dec_cell = rnn.LSTMCell(2*hidden_units) 

# Make sure input and output are the correct dimensions 
enc_cell_fw = rnn.InputProjectionWrapper(enc_cell_fw, input_dim) 
enc_cell_bw = rnn.InputProjectionWrapper(enc_cell_bw, input_dim) 
dec_cell = rnn.OutputProjectionWrapper(dec_cell, output_dim) 

_, final_fw_state, final_bw_state = \ 
    rnn.static_bidirectional_rnn(enc_cell_fw, 
            enc_cell_bw, 
            encoder_inputs, 
            dtype=tf.float32) 

# Concatenate forward and backward cell states 
# (The state is a tuple of previous output and cell state) 
if num_layers == 1: 
    initial_dec_state = tuple([tf.concat([final_fw_state[i], 
              final_bw_state[i]], 1) 
           for i in range(2)]) 
else: 
    initial_dec_state = tuple([tf.concat([final_fw_state[-1][i], 
              final_bw_state[-1][i]], 1) 
           for i in range(2)]) 

decoder = seq2seq.rnn_decoder(decoder_inputs, initial_dec_state, dec_cell) 

tf.global_variables_initializer().run(session=sess) 

这是错误:

Traceback (most recent call last): 
    File "example.py", line 67, in <module> 
    decoder = seq2seq.rnn_decoder(decoder_inputs, initial_dec_state, dec_cell) 
    File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 150, in rnn_decoder 
    output, state = cell(inp, state) 
    File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 426, in __call__ 
    output, res_state = self._cell(inputs, state) 
    File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 655, in __call__ 
    cur_inp, new_state = cell(cur_inp, cur_state) 
    File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 321, in __call__ 
    (c_prev, m_prev) = state 
    File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 502, in __iter__ 
    raise TypeError("'Tensor' object is not iterable.") 
TypeError: 'Tensor' object is not iterable. 

谢谢!

回答

4

问题出在传递给seq2seq.rnn_decoder的初始状态(initial_dec_state)的格式。

当您使用rnn.MultiRNNCell时,您正在构建多层经常性网络,因此您需要为这些图层的每个提供一个初始状态

因此,您应该提供一个列表作为初始状态,其中列表中的每个元素都是来自经常性网络相应层的前一状态。

所以你initial_dec_state,这样的初始化:

initial_dec_state = tuple([tf.concat([final_fw_state[-1][i], 
             final_bw_state[-1][i]], 1) 
          for i in range(2)]) 

而应该是这样的:

initial_dec_state = [ 
        tuple([tf.concat([final_fw_state[j][i],final_bw_state[j][i]], 1) 
          for i in range(2)]) for j in range(len(final_fw_state)) 
         ] 

这在格式创建的元组的列表:

[(state_c1, state_m1), (state_c2, state_m2) ...] 

在更多详细信息,'Tensor' object is not iterable.错误发生是因为seq2seq.rnn_decoder内部呼叫您的rnn.MultiRNNCelldec_cell)将初始状态(initial_dec_state)传递给它。

rnn.MultiRNNCell.__call__遍历初始状态列表,并为它们中的每一个提取元组(c_prev, m_prev)(在语句(c_prev, m_prev) = state中)。

所以,如果你只是传递一个元组,rnn.MultiRNNCell.__call__将遍历它,只要它达到(c_prev, m_prev) = state它还会找张量(这应该是一个元组)为state,并抛出'Tensor' object is not iterable.错误。

了解seq2seq.rnn_decoder预期的初始状态格式的一种好方法是致电dec_cell.zero_state(batch_size, dtype=tf.float32)。此方法以初始化您正在使用的循环模块所需的确切格式返回零填充状态张量。

+0

很好的回答!信息丰富且有帮助。 – AlVaz