2017-06-05 100 views
0

更新:我坚信,作为创建并送入tf.nn.dynamic_rnn(...)作为参数错误是关系到init_state。因此,问题就变成了堆栈RNN的初始状态的正确形式或构造方式是什么?Tensorflow 1.1 MultiRNNCell形状误差(Init_State相关)

我想获得一个MultiRNNCell定义在TensorFlow 1.1中工作。

图形定义(带辅助函数定义GRU单元)如下所示。基本思想是将占位符x定义为数字数据样本的冗长字符串。这些数据将通过整形分成相等长度的帧,每个时间步将显示一帧。然后,我想通过GRU的两个(现在)单元来处理这个问题。

def gru_cell(state_size): 
    cell = tf.contrib.rnn.GRUCell(state_size) 
    return cell 

graph = tf.Graph() 
with graph.as_default(): 

    x = tf.placeholder(tf.float32, [batch_size, num_samples], name="Input_Placeholder") 
    y = tf.placeholder(tf.int32, [batch_size, num_frames], name="Labels_Placeholder") 

    init_state = tf.zeros([batch_size, state_size], name="Initial_State_Placeholder") 

    rnn_inputs = tf.reshape(x, (batch_size, num_frames, frame_length)) 
    cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(2)], state_is_tuple=False) 
    rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state) 

图形定义从那里继续与损失函数,优化等,但是,这就是它打破了具有以下冗长错误的地方。

它将成为那个是的batch_size 10,和帧_和state_size都是80

ValueError        Traceback (most recent call last) 
<ipython-input-30-4c48b596e055> in <module>() 
    14  print(rnn_inputs) 
    15  cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(2)], state_is_tuple=False) 
---> 16  rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state) 
    17 
    18  with tf.variable_scope('softmax'): 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/rnn.pyc in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope) 
    551   swap_memory=swap_memory, 
    552   sequence_length=sequence_length, 
--> 553   dtype=dtype) 
    554 
    555  # Outputs of _dynamic_rnn_loop are always shaped [time, batch, depth]. 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/rnn.pyc in _dynamic_rnn_loop(cell, inputs, initial_state, parallel_iterations, swap_memory, sequence_length, dtype) 
    718  loop_vars=(time, output_ta, state), 
    719  parallel_iterations=parallel_iterations, 
--> 720  swap_memory=swap_memory) 
    721 
    722 # Unpack final output if not using output tuples. 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name) 
    2621  context = WhileContext(parallel_iterations, back_prop, swap_memory, name) 
    2622  ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, context) 
-> 2623  result = context.BuildLoop(cond, body, loop_vars, shape_invariants) 
    2624  return result 
    2625 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in BuildLoop(self, pred, body, loop_vars, shape_invariants) 
    2454  self.Enter() 
    2455  original_body_result, exit_vars = self._BuildLoop(
-> 2456   pred, body, original_loop_vars, loop_vars, shape_invariants) 
    2457  finally: 
    2458  self.Exit() 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants) 
    2435  for m_var, n_var in zip(merge_vars, next_vars): 
    2436  if isinstance(m_var, ops.Tensor): 
-> 2437   _EnforceShapeInvariant(m_var, n_var) 
    2438 
    2439  # Exit the loop. 

/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in _EnforceShapeInvariant(merge_var, next_var) 
    565   "Provide shape invariants using either the `shape_invariants` " 
    566   "argument of tf.while_loop or set_shape() on the loop variables." 
--> 567   % (merge_var.name, m_shape, n_shape)) 
    568 else: 
    569  if not isinstance(var, (ops.IndexedSlices, sparse_tensor.SparseTensor)): 

ValueError: The shape for rnn/while/Merge_2:0 is not an invariant for the loop. It enters the loop with shape (10, 80), but has shape (10, 160) after one iteration. Provide shape invariants using either the `shape_invariants` argument of tf.while_loop or set_shape() on the loop variables. 

,几乎看起来像网络开始作为80年代的2堆栈错误的最后部分相关并以某种方式被转换为160堆栈的一堆。任何帮助解决这个问题?我误解了MultiRNNCell的使用吗?

+0

不应该是'init_state = tf.zeros([batch_size,2 * state_size] ...''? –

回答

0

基于艾伦拉沃伊的评论上方,更正后的代码是:

def gru_cell(state_size): 
    cell = tf.contrib.rnn.GRUCell(state_size) 
    return cell 

num_layers = 2 # <--------- 
graph = tf.Graph() 
with graph.as_default(): 

    x = tf.placeholder(tf.float32, [batch_size, num_samples], name="Input_Placeholder") 
    y = tf.placeholder(tf.int32, [batch_size, num_frames], name="Labels_Placeholder") 

    init_state = tf.zeros([batch_size, num_layer * state_size], name="Initial_State_Placeholder") # <--------- 

    rnn_inputs = tf.reshape(x, (batch_size, num_frames, frame_length)) 
    cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(num_layer)], state_is_tuple=False) # <--------- 
    rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state) 

注意上述三个变化。 还要注意,这些更改必须波及到init_state流向的任何地方,特别是如果您将它们馈送到feed_dict中。