2017-03-09 116 views
3

数。当我读到TensorFlow的rnn_cell.py其中LSTMCell实施后,我看到了下面为什么一个LSTMCell输入的尺寸必须为单位

def __call__(self, inputs, state, scope=None): 
    """Run one step of LSTM. 

    Args: 
     inputs: input Tensor, 2D, batch x num_units. 
     state: if `state_is_tuple` is False, this must be a state Tensor, 
     `2-D, batch x state_size`. If `state_is_tuple` is True, this must be a 
     tuple of state Tensors, both `2-D`, with column sizes `c_state` and 
     `m_state`. 
     scope: VariableScope for the created subgraph; defaults to "LSTMCell". 

    Returns: 
     A tuple containing: 
     - A `2-D, [batch x output_dim]`, Tensor representing the output of the 
     LSTM after reading `inputs` when previous state was `state`. 
     Here output_dim is: 
      num_proj if num_proj was set, 
      num_units otherwise. 
     - Tensor(s) representing the new state of LSTM after reading `inputs` when 
     the previous state was `state`. Same type and shape(s) as `state`. 

    Raises: 
     ValueError: If input size cannot be inferred from inputs via 
     static shape inference. 
    """ 
    num_proj = self._num_units if self._num_proj is None else self._num_proj 

    if self._state_is_tuple: 
     (c_prev, m_prev) = state 
    else: 

我想知道为什么inputs尺寸必须匹配一致LSTM的单位数量(num_units)。我期望他们完全不相关,但不知何故他们不是。

有谁知道为什么?

回答

0

它不需要与单元格的单元数量(即隐藏尺寸)相匹配。

首先:

num_proj:(可选)INT,用于在投影 矩阵输出维数。如果没有,则不执行投影。

这就是说,num_proj是电池的输出的dimentionality不能在NUM_UNITS的尺寸(隐藏维度)相匹配。通常,解码时我们想要的输出与词汇表的维度相同(不是隐藏维度或数字单位的维度)。

 if self._num_proj is not None: 
     with vs.variable_scope("projection") as proj_scope: 
      if self._num_proj_shards is not None: 
      proj_scope.set_partitioner(
       partitioned_variables.fixed_size_partitioner(
        self._num_proj_shards)) 
      m = _linear(m, self._num_proj, bias=False) 

正如您在上面看到的,它只是通过_linear投影/转换将输出(m)转换为具有num_proj维数。如果num_proj为None,则此默认值与隐藏维度相同。

def _linear(args, output_size, bias, bias_start=0.0, scope=None): 
    """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable. 

    Args: 
     args: a 2D Tensor or a list of 2D, batch x n, Tensors. 
     output_size: int, second dimension of W[i]. 
     bias: boolean, whether to add a bias term or not. 
     bias_start: starting value to initialize the bias; 0 by default. 
     scope: VariableScope for the created subgraph; defaults to "Linear". 

    Returns: 
     A 2D Tensor with shape [batch x output_size] equal to 
     sum_i(args[i] * W[i]), where W[i]s are newly created matrices. 

    Raises: 
     ValueError: if some of the arguments has unspecified or wrong shape. 
    """ 
    if args is None or (isinstance(args, (list, tuple)) and not args): 
     raise ValueError("`args` must be specified") 
    if not isinstance(args, (list, tuple)): 
     args = [args] 

    # Calculate the total size of arguments on dimension 1. 
    total_arg_size = 0 
    shapes = [a.get_shape().as_list() for a in args] 
    for shape in shapes: 
     if len(shape) != 2: 
      raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes)) 
     if not shape[1]: 
      raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes)) 
     else: 
      total_arg_size += shape[1] 

    # Now the computation. 
    with tf.variable_scope(scope or "Linear"): 
     matrix = tf.get_variable("Matrix", [total_arg_size, output_size]) 
     if len(args) == 1: 
      res = tf.matmul(args[0], matrix) 
     else: 
      res = tf.matmul(tf.concat(axis=1, values=args), matrix) 
     if not bias: 
      return res 
     bias_term = tf.get_variable("Bias", [output_size], initializer=tf.constant_initializer(bias_start)) 
    return res + bias_term 
相关问题