如何在Tensorflow中访问循环单元的权重？

提高深度Q学习任务稳定性的一种方法是为网络维护一组目标权重，这些权重可以缓慢更新并用于计算Q值目标。作为在学习过程的不同时间的结果，在正向传球中使用两组不同的权重。对于正常DQN这并不难实现，因为权重可在feed_dict即设置tensorflow变量：如何在Tensorflow中访问循环单元的权重？

sess = tf.Session() 
input = tf.placeholder(tf.float32, shape=[None, 5]) 
weights = tf.Variable(tf.random_normal(shape=[5,4], stddev=0.1) 
bias = tf.Variable(tf.constant(0.1, shape=[4]) 
output = tf.matmul(input, weights) + bias 
target = tf.placeholder(tf.float32, [None, 4]) 
loss = ... 

... 

#Here we explicitly set weights to be the slowly updated target weights 
sess.run(output, feed_dict={input: states, weights: target_weights, bias: target_bias}) 

# Targets for the learning procedure are computed using this output. 

.... 

#Now we run the learning procedure, using the most up to date weights, 
#as well as the previously computed targets 
sess.run(loss, feed_dict={input: states, target: targets})

我想DQN的一个经常性的版本才能使用此目标网络技术，但我不知道如何访问和设置重复使用的单元格内使用的权重。具体来说，我正在使用tf.nn.rnn_cell.BasicLSTMCell，但我想知道如何对任何类型的循环单元格执行此操作。

来源

2016-11-27 John H

BasicLSTMCell不公开它的变量作为其公共API的一部分。我建议您查看这些变量在图形中的名称并提供这些名称（因为它们在检查点中，所以这些名称不太可能改变，并且更改这些名称会破坏检查点兼容性）。

或者，您可以制作一份BasicLSTMCell的副本，它会公开变量。我认为这是最干净的方法。

来源

2016-11-28 18:01:18

这工作，谢谢亚历山大。对于任何想要更多细节的人来说，当你将循环单元格送入'tf.nn.dynamicrnn（）'时，会创建权重和偏移变量。在会话中运行'tf.initialize_all_variables（）'后，如果运行tf.trainable_variables（）'，将会出现两个新的可训练张量。在我的情况下，他们被命名为“RNN/BasicLSTMCell/Linear/Matrix：0”和“RNN/BasicLSTMCell/Linear/Bias：0”。 –

如何在Tensorflow中访问循环单元的权重？

回答

相关问题