2017-07-17 181 views
0

我正在研究生成文本的LSTM,并且我遇到了重复使用以前训练过的模型的问题。在使用tensorflow website作为资源时,我细分了下面的代码。保存和恢复张量流模型(LSTM)的问题

在这里,我建立我所有的变量:

graph = tf.Graph() 

with graph.as_default(): 
    global_step = tf.Variable(0) 

    data = tf.placeholder(tf.float32, [batch_size, len_section, char_size]) 
    labels = tf.placeholder(tf.float32, [batch_size, char_size]) 

    ..... 

    #Reset at the beginning of each test 
    reset_test_state = tf.group(test_output.assign(tf.zeros([1, hidden_nodes])), 
           test_state.assign(tf.zeros([1, hidden_nodes]))) 

    #LSTM 
    test_output, test_state = lstm(test_data, test_output, test_state) 
    test_prediction = tf.nn.softmax(tf.matmul(test_output, w) + b) 

    saver = tf.train.Saver() 

在这里,我训练我的模型和保存在30次迭代

with tf.Session(graph = graph) as sess: 
    tf.global_variables_initializer().run() 
    offset = 0 

    for step in range(10000): 

     offset = offset % len(X) 

     if offset <= (len(X) - batch_size): 

      batch_data = X[offset: offset + batch_size] 
      batch_labels = y[offset:offset+batch_size] 
      offset += batch_size 

     else: 
      to_add = batch_size - (len(X) - offset) 
      batch_data = np.concatenate((X[offset: len(X)], X[0: to_add])) 
      batch_labels = np.concatenate((y[offset: len(X)], y[0: to_add])) 
      offset = to_add 

     _, training_loss = sess.run([optimizer, loss], feed_dict = {data : batch_data, labels : batch_labels}) 

     if step % 10 == 0: 
      print('training loss at step %d: %.2f (%s)' % (step, training_loss, datetime.datetime.now())) 

     if step % save_every == 0: 
      saver.save(sess, checkpoint_directory + '/model.ckpt', global_step=step) 

     if step == 30: 
      break 

我看那个目录中的一个检查站,并以下文件,创建:

enter image description here

她Ë我理应恢复我的训练模型并对其进行测试:

with tf.Session(graph=graph) as sess: 
    #standard init step 
    offset = 0 
    saver = tf.train.Saver() 
    saver.restore(sess, "/ckpt/model-150.meta") 
    tf.global_variables_initializer().run() 

    test_start = "I plan to make this world a better place " 
    test_generated = test_start 

.... 

这样做后,我得到了以下错误:

DataLossError (see above for traceback): Unable to open table file /ckpt/model.ckpt-30.meta: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? 

我不太清楚我在做什么错。该教程看起来非常简单,但我显然缺少一些东西。任何形式的反馈将不胜感激。

回答

1

首先,请注意,如果您在从检查点恢复后初始化所有变量,您将获得它们的随机初始值而不是训练值。

其次,如果您使用tf.estimator.Estimator而不是自己实现它,则更容易获得保存/恢复权限。

第三,我不明白你是如何通过model-150.meta恢复,但看到有关model-30.meta的错误。不过,我相信你只能通过model-30(没有.meta后缀)。