2014-09-24 68 views
0

我使用Theano's LogisticRegression sample code,我还没有在所有的修改给定的包代码,我使用的是相同的数据。如何阅读DeepLearningTutorials包内Theano的逻辑回归类预测值?

我需要读取LogisticRegression类中(self.y_pred)字段内的预测值以及同一类的self.p_y_given_x字段中的预测概率值。

他们tensortype和tensorvariables,我不知道如何读/打印出来。我需要他们做一个后处理,但我无法访问值。应该在训练后读取这些值,这些值应该在星星左右。

while (epoch < n_epochs) and (not done_looping): 
    epoch = epoch + 1 
    for minibatch_index in xrange(n_train_batches): 

     minibatch_avg_cost = train_model(minibatch_index) 
     # iteration number 
     iter = (epoch - 1) * n_train_batches + minibatch_index 

     if (iter + 1) % validation_frequency == 0: 
      # compute zero-one loss on validation set 
      validation_losses = [validate_model(i) 
           for i in xrange(n_valid_batches)] 
      this_validation_loss = numpy.mean(validation_losses) 

      print('epoch %i, minibatch %i/%i, validation error %f %%' % \ 
       (epoch, minibatch_index + 1, n_train_batches, 
       this_validation_loss * 100.)) 

      # if we got the best validation score until now 
      if this_validation_loss < best_validation_loss: 
       #improve patience if loss improvement is good enough 
       if this_validation_loss < best_validation_loss * \ 
        improvement_threshold: 
        patience = max(patience, iter * patience_increase) 

       best_validation_loss = this_validation_loss 
       # test it on the test set 

       test_losses = [test_model(i) 
           for i in xrange(n_test_batches)] 
       test_score = numpy.mean(test_losses) 

       print(('  epoch %i, minibatch %i/%i, test error of best' 
        ' model %f %%') % 
        (epoch, minibatch_index + 1, n_train_batches, 
        test_score * 100.)) 

     if patience <= iter: 
      done_looping = True 
      break 

end_time = time.clock() 
print(('Optimization complete with best validation score of %f %%,' 
     'with test performance %f %%') % 
      (best_validation_loss * 100., test_score * 100.)) 
print 'The code run for %d epochs, with %f epochs/sec' % (
    epoch, 1. * epoch/(end_time - start_time)) 
print >> sys.stderr, ('The code for file ' + 
         os.path.split(__file__)[1] + 
         ' ran for %.1fs' % ((end_time - start_time))) 
#read the values here and print them 
#********************************** 
if __name__ == '__main__': 
    sgd_optimization_mnist() 

回答

3

您需要编译一个函数来回馈预测。

此代码可能不完全正常工作,但这是想法:

import numpy as np 
import theano 
import theano.tensor as T 

# Create some data with 100 samples, 10 features 
X = np.random.randn(100, 10) 
X_sym = T.fmatrix('X') 
# Create prediction function 
predict_function = theano.function(inputs=[X_sym], outputs=self.y_pred) 

# See the actual prediction 
print(predict_function(X)) 
+0

感谢凯尔,它运作良好预期。 – Ash 2014-09-27 00:30:37

2

这是由凯尔的回答为我工作的代码。它返回预测类的值,并从中打印出报告。

classifier = LogisticRegression(input=x, n_in=train_set_x.get_value(borrow=True).shape[1], n_out=25) 

# the cost we minimize during training is the negative log likelihood of 
# the model in symbolic format 
cost = classifier.negative_log_likelihood(y) 

# compiling a Theano function that computes the mistakes that are made by 
# the model on a minibatch 
test_model = theano.function(inputs=[index], 
     outputs=classifier.errors(y), 
     givens={ 
      x: test_set_x[index * batch_size: (index + 1) * batch_size], 
      y: test_set_y[index * batch_size: (index + 1) * batch_size]}) 

validate_model = theano.function(inputs=[index], 
     outputs=classifier.errors(y), 
     givens={ 
      x: valid_set_x[index * batch_size:(index + 1) * batch_size], 
      y: valid_set_y[index * batch_size:(index + 1) * batch_size]}) 

predict = theano.function(inputs=[], 
     outputs=classifier.y_pred, 
     givens={ 
      x: test_set_x}) 
# compute the gradient of cost with respect to theta = (W,b) 
g_W = T.grad(cost=cost, wrt=classifier.W) 
g_b = T.grad(cost=cost, wrt=classifier.b) 

# specify how to update the parameters of the model as a list of 
# (variable, update expression) pairs. 
updates = [(classifier.W, classifier.W - learning_rate * g_W), 
      (classifier.b, classifier.b - learning_rate * g_b)] 

# compiling a Theano function `train_model` that returns the cost, but in 
# the same time updates the parameter of the model based on the rules 
# defined in `updates` 
train_model = theano.function(inputs=[index], 
     outputs=cost, 
     updates=updates, 
     givens={ 
      x: train_set_x[index * batch_size:(index + 1) * batch_size], 
      y: train_set_y[index * batch_size:(index + 1) * batch_size]}) 

############### 
# TRAIN MODEL # 
############### 
print '... training the model' 
# early-stopping parameters 
patience = 50000 # look as this many examples regardless 
patience_increase = 2 # wait this much longer when a new best is 
           # found 
improvement_threshold = 0.995 # a relative improvement of this much is 
           # considered significant 
validation_frequency = min(n_train_batches, patience/2) 
           # go through this many 
           # minibatche before checking the network 
           # on the validation set; in this case we 
           # check every epoch 

best_params = None 
best_validation_loss = numpy.inf 
test_score = 0. 
start_time = time.clock() 

done_looping = False 
epoch = 0 
while (epoch < n_epochs) and (not done_looping): 
    epoch = epoch + 1 
    #********************here i call the function and report based on returned class predictions. 
    report(predict())