2017-04-26 84 views
1

我写一个程序tensorflow处理Kaggle的数位识别problem.Program可以正常运行,但训练精度总是很低,约10%,如下列:TensorFlow - 训练精度在MNIST数据没有改善

step 0, training accuracy 0.11 
step 100, training accuracy 0.13 
step 200, training accuracy 0.21 
step 300, training accuracy 0.12 
step 400, training accuracy 0.07 
step 500, training accuracy 0.08 
step 600, training accuracy 0.15 
step 700, training accuracy 0.05 
step 800, training accuracy 0.08 
step 900, training accuracy 0.12 
step 1000, training accuracy 0.05 
step 1100, training accuracy 0.09 
step 1200, training accuracy 0.12 
step 1300, training accuracy 0.1 
step 1400, training accuracy 0.08 
step 1500, training accuracy 0.11 
step 1600, training accuracy 0.17 
step 1700, training accuracy 0.13 
step 1800, training accuracy 0.11 
step 1900, training accuracy 0.13 
step 2000, training accuracy 0.07 
…… 

以下是我的代码:

def weight_variable(shape): 
    initial = tf.truncated_normal(shape, stddev=0.1) 
    return tf.Variable(initial) 

def bias_variable(shape): 
    initial = tf.constant(0.1, shape=shape) 
    return tf.Variable(initial) 

def conv2d(x, w): 
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME') 

def max_pool_2x2(x): 
    # ksize = [batch, heigh, width, channels], strides=[batch, stride, stride, channels] 
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 

x = tf.placeholder(tf.float32, [None, 784])  
y_ = tf.placeholder(tf.float32, [None, 10])  
keep_prob = tf.placeholder(tf.float32) 

x_image = tf.placeholder(tf.float32, [None, 28, 28, 1]) 

w_conv1 = weight_variable([5, 5, 1, 32])  
b_conv1 = bias_variable([32]) 
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) 
h_pool1 = max_pool_2x2(h_conv1) 

w_conv2 = weight_variable([5, 5, 32, 64])  
b_conv2 = bias_variable([64]) 
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) 
h_pool2 = max_pool_2x2(h_conv2) 

w_fc1 = weight_variable([7 * 7 * 64, 1024]) 
b_fc1 = bias_variable([1024]) 
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])  
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) 
# dropout 
keep_prob = tf.placeholder(tf.float32)  
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) 
# softmax 
w_fc2 = weight_variable([1024, 10]) 
b_fc2 = bias_variable([10]) 
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2) 

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1])) 
train_step = tf.train.AdamOptimizer(10e-4).minimize(cross_entropy) 

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) 
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 

def get_batch(i, size, train, label): 
    startIndex = (i * size) % 42000 
    endIndex = startIndex + size 
    batch_X = train[startIndex : endIndex] 
    batch_Y = label[startIndex : endIndex] 
    return batch_X, batch_Y 


data = pd.read_csv('train.csv') 
train_data = data.drop(['label'], axis=1) 
train_data = train_data.values.astype(dtype=np.float32) 
train_data = train_data.reshape(42000, 28, 28, 1) 

label_data = data['label'].tolist() 
label_data = tf.one_hot(label_data, depth=10) 
label_data = tf.Session().run(label_data).astype(dtype=np.float64) 


batch_size = 100        
tf.global_variables_initializer().run() 

for i in range(20000): 
    batch_x, batch_y = get_batch(i, batch_size, train_data, label_data) 
    if i % 100 == 0: 
     train_accuracy = accuracy.eval(feed_dict={x_image: batch_x, y_: batch_y, keep_prob: 1.0}) 
     print("step %d, training accuracy %g" % (i, train_accuracy)) 
    train_step.run(feed_dict={x_image: batch_x, y_: batch_y, keep_prob: 0.9}) 

我不知道什么地方错了我的计划。

+0

如果答案解决了您的问题,请接受它 - 请参阅[我应该怎么做当有人回答我的问题?](谢谢 – desertnaut

回答

0

我建议你改变你的bias_variable功能 - 不知道如何tf.Variable(tf.constant)的行为,再加上我们平时在零初始化的偏见,而不是0.1:

def bias_variable(shape): 
    return tf.zeros((shape), dtype = tf.float32) 

如果这没有帮助,请您初始化重量与stddev=0.01

+0

谢谢你的非常重要!我将权重和偏差的stddev改为0.01,程序在约3000步之前正常运行,精度约为0.99。但是,在步骤3000之后,精确度突然变为0.1左右。非常奇怪。 – Lixudong

+0

@Lixudong这是另一个不同的问题 - 请接受这个答案,并打开一个新帖子 – desertnaut

+0

...与您更新的代码和结果 – desertnaut