我正在训练一个简单的MLP,用Keras对MNIST数字进行分类。我遇到了一个问题,无论我使用哪种优化器和学习速度,模型都不会学习/下降,而且我的准确性保持与随机猜测一样好。Keras MNIST渐变下降卡滞/学习速度非常慢
下面的代码:
model2=Sequential()
model2.add(Dense(output_dim=512, input_dim=784, activation='relu', name='dense1', kernel_initializer='random_uniform'))
model2.add(Dropout(0.2, name='dropout1'))
model2.add(Dense(output_dim=512, input_dim=512, activation='relu', name='dense2', kernel_initializer='random_uniform'))
model2.add(Dropout(0.2, name='dropout2'))
model2.add(Dense(output_dim=10, input_dim=512, activation='softmax', name='dense3', kernel_initializer='random_uniform'))
model2.compile(optimizer=Adagrad(), loss='categorical_crossentropy', metrics=['accuracy'])
model2.summary()
model2.fit(image_train.as_matrix(),img_keras_lb,batch_size=128,epochs = 100)
和输出:
Epoch 1/100
33600/33600 [==============================] - 5s - loss: 14.6704 - acc: 0.0894
Epoch 2/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 3/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 4/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 5/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 6/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 7/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 8/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 9/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 10/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 11/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 12/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 13/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 14/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 15/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 16/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 17/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 18/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 19/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 20/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 21/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
Epoch 22/100
33600/33600 [==============================] - 4s - loss: 14.6809 - acc: 0.0892
正如你可以看到,该模型没有学习任何东西。我也尝试过SGD,Adam和RMSprop,并将批量大小降低到32,16等。
任何指示为什么发生这种情况非常感谢!