The annoying MNIST - Day 3.

Published: January 12, 2018 by Tonny Pham

Categories:
deep-learning 4

The previous work

At the end of day 2, we end up with an error when running the script build and test the model. This is the error

$ python mnist.py
...
mnist.py:29: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(32, (3, 3), activation="relu", input_shape=(1, 28, 28...)`
  model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))
mnist.py:30: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(32, (3, 3), activation="relu")`
  model.add(Convolution2D(32, 3, 3, activation='relu'))
Traceback (most recent call last):
  File "mnist.py", line 35, in <module>
    model.add(Dense(128, activation='relu'))
...

Solution

The error caused by the fact that we use Keras 2.0, while the script is for 1.0 version. Let's switch to 2.0 version by replace the script with:

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

The only noticeable change in here is, Convolution2D has been changed to Conv2D, and the syntax is also different.

With the new file, running the script generates this result.

Warning: Take time, be patience

$ python mnist.py
Using Theano backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
60000/60000 [==============================] - 182s 3ms/step - loss: 0.3410 - acc: 0.8954 - val_loss: 0.0859 - val_acc: 0.9731
Epoch 2/12
60000/60000 [==============================] - 198s 3ms/step - loss: 0.1123 - acc: 0.9668 - val_loss: 0.0528 - val_acc: 0.9832
Epoch 3/12
60000/60000 [==============================] - 209s 3ms/step - loss: 0.0869 - acc: 0.9747 - val_loss: 0.0427 - val_acc: 0.9844
Epoch 4/12
60000/60000 [==============================] - 222s 4ms/step - loss: 0.0735 - acc: 0.9785 - val_loss: 0.0369 - val_acc: 0.9866
Epoch 5/12
60000/60000 [==============================] - 194s 3ms/step - loss: 0.0628 - acc: 0.9810 - val_loss: 0.0371 - val_acc: 0.9877
Epoch 6/12
60000/60000 [==============================] - 175s 3ms/step - loss: 0.0570 - acc: 0.9834 - val_loss: 0.0334 - val_acc: 0.9896
Epoch 7/12
60000/60000 [==============================] - 177s 3ms/step - loss: 0.0498 - acc: 0.9845 - val_loss: 0.0327 - val_acc: 0.9887
Epoch 8/12
60000/60000 [==============================] - 194s 3ms/step - loss: 0.0486 - acc: 0.9859 - val_loss: 0.0307 - val_acc: 0.9896
Epoch 9/12
60000/60000 [==============================] - 191s 3ms/step - loss: 0.0445 - acc: 0.9866 - val_loss: 0.0281 - val_acc: 0.9908
Epoch 10/12
60000/60000 [==============================] - 190s 3ms/step - loss: 0.0419 - acc: 0.9877 - val_loss: 0.0302 - val_acc: 0.9898
Epoch 11/12
60000/60000 [==============================] - 185s 3ms/step - loss: 0.0404 - acc: 0.9878 - val_loss: 0.0281 - val_acc: 0.9911
Epoch 12/12
60000/60000 [==============================] - 192s 3ms/step - loss: 0.0371 - acc: 0.9887 - val_loss: 0.0327 - val_acc: 0.9893
Test loss: 0.0326723895625
Test accuracy: 0.9893

So after 12 runs, it reaches the accuracy ~99%

That is it

So this is how the training run, I got the idea behind that by looking at the code. Correct me if I am wrong:

mnist.load_data shuffled and split the samples into a training set and a testing set at 6:1
We create the model by adding layers provided by Keras, and compile it
We keep running with the training set and confirm the accuracy of the test set.

Nothing special got discovered at this moment.

It has been more than 1 hour, and I am out of time for today. Let dig into the model next time