DLHUB: MNIST Hand Writing Recognition Using Convolution Neural Network

Oct 21, 20235 min read

Introduction

There are so many tutorials on MNIST handwriting recognition existing on the internet. However, these tutorials mainly use deep learning platforms such as Tensorflow, CNTK, MXNet, PyTorch, Caffe, and so on. The thing is to design a proper deep learning neural network model; not only does a user require deep knowledge of that AI field, but also to have Python programming skills and be familiar with one of the deep learning platform application programming interfaces (API). The purpose of this article is to introduce a new way to design a deep learning model without requiring Python programming skills, an understanding of complicated deep learning APIs, and a deep knowledge of deep learning architecture. DLHUB is designed to simplify this design process with just a few clicks.

Data preparation

You can download the MINST dataset directly from http://yann.lecun.com/exdb/mnist. This dataset consists of 60,000 samples in a training set and 10,000 samples in a test set. It is a subset of a larger dataset available from the National Institute of Standards and Technology (NIST).

However, there is an easier way to get the MNIST dataset directly from DLHUB, as DLHUB is equipped with ready-to-use examples to allow you to explore its features.

When you first launch DLHUB, DLHUB will show a load training data interface. To get the DLHUB examples, you need to click on the ... button (1) to show Training Data File Help that allows you to download DLHUB examples (2). This Train Data File Help dialog will also guide you to the correct format for different types of datasets.

After the DLHUB dataset is downloaded, please extract this folder and place it into DLHUBData location as shown below:

Since MNIST data is handwriting 28x28x1 image to represent a number from 0 to 9, there will be 10 different labels to encode a certain number from 0 to 9 that corresponds to the 28x28x1 image. This 28x28x1 image can be flattened into an array with 784 features, and each feature is equivalent to the pixel value of this image. To encode the output number (from 0 to 9), we can use 10 binary codes. For example, 0 0 0 0 0 0 0 0 0 1 will be equivalent to number 0, and 1 0 0 0 0 0 0 0 0 0 is for number 9. Therefore, the MNIST dataset will look like as below:

Design a neural network using DLHUB.

Step 1: Load MNIST training set

Once the MNIST data is in the right format, it is ready to be loaded into DLHUB in the loading data page. On that page, you can browse to correct the MNIST dataset location (1), confirm the correct dataset input shape (28x28x1) (2), and click next (3) to go to the deep learning design page.

Step 2: Configure Convolution Neural Network

Option 1: Using Keras Python code

In order to compare how easy it is to design a deep learning model in DLHUB, a famous Python code in (https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) Keras is used in this article.

As you know, because of the complexity of Python deep learning APIs, it is not easy to design a deep learning model directly from other deep learning platforms (Tensorflow, CNTK, MXNET, etc.), so Keras (https://keras.io/) is designed to provide a high-level wrap around popular deep learning APIs in Python.

LeNet, originally introduced by LeCun in 1998, is one of the famous Convolution Neural Network for image recognition. This LeNet structure is shown as below:

In LeNet, an image was filtered via a few trainable convolutions and pooling layers before being flattened out to feed to fully connected layers. With this structure, the image features are revealed after being filtered by convolution and pooling layers. As a result, the classification accuracy is significantly improved compared with the traditional machine learning model, which contains only fully connected layers. For more information, please look into LeCun's publication.

Since LeNet was introduced, there have been several variations in its structure to improve classification accuracy. In this article, we use the architecture introduced in Keras for comparison purposes. The Keras Python code to construct the LeNet model is shown as follows:

'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Option 2: Using DLHUB

The same LeNet architecture can be easily constructed in DLHUB with just a few clicks. First, create a new file to store the LeNet structure (1), then select the appropriate layer from the Select Functions palette (2) and configure the selected layer parameters in (3). The process is repeated until the correct LeNet model is constructed (red rectangular area). Once the model is constructed, a verification button (4) is used to check if the model is constructed correctly before the further process (6). The LeNet structure can be saved into a file at any time during the design process.

Here is the detailed information on the layer settings.

Step 3: Train Neural Network

Configuring training parameters in DLHUB is so simple with an easy-to-use graphical user interface. In this article, we use the sample training algorithm and parameters from the Keras examples. First, we select the training algorithm and loss function and accuracy, metric types, and training parameters (1), then we can specify when to stop the training process (2) before starting it (3). If the GPU is detected in the host PC, it will be used to improve the training process.

The training process is finished after 12 epochs, and it takes only 72 seconds to train a big dataset in the Alienware 15 R3 laptop.

Step 4: Evaluate the Neural Network

In order to test the trained deep learning performance, the evaluation process is included in DLHUB. It is can easily be done by importing a test dataset (10000 unseen samples) from a correct format file (1) and performing the evaluation on that test dataset (2). The accuracy of that 10k dataset is around 99%, depending on how a user specifies a learning rate defined in step 4.

Step 5: Test the Neural Network

Users can also test how a trained model works right inside DLHUB before deciding to use this trained model in actual production/deployment applications. It is easily done by loading a new data folder (1), selecting a data file to evaluate (2), visualizing how new data looks like if it is an image (3), and confirming the correct prediction result (4).

Export trained neural network model to weight file

When the trained neural network has been evaluated and tested with a new test data set, it can be exported using the export function in DLHUB.

LabVIEW Deployment

DLHUB-LV API provides a simple way to load a frozen deep learning model into the LabVIEW environment and perform prediction tasks using Predict.vi on a given input.