DLHUB: Fruit Recognition Using Transfer Learning

Oct 21, 20234 min read

Introduction

Deep learning has become a powerful technique for pattern recognition and regression problems, and a Convolution Neural Network has proven to be the most effective neural network structure for image recognition. However, deep learning often requires a very big dataset to achieve reasonable accuracy and performance. Moreover, because of the complexity of deep learning architecture, which contains a lot of layers, the training process often is very time-consuming. We all know that in deep learning architecture, the last few layers often act as the main classifier, and the previous layers act as a feature extractor that extracts important information from the dataset as features to feed to the main classifier.

Traditionally, in machine learning applications, feature extraction and classification often are separated into two processes, and feature extraction is always a difficult task as it depends on dataset structure, and this feature extractor's parameters are always fixed. In deep learning architecture, these two processes are combined in one deep learning model, and the feature extractor's parameters are adjusted during the training process.

Thanks to that deep learning architecture, the transfer learning technique is introduced to solve the small dataset and training time problem. The idea is to use an existing deep learning model that has been trained for a similar problem and extract all the layers from the input layer to nearly the last few layers to be used as a feature extractor. New output layers will be added in this feature extractor for a certain classification task, and these output layers will form a classifier that will be trained on the training process. Note that the feature extractor will not be retrained, and its parameters are fixed. For more information on the transfer learning technique, please visit https://cntk.ai/pythondocs/CNTK_301_Image_Recognition_with_Deep_Transfer_Learning.html.

The purpose of this article is to demonstrate how easy to design deep learning using the transfer learning technique for fruit recognition applications.

Data preparation

In this article, we are going to design a deep-learning neural network that can classify 4 different types of fruits: Apple, Orange, Kiwi, and Grape. First, we need to prepare datasets for the training and testing process. You can download fruit images from the internet, but we have prepared a small fruit dataset that contains only 10 images for each type of fruit for both training and testing data. This dataset can be downloaded here.

Design a neural network using DLHUB.

Step 1: Load training set

The fruit dataset structure in this example is simply a folder structure with sub-folders that act as fruit categories. Each sub-folder contains images for this category as follows:

DLHUB supports this simple data structure, and loading this data set into DLHUB is very simple: First, browse to the correct folder (1) and select the current folder that includes all sub-folders (2), then select the pre-trained model to be used (3) before proceeding to the model configuration page.

In image classification applications, the image augmentation technique is often used during the training process to improve accuracy, and DLHUB supports this technique by simply ticking the Image Augmentation option (3). For more information on how image augmentation works, please visit this website.

Step 2: Configure Neural Network

Configuring a deep learning neural network structure in DLHUB is so simple that all the hard work has been done inside the DLHUB engine to simplify the designing process. First, a TLModel layer is selected in the Select Functions palette (1) and configured correctly (2). This pre-trained model is frozen, which means its parameters will be reserved during the training process, and its output layer will be removed. The new output layer is then added to the pre-trained model to form a new deep-learning model for this fruit application. In this example, we simply use only 1 fully connected layer (or Dense layer) with 4 output nodes to classify 4 different fruit types. The linear activation function is figured for this Dense layer. The obtained deep learning model is shown in (3). The model is then can be verified by hitting the verify model button (4) before it can proceed to the next training page.

Step 3: Train Neural Network

Configuring training parameters in DLHUB is simple, with an easy-to-use graphical user interface. In this article, we use the sample training algorithm and parameters from the Keras examples. First, we select the training algorithm and loss function and accuracy, metric types, and training parameters (1), then we can specify when to stop the training process (2) before starting it (3). If the GPU is detected in the host PC, it will be used to improve the training process.

Step 4: Evaluate the Neural Network

After the deep learning model is trained, we can verify its generalization (the ability to correctly classify a new dataset) by evaluating it on the test dataset, which is not used during the training process. The evaluation can be done simply by browsing the test dataset (1), performing evaluation (2), and verifying an accurate result (3).

Step 5: Test the Neural Network

Before deciding to use a trained model in production, the user can test how this trained model performs with a new dataset. That process gives the user look and feel of how the trained model behaves. This is can be easily done in the built-in test interface by choosing the test dataset folder (1), selecting an image to be evaluated (2), visualizing the selected image (3), and confirming classification results (4).

Export trained neural network model to weight file

Once the trained model has been evaluated, tested, and verified, it can be exported directly into supported programming languages for real-time applications/deployment.

Conclusion

In this article, we have demonstrated the capability of the DLHUB software for the fruit recognition application using the transfer learning technique. All heavy work has been done in the DLHUB engine to simplify a deep learning design process. With just a few clicks, a proper deep learning model using the transfer technique has been constructed on a very small training dataset, and high accuracy is also achieved on a test dataset (97.5%). The training speed is incredibly fast, with less than 7 seconds.