Deep Learning with MATLAB
If you are using MATLAB on your desktop computer, make sure you have the Deep Learning Toolbox and Deep Learning Toolbox Model for AlexNet Network installed. You can go to the Add-On Explorer to install these packages.
Using the Sample Dataset
To use the images in the sample dataset, first unzip the folder and add the folder and subfolders to your path. This will make the files visible to MATLAB.
You can then change directory into the DeepLearning
folder. This is where we will be working
for the remainder of this tutorial.
unzip('DeepLearning.zip')
addpath(genpath('DeepLearning'))
cd DeepLearning
Example 1. Using a Pretrained Network
AlexNet
AlexNet is a neural network that was developed by Alex Krizhevsky at the University of Toronto in 2012. AlexNet was trained for a week on one million images from 1000 different categories.
In this example we will load AlexNet into MATLAB and use it to classify some images.
1. Load AlexNet
Load the pretrained network AlexNet into your MATLAB workspace as a variable net
.
net = alexnet;
2. Load the image
Load the first sample image into the workspace as a variable img
.
img = imread('file1.jpg');
Optionally, you can also view the image.
imshow(img)
3. Resize the image
AlexNet was trained on images that are 227 x 227 pixels in size. This means any images we want to classify with AlexNet must also be this size.
% See the size of the image
imgSize = size(img)
% Resize the image
img = imresize('img', [227 227]);
imshow(img)
4. Classify the image
The classify
function takes a neural net and an image as inputs and returns a categorical prediction
as an output.
pred = classify(net, img);
Try classifying the other images in the SampleImages
folder. What results do you get?
Example 2. Perform Transfer Learning
Transfer Learning
Transfer Learning is the process of modifying a pretrained neural network to
Datastores
Datastores are repositories for collections of data that are too large to fit in memory. Instead of storing all the pixel data in memory, datastores allow us to store just the filepaths and to read the image data into memory as needed.
1. Create an image datastore
Create an image datastore. We can label these images based on the folders in which they are organized.
imds = imageDatastore('flowers', 'IncludeSubfolders', true, 'LabelSource', 'foldernames');
We can preview the first image in our datastore imds
.
imshow(preview(imds))
We can also inspect the labels of our images by extracting the Labels
.
imds.Labels
2. Split the data
When training on new data, we generally want to reserve some of the data for testing. These data will not be used in training so that we don’t overfit the network – that is, so that the network isn’t just good at classifying images it’s already seen before. The test images will be used to evaluate the network’s performance.
Typically we want to split our dataset into two subsets: train and test. Usually we use 80% of the data for training and use the remaining 20% for testing.
The splitEachLabel
function allows us to divide the data proportionally within each folder/label.
By default, splitEachLabel
will split the images based on alphabetical order, so we can use
the 'randomized'
option to randomly assign images to the training and test sets.
[train, test] = splitEachLabel(imds, 0.8, 'randomized');
3. Modify layers of AlexNet
AlexNet is made of 25 distinct layers. We can inspect these layers by looking at the Layers
attribute of net
(the variable in which we loaded AlexNet).
layers = net.Layers
Layer 1 is the input layer, which is where we feed our images.
Layers 2-22 are mostly Convolution, Rectified Linear Unit (ReLU), and Max Pooling layers. This is where feature extraction occurs.
Layer 23 is a Fully Connected Layer containing 1000 neurons. This maps the extracted features to each of the 1000 output classes.
Layer 24 is a Softmax Layer. This is where a probability is assigned to the input image for each output class.
Layer 25 returns the most likely output class of the input image.
When starting with a pretrained network, we typically want to modify just the last few layers to suit our particular problem. The feature extraction layers will adjust themselves based on the images we are training on – no need to modify them ourselves!
First, we want to create a new Fully Connected layer fc
with 5 neurons – one for each of our flower labels.
We will then replace the Fully Connected layer in layers
with fc
.
fc = fullyConnectedLayer(5);
layers(23) = fc;
We also want to replace the last layer with a new classification layer.
layers(end) = classificationLayer;
4. Set the training options
Now we want to train the network with our training data and new layers. Before we begin training, we want to set our training options, or hyperparameters.
More documentation about the different options can be found here.
Training Option | Description |
---|---|
Solver Name | The solver for the training network. MATLAB allows us to use different optimizers: Stochastic Gradient Descent with Momentum sdgm , RMSProp rmsprop , and Adam adam . |
MiniBatchSize | Size of the mini-batch used for each training iteration. Rather than train the network on the whole training set for each iteration, we can train on mini-batches, or subsets of the data. |
MaxEpochs | Number of times the training algorithm passes over the entire training set. |
Shuffle | Optional shuffling of the training data. Shuffling the training data allows you to train over different mini-batches for each epoch. |
InitialLearnRate | This controls how we quickly the network adapts. Larger learning rates mean the network makes bigger adjustments after each iteration. A rate that is too large can cause the network to converge at a suboptimal solution, while a rate that is too small can make the network learn too slowly. |
Verbose | Set to true if you want progress printed to the Command Window. |
Plots | Display training progress plots with the training-progress option. |
We can set our desired training options in a variable called options
using the trainingOptions
function.
options = trainingOptions('sgdm', ...
'MiniBatchSize', 10, ...
'MaxEpochs', 2, ...
'InitialLearnRate', 3e-4, ...
'Shuffle', 'every-epoch', ...
'Verbose', false, ...
'Plots', 'training-progress');
5. Train the network
Now that we have our options set we can begin training the network on our new dataset. We
will call our new neural network flwrnet
.
We will use the trainNetwork
function to train the network. As inputs, we will use our training dataset train
, our modified layers layers
, and our training options options
.
flwrnet = trainNetwork(train, layers, options);
It will take several minutes to train the network.
6. Evaluating performance
After training has completed, we can evaluate the performance of the network flwrnet
using the reserved
test dataset test
.
% Classify our test dataset
preds = classify(flwrnet, test);
% Extract the actual labels of the test dataset
actual = test.Labels;
% Count the number of predictions that match the actual label
numCorrect = nnz(preds == actual);
% Determine the fraction of correct predictions
fracCorrect = numCorrect/length(actual)
We can also create a Confusion Matrix Chart, which shows us the number of correct predictions for each output class. The confusion matrix also shows us the breakdown of how incorrect predictions were classified.
confusionchart(actual,preds)
7. Improving Performance
With our initial training options, our resulting network has so-so performance. We can try improving the performance by adjusting the training options.
MaxEpochs: We can increase the number of epochs over which we train the network. Generally, the longer we train the dataset, the more performance improves.
InitialLearnRate: If we set our initial learning rate too high, we can cause the network to converge at a suboptimal solution. To improve performance, you can try dividing your initial learn rate by 10 and retrain the network.
MiniBatchSize: You can try adjusting the mini-batch size. Smaller values typically mean faster convergence but more noise in the training process. Larger batch sizes mean more training time but generally less noise.
There are other training options that you can try adjusting that are dependent on the solver you chose. These options can be explored in the MATLAB documentation.
You can also improve performance by testing your network as you are training it; this process is called validation. In addition to setting aside some data for testing after training is complete, we also set aside a validation set. Every few iterations of training, we will classify the images of our validation set and assess the accuracy of the network. This allows us to see how prediction accuracy improves not only on our training data, but also on data the network hasn’t seen before. The validation data isn’t used to modify any of our network layers–it’s just a check to see how training is coming along.
Example 3. Build a neural network
In some cases it may make more sense to train a network from scratch. This is particularly true if your dataset is very different from those that were used to train other networks.
In this example we will train a neural network to classify images of numerical digits. This uses images built into the MATLAB Deep Learning Toolbox.
1. Create an image datastore
First we will create a datastore containing our images.
% Retrieve the path to the demo dataset
digitDatasetPath = fullfile(matlabroot, 'toolbox','nnet','nndemos','nndatasets','DigitDataset');
% Create image datastore
imds = imageDatastore(digitDatasetPath, 'IncludeSubfolders', true, 'LabelSource', 'foldernames');
2. Split the data into training and test datasets
[train, test] = splitEachLabel(imds, 0.8, 'randomized');
3. Define the layers of your network (the network architecture).
layers = [...
imageInputLayer([28 28 1])
convolution2dLayer(5,20)
reluLayer
maxPooling2dLayer(2,'Stride',2)
fullyConnectedLayer(10)
softmaxLayer
classificationLayer];
4. Set your training options.
options = trainingOptions('sgdm', ...
'MaxEpochs', 20, ...
'InitialLearnRate', 1e-4, ...
'Verbose', false, ...
'Plots', 'training-progress');
5. Train the network
net = trainNetwork(train, layers, options);
6. Evaluate performance
preds = classify(net, test);
actual = test.Labels;
numCorrect = nnz(preds == actual);
fracCorrect = numCorrect/length(actual);
confusionchart(actual, preds)