Introduction to Deep Learning with PyTorch
Chapter 7: Building and Training a simple Classification Model
Improving our Classifier
Right now, our module MNISTClassifier
has only one linear layer.
Let’s make it more complex by:
- adding 2 additional linear operations; we consider that the hidden layers are of size
64
. -
adding an activation function just after those linear operations. PyTorch provides several activation functions such as:
As before, we just need to define our methods __init__
and forward
__init__
Here we just need to declare our 2 additional linear operations linear_1
and linear_2
.
linear_1
takes images as input and output results with 64 features.linear_2
takes those 64 features as input and output results with 64 .- the final operation
linear_final
takes those 64 features as input, and output the 10-dimensional score vector.
class MNISTClassifierComplex(torch.nn.Module):
def __init__(self):
super().__init__()
self.linear_1 = torch.nn.Linear(in_features=1 * 28 * 28, out_features=64)
self.linear_2 = torch.nn.Linear(in_features=64, out_features=64) # notice that linear_2.in_features == linear_1.out_features
self.linear_final = torch.nn.Linear(in_features=64, out_features=10) # notice that linear_final.in_features == linear_2.out_features
... # forward
forward
We just need to call the linear operators declared above one by one, and pass the resulting tensors through relu
activation functions every time.
class MNISTClassifierComplex(torch.nn.Module):
... # __init__
def forward(self, tensor_images):
"""
Args:
tensor_images: tensor of shape (N_batch, 1, 28, 28)
"""
tensor_images = tensor_images.view(-1, 1 * 28 * 28)
outcome_scores = tensor_images
# First linear layer
outcome_scores = self.linear_1(outcome_scores) # 1st Linear operation
outcome_scores = torch.relu(outcome_scores) # Activation operation (after every hidden linear layer! But not at the end)
# Second linear layer
outcome_scores = self.linear_2(outcome_scores) # 2nd Linear operation
outcome_scores = torch.relu(outcome_scores) # Activation operation (after every hidden linear layer! But not at the end)
# Last linear layer:
outcome_scores = self.linear_final(outcome_scores)
# No activation operation on final layer (for this example)
return outcome_scores
Exercise:
Try to replace the MNISTClassifier
used so far with our fresh MNISTClassifierComplex
.
Do you get better results on the training set?
Note: To be complete, we should compare the performance of those two classifiers on a testing dataset. But this is not the focus of this lesson… ☹️