This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 7: Building and Training a simple Classification Model

Improving our Classifier

face Luca Grillotti

Right now, our module MNISTClassifier has only one linear layer.

Let’s make it more complex by:

  • adding 2 additional linear operations; we consider that the hidden layers are of size 64.
  • adding an activation function just after those linear operations. PyTorch provides several activation functions such as:

As before, we just need to define our methods __init__ and forward

__init__

Here we just need to declare our 2 additional linear operations linear_1 and linear_2.

  1. linear_1 takes images as input and output results with 64 features.
  2. linear_2 takes those 64 features as input and output results with 64 .
  3. the final operation linear_final takes those 64 features as input, and output the 10-dimensional score vector.
class MNISTClassifierComplex(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_1 = torch.nn.Linear(in_features=1 * 28 * 28, out_features=64)
        self.linear_2 = torch.nn.Linear(in_features=64, out_features=64)  #  notice that linear_2.in_features == linear_1.out_features
        self.linear_final = torch.nn.Linear(in_features=64, out_features=10)  #  notice that linear_final.in_features == linear_2.out_features

    ... # forward

forward

We just need to call the linear operators declared above one by one, and pass the resulting tensors through relu activation functions every time.

class MNISTClassifierComplex(torch.nn.Module):
    ... # __init__

    def forward(self, tensor_images):
        """
        Args:
            tensor_images: tensor of shape (N_batch, 1, 28, 28)
        """
        tensor_images = tensor_images.view(-1, 1 * 28 * 28)

        outcome_scores = tensor_images

        # First linear layer
        outcome_scores = self.linear_1(outcome_scores)  # 1st Linear operation
        outcome_scores = torch.relu(outcome_scores)  # Activation operation (after every hidden linear layer! But not at the end)

        # Second linear layer
        outcome_scores = self.linear_2(outcome_scores)  # 2nd Linear operation
        outcome_scores = torch.relu(outcome_scores)  # Activation operation (after every hidden linear layer! But not at the end)

        # Last linear layer:
        outcome_scores = self.linear_final(outcome_scores)
        # No activation operation on final layer (for this example)

        return outcome_scores

Exercise:

Try to replace the MNISTClassifier used so far with our fresh MNISTClassifierComplex.

Do you get better results on the training set?

Note: To be complete, we should compare the performance of those two classifiers on a testing dataset. But this is not the focus of this lesson… ☹️