Introduction to Deep Learning with PyTorch > Training our Classifier | Python Programming (70053 Autumn Term 2022/2023) | Department of Computing

Introduction to Deep Learning with PyTorch

Chapter 7: Building and Training a simple Classification Model

Training our Classifier

face Luca Grillotti

Remember the code we had for training our linear model?

def train_parameters_linear_regression(tensor_number_tasks,
                                       tensor_number_questions,
                                       learning_rate=0.02,
                                       number_training_steps=200):
    """
    Instantiate ModelNumberQuestions model and Loss, and optimises the parameters of the model, given the dataset
    of tensor_number_tasks and tensor_number_tasks.

    Args:
        tensor_number_tasks (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
        tensor_number_questions (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
        learning_rate (float):
        number_training_steps (int):

    Returns:
        trained network (ModelNumberQuestions)
    """
    net = ModelNumberQuestions()  # model
    loss = torch.nn.MSELoss()  # loss module

    optimiser = torch.optim.SGD(net.parameters(), lr=learning_rate)

    for _ in range(number_training_steps):
        optimiser.zero_grad()

        # Compute Loss
        estimator_number_questions = net.forward(tensor_number_tasks)
        mse_loss = loss.forward(input=estimator_number_questions,
                                target=tensor_number_questions)

        mse_loss.backward()
        optimiser.step()
        print("loss:", mse_loss.item())

    print("Final Parameters:\n", list(net.named_parameters()))
    return net

Well the code we have for training our classifier is super similar! But we will make some slight adjustments:

Loss

Previously, we said that we use a cross-entropy loss cross-entropy loss.

cross_entropy_loss = torch.nn.CrossEntropyLoss()

Which Optimiser should we use?

So far, we mostly relied on the Stochastic Gradient Descent SGD optimiser.

Here we use the much more common Adam Optimiser

optimiser = Adam(net.parameters())

Training Loop

In the end our training code looks like this (we refactored it slightly):

def train_classifier(mnist_classifier, loss, optimiser, dataset_images, dataset_labels, number_training_steps):
    for _ in range(number_training_steps):
        optimiser.zero_grad()

        # Compute Loss
        estimator_number_questions = mnist_classifier(dataset_images)
        value_loss = loss.forward(input=estimator_number_questions,
                                target=dataset_labels)

        value_loss.backward()
        optimiser.step()
        print("loss:", value_loss.item())

And we declare our loss, model, and optimiser outside the function:

mnist_classifier = MNISTClassifier()
loss = torch.nn.CrossEntropyLoss()
optimiser = torch.optim.Adam(mnist_classifier.parameters())

Then, we just need a simple call to train_classifier and the work is done! :D

train_classifier(mnist_classifier,
                 loss,
                 optimiser,
                 dataset_training_images,
                 dataset_training_labels,
                 number_training_steps=500
                 )

However, passing the entire dataset through the model may be:

computationally expensive, as it computes a single gradient for the whole dataset.
memory inefficient, as we need to load all the dataset in memory at once.

That is why it is more common to divide our dataset into batches.