This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 5: Training a Linear Model with PyTorch

Loss Functions as Modules

face Luca Grillotti

Using a Module to implement our loss function

Remember the vectorised computations we used to compute the loss function?

difference_tensor = (estimator_number_questions - tensor_number_questions) 
squared_difference_tensor = difference_tensor * difference_tensor
loss_tensor = squared_difference_tensor.mean()

All those computations are using torch tensors. As a consequence, nothing prevents us from create a module MeanSquaredError performing the computations above.

import torch
class MeanSquaredError(torch.nn.Module):
    def __init__(self):
        super(MeanSquaredError, self).__init__()

    def forward(self, estimator_number_questions, tensor_number_questions):
        difference_tensor = (estimator_number_questions - tensor_number_questions) 
        squared_difference_tensor = difference_tensor * difference_tensor
        loss_tensor = squared_difference_tensor.mean()
        return loss_tensor

Then the loss can be easily computed using the modules we implemented:

net = ModelNumberQuestions()
loss = MeanSquaredError()
...
estimator_number_questions = net.forward(tensor_number_tasks)
loss_tensor = loss(estimator_number_questions, tensor_number_questions)
loss_tensor.backward() # If you want to compute gradients for parameters.

Using Loss functions already implemented in PyTorch

Usually, we don’t need to implement our own loss modules. Indeed, the most standard loss modules are already implemented in PyTorch.

For example, the MeanSquaredError implemented above already exists in PyTorch under the name torch.nn.MSELoss

net = ModelNumberQuestions()
loss = torch.nn.MSELoss() # Now we're using the MSELoss() provided in PyTorch
...
estimator_number_questions = net.forward(tensor_number_tasks)
loss_tensor = loss(estimator_number_questions, tensor_number_questions)
loss_tensor.backward() # If you want to compute gradients for parameters.

Exercise

Modify your implementation of train_parameters_linear_regression to use torch.nn.MSELoss.

We don’t need compute_loss anymore!

def train_parameters_linear_regression(tensor_number_tasks,
                                       tensor_number_questions,
                                       learning_rate=0.02,
                                       number_training_steps=200):
    """
    Instantiate ModelNumberQuestions model and Loss, and optimises the parameters of the model, given the dataset
    of tensor_number_tasks and tensor_number_tasks.

    Args:
        tensor_number_tasks (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
        tensor_number_questions (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
        learning_rate (float):
        number_training_steps (int):

    Returns:
        trained network (ModelNumberQuestions)
    """
    net = ModelNumberQuestions()  # model
    loss = torch.nn.MSELoss()  # loss module

    optimiser = torch.optim.SGD(net.parameters(), lr=learning_rate)

    for _ in range(number_training_steps):
        optimiser.zero_grad()

        # Compute Loss
        estimator_number_questions = net.forward(tensor_number_tasks)
        mse_loss = loss.forward(input=estimator_number_questions,
                                target=tensor_number_questions)

        mse_loss.backward()
        optimiser.step()
        print("loss:", mse_loss.item())

    print("Final Parameters:\n", list(net.named_parameters()))
    return net