Introduction to Deep Learning with PyTorch
Chapter 5: Training a Linear Model with PyTorch
PyTorch Modules
It is usually much more common to create a Module
to implement a pytorch model.
Let’s do that for the example we had before!
We first create a class ModelNumberQuestions
, inheriting from torch.nn.Module
We have to define 2 methods for our ModelNumberQuestions
:
__init__(...)
where all our parameters will be declared.forward(...)
which returns the output of the model.
__init__(...)
First, we define all the parameters of our module. All the module parameters should be defined as attributes in
the __init__
:
import torch
class ModelNumberQuestions(torch.nn.Module):
def __init__(self):
super().__init__()
initial_tensor_value = torch.Tensor([1])
initial_theta_0 = torch.Tensor([1])
initial_theta_1 = torch.Tensor([2])
self.theta_0 = torch.nn.Parameter(initial_theta_0)
self.theta_1 = torch.nn.Parameter(initial_theta_1)
...
Now, you can use the parameters()
(or named_parameters()
) method to directly see the module parameters:
net = ModelNumberQuestions()
print(list(net.parameters()))
print(list(net.named_parameters()))
forward(...)
The forward
method is used to output our estimator \widehat{n_Q} given its input (here: the number of tasks
\widehat{n_T}).
Thus, forward model just implements the operation.
import torch
class ModelNumberQuestions(torch.nn.Module):
def __init__(self):
super().__init__()
initial_theta_0 = torch.Tensor([1])
initial_theta_1 = torch.Tensor([2])
self.theta_0 = torch.nn.Parameter(initial_theta_0)
self.theta_1 = torch.nn.Parameter(initial_theta_1)
def forward(self, tensor_number_tasks):
return self.theta_1 * tensor_number_tasks + self.theta_0
Now, if you want to calculate the estimator outputted by the model, you can simply call the forward
method.
net = ModelNumberQuestions()
tensor_number_tasks = torch.Tensor([3])
net.forward(tensor_number_tasks)
Alternatively, instead of net.forward(tensor_number_tasks)
, you may directly use:
net(tensor_number_tasks)
Exercise
Modify the optimisation procedure you implemented before to use the torch module ModelNumberQuestions
given above.
Hint: To apply the optimiser to all the parameters of the module, you can replace the list_parameters
we’ve been
using so far by net.parameters()
:
optimiser = torch.optim.SGD(params=net.parameters(), lr=learning_rate)
First modify our function compute_loss
and replace theta_0
and theta_1
with our module:
def compute_loss(list_number_tasks, list_number_questions, model_number_questions):
mse_loss = torch.Tensor([0])
for number_tasks, number_questions in zip(list_number_tasks, list_number_questions):
# computing squared error for single data sample (number_tasks, number_questions)
tensor_number_tasks = torch.Tensor([number_tasks])
estimator_number_questions = model_number_questions(tensor_number_tasks)
error = estimator_number_questions - number_questions
squared_error = error * error
# adding the computed error to the loss
mse_loss += squared_error
# computing mean squared error.
mse_loss /= len(list_number_tasks)
return mse_loss
Then, we can use that function in train_parameters_linear_regression
to optimise our module parameters:
def train_parameters_linear_regression(list_number_tasks, list_number_questions, learning_rate=0.02, number_training_steps=200):
"""
Instantiate ModelNumberQuestions model and optimises the parameters of the model, given the dataset
of list_number_tasks and list_number_questions.
Args:
list_number_tasks (List[float]): of size n where n is the number of questions (it is also the number of tasks)
list_number_questions (List[float]): of size n where n is the number of questions (it is also the number of tasks)
learning_rate (float):
number_training_steps (int):
Returns:
trained network (ModelNumberQuestions)
"""
net = ModelNumberQuestions()
optimiser = torch.optim.SGD(net.parameters(), lr=learning_rate)
for _ in range(number_training_steps):
optimiser.zero_grad()
mse_loss = compute_loss(list_number_tasks, list_number_questions, model_number_questions=net)
mse_loss.backward() # Compute gradients
optimiser.step() # Perform 1-step gradient descent.
print("loss:", mse_loss.item())
print("Final Parameters:\n", list(net.named_parameters()))
return net