Introduction to Deep Learning with PyTorch
Chapter 5: Training a Linear Model with PyTorch
Vectorising your computations
When manipulating tensors, for
loops are usually very time-consuming.
Instead, we prefer vectorising our operations as much as possible.
PyTorch and other Machine Learning libraries are optimised to make vectorised operations way faster. In particular, vectorised operations can benefit hardware acceleration provided by GPUs and TPUs.
To start vectorising our operations, let’s first put all our data in the same tensors:
import torch
tensor_number_tasks = torch.Tensor([[1.],
[2.],
[4.],
[4.],
[5.],
[6.],
[6.],
[6.],
[8.],
[8.],
[9.],
[10.]])
tensor_number_questions = torch.Tensor([[5.],
[11.],
[21.],
[22.],
[26.],
[31.],
[32.],
[31.],
[41.],
[42.],
[48.],
[52.]])
Here is a simpler way to initialise the tensors above:
tensor_number_tasks = torch.Tensor([1, 2, 4, 4, 5, 6, 6, 6, 8, 8, 9, 10]).view(-1, 1)
tensor_number_questions = torch.Tensor([5, 11, 21, 22, 26, 31, 32, 31, 41, 42, 48, 52]).view(-1, 1)
You can then calculate the estimators of the number of estimator_number_questions
for all number of tensor_number_tasks
at the same time!
estimator_number_questions = net.forward(tensor_number_tasks)
And following the formula of the loss provided before (L(\theta) = \dfrac{1}{12} \sum_{i=0}^{11} (\widehat{n_Q^{(i)}} - n_Q^{(i)})^2):
difference_tensor = (estimator_number_questions - tensor_number_questions)
squared_difference_tensor = difference_tensor * difference_tensor
loss_tensor = squared_difference_tensor.mean()
Exercise
Try to vectorise the code you were using for training ModelNumberQuestions
.
The signature of your train_parameters_linear_regression
function should become:
def train_parameters_linear_regression(tensor_number_tasks, tensor_number_questions, learning_rate=0.02, number_training_steps=200):
"""
Instantiate ModelNumberQuestions model and optimises the parameters of the model, given the dataset
of tensor_number_tasks and tensor_number_tasks.
Args:
tensor_number_tasks (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
tensor_number_questions (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
learning_rate (float):
number_training_steps (int):
Returns:
trained network (ModelNumberQuestions)
"""
Our function compute_loss
can be vectorised!
def compute_loss(tensor_number_tasks, tensor_number_questions, model_number_questions):
# computing estimator number questions for all data samples present in tensor_number_tasks
estimator_number_questions = model_number_questions(tensor_number_tasks)
# computing squared error for all data samples
error = estimator_number_questions - tensor_number_questions
squared_error = error * error
# computing mean squared error.
mse_loss = torch.mean(squared_error)
return mse_loss
Similarly, we also vectorise train_parameters_linear_regression
:
def train_parameters_linear_regression(tensor_number_tasks, tensor_number_questions, learning_rate=0.02, number_training_steps=200):
"""
Instantiate ModelNumberQuestions model and optimises the parameters of the model, given the dataset
of tensor_number_tasks and tensor_number_tasks.
Args:
tensor_number_tasks (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
tensor_number_questions (torch.Tensor): of size (n, 1) where n is the number of questions (it is also the number of tasks)
learning_rate (float):
number_training_steps (int):
Returns:
trained network (ModelNumberQuestions)
"""
net = ModelNumberQuestions()
optimiser = torch.optim.SGD(net.parameters(), lr=learning_rate)
for _ in range(number_training_steps):
optimiser.zero_grad()
mse_loss = compute_loss(tensor_number_tasks, tensor_number_questions, model_number_questions=net)
mse_loss.backward() # Compute gradients
optimiser.step() # Perform 1-step gradient descent.
print("loss:", mse_loss.item())
print("Final Parameters:\n", list(net.named_parameters()))
return net