This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 4: PyTorch for Automatic Gradient Descent

1-dimensional Gradient Descent with torch Tensors

face Luca Grillotti

Remember the function gradient descent you implemented before?

def gradient_descent(initial_theta, learning_rate, number_steps):
    """
    Args:
        initial_theta (float): Initial value of theta
        learning_rate (float)
        number_steps (int): number of 1-step gradient descent to perform.

    Returns:
        final_theta (float): Final value of theta after multiple 1-step gradient descents
    """

Let’s make it torch-compliant!

Exercise: Write a function gradient_descent_torch that does the same thing as gradient_descent, but with tensor variables instead of floats. We are considering the same function as before L(\theta) = \theta^2. where \theta is a tensor of shape (1,)

def gradient_descent_torch(initial_theta, learning_rate, number_steps):
    """
    Args:
        initial_theta (torch.Tensor): Initial value of theta
        learning_rate (float)
        number_steps (int): number of 1-step gradient descent to perform.

    Returns:
        final_theta (torch.Tensor): Final value of theta after multiple 1-step gradient descents
    """

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def get_gradient_tensor(tensor):
    # considering function x^2
    return tensor * 2

def gradient_descent_torch(initial_theta, learning_rate, number_steps):
    """
    Args:
        initial_theta (torch.Tensor): Initial value of theta
        learning_rate (float)
        number_steps (int): number of 1-step gradient descent to perform.

    Returns:
        final_theta (torch.Tensor): Final value of theta after multiple 1-step gradient descents
    """
    tensor = initial_theta
    for _ in range(number_steps):
        tensor = tensor - learning_rate * get_gradient_tensor(tensor)
        print(tensor)

    return tensor

initial_theta = torch.Tensor([1])
gradient_descent_torch(initial_theta,
                       learning_rate=0.2,
                       number_steps=20)