This is an archived version of the course and is no longer updated. Please find the latest version of the course on the main webpage.

How to design a Deep Learning model [Theory]

Before implementing a Deep Learning model, you will need to define precisely which specifications you will need to take into account.

In particular, you will need to think about:

  • what type of data do you want to consider? This way, you will know how to process the input of your neural network.
  • what do you intend to do? This way, you will be able to determine what is the output of your neural network, and what is the loss function that you will intend to minimise.
  • how to adjust the hidden structure of the network (e.g. the number of neurons, number of layers…) depending on the complexity of the problem.
  • how you intend to train your model

the input data

First, you will need to consider which type of data you want your model to process. Depending on the problem considered, the input data may be:

  • images
  • sequences
  • videos
  • audio files

You also may need to pre-process your dataset (e.g. by normalising it, by projecting it to a low dimensional space, by adding random noise to the images).

output and loss

After having defined precisely how to process the input, you will have to decide of the purpose of your model. In other words, you will have to ask yourself:

  • What are the outputs of your neural network?
  • Which loss function \(L(\cdot)\) to minimise to achieve the intended purpose? How will the outputs be processed by that loss function?

hidden structure

Once you have defined the input and output of your neural network, you will have to consider the internal structure of your model.

That internal structure determines the overall complexity of your model.

Here are some things you may need to think about while you design it:

  • What type of layer do you need? (Fully Connected? Convolutional?)
  • Which activation function \(\phi(\cdot)\) do you intend to use?
  • Which number of layers do you want?
  • Which number of neurons do you want in each layer?

A validation dataset can be used to evaluate each model, and to optimise these hyper-parameters.

training and testing

Optimiser

You will have to choose an optimiser for optimising the weights of the neural network in order to minimise the loss \(L(\cdot)\). Several gradient-based optimisers are available in most Deep Learning libraries, such as:

  • Stochastic Gradient Descent (SGD)
  • RMSProp
  • Adam

The way all these optimisers work is beyond of the scope of this course.

Training

The training phase most of the time consists of the following steps:

  1. Taking a batch \(B_{\mathcal{D}}\) of data,
  2. Compute the output \(o\) of the neural network given the batch \(B_{\mathcal{D}}\) as input
  3. Compute the loss function \(L(\cdot)\) depending on the output \(o\)
  4. Compute the gradient of the loss \(\dfrac{\partial L}{\partial\cdot}\) using the backpropagation algorithm
  5. Perform one optimisation step using the optimiser described above
  6. Start again at step 1.

Testing

After the training phase, the accuracy of the trained model can be evaluated on a test dataset.