Introduction to Scikit-learn > Pipeline in scikit-learn | Python Programming (70053 Autumn Term 2022/2023) | Department of Computing | Imperial College London

This is an archived version of the course. Please find the latest version of the course on the main webpage.

Introduction to Scikit-learn

Chapter 2: Classification pipeline

Pipeline in scikit-learn

face Josiah Wang

In scikit-learn, the classification pipeline is exactly the same:

Arrange data into $\mathbf{X}$ and $\mathbf{y}$
Choose your model
Initialise your model with some hyperparameters
Fit your model to $\mathbf{X}$ and $\mathbf{y}$
Predict labels $\hat{\mathbf{y}}^{test}$ for $\mathbf{X}^{test}$
Evaluate the model performance by comparing $\hat{\mathbf{y}}^{test}$ against $\mathbf{y}^{test}$

I will take you through the whole pipeline. We discuss how to apply scikit-learn to a classification problem step by step.