This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 4: Understanding your features

Understanding your features

face Josiah Wang

You are already provided pre-processed features with the Iris dataset, rather than raw features. Therefore, there is no need for an explicit feature encoding step. We can just use the pre-processed features directly.

Now that we have examined the categories, let us now specifically try to examine and understand the features or attributes themselves.

So far, we have figured out that there are four features. But…

Question 1: What does each feature represent?

Scikit-learn gives you that information, with an attribute aptly called .feature_names.

>>> feature_names = dataset.feature_names
>>> print(feature_names)
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

You should get four features:

  • sepal length (cm)
  • sepal width (cm)
  • petal length (cm)
  • petal width (cm)

If you are botanically challenged like me, then here is a diagram of what sepals and petals are:

Petals and Sepals