This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 3: Array operations

Exercise

face Josiah Wang

Let’s do a quick exercise to get you practising NumPy! You will remember it better over just reading about it!

Say you are given a dataset. Each point in the dataset is represented as a 2D vector (a, b). And there are 10 data points in the dataset. So your dataset is represented as a 10\times2 np.array.

Each data point is also associated with a label, which is either 1 or 2.

data = np.array([
       [5.46, 3.06],
       [2.08, 1.57],
       [7.28, 3.06],
       [3.35, 4.74],
       [2.51, 2.43],
       [3.91, 4.55],
       [2.72, 3.11],
       [6.05, 2.72],
       [0.84, 3.57],
       [1.14, 6.31]])

label = np.array([1, 2, 1, 2, 2, 2, 1, 1, 2, 2])

Task

Divide data into two subsets data_1 and data_2.

data_1 should contain only data points with the label 1 (there should be 4 data points)

data_2 should contain only data points with the label 2 (there should be 6 data points).

Make use of NumPy boolean indexing for this!

While we are at it, can you also transpose both data_1 and data_2 so that it is a 2 \times N array rather than N \times 2? Try to find a function from the official documentation, or just guess!

data_1 = ??????
data_2 = ??????

assert data_1 == np.array([[5.46, 7.28, 2.72, 6.05],
                           [3.06, 3.06, 3.11, 2.72]])

assert data_2 == np.array([[2.08, 3.35, 2.51, 3.91, 0.84, 1.14],
                           [1.57, 4.74, 2.43, 4.55, 3.57, 6.31]])

Definitely try this yourself first before peeking at the solutions! It’s only one line each!

I am demonstrating two different ways to transpose a np.array in these solutions.

data_1 = data[label == 1].transpose()
data_2 = data[label == 2].T