Introduction to NumPy and Matplotlib
Chapter 3: Array operations
Exercise
Let’s do a quick exercise to get you practising NumPy
! You will remember it better over just reading about it!
Say you are given a dataset. Each point in the dataset is represented as a 2D vector (a, b)
. And there are 10 data points in the dataset. So your dataset is represented as a 10\times2 np.array
.
Each data point is also associated with a label, which is either 1
or 2
.
data = np.array([
[5.46, 3.06],
[2.08, 1.57],
[7.28, 3.06],
[3.35, 4.74],
[2.51, 2.43],
[3.91, 4.55],
[2.72, 3.11],
[6.05, 2.72],
[0.84, 3.57],
[1.14, 6.31]])
label = np.array([1, 2, 1, 2, 2, 2, 1, 1, 2, 2])
Task
Divide data
into two subsets data_1
and data_2
.
data_1
should contain only data points with the label 1 (there should be 4 data points)
data_2
should contain only data points with the label 2 (there should be 6 data points).
Make use of NumPy
boolean indexing for this!
While we are at it, can you also transpose both data_1
and data_2
so that it is a 2 \times N array rather than N \times 2? Try to find a function from the official documentation, or just guess!
data_1 = ??????
data_2 = ??????
assert data_1 == np.array([[5.46, 7.28, 2.72, 6.05],
[3.06, 3.06, 3.11, 2.72]])
assert data_2 == np.array([[2.08, 3.35, 2.51, 3.91, 0.84, 1.14],
[1.57, 4.74, 2.43, 4.55, 3.57, 6.31]])
Definitely try this yourself first before peeking at the solutions! It’s only one line each!
I am demonstrating two different ways to transpose a np.array
in these solutions.
data_1 = data[label == 1].transpose()
data_2 = data[label == 2].T