This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 4: Accessing DataFrame rows and columns

Row and column statistics

face Josiah Wang

I have introduced the .describe() method earlier on for DataFrames. You can also use this to obtain some summary statistics for a Series object.

For numeric data, the method returns a Series/DataFrame that includes the index count, mean, std, min, max etc.

>>> df["Speed"].describe()
count    800.000000
mean      68.277500
std       29.060474
min        5.000000
25%       45.000000
50%       65.000000
75%       90.000000
max      180.000000
Name: Speed, dtype: float64

For object data such as strings, the resulting Series/DataFrame includes count, unique, top and freq (of top element).

>>> df["Type 1"].describe()
count       800
unique       18
top       Water
freq        112
Name: Type 1, dtype: object

Frequency counts

You can use the .value_counts() method (for Series/DataFrame) to get the frequency counts for each unique element in the data.

>>> df["Type 1"].value_counts()
Water       112
Normal       98
Grass        70
Bug          69
Psychic      57
Fire         52
Electric     44
Rock         44
Dragon       32
Ground       32
Ghost        32
Dark         31
Poison       28
Steel        27
Fighting     27
Ice          24
Fairy        17
Flying        4
Name: Type 1, dtype: int64

Quick exercise

How many Legendary Pokemons are there? Use .value_counts() to discover this!

If your answer is 65, then either your code is correct or you are obsessed with Pokemon!