Introduction to Pandas
Chapter 4: Accessing DataFrame rows and columns
Row and column statistics
I have introduced the .describe()
method earlier on for DataFrame
s. You can also use this to obtain some summary statistics for a Series
object.
For numeric data, the method returns a Series/DataFrame
that includes the index count
, mean
, std
, min
, max
etc.
>>> df["Speed"].describe()
count 800.000000
mean 68.277500
std 29.060474
min 5.000000
25% 45.000000
50% 65.000000
75% 90.000000
max 180.000000
Name: Speed, dtype: float64
For object data such as strings, the resulting Series/DataFrame
includes count
, unique
, top
and freq
(of top
element).
>>> df["Type 1"].describe()
count 800
unique 18
top Water
freq 112
Name: Type 1, dtype: object
Frequency counts
You can use the .value_counts()
method (for Series/DataFrame
) to get the frequency counts for each unique element in the data.
>>> df["Type 1"].value_counts()
Water 112
Normal 98
Grass 70
Bug 69
Psychic 57
Fire 52
Electric 44
Rock 44
Dragon 32
Ground 32
Ghost 32
Dark 31
Poison 28
Steel 27
Fighting 27
Ice 24
Fairy 17
Flying 4
Name: Type 1, dtype: int64
Quick exercise
How many Legendary
Pokemons are there? Use .value_counts()
to discover this!
If your answer is 65, then either your code is correct or you are obsessed with Pokemon!