This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 5: DataFrame operations

Missing values

face Josiah Wang

If you examined your DataFrame earlier with df.info() (you did do this, did you not?), you may have noticed that there are only 414 non-null for Type 2.

>>> df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 800 entries, Bulbasaur to Volcanion
Data columns (total 12 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   #           800 non-null    int64
 1   Type 1      800 non-null    object
 2   Type 2      414 non-null    object
 3   Total       800 non-null    int64
 4   HP          800 non-null    int64
 5   Attack      800 non-null    int64
 6   Defense     800 non-null    int64
 7   Sp. Atk     800 non-null    int64
 8   Sp. Def     800 non-null    int64
 9   Speed       800 non-null    int64
 10  Generation  800 non-null    int64
 11  Legendary   800 non-null    bool
dtypes: bool(1), int64(9), object(2)
memory usage: 75.8+ KB

This means that the remaining 386 values for Type 2 are null or NA, that is, they have missing values. They could be for example np.nan or None.

You can use the .isna() or .isnull() method to figure out the rows that are null. Combine that with .sum() and you get some useful statistics.

>>> df.isna().sum()
#               0
Type 1          0
Type 2        386
Total           0
HP              0
Attack          0
Defense         0
Sp. Atk         0
Sp. Def         0
Speed           0
Generation      0
Legendary       0
dtype: int64

You can easily figure out which rows contain NA values in Type 2.

>>> null_type2 = df[df["Type 2"].isna()]
>>> print(null_type2)
              #  Type 1 Type 2  ...  Speed  Generation  Legendary
Name                            ...

Charmander    4    Fire    NaN  ...     65           1      False
Charmeleon    5    Fire    NaN  ...     80           1      False
Squirtle      7   Water    NaN  ...     43           1      False
Wartortle     8   Water    NaN  ...     58           1      False
Blastoise     9   Water    NaN  ...     78           1      False
...         ...     ...    ...  ...    ...         ...        ...
Sliggoo     705  Dragon    NaN  ...     60           6      False
Goodra      706  Dragon    NaN  ...     80           6      False
Bergmite    712     Ice    NaN  ...     28           6      False
Avalugg     713     Ice    NaN  ...     28           6      False
Xerneas     716   Fairy    NaN  ...     99           6       True

[386 rows x 12 columns]