This is an archived version of the course. Please find the latest version of the course on the main webpage.

Chapter 4: Accessing DataFrame rows and columns

Accessing columns

face Josiah Wang

You have accessed columns of a DataFrame with its .columns attribute. The .columns attribute gives you an Index instance.

>>> print(df.columns)
Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
       'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary'],
      dtype='object')

If you do not like any of the column names, just rename them! Let’s say that you think “Sp. Atk” and “Sp. Def” are too cryptic and want to rename these.

>>> df.rename(columns={"Sp. Atk": "Special Attack", 
                       "Sp. Def": "Special Defense"},
              inplace=True)
>>> print(df.columns)
Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
       'Special Attack', 'Special Defense', 'Speed', 'Generation',
       'Legendary'],
      dtype='object')

Accessing individual DataFrame columns

You can access a single column by passing the column name to the DataFrame. This will return a Series object (if you remember, this is a column!)

>>> name_column = df["Name"]
>>> print(type(name_column))
<class 'pandas.core.series.Series'>
>>> genre_column.head()
0                Bulbasaur
1                  Ivysaur
2                 Venusaur
3    VenusaurMega Venusaur
4               Charmander
Name: Name, dtype: object

Accessing multiple DataFrame columns

You can also access one or more columns as a sub-DataFrame by passing a list of column names.

>>> columns_df = df[["Name", "Type 1", "Type 2"]] 
>>> print(type(columns_df))
<class 'pandas.core.frame.DataFrame'>
>>> columns_df.head()
                    Name Type 1  Type 2
0              Bulbasaur  Grass  Poison
1                Ivysaur  Grass  Poison
2               Venusaur  Grass  Poison
3  VenusaurMega Venusaur  Grass  Poison
4             Charmander   Fire     NaN