This is an archived version of the course and is no longer updated. Please find the latest version of the course on the main webpage.

More DataFrame methods

.append()

You can use .append(other) to return a new object with rows from other appended. other can be a DataFrame or Series or dict-like object, or a list of these.

See the official documentation for more details and examples.

temp_df = df.append(df)
print(temp_df.shape)  ## (2000, 11)

.drop_duplicates()

This method will return a new DataFrame with duplicates removed.

See the official documentation to see the options available.

temp_df = temp_df.drop_duplicates()
print(temp_df.shape)  ## (1000, 11)

temp_df = df.append(df)
temp_df.drop_duplicates(inplace=True)
print(temp_df.shape)  ## (1000, 11)

.apply(func, axis=0)

This method aplies a function to the dataset.

You can specify which axis to which to apply the function (0 to apply the function to column(s) and 1 to apply the function to rows(s)).

This method is more efficient than iterating over the DataFrame or Series.

It is useful if you want to create a new column with new values.

def rotten_tomatoes_style(rating):
    if rating >= 8.0:
        return "fresh"
    else:
        return "rotten"

df["RT"] = df["Rating"].apply(rotten_tomatoes_style)
print(df.head(2))
##                          Rank  ...      RT
## Title                          ...
## Guardians of the Galaxy     1  ...   fresh
## Prometheus                  2  ...  rotten
##
## [2 rows x 12 columns]