Area Plot

%%capture
!pip3 install xlrd
import matplotlib.pyplot as plt
import pandas as pd
df_can = pd.read_excel(
    './data/ibm/canada.xlsx',
    sheet_name='Canada by Citizenship',
    skiprows=range(20),
    skipfooter=2
)
df_can.columns = list(map(lambda x: str(x), df_can.columns))
drops = [
    'AREA', 
    'REG', 
    'DEV', 
    'Type', 
    'Coverage'
]
df_can.drop(columns=drops, inplace=True)
columns = {
    'OdName': 'Country',
    'AreaName': 'Continent',
    'RegName': 'Region'
}

df_can.rename(columns=columns, inplace=True)
df_can.set_index('Country', inplace=True)
df_can['Total'] = df_can.sum(axis=1)
years = list(map(str, range(1980, 2014)))
df_can.sort_values('Total', ascending=False, axis=0, inplace=True)

df_top5 = df_can.head()
df_top5 = df_top5[years].transpose() 

df_top5.head()
Country India China United Kingdom of Great Britain and Northern Ireland Philippines Pakistan
1980 8880 5123 22045 6051 978
1981 8670 6682 24796 5921 972
1982 8147 3308 20620 5249 1201
1983 7338 1863 10015 4562 900
1984 5704 1527 10170 3801 668
df_top5.index = df_top5.index.map(int)
df_top5.plot(
    kind='area', 
    stacked=False,
    figsize=(20, 10)
)

plt.title('Immigration Trend of Top 5 Countries')
plt.ylabel('Number of Immigrants')
plt.xlabel('Years')

plt.show()

The unstacked plot has a default transparency (alpha value) at 0.5. We can modify this value by passing in the alpha parameter.

df_top5.plot(
    kind='area', 
    alpha=0.25,
    stacked=False,
    figsize=(20, 10),
)

plt.title('Immigration Trend of Top 5 Countries')
plt.ylabel('Number of Immigrants')
plt.xlabel('Years')

plt.show()

Two types of plotting

There are two styles/options of ploting with matplotlib. Plotting using the Artist layer and plotting using the scripting layer.

Option 1: Scripting layer (procedural method) - using matplotlib.pyplot as plt

You can use plt i.e. matplotlib.pyplot and add more elements by calling different methods procedurally; for example, plt.title(...) to add title or plt.xlabel(...) to add label to the x-axis.

# option 1: this is what we have been using so far
    df_top5.plot(kind='area', alpha=0.35, figsize=(20, 10)) 

    plt.title('Immigration trend of top 5 countries')
    plt.ylabel('Number of immigrants')
    plt.xlabel('Years')

Option 2: Artist layer (Object oriented method) - using an Axes instance from Matplotlib (preferred)

You can use an Axes instance of your current plot and store it in a variable (eg. ax). You can add more elements by calling methods with a little change in syntax (by adding *set_* to the previous methods). For example, use ax.set_title() instead of plt.title() to add title, or ax.set_xlabel() instead of plt.xlabel() to add label to the x-axis.

This option sometimes is more transparent and flexible to use for advanced plots.

# option 2: preferred option with more flexibility
    ax = df_top5.plot(kind='area', alpha=0.35, figsize=(20, 10))

    ax.set_title('Immigration Trend of Top 5 Countries')
    ax.set_ylabel('Number of Immigrants')
    ax.set_xlabel('Years')