Pages

Monday, 15 April 2019

Data Visualization using Distribution Plot (Seaborn Library)

Lets visualize our data with Distribution Plot which is present in Seaborn library. By default, Distribution Plot uses Histogram and KDE (Kernel Density Estimate). We can specify number of bins to the histogram as per our requirement. Please note that Distribution Plot is a univariate plot.

We can pass various parameters to distplot like bins, hist, kde, rug, vertical, color etc. 

Lets explore Distribution Plot by generating 150 random numbers.

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Generate 150 random numbers

num = np.random.randn(150)
num

Step 3: Explore data using Distribution Plot

sns.distplot(num)

Specify number of bins

sns.distplot(num, bins=20)

Remove histogram from distribution plot

sns.distplot(num, hist=False)

Remove KDE from distribution plot

sns.distplot(num, kde=False)

Add rug parameter to distribution plot

sns.distplot(num, hist=False, rug=True)

Add label to distribution plot

label_dist = pd.Series(num, name="variable x")
sns.distplot(label_dist)

Change orientation of distribution plot

sns.distplot(label_dist, vertical=True)

Add cosmetic parameter: color

sns.distplot(label_dist, color='red')

You can download my Jupyter notebook from here. I recommend to also try above code with Tips and Iris dataset.

No comments:

Post a Comment