Online Machine Learning Quiz

100+ Objective Machine Learning Questions. Lets see how many can you answer?

Start Quiz

Monday, 15 April 2019

Data Visualization using Distribution Plot (Seaborn Library)

Lets visualize our data with Distribution Plot which is present in Seaborn library. By default, Distribution Plot uses Histogram and KDE (Kernel Density Estimate). We can specify number of bins to the histogram as per our requirement. Please note that Distribution Plot is a univariate plot.

We can pass various parameters to distplot like bins, hist, kde, rug, vertical, color etc. 

Lets explore Distribution Plot by generating 150 random numbers.

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Generate 150 random numbers

num = np.random.randn(150)

Step 3: Explore data using Distribution Plot


Specify number of bins

sns.distplot(num, bins=20)

Remove histogram from distribution plot

sns.distplot(num, hist=False)

Remove KDE from distribution plot

sns.distplot(num, kde=False)

Add rug parameter to distribution plot

sns.distplot(num, hist=False, rug=True)

Add label to distribution plot

label_dist = pd.Series(num, name="variable x")

Change orientation of distribution plot

sns.distplot(label_dist, vertical=True)

Add cosmetic parameter: color

sns.distplot(label_dist, color='red')

You can download my Jupyter notebook from here. I recommend to also try above code with Tips and Iris dataset.

No comments:

Post a Comment