Machine Learning Quiz (134 Objective Questions) Start ML Quiz

Deep Learning Quiz (205 Objective Questions) Start DL Quiz

Deep Learning Free eBook Download

Monday, 15 April 2019

Data Visualization using Box Plot (Seaborn Library)

Lets visualize our data with Box Plot which is present in Seaborn library. Box Plots are very useful in finding outliers in a variable. We can also combine Box Plot with Swarm Plot.

We can pass various parameters to boxplot like hue, order, orient, palette, color etc. 

Lets explore Box Plot using Tips dataset.

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Load Tips dataset


Step 3: Explore data using Box Plot

Box Plot is both univariate and bivariate. Lets analyze it first by using one variable and then we will use two variables. 

Visualizing one variable using Box Plot



sns.boxplot(x='total_bill', data=tips)

Visualizing two variables using Box Plot

sns.boxplot(x='sex', y='total_bill', data=tips)

sns.boxplot(x='day', y='total_bill', data=tips)

Add hue parameter

sns.boxplot(x='day', y='total_bill', data=tips, hue='sex')

sns.boxplot(x='day', y='total_bill', data=tips, hue='sex', palette='husl')

sns.boxplot(x='day', y='total_bill', data=tips, hue='smoker', palette='coolwarm')

sns.boxplot(x='day', y='total_bill', data=tips, hue='time', palette='coolwarm') 

Note: If you run the above line, you will find that there is no hue corresponding to "Sat" and "Sun" as there is no data for "Lunch" for "Sat" and "Sun".

sns.boxplot(x='day', y='total_bill', data=tips, order=['Sat', 'Sun', 'Thur', 'Fri'])

Change orientation of box plot


sns.boxplot(data=tips, orient='horizontal')
sns.boxplot(data=tips, orient='h')

sns.boxplot(data=tips, orient='vertical')
sns.boxplot(data=tips, orient='v')

Combining Box Plot and Swarm Plot

sns.boxplot(x='day', y='total_bill', data=tips, palette='husl')
sns.swarmplot(x='day', y='total_bill', data=tips, color='black')

sns.boxplot(x='day', y='total_bill', data=tips, palette='husl')
sns.swarmplot(x='day', y='total_bill', data=tips, color='0.35')

You can download my Jupyter notebook from here. I recommend to also try above code with Iris dataset.

What is Boxplot? How is it used to find outliers in a dataset?
Boxplot Grouping: Visualizing one variable based on another variable using boxplot

No comments:

Post a Comment

About the Author

I have more than 10 years of experience in IT industry. Linkedin Profile

I am currently messing up with neural networks in deep learning. I am learning Python, TensorFlow and Keras.

Author: I am an author of a book on deep learning.

Quiz: I run an online quiz on machine learning and deep learning.