Pages

Friday, 19 April 2019

Data Visualization using Pair Grid and Pair Plot (Seaborn Library)

Lets visualize our data with Pair Grid and Pair Plot which are present in Seaborn library. We can draw various plots (like scatter plot, histogram and KDE plot) in Pair Grid. Pair Plot shows histograms at diagonal and scatter plots at rest of the grid cells by default.

We can pass various parameters to PairGrid like hue, hue_kws, vars, x_vars, y_vars, palette, marker (diamond, plus sign, circle, square) etc.

We can pass various parameters to pairplot like kind, diag_kind, hue, vars, x_vars, y_vars, height etc. 

Lets explore Pair Grid and Pair Plot using Iris dataset. 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Load Tips datasets

iris=sns.load_dataset('iris')
iris.head()

Step 3: Explore data using Pair Grid

Draw scatter plots on all grid cells

x = sns.PairGrid(iris)
x = x.map(plt.scatter)

Draw histograms on diagonals and scatter plots on rest of the grid cells

x = sns.PairGrid(iris)
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)

Draw histograms on diagonals, scatter plots at top and KDE plots at bottom

x = sns.PairGrid(iris)
x = x.map_diag(plt.hist) 
x = x.map_upper(plt.scatter)  
x = x.map_lower(sns.kdeplot)

Add hue and legend

x = sns.PairGrid(iris, hue='species')
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)

x = sns.PairGrid(iris, hue='species')
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)
x = x.add_legend()

x = sns.PairGrid(iris, hue='species')
x = x.map_diag(plt.hist)
x = x.map_upper(plt.scatter)
x = x.map_lower(sns.kdeplot)
x = x.add_legend()

x = sns.PairGrid(iris, hue='species', palette='Blues_d')
x = x.map_diag(plt.hist, histtype='step', linewidth=2, edgecolor='black')
x = x.map_offdiag(plt.scatter, edgecolor='black')
x = x.add_legend()

x = sns.PairGrid(iris, hue='species', hue_kws={'marker' : ['D', 's', '+']})
x = x.map(plt.scatter, s=30, edgecolor='black')
x = x.add_legend()

Add specific variables

x = sns.PairGrid(iris, vars=['petal_length', 'petal_width'])
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)

x = sns.PairGrid(iris, x_vars=['petal_length', 'petal_width'], y_vars=['sepal_length', 'sepal_width'])
x = x.map(plt.scatter)

Step 4: Explore data using Pair Plot

sns.pairplot(iris)

Add regression line to scatter plot

sns.pairplot(iris, kind='reg')

Change diagonal to KDE, by default its histogram

sns.pairplot(iris, diag_kind='kde')

Add hue parameter

sns.pairplot(iris, hue='species')

sns.pairplot(iris, hue='species', kind='reg')

sns.pairplot(iris, hue='species', kind='reg', diag_kind='kde')

sns.pairplot(iris, hue='species', kind='reg', diag_kind='hist')

Add specific variables

sns.pairplot(iris, vars=['petal_length', 'petal_width'], height=4)

sns.pairplot(iris, x_vars=['petal_length', 'petal_width'], y_vars=['sepal_length', 'sepal_width'])

You can download my Jupyter notebook from here. I recommend to also try above code with Tips dataset.

No comments:

Post a Comment