Pages

Thursday, 18 April 2019

Data Visualization using Regression Plot (Seaborn Library)

Lets visualize our data with Regression Plot which is present in Seaborn library. By default, Regression Plot uses Scatter Plot. It draws a best fit line (regression line) passing through the data points.

We can pass various parameters to regplot like confidence interval (ci), estimators (mean, median etc.), jitter, color, marker (diamond, plus sign, circle, square), linewidth etc. We can also change style of scattering points and regression lines differently using scatter_kws and line_kws functions. 

Lets explore Regression Plot using Tips dataset. 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Load Tips datasets

tips=sns.load_dataset('tips')
tips.head()

Step 3: Explore data using Regression Plot

sns.regplot(x='total_bill', y='tip', data=tips)

Specify confidence interval

sns.regplot(x='total_bill', y='tip', data=tips, ci=95)

Specify estimators like mean, median etc.

sns.regplot(x='total_bill', y='tip', data=tips, x_estimator=np.mean)
sns.regplot(x='total_bill', y='tip', data=tips, x_estimator=np.median)

Specify jitter parameter

sns.regplot(x='size', y='total_bill', data=tips)
sns.regplot(x='size', y='total_bill', data=tips, x_jitter=True)
sns.regplot(x='size', y='total_bill', data=tips, x_jitter=0.3)

Cosmetic parameters like color, marker, line width etc.

sns.regplot(x='total_bill', y='tip', data=tips, color='purple')

sns.regplot(x='total_bill', y='tip', data=tips, marker='D')  #diamond
sns.regplot(x='total_bill', y='tip', data=tips, marker='+')  #plus sign
sns.regplot(x='total_bill', y='tip', data=tips, marker='o')  #circle
sns.regplot(x='total_bill', y='tip', data=tips, marker='s')  #square

sns.regplot(x='total_bill', y='tip', data=tips, marker='D', \
           scatter_kws={'color' : 'blue'}, \
           line_kws={'color' : 'red', 'linewidth' : 3.1})

You can download my Jupyter notebook from here. I recommend to also try above code with Iris dataset.

No comments:

Post a Comment