Friday, 19 April 2019

Data Visualization using Pair Grid and Pair Plot (Seaborn Library)

Lets visualize our data with Pair Grid and Pair Plot which are present in Seaborn library. We will use Iris dataset. We can pass various parameters to pair grid and pair plot like color, palette, marker (diamond, plus sign, circle, square), linewidth, edgecolor, hue, hue_kws, vars, x_vars, y_vars,  height, kind, diag_kind etc. Lets explore pair grid and pair plot in detail: 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#matplotlib inline

Step 2: Load Tips datasets

iris=sns.load_dataset('iris')
iris.head()

Step 3: Explore data using Pair Grid

Please note that I am not displaying the resulting plots in this post. Please explore it yourself in your Jupyter notebook.

x = sns.PairGrid(iris)
x = x.map(plt.scatter)  #draw scatter plot on pair grid

x = sns.PairGrid(iris)
x = x.map_diag(plt.hist)  #draw histogram on diagonal
x = x.map_offdiag(plt.scatter) #draw scatter plot on rest of the grid

x = sns.PairGrid(iris)
x = x.map_diag(plt.hist)  #draw histogram on diagonal
x = x.map_upper(plt.scatter)  #draw scatter plot on upper grid w.r.t diagonal
x = x.map_lower(sns.kdeplot)  #draw kde plot on lower grid w.r.t diagonal

x = sns.PairGrid(iris, hue='species', palette='Blues_d')
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)
x = x.add_legend()

x = sns.PairGrid(iris, hue='species')
x = x.map_diag(plt.hist, histtype='step', linewidth=2, edgecolor='black')
x = x.map_offdiag(plt.scatter, edgecolor='black')
x = x.add_legend()

x = sns.PairGrid(iris, vars=['petal_length', 'petal_width'])
x = x.map_diag(plt.hist)
x = x.map_offdiag(plt.scatter)

x = sns.PairGrid(iris, x_vars=['petal_length', 'petal_width'], y_vars=['sepal_length', 'sepal_width'])
x = x.map(plt.scatter)

x = sns.PairGrid(iris, hue='species')
x = x.map_diag(plt.hist)
x = x.map_upper(plt.scatter)
x = x.map_lower(sns.kdeplot)
x = x.add_legend()

x = sns.PairGrid(iris, hue='species', hue_kws={'marker' : ['D', 's', '+']})
x = x.map(plt.scatter, s=30, edgecolor='black')
x = x.add_legend()

Step 4: Explore data using Pair Plot

sns.pairplot(iris)

sns.pairplot(iris, kind='reg', diag_kind='kde')

sns.pairplot(iris, kind='reg', diag_kind='kde', hue='species')

sns.pairplot(iris, vars=['petal_length', 'petal_width'], height=4)

sns.pairplot(iris, x_vars=['petal_length', 'petal_width'], y_vars=['sepal_length', 'sepal_width'])

Thursday, 18 April 2019

Data Visualization using Regression Plot (Seaborn Library)

Lets visualize our data with Regression Plot which is present in Seaborn library. We will use Tips dataset. We can pass various parameters to regplot like color, marker (diamond, plus sign, circle), linewidth, jitter, estimator etc. We can also change style of scattering points and regression lines differently using scatter_kws and line_kws functions. Lets explore regplot in detail: 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#matplotlib inline

Step 2: Load Tips datasets

tips=sns.load_dataset('tips')
tips.head()

Step 3: Explore data using Regression Plot

Please note that I am not displaying the resulting plots in this post. Please explore it yourself in your Jupyter notebook.

sns.regplot(x='total_bill', y='tip', data=tips)

sns.regplot(x='total_bill', y='tip', data=tips, color='purple')

sns.regplot(x='total_bill', y='tip', data=tips, marker='D')  #diamond
sns.regplot(x='total_bill', y='tip', data=tips, marker='+')  #plus sign
sns.regplot(x='total_bill', y='tip', data=tips, marker='o')  #circle

sns.regplot(x='total_bill', y='tip', data=tips, marker='D', \
           scatter_kws={'color' : 'blue'}, \
           line_kws={'color' : 'red', 'linewidth' : 3.1})

sns.regplot(x='total_bill', y='tip', data=tips, ci=64)  #confidence interval

sns.regplot(x='size', y='total_bill', data=tips)
sns.regplot(x='size', y='total_bill', data=tips, x_jitter=0.3)

sns.regplot(x='size', y='total_bill', data=tips, x_estimator=np.mean)

Data Visualization using FacetGrid (Seaborn Library)

Lets visualize our data with FacetGrid which is present in Seaborn library. We will use Tips dataset. FacetGrid can be used with Histogram, Scatter Plot, Regression Plot, Box Plot etc. We can pass various parameters to facetgrid like height, aspect, hue, palette, col_order etc. To add legend to facetgrid, you can use add_legend() function. Lets explore facetgrid in detail: 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#matplotlib inline

Step 2: Load Tips datasets

tips=sns.load_dataset('tips')
tips.head()

Step 3: Explore data using Facet Grid

Please note that I am not displaying the resulting plots in this post. Please explore it yourself in your Jupyter notebook.

#Facet Grid with Histogram

x = sns.FacetGrid(tips, row='smoker', col='time')
x = x.map(plt.hist, 'total_bill')

x = sns.FacetGrid(tips, row='smoker', col='time')
x = x.map(plt.hist, 'total_bill', color='green', bins=15)

#Facet Grid with Scatter Plot

x = sns.FacetGrid(tips, row='smoker', col='time')
x = x.map(plt.scatter, 'total_bill', 'tip')

#Facet Grid with Regression Plot

x = sns.FacetGrid(tips, row='smoker', col='time', height=6, aspect=0.7)
x = x.map(sns.regplot, 'total_bill', 'tip')

x = sns.FacetGrid(tips, col='time', hue='smoker', palette='husl')
x = x.map(sns.regplot, 'total_bill', 'tip')

x = sns.FacetGrid(tips, col='time', hue='smoker')
x = x.map(sns.regplot, 'total_bill', 'tip').add_legend()  #add legend

#Facet Grid with Box Plot

x = sns.FacetGrid(tips, col='day', height=10, aspect=0.2)
x = x.map(sns.boxplot, 'time', 'total_bill')

x = sns.FacetGrid(tips, col='day', height=10, aspect=0.2, col_order=['Sat', 'Sun', 'Thur', 'Fri'])
x = x.map(sns.boxplot, 'time', 'total_bill', color='red')

Wednesday, 17 April 2019

Data Visualization using Violin Plot (Seaborn Library)

Lets visualize our data with Violin Plot which is present in Seaborn library. We will use Tips and Iris dataset. We can pass various parameters to violinplot like hue, split, palette, order, inner (quartile, stick), scale, scale_hue, bandwidth (bw) etc. Lets explore violinplot in detail: 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#matplotlib inline

Step 2: Load Tips datasets

tips=sns.load_dataset('tips')
tips.head()

Step 3: Explore data using Violin Plot

Please note that I am not displaying the resulting plots in this post. Please explore it yourself in your Jupyter notebook.

sns.violinplot(x=tips['tip'])

sns.violinplot(x='day', y='total_bill', data=tips)

sns.violinplot(x='day', y='total_bill', data=tips, hue='sex')

sns.violinplot(x='day', y='total_bill', data=tips, hue='sex', palette='RdBu')

sns.violinplot(x='day', y='total_bill', data=tips, hue='sex', split=True)

sns.violinplot(x='day', y='total_bill', data=tips, hue='sex', order=['Sat', 'Sun', 'Thur', 'Fri'])

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='quartile')

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='quartile', split='True')

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='quartile', split='True', scale='count')

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='stick', split='True', scale='count')

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='stick', split='True', scale='count', scale_hue=False)

sns.violinplot(x='day', y='total_bill', data=tips, hue='smoker', inner='stick', split='True', scale='count', scale_hue=False, bw=0.1)

Data Visualization using Heatmap (Seaborn Library)

Lets visualize our data with Heatmap which is present in Seaborn library. Heatmap is full of colors. Darker the color, higher is the value and vice versa. Values closer to 1 represent higher values and values closer to 0 represent lower values. We will use Flights dataset and analyze it through heatmap. We can pass various parameters to heatmap like annot, fmt, vmin, vmax, cbar, cmap, linewidths, center etc. Lets explore heatmap in detail: 

Step 1: Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#matplotlib inline

Step 2: Load Flights datasets

flights = sns.load_dataset('flights')
flights.head()
flights.tail()

Step 3: Explore data using Heat Map

Please note that I am not displaying the resulting maps in this post. Please explore it yourself in your Jupyter notebook.

Before exploring Flights dataset with Heatmap, lets first analyze some random numbers using Heatmap:

numbers = np.random.randn(12, 15)
numbers

sns.heatmap(numbers)

sns.heatmap(numbers, annot=True)  #to show actual values in the heatmap

sns.heatmap(numbers, annot=True, vmin=0, vmax=2)  #to change the key value of heatmap, by default key varies from 0 and 1.

sns.heatmap(flights, cbar=False)  #to hide the color bar

Now, lets jump to our Flights dataset. Lets pivot this dataset so that we have "year" on x-axis and "month" on y-axis.

flights = flights.pivot('month', 'year', 'passengers')
flights

sns.heatmap(flights)

sns.heatmap(flights, annot=True)

sns.heatmap(flights, annot=True, fmt='d')  #format the annotation to contain only digits

sns.heatmap(flights, annot=True, fmt='d', linewidths=0.9)  #add linewidth to heatmap

sns.heatmap(flights, annot=True, fmt='d', linewidths=0.9, cmap='RdBu')  #add color map to heatmap to change the color

sns.heatmap(flights, annot=True, fmt='d', linewidths=0.9, cmap='summer')

sns.heatmap(flights, annot=True, fmt='d', linewidths=0.9, cmap='winter_r')

sns.heatmap(flights, annot=True, fmt='d', linewidths=0.9, cmap='coolwarm')

sns.heatmap(flights, annot=True, fmt='d', center=flights.loc['June', 1954])  #center color theme to a particular cell