Using Seaborn for exploratory data analysis (EDA) of continuous variables

Creating histograms and scatter plots using the pairplot() function

May 20, 2025

When you first analyze a dataset of continuous variables, I encourage you to start by creating a matrix of histograms and pairwise scatter plots. Let’s use the “Iris” dataset as an example; you can access it via the Seaborn package in Python. Here is the code to obtain the first 5 rows:

import pandas as pd 
import numpy as np 
import seaborn as sb

iris = sb.load_dataset('iris')
display(iris.head())

Here is what the output looks like in Google Colab:

Here is the code to produce the matrix of histograms and pairwise scatter plots for the 4 continuous variables:

import matplotlib.pyplot as plt
import seaborn as sb

iris = sb.load_dataset("iris")
correlation_plots = sb.pairplot(iris)
plt.show()

Here is the output:

In the diagonal entries from top left to bottom right, you will find the histogram for each of the 4 continuous variables. They help you to understand the shape, spread, central tendency, and overall distribution of the data.
In the off-diagonal entries, you will find the scatter plot between each pair of continuous variables. This allows you to quickly assess the relationship between each pair of variables.

Using this information in your exploratory data analysis (EDA), you can move onto deeper analyses, such as correlation, regression, variable selection, outlier detection, and multicollinearity assessment.

Discussion about this post

Ready for more?