Member-only story
Box and Violin Plots with Python’s Seaborn
A quick look at two powerful tools for exploring variability
Currently, in data analytics, Exploratory Data Analysis is considered an essential step of every project — That is to say, you need to explore your data before you go making conclusions or wasting your time investigating the wrong relationships.
The essence of EDA is to collect valuable information that can help describe the characteristics and understand the story behind the data.
That process usually starts with finding estimates of Location and Variability.
- Location — Measures a central point to describe the data.
(Mean, Mode, Median, and others) - Variability — Measures the spread of the data.
(Range, Standard Deviation, Percentiles, Variance, and others)
In this article, I’ll explore Box plots and Violin plots, both convenient ways of visualizing the distribution of a variable.
Box plots
The objective of a boxplot is to describe the variability in a set of values; it does so by using Percentiles, more commonly the 25th and 75th percentiles, also knows as the Quartiles, it also uses an estimate of Location, usually the Median, and the Range of the values.