A great option for representing different stages of a process

Image for post
Image for post
Funnel chart made with — Image by the author

Funnel charts are mostly used for representing a sequential process, allowing the viewers to compare and see how the numbers change through the stages.

In this article, we’ll explore how to build a funnel chart from scratch using Matplotlib, and then we’ll have a look at an easier implementation with Plotly.

Matplotlib

There is no method for instantly creating funnel charts in Matplotlib, so let’s start with a simple horizontal bar chart and build from there.

import matplotlib.pyplot as plty = [5,4,3,2,1]
x = [80,73,58,42,23]
plt.barh(y, x)


Use this intuitive tool to simplify mapping

In this article, we’ll explore Kepler.gl, an open-source solution for geospatial data visualization and exploration. Kepler was developed by Uber to make it easier for users of all levels to design meaningful maps that also look good. The tool can handle large amounts of data and has a friendly, intuitive interface that allows users to build effective maps in an instant.

Available for all to use since 2018, it’s about time we get a closer look at how the tool fits into the data visualization landscape. …


When and how to use texts in your data visualizations

Image for post
Image for post
Random line chart — Image by the author

Data visualization is all about reducing complexity; we use graphical representations to make difficult concepts and insights more comfortable to understand.

Titles, subtitles, notes, annotations, and labels serve an essential function in this process. They guide our audience through the story we’re trying to tell, much like a narrator.

In this article, we’ll explore the functions of titles, subtitles and labels, get a look at how to add annotations to our charts and check how to use custom fonts in Matplotlib.

Titles, Subtitles, Captions, and Labels

Let’s start with a simple line chart.

import matplotlib.pyplot as plt# data
spam = [263.12, 302.99, 291.23, 320.68, 312.17, 316.39,
347.73, 344.66, 291.67, 242.42, 210.54, …


How to improve the visualization of your cluster analysis

Clustering sure isn’t something new. MacQueen developed the k-means algorithm in 1967, and since then, many other implementations and algorithms have been developed to perform the task of grouping data.

Image for post
Image for post
Scatter Plots — Image by the author

In this article, we’ll explore how to improve our cluster’s visualization with scatter plots.

Scatter Plots

Let’s start by loading and preparing our data. I’ll use a .

import pandas as pddf = pd.read_csv('data/Pokemon.csv')# prepare data
types = df['Type 1'].isin(['Grass', 'Fire', 'Water'])
drop_cols = ['Type 1', 'Type 2', 'Generation', 'Legendary', '#']
df = df[types].drop(columns = drop_cols)
df.head()


Exploring the basics of this fundamental task with dplyr

Data by itself can be quite interesting, but even if you’re dealing with a small dataset, the chances are that you’ll have to summarize or aggregate it in some way. That’s where we’ll need groups.

Image for post
Image for post
Image from pixabay —

Sure, it’s nice to know the total amount of sales. But it’s often more interesting to know the total amount of sales by salesperson, or by month.

Grouping data is undeniably essential for data analysis, and in this article, I’ll investigate some of the methods for doing so with R, Tidyverse and dplyr.

The dataset I’ll use for the next examples comes from Kaggle and contains Spotify’s top songs from 2010 to 2019.


How to use this unique solution for displaying three variables composition

There are plenty of ways to display three variables in a single visualization. Heatmaps and Colormaps rely on encoding the third variable in the color. Bubble charts are Scatter Plots with the third variable encoded in size, and other solutions may introduce a Z-axis and rely on 3-dimensional representations.

Image for post
Image for post
Ternary plot — Image by the author

Ternary plots are a less known solution that doesn’t require our user to compare colors, circumference sizes, or 3D distances.

They’re a two-dimensional representation where all the three variables are encoded by their positions to three connected axes, in the shape of a triangle.

This article will go through the basics of how to draw ternary scatter plots using Plotly Express. …


What I learned from creating the same visualization with Python and R

I’ve been playing around a lot with R’s ggplot and decided to compare it with Python’s Matplolib.

In some ways, they feel very similar but also not at all. So I decided to build a scatter plot with R and replicate it with Python to check their advantages and disadvantages.

Image for post
Image for post
Image by the author — Drawn with R


Exploring options for one of the hardest tasks in data visualization

A prevalent task in any data analysis is comparing multiple sets of something. You may have lists of IPs for each landing page of your website, clients who bought certain items from your store, multiple answers from a survey, and so many others.

Image for post
Image for post
Venn Diagram — Image by the author

This article will use Python to explore ways to visualize overlaps and intersections of sets, the possibilities, and their advantages and disadvantages.

For the next examples, I’ll use a from the .

I’m using the survey because it has many different types of questions, where some are multiple-choice questions with multiple answers, like the bellow. …


How to easily create gifs to animate your visualizations

Image for post
Image for post
— Image by the author

There are plenty of ways to build animations in Matplotlib. They even have an with functions and methods to support this task.

But I often find those methods over-complicated, and many times I want to get something together without too much complexity.

In this article, I’ll go through the basics of creating charts, saving them as images, and using Imageio to create a GIF.


An exploration of pie charts and their place in data visualization

Image for post
Image for post
Pie charts — Image by the author

You’ve probably heard lots of reasons not to use pie charts. The lack of precision, hard to read angles, scalability limitations, and ink-ratio are some of the most mentioned ones.

According to John W. Tukey, a famous statistician known for developing the and .

“There is no data that can be displayed in a pie chart, that cannot be displayed better in some other type of chart.” —

Alright, I get it — Pie charts bad.

But still, if you google ‘data visualization’ and go to images, I bet you’ll find lots of pies there. …

About

Thiago Carvalho

Data visualization enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store