A workshop on data visualization in Python with notebooks and exercises for following along.

Overview

Beyond the Basics: Data Visualization in Python

Binder Nbviewer View slides in browser

The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.

While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.

Workshop Outline

This is a workshop on data visualization in Python first delivered at ODSC West 2021 and subsequently at ODSC East 2022 and PyCon Italia 2022. It's divided into the following sections:

Section 1: Getting Started With Matplotlib

We will begin by familiarizing ourselves with Matplotlib. Moving beyond the default options, we will explore how to customize various aspects of our visualizations. By the end of this section, you will be able to generate plots using the Matplotlib API directly, as well as customize the plots that libraries like pandas and Seaborn create for you.

Section 2: Moving Beyond Static Visualizations

Static visualizations are limited in how much information they can show. To move beyond these limitations, we can create animated and/or interactive visualizations. Animations make it possible for our visualizations to tell a story through movement of the plot components (e.g., bars, points, lines). Interactivity makes it possible to explore the data visually by hiding and displaying information based on user interest. In this section, we will focus on creating animated visualizations using Matplotlib before moving on to create interactive visualizations in the next section.

Section 3: Building Interactive Visualizations for Data Exploration

When exploring our data, interactive visualizations can provide the most value. Without having to create multiple iterations of the same plot, we can use mouse actions (e.g., click, hover, zoom, etc.) to explore different aspects and subsets of the data. In this section, we will learn how to use a few of the libraries in the HoloViz ecosystem to create interactive visualizations for exploring our data utilizing the Bokeh backend.


Prerequisites

You should have basic knowledge of Python and be comfortable working in Jupyter Notebooks. Check out this notebook for a crash course in Python or work through the official Python tutorial for a more formal introduction. The environment we will use for this workshop comes with JupyterLab, which is pretty intuitive, but be sure to familiarize yourself using notebooks in JupyterLab and additional functionality in JupyterLab. In addition, a basic understanding of pandas will be beneficial, but is not required; reviewing the first section of my pandas workshop will be sufficient.


Setup Instructions

  1. Install Anaconda/Miniconda. Note that you can use this Binder environment instead if you don't want to install anything on your machine.

  2. Fork this repository:

    location of fork button in GitHub

  3. Clone your forked repository:

    location of clone button in GitHub

  4. Create and activate a conda virtual environment (on Windows, these commands should be run in Anaconda Prompt):

    $ cd python-data-viz-workshop
    ~/python-data-viz-workshop$ conda install mamba -n base -c conda-forge
    ~/python-data-viz-workshop$ mamba env create --file environment.yml
    ~/python-data-viz-workshop$ conda activate data_viz_workshop
    (data_viz_workshop) ~/python-data-viz-workshop$
  5. Launch JupyterLab:

    (data_viz_workshop) ~/python-data-viz-workshop$ jupyter lab
  6. Navigate to the 0-check_your_env.ipynb notebook in the notebooks/ folder:

    open 0-check_your_env.ipynb

  7. Run the notebook to confirm everything is set up properly:

    check env


About the Author

Stefanie Molin (@stefmolin) is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of Hands-On Data Analysis with Pandas, which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

Related Content

All examples herein were developed exclusively for this workshop. Hands-On Data Analysis with Pandas contains additional examples and exercises, as does this blog post and this workshop on pandas.

Owner
Stefanie Molin
Developer | Data Scientist | Author of "Hands-On Data Analysis with Pandas" | occasional hacker
Stefanie Molin
Easily convert matplotlib plots from Python into interactive Leaflet web maps.

mplleaflet mplleaflet is a Python library that converts a matplotlib plot into a webpage containing a pannable, zoomable Leaflet map. It can also embe

Jacob Wasserman 502 Dec 28, 2022
:small_red_triangle: Ternary plotting library for python with matplotlib

python-ternary This is a plotting library for use with matplotlib to make ternary plots plots in the two dimensional simplex projected onto a two dime

Marc 611 Dec 29, 2022
Material for dataviz course at university of Bordeaux

Material for dataviz course at university of Bordeaux

Nicolas P. Rougier 50 Jul 17, 2022
NorthPitch is a python soccer plotting library that sits on top of Matplotlib

NorthPitch is a python soccer plotting library that sits on top of Matplotlib.

Devin Pleuler 30 Feb 22, 2022
A workshop on data visualization in Python with notebooks and exercises for following along.

Beyond the Basics: Data Visualization in Python The human brain excels at finding patterns in visual representations, which is why data visualizations

Stefanie Molin 162 Dec 05, 2022
Decision Border Visualizer for Classification Algorithms

dbv Decision Border Visualizer for Classification Algorithms Project description A python package for Machine Learning Engineers who want to visualize

Sven Eschlbeck 1 Nov 01, 2021
Runtime analysis of code with plotting

Runtime analysis of code with plotting A quick comparison among Python, Cython, and the C languages A Programming Assignment regarding the Programming

Cena Ashoori 2 Dec 24, 2021
Generate a 3D Skyline in STL format and a OpenSCAD file from Gitlab contributions

Your Gitlab's contributions in a 3D Skyline gitlab-skyline is a Python command to generate a skyline figure from Gitlab contributions as Github did at

Félix Gómez 70 Dec 22, 2022
Some useful extensions for Matplotlib.

mplx Some useful extensions for Matplotlib. Contour plots for functions with discontinuities plt.contour mplx.contour(max_jump=1.0) Matplotlib has pro

Nico Schlömer 519 Dec 30, 2022
Example scripts for generating plots of Bohemian matrices

Bohemian Eigenvalue Plotting Examples This repository contains examples of generating plots of Bohemian eigenvalues. The examples in this repository a

Bohemian Matrices 5 Nov 12, 2022
Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI

Data-Visualization-Projects Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI Indigenous-Brands-Social-Movements Pyt

Jinwoo(Roy) Yoon 1 Feb 05, 2022
A small script written in Python3 that generates a visual representation of the Mandelbrot set.

Mandelbrot Set Generator A small script written in Python3 that generates a visual representation of the Mandelbrot set. Abstract The colors in the ou

1 Dec 28, 2021
Implementation of SOMs (Self-Organizing Maps) with neighborhood-based map topologies.

py-self-organizing-maps Simple implementation of self-organizing maps (SOMs) A SOM is an unsupervised method for learning a mapping from a discrete ne

Jonas Grebe 6 Nov 22, 2022
📊 Charts with pure python

A zero-dependency python package that prints basic charts to a Jupyter output Charts supported: Bar graphs Scatter plots Histograms 🍑 📊 👏 Examples

Max Humber 54 Oct 04, 2022
Sci palettes for matplotlib/seaborn

sci palettes for matplotlib/seaborn Installation python3 -m pip install sci-palettes Usage import seaborn as sns import matplotlib.pyplot as plt impor

Qingdong Su 2 Jun 07, 2022
coordinate to draw the nimbus logo on the graffitiwall

This is a community effort to draw the nimbus logo on beaconcha.in's graffitiwall. get started clone repo with git clone https://github.com/tennisbowl

4 Apr 04, 2022
A Python library for plotting hockey rinks with Matplotlib.

Hockey Rink A Python library for plotting hockey rinks with Matplotlib. Installation pip install hockey_rink Current Rinks The following shows the cus

24 Jan 02, 2023
Here are my graphs for hw_02

Let's Have A Look At Some Graphs! Graph 1: State Mentions in Congressperson's Tweets on 10/01/2017 The graph below uses this data set to demonstrate h

7 Sep 02, 2022
Altair extension for saving charts in a variety of formats.

Altair Saver This packge provides extensions to Altair for saving charts to a variety of output types. Supported output formats are: .json/.vl.json: V

Altair 85 Dec 09, 2022
A guide for using Bootstrap 5 classes in Dash Bootstrap Components V1

dash-bootstrap-cheatsheet This handy interactive cheatsheet makes it easy to use the Bootstrap 5 classes with your Dash app made with the latest versi

10 Dec 22, 2022