Interactive plotting for Pandas using Vega-Lite

Overview

pdvega: Vega-Lite plotting for Pandas Dataframes

build status Binder

pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas dataframes, using an API that is nearly identical to Pandas' built-in visualization tools, and designed for easy use within the Jupyter notebook.

Pandas currently has some basic plotting capabilities based on matplotlib. So, for example, you can create a scatter plot this way:

import numpy as np
import pandas as pd

df = pd.DataFrame({'x': np.random.randn(100), 'y': np.random.randn(100)})
df.plot.scatter(x='x', y='y')

matplotlib scatter output

The goal of pdvega is that any time you use dataframe.plot, you'll be able to replace it with dataframe.vgplot and instead get a similar (but prettier and more interactive) visualization output in Vega-Lite that you can easily export to share or customize:

import pdvega  # import adds vgplot attribute to pandas

df.vgplot.scatter(x='x', y='y')

vega-lite scatter output

The above image is a static screenshot of the interactive output; please see the Documentation for a full set of live usage examples.

Installation

You can get started with pdvega using pip:

$ pip install jupyter pdvega
$ jupyter nbextension install --sys-prefix --py vega3

The first line installs pdvega and its dependencies; the second installs the Jupyter extensions that allows plots to be displayed in the Jupyter notebook. For more information on installation and dependencies, see the Installation docs.

Why Vega-Lite?

When working with data, one of the biggest challenges is ensuring reproducibility of results. When you create a figure and export it to PNG or PDF, the data become baked-in to the rendering in a way that is difficult or impossible for others to extract. Vega and Vega-Lite change this: instead of packaging a figure by encoding its pixel values, they package a figure by describing, in a declarative manner, the relationship between data values and visual encodings through a JSON specification.

This means that the Vega-Lite figures produced by pdvega are portable: you can send someone the resulting JSON specification and they can choose whether to render it interactively online, convert it to a PNG or EPS for static publication, or even enhance and extend the figure to learn more about the data.

pdvega is a step in bringing this vision of figure portability and reproducibility to the Python world.

Relationship to Altair

Altair is a project that seeks to design an intuitive declarative API for generating Vega-Lite and Vega visualizations, using Pandas dataframes as data sources.

By contrast, pdvega seeks not to design new visualization APIs, but to use the existing DataFrame.plot visualization api and output visualizations with Vega/Vega-Lite rather than with matplotlib.

In this respect, pdvega is quite similar in spirit to the now-defunct mpld3 project, though the scope is smaller and (hopefully) much more manageable.

Owner
Altair
Declarative visualization in Python
Altair
Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

688 Jan 03, 2023
A python package for animating plots build on matplotlib.

animatplot A python package for making interactive as well as animated plots with matplotlib. Requires Python = 3.5 Matplotlib = 2.2 (because slider

Tyler Makaro 394 Dec 18, 2022
Movies-chart - A CLI app gets the top 250 movies of all time from imdb.com and the top 100 movies from rottentomatoes.com

movies-chart This CLI app gets the top 250 movies of all time from imdb.com and

3 Feb 17, 2022
These data visualizations were created as homework for my CS40 class. I hope you enjoy!

Data Visualizations These data visualizations were created as homework for my CS40 class. I hope you enjoy! Nobel Laureates by their Country of Birth

9 Sep 02, 2022
ICS-Visualizer is an interactive Industrial Control Systems (ICS) network graph that contains up-to-date ICS metadata

ICS-Visualizer is an interactive Industrial Control Systems (ICS) network graph that contains up-to-date ICS metadata (Name, company, port, user manua

QeeqBox 2 Dec 13, 2021
Pretty Confusion Matrix

Pretty Confusion Matrix Why pretty confusion matrix? We can make confusion matrix by using matplotlib. However it is not so pretty. I want to make con

Junseo Ko 5 Nov 22, 2022
Some useful extensions for Matplotlib.

mplx Some useful extensions for Matplotlib. Contour plots for functions with discontinuities plt.contour mplx.contour(max_jump=1.0) Matplotlib has pro

Nico Schlömer 519 Dec 30, 2022
Fast visualization of radar_scenes based on oleschum/radar_scenes

RadarScenes Tools About This python package provides fast visualization for the RadarScenes dataset. The Open GL based visualizer is smoother than ole

Henrik Söderlund 2 Dec 09, 2021
Create SVG drawings from vector geodata files (SHP, geojson, etc).

SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a

Neil Freeman 78 Dec 09, 2022
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

100 pandas puzzles Puzzles notebook Solutions notebook Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of panda

Alex Riley 1.9k Jan 08, 2023
A Graph Learning library for Humans

A Graph Learning library for Humans These novel algorithms include but are not limited to: A graph construction and graph searching class can be found

Richard Tjörnhammar 1 Feb 08, 2022
Fast scatter density plots for Matplotlib

About Plotting millions of points can be slow. Real slow... 😴 So why not use density maps? ⚡ The mpl-scatter-density mini-package provides functional

Thomas Robitaille 473 Dec 12, 2022
Tweets your monthly GitHub Contributions as Wordle grid

Tweets your monthly GitHub Contributions as Wordle grid

Venu Vardhan Reddy Tekula 5 Feb 16, 2022
clock_plot provides a simple way to visualize timeseries data, mapping 24 hours onto the 360 degrees of a polar plot

clock_plot clock_plot provides a simple way to visualize timeseries data mapping 24 hours onto the 360 degrees of a polar plot. For usage, please see

12 Aug 24, 2022
PanGraphViewer -- show panenome graph in an easy way

PanGraphViewer -- show panenome graph in an easy way Table of Contents Versions and dependences Desktop-based panGraphViewer Library installation for

16 Dec 17, 2022
mysql relation charts

sqlcharts 自动生成数据库关联关系图 复制settings.py.example 重命名为settings.py 将数据库配置信息填入settings.DATABASE,目前支持mysql和postgresql 执行 python build.py -b,-b是读取数据库表结构,如果只更新匹

6 Aug 22, 2022
Streaming pivot visualization via WebAssembly

Perspective is an interactive visualization component for large, real-time datasets. Originally developed for J.P. Morgan's trading business, Perspect

The Fintech Open Source Foundation (www.finos.org) 5.1k Dec 27, 2022
Chem: collection of mostly python code for molecular visualization, QM/MM, FEP, etc

chem: collection of mostly python code for molecular visualization, QM/MM, FEP,

5 Sep 02, 2022
649 Pokémon palettes as CSVs, with a Python lib to turn names/IDs into palettes, or MatPlotLib compatible ListedColormaps.

PokePalette 649 Pokémon, broken down into CSVs of their RGB colour palettes. Complete with a Python library to convert names or Pokédex IDs into eithe

11 Dec 05, 2022
Resources for teaching & learning practical data visualization with python.

Practical Data Visualization with Python Overview All views expressed on this site are my own and do not represent the opinions of any entity with whi

Paul Jeffries 98 Sep 24, 2022