A small timeseries transformation API built on Flask and Pandas

Overview

#Mcflyin

###A timeseries transformation API built on Pandas and Flask

This is a small demo of an API to do timeseries transformations built on Flask and Pandas.

Concept

The idea is that you can make a POST request to the API with a simple list/array of timestamps, from any language, and get back some interesting transformations of that data.

Why?

Partly to show how straightforward it is to build such a thing. Python is great because it has very powerful, intuitive, quick-to-learn tools for both building web applications and doing data analysis/statistics.

That puts Python in kind of a unique position: powerful web tools, powerful scientific/numerical/statistical data tools. This API is a very simple example of how you can take advantage of both. Go read the source code- it's short and easy to grok. Bug fixes and pull requests welcome.

Getting Started

First we need to find some data. We're going to use some data that Wes McKinney provided in a recent blog post, with some statistics on Python posts on Stack Overflow. This is something of a contrived example: I'm manipulating the data in Python, sending to a Python backend, and then getting a response to manipulate in Python. Just know that all you need is an array of timestamp strings, no matter your language.

import pandas as pd

data = pd.read_csv('AllPandas.csv')
data = data['CreationDate'].tolist()

A simple array of timestamps:

>>>data[:10]
['2011-04-01 14:50:44',
 '2012-01-18 19:41:27',
 '2012-01-23 03:21:00',
 '2012-01-24 17:59:53',
 '2012-03-04 16:58:45',
 '2012-03-09 22:36:52',
 '2012-03-10 15:35:26',
 '2012-03-18 12:53:06',
 '2012-03-30 13:58:29',
 '2012-04-04 23:17:23']

With the McFlyin application running on localhost, lets make a request to resample the data on an daily basis, to get the number of posts per day:

import requests
import json

freq = {'D': 'Daily'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data)}
r = requests.post('http://127.0.0.1:5000/resample', data=sends)
response = r.json

The response is simple JSON:

{'Monthly': {'data': [1.0, 2.0, 1.0, 1.0,...
             'time': ['2011-03-31T00:00:00', '2011-04-30T00:00:00', '2011-05-31T00:00:00', '2011-06-30T00:00:00', '2011-07-31T00:00:00',...

Here's the distribution of daily questions on Stack Overflow for Pandas (monthly probably would have been a little more informative):

Daily

Let's call Mcflyin for a rolling sum on a seven-day window. It will resample to the given freq, then apply the window to the result:

freq = {'D': 'Weekly Rolling'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data), 'window': 7}
r = requests.post('http://127.0.0.1:5000/rolling_sum', data=sends)
response = r.json

Rolling

Let's look at the total questions asked by day:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

dailysum

and daily means:

sends = {'data': json.dumps(data), 'how': json.dumps('mean')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

dailymean

The same for hourly:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/hourly', data=sends)
response = r.json

dailymean

Finally, we can look at hourly by day-of-week:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily_hours', data=sends)
response = r.json

hourdow

Live demo here

Dependencies

Pandas, Numpy, Requests, Flask

How did you make those colorful graphs?

Vincent and Bearcart

Status

Lots of stuff that could be better- error handling on the requests, probably better handling of weird timestamps, etc. This is just a small demo of how powerful Python can be for building a statistics backend with relatively few lines of code.

If I want to write a front-end in a different language, can I put it in the examples folder?

Yes! PR's welcome.

Owner
Rob Story
Rob Story
Plot toolbox based on Matplotlib, simple and elegant.

Elegant-Plot Plot toolbox based on Matplotlib, simple and elegant. 绘制效果 绘制过程 数据准备 每种图标类型的目录下有data.csv文件,依据样例数据填入自己的数据。

3 Jul 15, 2022
Data visualization electromagnetic spectrum

Datenvisualisierung-Elektromagnetischen-Spektrum Anhand des Moduls matplotlib sollen die Daten des elektromagnetischen Spektrums dargestellt werden. D

Pulsar 1 Sep 01, 2022
Visualization Data Drug in thailand during 2014 to 2020

Visualization Data Drug in thailand during 2014 to 2020 Data sorce from ข้อมูลเปิดภาครัฐ สำนักงาน ป.ป.ส Inttroducing program Using tkinter module for

Narongkorn 1 Jan 05, 2022
Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly

Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly Problem: 2 peloton users were looking for a way to track their metri

9 Jul 22, 2022
Python Package for CanvasXpress JS Visualization Tools

CanvasXpress Python Library About CanvasXpress for Python CanvasXpress was developed as the core visualization component for bioinformatics and system

Dr. Todd C. Brett 5 Nov 07, 2022
The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based metabolomics.

MINT (Metabolomics Integrator) The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based m

Sören Wacker 0 May 04, 2022
GDSHelpers is an open-source package for automatized pattern generation for nano-structuring.

GDSHelpers GDSHelpers in an open-source package for automatized pattern generation for nano-structuring. It allows exporting the pattern in the GDSII-

Helge Gehring 76 Dec 16, 2022
Jupyter notebook and datasets from the pandas Q&A video series

Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note

Kevin Markham 2k Jan 05, 2023
A tool for creating Toontown-style nametags in Panda3D

Toontown-Nametag Toontown-Nametag is a tool for creating Toontown Online/Toontown Rewritten-style nametags in Panda3D. It contains a function, createN

BoggoTV 2 Dec 23, 2021
A data visualization curriculum of interactive notebooks.

A data visualization curriculum of interactive notebooks, using Vega-Lite and Altair. This repository contains a series of Python-based Jupyter notebooks.

UW Interactive Data Lab 1.2k Dec 30, 2022
This is a sorting visualizer made with Tkinter.

Sorting-Visualizer This is a sorting visualizer made with Tkinter. Make sure you've installed tkinter in your system to use this visualizer pip instal

Vishal Choubey 7 Jul 06, 2022
A site that displays up to date COVID-19 stats, powered by fastpages.

https://covid19dashboards.com This project was built with fastpages Background This project showcases how you can use fastpages to create a static das

GitHub 1.6k Jan 07, 2023
Dimensionality reduction in very large datasets using Siamese Networks

ivis Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets. Ivis

beringresearch 284 Jan 01, 2023
Tools for exploratory data analysis in Python

Dora Exploratory data analysis toolkit for Python. Contents Summary Setup Usage Reading Data & Configuration Cleaning Feature Selection & Extraction V

Nathan Epstein 599 Dec 25, 2022
Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.

physt P(i/y)thon h(i/y)stograms. Inspired (and based on) numpy.histogram, but designed for humans(TM) on steroids(TM). The goal is to unify different

Jan Pipek 120 Dec 08, 2022
Fractals plotted on MatPlotLib in Python.

About The Project Learning more about fractals through the process of visualization. Built With Matplotlib Numpy License This project is licensed unde

Akeel Ather Medina 2 Aug 30, 2022
Browse Dash docsets inside emacs

Helm Dash What's it This package uses Dash docsets inside emacs to browse documentation. Here's an article explaining the basic usage of it. It doesn'

504 Dec 15, 2022
The visual framework is designed on the idea of module and implemented by mixin method

Visual Framework The visual framework is designed on the idea of module and implemented by mixin method. Its biggest feature is the mixins module whic

LEFTeyes 9 Sep 19, 2022
Small project to recursively calculate and plot each successive order of the Hilbert Curve

hilbert-curve Small project to recursively calculate and plot each successive order of the Hilbert Curve. After watching 3Blue1Brown's video on Hilber

Stefan Mejlgaard 2 Nov 15, 2021
EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs.

EPViz (EEG Prediction Visualizer) EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs. A lig

Jeff 2 Oct 19, 2022