A Graph Learning library for Humans

Overview

A Graph Learning library for Humans

These novel algorithms include but are not limited to:

  • A graph construction and graph searching class can be found here (NodeGraph). It was developed and invented as a faster alternative for hierarchical DAG construction and searching.
  • A fast DBSCAN method utilizing my connectivity code as invented during my PhD.
  • A NLP pattern matching algorithm useful for sequence alignment clustering.
  • High dimensional alignment code for aligning models to data.
  • An SVD based variant of the Distance Geometry algorithm. For going from relative to absolute coordinates.

License DOI Downloads

Visit the active code via : https://github.com/richardtjornhammar/graphtastic

Pip installation with :

pip install graphtastic

Version controlled installation of the Graphtastic library

The Graphtastic library

In order to run these code snippets we recommend that you download the nix package manager. Nix package manager links from Februari 2022:

https://nixos.org/download.html

$ curl -L https://nixos.org/nix/install | sh

If you cannot install it using your Wintendo then please consider installing Windows Subsystem for Linux first:

https://docs.microsoft.com/en-us/windows/wsl/install-win10

In order to run the code in this notebook you must enter a sensible working environment. Don't worry! We have created one for you. It's version controlled against python3.9 (and experimental python3.10 support) and you can get the file here:

https://github.com/richardtjornhammar/graphtastic/blob/master/env/env39.nix

Since you have installed Nix as well as WSL, or use a Linux (NixOS) or bsd like system, you should be able to execute the following command in a termnial:

$ nix-shell env39.nix

Now you should be able to start your jupyter notebook locally:

$ jupyter-notebook graphhaxxor.ipynb

and that's it.

EXAMPLE 0

Running

import graphtastic.graphs as gg
import graphtastic.clustering as gl
import graphtastic.fit as gf
import graphtastic.convert as gc

Should work if the install was succesful

Example 1 : Absolute and relative coordinates

In this example, we will use the SVD based distance geometry method to go between absolute coordinates, relative coordinate distances and back to ordered absolute coordinates. Absolute coordinates are float values describing the position of something in space. If you have several of these then the same information can be conveyed via the pairwise distance graph. Going from absolute coordinates to pairwise distances is simple and only requires you to calculate all the pairwise distances between your absolute coordinates. Going back to mutually orthogonal ordered coordinates from the pariwise distances is trickier, but a solved problem. The distance geometry can be obtained with SVD and it is implemented in the graphtastic.fit module under the name distance_matrix_to_absolute_coordinates. We start by defining coordinates afterwhich we can calculate the pair distance matrix and transforming it back by using the code below

import numpy as np

coordinates = np.array([[-23.7100 ,  24.1000 ,  85.4400],
  [-22.5600 ,  23.7600 ,  85.6500],
  [-21.5500 ,  24.6200 ,  85.3800],
  [-22.2600 ,  22.4200 ,  86.1900],
  [-23.2900 ,  21.5300 ,  86.4800],
  [-20.9300 ,  22.0300 ,  86.4300],
  [-20.7100 ,  20.7600 ,  86.9400],
  [-21.7900 ,  19.9300 ,  87.1900],
  [-23.0300 ,  20.3300 ,  86.9600],
  [-24.1300 ,  19.4200 ,  87.2500],
  [-23.7400 ,  18.0500 ,  87.0000],
  [-24.4900 ,  19.4600 ,  88.7500],
  [-23.3700 ,  19.8900 ,  89.5200],
  [-24.8500 ,  18.0000 ,  89.0900],
  [-23.9600 ,  17.4800 ,  90.0800],
  [-24.6600 ,  17.2400 ,  87.7500],
  [-24.0800 ,  15.8500 ,  88.0100],
  [-23.9600 ,  15.1600 ,  86.7600],
  [-23.3400 ,  13.7100 ,  87.1000],
  [-21.9600 ,  13.8700 ,  87.6300],
  [-24.1800 ,  13.0300 ,  88.1100],
  [-23.2900 ,  12.8200 ,  85.7600],
  [-23.1900 ,  11.2800 ,  86.2200],
  [-21.8100 ,  11.0000 ,  86.7000],
  [-24.1500 ,  11.0300 ,  87.3200],
  [-23.5300 ,  10.3200 ,  84.9800],
  [-23.5400 ,   8.9800 ,  85.4800],
  [-23.8600 ,   8.0100 ,  84.3400],
  [-23.9800 ,   6.5760 ,  84.8900],
  [-23.2800 ,   6.4460 ,  86.1300],
  [-23.3000 ,   5.7330 ,  83.7800],
  [-22.7300 ,   4.5360 ,  84.3100],
  [-22.2000 ,   6.7130 ,  83.3000],
  [-22.7900 ,   8.0170 ,  83.3800],
  [-21.8100 ,   6.4120 ,  81.9200],
  [-20.8500 ,   5.5220 ,  81.5200],
  [-20.8300 ,   5.5670 ,  80.1200],
  [-21.7700 ,   6.4720 ,  79.7400],
  [-22.3400 ,   6.9680 ,  80.8000],
  [-20.0100 ,   4.6970 ,  82.1500],
  [-19.1800 ,   3.9390 ,  81.4700] ]);

if __name__=='__main__':

    import graphtastic.fit as gf

    distance_matrix = gf.absolute_coordinates_to_distance_matrix( coordinates )
    ordered_coordinates = gf.distance_matrix_to_absolute_coordinates( distance_matrix , n_dimensions=3 )

    print ( ordered_coordinates )

You will notice that the largest variation is now aligned with the X axis, the second most variation aligned with the Y axis and the third most, aligned with the Z axis while the graph topology remained unchanged.

Example 2 : Deterministic DBSCAN

DBSCAN is a clustering algorithm that can be seen as a way of rejecting points, from any cluster, that are positioned in low dense regions of a point cloud. This introduces holes and may result in a larger segment, that would otherwise be connected via a non dense link to become disconnected and form two segments, or clusters. The rejection criterion is simple. The central concern is to evaluate a distance matrix with an applied cutoff this turns the distances into true or false values depending on if a pair distance between point i and j is within the distance cutoff. This new binary Neighbour matrix tells you wether or not two points are neighbours (including itself). The DBSCAN criterion states that a point is not part of any cluster if it has fewer than minPts neighbors. Once you've calculated the distance matrix you can immediately evaluate the number of neighbors each point has and the rejection criterion, via . If the rejection vector R value of a point is True then all the pairwise distances in the distance matrix of that point is set to a value larger than epsilon. This ensures that a distance matrix search will reject those points as neighbours of any other for the choosen epsilon. By tracing out all points that are neighbors and assessing the connectivity (search for connectivity) you can find all the clusters.

import numpy as np
from graphtastic.clustering import dbscan, reformat_dbscan_results
from graphtastic.fit import absolute_coordinates_to_distance_matrix

N   = 100
N05 = int ( np.floor(0.5*N) )
R   = 0.25*np.random.randn(N).reshape(N05,2) + 1.5
P   = 0.50*np.random.randn(N).reshape(N05,2)

coordinates = np.array([*P,*R])

results = dbscan ( distance_matrix = absolute_coordinates_to_distance_matrix(coordinates,bInvPow=True) , eps=0.45 , minPts=4 )
clusters = reformat_dbscan_results(results)
print ( clusters )

Example 3 : NodeGraph, distance matrix to DAG

Here we demonstrate how to convert the graph coordinates into a hierarchy. The leaf nodes will correspond to the coordinate positions.

import numpy as np

coordinates = np.array([[-23.7100 ,  24.1000 ,  85.4400],
  [-22.5600 ,  23.7600 ,  85.6500],
  [-21.5500 ,  24.6200 ,  85.3800],
  [-22.2600 ,  22.4200 ,  86.1900],
  [-23.2900 ,  21.5300 ,  86.4800],
  [-20.9300 ,  22.0300 ,  86.4300],
  [-20.7100 ,  20.7600 ,  86.9400],
  [-21.7900 ,  19.9300 ,  87.1900],
  [-23.0300 ,  20.3300 ,  86.9600],
  [-24.1300 ,  19.4200 ,  87.2500],
  [-23.7400 ,  18.0500 ,  87.0000],
  [-24.4900 ,  19.4600 ,  88.7500],
  [-23.3700 ,  19.8900 ,  89.5200],
  [-24.8500 ,  18.0000 ,  89.0900],
  [-23.9600 ,  17.4800 ,  90.0800],
  [-24.6600 ,  17.2400 ,  87.7500],
  [-24.0800 ,  15.8500 ,  88.0100],
  [-23.9600 ,  15.1600 ,  86.7600],
  [-23.3400 ,  13.7100 ,  87.1000],
  [-21.9600 ,  13.8700 ,  87.6300],
  [-24.1800 ,  13.0300 ,  88.1100],
  [-23.2900 ,  12.8200 ,  85.7600],
  [-23.1900 ,  11.2800 ,  86.2200],
  [-21.8100 ,  11.0000 ,  86.7000],
  [-24.1500 ,  11.0300 ,  87.3200],
  [-23.5300 ,  10.3200 ,  84.9800],
  [-23.5400 ,   8.9800 ,  85.4800],
  [-23.8600 ,   8.0100 ,  84.3400],
  [-23.9800 ,   6.5760 ,  84.8900],
  [-23.2800 ,   6.4460 ,  86.1300],
  [-23.3000 ,   5.7330 ,  83.7800],
  [-22.7300 ,   4.5360 ,  84.3100],
  [-22.2000 ,   6.7130 ,  83.3000],
  [-22.7900 ,   8.0170 ,  83.3800],
  [-21.8100 ,   6.4120 ,  81.9200],
  [-20.8500 ,   5.5220 ,  81.5200],
  [-20.8300 ,   5.5670 ,  80.1200],
  [-21.7700 ,   6.4720 ,  79.7400],
  [-22.3400 ,   6.9680 ,  80.8000],
  [-20.0100 ,   4.6970 ,  82.1500],
  [-19.1800 ,   3.9390 ,  81.4700] ]);


if __name__=='__main__':

    import graphtastic.graphs as gg
    import graphtastic.fit as gf
    GN = gg.NodeGraph()
    #
    # bInvPow refers to the distance type. If True then R distances are returned
    # instead of R2 (R**2) distances. That is also computing the square root if True
    #
    distm = gf.absolute_coordinates_to_distance_matrix( coordinates , bInvPow=True )
    #
    # Now a Graph DAG is constructed from the pairwise distances
    GN.distance_matrix_to_graph_dag( distm )
    #
    # And write it to a json file so that we may employ JS visualisations
    # such as D3 or other nice packages to view our hierarchy
    GN.write_json( jsonfile='./graph_hierarchy.json' )

Manually updated code backups for this library :

GitLab | https://gitlab.com/richardtjornhammar/graphtastic

CSDN | https://codechina.csdn.net/m0_52121311/graphtastic

You might also like...
Fastest Gephi's ForceAtlas2 graph layout algorithm implemented for Python and NetworkX
Fastest Gephi's ForceAtlas2 graph layout algorithm implemented for Python and NetworkX

ForceAtlas2 for Python A port of Gephi's Force Atlas 2 layout algorithm to Python 2 and Python 3 (with a wrapper for NetworkX and igraph). This is the

🐍PyNode Next allows you to easily create beautiful graph visualisations and animations
🐍PyNode Next allows you to easily create beautiful graph visualisations and animations

PyNode Next A complete rewrite of PyNode for the modern era. Up to five times faster than the original PyNode. PyNode Next allows you to easily create

LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.
LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

LabGraph is a a Python-first framework used to build sophisticated research systems with real-time streaming, graph API, and parallelism.

Automatization of BoxPlot graph usin Python MatPlotLib and Excel

BoxPlotGraphAutomation Automatization of BoxPlot graph usin Python / Excel. This file is an automation of BoxPlot-Graph using python graph library mat

Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

Declarative statistical visualization library for Python
Declarative statistical visualization library for Python

Altair http://altair-viz.github.io Altair is a declarative statistical visualization library for Python. With Altair, you can spend more time understa

Plotting library for IPython/Jupyter notebooks
Plotting library for IPython/Jupyter notebooks

bqplot 2-D plotting library for Project Jupyter Introduction bqplot is a 2-D visualization system for Jupyter, based on the constructs of the Grammar

Cartopy - a cartographic python library with matplotlib support
Cartopy - a cartographic python library with matplotlib support

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation easy. Table of contents Overview Get in touch License an

Releases(v0.12.0)
Owner
Richard Tjörnhammar
PhD in Biological physics https://richardtjornhammar.github.io
Richard Tjörnhammar
Extract and visualize information from Gurobi log files

GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp

Gurobi Optimization 56 Nov 17, 2022
A Bokeh project developed for learning and teaching Bokeh interactive plotting!

Bokeh-Python-Visualization A Bokeh project developed for learning and teaching Bokeh interactive plotting! See my medium blog posts about making bokeh

Will Koehrsen 350 Dec 05, 2022
Simple plotting for Python. Python wrapper for D3xter - render charts in the browser with simple Python syntax.

PyDexter Simple plotting for Python. Python wrapper for D3xter - render charts in the browser with simple Python syntax. Setup $ pip install PyDexter

D3xter 31 Mar 06, 2021
Python Data Structures for Humans™.

Schematics Python Data Structures for Humans™. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar

Schematics 2.5k Dec 28, 2022
2021 grafana arbitrary file read

2021_grafana_arbitrary_file_read base on pocsuite3 try 40 default plugins of grafana alertlist annolist barchart cloudwatch dashlist elasticsearch gra

ATpiu 5 Nov 09, 2022
DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

1 Jan 03, 2022
OpenStats is a library built on top of streamlit that extracts data from the Github API and shows the main KPIs

Open Stats Discover and share the KPIs of your OpenSource project. OpenStats is a library built on top of streamlit that extracts data from the Github

Pere Miquel Brull 4 Apr 03, 2022
AB-test-analyzer - Python class to perform AB test analysis

AB-test-analyzer Python class to perform AB test analysis Overview This repo con

13 Jul 16, 2022
A tool for creating Toontown-style nametags in Panda3D

Toontown-Nametag Toontown-Nametag is a tool for creating Toontown Online/Toontown Rewritten-style nametags in Panda3D. It contains a function, createN

BoggoTV 2 Dec 23, 2021
Python implementation of the Density Line Chart by Moritz & Fisher.

PyDLC - Density Line Charts with Python Python implementation of the Density Line Chart (Moritz & Fisher, 2018) to visualize large collections of time

Charles L. Bérubé 10 Jan 06, 2023
Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 01, 2023
Seismic Waveform Inversion Toolbox-1.0

Seismic Waveform Inversion Toolbox (SWIT-1.0)

Haipeng Li 98 Dec 29, 2022
Matplotlib tutorial for beginner

matplotlib is probably the single most used Python package for 2D-graphics. It provides both a very quick way to visualize data from Python and publication-quality figures in many formats. We are goi

Nicolas P. Rougier 2.6k Dec 28, 2022
A curated list of awesome Dash (plotly) resources

Awesome Dash A curated list of awesome Dash (plotly) resources Dash is a productive Python framework for building web applications. Written on top of

Luke Singham 1.7k Dec 26, 2022
A small timeseries transformation API built on Flask and Pandas

#Mcflyin ###A timeseries transformation API built on Pandas and Flask This is a small demo of an API to do timeseries transformations built on Flask a

Rob Story 84 Mar 25, 2022
Numerical methods for ordinary differential equations: Euler, Improved Euler, Runge-Kutta.

Numerical methods Numerical methods for ordinary differential equations are methods used to find numerical approximations to the solutions of ordinary

Aleksey Korshuk 5 Apr 29, 2022
This is a sorting visualizer made with Tkinter.

Sorting-Visualizer This is a sorting visualizer made with Tkinter. Make sure you've installed tkinter in your system to use this visualizer pip instal

Vishal Choubey 7 Jul 06, 2022
Draw tree diagrams from indented text input

Draw tree diagrams This repository contains two very different scripts to produce hierarchical tree diagrams like this one: $ ./classtree.py collectio

Luciano Ramalho 8 Dec 14, 2022
Learn Data Science with focus on adding value with the most efficient tech stack.

DataScienceWithPython Get started with Data Science with Python An engaging journey to become a Data Scientist with Python TL;DR Download all Jupyter

Learn Python with Rune 110 Dec 22, 2022
Cartopy - a cartographic python library with matplotlib support

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation easy. Table of contents Overview Get in touch License an

1.2k Jan 01, 2023