Statistical tests for the sequential locality of graphs

Last update: Nov 23, 2021

Related tags

Overview

Statistical tests for the sequential locality of graphs

You can assess the statistical significance of the sequential locality of an adjacency matrix (graph + vertex sequence) using sequential_locality.py.

This file also includes ORGM.py that generates an instance of the ordered random graph model (ORGM) [1] and spectral.py that yields an optimized vertex sequence based on the spectral ordering algorithms.

Please find Ref. [1] for the details of the statistical tests.

sequential_locality.py

sequential_locality.py executes statistical tests with respect to the sequential locality.

Simple example

import numpy as np
import igraph
import sequential_locality as seq

s = seq.SequentialLocality(
		g = igraph.Graph.Erdos_Renyi(n=20,m=80), 
		sequence = np.arange(20)
		)
s.H1()

{'H1': 1.0375,
 'z1': 0.5123475382979811,
 'H1 p-value (ER/ORGM)': 0.6957960998835012,
 'H1 p-value (random)': 0.7438939644617626,
 'bandwidth_opt': None}

Please find Demo.ipynb for more examples.

SequentialLocality

This is a class to be instantiated to assess the sequential locality.

Input parameters

Either g or edgelist must be provided as an input.

Parameter	Value	Default	Description
g	graph	None	Graph (undirected, unweighted, no self-loops) in igraph or graph-tool.
edgelist	list of tuples	None	Edgelist as a list of tuples.
sequence	1-dim array	None	Array (list or ndarray) indicating the vertex ordering. If provided, the vertex indices in the graph will be replaced based on `sequence` . If `sequence` is `None`, the intrinsic vertex indices in the graph or edgelist will be used as the `sequence` .
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format
simple	Boolean	True	If `True`, the graph is assumed to be a simple graph, otherwise the graph is assumed to be a multigraph.

H1

This is a method that returns H₁ and z₁ test statistics and p-values of the input data.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'analytical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.
in_envelope	Boolean	`False`	If `False`, the p-value based on the ER model will be returned. If `True`, the p-value based on the ORGM will be returned. That is, the matrix elements outside of the bandwidth `r` will be ignored.
r	Integer	None	An integer between `1` and `N-1`. If provided, `r` will be used as the bandwidth when `in_envelope=True`.

Output parameters

Parameter	Description
H1	H₁ test statistic of the input data (graph & vertex sequence)
z1	z₁ test statistic of the input data
H1 p-value (ER/ORGM)	p-value under the null hypothesis of the ER random graph (when `in_envelope=False`) or the ORGM (when `in_envelope=True`).
H1 p-value (random)	p-value under the null hypothesis of random sequences
bandwidth_opt	Maximum likelihood estimate (MLE) of the bandwidth (when `r=None` in the input) or the input bandwidth `r`

HG

This is a method that returns H_G and z_G test statistics and p-values of the input data.

There is no in_envelope option for the test based on H_G.
random_sequence = 'analytical' can be computationally demanding.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'empirical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.

Output parameters

Parameter	Description
HG	H_G test statistic of the input data (graph & vertex sequence)
zG	z_G test statistic of the input data
HG p-value (ER)	p-value under the null hypothesis of the ER random graph.
HG p-value (random)	p-value under the null hypothesis of random sequences

ORGM.py

ORGM.py is a random graph generator. It generates an ORGM [1] instance that has a desired strength of sequentially lcoal structure.

Simple example

import ORGM as orgm

edgelist, valid = orgm.ORGM(
	N=20, M=80, bandwidth=10, epsilon=0.25
	)

Input parameters

Parameter	Value	Default	Description
N	Integer	required input	Number of vertices
M	Integer	required input	Number of edges
bandwidth	Integer	required input	Bandwidth of the ORGM
epsilon	Float (in [0,1])	required input	Density ratio between the adjacency matrix elements inside & outside of the envelope. When `epsilon=1`, the ORGM becomes a uniform model. When `epsilon=0`, the nonzero matrix elements are strictly confined in the envelope.
simple	Boolean	`True`	If `True`, the graph is constrained to be simple. If `False`, the graph is allowed to have multiedges.

spectral.py

spectral.py is an implementation of the spectral ordering [2].

Simple example

import graph_tool.all as gt
import spectral

g_real = gt.collection.ns['karate/77']
inferred_sequence = spectral.spectral_sequence(
	g= g_real, 
	format='graph-tool'
	)

Parameter	Value	Default	Description
g	graph	required input	graph (undirected, unweighted, no self-loops) in igraph or graph-tool
normalized	Boolean	`True`	Normalized Laplacian (`True`) vs unnormalized (combinatorial) Laplacian (`False`)
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format

Citation

Please use Ref. [1] for the citation of the present code.

References

[1] Tatsuro Kawamoto and Teruyoshi Kobayashi, "Sequential locality of graphs and its hypothesis testing," arXiv:2111.11267 (2021).
[2] Chris Ding and Xiaofeng He, "Linearized Cluster Assignment via Spectral Ordering," Proceedings of the Twenty-First International Conference on Machine Learning (ICML) (2004).

Statistical tests for the sequential locality of graphs

Related tags

Overview

Statistical tests for the sequential locality of graphs

sequential_locality.py

Simple example

SequentialLocality

Input parameters

H1

Input parameters

Output parameters

HG

Input parameters

Output parameters

ORGM.py

Simple example

Input parameters

spectral.py

Simple example

Citation

References

Owner

Python version of the Playwright testing and automation library.

Mypy static type checker plugin for Pytest

HTTP traffic mocking and testing made easy in Python

WIP SAT benchmarking tooling, written with only my personal use in mind.

Test scripts etc. for experimental rollup testing

Pytest support for asyncio.

A simple python script that uses selenium(chrome web driver),pyautogui,time and schedule modules to enter google meets automatically

Docker-based integration tests

The source code and slide for my talk about the subject: unittesing in python

The async ready version of the AniManga library created by centipede000.

A configurable set of panels that display various debug information about the current request/response.

Tattoo - System for automating the Gentoo arch testing process

1st Solution to QQ Browser 2021 AIAC Track 2

Run ISP speed tests and save results

Akulaku Create NewProduct Automation using Selenium Python

automate the procedure of 403 response code bypass

Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism"

BDD library for the py.test runner

Hamcrest matchers for Python

A wrapper for webdriver that is a jumping off point for web automation.