PyTorch implementation HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

Related tags

Deep LearningHoroPCA
Overview

HoroPCA

This code is the official PyTorch implementation of the ICML 2021 paper:

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections
Ines Chami*, Albert Gu*, Dat Nguyen*, Christopher Ré
Stanford University
Paper: https://arxiv.org/abs/2106.03306

HoroPCA

Abstract. This paper studies Principal Component Analysis (PCA) for data lying in hyperbolic spaces. Given directions, PCA relies on: (1) a parameterization of subspaces spanned by these directions, (2) a method of projection onto subspaces that preserves information in these directions, and (3) an objective to optimize, namely the variance explained by projections. We generalize each of these concepts to the hyperbolic space and propose HoroPCA, a method for hyperbolic dimensionality reduction. By focusing on the core problem of extracting principal directions, HoroPCA theoretically better preserves information in the original data such as distances, compared to previous generalizations of PCA. Empirically, we validate that HoroPCA outperforms existing dimensionality reduction methods, significantly reducing error in distance preservation. As a data whitening method, it improves downstream classification by up to 3.9% compared to methods that don’t use whitening. Finally, we show that HoroPCA can be used to visualize hyperbolic data in two dimensions.

The code has an implementation of the HoroPCA method, as well as other methods for dimensionality reduction on manifolds, such as Principal Geodesic Analysis and tangent Principal Component Analysis.

Installation

This code was tested on Python3.7 and Pytorch 1.8.1. Start by installing the requirements:

pip install -r requirements.txt

Usage

Main script

Run hyperbolic dimensionality reduction experiments using the main.py script.

python main.py --help

optional arguments:
  -h, --help            show this help message and exit
  --dataset {smalltree,phylo-tree,bio-diseasome,ca-CSphd}
                        which datasets to use
  --model {pca,tpca,pga,bsa,hmds,horopca}
                        which dimensionality reduction method to use
  --metrics METRICS [METRICS ...]
                        which metrics to use
  --dim DIM             input embedding dimension to use
  --n-components N_COMPONENTS
                        number of principal components
  --lr LR               learning rate to use for optimization-based methods
  --n-runs N_RUNS       number of runs for optimization-based methods
  --use-sarkar          use sarkar to embed the graphs
  --sarkar-scale SARKAR_SCALE
                        scale to use for embeddings computed with Sarkar's
                        construction

Examples

1. Run HoroPCA on the smalltree dataset:

python main.py --dataset smalltree --model horopca --dim 10 --n-components 2

Output:

distortion: 	0.19 +- 0.00
frechet_var: 	7.15 +- 0.00

2. Run Euclidean PCA on the smalltree dataset:

python main.py --dataset smalltree --model pca --dim 10 --n-components 2

Output:

distortion: 	0.84 +- 0.00
frechet_var:    0.34 +- 0.00

Datasets

The possible dataset choices in this repo are {smalltree,phylo-tree,bio-diseasome,ca-CSphd}. To add a new dataset, add the corresponding edge list and embedding file in the data/ folder.

Citation

If you use this codebase, or otherwise found our work valuable, please cite:

@article{chami2021horopca,
  title={HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections},
  author={Chami, Ines and Gu, Albert and Nguyen, Dat and R{\'e}, Christopher},
  journal={arXiv preprint arXiv:2106.03306},
  year={2021}
}
Owner
HazyResearch
We are a CS research group led by Prof. Chris Ré.
HazyResearch
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
Create Data & AI apps in 20 lines of code with Shimoku

Install with: pip install shimoku-api-python Start with: from os import getenv import shimoku_api_python.client as Shimoku

Shimoku 5 Nov 07, 2022
Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning

SkFlow has been moved to Tensorflow. SkFlow has been moved to http://github.com/tensorflow/tensorflow into contrib folder specifically located here. T

3.2k Dec 29, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
Automatically creates genre collections for your Plex media

Plex Auto Genres Plex Auto Genres is a simple script that will add genre collection tags to your media making it much easier to search for genre speci

Shane Israel 63 Dec 31, 2022
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
OMAMO: orthology-based model organism selection

OMAMO: orthology-based model organism selection OMAMO is a tool that suggests the best model organism to study a biological process based on orthologo

Dessimoz Lab 5 Apr 22, 2022
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab

Merantix Momentum 249 Dec 07, 2022
A PyTorch implementation of SIN: Superpixel Interpolation Network

SIN: Superpixel Interpolation Network This is is a PyTorch implementation of the superpixel segmentation network introduced in our PRICAI-2021 paper:

6 Sep 28, 2022
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

MixText This repo contains codes for the following paper: Jiaao Chen, Zichao Yang, Diyi Yang: MixText: Linguistically-Informed Interpolation of Hidden

GT-SALT 309 Dec 12, 2022
Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

The Apache Software Foundation 34.7k Jan 04, 2023
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

202 Nov 18, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 68 Dec 28, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods

pyLiDAR-SLAM This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods, which can easily be evaluated

Kitware, Inc. 208 Dec 16, 2022
[ICCV'21] Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

CKDN The official implementation of the ICCV2021 paper "Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment" O

Multimedia Research 50 Dec 13, 2022
PASSL包含 SimCLR,MoCo,BYOL,CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer,Swin-Transformer,BEiT,CVT,T2T,MLP_Mixer等视觉Transformer算法

PASSL Introduction PASSL is a Paddle based vision library for state-of-the-art Self-Supervised Learning research with PaddlePaddle. PASSL aims to acce

186 Dec 29, 2022
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

David Brüggemann 35 Dec 05, 2022
Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Reinforcement Learning with Learned Fourier Features State-space Soft Actor-Critic Experiments Move to the state-SAC-LFF repository. cd state-SAC-LFF

Alex Li 10 Nov 11, 2022