ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Related tags

Deep Learningvivit
Overview

[ 👷 🏗 👷 🏗 Coming soon! Official release with improved docs. Stay tuned. 👷 🏗 👷 🏗 ]

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Python 3.7+ [tests]

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

  • GGN eigenvalues
  • GGN eigenpairs (eigenvalues + eigenvector)
  • 1ˢᵗ- and 2ⁿᵈ-order directional derivatives along GGN eigenvectors
  • Newton steps

These operations can also further approximate the GGN to reduce cost via sub-sampling, Monte-Carlo approximation, and block-diagonal approximation.

How does it work? ViViT uses and extends BackPACK for PyTorch. The described functionality is realized through a combination of existing and new BackPACK extensions and hooks into its backpropagation.

Installation

👷 🏗 👷 🏗 The PyPI release is coming soon. 👷 🏗 👷 🏗

For now, you need to install from GitHub via

pip install vivit-for-pytorch@git+https://github.com/f-dangel/vivit.git#egg=vivit-for-pytorch

Examples

👷 🏗 👷 🏗 Coming soon! 👷 🏗 👷 🏗

How to cite

If you are using ViViT, consider citing the paper

@misc{dangel2022vivit,
      title={{ViViT}: Curvature access through the generalized Gauss-Newton's low-rank structure},
      author={Felix Dangel and Lukas Tatzel and Philipp Hennig},
      year={2022},
      eprint={2106.02624},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Comments
  • [ADD] Warn about instabilities if eigenvalues are small

    [ADD] Warn about instabilities if eigenvalues are small

    The directional gradient computation and transformation of the Newton step from Gram space into parameter space require division by the square root of the direction's eigenvalue. This is unstable if the eigenvalue is close to zero.

    opened by f-dangel 1
  • [ADD] Clean `DirectionalDampedNewtonComputation`

    [ADD] Clean `DirectionalDampedNewtonComputation`

    Adds directionally damped Newton step computation with cleaned up API.

    • Fixes a bug in the eigenvalue criterion in the tests. It always picked one more eigenvalue than specified.
    opened by f-dangel 1
  • [DOC] Add NTK example

    [DOC] Add NTK example

    Adds an example inspired by the functorch tutorial on NTKs. It demonstrates how to use vivit to compute empirical NTK matrices and makes a comparison with the functorch implementation.

    opened by f-dangel 1
  • [ADD] Simplify `DirectionalDerivatives` API

    [ADD] Simplify `DirectionalDerivatives` API

    Exotic features, like using different GGNs to compute directions and directional curvatures, as well as full control of which intermediate buffers to keep, have been deprecated in favor of a simpler API.

    • Remove Newton step computation for now as it was internally relying on DirectionalDerivatives
    • Remove many utilities and associated tests from the exotic features
    • Forbid duplicate indices in subsampling
    • Always delete intermediate buffers other than the target quantities
    opened by f-dangel 1
  • [DOC] Set up `sphinx` and RTD

    [DOC] Set up `sphinx` and RTD

    This PR adds a scaffold for the doc at https://vivit.readthedocs.io/en/latest/. Code examples are integrated via sphinx-gallery (I added a preliminary logo). Pull requests are built by the CI.

    To build the docs, run make docs. You need to install the dependencies first, for example using pip install -e .[docs].

    opened by f-dangel 1
  • Calculate Parameter Space Values of GGN Eigenvectors

    Calculate Parameter Space Values of GGN Eigenvectors

    The docs show how to calculate the gram matrix eigenvectors and the paper articulates that to translate from 'gram space' to parameter space we just need to multiply by the 'V' matrix.

    What's the easiest way of implementing this?

    question 
    opened by lk-wq 1
  • Detect loss function's `reduction`, error if unsupported

    Detect loss function's `reduction`, error if unsupported

    For now, the library only supports reduction='mean'. We rely on the user to use this reduction and raise awareness about this point in the documentation. It would be better to automatically have the library detect the reduction and error if it is unsupported.

    This can be done via a hook into BackPACK.

    • [ ] Implement hook that determines the loss function reduction during backpropagation
    • [ ] Integrate the above hook into the *Computation and raise an exception if the reduction is not supported
    • [ ] Remove the comments about supported reductions in the documentation
    enhancement 
    opened by f-dangel 0
Releases(1.0.0)
Owner
Felix Dangel
Machine Learning PhD student at the University of Tübingen and the Max Planck Institute for Intelligent Systems.
Felix Dangel
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
DC3: A Learning Method for Optimization with Hard Constraints

DC3: A learning method for optimization with hard constraints This repository is by Priya L. Donti, David Rolnick, and J. Zico Kolter and contains the

CMU Locus Lab 57 Dec 26, 2022
Implementation of ViViT: A Video Vision Transformer

ViViT: A Video Vision Transformer Unofficial implementation of ViViT: A Video Vision Transformer. Notes: This is in WIP. Model 2 is implemented, Model

Rishikesh (ऋषिकेश) 297 Jan 06, 2023
Deep learning for Engineers - Physics Informed Deep Learning

SciANN: Neural Networks for Scientific Computations SciANN is a Keras wrapper for scientific computations and physics-informed deep learning. New to S

SciANN 195 Jan 03, 2023
Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging Minimal implementation and experiments of "No-Transaction Band N

19 Jan 03, 2023
SMPL-X: A new joint 3D model of the human body, face and hands together

SMPL-X: A new joint 3D model of the human body, face and hands together [Paper Page] [Paper] [Supp. Mat.] Table of Contents License Description News I

Vassilis Choutas 1k Jan 09, 2023
Code for Paper Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning

Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning (c) Tianyu Han and Daniel Truhn, RWTH Aachen University, 20

Tianyu Han 7 Nov 22, 2022
TLDR; Train custom adaptive filter optimizers without hand tuning or extra labels.

AutoDSP TLDR; Train custom adaptive filter optimizers without hand tuning or extra labels. About Adaptive filtering algorithms are commonplace in sign

Jonah Casebeer 48 Sep 19, 2022
Autonomous Movement from Simultaneous Localization and Mapping

Autonomous Movement from Simultaneous Localization and Mapping About us Built by a group of Clarkson University students with the help from Professor

14 Nov 07, 2022
Code for "Causal autoregressive flows" - AISTATS, 2021

Code for "Causal Autoregressive Flow" This repository contains code to run and reproduce experiments presented in Causal Autoregressive Flows, present

Ricardo Pio Monti 35 Dec 16, 2022
DRIFT is a tool for Diachronic Analysis of Scientific Literature.

About DRIFT is a tool for Diachronic Analysis of Scientific Literature. The application offers user-friendly and customizable utilities for two modes:

Rajaswa Patil 108 Dec 12, 2022
The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

F.S.Fire 180 Dec 30, 2022
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

Realcat 270 Jan 07, 2023
A mini lib that implements several useful functions binding to PyTorch in C++.

Torch-gather A mini library that implements several useful functions binding to PyTorch in C++. What does gather do? Why do we need it? When dealing w

maxwellzh 8 Sep 07, 2022
Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

snc4onnx Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools 1.

Katsuya Hyodo 8 Oct 13, 2022
An air quality monitoring service with a Raspberry Pi and a SDS011 sensor.

Raspberry Pi Air Quality Monitor A simple air quality monitoring service for the Raspberry Pi. Installation Clone the repository and run the following

rydercalmdown 24 Dec 09, 2022
Code for the Paper "Diffusion Models for Handwriting Generation"

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

Junha Lee 10 Dec 02, 2022
Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

AdaBelief Optimizer NeurIPS 2020 Spotlight, trains fast as Adam, generalizes well as SGD, and is stable to train GANs. Release of package We have rele

Juntang Zhuang 998 Dec 29, 2022
YOLOV4运行在嵌入式设备上

在嵌入式设备上实现YOLO V4 tiny 在嵌入式设备上实现YOLO V4 tiny 目录结构 目录结构 |-- YOLO V4 tiny |-- .gitignore |-- LICENSE |-- README.md |-- test.txt |-- t

Liu-Wei 6 Sep 09, 2021