This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Overview

Wide-Networks

This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameterizations of NNs as defined by (Yang & Hu 2021). Although an equivalent description can be given using only ac-parameterizations, we keep the 3 scales (a, b and c) in the code to allow more flexibility depending on how we want to approach the problem of dealing with infinitely wide NNs.

Structure of the code

The BaseModel class

All the code related to neural networks is in the directory pytorch. The different models we have implemented are in this directory along with the base class found in the file base_model.py which implements the generic attributes and methods all our NNs classes will share.

The BaseModel class inherits from the Pytorch Lightning module, and essentially defines the necessary attributes for any NN to work properly, namely the architecture (which is defined in the _build_model() method), the activation function (we consider the same activation function at each layer), the loss function, the optimizer and the initializer for the parameters of the network.

Optionally, the BaseModel class can define attributes for the normalization (e.g. BatchNorm, LayerNorm, etc) and the scheduler, and any of the aforementioned attributes (optional or not) can be customized depending on the needs (see examples for the scheduler of ipllr and the initializer of abc_param).

The ModelConfig class

All the hyper-parameters which define the model (depth, width, activation function name, loss name, optimizer name, etc) have to be passed as argument to _init_() as an object of the class ModelConfig (pytorch/configs/model.py). This class reads from a yaml config file which defines all the necessary objects for a NN (see examples in pytorch/configs). Essentially, the class ModelConfig is here so that one only has to set the yaml config file properly and then the attributes are correctly populated in BaseModel via the class ModelConfig.

abc-parameterizations

The code for abc-parameterizations (Yang & Hu 2021) can be found in pytorch/abc_params. There we define the base class for abc-parameterizations, mainly setting the layer, init and lr scales from the values of a,b,c, as well as defining the initial parameters through Gaussians of appropriate variance depending on the value of b and the activation function.

Everything that is architecture specific (fully-connected, conv, residual, etc) is left out of this base class and has to be implemented in the _build_model() method of the child class (see examples in pytorch/abc_params/fully_connected). We also define there the base classes for the ntk, muP (Yang & Hu 2021), ip and ipllr parameterizations, and there fully-connected implementations in pytorch/abc_params/fully_connected.

Experiment runs

Setup

Before running any experiment, make sure you first install all the necessary packages:

pip3 install -r requirements.txt

You can optionally create a virtual environment through

python3 -m venv your_env_dir

then activate it with

source your_env_dir/bin/activate

and then install the requirements once the environment is activated. Now, if you haven't installed the wide-networks library in site-packages, before running the command for your experiment, make sure you first add the wide-networks library to the PYTHONPATH by running the command

export PYTHONPATH=$PYTHONPATH:"$PWD"

from the root directory (wide-networks/.) of where the wide-networks library is located.

Python jobs

We define python jobs which can be run with arguments from the command line in the directory jobs. Mainly, those jobs launch a training / val / test pipeline for a given model using the Lightning module, and the results are collected in a dictionary which is saved to a pickle file a the end of training for later examination. Additionally, metrics are logged in TensorBoard and can be visualized during training with the command

tensorboard --logdir=`your_experiment_dir`

We have written jobs to launch experiments on MNIST and CIFAR-10 with the fully connected version of different models such as muP (Yang & Hu 2021), IP-LLR, Naive-IP which can be found in jobs/abc_parameterizations. Arguments can be passed to those Python scripts through the command line, but they are optional and the default values will be used if the parameters of the script are not manually set. For example, the command

python3 jobs/abc_parameterizations/fc_muP_run.py --activation="relu" --n_steps=600 --dataset="mnist"

will launch a training / val / test pipeline with ReLU as the activation function, 600 SGD steps and the MNIST dataset. The other parameters of the run (e.g. the base learning rate and batch size) will have their default values. The jobs will automatically create a directory (and potentially subdirectories) for the experiment and save there the python logs, the tensorboard events and the results dictionary saved to a pickle file as well as the checkpoints saved for the network.

Visualizing results

To visualize the results after training for a given experiment, one can launch the notebook experiments-results.ipynb located in pytorch/notebooks/training/abc_parameterizations, and simply change the arguments in the "Set variables" cell to load the results from the corresponding experiment. Then running all the cells will produce (and save) some figures related to the training phase (e.g. loss vs. steps).

Owner
Karl Hajjar
PhD student at Laboratoire de Mathématiques d'Orsay
Karl Hajjar
Implementation of momentum^2 teacher

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning Requirements All experiments are done with python3.6, torch

jemmy li 121 Sep 26, 2022
Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

Init Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger. 本项目基于 https://github.com/jaywalnut310/vits https://github.com/S

AmorTX 107 Dec 23, 2022
In-place Parallel Super Scalar Samplesort (IPS⁴o)

In-place Parallel Super Scalar Samplesort (IPS⁴o) This is the implementation of the algorithm IPS⁴o presented in the paper Engineering In-place (Share

82 Dec 22, 2022
[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

GenForce: May Generative Force Be with You 148 Dec 09, 2022
SEC'21: Sparse Bitmap Compression for Memory-Efficient Training onthe Edge

Training Deep Learning Models on The Edge Training on the Edge enables continuous learning from new data for deployed neural networks on memory-constr

Brown University Scale Lab 4 Nov 18, 2022
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising (CVPR 2020 Oral & TPAMI 2021)

ELD The implementation of CVPR 2020 (Oral) paper "A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising" and its journal (TPAMI) v

Kaixuan Wei 359 Jan 01, 2023
StellarGraph - Machine Learning on Graphs

StellarGraph Machine Learning Library StellarGraph is a Python library for machine learning on graphs and networks. Table of Contents Introduction Get

S T E L L A R 2.6k Jan 05, 2023
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
Learning Representations that Support Robust Transfer of Predictors

Transfer Risk Minimization (TRM) Code for Learning Representations that Support Robust Transfer of Predictors Prepare the Datasets Preprocess the Scen

Yilun Xu 15 Dec 07, 2022
Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

BayesOpt-LV Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV) About This repository contains the s

1 Nov 11, 2021
Remote sensing change detection using PaddlePaddle

Change Detection Laboratory Developing and benchmarking deep learning-based remo

Lin Manhui 15 Sep 23, 2022
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
EMNLP 2020 - Summarizing Text on Any Aspects

Summarizing Text on Any Aspects This repo contains preliminary code of the following paper: Summarizing Text on Any Aspects: A Knowledge-Informed Weak

Bowen Tan 35 Nov 14, 2022
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 09, 2022
PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. PantheonRL supports cross-play, fine-tuning, ad-hoc coordination, and more.

Stanford Intelligent and Interactive Autonomous Systems Group 57 Dec 28, 2022
DRIFT is a tool for Diachronic Analysis of Scientific Literature.

About DRIFT is a tool for Diachronic Analysis of Scientific Literature. The application offers user-friendly and customizable utilities for two modes:

Rajaswa Patil 108 Dec 12, 2022
Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

StyleGAR TODO: add arxiv link Implementation of Inverting Generative Adversarial Renderer for Face Reconstruction TODO: for test Currently, some model

155 Oct 27, 2022
Code for the paper "Improved Techniques for Training GANs"

Status: Archive (code is provided as-is, no updates expected) improved-gan code for the paper "Improved Techniques for Training GANs" MNIST, SVHN, CIF

OpenAI 2.2k Jan 01, 2023
Learning Energy-Based Models by Diffusion Recovery Likelihood

Learning Energy-Based Models by Diffusion Recovery Likelihood Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma Paper: https://arxiv.o

Ruiqi Gao 41 Nov 22, 2022