On Anytime Learning At Macroscale

Learning from sequential data dumps

(key) Requirements

Python 3.7
Pytorch 1.9.0
Hydra 1.1.0 (pip install hydra-core & pip install hydra-submitit-launcher)

Structure

├── crlapi           
  ├── benchmark.py    # Creates the data stream, feeds it to the model and evaluates it
  ├── core.py         # Abstract classes for 
  ├── logger.py   
  ├── sl
    ├── architectures
      ├── ...         # NN architectures used in this project
    ├── clmodels
      ├── ...         # Models (e.g. Single, gEns, ..., )
    ├── streams
      ├── ...         # CIFAR and MNIST stream implementatins

Running Experiments

To run experiments, you need to call the dataset specific run file, and you need to pass the configuration of the run. We have place the configurations in the previous directory (../configs). The config structure is as follows

    ├── configs
        ├── mnist
           ├── run.py                 # run file
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ├── test_finetune_mlp.yaml # This is the "Single Model"
           ... 
        ├── cifar
           ├── run.py                 # run file
           ├── test_finetune_vgg.yaml # This is the "Single Model"
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ...

To run an e.g. mnist gMoE run, the command is (launched from the directory just above (so cd ..)

PYTHONPATH=./ python configs/mnist/run.py -cn test_usage_gmoe n_megabatches=2 replay=1 clmodel.max_epochs=200

Important arguments

n_megabatches : controls the number of megabatches. So n_megabatches=1 is your regular full dataset training
replay : whether to use replay or not
clmodel.init_from_scratch : whether to reinitialize the model at every MB. Should only be used when replay=1
device : use cuda or cpu depending on your hardware

License

alma is released under the MIT license. See LICENSE for additional details about it. See also our Terms of Use and Privacy Policy.

Anytime Learning At Macroscale

Related tags

Overview

On Anytime Learning At Macroscale

(key) Requirements

Structure

Running Experiments

Important arguments

License

Owner

Meta Research

Time series forecasting with PyTorch

A logistic regression model for health insurance purchasing prediction

Scikit learn library models to account for data and concept drift.

whylogs: A Data and Machine Learning Logging Standard

A machine learning toolkit dedicated to time-series data

Implementation of the Object Relation Transformer for Image Captioning

This project has Classification and Clustering done Via kNN and K-Means respectfully

Predict the demand for electricity (R) - FRENCH

Machine Learning e Data Science com Python

Dragonfly is an open source python library for scalable Bayesian optimisation.

Neighbourhood Retrieval (Nearest Neighbours) with Distance Correlation.

The code from the Machine Learning Bookcamp book and a free course based on the book

A concept I came up which ditches the idea of "layers" in a neural network.

Simplify stop motion animation with machine learning.

Tools for Optuna, MLflow and the integration of both.

distfit - Probability density fitting

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

This repo includes some graph-based CTR prediction models and other representative baselines.

ML-powered Loan-Marketer Customer Filtering Engine

slim-python is a package to learn customized scoring systems for decision-making problems.