FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

Related tags

Deep LearningFLSim
Overview

Federated Learning Simulator (FLSim)

Federated Learning Simulator (FLSim) is a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API. FLSim is domain-agnostic and accommodates many use cases such as computer vision and natural text. Currently FLSim supports cross-device FL, where millions of clients' devices (e.g. phones) traing a model collaboratively together.

FLSim is scalable and fast. It supports differential privacy (DP), secure aggregation (secAgg), and variety of compression techniques.

In FL, a model is trained collaboratively by multiple clients that each have their own local data, and a central server moderates training, e.g. by aggregating model updates from multiple clients.

In FLSim, developers only need to define a dataset, model, and metrics reporter. All other aspects of FL training are handled internally by the FLSim core library.

FLSim

Library Structure

FLSim core components follow the same semantic as FedAvg. The server comprises three main features: selector, aggregator, and optimizer at a high level. The selector selects clients for training, and the aggregate aggregates client updates until a round is complete. Then, the optimizer optimizes the server model based on the aggregated gradients. The server communicates with the clients via the channel. The channel then compresses the message between the server and the clients. Locally, the client composes of a dataset and a local optimizer. This local optimizer can be SGD, FedProx, or a custom Pytorch optimizer.

Installation

The latest release of FLSim can be installed via pip:

pip install flsim

You can also install directly from the source for the latest features (along with its quirks and potentially ocassional bugs):

git clone https://github.com/facebookresearch/FLSim.git
cd FLSim
pip install -e .

Getting started

To implement a central training loop in the FL setting using FLSim, a developer simply performs the following steps:

  1. Build their own data pipeline to assign individual rows of training data to client devices (to simulate data is distributed across client devices)
  2. Create a corresponding nn/Module model and wrap it in an FL model.
  3. Define a custom metrics reporter that computes and collects metrics of interest (e.g., accuracy) throughout training.
  4. Set the desired hyperparameters in a config.

Usage Example

Tutorials

To see the details, please refer to the tutorials that we have prepared.

Examples

We have prepared the runnable exampels for 2 of the tutorials above:

Contributing

See the CONTRIBUTING for how to contribute to this library.

License

This code is released under Apache 2.0, as found in the LICENSE file.

Comments
  • Bug Fix#36: fix imports in tests.

    Bug Fix#36: fix imports in tests.

    Types of changes

    • [x ] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Docs change / refactoring / dependency upgrade

    Motivation and Context / Related issue

    Bug Fix#36: fix imports in tests.

    How Has This Been Tested (if it applies)

    pytest -ra is able to discover all tests now.

    Checklist

    • [x] The documentation is up-to-date with the changes I made.
    • [x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
    • [x ] All tests passed, and additional code has been covered with new tests.
    CLA Signed 
    opened by ghaccount 8
  • Vr

    Vr

    Types of changes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] Docs change / refactoring / dependency upgrade

    Motivation and Context / Related issue

    How Has This Been Tested (if it applies)

    Checklist

    • [ ] The documentation is up-to-date with the changes I made.
    • [ ] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
    • [ ] All tests passed, and additional code has been covered with new tests.
    CLA Signed 
    opened by JohnlNguyen 6
  • Move optimizer_test_utils to optimizers directory

    Move optimizer_test_utils to optimizers directory

    Summary: it is currently located at the top-level tests directory. However the top-level tests directory does not really make sense as each component is organized into its dedicated directory. optimizer_test_utils.py belongs to the optimizer directory in that sense. In this diff, we move the file to the optimizer directory and fixes the reference.

    Differential Revision: D32241821

    CLA Signed fb-exported Merged 
    opened by jessemin 3
  • Does the backend handle Federated learning asynchronously?

    Does the backend handle Federated learning asynchronously?

    I found this repo from this blog: - https://ai.facebook.com/blog/asynchronous-federated-learning/ However I do not find any mentioning on this repo and also I cannot decipher from the code examples whether this is synchronous version or asynchronous version of Federated learning? Can you please clarify this for me? And also if this is the asynchronous version how can I dive deeper in to the libraries and look at the code of implementation for the asynch handling mechanism?

    Thank you

    opened by 111Kaushal 2
  • Remove test_pytorch_local_dataset_factory

    Remove test_pytorch_local_dataset_factory

    Summary: This test had been very flaky about 1+ year ago an d never been revived since then. Deleting it from the codebase.

    Differential Revision: D32415979

    CLA Signed fb-exported Merged 
    opened by jessemin 2
  • FedSGD with virtual batching

    FedSGD with virtual batching

    πŸš€ Feature

    Motivation

    Create a memory efficient client to run FedSGD. If a client has many examples, running FedSGD (taking the gradient of the model based on all of the client's data) can lead to OOM. In this PR, we fix this problem by still calling optimizer.step once at the end of local training to simulate the effect of FedSGD.>

    opened by JohnlNguyen 0
  • Add Fednova as a benchmark

    Add Fednova as a benchmark

    Summary:

    What?

    Adding FedNova as a benchmark

    Why?

    FedNova is a well known paper that fixes the objective inconsistency problem

    Differential Revision: D34668291

    CLA Signed fb-exported 
    opened by JohnlNguyen 1
  • Having to `import flsim.configs`  before creating config from json is unintuitive

    Having to `import flsim.configs` before creating config from json is unintuitive

    πŸš€ Feature

    This code works

    import flsim.configs <-- 
    from flsim.utils.config_utils import fl_config_from_json
    
    json_config = {
        "trainer": {
        }
    }
    cfg = fl_config_from_json(json_config)
    

    This code doesn't work

    from flsim.utils.config_utils import fl_config_from_json
    
    json_config = {
        "trainer": {
        }
    }
    cfg = fl_config_from_json(json_config)
    

    Motivation

    Having to import flsim.configs is unintuitive and not clear from the user perspective

    Pitch

    Alternatives

    Additional context

    opened by JohnlNguyen 0
  • Fix sent140 example

    Fix sent140 example

    Summary:

    What?

    Fix tutorial to word embedding to resolve the poor accuracy problem

    Why?

    https://github.com/facebookresearch/FLSim/issues/34

    Differential Revision: D34149392

    CLA Signed fb-exported 
    opened by JohnlNguyen 1
  • low test accuracy in Sentiment classification with LEAF's Sent140 tutorial?

    low test accuracy in Sentiment classification with LEAF's Sent140 tutorial?

    ❓ Questions and Help

    Until we move the questions to another medium, feel free to use this as your question:

    Question

    I tried this tutorial https://github.com/facebookresearch/FLSim/blob/main/tutorials/sent140_tutorial.ipynb And accuracy is less that random guess (50%)!

    Any suggestions or approaches to improve accuracy for this tutorial?

    from tutorial: Running (epoch = 1, round = 1, global round = 1) for Test (epoch = 1, round = 1, global round = 1), Loss/Test: 0.8683878255035598 (epoch = 1, round = 1, global round = 1), Accuracy/Test: 49.61439588688946 {'Accuracy': 49.61439588688946}

    opened by ghaccount 0
Releases(v0.1.0)
  • v0.0.1(Dec 9, 2021)

    We are excited to announce the release of FLSim 0.0.1.

    Introduction

    How does one train a machine learning model without access to user data? Federated Learning (FL) is the technology that answers this question. In a nutshell, FL is a way for many users to learn a machine learning model without sharing data collaboratively. The two scenarios for FL, cross-silo and cross-device. Cross-silo provides technologies for collaborative learning between a few large organizations with massive silo datasets. Cross-device provides collaborative learning between many small user devices with small local datasets. Cross-device FL, where millions or even billions of users cooperate on learning a model, is a much more complex problem and attracted less attention from the research community. We designed FLSim to address the cross-device FL use case.

    Federated Learning at Scale

    Large-scale cross-device Federated Learning (FL) is a federated learning paradigm with several challenges that differentiate it from cross-silo FL: millions of clients coordinating with a central server and training instability due to the significant cohort problem. With these challenges in mind, we built FLSim to be scalable while easy to use, and FLSim can scale to thousands of clients per round using only 1 GPU. We hope FLSim will equip researchers to tackle problems with federated learning at scale.

    FLSim

    Library Structure

    FLSim core components follow the same semantic as FedAvg. The server comprises three main features: selector, aggregator, and optimizer at a high level. The selector selects clients for training, and the aggregate aggregates client updates until a round is complete. Then, the optimizer optimizes the server model based on the aggregated gradients. The server communicates with the clients via the channel. The channel then compresses the message between the server and the clients. Locally, the client composes of a dataset and a local optimizer. This local optimizer can be SGD, FedProx, or a custom Pytorch optimizer.

    Included Datasets

    Currently, FLSim supports all datasets from LEAF including FEMNIST, Shakespeare, Sent140, CelebA, Synthetic and Reddit. Additionally, we support MNIST and CIFAR-10.

    Included Algorithms

    FLSim supports standard FedAvg, and other federated learning methods such as FedAdam, FedProx, FedAvgM, FedBuff, FedLARS, and FedLAMB.

    What’s next?

    We hope FLSim will foster large-scale cross-device FL research. Soon, we plan to add support for personalization in early 2022. Throughout 2022, we plan to gather feedback and improve usability. We plan to continue to grow our collection of algorithms, datasets, and models.

    Source code(tar.gz)
    Source code(zip)
Owner
Meta Research
Meta Research
Feature extraction made simple with torchextractor

torchextractor: PyTorch Intermediate Feature Extraction Introduction Too many times some model definitions get remorselessly copy-pasted just because

Antoine Broyelle 89 Oct 31, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python = 3.8 pytorch = 1.8.0

DV Lab 63 Dec 16, 2022
Unofficial implementation of Pix2SEQ

Unofficial-Pix2seq: A Language Modeling Framework for Object Detection Unofficial implementation of Pix2SEQ. Please use this code with causion. Many i

159 Dec 12, 2022
An energy estimator for eyeriss-like DNN hardware accelerator

Energy-Estimator-for-Eyeriss-like-Architecture- An energy estimator for eyeriss-like DNN hardware accelerator This is an energy estimator for eyeriss-

HEXIN BAO 2 Mar 26, 2022
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Varun Nair 37 Dec 30, 2022
RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation (CIKM'17)

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation This is the implementation of RATE: Overcoming Noise and Spar

Yu Zhang 5 Feb 10, 2022
Forecasting Nonverbal Social Signals during Dyadic Interactions with Generative Adversarial Neural Networks

ForecastingNonverbalSignals This is the implementation for the paper Forecasting Nonverbal Social Signals during Dyadic Interactions with Generative A

1 Feb 10, 2022
Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis

4 Feb 11, 2022
SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

SparseML is a toolkit that includes APIs, CLIs, scripts and libraries that apply state-of-the-art sparsification algorithms such as pruning and quantization to any neural network. General, recipe-dri

Neural Magic 1.5k Dec 30, 2022
Simple and ready-to-use tutorials for TensorFlow

TensorFlow World To support maintaining and upgrading this project, please kindly consider Sponsoring the project developer. Any level of support is a

Amirsina Torfi 4.5k Dec 23, 2022
Implementation of ICLR 2020 paper "Revisiting Self-Training for Neural Sequence Generation"

Self-Training for Neural Sequence Generation This repo includes instructions for running noisy self-training algorithms from the following paper: Revi

Junxian He 45 Dec 31, 2022
This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

optimaladj: A library for computing optimal adjustment sets in causal graphical models This package implements the algorithms introduced in Smucler, S

Facundo Sapienza 6 Aug 04, 2022
Social Distancing Detector

Computer vision has opened up a lot of opportunities to explore into AI domain that were earlier highly limited. Here is an application of haarcascade classifier and OpenCV to develop a social distan

Ashish Pandey 2 Jul 18, 2022
Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

ACSC Automatic extrinsic calibration for non-repetitive scanning solid-state LiDAR and camera systems. System Architecture 1. Dependency Tested with U

KINO 192 Dec 13, 2022
A Simulation Environment to train Robots in Large Realistic Interactive Scenes

iGibson: A Simulation Environment to train Robots in Large Realistic Interactive Scenes iGibson is a simulation environment providing fast visual rend

Stanford Vision and Learning Lab 493 Jan 04, 2023
[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space o

Anpei Chen 694 Dec 23, 2022
Implementation of E(n)-Transformer, which extends the ideas of Welling's E(n)-Equivariant Graph Neural Network to attention

E(n)-Equivariant Transformer (wip) Implementation of E(n)-Equivariant Transformer, which extends the ideas from Welling's E(n)-Equivariant G

Phil Wang 132 Jan 02, 2023
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We h

97 Dec 01, 2022
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

OFA Sys 1.4k Jan 08, 2023