Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020)

This repository implements all experiments in the paper Personalized Federated Learning with Moreau Envelopes.

Authors: Canh T. Dinh, Nguyen H. Tran, Tuan Dung Nguyen

Full paper: https://arxiv.org/pdf/2006.08848.pdf https://proceedings.neurips.cc/paper/2020/file/f4f1f13c8289ac1b1ee0ff176b56fc60-Paper.pdf

Paper has been accepted by NeurIPS 2020.

This repository does not only implement pFedMe but also FedAvg, and Per-FedAvg algorithms. (Federated Learning using Pytorch)

Software requirements:

numpy, scipy, torch, Pillow, matplotlib.
To download the dependencies: pip3 install -r requirements.txt

Dataset: We use 2 datasets: MNIST and Synthetic

To generate non-idd MNIST Data:
- Access data/Mnist and run: "python3 generate_niid_20users.py"
- We can change the number of user and number of labels for each user using 2 variable NUM_USERS = 20 and NUM_LABELS = 2
To generate idd MNIST Data (we do not use iid data in the paper):
- Access data/Mnist and run: "python3 generate_iid_20users.py"
To generate niid Synthetic:
- Access data/Synthetic and run: "python3 generate_synthetic_05_05.py". Similar to MNIST data, the Synthetic data is configurable with the number of users and the numbers of labels for each user.
The datasets also are available to download at: https://drive.google.com/drive/folders/1-Z3FCZYoisqnIoLLxOljMPmP70t2TGwB?usp=sharing

Produce experiments and figures

There is a main file "main.py" which allows running all experiments.

Using same parameters

To produce the comparison experiments for pFedMe using MNIST dataset:

Strongly Convex Case, run below commands:


python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.005 --personal_learning_rate 0.1 --beta 1 --lamda 15 --num_global_iters 800 --local_epochs 20 --algorithm pFedMe --numusers 5 --times 10
python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.005 --num_global_iters 800 --local_epochs 20 --algorithm FedAvg --numusers 5  --times 10
python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.005 --beta 0.001  --num_global_iters 800 --local_epochs 20 --algorithm PerAvg --numusers 5  --times 10

It is noted that each algorithm should be run at least 10 times and then the results are averaged.
All the train loss, testing accuracy, and training accuracy will be stored as h5py file in the folder "results". It is noted that we store the data for persionalized model and global of pFedMe in 2 separate files following format: DATASET_pFedMe_p_x_x_xu_xb_x_avg.h5 and DATASET_pFedMe_x_x_xu_xb_x_avg.h5 respectively (pFedMe for global model, pFedMe_p for personalized model of pFedMe, PerAvg_p is for personalized model of PerAvg).

In order to plot the figure for convex case, set parameters in file main_plot.py similar to parameters run from previous experiments. It is noted that each experiment with different parameters will have different results, the configuration in the plot function should be modified for each specific case. For example. To plot the comparision in convex case for the above experiments, in the main_plot.py set:


  numusers = 5
  num_glob_iters = 800
  dataset = "Mnist"
  local_ep = [20,20,20,20]
  lamda = [15,15,15,15]
  learning_rate = [0.005, 0.005, 0.005, 0.005]
  beta =  [1.0, 1.0, 0.001, 1.0]
  batch_size = [20,20,20,20]
  K = [5,5,5,5]
  personal_learning_rate = [0.1,0.1,0.1,0.1]
  algorithms = [ "pFedMe_p","pFedMe","PerAvg_p","FedAvg"]
  plot_summary_one_figure_mnist_Compare(num_users=numusers, loc_ep1=local_ep, Numb_Glob_Iters=num_glob_iters, lamb=lamda,
                             learning_rate=learning_rate, beta = beta, algorithms_list=algorithms, batch_size=batch_size, dataset=dataset, k = K, personal_learning_rate = personal_learning_rate)

NonConvex case:


python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.005 --personal_learning_rate 0.09 --beta 1 --lamda 15 --num_global_iters 800 --local_epochs 20 --algorithm pFedMe --numusers 5 --times 10
python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.005 --num_global_iters 800 --local_epochs 20 --algorithm FedAvg --numusers 5 --times 10
python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.005 --beta 0.001  --num_global_iters 800 --local_epochs 20 --algorithm PerAvg --numusers 5 --times 10

To plot the figure for non-convex case, we do similar to convex case, also need to change the parameters in main_plot.py.

To produce the comparision experiment for pFedMe using Synthetic dataset:

Strongly Convex Case:


python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.005 --personal_learning_rate 0.01 --beta 1 --lamda 20 --num_global_iters 600 --local_epochs 20 --algorithm pFedMe --numusers 10 --times 10
python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.005 --num_global_iters 600 --local_epochs 20 --algorithm FedAvg --numusers 10 --times 10
python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.005 --beta 0.001  --num_global_iters 600 --local_epochs 20 --algorithm PerAvg --numusers 10 --times 10

NonConvex case:


python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.005 --personal_learning_rate 0.01 --beta 1 --lamda 20 --num_global_iters 600 --local_epochs 20 --algorithm pFedMe --numusers 10 --times 10
python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.005 --num_global_iters 600 --local_epochs 20 --algorithm FedAvg --numusers 10 --times 10
python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.005 --beta 0.001  --num_global_iters 600 --local_epochs 20 --algorithm PerAvg --numusers 10 --times 10

Fine-tuned Parameters:

To produce results in the table of fine-tune parameter:

MNIST:

Strongly Convex Case:


python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.01 --personal_learning_rate 0.1 --beta 2 --lamda 15 --num_global_iters 800 --local_epochs 20 --algorithm pFedMe --numusers 5 --times 10
python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.02 --num_global_iters 800 --local_epochs 20 --algorithm FedAvg --numusers 5 --times 10
python3 main.py --dataset Mnist --model mclr --batch_size 20 --learning_rate 0.03 --beta 0.003  --num_global_iters 800 --local_epochs 20 --algorithm PerAvg --numusers 5 --times 10

NonConvex Case:


python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.01 --personal_learning_rate 0.05 --beta 2 --lamda 30 --num_global_iters 800 --local_epochs 20 --algorithm pFedMe --numusers 5 --times 10
python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.02 --num_global_iters 800 --local_epochs 20 --algorithm FedAvg --numusers 5 --times 10
python3 main.py --dataset Mnist --model dnn --batch_size 20 --learning_rate 0.02 --beta 0.001  --num_global_iters 800 --local_epochs 20 --algorithm PerAvg --numusers 5 --times 10

Sythetic:

Strongly Convex Case:


python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.01 --personal_learning_rate 0.01 --beta 2 --lamda 20 --num_global_iters 600 --local_epochs 20 --algorithm pFedMe --numusers 10 --times 10
python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.02 --num_global_iters 600 --local_epochs 20 --algorithm FedAvg --numusers 10 --times 10
python3 main.py --dataset Synthetic --model mclr --batch_size 20 --learning_rate 0.02 --beta 0.002  --num_global_iters 600 --local_epochs 20 --algorithm PerAvg --numusers 10 --times 10

NonConvex Case:


python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.01 --personal_learning_rate 0.01 --beta 2 --lamda 30 --num_global_iters 600 --local_epochs 20 --algorithm pFedMe --numusers 10 --times 10
python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.03 --num_global_iters 600 --local_epochs 20 --algorithm FedAvg --numusers 10 --times 10
python3 main.py --dataset Synthetic --model dnn --batch_size 20 --learning_rate 0.01 --beta 0.001  --num_global_iters 600 --local_epochs 20 --algorithm PerAvg --numusers 10 --times 10

Effect of hyper-parameters:

For all the figures for effect of hyper-parameters, we use Mnist dataset and fix the learning_rate == 0.005 and personal_learning_rate == 0.09 for all experiments. Other parameters are changed according to the experiments. Only in the experiments for the effects of $\beta$, in case $\beta = 4$, we use learning_rate == 0.003 to stable the algorithm.

CIFAR-10 dataset:

The implementation of Cifar10 has been finished. However, we haven't fine-tuned the parameters for all algorithms on Cifar10. Below is the comment to run cifar10 on pFedMe.


python3 main.py --dataset Cifar10 --model cnn --batch_size 20 --learning_rate 0.01 --personal_learning_rate 0.01 --beta 1 --lamda 15 --num_global_iters 800 --local_epochs 20 --algorithm pFedMe --numusers 5

Personalized Federated Learning using Pytorch (pFedMe)

Related tags

Overview

Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020)

Software requirements:

Dataset: We use 2 datasets: MNIST and Synthetic

Produce experiments and figures

Using same parameters

Fine-tuned Parameters:

Effect of hyper-parameters:

CIFAR-10 dataset:

Owner

Charlie Dinh

This is the code for our paper "Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text"

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

A fast MoE impl for PyTorch

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

Algorithmic trading with deep learning experiments

This implements one of result networks from Large-scale evolution of image classifiers

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

RADIal is available now! Check the download section

Ensembling Off-the-shelf Models for GAN Training

Fuzzy Overclustering (FOC)

SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Project page for End-to-end Recovery of Human Shape and Pose

Unified file system operation experience for different backend

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving