[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Last update: Oct 12, 2022

Overview

G-PATE

This is the official code base for our NeurIPS 2021 paper:

"G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators."

Yunhui Long*, Boxin Wang*, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li

Citation

@article{long2021gpate,
  title={G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators},
  author={Long, Yunhui and Wang, Boxin and Yang, Zhuolin and Kailkhura, Bhavya and Zhang, Aston and Gunter, Carl A. and Li, Bo},
  journal={NeurIPS 2021},
  year={2021}
}

Usage

Prepare your environment

Download required packages

pip install -r requirements.txt

Prepare your data

Please store the training data in $data_dir. By default, $data_dir is set to ../../data.

We provide a script to download the MNIST and Fashion Mnist datasets.

python download.py [dataset_name]

For MNIST, you can run

python download.py mnist

For Fashion-MNIST, you can run

python download.py fashion_mnist

For CelebA datasets, please refer to their official websites for downloading.

Training

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name] --train

Example of one of our best commands on MNIST:

Given eps=1,

python main.py --checkpoint_dir mnist_teacher_4000_z_dim_50_c_1e-4/ --teachers_batch 40 --batch_teachers 100 --dataset mnist --train --sigma_thresh 3000 --sigma 1000 --step_size 1e-4 --max_eps 1 --nopretrain --z_dim 50 --batch_size 64

By default, after it reaches the max epsilon=1, it will generate 100,000 DP samples as eps-1.00.data.pkl in checkpoint_dir.

Given eps=10,

python main.py --checkpoint_dir mnist_teacher_2000_z_dim_100_eps_10/ --teachers_batch 40 --batch_teachers 50 --dataset mnist --train --sigma_thresh 600 --sigma 100 --step_size 1e-4 --max_eps 10 --nopretrain --z_dim 100 --batch_size 64

By default, after it reaches the max epsilon=10, it will generate 100,000 DP samples as eps-9.9x.data.pkl in checkpoint_dir.

Generating synthetic samples

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name]

Evaluate the synthetic records

We follow the standard the protocl and train a classifier on synthetic samples and test it on real samples.

For MNIST,

python evaluation/train-classifier-mnist.py --data [DP_data_dir]

For Fashion-MNIST,

python evaluation/train-classifier-fmnist.py --data [DP_data_dir]

For CelebA-Gender,

python evaluation/train-classifier-celebA.py --data [DP_data_dir]

For CelebA-Gender (Small),

python evaluation/train-classifier-small-celebA.py --data [DP_data_dir]

For CelebA-Hair,

python evaluation/train-classifier-hair.py --data [DP_data_dir]

The [DP_data_dir] is where your generated DP samples are located.

In the MNIST example above, we have generated DP samples in $checkpoint_dir/eps-1.00.data.

During evaluation, you should run with DP_data_dir=$checkpoint_dir/eps-1.00.data.

python evaluation/train-classifier-mnist.py --data $checkpoint_dir/eps-1.00.data

[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Related tags

Overview

G-PATE

Citation

Usage

Prepare your environment

Prepare your data

Training

Generating synthetic samples

Evaluate the synthetic records

Owner

AI Secure

PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

MoveNetを用いたPythonでの姿勢推定のデモ

Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Pytorch implementation of Hinton's Dynamic Routing Between Capsules

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

TensorFlow implementation of Deep Reinforcement Learning papers

Automatic differentiation with weighted finite-state transducers.

Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Real-time Neural Representation Fusion for Robust Volumetric Mapping

Character-Input - Create a program that asks the user to enter their name and their age

Construct a neural network frame by Numpy

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Related tags

Overview

G-PATE

Citation

Usage

Prepare your environment

Prepare your data

Training

Generating synthetic samples

Evaluate the synthetic records

Owner

AI Secure

PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

AI创造营 ：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

MoveNetを用いたPythonでの姿勢推定のデモ

Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Pytorch implementation of Hinton's Dynamic Routing Between Capsules

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

TensorFlow implementation of Deep Reinforcement Learning papers

Automatic differentiation with weighted finite-state transducers.

Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Real-time Neural Representation Fusion for Robust Volumetric Mapping

Character-Input - Create a program that asks the user to enter their name and their age

Construct a neural network frame by Numpy

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人