Training data extraction on GPT-2

Overview

Training data extraction from GPT-2

This repository contains code for extracting training data from GPT-2, following the approach outlined in the following paper:

Extracting Training Data from Large Language Models
Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel
USENIX Security Symposium, 2021
https://arxiv.org/abs/2012.07805

WARNING: The experiments in our paper relied on different non-public codebases, and also involved a large amount of manual labor. The code in this repository is thus not meant to exactly reproduce the paper's results, but instead to illustrate the paper's approach and to help others perform similar experiments.
The code in this repository has not been tested at the scale considered in the paper (600,000 generated samples) and might find memorized content at a lower (or higher) rate!

Installation

You will need transformers, pytorch and tqdm. The code was tested with transformers v3.0.2 and torch v1.5.1.

Extracting Data

Simply run

python3 extraction.py --N 1000 --batch-size 10

to generate 1000 samples with GPT-2 (XL). The samples are generated with top-k sampling (k=40) and an empty prompt.

The generated samples are ranked according to four membership inference metrics introduced in our paper:

  • The log perplexity of the GPT-2 (XL) model.
  • The ratio of the log perplexities of the GPT-2 (XL) model and the GPT-2 (S) model.
  • The ratio of the log perplexities for the generated sample and the same sample in lower-case letters.
  • The ratio of the log perplexity of GPT-2 (XL) and the sample's entropy estimated by Zlib.

The top 10 samples according to each metric are printed out. These samples are likely to contain verbatim text from the GPT-2 training data.

Conditioning on Internet text

In our paper, we found that prompting GPT-2 with small snippets of text taken from the Web increased the chance of the model generating memorized content.

To reproduce this attack, first download a slice of the Common Crawl dataset:

./download_cc.sh

This will download a sample of the Crawl from May 2021 (~350 MB) to a file called commoncrawl.warc.wet.

Then, we can run the extraction attack with Internet prompts:

python3 extraction.py --N 1000 --internet-sampling --wet-file commoncrawl.warc.wet

Sample outputs

Some interesting data that we extracted from GPT-2 can be found here.

Note that these were found among 600,000 generated samples. If you generate a much smaller number of samples (10,000 for example), you will be less likely to find memorized content.

Citation

If this code is useful in your research, you are encouraged to cite our academic paper:

@inproceedings{carlini21extracting,
  author = {Carlini, Nicholas and Tramer, Florian and Wallace, Eric and Jagielski, Matthew and Herbert-Voss, Ariel and Lee, Katherine and Roberts, Adam and Brown, Tom and Song, Dawn and Erlingsson, Ulfar and Oprea, Alina and Raffel, Colin},
  title = {Extracting Training Data from Large Language Models},
  booktitle = {USENIX Security Symposium},
  year = {2021},
  howpublished = {arXiv preprint arXiv:2012.07805},
  url = {https://arxiv.org/abs/2012.07805}
}
Owner
Florian Tramer
Florian Tramer
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Soroush Omranpour 1 Jan 01, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
Extracts data from the database for a graph-node and stores it in parquet files

subgraph-extractor Extracts data from the database for a graph-node and stores it in parquet files Installation For developing, it's recommended to us

Cardstack 0 Jan 10, 2022
covid question answering datasets and fine tuned models

Covid-QA Fine tuned models for question answering on Covid-19 data. Hosted Inference This model has been contributed to huggingface.Click here to see

Abhijith Neil Abraham 19 Sep 09, 2021
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed+Megatron trained the world's most powerful language model: MT-530B DeepSpeed is hiring, come join us! DeepSpeed is a deep learning optimizat

Microsoft 8.4k Dec 28, 2022
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022
This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

Clarifying Questions for Query Refinement in Source Code Search This code is part of the reproducibility package for the SANER 2022 paper "Generating

Zachary Eberhart 0 Dec 04, 2021
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
An Api for Emotion recognition.

PLAYEMO Playemo was built from the ground-up with Flask, a python tool that makes it easy for developers to build APIs. Use Cases Is Python your langu

greek geek 2 Jul 16, 2022
SAS: Self-Augmentation Strategy for Language Model Pre-training

SAS: Self-Augmentation Strategy for Language Model Pre-training This repository

Alibaba 5 Nov 02, 2022
Simple object detection app with streamlit

object-detection-app Simple object detection app with streamlit. Upload an image and perform object detection. Adjust the confidence threshold to see

Robin Cole 68 Jan 02, 2023
Deep learning models for change detection of remote sensing images

Change Detection Models (Remote Sensing) Python library with Neural Networks for Change Detection based on PyTorch. ⚡ ⚡ ⚡ I am trying to build this pr

Kaiyu Li 176 Dec 24, 2022
Official implementation of the ICML2021 paper "Elastic Graph Neural Networks"

ElasticGNN This repository includes the official implementation of ElasticGNN in the paper "Elastic Graph Neural Networks" [ICML 2021]. Xiaorui Liu, W

liuxiaorui 34 Dec 04, 2022
(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

DARS Code release for the paper "Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation", ICCV 2021

CVMI Lab 58 Jan 01, 2023
Official implementation of ETH-XGaze dataset baseline

ETH-XGaze baseline Official implementation of ETH-XGaze dataset baseline. ETH-XGaze dataset ETH-XGaze dataset is a gaze estimation dataset consisting

Xucong Zhang 134 Jan 03, 2023
Benchmark for the generalization of 3D machine learning models across different remeshing/samplings of a surface.

Discretization Robust Correspondence Benchmark One challenge of machine learning on 3D surfaces is that there are many different representations/sampl

Nicholas Sharp 10 Sep 30, 2022
ICON: Implicit Clothed humans Obtained from Normals

ICON: Implicit Clothed humans Obtained from Normals arXiv, December 2021. Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black Table of C

Yuliang Xiu 1.1k Dec 30, 2022
Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

48 Dec 20, 2022
Sequence to Sequence Models with PyTorch

Sequence to Sequence models with PyTorch This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch At present it ha

Sandeep Subramanian 708 Dec 19, 2022