PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

Related tags

Deep Learningd2c
Overview

D2C: Diffuison-Decoding Models for Few-shot Conditional Generation

Project | Paper

Open In Collab

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

Abhishek Sinha*, Jiaming Song*, Chenlin Meng, Stefano Ermon

Stanford University

Overview

Conditional generative models of high-dimensional images have many applications, but supervision signals from conditions to images can be expensive to acquire. This paper describes Diffusion-Decoding models with Contrastive representations (D2C), a paradigm for training unconditional variational autoencoders (VAEs) for few-shot conditional image generation. By learning from as few as 100 labeled examples, D2C can be used to generate images with a certain label or manipulate an existing image to contain a certain label. Compared with state-of-the-art StyleGAN2 methods, D2C is able to manipulate certain attributes efficiently while keeping the other details intact.

Here are some example for image manipulation. You can see more results here.

Attribute Original D2C StyleGAN2 NVAE DDIM
Blond
Red Lipstick
Beard

Getting started

The code has been tested on PyTorch 1.9.1 (CUDA 10.2).

To use the checkpoints, download the checkpoints from this link, under the checkpoints/ directory.

# Requires gdown >= 4.2.0, install with pip
gdown https://drive.google.com/drive/u/1/folders/1DvApt-uO3uMRhFM3eIqPJH-HkiEZC1Ru -O ./ --folder

Examples

The main.py file provides some basic scripts to perform inference on the checkpoints.

We will release training code soon on a separate repo, as the GPU memory becomes a bottleneck if we train the model jointly.

Example to perform image manipulation:

  • Red lipstick
python main.py ffhq_256 manipulation --d2c_path checkpoints/ffhq_256/model.ckpt --boundary_path checkpoints/ffhq_256/red_lipstick.ckpt --step 10 --image_dir images/red_lipstick --save_location results/red_lipstick
  • Beard
python main.py ffhq_256 manipulation --d2c_path checkpoints/ffhq_256/model.ckpt --boundary_path checkpoints/ffhq_256/beard.ckpt --step 20 --image_dir images/beard --save_location results/beard
  • Blond
python main.py ffhq_256 manipulation --d2c_path checkpoints/ffhq_256/model.ckpt --boundary_path checkpoints/ffhq_256/blond.ckpt --step -15 --image_dir images/blond --save_location results/blond

Example to perform unconditional image generation:

python main.py ffhq_256 sample_uncond --d2c_path checkpoints/ffhq_256/model.ckpt --skip 100 --save_location results/uncond_samples

Extensions

We implement a D2C class here that contains an autoencoder and a diffusion latent model. See code structure here.

Useful functions include: image_to_latent, latent_to_image, sample_latent, manipulate_latent, postprocess_latent, which are also called in main.py.

Todo

  • Release checkpoints and models for other datasets.
  • Release code for conditional generation.
  • Release training code and procedure to convert into inference model.
  • Train on higher resolution images.

References and Acknowledgements

If you find this repository useful for your research, please cite our work.

@inproceedings{sinha2021d2c,
  title={D2C: Diffusion-Denoising Models for Few-shot Conditional Generation},
  author={Sinha*, Abhishek and Song*, Jiaming and Meng, Chenlin and Ermon, Stefano},
  year={2021},
  month={December},
  abbr={NeurIPS 2021},
  url={https://arxiv.org/abs/2106.06819},
  booktitle={Neural Information Processing Systems},
  html={https://d2c-model.github.io}
}

This implementation is based on:

Owner
Jiaming Song
PhD @ Stanford CS. My Chinese name is Jiaming Song (宋佳铭). I also go by the name Tony.
Jiaming Song
Scaling Vision with Sparse Mixture of Experts

Scaling Vision with Sparse Mixture of Experts This repository contains the code for training and fine-tuning Sparse MoE models for vision (V-MoE) on I

Google Research 290 Dec 25, 2022
Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user who joins your server.

Discord-Protect Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user wh

Tir Omar 2 Oct 28, 2021
CVPR 2021

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-image Translation [Paper] | [Poster] | [Codes] Yahui Liu1,3, Enver Sangineto1,

Yahui Liu 37 Sep 12, 2022
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi

AWS Samples 916 Dec 23, 2022
Videocaptioning.pytorch - A simple implementation of video captioning

pytorch implementation of video captioning recommend installing pytorch and pyth

Yiyu Wang 2 Jan 01, 2022
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Introduction QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and

Yu 1.4k Dec 30, 2022
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021
Tool cek opsi checkpoint facebook!

tool apa ini? cek_opsi_facebook adalah sebuah tool yang mengecek opsi checkpoint akun facebook yang terkena checkpoint! tujuan dibuatnya tool ini? too

Muhammad Latif Harkat 2 Jul 17, 2022
Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Total number of Spanning Trees in a Graph This is a python script just written f

Mehdi I. 0 Jul 18, 2022
PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

Responsible AI Toolbox A library that provides high-quality, PyTorch-centric tools for evaluating and enhancing both the robustness and the explainabi

24 Dec 22, 2022
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

Lau 100 Dec 25, 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

Edresson Casanova 390 Dec 29, 2022
Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains This repository contains the source code for an end-to-end open-domain question

7 Sep 27, 2022
Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Code to accompany Learning from Protein Structure with Geometric Vector Perceptrons by B Jing, S Eismann, P Suriana, RJL T

Dror Lab 85 Dec 29, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. The related paper is avai

26 Dec 13, 2022
SplineConv implementation for Paddle.

SplineConv implementation for Paddle This module implements the SplineConv operators from Matthias Fey, Jan Eric Lenssen, Frank Weichert, Heinrich Mül

北海若 3 Dec 29, 2021
PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

PyDeepFakeDet An integrated and scalable library for Deepfake detection research. Introduction PyDeepFakeDet is an integrated and scalable Deepfake de

Junke, Wang 49 Dec 11, 2022
Simulator for FRC 2022 challenge: Rapid React

rrsim Simulator for FRC 2022 challenge: Rapid React out-1.mp4 Usage In order to run the simulator use the following: python3 rrsim.py [config_path] wh

1 Jan 18, 2022
Project ArXiv Citation Network

Project ArXiv Citation Network Overview This project involved the analysis of the ArXiv citation network. Usage The complete code of this project is i

Dennis Núñez-Fernández 5 Oct 20, 2022
A graph-to-sequence model for one-step retrosynthesis and reaction outcome prediction.

Graph2SMILES A graph-to-sequence model for one-step retrosynthesis and reaction outcome prediction. 1. Environmental setup System requirements Ubuntu:

29 Nov 18, 2022