Code for Multinomial Diffusion

Abstract

Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural images. This paper introduces two extensions of flows and diffusion for categorical data such as language or image segmentation: Argmax Flows and Multinomial Diffusion. Argmax Flows are defined by a composition of a continuous distribution (such as a normalizing flow), and an argmax function. To optimize this model, we learn a probabilistic inverse for the argmax that lifts the categorical data to a continuous space. Multinomial Diffusion gradually adds categorical noise in a diffusion process, for which the generative denoising process is learned. We demonstrate that our method outperforms existing dequantization approaches on text modelling and modelling on image segmentation maps in log-likelihood.

Link: https://arxiv.org/abs/2102.05379

Instructions

In the folder containing setup.py, run

pip install --user -e .

The --user option ensures the library will only be installed for your user. The -e option makes it possible to modify the library, and modifications will be loaded on the fly.

You should now be able to use it.

Running experiments.

Go to the experiment of interest (folder segmentation_diffusion or text_diffusion) and follow the readme instructions there.

Acknowledgements

The Robert Bosch GmbH is acknowledged for financial support.

Code for Multinomial Diffusion

Related tags

Overview

Code for Multinomial Diffusion

Abstract

Instructions

Running experiments.

Acknowledgements

Owner

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Lightweight library to build and train neural networks in Theano

CAR-API: Cityscapes Attributes Recognition API

PyTorch implementation of PP-LCNet: A Lightweight CPU Convolutional Neural Network

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

ConvMixer unofficial implementation

A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

Easy and comprehensive assessment of predictive power, with support for neuroimaging features

[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

The Official Repository for "Generalized OOD Detection: A Survey"

Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Training DiffWave using variational method from Variational Diffusion Models.

DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

Learning to Estimate Hidden Motions with Global Motion Aggregation

PROJECT - Az Residential Real Estate Analysis

So-ViT: Mind Visual Tokens for Vision Transformer