Learning to Draw: Emergent Communication through Sketching

Overview

Learning to Draw: Emergent Communication through Sketching

This is the official code for the paper "Learning to Draw: Emergent Communication through Sketching".

ArXivPapers With CodeGetting StartedGame setupsModel setupDatasets

About

We demonstrate that it is possible for a communication channel based on line drawing to emerge between agents playing a visual referential communication game. Furthermore we show that with a simple additional self-supervised loss that the drawings the agent produces are interpretable by humans.

Getting started

You'll need to install the required dependencies listed in requirements.txt. This includes installing the differentiable rasteriser from the DifferentiableSketching repository, and the source version of https://github.com/pytorchbearer/torchbearer:

pip install git+https://github.com/jonhare/DifferentiableSketching.git
pip install git+https://github.com/pytorchbearer/torchbearer.git
pip install -r requirements.txt

Once the dependencies are installed, you can run the commgame.py script to train and test models:

python commgame.py train [args]
python commgame.py test [args]

For example, to train a pair of agents on the original game using the STL10 dataset (which will be downloaded if required), you would run:

python commgame.py train --dataset STL10 --output stl10-original-model --sigma2 5e-4 --nlines 20 --learning-rate 0.0001 --imagenet-weights --freeze-vgg --imagenet-norm --epochs 250 --invert --batch-size 100

The options --sigma2 and --nlines control the thickness and number of lines respectively. --imagenet-weights uses the standard pretrained imagenet vgg16 weights (use --sin-weights for stylized imagenet weights). Finally, --freeze-vgg freezes the backbone CNN, --imagenet-norm specifies to apply the imagenet normalisation to images (this should be used when using either imagenet or stylized imagenet weights), and --invert draws black strokes on a white canvas.

The training scripts compute a running communication rate in addition to loss and this is displayed as training progresses. After each epoch a validation pass is performed and images of the sketches and sender inputs and receiver targets are saved to the output directory along with a model snapshot. The output directory also contains a log file with the training and validation statistics per epoch.

Example commands to run the experiments in the paper are given in commands.md

Further details on commandline arguments are given below.

Game setups

All the setups involve a referential game where the reciever tries to select the "correct" image from a pool on the basis of a "sketch" provided by the sender. The primary measure of success is the communication rate. The different command line arguments to control the different game variants are listed in the following subsections:

Havrylov and Titov's Original Game Setup

Sender sees one image; Reciever sees many, where one is exactly the same as sender.

Number of reciever images (target + distractors) is controlled by the batch-size. Number of sender images per iteration can also be controlled for completeness, but defaults to the same as batch size (e.g. each forward pass with a batch plays all possible game combinations using each of the images as a target).

arguments:
--batch-size
[--sender-images-per-iter]

Object-oriented Game Setup (same)

Sender sees one image; Reciever sees many, where one is exactly the same as sender and the others are all of different classes.

arguments:
--object-oriented same
[--num-targets]
[--sender-images-per-iter]

Object-oriented Game Setup (different)

Sender sees one image; Reciever sees many, each of different classes; one of the images is the same class as the sender, but is a completely different image).

arguments:
--object-oriented different 
[--num-targets]
[--sender-images-per-iter]
[--random-transform-sender]

Model setup

Sender

The "sender" consists of a backbone VGG16 CNN which translates the input image into a latent vector and a "decoder" with an MLP that projects the latent representation from the backbone to a set of drawing commands that are differentiably rendered into an image which is sent to the "reciever".

The backbone can optionally be initialised with pretrained weight and also optionally frozen (except for the final linear projection). The backbone, including linear projection can be shared between sender and reciever (default) or separate (--separate_encoders).

arguments:
[--freeze-vgg]
[--imagenet-weights --imagenet-norm] 
[--sin-weights --imagenet-norm] 
[--separate_encoders]

Receiver

The "receiver" consists of a backbone CNN which is used to convert visual inputs (both the images in the pool and the sketch) into a latent vector which is then transformed into a different latent representation by an MLP. These projected latent vectors are used for prediction and in the loss as described below.

The actual backbone CNN model architecture will be the same as the sender's. The backbone can optionally share parameters with the "sender" agent. Alternatively it can be initialised with pre-trained weights, and also optionally frozen.

arguments:
[--freeze-vgg]
[--imagenet-weights --imagenet-norm]
[--separate_encoders]

Datasets

  • MNIST
  • CIFAR-10 / CIFAR-100
  • TinyImageNet
  • CelebA (--image-size to control size; default 64px)
  • STL-10
  • Caltech101 (training data is balanced by supersampling with augmentation)

Datasets will be downloaded to the dataset root directory (default ./data) as required.

arguments: 
--dataset {CIFAR10,CelebA,MNIST,STL10,TinyImageNet,Caltech101}  
[--dataset-root]

Citation

If you find this repository useful for your research, please cite our paper using the following.

  @@inproceedings{
  mihai2021learning,
  title={Learning to Draw: Emergent Communication through Sketching},
  author={Daniela Mihai and Jonathon Hare},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021},
  url={https://openreview.net/forum?id=YIyYkoJX2eA}
  }
Apollo optimizer in tensorflow

Apollo Optimizer in Tensorflow 2.x Notes: Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant lea

Evan Walters 1 Nov 09, 2021
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

51 Dec 11, 2022
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

AutoViz and Auto_ViML 397 Dec 30, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022
Fast mesh denoising with data driven normal filtering using deep variational autoencoders

Fast mesh denoising with data driven normal filtering using deep variational autoencoders This is an implementation for the paper entitled "Fast mesh

9 Dec 02, 2022
RTSeg: Real-time Semantic Segmentation Comparative Study

Real-time Semantic Segmentation Comparative Study The repository contains the official TensorFlow code used in our papers: RTSEG: REAL-TIME SEMANTIC S

Mennatullah Siam 592 Nov 18, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
Deep Learning GPU Training System

DIGITS DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, To

NVIDIA Corporation 4.1k Jan 03, 2023
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

Joseph P. Robinson 41 Dec 12, 2022
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum

Thanh Luan Nguyen 2 Oct 10, 2022
PyTorch implementation of Convolutional Neural Fabrics http://arxiv.org/abs/1606.02492

PyTorch implementation of Convolutional Neural Fabrics arxiv:1606.02492 There are some minor differences: The raw image is first convolved, to obtain

Anuvabh Dutt 25 Dec 22, 2021
Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets.

Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets. Introduction We propose our dataloader API for loading and

1 Nov 19, 2021
Character Grounding and Re-Identification in Story of Videos and Text Descriptions

Character in Story Identification Network (CiSIN) This project hosts the code for our paper. Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung and

8 Dec 09, 2022
PyElecCL - Electron Monte Carlo Second Checks

PyElecCL Python program to perform second checks for electron Monte Carlo radiat

Reese Haywood 3 Feb 22, 2022
Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

Steven Liu 147 Dec 03, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
Training Structured Neural Networks Through Manifold Identification and Variance Reduction

Training Structured Neural Networks Through Manifold Identification and Variance Reduction This repository is a pytorch implementation of the Regulari

0 Dec 23, 2021
Neural network for digit classification powered by cuda

cuda_nn_mnist Neural network library for digit classification powered by cuda Resources The library was built to work with MNIST dataset. python-mnist

Nikita Ardashev 1 Dec 20, 2021