The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Deep Learning transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Facebook Research

MMFlow is an open source optical flow toolbox based on PyTorch

Image Deblurring using Generative Adversarial Networks

Official repo of the paper "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"

This repository compare a selfie with images from identity documents and response if the selfie match.

Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

TensorFlow CNN for fast style transfer

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

It's A ML based Web Site build with python and Django to find the breed of the dog

Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

The Multi-Mission Maximum Likelihood framework (3ML)

Evaluation and Benchmarking of Speech Super-resolution Methods

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable.

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition