Clockwork Variational Autoencoders (CW-VAE)

Vaibhav Saxena, Jimmy Ba, Danijar Hafner

If you find this code useful, please reference in your paper:

@article{saxena2021clockworkvae,
  title={Clockwork Variational Autoencoders}, 
  author={Saxena, Vaibhav and Ba, Jimmy and Hafner, Danijar},
  journal={arXiv preprint arXiv:2102.09532},
  year={2021},
}

Method

Clockwork VAEs are deep generative model that learn long-term dependencies in video by leveraging hierarchies of representations that progress at different clock speeds. In contrast to prior video prediction methods that typically focus on predicting sharp but short sequences in the future, Clockwork VAEs can accurately predict high-level content, such as object positions and identities, for 1000 frames.

Clockwork VAEs build upon the Recurrent State Space Model (RSSM), so each state contains a deterministic component for long-term memory and a stochastic component for sampling diverse plausible futures. Clockwork VAEs are trained end-to-end to optimize the evidence lower bound (ELBO) that consists of a reconstruction term for each image and a KL regularizer for each stochastic variable in the model.

More information:

Instructions

This repository contains the code for training the Clockwork VAE model on the datasets minerl, mazes, and mmnist.

The datasets will automatically be downloaded into the --datadir directory.

python3 train.py --logdir /path/to/logdir --datadir /path/to/datasets --config configs/<dataset>.yml

The evaluation script writes open-loop video predictions in both PNG and NPZ format and plots of PSNR and SSIM to the data directory.

python3 eval.py --logdir /path/to/logdir

Clockwork Variational Autoencoder

Related tags

Overview

Clockwork Variational Autoencoders (CW-VAE)

Method

Instructions

Owner

Vaibhav Saxena

General purpose Slater-Koster tight-binding code for electronic structure calculations

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

ReferFormer - Official Implementation of ReferFormer

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Joint parameterization and fitting of stroke clusters

Random Forests for Regression with Missing Entries

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

LEAP: Learning Articulated Occupancy of People

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Deepfake Scanner by Deepware.

Real-time analysis of intracranial neurophysiology recordings.

Credit fraud detection in Python using a Jupyter Notebook

Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Diverse Object-Scene Compositions For Zero-Shot Action Recognition

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1