MoCoGAN: Decomposing Motion and Content for Video Generation

Last update: Dec 18, 2022

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

This repository contains an implementation and further details of MoCoGAN: Decomposing Motion and Content for Video Generation by Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz.

CVPR Poster:

Representation

MoCoGAN is a generative model for videos, which generates videos from random inputs. It features separated representations of motion and content, offering control over what is generated. For example, MoCoGAN can generate the same object performing different actions, as well as the same action performed by different objects

Examples of generated videos

We trained MoCoGAN on the MUG Facial Expression Database to generate facial expressions. When fixing the content code and changing the motion code, it generated the same person performs different expressions. When fixing the motion code and changing the content code, it generated different people performs the same expression. In the figure shown below, each column has fixed identity, each row shows the same action:

We trained MoCoGAN on a human action dataset where content is represented by the performer, executing several actions. When fixing the content code and changing the motion code, it generated the same person performs different actions. When fixing the motion code and changing the content code, it generated different people performs the same action. Each pair of images represents the same action executed by different people:

We have collected a large-scale TaiChi dataset including 4.5K videos of TaiChi performers. Below are videos generated by MoCoGAN.

Training MoCoGAN

Please refer to a wiki page

Citation

If you use MoCoGAN in your research please cite our paper:

Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz, "MoCoGAN: Decomposing Motion and Content for Video Generation"

@inproceedings{Tulyakov:2018:MoCoGAN,
 title={{MoCoGAN}: Decomposing motion and content for video generation},
 author={Tulyakov, Sergey and Liu, Ming-Yu and Yang, Xiaodong and Kautz, Jan},
 booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 pages = {1526--1535},
 year={2018}
}

MoCoGAN: Decomposing Motion and Content for Video Generation

Related tags

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

Representation

Examples of generated videos

Training MoCoGAN

Citation

Other implementations:

Owner

Sergey Tulyakov

[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

StackNet is a computational, scalable and analytical Meta modelling framework

nnFormer: Interleaved Transformer for Volumetric Segmentation

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

AdamW optimizer and cosine learning rate annealing with restarts

Code for Boundary-Aware Segmentation Network for Mobile and Web Applications

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

Code for the tech report Toward Training at ImageNet Scale with Differential Privacy

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

The Illinois repository for Climatehack (https://climatehack.ai/). We won 1st place!

Python wrappers to the C++ library SymEngine, a fast C++ symbolic manipulation library.

Code for the paper "Adversarial Generator-Encoder Networks"

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Transformer part of 12th place solution in Riiid! Answer Correctness Prediction

Data cleaning, missing value handle, EDA use in this project

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.