This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

Overview

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

Python Pytorch

Project Page | YouTube | Paper

This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

Environment

conda install pytorch torchvision cudatoolkit=<your cuda version>
conda install pyyaml scikit-image scikit-learn opencv
pip install -r requirements.txt

Data

Mixamo

Mixamo is a synthesized 3D character animation dataset.

  1. Download mixamo data here.
  2. Extract under data/mixamo

For directions for downloading 3D Mixamo data please refer to this link.

SoloDance

SoloDance is a collection of dancing videos on youtube. We use DensePose to extract skeleton sequences from these videos for training.

  1. Download the extracted skeleton sequences here.
  2. Extract under data/solo_dance

The original videos can be downloaded here.

Preprocessing

run sh scripts/preprocess.sh to preprocess the two datasets above.

Pretrained model

Download the pretrained models here.

Inference

  1. For Skeleton Extraction, please consider using a pose estimation library such as Detectron2. We require the input skeleton sequences to be in the format of a numpy .npy file:

    • The file should contain an array with shape 15 x 2 x length.
    • The first dimension (15) corresponds the 15 body joint defined here.
    • The second dimension (2) corresponds to x and y coordinates.
    • The third dimension (length) is the temporal dimension.
  2. For Motion Retargeting Network, we provide the sample command for inference:

python infer_pair.py 
--config configs/transmomo.yaml 
--checkpoint transmomo_mixamo_36_800_24/checkpoints/autoencoder_00200000.pt # replace with actual path
--source a.npy  # replace with actual path
--target b.npy  # replace with actual path
--source_width 1280 --source_height 720 
--target_height 1920 --target_width 1080
  1. For Skeleton-to-Video Rendering, please refer to Everybody Dance Now.

Training

To train the Motion Retargeting Network, run

python train.py --config configs/transmomo.yaml

To train on the SoloDance dataest, run

python train.py --config configs/transmomo_solo_dance.yaml

Testing

For testing motion retargeting MSE, first generate the motion-retargeted motions with

python test.py
--config configs/transmomo.yaml # replace with the actual config used for training
--checkpoint transmomo_mixamo_36_800_24/checkpoints/autoencoder_00200000.pt
--out_dir transmomo_mixamo_36_800_24_results # replace actual path to output directory

And then compute MSE by

python scripts/compute_mse.py 
--in_dir transmomo_mixamo_36_800_24_results # replace with the previous output directory

Project Structure

transmomo.pytorch
├── configs - configuration files
├── data - place for storing data
├── docs - documentations
├── lib
│   ├── data.py - datasets and dataLoaders
│   ├── networks - encoders, decoders, discriminators, etc.
│   ├── trainer.py - training pipeline
│   ├── loss.py - loss functions
│   ├── operation.py - operations, e.g. rotation, projection, etc.
│   └── util - utility functions
├── out - place for storing output
├── infer_pair.py - perform motion retargeting
├── render_interpolate.py - perform motion and body interpolation
├── scripts - scripts for data processing and experiments
├── test.py - test MSE
└── train.py - main entrance for training

TODOs

  • Detailed documentation

  • Add example files

  • Release in-the-wild dancing video dataset (unannotated)

  • Tool for visualizing Mixamo test error

  • Tool for converting keypoint formats

Citation

Z. Yang*, W. Zhu*, W. Wu*, C. Qian, Q. Zhou, B. Zhou, C. C. Loy. "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting." IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. (* indicates equal contribution.)

BibTeX:

@inproceedings{transmomo2020,
  title={TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting},
  author={Yang, Zhuoqian and Zhu, Wentao and Wu, Wayne and Qian, Chen and Zhou, Qiang and Zhou, Bolei and Loy, Chen Change},
  booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

Acknowledgement

This repository is partly based on Rundi Wu's Learning Character-Agnostic Motion for Motion Retargeting in 2D and Xun Huang's MUNIT: Multimodal UNsupervised Image-to-image Translation. The skeleton-to-rendering part is based on Everybody Dance Now. We sincerely thank them for their inspiration and contribution to the community.

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

41 Dec 12, 2022
“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品

“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品,并且能够返回完整地购物清单及顾客应付的实际商品总价格,极大地降低零售行业实际运营过程中巨大的人力成本,提升零售行业无人化、自动化、智能化水平。

thomas-yanxin 192 Jan 05, 2023
SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling Reference Main paper to be cited (Di Wu et al., 2020) @article

Moore 34 Nov 03, 2022
Solving reinforcement learning tasks which require language and vision

Multimodal Reinforcement Learning JAX implementations of the following multimodal reinforcement learning approaches. Dual-coding Episodic Memory from

Henry Prior 31 Feb 26, 2022
Temporal Segment Networks (TSN) in PyTorch

TSN-Pytorch We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as oth

1k Jan 03, 2023
The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

R2D2 This is the official code for paper titled "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Mode

Alipay 49 Dec 17, 2022
TensorFlow implementation of AlexNet and its training and testing on ImageNet ILSVRC 2012 dataset

AlexNet training on ImageNet LSVRC 2012 This repository contains an implementation of AlexNet convolutional neural network and its training and testin

Matteo Dunnhofer 161 Nov 25, 2022
PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

PG2Net PG2Net:Personalized and Group Preference Guided Network for Next Place Prediction Datasets Experiment results on two Foursquare check-in datase

Urban Mobility 5 Dec 20, 2022
Code for "Continuous-Time Meta-Learning with Forward Mode Differentiation" (ICLR 2022)

Continuous-Time Meta-Learning with Forward Mode Differentiation ICLR 2022 (Spotlight) - Installation - Example - Citation This repository contains the

Tristan Deleu 25 Oct 20, 2022
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

SimMIM By Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai and Han Hu*. This repo is the official implementation of

Microsoft 674 Dec 26, 2022
A Broad Study on the Transferability of Visual Representations with Contrastive Learning

A Broad Study on the Transferability of Visual Representations with Contrastive Learning This repository contains code for the paper: A Broad Study on

Ashraful Islam 29 Nov 09, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 04, 2022
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP 📎 A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

Cade Gordon 396 Dec 30, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Author's PyTorch implementation of TD3 for OpenAI gym tasks

Addressing Function Approximation Error in Actor-Critic Methods PyTorch implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3). If y

Scott Fujimoto 1.3k Dec 25, 2022
MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Digital Image Processing Python MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python TO-DO: Refactor scripts, curren

Merve Noyan 24 Oct 16, 2022
Locally cache assets that are normally streamed in POPULATION: ONE

Population One Localizer This is no longer needed as of the build shipped on 03/03/22, thank you bigbox :) Locally cache assets that are normally stre

Ahman Woods 2 Mar 04, 2022
Markov Attention Models

Introduction This repo contains code for reproducing the results in the paper Graphical Models with Attention for Context-Specific Independence and an

Vicarious 0 Dec 09, 2021
(AAAI2022) Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation

SM-PPM This is a Pytorch implementation of our paper "Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Seman

W-zx-Y 10 Dec 07, 2022
Official Repository for Machine Learning class - Physics Without Frontiers 2021

PWF 2021 Física Sin Fronteras es un proyecto del Centro Internacional de Física Teórica (ICTP) en Trieste Italia. El ICTP es un centro dedicado a fome

36 Aug 06, 2022