ASFormer: Transformer for Action Segmentation

This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segmentation.

Enviroment

Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

Reproduce our results

1. Download the dataset data.zip at (https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8) or (https://zenodo.org/record/3625992#.Xiv9jGhKhPY). 
2. Unzip the data.zip file to the current folder. There are three datasets in the ./data folder, i.e. ./data/breakfast, ./data/50salads, ./data/gtea
3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg). There are pretrained models for three datasets, i.e. ./models/50salads, ./models/breakfast, ./models/gtea
4. Run python main.py --action=predict --dataset=50salads/gtea/breakfast --split=1/2/3/4/5 to generate predicted results for each split.
5. Run python eval.py --dataset=50salads/gtea/breakfast --split=0/1/2/3/4/5 to evaluate the performance. **NOTE**: split=0 will evaulate the average results for all splits, It needs to be done after you complete all split predictions.

Train your own model

Also, you can retrain the model by yourself with following command.

python main.py --action=train --dataset=50salads/gtea/breakfast --split=1/2/3/4/5

The training process is very stable in our experiments. It convergences very fast and is not sensitive to the number of training epochs.

Demo for using ASFormer as your backbone

In our paper, we replace the original TCN-based backbone model MS-TCN in ASRF with our ASFormer. The new model achieves even higher results on the 50salads dataset than the original ASRF. Code is Here.

If you find our repo useful, please give us a star and cite

@inproceedings{chinayi_ASformer,  
	author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, 
	booktitle={The British Machine Vision Conference (BMVC)},   
	title={ASFormer: Transformer for Action Segmentation},
	year={2021},  
}

Feel free to raise a issue if you got trouble with our code.

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Related tags

Overview

ASFormer: Transformer for Action Segmentation

Enviroment

Reproduce our results

Train your own model

Demo for using ASFormer as your backbone

Owner

A Fast Monotone Rotating Shallow Water model

VLGrammar: Grounded Grammar Induction of Vision and Language

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation in PyTorch

Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS

Styled Augmented Translation

BiSeNet based on pytorch

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition - NeurIPS2021

Gym Threat Defense

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

This repo is about to create the Streamlit application for given ML model.

Direct Multi-view Multi-person 3D Human Pose Estimation

Real Time Object Detection and Classification using Yolo Algorithm.

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.