Learning Representational Invariances for Data-Efficient Action Recognition

Last update: Nov 22, 2022

Overview

Learning Representational Invariances for Data-Efficient Action Recognition

Official PyTorch implementation for Learning Representational Invariances for Data-Efficient Action Recognition. We follow the code structure of MMAction2.

See the project page for more details.

Installation

We use PyTorch-1.6.0 with CUDA-10.2 and Torchvision-0.7.0.

Please refer to install.md for installation.

Data Preparation

First, please download human detection results and put them in the corresponding folder under data: UCF-101, HMDB-51, Kinetics-100.

Second, please refer to data_preparation.md to prepare raw frames of UCF-101 and HMDB-51. (Instructions of extracting frames from Kinetics-100 will be available soon.)

(Optional) You can download the pre-extracted ImageNet scores: UCF-101, HMDB-51.

Training

We use 8 RTX2080 Ti GPUs to run our experiments. You would need to adjust your training schedule accordingly if you have less GPUs. Please refer to here.

Supervised learning

PORT=${PORT:-29500}

python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$PORT \
tools/train.py \
$CONFIG \
--launcher pytorch ${@:3} \
--validate

You need to replace $CONFIG with the actual config file:

For supervised baseline, please use config files in configs/recognition/r2plus1d.
For strongly-augmented supervised learning, please use config files in configs/supervised_aug.

Semi-supervised learning

PORT=${PORT:-29500}

python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$PORT \
tools/train_semi.py \
$CONFIG \
--launcher pytorch ${@:3} \
--validate

You need to replace $CONFIG with the actual config file:

For single dataset semi-supervised learning, please use config files in configs/semi.
For cross-dataset semi-supervised learning, please use config files in configs/semi_both.

Testing

# Multi-GPU testing
./tools/dist_test.sh $CONFIG ${path_to_your_ckpt} ${num_of_gpus} --eval top_k_accuracy

# Single-GPU testing
python tools/test.py $CONFIG ${path_to_your_ckpt} --eval top_k_accuracy

NOTE: Do not use multi-GPU testing if you are currently using multi-GPU training.

Other details

Please see getting_started.md for the basic usage of MMAction2.

Acknowledgement

Codes are built upon MMAction2.

Learning Representational Invariances for Data-Efficient Action Recognition

Related tags

Overview

Learning Representational Invariances for Data-Efficient Action Recognition

Installation

Data Preparation

Training

Supervised learning

Semi-supervised learning

Testing

Other details

Acknowledgement

Owner

Virginia Tech Vision and Learning Lab

PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

PyQt6 configuration in yaml format providing the most simple script.

Temporal Segment Networks (TSN) in PyTorch

Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

Sequence to Sequence Models with PyTorch

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

MVP Benchmark for Multi-View Partial Point Cloud Completion and Registration

STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Contrastive Language-Image Pretraining

EfficientNetv2 TensorRT int8

Tools for computational pathology

Identifying Stroke Indicators Using Rough Sets

DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

[PAMI 2020] Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter