A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Last update: Dec 23, 2022

Overview

SlowFast

A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition.

Requirements

conda install pytorch=1.9.1 torchvision cudatoolkit -c pytorch

PyTorchVideo

pip install pytorchvideo

Dataset

kinetics-400 dataset is used in this repo, you could download these datasets from official websites. The data directory structure is shown as follows:

├──data
  ├── train
      ├── abseiling
          ├── _4YTwq0-73Y_000044_000054.mp4
          └── ...
          ...
      ├── archery
          same structure as abseiling
  ├── test
     same structure as train
     ...

Usage

Train Model

python train.py --batch_size 16
optional arguments:
--data_root                   Datasets root path [default value is 'data']
--batch_size                  Number of videos in each mini-batch [default value is 8]
--epochs                      Number of epochs over the model to train [default value is 10]
--save_root                   Result saved root path [default value is 'result']

Test Model

python test.py --video_path data/test/beatboxing/5s_gFWie1Ys_000069_000079.mp4
optional arguments:
--model_path                  Model path [default value is 'result/slow_fast.pth']
--video_path                  Video path [default value is 'data/test/applauding/_V-dzjftmCQ_000023_000033.mp4']

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Related tags

Overview

SlowFast

Requirements

Dataset

Usage

Train Model

Test Model

Owner

Hao Ren

JumpDiff: Non-parametric estimator for Jump-diffusion processes for Python

Python code for loading the Aschaffenburg Pose Dataset.

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Athena is the only tool that you will ever need to optimize your portfolio.

This is a model to classify Vietnamese sign language using Motion history image (MHI) algorithm and CNN.

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

Vision Deep-Learning using Tensorflow, Keras.

Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

EXplainable Artificial Intelligence (XAI)

An Inverse Kinematics library aiming performance and modularity

Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".

Collision risk estimation using stochastic motion models

WRENCH: Weak supeRvision bENCHmark

GraphLily: A Graph Linear Algebra Overlay on HBM-Equipped FPGAs

First-Order Probabilistic Programming Language

《Geo Word Clouds》paper implementation

The source code for Adaptive Kernel Graph Neural Network at AAAI2022

A Dataset of Python Challenges for AI Research

CLASP - Contrastive Language-Aminoacid Sequence Pretraining