Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Last update: Dec 31, 2022

Related tags

Overview

================================================================================

Convolutional Two-Stream Network Fusion for Video Action Recognition

This repository contains the code for our CVPR 2016 paper:

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
"Convolutional Two-Stream Network Fusion for Video Action Recognition"
in Proc. CVPR 2016

If you find the code useful for your research, please cite our paper:

    @inproceedings{feichtenhofer2016convolutional,
      title={Convolutional Two-Stream Network Fusion for Video Action Recognition},
      author={Feichtenhofer, Christoph and Pinz, Axel and Zisserman, Andrew},
      booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2016}
    }

Requirements

The code was tested on Ubuntu 14.04 and Windows 10 using MATLAB R2015b and NVIDIA Titan X or Z GPUs.

If you have questions regarding the implementation please contact:

Christoph Feichtenhofer

================================================================================

Setup

Download the code git clone --recursive https://github.com/feichtenhofer/twostreamfusion
Compile the code by running compile.m.
- This will also compile a modified (and older) version of the MatConvNet toolbox. In case of any issues, please follow the installation instructions on the MatConvNet homepage.
Edit the file cnn_setup_environment.m to adjust the models and data paths.
Download pretrained model files and the datasets, linked below and unpack them into your models/data directory.

Optionally you can pretrain your own twostream models by running
1. cnn_ucf101_spatial(); to train the appearance network stream.
2. cnn_ucf101_temporal(); to train the optical flow network stream.

Run cnn_ucf101_fusion(); this will use the downloaded models and demonstrate training of our final architecture on UCF101/HMDB51.
- In case you would like to train on the CPU, clear the variable opts.train.gpus
- In case you encounter memory issues on your GPU, consider decreasing the cudnnWorkspaceLimit (512MB is default)

Pretrained models

Download our baseline networks trained on UCF101 here:

Data

Pre-computed optical flow images and resized rgb frames for the UCF101 and HMDB51 datasets

UCF101 RGB: part1 part2 part3
UCF101 Flow: part1 part2 part3
HMDB51 RGB: part1
HMDB51 Flow: part1

Use it on your own dataset

Our Optical flow extraction tool provides OpenCV wrappers for optical flow extraction on a GPU.

Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Related tags

Overview

Convolutional Two-Stream Network Fusion for Video Action Recognition

Requirements

Setup

Pretrained models

Data

Use it on your own dataset

Owner

Christoph Feichtenhofer

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

Various operations like path tracking, counting, etc by using yolov5

“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品

To build a regression model to predict the concrete compressive strength based on the different features in the training data.

PoolFormer: MetaFormer is Actually What You Need for Vision

Awesome Human Pose Estimation

Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

Hypersearch weight debugging and losses tutorial

Self Driving RC Car Code

MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Implementation of Monocular Direct Sparse Localization in a Prior 3D Surfel Map (DSL)

Jittor 64*64 implementation of StyleGAN

A Blender python script for getting asset browser custom preview images for objects and collections.

One-line your code easily but still with the fun of doing so!

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.