Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Overview

FAC-Net

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)

Paper: arXiv, ICCV

Overview

We argue that existing methods for weakly-supervised temporal activity localization cannot guarantee the foreground-action consistency, that is, the foreground and actions are mutually inclusive. Therefore, we propose a novel method named Foreground-Action Consistency Network (FAC-Net) to address this issue. The experimental results on THUMOS14 are as below.

Method \ mAP(%) @0.1 @0.2 @0.3 @0.4 @0.5 @0.6 @0.7 AVG
UntrimmedNet 44.4 37.7 28.2 21.1 13.7 - - -
STPN 52.0 44.7 35.5 25.8 16.9 9.9 4.3 27.0
W-TALC 55.2 49.6 40.1 31.1 22.8 - 7.6 -
AutoLoc - - 35.8 29.0 21.2 13.4 5.8 -
CleanNet - - 37.0 30.9 23.9 13.9 7.1 -
MAAN 59.8 50.8 41.1 30.6 20.3 12.0 6.9 31.6
CMCS 57.4 50.8 41.2 32.1 23.1 15.0 7.0 32.4
BM 60.4 56.0 46.6 37.5 26.8 17.6 9.0 36.3
RPN 62.3 57.0 48.2 37.2 27.9 16.7 8.1 36.8
DGAM 60.0 54.2 46.8 38.2 28.8 19.8 11.4 37.0
TSCN 63.4 57.6 47.8 37.7 28.7 19.4 10.2 37.8
EM-MIL 59.1 52.7 45.5 36.8 30.5 22.7 16.4 37.7
BaS-Net 58.2 52.3 44.6 36.0 27.0 18.6 10.4 35.3
A2CL-PT 61.2 56.1 48.1 39.0 30.1 19.2 10.6 37.8
ACM-BANet 64.6 57.7 48.9 40.9 32.3 21.9 13.5 39.9
HAM-Net 65.4 59.0 50.3 41.1 31.0 20.7 11.1 39.8
UM 67.5 61.2 52.3 43.4 33.7 22.9 12.1 41.9
FAC-Net (Ours) 67.6 62.1 52.6 44.3 33.4 22.5 12.7 42.2

Prerequisites

Recommended Environment

  • Python 3.6
  • Pytorch 1.2
  • Tensorboard Logger
  • CUDA 10.0

Data Preparation

  1. Prepare THUMOS'14 dataset.

    • We recommend using features and annotations provided by this repo.
  2. Place the features and annotations inside a dataset/Thumos14reduced/ folder.

Usage

Training

You can easily train the model by running the provided script.

  • Refer to train_options.py. Modify the argument of dataset-root to the path of your dataset folder.

  • Run the command below.

$ python train_main.py --run-type 0 --model-id 1   # rgb stream
$ python train_main.py --run-type 1 --model-id 2   # flow stream

Make sure you use different model-id for RGB and optical flow. Models are saved in ./ckpt/dataset_name/model_id/

Evaulation

The trained model can be found here. Please change the file name to xxx.pkl (e.g., 100.pkl) and put it into ./ckpt/dataset_name/model_id/. You can evaluate the model referring to the two stream evaluation process.

Single stream evaluation

  • Run the command below.
$ python train_main.py --pretrained --run-type 2 --model-id 1 --load-epoch 100  # rgb stream
$ python train_main.py --pretrained --run-type 3 --model-id 2 --load-epoch 100  # flow stream

load-epoch refers to the epoch of the best model. The best model would not always occur at 100 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model. Make sure you set the right model-id that corresponds to the model-id during training.

Two stream evaluation

  • Run the command below using our provided models.
$ python test_main.py --rgb-model-id 1 --flow-model-id 2 --rgb-load-epoch 100 --flow-load-epoch 100

References

We referenced the repos below for the code.

If you find this code useful, please cite our paper.

@InProceedings{Huang_2021_ICCV,
    author    = {Huang, Linjiang and Wang, Liang and Li, Hongsheng},
    title     = {Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {8002-8011}
}

Contact

If you have any question or comment, please contact the first author of the paper - Linjiang Huang ([email protected]).

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling Transformer-based models are widely used in natural language processi

Zhanpeng Zeng 12 Jan 01, 2023
Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

Pearl The Parallel Evolutionary and Reinforcement Learning Library (Pearl) is a pytorch based package with the goal of being excellent for rapid proto

38 Jan 01, 2023
Research code for the paper "Variational Gibbs inference for statistical estimation from incomplete data".

Variational Gibbs inference (VGI) This repository contains the research code for Simkus, V., Rhodes, B., Gutmann, M. U., 2021. Variational Gibbs infer

Vaidotas Šimkus 1 Apr 08, 2022
Code accompanying the paper "How Tight Can PAC-Bayes be in the Small Data Regime?"

How Tight Can PAC-Bayes be in the Small Data Regime? This is the code to reproduce all experiments for the following paper: @inproceedings{Foong:2021:

5 Dec 21, 2021
This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your username and app/website.

PasswordGeneratorAndVault This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your us

Chris 1 Feb 26, 2022
[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Exploring Temporal Coherence for More General Video Face Forgery Detection(FTCN) Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen Accepted b

57 Dec 28, 2022
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
[ICML 2021] A fast algorithm for fitting robust decision trees.

GROOT: Growing Robust Trees Growing Robust Trees (GROOT) is an algorithm that fits binary classification decision trees such that they are robust agai

Cyber Analytics Lab 17 Nov 21, 2022
Rank1 Conversation Emotion Detection Task

Rank1-Conversation_Emotion_Detection_Task accuracy macro-f1 recall 0.826 0.7544 0.719 基于预训练模型和时序预测模型的对话情感探测任务 1 摘要 针对对话情感探测任务,本文将其分为文本分类和时间序列预测两个子任务,分

Yuchen Han 2 Nov 28, 2021
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

Dirk Neuhäuser 6 Dec 08, 2022
A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

What is This This is a simple re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"(1). Only Sections

102 Dec 14, 2022
Weakly-supervised semantic image segmentation with CNNs using point supervision

Code for our ECCV paper What's the Point: Semantic Segmentation with Point Supervision. Summary This library is a custom build of Caffe for semantic i

27 Sep 14, 2022
Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

MINDs Lab 104 Nov 29, 2022
Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

SMSR Reposity for "Exploring Sparsity in Image Super-Resolution for Efficient Inference" [arXiv] Highlights Locate and skip redundant computation in S

Longguang Wang 225 Dec 26, 2022
This is the code for Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

This is the code for Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning It includes /bert, which is the original BERT repos

Mitchell Gordon 11 Nov 15, 2022
DLL: Direct Lidar Localization

DLL: Direct Lidar Localization Summary This package presents DLL, a direct map-based localization technique using 3D LIDAR for its application to aeri

Service Robotics Lab 127 Dec 16, 2022
The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark."

FFA-IR The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark." The framework is inheri

Mingjie 28 Dec 16, 2022
Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

Mark Dong 166 Dec 11, 2022
Code for "Modeling Indirect Illumination for Inverse Rendering", CVPR 2022

Modeling Indirect Illumination for Inverse Rendering Project Page | Paper | Data Preparation Set up the python environment conda create -n invrender p

ZJU3DV 116 Jan 03, 2023