PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

Related tags

Deep LearningDRNet
Overview

DRNet for Video Indvidual Counting (CVPR 2022)

Introduction

This is the official PyTorch implementation of paper: DR.VIC: Decomposition and Reasoning for Video Individual Counting. Different from the single image counting methods, it counts the total number of the pedestrians in a video sequence with a person in different frames only being calculated once. DRNet decomposes this new task to estimate the initial crowd number in the first frame and integrate differential crowd numbers in a set of following image pairs (namely current frame and preceding frame). framework

Catalog

  • Testing Code (2022.3.19)
  • PyTorch pretrained models (2022.3.19)
  • Training Code
    • HT21
    • SenseCrowd

Getting started

preparatoin

  • Clone this repo in the directory (Root/DRNet):

  • Install dependencies. We use python 3.7 and pytorch >= 1.6.0 : http://pytorch.org.

    conda create -n DRNet python=3.7
    conda activate DRNet
    conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch
    cd ${DRNet}
    pip install -r requirements.txt
  • PreciseRoIPooling for extracting the feature descriptors

    Note: the PreciseRoIPooling [1] module is included in the repo, but it's likely to have some problems when running the code:

    1. If you are prompted to install ninja, the following commands will help you.
      wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
      sudo unzip ninja-linux.zip -d /usr/local/bin/
      sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force 
    2. If you encounter errors when compiling the PreciseRoIPooling, you can look up the original repo's issues for help.
  • Datasets

    • HT21 dataset: Download CroHD dataset from this link. Unzip HT21.zip and place HT21 into the folder (Root/dataset/).
    • SenseCrowd dataset: To be updated when it is released.
    • Download the lists of train/val/test sets at link: dataset., and place them to each dataset folder, respectively.

Training

Check some parameters in config.py before training,

  • Use __C.DATASET = 'HT21' to set the dataset (default: HT21).
  • Use __C.GPU_ID = '0' to set the GPU.
  • Use __C.MAX_EPOCH = 20 to set the number of the training epochs (default:20).
  • Use __C.EXP_PATH = os.path.join('./exp', __C.DATASET) to set the dictionary for saving the code, weights, and resume point.

Check other parameters (TRAIN_BATCH_SIZE, TRAIN_SIZE etc.) in the Root/DRNet/datasets/setting in case your GPU's memory is not support for the default setting.

  • run python train.py.

Tips: The training process takes ~10 hours on HT21 dataset with one TITAN RTX (24GB Memory).

Testing

To reproduce the performance, download the pre-trained models and then place pretrained_models folder to Root/DRNet/model/

  • for HT21:
    • Run python test_HT21.py.
  • for SenseCrowd:
    • Run python test_SENSE.py. Then the output file (*_SENSE_cnt.py) will be generated.

Performance

The results on HT21 and SenseCrowd.

  • HT21 dataset
Method CroHD11~CroHD15 MAE/MSE/MRAE(%)
Paper: VGG+FPN [2,3] 164.6/1075.5/752.8/784.5/382.3 141.1/192.3/27.4
This Repo's Reproduction: VGG+FPN [2,3] 138.4/1017.5/623.9/659.8/348.5 160.7/217.3/25.1
  • SenseCrowd dataset
Method MAE/MSE/MRAE(%) MIAE/MOAE D0~D4 (for MAE)
Paper: VGG+FPN [2,3] 12.3/24.7/12.7 1.98/2.01 4.1/8.0/23.3/50.0/77.0
This Repo's Reproduction: VGG+FPN [2,3] 11.7/24.6/11.7 1.99/1.88 3.6/6.8/22.4/42.6/85.2

Video Demo

Please visit bilibili or YouTube to watch the video demonstration. demo

References

  1. Acquisition of Localization Confidence for Accurate Object Detection, ECCV, 2018.
  2. Very Deep Convolutional Networks for Large-scale Image Recognition, arXiv, 2014.
  3. Feature Pyramid Networks for Object Detection, CVPR, 2017.

Citation

If you find this project is useful for your research, please cite:

@article{han2022drvic,
  title={DR.VIC: Decomposition and Reasoning for Video Individual Counting},
  author={Han, Tao, Bai Lei, Gao, Junyu, Qi Wang, and Ouyang  Wanli},
  booktitle={CVPR},
  year={2022}
}

Acknowledgement

The released PyTorch training script borrows some codes from the C^3 Framework and SuperGlue repositories. If you think this repo is helpful for your research, please consider cite them.

Owner
tao han
tao han
Official Pytorch implementation of "Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral)"

Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral): Official Project Webpage This repository provides the off

Kakao Enterprise Corp. 68 Dec 17, 2022
FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.

FindFunc: Advanced Filtering/Finding of Functions in IDA Pro FindFunc is an IDA Pro plugin to find code functions that contain a certain assembly or b

213 Dec 17, 2022
McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

DiffuseAnimals: Reaction-Diffusion Models for the Generation of Biological Patterns Introduction Reaction-diffusion equations can be utilized in order

Austin Szuminsky 2 Mar 07, 2022
Help you understand Manual and w/ Clutch point while driving.

简体中文 forza_auto_gear forza_auto_gear is a tool for Forza Horizon 5. It will help us understand the best gear shift point using Manual or w/ Clutch in

15 Oct 08, 2022
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Omninet - Pytorch Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch. The authors propose that we should be atte

Phil Wang 48 Nov 21, 2022
Get the partition that a file belongs and the percentage of space that consumes

tinos_eisai_sy Get the partition that a file belongs and the percentage of space that consumes (works only with OSes that use the df command) tinos_ei

Konstantinos Patronas 6 Jan 24, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

37 Jan 01, 2023
Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

COCON_ICLR2021 This is our Pytorch implementation of COCON. CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021) Alvin Chan, Y

alvinchangw 79 Dec 18, 2022
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
Programming with Neural Surrogates of Programs

Programming with Neural Surrogates of Programs

0 Dec 12, 2021
Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020 BibTeX @INPROCEEDINGS{punnappurath2020modeling, author={Abhi

Abhijith Punnappurath 22 Oct 01, 2022
PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"

Contrast to Divide: self-supervised pre-training for learning with noisy labels This is an official implementation of "Contrast to Divide: self-superv

55 Nov 23, 2022
Code for the paper "Curriculum Dropout", ICCV 2017

Curriculum Dropout Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability dis

Pietro Morerio 21 Jan 02, 2022
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022
A deep learning CNN model to identify and classify and check if a person is wearing a mask or not.

Face Mask Detection The Model is designed to check if any human is wearing a mask or not. Dataset Description The Dataset contains a total of 11,792 i

1 Mar 01, 2022
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021 Global Pooling, More than Meets the Eye: Posi

Md Amirul Islam 32 Apr 24, 2022
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

RealBasicVSR [Paper] This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contain

Kelvin C.K. Chan 566 Dec 28, 2022
A machine learning project which can detect and predict the skin disease through image recognition.

ML-Project-2021 A machine learning project which can detect and predict the skin disease through image recognition. The dataset used for this is the H

Debshishu Ghosh 1 Jan 13, 2022
Localized representation learning from Vision and Text (LoVT)

Localized Vision-Text Pre-Training Contrastive learning has proven effective for pre- training image models on unlabeled data and achieved great resul

Philip Müller 10 Dec 07, 2022