Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Related tags

Deep LearningTADAM
Overview

Online Multiple Object Tracking with Cross-Task Synergy

This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy" Structure of TADAM

Installation

Tested on python=3.8 with torch=1.8.1 and torchvision=0.9.1.

It should also be compatible with python>=3.6, torch>=1.4.0 and torchvision>=0.4.0. Not tested on lower versions.

1. Clone the repository

git clone https://github.com/songguocode/TADAM.git

2. Create conda env and activate

conda create -n TADAM python=3.8
conda activate TADAM

3. Install required packages

pip install torch torchvision scipy opencv-python yacs

All models are set to run on GPU, thus make sure graphics card driver is properly installed, as well as CUDA.

To check if torch is running with CUDA, run in python:

import torch
torch.cuda.is_available()

It is working if True is returned.

See PyTorch Official Site if torch is not installed or working properly.

4. Clone MOTChallenge benchmark evaluation code

git clone https://github.com/JonathonLuiten/TrackEval.git

By now there should be two folders, TADAM and TrackEval.

Refer to MOTChallenge-Official for instructions.

Download the provided data.zip, unzip as folder data and copy inside TrackEval as TrackEva/data.

Move into TADAM folder

cd TADAM

5. Prepare MOTChallenge data

Download MOT16, MOT17, MOT17Det, and MOT20 and place them inside a datasets folder.

Two options to provide datasets location for training/testing:

  • a. Add a symbolic link inside TADAM folder by ln -s path_of_datasets datasets
  • b. In TADAM/configs/config.py, assign __C.PATHS.DATASET_ROOT with path_of_datasets

6. Download Models

The training base of TADAM is a detector pretrained on COCO. The base model coco_checkpoint.pth is provided in Google Drive

Trained models are also provided for reference:

  • TADAM_MOT16.pth
  • TADAM_MOT17.pth
  • TADAM_MOT20.pth

Create a folder output/models and place all models inside.

Train

  1. Training on single GPU, for MOT17 as an example
python -m lib.training.train TADAM_MOT17 --config TADAM_MOT17

First TADAM_MOT17 specifies the output name of the trained model, which can be changed as preferred.

Second TADAM_MOT17 refers to the config file lib/configs/TADAM_MOT17.yaml that loads training parameters. Switch config for respective dataset training. Config files are located in lib/configs.

  1. Training on multiple GPU with Distributed Data Parallel
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=2 --use_env -m lib.training.train TADAM_MOT17 --config TADAM_MOT17

Argument --nproc_per_node=2 specifies how many GPUs to be used for training. Here 2 cards are used.

Trained model will be stored inside output/models with the specified output name

Evaluate

python -m lib.tracking.test_tracker --result-name xxx --config TADAM_MOT17 --evaluation

Change xxx to prefered result name. --evaluation toggles on evaluation right after obtaining tracking results. Remove it if only running for results without evaluation. Evaluation requires all sequences results of the specified dataset.

Either run evaluation after training, or download and test the provided trained models.

Note that if output name of the trained model is changed, it must be specified in corresponding .yaml config file's line, i.e. replace value in MODEL: TADAM_MOT17.pth with expected model file name.

Code from TrackEval is used for evaluation, and it is set to run on multiple cores (8 cores) by default.

To run an evaluation after obtaining tracking results (with sequences result files), run:

python -m lib.utils.official_benchmark --result-name xxx --config TADAM_MOT17

Replace xxx with the result name, and choose config accordingly.

Tracking results can be found in output/results under respective dataset name folders. Detailed result is stored in a xxx_detailed.csv file, while the summary is given in a xxx_summary.txt file.

Results for reference

The evaluation results on train sets are given here for reference. See paper for reported test sets results.

  • MOT16
MOTA	MOTP	MODA	CLR_Re	CLR_Pr	MTR	PTR	MLR	CLR_TP	CLR_FN
63.7	91.6	63.9	64.5	99.0	35.6	40.8	23.6	71242	39165
CLR_FP	IDSW	MT	PT	ML	Frag	sMOTA	IDF1	IDR	IDP
689	186	184	211	122	316	58.3	68.0	56.2	86.2
IDTP	IDFN	IDFP	Dets	GT_Dets	IDs	GT_IDs
62013	48394	9918	71931	110407	446	517
  • MOT17
MOTA	MOTP	MODA	CLR_Re	CLR_Pr	MTR	PTR	MLR	CLR_TP	CLR_FN
68.0	91.3	68.2	69.0	98.8	43.5	37.5	19.0	232600	104291
CLR_FP	IDSW	MT	PT	ML	Frag	sMOTA	IDF1	IDR	IDP
2845	742	712	615	311	1182	62.0	71.6	60.8	87.0
IDTP	IDFN	IDFP	Dets	GT_Dets	IDs	GT_IDs
204819	132072	30626	235445	336891	1455	1638
  • MOT20
MOTA	MOTP	MODA	CLR_Re	CLR_Pr	MTR	PTR	MLR	CLR_TP	CLR_FN
80.2	87.0	80.4	82.2	97.9	64.0	28.8	7.18	932899	201715
CLR_FP	IDSW	MT	PT	ML	Frag	sMOTA	IDF1	IDR	IDP
20355	2275	1418	638	159	2737	69.5	72.3	66.5	79.2
IDTP	IDFN	IDFP	Dets	GT_Dets	IDs	GT_IDs
754621	379993	198633	953254	1134614	2953	2215

Results could differ slightly, and small variations should be acceptable.

Visualization

A visualization tool is provided to preview datasets' ground-truths, provided detections, and generated tracking results.

python -m lib.utils.visualization --config TADAM_MOT17 --which-set train --sequence 02 --public-detection FRCNN --result xxx --start-frame 1 --scale 0.8

Specify config files, train/test split, and sequence with --config, --which-set, --sequence respectively. --public-detection should only be specified for MOT17.

Replace --result xxx with the tracking results --start-frame 1 means viewing from frame 1, while --scale 0.8 resizes viewing window with given ratio.

Commands in visualization window:

  • "<": previous frame
  • ">": next frame
  • "t": toggle between viewing ground_truths, provided detections, and tracking results
  • "s": save current frame with all rendered elements
  • "h": hide frame information on window's top-left corner
  • "i": hide identity index on bounding boxes' top-left corner
  • "Esc" or "q": exit program

Pretrain detector on COCO

Basic detector is pretrained on COCO dataset, before training on MOT. A Faster-RCNN FPN with ResNet101 backbone is adopted in this code, which can be replaced by other similar detectors with code modifications.

Refer to Object detection reference training scripts on how to train a PyTorch-based detector.

See Tracking without bells and whistles for a jupyter notebook hands-on, which is also based on the aforementioned reference codes.

Publication

If you use the code in your research, please cite:

@InProceedings{TADAM_2021_CVPR,
    author = {Guo, Song and Wang, Jingya and Wang, Xinchao and Tao, Dacheng},
    title = {Online Multiple Object Tracking With Cross-Task Synergy},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021},
}
[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA

Dimitris Dimos 6 Jul 21, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022
Contrastive Feature Loss for Image Prediction

Contrastive Feature Loss for Image Prediction We provide a PyTorch implementation of our contrastive feature loss presented in: Contrastive Feature Lo

Alex Andonian 44 Oct 05, 2022
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding PyTorch implementation for the Scalable Attentive Sentence-Pair Modeling vi

Microsoft 25 Dec 02, 2022
PyTorch and Tensorflow functional model definitions

functional-zoo Model definitions and pretrained weights for PyTorch and Tensorflow PyTorch, unlike lua torch, has autograd in it's core, so using modu

Sergey Zagoruyko 590 Dec 22, 2022
Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

Alireza Kiaeipour 7 Jul 10, 2022
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

ICCV'21 Context-aware Scene Graph Generation with Seq2Seq Transformers Authors: Yichao Lu*, Himanshu Rai*, Cheng Chang*, Boris Knyazev†, Guangwei Yu,

Layer6 Labs 37 Dec 18, 2022
maximal update parametrization (µP)

Maximal Update Parametrization (μP) and Hyperparameter Transfer (μTransfer) Paper link | Blog link In Tensor Programs V: Tuning Large Neural Networks

Microsoft 694 Jan 03, 2023
Official PyTorch implementation of PS-KD

Self-Knowledge Distillation with Progressive Refinement of Targets (PS-KD) Accepted at ICCV 2021, oral presentation Official PyTorch implementation of

61 Dec 28, 2022
Spatial Action Maps for Mobile Manipulation (RSS 2020)

spatial-action-maps Update: Please see our new spatial-intention-maps repository, which extends this work to multi-agent settings. It contains many ne

Jimmy Wu 27 Nov 30, 2022
transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2:ImageNet无限制对抗攻击 介绍 : 深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

25 Dec 08, 2022
Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo

Building and Urban Data Science (BUDS) Group 5 Dec 02, 2022
Workshop Materials Delivered on 28/02/2022

intro-to-cnn-p1 Repo for hosting workshop materials delivered on 28/02/2022 Questions you will answer in this workshop Learning Objectives What are co

Beginners Machine Learning 5 Feb 28, 2022
Reinforcement Learning Theory Book (rus)

Reinforcement Learning Theory Book (rus)

qbrick 206 Nov 27, 2022
Denoising Normalizing Flow

Denoising Normalizing Flow Christian Horvat and Jean-Pascal Pfister 2021 We combine Normalizing Flows (NFs) and Denoising Auto Encoder (DAE) by introd

CHrvt 17 Oct 15, 2022
Generalized hybrid model for mode-locked laser diodes with an extended passive cavity

GenHybridMLLmodel Generalized hybrid model for mode-locked laser diodes with an extended passive cavity This hybrid simulation strategy combines a tra

Stijn Cuyvers 3 Sep 21, 2022
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling Transformer-based models are widely used in natural language processi

Zhanpeng Zeng 12 Jan 01, 2023
TriMap: Large-scale Dimensionality Reduction Using Triplets

TriMap TriMap is a dimensionality reduction method that uses triplet constraints to form a low-dimensional embedding of a set of points. The triplet c

Ehsan Amid 235 Dec 24, 2022
Blind Video Temporal Consistency via Deep Video Prior

deep-video-prior (DVP) Code for NeurIPS 2020 paper: Blind Video Temporal Consistency via Deep Video Prior PyTorch implementation | paper | project web

Chenyang LEI 272 Dec 21, 2022