Multiple-Object Tracking with Transformer

Last update: Jan 04, 2023

Related tags

Deep Learning TransTrack

Overview

TransTrack: Multiple-Object Tracking with Transformer

Introduction

TransTrack: Multiple-Object Tracking with Transformer

Models

Training data	Training time	Validation MOTA	download
crowdhuman, mot_half	36h + 1h	65.4	model
crowdhuman	36h	53.8	model
mot_half	8h	61.6	model

Models are also available in Baidu Drive by code m4iv.

Notes

Evaluating crowdhuman-training model and mot-training model use different command lines, see Steps.
We observe about 1 MOTA noise.
If the resulting MOTA of your self-trained model is not desired, playing around with the --track_thresh sometimes gives a better performance.
The training time is on 8 NVIDIA V100 GPUs with batchsize 16.
We use the models pre-trained on imagenet.

Demo

Installation

The codebases are built on top of Deformable DETR and CenterTrack.

Requirements

Linux, CUDA>=9.2, GCC>=5.4
Python>=3.7
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/TransTrack.git
cd TransTrack
cd models/ops
python setup.py build install
cd ../..
pip install -r requirements.txt

Prepare dataset

mkdir -p crowdhuman/annotations
cp -r /path_to_crowdhuman_dataset/annotations/CrowdHuman_val.json crowdhuman/annotations/CrowdHuman_val.json
cp -r /path_to_crowdhuman_dataset/annotations/CrowdHuman_train.json crowdhuman/annotations/CrowdHuman_train.json
cp -r /path_to_crowdhuman_dataset/CrowdHuman_train crowdhuman/CrowdHuman_train
cp -r /path_to_crowdhuman_dataset/CrowdHuman_val crowdhuman/CrowdHuman_val
mkdir mot
cp -r /path_to_mot_dataset/train mot/train
cp -r /path_to_mot_dataset/test mot/test
python track_tools/convert_mot_to_coco.py

CrowdHuman dataset is available in CrowdHuman. We provide annotations of json format.

MOT dataset is available in MOT.

Pre-train on crowdhuman

sh track_exps/crowdhuman_train.sh
python track_tools/crowdhuman_model_to_mot.py

The pre-trained model is available crowdhuman_final.pth.

Train TransTrack

sh track_exps/crowdhuman_mot_trainhalf.sh

Evaluate TransTrack

sh track_exps/mot_val.sh
sh track_exps/mot_eval.sh

Visualize TransTrack

python track_tools/txt2video.py

Notes

Evaluate pre-trained CrowdHuman model on MOT

sh track_exps/det_val.sh
sh track_exps/mot_eval.sh

License

TransTrack is released under MIT License.

Citing

If you use TransTrack in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{transtrack,
  title   =  {TransTrack: Multiple-Object Tracking with Transformer},
  author  =  {Peize Sun and Yi Jiang and Rufeng Zhang and Enze Xie and Jinkun Cao and Xinting Hu and Tao Kong and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv: 2012.15460},
  year    =  {2020}
}

Multiple-Object Tracking with Transformer

Related tags

Overview

TransTrack: Multiple-Object Tracking with Transformer

Introduction

Models

Notes

Demo

Installation

Requirements

Steps

Notes

License

Citing

Owner

Peize Sun

Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

Unsupervised Feature Loss (UFLoss) for High Fidelity Deep learning (DL)-based reconstruction

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Top #1 Submission code for the first https://alphamev.ai MEV competition with best AUC (0.9893) and MSE (0.0982).

L-Verse: Bidirectional Generation Between Image and Text

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

PointPillars inference with TensorRT

Applying PVT to Semantic Segmentation

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

Airbus Ship Detection Challenge

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

CNN visualization tool in TensorFlow

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Weakly Supervised Learning of Rigid 3D Scene Flow

Demonstrational Session git repo for H SAF User Workshop (28/1)

Algorithm to texture 3D reconstructions from multi-view stereo images