SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Overview

SwinTrack

This is the official repo for SwinTrack.

banner

A Simple and Strong Baseline

performance

Prerequisites

Environment

conda (recommended)

conda create -y -n SwinTrack
conda activate SwinTrack
conda install -y anaconda
conda install -y pytorch torchvision cudatoolkit -c pytorch
conda install -y -c fvcore -c iopath -c conda-forge fvcore
pip install wandb
pip install timm

pip

pip install -r requirements.txt

Dataset

Download

Unzip

The paths should be organized as following:

lasot
├── airplane
├── basketball
...
├── training_set.txt
└── testing_set.txt

lasot_extension
├── atv
├── badminton
...
└── wingsuit

got-10k
├── train
│   ├── GOT-10k_Train_000001
│   ...
├── val
│   ├── GOT-10k_Val_000001
│   ...
└── test
    ├── GOT-10k_Test_000001
    ...
    
trackingnet
├── TEST
├── TRAIN_0
...
└── TRAIN_11

coco2017
├── annotations
│   ├── instances_train2017.json
│   └── instances_val2017.json
└── images
    ├── train2017
    │   ├── 000000000009.jpg
    │   ├── 000000000025.jpg
    │   ...
    └── val2017
        ├── 000000000139.jpg
        ├── 000000000285.jpg
        ...

Prepare path.yaml

Copy path.template.yaml as path.yaml and fill in the paths.

LaSOT_PATH: '/path/to/lasot'
LaSOT_Extension_PATH: '/path/to/lasot_ext'
GOT10k_PATH: '/path/to/got10k'
TrackingNet_PATH: '/path/to/trackingnet'
COCO_2017_PATH: '/path/to/coco2017'

Prepare dataset metadata cache (optional)

Download the metadata cache from google drive, and unzip it in datasets/cache/

datasets
└── cache
    ├── SingleObjectTrackingDataset_MemoryMapped
    │   └── filtered
    │       ├── got-10k-got10k_vot_train_split-train-3c1ffeb0c530522f0345d088b2f72168.np
    │       ...
    └── DetectionDataset_MemoryMapped
        └── filtered
            └── coco2017-nocrowd-train-bcd5bf68d4b87619ab451fe293098401.np

Login to wandb

Register an account at wandb, then login with command:

wandb login

Training & Evaluation

Train and evaluate on a single GPU

# Tiny
python main.py SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers

# Base
python main.py SwinTrack Base --output_dir /path/to/output -W $num_dataloader_workers

# Base-384
python main.py SwinTrack Base-384 --output_dir /path/to/output -W $num_dataloader_workers

--output_dir is optional, -W defaults to 4.

note: our code performs evaluation automatically when training is done, output is saved in /path/to/output/test_metrics.

Train and evaluate on multiple GPUs using DDP

# Tiny
python main.py SwinTrack Tiny --distributed_nproc_per_node $num_gpus --distributed_do_spawn_workers --output_dir /path/to/output -W $num_dataloader_workers

Train and evaluate on multiple nodes with multiple GPUs using DDP

# Tiny
python main.py SwinTrack Tiny --master_address $master_address --distributed_node_rank $node_rank distributed_nnodes $num_nodes --distributed_nproc_per_node $num_gpus --distributed_do_spawn_workers --output_dir /path/to/output -W $num_dataloader_workers 

Train and evaluate with run.sh helper script

# Train and evaluate on all GPUs
./run.sh SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers
# Train and evaluate on multiple nodes
NODE_RANK=$NODE_INDEX NUM_NODES=$NUM_NODES MASTER_ADDRESS=$MASTER_ADDRESS DATE_WITH_TIME=$DATE_WITH_TIME ./run.sh SwinTrack Tiny --output_dir /path/to/output -W $num_dataloader_workers 

Ablation study

The ablation study can be done by applying a small patch to the main config file.

Take the ResNet 50 backbone as the example, the rest parameters are the same as the above.

# Train and evaluate with resnet50 backbone
python main.py SwinTrack Tiny --mixin_config resnet.yaml
# or with run.sh
./run.sh SwinTrack Tiny --mixin resnet.yaml

All available config patches are listed in config/SwinTrack/Tiny/mixin.

Train and evaluate with GOT-10k dataset

python main.py SwinTrack Tiny --mixin_config got10k.yaml

Submit $output_dir/test_metrics/got10k/submit/*.zip to the GOT-10k evaluation server to get the result of GOT-10k test split.

Evaluate Existing Model

Download the pretrained model from google drive, then type:

python main.py SwinTrack Tiny --weight_path /path/to/weigth_file.pth --mixin_config evaluation.yaml --output_dir /path/to/output

Our code can evaluate the model on multiple GPUs in parallel, so all parameters above are also available.

Tracking results

Touch here google drive

Citation

@misc{lin2021swintrack,
      title={SwinTrack: A Simple and Strong Baseline for Transformer Tracking}, 
      author={Liting Lin and Heng Fan and Yong Xu and Haibin Ling},
      year={2021},
      eprint={2112.00995},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
LitingLin
LitingLin
A framework for Quantification written in Python

QuaPy QuaPy is an open source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. QuaPy

41 Dec 14, 2022
Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022
YOLOv2 in PyTorch

YOLOv2 in PyTorch NOTE: This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0). This is a PyTorch implement

Long Chen 1.5k Jan 02, 2023
Change Detection in SAR Images Based on Multiscale Capsule Network

SAR_CD_MS_CapsNet Code for the paper "Change Detection in SAR Images Based on Multiscale Capsule Network" , IEEE Geoscience and Remote Sensing Letters

Feng Gao 21 Nov 29, 2022
Linear image-to-image translation

Linear (Un)supervised Image-to-Image Translation Examples for linear orthogonal transformations in PCA domain, learned without pairing supervision. Tr

Eitan Richardson 40 Aug 31, 2022
Style-based Neural Drum Synthesis with GAN inversion

Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap

Sound and Music Analysis (SoMA) Group 29 Nov 19, 2022
auto-tuning momentum SGD optimizer

YellowFin YellowFin is an auto-tuning optimizer based on momentum SGD which requires no manual specification of learning rate and momentum. It measure

Jian Zhang 288 Nov 19, 2022
Code base for "On-the-Fly Test-time Adaptation for Medical Image Segmentation"

On-the-Fly Adaptation Official Pytorch Code base for On-the-Fly Test-time Adaptation for Medical Image Segmentation Paper Introduction One major probl

Jeya Maria Jose 17 Nov 10, 2022
YOLOX-CondInst - Implement CondInst which is a instances segmentation method on YOLOX

YOLOX CondInst -- YOLOX 实例分割 前言 本项目是自己学习实例分割时,复现的代码. 通过自己编程,让自己对实例分割有更进一步的了解。 若想

DDGRCF 16 Nov 18, 2022
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

AI Summer 962 Dec 23, 2022
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

torch-imle Concise and self-contained PyTorch library implementing the I-MLE gradient estimator proposed in our NeurIPS 2021 paper Implicit MLE: Backp

UCL Natural Language Processing 249 Jan 03, 2023
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect vi

MUGEN 11 Oct 22, 2022
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
✂️ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.

EyeLipCropper EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video. The whole process consists of three parts: frame extracti

Zi-Han Liu 9 Oct 25, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

Hieu Duong 7 Jan 12, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1).

M1-tensorflow-benchmark TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1). I was initially testing if Tens

particle 2 Jan 05, 2022
Towards Debiasing NLU Models from Unknown Biases

Towards Debiasing NLU Models from Unknown Biases Abstract: NLU models often exploit biased features to achieve high dataset-specific performance witho

Ubiquitous Knowledge Processing Lab 22 Jun 14, 2022