Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Overview

Robust Object Detection via Instance-Level Temporal Cycle Confusion

This repo contains the implementation of the ICCV 2021 paper, Robust Object Detection via Instance-Level Temporal Cycle Confusion.

Screenshot

Building reliable object detectors that are robust to domain shifts, such as various changes in context, viewpoint, and object appearances, is critical for real world applications. In this work, we study the effectiveness of auxiliary self-supervised tasks to improve out-of-distribution generalization of object detectors. Inspired by the principle of maximum entropy, we introduce a novel self-supervised task, instance-level cycle confusion (CycConf), which operates on the region features of the object detectors. For each object, the task is to find the most different object proposals in the adjacent frame in a video and then cycle back to itself for self-supervision. CycConf encourages the object detector to explore invariant structures across instances under various motion, which leads to improved model robustness in unseen domains at test time. We observe consistent out-of-domain performance improvements when training object detectors in tandem with self-supervised tasks on various domain adaptation benchmarks with static images (Cityscapes, Foggy Cityscapes, Sim10K) and large-scale video datasets (BDD100K and Waymo open data).

Installation

Environment

  • CUDA 10.2
  • Python >= 3.7
  • Pytorch >= 1.6
  • THe Detectron2 version matches Pytorch and CUDA versions.

Dependencies

  1. Create a virtual env.
  • python3 -m pip install --user virtualenv
  • python3 -m venv cyc-conf
  • source cyc-conf/bin/activate
  1. Install dependencies.
  • pip install -r requirements.txt

  • Install Pytorch 1.9

pip3 install torch torchvision

Check out the previous Pytorch versions here.

  • Install Detectron2 Build Detectron2 from Source (gcc & g++ >= 5.4) python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Or, you can install Pre-built detectron2 (example for CUDA 10.2, Pytorch 1.9)

python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.9/index.html

More details can be found here.

Data Preparation

BDD100K

  1. Download the BDD100K MOT 2020 dataset (MOT 2020 Images and MOT 2020 Labels) and the detection labels (Detection 2020 Labels) here and the detailed description is available here. Put the BDD100K data under datasets/ in this repo. After downloading the data, the folder structure should be like below:
├── datasets
│   ├── bdd100k
│   │   ├── images
│   │   │    └── track
│   │   │        ├── train
│   │   │        ├── val
│   │   │        └── test
│   │   └── labels
│   │        ├── box_track_20
│   │        │   ├── train
│   │        │   └── val
│   │        └── det_20
│   │            ├── det_train.json
│   │            └── det_val.json
│   ├── waymo

Convert the labels of the MOT 2020 data (train & val sets) into COCO format by running:

python3 datasets/bdd100k2coco.py -i datasets/bdd100k/labels/box_track_20/val/ -o datasets/bdd100k/labels/track/bdd100k_mot_val_coco.json -m track
python3 datasets/bdd100k2coco.py -i datasets/bdd100k/labels/box_track_20/train/ -o datasets/bdd100k/labels/track/bdd100k_mot_train_coco.json -m track
  1. Split the original videos into different domains (time of day). Run the following command:
python3 -m datasets.domain_splits_bdd100k

This script will first extract the domain attributes from the BDD100K detection set and then map them to the tracking set sequences. After the processing steps, you would see two additional folders domain_splits and per_seq under the datasets/bdd100k/labels/box_track_20. The domain splits of all attributes in BDD100K detection set can be found at datasets/bdd100k/labels/domain_splits.

Waymo

  1. Download the Waymo dataset here. Put the Waymo raw data under datasets/ in this repo. After downloading the data, the folder structure should be like below:
├── datasets
│   ├── bdd100k
│   ├── waymo
│   │   └── raw

Convert the raw TFRecord data files into COCO format by running:

python3 -m datasets.waymo2coco

Note that this script takes a long time to run, be prepared to keep it running for over a day.

  1. Convert the BDD100K dataset labels into 3 classes (originally 8). This needs to be done in order to match the 3 classes of the Waymo dataset. Run the following command:
python3 -m datasets.convert_bdd_3cls

Get Started

For joint training,

python3 -m tools.train_net --config-file [config_file] --num-gpus 8

For evaluation,

python3 -m tools.train_net --config-file [config_file] --num-gpus [num] --eval-only

This command will load the latest checkpoint in the folder. If you want to specify a different checkpoint or evaluate the pretrained checkpoints, you can run

python3 -m tools.train_net --config-file [config_file] --num-gpus [num] --eval-only MODEL.WEIGHTS [PATH_TO_CHECKPOINT]

Benchmark Results

Dataset Statistics

Dataset Split Seq frames/seq. boxes classes
BDD100K Daytime train 757 204 1.82M 8
val 108 204 287K 8
BDD100K Night train 564 204 895K 8
val 71 204 137K 8
Waymo Open Data train 798 199 3.64M 3
val 202 199 886K 3

Out of Domain Evaluation

BDD100K Daytime to Night. The base detector is Faster R-CNN with ResNet-50.

Model AP AP50 AP75 APs APm APl Config Checkpoint
Faster R-CNN 17.84 31.35 17.68 4.92 16.15 35.56 link link
+ Rotation 18.58 32.95 18.15 5.16 16.93 36.00 link link
+ Jigsaw 17.47 31.22 16.81 5.08 15.80 33.84 link link
+ Cycle Consistency 18.35 32.44 18.07 5.04 17.07 34.85 link link
+ Cycle Confusion 19.09 33.58 19.14 5.70 17.68 35.86 link link

BDD100K Night to Daytime.

Model AP AP50 AP75 APs APm APl Config Checkpoint
Faster R-CNN 19.14 33.04 19.16 5.38 21.42 40.34 link link
+ Rotation 19.07 33.25 18.83 5.53 21.32 40.06 link link
+ Jigsaw 19.22 33.87 18.71 5.67 22.35 38.57 link link
+ Cycle Consistency 18.89 33.50 18.31 5.82 21.01 39.13 link link
+ Cycle Confusion 19.57 34.34 19.26 6.06 22.55 38.95 link link

Waymo Front Left to BDD100K Night.

Model AP AP50 AP75 APs APm APl Config Checkpoint
Faster R-CNN 10.07 19.62 9.05 2.67 10.81 18.62 link link
+ Rotation 11.34 23.12 9.65 3.53 11.73 21.60 link link
+ Jigsaw 9.86 19.93 8.40 2.77 10.53 18.82 link link
+ Cycle Consistency 11.55 23.44 10.00 2.96 12.19 21.99 link link
+ Cycle Confusion 12.27 26.01 10.24 3.44 12.22 23.56 link link

Waymo Front Right to BDD100K Night.

Model AP AP50 AP75 APs APm APl Config Checkpoint
Faster R-CNN 8.65 17.26 7.49 1.76 8.29 19.99 link link
+ Rotation 9.25 18.48 8.08 1.85 8.71 21.08 link link
+ Jigsaw 8.34 16.58 7.26 1.61 8.01 18.09 link link
+ Cycle Consistency 9.11 17.92 7.98 1.78 9.36 19.18 link link
+ Cycle Confusion 9.99 20.58 8.30 2.18 10.25 20.54 link link

Citation

If you find this repository useful for your publications, please consider citing our paper.

@article{wang2021robust,
  title={Robust Object Detection via Instance-Level Temporal Cycle Confusion},
  author={Wang, Xin and Huang, Thomas E and Liu, Benlin and Yu, Fisher and Wang, Xiaolong and Gonzalez, Joseph E and Darrell, Trevor},
  journal={International Conference on Computer Vision (ICCV)},
  year={2021}
}
Owner
Xin Wang
Researcher from Microsoft Research. Prev. Ph.D. student at UC Berkeley.
Xin Wang
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
NeuralCompression is a Python repository dedicated to research of neural networks that compress data

NeuralCompression is a Python repository dedicated to research of neural networks that compress data. The repository includes tools such as JAX-based entropy coders, image compression models, video c

Facebook Research 297 Jan 06, 2023
Use unsupervised and supervised learning to predict stocks

AIAlpha: Multilayer neural network architecture for stock return prediction This project is meant to be an advanced implementation of stacked neural n

Vivek Palaniappan 1.5k Jan 06, 2023
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate

NYU Machine-Learning guided Design Automation (MLDA) 31 Nov 22, 2022
A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)

DFC2022 Baseline A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022) This repository uses TorchGeo, PyTorch Lightning, and Segmenta

isaac 24 Nov 28, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Tracking Progress in Question Answering over Knowledge Graphs

Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont

Knowledge Graph Question Answering 47 Jan 02, 2023
Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

lacmus The program for searching through photos from the air of lost people in the forest using Retina Net neural nwtwork. The project is being develo

Lacmus Foundation 168 Dec 27, 2022
Extracts essential Mediapipe face landmarks and arranges them in a sequenced order.

simplified_mediapipe_face_landmarks Extracts essential Mediapipe face landmarks and arranges them in a sequenced order. The default 478 Mediapipe face

Irfan 13 Oct 04, 2022
Code for ViTAS_Vision Transformer Architecture Search

Vision Transformer Architecture Search This repository open source the code for ViTAS: Vision Transformer Architecture Search. ViTAS aims to search fo

46 Dec 17, 2022
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX

Qiming Hu 30 Jan 01, 2023
A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

jedibobo 3 Dec 28, 2022
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers

Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat

DreamSoul 3 Sep 12, 2022
Improving Compound Activity Classification via Deep Transfer and Representation Learning

Improving Compound Activity Classification via Deep Transfer and Representation Learning This repository is the official implementation of Improving C

NingLab 2 Nov 24, 2021
Beginner-friendly repository for Hacktober Fest 2021. Start your contribution to open source through baby steps. 💜

Hacktober Fest 2021 🎉 Open source is changing the world – one contribution at a time! 🎉 This repository is made for beginners who are unfamiliar wit

Abhilash M Nair 32 Dec 11, 2022
X-VLM: Multi-Grained Vision Language Pre-Training

X-VLM: learning multi-grained vision language alignments Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts. Yan Zeng, Xi

Yan Zeng 286 Dec 23, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
JupyterLite demo deployed to GitHub Pages 🚀

JupyterLite Demo JupyterLite deployed as a static site to GitHub Pages, for demo purposes. ✨ Try it in your browser ✨ ➡️ https://jupyterlite.github.io

JupyterLite 223 Jan 04, 2023
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022