Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Overview

Rank & Sort Loss for Object Detection and Instance Segmentation

The official implementation of Rank & Sort Loss. Our implementation is based on mmdetection.

Rank & Sort Loss for Object Detection and Instance Segmentation,
Kemal Oksuz, Baris Can Cam, Emre Akbas, Sinan Kalkan, ICCV 2021 (Oral Presentation). (arXiv pre-print)

Summary

What is Rank & Sort (RS) Loss? Rank & Sort (RS) Loss supervises object detectors and instance segmentation methods to (i) rank the scores of the positive anchors above those of negative anchors, and at the same time (ii) sort the scores of the positive anchors with respect to their localisation qualities.

Benefits of RS Loss on Simplification of Training. With RS Loss, we significantly simplify training: (i) Thanks to our sorting objective, the positives are prioritized by the classifier without an additional auxiliary head (e.g. for centerness, IoU, mask-IoU), (ii) due to its ranking-based nature, RS Loss is robust to class imbalance, and thus, no sampling heuristic is required, and (iii) we address the multi-task nature of visual detectors using tuning-free task-balancing coefficients.

Benefits of RS Loss on Improving Performance. Using RS Loss, we train seven diverse visual detectors only by tuning the learning rate, and show that it consistently outperforms baselines: e.g. our RS Loss improves (i) Faster R-CNN by ~3 box AP and aLRP Loss (ranking-based baseline) by ~2 box AP on COCO dataset, (ii) Mask R-CNN with repeat factor sampling by 3.5 mask AP (~7 AP for rare classes) on LVIS dataset.

How to Cite

Please cite the paper if you benefit from our paper or the repository:

@inproceedings{RSLoss,
       title = {Rank & Sort Loss for Object Detection and Instance Segmentation},
       author = {Kemal Oksuz and Baris Can Cam and Emre Akbas and Sinan Kalkan},
       booktitle = {International Conference on Computer Vision (ICCV)},
       year = {2021}
}

Specification of Dependencies and Preparation

  • Please see get_started.md for requirements and installation of mmdetection.
  • Please refer to introduction.md for dataset preparation and basic usage of mmdetection.

Trained Models

Here, we report minival results in terms of AP and oLRP.

Multi-stage Object Detection

RS-R-CNN

Backbone Epoch Carafe MS train box AP box oLRP Log Config Model
ResNet-50 12 39.6 67.9 log config model
ResNet-50 12 + 40.8 66.9 log config model
ResNet-101-DCN 36 [480,960] 47.6 61.1 log config model
ResNet-101-DCN 36 + [480,960] 47.7 60.9 log config model

RS-Cascade R-CNN

Backbone Epoch box AP box oLRP Log Config Model
ResNet-50 12 41.3 66.6 Coming soon

One-stage Object Detection

Method Backbone Epoch box AP box oLRP Log Config Model
RS-ATSS ResNet-50 12 39.9 67.9 log config model
RS-PAA ResNet-50 12 41.0 67.3 log config model

Multi-stage Instance Segmentation

RS-Mask R-CNN on COCO Dataset

Backbone Epoch Carafe MS train mask AP box AP mask oLRP box oLRP Log Config Model
ResNet-50 12 36.4 40.0 70.1 67.5 log config model
ResNet-50 12 + 37.3 41.1 69.4 66.6 log config model
ResNet-101 36 [640,800] 40.3 44.7 66.9 63.7 log config model
ResNet-101 36 + [480,960] 41.5 46.2 65.9 62.6 log config model
ResNet-101-DCN 36 + [480,960] 43.6 48.8 64.0 60.2 log config model
ResNeXt-101-DCN 36 + [480,960] 44.4 49.9 63.1 59.1 Coming Soon config model

RS-Mask R-CNN on LVIS Dataset

Backbone Epoch MS train mask AP box AP mask oLRP box oLRP Log Config Model
ResNet-50 12 [640,800] 25.2 25.9 Coming Soon Coming Soon Coming Soon Coming soon Coming soon

One-stage Instance Segmentation

RS-YOLACT

Backbone Epoch mask AP box AP mask oLRP box oLRP Log Config Model
ResNet-50 55 29.9 33.8 74.7 71.8 log config model

RS-SOLOv2

Backbone Epoch mask AP mask oLRP Log Config Model
ResNet-34 36 32.6 72.7 Coming soon Coming soon Coming soon
ResNet-101 36 39.7 66.9 Coming soon Coming soon Coming soon

Running the Code

Training Code

The configuration files of all models listed above can be found in the configs/ranksort_loss folder. You can follow get_started.md for training code. As an example, to train Faster R-CNN with our RS Loss on 4 GPUs as we did, use the following command:

./tools/dist_train.sh configs/ranksort_loss/ranksort_faster_rcnn_r50_fpn_1x_coco.py 4

Test Code

The configuration files of all models listed above can be found in the configs/ranksort_loss folder. You can follow get_started.md for test code. As an example, first download a trained model using the links provided in the tables below or you train a model, then run the following command to test an object detection model on multiple GPUs:

./tools/dist_test.sh configs/ranksort_loss/ranksort_faster_rcnn_r50_fpn_1x_coco.py ${CHECKPOINT_FILE} 4 --eval bbox 

and use the following command to test an instance segmentation model on multiple GPUs:

./tools/dist_test.sh configs/ranksort_loss/ranksort_mask_rcnn_r50_fpn_1x_coco.py ${CHECKPOINT_FILE} 4 --eval bbox segm 

You can also test a model on a single GPU with the following example command:

python tools/test.py configs/ranksort_loss/ranksort_faster_rcnn_r50_fpn_1x_coco.py ${CHECKPOINT_FILE} 4 --eval bbox 

Details for Rank & Sort Loss Implementation

Below is the links to the files that can be useful to check out the details of the implementation:

Owner
Kemal Oksuz
Kemal Oksuz
This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

Learning to propose objects This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Ko

Philipp Krähenbühl 90 Sep 10, 2021
The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

Rootroo Ltd 2 Dec 25, 2021
FS2KToolbox FS2K Dataset Towards the translation between Face

FS2KToolbox FS2K Dataset Towards the translation between Face -- Sketch. Download (photo+sketch+annotation): Google-drive, Baidu-disk, pw: FS2K. For

Deng-Ping Fan 5 Jan 03, 2023
Generating Images with Recurrent Adversarial Networks

Generating Images with Recurrent Adversarial Networks Python (Theano) implementation of Generating Images with Recurrent Adversarial Networks code pro

Daniel Jiwoong Im 121 Sep 08, 2022
Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

nli2paraphrases Source code repository accompanying the preprint Extracting and filtering paraphrases by bridging natural language inference and parap

Matej Klemen 1 Mar 09, 2022
Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

**Codebase and data are uploaded in progress. ** VOLT(-py) is a vocabulary learning codebase that allows researchers and developers to automaticaly ge

416 Jan 09, 2023
[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems Introduction Multi-agent control i

VITA 6 May 05, 2022
To prepare an image processing model to classify the type of disaster based on the image dataset

Disaster Classificiation using CNNs bunnysaini/Disaster-Classificiation Goal To prepare an image processing model to classify the type of disaster bas

Bunny Saini 1 Jan 24, 2022
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

CRF - Conditional Random Fields A library for dense conditional random fields (CRFs). This is the official accompanying code for the paper Regularized

Đ.Khuê Lê-Huu 21 Nov 26, 2022
Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

Unsupervised JPEG Domain Adaptation for Practical Digital Image Forensics @WIFS2021 (Montpellier, France) Rony Abecidan, Vincent Itier, Jeremie Boulan

Rony Abecidan 6 Jan 06, 2023
A machine learning package for streaming data in Python. The other ancestor of River.

scikit-multiflow is a machine learning package for streaming data in Python. creme and scikit-multiflow are merging into a new project called River. W

670 Dec 30, 2022
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
Point cloud processing tool library.

Point Cloud ToolBox This point cloud processing tool library can be used to process point clouds, 3d meshes, and voxels. Environment python 3.7.5 Dep

ZhangXinyun 40 Dec 09, 2022
A Python package to process & model ChEMBL data.

insilico: A Python package to process & model ChEMBL data. ChEMBL is a manually curated chemical database of bioactive molecules with drug-like proper

Steven Newton 0 Dec 09, 2021
DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

DECA: Detailed Expression Capture and Animation (SIGGRAPH2021) input image, aligned reconstruction, animation with various poses & expressions This is

Yao Feng 1.5k Jan 02, 2023
GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications

GPOEO GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications. We also implement ODPP [1] as a comparison. [1]

瑞雪轻飏 8 Sep 10, 2022
PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The official PyTorch implementation of Neural View S

Angtian Wang 20 Oct 09, 2022
A real world application of a Recurrent Neural Network on a binary classification of time series data

What is this This is a real world application of a Recurrent Neural Network on a binary classification of time series data. This project includes data

Josep Maria Salvia Hornos 2 Jan 30, 2022