The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

Related tags

Deep Learningeqlv2
Overview

The Equalization Losses for Long-tailed Object Detection and Instance Segmentation

This repo is official implementation CVPR 2021 paper: Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection and CVPR 2020 paper: Equalization loss for long-tailed object recognition

Besides the equalization losses, this repo also includes some other algorithms:

  • BAGS (Balance GroupSoftmax)
  • cRT (classifier re-training)
  • LWS (Learnable Weight Scaling)

Requirements

We test our codes on MMDetection V2.3, other versions should also be ok.

Prepare LVIS Dataset

for images

LVIS uses same images as COCO's, so you need to donwload COCO dataset at folder ($COCO), and link those train, val under folder lvis($LVIS).

mkdir -p data/lvis
ln -s $COCO/train $LVIS
ln -s $COCO/val $LVIS
ln -s $COCO/test $LVIS

for annotations

Download the annotations from lvis webset

cd $LVIS
mkdir annotations

then places the annotations at folder ($LVIS/annotations)

Finally you will have the file structure like below:

data
  ├── lvis
  |   ├── annotations
  │   │   │   ├── lvis_v1_val.json
  │   │   │   ├── lvis_v1_train.json
  │   ├── train2017
  │   │   ├── 000000004134.png
  │   │   ├── 000000031817.png
  │   │   ├── ......
  │   ├── val2017
  │   ├── test2017

for API

The official lvis-api and mmlvis can lead to some bugs of multiprocess. See issue

So you can install this LVIS API from my modified repo.

pip install git+https://github.com/tztztztztz/lvis-api.git

Testing with pretrain_models

# ./tools/dist_test.sh ${CONFIG} ${CHECKPOINT} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
./tools/dist_test.sh configs/eqlv2/eql_r50_8x2_1x.py data/pretrain_models/eql_r50_8x2_1x.pth 8 --out results.pkl --eval bbox segm

Training

# ./tools/dist_train.sh ${CONFIG} ${GPU_NUM}
./tools/dist_train.sh ./configs/end2end/eql_r50_8x2_1x.py 8 

Once you finished the training, you will get the evaluation metric like this:

bbox AP

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.242
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=300 catIds=all] = 0.401
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=300 catIds=all] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.181
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.317
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.367
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.135
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.225
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.308
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.331
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.417
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.497
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.197
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.308
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.415

mask AP

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.237
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=300 catIds=all] = 0.372
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=300 catIds=all] = 0.251
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.169
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.316
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.370
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.149
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.228
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.286
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.326
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.210
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.415
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.495
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.213
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.313
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.389

We place ours configs file in ./configs/

  • ./configs/end2end: eqlv2 and other end2end methods
  • ./configs/decouple decoupled-based methods

How to train decouple training methods.

  1. Train the baseline model (or EQL v2).
  2. Prepare the pretrained checkpoint
  # suppose you've trained baseline model
  cd r50_1x
  python ../tools/ckpt_surgery.py --ckpt-path epoch_12.pth --method remove
  # if you want to train LWS, you should choose method 'reset'
  1. Start training with configs
  # ./tools/dist_train.sh ./configs/decouple/bags_r50_8x2_1x.py 8
  # ./tools/dist_train.sh ./configs/decouple/lws_r50_8x2_1x.py 8
  ./tools/dist_train.sh ./configs/decouple/crt_r50_8x2_1x.py 8

Pretrained Models on LVIS

Methods end2end AP APr APc APf pretrained_model
Baseline 16.1 0.0 12.0 27.4 model
EQL 18.6 2.1 17.4 27.2 model
RFS 22.2 11.5 21.2 28.0 model
LWS × 17.0 2.0 13.5 27.4 model
cRT × 22.1 11.9 20.2 29.0 model
BAGS × 23.1 13.1 22.5 28.2 model
EQLv2 23.7 14.9 22.8 28.6 model

How to train EQLv2 on OpenImages

1. Download the data

Download openimages v5 images from link, The folder will be

openimages
    ├── train
    ├── validation
    ├── test

Download the annotations for Challenge 2019 from link, The folder will be

annotations
    ├── challenge-2019-classes-description-500.csv
    ├── challenge-2019-train-detection-human-imagelabels.csv
    ├── challenge-2019-train-detection-bbox.csv
    ├── challenge-2019-validation-detection-bbox.csv
    ├── challenge-2019-validation-detection-human-imagelabels.csv
    ├── ...

2. Convert the .csv to coco-like .json file.

cd tools/openimages2coco/
python convert_annotations.py -p PATH_TO_OPENIMAGES --version challenge_2019 --task bbox 

You may need to donwload the data directory from https://github.com/bethgelab/openimages2coco/tree/master/data and place it at $project_dir/tools/openimages2coco/

3. Train models

  ./tools/dist_train.sh ./configs/openimages/eqlv2_r50_fpn_8x2_2x.py 8

Other configs can be found at ./configs/openimages/

4. Inference and output the json results file

./tools/dist_test.sh ./configs/openimages/eqlv2_r50_fpn_8x2_2x.py openimage_eqlv2_2x/epoch_1.pth 8 --format-only --options "jsonfile_prefix=openimage_eqlv2_2x/results"" 

Then you will get results.bbox.json under folder openimage_eqlv2

5. Convert coco-like json result file to openimage-like csv results file

cd $project_dir/tools/openimages2coco/
python convert_predictions.py -p ../../openimage_eqlv2/results.bbox.json --subset validation

Then you will get results.bbox.csv under folder openimage_eqlv2

6. Evaluate results file using official API

Please refer this link

After this, you will see something like this.

OpenImagesDetectionChallenge_Precision/[email protected],0.5263230244227198                                                                                                                     OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/061hd_',0.4198356678732905                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/06m11',0.40262261023434986                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/03120',0.5694096972722996                                                                                              OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/01kb5b',0.20532245532245533                                                                                            OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0120dh',0.7934685035604202                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0dv5r',0.7029194449221794                                                                                              OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0jbk',0.5959245714028935

7. Parse the AP file and output the grouped AP

cd $project_dir

PYTHONPATH=./:$PYTHONPATH python tools/parse_openimage_metric.py --file openimage_eqlv2_2x/metric

And you will get:

mAP 0.5263230244227198
mAP0: 0.4857693606436219
mAP1: 0.52047262478471
mAP2: 0.5304580597832517
mAP3: 0.5348747991854581
mAP4: 0.5588236678031849

Main Results on OpenImages

Methods AP AP1 AP2 AP3 AP4 AP5
Faster-R50 43.1 26.3 42.5 45.2 48.2 52.6
EQL 45.3 32.7 44.6 47.3 48.3 53.1
EQLv2 52.6 48.6 52.0 53.0 53.4 55.8
Faster-R101 46.0 29.2 45.5 49.3 50.9 54.7
EQL 48.0 36.1 47.2 50.5 51.0 55.0
EQLv2 55.1 51.0 55.2 56.6 55.6 57.5

Citation

If you use the equalization losses, please cite our papers.

@article{tan2020eqlv2,
  title={Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection},
  author={Tan, Jingru and Lu, Xin and Zhang, Gang and Yin, Changqing and Li, Quanquan},
  journal={arXiv preprint arXiv:2012.08548},
  year={2020}
}
@inproceedings{tan2020equalization,
  title={Equalization loss for long-tailed object recognition},
  author={Tan, Jingru and Wang, Changbao and Li, Buyu and Li, Quanquan and Ouyang, Wanli and Yin, Changqing and Yan, Junjie},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11662--11671},
  year={2020}
}

Credits

The code for converting openimage to LVIS is from this repo.

Owner
Jingru Tan
Jingru Tan
Explicable Reward Design for Reinforcement Learning Agents [NeurIPS'21]

Explicable Reward Design for Reinforcement Learning Agents [NeurIPS'21]

3 May 12, 2022
An Implementation of SiameseRPN with Feature Pyramid Networks

SiameseRPN with FPN This project is mainly based on HelloRicky123/Siamese-RPN. What I've done is just add a Feature Pyramid Network method to the orig

3 Apr 16, 2022
An official implementation of MobileStyleGAN in PyTorch

MobileStyleGAN: A Lightweight Convolutional Neural Network for High-Fidelity Image Synthesis Official PyTorch Implementation The accompanying videos c

Sergei Belousov 602 Jan 07, 2023
Learning Representational Invariances for Data-Efficient Action Recognition

Learning Representational Invariances for Data-Efficient Action Recognition Official PyTorch implementation for Learning Representational Invariances

Virginia Tech Vision and Learning Lab 27 Nov 22, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

Hao Luo 116 Jan 04, 2023
Learning a mapping from images to psychological similarity spaces with neural networks.

LearningPsychologicalSpaces v0.1: v1.1: v1.2: v1.3: v1.4: v1.5: The code in this repository explores learning a mapping from images to psychological s

Lucas Bechberger 8 Dec 12, 2022
Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks This repository contains the official code for the

Linus Ericsson 11 Dec 16, 2022
Credit fraud detection in Python using a Jupyter Notebook

Credit-Fraud-Detection - Credit fraud detection in Python using a Jupyter Notebook , using three classification models (Random Forest, Gaussian Naive Bayes, Logistic Regression) from the sklearn libr

Ali Akram 4 Dec 28, 2021
A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

TiSASRec.paddle A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation. Introduction 论文:Time Interval Aware Sel

Paddorch 2 Nov 28, 2021
Post-training Quantization for Neural Networks with Provable Guarantees

Post-training Quantization for Neural Networks with Provable Guarantees Authors: Jinjie Zhang ( Yixuan Zhou 2 Nov 29, 2022

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022
A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

DGC-Net: Dense Geometric Correspondence Network This is a PyTorch implementation of our work "DGC-Net: Dense Geometric Correspondence Network" TL;DR A

191 Dec 16, 2022
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Introduction QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and

Yu 1.4k Dec 30, 2022
STRIVE: Scene Text Replacement In Videos

STRIVE: Scene Text Replacement In Videos Dataset Types: RoboText SynthText RealWorld videos RoboText : Videos of texts collected using navigation robo

15 Jul 11, 2022
Hydra Lightning Template for Structured Configs

Hydra Lightning Template for Structured Configs Template for creating projects with pytorch-lightning and hydra. How to use this template? Create your

Model-driven Machine Learning 4 Jul 19, 2022
Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Fast and Flexible Temporal Point Processes with Triangular Maps This repository includes a reference implementation of the algorithms described in "Fa

Oleksandr Shchur 20 Dec 02, 2022
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
Automatic labeling, conversion of different data set formats, sample size statistics, model cascade

Simple Gadget Collection for Object Detection Tasks Automatic image annotation Conversion between different annotation formats Obtain statistical info

llt 4 Aug 24, 2022
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

Nguyễn Trường Lâu 6 Aug 22, 2022