R3Det based on mmdet 2.19.0

Overview

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

License arXiv

Installation

# install mmdetection first if you haven't installed it yet. (Refer to mmdetection for details.)
pip install mmdet==2.19.0

# install r3det (Compiling rotated ops is a little time-consuming.)
pip install -r requirements.txt
pip install -v -e .
  • It is best to use opencv-python greater than 4.5.1 because its angle representation has been changed in 4.5.1. The following experiments are all run with 4.5.3.

Quick Start

Please change path in configs to your data path.

# train
CUDA_VISIBLE_DEVICES=0 PORT=29500 \
./tools/dist_train.sh configs/rretinanet/rretinanet_obb_r50_fpn_1x_dota_v3.py 1

# submission
CUDA_VISIBLE_DEVICES=0 PORT=29500 \
./tools/dist_test.sh configs/rretinanet/rretinanet_obb_r50_fpn_1x_dota_v3.py \
        work_dirs/rretinanet_obb_r50_fpn_1x_dota_v3/epoch_12.pth 1 --format-only\
        --eval-options submission_dir=work_dirs/rretinanet_obb_r50_fpn_1x_dota_v3/Task1_results

For DOTA dataset, please crop the original images into 1024×1024 patches with an overlap of 200 by run

python tools/split/img_split.py --base_json \
       tools/split/split_configs/split_configs/dota1_0/ss_trainval.json

python tools/split/img_split.py --base_json \
       tools/split/split_configs/dota1_0/ss_test.json

Please change path in ss_trainval.json, ss_test.json to your path. (Forked from BboxToolkit, which is faster then DOTA_Devkit.)

Angle Representations

Three angle representations are built-in, which can freely switch in the config.

  • v1 (from R3Det): [-PI/2, 0)
  • v2 (from S2ANet): [-Pi/4, 3PI/4)
  • v3 (from OBBDetection): [-PI/2, PI/2)

The differences of the three angle representations are reflected in poly2obb, obb2poly, obb2xyxy, obb2hbb, hbb2obb, etc. [More], And according to the above three papers, the coders of them are different.

  • DeltaXYWHAOBBoxCoder
    • v1:None
    • v2:Constrained angle + Projection of dx and dy + Normalized with PI
    • v3:Constrained angle and length&width + Projection of dx and dy
  • DeltaXYWHAHBBoxCoder
    • v1:None
    • v2:Constrained angle + Normalized with PI
    • v3:Constrained angle and length&width + Normalized with 2PI

We believe that different coders are the key reason for the different baselines in different papers. The good news is that all the above coders can be freely switched in R3Det. In addition, R3Det also provide 4 NMS ops and 3 IoU_Calculators for rotation detection as follows:

  • nms.type
    • v1:v1
    • v2:v2
    • v3:v3
    • mmcv: mmcv
  • iou_calculator
    • v1:RBboxOverlaps2D_v1
    • v2:RBboxOverlaps2D_v2
    • v3:RBboxOverlaps2D_v3

Performance

DOTA1.0 (Task1)
Model Backbone Lr schd MS RR Angle box AP Official Download
RRetinaNet HBB R50-FPN 1x - - v1 65.19 65.73 Baidu:0518/Google
RRetinaNet OBB R50-FPN 1x - - v3 68.20 69.40 Baidu:0518/Google
RRetinaNet OBB R50-FPN 1x - - v2 68.64 68.40 Baidu:0518/Google
R3Det R50-FPN 1x - - v1 70.41 70.66 Baidu:0518/Google
R3Det* R50-FPN 1x - - v1 70.86 - Baidu:0518/Google
  • MS means multiple scale image split.
  • RR means random rotation.

Citation

@inproceedings{yang2021r3det,
    title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
    author={Yang, Xue and Yan, Junchi and Feng, Ziming and He, Tao},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={35},
    number={4},
    pages={3163--3171},
    year={2021}
}

Owner
SJTU-Thinklab-Det
SJTU-Thinklab-Det
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

Applied Research Center (ARC), Tencent PCG 84 Jan 04, 2023
Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

IQNAS: Interpretable Integer Quadratic programming Neural Architecture Search Realistic use of neural networks often requires adhering to multiple con

0 Oct 24, 2021
A clear, concise, simple yet powerful and efficient API for deep learning.

The Gluon API Specification The Gluon API specification is an effort to improve speed, flexibility, and accessibility of deep learning technology for

Gluon API 2.3k Dec 17, 2022
PyTorch Implementation of Sparse DETR

Sparse DETR By Byungseok Roh*, Jaewoong Shin*, Wuhyun Shin*, and Saehoon Kim at Kakao Brain. (*: Equal contribution) This repository is an official im

Kakao Brain 113 Dec 28, 2022
tensorrt int8 量化yolov5 4.0 onnx模型

onnx模型转换为 int8 tensorrt引擎

123 Dec 28, 2022
Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

Facebook Research 123 Dec 13, 2022
A curated (most recent) list of resources for Learning with Noisy Labels

A curated (most recent) list of resources for Learning with Noisy Labels

Jiaheng Wei 321 Jan 09, 2023
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
Network Enhancement implementation in pytorch

network_enahncement_pytorch Network Enhancement implementation in pytorch Research paper Network Enhancement: a general method to denoise weighted bio

Yen 1 Nov 12, 2021
Pyeventbus: a publish/subscribe event bus

pyeventbus pyeventbus is a publish/subscribe event bus for Python 2.7. simplifies the communication between python classes decouples event senders and

15 Apr 21, 2022
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

Rush Kapoor 2 Nov 21, 2022
TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

TensorFlow-Image-Models Introduction Usage Models Profiling License Introduction TensorfFlow-Image-Models (tfimm) is a collection of image models with

Martins Bruveris 227 Dec 20, 2022
Torch implementation of SegNet and deconvolutional network

Torch implementation of SegNet and deconvolutional network

Fedor Chervinskii 5 Jul 17, 2020
An implementation of the WHATWG URL Standard in JavaScript

whatwg-url whatwg-url is a full implementation of the WHATWG URL Standard. It can be used standalone, but it also exposes a lot of the internal algori

314 Dec 28, 2022
CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

CS5242_2021 Neural Networks and Deep Learning, NUS CS5242, 2021 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : https:/

Xavier Bresson 165 Oct 25, 2022
Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

GLAT Implementation for the ACL2021 paper "Glancing Transformer for Non-Autoregressive Neural Machine Translation" Requirements Python = 3.7 Pytorch

117 Jan 09, 2023
Garbage Detection system which will detect objects based on whether it is plastic waste or plastics or just garbage.

Garbage Detection using Yolov5 on Jetson Nano 2gb Developer Kit. Garbage detection system which will detect objects based on whether it is plastic was

Rishikesh A. Bondade 2 May 13, 2022
Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom Sample on-line plotting while training(avg loss)/testing(writ

Jingwei Zhang 269 Nov 15, 2022
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022