FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Last update: Nov 29, 2022

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

Details (Test device: nvidia RTX 2080ti)

Methods	backbone	FPS	mAP(%)
ReDet	ReR50	8.8	76.25
S²ANet	Mobilenet v2	18.9	67.46
S²ANet	R50	14.4	74.14
R³Det	R50	9.2	71.9
Oriented-RCNN	Mobilenet v2	21.2	72.72
Oriented-RCNN	R50	13.8	75.87
Oriented-RCNN	R101	11.3	76.28
RetinaNet-O	Mobilenet v2	22.4	67.95
RetinaNet-O	R50	16.5	72.7
RetinaNet-O	R101	13.3	73.7
Faster-RCNN-O	Mobilenet v2	23	67.41
Faster-RCNN-O	R50	14.4	72.29
Faster-RCNN-O	R101	11.4	72.65
FCOSR-S	Mobilenet v2	23.7	74.05
FCOSR-M	Rx50	14.6	77.15
FCOSR-L	Rx101	7.9	77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	74.05	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	76.11	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	77.15	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	79.25	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	77.39	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	78.80	model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	66.37	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	73.14	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	68.74	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	73.79	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	69.96	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	75.41	model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model	backbone	Rot.	Sched.	Param.	Input	GFLOPs	FPS	AP50(07)	AP75(07)	AP50(12)	AP75(12)	download
FCOSR-S	Mobilenet v2	✓	40k iters	7.29M	800×800	61.57	35.3	90.08	76.75	92.67	75.73	model/cfg
FCOSR-M	ResNext50-32x4	✓	40k iters	31.37M	800×800	127.87	26.9	90.15	78.58	94.84	81.38	model/cfg
FCOSR-L	ResNext101-64x4	✓	40k iters	89.61M	800×800	271.75	15.1	90.14	77.98	95.74	80.94	model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model	backbone	Head channels	Sched.	Param	Size	Input	GFLOPs	FPS	mAP	onnx	TensorRT
FCOSR-lite	Mobilenet v2	256	3x	6.9M	51.63MB	1024×1024	101.25	7.64	74.30	onnx	trt
FCOSR-tiny	Mobilenet v2	128	3x	3.52M	23.2MB	1024×1024	35.89	10.68	73.93	onnx	trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name	size	patch size	gap	patches	det objects	det time(s)
P0031.png	5343×3795	1024	200	35	1197	2.75
P0051.png	4672×5430	1024	200	42	309	2.38
P0112.png	6989×4516	1024	200	54	184	3.02
P0137.png	5276×4308	1024	200	35	66	1.95
P1004.png	7001×3907	1024	200	45	183	2.52
P1125.png	7582×4333	1024	200	54	28	2.95
P1129.png	4093×6529	1024	200	40	70	2.23
P1146.png	5231×4616	1024	200	42	64	2.29
P1157.png	7278×5286	1024	200	63	184	3.47
P1378.png	5445×4561	1024	200	42	83	2.32
P1379.png	4426×4182	1024	200	30	686	1.78
P1393.png	6072×6540	1024	200	64	893	3.63
P1400.png	6471×4479	1024	200	48	348	2.63
P1402.png	4112×4793	1024	200	30	293	1.68
P1406.png	6531×4182	1024	200	40	19	2.19
P1415.png	4894x4898	1024	200	36	190	1.99
P1436.png	5136×5156	1024	200	42	39	2.31
P1448.png	7242×5678	1024	200	63	51	3.41
P1457.png	5193×4658	1024	200	42	382	2.33
P1461.png	6661×6308	1024	200	64	27	3.45
P1494.png	4782×6677	1024	200	48	70	2.61
P1500.png	4769×4386	1024	200	36	92	1.96
P1772.png	5963×5553	1024	200	49	28	2.70
P1774.png	5352×4281	1024	200	35	291	1.95
P1796.png	5870×5822	1024	200	49	308	2.74
P1870.png	5942×6059	1024	200	56	135	3.04
P2043.png	4165×3438	1024	200	20	1479	1.49
P2329.png	7950×4334	1024	200	60	83	3.26
P2641.png	7574×5625	1024	200	63	269	3.41
P2642.png	7039×5551	1024	200	63	451	3.50
P2643.png	7568×5619	1024	200	63	249	3.40
P2645.png	4605×3442	1024	200	24	357	1.42
P2762.png	8074×4359	1024	200	60	127	3.23
P2795.png	4495×3981	1024	200	30	65	1.64

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Related tags

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Install

Getting Started

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

FCOSR serise HRSC2016 result. FPS(2080ti)

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

Owner

基于DouZero定制AI实战欢乐斗地主

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

Generative Models as a Data Source for Multiview Representation Learning

Implementation of Uformer, Attention-based Unet, in Pytorch

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

L-Verse: Bidirectional Generation Between Image and Text

RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting

WTTE-RNN a framework for churn and time to event prediction

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

Unofficial implementation of Proxy Anchor Loss for Deep Metric Learning

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

A pre-trained language model for social media text in Spanish

A Flow-based Generative Network for Speech Synthesis

This repository contains all source code, pre-trained models related to the paper "An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator"

DGCNN - Dynamic Graph CNN for Learning on Point Clouds

Implements Gradient Centralization and allows it to use as a Python package in TensorFlow