The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

Related tags

Deep LearningRCNet
Overview

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21)

By Zhuofan Zong, Qianggang Cao, Biao Leng

Introduction

Feature pyramid networks (FPN) are widely exploited for multi-scale feature fusion in existing advanced object detection frameworks. Numerous previous works have developed various structures for bidirectional feature fusion, all of which are shown to improve the detection performance effectively. We observe that these complicated network structures require feature pyramids to be stacked in a fixed order, which introduces longer pipelines and reduces the inference speed. Moreover, semantics from non-adjacent levels are diluted in the feature pyramid since only features at adjacent pyramid levels are merged by the local fusion operation in a sequence manner. To address these issues, we propose a novel architecture named RCNet, which consists of Reverse Feature Pyramid (RevFP) and Cross-scale Shift Network (CSN). RevFP utilizes local bidirectional feature fusion to simplify the bidirectional pyramid inference pipeline. CSN directly propagates representations to both adjacent and non-adjacent levels to enable multi-scale features more correlative. Extensive experiments on the MS COCO dataset demonstrate RCNet can consistently bring significant improvements over both one-stage and two-stage detectors with subtle extra computational overhead. In particular, RetinaNet is boosted to 40.2 AP, which is 3.7 points higher than baseline, by replacing FPN with our proposed model. On COCO test-dev, RCNet can achieve very competitive performance with a single-model single-scale 50.5 AP.

Models

Pretrained models will be available.

Training and Testing

This project is based on mmdetection. Please follow mmdetection on how to install and use this repo. Config files can be found in configs/rcnet/.

Results on MS COCO

Detector Backbone Neck Lr schd mAP(val) mAP(test)
RetinaNet R50 RCNet 1x 40.2 -
ATSS R50 RCNet 1x 42.6 -
GFL R50 RCNet 1x 43.1 -
GFL R101 RCNet 2x 47.1 47.4
GFL X101-64x4d RCNet 2x 48.9 49.2
GFL X101-64x4d-DCN RCNet 2x 50.2 50.5

Citations

If you find RCNet useful in your research, please consider citing:

@inproceedings{zong2021rcnet,
author = {Zong, Zhuofan and Cao, Qianggang and Leng, Biao},
title = {RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection},
booktitle = {ACM MM},
pages = {5637–5645},
year = {2021}
}

License

This project is released under the Apache 2.0 license

Owner
TempleX
TempleX
Embeddinghub is a database built for machine learning embeddings.

Embeddinghub is a database built for machine learning embeddings.

Featureform 1.2k Jan 01, 2023
Customised to detect objects automatically by a given model file(onnx)

LabelImg LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML

Heeone Lee 1 Jun 07, 2022
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

Vince 0 Jul 13, 2021
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 08, 2023
Credit fraud detection in Python using a Jupyter Notebook

Credit-Fraud-Detection - Credit fraud detection in Python using a Jupyter Notebook , using three classification models (Random Forest, Gaussian Naive Bayes, Logistic Regression) from the sklearn libr

Ali Akram 4 Dec 28, 2021
Segment axon and myelin from microscopy data using deep learning

Segment axon and myelin from microscopy data using deep learning. Written in Python. Using the TensorFlow framework. Based on a convolutional neural network architecture. Pixels are classified as eit

NeuroPoly 103 Nov 29, 2022
Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Self-supervised Image-to-text and Text-to-image Synthesis This is the official implementation of Self-supervised Image-to-text and Text-to-image Synth

6 Jul 31, 2022
Data Consistency for Magnetic Resonance Imaging

Data Consistency for Magnetic Resonance Imaging Data Consistency (DC) is crucial for generalization in multi-modal MRI data and robustness in detectin

Dimitris Karkalousos 19 Dec 12, 2022
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

OpenAI 2.9k Jan 04, 2023
HNECV: Heterogeneous Network Embedding via Cloud model and Variational inference

HNECV This repository provides a reference implementation of HNECV as described in the paper: HNECV: Heterogeneous Network Embedding via Cloud model a

4 Jun 28, 2022
A collection of Jupyter notebooks to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

StyleGAN3 CLIP-based guidance StyleGAN3 + CLIP StyleGAN3 + inversion + CLIP This repo is a collection of Jupyter notebooks made to easily play with St

Eugenio Herrera 176 Dec 30, 2022
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

Benedek Rozemberczki 49 Dec 01, 2022
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 09, 2023
Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

RSS 2020 - Online Domain Adaptation for Occupancy Mapping Repository for the paper "Online Domain Adaptation for Occupancy Mapping", Robotics: Science

Anthony 26 Sep 22, 2022
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
Synthesizing and manipulating 2048x1024 images with conditional GANs

pix2pixHD Project | Youtube | Paper Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translatio

NVIDIA Corporation 6k Dec 27, 2022
Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021. For details of the model and experiments, please see our paper.

tricktreat 87 Dec 16, 2022
PINN Burgers - 1D Burgers equation simulated by PINN

PINN(s): Physics-Informed Neural Network(s) for Burgers equation This is an impl

ShotaDEGUCHI 1 Feb 12, 2022