PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Related tags

Deep LearningDeFRCN
Overview

Introduction

This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Updates!!

  • 【2021/10/10】 We release the official PyTorch implementation of DeFRCN.
  • 【2021/08/20】 We have uploaded our paper (long version with supplementary material) on arxiv, review it for more details.

Quick Start

1. Check Requirements

  • Linux with Python >= 3.6
  • PyTorch >= 1.6 & torchvision that matches the PyTorch version.
  • CUDA 10.1, 10.2
  • GCC >= 4.9

2. Build DeFRCN

  • Clone Code
    git clone https://github.com/er-muyue/DeFRCN.git
    cd DeFRCN
    
  • Create a virtual environment (optional)
    virtualenv defrcn
    cd /path/to/venv/defrcn
    source ./bin/activate
    
  • Install PyTorch 1.6.0 with CUDA 10.1
    pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
  • Install Detectron2
    python3 -m pip install detectron2==0.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
    
    • If you use other version of PyTorch/CUDA, check the latest version of Detectron2 in this page: Detectron2.
    • Sorry for that I don’t have enough time to test on more versions, if you run into problems with other versions, please let me know.
  • Install other requirements.
    python3 -m pip install -r requirements.txt
    

3. Prepare Data and Weights

  • Data Preparation
    • We evaluate our models on two datasets for both FSOD and G-FSOD settings:

      Dataset Size GoogleDrive BaiduYun Note
      VOC2007 0.8G download download -
      VOC2012 3.5G download download -
      vocsplit <1M download download refer from TFA
      COCO ~19G - - download from offical
      cocosplit 174M download download refer from TFA
    • Unzip the downloaded data-source to datasets and put it into your project directory:

        ...
        datasets
          | -- coco (trainval2014/*.jpg, val2014/*.jpg, annotations/*.json)
          | -- cocosplit
          | -- VOC2007
          | -- VOC2012
          | -- vocsplit
        defrcn
        tools
        ...
      
  • Weights Preparation
    • We use the imagenet pretrain weights to initialize our model. Download the same models from here: GoogleDrive BaiduYun
    • The extract code for all BaiduYun link is 0000

4. Training and Evaluation

For ease of training and evaluation over multiple runs, we integrate the whole pipeline of few-shot object detection into one script run_*.sh, including base pre-training and novel-finetuning (both FSOD and G-FSOD).

  • To reproduce the results on VOC, EXP_NAME can be any string (e.g defrcn, or something) and SPLIT_ID must be 1 or 2 or 3 (we consider 3 random splits like other papers).
    bash run_voc.sh EXP_NAME SPLIT_ID (1, 2 or 3)
    
  • To reproduce the results on COCO, EXP_NAME can be any string (e.g defrcn, or something)
    bash run_coco.sh EXP_NAME
    
  • Please read the details of few-shot object detection pipeline in run_*.sh, you need change IMAGENET_PRETRAIN* to your path.

Results on COCO Benchmark

  • Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.0* 1.8* 2.8* 4.0* 6.5 11.1
    FSRW - - - - 5.6 9.1
    MetaDet - - - - 7.1 11.3
    MetaR-CNN - - - - 8.7 12.4
    TFA 4.4* 5.4* 6.0* 7.7* 10.0 13.7
    MPSR 5.1* 6.7* 7.4* 8.7* 9.8 14.1
    FSDetView 4.5 6.6 7.2 10.7 12.5 14.7
    DeFRCN (Our Paper) 9.3 12.9 14.8 16.1 18.5 22.6
    DeFRCN (This Repo) 9.7 13.1 14.5 15.6 18.4 22.6
  • Generalized Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.7 3.1 3.7 4.6 5.5 7.4
    TFA 1.9 3.9 5.1 7 9.1 12.1
    FSDetView 3.2 4.9 6.7 8.1 10.7 15.9
    DeFRCN (Our Paper) 4.8 8.5 10.7 13.6 16.8 21.2
    DeFRCN (This Repo) 4.8 8.5 10.7 13.5 16.7 21.0
  • * indicates that the results are reproduced by us with their source code.
  • It's normal to observe -0.3~+0.3AP noise between your results and this repo.
  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.
  • If you have any problem of above results in this repo, you can download configs and train logs from GoogleDrive, BaiduYun.

Results on VOC Benchmark

  • Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    YOLO-ft 6.6 10.7 12.5 24.8 38.6 12.5 4.2 11.6 16.1 33.9 13.0 15.9 15.0 32.2 38.4
    FRCN-ft 13.8 19.6 32.8 41.5 45.6 7.9 15.3 26.2 31.6 39.1 9.8 11.3 19.1 35.0 45.1
    FSRW 14.8 15.5 26.7 33.9 47.2 15.7 15.2 22.7 30.1 40.5 21.3 25.6 28.4 42.8 45.9
    MetaDet 18.9 20.6 30.2 36.8 49.6 21.8 23.1 27.8 31.7 43.0 20.6 23.9 29.4 43.9 44.1
    MetaR-CNN 19.9 25.5 35.0 45.7 51.5 10.4 19.4 29.6 34.8 45.4 14.3 18.2 27.5 41.2 48.1
    TFA 39.8 36.1 44.7 55.7 56.0 23.5 26.9 34.1 35.1 39.1 30.8 34.8 42.8 49.5 49.8
    MPSR 41.7 - 51.4 55.2 61.8 24.4 - 39.2 39.9 47.8 35.6 - 42.3 48.0 49.7
    DeFRCN (Our Paper) 53.6 57.5 61.5 64.1 60.8 30.1 38.1 47.0 53.3 47.9 48.4 50.9 52.3 54.9 57.4
    DeFRCN (This Repo) 55.1 57.4 61.1 64.6 61.5 32.1 40.5 47.9 52.9 47.5 48.9 51.9 52.3 55.7 59.0
  • Generalized Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    FRCN-ft 9.9 15.6 21.6 28.0 52.0 9.4 13.8 17.4 21.9 39.7 8.1 13.9 19 23.9 44.6
    FSRW 14.2 23.6 29.8 36.5 35.6 12.3 19.6 25.1 31.4 29.8 12.5 21.3 26.8 33.8 31.0
    TFA 25.3 36.4 42.1 47.9 52.8 18.3 27.5 30.9 34.1 39.5 17.9 27.2 34.3 40.8 45.6
    FSDetView 24.2 35.3 42.2 49.1 57.4 21.6 24.6 31.9 37.0 45.7 21.2 30.0 37.2 43.8 49.6
    DeFRCN (Our Paper) 40.2 53.6 58.2 63.6 66.5 29.5 39.7 43.4 48.1 52.8 35.0 38.3 52.9 57.7 60.8
    DeFRCN (This Repo) 43.8 57.5 61.4 65.3 67.0 31.5 40.9 45.6 50.1 52.9 38.2 50.9 54.1 59.2 61.9
  • Note that we change the λGDL-RCNN for VOC to 0.001 (0.01 in paper) and get better performance, check the configs for more details.

  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.

  • If you have any problem of above results in this repo, you can download configs and logs from GoogleDrive, BaiduYun.

Acknowledgement

This repo is developed based on TFA and Detectron2. Please check them for more details and features.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@inproceedings{qiao2021defrcn,
  title={DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection},
  author={Qiao, Limeng and Zhao, Yuxuan and Li, Zhiyuan and Qiu, Xi and Wu, Jianan and Zhang, Chi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={8681--8690},
  year={2021}
}
Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition [ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

Sourav Garg 63 Dec 12, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Deep-RTC [project page] This repository contains the source code accompanying our ECCV 2020 paper. Solving Long-tailed Recognition with Deep Realistic

Gina Wu 16 May 26, 2022
Machine Learning Time-Series Platform

cesium: Open-Source Platform for Time Series Inference Summary cesium is an open source library that allows users to: extract features from raw time s

632 Dec 26, 2022
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

Google Research 2.1k Dec 28, 2022
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Yunjey Choi 5.1k Dec 30, 2022
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

Li Shengyan 270 Dec 31, 2022
An implementation of Deep Forest 2021.2.1.

Deep Forest (DF) 21 DF21 is an implementation of Deep Forest 2021.2.1. It is designed to have the following advantages: Powerful: Better accuracy than

LAMDA Group, Nanjing University 795 Jan 03, 2023
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks Introduction This repo contains the pytorch impl

Meta Research 38 Oct 10, 2022
FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

HKBU High Performance Machine Learning Lab 6 Nov 18, 2022
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

41 Dec 12, 2022
Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification

Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification (ACDNE) This is a pytorch implementation of the Adv

陈志豪 8 Oct 13, 2022
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation

PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd

SUN Group @ UMN 26 Nov 16, 2022
Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

CyberAgent AI Lab 55 Dec 18, 2022
Bounding Wasserstein distance with couplings

BoundWasserstein These scripts reproduce the results of the article Bounding Wasserstein distance with couplings by Niloy Biswas and Lester Mackey. ar

Niloy Biswas 1 Jan 11, 2022
Parris, the automated infrastructure setup tool for machine learning algorithms.

README Parris, the automated infrastructure setup tool for machine learning algorithms. What Is This Tool? Parris is a tool for automating the trainin

Joseph Greene 319 Aug 02, 2022
CLIP+FFT text-to-image

Aphantasia This is a text-to-image tool, part of the artwork of the same name. Based on CLIP model, with FFT parameterizer from Lucent library as a ge

vadim epstein 690 Jan 02, 2023
PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks

PyDEns PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks. With PyDEns one can solve PD

Data Analysis Center 220 Dec 26, 2022
SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]

Learning to Classify Images without Labels This repo contains the Pytorch implementation of our paper: SCAN: Learning to Classify Images without Label

Wouter Van Gansbeke 1.1k Dec 30, 2022