Rethinking Transformer-based Set Prediction for Object Detection

Last update: Dec 03, 2022

Related tags

Deep Learning TSP-Detection

Overview

Rethinking Transformer-based Set Prediction for Object Detection

Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiDet.

All the model are trained on 4 V100 GPUs.

Prerequisites

Modify the environment name and environment prefix in environment.yml and run

conda env create -f environment.yml

git clone https://github.com/facebookresearch/detectron2.git
cd detectron2
git reset --hard b88c6c06563e4db1139aafbd6d8d97d1fa7a57e4
pip install -e .

Rreproducing Results

For TSP-FCOS,

bash tsp_fcos.sh

For TSP-RCNN,

bash tsp_rcnn.sh

Citation

@InProceedings{Sun_2021_ICCV,
    author    = {Sun, Zhiqing and Cao, Shengcao and Yang, Yiming and Kitani, Kris M.},
    title     = {Rethinking Transformer-Based Set Prediction for Object Detection},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {3611-3620}
}

Owner

Zhiqing Sun

Third-year Ph.D. student at LTI, CMU

GitHub Repository

unet-family: Ultimate version

unet-family: Ultimate version 基于之前my-unet代码，我整理出来了这一份终极版本unet-family，方便其他人阅读。相比于之前的my-unet代码，代码分类更加规范，有条理对于clone下来的代码不需要修改各种复杂繁琐的路径问题，直接就可以运行。并且代码有

2 Sep 19, 2022

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

smart_edu-autobooking Sistema di autoprenotazione per l'aula studio [email protected]

17 Jun 20, 2022

TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition This is an implementation of TCPNet. Introduction For video recognition task, a g

21 Dec 08, 2022

Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

2 May 11, 2022

The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer

24 Oct 15, 2022

Boosted CVaR Classification (NeurIPS 2021)

Boosted CVaR Classification Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar NeurIPS 2021 Table of Contents Quick Start Train

4 Feb 15, 2022

The 2nd place solution of 2021 google landmark retrieval on kaggle.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

229 Dec 13, 2022

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

110 Dec 20, 2022

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

For SwapNet Create a list.txt file containing all the images to process. This can be done with the GNU find command: find path/to/input/folder -name '

2 Nov 10, 2021

Winning solution of the Indoor Location & Navigation Kaggle competition

This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft

62 Dec 28, 2022

A lightweight deep network for fast and accurate optical flow estimation.

FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation The official PyTorch implementation of FastFlowNet (ICRA 2021). Authors: Lingtong

161 Jan 03, 2023

Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

141 Dec 30, 2022

End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

ADOP: Approximate Differentiable One-Pixel Point Rendering

ADOP: Approximate Differentiable One-Pixel Point Rendering Abstract: We present a novel point-based, differentiable neural rendering pipeline for scen

1.9k Jan 06, 2023

Like ThreeJS but for Python and based on wgpu

pygfx A render engine, inspired by ThreeJS, but for Python and targeting Vulkan/Metal/DX12 (via wgpu). Introduction This is a Python render engine bui

139 Jan 07, 2023

Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

209 Dec 28, 2022

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021

Rethinking Transformer-based Set Prediction for Object Detection

Related tags

Overview

Rethinking Transformer-based Set Prediction for Object Detection

Prerequisites

Rreproducing Results

Citation

Owner

Zhiqing Sun

unet-family: Ultimate version

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Generate pixel-style avatars with python.

The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Boosted CVaR Classification (NeurIPS 2021)

The 2nd place solution of 2021 google landmark retrieval on kaggle.

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

Winning solution of the Indoor Location & Navigation Kaggle competition

A lightweight deep network for fast and accurate optical flow estimation.

Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

End-to-end image segmentation kit based on PaddlePaddle.

ADOP: Approximate Differentiable One-Pixel Point Rendering

Like ThreeJS but for Python and based on wgpu

Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

Efficient Multi Collection Style Transfer Using GAN

Research into Forex price prediction from price history using Deep Sequence Modeling with Stacked LSTMs.