Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Related tags

Deep LearningIC-Conv
Overview

IC-Conv

This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search.

Getting Started

Download ImageNet pre-trained checkpoints.

Extract the file to get the following directory tree

|-- README.md
|-- ckpt
|   |-- detection
|   |-- human_pose
|   |-- segmentation
|-- config
|-- model
|-- pattern_zoo

Easy Use

The current implementation is coupled to specific downstream tasks. OpenMMLab users can quickly use IC-Conv in the following simple ways.

from models import IC_ResNet
import torch
net = IC_ResNet(depth=50,pattern_path='pattern_zoo/detection/ic_r50_k9.json')
net.eval()
inputs = torch.rand(1, 3, 32, 32)
outputs = net.forward(inputs)

For 2d Human Pose Estimation using MMPose

  1. Copying the config files to the config path of mmpose, such as
cp config/human_pose/ic_res50_k13_coco_640x640.py your_mmpose_path/mmpose/configs/bottom_up/resnet/coco/ic_res50_k13_coco_640x640.py
  1. Copying the inception conv files to the model path of mmpose,
cp model/ic_conv2d.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_conv2d.py
cp model/ic_resnet.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_resnet.py
  1. Running it directly like MMPose.

Model Zoo

We provided the pre-trained weights of IC-ResNet-50, IC-ResNet-101and IC-ResNeXt-101 (32x4d) on ImageNet and the weights trained on specific tasks.

For users with limited computing power, you can directly reuse our provided IC-Conv and ImageNet pre-training weights for detection, segmentation, and 2d human pose estimation tasks on other datasets.

Attentions: The links in the tables below are relative paths. Therefore, you should clone the repository and download checkpoints.

Object Detection

Detector Backbone Lr AP dilation_pattern checkpoint
Faster-RCNN-FPN IC-R50 1x 38.9 pattern ckpt/imagenet_retrain_ckpt
Faster-RCNN-FPN IC-R101 1x 41.9 pattern ckpt/imagenet_retrain_ckpt
Faster-RCNN-FPN IC-X101-32x4d 1x 42.1 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-R50 1x 42.4 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-R101 1x 45.0 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-X101-32x4d 1x 45.7 pattern ckpt/imagenet_retrain_ckpt

Instance Segmentation

Detector Backbone Lr box AP mask AP dilation_pattern checkpoint
Mask-RCNN-FPN IC-R50 1x 40.0 35.9 pattern ckpt/imagenet_retrain_ckpt
Mask-RCNN-FPN IC-R101 1x 42.6 37.9 pattern ckpt/imagenet_retrain_ckpt
Mask-RCNN-FPN IC-X101-32x4d 1x 43.4 38.4 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-R50 1x 43.4 36.8 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-R101 1x 45.7 38.7 pattern ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN IC-X101-32x4d 1x 46.4 39.1 pattern ckpt/imagenet_retrain_ckpt

2d Human Pose Estimation

We adjust the learning rate of resnet backbone in MMPose and get better baseline results. Please see the specific config files in config/human_pose/.

Results on COCO val2017 without multi-scale test
Backbone Input Size AP dilation_pattern checkpoint
R50(mmpose) 640x640 47.9 ~ ~
R50 640x640 51.0 ~ ~
IC-R50 640x640 62.2 pattern ckpt/imagenet_retrain_ckpt
R101 640x640 55.5 ~ ~
IC-R101 640x640 63.3 pattern ckpt/imagenet_retrain_ckpt
Results on COCO val2017 with multi-scale test. 3 default scales ([2, 1, 0.5]) are used
Backbone Input Size AP
R50(mmpose) 640x640 52.5
R50 640x640 55.8
IC-R50 640x640 65.8
R101 640x640 60.2
IC-R101 640x640 68.5

Acknowledgement

The human pose estimation experiments are built upon MMPose.

Citation

If our paper helps your research, please cite it in your publications:

@article{liu2020inception,
 title={Inception Convolution with Efficient Dilation Search},
 author={Liu, Jie and Li, Chuming and Liang, Feng and Lin, Chen and Sun, Ming and Yan, Junjie and Ouyang, Wanli and Xu, Dong},
 journal={arXiv preprint arXiv:2012.13587},
 year={2020}
}
Owner
Jie Liu
Jie Liu
It's a implement of this paper:Relation extraction via Multi-Level attention CNNs

Relation Classification via Multi-Level Attention CNNs It's a implement of this paper:Relation Classification via Multi-Level Attention CNNs. Training

Aybss 2 Nov 04, 2022
This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

openmc-plasma-source This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters. The OpenMC sources a

Fusion Energy 10 Oct 18, 2022
Code for the upcoming CVPR 2021 paper

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel J. Brostow and Michael

Niantic Labs 496 Dec 30, 2022
Apollo optimizer in tensorflow

Apollo Optimizer in Tensorflow 2.x Notes: Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant lea

Evan Walters 1 Nov 09, 2021
CLEAR algorithm for multi-view data association

CLEAR: Consistent Lifting, Embedding, and Alignment Rectification Algorithm The Matlab, Python, and C++ implementation of the CLEAR algorithm, as desc

MIT Aerospace Controls Laboratory 30 Jan 02, 2023
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

Rush Kapoor 2 Nov 21, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 147 Jan 07, 2023
A study project using the AA-RMVSNet to reconstruct buildings from multiple images

3d-building-reconstruction This is part of a study project using the AA-RMVSNet to reconstruct buildings from multiple images. Introduction It is exci

17 Oct 17, 2022
Group-Free 3D Object Detection via Transformers

Group-Free 3D Object Detection via Transformers By Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong. This repo is the official implementation of "Group-

Ze Liu 213 Dec 07, 2022
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)

YOLOv5-GUI 🎉 YOLOv5算法(ver.6及ver.5)的Qt-GUI实现 🎉 Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5). 基于YOLOv5的v5版本和v6版本及Javacr大佬的UI逻辑进行编写

EricFang 12 Dec 28, 2022
Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Myo Keylogging This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Ga

Secure Mobile Networking Lab 7 Jan 03, 2023
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
Deep Learning with PyTorch made easy 🚀 !

Deep Learning with PyTorch made easy 🚀 ! Carefree? carefree-learn aims to provide CAREFREE usages for both users and developers. It also provides a c

381 Dec 22, 2022
Revisting Open World Object Detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our n

58 Dec 23, 2022
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Stephen James 51 Dec 27, 2022
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023
This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Workbook for the Qiskit Developer Certification Exam Hello everyone! This is Bartu, a fellow Qiskitter. I have recently taken the Certification exam

Bartu Bisgin 66 Dec 10, 2022