MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Related tags

Deep LearningOctConv
Overview

Octave Convolution

MXNet implementation for:

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

ImageNet

Ablation

  • Loss: Softmax
  • Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
  • MXNet API: Symbol API

example

Model baseline alpha = 0.125 alpha = 0.25 alpha = 0.5 alpha = 0.75
DenseNet-121 75.4 / 92.7 76.1 / 93.0 75.9 / 93.1 -- --
ResNet-26 73.2 / 91.3 75.8 / 92.6 76.1 / 92.6 75.5 / 92.5 74.6 / 92.1
ResNet-50 77.0 / 93.4 78.2 / 93.9 78.0 / 93.8 77.4 / 93.6 76.7 / 93.0
SE-ResNet-50 77.6 / 93.6 78.7 / 94.1 78.4 / 94.0 77.9 / 93.8 77.4 / 93.5
ResNeXt-50 78.4 / 94.0 -- 78.8 / 94.2 78.4 / 94.0 77.5 / 93.6
ResNet-101 78.5 / 94.1 79.2 / 94.4 79.2 / 94.4 78.7 / 94.1 --
ResNeXt-101 79.4 / 94.6 -- 79.6 / 94.5 78.9 / 94.4 --
ResNet-200 79.6 / 94.7 80.0 / 94.9 79.8 / 94.8 79.5 / 94.7 --

Note:

  • Top-1 / Top-5, single center crop accuracy is shown in the table. (testing script)
  • All residual networks in ablation study adopt pre-actice version[1] for convenience.

Others

  • Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
  • MXNet API: Gluon API
Model alpha label smoothing[2] mixup[3] #Params #FLOPs Top1 / Top5
0.75 MobileNet (v1) .375 2.6 M 213 M 70.5 / 89.5
1.0 MobileNet (v1) .5 4.2 M 321 M 72.5 / 90.6
1.0 MobileNet (v2) .375 Yes 3.5 M 256 M 72.0 / 90.7
1.125 MobileNet (v2) .5 Yes 4.2 M 295 M 73.0 / 91.2
Oct-ResNet-152 .125 Yes Yes 60.2 M 10.9 G 81.4 / 95.4
Oct-ResNet-152 + SE .125 Yes Yes 66.8 M 10.9 G 81.6 / 95.7

Citation

@article{chen2019drop,
  title={Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution},
  author={Chen, Yunpeng and Fan, Haoqi and Xu, Bing and Yan, Zhicheng and Kalantidis, Yannis and Rohrbach, Marcus and Yan, Shuicheng and Feng, Jiashi},
  journal={Proceedings of the IEEE International Conference on Computer Vision},
  year={2019}
}

Third-party Implementations

Acknowledgement

  • Thanks MXNet, Gluon-CV and TVM!
  • Thanks @Ldpe2G for sharing the code for calculating the #FLOPs (link)
  • Thanks Min Lin (Mila), Xin Zhao (Qihoo Inc.), Tao Wang (NUS) for helpful discussions on the code development.

Reference

[1] He K, et al "Identity Mappings in Deep Residual Networks".

[2] Christian S, et al "Rethinking the Inception Architecture for Computer Vision"

[3] Zhang H, et al. "mixup: Beyond empirical risk minimization.".

License

The code and the models are MIT licensed, as found in the LICENSE file.

Owner
Meta Research
Meta Research
TabNet for fastai

TabNet for fastai This is an adaptation of TabNet (Attention-based network for tabular data) for fastai (=2.0) library. The original paper https://ar

Mikhail Grankin 116 Oct 21, 2022
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

Robin Jia 38 Oct 16, 2022
8-week curriculum for AI Builders

curriculum 8-week curriculum for AI Builders สารบัญ บทที่ 1 - Machine Learning คืออะไร บทที่ 2 - ชุดข้อมูลมหัศจรรย์และถิ่นที่อยู่ บทที่ 3 - Stochastic

AI Builders 134 Jan 03, 2023
Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Neural Magic Eye Preprint | Project Page | Colab Runtime Official PyTorch implementation of the preprint paper "NeuralMagicEye: Learning to See and Un

Zhengxia Zou 56 Jul 15, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

Katsuya Hyodo 16 Dec 22, 2022
C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion By: David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedal

Meta Research 309 Dec 16, 2022
A novel benchmark dataset for Monocular Layout prediction

AutoLay AutoLay: Benchmarking Monocular Layout Estimation Kaustubh Mani, N. Sai Shankar, J. Krishna Murthy, and K. Madhava Krishna Abstract In this pa

Kaustubh Mani 39 Apr 26, 2022
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

OpenAI 2.9k Jan 04, 2023
Tensorflow port of a full NetVLAD network

netvlad_tf The main intention of this repo is deployment of a full NetVLAD network, which was originally implemented in Matlab, in Python. We provide

Robotics and Perception Group 225 Nov 08, 2022
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin

Yue Zhao 6.6k Jan 05, 2023
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie

58 Dec 23, 2022
An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha

Sewon Min 92 Jan 07, 2023
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
EfficientDet (Scalable and Efficient Object Detection) implementation in Keras and Tensorflow

EfficientDet This is an implementation of EfficientDet for object detection on Keras and Tensorflow. The project is based on the official implementati

1.3k Dec 19, 2022
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (

Sang-gil Lee 241 Nov 18, 2022
The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

Balloon Learning Environment Docs The Balloon Learning Environment (BLE) is a simulator for stratospheric balloons. It is designed as a benchmark envi

Google 87 Dec 25, 2022
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

Akari Asai 25 Oct 30, 2022
A library for uncertainty representation and training in neural networks.

Epistemic Neural Networks A library for uncertainty representation and training in neural networks. Introduction Many applications in deep learning re

DeepMind 211 Dec 12, 2022