Official Code for "Non-deep Networks"

Overview

Non-deep Networks
arXiv:2110.07641
Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun

Overview: Depth is the hallmark of DNNs. But more depth means more sequential computation and higher latency. This begs the question -- is it possible to build high-performing ``non-deep" neural networks? We show that it is. We show, for the first time, that a network with a depth of just 12 can achieve top-1 accuracy over 80% on ImageNet, 96% on CIFAR10, and 81% on CIFAR100. We also show that a network with a low-depth (12) backbone can achieve an AP of 48% on MS-COCO.

If you find our work useful, please consider citing it:

@article{goyal2021nondeep,
  title={Non-deep Networks},
  author={Goyal, Ankit and Bochkovskiy, Alexey and Deng, Jia and Koltun, Vladlen},
  journal={arXiv:2110.07641},
  year={2021}
}

Code Coming Soon!

Comments
  • when will the code of the model be released?

    when will the code of the model be released?

    I am very interested in your research, when will the code of the model be released? I saw on October 23rd that you said it would be released in 4 weeks

    opened by Dr-Goopher 6
  • When will the code be released?

    When will the code be released?

    I am very interested in your work and would like to further study. I hope you can release the code as soon as possible in your busy schedule. Thank you!

    opened by SenShu96 5
  • what is the meaning of 'Shuffle' of fusion block in Fig. A1?

    what is the meaning of 'Shuffle' of fusion block in Fig. A1?

    Hello. Thank you for your great study. I wonder the meaning of 'Shuffle' of fusion block in Fig. A1. Is it pixel shuffle layer? Please let me know the meaning of that.

    Thank you.

    opened by jhcha08 3
  • Question about SSE module

    Question about SSE module

    Hi. Figure 2b shows that there's one 1x1conv in a branch of SSE, how to match the channel of output by 1x1conv with the channel of input after shortcut? If I set the output channel of 1x1conv the same as input, the channels of the outputs by RepVGG block and SSE will not match.

    opened by Tsianmy 2
  • Really faster than ResNet? I am very confused

    Really faster than ResNet? I am very confused

    Hello, my friend, appreciate for your great work! I have tested the code on https://github.com/Pritam-N/ParNet by Pritam-N and change the ResNet code in my model by using your ParNet , but the actual time is quite slow than the paper said. My block size is [64, 128, 256, 512, 2048], and the time of "forward()" is more than 5s average while the Resnet is 0.02s in my device. I have use the time function for every line in the forward(), find that the encode stuff is the main reason. I continue write time.perf_counter() in the encode stuff, find that the "self.stream2_fusion" and "self.stream3_fusion" is the most time user. Do you know why ?

    opened by StonepageVan 1
  •  fusion module, accuracy about cifar100

    fusion module, accuracy about cifar100

    1. what is your shuffle code in your fusion module?
    2. what is your model architecture in cifar-100? I just changed front two downsample modules based on the ParNet for Imagenet in the paper. But the accuracy is lower. And How do you set the LR, MILESTONES and NUM_EPOCH to meet high accuracy?
    opened by qq769852576 2
Owner
Ankit Goyal
Phd Candidate @Princeton | Works in CV and AI
Ankit Goyal
Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

Qiushi Yang 2 Sep 29, 2022
Import Python modules from dicts and JSON formatted documents.

Paker Paker is module for importing Python packages/modules from dictionaries and JSON formatted documents. It was inspired by httpimporter. Important

Wojciech Wentland 1 Sep 07, 2022
Code for LIGA-Stereo Detector, ICCV'21

LIGA-Stereo Introduction This is the official implementation of the paper LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based

Xiaoyang Guo 75 Dec 09, 2022
2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup)智能人机交互自然语言理解赛道第二名参赛解决方案

2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup) 智能人机交互自然语言理解赛道第二名解决方案 比赛网址: CCIR-Cup-智能人机交互自然语言理解 1.依赖环境: python==3.8 torch==1.7.1+cu110 numpy==1.19.2 transformers=

JinXiang 22 Oct 29, 2022
A project for developing transformer-based models for clinical relation extraction

Clinical Relation Extration with Transformers Aim This package is developed for researchers easily to use state-of-the-art transformers models for ext

uf-hobi-informatics-lab 101 Dec 19, 2022
Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

Introduction This script trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI. In order to run this

Momin Haider 0 Jan 02, 2022
Automatic library of congress classification, using word embeddings from book titles and synopses.

Automatic Library of Congress Classification The Library of Congress Classification (LCC) is a comprehensive classification system that was first deve

Ahmad Pourihosseini 3 Oct 01, 2022
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

111 Dec 27, 2022
Some methods for comparing network representations in deep learning and neuroscience.

Generalized Shape Metrics on Neural Representations In neuroscience and in deep learning, quantifying the (dis)similarity of neural representations ac

Alex Williams 45 Dec 27, 2022
TensorFlow implementation of the algorithm in the paper "Decoupled Low-light Image Enhancement"

Decoupled Low-light Image Enhancement Shijie Hao1,2*, Xu Han1,2, Yanrong Guo1,2 & Meng Wang1,2 1Key Laboratory of Knowledge Engineering with Big Data

17 Apr 25, 2022
EfficientNetV2 implementation using PyTorch

EfficientNetV2-S implementation using PyTorch Train Steps Configure imagenet path by changing data_dir in train.py python main.py --benchmark for mode

Jahongir Yunusov 86 Dec 29, 2022
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
Using Hotel Data to predict High Value And Potential VIP Guests

Description Using hotel data and AI to predict high value guests and potential VIP guests. Hotel can leverage on prediction resutls to run more effect

HCG 12 Feb 14, 2022
Rohit Ingole 2 Mar 24, 2022
[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

A paper Introduction This is an official release of the paper Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation wit

Jiacheng Wang 14 Dec 08, 2022
Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

VITA 77 Oct 05, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to

Equinor 26 Dec 27, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022