Deep Residual Networks with 1K Layers

By Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

Microsoft Research Asia (MSRA).

Introduction
Notes
Usage

Introduction

This repository contains re-implemented code for the paper "Identity Mappings in Deep Residual Networks" (http://arxiv.org/abs/1603.05027). This work enables training quality 1k-layer neural networks in a super simple way.

Acknowledgement: This code is re-implemented by Xiang Ming from Xi'an Jiaotong Univeristy for the ease of release.

Seel Also: Re-implementations of ResNet-200 [a] on ImageNet from Facebook AI Research (FAIR): https://github.com/facebook/fb.resnet.torch/tree/master/pretrained

Notes

This code is based on the implementation of Torch ResNets (https://github.com/facebook/fb.resnet.torch).
The experiments in the paper were conducted in Caffe, whereas this code is re-implemented in Torch. We observed similar results within reasonable statistical variations.
To fit the 1k-layer models into memory without modifying much code, we simply reduced the mini-batch size to 64, noting that results in the paper were obtained with a mini-batch size of 128. Less expectedly, the results with the mini-batch size of 64 are slightly better:

mini-batch CIFAR-10 test error (%): (median (mean+/-std))

128 (as in [a]) 4.92 (4.89+/-0.14)

64 (as in this code) 4.62 (4.69+/-0.20)
Curves obtained by running this code with a mini-batch size of 64 (training loss: y-axis on the left; test error: y-axis on the right):

mini-batch	CIFAR-10 test error (%): (median (mean+/-std))
128 (as in [a])	4.92 (4.89+/-0.14)
64 (as in this code)	4.62 (4.69+/-0.20)

Usage

Install Torch ResNets (https://github.com/facebook/fb.resnet.torch) following instructions therein.
Add the file resnet-pre-act.lua from this repository to ./models.
To train ResNet-1001 as of the form in [a]:

th main.lua -netType resnet-pre-act -depth 1001 -batchSize 64 -nGPU 2 -nThreads 4 -dataset cifar10 -nEpochs 200 -shareGradInput false

Note: ``shareGradInput=true'' is not valid for this model yet.

Deep Residual Networks with 1K Layers

Related tags

Overview

Deep Residual Networks with 1K Layers

Table of Contents

Introduction

Notes

Usage

Owner

Kaiming He

Imagededup - 😎 Finding duplicate images made easy

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Code corresponding to The Introspective Agent: Interdependence of Strategy, Physiology, and Sensing for Embodied Agents

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

FLVIS: Feedback Loop Based Visual Initial SLAM

Single Image Deraining Using Bilateral Recurrent Network (TIP 2020)

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods."

PyTorch reimplementation of minimal-hand (CVPR2020)

Pointer networks Tensorflow2

Using VapourSynth with super resolution models and speeding them up with TensorRT.

COVID-Net Open Source Initiative

Experiments and examples converting Transformers to ONNX

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

Implementation of Monocular Direct Sparse Localization in a Prior 3D Surfel Map (DSL)

Revisiting Self-Training for Few-Shot Learning of Language Model.

All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.