Pre-trained NFNets with 99% of the accuracy of the official paper

Overview

NFNet Pytorch Implementation

This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale Image Recognition Without Normalization. The small models are as accurate as an EfficientNet-B7, but train 8.7 times faster. The large models set a new SOTA top-1 accuracy on ImageNet.

NFNet F0 F1 F2 F3 F4 F5 F6+SAM
Top-1 accuracy Brock et al. 83.6 84.7 85.1 85.7 85.9 86.0 86.5
Top-1 accuracy this implementation 82.82 84.63 84.90 85.46 85.66 85.62 TBD

All credits go to the authors of the original paper. This repo is heavily inspired by their nice JAX implementation in the official repository. Visit their repo for citing.

Get started

git clone https://github.com/benjs/nfnets_pytorch.git
pip3 install -r requirements.txt

Download pretrained weights from the official repository and place them in the pretrained folder.

from pretrained import pretrained_nfnet
model_F0 = pretrained_nfnet('pretrained/F0_haiku.npz')
model_F1 = pretrained_nfnet('pretrained/F1_haiku.npz')
# ...

The model variant is automatically derived from the parameter count in the pretrained weights file.

Validate yourself

python3 eval.py --pretrained pretrained/F0_haiku.npz --dataset path/to/imagenet/valset/

You can download the ImageNet validation set from the ILSVRC2012 challenge site after asking for access with, for instance, your .edu mail address.

Scaled weight standardization convolutions in your own model

Simply replace all your nn.Conv2d with WSConv2D and all your nn.ReLU with VPReLU or VPGELU (variance preserving ReLU/GELU).

import torch.nn as nn
from model import WSConv2D, VPReLU, VPGELU

# Simply replace your nn.Conv2d layers
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
 
        self.activation = VPReLU(inplace=True) # or VPGELU
        self.conv0 = WSConv2D(in_channels=128, out_channels=256, kernel_size=1, ...)
        # ...

    def forward(self, x):
      out = self.activation(self.conv0(x))
      # ...

SGD with adaptive gradient clipping in your own model

Simply replace your SGD optimizer with SGD_AGC.

from optim import SGD_AGC

optimizer = SGD_AGC(
        named_params=model.named_parameters(), # Pass named parameters
        lr=1e-3,
        momentum=0.9,
        clipping=0.1, # New clipping parameter
        weight_decay=2e-5, 
        nesterov=True)

It is important to exclude certain layers from clipping or momentum. The authors recommends to exclude the last fully convolutional from clipping and the bias/gain parameters from weight decay:

import re

for group in optimizer.param_groups:
    name = group['name'] 
    
    # Exclude from weight decay
    if len(re.findall('stem.*(bias|gain)|conv.*(bias|gain)|skip_gain', name)) > 0:
        group['weight_decay'] = 0

    # Exclude from clipping
    if name.startswith('linear'):
        group['clipping'] = None

Train your own NFNet

Adjust your desired parameters in default_config.yaml and start training.

python3 train.py --dataset /path/to/imagenet/

There is still some parts missing for complete training from scratch:

  • Multi-GPU training
  • Data augmentations
  • FP16 activations and gradients

Contribute

The implementation is still in an early stage in terms of usability / testing. If you have an idea to improve this repo open an issue, start a discussion or submit a pull request.

Development status

  • Pre-trained NFNet Models
    • F0-F5
    • F6+SAM
    • Scaled weight standardization
    • Squeeze and excite
    • Stochastic depth
    • FP16 activations
  • SGD with unit adaptive gradient clipping (SGD-AGC)
    • Exclude certain layers from weight-decay, clipping
    • FP16 gradients
  • PyPI package
  • PyTorch hub submission
  • Label smoothing loss from Szegedy et al.
  • Training on ImageNet
  • Pre-trained weights
  • Tensorboard support
  • general usability improvements
  • Multi-GPU support
  • Data augmentation
  • Signal propagation plots (from first paper)
Comments
  • ModuleNotFoundError: No module named 'haiku'

    ModuleNotFoundError: No module named 'haiku'

    when i try "python3 eval.py --pretrained pretrained/F0_haiku.npz --dataset ***" i got this error, have you ever met this error? how to fix this?

    opened by Rianusr 2
  • Trained without data augmentation?

    Trained without data augmentation?

    Thanks for the great work on the pytorch implementation of NFNet! The accuracies achieved by this implementation are pretty impressive also and I am wondering if these training results were simply derived from the training script, that is, without data augmentation.

    opened by nandi-zhang 2
  • from_pretrained_haiku

    from_pretrained_haiku

    https://github.com/benjs/nfnets_pytorch/blob/7b4d1cc701c7de4ee273ded01ce21cbdb1e60c48/nfnets/pretrained.py#L90

    model = from_pretrained_haiku(args.pretrained)

    where is 'from_pretrained_haiku' method?

    opened by vkmavani 0
  • About WSconv2d

    About WSconv2d

    I see the authoe's code, I find his WSconv2d pad_mod is 'same'. Pytorch's conv2d dono't have pad_mode, and I think your padding should greater 0, but I find your padding always be 0. I want to know why?

    I see you train.py your learning rate is constant, why? Thank you!

    opened by fancyshun 3
  • AveragePool

    AveragePool

    Hi, noticed that the AveragePool ('pool' layer) is not used in forward function. Instead, forward uses torch.mean. Removing the layer doesn't change pooling behavior. I tried using this model as a feature extractor and was a bit confused for a moment.

    opened by bogdankjastrzebski 1
Releases(v0.0.1)
Owner
Benjamin Schmidt
Engineering Student
Benjamin Schmidt
Existing Literature about Machine Unlearning

Machine Unlearning Papers 2021 Brophy and Lowd. Machine Unlearning for Random Forests. In ICML 2021. Bourtoule et al. Machine Unlearning. In IEEE Symp

Jonathan Brophy 213 Jan 08, 2023
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out) created with Python.

Hand Gesture Volume Controller Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out). Code Firstly I have created a

Tejas Prajapati 16 Sep 11, 2021
Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Official Code Implementation of The Paper : XAI for Transformers: Better Explanations through Conservative Propagation For the SST-2 and IMDB expermin

Ameen Ali 23 Dec 30, 2022
object recognition with machine learning on Respberry pi

Respberrypi_object-recognition object recognition with machine learning on Respberry pi line.py 建立一支與樹梅派連線的 linebot 使用此 linebot 遠端控制樹梅派拍照 config.ini l

1 Dec 11, 2021
This repo is duplication of jwyang/faster-rcnn.pytorch

Faster RCNN Pytorch This repo is duplication of jwyang/faster-rcnn.pytorch C/C++ code are removed and easier to study. Python 3.8.5 Ubuntu 20.04.1 LTS

Kim Jihwan 1 Jan 14, 2022
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

HAOMO.AI 136 Jan 03, 2023
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

2 Dec 26, 2021
Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
Dense Gaussian Processes for Few-Shot Segmentation

DGPNet - Dense Gaussian Processes for Few-Shot Segmentation Welcome to the public repository for DGPNet. The paper is available at arxiv: https://arxi

37 Jan 07, 2023
Implementing yolov4 target detection and tracking based on nao robot

Implementing yolov4 target detection and tracking based on nao robot

6 Apr 19, 2022
[NeurIPS-2020] Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID.

Self-paced Contrastive Learning (SpCL) The official repository for Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID

Yixiao Ge 286 Dec 21, 2022
Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

FAU Implementation of the paper: Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution. Yingruo

Evelyn 78 Nov 29, 2022
Code for testing various M1 Chip benchmarks with TensorFlow.

M1, M1 Pro, M1 Max Machine Learning Speed Test Comparison This repo contains some sample code to benchmark the new M1 MacBooks (M1 Pro and M1 Max) aga

Daniel Bourke 348 Jan 04, 2023
A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions Kapoutsis, A.C., Chatzichristofis,

Athanasios Ch. Kapoutsis 5 Oct 15, 2022
Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

yolox-bytetrack-sample YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサン

KazuhitoTakahashi 12 Nov 09, 2022
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

63 Oct 17, 2022
Bringing Characters to Life with Computer Brains in Unity

AI4Animation: Deep Learning for Character Control This project explores the opportunities of deep learning for character animation and control as part

Sebastian Starke 5.5k Jan 04, 2023
Network Enhancement implementation in pytorch

network_enahncement_pytorch Network Enhancement implementation in pytorch Research paper Network Enhancement: a general method to denoise weighted bio

Yen 1 Nov 12, 2021