A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Last update: Dec 28, 2022

Related tags

Overview

sam.pytorch

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementation .

Requirements

Python>=3.8
PyTorch>=1.7.1

To run the example, you further need

homura by pip install -U homura-core==2020.12.0
chika by pip install -U chika

Example

python cifar10.py [--optim.name {sam,sgd}] [--model {renst20, wrn28_2}] [--optim.rho 0.05]

Results: Test Accuracy (CIFAR-10)

Model	SAM	SGD
ResNet-20	93.5	93.2
WRN28-2	95.8	95.4
ResNeXT29	96.4	95.8

SAM needs double forward passes per each update, thus training with SAM is slower than training with SGD. In case of ResNet-20 training, 80 mins vs 50 mins on my environment. Additional options --use_amp --jit_model may slightly accelerates the training.

Usage

SAMSGD can be used as a drop-in replacement of PyTorch optimizers with closures. Also, it is compatible with lr_scheduler and has state_dict and load_state_dict.

from sam import SAMSGD

optimizer = SAMSGD(model.parameters(), lr=1e-1, rho=0.05)

for input, target in dataset:
    def closure():
        optimizer.zero_grad()
        output = model(input)
        loss = loss_f(output, target)
        loss.backward()
        return loss


    loss = optimizer.step(closure)

Citation

@ARTICLE{2020arXiv201001412F,
    author = {{Foret}, Pierre and {Kleiner}, Ariel and {Mobahi}, Hossein and {Neyshabur}, Behnam},
    title = "{Sharpness-Aware Minimization for Efficiently Improving Generalization}",
    year = 2020,
    eid = {arXiv:2010.01412},
    eprint = {2010.01412},
}

@software{sampytorch
    author = {Ryuichiro Hataya},
    titile = {sam.pytorch},
    url    = {https://github.com/moskomule/sam.pytorch},
    year   = {2020}
}

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Related tags

Overview

sam.pytorch

Requirements

Example

Results: Test Accuracy (CIFAR-10)

Usage

Citation

Owner

Ryuichiro Hataya

An Active Automata Learning Library Written in Python

A library for optimization on Riemannian manifolds

Mememoji - A facial expression classification system that recognizes 6 basic emotions: happy, sad, surprise, fear, anger and neutral.

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

Low Complexity Channel estimation with Neural Network Solutions

Source code for From Stars to Subgraphs

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

🔥 Real-time Super Resolution enhancement (4x) with content loss and relativistic adversarial optimization 🔥

Hide screen when boss is approaching.

A PyTorch implementation of PointRend: Image Segmentation as Rendering

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Highway networks implemented in PyTorch.

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

classification task on dataset-CIFAR10,by using Tensorflow/keras

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

TDmatch is a Python library developed to perform matching tasks in three categories:

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition