Learning Features with Parameter-Free Layers (ICLR 2022)

Related tags

Deep LearningPfLayer
Overview

Learning Features with Parameter-Free Layers (ICLR 2022)

Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper

NAVER AI Lab, NAVER CLOVA

Updates

  • 02.11.2022 Code has been uploaded
  • 02.06.2022 Initial update

Abstract

Trainable layers such as convolutional building blocks are the standard network design choices by learning parameters to capture the global context through successive spatial operations. When designing an efficient network, trainable layers such as the depthwise convolution is the source of efficiency in the number of parameters and FLOPs, but there was little improvement to the model speed in practice. This paper argues that simple built-in parameter-free operations can be a favorable alternative to the efficient trainable layers replacing spatial operations in a network architecture. We aim to break the stereotype of organizing the spatial operations of building blocks into trainable layers. Extensive experimental analyses based on layer-level studies with fully-trained models and neural architecture searches are provided to investigate whether parameter-free operations such as the max-pool are functional. The studies eventually give us a simple yet effective idea for redesigning network architectures, where the parameter-free operations are heavily used as the main building block without sacrificing the model accuracy as much. Experimental results on the ImageNet dataset demonstrate that the network architectures with parameter-free operations could enjoy the advantages of further efficiency in terms of model speed, the number of the parameters, and FLOPs.

Some Analyses in The Paper

1. Depthwise convolution is replaceble with a parameter-free operation:

2. Parameter-free operations are frequently searched in normal building blocks by NAS:

3. R50-hybrid (with the eff-bottlenecks) yields a localizable features (see the Grad-CAM visualizations):

Our Proposed Models

1. Schematic illustration of our models

  • Here, we provide example models where the parameter-free operations (i.e., eff-layer) are mainly used;

  • Parameter-free operations such as the max-pool2d and avg-pool2d can replace the spatial operations (conv and SA).

2. Brief model descriptions

resnet_pf.py: resnet50_max(), resnet50_hybrid(): R50-max, R50-hybrid - model with the efficient bottlenecks

vit_pf.py: vit_s_max() - ViT with the efficient transformers

pit_pf.py: pit_s_max() - PiT with the efficient transformers

Usage

Requirements

pytorch >= 1.6.0
torchvision >= 0.7.0
timm >= 0.3.4
apex == 0.1.0

Pretrained models

Network Img size Params. (M) FLOPs (G) GPU (ms) Top-1 (%) Top-5 (%)
R50 224x224 25.6 4.1 8.7 76.2 93.8
R50-max 224x224 14.2 2.2 6.8 74.3 92.0
R50-hybrid 224x224 17.3 2.6 7.3 77.1 93.1
Network Img size Throughputs Vanilla +CutMix +DeiT
R50 224x224 962 / 112 76.2 77.6 78.8
ViT-S-max 224x224 763 / 96 74.2 77.3 79.8
PiT-S-max 224x224 1000 / 92 75.7 78.1 80.1

Model load & evaluation

Example code of loading resnet50_hybrid without timm:

import torch
from resnet_pf import resnet50_hybrid

model = resnet50_hybrid() 
model.load_state_dict(torch.load('./weight/checkpoint.pth'))
print(model(torch.randn(1, 3, 224, 224)))

Example code of loading pit_s_max with timm:

import torch
import timm
import pit_pf
   
model = timm.create_model('pit_s_max', pretrained=False)
model.load_state_dict(torch.load('./weight/checkpoint.pth'))
print(model(torch.randn(1, 3, 224, 224)))

Directly run each model can verify a single iteration of forward and backward of the mode.

Training

Our ResNet-based models can be trained with any PyTorch training codes; we recommend timm. We provide a sample script for training R50_hybrid with the standard 90-epochs training setup:

  python3 -m torch.distributed.launch --nproc_per_node=4 train.py ./ImageNet_dataset/ --model resnet50_hybrid --opt sgd --amp \
  --lr 0.2 --weight-decay 1e-4 --batch-size 256 --sched step --epochs 90 --decay-epochs 30 --warmup-epochs 3 --smoothing 0\

Vision transformers (ViT and PiT) models are also able to be trained with timm, but we recommend the code DeiT to train with. We provide a sample training script with the default training setup in the package:

  python3 -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model vit_s_max --batch-size 256 --data-path ./ImageNet_dataset/

License

Copyright 2022-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

How to cite

@inproceedings{han2022learning,
    title={Learning Features with Parameter-Free Layers},
    author={Dongyoon Han and YoungJoon Yoo and Beomyoung Kim and Byeongho Heo},
    year={2022},
    journal={International Conference on Learning Representations (ICLR)},
}
Owner
NAVER AI
Official account of NAVER AI, Korea No.1 Industrial AI Research Group
NAVER AI
Train DeepLab for Semantic Image Segmentation

Train DeepLab for Semantic Image Segmentation Martin Kersner, [email protected]

Martin Kersner 172 Dec 14, 2022
FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks

FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks This is our implementation for the paper: FinGAT: A Financial Graph At

Yu-Che Tsai 64 Dec 13, 2022
GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

Course Description The programming language Julia is being more and more adopted in High Performance Computing (HPC) due to its unique way to combine

Samuel Omlin 192 Jan 03, 2023
Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

MASTER-PyTorch PyTorch reimplementation of "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021). This projec

Wenwen Yu 255 Dec 29, 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
A system for quickly generating training data with weak supervision

Programmatically Build and Manage Training Data Announcement The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI applicat

Snorkel Team 5.4k Jan 02, 2023
3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

3DIAS_Pytorch This repository contains the official code to reproduce the results from the paper: 3DIAS: 3D Shape Reconstruction with Implicit Algebra

Mohsen Yavartanoo 21 Dec 12, 2022
TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.

TensorFlow GNN This is an early (alpha) release to get community feedback. It's under active development and we may break API compatibility in the fut

889 Dec 30, 2022
Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

Class-balanced-loss-pytorch Pytorch implementation of the paper Class-Balanced Loss Based on Effective Number of Samples presented at CVPR'19. Yin Cui

Vandit Jain 697 Dec 29, 2022
AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

AT-BMC Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction (AAAI 2022) Paper Prerequisites Install pac

16 Nov 26, 2022
System Combination for Grammatical Error Correction Based on Integer Programming

System Combination for Grammatical Error Correction Based on Integer Programming This repository contains the code and scripts that implement the syst

NUS NLP Group 0 Mar 29, 2022
An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

CLCC: Contrastive Learning for Color Constancy (CVPR 2021) Yi-Chen Lo*, Chia-Che Chang*, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang,

Yi-Chen (Howard) Lo 58 Dec 17, 2022
This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

Deploy-yolo-fastest-tflite-on-raspberry 觉得有用的话可以顺手点个star嗷 这个项目将垃圾分类小车中的tflite模型移植到了树莓派3b+上面。 该项目主要是为了记录在树莓派部署yolo fastest tflite的流程 (之后有时间会尝试用C++部署来提升

7 Aug 16, 2022
A python package to perform same transformation to coco-annotation as performed on the image.

coco-transform-util A python package to perform same transformation to coco-annotation as performed on the image. Installation Way 1 $ git clone https

1 Jan 14, 2022
An easy-to-use app to visualise attentions of various VQA models.

Ask Me Anything: A tool for visualising Visual Question Answering (AMA) An easy-to-use app to visualise attentions of various VQA models. Please click

Apoorve 37 Nov 13, 2022
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
Crawl & visualize ICLR papers and reviews

Crawl and Visualize ICLR 2022 OpenReview Data Descriptions This Jupyter Notebook contains the data crawled from ICLR 2022 OpenReview webpages and thei

Federico Berto 75 Dec 05, 2022
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023