Facilitates implementing deep neural-network backbones, data augmentations

Last update: Dec 29, 2022

Related tags

Overview

Introduction

Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common way was to find a repo and reimplement them. Thus, it is really hard for them to speed up the implementation of a big project in which requires a continuous try-end-error process to find the best model. general_backbone is launched to facilitate for implementation of deep neural-network backbones, data augmentations, optimizers, and learning schedulers that all in one package. Finally, you can quick-win the training process. Below are these supported sectors in the current version:

backbones
loss functions
augumentation styles
optimizers
schedulers
data types
visualizations

Installation

Refer to docs/installation.md for installion of general_backbone package.

Model backbone

Currently, general_backbone supports more than 70 type of resnet models such as: resnet18, resnet34, resnet50, resnet101, resnet152, resnext50.

All models is supported can be found in general_backbone.list_models() function:

import general_backbone
general_backbone.list_models()

Results

{'resnet': ['resnet18', 'resnet18d', 'resnet34', 'resnet34d', 'resnet26', 'resnet26d', 'resnet26t', 'resnet50', 'resnet50d', 'resnet50t', 'resnet101', 'resnet101d', 'resnet152', 'resnet152d', 'resnet200', 'resnet200d', 'tv_resnet34', 'tv_resnet50', 'tv_resnet101', 'tv_resnet152', 'wide_resnet50_2', 'wide_resnet101_2', 'resnext50_32x4d', 'resnext50d_32x4d', 'resnext101_32x4d', 'resnext101_32x8d', 'resnext101_64x4d', 'tv_resnext50_32x4d', 'ig_resnext101_32x8d', 'ig_resnext101_32x16d', 'ig_resnext101_32x32d', 'ig_resnext101_32x48d', 'ssl_resnet18', 'ssl_resnet50', 'ssl_resnext50_32x4d', 'ssl_resnext101_32x4d', 'ssl_resnext101_32x8d', 'ssl_resnext101_32x16d', 'swsl_resnet18', 'swsl_resnet50', 'swsl_resnext50_32x4d', 'swsl_resnext101_32x4d', 'swsl_resnext101_32x8d', 'swsl_resnext101_32x16d', 'seresnet18', 'seresnet34', 'seresnet50', 'seresnet50t', 'seresnet101', 'seresnet152', 'seresnet152d', 'seresnet200d', 'seresnet269d', 'seresnext26d_32x4d', 'seresnext26t_32x4d', 'seresnext50_32x4d', 'seresnext101_32x4d', 'seresnext101_32x8d', 'senet154', 'ecaresnet26t', 'ecaresnetlight', 'ecaresnet50d', 'ecaresnet50d_pruned', 'ecaresnet50t', 'ecaresnet101d', 'ecaresnet101d_pruned', 'ecaresnet200d', 'ecaresnet269d', 'ecaresnext26t_32x4d', 'ecaresnext50t_32x4d', 'resnetblur18', 'resnetblur50', 'resnetrs50', 'resnetrs101', 'resnetrs152', 'resnetrs200', 'resnetrs270', 'resnetrs350', 'resnetrs420']}

To select your backbone type, you set model=resnet50 in train_config of your config file. An example config file general_backbone/configs/image_clf_config.py.

Dataset

A toy dataset is provided at toydata for your test training. It has a structure organized as below:

toydata/
└── image_classification
    ├── test
    │   ├── cat
    │   └── dog
    └── train
        ├── cat
        └── dog

Inside each folder cat and dog is the images. If you want to add a new class, you just need to create a new folder with the folder's name is label name inside train and test folder.

Data Augmentation

general_backbone package support many augmentations style for training. It is efficient and important to improve model accuracy. Some of common augumentations is below:

Augumentation Style	Parameters	Description
Pixel-level transforms
Blur	`{'blur_limit':7, 'always_apply':False, 'p':0.5}`	Blur the input image using a random-sized kernel
GaussNoise	`{'var_limit':(10.0, 50.0), 'mean':0, 'per_channel':True, 'always_apply':False, 'p':0.5}`	Apply gaussian noise to the input image
GaussianBlur	`{'blur_limit':(3, 7), 'sigma_limit':0, 'always_apply':False, 'p':0.5}`	Blur the input image using a Gaussian filter with a random kernel size
GlassBlur	`{'sigma': 0.7, 'max_delta':4, 'iterations':2, 'always_apply':False, 'mode':'fast', 'p':0.5}`	Apply glass noise to the input image
HueSaturationValue	`{'hue_shift_limit':20, 'sat_shift_limit':30, 'val_shift_limit':20, 'always_apply':False, 'p':0.5}`	Randomly change hue, saturation and value of the input image
MedianBlur	`{'blur_limit':7, 'always_apply':False, 'p':0.5}`	Blur the input image using a median filter with a random aperture linear size
RGBShift	`{'r_shift_limit': 15, 'g_shift_limit': 15, 'b_shift_limit': 15, 'p': 0.5}`	Randomly shift values for each channel of the input RGB image.
Normalize	`{'mean':(0.485, 0.456, 0.406), 'std':(0.229, 0.224, 0.225)}`	Normalization is applied by the formula: `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`
Spatial-level transforms
RandomCrop	`{'height':128, 'width':128}`	Crop a random part of the input
VerticalFlip	`{'p': 0.5}`	Flip the input vertically around the x-axis
ShiftScaleRotate	`{'shift_limit':0.05, 'scale_limit':0.05, 'rotate_limit':15, 'p':0.5}`	Randomly apply affine transforms: translate, scale and rotate the input
RandomBrightnessContrast	`{'brightness_limit':0.2, 'contrast_limit':0.2, 'brightness_by_max':True, 'always_apply':False,'p': 0.5}`	Randomly change brightness and contrast of the input image

Augumentation is configured in the configuration file general_backbone/configs/image_clf_config.py:

data_conf = dict(
    dict_transform = dict(
        SmallestMaxSize={'max_size': 160},
        ShiftScaleRotate={'shift_limit':0.05, 'scale_limit':0.05, 'rotate_limit':15, 'p':0.5},
        RandomCrop={'height':128, 'width':128},
        RGBShift={'r_shift_limit': 15, 'g_shift_limit': 15, 'b_shift_limit': 15, 'p': 0.5},
        RandomBrightnessContrast={'p': 0.5},
        Normalize={'mean':(0.485, 0.456, 0.406), 'std':(0.229, 0.224, 0.225)},
        ToTensorV2={'always_apply':True}
    )
)

You can add a new transformation step in data_conf['dict_transform'] and they are transformed in order from top-down. You can also debug your transformation by setup debug=True:

from general_backbone.data import AugmentationDataset
augdataset = AugmentationDataset(data_dir='toydata/image_classification',
                            name_split='train',
                            config_file = 'general_backbone/configs/image_clf_config.py', 
                            dict_transform=None, 
                            input_size=(256, 256), 
                            debug=True, 
                            dir_debug = 'tmp/alb_img_debug', 
                            class_2_idx=None)

for i in range(50):
    img, label = augdataset.__getitem__(i)

In default, the augmentation images output is saved in tmp/alb_img_debug to you review before train your models. the code tests augmentation image is available in debug/transform_debug.py:

conda activate gen_backbone
python debug/transform_debug.py

Train model

To train model, you run file tools/train.py. There are variaty of config for your training such as --model, --batch_size, --opt, --loss, --sched. We supply to you a standard configuration file to train your model through --config. general_backbone/configs/image_clf_config.py is for image classification task. You can change value inside this file or add new parameter as you want but without changing the name and structure of file.

python3 tools/train.py --config general_backbone/configs/image_clf_config.py

Results:

Model resnet50 created, param count:25557032
Train: 0 [   0/33 (  0%)]  Loss: 8.863 (8.86)  Time: 1.663s,    9.62/s  (1.663s,    9.62/s)  LR: 5.000e-04  Data: 0.460 (0.460)
Train: 0 [  32/33 (100%)]  Loss: 1.336 (4.00)  Time: 0.934s,    8.57/s  (0.218s,   36.68/s)  LR: 5.000e-04  Data: 0.000 (0.014)
Test: [   0/29]  Time: 0.560 (0.560)  Loss:  0.6912 (0.6912)  [email protected]: 87.5000 (87.5000)  [email protected]: 100.0000 (100.0000)
Test: [  29/29]  Time: 0.041 (0.064)  Loss:  0.5951 (0.5882)  [email protected]: 81.2500 (87.5000)  [email protected]: 100.0000 (99.3750)
Train: 1 [   0/33 (  0%)]  Loss: 0.5741 (0.574)  Time: 0.645s,   24.82/s  (0.645s,   24.82/s)  LR: 5.000e-04  Data: 0.477 (0.477)
Train: 1 [  32/33 (100%)]  Loss: 0.5411 (0.313)  Time: 0.089s,   90.32/s  (0.166s,   48.17/s)  LR: 5.000e-04  Data: 0.000 (0.016)
Test: [   0/29]  Time: 0.537 (0.537)  Loss:  0.3071 (0.3071)  [email protected]: 87.5000 (87.5000)  [email protected]: 100.0000 (100.0000)
Test: [  29/29]  Time: 0.043 (0.066)  Loss:  0.1036 (0.1876)  [email protected]: 100.0000 (93.9583)  [email protected]: 100.0000 (100.0000)

Table of config parameters is in training.

Your model checkpoint and log are saved in the same path of --output directory. A tensorboard visualization is created in order to facilitate manage and control training process. As default, folder of tensorboard is runs that insides --output. The loss, accuracy, learning rate and batch time on both train and test are logged:

tensorboard --logdir checkpoint/resnet50/20211023-092651-resnet50-224/runs/

Inference

To inference model, you can pass relevant values to --img, --config and --initial-checkpoint.

python tools/inference.py --img demo/cat0.jpg --config general_backbone/configs/image_clf_config.py --initial-checkpoint checkpoint.pth.tar

TODO

code setup.py
conda virtual environment setup
Introduce group of CNN models support
Visualization training results
[] Table ranking model performances.
Support new type of Datasets: You can change the augmentation styles:
- references: https://albumentations.ai/docs/examples/pytorch_classification/
[] New loss function:
- Focal Loss function; KL divergence.
- references:
  - https://github.com/pytorch/pytorch/blob/3097755e7a88333c945a14ee44fda055ba276138/torch/nn/modules/loss.py
  - https://pytorch.org/docs/stable/nn.html#loss-functions

Packages reference:

There are many open sources package we refered to build up general_backbone:

timm: PyTorch Image Models (timm) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results.
albumentations: is a Python library for image augmentation.
mmcv: MMCV is a foundational library for computer vision research and supports many research projects.

Citation

If you find this project is useful in your reasearch, kindly consider cite:

@article{genearal_backbone,
    title={GeneralBackbone:  A handy package for implementing Deep Learning Backbone},
    author={khanhphamdinh},
    email= {[email protected]},
    year={2021}
}

You might also like...

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

26 Nov 29, 2022

This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

9 Oct 18, 2022

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

210 Jan 4, 2023

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

138 Dec 17, 2022

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

A python library for implementing a recommender system

python-recsys A python library for implementing a recommender system. Installation Dependencies python-recsys is build on top of Divisi2, with csc-pys

1.5k Dec 17, 2022

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

249 Dec 21, 2022

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

8 Nov 21, 2022

Releases(v0.2.1)

v0.2.1(Oct 26, 2021)
Features:

Augmentation according to Albumentation.

Visualization Tensorboard loss, accuracy, learning rate, and batch time execution.

Source code(tar.gz)
Source code(zip)
v.0.2.0(Oct 26, 2021)
New features:

Support backbone in resnet group for the image classification task.

Source code(tar.gz)
Source code(zip)

Facilitates implementing deep neural-network backbones, data augmentations

Related tags

Overview

Introduction

Installation

Model backbone

Dataset

Data Augmentation

Train model

Inference

TODO

Packages reference:

Citation

You might also like...

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

This is a model made out of Neural Network specifically a Convolutional Neural Network model

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

A python library for implementing a recommender system

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

Releases(v0.2.1)

v0.2.1(Oct 26, 2021)

v.0.2.0(Oct 26, 2021)

Owner

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

fcn by tensorflow

A simple, fully convolutional model for real-time instance segmentation.

Edge Restoration Quality Assessment

labelpix is a graphical image labeling interface for drawing bounding boxes

a morph transfer UGATIT for image translation.

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

NVIDIA Deep Learning Examples for Tensor Cores

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

Object Detection with YOLOv3

An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

Notspot robot simulation - Python version

Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

A Runtime method overload decorator which should behave like a compiled language

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

Style-based Neural Drum Synthesis with GAN inversion

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Learning Spatio-Temporal Transformer for Visual Tracking