Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Overview

Plant Pathology 2020 FGVC7

Introduction

A deep learning model pipeline for training, experimentaiton and deployment for the Kaggle Competition, Plant Pathology 2020, utilising:

  • PyTorch: A Deep Learning Framework for high-performance AI research
  • Weights and Biases: tool for experiment tracking, dataset versioning, and model management
  • Apex: A Library to Accelerate Deep Learning Training using AMP, Fused Optimizer, and Multi-GPU
  • TensorRT: high-performance neural network inference optimizer and runtime engine for production deployment
  • Triton Inference Server: inference serving software that simplifies the deployment of AI models at scale
  • Streamlit: framework to quickly build highly interactive web applications for machine learning models

For a quick tutorial about all these modules, check out tutorials folder. Exploratory data analysis for the same can also be found in the notebooks folder.

Structure

├── app                 # Interactive Streamlit app scripts
├── data                # Datasets
├── examples            # assignment on pytorch amp and ddp
├── model               # Directory to save models for triton
├── notebooks           # EDA, Training, Model conversion, Inferencing and other utility notebooks
├── tutorials           # Tutorials on the modules used
└── requirements.txt    # Basic requirements

Usage

EDA: Data Evaluation

Data can be explored with various visualization techniques provided in eda.ipyb notebooks folder

Training the model

To run the pytorch resnet50 model use pytorch_train.ipynb.

The code is inspired by Pytorch Performance Tuning Guide

Once the model is trained, you can even run model explainabilty using the shap library. The tutorial notebook for the same can be found in the notebooks folder.

Model Conversion and Inferencing

Once you've trained the model, you will need to convert it to different formats in order to have a faster inference time as well as easily deploy them. You can convert the model to ONNX, TensorRT FP32 and TensorRT FP16 formats which are optimised to run faster inference. You will also need to convert the PyTorch model to TorchScript. Procedure for converting and benchmarking all the different formats of the model can be found in notebooks folder.

Model Deployment and Benchmarking

Now your models are ready to be deployed. For deployment, we utilise the Triton Inference Server. It provides an inferencing solution for deep learning models to be easily deployed and integrated with various functionalities. It supports HTTP and gRPC protocol that allows clients to request for inferencing, utilising any model of choice being managed by the server. The process of deployment can be found in Triton Inference Server.md.

Once your inferencing server is up and running, the next step it to understand as well as optimise the model performance. For this purpose, you can utilise tools like perf_analyzer which helps you measure changes in performance as you experiment with different parameters.

Interactive Web App

To run the Streamlit app:

cd app/
streamlit app.py

This will create a local server on which you can view the web application. This app contains the client side for the Triton Inference Server, along with an easy to use GUI.

Acknowledgement

This repository is built with references and code snippets from the NN Template by Luca Moschella.

Owner
Bharat Giddwani
B.Tech Graduate || Deep learning/ machine learning enthusiast. A passionate/avid learner.
Bharat Giddwani
Demonstration of transfer of knowledge and generalization with distillation

Distilling-the-Knowledge-in-a-Neural-Network This is an implementation of a part of the paper "Distilling the Knowledge in a Neural Network" (https://

26 Nov 25, 2022
DM-ACME compatible implementation of the Arm26 environment from Mujoco

ACME-compatible implementation of Arm26 from Mujoco This repository contains a customized implementation of Mujoco's Arm26 model, that can be used wit

1 Dec 24, 2021
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
ICCV2021 - A New Journey from SDRTV to HDRTV.

ICCV2021 - A New Journey from SDRTV to HDRTV.

XyChen 82 Dec 27, 2022
This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

CV Lab @ Yonsei University 36 Nov 04, 2022
CoRe: Contrastive Recurrent State-Space Models

CoRe: Contrastive Recurrent State-Space Models This code implements the CoRe model and reproduces experimental results found in Robust Robotic Control

Apple 21 Aug 11, 2022
Repository for training material for the 2022 SDSC HPC/CI User Training Course

hpc-training-2022 Repository for training material for the 2022 SDSC HPC/CI Training Series HPC/CI Training Series home https://www.sdsc.edu/event_ite

sdsc-hpc-training-org 21 Jul 27, 2022
Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”. Introduction Based

96 Dec 13, 2022
Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Memory-Efficient Multi-Level In-Situ Generation (MLG) By Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen and David Z. Pan

Jiaqi Gu 2 Jan 04, 2022
DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

DFL-Colab — DeepFaceLab fork for Google Colab This project provides you IPython Notebook to use DeepFaceLab with Google Colaboratory. You can create y

779 Jan 05, 2023
A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN

A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN Please follow Faster R-CNN and DAF to complete the environment confi

2 Jan 12, 2022
Mini-hmc-jax - A simple implementation of Hamiltonian Monte Carlo in JAX

mini-hmc-jax This is a simple implementation of Hamiltonian Monte Carlo in JAX t

Martin Marek 6 Mar 03, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb

Google Research 135 Dec 27, 2022
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022
On the Adversarial Robustness of Visual Transformer

On the Adversarial Robustness of Visual Transformer Code for our paper "On the Adversarial Robustness of Visual Transformers"

Rulin Shao 35 Dec 14, 2022
Some useful blender add-ons for SMPL skeleton's poses and global translation.

Blender add-ons for SMPL skeleton's poses and trans There are two blender add-ons for SMPL skeleton's poses and trans.The first is for making an offli

犹在镜中 154 Jan 04, 2023
Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

CasRel-pytorch-reimplement Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The o

longlongman 170 Dec 01, 2022