利用yolov5和TensorRT从0到1实现目标检测的模型训练到模型部署全过程

Overview

写在前面

利用TensorRT加速推理速度是以时间换取精度的做法,意味着在推理速度上升的同时将会有精度的下降,不过不用太担心,精度下降微乎其微。此外,要有NVIDIA显卡,经测试,CUDA10.2可以支持20系列显卡及以下,30系列显卡需要CUDA11.x的支持,并且目前有bug。

默认你已经完成了 yolov5的训练过程并得到了.pt模型权值文件。

本文目的仅是带着走通流程。

注意要对应yolov5和tensorrtx的版本。

  • ./yolov5包含yolov5训练以及模型初转化阶段的代码
  • ./model_process是将.wts模型转化为.engine模型的代码
  • ./detector是利用.engine模型进行前向推理阶段的代码

我的运行环境(注意OpenCV要选择适合你的visual studio的版本等问题):

win10

Visual Studio 2019

NVIDIA GeForce RTX 2060

opencv-3.4.3-vc14_vc15

cuda_10.2.89_441.22_win10

cudnn-10.2-windows10-x64-v7.6.5.32

TensorRT-7.0.0.11.Windows10.x86_64.cuda-10.2.cudnn7.6

cmake-3.21.2-windows-x86_64

上述环境的百度云(测试10、20系列可用):

链接:https://pan.baidu.com/s/1AyaloTzLap8X2hsJBvyeBw
提取码:dwr7

其他版本下载地址:

CUDA cudnn TensorRT CMake OpenCV

环境安装:

1、安装OpenCV并配置好环境变量

2、安装CUDA

一路默认。一般的安装路径为:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2

3、安装cudnn和TensorRT

cudnn和TensorRT的安装仅是将下载的对应版本的压缩包解压并复制*.h、*.lib、*.dll到CUDA的安装路径。

1 将cuDNN压缩包解压

2 将cuda\bin中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin

3 将cuda\include中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include

4 将cuda\lib中的文件复制到 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib

另外,

1 将TensorRT压缩包解压

2 将 TensorRT-7.0.0.11\include中头文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include

3 将TensorRT-7.0.0.11\lib中所有lib文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64

4 将TensorRT-7.0.0.11\lib中所有dll文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin

4、安装CMake软件备用

一、将训练阶段得到的.pt模型转化为.wts中间模型

把tensorrtx里面的yolov5\gen_wts.py加入到yolov5里面,执行

python gen_wts.py -w [.pt权值文件路径] 

runs\train\exp\weights\best.pt为训练过程生成的.pt模型,生成的best.wts会保存到同目录下,此best.wts待会会用到。

cuda版本每个电脑不一样

配置好的tensorrtx,包括Cmakelist.txt的设定以及dirent.h的配置。

若使用原作者的请参照tensorrtx源码https://github.com/wang-xinyu/tensorrtx ,配置过程中会遇到一些问题,挨个解决,问题不大。

1、在yolov5目录下新建build文件夹

2、修改CMakelist.txt

add_definitions(-DAPI_EXPORTS)

3、打开CMake

​​ generate后关闭

4、yolov5/include/dirent.h

​​ 也可使用我的配置好的

二、利用Cmake软件创建VS工程

修改CMakeLists.txt中此处为你的opencv安装路径。

配置好上方两个目录之后,点击Configure,根据你的环境选择配置,

点击Gnerate,警告可忽视,

现在关闭Cmake即可。

三、wts转化为engine

VS打开刚刚在bulid目录下创建的工程。

build处vs打开,生成

问题:我的模型只识别一个类,需要更改


cd {tensorrtx}/yolov5/

// update CLASS_NUM in yololayer.h if your model is trained on custom dataset

为1

生成项目。

把之前生成的best.wts复制到build\release目录里面

cmd里面运行:

.\test.exe -s .\best.wts best.engine s

运行成功在同文件夹下面会得到best.engine转换后的文件。之后的推理过程使用的都是这个文件。

测试:

.\yolov5.exe -d best.engine .\samples

至此,流程走完。

如果想要进一步封装,可以按照我的示例。

注释掉yolov5.cpp,并取消 几个文件的注释。重新生成项目。按照你的需求更改。

Owner
Helium
Helium
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor

Dave Z. Chen 79 Nov 07, 2022
A simple baseline for 3d human pose estimation in PyTorch.

3d_pose_baseline_pytorch A PyTorch implementation of a simple baseline for 3d human pose estimation. You can check the original Tensorflow implementat

weigq 312 Jan 06, 2023
Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Perceiver IO Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs Usage import torch from src.perceiver.

Timur Ganiev 111 Nov 15, 2022
An implementation of Deep Forest 2021.2.1.

Deep Forest (DF) 21 DF21 is an implementation of Deep Forest 2021.2.1. It is designed to have the following advantages: Powerful: Better accuracy than

LAMDA Group, Nanjing University 795 Jan 03, 2023
Emblaze - Interactive Embedding Comparison

Emblaze - Interactive Embedding Comparison Emblaze is a Jupyter notebook widget for visually comparing embeddings using animated scatter plots. It bun

CMU Data Interaction Group 77 Nov 24, 2022
Official repository for the paper F, B, Alpha Matting

FBA Matting Official repository for the paper F, B, Alpha Matting. This paper and project is under heavy revision for peer reviewed publication, and s

Marco Forte 404 Jan 05, 2023
Code for "LoRA: Low-Rank Adaptation of Large Language Models"

LoRA: Low-Rank Adaptation of Large Language Models This repo contains the implementation of LoRA in GPT-2 and steps to replicate the results in our re

Microsoft 394 Jan 08, 2023
This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

4 Aug 02, 2022
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

SfM disambiguation with COLMAP About Structure-from-Motion generally fails when the scene exhibits symmetries and duplicated structures. In this repos

Computer Vision and Geometry Lab 193 Dec 26, 2022
Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Installation Run pipenv install (at your own risk with --skip-lo

Autonomous Learning Group 65 Dec 27, 2022
Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation

SUCP Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation () Direct Friends (i.e., users who follow each o

Kosar 8 Nov 26, 2022
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022
SimpleDepthEstimation - An unified codebase for NN-based monocular depth estimation methods

SimpleDepthEstimation Introduction This is an unified codebase for NN-based monocular depth estimation methods, the framework is based on detectron2 (

8 Dec 13, 2022
"Learning Free Gait Transition for Quadruped Robots vis Phase-Guided Controller"

PhaseGuidedControl The current version is developed based on the old version of RaiSim series, and possibly requires further modification. It will be

X-Mechanics 12 Oct 21, 2022
A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

MADGRAD Optimization Algorithm For Tensorflow This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized,

20 Aug 18, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022