YOLOX-RMPOLY

Overview

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。

基于旷视科技YOLOX,实现对不规则四边形的目标检测

TODO 修改onnx推理模型

更改/添加标注:

1.yolox/models/yolox_polyhead.py:
    1.1继承yolox/models/yolo_head.py YOLOXHead类,修改代码使其输出变为四点。
        1.1.1修改构造函数
        1.1.2修改get_output_and_grid函数,使其grid变为4对xy坐标的形式
        1.1.3修改forward函数
        1.1.4修改get_loses
        1.1.5把自带的l1损失函数改成smoothl1,注意它自带的是算的xywh,要改成xyxyxyxy
             正样本匹配策略还是依靠dynamic-k,用的是不规则四边形的最小外接矩形的iou

2.yolox/models/losses.py:(弃用)
新增PolyIOULoss类,iou是四边形的最小外接矩形iou,并新增四个坐标点的smoothl1_loss(弃用)

3.yolox/utils/boxes.py:
    3.1增加order_corners函数,用于给不规则四边形的四个点排序
    3.2增加minimum_outer_rect函数,用于求解四边形的最小外接矩形
    3.3增加poly_adjust_box_anns函数

4.新增exps/yolox_s_rmpoly.py配置文件
    

5.新增yolox/exp/yolox_poly_base.py配置文件基类

6.新增yolox/data/datasets/rmpoly.py
    6.1新增RMPOLYDataset类
        6.1.1修改数据集读取方式,读取八点
        6.1.2修改pull_item
        6.1.3修改load_anno

7.yolox/data/data_augment.py
    7.1新增PolyTrainTransform类,对四点数据进行数据增强(未完待续)
    7.2poly_random_affine
    7.2poly_apply_affine_to_bboxes

8.yolox/data/datasets/mosaicdetection.py
    8.1新增PolyMosaicDetection(未完待续)
    8.2_polymirror

9.yolox/models/yolox.py
    9.1 YOLOX类:
    为了适应yolox/models/yolox_polyhead.py中YOLOXPolyHead类的get_losses函数返回字典,修改forward函数中训练时返回值。(弃用)


可以试着把求解回归损失的smoothl1_loss改成归一化后的坐标再求损失。

Introduction

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our report on Arxiv.

This repo is an implementation of PyTorch version YOLOX, there is also a MegEngine implementation.

Updates!!

  • 【2021/08/19】 We optimize the training process with 2x faster training and ~1% higher performance! See notes for more details.
  • 【2021/08/05】 We release MegEngine version YOLOX.
  • 【2021/07/28】 We fix the fatal error of memory leak
  • 【2021/07/26】 We now support MegEngine deployment.
  • 【2021/07/20】 We have released our technical report on Arxiv.

Comming soon

  • YOLOX-P6 and larger model.
  • Objects365 pretrain.
  • Transformer modules.
  • More features in need.

Benchmark

Standard Models.

Model size mAPval
0.5:0.95
mAPtest
0.5:0.95
Speed V100
(ms)
Params
(M)
FLOPs
(G)
weights
YOLOX-s 640 40.5 40.5 9.8 9.0 26.8 github
YOLOX-m 640 46.9 47.2 12.3 25.3 73.8 github
YOLOX-l 640 49.7 50.1 14.5 54.2 155.6 github
YOLOX-x 640 51.1 51.5 17.3 99.1 281.9 github
YOLOX-Darknet53 640 47.7 48.0 11.1 63.7 185.3 github
Legacy models
Model size mAPtest
0.5:0.95
Speed V100
(ms)
Params
(M)
FLOPs
(G)
weights
YOLOX-s 640 39.6 9.8 9.0 26.8 onedrive/github
YOLOX-m 640 46.4 12.3 25.3 73.8 onedrive/github
YOLOX-l 640 50.0 14.5 54.2 155.6 onedrive/github
YOLOX-x 640 51.2 17.3 99.1 281.9 onedrive/github
YOLOX-Darknet53 640 47.4 11.1 63.7 185.3 onedrive/github

Light Models.

Model size mAPval
0.5:0.95
Params
(M)
FLOPs
(G)
weights
YOLOX-Nano 416 25.8 0.91 1.08 github
YOLOX-Tiny 416 32.8 5.06 6.45 github
Legacy models
Model size mAPval
0.5:0.95
Params
(M)
FLOPs
(G)
weights
YOLOX-Nano 416 25.3 0.91 1.08 github
YOLOX-Tiny 416 32.8 5.06 6.45 github

Quick Start

Installation

Step1. Install YOLOX.

git clone [email protected]:Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -U pip && pip3 install -r requirements.txt
pip3 install -v -e .  # or  python3 setup.py develop

Step2. Install pycocotools.

pip3 install cython; pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
Demo

Step1. Download a pretrained model from the benchmark table.

Step2. Use either -n or -f to specify your detector's config. For example:

python tools/demo.py image -n yolox-s -c /path/to/your/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]

or

python tools/demo.py image -f exps/default/yolox_s.py -c /path/to/your/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]

Demo for video:

python tools/demo.py video -n yolox-s -c /path/to/your/yolox_s.pth --path /path/to/your/video --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]
Reproduce our results on COCO

Step1. Prepare COCO dataset

cd <YOLOX_HOME>
ln -s /path/to/your/COCO ./datasets/COCO

Step2. Reproduce our results on COCO by specifying -n:

python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o [--cache]
                         yolox-m
                         yolox-l
                         yolox-x
  • -d: number of gpu devices
  • -b: total batch size, the recommended number for -b is num-gpu * 8
  • --fp16: mixed precision training
  • --cache: caching imgs into RAM to accelarate training, which need large system RAM.

When using -f, the above commands are equivalent to:

python tools/train.py -f exps/default/yolox_s.py -d 8 -b 64 --fp16 -o [--cache]
                         exps/default/yolox_m.py
                         exps/default/yolox_l.py
                         exps/default/yolox_x.py

Multi Machine Training

We also support multi-nodes training. Just add the following args:

  • --num_machines: num of your total training nodes
  • --machine_rank: specify the rank of each node

Suppose you want to train YOLOX on 2 machines, and your master machines's IP is 123.123.123.123, use port 12312 and TCP.
On master machine, run

python tools/train.py -n yolox-s -b 128 --dist-url tcp://123.123.123.123:12312 --num-machines 2 --machine-rank 0

On the second machine, run

python tools/train.py -n yolox-s -b 128 --dist-url tcp://123.123.123.123:12312 --num-machines 2 --machine-rank 1
Evaluation

We support batch testing for fast evaluation:

python tools/eval.py -n  yolox-s -c yolox_s.pth -b 64 -d 8 --conf 0.001 [--fp16] [--fuse]
                         yolox-m
                         yolox-l
                         yolox-x
  • --fuse: fuse conv and bn
  • -d: number of GPUs used for evaluation. DEFAULT: All GPUs available will be used.
  • -b: total batch size across on all GPUs

To reproduce speed test, we use the following command:

python tools/eval.py -n  yolox-s -c yolox_s.pth -b 1 -d 1 --conf 0.001 --fp16 --fuse
                         yolox-m
                         yolox-l
                         yolox-x
Tutorials

Deployment

  1. MegEngine in C++ and Python
  2. ONNX export and an ONNXRuntime
  3. TensorRT in C++ and Python
  4. ncnn in C++ and Java
  5. OpenVINO in C++ and Python

Third-party resources

Cite YOLOX

If you use YOLOX in your research, please cite our work by using the following BibTeX entry:

 @article{yolox2021,
  title={YOLOX: Exceeding YOLO Series in 2021},
  author={Ge, Zheng and Liu, Songtao and Wang, Feng and Li, Zeming and Sun, Jian},
  journal={arXiv preprint arXiv:2107.08430},
  year={2021}
}
Easy to use Audio Tagging in PyTorch

Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: Fine-tune on audio classification Fine-tune on audio tagging Fine-tune on s

sithu3 15 Dec 22, 2022
Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

The official code for the NeurIPS 2021 paper Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

13 Dec 22, 2022
[NeurIPS-2020] Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID.

Self-paced Contrastive Learning (SpCL) The official repository for Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID

Yixiao Ge 286 Dec 21, 2022
Our solution for SSN Invente 2021's Hackathon

Our solution for SSN Invente 2021's Hackathon. To help maitain godowns in a pristine and safe condition using raspberry pi.

1 Jan 12, 2022
TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain Gait Recognition.

TraND This is the code for the paper "Jinkai Zheng, Xinchen Liu, Chenggang Yan, Jiyong Zhang, Wu Liu, Xiaoping Zhang and Tao Mei: TraND: Transferable

Jinkai Zheng 32 Apr 04, 2022
Implementation of PersonaGPT Dialog Model

PersonaGPT An open-domain conversational agent with many personalities PersonaGPT is an open-domain conversational agent cpable of decoding personaliz

ILLIDAN Lab 42 Jan 01, 2023
Machine learning and Deep learning models, deploy on telegram (the best social media)

Semi Intelligent BOT The project involves : Classifying fake news Classifying objects such as aeroplane, automobile, bird, cat, deer, dog, frog, horse

MohammadReza Norouzi 5 Mar 06, 2022
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.

FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu

9 Nov 27, 2022
The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

QUT Centre for Robotics (QCR) 13 Nov 26, 2022
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
Face Recognition plus identification simply and fast | Python

PyFaceDetection Face Recognition plus identification simply and fast Ubuntu Setup sudo pip3 install numpy sudo pip3 install cmake sudo pip3 install dl

Peyman Majidi Moein 16 Sep 22, 2022
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP

Duo Li 1.3k Dec 28, 2022
Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Probabilistic Tensor Decomposition of Neural Population Spiking Activity Matlab (recommended) and Python (in developement) implementations of Soulat e

Hugo Soulat 6 Nov 30, 2022
Contains a bunch of different python programm tasks

py_tasks Contains a bunch of different python programm tasks Armstrong.py - calculate Armsrong numbers in range from 0 to n with / without cache and c

Dmitry Chmerenko 1 Dec 17, 2021
GT China coal model

GT China coal model The full version of a China coal transport model with a very high spatial reslution. What it does The code works in a few steps: T

0 Dec 13, 2021
Image-to-image translation with conditional adversarial nets

pix2pix Project | Arxiv | PyTorch Torch implementation for learning a mapping from input images to output images, for example: Image-to-Image Translat

Phillip Isola 9.3k Jan 08, 2023
TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition This is an implementation of TCPNet. Introduction For video recognition task, a g

Zilin Gao 21 Dec 08, 2022
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
Official repository for Hierarchical Opacity Propagation for Image Matting

HOP-Matting Official repository for Hierarchical Opacity Propagation for Image Matting 🚧 🚧 🚧 Under Construction 🚧 🚧 🚧 🚧 🚧 🚧   Coming Soon   

Li Yaoyi 54 Dec 30, 2021
Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

ChongjianGE 89 Dec 02, 2022