torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

Last update: Dec 25, 2022

Overview

English | 中文文档 | 知乎专栏 | 下载统计

🤗 Introduction

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations, can easily install with pip.

❤️ Star 🌟 👆🏻 this repo to support me if it does any helps to you, thanks ~

👋 Core Features

High level pipeline for training and inference.
Provides 30+ native landmarks data augmentations.
Can bind 80+ transforms from torchvision and albumentations with one-line-code.
Support PIPNet, YOLOX, ResNet, MobileNet and ShuffleNet for face landmarks detection.

🆕 What's New

[2022/03/08]: Add PIPNet: Towards Efficient Facial Landmark Detection in the Wild, CVPR2021
[2022/02/13]: Add 30+ transforms and bind 80+ transforms from torchvision and albumentations.

✅ Supported Models Matrix

✅ = known work and official supported, ❔ = in my plan, but not coming soon.

Face Detection

FaceBoxesV2	YOLO5Face	SCRFD	RetinaFace	...
✅	❔	❔	❔	❔

Face Landmarks Detection

PIPNet	YOLOX	YOLOv5	NanoDet	ResNet	MobileNet	ShuffleNet	VIT	...
✅	❔	❔	❔	❔	❔	❔	❔	❔

🔥 🔥 Performance(@NME)

Model	Backbone	Head	300W	COFW	AFLW	WFLW	Download
PIPNet	MobileNetV2	Heatmap+Regression+NRM	3.40	3.43	1.52	4.79	link
PIPNet	ResNet18	Heatmap+Regression+NRM	3.36	3.31	1.48	4.47	link
PIPNet	ResNet50	Heatmap+Regression+NRM	3.34	3.18	1.44	4.48	link
PIPNet	ResNet101	Heatmap+Regression+NRM	3.19	3.08	1.42	4.31	link

🛠️ Installation

you can install torchlm directly from pypi.

pip install torchlm>=0.1.6.10 # or install the latest pypi version `pip install torchlm`
pip install torchlm>=0.1.6.10 -i https://pypi.org/simple/ # or install from specific pypi mirrors use '-i'

or install from source if you want the latest torchlm and install it in editable mode with -e.

git clone --depth=1 https://github.com/DefTruth/torchlm.git 
cd torchlm && pip install -e .

🌟 🌟 Data Augmentation

torchlm provides 30+ native data augmentations for landmarks and can bind with 80+ transforms from torchvision and albumentations. The layout format of landmarks is xy with shape (N, 2).

Use almost 30+ native transforms from torchlm directly

import torchlm
transform = torchlm.LandmarksCompose([
    torchlm.LandmarksRandomScale(prob=0.5),
    torchlm.LandmarksRandomMask(prob=0.5),
    torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
    torchlm.LandmarksRandomBrightness(prob=0.),
    torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
    torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5)
])

Also, a user-friendly API build_default_transform is available to build a default transform pipeline.

transform = torchlm.build_default_transform(
    input_size=(input_size, input_size),
    mean=[0.485, 0.456, 0.406],
    std=[0.229, 0.224, 0.225],
    force_norm_before_mean_std=True,  # img/=255. first
    rotate=30,
    keep_aspect=False,
    to_tensor=True  # array -> Tensor & HWC -> CHW
)

See transforms.md for supported transforms sets and more example can be found at test/transforms.py.

💡 more details about transform in torchlm

torchlm provides 30+ native data augmentations for landmarks and can bind with 80+ transforms from torchvision and albumentations through torchlm.bind method. The layout format of landmarks is xy with shape (N, 2), N denotes the number of the input landmarks. Further, torchlm.bind provide a prob param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in torchlm are safe and simplest. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. Yes, is ok if you pass a Tensor to a np.ndarray-like transform, torchlm will automatically be compatible with different data types and then wrap it back to the original type through a autodtype wrapper.

bind 80+ torchvision and albumentations's transforms

NOTE: Please install albumentations first if you want to bind albumentations's transforms. If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, ablumentations need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall albumentations. See albumentations#1140 for more details.

# first uninstall conflict opencvs
pip uninstall opencv-python
pip uninstall opencv-python-headless
pip uninstall albumentations  # if you have installed albumentations
pip install albumentations # then reinstall albumentations, will also install deps, e.g opencv

Then, check if albumentations is available.

torchlm.albumentations_is_available()  # True or False

transform = torchlm.LandmarksCompose([
    torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),  
    torchlm.bind(albumentations.ColorJitter(p=0.5))
])

bind custom callable array or Tensor transform functions

# First, defined your custom functions
def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ...
    return img.astype(np.uint32), landmarks.astype(np.float32)

def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ...
    return img, landmarks

# Then, bind your functions and put it into the transforms pipeline.
transform = torchlm.LandmarksCompose([
        torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
        torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)
])

some global debug setting for torchlm's transform

setup logging mode as True globally might help you figure out the runtime details

# some global setting
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)

some detail information will show you at each runtime, the infos might look like

LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
LandmarksRandomScale() Execution Flag: False
BindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTorchVisionTransform(GaussianBlur())() Execution Flag: True
BindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOut
BindAlbumentationsTransform(ColorJitter())() Execution Flag: True
BindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTensorCallable(callable_tensor_noop())() Execution Flag: False
Error at LandmarksRandomTranslate() Skip, Flag: False Error Info: LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!
LandmarksRandomTranslate() Execution Flag: False

Execution Flag: True means current transform was executed successful, False means it was not executed because of the random probability or some Runtime Exceptions(torchlm will should the error infos if debug mode is True).
AutoDtype Info:
- Array_InOut means current transform need a np.ndnarray as input and then output a np.ndarray.
- Tensor_InOut means current transform need a torch Tensor as input and then output a torch Tensor.
- Array_In means current transform needs a np.ndarray input and then output a torch Tensor.
- Tensor_In means current transform needs a torch Tensor input and then output a np.ndarray.
Yes, is ok if you pass a Tensor to a np.ndarray-like transform, torchlm will automatically be compatible with different data types and then wrap it back to the original type through a autodtype wrapper.

🎉 🎉 Training

In torchlm, each model have two high level and user-friendly APIs named apply_training and apply_freezing for training. apply_training handle the training process and apply_freezing decide whether to freeze the backbone for fune-tuning.

Quick Start 👇

Here is an example of PIPNet. You can freeze backbone before fine-tuning through apply_freezing.

from torchlm.models import pipnet
# will auto download pretrained weights from latest release if pretrained=True
model = pipnet(backbone="resnet18", pretrained=True, num_nb=10, num_lms=98, net_stride=32,
               input_size=256, meanface_type="wflw", backbone_pretrained=True)
model.apply_freezing(backbone=True)
model.apply_training(
    annotation_path="../data/WFLW/converted/train.txt",  # or fine-tuning your custom data
    num_epochs=10,
    learning_rate=0.0001,
    save_dir="./save/pipnet",
    save_prefix="pipnet-wflw-resnet18",
    save_interval=1,
    logging_interval=1,
    device="cuda",
    coordinates_already_normalized=True,
    batch_size=16,
    num_workers=4,
    shuffle=True
)

Please jump to the entry point of the function for the detail documentations of apply_training API for each defined models in torchlm, e.g pipnet/_impls.py#L166. You might see some logs if the training process is running:

Parameters for DataLoader:  {'batch_size': 16, 'num_workers': 4, 'shuffle': True}
Built _PIPTrainDataset: train count is 7500 !
Epoch 0/9
----------
[Epoch 0/9, Batch 1/468] <Total loss: 0.372885> <cls loss: 0.063186> <x loss: 0.078508> <y loss: 0.071679> <nbx loss: 0.086480> <nby loss: 0.073031>
[Epoch 0/9, Batch 2/468] <Total loss: 0.354169> <cls loss: 0.051672> <x loss: 0.075350> <y loss: 0.071229> <nbx loss: 0.083785> <nby loss: 0.072132>
[Epoch 0/9, Batch 3/468] <Total loss: 0.367538> <cls loss: 0.056038> <x loss: 0.078029> <y loss: 0.076432> <nbx loss: 0.083546> <nby loss: 0.073492>
[Epoch 0/9, Batch 4/468] <Total loss: 0.339656> <cls loss: 0.053631> <x loss: 0.073036> <y loss: 0.066723> <nbx loss: 0.080007> <nby loss: 0.066258>
[Epoch 0/9, Batch 5/468] <Total loss: 0.364556> <cls loss: 0.051094> <x loss: 0.077378> <y loss: 0.071951> <nbx loss: 0.086363> <nby loss: 0.077770>
[Epoch 0/9, Batch 6/468] <Total loss: 0.371356> <cls loss: 0.049117> <x loss: 0.079237> <y loss: 0.075729> <nbx loss: 0.086213> <nby loss: 0.081060>
...
[Epoch 0/9, Batch 33/468] <Total loss: 0.298983> <cls loss: 0.041368> <x loss: 0.069912> <y loss: 0.057667> <nbx loss: 0.072996> <nby loss: 0.057040>

Dataset Format 👇

The annotation_path parameter is denotes the path to a custom annotation file, the format must be:

"img0_path x0 y0 x1 y1 ... xn-1,yn-1"
"img1_path x0 y0 x1 y1 ... xn-1,yn-1"
"img2_path x0 y0 x1 y1 ... xn-1,yn-1"
"img3_path x0 y0 x1 y1 ... xn-1,yn-1"
...

If the label in annotation_path is already normalized by image size, please set coordinates_already_normalized as True in apply_training API.

"img0_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img1_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img2_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img3_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
...

Here is an example of WFLW to show you how to prepare the dataset, also see test/data.py.

Additional Custom Settings 👋

Some models in torchlm support additional custom settings beyond the num_lms of your custom dataset. For example, PIPNet also need to set a custom meanface generated by your custom dataset. Please jump the source code of each defined model in torchlm for the details about additional custom settings to get more flexibilities of training or fine-tuning processes. Here is an example of How to train PIPNet in your own dataset with custom meanface setting?

Set up your custom meanface and nearest-neighbor landmarks through pipnet.set_custom_meanface method, this method will calculate the Euclidean Distance between different landmarks in meanface and will auto set up the nearest-neighbors for each landmark. NOTE: The PIPNet will reshape the detection headers if the number of landmarks in custom dataset is not equal with the num_lms you initialized.

def set_custom_meanface(custom_meanface_file_or_string: str) -> bool:
    """
    :param custom_meanface_file_or_string: a long string or a file contains normalized
    or un-normalized meanface coords, the format is "x0,y0,x1,y1,x2,y2,...,xn-1,yn-1".
    :return: status, True if successful.
    """

Also, a generate_meanface API is available in torchlm to help you get meanface in custom dataset.

# generate your custom meanface.
custom_meanface, custom_meanface_string = torchlm.data.annotools.generate_meanface(
  annotation_path="../data/WFLW/converted/train.txt",
  coordinates_already_normalized=True)
# check your generated meanface.
rendered_meanface = torchlm.data.annotools.draw_meanface(
  meanface=custom_meanface, coordinates_already_normalized=True)
cv2.imwrite("./logs/wflw_meanface.jpg", rendered_meanface)
# setting up your custom meanface
model.set_custom_meanface(custom_meanface_file_or_string=custom_meanface_string)

Benchmarks Dataset Converters 👇

In torchlm, some pre-defined dataset converters for common use benchmark datasets are available, such as 300W, COFW, WFLW and AFLW. These converters will help you to convert the common use dataset to the standard annotation format that torchlm need. Here is an example of WFLW.

from torchlm.data import LandmarksWFLWConverter
# setup your path to the original downloaded dataset from official 
converter = LandmarksWFLWConverter(
    data_dir="../data/WFLW", save_dir="../data/WFLW/converted",
    extend=0.2, rebuild=True, target_size=256, keep_aspect=False,
    force_normalize=True, force_absolute_path=True
)
converter.convert()
converter.show(count=30)  # show you some converted images with landmarks for debugging

Then, the output's layout in ../data/WFLW/converted would be look like:

├── image
│   ├── test
│   └── train
├── show
│   ├── 16--Award_Ceremony_16_Award_Ceremony_Awards_Ceremony_16_589x456y91.jpg
│   ├── 20--Family_Group_20_Family_Group_Family_Group_20_118x458y58.jpg
...
├── test.txt
└── train.txt

🛸 🚵‍️ Inference

C++ APIs 👀

The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of torchlm will be release in lite.ai.toolkit. Here is an example of 1000 Facial Landmarks Detection using FaceLandmarks1000. Download model from Model-Zoo².

detect(img_bgr, landmarks); lite::utils::draw_landmarks_inplace(img_bgr, landmarks); cv::imwrite(save_img_path, img_bgr); delete face_landmarks_1000; }">

#include "lite/lite.h"

static void test_default()
{
  std::string onnx_path = "../../../hub/onnx/cv/FaceLandmark1000.onnx";
  std::string test_img_path = "../../../examples/lite/resources/test_lite_face_landmarks_0.png";
  std::string save_img_path = "../../../logs/test_lite_face_landmarks_1000.jpg";
    
  auto *face_landmarks_1000 = new lite::cv::face::align::FaceLandmark1000(onnx_path);

  lite::types::Landmarks landmarks;
  cv::Mat img_bgr = cv::imread(test_img_path);
  face_landmarks_1000->detect(img_bgr, landmarks);
  lite::utils::draw_landmarks_inplace(img_bgr, landmarks);
  cv::imwrite(save_img_path, img_bgr);
  
  delete face_landmarks_1000;
}

The output is:

More classes for face alignment (68 points, 98 points, 106 points, 1000 points)

auto *align = new lite::cv::face::align::PFLD(onnx_path);  // 106 landmarks, 1.0Mb only!
auto *align = new lite::cv::face::align::PFLD98(onnx_path);  // 98 landmarks, 4.8Mb only!
auto *align = new lite::cv::face::align::PFLD68(onnx_path);  // 68 landmarks, 2.8Mb only!
auto *align = new lite::cv::face::align::MobileNetV268(onnx_path);  // 68 landmarks, 9.4Mb only!
auto *align = new lite::cv::face::align::MobileNetV2SE68(onnx_path);  // 68 landmarks, 11Mb only!
auto *align = new lite::cv::face::align::FaceLandmark1000(onnx_path);  // 1000 landmarks, 2.0Mb only!
auto *align = new lite::cv::face::align::PIPNet98(onnx_path);  // 98 landmarks, CVPR2021!
auto *align = new lite::cv::face::align::PIPNet68(onnx_path);  // 68 landmarks, CVPR2021!
auto *align = new lite::cv::face::align::PIPNet29(onnx_path);  // 29 landmarks, CVPR2021!
auto *align = new lite::cv::face::align::PIPNet19(onnx_path);  // 19 landmarks, CVPR2021!

More details of C++ APIs, please check lite.ai.toolkit.

Python APIs 👇

In torchlm, we provide pipelines for deploying models with PyTorch and ONNXRuntime. A high level API named runtime.bind can bind face detection and landmarks models together, then you can run the runtime.forward API to get the output landmarks and bboxes. Here is an example of PIPNet. Pretrained weights of PIPNet, Download.

Inference on PyTorch Backend

import torchlm
from torchlm.tools import faceboxesv2
from torchlm.models import pipnet

torchlm.runtime.bind(faceboxesv2(device="cpu"))  # set device="cuda" if you want to run with CUDA
# set map_location="cuda" if you want to run with CUDA
torchlm.runtime.bind(
  pipnet(backbone="resnet18", pretrained=True,  
         num_nb=10, num_lms=98, net_stride=32, input_size=256,
         meanface_type="wflw", map_location="cpu", checkpoint=None) 
) # will auto download pretrained weights from latest release if pretrained=True
landmarks, bboxes = torchlm.runtime.forward(image)
image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)

Inference on ONNXRuntime Backend

import torchlm
from torchlm.runtime import faceboxesv2_ort, pipnet_ort

torchlm.runtime.bind(faceboxesv2_ort())
torchlm.runtime.bind(
  pipnet_ort(onnx_path="pipnet_resnet18.onnx",num_nb=10,
             num_lms=98, net_stride=32,input_size=256, meanface_type="wflw")
)
landmarks, bboxes = torchlm.runtime.forward(image)
image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)

🤠 🎯 Evaluating

In torchlm, each model have a high level and user-friendly API named apply_evaluating for evaluation. This method will calculate the NME, FR and AUC for eval dataset. Here is an example of PIPNet.

from torchlm.models import pipnet
# will auto download pretrained weights from latest release if pretrained=True
model = pipnet(backbone="resnet18", pretrained=True, num_nb=10, num_lms=98, net_stride=32,
               input_size=256, meanface_type="wflw", backbone_pretrained=True)
NME, FR, AUC = model.apply_evaluating(
    annotation_path="../data/WFLW/convertd/test.txt",
    norm_indices=[60, 72],  # the indexes of two eyeballs.
    coordinates_already_normalized=True, 
    eval_normalized_coordinates=False
)
print(f"NME: {NME}, FR: {FR}, AUC: {AUC}")

Then, you will get the Performance(@[email protected]@AUC) results.

Built _PIPEvalDataset: eval count is 2500 !
Evaluating PIPNet: 100%|██████████| 2500/2500 [02:53<00:00, 14.45it/s]
NME: 0.04453323229181989, FR: 0.04200000000000004, AUC: 0.5732673333333334

⚙️ ⚔️ Exporting

In torchlm, each model have a high level and user-friendly API named apply_exporting for ONNX export. Here is an example of PIPNet.

from torchlm.models import pipnet
# will auto download pretrained weights from latest release if pretrained=True
model = pipnet(backbone="resnet18", pretrained=True, num_nb=10, num_lms=98, net_stride=32,
               input_size=256, meanface_type="wflw", backbone_pretrained=True)
model.apply_exporting(
    onnx_path="./save/pipnet/pipnet_resnet18.onnx",
    opset=12, simplify=True, output_names=None  # use default output names.
)

Then, you will get a Static ONNX model file if the exporting process was done.

  ...
  %195 = Add(%259, %189)
  %196 = Relu(%195)
  %outputs_cls = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %cls_layer.weight, %cls_layer.bias)
  %outputs_x = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %x_layer.weight, %x_layer.bias)
  %outputs_y = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %y_layer.weight, %y_layer.bias)
  %outputs_nb_x = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %nb_x_layer.weight, %nb_x_layer.bias)
  %outputs_nb_y = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %nb_y_layer.weight, %nb_y_layer.bias)
  return %outputs_cls, %outputs_x, %outputs_y, %outputs_nb_x, %outputs_nb_y
}
Checking 0/3...
Checking 1/3...
Checking 2/3...

📖 Documentations

Data Augmentation's API

🎓 License

The code of torchlm is released under the MIT License.

❤️ Contribution

Please consider ⭐ this repo if you like it, as it is the simplest way to support me.

👋 Acknowledgement

The implementation of torchlm's transforms borrow the code from Paperspace.
PIPNet: Towards Efficient Facial Landmark Detection in the Wild, CVPR2021

Comments

Can't use cuda in pipnet

I want use cuda in pipnet, so I run the following code:

import torchlm
from torchlm.tools import faceboxesv2
from torchlm.models import pipnet
import cv2
image_path = '../rgb/image0/1.png'
image = cv2.imread(image_path)
torchlm.runtime.bind(faceboxesv2())
torchlm.runtime.bind(
    pipnet(backbone="resnet18", pretrained=True,
        num_nb=10, num_lms=98, net_stride=32, input_size=256,
        meanface_type="wflw", checkpoint=None, map_location="cuda")
)
torchlm.runtime.forward(image)

Then I get a error that say:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

I know it due to my image data is still stay in cpu instead of gpu, and I need load my data to gpu. So I add a line code like following:

image = cv2.imread(image_path)
image = torch.tensor(image).cuda()

But now I get another error:

  File "C:\Home\Development\Anaconda\envs\DeepLearning\lib\site-packages\torchlm\tools\_faceboxesv2.py", line 305, in apply_detecting
    image_scale = cv2.resize(
cv2.error: OpenCV(4.5.5) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src is not a numpy array, neither a scalar
>  - Expected Ptr<cv::UMat> for argument 'src'

It means I need pass a ndarray array instead of a torch tensor. But if I pass a ndarray, its data will stay in cpu, and I will get the first error again.

What shold I do? Hava anyone get the same error?

bug

opened by try-agaaain 15

Missing mobilenet_v2 model file

It's mentioned in readme but page with releases contain only pipnet with resnet18 and resnet101 backbone. @DefTruth would be great if you can share mobilenet_v2 backboned pipnet! Because even resnet18 is too heavy for mobile application (44mb). Thank you for great work, very useful!
question

opened by gordinmitya 7

Inference on PyTorch Backend not work

Hi, I tested the interface code for predict landmarks by using 300W pretrained weights like these:

checkpoint = "pipnet_resnet101_10x68x32x256_300w.pth"
torchlm.runtime.bind(faceboxesv2())
torchlm.runtime.bind(
  pipnet(backbone="resnet101", pretrained=True,  
         num_nb=10, num_lms=68, net_stride=32, input_size=256,
         meanface_type="300w", map_location="cpu",
            backbone_pretrained=True, checkpoint=checkpoint)
) # will auto download pretrained weights from latest release if pretrained=True

when I run the forward code:

 landmarks, bboxes = torchlm.runtime.forward(frame_bgr)

I got the error of

[File ~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py:120, in forward(image, extend, swapRB_before_face, swapRB_before_landmarks, **kwargs)
    ]()[105](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=104)[ def forward(
    ]()[106](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=105)[         image: np.ndarray,
    ]()[107](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=106)[         extend: float = 0.2,
   (...)
    ]()[110](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=109)[         **kwargs: Any  # params for face_det & landmarks_det
    ]()[111](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=110)[ ) -> Tuple[_Landmarks, _BBoxes]:
    ]()[112](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=111)[     """
    ]()[113](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=112)[     :param image: original input image, HWC, BGR/RGB
    ]()[114](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=113)[     :param extend: extend ratio for face cropping (1.+extend) before landmarks detection.
   (...)
    ]()[118](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=117)[     :return: landmarks (n,m,2) -> x,y; bboxes (n,5) -> x1,y1,x2,y2,score
    ]()[119](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=118)[     """
--> ]()[120](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=119)[     return RuntimeWrapper.forward(
    ]()[121](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=120)[         image=image,
    ]()[122](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=121)[         extend=extend,
    ]()[123](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=122)[         swapRB_before_face=swapRB_before_face,
    ]()[124](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=123)[         swapRB_before_landmarks=swapRB_before_landmarks,
    ]()[125](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=124)[         **kwargs
    ]()[126](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=125)[     )

File ~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py:50, in RuntimeWrapper.forward(cls, image, extend, swapRB_before_face, swapRB_before_landmarks, **kwargs)
     ]()[48](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=47)[     bboxes = cls.face_base.apply_detecting(image_swapRB, **kwargs)  # (n,5) x1,y1,x2,y2,score
     ]()[49](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=48)[ else:
---> ]()[50](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=49)[     bboxes = cls.face_base.apply_detecting(image, **kwargs)  # (n,5) x1,y1,x2,y2,score
     ]()[52](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=51)[ det_num = bboxes.shape[0]
     ]()[53](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/runtime/_wrappers.py?line=52)[ landmarks = []

File ~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torch/autograd/grad_mode.py:28, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
     ]()[25](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torch/autograd/grad_mode.py?line=24)[ @functools.wraps(func)
     ]()[26](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torch/autograd/grad_mode.py?line=25)[ def decorate_context(*args, **kwargs):
     ]()[27](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torch/autograd/grad_mode.py?line=26)[     with self.__class__():
---> ]()[28](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torch/autograd/grad_mode.py?line=27)[         return func(*args, **kwargs)

File ~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py:340, in FaceBoxesV2.apply_detecting(self, image, thresh, im_scale, top_k)
    ]()[338](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py?line=337)[ # keep top-K before NMS
    ]()[339](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py?line=338)[ order = scores.argsort()[::-1][:top_k * 3]
--> ]()[340](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py?line=339)[ boxes = boxes[order]
    ]()[341](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py?line=340)[ scores = scores[order]
    ]()[343](file:///~/miniconda3/envs/torchlm/lib/python3.9/site-packages/torchlm/tools/_faceboxesv2.py?line=342)[ # nms

IndexError: index 1 is out of bounds for axis 0 with size 1]()

Could you help me with this, thx!

opened by tongnews 5

is PIPNet the SOTA ?

Hi, thanks for the work! I am from the industrial, and I plan to choose one algorithm and train with custom datasets.

I would like to ask which algorithm is the SOTA choice for the 2D facial landmarks task? By SOTA I mean the video stability and accuracy given in the paper or github.

Also, any suggestions for 3D facial landmarks?

opened by IQ17 4

HorizontalFlip with Albumentations

Great project and thanks for sharing it! For Horizontal flip, the pipnet has aug point flip such as

def random_flip(image, target, points_flip):
    if random.random() > 0.5:
        image = image.transpose(Image.FLIP_LEFT_RIGHT)
        target = np.array(target).reshape(-1, 2)
        target = target[points_flip, :]
        target[:,0] = 1-target[:,0]
        target = target.flatten()
        return image, target
    else:
        return image, target

How do you implement the above function with Albumentations? I found HorizontalFlip but it has no points_flip aug

opened by John1231983 3

Wrong results with Faceboxesv2 (ONNX) and there isn't YoloV5face implemented yet.

Hi, first of all, thanks for the amazing work. I founded an error regarding with the model of Faceboxesv2 (ONNX) When I do a inference with my image: I got nothing, i was able to found the problem and is with the model itself, if i change the threshold of the model from 0.3 to 0.003 i get: as you can see, is not working well the model and the conditions of foucs, and brightness is very well. So, i was thinking here that maybe the model will work only with the image of the test, right?

So, i tried to use another model (yolov5face) and i saw that is not implemented yet. So, do you have in your plans to incoporate it?

Thanks!

opened by PaulCahuana 0
Will PipNet allow gradients to flow through?

I will just freeze the weights in the PipNet model itself, but I'd like to get gradients (to adjust aspects of the input image) from a loss created by the Landmarks vs some target Landmarks.

It looks like the code 'flows through' from input image to Landmark predictions - does this sound right? Or is there some strange discretisation step that would make the gradients nonsense?

opened by mdda 0
Training with 7 landmarks

Hi can I train my own dataset that has 7 landmarks in an image. Lets say feet landmarks. I tried doing that but it gives error. RuntimeError: Can not found any meanface landmarks settings !Please check and setup meanface carefully beforerunning PIPNet

Originally generated by this following warning: UserWarning: meanface_lms != self.num_lms, 98 != 7So, we will skip this setup for PIPNet meanface

opened by KaleemW 0

Error on Converting 300W dataset in class Landmarks300WConverter with message UFuncTypeError: Cannot cast ufunc 'subtract' output...

Example code

from torchlm.data import LandmarksWFLWConverter, Landmarks300WConverter
# setup your path to the original downloaded dataset from official 
converter = Landmarks300WConverter(
    data_dir="/data/extended/landmark/ibug_300W_dlib/", save_dir="/data/extended/landmark/ibug_300W_dlib/converted",
    extend=0.2, rebuild=True, target_size=256, keep_aspect=False,
    force_normalize=True, force_absolute_path=True
)
converter.convert()
converter.show(count=30)  # show you some converted images with landmarks for debugging

Error Message

Converting 300W Train Annotations:   0%|                                                                                                                                          | 0/3148 [00:00<?, ?it/s]
---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
Input In [1], in <cell line: 8>()
      2 # setup your path to the original downloaded dataset from official 
      3 converter = Landmarks300WConverter(
      4     data_dir="/data/extended/landmark/ibug_300W_dlib//", save_dir="/data/extended/landmark/ibug_300W_dlib/converted",
      5     extend=0.2, rebuild=True, target_size=256, keep_aspect=False,
      6     force_normalize=True, force_absolute_path=True
      7 )
----> 8 converter.convert()
      9 converter.show(count=30)

File /opt/anaconda/envs/facial/lib/python3.9/site-packages/torchlm/data/_converters.py:329, in Landmarks300WConverter.convert(self)
    322 test_anno_file = open(self.save_test_annotation_path, "w")
    324 for annotation in tqdm.tqdm(
    325         self.train_annotations,
    326         colour="GREEN",
    327         desc="Converting 300W Train Annotations"
    328 ):
--> 329     crop, landmarks, new_img_name = self._process_annotation(annotation=annotation)
    330     if crop is None or landmarks is None:
    331         continue

File /opt/anaconda/envs/facial/lib/python3.9/site-packages/torchlm/data/_converters.py:493, in Landmarks300WConverter._process_annotation(self, annotation)
    491 crop = image[int(ymin):int(ymax), int(xmin):int(xmax), :]
    492 # adjust according to left-top corner
--> 493 landmarks[:, 0] -= float(xmin)
    494 landmarks[:, 1] -= float(ymin)
    496 if self.target_size is not None and self.resize_op is not None:

UFuncTypeError: Cannot cast ufunc 'subtract' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

Solution

Change this line

# adjust according to left-top corner
landmarks[:, 0] -= float(xmin)
landmarks[:, 1] -= float(ymin)

Use this line instead

# adjust according to left-top corner
landmarks[:, 0] = landmarks[:, 0] - float(xmin)
landmarks[:, 1] = landmarks[:, 1] - float(ymin)

Environment

conda 4.10.3
Python 3.9.12 (main, Jun 1 2022, 11:38:51)
numpy version 1.23.1
torch version 1.12.1
Operating system Ubuntu 20.04.3 LTS

opened by nunenuh 0

Wrong results while running on GPU

Hi,

Thanks for the wonderful work. I am using torchlm for landmark detection. While running on CPU, the results looks fine. When I later run it on the GPU server, its gives non-sense results like the following:

I don't know if anyone has encountered this issue before. I set up the environment the same way both locally and remotely, it keeps giving non-sense result either on CPU or GPU while running on remote server.

opened by Neo1024 0

Releases(torchlm-0.1.6-alpha)

torchlm-0.1.6-alpha(Mar 7, 2022)
[2022/03/08]: Add PIPNet: Towards Efficient Facial Landmark Detection in the Wild, CVPR2021

Source code(tar.gz)
Source code(zip)
faceboxesv2-640x640.onnx(3.91 MB)
faceboxesv2.pth(3.96 MB)
pipnet_resnet101_10x19x32x256_aflw.pth(166.05 MB)
pipnet_resnet101_10x29x32x256_cofw.pth(167.85 MB)
pipnet_resnet101_10x68x32x256_300w.pth(174.86 MB)
pipnet_resnet101_10x98x32x256_wflw.pth(180.25 MB)
pipnet_resnet18.onnx(47.03 MB)
pipnet_resnet18_10x19x32x256_aflw.pth(43.54 MB)
pipnet_resnet18_10x29x32x256_cofw.pth(43.99 MB)
pipnet_resnet18_10x68x32x256_300w.pth(45.75 MB)
pipnet_resnet18_10x98x32x256_wflw.pth(47.10 MB)
torchlm-0.1.6-py3-none-any.whl(8.42 MB)
torchlm-0.1.6.tar.gz(8.42 MB)
torchlm-v0.1.5(Feb 19, 2022)
v0.1.5 release notes

add data augmentations api docs

some bugs fixed

add more examples

Source code(tar.gz)
Source code(zip)
torchlm-0.1.5-py3-none-any.whl(4.74 MB)
torchlm-0.1.5.tar.gz(4.73 MB)
torchlm-v0.1.4(Feb 19, 2022)
v0.1.4 release notes

add _transforms_api_assert

fixed apply_background

add MANFEST.in

add MixUp

package transforms assets in to pypi for MixUp

Source code(tar.gz)
Source code(zip)
torchlm-0.1.4-py3-none-any.whl(4.74 MB)
torchlm-0.1.4.tar.gz(4.73 MB)
torchlm-v0.1.3(Feb 18, 2022)
v0.1.3 release notes

fixed wrong flag setting when the landmarks been outside

fixed docs typos

inti metrics module

Source code(tar.gz)
Source code(zip)
torchlm-0.1.3-py3-none-any.whl(25.55 KB)
torchlm-0.1.3.tar.gz(24.88 KB)
torchlm-v0.1.2(Feb 14, 2022)
v0.1.2 release notes

add safe check and make sure the landmarks keep the same number after any transforms

fixed docs typos error

add internal transform logging api

add prob param at bind-level

Source code(tar.gz)
Source code(zip)
torchlm-0.1.2-py3-none-any.whl(24.93 KB)
torchlm-0.1.2.tar.gz(24.77 KB)
torchlm-v0.1.1(Feb 13, 2022)
🆕 What's New

[2022/02/13]: Add 30+ native data augmentations and bind 80+ torchvision and albumations's transforms.

Source code(tar.gz)
Source code(zip)
torchlm-0.1.1-py3-none-any.whl(24.15 KB)
torchlm-0.1.1.tar.gz(23.80 KB)
torchlm-v0.1.0(Feb 13, 2022)
🆕 What's New

[2022/02/13]: Add 30+ native data augmentations and bind 80+ torchvision and albumations's transforms.

Source code(tar.gz)
Source code(zip)
torchlm-0.1.0-py3-none-any.whl(21.83 KB)
torchlm-0.1.0.tar.gz(18.50 KB)

Owner

DefTruth

他仍是少年 ~ (stay young)

GitHub Repository https://github.com/DefTruth/torchlm

A High-Quality Real Time Upscaler for Anime Video

Anime4K Anime4K is a set of open-source, high-quality real-time anime upscaling/denoising algorithms that can be implemented in any programming langua

15.7k Jan 06, 2023

Paddle pit - Rethinking Spatial Dimensions of Vision Transformers

基于Paddle实现PiT ——Rethinking Spatial Dimensions of Vision Transformers,arxiv 官方原版代

4 Jan 15, 2022

Implementation of PersonaGPT Dialog Model

PersonaGPT An open-domain conversational agent with many personalities PersonaGPT is an open-domain conversational agent cpable of decoding personaliz

42 Jan 01, 2023

A Kitti Road Segmentation model implemented in tensorflow.

KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark

890 Jan 04, 2023

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

LightningFSL: Few-Shot Learning with Pytorch-Lightning In this repo, a number of pytorch-lightning implementations of FSL algorithms are provided, inc

76 Dec 11, 2022

This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models This repository contains the code for our paper VDA (publ

13 Aug 06, 2022

Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

6 May 15, 2022

A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution. Introduction This repo contains experimental code derived from

2 May 09, 2022

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

43 Jan 03, 2023

Apollo optimizer in tensorflow

Apollo Optimizer in Tensorflow 2.x Notes: Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant lea

1 Nov 09, 2021

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

WASP2 (Currently in pre-development): Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis Requ

2 Aug 11, 2022

Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics

Bert Axioms This is the repository with the code for the Paper Diagnosing BERT with Retrieval Heuristics Required Data In order to run this code, you

5 Jan 21, 2022

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明山东大学 2020级苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数： resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a，返回一个$lenb\times lenb$的方阵。因

2 Aug 29, 2022

DIP-football - A football video analyse system based on Yolov5, alphapose, Qt6

足球视频分析系统作者陆徐东 [email protected] 方天宬

2 Jun 04, 2022

DC540 hacking challenge 0x00005a.

dc540-0x00005a DC540 hacking challenge 0x00005a. PROMOTIONAL VIDEO - WATCH NOW HERE ON YOUTUBE CRITICAL PART 5A VIDEO - WATCH NOW HERE ON YOUTUBE Prio

3 May 09, 2022

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras which will then be used to generate residuals

2 Jan 14, 2022

Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois

19 Nov 20, 2022

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

247 Dec 16, 2022

[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

46 Dec 19, 2022

Federated_learning codes used for the the paper "Evaluation of Federated Learning Aggregation Algorithms" and "A Federated Learning Aggregation Algorithm for Pervasive Computing: Evaluation and Comparison"

Federated Distance (FedDist) This is the code accompanying the Percom2021 paper "A Federated Learning Aggregation Algorithm for Pervasive Computing: E

8 Jan 03, 2023

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

Related tags

Overview

🤗 Introduction

👋 Core Features

🆕 What's New

✅ Supported Models Matrix

Face Detection

Face Landmarks Detection

🔥 🔥 Performance(@NME)

🛠️ Installation

🌟 🌟 Data Augmentation

🎉 🎉 Training

Quick Start 👇

Dataset Format 👇

Additional Custom Settings 👋

Benchmarks Dataset Converters 👇

🛸 🚵‍️ Inference

C++ APIs 👀

Python APIs 👇

Inference on PyTorch Backend

Inference on ONNXRuntime Backend

🤠 🎯 Evaluating

⚙️ ⚔️ Exporting

📖 Documentations

🎓 License

❤️ Contribution

👋 Acknowledgement

Comments

Example code

Error Message

Solution

Change this line

Use this line instead

Environment

Releases(torchlm-0.1.6-alpha)

torchlm-0.1.6-alpha(Mar 7, 2022)

torchlm-v0.1.5(Feb 19, 2022)

v0.1.5 release notes

torchlm-v0.1.4(Feb 19, 2022)

v0.1.4 release notes

torchlm-v0.1.3(Feb 18, 2022)

v0.1.3 release notes

torchlm-v0.1.2(Feb 14, 2022)

v0.1.2 release notes

torchlm-v0.1.1(Feb 13, 2022)

🆕 What's New

torchlm-v0.1.0(Feb 13, 2022)

🆕 What's New

Owner

DefTruth

A High-Quality Real Time Upscaler for Anime Video

Paddle pit - Rethinking Spatial Dimensions of Vision Transformers

Implementation of PersonaGPT Dialog Model

A Kitti Road Segmentation model implemented in tensorflow.

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

Simple ONNX operation generator. Simple Operation Generator for ONNX.

A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Apollo optimizer in tensorflow

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

DIP-football - A football video analyse system based on Yolov5, alphapose, Qt6

DC540 hacking challenge 0x00005a.

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras

Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

[ArXiv 2021] One-Shot Generative Domain Adaptation

Federated_learning codes used for the the paper "Evaluation of Federated Learning Aggregation Algorithms" and "A Federated Learning Aggregation Algorithm for Pervasive Computing: Evaluation and Comparison"