Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

Overview

alt text

The Face Synthetics dataset

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

It was introduced in our paper Fake It Till You Make It: Face analysis in the wild using synthetic data alone.

Our dataset contains:

  • 100,000 images of faces at 512 x 512 pixel resolution
  • 70 standard facial landmark annotations
  • per-pixel semantic class anotations

It can be used to train machine learning systems for face-related tasks such as landmark localization and face parsing, showing that synthetic data can both match real data in accuracy as well as open up new approaches where manual labelling would be impossible.

Some images also include hands and off-center distractor faces in addition to primary faces centered in the image.

The Face Synthetics dataset can be used for non-commercial research, and is licensed under the license found in LICENSE.txt.

Downloading the dataset

A sample dataset with 100 images (34MB) can be downloaded from here

A sample dataset with 1000 images (320MB) can be downloaded from here

A full dataset of 100,000 images (32GB) can be downloaded from here

Dataset layout

The Face Synthetics dataset is a single .zip file containing color images, segmentation images, and 2D landmark coordinates in a text file.

dataset.zip
├── {frame_id}.png        # Rendered image of a face
├── {frame_id}_seg.png    # Segmentation image, where each pixel has an integer value mapping to the categories below
├── {frame_id}_ldmks.txt  # Landmark annotations for 70 facial landmarks (x, y) coordinates for every row

Our landmark annotations follow the 68 landmark scheme from iBUG with two additional points for the pupil centers. Please note that our 2D landmarks are projections of 3D points and do not follow the outline of the face/lips/eyebrows in the way that is common from manually annotated landmarks. They can be thought of as an "x-ray" version of 2D landmarks.

Each pixel in the segmentation image will belong to one of the following classes:

BACKGROUND = 0
SKIN = 1
NOSE = 2
RIGHT_EYE = 3
LEFT_EYE = 4
RIGHT_BROW = 5
LEFT_BROW = 6
RIGHT_EAR = 7
LEFT_EAR = 8
MOUTH_INTERIOR = 9
TOP_LIP = 10
BOTTOM_LIP = 11
NECK = 12
HAIR = 13
BEARD = 14
CLOTHING = 15
GLASSES = 16
HEADWEAR = 17
FACEWEAR = 18
IGNORE = 255

Pixels marked as IGNORE should be ignored during training.

Notes:

  • Opaque eyeglass lenses are labeled as GLASSES, while transparent lenses as the class behind them.
  • For bushy eyebrows, a few eyebrow pixels may extend beyond the boundary of the face. These pixels are labelled as IGNORE.

Disclaimer

Some of our rendered faces may be close in appearance to the faces of real people. Any such similarity is naturally unintentional, as it would be in a dataset of real images, where people may appear similar to others unknown to them.

Generalization to real data

For best results, we suggest you follow the methodology described in our paper (citation below). Especially note the need for 1) data augmentation; 2) use of a translation layer if evaluating on real data benchmarks that contain different types of annotations.

Our dataset strives to be as diverse as possible and generalizes to real test data as described in the paper. However, you may encounter situations that it does not cover and/or where generalization is less successful. We recommend that machine learning practitioners always test models on real data that is representative of the target deployment scenario.

Citation

If you use the Face Synthetics Dataset your research, please cite the following paper:

@misc{wood2021fake,
    title={Fake It Till You Make It: Face analysis in the wild using synthetic data alone},
    author={Erroll Wood and Tadas Baltru\v{s}aitis and Charlie Hewitt and Sebastian Dziadzio and Matthew Johnson and Virginia Estellers and Thomas J. Cashman and Jamie Shotton},
    year={2021},
    eprint={2109.15102},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 05, 2022
Open Source Light Field Toolbox for Super-Resolution

BasicLFSR BasicLFSR is an open-source and easy-to-use Light Field (LF) image Super-Ressolution (SR) toolbox based on PyTorch, including a collection o

Squidward 50 Nov 18, 2022
Temporal-Relational CrossTransformers

Temporal-Relational Cross-Transformers (TRX) This repo contains code for the method introduced in the paper: Temporal-Relational CrossTransformers for

83 Dec 12, 2022
School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

Yutong Bai 145 Dec 01, 2022
Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

Constrained Logistic Regression Sample implementation of constructing a logistic regression with given ranges on each of the feature's coefficients (v

1 Dec 29, 2021
TensorFlow implementation of Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction)

Barlow-Twins-TF This repository implements Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction) in TensorFlow and demonstrat

Sayak Paul 36 Sep 14, 2022
ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset

UIT-ViSD4SA PACLIC 35 General Introduction This repository contains the data of the paper: Span Detection for Vietnamese Aspect-Based Sentiment Analys

Nguyễn Thị Thanh Kim 5 Nov 13, 2022
The code for paper Efficiently Solve the Max-cut Problem via a Quantum Qubit Rotation Algorithm

Quantum Qubit Rotation Algorithm Single qubit rotation gates $$ U(\Theta)=\bigotimes_{i=1}^n R_x (\phi_i) $$ QQRA for the max-cut problem This code wa

SheffieldWang 0 Oct 18, 2021
A universal memory dumper using Frida

Fridump Fridump (v0.1) is an open source memory dumping tool, primarily aimed to penetration testers and developers. Fridump is using the Frida framew

551 Jan 07, 2023
Physical Anomalous Trajectory or Motion (PHANTOM) Dataset

Physical Anomalous Trajectory or Motion (PHANTOM) Dataset Description This dataset contains the six different classes as described in our paper[]. The

0 Dec 16, 2021
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

Mario Sessa 8 Dec 12, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
Code for Environment Dynamics Decomposition (ED2).

ED2 Code for Environment Dynamics Decomposition (ED2). Installation Follow the installation in MBPO and Dreamer. Usage First follow the SD2 method for

0 Aug 10, 2021
Project NII pytorch scripts

project-NII-pytorch-scripts By Xin Wang, National Institute of Informatics, since 2021 I am a new pytorch user. If you have any suggestions or questio

Yamagishi and Echizen Laboratories, National Institute of Informatics 184 Dec 23, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

Citylearn Challenge This is the PyTorch implementation for PikaPika team, CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energ

bigAIdream projects 10 Oct 10, 2022
Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN

Overview PyTorch 0.4.1 | Python 3.6.5 Annotated implementations with comparative introductions for minimax, non-saturating, wasserstein, wasserstein g

Shayne O'Brien 471 Dec 16, 2022
PyTorch META-DATASET (Few-shot classification benchmark)

PyTorch META-DATASET (Few-shot classification benchmark) This repo contains a PyTorch implementation of meta-dataset and a unified implementation of s

Malik Boudiaf 39 Oct 31, 2022
[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

MonoRUn MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 96 Dec 10, 2022