Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Last update: Dec 19, 2022

Related tags

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Abstract: We introduce a method that allows to automatically segment images into semantically meaningful regions without human supervision. Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets. In cases where semantic regions might be hard for human to define and consistently label, our method is still able to find meaningful and consistent semantic classes. In our work, we use pretrained StyleGAN2 generative model: clustering in the feature space of the generative model allows to discover semantic classes. Once classes are discovered, a synthetic dataset with generated images and corresponding segmentation masks can be created. After that a segmentation model is trained on the synthetic dataset and is able to generalize to real images. Additionally, by using CLIP we are able to use prompts defined in a natural language to discover some desired semantic classes. We test our method on publicly available datasets and show state-of-the-art results.

This repository contains the official Pytorch implementation of the following paper:

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
Daniil Pakhomov, Sanchit Hira, Narayani Wagle, Kemar E. Green, Nassir Navab
https://arxiv.org/abs/2107.12518

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Related tags

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Owner

Daniil Pakhomov

Medical Insurance Cost Prediction using Machine earning

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Official code for the paper "Self-Supervised Prototypical Transfer Learning for Few-Shot Classification"

Inferring Lexicographically-Ordered Rewards from Preferences

A repo for Causal Imitation Learning under Temporally Correlated Noise

The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Neural Nano-Optics for High-quality Thin Lens Imaging

Official implementation of Meta-StyleSpeech and StyleSpeech

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, wav2lip, picture repair, image editing, photo2cartoon, image style transfer, and so on.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Search and filter videos based on objects that appear in them using convolutional neural networks

This is a demo app to be used in the video streaming applications

Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation