This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Last update: Dec 30, 2022

Related tags

Deep Learning clipseg

Overview

Prompt-Based Multi-Modal Image Segmentation

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

The systems allows to create segmentation models without training based on:

An arbitrary text query
Or an image with a mask highlighting stuff or an object.

Quick Start

In the Quickstart.ipynb notebook we provide the code for using a pre-trained CLIPSeg model. It can also be used interactively using MyBinder (please note that the VM does not use a GPU, thus inference takes a few seconds).

Dependencies

This code base depends on pytorch, torchvision and clip (pip install git+https://github.com/openai/CLIP.git). Additional dependencies are hidden for double blind review.

Datasets

PhraseCut and PhraseCutPlus: Referring expression dataset
PFEPascalWrapper: Wrapper class for PFENet's Pascal-5i implementation
PascalZeroShot: Wrapper class for PascalZeroShot
COCOWrapper: Wrapper class for COCO.

Models

CLIPDensePredT: CLIPSeg model with transformer-based decoder.
ViTDensePredT: CLIPSeg model with transformer-based decoder.

Third Party Dependencies

For some of the datasets third party dependencies are required. Run the following commands in the third_party folder.

git clone https://github.com/cvlab-yonsei/JoEm
git clone https://github.com/Jia-Research-Lab/PFENet.git
git clone https://github.com/ChenyunWu/PhraseCutDataset.git
git clone https://github.com/juhongm999/hsnet.git

Weights

CLIPSeg-D64 (4.1MB, without CLIP weights)
CLIPSeg-D16 (1.1MB, without CLIP weights)

Training

See the experiment folder for yaml definitions of the training configurations. The training code is in experiment_setup.py.

Usage of PFENet Wrappers

In order to use the dataset and model wrappers for PFENet, the PFENet repository needs to be cloned to the root folder. git clone https://github.com/Jia-Research-Lab/PFENet.git

Citation

@article{lueddecke21
    title={Prompt-Based Multi-Modal Image Segmentation},
    author={Timo Lüddecke and Alexander Ecker},
    journal={arXiv preprint arXiv:2112.10003},
    year={2021}
}

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Related tags

Overview

Prompt-Based Multi-Modal Image Segmentation

Quick Start

Dependencies

Datasets

Models

Third Party Dependencies

Weights

Training

Usage of PFENet Wrappers

Citation

Owner

Timo Lüddecke

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

Learn other languages using artificial intelligence with python.

Wenet STT Python

Large-Scale Unsupervised Object Discovery

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series Classification

You Only Look Once for Panopitic Driving Perception

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Learning to Segment Instances in Videos with Spatial Propagation Network

Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

Semi-SDP Semi-supervised parser for semantic dependency parsing.

Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Online-compatible Unsupervised Non-resonant Anomaly Detection Repository

This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

Benchmark for Answering Existential First Order Queries with Single Free Variable

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Related tags

Overview

Prompt-Based Multi-Modal Image Segmentation

Quick Start

Dependencies

Datasets

Models

Third Party Dependencies

Weights

Training

Usage of PFENet Wrappers

Citation

Owner

Timo Lüddecke

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

Learn other languages ​​using artificial intelligence with python.

Wenet STT Python

Large-Scale Unsupervised Object Discovery

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series Classification

You Only Look Once for Panopitic Driving Perception

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Learning to Segment Instances in Videos with Spatial Propagation Network

Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

Semi-SDP Semi-supervised parser for semantic dependency parsing.

Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Online-compatible Unsupervised Non-resonant Anomaly Detection Repository

This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

Benchmark for Answering Existential First Order Queries with Single Free Variable

Learn other languages using artificial intelligence with python.