A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Last update: Aug 14, 2022

Related tags

Overview

Paper

Khoi Nguyen, Sinisa Todorovic "A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation", accepted to ICCV 2021

Our code is mainly based on the code from the paper: Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu, Dahua Lin, Chen Change Loy, "Self-Supervised Scene De-occlusion"

Requirements

pytorch>=0.4.1
```
pip install -r requirements.txt
```

Data Preparation

COCOA dataset proposed in Semantic Amodal Segmentation.

Download COCO2014 train and val images from here and unzip.
Download COCOA annotations from here and untar.

Ensure the COCOA folder looks like:

COCOA/
  |-- train2014/
  |-- val2014/
  |-- annotations/
    |-- COCO_amodal_train2014.json
    |-- COCO_amodal_val2014.json
    |-- COCO_amodal_test2014.json
    |-- ...

Create symbolic link:

cd deocclusion
mkdir data
cd data
ln -s /path/to/COCOA

KINS dataset proposed in Amodal Instance Segmentation with KINS Dataset.

Download left color images of object data in KITTI dataset from here and unzip.
Download KINS annotations from here corresponding to this commit.

Ensure the KINS folder looks like:

KINS/
  |-- training/image_2/
  |-- testing/image_2/
  |-- instances_train.json
  |-- instances_val.json

Create symbolic link:

cd deocclusion/data
ln -s /path/to/KINS

Train

train PCNet-M

Train (taking COCOA for example).
```
./train_pcnet_m_std_no_rgb_gaussian.sh
```
Monitoring status and visual results using tensorboard.
```
sh tensorboard.sh $PORT
```

Evaluate

Execute:
```
./test_pcnet_m.sh
```

Bibtex

@InProceedings{Nguyen_2021_ICCV,
    author    = {Nguyen, Khoi and Todorovic, Sinisa},
    title     = {A Weakly Supervised Amodal Segmenter With Boundary Uncertainty Estimation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {7396-7405}
}

Acknowledgement

We developed our approach based on the code from https://github.com/XiaohangZhan/deocclusion/
We used the code and models of GCA-Matting in our demo.
We modified some code from pytorch-inpainting-with-partial-conv to train the PCNet-C.

A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Related tags

Overview

Paper

Requirements

Data Preparation

COCOA dataset proposed in Semantic Amodal Segmentation.

KINS dataset proposed in Amodal Instance Segmentation with KINS Dataset.

Train

train PCNet-M

Evaluate

Bibtex

Acknowledgement

Owner

Khoi Nguyen

PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

Extremely simple and fast extreme multi-class and multi-label classifiers.

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

A TensorFlow Implementation of "Deep Multi-Scale Video Prediction Beyond Mean Square Error" by Mathieu, Couprie & LeCun.

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Binary classification for arrythmia detection with ECG datasets.

Accelerate Neural Net Training by Progressively Freezing Layers

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

MutualGuide is a compact object detector specially designed for embedded devices

Running Google MoveNet Multipose Tracking models on OpenVINO.

Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

An official implementation of the Anchor DETR.

Bottom-up Human Pose Estimation

meProp: Sparsified Back Propagation for Accelerated Deep Learning

Implementation of Nyström Self-attention, from the paper Nyströmformer

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Sign Language Transformers (CVPR'20)

[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Flexible Networks for Learning Physical Dynamics of Deformable Objects (2021)

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.