Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Official source code. Appears at CVPR 2022

This repository provides a new dataset, named InstaOrder, that can be used to understand the geometrical relationships of instances in an image. The dataset consists of 2.9M annotations of geometric orderings for class-labeled instances in 101K natural scenes. The scenes were annotated by 3,659 crowd-workers regarding (1) occlusion order that identifies occluder/occludee and (2) depth order that describes ordinal relations that consider relative distance from the camera. This repository also introduce a geometric order prediction network called InstaOrderNet, which is superior to state-of-the-art approaches.

Installation

This code has been developed under Anaconda(Python 3.6), Pytorch 1.7.1, torchvision 0.8.2 and CUDA 10.1. Please install following environments:

# build conda environment
conda create --name order python=3.6
conda activate order

# install requirements
pip install -r requirements.txt

# install COCO API
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Visualization

Check InstaOrder_vis.ipynb to visualize InstaOrder dataset including instance masks, occlusion order, and depth order.

Training

The experiments folder contains train and test scripts of experiments demonstrated in the paper.

To train {MODEL} with {DATASET},

  1. Download {DATASET} following this.
  2. Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml
  3. (Optional) To train InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt
  4. Run the script file as follow:
    sh experiments/{DATASET}/{MODEL}/train.sh
    
    # Example of training InstaOrderNet^o (Table3 in the main paper) from the scratch
    sh experiments/InstaOrder/InstaOrderNet_o/train.sh

Inference

  1. Download pretrained models InstaOrder_ckpt.zip (3.5G) and unzip files following the below structure. Pretrained models are named by {DATASET}_{MODEL}.pth.tar

    ${base_dir}
    |--data
    |    |--out
    |    |    |--InstaOrder_ckpt
    |    |    |    |--COCOA_InstaOrderNet_o.pth.tar
    |    |    |    |--COCOA_OrderNet.pth.tar
    |    |    |    |--COCOA_pcnet_m.pth.tar
    |    |    |    |--InstaOrder_InstaDepthNet_d.pth.tar
    |    |    |    |--InstaOrder_InstaDepthNet_od.pth.tar
    |    |    |    |--InstaOrder_InstaOrderNet_d.pth.tar
    |    |    |    |--InstaOrder_InstaOrderNet_o.pth.tar
    |    |    |    |--InstaOrder_InstaOrderNet_od.pth.tar
    |    |    |    |--InstaOrder_OrderNet.pth.tar
    |    |    |    |--InstaOrder_OrderNet_ext.pth.tar  
    |    |    |    |--InstaOrder_pcnet_m.pth.tar
    |    |    |    |--KINS_InstaOrderNet_o.pth.tar
    |    |    |    |--KINS_OrderNet.pth.tar
    |    |    |    |--KINS_pcnet_m.pth.tar
    
  2. (Optional) To test InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt

  3. Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml

  4. To test {MODEL} with {DATASET}, run the script file as follow:

    sh experiments/{DATASET}/{MODEL}/test.sh
    
    # Example of reproducing the accuracy of InstaOrderNet^o (Table3 in the main paper)
    sh experiments/InstaOrder/InstaOrderNet_o/test.sh
    

Datasets

InstaOrder dataset

To use InstaOrder, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2017/
|    |    |--val2017/
|    |    |--annotations/
|    |    |    |--instances_train2017.json
|    |    |    |--instances_val2017.json
|    |    |    |--InstaOrder_train2017.json
|    |    |    |--InstaOrder_val2017.json    

COCOA dataset

To use COCOA, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2014/
|    |    |--val2014/
|    |    |--annotations/
|    |    |    |--COCO_amodal_train2014.json 
|    |    |    |--COCO_amodal_val2014.json
|    |    |    |--COCO_amodal_val2014.json

KINS dataset

To use KINS, download files following the below structure

${base_dir}
|--data
|    |--KINS
|    |    |--training/
|    |    |--testing/
|    |    |--instances_val.json
|    |    |--instances_train.json
  

DIW dataset

To use DIW, download files following the below structure

${base_dir}
|--data
|    |--DIW
|    |    |--DIW_test/
|    |    |--DIW_Annotations
|    |    |    |--DIW_test.csv   

Citing InstaOrder

If you find this code/data useful in your research then please cite our paper:

@inproceedings{lee2022instaorder,
  title={{Instance-wise Occlusion and Depth Orders in Natural Scenes}},
  author={Hyunmin Lee and Jaesik Park},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

We have reffered to and borrowed the implementations from Xiaohang Zhan

Turning pixels into virtual points for multimodal 3D object detection.

Multimodal Virtual Point 3D Detection Turning pixels into virtual points for multimodal 3D object detection. Multimodal Virtual Point 3D Detection, Ti

Tianwei Yin 204 Jan 08, 2023
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
Automatic 2D-to-3D Video Conversion with CNNs

Deep3D: Automatic 2D-to-3D Video Conversion with CNNs How To Run To run this code. Please install MXNet following the official document. Deep3D requir

Eric Junyuan Xie 1.2k Dec 30, 2022
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
Resilience from Diversity: Population-based approach to harden models against adversarial attacks

Resilience from Diversity: Population-based approach to harden models against adversarial attacks Requirements To install requirements: pip install -r

0 Nov 23, 2021
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022
MINOS: Multimodal Indoor Simulator

MINOS Simulator MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environ

194 Dec 27, 2022
Python package for dynamic system estimation of time series

PyDSE Toolset for Dynamic System Estimation for time series inspired by DSE. It is in a beta state and only includes ARMA models right now. Documentat

Blue Yonder GmbH 40 Oct 07, 2022
Hierarchical Attentive Recurrent Tracking

Hierarchical Attentive Recurrent Tracking This is an official Tensorflow implementation of single object tracking in videos by using hierarchical atte

Adam Kosiorek 147 Aug 07, 2021
Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.

Angora Angora is a mutation-based coverage guided fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without s

833 Jan 07, 2023
The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

ISC-Track1-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 1. Required dependencies To begin with

Wenhao Wang 115 Jan 02, 2023
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Deep Reinforcement Learning Agents This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython noteb

Arthur Juliani 2.2k Jan 01, 2023
Pyramid addon for OpenAPI3 validation of requests and responses.

Validate Pyramid views against an OpenAPI 3.0 document Peace of Mind The reason this package exists is to give you peace of mind when providing a REST

Pylons Project 79 Dec 30, 2022
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Sagor Saha 4 Sep 04, 2021
[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Few-shot 3D Point Cloud Semantic Segmentation Created by Na Zhao from National University of Singapore Introduction This repository contains the PyTor

117 Dec 27, 2022
You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Huiyiqianli 42 Dec 06, 2022
A comprehensive and up-to-date developer education platform for Urbit.

curriculum A comprehensive and up-to-date developer education platform for Urbit. This project organizes developer capabilities into a hierarchy of co

Sigilante 36 Oct 04, 2022
Corruption Invariant Learning for Re-identification

Corruption Invariant Learning for Re-identification The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS

Minghui Chen 73 Dec 08, 2022