A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Related tags

Deep LearningPNG
Overview

❇️   ❇️     Please visit our Project Page to learn more about Panoptic Narrative Grounding.    ❇️   ❇️

Panoptic Narrative Grounding

This repository provides a PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral). Panoptic Narrative Grounding is a spatially fine and general formulation of the natural language visual grounding problem. We establish an experimental framework for the study of this new task, including new ground truth and metrics, and we propose a strong baseline method to serve as stepping stone for future work. We exploit the intrinsic semantic richness in an image by including panoptic categories, and we approach visual grounding at a fine-grained level by using segmentations. In terms of ground truth, we propose an algorithm to automatically transfer Localized Narratives annotations to specific regions in the panoptic segmentations of the MS COCO dataset. The proposed baseline achieves a performance of 55.4 absolute Average Recall points. This result is a suitable foundation to push the envelope further in the development of methods for Panoptic Narrative Grounding.

Paper

Panoptic Narrative Grounding,
Cristina González1, Nicolás Ayobi1, Isabela Hernández1, José Hernández 1, Jordi Pont-Tuset2, Pablo Arbeláez1
ICCV 2021 Oral.

1 Center for Research and Formation in Artificial Intelligence (CINFONIA) , Universidad de Los Andes.
2 Google Research, Switzerland.

Installation

Requirements

  • Python
  • Numpy
  • Pytorch 1.7.1
  • Tqdm 4.56.0
  • Scipy 1.5.3

Cloning the repository

$ git clone [email protected]:BCV-Uniandes/PNG.git
$ cd PNG

Dataset Preparation

Panoptic Marrative Grounding Benchmark

  1. Download the 2017 MSCOCO Dataset from its official webpage. You will need the train and validation splits' images1 and panoptic segmentations annotations.

  2. Download the Panoptic Narrative Grounding Benchmark and pre-computed features from our project webpage with the following folders structure:

panoptic_narrative_grounding
|_ images
|  |_ train2017
|  |_ val2017
|_ features
|  |_ train2017
|  |  |_ mask_features
|  |  |_ sem_seg_features
|  |  |_ panoptic_seg_predictions
|  |_ val2017
|     |_ mask_features
|     |_ sem_seg_features
|     |_ panoptic_seg_predictions
|_ annotations
   |_ png_coco_train2017.json
   |_ png_coco_val2017.json
   |_ panoptic_segmentation
      |_ train2017
      |_ val2017

Train setup:

Modify the routes in train_net.sh according to your local paths.

python main --init_method "tcp://localhost:8080" NUM_GPUS 1 DATA.PATH_TO_DATA_DIR path_to_your_data_dir DATA.PATH_TO_FEATURES_DIR path_to_your_features_dir OUTPUT_DIR output_dir

Test setup:

Modify the routes in test_net.sh according to your local paths.

python main --init_method "tcp://localhost:8080" NUM_GPUS 1 DATA.PATH_TO_DATA_DIR path_to_your_data_dir DATA.PATH_TO_FEATURES_DIR path_to_your_features_dir OUTPUT_DIR output_dir TRAIN.ENABLE "False"

Pretrained model

To reproduce all our results as reported bellow, you can use our pretrained model and our source code.

Method things + stuff things stuff
Oracle 64.4 67.3 60.4
Ours 55.4 56.2 54.3
MCN - 48.2 -
Method singulars + plurals singulars plurals
Oracle 64.4 64.8 60.7
Ours 55.4 56.2 48.8

Citation

If you find Panoptic Narrative Grounding useful in your research, please use the following BibTeX entry for citation:

@inproceedings{gonzalez2021png,
  title={Panoptic Narrative Grounding},
  author={Gonz{\'a}lez, Cristina and Ayobi, Nicol{'\a}s and Hern{\'a}ndez, Isabela and Hern{\'a}ndez, Jose and Pont-Tuset, Jordi and Arbel{\'a}ez, Pablo},
  booktitle={ICCV},
  year={2021}
}
Owner
Biomedical Computer Vision @ Uniandes
Our field of research is computer vision, the area of artificial intelligence seeking automated understanding of visual information
Biomedical Computer Vision @ Uniandes
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 04, 2022
GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

GalaXC GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification @InProceedings{Saini21, author = {Saini, D. and Jain,

Extreme Classification 28 Dec 05, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023
Informal Persian Universal Dependency Treebank

Informal Persian Universal Dependency Treebank (iPerUDT) Informal Persian Universal Dependency Treebank, consisting of 3000 sentences and 54,904 token

Roya Kabiri 0 Jan 05, 2022
Create Own QR code with Python

Create-Own-QR-code Create Own QR code with Python SO guys in here, you have to install pyqrcode 2. open CMD and type python -m pip install pyqrcode

JehanKandy 10 Jul 13, 2022
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a

79 Dec 27, 2022
A Loss Function for Generative Neural Networks Based on Watson’s Perceptual Model

This repository contains the similarity metrics designed and evaluated in the paper, and instructions and code to re-run the experiments. Implementation in the deep-learning framework PyTorch

Steffen 86 Dec 27, 2022
Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transformers to Guarantee TopologyPreservation in Segmentations"

TEDS-Net Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transfo

Madeleine K Wyburd 14 Jan 04, 2023
[WWW 2021] Source code for "Graph Contrastive Learning with Adaptive Augmentation"

GCA Source code for Graph Contrastive Learning with Adaptive Augmentation (WWW 2021) For example, to run GCA-Degree under WikiCS, execute: python trai

Big Data and Multi-modal Computing Group, CRIPAC 97 Jan 07, 2023
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022
High-fidelity 3D Model Compression based on Key Spheres

High-fidelity 3D Model Compression based on Key Spheres This repository contains the implementation of the paper: Yuanzhan Li, Yuqi Liu, Yujie Lu, Siy

5 Oct 11, 2022
A library for differentiable nonlinear optimization.

Theseus A library for differentiable nonlinear optimization built on PyTorch to support constructing various problems in robotics and vision as end-to

Meta Research 1.1k Dec 30, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
Person Re-identification

Person Re-identification Final project of Computer Vision Table of content Person Re-identification Table of content Students: Proposed method Dataset

Nguyễn Hoàng Quân 4 Jun 17, 2021
Pytorch implementation of the unsupervised object discovery method LOST.

LOST Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper: Localizing Objects with Self-Sup

Valeo.ai 189 Dec 25, 2022
Dark Finix: All in one hacking framework with almost 100 tools

Dark Finix - Hacking Framework. Dark Finix is a all in one hacking framework wit

Md. Nur habib 2 Feb 18, 2022
Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

DimitrisAlivas 19 Jul 26, 2022
Torchlight2 lan game server tool - A message forwarding tool for Torchlight 2 lan game

Torchlight 2 Lan Game Server Tool A message forwarding tool for Torchlight 2 lan

Huaijun Jiang 3 Nov 01, 2022
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
Code for intrusion detection system (IDS) development using CNN models and transfer learning

Intrusion-Detection-System-Using-CNN-and-Transfer-Learning This is the code for the paper entitled "A Transfer Learning and Optimized CNN Based Intrus

Western OC2 Lab 38 Dec 12, 2022