💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

Related tags

Deep LearningVALSE
Overview

VALSE 💃

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena. https://arxiv.org/abs/2112.07566

Data Instructions

Please find the data in the data folder. The dataset is in json format and contains the following relevant fields:

  • A reference to the image in the original dataset: dataset and image_file.
  • The valid sentence, the caption for VALSE: caption.
  • The altered caption, the foil.
  • The annotator's votes (3 annotators per sample): mturk.
    • The subentry caption counts the number of annotators who chose the caption, but/and not the foil, to be the one describing the image.
    • The subentry foil counts how many of the three annotators chose the foil to be (also) describing the image.
    • For more information, see subsec. 4.4 and App. E of the paper.

‼️ Please be aware that the jsons are containing both valid (meaning: validated by annotators) and non-validated samples. In order to work only with the valid set, please consider filtering them:

We consider a valid foil to mean: at least two out of three annotators identified the caption, but not the foil, as the text which accurately describes the image.

This means that the valid samples of the dataset are the ones where sample["mturk"]["caption"] >= 2.

Example instance:

{
    "actions_test_0": {
        "dataset": "SWiG",
        "original_split": "test",                 # the split of the original dataset in which the sample belonged to
        "dataset_idx": "exercising_255.jpg",      # the sample id in the original dataset
        "linguistic_phenomena": "actions",        # the linguistic phenomenon targeted
        "image_file": "exercising_255.jpg",
        "caption": "A man exercises his torso.",
        "classes": "man",                         # the word of the caption that was replaced
        "classes_foil": "torso",                  # the foil word / phrase
        "mturk": {
            "foil": 0,
            "caption": 3,
            "other": 0
        },
        "foil": "A torso exercises for a man."
    }, ...
}

Images

For the images, please follow the downloading instructions of the respective original dataset. The provenance of the original images is mentioned in the json files in the field dataset.

Reference

Please cite our 💃 VALSE paper if you are using this dataset.

@misc{parcalabescu2021valse,
      title={VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena}, 
      author={Letitia Parcalabescu and Michele Cafagna and Lilitta Muradjan and Anette Frank and Iacer Calixto and Albert Gatt},
      year={2021},
      eprint={2112.07566},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Owner
Heidelberg-NLP
Heidelberg Natural Language Processing Group
Heidelberg-NLP
PyTorch implementation for paper Neural Marching Cubes.

NMC PyTorch implementation for paper Neural Marching Cubes, Zhiqin Chen, Hao Zhang. Paper | Supplementary Material (to be updated) Citation If you fin

Zhiqin Chen 109 Dec 27, 2022
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. 👉 https:

Gaurav 16 Oct 29, 2022
Graph Convolutional Networks in PyTorch

Graph Convolutional Networks in PyTorch PyTorch implementation of Graph Convolutional Networks (GCNs) for semi-supervised classification [1]. For a hi

Thomas Kipf 4.5k Dec 31, 2022
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

Yu Wang 34 Dec 24, 2022
A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Paper Khoi Nguyen, Sinisa Todorovic "A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation", accepted to ICCV 2021 Our code is mai

Khoi Nguyen 5 Aug 14, 2022
A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

3d-pose-baseline This is the code for the paper Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3

Julieta Martinez 1.3k Jan 03, 2023
A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL

🌟 HNSW + PostgreSQL Indexer HNSWPostgreSQLIndexer Jina is a production-ready, scalable Indexer for the Jina neural search framework. It combines the

Jina AI 25 Oct 14, 2022
ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Voice2Series-Reprogramming Voice2Series: Reprogramming Acoustic Models for Time Series Classification International Conference on Machine Learning (IC

49 Jan 03, 2023
Create animations for the optimization trajectory of neural nets

Animating the Optimization Trajectory of Neural Nets loss-landscape-anim lets you create animated optimization path in a 2D slice of the loss landscap

Logan Yang 81 Dec 25, 2022
Ontologysim: a Owlready2 library for applied production simulation

Ontologysim: a Owlready2 library for applied production simulation Ontologysim is an open-source deep production simulation framework, with an emphasi

10 Nov 30, 2022
Addition of pseudotorsion caclulation eta, theta, eta', and theta' to barnaba package

Addition to Original Barnaba Code: This is modified version of Barnaba package to calculate RNA pseudotorsion angles eta, theta, eta', and theta'. Ple

Mandar Kulkarni 1 Jan 11, 2022
The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation.

Feedback Convolutional Neural Network for Visual Localization and Segmentation The code is an implementation of Feedback Convolutional Neural Network

19 Dec 04, 2022
PyTorch implementations of the beta divergence loss.

Beta Divergence Loss - PyTorch Implementation This repository contains code for a PyTorch implementation of the beta divergence loss. Dependencies Thi

Billy Carson 7 Nov 09, 2022
TLoL (Python Module) - League of Legends Deep Learning AI (Research and Development)

TLoL-py - League of Legends Deep Learning Library TLoL-py is the Python component of the TLoL League of Legends deep learning library. It provides a s

7 Nov 29, 2022
ROS-UGV-Control-Interface - Control interface which can be used in any UGV

ROS-UGV-Control-Interface Cam Closed: Cam Opened:

Ahmet Fatih Akcan 1 Nov 04, 2022
Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

Deformable Butterfly: A Highly Structured and Sparse Linear Transform DeBut Advantages DeBut generalizes the square power of two butterfly factor matr

Rui LIN 8 Jun 10, 2022
Keras Implementation of Neural Style Transfer from the paper "A Neural Algorithm of Artistic Style"

Neural Style Transfer & Neural Doodles Implementation of Neural Style Transfer from the paper A Neural Algorithm of Artistic Style in Keras 2.0+ INetw

Somshubra Majumdar 2.2k Dec 31, 2022
The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022
Dynamic Capacity Networks using Tensorflow

Dynamic Capacity Networks using Tensorflow Dynamic Capacity Networks (DCN; http://arxiv.org/abs/1511.07838) implementation using Tensorflow. DCN reduc

Taeksoo Kim 8 Feb 23, 2021
LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

Query Selector Here you can find code and data loaders for the paper https://arxiv.org/pdf/2107.08687v1.pdf . Query Selector is a novel approach to sp

MORAI 62 Dec 17, 2022