This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

Related tags

Deep LearningxGQA
Overview

xGQA

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

xGQA builds on the original work of Hudson et al. 2019: GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering. The training data can be downloaded here.

Overview

The repository is structured as follows:

  • data/zero_shot/ contains the xGQA test-dev files for all 8 languages
  • data/few_shot/ contains the new standard splits for few shot learning. The number in the file name indicates how many distinct images the split includes. i.e. train_10.json implies that this subset contains questions about 10 distinct images.

Training Data

Please download the English training data of GQA (Hudson et al. 2019) here.

Zero-Shot Results

Zero-shot transfer results on xGQA when transferring from English GQA. Average accuracy is reported. Mean scores are not averaged over the source language (English).

model en de pt ru id bn ko zh mean
M3P 58.43 23.93 24.37 20.37 22.57 15.83 16.90 18.60 20.37
OSCAR+Emb 62.23 17.35 19.25 10.52 18.26 14.93 17.10 16.41 16.26
OSCAR+Ada 60.30 18.91 27.02 17.50 18.77 15.42 15.28 14.96 18.27
mBERTAda 56.25 29.76 30.37 24.42 19.15 15.12 19.09 24.86 23.25

Few-Shot

Few-shot dataset sizes. The GQA test-dev set is split into new development, test sets, and training splits of different sizes. We maintain the distribution of structural types in each split.

Set Test Dev Train
#Images 300 50 1 5 10 20 25 48
#Questions 9666 1422 27 155 317 594 704 1490

Citation

If you find this repository helpful, please cite our paper "xGQA: Cross-lingual Visual Question Answering":

@article{pfeiffer-etal-2021-xGQA,
    title={{xGQA: Cross-Lingual Visual Question Answering}},
    author={ Jonas Pfeiffer and Gregor Geigle and Aishwarya Kamath and Jan-Martin O. Steitz and Stefan Roth and Ivan Vuli{\'{c}} and Iryna Gurevych},
    journal = "arXiv preprint", 
    year = "2021",  
    url = "https://arxiv.org/pdf/2109.06082.pdf"
}

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Owner
AdapterHub
AdapterHub
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Few-Shot and Continual Learning with Attentive Independent Mechanisms This repository is the official implementation of Few-Shot and Continual Learnin

Chikan_Huang 25 Dec 08, 2022
An University Project of Quera Web Crawling.

WebCrawlerProject An University Project of Quera Web Crawling. خزشگر اینستاگرام در این پروژه شما باید با استفاده از کتابخانه های زیر یک خزشگر اینستاگر

Mahdi 3 Aug 12, 2022
A little software to generate and save Julia or Mandelbrot's Fractals.

Julia-Mandelbrot-s-Fractals A little software to generate and save Julia or Mandelbrot's Fractals. Dependencies : Python 3.7 or more. (Also possible t

Olivier 0 Jul 09, 2022
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
Certis - Certis, A High-Quality Backtesting Engine

Certis - Backtesting For y'all Certis is a powerful, lightweight, simple backtes

Yeachan-Heo 46 Oct 30, 2022
🇰🇷 Text to Image in Korean

KoDALLE Utilizing pretrained language model’s token embedding layer and position embedding layer as DALLE’s text encoder. Background Training DALLE mo

HappyFace 74 Sep 22, 2022
A vision library for performing sliced inference on large images/small objects

SAHI: Slicing Aided Hyper Inference A vision library for performing sliced inference on large images/small objects Overview Object detection and insta

Open Business Software Solutions 2.3k Jan 04, 2023
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 126 Jan 06, 2023
This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

MCGC Description This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021. Datasets Results ACM DBLP IMDB Amazon photos Amazon co

31 Nov 14, 2022
NovelD: A Simple yet Effective Exploration Criterion

NovelD: A Simple yet Effective Exploration Criterion Intro This is an implementation of the method proposed in NovelD: A Simple yet Effective Explorat

29 Dec 05, 2022
Tools for investing in Python

InvestOps Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction This is a Python package with simple and effective

24 Nov 26, 2022
最新版本yolov5+deepsort目标检测和追踪,支持5.0版本可训练自己数据集

使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中。

422 Dec 30, 2022
Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

WarPI The official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels". Run python main.py --corruption_type

Haoliang Sun 3 Sep 03, 2022
Pytorch implementation of BRECQ, ICLR 2021

BRECQ Pytorch implementation of BRECQ, ICLR 2021 @inproceedings{ li&gong2021brecq, title={BRECQ: Pushing the Limit of Post-Training Quantization by Bl

Yuhang Li 148 Dec 28, 2022
Code for "Steerable Pyramid Transform Enables Robust Left Ventricle Quantification"

Code for "Steerable Pyramid Transform Enables Robust Left Ventricle Quantification" This is an end-to-end framework for accurate and robust left ventr

2 Jul 09, 2022
Tensorflow 2.x implementation of Vision-Transformer model

Vision Transformer Unofficial Tensorflow 2.x implementation of the Transformer based Image Classification model proposed by the paper AN IMAGE IS WORT

Soumik Rakshit 16 Jul 20, 2022
Pytorch implementation of forward and inverse Haar Wavelets 2D

Pytorch implementation of forward and inverse Haar Wavelets 2D

Sergei Belousov 9 Oct 30, 2022
Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition (NeurIPS 2019)

MLCR This is the source code for paper Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition. Xuesong Niu, Hu Han, Shiguang

Edson-Niu 60 Nov 29, 2022
Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022