A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Related tags

Deep LearningIconQA
Overview

IconQA

License: CC BY-SA 4.0

About

IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and comprehensive cognitive reasoning in real-world problems.

iconqa examples

There are three different sub-tasks in IconQA:

  • 57,672 image choice MC questions
  • 31,578 text chioce MC questions
  • 18,189 fill-in-the-blank questions
Sub-Tasks Train Validation Test Total
Multi-image-choice 34,603 11,535 11,535 57,672
Multi-text-choice 18,946 6,316 6,316 31,578
Filling-in-the-blank 10,913 3,638 3,638 18,189

In addition to IconQA, we also present Icon645, a large-scale dataset of icons that cover a wide range of objects:

  • 645,687 colored icons
  • 377 different icon classes

icon_examples

For more details, you can find our website here and our paper here.

Download

Our dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please read the license before you use, change, or share our dataset.

You can download IconQA here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/iconqa.zip
unzip iconqa.zip

You can download Icon645 here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/icon645.zip
unzip icon645.zip

File structures for the IconQA dataset:

IconQA
|   LICENSE.md
|   metadata.json
|   pid2skills.json
|   pid_splits.json
|   problems.json
|   skills.json
└───test
│   │
│   └───choose_img
│   |   |
│   |   └───question_id
│   |   |   |   image.png
|   |   |   |   data.json
|   |   |   |   choice_0.png
|   |   |   |   choice_1.png
|   |   |   |   ...
|   |   |
|   |   └───question_id
|   |   |   ...
|   |   
|   └───choose_txt
|   |   |  
|   |   └───question_id
|   |   |   |   image.png
|   |   |   |   data.json
|   |   | 
|   |   └───question_id
|   |   |   ...
|   |
|   └───fill_in_blank
|       |  
|       └───question_id
|       |   |   image.png
|       |   |   data.json
|       | 
|       └───question_id
|       |   ...
|   
└───train
|   |   same as test
|   
└───val
    |   same as test

File structures for the Icon645 dataset:

Icon645
|   LICENCE.md
|   metadata.json
└───colored_icons_final
    |
    └───acorn
    |   |   image_id1.png
    |   |   image_id2.png
    |   |   ...
    |   
    └───airplane
    |   |   image_id3.png
    |   |   ...
    |      
    |   ...

Citation

If the paper or the dataset inspires you, please cite us:

@inproceedings{lu2021iconqa,
  title = {IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning},
  author = {Lu, Pan and Qiu, Liang and Chen, Jiaqi and Xia, Tony and Zhao, Yizhou and Zhang, Wei and Yu, Zhou and Liang, Xiaodan and Zhu, Song-Chun},
  booktitle = {Submitted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks},
  year = {2021}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Owner
Pan Lu
Computer Science
Pan Lu
LibFewShot: A Comprehensive Library for Few-shot Learning.

LibFewShot Make few-shot learning easy. Supported Methods Meta MAML(ICML'17) ANIL(ICLR'20) R2D2(ICLR'19) Versa(NeurIPS'18) LEO(ICLR'19) MTL(CVPR'19) M

<a href=[email protected]&L"> 603 Jan 05, 2023
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
Simple Tensorflow implementation of Toward Spatially Unbiased Generative Models (ICCV 2021)

Spatial unbiased GANs — Simple TensorFlow Implementation [Paper] : Toward Spatially Unbiased Generative Models (ICCV 2021) Abstract Recent image gener

Junho Kim 16 Apr 15, 2022
FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale 40 Dec 13, 2022
This repository contains the code used to quantitatively evaluate counterfactual examples in the associated paper.

On Quantitative Evaluations of Counterfactuals Install To install required packages with conda, run the following command: conda env create -f requi

Frederik Hvilshøj 1 Jan 16, 2022
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
Multimodal Descriptions of Social Concepts: Automatic Modeling and Detection of (Highly Abstract) Social Concepts evoked by Art Images

MUSCO - Multimodal Descriptions of Social Concepts Automatic Modeling of (Highly Abstract) Social Concepts evoked by Art Images This project aims to i

0 Aug 22, 2021
Keyword-BERT: Keyword-Attentive Deep Semantic Matching

project discription An implementation of the Keyword-BERT model mentioned in my paper Keyword-Attentive Deep Semantic Matching (Plz cite this github r

1 Nov 14, 2021
Deepfake Scanner by Deepware.

Deepware Scanner (CLI) This repository contains the command-line deepfake scanner tool with the pre-trained models that are currently used at deepware

deepware 110 Jan 02, 2023
Code for "Diffusion is All You Need for Learning on Surfaces"

Source code for "Diffusion is All You Need for Learning on Surfaces", by Nicholas Sharp Souhaib Attaiki Keenan Crane Maks Ovsjanikov NOTE: the linked

Nick Sharp 247 Dec 28, 2022
Fermi Problems: A New Reasoning Challenge for AI

Fermi Problems: A New Reasoning Challenge for AI Fermi Problems are questions whose answer is a number that can only be reasonably estimated as a prec

AI2 15 May 28, 2022
Structured Edge Detection Toolbox

################################################################### # # # Structure

Piotr Dollar 779 Jan 02, 2023
Select, weight and analyze complex sample data

Sample Analytics In large-scale surveys, often complex random mechanisms are used to select samples. Estimates derived from such samples must reflect

samplics 37 Dec 15, 2022
Convert dog pictures into various painting styles. Try LimnPet

LimnPet Cartoon stylization service project Try our service » Home page · Team notion · Members 목차 프로젝트 소개 프로젝트 목표 사용한 기술스택과 수행도구 팀원 구현 기능 주요 기능 추가 기능

LiJell 7 Jul 14, 2022
Reverse engineering Rosetta 2 in M1 Mac

Project Champollion About this project Rosetta 2 is an emulation mechanism to run the x86_64 applications on Arm-based Apple Silicon with Ahead-Of-Tim

FFRI Security, Inc. 258 Jan 07, 2023
Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

41 Jan 04, 2023
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022