Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Last update: Dec 11, 2022

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge by Teney et al.

Prerequisites

python 3.6+
numpy
pytorch 0.4
tqdm
nltk
pandas

Data

Preparation

To download and extract vqav2, glove, and pretrained visual features:
```
bash scripts/download_extract.sh
```
To prepare data for training:
```
python scripts/preproc.py
```

The structure of data/ directory should look like this:

- data/
  - zips/
    - v2_XXX...zip
    - ...
    - glove...zip
    - trainval_36.zip
  - glove/
    - glove...txt
    - ...
  - v2_XXX.json
  - ...
  - trainval_resnet...tsv
  (The above are files created after executing scripts/download_extract.sh)
  - tokenizers/
    - ...
  - dict_ans.pkl
  - dict_q.pkl
  - glove_pretrained_300.npy
  - train_qa.pkl
  - val_qa.pkl
  - train_vfeats.pkl
  - val_vfeats.pkl
  (The above are files created after executing scripts/preproc.py)

Train

Use default parameters:

bash scripts/train.sh

Notes

Huge re-factor (especially data preprocessing), tested based on pytorch 0.4.1 and python 3.6
Training for 20 epochs reach around 50% training accuracy. (model seems buggy in my implementation)
After all the preprocessing, data/ directory may be up to 38G+
Some of preproc.py and utils.py are based on this repo

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Related tags

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

Prerequisites

Data

Preparation

Train

Notes

Resources

Owner

Mark Dong

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

PyTorch Implementation of AnimeGANv2

BASH - Biomechanical Animated Skinned Human

Pytorch library for seismic data augmentation

VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems

A simple baseline for 3d human pose estimation in PyTorch.

scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition (AGRA, ACM 2020, Oral)

《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

Hooks for VCOCO

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

pyspark🍒🥭 is delicious，just eat it!😋😋

Official PyTorch implementation of "ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows"

Implementation of the GBST block from the Charformer paper, in Pytorch

RL-driven agent playing tic-tac-toe on starknet against challengers.