Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Last update: Dec 11, 2022

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge by Teney et al.

Prerequisites

python 3.6+
numpy
pytorch 0.4
tqdm
nltk
pandas

Data

Preparation

To download and extract vqav2, glove, and pretrained visual features:
```
bash scripts/download_extract.sh
```
To prepare data for training:
```
python scripts/preproc.py
```

The structure of data/ directory should look like this:

- data/
  - zips/
    - v2_XXX...zip
    - ...
    - glove...zip
    - trainval_36.zip
  - glove/
    - glove...txt
    - ...
  - v2_XXX.json
  - ...
  - trainval_resnet...tsv
  (The above are files created after executing scripts/download_extract.sh)
  - tokenizers/
    - ...
  - dict_ans.pkl
  - dict_q.pkl
  - glove_pretrained_300.npy
  - train_qa.pkl
  - val_qa.pkl
  - train_vfeats.pkl
  - val_vfeats.pkl
  (The above are files created after executing scripts/preproc.py)

Train

Use default parameters:

bash scripts/train.sh

Notes

Huge re-factor (especially data preprocessing), tested based on pytorch 0.4.1 and python 3.6
Training for 20 epochs reach around 50% training accuracy. (model seems buggy in my implementation)
After all the preprocessing, data/ directory may be up to 38G+
Some of preproc.py and utils.py are based on this repo

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Related tags

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

Prerequisites

Data

Preparation

Train

Notes

Resources

Owner

Mark Dong

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Generate vector graphics from a textual caption

Segmenter - Transformer for Semantic Segmentation

Search with BERT vectors in Solr and Elasticsearch

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Différents programmes créant une interface graphique a l'aide de Tkinter pour simplifier la vie des étudiants.

Easy to start. Use deep nerual network to predict the sentiment of movie review.

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

NSFW A chatbot based on GPT2-chitchat

A telegram bot to translate 100+ Languages

小布助手对话短文本语义匹配的一个baseline

Code examples for my Write Better Python Code series on YouTube.

A minimal code for fairseq vq-wav2vec model inference.

A fast and easy implementation of Transformer with PyTorch.

ConvBERT-Prod

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Expressions.

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Related tags

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

Prerequisites

Data

Preparation

Train

Notes

Resources

Owner

Mark Dong

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Generate vector graphics from a textual caption

Segmenter - Transformer for Semantic Segmentation

Search with BERT vectors in Solr and Elasticsearch

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Différents programmes créant une interface graphique a l'aide de Tkinter pour simplifier la vie des étudiants.

Easy to start. Use deep nerual network to predict the sentiment of movie review.

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

**NSFW** A chatbot based on GPT2-chitchat

A telegram bot to translate 100+ Languages

小布助手对话短文本语义匹配的一个baseline

Code examples for my Write Better Python Code series on YouTube.

A minimal code for fairseq vq-wav2vec model inference.

A fast and easy implementation of Transformer with PyTorch.

ConvBERT-Prod

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Expressions.

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

NSFW A chatbot based on GPT2-chitchat