Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

Last update: Aug 27, 2022

Overview

Adaptively Aligned Image Captioning via Adaptive Attention Time

This repository includes the implementation for Adaptively Aligned Image Captioning via Adaptive Attention Time.

Requirements

Python 3.6
Java 1.8.0
PyTorch 1.0
cider
coco-caption
tensorboardX

Training AAT

Prepare data (with python2)

See details in data/README.md.

(notes: Set word_count_threshold in scripts/prepro_labels.py to 4 to generate a vocabulary of size 10,369.)

You should also preprocess the dataset and get the cache for calculating cider score for SCST:

$ python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Training

$ sh train-aat.sh

See opts.py for the options.

Evaluation

$ CUDA_VISIBLE_DEVICES=0 python eval.py --model log/log_aat_rl/model.pth --infos_path log/log_aat_rl/infos_aat.pkl  --dump_images 0 --dump_json 1 --num_images -1 --language_eval 1 --beam_size 2 --batch_size 100 --split test

Reference

If you find this repo helpful, please consider citing:

@inproceedings{huang2019adaptively,
  title = {Adaptively Aligned Image Captioning via Adaptive Attention Time},
  author = {Huang, Lun and Wang, Wenmin and Xia, Yaxian and Chen, Jie},
  booktitle = {Advances in Neural Information Processing Systems 32},
  year={2019}
}

Acknowledgements

This repository is based on Ruotian Luo's self-critical.pytorch.

Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

Related tags

Overview

Adaptively Aligned Image Captioning via Adaptive Attention Time

Requirements

Training AAT

Prepare data (with python2)

Training

Evaluation

Reference

Acknowledgements

Owner

Lun Huang

Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

Proto-RL: Reinforcement Learning with Prototypical Representations

TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

L-Verse: Bidirectional Generation Between Image and Text

Heat transfer problemas solved using python

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Isaac Gym Reinforcement Learning Environments

Title: Heart-Failure-Classification

(under submission) Bayesian Integration of a Generative Prior for Image Restoration

Hypersearch weight debugging and losses tutorial

Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

Chess reinforcement learning by AlphaGo Zero methods.

Sum-Product Probabilistic Language

AI drive app that can help user become beautiful.

PyTorch implementation of Tacotron speech synthesis model.

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

YOLOv5 detection interface - PyQt5 implementation

This is the official implementation of our proposed SwinMR

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation