Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020

Related tags

Testingshow-edit-tell
Overview

Show, Edit and Tell: A Framework for Editing Image Captions | arXiv

This contains the source code for Show, Edit and Tell: A Framework for Editing Image Captions, to appear at CVPR 2020

Requirements

  • Python 3.6 or 3.7
  • PyTorch 1.2

For evaluation, you also need:

Argument Parser is currently not supported. We will add support to it soon.

Pretrained Models

You can download the pretrained models from here. Place them in eval folder.

Download and Prepare Features

In this work, we use 36 fixed bottom-up features. If you wish to use the adaptive features (10-100), please refer to adaptive_features folder in this repository and follow the instructions.

First, download the fixed features from here and unzip the file. Place the unzipped folder in bottom-up_features folder.

Next type this command:

python bottom-up_features/tsv.py

This command will create the following files:

  • An HDF5 file containing the bottom up image features for train and val splits, 36 per image for each split, in an (I, 36, 2048) tensor where I is the number of images in the split.
  • PKL files that contain training and validation image IDs mapping to index in HDF5 dataset created above.

Download/Prepare Caption Data

You can either download all the related caption data files from here or create them yourself. The folder contains the following:

  • WORDMAP_coco: maps the words to indices
  • CAPUTIL: stores the information about the existing captions in a dictionary organized as follows: {"COCO_image_name": {"caption": "existing caption to be edited", "encoded_previous_caption": an encoded list of the words, "previous_caption_length": a list contaning the length of the caption, "image_ids": the COCO image id}
  • CAPTIONS the encoded ground-truth captions (a list with number_images x 5 lists. Example: we have 113,287 training images in Karpathy Split, thereofre there is 566,435 lists for the training split)
  • CAPLENS: the length of the ground-truth captions (a list with number_images x 5 vallues)
  • NAMES: the COCO image name in the same order as the CAPTIONS
  • GENOME_DETS: the splits and image ids for loading the images in accordance to the features file created above

If you'd like to create the caption data yourself, download Karpathy's Split training, validation, and test splits. This zip file contains the captions. Place the file in caption data folder. You should also have the pkl files created from the 'Download Features' section: train36_imgid2idx.pkl and val36_imgid2idx.pkl.

Next, run:

python preprocess_caps.py

This will dump all the files to the folder caption data.

Next, download the existing captios to be edited, and organize them in a list containing dictionaries with each dictionary in the following format: {"image_id": COCO_image_id, "caption": "caption to be edited", "file_name": "split\\COCO_image_name"}. For example: {"image_id": 522418, "caption": "a woman cutting a cake with a knife", "file_name": "val2014\\COCO_val2014_000000522418.jpg"}. In our work, we use the captions produced by AoANet.

Next, run:

python preprocess_existing_caps.py

This will dump all the existing caption files to the folder caption data.

Prepare/Download Sequence-Level Training Data

Download the RL-data for sequence-level training used for computing metric scores from here.

Alternitavely, you may prepare the data yourself:

Run the following command:

python preprocess_rl.py

This will dump two files in the data folder used for computing metric scores.

Training and Validation

XE training stage:

For training DCNet, run:

python dcnet.py

For optimizing DCNet with MSE, run:

python dcnet_with_mse.py

For training editnet:

python editnet.py
Cider-D Optimization stage:

For training DCNet, run:

python dcnet_rl.py

For training editnet:

python editnet_rl.py

Evaluation

Refer to eval folder for instructions. All the generated captions and scores from our model can be found in the outputs folder.

BLEU-1 BLEU-4 CIDEr SPICE
Cross-Entropy Loss 77.9 38.0 1.200 21.2
CIDEr Optimization 80.6 39.2 1.289 22.6

Citation

@InProceedings{Sammani_2020_CVPR,
author = {Sammani, Fawaz and Melas-Kyriazi, Luke},
title = {Show, Edit and Tell: A Framework for Editing Image Captions},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

References

Our code is mainly based on self-critical and show attend and tell. We thank both authors.

Owner
Fawaz Sammani
The human brain is a miracle every human has, and mathematically modelling that brain is an overwhelming matter! I like teaching machines vision-language
Fawaz Sammani
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.

mitmproxy mitmproxy is an interactive, SSL/TLS-capable intercepting proxy with a console interface for HTTP/1, HTTP/2, and WebSockets. mitmdump is the

mitmproxy 29.7k Jan 02, 2023
A modern API testing tool for web applications built with Open API and GraphQL specifications.

Schemathesis Schemathesis is a modern API testing tool for web applications built with Open API and GraphQL specifications. It reads the application s

Schemathesis.io 1.6k Dec 30, 2022
Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020

Show, Edit and Tell: A Framework for Editing Image Captions | arXiv This contains the source code for Show, Edit and Tell: A Framework for Editing Ima

Fawaz Sammani 76 Nov 25, 2022
Data App Performance Tests

Data App Performance Tests My hypothesis is that The different architectures of

Marc Skov Madsen 6 Dec 14, 2022
The Good Old Days. | Testing Out A New Module-

The-Good-Old-Days. The Good Old Days. | Testing Out A New Module- Installation Asciimatics supports Python versions 2 & 3. For the precise list of tes

Syntax. 2 Jun 08, 2022
Mypy static type checker plugin for Pytest

pytest-mypy Mypy static type checker plugin for pytest Features Runs the mypy static type checker on your source files as part of your pytest test run

Dan Bader 218 Jan 03, 2023
splinter - python test framework for web applications

splinter - python tool for testing web applications splinter is an open source tool for testing web applications using Python. It lets you automate br

Cobra Team 2.6k Dec 27, 2022
PyAutoEasy is a extension / wrapper around the famous PyAutoGUI, a cross-platform GUI automation tool to replace your boooring repetitive tasks.

PyAutoEasy PyAutoEasy is a extension / wrapper around the famous PyAutoGUI, a cross-platform GUI automation tool to replace your boooring repetitive t

Dingu Sagar 7 Oct 27, 2022
This is a Python script for Github Bot which uses Selenium to Automate things.

github-follow-unfollow-bot This is a Python script for Github Bot which uses Selenium to Automate things. Pre-requisites :- Python A Github Account Re

Chaudhary Hamdan 10 Jul 01, 2022
A Python Selenium library inspired by the Testing Library

Selenium Testing Library Slenium Testing Library (STL) is a Python library for Selenium inspired by Testing-Library. Dependencies Python 3.6, 3.7, 3.8

Anže Pečar 12 Dec 26, 2022
Statistical tests for the sequential locality of graphs

Statistical tests for the sequential locality of graphs You can assess the statistical significance of the sequential locality of an adjacency matrix

2 Nov 23, 2021
Python tools for penetration testing

pyTools_PT python tools for penetration testing Please don't use these tool for illegal purposes. These tools is meant for penetration testing for leg

Gourab 1 Dec 01, 2021
A collection of testing examples using pytest and many other libreris

Effective testing with Python This project was created for PyConEs 2021 Check out the test samples at tests Check out the slides at slides (markdown o

Héctor Canto 10 Oct 23, 2022
Fully functioning price detector built with selenium and python

Fully functioning price detector built with selenium and python

mark sikaundi 4 Mar 30, 2022
The definitive testing tool for Python. Born under the banner of Behavior Driven Development (BDD).

mamba: the definitive test runner for Python mamba is the definitive test runner for Python. Born under the banner of behavior-driven development. Ins

Néstor Salceda 502 Dec 30, 2022
The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

DUE Evaluator The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KI

DUE Benchmark 4 Jan 21, 2022
Pyramid debug toolbar

pyramid_debugtoolbar pyramid_debugtoolbar provides a debug toolbar useful while you're developing your Pyramid application. Note that pyramid_debugtoo

Pylons Project 95 Sep 17, 2022
A configurable set of panels that display various debug information about the current request/response.

Django Debug Toolbar The Django Debug Toolbar is a configurable set of panels that display various debug information about the current request/respons

Jazzband 7.3k Jan 02, 2023
Screenplay pattern base for Python automated UI test suites.

ScreenPy TITLE CARD: "ScreenPy" TITLE DISAPPEARS.

Perry Goy 39 Nov 15, 2022
Selects tests affected by changed files. Continous test runner when used with pytest-watch.

This is a pytest plug-in which automatically selects and re-executes only tests affected by recent changes. How is this possible in dynamic language l

Tibor Arpas 614 Dec 30, 2022