ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

Last update: Dec 08, 2022

Related tags

Overview

This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

Special Note:

This dataset is much bigger than the one used on ECCV 2020. The larger one has almost 1M images while the other one contains only about half of it (even though you might find 993K in the paper).
The evaluation codes are now adopted from self-critical.pytorch.
Because of the two reasons above, we now should have better CIDEr scores. However, the other scores might be lower. We will try to update the scores soon.

Codes:

Now this repo only contains codes for SAT, BUTD and CNN-C as was written in the paper.

evalcap folder can be downloaded from here.

To run the code for training, do sh train.sh. To test, sh test.sh

I kept having bad results for CNN-C model, with all the generations in the val set be the same. I had the same issue when I tried to adopt from self-critical.pytorch. This never happened before when I ran the experiments for the ECCV paper. I really appreciate if anyone find the reason why this happened.

Dataset:

To get the preprocessed data, use this or email: Xuewen Yang @ [email protected] if you need the raw data.

For other issues, please create an issue on this repo.

If you want to download the original dataset (some data might be missing), you can:

First download the json file from here.
Then use wget or other download scripts. For example, wget https://n.nordstrommedia.com/id/sr3/58d1a13f-b6b6-4e68-b2ff-3a3af47c422e.jpeg Remember to ignore anything after .jpeg in the url to get high resolution images, otherwise, very low resolution images are downloaded.
Sometimes the description is no longer available, we can retrieve it from the 'detail_info' part.

License:

The dataset is under license in the LICENSE file.
No commercial use.

Citation:

If you use this data, please cite:

@inproceedings{XuewenECCV20Fashion,
Author = {Xuewen Yang and Heming Zhang and Di Jin and Yingru Liu and Chi-Hao Wu and Jianchao Tan and Dongliang Xie and Jue Wang and Xin Wang},
Title = {Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards},
booktitle = {ECCV},
Year = {2020}
}

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

Related tags

Overview

Special Note:

Codes:

Dataset:

License:

Citation:

Owner

Xuewen Yang

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Official git repo for the CHIRP project

PyArmadillo: an alternative approach to linear algebra in Python

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

A 10000+ hours dataset for Chinese speech recognition

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

A Joint Video and Image Encoder for End-to-End Retrieval

All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Sequential model-based optimization with a `scipy.optimize` interface

Experiments with Fourier layers on simulation data.

Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

Related tags

Overview

Special Note:

Codes:

Dataset:

License:

Citation:

Owner

Xuewen Yang

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Official git repo for the CHIRP project

PyArmadillo: an alternative approach to linear algebra in Python

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

A 10000+ hours dataset for Chinese speech recognition

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

A Joint Video and Image Encoder for End-to-End Retrieval

All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Sequential model-based optimization with a `scipy.optimize` interface

Experiments with Fourier layers on simulation data.

Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.