NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Last update: Nov 15, 2022

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multimodal Contrastive Learning of Sentence Embeddings. If you find this reposity useful, please consider citing our paper.

Contact: Miaoran Zhang ([email protected])

Pre-trained Models & Results

Model	Avg. STS
flickr-mcse-bert-base-uncased [Google Drive]	77.70
flickr-mcse-roberta-base [Google Drive]	78.44
coco-mcse-bert-base-uncased [Google Drive]	77.08
coco-mcse-roberta-base [Google Drive]	78.17

Note: flickr indicates that models are trained on wiki+flickr, and coco indicates that models are trained on wiki+coco.

Quickstart

Setup

Python 3.9.5
Pytorch 1.7.1
Install other packages:

pip install -r requirements.txt

Data Preparation

Please organize the data directory as following:

REPO ROOT
|
|--data    
|  |--wiki1m_for_simcse.txt  
|  |--flickr_random_captions.txt    
|  |--flickr_resnet.hdf5    
|  |--coco_random_captions.txt    
|  |--coco_resnet.hdf5

Wiki1M

wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt

Flickr30k & MS-COCO
You can either download the preprocessed data we used:
(annotation sources: flickr30k-entities and coco).

Or preprocess the data by yourself (take Flickr30k as an example):

Download the flickr30k-entities.
Request access to the flickr-images from here. Note that the use of the images much abide by the Flickr Terms of Use.

Run script:

unzip ${path_to_flickr-entities}/annotations.zip

python preprocess/prepare_flickr.py \
    --flickr_entities_dir ${path_to_flickr-entities}  \  
    --flickr_images_dir ${path_to_flickr-images} \
    --output_dir data/
    --batch_size 32

Train & Evaluation

Prepare the senteval datasets for evaluation:

cd SentEval/data/downstream/
bash download_dataset.sh

Run scripts:
```
# For example:  (more examples are given in scripts/.)
sh scripts/run_wiki_flickr.sh
```
Note: In the paper we run experiments with 5 seeds (0,1,2,3,4). You can find the detailed parameter settings in Appendix.

Acknowledgements

The extremely clear and well organized codebase: SimCSE
SentEval toolkit

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Pre-trained Models & Results

Quickstart

Setup

Data Preparation

Train & Evaluation

Acknowledgements

Owner

Saarland University Spoken Language Systems Group

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

SAINT PyTorch implementation

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

تولید اسم های رندوم فینگیلیش

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Natural Language Processing for Adverse Drug Reaction (ADR) Detection

Translators - is a library which aims to bring free, multiple, enjoyable translation to individuals and students in Python

Longformer: The Long-Document Transformer

A python wrapper around the ZPar parser for English.

Optimal Transport Tools (OTT), A toolbox for all things Wasserstein.

Rank-One Model Editing for Locating and Editing Factual Knowledge in GPT

ADCS cert template modification and ACL enumeration

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Training RNNs as Fast as CNNs

Python wrapper for Stanford CoreNLP tools v3.4.1

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.