Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Last update: Jan 04, 2023

Related tags

Overview

This repository contains code for the following two papers:

VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short version titiled What Does BERT with Vision Look At? published on ACL 2020.

Under the folder visualbert is code (the original VisualBERT), where we pre-train a Transformer for vision-and-language (V&L) tasks on image-caption data.
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions published on NAACL 2021.

Under the folder unsupervised_visualbert is code (Unsupervised VisualBERT), where we pre-train a V&L transformer without aligned image-captions pairs. Rather, we pre-training only using unaligned images and text, and achieve competitive performance with many models supervised with aligned data.

The model VisualBERT has been also integrated into several libararies such as Huggingface Transformer (many thanks to Gunjan Chhablani who made it work) and Facebook MMF.

Thanks~

Owner

Natural Language Processing @UCLA

GitHub Repository

NLP Text Classification

多标签文本分类任务近年来随着深度学习的发展，模型参数的数量飞速增长。为了训练这些参数，需要更大的数据集来避免过拟合。然而，对于大部分NLP任务来说，构建大规模的标注数据集非常困难（成本过高），特别是对于句法和语义相关的任务。相比之下，大规模的未标注语料库的构建则相对容易。为了利用这些数据，我们可以

1 Nov 11, 2021

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre

2.3k Jan 08, 2023

Text to speech for Vietnamese, ez to use, ez to update

Chào mọi người, đây là dự án mở nhằm giúp việc đọc được trở nên dễ dàng hơn. Rất cảm ơn đội ngũ Zalo đã cung cấp hạ tầng để mình có thể tạo ra app này

32 Jul 29, 2022

Yomichad - a Japanese pop-up dictionary that can display readings and English definitions of Japanese words

Yomichad is a Japanese pop-up dictionary that can display readings and English definitions of Japanese words, kanji, and optionally named entities. It is similar to yomichan, 10ten, and rikaikun in s

7 Nov 07, 2022

Shirt Bot is a discord bot which uses GPT-3 to generate text

SHIRT BOT · Shirt Bot is a discord bot which uses GPT-3 to generate text. Made by Cyclcrclicly#3420 (474183744685604865) on Discord. Support Server EX

31 Oct 31, 2022

Semi-automated vocabulary generation from semantic vector models

vec2word Semi-automated vocabulary generation from semantic vector models This script generates a list of potential conlang word forms along with asso

9 Nov 25, 2022

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

605 Jan 02, 2023

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

This repository provides a library for efficient training of masked language models (MLM), built with fairseq. We fork fairseq to give researchers mor

92 Dec 27, 2022

Ecommerce product title recognition package

revizor This package solves task of splitting product title string into components, like type, brand, model and article (or SKU or product code or you

16 Mar 03, 2022

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

Graph2Pix: A Graph-Based Image to Image Translation Framework Installation Install the dependencies in env.yml $ conda env create -f env.yml $ conda a

18 Nov 17, 2022

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

🤗 Transformers Wav2Vec2 + Parlance's CTCDecode Introduction This repo shows how 🤗 Transformers can be used in combination with Parlance's ctcdecode

9 Jul 21, 2022

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

211 Dec 28, 2022

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Related tags

Overview

Owner

Natural Language Processing @UCLA

NLP Text Classification

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Text to speech for Vietnamese, ez to use, ez to update

Yomichad - a Japanese pop-up dictionary that can display readings and English definitions of Japanese words

Shirt Bot is a discord bot which uses GPT-3 to generate text

Semi-automated vocabulary generation from semantic vector models

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

Ecommerce product title recognition package

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Simple Annotated implementation of GPT-NeoX in PyTorch

Trained T5 and T5-large model for creating keywords from text

KR-FinBert And KR-FinBert-SC

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

Understand Text Summarization and create your own summarizer in python

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time

Maix Speech AI lib, including ASR, chat, TTS etc.

Open solution to the Toxic Comment Classification Challenge