Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

Last update: Dec 20, 2022

Related tags

Overview

VidLanKD

Implementation of VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer by Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal.

Setup

# Create python environment (optional)
conda create -n vidlankd python=3.7

# Install python dependencies
pip install -r requirements.txt

To speed up the training, we use mixed precision with Apex.

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Dataset Preparation

Text Dataset

We provide scripts to obtain datasets "wiki103" and "wiki".

Wiki103, a seleted subset of English Wikipedia.

bash data/wiki103/get_data_cased.bash

English Wikipedia. The scripts are modified from XLM.

bash data/wiki/get_data_cased.bash en

Video Dataset

Howto100m where you can download official captions and videos features.

Video Features Extraction Code

To be updated.

We extracted our 2D-level video features with ResNet152 from torchvision.
We extracted our 3D-level video features with 3D-RexNext.

Downstream tasks

GLUE dataset

Download dataset

python download_glue_data.py --data_dir data/glue --tasks all

Training

Teacher model pre-training

# bash scripts/small_vlm_howto100m.bash $GPUS #teacher_SNAP_PATH
bash scripts/small_vlm_howto100m.bash 0,1,2,3 howto100m_bert_small_vokenhinge
# bash scripts/base_vlm_howto100m.bash $GPUS #teacher_SNAP_PATH
bash scripts/base_vlm_howto100m.bash 0,1,2,3 howto100m_bert_base_vokenhinge

Knowledge transfer to student model

# bash scripts/small_vlm_wiki103.bash $GPUS #teacher_SNAP_PATH #student_SNAP_PATH
bash scripts/small_vlm_wiki103.bash 0,1,2,3 howto100m_bert_small_vokenhinge/checkpoint-epoch0019 wiki103_bert_small_vokenmmd
# bash scripts/base_vlm_wiki.bash $GPUS #teacher_SNAP_PATH #student_SNAP_PATH
bash scripts/base_vlm_wiki.bash 0,1,2,3 howto100m_bert_base_vokenhinge/checkpoint-epoch0019 wiki_bert_base_vokenmmd

Finetuning on GLUE tasks

# bash scripts/run_glue_at_epoch.bash $GPUS $NumTrainEpochs $SNAP_PATH                        
bash scripts/run_glue_at_epoch.bash 0,1,2,3 3 snap/vlm/wiki103_bert_small_vokenmmd/checkpoint-epoch0019

Acknowledgements

Part of the code is built based on vokenization, huggingface transformers, and facebook faiss.

Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

Related tags

Overview

VidLanKD

Setup

Dataset Preparation

Text Dataset

Video Dataset

Video Features Extraction Code

Downstream tasks

GLUE dataset

Training

Acknowledgements

Owner

Zineng Tang

How to Predict Stock Prices Easily Demo

Voice of Pajlada with model and weights.

A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL

Using pretrained language models for biomedical knowledge graph completion.

Compositional Sketch Search

Learning Features with Parameter-Free Layers (ICLR 2022)

GANmouflage: 3D Object Nondetection with Texture Fields

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

TAPEX: Table Pre-training via Learning a Neural SQL Executor

68 keypoint annotations for COFW test data

Hierarchical Time Series Forecasting with a familiar API

Detectorch - detectron for PyTorch

Class-Attentive Diffusion Network for Semi-Supervised Classification [AAAI'21] (official implementation)

Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

Transformer model implemented with Pytorch

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

Computing Shapley values using VAEAC

A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

Related resources for our EMNLP 2021 paper

A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.