This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Overview

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Requirements

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Download checkpoints

Download the vocabulary file of BERT-base (uncased) from HERE, and put it into ./pretrained_ckpt/.
Download the pre-trained checkpoint of BERT-base (uncased) from HERE, and put it into ./pretrained_ckpt/.
Download the 2nd general distillation checkpoint of TinyBERT from HERE, and extract them into ./pretrained_ckpt/.

Prepare dataset

Download the GLUE dataset (containing MNLI) using the script in HERE, and put the files into ./dataset/glue/. Download the Amazon Reviews dataset from HERE, and extract it into ./dataset/amazon_review/

Train the teacher model (BERT$_{\rm B}$-single) from single-domain

bash train_domain.sh

Distill the student model (BERT$_{\rm S}$) with TinyBERT-KD from single-domain

bash finetune_domain.sh

Train the teacher model (HRKD-teacher) from multi-domain

bash train_multi_domain.sh

And then put the checkpoints to the specified directories (see the beginning of finetune_multi_domain.py for more details).

Distill the student model (BERT$_{\rm S}$) with our HRKD from multi-domain

bash finetune_multi_domain.sh

Reference

If you find this code helpful for your research, please cite the following paper.

@inproceedings{dong2021hrkd,
  title     = {{HRKD}: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression},
  author    = {Chenhe Dong and Yaliang Li and Ying Shen and Minghui Qiu},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year      = {2021}
}
Owner
Chenhe Dong
Chenhe Dong
This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

Deploy-yolo-fastest-tflite-on-raspberry 觉得有用的话可以顺手点个star嗷 这个项目将垃圾分类小车中的tflite模型移植到了树莓派3b+上面。 该项目主要是为了记录在树莓派部署yolo fastest tflite的流程 (之后有时间会尝试用C++部署来提升

7 Aug 16, 2022
Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

xcfeng 55 Dec 27, 2022
Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems

Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems This is our experimental code for RecSys 2021 paper "Learning

11 Jul 28, 2022
Public implementation of the Convolutional Motif Kernel Network (CMKN) architecture

CMKN Implementation of the convolutional motif kernel network (CMKN) introduced in Ditz et al., "Convolutional Motif Kernel Network", 2021. Testing Yo

1 Nov 17, 2021
A unet implementation for Image semantic segmentation

Unet-pytorch a unet implementation for Image semantic segmentation 参考网上的Unet做分割的代码,做了一个针对kaggle地盐识别的,请去以下地址获取数据集: https://www.kaggle.com/c/tgs-salt-id

Rabbit 3 Jun 29, 2022
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchm

Filip Radenovic 188 Dec 17, 2022
Hierarchical Attentive Recurrent Tracking

Hierarchical Attentive Recurrent Tracking This is an official Tensorflow implementation of single object tracking in videos by using hierarchical atte

Adam Kosiorek 147 Aug 07, 2021
Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

LMMNN Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks This is the working dire

Giora Simchoni 10 Nov 02, 2022
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
Computer Vision and Pattern Recognition, NUS CS4243, 2022

CS4243_2022 Computer Vision and Pattern Recognition, NUS CS4243, 2022 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : h

Xavier Bresson 142 Dec 15, 2022
🏖 Keras Implementation of Painting outside the box

Keras implementation of Image OutPainting This is an implementation of Painting Outside the Box: Image Outpainting paper from Standford University. So

Bendang 1.1k Dec 10, 2022
Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

FCN.tensorflow Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCNs). The implementation is largely based on the

Sarath Shekkizhar 1.3k Dec 25, 2022
Outlier Exposure with Confidence Control for Out-of-Distribution Detection

OOD-detection-using-OECC This repository contains the essential code for the paper Outlier Exposure with Confidence Control for Out-of-Distribution De

Nazim Shaikh 64 Nov 02, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022
Image Matching Evaluation

Image Matching Evaluation (IME) IME provides to test any feature matching algorithm on datasets containing ground-truth homographies. Also, one can re

32 Nov 17, 2022
Reproduces the results of the paper "Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations".

Finite basis physics-informed neural networks (FBPINNs) This repository reproduces the results of the paper Finite Basis Physics-Informed Neural Netwo

Ben Moseley 65 Dec 28, 2022
Azion the best solution of Edge Computing in the world.

Azion Edge Function docker action Create or update an Edge Functions on Azion Edge Nodes. The domain name is the key for decision to a create or updat

8 Jul 16, 2022
MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

The Modular and Robust State-Estimation Framework, or short, MaRS, is a recursive filtering framework that allows for truly modular multi-sensor integration

Control of Networked Systems - University of Klagenfurt 143 Dec 29, 2022
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer

1 Feb 03, 2022
Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

sarulab-speech 24 Sep 07, 2022