PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Last update: Dec 28, 2022

Related tags

Deep Learning hyperformer

Overview

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

This repo contains the PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Installation

python setup.py install

How to run the models

We provide example scripts for each model in hyperformer/scripts/ folder with their config files in hyperformer/configs. To run the models, please do cd hyperformer and:

To run hyperformer++ model (This model generates the task-specific adapters using a shared hypernetwork, which is shared across the tasks and layers of a transformer.):
```
bash scripts/hyperformer++.sh
```
To run hyperformer model (This model generates the task-specific adapters using a shared hypernetwork, which is shared across the tasks, but this is specific to each layer of a transformer. This model is less efficient compared to hyperformer++.):
```
bash scripts/hyperformer.sh
```
To run adapter\dagger model (This model share the layer normalization between adapters across the tasks, and train adapters in a multi-task setting.):
```
bash scripts/adapters_dagger.sh   
```
To run adapter model (This model trains a single-adapter per task and trains the adapters in a single-task learning.):
```
bash scripts/adapters.sh 
```
To run T5 finetuning model in a multi-task learning setup:
```
bash scripts/finetune.sh
```
To run T5 finetuning model in a single-task learning setup:
```
bash scripts/finetune_single_task.sh
```

We run all the models on 4 GPUs, while this is not necessary and one can run the models on 1 GPU. In case running on one GPU, in all the scripts, please remove the -m torch.distributed.launch --nproc_per_node=4 part.

Bibliography

If you find this repo useful, please cite our paper.

@inproceedings{karimi2021parameterefficient,
  title={Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks},
  author={Karimi Mahabadi, Rabeeh and Ruder, Sebastian and Dehghani, Mostafa and Henderson, James},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
  year={2021}
}

Final words

Hope this repo is useful for your research. For any questions, please create an issue or email [email protected], and I will get back to you as soon as possible.

PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Related tags

Overview

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

Installation

How to run the models

Bibliography

Final words

Owner

Rabeeh Karimi Mahabadi

[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

Semi-supervised semantic segmentation needs strong, varied perturbations

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Multiple custom object count and detection using YOLOv3-Tiny method

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Vision-and-Language Navigation in Continuous Environments using Habitat

Python package provinding tools for artistic interactive applications using AI

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

Haze Removal can remove slight to extreme cases of haze affecting an image

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

This repository implements WGAN_GP.

A repository for the paper "Improved Adversarial Systems for 3D Object Generation and Reconstruction".

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings

PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Implementation of various Vision Transformers I found interesting

PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Related tags

Overview

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

Installation

How to run the models

Bibliography

Final words

Owner

Rabeeh Karimi Mahabadi

[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

Semi-supervised semantic segmentation needs strong, varied perturbations

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Multiple custom object count and detection using YOLOv3-Tiny method

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Vision-and-Language Navigation in Continuous Environments using Habitat

Python package provinding tools for artistic interactive applications using AI

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

Haze Removal can remove slight to extreme cases of haze affecting an image

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

This repository implements WGAN_GP.

A repository for the paper "Improved Adversarial Systems for 3D Object Generation and Reconstruction".

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings

PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Implementation of various Vision Transformers I found interesting

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,