TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

Overview

TransPrompt

This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification》.

Our proposed TransPrompt is motivated by the join of prompt-tuning and cross-task transfer learning. The aim is to explore and exploit the transferable knowledge from similar tasks in the few-shot scenario, and make the Pre-trained Language Model (PLM) better few-shot transfer learner. Our proposed framework is accepted by the main conference (long paper track) in EMNLP-2021. This code is the default multi-GPU version. We will teach you how to use our code in the following parts.

Ps: We also commit the same code in Alibaba EasyTransfer.

1. Data Preparation

We follow PET to use the same dataset. Please run the scripts to download the data:

sh data/download_data.sh

or manually download the dataset from https://nlp.cs.princeton.edu/projects/lm-bff/datasets.tar.

Then you will obtain a new director data/original

Our work has two kind of scenario, such as single-task and cross-task. Different kind scenario has corresponding splited examples. Defaultly, we generate few-shot learning examples, you can also generate full data by edit the parameter (-scene=full). We only demostrate the few-shot data generation.

1.1 Single-task Few-shot

Please run the scripts to obtain the single-task few-shot examples:

python3 data_utils/generate_k_shot_data.py --scene few-shot --k 16

Then you will obtain a new folder data/k-shot-single

1.2 Cross-task Few-shot

Run the scripts

python3 data_utils/generate_k_shot_cross_task_data.py --scene few-shot --k 16

and you will obtain a new folder data/k-shot-cross

After the generation, the similar tasks will be divided into the same group. We have three groups:

  • Group1 (Sentiment Analysis): SST-2, MR, CR
  • Group2 (Natural Language Inference): MNLI, SNLI
  • Group3 (Paraphrasing): MRPC, QQP

2. Have a Training Games

Please follow our papers, we have mask following experiments:

  • Single-task few-shot learning: It is the same as LM-BFF and P-tuning, we prompt-tune the PLM only on one task.
  • Cross-task few-shot learning: We mix up the similar task in group. At first, we prompt-tune the PLM on cross-task data, then we prompt-tune on each task again. For the Cross-task Learning, we have two cross-task method:
  • (Cross-)Task Adaptation: In one group, we prompt-tune on all the tasks, and then evaluate on each task both in few-shot scenario.
  • (Cross-)Task Generalization: In one group, we randomly choose one task for few-shot evaluation (do not used for training), others are used for prompt-tuning.

2.1 Single-task few-shot learning

Take MRPC as an example, please run:

CUDA_VISIBLE_DEVICES=0 sh scripts/run_single_task.sh

figure1.png

2.2 Cross-task few-shot Learning (Task Adaptaion)

Take Group1 as an example, please run the scripts:

CUDA_VISIBLE_DEVICES=0 sh scripts/run_cross_task_adaptation.sh

figure2.png

2.3 Cross-task few-shot Learning (Task Generalization)

Also take Group1 as an example, please run the scripts: Ps: the unseen task is SST-2.

CUDA_VISIBLE_DEVICES=0 sh scripts/run_cross_task_generalization.sh

figure3.png

Citation

Our paper citation is:

@inproceedings{DBLP:conf/emnlp/0001WQH021,
  author    = {Chengyu Wang and
               Jianing Wang and
               Minghui Qiu and
               Jun Huang and
               Ming Gao},
  editor    = {Marie{-}Francine Moens and
               Xuanjing Huang and
               Lucia Specia and
               Scott Wen{-}tau Yih},
  title     = {TransPrompt: Towards an Automatic Transferable Prompting Framework
               for Few-shot Text Classification},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural
               Language Processing, {EMNLP} 2021, Virtual Event / Punta Cana, Dominican
               Republic, 7-11 November, 2021},
  pages     = {2792--2802},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
  url       = {https://aclanthology.org/2021.emnlp-main.221},
  timestamp = {Tue, 09 Nov 2021 13:51:50 +0100},
  biburl    = {https://dblp.org/rec/conf/emnlp/0001WQH021.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgement

The code is developed based on pet. We appreciate all the authors who made their code public, which greatly facilitates this project. This repository would be continuously updated.

Owner
WangJianing
My name is Wang Jianing.Nowadays I am a postgraduate of East China Normal University in Shanghai.My research field is Machine Learning;Deep Learning and NLP
WangJianing
Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

numbalsoda numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems.

Nick Wogan 52 Jan 09, 2023
TensorFlow implementation of "Variational Inference with Normalizing Flows"

[TensorFlow 2] Variational Inference with Normalizing Flows TensorFlow implementation of "Variational Inference with Normalizing Flows" [1] Concept Co

YeongHyeon Park 7 Jun 08, 2022
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP

Duo Li 1.3k Dec 28, 2022
Hummingbird compiles trained ML models into tensor computation for faster inference.

Hummingbird Introduction Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to se

Microsoft 3.1k Dec 30, 2022
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
TLXZoo - Pre-trained models based on TensorLayerX

Pre-trained models based on TensorLayerX. TensorLayerX is a multi-backend AI fra

TensorLayer Community 13 Dec 07, 2022
Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Memory-Efficient Multi-Level In-Situ Generation (MLG) By Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen and David Z. Pan

Jiaqi Gu 2 Jan 04, 2022
Code release for Universal Domain Adaptation(CVPR 2019)

Universal Domain Adaptation Code release for Universal Domain Adaptation(CVPR 2019) Requirements python 3.6+ PyTorch 1.0 pip install -r requirements.t

THUML @ Tsinghua University 229 Dec 23, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
Bayesian dessert for Lasagne

Gelato Bayesian dessert for Lasagne Recent results in Bayesian statistics for constructing robust neural networks have proved that it is one of the be

Maxim Kochurov 84 May 11, 2020
An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Fast Face Classification (F²C) This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicit

33 Jun 27, 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

MSG-Transformer Official implementation of the paper MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens, by Jiemin

Hust Visual Learning Team 68 Nov 16, 2022
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

Phil Wang 208 Dec 25, 2022
Noether Networks: meta-learning useful conserved quantities

Noether Networks: meta-learning useful conserved quantities This repository contains the code necessary to reproduce experiments from "Noether Network

Dylan Doblar 33 Nov 23, 2022
Count GitHub Stars ⭐

Count GitHub Stars per Day ⭐ Track GitHub stars per day over a date range to measure the open-source popularity of different repositories. Requirement

Ultralytics 20 Nov 20, 2022
Physics-informed Neural Operator for Learning Partial Differential Equation

PINO Physics-informed Neural Operator for Learning Partial Differential Equation Abstract: Machine learning methods have recently shown promise in sol

107 Jan 02, 2023
Generic U-Net Tensorflow implementation for image segmentation

Tensorflow Unet Warning This project is discontinued in favour of a Tensorflow 2 compatible reimplementation of this project found under https://githu

Joel Akeret 1.8k Dec 10, 2022
A curated list of awesome game datasets, and tools to artificial intelligence in games

🎮 Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th

Leonardo Mauro 454 Jan 03, 2023
Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

Pixel Transposed Convolutional Networks Created by Hongyang Gao, Hao Yuan, Zhengyang Wang and Shuiwang Ji at Texas A&M University. Introduction Pixel

Hongyang Gao 95 Jul 24, 2022
QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

QR2Pass This is a proof of concept for an alternative (passwordless) authenticat

4 Dec 09, 2022