Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Last update: Dec 30, 2022

Related tags

Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

numpy>=1.18.0
scikit-learn>=0.22.1
scipy>=1.4.1
torch>=1.3.0
tqdm>=4.41.1
transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Related tags

Overview

PTR

Requirements

Baselines

Datasets

Run the experiments

(1) For TACRED

(2) For TACREV

(3) For RETACRED

Owner

THUNLP

A CSRankings-like index for speech researchers

Machine translation models released by the Gourmet project

CredData is a set of files including credentials in open source projects

HF's ML for Audio study group

Official Stanford NLP Python Library for Many Human Languages

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

Spert NLP Relation Extraction API deployed with torchserve for inference

Shellcode antivirus evasion framework

Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources (NAACL-2021).

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Repository for the paper: VoiceMe: Personalized voice generation in TTS

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Easy to start. Use deep nerual network to predict the sentiment of movie review.

Shared code for training sentence embeddings with Flax / JAX

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Enterprise Scale NLP with Hugging Face & SageMaker Workshop series

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.