Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Related tags

DocumentationP3Ranker
Overview

P3 Ranker

Implementation for our SIGIR2022 accepted paper:

P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Project Structures

├── commands
│   ├── bert.sh
│   ├── p3ranker.sh
│   ├── prop_ft.sh
│   ├── roberta.sh
│   └── t5v11.sh
├── Prefinetune
│   ├── mnli_dataloader.py
│   ├── mnli_dataset.py
│   ├── mnli_model.py
│   ├── README.md
│   ├── train_mnli.sh
│   ├── train_nq.sh
│   ├── train.py
│   └── utils.py
├── src
│   ├── data
│   │    ├── datasets
│   │    │   ├── __init__.py
│   │    │   ├── bert_dataset.py
│   │    │   ├── bertmaxp_dataset.py
│   │    │   ├── dataset.py
│   │    │   ├── edrm_dataset.py
│   │    │   ├── roberta_dataset.py
│   │    │   └── t5_dataset.py
│   │    └── tokenizers
│   │        ├── __init__.py
│   │        ├── tokenizer.py
│   │        └── word_tokenizer.py
│   ├── extractors
│   │    ├── __init__.py
│   │    └── classic_extractor.py
│   ├── metrics
│   │    ├── __init__.py
│   │    └── metric.py
│   ├── models
│   │    ├── __init__.py
│   │    ├── bert_maxp.py
│   │    ├── bert_prompt_.py
│   │    ├── bert.py
│   │    ├── conv_knrm.py
│   │    ├── edrm.py
│   │    ├── knrm.py
│   │    ├── t5.py
│   │    └── tk.py
│   ├── modules
│   │    ├── attentons
│   │    │   ├── __init__.py
│   │    │   ├── multi_head_attention.py
│   │    │   └── scaled_dot_product_attention.py
│   │    ├── embedders
│   │    │   ├── __init__.py
│   │    │   └── embedder.py
│   │    ├── encoders
│   │    │   ├── __init__.py
│   │    │   ├── cnn_encoder.py
│   │    │   ├── feed_forward_encoder.py
│   │    │   ├── positional_encoder.py
│   │    │   └── transformer_encoder.py
│   │    └── matchers
│   │        ├── __init__.py
│   │        └── kernel_matcher.py
│   ├── __init__.py
│   └── utils.py
├── README.md
├── requirements.txt
├── train.py
└── utils.py 

Prerequisites

Install dependencies:

git clone https://github.com/NEUIR/P3Ranker.git
cd P3-Rankers
pip install -r requirements.txt

Data Preparation

We will release our few-shot dataset soon.

Prompt Generation

Details about the Discrete Prompt Generation can be find in https://github.com/princeton-nlp/LM-BFF and our paper

Prefinetune

cd Reproduce

And you will find how to do prefinetune.

Reproduce our results

Directly run the scripts we stored in './commands' can reproduce our results. One example is shown below:

bash commands/bert.sh 5

The above command is for reproducing results in our 5-q few-shot scenarios mentioned in our paper.

Contact

Please send email to [email protected].

charcade is a string manipulation library that can animate, color, and bruteforce strings

charcade charcade is a string manipulation library that can animate, color, and bruteforce strings. Features Animating text for CLI applications with

Aaron 8 May 23, 2022
A python package to import files from an adjacent folder

EasyImports About EasyImports is a python package that allows users to easily access and import files from sister folders: f.ex: - Project - Folde

1 Jun 22, 2022
Portfolio project for Code Institute Full Stack software development course.

Comic Sales tracker This project is the third milestone project for the Code Institute Diploma in Full Stack Software Development. You can see the fin

1 Jan 10, 2022
This tutorial will guide you through the process of self-hosting Polygon

Hosting guide This tutorial will guide you through the process of self-hosting Polygon Before starting Make sure you have the following tools installe

Polygon 2 Jan 31, 2022
Python solutions to solve practical business problems.

Python Business Analytics Also instead of "watching" you can join the link-letter, it's already being sent out to about 90 people and you are free to

Derek Snow 357 Dec 26, 2022
Use Brainf*ck with python!

Brainfudge Run Brainf*ck code with python! Classes Interpreter(array_len): encapsulate all functions into class __init__(self, array_len: int=30000) -

1 Dec 14, 2021
Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

datamodel-code-generator This code generator creates pydantic model from an openapi file and others. Help See documentation for more details. Supporte

Koudai Aono 1.3k Dec 29, 2022
EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

WVlab 1 Jun 18, 2022
📘 OpenAPI/Swagger-generated API Reference Documentation

Generate interactive API documentation from OpenAPI definitions This is the README for the 2.x version of Redoc (React-based). The README for the 1.x

Redocly 19.2k Jan 02, 2023
Collection of Summer 2022 tech internships!

Collection of Summer 2022 tech internships!

Pitt Computer Science Club (CSC) 15.6k Jan 03, 2023
Fast syllable estimation library based on pattern matching.

Syllables: A fast syllable estimator for Python Syllables is a fast, simple syllable estimator for Python. It's intended for use in places where speed

ProseGrinder 26 Dec 14, 2022
Data science on SDGs - Udemy Online Course Material: Data Science on Sustainable Development Goals

Data Science on Sustainable Development Goals (SDGs) Udemy Online Course Material: Data Science on Sustainable Development Goals https://bit.ly/data_s

Frank Kienle 1 Jan 04, 2022
step by step guide for beginners for getting started with open source

Step-by-Step Guide for beginners for getting started with Open-Source Here The Contribution Begins 💻 If you are a beginner then this repository is fo

Arpit Jain 66 Jan 03, 2023
PySpark Cheat Sheet - learn PySpark and develop apps faster

This cheat sheet will help you learn PySpark and write PySpark apps faster. Everything in here is fully functional PySpark code you can run or adapt to your programs.

Carter Shanklin 168 Jan 01, 2023
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
An open-source script written in python just for fun

Owersite Owersite is an open-source script written in python just for fun. It do

大きなペニスを持つ少年 7 Sep 21, 2022
ACPOA plugin creation helper

ACPOA Plugin What is ACPOA ACPOA is the acronym for "Application Core for Plugin Oriented Applications". It's a tool to create flexible and extendable

Leikt Sol'Reihin 1 Oct 20, 2021
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

applied-ml Curated papers, articles, and blogs on data science & machine learning in production. ⚙️ Figuring out how to implement your ML project? Lea

Eugene Yan 22.1k Jan 03, 2023
Official Matplotlib cheat sheets

Official Matplotlib cheat sheets

Matplotlib Developers 6.7k Jan 09, 2023
[Unofficial] Python PEP in EPUB format

PEPs in EPUB format This is a unofficial repository where I stock all valid PEPs in the EPUB format. Repository Cloning git clone --recursive Mickaël Schoentgen 9 Oct 12, 2022