Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Overview

Paradigm Shift in NLP

Welcome to the webpage for "Paradigm Shift in Natural Language Processing". Some resources of the paper are constantly maintained here, such as a full list of papers of paradigm shift, an interactive Sankey diagram to depict the trend of paradigm shift, etc.

What is paradigm shift?

First of all, what is paradigm, and what is paradigm shift?

Paradigm is the general framework to model a class of tasks. For example, sequence labeling (SeqLab) is a popular paradigm to solve named entity recognition (NER). We summarize the mainstream paradigms that are widely used for common NLP tasks as: Class, Matching, SeqLab, MRC, Seq2Seq, Seq2ASeq, (M)LM.

Paradigm shift is a phenomena of solving a task that is usually solved with some paradigm with another paradigm. For example, Li et al. (2020) uses the MRC paradigm to solve NER, which is previously solved with SeqLab, then we can say that the paradigm of NER shifted from SeqLab to MRC.

The figure below shows the observed shift (or transfer) of the seven paradigms in recent years.

Paradigm shift in NLP tasks

We collect the papers of paradigm shift in the table below, which is an extension of the Table 1 in our original paper. This table will be constantly updated.

Task Class Matching SeqLab MRC Seq2Seq Seq2ASeq (M)LM
TC Kim 2014;
Liu et al. 2016;
Devlin et al. 2019
Chai et al. 2020;
Yin et al. 2020;
Wang et al. 2021;
Yang et al. 2018 Brown et al. 2020;
Schick&Schutze 2021;
Schick&Schutze 2021;
Gao et al. 2021
NLI Devlin et al. 2019 Chen et al. 2017 McCann et al. 2018 Schick&Schutze 2021;
Schick&Schutze 2021;
Gao et al. 2021
NER Xia et al. 2019;
Fisher&Vlachos 2019;
Yu et al. 2020;
Fu et al. 2021
Ma&Hovy 2016;
Lample 2016
Li et al. 2020 Yan et al. 2021 Lample et al. 2016;
Dai et al. 2020
Ma et al. 2021
ABSA Wang et al. 2016 Sun et al. 2019 Mao et al. 2021
Chen et al. 2021
Yan et al. 2021;
Zhang et al. 2021
Li et al. 2021
RE Zeng et al. 2014 Levy et al. 2017;
Li et al. 2019;
Zhao et al. 2020
Han et al. 2021
Summ Zhong et al. 2020 Cheng&Lapata 2016 McCann et al. 2018 Aghajanyan et al. 2021
Parsing Rodríguez&Vilares 2018;
Strzyz et al. 2019;
Vilares&Rodríguez 2020;
Vacareanu et al. 2020;
Gan et al. 2021 Vinyals et al. 2015;
Li et al. 2018;
Rongali et al. 2020
Chen et al. 2014;
Dyer et al. 2015;
Choe&Charniak 2016

Trends

To intuitively depict the trend of paradigm shift in NLP, we also draw an interactive Sankey diagram, which is an extension of the Figure 2 in our original paper. Also, this diagram is constantly updated as the table above changed.

Contributing

This line of research is difficult to be comprehensively surveyed, so welcome any additions, modifications, and suggestions! Please feel free to submit pull request or directly contact me.

Citation

If you find this webpage or the paper helpful to your research, please cite our paper:

@article{sun2021paradigmshift,
  title={Paradigm Shift in Natural Language Processing}, 
  author={Tianxiang Sun and Xiangyang Liu and Xipeng Qiu and Xuanjing Huang},
  journal={arXiv preprint arXiv:2109.12575},
  year={2021}
}
Owner
Tianxiang Sun
@FudanNLP
Tianxiang Sun
DaCy: The State of the Art Danish NLP pipeline using SpaCy

DaCy: A SpaCy NLP Pipeline for Danish DaCy is a Danish preprocessing pipeline trained in SpaCy. At the time of writing it has achieved State-of-the-Ar

Kenneth Enevoldsen 71 Jan 06, 2023
Natural Language Processing Tasks and Examples.

Natural Language Processing Tasks and Examples With the advancement of A.I. technology in recent years, natural language processing technology has bee

Soohwan Kim 53 Dec 20, 2022
Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Realistic Few-Shot Relation Extraction This repository contains code to reproduce the results in the paper "Towards Realistic Few-Shot Relation Extrac

Bloomberg 8 Nov 09, 2022
BERT score for text generation

BERTScore Automatic Evaluation Metric described in the paper BERTScore: Evaluating Text Generation with BERT (ICLR 2020). News: Features to appear in

Tianyi 1k Jan 08, 2023
This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

João Assalim 1 Oct 10, 2022
PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding This repository contains the official PyTorch implementation of th

Xiao Xu 26 Dec 14, 2022
多语言降噪预训练模型MBart的中文生成任务

mbart-chinese 基于mbart-large-cc25 的中文生成任务 Input source input: text + /s + lang_code target input: lang_code + text + /s Usage token_ids_mapping.jso

11 Sep 19, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 08, 2023
This is a GUI program that will generate a word search puzzle image

Word Search Puzzle Generator Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing Cont

11 Feb 22, 2022
Natural Language Processing Specialization

Natural Language Processing Specialization In this folder, Natural Language Processing Specialization projects and notes can be found. WHAT I LEARNED

Kaan BOKE 3 Oct 06, 2022
Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

KoSimCSE Korean Simple Contrastive Learning of Sentence Embeddings implementation using pytorch SimCSE Installation git clone https://github.com/BM-K/

34 Nov 24, 2022
Sentello is python script that simulates the anti-evasion and anti-analysis techniques used by malware.

sentello Sentello is a python script that simulates the anti-evasion and anti-analysis techniques used by malware. For techniques that are difficult t

Malwation 62 Oct 02, 2022
Clone a voice in 5 seconds to generate arbitrary speech in real-time

This repository is forked from Real-Time-Voice-Cloning which only support English. English | 中文 Features 🌍 Chinese supported mandarin and tested with

Weijia Chen 25.6k Jan 06, 2023
A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

Emily's Symbol Dictionary Design This dictionary was created with the following goals in mind: Have a consistent method to type (pretty much) every sy

Emily 68 Jan 07, 2023
Reproduction process of BERT on SST2 dataset

BERT-SST2-Prod Reproduction process of BERT on SST2 dataset 安装说明 下载代码库 git clone https://github.com/JunnYu/BERT-SST2-Prod 进入文件夹,安装requirements pip ins

yujun 1 Nov 18, 2021
Automated question generation and question answering from Turkish texts using text-to-text transformers

Turkish Question Generation Offical source code for "Automated question generation & question answering from Turkish texts using text-to-text transfor

Open Business Software Solutions 29 Dec 14, 2022
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

41 Jan 03, 2023
[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

Cambridge Language Technology Lab 61 Dec 10, 2022
Opal-lang - A WIP programming language based on Python

thanks to aphitorite for the beautiful logo! opal opal is a WIP transcompiled pr

3 Nov 04, 2022
Code for the paper PermuteFormer

PermuteFormer This repo includes codes for the paper PermuteFormer: Efficient Relative Position Encoding for Long Sequences. Directory long_range_aren

Peng Chen 42 Mar 16, 2022