One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Last update: Dec 11, 2022

Related tags

Deep Learning DMRST_Parser

Overview

Introduction

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".
Users can apply it to parse the input text from scratch, and get the EDU segmentations and the parsed tree structure.
The model supports both sentence-level and document-level RST discourse parsing.
This repo and the pre-trained model is only for research use.

Package Requirements

pytorch==1.7.1
transformers==4.8.2

Supported Languages

We trained and evaluated the model with the multilingual collection of RST discourse treebanks, and it natively supports 6 languages: English, Portuguese, Spanish, German, Dutch, Basque. Interested users can also try other languages.

Data Format

[Input] InputSentence: The input document/sentence, and the raw text will be tokenizaed and encoded by the xlm-roberta-base language backbone. '|| ' denotes the EDU boundary positions.
- Although the report, || which has released || before the stock market opened, || didn't trigger the 190.58 point drop in the Dow Jones Industrial Average, || analysts said || it did play a role in the market's decline. ||
[Output] EDU_Breaks: The indices of the EDU boundary tokens, including the last word of the sentence.
- [2, 5, 10, 22, 24, 33]
[Output] tree_parsing_output: The model outputs of the discourse parsing tree follow this format.
- (1:Satellite=Contrast:4,5:Nucleus=span:6) (1:Nucleus=Same-Unit:3,4:Nucleus=Same-Unite:4) (5:Satellite=Attribution:5,6:Nucleus=span:6) (1:Satellite=span:1,2:Nucleus=Elaboration:3) (2:Nucleus=span:2,3:Satellite=Temporal:3)

How to use it for parsing

Put the text paragraph to the file ./data/text_for_inference.txt.
Run the script MUL_main_Infer.py to obtain the RST parsing result. See the script for detailed model output.
We recommend users to run the parser on a GPU-equipped environment.

Citation

@article{liu2021dmrst,
  title={DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy F},
  journal={arXiv preprint arXiv:2110.04518},
  year={2021}
}

@inproceedings{liu2020multilingual,
  title={Multilingual Neural RST Discourse Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy},
  booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
  pages={6730--6738},
  year={2020}
}

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Related tags

Overview

Introduction

Package Requirements

Supported Languages

Data Format

How to use it for parsing

Citation

Owner

seq-to-mind

Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

The VeriNet toolkit for verification of neural networks

TensorFlow (Python API) implementation of Neural Style

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

Learning Versatile Neural Architectures by Propagating Network Codes

Official implementation of the paper ``Unifying Nonlocal Blocks for Neural Networks'' (ICCV'21)

DecoupledNet is semantic segmentation system which using heterogeneous annotations

HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation

Project for music generation system based on object tracking and CGAN

Get started with Machine Learning with Python - An introduction with Python programming examples

Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch