This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

Last update: Apr 07, 2022

Related tags

Deep Learning Disco-Seq2seq-Parser

Overview

Discontinuous Grammar as a Foreign Language

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language. In particular, it uses the in-order+SWAP linearization to deal with discontinuities and yields 95.47 F1 on the English Discontinuous Penn Treebank (DPTB). This implementation is based on the system by Fernandez Astudillo et al. (2020) and reuses part of its code.

Requirements

This implementation was tested on Python 3.6.9, PyTorch 1.1.0 and CUDA 9.0.176. Please run the following command to proceed with the installation:

    cd Disco-Seq2seq-Parser
    pip install -r requirements.txt

For the evaluation, script DISCODOP must be also installed following steps described in https://github.com/andreasvc/disco-dop.

Data

To get shift-reduce linearizations from discontinuous constituent treebanks (for instance, the DPTB), please include train, dev and test splits in discbracket format in the disco_data folder and name them as train.discbracket, dev.discbracket and test.discbracket. Then use the following script:

    ./linearization/generate.sh DPTB

Experiments

To train a model for the DPTB treebank, just execute the following script:

   ./scripts/stack-transformer/con_experiment.sh configs/ptb_roberta.large.sh

To test the trained model on the test split, please run the following command:

    ./scripts/stack-transformer/con_test-test.sh configs/test_roberta_large.sh DATA/dep-parsing/models/DPTB_RoBERTa-large_stnp6x6-seed44/checkpoint_top3-average.pt DATA/dep-parsing/models/DPTB_RoBERTa-large_stnp6x6-seed44/epoch-tests-test/dec-checkpoint-top3-average

Citation

@misc{fernándezgonzález2021discontinuous,
      title={Discontinuous Grammar as a Foreign Language},
      author={Daniel Fernández-González and Carlos Gómez-Rodríguez},
      year={2021},
      eprint={2110.10431},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
    }

Acknowledgments

We acknowledge the European Research Council (ERC), which has funded this research under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150), MINECO (ANSWER-ASAP, TIN2017-85160-C2-1-R), MICINN (SCANNER, PID2020-113230RB-C21) Xunta de Galicia (ED431C 2020/11), and Centro de Investigación de Galicia "CITIC", funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014-2020 Program), by grant ED431G 2019/01.

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

Related tags

Overview

Discontinuous Grammar as a Foreign Language

Requirements

Data

Experiments

Citation

Acknowledgments

Owner

Daniel Fernández-González

Interactive Image Generation via Generative Adversarial Networks

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ)

这是一个yolox-keras的源码，可以用于训练自己的模型。

An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Large scale PTM - PPI relation extraction

Training Very Deep Neural Networks Without Skip-Connections

A Novel Plug-in Module for Fine-grained Visual Classification

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

On Out-of-distribution Detection with Energy-based Models

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Bayesian Neural Networks in PyTorch

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

A PyTorch Implementation of FaceBoxes

Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition