code for modular summarization work published in ACL2021 by Krishna et al

Overview

This repository contains the code for running modular summarization pipelines as described in the publication Krishna K, Khosla K, Bigham J, Lipton ZC. Generating SOAP Notes from Doctor-Patient Conversations." ACL 2021.

Although we can not release models trained on the confidential medical data, we have released models trained on the publicly available AMI dataset.

Download the ami_models folder from the link given below and put it at the root of the repository: https://drive.google.com/drive/folders/12Fkv_JvhJotvDTk2Z5PZYs4POIkqmwBZ?usp=sharing

To reproduce the results, you need to follow the following 3 steps.

# step1: downloads and preprocesses AMI dataset
./prepare_data.sh

 # step2: runs the summarization pipelines on the data and computes rouge scores
 # (before running this command, you need to download the models as shown above  )
./predict_ami.sh

# step3: print the results
python show_results.py
Owner
Kundan Krishna
Kundan Krishna
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

420 Dec 28, 2022
Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

2 Dec 29, 2022
ACL'22: Structured Pruning Learns Compact and Accurate Models

☕ CoFiPruning: Structured Pruning Learns Compact and Accurate Models This repository contains the code and pruned models for our ACL'22 paper Structur

Princeton Natural Language Processing 130 Jan 04, 2023
Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

Zhuosheng Zhang 3 Jun 13, 2022
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

1 Nov 02, 2021
An implementation of WaveNet with fast generation

pytorch-wavenet This is an implementation of the WaveNet architecture, as described in the original paper. Features Automatic creation of a dataset (t

Vincent Herrmann 858 Dec 27, 2022
To classify the News into Real/Fake using Features from the Text Content of the article

Hoax-Detector Authenticity of news has now become a major problem. The Idea is to classify the News into Real/Fake using Features from the Text Conten

Aravindhan 1 Feb 09, 2022
CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus CVSS is a massively multilingual-to-English speech-to-speech translation corpus, co

Google Research Datasets 118 Jan 06, 2023
This repository is home to the Optimus data transformation plugins for various data processing needs.

Transformers Optimus's transformation plugins are implementations of Task and Hook interfaces that allows execution of arbitrary jobs in optimus. To i

Open Data Platform 37 Dec 14, 2022
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

simple_diarizer Simplified diarization pipeline using some pretrained models. Made to be a simple as possible to go from an input audio file to diariz

Chau 65 Dec 30, 2022
Write Python in Urdu - اردو میں کوڈ لکھیں

UrduPython Write simple Python in Urdu. How to Use Write Urdu code in سامپل۔پے The mappings are as following: "۔": ".", "،":

Saad A. Bazaz 26 Nov 27, 2022
Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

smart-school-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

Tom Huynh 5 Oct 24, 2022
DataCLUE: 国内首个以数据为中心的AI测评(含模型分析报告)

DataCLUE 以数据为中心的AI测评(DataCLUE) DataCLUE: A Chinese Data-centric Language Evaluation Benchmark 内容导引 章节 描述 简介 介绍以数据为中心的AI测评(DataCLUE)的背景 任务描述 任务描述 实验结果

CLUE benchmark 135 Dec 22, 2022
A cross platform OCR Library based on PaddleOCR & OnnxRuntime

A cross platform OCR Library based on PaddleOCR & OnnxRuntime

RapidOCR Team 767 Jan 09, 2023
keras implement of transformers for humans

keras implement of transformers for humans

苏剑林(Jianlin Su) 4.8k Jan 03, 2023
Auto translate textbox from Japanese to English or Indonesia

priconne-auto-translate Auto translate textbox from Japanese to English or Indonesia How to use Install python first, Anaconda is recommended Install

Aji Priyo Wibowo 5 Aug 25, 2022
Pipelines de datos, 2021.

Este repo ilustra un proceso sencillo de automatización de transformación y modelado de datos, a través de un pipeline utilizando Luigi. Stack princip

Rodolfo Ferro 8 May 19, 2022
Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

Parser-Free Virtual Try-on via Distilling Appearance Flows, CVPR 2021 Official code for CVPR 2021 paper 'Parser-Free Virtual Try-on via Distilling App

395 Jan 03, 2023
Toward Model Interpretability in Medical NLP

Toward Model Interpretability in Medical NLP LING380: Topics in Computational Linguistics Final Project James Cross ( 1 Mar 04, 2022