Grading tools for Advanced NLP (11-711)

Installation

You'll need docker and unzip to use this repo. For docker, visit the official guide to get started. For unzip, you can install it on ubuntu via sudo apt-get install unzip.

Install the python package by

git clone https://github.com/ProKil/anlp-grading-tools
cd anlp-grading-tools
pip install -e .

Usage

To evaluate your code, you'll need to change the environment variables in test.sh.

ANLP_TMP_DIR: mkdir a new folder, e.g. mkdir tmp, and point this variable to the absolute path of the tmp folder.

SUBMISSION_DIR: this should point to the folder containing your submission zip file. Note that the toolkit will automatically evaluate all zip files in the folder.

SCORES_DIR: this should point to an empty folder. Your score will be logged in a text file there.

DATA_DIR: this should point to the data folder of minnn-assignment. Please copy the original minnn-assignment/classifier.py to minnn-assignment/data/classifier_orig.py to test if your code can be executed with the original classifier.

Example code to prepare the folders:

mkdir tmp
mkdir scores
cp -r path/to/minnn-assignment/data ./
cp path/to/minnn-assignment/classifier.py data/classifier_orig.py
mkdir submission
cp your/submission.zip submission

Now you can evaluate your code through bash test.sh, after which your scores are at SCORES_DIR/andrewid. It is normal to get 0s for the last two (correct labels for the imdb test set are not available), but you should get reasonable accuracies for the first two (~40).

Troubleshooting

You may find writing files inside ANLP_TMP_DIR and SCORE_DIR requiring permission. You can either use sudo or log into docker through docker run -v FOLDER_TO_WRITE:/mnt -it --entrypoint /bin/bash anlp and cd /mnt to write those files.
You may experience other permission issues with docker. Please refer to this page to use docker without sudo.

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Related tags

Overview

Grading tools for Advanced NLP (11-711)

Installation

Usage

Troubleshooting

Owner

Hao Zhu

End-to-End Speech Processing Toolkit

A NLP program: tokenize method, PoS Tagging with deep learning

Basic yet complete Machine Learning pipeline for NLP tasks

Weakly-supervised Text Classification Based on Keyword Graph

Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.

Sploitus - Command line search tool for sploitus.com. Think searchsploit, but with more POCs

Code for Editing Factual Knowledge in Language Models

PortaSpeech - PyTorch Implementation

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

A fast and easy implementation of Transformer with PyTorch.

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

Code for using and evaluating SpanBERT.

This repository contains (not all) code from my project on Named Entity Recognition in philosophical text

Perform sentiment analysis and keyword extraction on Craigslist listings