Sequence-tagging using deep learning

Last update: Dec 20, 2022

Overview

Classification using Deep Learning

Requirements

PyTorch version >= 1.9.1+cu111
Python version >= 3.8.10
PyTorch-Lightning version >= 1.4.9
Huggingface Transformers version >= 4.11.3
Tensorboard version >= 2.6.0
Pandas >= 1.3.4
Scikit-learn: numpy>=1.14.6, scipy>=1.1.0, threadpoolctl>=2.0.0, joblib>=0.11

Installation

pip3 install transformers
pip3 install pytorch-lightning
pip3 install tensorboard
pip3 install pandas
pip3 install scikit-learn
git clone https://github.com/vineetk1/clss.git
cd clss

Note that the default directory is clss. Unless otherwise stated, all commands from the Command-Line-Interface must be delivered from the default directory.

Download the dataset

Create a data directory.

mkdir data

Download a dataset in the data directory.

Saving all informtion and results of an experiment

All information about the experiment is stored in a unique directory whose path starts with tensorboard_logs and ends with a unique version-number. Its contents consist of hparams.yaml, hyperperameters_used.yaml, test-results.txt, events.* files, and a checkpoints directory that has one or more checkpoint-files.

Train, validate, and test a model

Following command trains a model, saves the last checkpoint plus checkpoints that have the lowest validation loss, runs the test dataset on the checkpointed model with the lowest validation loss, and outputs the results of the test:

python3 Main.py input_param_files/bert_seq_class

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class. An explanation on the contents of this file is at input_param_files/README.md. A list of all the hyper-parameters is in the PyTorch-Lightning documentation, and any hyper-parameter can be used.
To assist in Training, the two parameters auto_lr_find and auto_scale_batch_size in the file input_param_files/bert_seq_class enable the software to automatically find an initial Learning-Rate and a Batch-Size respectively.
As training progresses, graphs of "training-loss vs. epoch #", "validation-loss vs. epoch #", and "learning-rate vs. batch #" are plotted in real-time on the TensorBoard. Training is stopped by typing, at the Command-Line-Interface, the keystroke ctrl-c. The current training information is checkpointed, and training stops. Training can be resumed, at some future time, from the checkpointed file.
Dueing testing, the results are sent to the standard-output, and also saved in the *test-results.txt" file that include the following: general information about the dataset and the classes, confusion matrix, precision, recall, f1, average f1, and weighted f1.

Resume training, validation, and testing a model with same hyper-parameters

Resume training a checkpoint model with the same model- and training-states by using the following command:

python3 Main.py input_param_files/bert_seq_class-res_from_chkpt

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-res_from_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Change hyper-parameters and continue training, validation, and testing a model

Continue training a checkpoint model with the same model-state but different hyperparameters for the training-state by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt

The user-settable hyper-parameters are in the file input_param_filesbert_seq_class-ld_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Further test a checkpoint model with a new dataset

Test a checkpoint model by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt_and_test

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-ld_chkpt_and_test. An explanation on the contents of this file is at input_param_files/README.md.

Sequence-tagging using deep learning

Related tags

Overview

Classification using Deep Learning

Requirements

Installation

Download the dataset

Saving all informtion and results of an experiment

Train, validate, and test a model

Resume training, validation, and testing a model with same hyper-parameters

Change hyper-parameters and continue training, validation, and testing a model

Further test a checkpoint model with a new dataset

Owner

Vineet Kumar

The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Minimal fastai code needed for working with pytorch

Internship Assessment Task for BaggageAI.

A novel Engagement Detection with Multi-Task Training (ED-MTT) system

AttGAN: Facial Attribute Editing by Only Changing What You Want (IEEE TIP 2019)

This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.

Cluttered MNIST Dataset

Transformer in Vision

Realtime segmentation with ENet, the fast and accurate segmentation net.

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

Automated Attendance Project Using Face Recognition

Code for reproducing experiments in "Improved Training of Wasserstein GANs"

CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite.

The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .