Natural Language Processing Specialization

Last update: Oct 06, 2022

Related tags

Text Data & NLP Natural-Language-Processing-Specialization

Overview

Natural Language Processing Specialization

In this folder, Natural Language Processing Specialization projects and notes can be found.

WHAT I LEARNED

Use logistic regression, naïve Bayes, and word vectors to implement sentiment analysis, complete analogies & translate words.
Use dynamic programming, hidden Markov models, and word embeddings to implement autocorrect, autocomplete & identify part-of-speech tags for words.
Use recurrent neural networks, LSTMs, GRUs & Siamese networks in Trax for sentiment analysis, text generation & named entity recognition.
Use encoder-decoder, causal, & self-attention to machine translate complete sentences, summarize text, build chatbots & question-answering.

There are 4 Courses in this Specialization

Course 1 - Natural Language Processing with Classification and Vector Spaces

In the first course of the Natural Language Processing Specialization
I performed sentiment analysis of tweets using logistic regression and then naïve Bayes,
I used vector space models to discover relationships between words and used PCA to reduce the dimensionality of the vector space and visualize those relationships, and
I wrote a simple English to French translation algorithm using pre-computed word embeddings and locality-sensitive hashing to relate words via approximate k-nearest neighbor search.

Projects

Course 2 - Natural Language Processing with Probabilistic Models

In the second course of the Natural Language Processing Specialization
I wrote a simple auto-correct algorithm using minimum edit distance and dynamic programming,
I applied the Viterbi Algorithm for part-of-speech (POS) tagging, which is vital for computational linguistics,
I wrote a better auto-complete algorithm using an N-gram language model, and
I wrote my own Word2Vec model that uses a neural network to compute word embeddings using a continuous bag-of-words model.

Projects

Course 3 - Natural Language Processing with Sequence Models

In the third course of the Natural Language Processing Specialization
I trained a neural network with GLoVe word embeddings to perform sentiment analysis of tweets,
I generated synthetic Shakespeare text using a Gated Recurrent Unit (GRU) language model,
I trained a recurrent neural network to perform named entity recognition (NER) using LSTMs with linear layers, and
I used so-called ‘Siamese’ LSTM models to compare questions in a corpus and identify those that are worded differently but have the same meaning.

Projects

Course 4 - Natural Language Processing with Attention Models

In the fourth course of the Natural Language Processing Specialization
I translated complete English sentences into German using an encoder-decoder attention model,
I built a Transformer model to summarize text,
I used T5 and BERT models to perform question-answering, and
I built a chatbot using a Reformer model.

Projects

Disclaimer

DeepLearning.AI makes course notes available for educational purposes.
Project solutions are just for educational purposes. I highly recommend trying and solving project/program assignments on your own.

All the best 🤘

Owner

Kaan BOKE

Data Scientist

Kaan BOKE

GitHub Repository

The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Good news! Our new work exhibits state-of-the-art performances on DocUNet benchmark dataset: DocScanner: Robust Document Image Rectification with Prog

231 Dec 26, 2022

A natural language modeling framework based on PyTorch

Overview PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapi

6.4k Jan 08, 2023

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

2 Jul 15, 2022

Data loaders and abstractions for text and NLP

torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto

3.2k Dec 30, 2022

Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

Phone Level Mixture Density Network for TTS This repo contains pytorch implementation of paper Rich Prosody Diversity Modelling with Phone-level Mixtu

42 Dec 13, 2022

Pipeline for chemical image-to-text competition

BMS-Molecular-Translation Introduction This is a pipeline for Bristol-Myers Squibb – Molecular Translation by Vadim Timakin and Maksim Zhdanov. We got

7 Sep 20, 2022

📝An easy-to-use package to restore punctuation of the text.

✏️ rpunct - Restore Punctuation This repo contains code for Punctuation restoration. This package is intended for direct use as a punctuation restorat

72 Dec 30, 2022

Skipgram Negative Sampling in PyTorch

PyTorch SGNS Word2Vec's SkipGramNegativeSampling in Python. Yet another but quite general negative sampling loss implemented in PyTorch. It can be use

287 Dec 14, 2022

Creating a chess engine using GPT-3

GPT3Chess Creating a chess engine using GPT-3 Code for my article : https://towardsdatascience.com/gpt-3-play-chess-d123a96096a9 My game (white) vs GP

19 Dec 17, 2022

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

1 Feb 01, 2022

Datasets of Automatic Keyphrase Extraction

This repository contains 20 annotated datasets of Automatic Keyphrase Extraction made available by the research community. Following are the datasets and the original papers that proposed them. If yo

163 Dec 23, 2022

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Neural Network Models for Joint POS Tagging and Dependency Parsing Implementations of joint models for POS tagging and dependency parsing, as describe

152 Sep 02, 2022

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised Text Classification Based on Keyword Graph How to run? Download data Our dataset follows previous works. For long texts, we follow C

20 Dec 29, 2022

Tracking Progress in Natural Language Processing

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

21.2k Dec 30, 2022

FireFlyer Record file format, writer and reader for DL training samples.

FFRecord The FFRecord format is a simple format for storing a sequence of binary records developed by HFAiLab, which supports random access and Linux

77 Jan 04, 2023

Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is one of the deep-learning NLP(Natural Language Process) model struc

2 Feb 22, 2022

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

BiQQLSTM_HS Code and data for paper: Title: An Effective, Robust and Fairness-awareHate Speech Detection Framework. Authors: Guanyi Mou and Kyumin Lee

2 Dec 27, 2022

Partially offline multi-language translator built upon Huggingface transformers.

Translate Command-line interface to translation pipelines, powered by Huggingface transformers. This tool can download translation models, and then us

8 Oct 25, 2022

Maix Speech AI lib, including ASR, chat, TTS etc.

Maix-Speech 中文 | English Brief Now only support Chinese, See 中文 Build Clone code by: git clone https://github.com/sipeed/Maix-Speech Compile x86x64 c

267 Dec 25, 2022

Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

856 Dec 24, 2022