Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Last update: Mar 20, 2022

Overview

Neural Search

Description: Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).

bert_search

Status: WIP
Source: https://towardsdatascience.com/building-a-search-engine-with-bert-and-tensorflow-c6fdc0186c8a
Description: Use a pre-trained BERT model checkpoint to build a general-purpose text feature extractor, which will be applied to a task of nearest neighbor search.

bert_tfhub

Status: Completed
Source: Use a matching preprocessing model to tokenize raw text and convert it to ids, generate the pooled and sequence output from the token input ids using the loaded (BERT) model, and look at the semantic similarity of the pooled outputs of different sentences.
Description: https://www.tensorflow.org/hub/tutorials/bert_experts

finetune_bert

Status: Completed
Source: https://www.tensorflow.org/text/tutorials/fine_tune_bert
Description: Work through fine-tuning a BERT model using the tensorflow-models pip package. The pretrained BERT model is on Tensorflow Hub.

text_summarization_encoderdecoder

Status: Abandoned (No source code to reference)
Source: https://towardsdatascience.com/text-summarization-from-scratch-using-encoder-decoder-network-with-attention-in-keras-5fa80d12710e
Description: Summarizing text from new articles to generate meaningful headlines using an Encoder-Decoder with Attention in Keras.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Related tags

Overview

Neural Search

bert_search

bert_tfhub

finetune_bert

text_summarization_encoderdecoder

Owner

Diego

A tool helps build a talk preview image by combining the given background image and talk event description

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Chinese NER with albert/electra or other bert descendable model (keras)

Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型，适用于英语、普通话/中文、日语、韩语、俄语和藏语（当前已测试）。

Applied Natural Language Processing in the Enterprise - An O'Reilly Media Publication

Fast, DB Backed pretrained word embeddings for natural language processing.

Big Bird: Transformers for Longer Sequences

Model for recasing and repunctuating ASR transcripts

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

DVC-NLP-Simple-usecase

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Indonesia spellchecker with python

NLP library designed for reproducible experimentation management

Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.

A demo of chinese asr

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"