Quick insights from Zoom meeting transcripts using Graph + NLP

Last update: Sep 17, 2022

Overview

Transcript Analysis - Graph + NLP

This program extracts insights from Zoom Meeting Transcripts (.vtt) using TigerGraph and NLTK.

In order to run this program, modify the auth.ini file with your proper graph solution credentials and file paths. Then, simply run main.py. A sample transcript has been provided, but feel free to add your own into the \a_raw_transcripts directory!

As of now, this program performs the following tasks:

Convert .vtt into compact version (stored in \b_cmt_transcripts)
NLP analysis of compact transcript (using NLTK)
- Sentiment analysis
- Trigrams (collocations)
- Frequency of words (plotted)
- Meaningful words (shown as wordcloud)
- Number of speakers, names of speakers
- Who spoke the longest, least, average
Graph analysis of compact transcript (using TigerGraph)
- Analyze relationships between speakers
- Asked the most/least questions
- Pair w/ the most back-and-forth
- (TODO): Linking topics in semantic graph
- (TODO): Named-Entity Recognition
Visual output of all determined insights

Usage

A TigerGraph Cloud Portal solution (https://tgcloud.io/) will be required to run this program.

Kindly find the GraphStudio link here: https://transcript-analysis.i.tgcloud.io/

The schema utilized in this graph is fleshed out below:

Vertex: speaker

(PRIMARY ID) name - STRING

Edge: asked_question

text - STRING

Edge: answered_question

Here is an example of the graph populated with the sample transcript provided:

Analysis

Here is a screenshot of the command-line output produced:

Here is a frequency chart of meaningful words generated:

Here is a word cloud that visualizes common, key terms:

More features coming soon! In the meantime, feel free to continue creating and adding new insights 😁 😁

Quick insights from Zoom meeting transcripts using Graph + NLP

Related tags

Overview

Transcript Analysis - Graph + NLP

Usage

Analysis

References

Owner

Advit Deepak

Code voor mijn Master project omtrent VideoBERT

Wind Speed Prediction using LSTMs in PyTorch

Client library to download and publish models and other files on the huggingface.co hub

aMLP Transformer Model for Japanese

A workshop with several modules to help learn Feast, an open-source feature store

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Modeling cumulative cases of Covid-19 in the US during the Covid 19 Delta wave using Bayesian methods.

ACL'2021: Learning Dense Representations of Phrases at Scale

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Source code of the "Graph-Bert: Only Attention is Needed for Learning Graph Representations" paper

Sequence-to-Sequence Framework in PyTorch

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning

Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP

chaii - hindi & tamil question answering

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

CMeEE 数据集医学实体抽取

Exploring dimension-reduced embeddings

DANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)

source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.