Conversational text Analysis using various NLP techniques

Last update: Jan 06, 2023

Related tags

Overview

PyConverse

Let me try first

Installation

pip install pyconverse

Usage

Please try this notebook that demos the core functionalities: basic usage notebook

Introduction

Conversation analytics plays an increasingly important role in shaping great customer experiences across various industries like finance/contact centres etc... primarily to gain a deeper understanding of the customers and to better serve their needs. This library, PyConverse is an attempt to provide tools & methods which can be used to gain an understanding of the conversations from multiple perspectives using various NLP techniques.

Why PyConverse?

I have been doing what can be called conversational text NLP with primarily contact centre data from various domains like Financial services, Banking, Insurance etc for the past year or so, and I have not come across any interesting open-source tools that can help in understanding conversational texts as such I decided to create this library that can provide various tools and methods to analyse calls and help answer important questions/compute important metrics that usually people want to find from conversations, in contact centre data analysis settings.

Where can I use PyConverse?

The primary use case is geared towards contact centre call analytics, but most of the tools that Converse provides can be used elsewhere as well.

There’s a lot of insights hidden in every single call that happens, Converse enables you to extract those insights and compute various kinds of KPIs from the point of Operational Efficiency, Agent Effectiveness & monitoring Customer Experience etc.

If you are looking to answer questions like these:-

What was the overall sentiment of the conversation that was exhibited by the speakers?
Was there periods of dead air(silence periods) between the agents and customer? if so how much?
Was the agent empathetic towards the customer?
What was the average agent response time/average hold time?
What was being said on calls?

and more... pyconverse might be of small help.

What can PyConverse do?

At the moment pyconverse can do a few things that broadly fall into these categories:-

Emotion identification
Empathetic statement identification
Call Segmentation
Topic identification from call segments
Compute various types of Speaker attributes:
1. linguistic attributes like: word counts/number of words per utterance/negations etc.
2. Identify periods of silence & interruptions.
3. Question identification
4. Backchannel identification
Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:
1. Talkative, verbally fluent
2. Informal/Personal/social
3. Goal-oriented or Forward/future-looking/focused on past
4. Identify inhibitions

What Next?

Improve documentation.
Add more use case notebooks/examples.
Improve some of the functionalities and make it more streamlined.

Built with:

Transformers	Spacy	Pytorch

Credits:

Note: The backchannel Utterance classification method is inspired by facebook's Unsupervised Topic Segmentation of Meetings with BERT Embeddings paper (arXiv:2106.12978 [cs.LG])

You might also like...

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

2 Jun 10, 2022

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

FantasyBert English | 中文 Introduction An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations. You can imp

137 Oct 26, 2022

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

2 Sep 27, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

2.3k Dec 29, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

2k Feb 9, 2021

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

2 Jan 18, 2022

Various capabilities for static malware analysis.

Malchive The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to da

64 Nov 22, 2022

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

46 Dec 14, 2022

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

297 Dec 29, 2022

Comments

SemanticTextSegmentation NaN With All Stop Words

When running semantic text segmentation, I found that if the input utterance line is all stop words, (i.e. "Bye. Uh huh. Yeah."), SemanticTextSegmentation._get_similarity fails with ValueError: Input contains NaN.

I found that adding a check for nan in both embeddings could solve this problem.

def _get_similarity(self, text1, text2):
    sentence_1 = [i.text.strip()
                  for i in nlp(text1).sents if len(i.text.split(' ')) > 1]
    sentence_2 = [i.text.strip()
                  for i in nlp(text2).sents if len(i.text.split(' ')) > 2]
    embeding_1 = model.encode(sentence_1)
    embeding_2 = model.encode(sentence_2)
    embeding_1 = np.mean(embeding_1, axis=0).reshape(1, -1)
    embeding_2 = np.mean(embeding_2, axis=0).reshape(1, -1)

    if np.any(np.isnan(embeding_1)) or np.any(np.isnan(embeding_2)):
            return 1

    sim = cosine_similarity(embeding_1, embeding_2)
    return sim

I would like to have someone else look at it because I don't want to make any assumptions that the stop words should be part of the same segments.

opened by Haowjy 1

Updated lru_cache decorator.

After installing and running the library pyconverse on python-3.7 or below and using the import statement it gives error in import itself. I went through the utils file and saw that the "@lru_cache" decorator was written as per the new python(i.e. 3.8+) style hence when calling in older versions(py 3.7 and below it raises a NoneType Error) as the LRU_CACHE decorator is written as -" @lru_cache() " with paranthesis for older versions . Hence made the changes. The changes made do not cause any error on the newer versions.

opened by AkashKhamkar 0
Error in importing Callyzer, SpeakerStats

When I want to load the model it's showing this error.Whether it is currently in devloped mode

KeyError: "[E002] Can't find factory for 'tok2vec'. This usually happens when spaCy callsnlp.create_pipewith a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to Language.factories['tok2vec'] or remove it from the ### model meta and add it vianlp.add_pipeinstead.

opened by kalpa277 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2021)
First Release of PyConverse library.

Conversational Transcript Analysis using various NLP techniques.

Emotion identification

Empathetic statement identification

Call Segmentation

Topic identification from call segments

Compute various types of Speaker attributes:

linguistic attributes like : word counts/number of words per utterance/negations etc

Identify periods of silence & interruptions.

Question identification

Backchannel identification

Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:

Talkative, verbally fluent

Informal/Personal/social

Goal-oriented or Forward/future-looking/focused on past

Identify inhibitions

Source code(tar.gz)
Source code(zip)

Owner

Rita Anjana

ML engineer

GitHub Repository

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

pySBD: Python Sentence Boundary Disambiguation (SBD) pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detecti

549 Jan 06, 2023

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

3 May 25, 2022

Translate U is capable of translating the text present in an image from one language to the other.

Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.

1 Dec 22, 2021

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

SentencesJudger SentencesJudger 是一个基于GRU神经网络的句子判断程序，基本的功能是判断文章中的某一句话是否为一个优美的句子。 English 如何使用SentencesJudger 确认Python运行环境安装pyTorch与LTP python3 -m pip

8 Mar 24, 2022

Natural Language Processing Best Practices & Examples

NLP Best Practices In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive bus

6.1k Dec 31, 2022

☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA

Accuracy of BBC Weather forecasts for Honolulu This repository records the forecasts made by BBC Weather for the city of Honolulu, USA. Essentially, t

12 Oct 15, 2022

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

BERT is to NLP what AlexNet is to CV This is the official implementation of BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Iden

20 Nov 03, 2022

Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

395 Nov 21, 2022

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

86 Dec 28, 2022

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

J.A.R.V.I.S Kindly consider starring this repository if you like the program :-) What/Who is J.A.R.V.I.S? J.A.R.V.I.S is an chatbot written that is bu

50 Dec 31, 2022

A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022

Generate a cool README/About me page for your Github Profile

Github Profile README/ About Me Generator 💯 This webapp lets you build a cool README for your profile. A few inputs + ~15 mins = Your Github Profile

179 Jan 07, 2023

🕹 An esoteric language designed so that the program looks like the transcript of a Pokémon battle

PokéBattle is an esoteric language designed so that the program looks like the transcript of a Pokémon battle. Original inspiration and specification

9 Jan 11, 2022

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

1.1k Jan 02, 2023

A telegram bot to translate 100+ Languages

🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF

5 Dec 20, 2021

Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting Official PyTorch Implementation of paper "NeLF: Neural Light-tran

38 Dec 26, 2022

CoNLL-English NER Task (NER in English)

CoNLL-English NER Task en | ch Motivation Course Project review the pytorch framework and sequence-labeling task practice using the transformers of Hu

2 Jan 14, 2022

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

4 Apr 06, 2022

The tool to make NLP datasets ready to use

chazutsu photo from Kaikado, traditional Japanese chazutsu maker chazutsu is the dataset downloader for NLP. import chazutsu r = chazutsu.data

243 Dec 29, 2022

Fidibo.com comments Sentiment Analyser

Fidibo.com comments Sentiment Analyser Introduction This project first asynchronously grab Fidibo.com books comment data using grabber.py and then sav

3 Apr 15, 2022

Conversational text Analysis using various NLP techniques

Related tags

Overview

PyConverse

Let me try first

Installation

Usage

Introduction

Why PyConverse?

Where can I use PyConverse?

What can PyConverse do?

What Next?

Built with:

Credits:

You might also like...

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Various capabilities for static malware analysis.

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

Comments

SemanticTextSegmentation NaN With All Stop Words

Updated lru_cache decorator.

Error in importing Callyzer, SpeakerStats

Releases(v0.2.0)

v0.2.0(Nov 21, 2021)

Owner

Rita Anjana

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

Translate U is capable of translating the text present in an image from one language to the other.

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

Natural Language Processing Best Practices & Examples

☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Sequence-to-Sequence Framework in PyTorch

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

A Paper List for Speech Translation

Generate a cool README/About me page for your Github Profile

🕹 An esoteric language designed so that the program looks like the transcript of a Pokémon battle

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

A telegram bot to translate 100+ Languages

Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

CoNLL-English NER Task (NER in English)

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

The tool to make NLP datasets ready to use

Fidibo.com comments Sentiment Analyser