When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging for a long time, based on this problem, I want to manage all the previous work, organized into an open source library can be iterative. For their own use and others.

Last update: Oct 27, 2022

Related tags

Text Data & NLP FastAudioVisual

Overview

FastAudioVisual

Our project is developed here. The goal finish time is March 01, 2021

What is FastAudioVisual?

FastAudioVisual is a tool that allows us to develop and analyse research in the audiovisual domain. The framework of this model as follow:

As we can see that this project has five parts. Here is the detail of each part.

DataRegular: It causes many questions due to different file structure in some research. In this work, we develop a series of functions to make your database regular with the next step. All of these funfunctions arested and regular by RAVDESS which is a big database in multimodal emotion recognition.
FeatureExtract: Features extraction is important for model study. There are many features can be extracted for input. For audio, MFCC, FBank, crossing-zero rate and soon on can be used. For visual, gray, RGB, optical flow diagram can be used. In this part, we will build some API to extract these features.
SampleModel: With the develop of hardwares, deep learning has got siginificant improvement in every area. Many area has been regular by deep learning. Therefore, we collect some classical model for basic research. This part will make you have a enough evaluate and experiment. (In the beginning, I struggled to choose Pytorch and fastai).
ModelDesign: In this part, we focus on audiovisual fusion method and model design for audiovisual other domain( including loss , framework, other trick.). It collect some research work and code. Also, we can replace simplemodel into this part. Making the result is better.
Analysis: Based on above parts, we will using some tool to analysis the result of this experiment. Such as confusion matrix, CAM, feature distrbution.
Test: Some demo for using this project.
Others: It includes some paper or blog for this area.

In general, All of these design is for developing your audiovisual research fastly by this ttool!

Develop and Iteration

3. 功能内容与具体

4. 后期维护与迭代

Installation

You can install, upgrade, uninstall count-line with these commands(without $):

$ pip install FastAudioVisual
$ pip install --upgrade FastAudioVisual
$ pip unstall FastAudioVisual

Help

usage: line.py [-h] [-s SUFFIX | -f FILTER] [-d]

count the amount of lines and files under the current directory

optional arguments:
  -h, --help            show this help message and exit
  -s SUFFIX, --suffix SUFFIX
                        count by suffix file name, format: .suffix1.suffix2...
                        e.g: .cpp.py (without space)
  -f FILTER, --filter FILTER
                        count without filter name, format: .suffix1.suffix2...
                        e.g: .cpp.py (without space)
  -d, --detail          show detail results

Examples

Count all files under the current directory:

$ line
Search in /Users/macbook/Desktop/Examples1/
file count: 4
line count: 373

Count all files under the current directory with detail results:

$ line -d
Search in /Users/macbook/Desktop/Examples2/

		========================================
		文件后缀名	文件数		总行数
		

		   .py		5		397
		

		   .cpp		240		11346
		

		总文件数: 245	总行数: 11743
		========================================

Count specified files under the current directory, using -s to pass suffix as parameters, if there are more than one parameter, don't have space, for example, count cpp files and python files:

$ line -s .cpp.py
Search in /Users/macbook/Desktop/Examples3/
file count: 3
line count: 243
$ line -s .cpp.py -d
Search in /Users/macbook/Desktop/Examples3/

		========================================
		文件后缀名	文件数		总行数
		

		   .py		5		397
		

		   .cpp		240		11346
		

		总文件数: 245	总行数: 11743
		========================================

Count files under the current directory with filter:

$ line -f .py -d
Search in /Users/macbook/Desktop/Examples4/

		========================================
		文件后缀名	文件数		总行数
		

		   .cpp		240		11346
		

		总文件数: 240	总行数: 11528
		========================================
$ line -d
Search in /Users/macbook/Desktop/Examples4/

		========================================
		文件后缀名	文件数		总行数
		

		   .py		5		397
		

		   .cpp		240		11346
		

		总文件数: 245	总行数: 11743
		========================================

Related tags

Overview

FastAudioVisual

What is FastAudioVisual?

Develop and Iteration

3. 功能内容与具体

4. 后期维护与迭代

Installation

Help

Examples

Owner

Natural Language Processing

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

SAINT PyTorch implementation

A Python script which randomly chooses and prints a file from a directory.

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.

A Japanese tokenizer based on recurrent neural networks

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

Google's Meena transformer chatbot implementation

Pipelines de datos, 2021.

Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

precise iris segmentation

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.

CJK computer science terms comparison / 中日韓電腦科學術語對照 / 日中韓のコンピュータ科学の用語対照 / 한·중·일 전산학 용어 대조

Two-stage text summarization with BERT and BART

A PyTorch Implementation of End-to-End Models for Speech-to-Text

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding