Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

Overview

Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

This is a simple audio classification api build to classify the sound of an audio, weather it is the cat or dog sound.

alt

Response

Given a .wav audio the model will classify what does the sound the audio belongs to either cat or dog.

{
  "predictions": {
    "class": "dog",
    "label": 1,
    "probability": 1.0
  },
  "success": true
}

Starting the server

To start server and start audio classification first you need to make sure you are in the server folder and run the following commands:

  1. creating a virtual environment
virtualenv venv && .\venv\Scripts\activate.bat
  1. installing packages
pip install -r requirements.txt
  1. Starting the server
python api/app.py

The server will start on a default port of 3001 and you will be able to make api request to the server to do audio classification.

Model Metrics

The following table shows all the metrics summary we get after training the model for few 15 epochs.

model name model description test accuracy validation accuracy train accuracy test loss validation loss train loss
cats-dogs-sound-cnn.pt audio sentiment classification for dogs and cats CNN. 90.7% 90.7% 93.5% 0.621 0.218 0.209

Classification report

The following is the classification report for the model on the test dataset.

# precision recall f1-score support
accuracy - - 90% 2305
macro avg 91% 90% 90% 2305
weighted avg 92% 89% 90% 2305

Confusion matrix

The following figure shows a confusion matrix for the classification model.

Audio Sentiment classification

If you hit the server at http://localhost:3001/classify you will be able to get the following expected response that is if the request method is POST and you provide the file expected by the server.

Expected Response

The expected response at http://localhost:3001/classify with a file audio of the right format will yield the following json response to the client.

{
  "predictions": {
    "class": "dog",
    "label": 1,
    "probability": 1.0
  },
  "success": true
}

Using curl

Make sure that you have the audio named cat.wav in the current folder that you are running your cmd otherwise you have to provide an absolute or relative path to the audio.

To make a curl POST request at http://localhost:3001/classify with the file cat.wav we run the following command.

# for cat
curl -X POST -F [email protected] http://127.0.0.1:3001/classify

# for dog
curl -X POST -F [email protected] http://127.0.0.1:3001/classify

Using Postman client

To make this request with postman we do it as follows:

  1. Change the request method to POST at http://127.0.0.1:3001/classify
  2. Click on form-data
  3. Select type to be file on the KEY attribute
  4. For the KEY type audio and select the audio you want to predict under value
  5. Click send

If everything went well you will get the following response depending on the face you have selected:

{
  "predictions": { "class": "dog", "label": 1, "probability": 1.0 },
  "success": true
}

Using JavaScript fetch api.

  1. First you need to get the input from html
  2. Create a formData object
  3. make a POST requests
res.json()) .then((data) => console.log(data));">
const input = document.getElementById("input").files[0];
let formData = new FormData();
formData.append("audio", input);
fetch("http://127.0.0.1:3001/classify", {
  method: "POST",
  body: formData,
})
  .then((res) => res.json())
  .then((data) => console.log(data));

If everything went well you will be able to get expected response.

{
  "predictions": { "class": "dog", "label": 1, "probability": 1.0 },
  "success": true
}

Notebooks

  • All notebooks for training and saving the models are found in the notebooks folder of this repository.
Owner
crispengari
ai || software development. (creator of initialiseur)
crispengari
Second Order Optimization and Curvature Estimation with K-FAC in JAX.

KFAC-JAX - Second Order Optimization with Approximate Curvature in JAX Installation | Quickstart | Documentation | Examples | Citing KFAC-JAX KFAC-JAX

DeepMind 90 Dec 22, 2022
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

BBC 16 Oct 27, 2022
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb

Google Research 135 Dec 27, 2022
basic tutorial on pytorch

Quick Tutorial on PyTorch PyTorch Basics Linear Regression Logistic Regression Artificial Neural Networks Convolutional Neural Networks Recurrent Neur

7 Sep 15, 2022
Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

shindo.py Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data stored in NumPy array Introduction Japa

RR_Inyo 3 Sep 23, 2022
an implementation of softmax splatting for differentiable forward warping using PyTorch

softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I

Simon Niklaus 338 Dec 28, 2022
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Structure Inpainting ArXiv | Papar | Supplementary Material | BibTex This repository is for the CVPR 2021 paper, "Generating Diverse Structure

152 Nov 04, 2022
Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
BlockUnexpectedPackets - Preventing BungeeCord CPU overload due to Layer 7 DDoS attacks by scanning BungeeCord's logs

BlockUnexpectedPackets This script automatically blocks DDoS attacks that are sp

SparklyPower 3 Mar 31, 2022
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022
An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

Ma Zerun 22 Dec 29, 2022
Pytorch implementation of Hinton's Dynamic Routing Between Capsules

pytorch-capsule A Pytorch implementation of Hinton's "Dynamic Routing Between Capsules". https://arxiv.org/pdf/1710.09829.pdf Thanks to @naturomics fo

Tim Omernick 625 Oct 27, 2022
PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

SimTrans-Weak-Shot-Classification This repository contains the official PyTorch implementation of the following paper: Weak-shot Fine-grained Classifi

BCMI 60 Dec 02, 2022
CS50x-AI - Artificial Intelligence with Python from Harvard University

CS50x-AI Artificial Intelligence with Python from Harvard University 📖 Table of

Hosein Damavandi 6 Aug 22, 2022
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

22 Dec 12, 2022
[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

Niantic Labs 44 Nov 29, 2022
Exponential Graph is Provably Efficient for Decentralized Deep Training

Exponential Graph is Provably Efficient for Decentralized Deep Training This code repository is for the paper Exponential Graph is Provably Efficient

3 Apr 20, 2022
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022
Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

Alexander Hirner 169 Jun 29, 2022
TinyML Cookbook, published by Packt

TinyML Cookbook This is the code repository for TinyML Cookbook, published by Packt. Author: Gian Marco Iodice Publisher: Packt About the book This bo

Packt 93 Dec 29, 2022