A custom DeepStack model for detecting 16 human actions.

Overview

DeepStack_ActionNET

This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for detecting 16 human actions present in the ActionNET Dataset dataset. Also included in this repository is that dataset with the YOLO annotations.

>> Watch Video Demo

  • Download DeepStack Model and Dataset
  • Create API and Detect Objects
  • Discover more Custom Models
  • Train your own Model

Download DeepStack Model and Dataset

You can download the pre-trained DeepStack_ActionNET model and the annotated dataset via the links below.

Create API and Detect Actions

The Trained Model can detect the following actions in images and videos.

  • calling
  • clapping
  • cycling
  • dancing
  • drinking
  • eating
  • fighting
  • hugging
  • kissing
  • laughing
  • listening-to-music
  • running
  • sitting
  • sleeping
  • texting
  • using-laptop

To start detecting, follow the steps below

  • Install DeepStack: Install DeepStack AI Server with instructions on DeepStack's documentation via https://docs.deepstack.cc

  • Download Custom Model: Download the trained custom model actionnetv2.pt from this GitHub release. Create a folder on your machine and move the downloaded model to this folder.

    E.g A path on Windows Machine C\Users\MyUser\Documents\DeepStack-Models, which will make your model file path C\Users\MyUser\Documents\DeepStack-Models\actionnet.pt

  • Run DeepStack: To run DeepStack AI Server with the custom ActionNET model, run the command that applies to your machine as detailed on DeepStack's documentation linked here.

    E.g

    For a Windows version, you run the command below

    deepstack --MODELSTORE-DETECTION "C\Users\MyUser\Documents\DeepStack-Models" --PORT 80

    For a Linux machine

    sudo docker run -v /home/MyUser/Documents/DeepStack-Models -p 80:5000 deepquestai/deepstack

    Once DeepStack runs, you will see a log like the one below in your Terminal/Console

    That means DeepStack is running your custom actionnet.pt model and now ready to start detecting actions images via the API endpoint http://localhost:80/v1/vision/custom/actionnet or http://your_machine_ip:80/v1/vision/custom/actionnet

  • Detect actions in image: You can detect objects in an image by sending a POST request to the url mentioned above with the paramater image set to an image using any proggramming language or with a tool like POSTMAN. For the purpose of this repository, we have provided a sample Python code below.

    • A sample image can be found in images/test.jpg of this repository

    • Install Python and install the DeepStack Python SDK via the command below

      pip install deepstack_sdk
    • Run the Python file detect.py in this repository.

      python detect.py
    • After the code runs, you will find a new image in images/test_detected.jpg with the detection visualized, with the following results printed in the Terminal/Console.

      Name: dancing
      Confidence: 0.91482425
      x_min: 270
      x_max: 516
      y_min: 18
      y_max: 480
      -----------------------
      

    • You can try running action detection for other images.

Discover more Custom Models

For more custom DeepStack models that has been trained and ready to use, visit the Custom Models sample page on DeepStack's documentation https://docs.deepstack.cc/custom-models-samples/ .

Train your own Model

If you will like to train a custom model yourself, follow the instructions below.

  • Prepare and Annotate: Collect images on and annotate object(s) you plan to detect as detailed here
  • Train your Model: Train the model as detailed here
You might also like...
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions
An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Agar.io_Q-Learning_AI An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available act

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.
Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Tensorflow-Mobile-Generic-Object-Localizer Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label. Ori

Python TFLite scripts for detecting objects of any class in an image without knowing their label.
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples This project is for the paper "Training Confidence-Calibrated Clas

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

Beyond the Spectrum Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis" by Yang He, Ning Yu, Margret Keu

Comments
  • How to download a Custom Model action net v2.pt in Deepstack Server Docker?

    How to download a Custom Model action net v2.pt in Deepstack Server Docker?

    Tell me how to load a custom action network model correctly v2.pt in the Deepstack server docker? Did I do the right thing?

    DeepStack: Version 2021.09.01 I created the /model store/detection folders and threw the action net file there v2.pt image

    After the reboot, I got a v1/vision/custom/action net v2 entry in the logs. Did I do the right thing? It just confuses me that there is a v1/vision/custom/action net v2 entry in the logs, and the rest are written like this.

    /v1/vision/face
    /v1/vision/face/recognize
    ....
    

    image

    Is it necessary to enter here as in the case of face and object recognition? image image

    opened by DivanX10 0
Releases(v2)
  • v2(Aug 26, 2021)

    Version 2 of the DeepStack Custom Model for object detection API to detect human actions in images and videos. It detects the following actions

    • calling
    • clapping
    • cycling
    • dancing
    • drinking
    • eating
    • fighting
    • hugging
    • kissing
    • laughing
    • listening-to-music
    • running
    • sitting
    • sleeping
    • texting
    • using-laptop

    Download the model actionnetv2.pt from the Assets section (below) in this release.

    This Model is a YOLOv5x DeepStack custom model and that was trained for 150 epochs, generating a best model with the following evaluation result.

    [email protected]: 0.995 [email protected]: 0.913

    Source code(tar.gz)
    Source code(zip)
    actionnetv2.pt(169.41 MB)
  • v1(Aug 14, 2021)

    A DeepStack Custom Model for object detection API to detect human actions in images and videos. It detects the following actions

    • calling
    • clapping
    • cycling
    • dancing
    • drinking
    • eating
    • fighting
    • hugging
    • kissing
    • laughing
    • listening-to-music
    • running
    • sitting
    • sleeping
    • texting
    • using-laptop

    Download the model actionnet.pt from the Assets section (below) in this release.

    This Model is a YOLOv5x DeepStack custom model and that was trained for 150 epochs, generating a best model with the following evaluation result.

    [email protected]: 0.9858 [email protected]: 0.8051

    Source code(tar.gz)
    Source code(zip)
    actionnet.pt(169.41 MB)
Owner
MOSES OLAFENWA
Software Engineer @Microsoft , A self-Taught computer programmer, Deep Learning, Computer Vision Researcher and Developer. Creator of ImageAI.
MOSES OLAFENWA
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

YITUTech 1k Dec 31, 2022
Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

Lucas 103 Dec 14, 2022
(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework Background: Outlier detection (OD) is a key data mining task for identify

Yue Zhao 127 Jan 05, 2023
MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Main repo for ECCV 2020 paper MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images. visual.cs.brown.edu/matryodshka

Brown University Visual Computing Group 75 Dec 13, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

47 Dec 28, 2022
Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.

Autotype A quick and small python script that helps you autotype on websites that have copy paste disabled like Moodle, HackerEarth contests etc as it

Tushar 32 Nov 03, 2022
Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

Winning submission to the 2021 Brain Tumor Segmentation Challenge This repo contains the codes and pretrained weights for the winning submission to th

94 Dec 28, 2022
Bag of Tricks for Natural Policy Gradient Reinforcement Learning

Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv] Setup Python 3.8.0 pip install -r req.txt Mujoco 200 license Main Files main.

Brennan Gebotys 1 Oct 10, 2022
This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis, accepted at ACMMM 2021.

Ziqi Yuan 10 Sep 30, 2022
This repository is all about spending some time the with the original problem posed by Minsky and Papert

This repository is all about spending some time the with the original problem posed by Minsky and Papert. Working through this problem is a great way to begin learning computer vision.

Jaissruti Nanthakumar 1 Jan 23, 2022
SoGCN: Second-Order Graph Convolutional Networks

SoGCN: Second-Order Graph Convolutional Networks This is the authors' implementation of paper "SoGCN: Second-Order Graph Convolutional Networks" in Py

Yuehao 7 Aug 16, 2022
Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles This project is for the paper: Detecting Errors and Estimating

Jiefeng Chen 13 Nov 21, 2022
Exploring Simple 3D Multi-Object Tracking for Autonomous Driving (ICCV 2021)

Exploring Simple 3D Multi-Object Tracking for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Exploring Simple 3D Multi-Object Tracking for

QCraft 141 Nov 21, 2022
"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri Bu Github Reposundaki tüm projeler; kaleme almış olduğum "Projelerle Yapay Zekâ ve Bi

Ümit Aksoylu 4 Aug 03, 2022
Koç University deep learning framework.

Knet Knet (pronounced "kay-net") is the Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU

1.4k Dec 31, 2022
Hashformers is a framework for hashtag segmentation with transformers.

Hashtag segmentation is the task of automatically inserting the missing spaces between the words in a hashtag. Hashformers applies Transformer models

Ruan Chaves 41 Nov 09, 2022
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022