👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

Overview

👐 OpenHands: Sign Language Recognition Library

Making Sign Language Recognition Accessible

Check the documentation on how to use the library:
ReadTheDocs: 👐 OpenHands

License

This project is released under the Apache 2.0 license.

Citation

If you find our work useful in your research, please consider citing us:

@misc{2021_openhands_slr_preprint,
      title={OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages}, 
      author={Prem Selvaraj and Gokul NC and Pratyush Kumar and Mitesh Khapra},
      year={2021},
      eprint={2110.05877},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Comments
  • Question about GSL dataset

    Question about GSL dataset

    I have no idea how to get the Isolated gloss sign language recognition (GSL isol.) data (xxx_signerx_repx_glosses), while I only find the continuous sign language recognition data (xxx_signerx_repx_sentences) from https://zenodo.org/record/3941811.

    Thank you very much for any information about this.

    opened by snorlaxse 6
  • Question about 'Config-based training'

    Question about 'Config-based training'

    I try the code from Config-based training as below.

    import omegaconf
    from openhands.apis.classification_model import ClassificationModel
    from openhands.core.exp_utils import get_trainer
    import os 
    
    os.environ["CUDA_VISIBLE_DEVICES"]="2,3"
    cfg = omegaconf.OmegaConf.load("examples/configs/lsa64/decoupled_gcn.yaml")
    trainer = get_trainer(cfg)
    
    
    model = ClassificationModel(cfg=cfg, trainer=trainer)
    model.init_from_checkpoint_if_available()
    model.fit()
    
    /raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:747: UserWarning: You requested multiple GPUs but did not specify a backend, e.g. `Trainer(accelerator="dp"|"ddp"|"ddp2")`. Setting `accelerator="ddp_spawn"` for you.
      "You requested multiple GPUs but did not specify a backend, e.g."
    GPU available: True, used: True
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    /raid/xxx/OpenHands/openhands/apis/inference.py:21: LightningDeprecationWarning: The `LightningModule.datamodule` property is deprecated in v1.3 and will be removed in v1.5. Access the datamodule through using `self.trainer.datamodule` instead.
      self.datamodule.setup(stage=stage)
    Found 64 classes in train splits
    Found 64 classes in test splits
    Train set size: 2560
    Valid set size: 320
    /raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py:424: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup.
      f"DataModule.{name} has already been called, so it will not be called again. "
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [2,3]
    Traceback (most recent call last):
      File "study_train.py", line 15, in <module>
        model.fit()
      File "/raid/xxx/OpenHands/openhands/apis/classification_model.py", line 104, in fit
        self.trainer.fit(self, self.datamodule)
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
        self._run(model)
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 917, in _run
        self._dispatch()
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 985, in _dispatch
        self.accelerator.start_training(self)
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
        self.training_type_plugin.start_training(trainer)
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 158, in start_training
        mp.spawn(self.new_process, **self.mp_spawn_kwargs)
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
        return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
      File "/raid/xxx/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 148, in start_processes
        process.start()
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/process.py", line 112, in start
        self._popen = self._Popen(self)
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
        return Popen(process_obj)
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__
        super().__init__(process_obj)
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
        self._launch(process_obj)
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
        reduction.dump(process_obj, fp)
      File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
        ForkingPickler(file, protocol).dump(obj)
    AttributeError: Can't pickle local object 'DecoupledGCN_TCN_unit.__init__.<locals>.<lambda>'
    (base) 
    
    opened by snorlaxse 4
  • installation issue

    installation issue

    Hello, thank you for providing such a great framework, but there was an error when I import the module. Could you please offer me a help? code:

    import omegaconf
    from openhands.apis.classification_model import ClassificationModel
    from openhands.core.exp_utils import get_trainer
    
    cfg = omegaconf.OmegaConf.load("1.yaml")
    trainer = get_trainer(cfg)
    
    model = ClassificationModel(cfg=cfg, trainer=trainer)
    model.init_from_checkpoint_if_available()
    model.fit()
    

    ERROR: Traceback (most recent call last): File "/home/hxz/project/pose_SLR/main.py", line 3, in from openhands.apis.classification_model import ClassificationModel ModuleNotFoundError: No module named 'openhands.apis'

    opened by Xiaolong-han 4
  • visibility object

    visibility object

    https://github.com/narVidhai/SLR/blob/2f26455c7cb530265618949203859b953224d0aa/scripts/mediapipe_extract.py#L48

    Doesn't this object contain visibility value as well. If so, we could add some logic for conditioning and merge it with the above function

    enhancement 
    opened by grohith327 3
  • About the wrong st_gcn checkpoints files provided on GSL

    About the wrong st_gcn checkpoints files provided on GSL

    import omegaconf
    from openhands.apis.inference import InferenceModel
    
    cfg = omegaconf.OmegaConf.load("GSL/gsl/st_gcn/config.yaml")
    model = InferenceModel(cfg=cfg)
    model.init_from_checkpoint_if_available()
    if cfg.data.test_pipeline.dataset.inference_mode:
        model.test_inference()
    else:
        model.compute_test_accuracy()
    
    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    /tmp/ipykernel_6585/2983784194.py in <module>
          4 cfg = omegaconf.OmegaConf.load("GSL/gsl/st_gcn/config.yaml")
          5 model = InferenceModel(cfg=cfg)
    ----> 6 model.init_from_checkpoint_if_available()
          7 if cfg.data.test_pipeline.dataset.inference_mode:
          8     model.test_inference()
    
    ~/OpenHands/openhands/apis/inference.py in init_from_checkpoint_if_available(self, map_location)
         47         print(f"Loading checkpoint from: {ckpt_path}")
         48         ckpt = torch.load(ckpt_path, map_location=map_location)
    ---> 49         self.load_state_dict(ckpt["state_dict"], strict=False)
         50         del ckpt
         51 
    
    ~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
       1050         if len(error_msgs) > 0:
       1051             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
    -> 1052                                self.__class__.__name__, "\n\t".join(error_msgs)))
       1053         return _IncompatibleKeys(missing_keys, unexpected_keys)
       1054 
    
    RuntimeError: Error(s) in loading state_dict for InferenceModel:
    	size mismatch for model.encoder.A: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.st_gcn_networks.0.gcn.conv.weight: copying a param with shape torch.Size([128, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 2, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.0.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
    	size mismatch for model.encoder.st_gcn_networks.1.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.1.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
    	size mismatch for model.encoder.st_gcn_networks.2.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.2.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
    	size mismatch for model.encoder.st_gcn_networks.3.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.3.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
    	size mismatch for model.encoder.st_gcn_networks.4.gcn.conv.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 64, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.4.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
    	size mismatch for model.encoder.st_gcn_networks.5.gcn.conv.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.5.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
    	size mismatch for model.encoder.st_gcn_networks.6.gcn.conv.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.6.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
    	size mismatch for model.encoder.st_gcn_networks.7.gcn.conv.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 128, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.7.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
    	size mismatch for model.encoder.st_gcn_networks.8.gcn.conv.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.8.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
    	size mismatch for model.encoder.st_gcn_networks.9.gcn.conv.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
    	size mismatch for model.encoder.st_gcn_networks.9.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
    	size mismatch for model.encoder.edge_importance.0: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.1: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.2: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.3: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.4: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.5: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.6: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.7: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.8: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    	size mismatch for model.encoder.edge_importance.9: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
    
    opened by snorlaxse 2
  • Refactoring code to remove bugs, code smells

    Refactoring code to remove bugs, code smells

    Following changes were made as part of the PR

    • Refactored data loading component of the package
    • Created/Renamed files under slr.datasets.isolated to allow for modularization
    • added __init__.py files for importing
    opened by grohith327 1
  • ST-GCN does not work for mediapipe

    ST-GCN does not work for mediapipe

    Currently the openpose layout seems to be hardcoded in graph_utils.py's Graph class. Should we also add a layout for mediapipe, or pass the joints via yml?

    bug 
    opened by GokulNC 1
  • Scale normalization for pose

    Scale normalization for pose

    For example, if the signer is moving forward or backward in the video, this augmentation will help normalize the scale throughout the video: https://github.com/AmitMY/pose-format#data-normalization

    Will involve explicitly specifying the joint (edge) based on which scaling has to be performed.

    enhancement 
    opened by GokulNC 1
  • Function not called

    Function not called

    https://github.com/narVidhai/SLR/blob/2f26455c7cb530265618949203859b953224d0aa/scripts/mediapipe_extract.py#L129

    Is this function not called anywhere?

    question 
    opened by grohith327 1
  • Support for GCN + BERT model

    Support for GCN + BERT model

    Add the model proposed in

    https://openaccess.thecvf.com/content/WACV2021W/HBU/papers/Tunga_Pose-Based_Sign_Language_Recognition_Using_GCN_and_BERT_WACVW_2021_paper.pdf

    enhancement 
    opened by Prem-kumar27 0
  • Sinusoidal Train/Val Accuracy

    Sinusoidal Train/Val Accuracy

    I'm noticing that the transformer and the SL-GCN architectures, while learning on WLASL2000, have an accuracy curve that resembles a sine curve with period of about 20 epochs and amplitude of about 5-10%. I am using the example config provided in the repo, and verified that the batches are being shuffled. I have also played around with logging on_step=True in case this is an artifact of torch.nn.log, but that didn't help either. Any ideas why this is happening?

    opened by leekezar 1
  • Lower accuracy when inferring a single video

    Lower accuracy when inferring a single video

    Hello,

    When I supply the inference model with multiple videos, the model predicts all of them right. But if I supply only one video then the prediction is wrong. I am curious about the cause of this? Can anyone please explain?

    Thank you!

    opened by burakkaraceylan 1
  • Using `pose-format` for consistent `.pose` files

    Using `pose-format` for consistent `.pose` files

    Seems like for pose data you are using pkl and h5. Also, that you have a custom mediapipe holistic script

    Personally I believe it would be more shareable, and faster, to use a binary format like https://github.com/AmitMY/pose-format Every pose file also declares its content, so you can transfer them between projects, or convert them to different formats with relative is.

    Besides the fact that it has a holistic loading script and multiple formats of OpenPose, it is a binary format which is faster to load, allows loading to numpy, torch and tensorflow, and can perform several operations on poses.

    It also allows the visualization of pose files, separately or on top of videos, and while admittedly this repository is not perfect, in my opinion it is better than having json or pkl files.

    opened by AmitMY 9
  • Consistent Dataset Handling

    Consistent Dataset Handling

    Very nice repo and documentation!

    I think this repository can benefit from using https://github.com/sign-language-processing/datasets as data loaders.

    It is fast, consistent across datasets, and allows loading videos / poses from multiple datasets. If a dataset you are using is not there, you can ask for it or add it yourself, it is a breeze.

    The repo supports many datasets, multiple pose estimation formats, binary pose files, fps and resolution manipulations, and dataset disk mapping.

    Finally, this would make this repo less complex. This repo does pre-training and fine-tuning, the other repo does datasets, and they could be used together.

    Please consider :)

    opened by AmitMY 5
  • Resume training, but load only parameters

    Resume training, but load only parameters

    Not the entire state stored by Lightning.

    Use an option called pretrained to achieve it, like this: https://github.com/AI4Bharat/OpenHands/blob/26c17ed0fca2ac786950d1f4edfa5a88419d06e6/examples/configs/include/decoupled_gcn.yaml#L1

    important feature 
    opened by GokulNC 1
Owner
AI4Bhārat
Building open-source AI solutions for India!
AI4Bhārat
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

Tao Ruijie 277 Dec 31, 2022
Azion the best solution of Edge Computing in the world.

Azion Edge Function docker action Create or update an Edge Functions on Azion Edge Nodes. The domain name is the key for decision to a create or updat

8 Jul 16, 2022
Neural Koopman Lyapunov Control

Neural-Koopman-Lyapunov-Control Code for our paper: Neural Koopman Lyapunov Control Requirements dReal4: v4.19.02.1 PyTorch: 1.2.0 The learning framew

Vrushabh Zinage 6 Dec 24, 2022
Fuzzer for Linux Kernel Drivers

difuze: Fuzzer for Linux Kernel Drivers This repo contains all the sources (including setup scripts), you need to get difuze up and running. Tested on

seclab 344 Dec 27, 2022
Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Music Trees Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Ins

Hugo Flores García 32 Nov 22, 2022
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
Recurrent Scale Approximation (RSA) for Object Detection

Recurrent Scale Approximation (RSA) for Object Detection Codebase for Recurrent Scale Approximation for Object Detection in CNN published at ICCV 2017

Yu Liu (Louis) 239 Dec 28, 2022
A no-BS, dead-simple training visualizer for tf-keras

A no-BS, dead-simple training visualizer for tf-keras TrainingDashboard Plot inter-epoch and intra-epoch loss and metrics within a jupyter notebook wi

Vibhu Agrawal 3 May 28, 2021
METER: Multimodal End-to-end TransformER

METER Code and pre-trained models will be publicized soon. Citation @article{dou2021meter, title={An Empirical Study of Training End-to-End Vision-a

Zi-Yi Dou 257 Jan 06, 2023
Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All

Wenbo Huang 1 May 17, 2022
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If

tsc 0 Jan 11, 2022
NLP made easy

GluonNLP: Your Choice of Deep Learning for NLP GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you l

Distributed (Deep) Machine Learning Community 2.5k Jan 04, 2023
Self-training with Weak Supervision (NAACL 2021)

This repo holds the code for our weak supervision framework, ASTRA, described in our NAACL 2021 paper: "Self-Training with Weak Supervision"

Microsoft 148 Nov 20, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
This repository is all about spending some time the with the original problem posed by Minsky and Papert

This repository is all about spending some time the with the original problem posed by Minsky and Papert. Working through this problem is a great way to begin learning computer vision.

Jaissruti Nanthakumar 1 Jan 23, 2022
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可

QPT Family 545 Dec 28, 2022
Gym environment for FLIPIT: The Game of "Stealthy Takeover"

gym-flipit Gym environment for FLIPIT: The Game of "Stealthy Takeover" invented by Marten van Dijk, Ari Juels, Alina Oprea, and Ronald L. Rivest. Desi

Lisa Oakley 2 Dec 15, 2021