Diverse Object-Scene Compositions For Zero-Shot Action Recognition

This repository contains the source code for the use of object-scene compositions for zero-shot action recognition.

This repository includes:

object and scene predictions for UCF-101, UCF-Sports, J-HMDB
script to retrieve object and scene predictions for Kinetics
scripts to obtain word and sentence embeddings for all datasets used and for object-scene compositions
script to obtain action predictions from any given action dataset, given the object and scene predictions and the respective action labels

Software used

python 3.8.8
pytorch 1.7.1
numpy 1.19.2
fasttext 0.9.2
sentence-transformers 1.2.0
scikit-learn 0.24.1

Downloading the object and scene predictions for Kinetics

While the action labels and video annotations for Kinetics are already present in the repo, the object and scene predictions need to be retrieved using:

bash kineticsdownload.sh

Obtaining word and sentence embeddings for all datasets

To compute the word and sentence embeddings for all the video and image datasets run:

python getfasttextembs.py; python getbertembs.py

This will additionally compute the embeddings for all object-scene compositions and the similarities between all action labels and objects-scene compositions.

Using the main script

The main script can be run using the default arguments as follows: To compute the word and sentence embeddings for all the video and image datasets run:

python zero-shot-actions.py

There are several flags that can be used. Descriptions for these can be shown by running:

python zero-shot-actions.py --help

Lastly, a helper function to compute results for different datasets and for different flag values is available:

python make_results.py

Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Related tags

Overview

Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Software used

Downloading the object and scene predictions for Kinetics

Obtaining word and sentence embeddings for all datasets

Using the main script

Owner

Towards Interpretable Deep Metric Learning with Structural Matching

Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Deep Reinforcement Learning for Keras.

Google AI Open Images - Object Detection Track: Open Solution

PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases

Source code for PairNorm (ICLR 2020)

Seq2seq - Sequence to Sequence Learning with Keras

Python code for loading the Aschaffenburg Pose Dataset.

Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

Automatic Differentiation Multipole Moment Molecular Forcefield

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

Revisiting Global Statistics Aggregation for Improving Image Restoration

Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

Few-shot Neural Architecture Search

Deep Sketch-guided Cartoon Video Inbetweening