Kinetics-Data-Preprocessing

Overview

Kinetics-Data-Preprocessing

Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like SlowFast or PytorchVideo. However, their instruction of dataset preparation is too brief. Therefore, this project provides a more detailed instruction for Kinetics-400/-600 data preprocessing.

Download the raw videos

There are multiple ways to download the raw videos of Kinetics-400 and Kinetics-600. Here, I list two common choices that I found to be simple and fast:

  1. Download the videos via the official scripts. However, I noticed that this option is very slow, so I personally recommend the next choice.

  2. Download the compressed videos from the Common Visual Data Foundation Servers following the repository, which is much faster as they organized 650,000 independent video clips into several compressed files.

Resize the videos

The common data preprocessing of Kinetics requires all videos to be resized to the short edge size of 256. Therefore, I use the moviepy package to do so. The package can be easily installed by the following command:

pip install moviepy

Then, you can use the resize_video.py to resize all the videos within the given folder by following command:

python resize_video.py --size 256 --path YOUR_VIDEO_CONTAINER

IMPORTANT! Note that the resize_video.py will replace the original mp4 files. If you want to keep the original files, please make copys before resizing.

Prepare the csv annotation files

Following SlowFast, we also need to prepare the csv annotation files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

The original annotations can be found at the kinetics website, or you can directly use download links of kinetics-400 annotations and kinetics-600 annotations. The official annotations support two different types of files: csv and json. However, both of them don't meet the above format. Therefore, I also provide a python code to transfer json files to the corresponding csv files with correct format. It takes two inputs: the container path of all videos, the path of official json annotation files. The output annotations will be named as 'output_XXX.csv' and located at the same folder. The label-to-id mapping dictionary will be saved as 'label2id.json'. The following command is my example.

python kinetics_annotation.py --train_path /home/kaihua/datasets/kinetics-train/ \
    --test_path /home/kaihua/datasets/kinetics-test/ \
    --val_path /home/kaihua/datasets/kinetics-val/ \
    --anno_path /home/kaihua/datasets/kinetics400-anno/
Owner
Kaihua Tang
@kaihuatang.github.io/
Kaihua Tang
This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression problems

Doctoral dissertation of Zheng Zhao This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression pro

Zheng Zhao 21 Nov 14, 2022
Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution Abstract Within the Latin (and ancient Greek) production, it is well

4 Dec 03, 2022
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

This repository contains code to reproduce results for submission NeurIPS 2021, "Momentum Centering and Asynchronous Update for Adaptive Gradient Meth

Juntang Zhuang 15 Jun 11, 2022
Differentiable rasterization applied to 3D model simplification tasks

nvdiffmodeling Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Automatic 3D Model

NVIDIA Research Projects 336 Dec 30, 2022
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 279 Jan 04, 2023
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
Automatic detection and classification of Covid severity degree in LUS (lung ultrasound) scans

Final-Project Final project in the Technion, Biomedical faculty, by Mor Ventura, Dekel Brav & Omri Magen. Subproject 1: Automatic Detection of LUS Cha

Mor Ventura 1 Dec 18, 2021
Toward Multimodal Image-to-Image Translation

BicycleGAN Project Page | Paper | Video Pytorch implementation for multimodal image-to-image translation. For example, given the same night image, our

Jun-Yan Zhu 1.4k Dec 22, 2022
Repository for self-supervised landmark discovery

self-supervised-landmarks Repository for self-supervised landmark discovery Requirements pytorch pynrrd (for 3d images) Usage The use of this models i

Riddhish Bhalodia 2 Apr 18, 2022
Multi Agent Path Finding Algorithms

MATP-solver Simulator collision check path step random initial states or given states Traditional method Seperate A* algorithem Confict-based Search S

30 Dec 12, 2022
OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

OpenMMLab 3k Jan 07, 2023
ObsPy: A Python Toolbox for seismology/seismological observatories.

ObsPy is an open-source project dedicated to provide a Python framework for processing seismological data. It provides parsers for common file formats

ObsPy 979 Jan 07, 2023
An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

EquivariantSelfAttention An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astro

2 Nov 09, 2021
A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset

YOLOv4 CrowdHuman Tutorial This is a tutorial demonstrating how to train a YOLOv4 people detector using Darknet and the CrowdHuman dataset. Table of c

JK Jung 118 Nov 10, 2022
PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019

Learning Character-Agnostic Motion for Motion Retargeting in 2D We provide PyTorch implementation for our paper Learning Character-Agnostic Motion for

Rundi Wu 367 Dec 22, 2022
A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.

europilot Overview Europilot is an open source project that leverages the popular Euro Truck Simulator(ETS2) to develop self-driving algorithms. A con

1.4k Jan 04, 2023
Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph Model Description Open-CyKG is a framework that is constructed using an attenti

Injy Sarhan 34 Jan 05, 2023
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022
A keras implementation of ENet (abandoned for the foreseeable future)

ENet-keras This is an implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from ENet-training (lua-t

Pavlos 115 Nov 23, 2021