STRIVE: Scene Text Replacement In Videos

Dataset Types:

RoboText
SynthText
RealWorld videos

RoboText : Videos of texts collected using navigation robot in indoor environment. The overall duration of these videos is 10hrs+ Each text's background can be extracted from the bottom rectangle of its text rectangle. The orginial unprocessed data is stored as RoboText-OriginalZip.7z. Around 200 preprocessed videos are stored as RoboTextZip1.7z

SynthText : Using unity, we have created paired videos from synthetic scenes. These videos are stored with similar naming convention in drive. File name : SynthText7Zip.7z

Note: Unity bbox are recorded as mirror values, hence the bbox extraction process will be different than other two video types.

Real World videos: We have collected videos using high resolution mobile camera to capture texts in different lighting conditions and motion blur. File name: RealWorld.7z

Preparing data

We have extracted text bounding box from RoboText and Real world videos using AWS Rekognition API. The code available as runAWS.py file. Synthetic videos bbox is recorded in unity environment

Data Preprocessing

Refer to the preprocessing python file for each dataset type to get crop images of text.

Data download

Data can be downloaded from here

Please contact Jeyasri Subramanian( [email protected] ) for any data queries

STRIVE: Scene Text Replacement In Videos

Related tags

Overview

STRIVE: Scene Text Replacement In Videos

Dataset Types:

Preparing data

Data Preprocessing

Data download

Owner

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Experiments and examples converting Transformers to ONNX

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Code for CPM-2 Pre-Train

Trax — Deep Learning with Clear Code and Speed

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

This program will stylize your photos with fast neural style transfer.

Uncertain natural language inference

Learning embeddings for classification, retrieval and ranking.

Misc YOLOL scripts for use in the Starbase space sandbox videogame

Python scripts using the Mediapipe models for Halloween.

Commonsense Ability Tests

Deep Learning for 3D Point Clouds: A Survey (IEEE TPAMI, 2020)

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Fully Convolutional DenseNets for semantic segmentation.

Keras Image Embeddings using Contrastive Loss