Character Grounding and Re-Identification in Story of Videos and Text Descriptions

Last update: Dec 09, 2022

Related tags

Overview

Character in Story Identification Network (CiSIN)

This project hosts the code for our paper.

Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung and Gunhee Kim. Character Grounding and Re-Identification inStory of Videos and Text Descriptions. In ECCV (spotlight), 2020.

This project is an Winning Solution in LSMDC 19 "Fill-in the Characters" task. For more information about the LSMDC visit the Large Scale Movie Description Challenge (LSMDC) 2019

Reference

If you use this code as part of any published research, please refer following paper,

@inproceedings{yu:2020:ECCV,
    title="{Character Grounding and Re-Identification inStory of Videos and Text Descriptions}",
    author={Yu, Youngjae and Kim, Jongseok and Yun, Heeseung and Chung Jiwan and Kim, Gunhee},
    booktitle={ECCV},
    year=2020
}

System Requirements

The following dependencies should be installed:

Python 3.6
Pytorch 1.4.0
torchvision 0.5.0
CUDA 10.0 supported GPU with at least 12GB memory
see requirements.txt for more details

Data Setup

Coming soon,

CiSIN

To train our model,

python train.py

Acknowledgement

We thank SNUVL lab members for helpful comments. This research was supported by Seoul National University, Brain Research Program by National Research Foundation of Korea (NRF) (2017M3C7A1047860), and AIR Lab (AI Research Lab) in Hyundai Motor Company through HMC-SNU AI Consortium Fund.

License

LICENSE.md.

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

Related tags

Overview

Character in Story Identification Network (CiSIN)

Reference

System Requirements

Data Setup

CiSIN

Acknowledgement

License

Owner

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

🐦 Quickly annotate data from the comfort of your Jupyter notebook

EmoTag helps you train emotion detection model for Chinese audios

A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

SeqAttack: a framework for adversarial attacks on token classification models

A 35mm camera, based on the Canonet G-III QL17 rangefinder, simulated in Python.

The FIRST GANs-based omics-to-omics translation framework

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

Using OpenAI's CLIP to upscale and enhance images

Snscrape-jsonl-urls-extractor - Extracts urls from jsonl produced by snscrape

Justmagic - Use a function as a method with this mystic script, like in Nim

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

A diff tool for language models

Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

Deep Learning for Time Series Forecasting.

MISSFormer: An Effective Medical Image Segmentation Transformer

交互式标注软件，暂定名 iann

GND-Nets (Graph Neural Diffusion Networks) in TensorFlow.