FairyTailor: Multimodal Generative Framework for Storytelling

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. FairyTailor adds another modality and modifies the text generation process to help producing a coherent and creative story.

Architecture

Set-up (development)

After cloning the repository:

Client (Vue 2.6)

Install and check that the client compiles:

cd client
npm i
npm run build

Backend (FASTAPI)

Install and activate the environment (conda provided):

conda env create -f environment.yml
conda activate MultiModalStory

Install environment globally in the directory:

pip install -e .
pip install git+https://github.com/openai/CLIP.git

After installation run:

python -m spacy download en_core_web_sm

In python terminal:

nltk.download('wordnet')
nltk.download('sentiwordnet')
nltk.download('averaged_perceptron_tagger')

Large Data Management (dvc)

Our large data files are stored on IBM's Cloud Object Storage, and to pull data files from that platform you will use a special, read-only .dvc/config file.

dvc pull -f

Which will pull:

  • backend/outputs (five preset stories)
  • backend/story_generator/downloaded (transformers)
  • client/public/unsplash25k (styled images)

Running the framework during developemnt

Client:

cd client
npm run devw

Backend (with server auto reload):

uvicorn backend.server:app --reload --reload-dir backend

Open the uvicorn server localhost:8000 in your web browser

Modifications Ideas:

New huggingface transformer

  • Place the transformer in backend/story_generator/downloaded directory.
  • Update the current model path by changing the constant FINETUNED_GPT2_PATH in backend/story_generator/constants.py.

New images folder

  • Replace the folder client/public/unsplash25k/sketch_images1024 with yours.
  • Update the current path by changing the constant IMAGE_PATH in client/src/components/Constants.js.

API functionalities

  • Add functions to the backend endpoint at backend/server/main.py.
  • Update client/src/js/api/mainApi.js to call the backend endpoint from the client.
  • Update the corresponding user components in client/src/components.
Experiment about Deep Person Re-identification with EfficientNet-v2

We evaluated the baseline with Resnet50 and Efficienet-v2 without using pretrained models. Also Resnet50-IBN-A and Efficientnet-v2 using pretrained on ImageNet. We used two datasets: Market-1501 and

lan.nguyen2k 77 Jan 03, 2023
A general-purpose programming language, focused on simplicity, safety and stability.

The Rivet programming language A general-purpose programming language, focused on simplicity, safety and stability. Rivet's goal is to be a very power

The Rivet programming language 17 Dec 29, 2022
This repository contains part of the code used to make the images visible in the article "How does an AI Imagine the Universe?" published on Towards Data Science.

Generative Adversarial Network - Generating Universe This repository contains part of the code used to make the images visible in the article "How doe

Davide Coccomini 9 Dec 18, 2022
3 Apr 20, 2022
Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis Requirements python 3.7 pytorch-gpu 1.7 numpy 1.19.4 pytorch_

12 Oct 29, 2022
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

Facebook Research 373 Dec 31, 2022
Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions This repo contains the dataset and code for the paper Benchmarking Ro

Jiachen Sun 168 Dec 29, 2022
RLBot Python bindings for the Rust crate rl_ball_sym

RLBot Python bindings for rl_ball_sym 0.6 Prerequisites: Rust & Cargo Build Tools for Visual Studio RLBot - Verify that the file %localappdata%\RLBotG

Eric Veilleux 2 Nov 25, 2022
A geometric deep learning pipeline for predicting protein interface contacts.

A geometric deep learning pipeline for predicting protein interface contacts.

44 Dec 30, 2022
Loopy belief propagation for factor graphs on discrete variables, in JAX!

PGMax implements general factor graphs for discrete probabilistic graphical models (PGMs), and hardware-accelerated differentiable loopy belief propagation (LBP) in JAX.

Vicarious 62 Dec 23, 2022
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation

PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation Winner method of the ICCV-2021 SemKITTI-DVPS Challenge. [arxiv] [

Yuan Haobo 38 Jan 03, 2023
List of awesome things around semantic segmentation 🎉

Awesome Semantic Segmentation List of awesome things around semantic segmentation 🎉 Semantic segmentation is a computer vision task in which we label

Dam Minh Tien 18 Nov 26, 2022
Epidemiology analysis package

zEpid zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is

Paul Zivich 111 Jan 08, 2023
Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP

mmc installation git clone https://github.com/dmarx/Multi-Modal-Comparators cd 'Multi-Modal-Comparators' pip install poetry poetry build pip install d

David Marx 37 Nov 25, 2022
Zero-shot Learning by Generating Task-specific Adapters

Code for "Zero-shot Learning by Generating Task-specific Adapters" This is the repository containing code for "Zero-shot Learning by Generating Task-s

INK Lab @ USC 11 Dec 17, 2021
PyTorch implementation of popular datasets and models in remote sensing

PyTorch Remote Sensing (torchrs) (WIP) PyTorch implementation of popular datasets and models in remote sensing tasks (Change Detection, Image Super Re

isaac 222 Dec 28, 2022
Explainable Zero-Shot Topic Extraction

Zero-Shot Topic Extraction with Common-Sense Knowledge Graph This repository contains the code for reproducing the results reported in the paper "Expl

D2K Lab 56 Dec 14, 2022
Implementation of the paper "Shapley Explanation Networks"

Shapley Explanation Networks Implementation of the paper "Shapley Explanation Networks" at ICLR 2021. Note that this repo heavily uses the experimenta

68 Dec 27, 2022
Training DiffWave using variational method from Variational Diffusion Models.

Variational DiffWave Training DiffWave using variational method from Variational Diffusion Models. Quick Start python train_distributed.py discrete_10

Chin-Yun Yu 26 Dec 13, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022