An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Last update: Dec 18, 2022

Related tags

Deep Learning Sketch-Simulator

Overview

Sketch Simulator

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

See the final cell output of the colab below for some examples with and without subtracting sketch embedding averages.

WARNING: This colab is messy, a precursor of the code in this repo, but it works.

Architecture Overview

Setup

run ./setup.sh in your environment. This will install required libraries and download model weights.

Usage

To work a single doodle, in your desired style (see train.py for all avaible modifiers), run:
- train.py --start_image "path/to/your/doodle" --prompts "a painting in the style of ... | Trending on artstation
Prompts are split using "|", and specific weights can be assigned using {prompt1}:{weight1}|{prompt2}:{weight2}
To explore the hyperparameter space or large amounts of doodles and / or promps using weights and biases:
- Create a sweep config with your desired parameters your_sweep.yaml in sweep_configs/ (see sweep_configs/* for examples)
- Start the sweep:
  - wandb sweep -p Sketch-sim "\path\to\your_sweep.yaml" (this returns the sweep_ID, to be used in the next command)
  - wandb agent janzuiderveld/Sketch-sim/sweep_ID''
- Alternatively, when working in SLURM environments, one can utilize `SLURM_scripts/sweeper.sh' (make sure to edit paths appropriately):
  - sbatch SLURM_scripts/sweeper.sh "path/to/your_sweep.yaml"

All outputs are saved in outputs/{args.experiment_name}/step_{i}.png

Calculate Average Sketch Embedding

To (re)calculate average sketch embeddings (results/ovl_mean_sketch.pth is calculated based on 1000 (padded) items per class for all 350 quickdraw classes) run:
- extract_sketch_emb.py --items_per_class 1000 --save_root "path/to/repo/root" --pad_images 6

Notes

1 step of synthesizing + embedding 400x400 images takes about 0.3 seconds on a single 1080, usually 20-30 steps is enough for nice results.
Prompts can be used as a metric in large hyperparameter sweeps (their scores are automatically logged) by using a weight of 0.

TODO

Add server / client scripts to circumvent startup times
Add CLIP-based classifier for testing conceptual embedding accuracy on Quickdraw classification

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Related tags

Overview

Sketch Simulator

Architecture Overview

Setup

Usage

Calculate Average Sketch Embedding

Notes

TODO

Owner

[KDD 2021, Research Track] DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks

A Free and Open Source Python Library for Multiobjective Optimization

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Tensorflow-Project-Template - A best practice for tensorflow project template architecture.

Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

PyTorch module to use OpenFace's nn4.small2.v1.t7 model

Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Bounding Wasserstein distance with couplings

Pytorch implementation of XRD spectral identification from COD database

This is a Deep Leaning API for classifying emotions from human face and human audios.

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Python script that allows you to automatically setup your Growtopia server.

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

[WWW 2021] Source code for "Graph Contrastive Learning with Adaptive Augmentation"

A DCGAN to generate anime faces using custom mined dataset

Liver segmentation using MONAI and pytorch