Navigating StyleGAN2 w latent space using CLIP

an attempt to build sth with the official SG2-ADA Pytorch impl kinda inspired by Generating Images from Prompts using CLIP and StyleGAN based on the og projector.py

things learned:

it's better to generate initial w values from a well converged sample rather than starting with random or median ones
optimizing w and noise inputs works better than w alone
default values of 0.02 for LR/noise work fine with portraits

Quick start

clone SG2 repo, copy clip dir from CLIP repo, install pytorch 1.7.1 and stuff
pick a suitable SG2 PKL (eg FFHQ)
pick a seed
run python3 approach.py --network network-snapshot-ffhq.pkl --outdir project --num-steps 100 --text 'an image of a girl with a face resembling Paul Krugman' --psi 0.8 --seed 12345
alternatively, one can start from a w vector stored as .npz python3 approach.py --network network-snapshot-ffhq.pkl --outdir project --num-steps 100 --text 'an image of a girl with a face resembling Paul Krugman' --w w-7660ca0b7e95428cac94c89459b5cebd8a7acbd4.npz

FFHQ test

python3 approach.py --network stylegan2-ffhq-config-f.pkl --outdir ffhq --num-steps 100 --text 'an image of an Instagram influencer girl' --psi 0.7 --seed 32

Navigating StyleGAN2 w latent space using CLIP

Related tags

Overview

Navigating StyleGAN2 w latent space using CLIP

Quick start

FFHQ test

Owner

Mike K.

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Location-Sensitive Visual Recognition with Cross-IOU Loss

Space Time Recurrent Memory Network - Pytorch

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Codes for Causal Semantic Generative model (CSG), the model proposed in "Learning Causal Semantic Representation for Out-of-Distribution Prediction" (NeurIPS-21)

Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

Code for paper "Context-self contrastive pretraining for crop type semantic segmentation"

ACV is a python library that provides explanations for any machine learning model or data.

Meta-meta-learning with evolution and plasticity

Tooling for converting STAC metadata to ODC data model

Machine Learning Time-Series Platform

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Multimodal Descriptions of Social Concepts: Automatic Modeling and Detection of (Highly Abstract) Social Concepts evoked by Art Images

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

A library for uncertainty quantification based on PyTorch