Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Related tags

Deep Learningsiren
Overview

Implicit Neural Representations with Periodic Activation Functions

Project Page | Paper | Data

Explore Siren in Colab

Vincent Sitzmann*, Julien N. P. Martel*, Alexander W. Bergman, David B. Lindell, Gordon Wetzstein
Stanford University, *denotes equal contribution

This is the official implementation of the paper "Implicit Neural Representations with Periodic Activation Functions".

siren_video

Google Colab

If you want to experiment with Siren, we have written a Colab. It's quite comprehensive and comes with a no-frills, drop-in implementation of SIREN. It doesn't require installing anything, and goes through the following experiments / SIREN properties:

  • Fitting an image
  • Fitting an audio signal
  • Solving Poisson's equation
  • Initialization scheme & distribution of activations
  • Distribution of activations is shift-invariant
  • Periodicity & behavior outside of the training range.

Tensorflow Playground

You can also play arond with a tiny SIREN interactively, directly in the browser, via the Tensorflow Playground here. Thanks to David Cato for implementing this!

Get started

If you want to reproduce all the results (including the baselines) shown in the paper, the videos, point clouds, and audio files can be found here.

You can then set up a conda environment with all dependencies like so:

conda env create -f environment.yml
conda activate siren

High-Level structure

The code is organized as follows:

  • dataio.py loads training and testing data.
  • training.py contains a generic training routine.
  • modules.py contains layers and full neural network modules.
  • meta_modules.py contains hypernetwork code.
  • utils.py contains utility functions, most promintently related to the writing of Tensorboard summaries.
  • diff_operators.py contains implementations of differential operators.
  • loss_functions.py contains loss functions for the different experiments.
  • make_figures.py contains helper functions to create the convergence videos shown in the video.
  • ./experiment_scripts/ contains scripts to reproduce experiments in the paper.

Reproducing experiments

The directory experiment_scripts contains one script per experiment in the paper.

To monitor progress, the training code writes tensorboard summaries into a "summaries"" subdirectory in the logging_root.

Image experiments

The image experiment can be reproduced with

python experiment_scripts/train_img.py --model_type=sine

The figures in the paper were made by extracting images from the tensorboard summaries. Example code how to do this can be found in the make_figures.py script.

Audio experiments

This github repository comes with both the "counting" and "bach" audio clips under ./data.

They can be trained with

python experiment_scipts/train_audio.py --model_type=sine --wav_path=<path_to_audio_file>

Video experiments

The "bikes" video sequence comes with scikit-video and need not be downloaded. The cat video can be downloaded with the link above.

To fit a model to a video, run

python experiment_scipts/train_video.py --model_type=sine --experiment_name bikes_video

Poisson experiments

For the poisson experiments, there are three separate scripts: One for reconstructing an image from its gradients (train_poisson_grad_img.py), from its laplacian (train_poisson_lapl_image.py), and to combine two images (train_poisson_gradcomp_img.py).

Some of the experiments were run using the BSD500 datast, which you can download here.

SDF Experiments

To fit a Signed Distance Function (SDF) with SIREN, you first need a pointcloud in .xyz format that includes surface normals. If you only have a mesh / ply file, this can be accomplished with the open-source tool Meshlab.

To reproduce our results, we provide both models of the Thai Statue from the 3D Stanford model repository and the living room used in our paper for download here.

To start training a SIREN, run:

python experiments_scripts/train_single_sdf.py --model_type=sine --point_cloud_path=<path_to_the_model_in_xyz_format> --batch_size=250000 --experiment_name=experiment_1

This will regularly save checkpoints in the directory specified by the rootpath in the script, in a subdirectory "experiment_1". The batch_size is typically adjusted to fit in the entire memory of your GPU. Our experiments show that with a 256, 3 hidden layer SIREN one can set the batch size between 230-250'000 for a NVidia GPU with 12GB memory.

To inspect a SDF fitted to a 3D point cloud, we now need to create a mesh from the zero-level set of the SDF. This is performed with another script that uses a marching cubes algorithm (adapted from the DeepSDF github repo) and creates the mesh saved in a .ply file format. It can be called with:

python experiments_scripts/test_single_sdf.py --checkpoint_path=<path_to_the_checkpoint_of_the_trained_model> --experiment_name=experiment_1_rec 

This will save the .ply file as "reconstruction.ply" in "experiment_1_rec" (be patient, the marching cube meshing step takes some time ;) ) In the event the machine you use for the reconstruction does not have enough RAM, running test_sdf script will likely freeze. If this is the case, please use the option --resolution=512 in the command line above (set to 1600 by default) that will reconstruct the mesh at a lower spatial resolution.

The .ply file can be visualized using a software such as Meshlab (a cross-platform visualizer and editor for 3D models).

Helmholtz and wave equation experiments

The helmholtz and wave equation experiments can be reproduced with the train_wave_equation.py and train_helmholtz.py scripts.

Torchmeta

We're using the excellent torchmeta to implement hypernetworks. We realized that there is a technical report, which we forgot to cite - it'll make it into the camera-ready version!

Citation

If you find our work useful in your research, please cite:

@inproceedings{sitzmann2019siren,
    author = {Sitzmann, Vincent
              and Martel, Julien N.P.
              and Bergman, Alexander W.
              and Lindell, David B.
              and Wetzstein, Gordon},
    title = {Implicit Neural Representations
              with Periodic Activation Functions},
    booktitle = {arXiv},
    year={2020}
}

Contact

If you have any questions, please feel free to email the authors.

Owner
Vincent Sitzmann
Incoming Assistant Professor @mit EECS. I'm researching neural scene representations - the way neural networks learn to represent information on our world.
Vincent Sitzmann
A Keras implementation of YOLOv4 (Tensorflow backend)

keras-yolo4 请使用更完善的版本: https://github.com/miemie2013/Keras-YOLOv4 Please visit here for more complete model: https://github.com/miemie2013/Keras-YOLOv

384 Nov 29, 2022
This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

You can use this simple crypto backtesting script to ensure your trading strategy is successful Minimal setup required and works well with static TP a

Andrei 154 Sep 12, 2022
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 92 Jan 04, 2023
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

Computational Pathology 12 Aug 06, 2022
NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.

NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions Overview NUANCED is a user-centric conversational recommen

Facebook Research 18 Dec 28, 2021
Heterogeneous Deep Graph Infomax

Heterogeneous-Deep-Graph-Infomax Parameter Setting: HDGI-A: Node-level dimension: 16 Attention head: 4 Semantic-level attention vector: 8 learning rat

52 Oct 31, 2022
Defending against Model Stealing via Verifying Embedded External Features

Defending against Model Stealing Attacks via Verifying Embedded External Features This is the official implementation of our paper Defending against M

20 Dec 30, 2022
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Winnie Xu 95 Nov 26, 2021
GND-Nets (Graph Neural Diffusion Networks) in TensorFlow.

GNDC For submission to IEEE TKDE. Overview Here we provide the implementation of GND-Nets (Graph Neural Diffusion Networks) in TensorFlow. The reposit

Wei Ye 3 Aug 08, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

Keyhole Imaging Code & Dataset Code associated with the paper "Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Singl

Stanford Computational Imaging Lab 20 Feb 03, 2022
Code for the published paper : Learning to recognize rare traffic sign

Improving traffic sign recognition by active search This repo contains code for the paper : "Learning to recognise rare traffic signs" How to use this

samsja 4 Jan 05, 2023
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 09, 2023
Learning-Augmented Dynamic Power Management

Learning-Augmented Dynamic Power Management This repository contains source code accompanying paper Learning-Augmented Dynamic Power Management with M

Adam 0 Feb 22, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
This repository contains implementations and illustrative code to accompany DeepMind publications

DeepMind Research This repository contains implementations and illustrative code to accompany DeepMind publications. Along with publishing papers to a

DeepMind 11.3k Dec 31, 2022
OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

OpenMMLab 3k Jan 07, 2023
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021