GRF: Learning a General Radiance Field for 3D Representation and Rendering

Related tags

Deep LearningGRF
Overview

GRF: Learning a General Radiance Field for 3D Representation and Rendering

[Paper] [Video]

GRF: Learning a General Radiance Field for 3D Representation and Rendering
Alex Trevithick1,2 and Bo Yang2,3
1Williams College, 2University of Oxford, 3The Hong Kong Polytechnic University in ICCV 2021

This is the codebase which is currently a work in progress.

Overview of GRF

GRF is a powerful implicit neural function that can represent and render arbitrarily complex 3D scenes in a single network only from 2D observations. GRF takes a set of posed 2D images as input, constructs an internal representation for each 3D point of the scene, and renders the corresponding appearance and geometry of any 3D point viewing from an arbitrary angle. The key to our approach is to explicitly integrate the principle of multi-view geometry to obtain features representative of an entire ray from a given viewpoint. Thus, in a single forward pass to render a scene from a novel view, GRF takes some views of that scene as input, computes per-pixel pose-aware features for each ray from the given viewpoints through the image plane at that pixel, and then uses those features to predict the volumetric density and rgb values of points in 3D space. Volumetric rendering is then applied.

Setting Up the Environment

Use conda to setup an environment as follows:

conda env create -f environment.yml
conda activate grf

Data

  • SRN cars and chairs datasets can be downloaded from the paper's drive link
  • NeRF-Synthetic and LLFF datasets can be downloaded from the NeRF drive link
  • MultiShapenet dataset can be downloaded from the DISN drive link

Training and Rendering from the Model

To train and render from the model, use the run.py script

python run.py --data_root [path to directory with dataset] ] \
    --expname [experiment name]
    --basedir [where to store ckpts and logs]
    --datadir [input data directory]
    --netdepth [layers in network]
    --netwidth [channels per layer]
    --netdepth_fine [layers in fine network]
    --netwidth_fine [channels per layer in fine network]
    --N_rand [batch size (number of random rays per gradient step)]
    --lrate [learning rate]
    --lrate_decay [exponential learning rate decay (in 1000s)]
    --chunk [number of rays processed in parallel, decrease if running out of memory]
    --netchunk [number of pts sent through network in parallel, decrease if running out of memory]
    --no_batching [only take random rays from 1 image at a time]
    --no_reload [do not reload weights from saved ckpt]
    --ft_path [specific weights npy file to reload for coarse network]
    --random_seed [fix random seed for repeatability]
    --precrop_iters [number of steps to train on central crops]
    --precrop_frac [fraction of img taken for central crops]
    --N_samples [number of coarse samples per ray]
    --N_importance [number of additional fine samples per ray]
    --perturb [set to 0. for no jitter, 1. for jitter]
    --use_viewdirs [use full 5D input instead of 3D]
    --i_embed [set 0 for default positional encoding, -1 for none]
    --multires [log2 of max freq for positional encoding (3D location)]
    --multires_views [log2 of max freq for positional encoding (2D direction)]
    --raw_noise_std [std dev of noise added to regularize sigma_a output, 1e0 recommended]
    --render_only [do not optimize, reload weights and render out render_poses path]
    --dataset_type [options: llff / blender / shapenet / multishapenet]
    --testskip [will load 1/N images from test/val sets, useful for large datasets like deepvoxels]
    --white_bkgd [set to render synthetic data on a white bkgd (always use for dvoxels)]
    --half_res [load blender synthetic data at 400x400 instead of 800x800]
    --no_ndc [do not use normalized device coordinates (set for non-forward facing scenes)]
    --lindisp [sampling linearly in disparity rather than depth]
    --spherify [set for spherical 360 scenes]
    --llffhold [will take every 1/N images as LLFF test set, paper uses 8]
    --i_print [frequency of console printout and metric loggin]
    --i_img [frequency of tensorboard image logging]
    --i_weights [frequency of weight ckpt saving]
    --i_testset [frequency of testset saving]
    --i_video [frequency of render_poses video saving]
    --attention_direction_multires [frequency of embedding for value]
    --attention_view_multires [frequency of embedding for direction]
    --training_recon [whether to render images from the test set or not during final evaluation]
    --use_quaternion [append input pose as quaternion to input to unet]
    --no_globl [don't use global vector in middle of unet]
    --no_render_pose [append render pose to input to unet]
    --use_attsets [use attsets, otherwise use slot attention]

In particular, note that to render and test from a trained model, set render_only to True in the config.

Configs

The current configs are for the blender, LLFF, and shapenet datasets, which can be found in configs.

After setting the parameters of the model, to run it,

python run.py --configs/config_DATATYPE

Practical Concerns

The models were tested on 32gb GPUs, and higher resolution images require very large amounts of memory. The shapenet experiments should run on 16gb GPUs.

Acknowledgements

The code is built upon the original NeRF implementation. Thanks to LucidRains for the torch implementation of slot attention on which the current version is based.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{grf2020,
  title={GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering},
  author={Trevithick, Alex and Yang, Bo},
  booktitle={arXiv:2010.04595},
  year={2020}
}
Owner
Alex Trevithick
ML + CV👍
Alex Trevithick
Unified learning approach for egocentric hand gesture recognition and fingertip detection

Unified Gesture Recognition and Fingertip Detection A unified convolutional neural network (CNN) algorithm for both hand gesture recognition and finge

Mohammad 227 Dec 25, 2022
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

GraspNet 209 Dec 29, 2022
Processed, version controlled history of Minecraft's generated data and assets

mcmeta Processed, version controlled history of Minecraft's generated data and assets Repository structure Each of the following branches has a commit

Misode 75 Dec 28, 2022
This is my codes that can visualize the psnr image in testing videos.

CVPR2018-Baseline-PSNRplot This is my codes that can visualize the psnr image in testing videos. Future Frame Prediction for Anomaly Detection – A New

Wenhao Yang 12 May 29, 2021
Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Pre-trained image classification models for Jax/Haiku Jax/Haiku Applications are deep learning models that are made available alongside pre-trained we

Alper Baris CELIK 14 Dec 20, 2022
GNEE - GAT Neural Event Embeddings

GNEE - GAT Neural Event Embeddings This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Se

João Pedro Rodrigues Mattos 0 Sep 15, 2021
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022
The challenge for Quantum Coalition Hackathon 2021

Qchack 2021 Google Challenge This is a challenge for the brave 2021 qchack.io participants. Instructions Hello, intrepid qchacker, welcome to the G|o

quantumlib 18 May 04, 2022
PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 93 Aug 17, 2022
A TensorFlow implementation of FCN-8s

FCN-8s implementation in TensorFlow Contents Overview Examples and demo video Dependencies How to use it Download pre-trained VGG-16 Overview This is

Pierluigi Ferrari 50 Aug 08, 2022
Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

Chonghan_Lee 15 Nov 02, 2022
[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

A paper Introduction This is an official release of the paper Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation wit

Jiacheng Wang 14 Dec 08, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

We propose a hierarchical core-fringe learning framework to measure fine-grained domain relevance of terms – the degree that a term is relevant to a broad (e.g., computer science) or narrow (e.g., de

Jie Huang 14 Oct 21, 2022
PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

SimTrans-Weak-Shot-Classification This repository contains the official PyTorch implementation of the following paper: Weak-shot Fine-grained Classifi

BCMI 60 Dec 02, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Open Sesame This repository contains the code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge. Credits We built the project on t

9 Jul 24, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

21 Nov 09, 2022
Example of semantic segmentation in Keras

keras-semantic-segmentation-example Example of semantic segmentation in Keras Single class example: Generated data: random ellipse with random color o

53 Mar 23, 2022