PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

Overview

VGPL-Visual-Prior

PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Given visual obseravtions, the visual prior proposes their corresponding particle representations, in the form of particle positions and groupings. Please see the following paper for more details.

Visual Grounding of Learned Physical Models

Yunzhu Li, Toru Lin*, Kexin Yi*, Daniel M. Bear, Daniel L. K. Yamins, Jiajun Wu, Joshua B. Tenenbaum, and Antonio Torralba

ICML 2020 [website] [paper] [video]

Demo

Input RGB videos and predictions from our learned model

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required packages in requirements.txt

Code overview

Helper files

config.py contains all configurations used for model training, model evaluation and output generation.

dataset.py contains helper functions for loading and standardizing data and related variables. Note that paths to data directories is specified in the _DATA_DIR variable in this file, not in config.py.

loss.py contains helper functions for calculating Chamfer loss in different settings (e.g. in a single frame, across a time sequence, etc.).

model.py implements the neural network model used for prediction.

Main files

The following files can be run directly; see "Training and evaluation" section for more details.

train.py trains a model that could convert input observations into their particle representations.

eval.py evaluates a trained model by visualizing its predictions, and/or stores the output predictions in .h5 format.

Training and evaluation

Download the training and evaluation data from the following links, and put them in data folder. Optionally, download our trained model checkpoints and put them in dump folder.

To train a model:

python train.py --set loss_type l2 dataset RigidFall

To debug (by overfitting model on small batch of data):

python train.py --set loss_type l2 dataset RigidFall debug True

To evaluate a trained model and generate outputs using our provided checkpoints:

python eval.py --set loss_type l2 dataset RigidFall n_frames 4 n_frames_eval 30 load_path dump/rigid_fall_4frame_l2.pth
python eval.py --set loss_type l2 dataset MassRope n_frames 4 n_frames_eval 30 load_path dump/mass_rope_4frame_l2.pth

See config.py for more details on customizable configurations.

Citing VGPL

If you find this codebase useful in your research, please consider citing:

@inproceedings{li2020visual,
    Title={Visual Grounding of Learned Physical Models},
    Author={Li, Yunzhu and Lin, Toru and Yi, Kexin and Bear, Daniel and Yamins, Daniel L.K. and Wu, Jiajun and Tenenbaum, Joshua B. and Torralba, Antonio},
    Booktitle={ICML},
    Year={2020}
}

@inproceedings{li2019learning,
    Title={Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids},
    Author={Li, Yunzhu and Wu, Jiajun and Tedrake, Russ and Tenenbaum, Joshua B and Torralba, Antonio},
    Booktitle={ICLR},
    Year={2019}
}
Owner
Toru
Toru
Capstone-Project-2 - A game program written in the Python language

Capstone-Project-2 My Pygame Game Information: Description This Pygame project i

Nhlakanipho Khulekani Hlophe 1 Jan 04, 2022
[3DV 2021] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation

Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation This is the official implementation for the method described in Ch

Jiaxing Yan 27 Dec 30, 2022
Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Rank & Sort Loss for Object Detection and Instance Segmentation The official implementation of Rank & Sort Loss. Our implementation is based on mmdete

Kemal Oksuz 229 Dec 20, 2022
Official implementation for paper Render In-between: Motion Guided Video Synthesis for Action Interpolation

Render In-between: Motion Guided Video Synthesis for Action Interpolation [Paper] [Supp] [arXiv] [4min Video] This is the official Pytorch implementat

8 Oct 27, 2022
Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Pre-trained image classification models for Jax/Haiku Jax/Haiku Applications are deep learning models that are made available alongside pre-trained we

Alper Baris CELIK 14 Dec 20, 2022
IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation Independent Encoder for Deep

30 Nov 05, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

yolov5-opencv-cpp-python Example of performing inference with ultralytics YOLO V

183 Jan 09, 2023
Point Cloud Registration using Representative Overlapping Points.

Point Cloud Registration using Representative Overlapping Points (ROPNet) Abstract 3D point cloud registration is a fundamental task in robotics and c

ZhuLifa 36 Dec 16, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
A large-image collection explorer and fast classification tool

IMAX: Interactive Multi-image Analysis eXplorer This is an interactive tool for visualize and classify multiple images at a time. It written in Python

Matias Carrasco Kind 23 Dec 16, 2022
Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks

Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks Work accepted at NeurIPS'21 [paper, video]. If you use this code in

TU Delft 43 Dec 07, 2022
This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

hzwer 190 Jan 08, 2023
Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions This repo contains the dataset and code for the paper Benchmarking Ro

Jiachen Sun 168 Dec 29, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

Yutong Bai 145 Dec 01, 2022
Multi-label classification of retinal disorders

Multi-label classification of retinal disorders This is a deep learning course project. The goal is to develop a solution, using computer vision techn

Sundeep Bhimireddy 1 Jan 29, 2022
EfficientNetV2 implementation using PyTorch

EfficientNetV2-S implementation using PyTorch Train Steps Configure imagenet path by changing data_dir in train.py python main.py --benchmark for mode

Jahongir Yunusov 86 Dec 29, 2022
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

Rao Muhammad Umer 6 Nov 14, 2022
Multistream CNN for Robust Acoustic Modeling

Multistream Convolutional Neural Network (CNN) A multistream CNN is a novel neural network architecture for robust acoustic modeling in speech recogni

ASAPP Research 37 Sep 21, 2022