An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Last update: Dec 14, 2022

Overview

GLOM - Pytorch (wip)

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns) for emergent part-whole heirarchies from data.

Citations

@misc{hinton2021represent,
    title   = {How to represent part-whole hierarchies in a neural network}, 
    author  = {Geoffrey Hinton},
    year    = {2021},
    eprint  = {2102.12627},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Comments

help

Hello, when I tried to reproduce your model, I got this error. I'm not sure how to correct it， can y help me?

Traceback (most recent call last): File "main.py", line 172, in outputs = custom_model(images,iters = 12) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 109, in forward consensus = self.attention(levels) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 49, in forward sim.masked_fill(self_mask, TOKEN_ATTEND_SELF_VALUE) RuntimeError: Expected object of scalar type Bool but got scalar type Float for argument #2 'mask' in call to th_masked_fill_bool

opened by DDxk369 1
Levels token

Hello, thank you for your good work. I was trying to implement the idea you shared in this todo:

https://github.com/lucidrains/glom-pytorch/projects/1#card-56284841

The text reads: allow each level to be represented by a list of tokens, updated with attention, simliar to https://github.com/lucidrains/transformer-in-transformer

I was going to implement it with a simple token at each level, but I was wondering if you had any suggestion on how to implement it correctly. Thank you.

opened by zenos4mbu 0
Implementing geometric mean for consensus opinion/levels_mean

Hi, I'm trying to implement the consensus opinion (levels_mean) as a geometric mean of the top-down predictions, bottom-up predictions, attention-weighted average of same-level embeddings, and embeddings of the previous time step as described by the original paper. Any ideas on how the weights should be set?

At first I thought this could be a learnable parameter, but section 9.1 reads

For interpreting a static image with no temporal context, the weights used for this weighted geometric mean need to change during the iterations that occur after a new fixation.

which leads me to believe that these might need to be outputted on the fly a la vanilla attention as opposed to being learned. Maybe an MLP that takes in the four source embeddings and outputs four scalars as weights?

opened by ryan-caesar-ramos 0
Classification
Hi @lucidrains ! Do you have any idea/insight on how to supervise classification (let's say, for example, MNIST digits classification) after having trained GLOM in an unsupervised way as a denoising autoencoder? In the paper that seems to be the final goal. However, it's not clear to me which columns and/or levels should be used for the classification. Also, since GLOM it's dealing with patches, how can single black patches vote towards a certain digit?

In other words, after training GLOM as a denoising autoencoder on MNIST, what we have is:

p X p columns, where p is the number of patches per dimension (e.g. 7X7=49 patches)

6 levels for each column, where the top-most levels should in theory represent higher-level entities, so it seems natural to search for the digit information in these layers

6*2=12 iterations, to allow for information to be passed by both top-down and bottom-up networks

Just by applying dimensionality reduction on the top-most level at different iterations does not seem enough to make the digit clusters emerge. So I'm wondering if you (or anybody else) have some insights on this. Cheers!
opened by A7ocin 1
Bug in forward?

Hello, thank you for making this code available! I think there could be a potential bug in the first line of the forward function:

b, h, w, _, device = *img.shape, img.device

but the input image shape is of kind b c h w, so it could be fixed by replacing it with

b, _, h, w, device = *img.shape, img.device

Am I wrong?

opened by A7ocin 9

Releases(0.0.14)

0.0.14(Mar 27, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10a(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9a(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need.

GitHub Repository

Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

308 Jan 05, 2023

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

DeepCTR DeepCTR is a Easy-to-use,Modular and Extendible package of deep-learning based CTR models along with lots of core components layers which can

6.6k Jan 08, 2023

CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

CenterFace Introduce CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices. Recent Update 2019.09.

1.2k Dec 21, 2022

Predict bus arrival time using VertexAI and Nvidia's Jetson Nano

bus_prediction predict bus arrival time using VertexAI and Nvidia's Jetson Nano imagenet the command for imagenet.py look like this python3 /path/to/i

10 Dec 22, 2022

SphereFace: Deep Hypersphere Embedding for Face Recognition

SphereFace: Deep Hypersphere Embedding for Face Recognition By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song License SphereFa

1.5k Dec 29, 2022

Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

14 Apr 07, 2022

Starter kit for getting started in the Music Demixing Challenge.

Music Demixing Challenge - Starter Kit 👉 Challenge page This repository is the Music Demixing Challenge Submission template and Starter kit! Clone th

106 Dec 20, 2022

BankNote-Net: Open dataset and encoder model for assistive currency recognition

BankNote-Net: Open Dataset for Assistive Currency Recognition Millions of people around the world have low or no vision. Assistive software applicatio

13 Oct 28, 2022

A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

GDN A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021) Abstract In this paper, we consider an inverse problem i

4 Sep 13, 2022

Rule based classification A hotel s customers dataset

Rule-based-classification-A-hotel-s-customers-dataset- Aim: Categorize new customers by segment and predict how much revenue they can generate This re

4 Jan 02, 2022

Evaluating saliency methods on artificial data with different background types

Evaluating saliency methods on artificial data with different background types This repository contains the relevant code for the MedNeurips 2021 subm

2 Jul 05, 2022

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation [Paper] [PyTorch] [MXNet] [Video] This repository provides code for training

516 Dec 21, 2022

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology Self-Supervised Vision Transformers Learn Visual Concepts in Histopatholog

95 Dec 24, 2022

GNEE - GAT Neural Event Embeddings

GNEE - GAT Neural Event Embeddings This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Se

0 Sep 15, 2021

Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

679 Jan 04, 2023

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

49 Dec 01, 2022

Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

Nightmare: One Byte to ROP // Alternate Solution TLDR: One byte write, no leak.

1 Feb 17, 2022

Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

Deep Networks from the Principle of Rate Reduction This repository is the official NumPy implementation of the paper Deep Networks from the Principle

49 Dec 16, 2022

FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr

16 Dec 14, 2022

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Related tags

Overview

GLOM - Pytorch (wip)

Citations

Comments

help

Levels token

Implementing geometric mean for consensus opinion/levels_mean

Classification

Bug in forward?

Releases(0.0.14)

0.0.14(Mar 27, 2021)

0.0.12(Mar 6, 2021)

0.0.11(Mar 5, 2021)

0.0.10a(Mar 5, 2021)

0.0.9a(Mar 5, 2021)

0.0.8(Mar 5, 2021)

0.0.7(Mar 5, 2021)

0.0.6(Mar 5, 2021)

0.0.5(Mar 5, 2021)

0.0.4(Mar 5, 2021)

0.0.3(Mar 5, 2021)

0.0.2(Mar 5, 2021)