An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Overview

GLOM - Pytorch (wip)

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns) for emergent part-whole heirarchies from data.

Citations

@misc{hinton2021represent,
    title   = {How to represent part-whole hierarchies in a neural network}, 
    author  = {Geoffrey Hinton},
    year    = {2021},
    eprint  = {2102.12627},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
Comments
  • help

    help

    Hello, when I tried to reproduce your model, I got this error. I'm not sure how to correct it, can y help me?

    Traceback (most recent call last): File "main.py", line 172, in outputs = custom_model(images,iters = 12) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 109, in forward consensus = self.attention(levels) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 49, in forward sim.masked_fill(self_mask, TOKEN_ATTEND_SELF_VALUE) RuntimeError: Expected object of scalar type Bool but got scalar type Float for argument #2 'mask' in call to th_masked_fill_bool

    opened by DDxk369 1
  • Levels token

    Levels token

    Hello, thank you for your good work. I was trying to implement the idea you shared in this todo:

    https://github.com/lucidrains/glom-pytorch/projects/1#card-56284841

    The text reads: allow each level to be represented by a list of tokens, updated with attention, simliar to https://github.com/lucidrains/transformer-in-transformer

    I was going to implement it with a simple token at each level, but I was wondering if you had any suggestion on how to implement it correctly. Thank you.

    opened by zenos4mbu 0
  • Implementing geometric mean for consensus opinion/levels_mean

    Implementing geometric mean for consensus opinion/levels_mean

    Hi, I'm trying to implement the consensus opinion (levels_mean) as a geometric mean of the top-down predictions, bottom-up predictions, attention-weighted average of same-level embeddings, and embeddings of the previous time step as described by the original paper. Any ideas on how the weights should be set?

    At first I thought this could be a learnable parameter, but section 9.1 reads

    For interpreting a static image with no temporal context, the weights used for this weighted geometric mean need to change during the iterations that occur after a new fixation.

    which leads me to believe that these might need to be outputted on the fly a la vanilla attention as opposed to being learned. Maybe an MLP that takes in the four source embeddings and outputs four scalars as weights?

    opened by ryan-caesar-ramos 0
  • Classification

    Classification

    Hi @lucidrains ! Do you have any idea/insight on how to supervise classification (let's say, for example, MNIST digits classification) after having trained GLOM in an unsupervised way as a denoising autoencoder? In the paper that seems to be the final goal. However, it's not clear to me which columns and/or levels should be used for the classification. Also, since GLOM it's dealing with patches, how can single black patches vote towards a certain digit?

    In other words, after training GLOM as a denoising autoencoder on MNIST, what we have is:

    • p X p columns, where p is the number of patches per dimension (e.g. 7X7=49 patches)
    • 6 levels for each column, where the top-most levels should in theory represent higher-level entities, so it seems natural to search for the digit information in these layers
    • 6*2=12 iterations, to allow for information to be passed by both top-down and bottom-up networks

    Just by applying dimensionality reduction on the top-most level at different iterations does not seem enough to make the digit clusters emerge. So I'm wondering if you (or anybody else) have some insights on this. Cheers!

    opened by A7ocin 1
  • Bug in forward?

    Bug in forward?

    Hello, thank you for making this code available! I think there could be a potential bug in the first line of the forward function:

    b, h, w, _, device = *img.shape, img.device

    but the input image shape is of kind b c h w, so it could be fixed by replacing it with

    b, _, h, w, device = *img.shape, img.device

    Am I wrong?

    opened by A7ocin 9
Owner
Phil Wang
Working with Attention. It's all we need.
Phil Wang
Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

Nabil Ibtehaz 308 Jan 05, 2023
Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

DeepCTR DeepCTR is a Easy-to-use,Modular and Extendible package of deep-learning based CTR models along with lots of core components layers which can

浅梦 6.6k Jan 08, 2023
CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

CenterFace Introduce CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices. Recent Update 2019.09.

StarClouds 1.2k Dec 21, 2022
Predict bus arrival time using VertexAI and Nvidia's Jetson Nano

bus_prediction predict bus arrival time using VertexAI and Nvidia's Jetson Nano imagenet the command for imagenet.py look like this python3 /path/to/i

10 Dec 22, 2022
SphereFace: Deep Hypersphere Embedding for Face Recognition

SphereFace: Deep Hypersphere Embedding for Face Recognition By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song License SphereFa

Weiyang Liu 1.5k Dec 29, 2022
Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

Padmanabha Banerjee 14 Apr 07, 2022
Starter kit for getting started in the Music Demixing Challenge.

Music Demixing Challenge - Starter Kit 👉 Challenge page This repository is the Music Demixing Challenge Submission template and Starter kit! Clone th

AIcrowd 106 Dec 20, 2022
BankNote-Net: Open dataset and encoder model for assistive currency recognition

BankNote-Net: Open Dataset for Assistive Currency Recognition Millions of people around the world have low or no vision. Assistive software applicatio

Microsoft 13 Oct 28, 2022
A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

GDN A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021) Abstract In this paper, we consider an inverse problem i

4 Sep 13, 2022
Rule based classification A hotel s customers dataset

Rule-based-classification-A-hotel-s-customers-dataset- Aim: Categorize new customers by segment and predict how much revenue they can generate This re

Şebnem 4 Jan 02, 2022
Evaluating saliency methods on artificial data with different background types

Evaluating saliency methods on artificial data with different background types This repository contains the relevant code for the MedNeurips 2021 subm

2 Jul 05, 2022
f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation [Paper] [PyTorch] [MXNet] [Video] This repository provides code for training

Visual Understanding Lab @ Samsung AI Center Moscow 516 Dec 21, 2022
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology Self-Supervised Vision Transformers Learn Visual Concepts in Histopatholog

Richard Chen 95 Dec 24, 2022
GNEE - GAT Neural Event Embeddings

GNEE - GAT Neural Event Embeddings This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Se

João Pedro Rodrigues Mattos 0 Sep 15, 2021
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

Benedek Rozemberczki 49 Dec 01, 2022
Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

Nightmare: One Byte to ROP // Alternate Solution TLDR: One byte write, no leak.

1 Feb 17, 2022
Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

Deep Networks from the Principle of Rate Reduction This repository is the official NumPy implementation of the paper Deep Networks from the Principle

Ryan Chan 49 Dec 16, 2022
FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr

16 Dec 14, 2022