Compact Bilinear Pooling for PyTorch

Overview

Compact Bilinear Pooling for PyTorch.

This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

This version relies on the FFT implementation provided with PyTorch 0.4.0 onward. For older versions of PyTorch, use the tag v0.3.0.

Installation

Run the setup.py, for instance:

python setup.py install

Usage

class compact_bilinear_pooling.CompactBilinearPooling(input1_size, input2_size, output_size, h1 = None, s1 = None, h2 = None, s2 = None)

Basic usage:

from compact_bilinear_pooling import CountSketch, CompactBilinearPooling

input_size = 2048
output_size = 16000
mcb = CompactBilinearPooling(input_size, input_size, output_size).cuda()
x = torch.rand(4,input_size).cuda()
y = torch.rand(4,input_size).cuda()

z = mcb(x,y)

Test

A couple of test of the implementation of Compact Bilinear Pooling and its gradient can be run using:

python test.py

References

Comments
  • The value in ComplexMultiply_backward function

    The value in ComplexMultiply_backward function

    Hi @gdlg, thanks for this nice work. I'm confused about the backward procedure of complex multiplication. So I hope you can help me to figure it out.

    In forward,

    Z = XY = (Rx + i * Ix)(Ry + i * Iy) = (RxRy - IxIy) + i * (IxRy + RxIy) = Rz + i * Iz
    

    In backward, according the chain rule, it will has

    grad_(L/X) = grad_(L/Z) * grad(Z/X)
               = grad_Z * Y
               = (R_gz + i * I_gz)(Ry + i * Iy)
               = (R_gzRy - I_gzIy) + i * (I_gzRy + R_gzIy)
    

    So, why is this line implemented by using the value = 1 for real part and value = -1 for image part?

    Is there something wrong in my thoughts? Thanks.

    opened by KaiyuYue 8
  • The miss of Rfft

    The miss of Rfft

    When I run the test module, it indicates that the module of pytorch_fft of fft in autograd does not have attribute of Rfft. What version of pytorch_fft should I install to fit this code?

    opened by PeiqinZhuang 8
  • Save the model - TypeError: can't pickle Rfft objects

    Save the model - TypeError: can't pickle Rfft objects

    How do you save and load the model, I'm using torch.save, which cause the following error:

    File "x/anaconda3/lib/python3.6/site-packages/tor                                                                                                                               ch/serialization.py", line 135, in save
       return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickl                                                                                                                               e_protocol))
     File "x/anaconda3/lib/python3.6/site-packages/tor                                                                                                                               ch/serialization.py", line 117, in _with_file_like
       return body(f)
     File "xanaconda3/lib/python3.6/site-packages/tor                                                                                                                               ch/serialization.py", line 135, in <lambda>
       return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickl                                                                                                                               e_protocol))
     File "x/anaconda3/lib/python3.6/site-packages/tor                                                                                                                               ch/serialization.py", line 198, in _save
       pickler.dump(obj)
    TypeError: can't pickle Rfft objects
    
    
    opened by idansc 3
  • Multi GPU support

    Multi GPU support

    I modify

    class CompactBilinearPooling(nn.Module):   
         def forward(self, x, y):    
                return CompactBilinearPoolingFn.apply(self.sketch1.h, self.sketch1.s, self.sketch2.h, self.sketch2.s, self.output_size, x, y)
    

    to

    def forward(self, x):    
        x = x.permute(0, 2, 3, 1) #NCHW to NHWC   
        y = Variable(x.data.clone())    
        out = (CompactBilinearPoolingFn.apply(self.sketch1.h, self.sketch1.s, self.sketch2.h, self.sketch2.s, self.output_size, x, y)).permute(0,3,1,2) #to NCHW    
        out = nn.functional.adaptive_avg_pool2d(out, 1) # N,C,1,1   
        #add an element-wise signed square root layer and an instance-wise l2 normalization    
        out = (torch.sqrt(nn.functional.relu(out)) - torch.sqrt(nn.functional.relu(-out)))/torch.norm(out,2,1,True)   
        return out 
    

    This makes the compact pooling layer can be plugged to PyTorch CNNs more easily:

    model.avgpool = CompactBilinearPooling(input_C, input_C, bilinear['dim'])
    model.fc = nn.Linear(int(model.fc.in_features/input_C*bilinear['dim']), num_classes)

    However, when I run this using multiple GPUs, I got the following error:

    Traceback (most recent call last): File "train3_bilinear_pooling.py", line 400, in run() File "train3_bilinear_pooling.py", line 219, in run train(train_loader, model, criterion, optimizer, epoch) File "train3_bilinear_pooling.py", line 326, in train return _each_epoch('train', train_loader, model, criterion, optimizer, epoch) File "train3_bilinear_pooling.py", line 270, in _each_epoch output = model(input_var) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 319, in call result = self.forward(*input, **kwargs) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 67, in forward replicas = self.replicate(self.module, self.device_ids[:len(inputs)]) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 72, in replicate return replicate(module, device_ids) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/replicate.py", line 19, in replicate buffer_copies = comm.broadcast_coalesced(buffers, devices) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/cuda/comm.py", line 55, in broadcast_coalesced for chunk in _take_tensors(tensors, buffer_size): File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/_utils.py", line 232, in _take_tensors if tensor.is_sparse: File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/autograd/variable.py", line 68, in getattr return object.getattribute(self, name) AttributeError: 'Variable' object has no attribute 'is_sparse'

    Do you have any ideas?

    opened by YanWang2014 3
  • AssertionError: False is not true

    AssertionError: False is not true

    Hi, I am back again. When running the test.py, I got the following error File "test.py", line 69, in test_gradients self.assertTrue(torch.autograd.gradcheck(cbp, (x,y), eps=1)) AssertionError: False is not true

    What does this mean?

    opened by YanWang2014 2
  • Support for Pytorch 1.11?

    Support for Pytorch 1.11?

    Hi, torch.fft() and torch.irfft() are no more functions, those are modules. And there appears to be a lof of modification in the parameters. I am currently trying to combine the two types of features with compact bilinear pooling, do you know how to port this code to pytorch 1.11?

    opened by bhosalems 1
  • Training does not converge after joining compact bilinear layer

    Training does not converge after joining compact bilinear layer

    Source code: x = self.features(x) #[4,512,28,28] batch_size = x.size(0) x = (torch.bmm(x, torch.transpose(x, 1, 2)) / 28 ** 2).view(batch_size, -1) x = torch.nn.functional.normalize(torch.sign(x) * torch.sqrt(torch.abs(x) + 1e-10)) x = self.classifiers(x) return x my code: x = self.features(x) #[4,512,28,28] x = x.view(x.shape[0], x.shape[1], -1) #[4,512,784] x = x.permute(0, 2, 1) #[4,784,512] x = self.mcb(x,x) #[4,784,512] batch_size = x.size(0) x = x.sum(1) #对于二维来说,dim=0,对列求和;dim=1对行求和;在这里是三维所以是对列求和 x = torch.nn.functional.normalize(torch.sign(x) * torch.sqrt(torch.abs(x) + 1e-10)) x = self.classifiers(x) return x

    The training does not converge after modification. Why? Is it a problem with my code?

    opened by roseif 3
Releases(v0.4.0)
Owner
Grégoire Payen de La Garanderie
Grégoire Payen de La Garanderie
UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus General info This is

71 Oct 25, 2022
MEND: Model Editing Networks using Gradient Decomposition

MEND: Model Editing Networks using Gradient Decomposition Setup Environment This codebase uses Python 3.7.9. Other versions may work as well. Create a

Eric Mitchell 141 Dec 02, 2022
Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)

Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021) This repository contains the code

149 Dec 15, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
A smaller subset of 10 easily classified classes from Imagenet, and a little more French

Imagenette 🎶 Imagenette, gentille imagenette, Imagenette, je te plumerai. 🎶 (Imagenette theme song thanks to Samuel Finlayson) NB: Versions of Image

fast.ai 718 Jan 01, 2023
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 03, 2023
Benchmarks for Object Detection in Aerial Images

Benchmarks for Object Detection in Aerial Images

Jian Ding 691 Dec 30, 2022
Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

This repository contains the official code of OSTAR in "Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift" (ICLR 2022).

Matthieu Kirchmeyer 5 Dec 06, 2022
OpenMMLab Image Classification Toolbox and Benchmark

Introduction English | 简体中文 MMClassification is an open source image classification toolbox based on PyTorch. It is a part of the OpenMMLab project. D

OpenMMLab 1.8k Jan 03, 2023
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

Tim Esler 3.3k Jan 04, 2023
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delu

Lue Tao 29 Sep 20, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 104 Jan 05, 2023
Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Find-Lane-Line This project is to use openCV library and Python to detect the road-lane-line. Data Pipeline Step one : Color Selection Step two : Cann

Kenny Cheng 3 Aug 17, 2022
This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

Akari Asai 59 Dec 28, 2022
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel ga

Tarun K 280 Dec 23, 2022
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option

torchsynth 229 Jan 02, 2023
Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

Python scripts to detect faces using Python with the BlazeFace Tensorflow Lite models. Tested on Windows 10, Tensorflow 2.4.0 (Python 3.8).

Ibai Gorordo 46 Nov 17, 2022