Dilated Convolution with Learnable Spacings PyTorch

Overview

Dilated-Convolution-with-Learnable-Spacings-PyTorch

Ismail Khalfaoui Hassani

Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a novel convolution method based on gradient descent and interpolation. It could be seen as an improvement of the well known dilated convolution that has been widely explored in deep convolutional neural networks and which aims to inflate the convolutional kernel by inserting spaces between the kernel elements.

In DCLS, the positions of the weights within the convolutional kernel are learned in a gradient-based manner, and the inherent problem of non-differentiability due to the integer nature of the positions in the kernel is solved by taking advantage of an interpolation method.

For now, the code has only been implemented on PyTorch, using Pytorch's C++ API and custom cuda extensions.

Installation

DCLS is based on PyTorch and CUDA. Please make sure that you have installed all the requirements before you install DCLS.

Install the last stable version from PyPI:

coming soon

Install the latest developing version from the source codes:

From GitHub:

git clone https://github.com/K-H-Ismail/Dilated-Convolution-with-Learnable-Spacings-PyTorch.git
cd Dilated-Convolution-with-Learnable-Spacings-PyTorch
python ./setup.py install 

To prevent bad install directory or PYTHONPATH, please use

export PYTHONPATH=path/to/your/Python-Ver/lib/pythonVer/site-packages/
python ./setup.py install --prefix=path/to/your/Python-Ver/

Usage

Dcls methods could be easily used as a substitue of Pytorch's nn.Convnd classical convolution method:

from DCLS.modules.Dcls import Dcls2d

# With square kernels, equal stride and dilation
m = Dcls2d(16, 33, 3, dilation=4, stride=2)
# non-square kernels and unequal stride and with padding`and dilation
m = Dcls2d(16, 33, (3, 5), dilation=4, stride=(2, 1), padding=(4, 2))
# non-square kernels and unequal stride and with padding and dilation
m = Dcls2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 2))
# non-square kernels and unequal stride and with padding and dilation
m = Dcls2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 2))
# With square kernels, equal stride, dilation and a scaling gain for the positions
m = Dcls2d(16, 33, 3, dilation=4, stride=2, gain=10)
input = torch.randn(20, 16, 50, 100)
output = m(input)

Note: using Dcls2d with a dilation argument of 1 basically amounts to using nn.Conv2d, therfore DCLS should always be used with a dilation > 1.

Construct and Im2col methods

The constructive DCLS method presents a performance problem when moving to larger dilations (greater than 7). Indeed, the constructed kernel is largely sparse (it has a sparsity of 1 - 1/(d1 * d2)) and the zeros are effectively taken into account during the convolution leading to great losses of performance in time and memory and this all the more as the dilation is large.

This is why we implemented an alternative method by adapting the im2col algorithm which aims to speed up the convolution by unrolling the input into a Toepliz matrix and then performing matrix multiplication.

You can use both methods by importing the suitable modules as follows:

from DCLS.construct.modules.Dcls import  Dcls2d as cDcls2d

# Will construct three (33, 16, (3x4), (3x4)) Tensors for weight, P_h positions and P_w positions 
m = cDcls2d(16, 33, 3, dilation=4, stride=2, gain=10)
input = torch.randn(20, 16, 50, 100)
output = m(input)
from DCLS.modules.Dcls import  Dcls2d 

# Will not construct kernels and will perform im2col algorithm instead 
m = Dcls2d(16, 33, 3, dilation=4, stride=2, gain=10)
input = torch.randn(20, 16, 50, 100)
output = m(input)

Note: in the im2col Dcls method the two extra learnable parameters P_h and P_w are of size channels_in // group x kernel_h x kernel_w, while in the construct method they are of size channels_out x channels_in // group x kernel_h x kernel_w

Device Supports

DCLS only supports Nvidia CUDA GPU devices for the moment. The CPU version has not been implemented yet.

  • Nvidia GPU
  • CPU

Make sure to have your data and model on CUDA GPU. DCLS-im2col doesn't support mixed precision operations for now. By default every tensor is converted to have float32 precision.

Publications and Citation

If you use DCLS in your work, please consider to cite it as follows:

@misc{Dilated Convolution with Learnable Spacings,
	title = {Dilated Convolution with Learnable Spacings},
	author = {Ismail Khalfaoui Hassani},
	year = {2021},
	howpublished = {\url{https://github.com/K-H-Ismail/Dilated-Convolution-with-Learnable-Spacings-PyTorch}},
	note = {Accessed: YYYY-MM-DD},
}

Contribution

This project is open source, therefore all your contributions are welcomed, whether it's reporting issues, finding and fixing bugs, requesting new features, and sending pull requests ...

Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

40 Dec 22, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

THUDM 540 Dec 30, 2022
HairCLIP: Design Your Hair by Text and Reference Image

Overview This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image". Our single

322 Jan 06, 2023
CodeContests is a competitive programming dataset for machine-learning

CodeContests CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode. It consists of pro

DeepMind 1.6k Jan 08, 2023
[SDM 2022] Towards Similarity-Aware Time-Series Classification

SimTSC This is the PyTorch implementation of SDM2022 paper Towards Similarity-Aware Time-Series Classification. We propose Similarity-Aware Time-Serie

Daochen Zha 49 Dec 27, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
Flask101 - FullStack Web Development with Python & JS - From TAQWA

Task: Create a CLI Calculator Step 0: Creating Virtual Environment $ python -m

Hossain Foysal 1 May 31, 2022
Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

Hydra: An Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems Paper Finding Semantic Bugs in File Systems with an Extensible Fuzzin

gts3.org (<a href=[email protected])"> 129 Dec 15, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
NeuroGen: activation optimized image synthesis for discovery neuroscience

NeuroGen: activation optimized image synthesis for discovery neuroscience NeuroGen is a framework for synthesizing images that control brain activatio

3 Aug 17, 2022
3D dataset of humans Manipulating Objects in-the-Wild (MOW)

MOW dataset [Website] This repository maintains our 3D dataset of humans Manipulating Objects in-the-Wild (MOW). The dataset contains 512 images in th

Zhe Cao 28 Nov 06, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022
Differential rendering based motion capture blender project.

TraceArmature Summary TraceArmature is currently a set of python scripts that allow for high fidelity motion capture through the use of AI pose estima

William Rodriguez 4 May 27, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
Code of Periodic Activation Functions Induce Stationarity

Periodic Activation Functions Induce Stationarity This repository is the official implementation of the methods in the publication: L. Meronen, M. Tra

AaltoML 12 Jun 07, 2022
In this project, we'll be making our own screen recorder in Python using some libraries.

Screen Recorder in Python Project Description: In this project, we'll be making our own screen recorder in Python using some libraries. Requirements:

Hassan Shahzad 4 Jan 24, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning

AGIS-Net Introduction This is the official PyTorch implementation of the Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning. paper | suppl

Yue Gao 102 Jan 02, 2023
Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Renato Almeida de Oliveira 18 Aug 31, 2022
3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos This repository contains the source code and dataset for the pa

54 Oct 09, 2022