PyTorch code for the "Deep Neural Networks with Box Convolutions" paper

Last update: Dec 18, 2022

Related tags

Deep Learning box-convolutions

Overview

Box Convolution Layer for ConvNets

Single-box-conv network (from `examples/mnist.py`) learns patterns on MNIST

What This Is

This is a PyTorch implementation of the box convolution layer as introduced in the 2018 NeurIPS paper:

Burkov, E., & Lempitsky, V. (2018) Deep Neural Networks with Box Convolutions. Advances in Neural Information Processing Systems 31, 6214-6224.

How to Use

Installing

python3 -m pip install git+https://github.com/shrubb/box-convolutions.git
python3 -m box_convolution.test # if throws errors, please open a GitHub issue

To uninstall:

python3 -m pip uninstall box_convolution

Tested on Ubuntu 18.04.2, Python 3.6, PyTorch 1.0.0, GCC {4.9, 5.5, 6.5, 7.3}, CUDA 9.2. Other versions (e.g. macOS or Python 2.7 or CUDA 8 or CUDA 10) should work too, but I haven't checked. If something doesn't build, please open a Github issue.

Known issues (see this chat):

CUDA 9/9.1 + GCC 6 isn't supported due to a bug in NVCC.

You can specify a different compiler with CC environment variable:

CC=g++-7 python3 -m pip install git+https://github.com/shrubb/box-convolutions.git

Using

import torch
from box_convolution import BoxConv2d

box_conv = BoxConv2d(16, 8, 240, 320)
help(BoxConv2d)

Also, there are usage examples in examples/.

Quick Tour of Box convolutions

You may want to see our poster.

Why reinvent the old convolution?

3×3 convolutions are too small ⮕ receptive field grows too slow ⮕ ConvNets have to be very deep.

This is especially undesirable in dense prediction tasks (segmentation, depth estimation, object detection, ...).

Today people solve this by

dilated/deformable convolutions (can bring artifacts or degrade to 1×1 conv; almost always filter high-frequency);
"global" spatial pooling layers (usually too constrained, fixed size, not "fully convolutional").

How does it work?

Box convolution layer is a basic depthwise convolution (i.e. Conv2d with groups=in_channels) but with special kernels called box kernels.

A box kernel is a rectangular averaging filter. That is, filter values are fixed and unit! Instead, we learn four parameters per rectangle − its size and offset:

Any success stories?

One example: there is an efficient semantic segmentation model ENet. It's a classical hourglass architecture stacked of dozens ResNet-like blocks (left image).

Let's replace some of these blocks by our "box convolution block" (right image).

First we replaced every second block with a box convolution block (BoxENet in the paper). The model became

more accurate,
faster,
lighter
without dilated convolutions.

Then, we replaced every residual block (except the down- and up-sampling ones)! The result, BoxOnlyENet, is

a ConvNet almost without (traditional learnable weight) convolutions,
2 times less operations,
3 times less parameters,
still more accurate than ENet!

Comments

Build problem!

Hi! Can't compile pls see log https://drive.google.com/open?id=1U_0axWSgQGsvvdMWv5FclS1hHHihqx9M

Command "/home/alex/anaconda3/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-n1eyvbz3/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-p0dv1roq/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-n1eyvbz3/

opened by aidonchuk 63

Implementation in VGG

Hey,

I am trying to implement box convolution for HED (Holistically-Nested Edge Detection) which uses VGG architecture. Here's the architecture with box convolution layer:

class HED(nn.Module):
    def __init__(self):
        super(HED, self).__init__()

        # conv1
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, 3, padding=1),
            BoxConv2d(1, 64, 5, 5),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, 3, padding=1),
            #BoxConv2d(1, 64, 28, 28),
            nn.ReLU(inplace=True),
        )

        # conv2
        self.conv2 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/2
            nn.Conv2d(64, 128, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv3
        self.conv3 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/4
            nn.Conv2d(128, 256, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv4
        self.conv4 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/8
            nn.Conv2d(256, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv5
        self.conv5 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/16
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        self.dsn1 = nn.Conv2d(64, 1, 1)
        self.dsn2 = nn.Conv2d(128, 1, 1)
        self.dsn3 = nn.Conv2d(256, 1, 1)
        self.dsn4 = nn.Conv2d(512, 1, 1)
        self.dsn5 = nn.Conv2d(512, 1, 1)
        self.fuse = nn.Conv2d(5, 1, 1)

    def forward(self, x):
        h = x.size(2)
        w = x.size(3)

        conv1 = self.conv1(x)
        conv2 = self.conv2(conv1)
        conv3 = self.conv3(conv2)
        conv4 = self.conv4(conv3)
        conv5 = self.conv5(conv4)

        ## side output
        d1 = self.dsn1(conv1)
        d2 = F.upsample_bilinear(self.dsn2(conv2), size=(h,w))
        d3 = F.upsample_bilinear(self.dsn3(conv3), size=(h,w))
        d4 = F.upsample_bilinear(self.dsn4(conv4), size=(h,w))
        d5 = F.upsample_bilinear(self.dsn5(conv5), size=(h,w))

        # dsn fusion output
        fuse = self.fuse(torch.cat((d1, d2, d3, d4, d5), 1))

        d1 = F.sigmoid(d1)
        d2 = F.sigmoid(d2)
        d3 = F.sigmoid(d3)
        d4 = F.sigmoid(d4)
        d5 = F.sigmoid(d5)
        fuse = F.sigmoid(fuse)

        return d1, d2, d3, d4, d5, fuse

I get the following error: RuntimeError: BoxConv2d: all parameters must have as many rows as there are input channels (box_convolution_forward at src/box_convolution_interface.cpp:30)

Can you help me with this?

opened by Flock1 10

YOLO architecture

Hi,

I want to know if there's some way I can create an architecture that'll work with YOLO. I've read a lot of implementations with pytorch but I don't know how should I modify the cfg file so that I can add box convolution layer.

Let me know.

opened by Flock1 9

Build Problem Windows 10 CUDA10.1 Python Bindings?

Hi, I'm trying to compile the box-convolutions using Windows 10 with CUDA 10.1. This results in the following error:

\python\python36\lib\site-packages\torch\lib\include\pybind11\cast.h(1439): error: expression must be a pointer to a complete object type

  1 error detected in the compilation of "C:/Users/CHRIST~1/AppData/Local/Temp/tmpxft_000010ec_00000000-8_integral_image_cuda.cpp4.ii".
  integral_image_cuda.cu
  error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.1\\bin\\nvcc.exe' failed with exit status 2

  ----------------------------------------
Failed building wheel for box-convolution
Running setup.py clean for box-convolution
Failed to build box-convolution

Any ideas? Thanks in advance

opened by tom23141 6

Getting a cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.ci:250

Hello,

I've been trying to implement your box convolution layer on a ResNet model by just substituting your BottleneckBoxConv layers for a typical ResNet Bottleneck layer.

I was getting this error

THCudaCheck FAIL file=src/box_convolution_cuda_forward.cu line=250 error=9 : invalid configuration argument
Traceback (most recent call last):
  File "half_box_train.py", line 178, in <module>
    main()
  File "half_box_train.py", line 107, in main
    scores = res_net(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 331, in forward
     x = self.layer3(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
       input = module(input)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 66, in forward
    return F.relu(x + self.main_branch(x), inplace=True)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_module.py", line 222, in forward
    self.reparametrization_h, self.reparametrization_w, self.normalize, self.exact)
  File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_function.py", line 46, in forward
    input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:250

Thanks so much!

opened by dkang9503 5

Speed and Efficiency of Depthwise separable operation?

As far as modern libraries are concerned, there is not much support for depth-wise separable operations, i.e. we cannot write custom operations that can be done depthwise. Only convolutions are supported.

How did you apply M box convolutions to each of the N input filters, to generate NM output filters? How is the different than using a for loop over the N input filters, applying M box convs on each one, and concatenating all the results?

opened by kennyseb 5

Import Error

Success build with ubuntu 16.04, cuda 10 and gcc 7.4. But import error encountered:

In [1]: import torch

In [2]: from box_convolution import BoxConv2d

ImportError                               Traceback (most recent call last)
<ipython-input-2-2424917dbf01> in <module>()
----> 1 from box_convolution import BoxConv2d

~/Software/pkgs/box-convolutions/box_convolution/__init__.py in <module>()
----> 1 from .box_convolution_module import BoxConv2d

~/Software/pkgs/box-convolutions/box_convolution/box_convolution_module.py in <module>()
      2 import random
      3 
----> 4 from .box_convolution_function import BoxConvolutionFunction, reparametrize
      5 import box_convolution_cpp_cuda as cpp_cuda
      6 

~/Software/pkgs/box-convolutions/box_convolution/box_convolution_function.py in <module>()
      1 import torch
      2 
----> 3 import box_convolution_cpp_cuda as cpp_cuda
      4 
      5 def reparametrize(

ImportError: /usr/Software/anaconda3/lib/python3.6/site-packages/box_convolution_cpp_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration

@shrubb

opened by the-butterfly 5

Error during forward pass

     44         input_integrated = cpp_cuda.integral_image(input)
     45         output = cpp_cuda.box_convolution_forward(
---> 46             input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
     47 
     48         ctx.save_for_backward(

RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:249```

opened by belskikh 5

Test script failed

Test script assertion failed:

Random seed is 1546545757 Testing for device 'cpu' Running test_integral_image()... 100%|| 50/50 [00:00<00:00, 1491.15it/s] OK Running test_box_convolution_module()... 0%| python3: /pytorch/third_party/ideep/mkl-dnn/src/cpu/jit_avx2_conv_kernel_f32.cpp:567: static mkldnn::impl::status_t mkldnn::impl::cpu::jit_avx2_conv_fwd_kernel_f32::init_conf(mkldnn::impl::cpu::jit_conv_conf_t&, const convolution_desc_t&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const primitive_attr_t&): Assertion `jcp.ur_w * (jcp.nb_oc_blocking + 1) <= num_avail_regs' failed. Aborted (core dumped)

Configuration: Ubuntu 16.04 LTS, CUDA 9.2, PyTorch 1.1.0, GCC 5.4.0.

opened by vtereshkov 4
how box convolution works

Hi,

It is a nice work. In the first figure on your poster, you compared the 3x3 convolution layer and your box convolution layer. I am not clear how the box convolution works. Is it right that for each position (p,q) on the image, you use a box filter which has a relative position x, y to (p,q) and size w,h to calculate the value for (p,q) on the output? You learn the 4 parameters x, y, w, h for each box filter. For example, in the figure, the value for the red anchor pixel position on the output should be the sum of the values in the box. Is it correct? Thanks.

opened by jiaozizhao 4

Multi-GPU Training: distributed error encountered

I am using https://github.com/facebookresearch/maskrcnn-benchmark for object detection, I want to use box convolutions, when I add a box convolution after some layer, training with 1 GPU is OK, while training with multiple GPUs in distributed mode failed, the error is very similar to this issue, I do not know how to fix, have some ideas? @shrubb

2019-02-18 16:09:15,187 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/train_net.py", line 172, in <module>
    main()
  File "tools/train_net.py", line 165, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 74, in train
    arguments,
  File "/srv/data0/hzxubinbin/projects/maskrcnn_benchmark/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 79, in do_train
    losses.backward()
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 445, in distributed_data_parallel_hook
    self._queue_reduction(bucket_idx)
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 475, in _queue_reduction
    self.device_ids)
TypeError: _queue_reduction(): incompatible function arguments. The following argument types are supported:
    1. (process_group: torch.distributed.ProcessGroup, grads_batch: List[List[at::Tensor]], devices: List[int]) -> Tuple[torch.distributed.Work, at::Tensor]

Invoked with: <torch.distributed.ProcessGroupNCCL object at 0x7f0d95248148>, [[tensor([[[[0.]],

1 GPU is too slow, I want to use multiple GPUs

opened by freesouls 4

How can I run Cityscapes example on a test set?

Hello, collegues! I've trained BoxERFNet, and now I wanna run this model on a test set to evaluate it. I checked the source code(train.py) and established 'test' in place of 'test' in 80th string. But there was falure, the evaluated metrics were incorrect(e.g. 0.0 and 0.0). Can you explain me, what I need to do to evaluate model on a test set? I guess that problem is on 'validate' function(241th string), because confusion_matrix_update(268th string) tensors are really different in test and val sets.

opened by mikhailkonyk 3

Releases(v1.0.0)

v1.0.0(Feb 15, 2019)

Source code(tar.gz)
Source code(zip)

Owner

Egor Burkov

GitHub Repository

This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

BEAR Overview This repository contains code associated with the preprint A generative nonparametric Bayesian model for whole genomes (2021), which pro

10 Sep 18, 2022

Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征，实际却受着广泛性发育障碍的折磨。项目背景这类儿童在与人交往时存在着沟通障碍，其特点表现在：社交交流差，互动障碍明显认知能力有限，被动认知兴趣狭窄，重复刻板，缺乏变化和想象

35 Nov 24, 2022

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec

48 Aug 19, 2022

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Figure: Shape-Accurate 3D-Aware Image Synthesis. A Shading-Guid

115 Dec 18, 2022

A python3 tool to take a 360 degree survey of the RF spectrum (hamlib + rotctld + RTL-SDR/HackRF)

RF Light House (rflh) A python script to use a rotor and a SDR device (RTL-SDR or HackRF One) to measure the RF level around and get a data set and be

11 Dec 13, 2022

Repo for paper "Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information"

Repo for paper "Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information" Notes I probabl

0 Jul 01, 2021

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

214 Jan 03, 2023

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022

A PyTorch Implementation of ViT (Vision Transformer)

ViT - Vision Transformer This is an implementation of ViT - Vision Transformer by Google Research Team through the paper "An Image is Worth 16x16 Word

7 May 11, 2022

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

128 Dec 27, 2022

Exploration & Research into cross-domain MEV. Initial focus on ETH/POLYGON.

xMEV, an apt exploration This is a small exploration on the xMEV opportunities between Polygon and Ethereum. It's a data analysis exercise on a few pa

7 Oct 18, 2022

Sibur challange 2021 competition - 6 place

sibur challange 2021 Решение на 6 место: https://sibur.ai-community.com/competitions/5/tasks/13 Скор 1.4066/1.4159 public/private. Архитектура - однос

5 Jan 11, 2022

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features"

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features". The code is reproduced from thi

1 Nov 02, 2022

A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

Wordle-AI A decent AI that solves daily "Wordle" puzzles. Works with different websites with similar wordlists. When prompted with "Word:" enter the w

1 Feb 10, 2022

Cmsc11 arcade - Final Project for CMSC11

cmsc11_arcade Final Project for CMSC11 Developers: Limson, Mark Vincent Peñafiel

1 Jan 18, 2022

A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

13 Sep 27, 2022

Official release of MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer axriv: http://arxiv.org/abs/2112.13513

MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis This is the official page of the MSHT with its experimental script and records. We de

53 Dec 27, 2022

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. Training python train.py --c

55 Dec 26, 2022

Reproduces the results of the paper "Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations".

Finite basis physics-informed neural networks (FBPINNs) This repository reproduces the results of the paper Finite Basis Physics-Informed Neural Netwo

65 Dec 28, 2022

Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

46 Dec 07, 2022