A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Last update: Nov 30, 2021

Related tags

Overview

CNN from scratch

The most interesting part is in the folder neural_networks/layers.py: Code for a convolutional neural network, based on only numpy (no PyTorch or TensorFlow). It is therefore very foundational and illustrates how CNNs work mathematically.

The CNNs is compatible with colour images (3-channel rgb), includes pooling layers (class Pool2D) and works with any given (valid) stride.

neural_networks/activations.py contains basic activation functions, like ReLu or SoftMax with the appropriate forward / backward implementations calculating the jacobian, etc., needed for backpropagation.

Many functions make heavy use of slicing, to speed up the training process significantly. See e.g. Conv2D.forward:

for x in range(out_rows):
    for y in range(out_cols):
        out[:,x,y,:] = np.apply_over_axes(np.sum, W[None]*X_pad[:,x*s:x*s+kernel_height,y*s:y*s+kernel_width,:][...,None], [1,2,3])[:,0,0,0,:]

which is the sliced version of a depth-6 nested for loop -- and thus allows for significant speedup (on my computer, more than 20x speedup for the given training data).

In losses.py, CrossEntropy is the most important function. To allow for speed-up, we simplified mathematically as much as possible, yielding

loss = -1.0/m *np.trace(np.matmul(Y,np.log(Y_hat.T)))

for the forward pass and

-1/m*(np.divide(Y,Y_hat))

for the backward pass.

This is based on a project for CS289 at UC Berkeley.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Related tags

Overview

CNN from scratch

Owner

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

PCGNN - Procedural Content Generation with NEAT and Novelty

Point cloud processing tool library.

Weakly-supervised object detection.

Implementation of a Transformer, but completely in Triton

An index of recommendation algorithms that are based on Graph Neural Networks.

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

This is a Deep Leaning API for classifying emotions from human face and human audios.

Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis

Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"

Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

Official PyTorch Implementation for InfoSwap: Information Bottleneck Disentanglement for Identity Swapping

load .txt to train YOLOX, same as Yolo others

Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

Implementation of various Vision Transformers I found interesting