SAMO: Streaming Architecture Mapping Optimisation

Overview

SAMO: Streaming Architecture Mapping Optimiser

The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model onto an FPGA platform for Streaming Architecture frameworks. Both a Simulated Annealing and Brute Force optimiser are implemented. We currently support the following frameworks:

Installation

You can install this package using:

python -m pip install samo

Usage

The general usage of the SAMO tool can be seen by running python -m samo --help.

Example platform configurations are given in the platforms directory and example CNN models can be generated by running python scripts/generate_networks.py.

FINN

In order to run the optimiser with the FINN toolflow, the first step is to download the following fork

git clone https://github.com/Yu-Zhewen/finn.git
cd finn
git checkout 4cc0b6fdae2f5c06f0b5bcc6fa45fba4d8b69111

As FINN requires docker, set SAMO_DIR to the path of SAMO in run_docker.sh, before entering the docker.

bash run_docker.sh

Within the docker, generate the FINN-ONNX through the following steps.

cd ../samo
cp models/${network}.onnx outputs/saved/finn/${network}.onnx
cp ../finn/notebooks/samo/config/${network}.json ../finn/notebooks/samo/config.json
jupyter nbconvert --to notebook --execute ../finn/notebooks/samo/pre_optimiser_steps.ipynb
mv ../finn/notebooks/samo/pre_optimiser_steps.nbconvert.ipynb outputs/saved/finn/${network}_pre_optimiser_steps.nbconvert.ipynb

To optimise the CNN model in the FINN-ONNX format, you need to do:

python -m samo --optimiser annealing --model outputs/saved/finn/${network}_pre_optimiser.onnx  \
    --backend finn --platform platforms/zedboard.json \
    --output-path outputs/saved/finn/${network}_post_optimiser.onnx

Finally, the following command is used to generate the hardware.

jupyter nbconvert --to notebook --execute ../finn/notebooks/samo/post_optimiser_steps.ipynb

HLS4ML

This tool can be used to generate optimised designs for the HLS4ML framework. SAMO tunes the reuse-factor for layers of the CNN model, and generates a Resource driven design.

To optimise a keras model for a given platform, run the following:

python -m samo --optimiser annealing --model models/model.keras \
    --backend hls4ml --platform platforms/zedboard.json \
    --output-path outputs/model_hls4ml.json

The previous command generates a configuration file (outputs/model_hls4ml.json), which can be used by the HLS4ML to generate hardware. To do this, you will need to use the HLS4ML API to convert this configuration file into a HLS project.

import hls4ml
from tensorflow import keras

# load the configuration
with open("outputs/model_hls4ml.json", "r") as f:
    config = json.load(f)

# load the platform
with open("platforms/zedboard.json", "r") as f:
    platform = json.load(f)

# load the keras model
model = keras.models.load_model("models/model.keras")

# create the hls model
hls_model = hls4ml.converters.convert_from_keras_model(model, hls_config=config,
        output_dir="outputs/hls4ml_prj",  io_type="io_stream", fpga_part=platform["part"])

# build the HLS project
hls_model.build(csim=True, cosim=True)

Feel free to post an issue if you have any questions or problems!

Owner
Alexander Montgomerie-Corcoran
PhD Student at Imperial College London
Alexander Montgomerie-Corcoran
Classify the disease status of a plant given an image of a passion fruit

Passion Fruit Disease Detection I tried to create an accurate machine learning models capable of localizing and identifying multiple Passion Fruits in

3 Nov 09, 2021
Code for Boundary-Aware Segmentation Network for Mobile and Web Applications

BASNet Boundary-Aware Segmentation Network for Mobile and Web Applications This repository contain implementation of BASNet in tensorflow/keras. comme

Hamid Ali 8 Nov 24, 2022
Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

Ligeng Zhu 31 Oct 19, 2019
Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

EmotionUI Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI. demo screenshot (with RealSense) required packages Python = 3.6 num

Yang Jiao 2 Dec 23, 2021
Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution.

convolver Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution. Created by Sean Higley

Sean Higley 1 Feb 23, 2022
A multi-mode modulator for multi-domain few-shot classification (ICCV)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Yanbin Liu 8 Apr 28, 2022
PyTorchVideo is a deeplearning library with a focus on video understanding work

PyTorchVideo is a deeplearning library with a focus on video understanding work. PytorchVideo provides resusable, modular and efficient components needed to accelerate the video understanding researc

Facebook Research 2.7k Jan 07, 2023
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s

14 Nov 10, 2022
Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach

Introduction Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach Datasets: WebFG-496

21 Sep 30, 2022
Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)

SESS: Self-Ensembling Semi-Supervised 3D Object Detection Created by Na Zhao from National University of Singapore Introduction This repository contai

125 Dec 23, 2022
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"

Dataset Distillation by Matching Training Trajectories Project Page | Paper This repo contains code for training expert trajectories and distilling sy

George Cazenavette 256 Jan 05, 2023
Gradient Step Denoiser for convergent Plug-and-Play

Source code for the paper "Gradient Step Denoiser for convergent Plug-and-Play"

Samuel Hurault 11 Sep 17, 2022
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

Robin Jia 38 Oct 16, 2022
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Aritra Roy Gosthipaty 23 Dec 24, 2022
GrabGpu_py: a scripts for grab gpu when gpu is free

GrabGpu_py a scripts for grab gpu when gpu is free. WaitCondition: gpu_memory

tianyuluan 3 Jun 18, 2022
Autonomous racing with the Anki Overdrive

Anki Autonomous Racing Autonomous racing with the Anki Overdrive. Using the Overdrive-Python API (https://github.com/xerodotc/overdrive-python) develo

3 Dec 11, 2022
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
simple artificial intelligence utilities

Simple AI Project home: http://github.com/simpleai-team/simpleai This lib implements many of the artificial intelligence algorithms described on the b

921 Dec 08, 2022
Analyses of the individual electric field magnitudes with Roast.

Aloi Davide - PhD Student (UoB) Analysis of electric field magnitudes (wp2a dataset only at the moment) and correlation analysis with Dynamic Causal M

Davide Aloi 7 Dec 15, 2022
Facestar dataset. High quality audio-visual recordings of human conversational speech.

Facestar Dataset Description Existing audio-visual datasets for human speech are either captured in a clean, controlled environment but contain only a

Meta Research 87 Dec 21, 2022