DaCeML - Machine learning powered by data-centric parallel programming.

Last update: Dec 12, 2022

Related tags

Overview

DaCeML

Machine learning powered by data-centric parallel programming.

This project adds PyTorch and ONNX model loading support to DaCe, and adds ONNX operator library nodes to the SDFG IR. With access to DaCe's rich transformation library and productive development environment, DaCeML can generate highly efficient implementations that can be executed on CPUs, GPUs and FPGAs.

The white box approach allows us to see computation at all levels of granularity: from coarse operators, to kernel implementations, and even down to every scalar operation and memory access.

Read more: Library Nodes

Integration

Converting PyTorch modules is as easy as adding a decorator...

@dace_module
class Model(nn.Module):
    def __init__(self, kernel_size):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 4, kernel_size)
        self.conv2 = nn.Conv2d(4, 4, kernel_size)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

... and ONNX models can also be directly imported using the model loader:

model = onnx.load(model_path)
dace_model = ONNXModel("mymodel", model)

Read more: PyTorch Integration and Importing ONNX models.

Training

DaCeML modules support training using a symbolic automatic differentiation engine:

import torch.nn.functional as F
from daceml.pytorch import dace_module

@dace_module(backward=True)
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 120)
        self.fc2 = nn.Linear(120, 32)
        self.fc3 = nn.Linear(32, 10)
        self.ls = nn.LogSoftmax(dim=-1)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        x = self.ls(x)
        return x

x = torch.randn(8, 784)
y = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7], dtype=torch.long)

model = Net()

criterion = nn.NLLLoss()
prediction = model(x)
loss = criterion(prediction, y)
# gradients can flow through model!
loss.backward()

Read more: Automatic Differentiation.

Library Nodes

DaCeML extends the DaCe IR with machine learning operators. The added nodes perform computation as specificed by the ONNX specification. DaCeML leverages high performance kernels from ONNXRuntime, as well as pure SDFG implementations that are introspectable and transformable with data centric transformations.

The nodes can be used from the DaCe python frontend.

import dace
import daceml.onnx as donnx
import numpy as np

@dace.program
def conv_program(X_arr: dace.float32[5, 3, 10, 10],
                 W_arr: dace.float32[16, 3, 3, 3]):
    output = dace.define_local([5, 16, 4, 4], dace.float32)
    donnx.ONNXConv(X=X_arr, W=W_arr, Y=output, strides=[2, 2])
    return output

X = np.random.rand(5, 3, 10, 10).astype(np.float32)
W = np.random.rand(16, 3, 3, 3).astype(np.float32)

result = conv_program(X_arr=X, W_arr=W)

Setup

The easiest way to get started is to run

make install

This will setup DaCeML in a newly created virtual environment.

For more detailed instructions, including ONNXRuntime installation, see Installation.

Development

Common development tasks are automated using the Makefile. See Development for more information.

DaCeML - Machine learning powered by data-centric parallel programming.

Related tags

Overview

DaCeML

Integration

Training

Library Nodes

Setup

Development

Owner

SPCL

Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis.

Machine-care - A simple python script to take care of simple maintenance tasks

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.

A model to predict steering torque fully end-to-end

Turning images into '9-pan' palettes using KMeans clustering from sklearn.

Bayesian Additive Regression Trees For Python

Combines Bayesian analyses from many datasets.

High performance Python GLMs with all the features!

A handy tool for common machine learning models' hyper-parameter tuning.

A Multipurpose Library for Synthetic Time Series Generation in Python

Gaussian Process Optimization using GPy

Automatically create Faiss knn indices with the most optimal similarity search parameters.

Greykite: A flexible, intuitive and fast forecasting library

Applied Machine Learning for Graduate Program in Computer Science (PPGCC)

Made in collaboration with Chris George for Art + ML Spring 2019.

Machine Learning approach for quantifying detector distortion fields

Can a machine learning project be implemented to estimate the salaries of baseball players whose salary information and career statistics for 1986 are shared?

Lightning ⚡️ fast forecasting with statistical and econometric models.

A collection of machine learning examples and tutorials.

DaCeML - Machine learning powered by data-centric parallel programming.

Related tags

Overview

DaCeML

Integration

Training

Library Nodes

Setup

Development

Owner

SPCL

Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis.

Machine-care - A simple python script to take care of simple maintenance tasks

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.

A model to predict steering torque fully end-to-end

Turning images into '9-pan' palettes using KMeans clustering from sklearn.

Bayesian Additive Regression Trees For Python

Combines Bayesian analyses from many datasets.

High performance Python GLMs with all the features!

A handy tool for common machine learning models' hyper-parameter tuning.

A Multipurpose Library for Synthetic Time Series Generation in Python

Gaussian Process Optimization using GPy

Automatically create Faiss knn indices with the most optimal similarity search parameters.

﻿Greykite: A flexible, intuitive and fast forecasting library

Applied Machine Learning for Graduate Program in Computer Science (PPGCC)

Made in collaboration with Chris George for Art + ML Spring 2019.

Machine Learning approach for quantifying detector distortion fields

Can a machine learning project be implemented to estimate the salaries of baseball players whose salary information and career statistics for 1986 are shared?

Lightning ⚡️ fast forecasting with statistical and econometric models.

A collection of machine learning examples and tutorials.

Greykite: A flexible, intuitive and fast forecasting library