Machine learning framework for both deep learning and traditional algorithms

Overview

NeoML

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for computer vision and natural language processing tasks, including image preprocessing, classification, document layout analysis, OCR, and data extraction from structured and unstructured documents.

Key features:

  • Neural networks with support for over 100 layer types
  • Traditional machine learning: 20+ algorithms (classification, regression, clustering, etc.)
  • CPU and GPU support, fast inference
  • ONNX support
  • Languages: C++, Java, Objective-C
  • Cross-platform: the same code can be run on Windows, Linux, macOS, iOS, and Android

Contents

Build and install

Supported platforms

The full С++ library version has been tested on the platforms:

Target OS Compiler Architecture
Windows 7+ (CPU and GPU) MSVC 2015+ x86, x86_64
Ubuntu 14+ (CPU) gcc 5.4+ x86_64
MacOS 10.11+ (CPU) Apple clang 11+ x86_64
iOS 11+ (CPU, GPU) Apple clang 11+ arm64-v8a, x86_64
Android 5.0+ (CPU), Android 7.0+ (GPU) clang 7+ armeabi-v7a, arm64-v8a, x86, x86_64

The inference Java and Objective-C library versions have been tested on the platforms:

Target OS Compiler Architecture
iOS 11+ (CPU, GPU) Apple clang 11+ arm64-v8a, x86_64
Android 5.0+ (CPU), Android 7.0+ (GPU) clang 7+ armeabi-v7a, arm64-v8a, x86, x86_64

Third party

The library is built with CMake (recommended versions 3.11 and later).

For best CPU performance on Windows, Linux and macOS we use Intel MKL.

When processing on a GPU, you can optionally use CUDA (version 10.2) on Windows or Linux and Vulkan (version 1.1.130 and later) on Windows, Linux or Android.

We also use Google Test for testing and Google Protocol Buffers for working with ONNX model format.

Build fully functional C++ version

See here for instructions on building the C++ library version for different platforms.

Build inference versions for Java and Objective-C

See here for instructions on building the Java and Objective-C versions that would only run the trained neural networks.

Getting started

Several tutorials with sample code will help you start working with the library:

API description

Basic principles

The library was developed with these principles in mind:

Platform independence

The user interface is completely separated from the low-level calculations implemented by a math engine.

The only thing you have to do is to specify at the start the type of the math engine that will be used for calculations. You can also choose to select the math engine automatically, based on the device configuration detected.

The rest of your machine learning code will be the same regardless of the math engine you choose.

Math engines independence

Each network works with one math engine instance, and all its layers should have been created with the same math engine. If you have chosen a GPU math engine, it will perform all calculations. This means you may not choose to use a CPU for "light" calculations like adding vectors and a GPU for "heavy" calculations like multiplying matrices. We have introduced this restriction to avoid unnecessary synchronizations and data exchange between devices.

Multi-threading support

The math engine interface is thread-safe; the same instance may be used in different networks and different threads.

Note that this may entail some synchronization overhead.

However, the neural network implementation is not thread-safe; the network may run only in one thread.

ONNX support

NeoML library also works with the models created by other frameworks, as long as they support the ONNX format. See the description of import API. However, you cannot export a NeoML-trained model into ONNX format.

Serialization format

The library uses its own binary format (implemented by CArchive, CArchiveFile) to save and load the trained models.

GPU support

Processing on GPU often helps significantly improve performance of mathematical operations. The NeoML library uses GPU both for training and running the models. This is an optional setting and depends on the hardware and software capabilities of your system.

To work on GPU, the library requires:

  • Windows: NVIDIA® GPU card with CUDA® 10.2 support.
  • iOS: Apple GPU A7+.
  • Android: devices with Vulkan 1.0 support.
  • Linux/macOS: no support for GPU processing as yet.

FineObj

The NeoML library originates in ABBYY internal infrastructure. For various reasons ABBYY uses a cross-platform framework called FineObj. Because of this, the open library version uses some of this framework primitives. See the common classes description.

C++ interface

NeoML contains two C++ libraries:

Algorithms library NeoML

The library provides C++ objects that implement various high-level algorithms. It consists of several parts:

NeoMathEngine

The math engine used for calculations is a separate module that implements the low-level mathematical functions used in the algorithms library. The user can also call these functions but usually never needs to.

This module has different implementations for different platforms. In particular, there is an implementation that uses a GPU for calculations.

The math engine is also a set of C++ interfaces described here.

Java interface

To work with the inference version of the library in Java and Kotlin we provide a Java interface.

Objective-C interface

To work with the inference version of the library in Swift and Objective-C we provide an Objective-C interface.

License

Copyright © 2016-2020 ABBYY Production LLC. Licensed under the Apache License, Version 2.0. See the license file.

Comments
  • Build error in Fedora (missing: Protobuf_LIBRARIES)

    Build error in Fedora (missing: Protobuf_LIBRARIES)

    Complete log:

    $ cmake -G Ninja ../NeoML -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=test
    -- Found OpenMP: TRUE (found version "4.5")  
    -- No CUDA support.
    -- Found OpenMP: TRUE (found version "4.5")  
    CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
      Could NOT find Protobuf (missing: Protobuf_LIBRARIES) (found version
      "3.19.4")
    Call Stack (most recent call first):
      /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
      /usr/share/cmake/Modules/FindProtobuf.cmake:650 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
      /home/<redacted>/Desktop/neoml/NeoOnnx/src/CMakeLists.txt:166 (find_package)
    
    
    -- Configuring incomplete, errors occurred!
    See also "/home/<redacted>/Desktop/neoml/Build/CMakeFiles/CMakeOutput.log".
    See also "/home/<redacted>/Desktop/neoml/Build/CMakeFiles/CMakeError.log".
    

    What can be done to build this ML library with the newer version of Protobuf? It would be inconvenient for me to build older version of Protobuf.

    opened by ghost 11
  • Cannot open NeoMathEngine.x64.Debug.lib

    Cannot open NeoMathEngine.x64.Debug.lib

    NeoML showing linker error. In Qt it looks like this:

    :-1: error: LNK1104: cannot open file 'NeoMathEngine.x64.Debug.lib'

    And in built libraries there is nowhere to be found NeoMathEngine.x64.Debug.lib. When I'm trying with Visual studio, I'm getting following output:

    Severity	Code	Description	Project	File	Line	Suppression State
    Error	C3861	'NOT_FOUND': identifier not found	NeoMLTest2	K:\NeoML\build64\debug\include\NeoML\TraditionalML\LdGraph.h	438	
    

    Code snippet where error is located seems to be some kind of assertion, but I can't understand what's wrong:

    inline void CLdGraph<Arc>::DetachArc( Arc* arc )
    {
    	// Delete the arcs from the starting node
    	CLdGraphVertex* initial = vertices[arc->InitialCoord() - begin];
    	NeoPresume( initial != 0 );
    	int i = initial->OutgoingArcs.Find(arc);
    	NeoAssert( i != NOT_FOUND ); //     PROBLEM IS HERE
    	initial->OutgoingArcs.DeleteAt(i);
    	// Delete the hanging node
    	if( initial->OutgoingArcs.Size() == 0
    		&&  initial->IncomingArcs.Size() == 0 )
    	{
    		delete initial;
    		vertices[arc->InitialCoord() - begin] = 0;
    	}
    

    I have 4 of those errors.

    opened by goodle06 8
  • Code style

    Code style

    Why do you use both keywords 'virtual' and 'override' in methods definition? It's redundantly and hard to read, because semantics for this keywords are different.

    opened by lordnn 7
  • Speed issue of precision-recall quality control layer

    Speed issue of precision-recall quality control layer

    Please implement calculation of positivesCorrect, positivesTotal, negativesCorrect, negativesTotal on device side.

    https://github.com/neoml-lib/neoml/blob/7b7653645c777f70acb461202032e118c2f1b470/NeoML/src/Dnn/Layers/PrecisionRecallLayer.cpp#L65-L91

    opened by MaratKhabibullin 5
  • Wrap input names in PyLayer

    Wrap input names in PyLayer

    Currently DNN graph structure is inaccessible from NeoML/Python - you can get the list of layers, but you can not read how these layers are connected with each other. With input names getter it is possible to traverse the whole graph starting from sinks and retrieve it's structure.

    It could be used in theory to plot the model, similar to torchviz or maybe even in a plugin for netron.

    Also fixes incorrect assertTrue which are supposed to be assertEqual

    opened by zimka 4
  • Clamp proposal

    Clamp proposal

    Раз PR вы не принимаете, оформлю как issue. Сделайте в namespace NeoML себе что-то такое и перестаньте писать многоэтажные сравнения:

    #if __cplusplus >= 201703L
    using std::clamp;
    #else
    template<typename T, typename Compare = std::less<T>> constexpr const T& clamp(const T &v, const T &lo, const T &hi, Compare comp = Compare()) {
        return comp(v, lo) ? lo : comp(hi, v) ? hi : v;
    }
    #endif
    
    opened by lordnn 4
  • Mismatch '_ITERATOR_DEBUG_LEVEL'

    Mismatch '_ITERATOR_DEBUG_LEVEL'

    Have a bunch mkl related linker errors when compiling NeoMathEngine. Could someone help me and tell what is this all about, since I can't look inside mkl dll? Neo ML was build with FineObjects option turned off (if it makes any difference).

    Severity Code Description Project File Line Suppression State Error LNK2038 mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx512_jit_destroy.obj) 1 Severity Code Description Project File Line Suppression State Error LNK2038 mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MDd_DynamicDebug' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx512_jit_destroy.obj) 1 Error LNK2038 mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx2_jit_destroy.obj) 1 Error LNK2038 mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MDd_DynamicDebug' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx2_jit_destroy.obj) 1 Error LNK2038 mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx_jit_destroy.obj) 1 Error LNK2038 mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MDd_DynamicDebug' in common.obj NeoMathEngine K:\NeoML\build\NeoMathEngine\src\mkl_core.lib(_avx_jit_destroy.obj) 1

    opened by goodle06 4
  • Simple NN example issue

    Simple NN example issue

    Hello! I'm trying to recreate simple net example, but I'm stuck here:

    for( int epoch = 1; epoch < 15; ++epoch ) {
        float epochLoss = 0; // total loss for the epoch
        for( int iter = 0; iter < iterationPerEpoch; ++iter ) {
            // trainData methods are used to transmit the data into the blob
            trainData.GetSamples( iter * batchSize, dataBlob );
            trainData.GetLabels( iter * batchSize, labelBlob );
    
            net.RunAndLearnOnce(); // run the learning iteration
            epochLoss += loss->GetLastLoss(); // add the loss value on the last step
        }
    
        ::printf( "Epoch #%02d    avg loss: %f\n", epoch, epochLoss / iterationPerEpoch );
        trainData.ReShuffle( random ); // reshuffle the data
    }
    

    I don't understand what class is trainData (and testData) and can't find GetSamples and GetLabels functions on my own. Please help.

    opened by goodle06 3
  • protobuf errors on Astra Linux on master

    protobuf errors on Astra Linux on master

    /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libprotobuf.a(arena.o): relocation R_X86_64_TPOFF32 against symbol `_ZN6google8protobuf5Arena13thread_cache_E' can not be used when making a shared object; перекомпилируйте с параметром -fPIC /usr/bin/ld: final link failed: Раздел, непредставимый для вывода collect2: error: ld returned 1 exit status NeoOnnx/src/CMakeFiles/NeoOnnx.dir/build.make:581: ошибка выполнения рецепта для цели «NeoOnnx/src/libNeoOnnx.so» make[2]: *** [NeoOnnx/src/libNeoOnnx.so] Ошибка 1 CMakeFiles/Makefile2:266: ошибка выполнения рецепта для цели «NeoOnnx/src/CMakeFiles/NeoOnnx.dir/all» make[1]: *** [NeoOnnx/src/CMakeFiles/NeoOnnx.dir/all] Ошибка 2 Makefile:160: ошибка выполнения рецепта для цели «all» make: *** [all] Ошибка 2

    opened by rmk177 3
  • Add counters reference-getters for PrecisionRecallLayer

    Add counters reference-getters for PrecisionRecallLayer

    There are many ML problems where TP/FP/FN/TN counters make sense, however they differ in terms of how these counters should be calculated. It would be nice to re-use PrecisionRecallLayer through inheritance for these ML problems.

    Currently PrecisionRecallLayer counters are private, so they are inaccessible in the inherited classes, and one can not change the way these counters are calculated.

    PR adds virtual reference-getters and uses them instead of direct changes of member variables. Now inherited classes have an access to the counters, and if this access is not sufficient somehow, it is possible to override the getters to use some other member variables.

    opened by zimka 2
  • error LNK2019 when building the library NeoOnnx.dll

    error LNK2019 when building the library NeoOnnx.dll

    Hello. I am building a library on windows 10, MSVC 2019 x64, protobuf 3.18.1 . I get two errors:

    onnx.pb.obj : error LNK2019: ссылка на неразрешенный внешний символ "class google::protobuf::internal::ExplicitlyConstr ucted<class std::basic_string<char,struct std::char_traits,class std::allocator > > google::protobuf::inter nal::fixed_address_empty_string" ([email protected]@[email protected]@@[email protected][email protected][email protected]@std@@[email protected]@2@@std@@@[email protected]) в функции "public: virtual class onnx::ValueInfoProto* __cdecl onnx::ValueInfoProto::New(void)const " ([email protected]@onnx@@UEBAPEAV1[email protected]). [H:\libs\neoml2.0.22.0\Build\NeoOnnx\src\NeoOnnx.vcxproj]

    onnx.pb.obj : error LNK2019: ссылка на неразрешенный внешний символ "struct std::atomic google::protobuf::interna l::init_protobuf_defaults_state" ([email protected]@[email protected]@@3U?$atomic@[email protected]@@A) в функции "public: virtual unsigned char * __cdecl onnx::TensorShapeProto_Dimension::_InternalSerialize(unsigned char *,class google::protobuf::io::EpsCopyOutputStream *)const " ([email protected][email protected]@@[email protected]@[email protected]@@@Z). [H:\libs\neoml2.0.22.0\Build\NeoOnnx\src\NeoOnnx.vcxproj]

    H:\libs\neoml2.0.22.0\Build\NeoOnnx\src\Release\NeoOnnx.dll : fatal error LNK1120: неразрешенных внешних элементов: 2 [ H:\libs\neoml2.0.22.0\Build\NeoOnnx\src\NeoOnnx.vcxproj]

    How fix it? Thank you!

    opened by Vovkin81 2
  • UB on comparison

    UB on comparison

    Comparison by relation < is undefined for boolean data type: https://github.com/neoml-lib/neoml/blob/5f0e8928c0150723a6504c6d05cc48643ed23ecd/NeoML/src/TraditionalML/GradientBoostQSEnsemble.cpp#L662-L665

    opened by lordnn 0
  • Wrap NeoOnnx into NeoML/Python

    Wrap NeoOnnx into NeoML/Python

    In most cases .onnx models are produced from Python environment - e.g. from tensorflow + tf2onnx or torch + torch.onnx.export). Currently to convert such exported onnx models into NeoML you can use either executable app (NeoOnnx/Onnx2NeoML) or C++ API (NeoOnnx::LoadFromOnnx).

    It would be nice to wrap NeoOnnx API into python bindings. With such Python API it would be possible to run the whole conversion pipeline (third-party-framework export, neoml import, models comparison and checks) in one environment.

    opened by zimka 1
  • Set CUDA_ACHITECUTRE to native when undefined for NeoML/Python

    Set CUDA_ACHITECUTRE to native when undefined for NeoML/Python

    Whenever I try to build NeoML/Python with python setup.py install, I get the following error:

    ...
    -- Configuring done
    CMake Error in CMakeLists.txt:
      CUDA_ARCHITECTURES is empty for target "PythonWrapper".
    ...
    CMake Generate step failed.  Build files cannot be regenerated correctly.
    

    As far as I understand, this is because I do not specify CMAKE_CUDA_ARCHITECTURES variable. This PR sets it to native when variable is empty. Somehow works for my CMake 3.18.

    opened by zimka 2
Releases(NeoML-master_2.0.5.0)
  • NeoML-master_2.0.5.0(Jun 22, 2021)

    We are glad to present to you the second release of NeoML! NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models.

    Major Features

    Neural networks with support for over 100 layer types Traditional machine learning: 20+ algorithms (classification, regression, clustering, etc.) CPU and GPU support, fast inference ONNX support Languages: C++, Python, Java, Objective-C Cross-platform: the same code can be run on Windows, Linux, macOS, iOS, and Android

    Source code(tar.gz)
    Source code(zip)
  • NeoML-master_1.0.1.0(Jun 14, 2020)

    We are glad to present to you the first release of NeoML! NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models.

    Major Features

    • Neural networks with support for over 100 layer types
    • Traditional machine learning: 20+ algorithms (classification, regression, clustering, etc.)
    • CPU and GPU support, fast inference
    • ONNX support
    • Languages: C++, Java, Objective-C
    • Cross-platform: the same code can be run on Windows, Linux, macOS, iOS, and Android
    Source code(tar.gz)
    Source code(zip)
Owner
NeoML
Cross-platform machine learning framework. Supports both deep learning and traditional ML algorithms.
NeoML
Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

Note: the current releases of this toolbox are a beta release, to test working with Haskell's, Python's, and R's code repositories. Metrics provides i

Ben Hamner 1.6k Dec 26, 2022
This is the official implementation for the paper "Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization" in NeurIPS 2021.

MPMAB_BEACON This is code used for the paper "Decentralized Multi-player Multi-armed Bandits: Beyond Linear Reward Functions", Neurips 2021. Requireme

Cong Shen Research Group 0 Oct 26, 2021
Using Python to Play Cyberpunk 2077

CyberPython 2077 Using Python to Play Cyberpunk 2077 This repo will contain code from the Cyberpython 2077 video series on Youtube (youtube.

Harrison 118 Oct 18, 2022
CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search This repository is the official implementation of CAPITAL: Optimal Subgrou

Hengrui Cai 0 Oct 19, 2021
Unofficial implementation of Pix2SEQ

Unofficial-Pix2seq: A Language Modeling Framework for Object Detection Unofficial implementation of Pix2SEQ. Please use this code with causion. Many i

159 Dec 12, 2022
Implementation for "Conditional entropy minimization principle for learning domain invariant representation features"

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features". The code is reproduced from thi

1 Nov 02, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
High-Resolution 3D Human Digitization from A Single Image.

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i

Meta Research 8.4k Dec 29, 2022
Versatile Generative Language Model

Versatile Generative Language Model This is the implementation of the paper: Exploring Versatile Generative Language Model Via Parameter-Efficient Tra

Zhaojiang Lin 17 Dec 02, 2022
CSPML (crystal structure prediction with machine learning-based element substitution)

CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti

8 Dec 20, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

ISC-Track2-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 2. Required dependencies To begin with

Wenhao Wang 89 Jan 02, 2023
💊 A 3D Generative Model for Structure-Based Drug Design (NeurIPS 2021)

A 3D Generative Model for Structure-Based Drug Design Coming soon... Citation @inproceedings{luo2021sbdd, title={A 3D Generative Model for Structu

Shitong Luo 118 Jan 05, 2023
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

235 Dec 26, 2022
Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images BlockGAN: Learning 3D Object-aware Scene Rep

41 May 18, 2022
Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

Picasso Code to generate Picasso embeddings of any input matrix. Picasso maps the points of an input matrix to user-defined, n-dimensional shape coord

Pachter Lab 45 Dec 23, 2022
Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

CoG-BART Contrast and Generation Make BART a Good Dialogue Emotion Recognizer Quick Start: To run the model on test sets of four datasets, Download th

39 Dec 24, 2022
A PyTorch Image-Classification With AlexNet And ResNet50.

PyTorch 图像分类 依赖库的下载与安装 在终端中执行 pip install -r -requirements.txt 完成项目依赖库的安装 使用方式 数据集的准备 STL10 数据集 下载:STL-10 Dataset 存储位置:将下载后的数据集中 train_X.bin,train_y.b

FYH 4 Feb 22, 2022
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
Continuum Learning with GEM: Gradient Episodic Memory

Gradient Episodic Memory for Continual Learning Source code for the paper: @inproceedings{GradientEpisodicMemory, title={Gradient Episodic Memory

Facebook Research 360 Dec 27, 2022
Using LSTM write Tang poetry

本教程将通过一个示例对LSTM进行介绍。通过搭建训练LSTM网络,我们将训练一个模型来生成唐诗。本文将对该实现进行详尽的解释,并阐明此模型的工作方式和原因。并不需要过多专业知识,但是可能需要新手花一些时间来理解的模型训练的实际情况。为了节省时间,请尽量选择GPU进行训练。

56 Dec 15, 2022