Deep Learning GPU Training System

Overview

DIGITS

Build Status

DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, Torch, and Tensorflow.

Feedback

In addition to submitting pull requests, feel free to submit and vote on feature requests via our ideas portal.

Documentation

Current and most updated document is availabel at NVIDIA Accelerated Computing, Deep Learning Documentation, NVIDIA DIGITS.

Installation

Installation method Supported platform[s] Available versions Instructions
Source Ubuntu 14.04, 16.04 GitHub tags docs/BuildDigits.md

Official DIGITS container is available at nvcr.io via docker pull command.

Usage

Once you have installed DIGITS, visit docs/GettingStarted.md for an introductory walkthrough.

Then, take a look at some of the other documentation at docs/ and examples/:

Get help

Installation issues

  • First, check out the instructions above
  • Then, ask questions on our user group

Usage questions

Bugs and feature requests

Notice on security

Users shall understand that DIGITS is not designed to be run as an exposed external web service.

Comments
  • Torch Data Augmentation

    Torch Data Augmentation

    Data augmentation needs little introduction I recon. It counters overfitting and makes your model generalize better, yielding better validation accuracies; or alternatively, allows you to use smaller datasets with similar performance.

    In the Zoo that's the internet, I see many implementations of different augmentations, of which few are proper and nicely portable. A part from Digits yielding a great UI; ease of use; and deep learning turn-key solution, I strongly feel we can expand to the functional side as well to make this a deep learning killer-app.

    For torch, I have made an implementation during lua preprocessing from frontend to backend to enable Digits to do so. In #330 there was already an attempt for augmentation, which happened on the dataset-creation side; something I am strongly against. Resizing and cropping I would consider a transformation, while I consider augmenting the data in its container an augmentation. I think therefore it's fine to resize during dataset loading (and squashing/filling/etc), but I would probably leave it at that.

    Anyway, I set up a more dynamic structure to pass around these options on the torch side; instead of adding a dozen of arguments to each function, I am just adding a table.

    Implements the following (screenshot): image

    I have iterated through many augmentation types but these were the most useful. Almost done, now running elaborate tests.

    Progress

    The code is already functional, though see progress below. See code, shoot!

    Features

    • [x] Make UI data transforms only visible for the Torch framework (invisible for Caffe)
    • [x] ~~Implement UI option for normalization (scales the [0 255] to [0 1])~~
    • [x] Data Augmentation UI
    • [x] Flips (mirrors)
    • [x] Quadrilateral rotations
    • [x] Arbitrary rotations
    • [x] Arbitrary scales
    • [x] Augmenting in HSV space
    • [x] Augmenting with noise (Thoughts?)
    • [x] [Travis] Tests
    • [x] Use Data Augmentation Template: data_augmentation.html

    Testing

    • [x] No augmentation
    • [x] Flips (mirrors)
    • [x] Quadrilateral rotations
    • [x] Arbitrary rotations
    • [x] Arbitrary scales
    • [x] Arbitrary rotations & arbitrary scales
    • [x] Augmenting in HSV space
    • [x] Augmenting with noise
    • [x] All Augmentations & benchmark speed; identify bottlenecks
    • [x] Verify models reporting a slower learning/less overfitting trade-off : more generalization.
    enhancement torch 
    opened by TimZaman 46
  • running on multiple GPU is very slow

    running on multiple GPU is very slow

    I am trying to run 50-layer residual network with 4 K40m GPUs and it's very slow (same batch_size 16 as running on single GPU), take 6 hours for 1 epoch. However, If I run it on 1 GPU the speed is normal.

    System: CentOS, digits v3, nvcaffe-0.14

    BTW, I tried use Googlenet and it was ok on 4 GPUs.

    Any suggestion or potential issue?

    duplicate 
    opened by 201power 37
  • ERROR: Expected caffe suffix

    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?

    Hi,

    I'm running on Ubuntu 14.4 LTS.

    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?

    [email protected]:~/digits$ pip install -r requirements.txt
    You are using pip version 7.0.3, however version 7.1.0 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    Requirement already satisfied (use --upgrade to upgrade): Pillow>=2.3.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 1))
    Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 2))
    Requirement already satisfied (use --upgrade to upgrade): scipy>=0.13.3 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 3))
    Collecting protobuf>=2.5.0 (from -r requirements.txt (line 4))
      Downloading protobuf-2.6.1.tar.gz (188kB)
        100% |████████████████████████████████| 188kB 2.3MB/s 
    Collecting pydot>=1.0.2 (from -r requirements.txt (line 5))
      Downloading pydot-1.0.2.tar.gz
    Requirement already satisfied (use --upgrade to upgrade): six>=1.5.2 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 6))
    Requirement already satisfied (use --upgrade to upgrade): requests>=2.2.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 7))
    Requirement already satisfied (use --upgrade to upgrade): gevent>=1.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 8))
    Requirement already satisfied (use --upgrade to upgrade): Flask>=0.10.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 9))
    Collecting Flask-WTF>=0.11 (from -r requirements.txt (line 10))
      Downloading Flask_WTF-0.12-py2-none-any.whl
    Collecting Flask-SocketIO (from -r requirements.txt (line 11))
      Downloading Flask-SocketIO-0.6.0.tar.gz
    Collecting lmdb (from -r requirements.txt (line 12))
      Downloading lmdb-0.86.tar.gz (144kB)
        100% |████████████████████████████████| 147kB 2.9MB/s 
    Requirement already satisfied (use --upgrade to upgrade): nose>=1.3.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 13))
    Requirement already satisfied (use --upgrade to upgrade): mock>=1.0.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 14))
    Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4>=4.2.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 15))
    Requirement already satisfied (use --upgrade to upgrade): selenium>=2.25.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 16))
    Collecting gunicorn (from -r requirements.txt (line 17))
      Downloading gunicorn-19.3.0-py2.py3-none-any.whl (110kB)
        100% |████████████████████████████████| 110kB 3.8MB/s 
    Requirement already satisfied (use --upgrade to upgrade): setuptools in /home/ubuntu/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg (from protobuf>=2.5.0->-r requirements.txt (line 4))
    Requirement already satisfied (use --upgrade to upgrade): pyparsing in /home/ubuntu/anaconda/lib/python2.7/site-packages (from pydot>=1.0.2->-r requirements.txt (line 5))
    Requirement already satisfied (use --upgrade to upgrade): Werkzeug in /home/ubuntu/anaconda/lib/python2.7/site-packages (from Flask-WTF>=0.11->-r requirements.txt (line 10))
    Collecting WTForms (from Flask-WTF>=0.11->-r requirements.txt (line 10))
      Downloading WTForms-2.0.2-py27-none-any.whl (128kB)
        100% |████████████████████████████████| 131kB 3.3MB/s 
    Collecting gevent-socketio>=0.3.6 (from Flask-SocketIO->-r requirements.txt (line 11))
      Downloading gevent_socketio-0.3.6-py27-none-any.whl
    Requirement already satisfied (use --upgrade to upgrade): gevent-websocket in /home/ubuntu/anaconda/lib/python2.7/site-packages (from gevent-socketio>=0.3.6->Flask-SocketIO->-r requirements.txt (line 11))
    Installing collected packages: protobuf, pydot, WTForms, Flask-WTF, gevent-socketio, Flask-SocketIO, lmdb, gunicorn
      Running setup.py install for protobuf
      Running setup.py install for pydot
      Running setup.py install for Flask-SocketIO
      Running setup.py install for lmdb
    Successfully installed Flask-SocketIO-0.6.0 Flask-WTF-0.12 WTForms-2.0.2 gevent-socketio-0.3.6 gunicorn-19.3.0 lmdb-0.86 protobuf-2.6.1 pydot-1.0.2
    [email protected]:~/digits$ sudo apt-get install graphviz
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    graphviz is already the newest version.
    The following packages were automatically installed and are no longer required:
      linux-headers-3.13.0-49 linux-headers-3.13.0-49-generic
      linux-image-3.13.0-49-generic linux-image-extra-3.13.0-49-generic
    Use 'apt-get autoremove' to remove them.
    0 upgraded, 0 newly installed, 0 to remove and 267 not upgraded.
    [email protected]:~/digits$ ./digits-devserver
      ___ ___ ___ ___ _____ ___
     |   \_ _/ __|_ _|_   _/ __|
     | |) | | (_ || |  | | \__ \
     |___/___\___|___| |_| |___/
    
    Welcome to the DIGITS config module.
    
    Where is caffe installed?
        (enter "SYS" if installed system-wide)
        [default is SYS]
    (q to quit) >>> SYS
    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?
    
    (q to quit) >>> 
    
    caffe 
    opened by dbl001 35
  • Accuracy & confusion matrix

    Accuracy & confusion matrix

    See #17

    Adds a new kind of job for performance evaluation of trained classifiers. It is now possible to visualize :

    • accuracy / recall curve
    • confusion matrix

    Accuracy and the confusion matrix are computed against a chosen snapshot of a training task, and against both the validation set and testing set (if it exists). An "evaluate performance" button has been added on the training view. This is currently the only way to run an evaluation job. The results are stored in the job directory in the form of two pickle files.

    button

    Accuracy / recall curve

    accuracy recall curve

    Confusion matrix

    I chose a very simple representation of the confusion matrix (not in the form of a matrix !), because it is more adapted to datasets with lots of classes. For each class, the top 10 most represented classes are displayed, with their respective %.

    confusion matrix

    Related jobs

    I added a "Related jobs" section on each job show view. It displays the jobs which depends on the current job. For example, models trained on a specific dataset, evaluations ran on a specific model.

    Related jobs

    Let me know what you think, critiques and comments are more than welcome.

    opened by groar 29
  • Windows Compatibility

    Windows Compatibility

    On my machine the image serving, e.g. of the mean.jpg does not work. The browser (tested IE and Chrome) cannot interpret the image probably due to the missing content type. The send_file function takes care of that all.

    windows 
    opened by crohkohl 27
  • Add support for HDF5 datasets

    Add support for HDF5 datasets

    Closes #224

    TODO before merge

    • [x] Create models from HDF5 datasets using HDF5Data layers
    • [x] Expose backend and compression information in REST API
    • [x] Shard HDF5 files into acceptable dataset sizes - https://github.com/BVLC/caffe/issues/2953#issuecomment-137274066

    TODO after merge

    • Allow non-image data (see #197)
    • Analyze prebuilt HDF5 datasets in "generic" path
    enhancement 
    opened by lukeyeager 26
  • Set map_size for LMDB

    Set map_size for LMDB

    @crohkohl, @danst18, I'm breaking the discussion in #203 out into a new issue.

    Here's the situation as I understand it. Please correct me if any of this is wrong.

    | map_size | Linux | OSX & Windows | | --- | --- | --- | | lower than size of dataset | LMDB runs out of memory | ? | | higher than system memory | No problem | LMDB can't allocate enough memory |

    On Linux, you can just set it as high as you like and never see a problem. But that strategy blows up on other platforms.

    Should [map_size] be made configurable? https://github.com/NVIDIA/DIGITS/pull/203#issuecomment-128859465

    This is a sufficient but lazy solution. I would like to understand whether this can be avoided programmatically somehow before making a decision. My googling skills are failing me.

    question 
    opened by lukeyeager 26
  • can't find hdf5.h when build caffe

    can't find hdf5.h when build caffe

    I want to install digits on my debian jessie.
    When I build caffe(NVIDIA's fork), I got errors complaining that hdf5.h could not be found.

    I'm sure I had installed libhdf5-serial-dev and libhdf5-dev, and I found the header file in /usr/include/hdf5/serial and its libs in /usr/lib/x86_64-linux-gnu.

    So, what's wrong? Some one help me?

    The build error message show below:

    (venv)➜  caffe  make all --jobs=4
    CXX src/caffe/layer_factory.cpp
    CXX src/caffe/util/insert_splits.cpp
    CXX src/caffe/util/db.cpp
    CXX src/caffe/util/upgrade_proto.cpp
    In file included from src/caffe/util/upgrade_proto.cpp:10:0:
    ./include/caffe/util/io.hpp:8:18: fatal error: hdf5.h: no such file or directory
     #include "hdf5.h"
                      ^
    compilation terminated.
    Makefile:512: recipe for target '.build_release/src/caffe/util/upgrade_proto.o' failed
    make: *** [.build_release/src/caffe/util/upgrade_proto.o] Error 1
    make: *** 正在等待未完成的任务....
    In file included from ./include/caffe/common_layers.hpp:10:0,
                     from ./include/caffe/vision_layers.hpp:10,
                     from src/caffe/layer_factory.cpp:6:
    ./include/caffe/data_layers.hpp:9:18: fatal error: hdf5.h: no such file or directory
     #include "hdf5.h"
                      ^
    compilation terminated.
    Makefile:512: recipe for target '.build_release/src/caffe/layer_factory.o' failed
    make: *** [.build_release/src/caffe/layer_factory.o] Error 1
    
    question caffe platform 
    opened by tangshi 26
  • mAP always zero

    mAP always zero

    I can't figure out why my model training mAP (val) doesn't get above zero. I'm trying to use the same approach and the SpaceNet_DetectNet_Train_Val.prototxt from this article.

    My label files 000n.txt look like this: p 0.0 0 0.0 0 0 24 118 0 0 0 0 0 0 0 0

    My images are 1280x1280, and I'm using these custom classes: dontcare,p

    image

    Where am I going wrong?

    object-detection 
    opened by DarylWM 25
  • CUDNN_STATUS_BAD_PARAM

    CUDNN_STATUS_BAD_PARAM

    Ubuntu 14.04LTS Clean install nvidia dpkg install

    $ sudo apt-get install cuda
    $ sudo apt-get install digits
    
    $ gedit .bashrc
    add to endline next.
    
    export PATH=/usr/local/cuda/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    
    $ sudo reboot
    
    $ nvidia-smi
    Tue May 31 13:32:37 2016       
    +------------------------------------------------------+                       
    | NVIDIA-SMI 352.93     Driver Version: 352.93         |                       
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  GeForce GTX 960     Off  | 0000:01:00.0      On |                  N/A |
    | 20%   37C    P8    10W / 160W |    289MiB /  4095MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    |   1  GeForce GTX 960     Off  | 0000:02:00.0     Off |                  N/A |
    | 20%   43C    P8     9W / 160W |     13MiB /  4095MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    
    $ nvcc -V
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2015 NVIDIA Corporation
    Built on Tue_Aug_11_14:27:32_CDT_2015
    Cuda compilation tools, release 7.5, V7.5.17
    

    ----digits run and create Dataset----

    MNIST Image Size28x28 Image Type GRAYSCALE

    run Image Classification Model

    select Caffe and LeNet

    run, and rize next error

    ERROR: Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

    bug 
    opened by shinfo001 25
  • Error: status == CUDNN_STATUS_SUCCESS (8 vs. 0)  CUDNN_STATUS_EXECUTION_FAILED

    Error: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED

    I am getting this error when trying to run training with my custom network.

    status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED

    I found this post that refers to this error: https://github.com/BVLC/caffe/issues/1700#issuecomment-133476490

    But it doesn't specify where or how to fix it. Also I am not sure if the issues are related or something completely different. Let me mention that this custom framework works perfectly fine when I run it in my local caffe install, and I can also see all the nodes if I hit the visualize button. It starts training and fails after the first epoch.

    pasted_image_at_2015_08_21_12_18_am

    bug 
    opened by alfredox10 24
  • Fix TypeError

    Fix TypeError

    File "/opt/digits/digits/extensions/data/imageSegmentation/data.py", line 225, in split_image_list random.shuffle(self.random_indices) File "/usr/lib/python3.8/random.py", line 307, in shuffle x[i], x[j] = x[j], x[i] TypeError: 'range' object does not support item assignment

    opened by vertexodessa 0
  • DIGITS DOCKET CONTAINER INSTALLING SUNNY PLUGIN

    DIGITS DOCKET CONTAINER INSTALLING SUNNY PLUGIN

    I'm Sorry, I'm trying to install Sunnybrook for the segmentation example on the docker container, as I want to run it over the TensorFlow backend (not Coffe). I tried to repeat the install procedure from inside the container doing docker exec -it XXXXX bash, being XXX the container ID, and later downloading the plugin from https://github.com/NVIDIA/DIGITS/tree/master/plugins/data and later doing the install proccedure, but it not works. Is there any official way to do this? I did pip install --ignore-installed setuptools (no error appears)

    Installing collected packages: setuptools Successfully installed setuptools-44.1.1

    git clone https://github.com/NVIDIA/DIGITS.git I went to /DIGITS/plugins/data/sunnybrook via "cd" finally I run pip install . No error appear, but after restarting docker, when trying to create a Sunny dataset it fails (See in the following post the error, I've posted appart, for clarity)

    Can you help please? Kind regards

    opened by crmuinos 1
  • I'm confused between which version of DIGITS to install

    I'm confused between which version of DIGITS to install

    Apologies in advance since I'm new to all this but I'm confused regarding which version of DIGITS to install. I'm beginning a fresh install of the latest Ubuntu version and as of now, after hours of scouring the internet, I have found DIGITS versions that work standalone, versions that work in Docker, then there's the official DIGITS github page which has DIGITS upto version 6 and on the NGC, there's DIGITS 20.03???

    What is going on I'm so confused. I was excited to get DIGITS up and running on my local machine just as soon as I had completed the Nvidia DLI's course and now I'm just stumped as to where to start. Would also like to know how different is DIGITS running for Tensorflow from the Caffe DIGITS.

    Please help.

    opened by RazaZaidi2802 0
  • cannot see detectnet bounding boxes using Caffe model on Nano

    cannot see detectnet bounding boxes using Caffe model on Nano

    We have trained and deployed a custom model on the nano using a caffe detectnet model. We trained in digits, and it works well when conducting inference in DIGITS, but it will not show bounding boxes when running on the nano. Is there a patch for this issue?

    opened by eanmikale 0
  • Module Creation erros

    Module Creation erros

    So I am about to train with digits as specify in Hello AI Wold an then 4cd6b3f6e3058db2dfd91edaef62c9058f65ab8d

    this is the run code

    inception_5b/relu_pool_proj ← inception_5b/pool_proj inception_5b/relu_pool_proj → inception_5b/pool_proj (in-place) Setting up inception_5b/relu_pool_proj TRAIN Top shape for layer 158 ‘inception_5b/relu_pool_proj’ 5 128 40 40 (1024000) Creating layer ‘inception_5b/output’ of type ‘Concat’ Layer’s types are Ftype:FLOAT Btype:FLOAT Fmath:FLOAT Bmath:FLOAT Created Layer inception_5b/output (159) inception_5b/output ← inception_5b/1x1 inception_5b/output ← inception_5b/3x3 inception_5b/output ← inception_5b/5x5 inception_5b/output ← inception_5b/pool_proj inception_5b/output → inception_5b/output Setting up inception_5b/output TRAIN Top shape for layer 159 ‘inception_5b/output’ 5 1024 40 40 (8192000) Creating layer ‘pool5/drop_s1’ of type ‘Dropout’ Layer’s types are Ftype:FLOAT Btype:FLOAT Fmath:FLOAT Bmath:FLOAT Created Layer pool5/drop_s1 (160) pool5/drop_s1 ← inception_5b/output pool5/drop_s1 → pool5/drop_s1 Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED, device 0

    I am using a 2070 super

    Server: 9dca63a42e15 DIGITS version: 6.1.1 Caffe version: 0.17.0 Caffe flavor: NVIDIA My brain is soup at this point please help me out. caffe_output.log

    I have not be able to create one model yet

    3f542d1f6aa28d3568d8dcf4a11558753180c8ff

    I am also unable to install the source digits without crashing Ubuntu. Today is May 11 and I started trying to have it work since the 7th please could you help me out. I am really exited about this tool.

    opened by cespedesk 0
Releases(v6.1.1)
  • v6.1.1(Apr 10, 2018)

    Since 6.1.0

    Bugfixes

    • Update for new TF API (#2014)
    • Update CI scripts to add some new deps to Caffe build (#1993)
    • Update import and API for pydicom 1.0
    • Fix label distribution and its view page (#1916)
    Source code(tar.gz)
    Source code(zip)
  • v6.1.0(Dec 12, 2017)

    Since 6.0

    New Features

    • Added functionality to integrate DIGITS with S3 Endpoints (#1868)
    • Added publish to inference server on classification workflow (#1906)

    Bugfixes

    • Fix frozen graph issue (#1907)
    • Fix 404 error for /datasets/inference-form/... from #1888 (#1889)
    • Remove timeout assertion (#1859)

    Changes

    • Various updates on document

    Known Issues

    • Out of memory error in the semantic-segmentation example when training the FCN AlexNet model on Tesla P100.
    Source code(tar.gz)
    Source code(zip)
  • v6.0.0(Aug 30, 2017)

    See release notes for the 6.0 release candidate.

    Since 6.0 RC1

    New Features

    • Added support for URL prefix (#1803)

    Bugfixes

    • Fixed loading/saving tensorflow models (#1794)

    Changes

    • Various updates on document

    Known Issues

    • Visualization for Caffe models does not currently work. (#1738)
    Source code(tar.gz)
    Source code(zip)
  • v6.0.0-rc.1(Jul 25, 2017)

    New Features

    • Added TensorFlow backend for DIGITS as an alternate to Caffe and Torch (#1714)
    • Added examples and support for GANs (#1714)
    • Added support for text classification (#1025)
    • Added more viewing options for image segmentation (#1188)

    Changes

    • HTML embedding now defaults to PNG (#1270)
    • Images that causes exceptions will now show the file name (#1636)

    Bugfixes

    • Fixed softmax visualization issue with scaled images (#1647)
    • Documentation was changed for model store with official pictures (#1650)
    • Fixed Caffe search path in Windows (#1244)
    • Fixed image file entry in Sunnybrook inference form (#1237)
    • Fixed bugs when visiting nested image folder (#1477)

    Known Issues

    • Visualization for Caffe models does not currently work. (#1738)
    Source code(tar.gz)
    Source code(zip)
  • v5.0.0(Feb 2, 2017)

    See release notes for the 5.0 release candidate.

    New since 5.0 RC

    • Enable the DIGITS Model Store (https://github.com/NVIDIA/DIGITS/pull/1308)
    • Fix calculations related to batch accumulation for Caffe (https://github.com/NVIDIA/DIGITS/pull/1307)
    • Various documentation updates
    Source code(tar.gz)
    Source code(zip)
  • v5.0.0-rc.1(Oct 15, 2016)

    279 commits since v4.0.0

    New Features

    • Import pretrained models from a model "store" (#896, #1077, #1161)
    • Support for image segmentation workflows (#830, #961, #1131)
    • Online data augmentation with Torch (#777)
    • Show CPU and system memory utilization during training (#800)
    • Improved bounding-box visualizations for object detection models (#869)
    • Create groups of jobs for easier display on the home page (#734)
    • Reuse data extensions for inference (#1024)
    • Support for plugin extensions (#1093, #927, #947)
    • Add documentation for the REST API (#964)

    Changes

    • Use environment variables for configuration instead of a file (#1091)
    • Remove digits-server and dependency on gunicorn (#1127)
    • digits-devserver is now just a small shell script instead of a Python script (#1121)
    • New design for Torch multi-GPU training (#828)
    • Add Ubuntu 16.04 support by updating dependency versions (#965)
    • Allow testing of only Caffe or only Torch with the testsuite (#1143)
    • Return more info when downloading a model tarball or json (#891)

    Bugfixes

    • Fix bug with Torch and CUDA_VISIBLE_DEVICES (#1130)
    • Fix issues with browsers returning incorrectly cached css and js files (#904)

    Known Issues

    • Training goes on longer than required when using batch accumulation (#1240)
    Source code(tar.gz)
    Source code(zip)
  • v4.0.0(Jul 19, 2016)

    529 commits since v3.0.0

    New Features

    • Add support for object-detection networks like DetectNet (#735) with documentation (#803)
    • Parameter sweep over batch size and learning rate (#708)
    • Show accuracy confusion matrix for "Classify Many" (#608)
    • Test a model with an LMDB (#638)
    • Add basic login functionality (#463)

    Changes

    • Major revamp of home page (#728, #790)
    • Allow use of BVLC/caffe (#769)
    • Run inference jobs in separate processes (#573)

    Bugfixes

    • Made device_query compatible with CUDA 8.0 (#890)

    For more information, see the release notes for v3.1, v3.2, v3.3, and the 4.0 RC.

    Source code(tar.gz)
    Source code(zip)
  • v4.0.0-rc.2(Jul 19, 2016)

    211 commits since v3.3.0

    New Features

    • Add support for object-detection networks like DetectNet (#735) with documentation (#803)
    • Parameter sweep over batch size and learning rate (#708)
    • Add plugin systems for data formats (#731) and inference visualizations (#756)
    • Expose Caffe's iter_size solver option (#744)
    • Add syntax highlighting when editing custom networks (#751)
    • View list of related jobs (#767)
    • Explore generic datasets (#822)
    • Add example for doing text classification with Torch (#684)

    Changes

    • Major revamp of home page (#728, #790)
    • Allow use of BVLC/caffe (#769)
    • New Torch multi-GPU programming model (#732)
    • Make small improvements to standard networks (#733, #749)
    • Set weight_decay to lr / 100 (#792)
    • Make major improvements to TravisCI build system (#766, #788)
    Source code(tar.gz)
    Source code(zip)
  • v3.3.0(Apr 25, 2016)

    New Features

    • Show accuracy confusion matrix for "Classify Many" (#608)
    • Test a model with an LMDB (#638)
    • Use layer stages in network descriptions for full control over train/val/deploy networks (#628)
    • Option to limit number of images to use for "Classify/Test Many" (#592)
    • Better in-app documentation for Python layers (#651)

    Changes

    • Run inference jobs in separate processes (#573)
    • Path autocompletion returns sorted list (#621)

    Bugfixes

    • Fixed UI bugs when using Safari (#702)
    • Fixed file serving for files with absolute paths (#586)
    • Fixed some UI bugs related to permissions (#594, #596)
    • Various torch-related bugfixes (#661, #663, #681, #686, #699)
    • Windows compatibility fixes (#698)
    Source code(tar.gz)
    Source code(zip)
  • v3.2.0(Feb 18, 2016)

    New Features

    • Add support for new solvers - RMSprop, AdaDelta and Adam (#564)
    • AlexNet for Torch now works for multiple GPUs (#539)
    • New documentation for installing CUDA toolkit, drivers, etc. (#558)

    Changes

    • Only look in one location for config files (#541)
    • Re-use weights when retraining a model on the same dataset (#538)
    • Functional improvements and documentation changes for examples/classification (#559, #557, #579, #582)
    • Better error-checking for caffe networks referencing invalid layer "bottoms" (#576)

    Bugfixes

    • Fixes for multistep learning rate (#549, #550)
    Source code(tar.gz)
    Source code(zip)
  • v3.1.0(Jan 22, 2016)

    New Features

    • Enable multi-GPU for Torch (#480)
    • Add basic login functionality (#463)
    • Allow Torch to fine-tune pretrained models (#499)
    • Allow Caffe to fine-tune from multiple pretrained models (#498)
    • New tutorials
      • Fine-tuning (#500)
      • Siamese networks (#453)
      • Weight initialization (#522)
    • Allow optional specification of image folder during multiple inference (#526)

    Changes

    • Torch performance improvements (#368, #390, #441, #339)
    • Disable colormap for "Top N" feature (#481)
    • Better real-time updates for dataset creation (#473)
    • Better display for device_query tool (#497)
    • Display the job directory for all job types (#469)
    • Use Flask "Blueprints" to cleanup routing code (#507)
    • Cleanup and alphabetize imports throughout the project (#501)
    • Removed docs/API.md and docs/FlaskRoutes.md (a05356ebfe0fe462f20143625ec8c942847348de)

    Bugfixes

    • Enable importing of LMDBs created with Caffe's convert_imageset tool (#517)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0(Jan 22, 2016)

    See release notes for v3.0 RC.

    New since 3.0 RC

    • Fix handling of unencoded LMDBs in Torch (#475)
    • Significant performance enhancement for creating datasets (#491)
    • Various documentation fixes / updates
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0-rc.3(Dec 10, 2015)

    New Features

    • Add Torch7 as an alternative backend to Caffe (#324, #345)
    • Make using python layers easier by [optionally] attaching a python file to each model (#329)
    • Add the ability to clone previous jobs with a click (#334)
    • Update the homepage to show job updates in real-time (#240)
    • Enable mean subtraction by subtracting the mean file as well as subtracting the mean pixel (#321)
    • Support NVcaffe v0.14 (#341, #336)
    • Display the job directory size for each DatasetJob and ModelJob (#309)
    • Add a backend badge (LMDB/HDF5) to DatasetJobs on the homepage (#323)
    • Explore images in LMDB datasets (#331)

    Changes

    • Use port 34448 for the digits-server instead of port 8080 (#392)
    • Remove digits-walkthrough (#352)
    • Enforce standard UI for file input fields across different browsers (#325)

    Bugfixes

    • Fix PicklingErrors issues on all platforms (#307)
    • Fix issue when running inference on many images at once (#361)

    Known Issues

    • Large inference requests (i.e. "Classify many") may cause timeouts or even crashes (#479)
    • Incorrect handling of unencoded LMDB in Torch wrapper (#477)
    Source code(tar.gz)
    Source code(zip)
  • v2.2.1(Sep 17, 2015)

  • v2.2.0(Sep 16, 2015)

    New Features

    • Add [initial] support for HDF5 datasets (#226)
    • Zoom in on weight/activation visualizations (#267)
    • Add a new page for comparing training results (#195)
    • Add notes to jobs (#283)

    Changes

    • Open inference results in a new browser tab (#244)
    • Various improvements for using prebuilt LMDBs (#268)
    • Sort subfolders when parsing a folder of images (#296)
    • Use input_shape instead of input_dim for deploy network prototxt (#231)

    Known Issues

    • Using a snapshot from a previous network doesn't work unless the network is on the first page (#285)
    • Parameter counting fails for some layer types (like PReLU) (#317)
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Sep 14, 2015)

    New Features

    • Add support for "Generic Inference" (i.e. non-classification) networks (#189)
    • Display number of learned parameters in a model (#221)
    • Show ground truth in "Classify Many" if provided (#110)
    • Zoom in on a selection of the loss/accuracy graph (#113)
    • Add autocomplete for server-side path input fields (#183)
    • Select max/min images per class when parsing a folder of images (#161)
    • Allow user to download log from CreateDb tasks (#221)
    • Show number of available GPUs on home page (#207)
    • Allow local file upload for image lists (#106)
    • Display DIGITS version in top right of page header (#153) and in the console output (c181797cdf3ce27bf65a22fd39fbc61b95ecaab6)

    Changes

    • Double the LMDB map_size when running out of memory instead of setting to 1TB (#209)
      • requires py-lmdb 0.87
    • Rename default GoogLeNet layers and tops (9ff246eed47ec04461956b133495260855168e2e)
    • Add pagination to Previous Networks list (c181797cdf3ce27bf65a22fd39fbc61b95ecaab6)
    • Various changes that help with Windows compatibility (#199)
    • Major refactoring of tests (#192)

    Known issues

    • Parameter counting fails for some layer types (like PReLU) (#317)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Sep 3, 2015)

    New Features

    • Enabled support for multi-GPU Caffe (#92)
      • Select multiple and/or specific GPUs for training (#92, #104)
    • Created new routes for JSON REST API (#134, #136)
    • Started using GPU for inference (#66)
    • Added NVML info about GPU memory/utilization (#93)
    • Enabled ADAGRAD and NESTEROV as alternative solver types (@drozdvadym in #102)
    • Added scripts to download standard datasets MNIST and CIFAR
    • Added option to set server name (#111)
    • Added support for PPM images (#123)
    • Enabled path autocompletion while setting values in the configuration (#96)

    Changes

    • Added a python classification example (#147)
    • Subtract mean pixel during training (#169)
    • Added TravisCI integration to run tests (#28)
    • Added Coveralls integration for test coverage
    • Added Landscape integration to inspect code
    • Added auto-generated documentation of the webapp’s HTTP routes
    • Switched to loading config files from new, more logical locations (#96)
    • Started suppressing most of Caffe’s raw output (b382e99b8a143c9bbbf659ba74e67bf2ef12718e, 019bc6ca750601396a502ad0fd2b0d47b239f0d7)
    • Added a CLA

    Bugfixes

    • Fixed various OSX platform-specific issues (#32, @trivedigaurav in #94)

    Known Issues

    • Some motherboards cause P2P bandwidth issues (https://github.com/NVIDIA/caffe/issues/10)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0-rc3(Jul 31, 2015)

    See release notes for v2.0.0-preview.

    New since 2.0 Preview

    • Recommend NVIDIA/Caffe v0.13(https://github.com/NVIDIA/DIGITS/commit/5dc0f8e646d28587c07ff6fe9bcd1990820b41c2)
      • Requires cuDNN v3
    • Subtract mean pixel during training (#169)
    • Fixes regarding deployment of digits-server (c9a9dce2fcf7bb12363e6cccc44a6dd0a26a8271, e7bbc63213a10bbea516ee51adc5ffcf160494e8)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0-preview(Jul 7, 2015)

    New Features

    • Enabled support for multi-GPU Caffe (#92)
      • Select multiple and/or specific GPUs for training (#92, #104)
    • Created new routes for JSON REST API (#134, #136)
    • Started using GPU for inference (#66)
    • Added NVML info about GPU memory/utilization (#93)
    • Enabled ADAGRAD and NESTEROV as alternative solver types (@drozdvadym in #102)
    • Added scripts to download standard datasets MNIST and CIFAR
    • Added option to set server name (#111)
    • Added support for PPM images (#123)
    • Enabled path autocompletion while setting values in the configuration (#96)

    Changes

    Bugfixes

    • Fixed various OSX platform-specific issues (#32, @trivedigaurav in #94)

    Known Issues

    • Some motherboards cause P2P bandwidth issues (https://github.com/NVIDIA/caffe/issues/10)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.2(Jun 26, 2015)

  • v1.1.0(Apr 24, 2015)

    New Features

    • Add GoogLeNet as a default network (#11)
    • "Classify Many Images" shows classification results of many images at once (#61)
    • Show statistics (mean, standard deviation, histogram of values) for each layer of the network at inference time (#67)
    • Allow saving images in database with PNG encoding (#73)
    • Optionally turn off shuffling when creating a dataset (#72)
    • Optionally provide a random seed to caffe (73fe257)

    Changes

    • Upgrade to NVIDIA/caffe version 0.11.0 (e2bcb27)
    • Update pip requirements list to match packages available on Ubuntu 14.04 where possible (4162db4, 133213d)
    • Use C3.js instead of Google Charts to enable DIGITS to run without an internet connection (#34)
    • Change default image resize mode from HALF_CROP to SQUASH (b4f3261)

    Bugfixes

    • Save images in BGR order instead of RGB because caffe uses OpenCV to read encoded images (#59)
    • Scale the LeNet standard network by the standard deviation of MNIST (~80) during train, val and test phases (5a38aa5, 23c1a78)
    • Use a white background when removing transparency from images (#85)

    Known Issues

    • The GoogLeNet standard network is not behaving correctly when trained on the full ImageNet dataset (#82)
    • "Classify Many Images" may timeout if too many images are uploaded and the server takes too long to respond (#70)
    Source code(tar.gz)
    Source code(zip)
A Transformer-Based Siamese Network for Change Detection

ChangeFormer: A Transformer-Based Siamese Network for Change Detection (Under review at IGARSS-2022) Wele Gedara Chaminda Bandara, Vishal M. Patel Her

Wele Gedara Chaminda Bandara 214 Dec 29, 2022
Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors. We provide a tiny ground truth file demo_gt.json, and t

Shuo Chen 3 Dec 26, 2022
A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

CoAtNet Overview This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021

Justin Wu 268 Jan 07, 2023
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Thomas Vuillaume 1 Dec 10, 2021
ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

Jie Hu 182 Dec 19, 2022
根据midi文件演奏“风物之诗琴”的脚本 "Windsong Lyre" auto play

Genshin-lyre-auto-play 简体中文 | English 简介 根据midi文件演奏“风物之诗琴”的脚本。由Python驱动,在此承诺, ⚠️ 项目内绝不含任何能够引起安全问题的代码。 前排提示:所有键盘在动但是原神没反应的都是因为没有管理员权限,双击run.bat或者以管理员模式

御坂17032号 386 Jan 01, 2023
PushForKiCad - AISLER Push for KiCad EDA

AISLER Push for KiCad Push your layout to AISLER with just one click for instant

AISLER 31 Dec 29, 2022
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi

AWS Samples 916 Dec 23, 2022
An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

50 Oct 19, 2022
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 121 Dec 25, 2022
Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning Authors: Tao Yu* Yichi Zhang* Zhiru Zhang Christopher De Sa *: Equal Contri

Cornell RelaxML 4 Sep 08, 2022
[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

InvCompress Official Pytorch Implementation for "Enhanced Invertible Encoding for Learned Image Compression", ACMMM 2021 (Oral) Figure: Our framework

96 Nov 30, 2022
Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

Wang jiahao 3 Oct 31, 2022
A curated list of awesome Deep Learning tutorials, projects and communities.

Awesome Deep Learning Table of Contents Books Courses Videos and Lectures Papers Tutorials Researchers Websites Datasets Conferences Frameworks Tools

Christos 20k Jan 05, 2023
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

🔉 Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 58 Dec 28, 2022
Multi agent DDPG algorithm written in Python + Pytorch

Multi agent DDPG algorithm written in Python + Pytorch. It also includes a Jupyter notebook, Tennis.ipynb, as a showcase.

Rogier Wachters 2 Feb 26, 2022
【steal piano】GitHub偷情分析工具!

【steal piano】GitHub偷情分析工具! 你是否有这样的困扰,有一天你的仓库被很多人加了star,但是你却不知道这些人都是从哪来的? 别担心,GitHub偷情分析工具帮你轻松解决问题! 原理 GitHub偷情分析工具透过分析star的时间以及他们之间的follow关系,可以推测出每个st

黄巍 442 Dec 21, 2022
AAAI 2022: Stationary diffusion state neural estimation

Stationary Diffusion State Neural Estimation Although many graph-based clustering methods attempt to model the stationary diffusion state in their obj

绽琨 33 Nov 24, 2022