Model Training as a CI/CD System

This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed workflow in the below section, but it is about rebuilding and redeploying (continuous integration) the currently deployed machine learning pipeline based on changes in code. Such changes could happen in the training data, data pre-processing logic, model architecture and training code, custom pipeline components, and so on.

Workflow #1

We create initial code, or we make some changes in the existing codebase for pipeline.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- Cloud Build containerizes the current codebase. This is an optional step. If you have any custom components unchanges, this step might be omitted.
  - The Cloud Build compiles a new pipeline. It creates an updated docker image, and it uploads the new docker image to GCR
- If there is any codes changed in data preprocessing, modeling, training steps, we only have to upload those source files to designated GCS bucket
The final step of the Cloud Build is to execute a pipeline run on Vertex AI

Workflow #2

Workflow in a nutshell

We create initial code, or we make some changes in the existing codebase for modules.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- If there is any codes changed in data preprocessing and models, we only have to upload those source files to designated GCS bucket.
The final step of the Cloud Build is to execute a pipeline run on Vertex AI. Trainer and Transform TFX components will look up the changed modules accordingly.

Acknowledgements

ML-GDE program for providing GCP credits.

Demonstration of the Model Training as a CI/CD System in Vertex AI

Related tags

Overview

Model Training as a CI/CD System

Workflow #1

Workflow #2

Workflow in a nutshell

Acknowledgements

Owner

Chansung Park

PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Distance correlation and related E-statistics in Python

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

PINN Burgers - 1D Burgers equation simulated by PINN

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Code for ICMI2020 and ICMI2021 papers: "Studying Person-Specific Pointing and Gaze Behavior for Multimodal Referencing of Outside Objects from a Moving Vehicle" and "ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving Vehicle"

Pydantic models for pywttr and aiopywttr.

Language model Prompt And Query Archive

A micro-game "flappy bird".

A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

A treasure chest for visual recognition powered by PaddlePaddle

Affine / perspective transformation in Pose Estimation with Tensorflow 2

A toolkit for document-level event extraction, containing some SOTA model implementations

Instance-based label smoothing for improving deep neural networks generalization and calibration

Retrieval.pytorch - The code we used in [2020 DIGIX]