Backend for the Autocomplete platform. An AI assisted coding platform.

Overview

Introduction

A custom predictor allows you to deploy your own prediction implementation, useful when the existing serving implementations don't fit your needs. If migrating from Cortex, the custom predictor work exactly the same way as PythonPredictor does in Cortex. Most PythonPredictors can be converted to custom predictor by copy pasting the code and renaming some variables.

The custom predictor is packaged as a Docker container. It is recommended, but not required, to keep large model files outside of the container image itself and to load them from a storage volume. This example follows that pattern. You will need somewhere to publish your Docker image once built. This example leverages Docker Hub, where storing public images are free and private images are cheap. Google Container Registry and other registries can also be used.

Make sure you use a GPU enabled Docker image as a base, and that you enable GPU support when loading the model.

Getting Started

After installing kubectl and adding your CoreWeave Cloud access credentials, the following steps will deploy the Inference Service. Clone this repository and folder, and execute all commands in there. We'll be using all the files.

Sign up for a Docker Hub account, or use a different container registry if you already have one. The free plan works perfectly fine, but your container images will be accessible by anyone. This guide assumes a private registry, requiring authentication. Once signed up, create a new repository. For the rest of the guide, we'll assume that the name of the new repository is gpt-6b.

Build the Docker image

  1. Enter the custom-predictor directory. Build and push the Docker image. No modifications are needed to any of the files to follow along. The default Docker tag is latest. We strongly discourage you to use this, as containers are cached on the nodes and in other parts of the CoreWeave stack. Once you have pushed to a tag, do not push to that tag again. Below, we use simple versioning by using tag 1 for the first iteration of the image.
    export DOCKER_USER=thotailtd
    docker build -t $DOCKER_USER/gpt-6b:v1alpha1 .
    docker push $DOCKER_USER/gpt-6b:v1alpha1

Set up repository access

  1. Create a Secret with the Docker Hub credentials. The secret will be named docker-hub. This will be used by nodes to pull your private image. Refer to the Kubernetes Documentation for more details.

    kubectl create secret docker-registry docker-hub --docker-server=https://index.docker.io/v1/ --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
  2. Tell Kubernetes to use the newly created Secret by patching the ServiceAccount for your namespace to reference this Secret.

    kubectl patch serviceaccounts default --patch "$(cat image-secrets-serviceaccount.patch.yaml)"

Download the model

As we don't want to bundle the model in the Docker image for performance reasons, a storage volume needs to be set up and the pre-trained model downloaded to it. Storage volumes are allocated using a Kubernetes PersistentVolumeClaim. We'll also deploy a simple container that we can use to copy files to our newly created volume.

  1. Apply the PersistentVolumeClaim and the manifest for the sleep container.

    $ kubectl apply -f model-storage-pvc.yaml
    persistentvolumeclaim/model-storage created
    $ kubectl apply -f sleep-deployment.yaml
    deployment.apps/sleep created
  2. The volume is mounted to /models inside the sleep container. Download the pre-trained model locally, create a directory for it in the shared volume and upload it there. The name of the sleep Pod is assigned to a variable using kubectl. You can also get the name with kubectl get pods.

    The model will be loaded to Amazon S3 soon. Now I directly uploaded it to CoreWeave
    
    export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    kubectl exec -it $SLEEP_POD -- sh -c 'mkdir /models/sentiment'
    kubectl cp ./sleep_383500 $SLEEP_POD:/models/sentiment/
  3. (Optional) Instead of copying the model from the local filesystem, the model can be downloaded from Amazon S3. The Amazon CLI utilities already exist in the sleep container.

    $ export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    $ kubectl exec -it $SLEEP_POD -- sh
    $# aws configure
    $# mkdir /models/sentiment
    $# aws s3 sync --recursive s3://thot-ai-models /models/sentiment/

Deploy the model

  1. Modify sentiment-inferenceservice.yaml to reference your docker image.

  2. Apply the resources. This can be used to both create and update existing manifests.

     $ kubectl apply -f sentiment-inferenceservice.yaml
     inferenceservice.serving.kubeflow.org/sentiment configured
  3. List pods to see that the Predictor has launched successfully. This can take a minute, wait for Ready to indicate 2/2.

    $ kubectl get pods
    NAME                                                           READY   STATUS    RESTARTS   AGE
    sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk  2/2     Running   0          34s

    If the predictor fails to init, look in the logs for clues kubectl logs sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk kfserving-container.

  4. Once all the Pods are running, we can get the API endpoint for our model. The API endpoints follow the Tensorflow V1 HTTP API.

    $ kubectl get inferenceservices
    NAME        URL                                                                          READY   DEFAULT TRAFFIC   CANARY TRAFFIC   AGE
    sentiment   http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment   True    100                                23h

    The URL in the output is the public API URL for your newly deployed model. A HTTPs endpoint is also available, however this one bypasses any canary deployments. Retrieve this one with kubectl get ksvc.

  5. Run a test prediction on the URL from above. Remember to add the :predict postfix.

     $ curl -d @sample.json http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment:predict
    {"predictions": ["positive"]}
  6. Remove the InferenceService. This will delete all the associated resources, except for your model storage and sleep Deployment.

    $ kubectl delete inferenceservices sentiment
    inferenceservice.serving.kubeflow.org "sentiment" deleted
    ```# thot.ai-Back-End
Owner
Tatenda Christopher Chinyamakobvu
Tatenda Christopher Chinyamakobvu
Translation to python of Chris Sims' optimization function

pycsminwel This is a locol minimization algorithm. Uses a quasi-Newton method with BFGS update of the estimated inverse hessian. It is robust against

Gustavo Amarante 1 Mar 21, 2022
基于百度的语音识别,用python实现,pyaudio+pyqt

Speech-recognition 基于百度的语音识别,python3.8(conda)+pyaudio+pyqt+baidu-aip 百度有面向python

J-L 1 Jan 03, 2022
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

41 Jan 03, 2023
Various Algorithms for Short Text Mining

Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te

Kwan-Yuet 466 Dec 06, 2022
Finetune gpt-2 in google colab

gpt-2-colab finetune gpt-2 in google colab sample result (117M) from retraining on A Tale of Two Cities by Charles Di

212 Jan 02, 2023
Implementation of "Adversarial purification with Score-based generative models", ICML 2021

Adversarial Purification with Score-based Generative Models by Jongmin Yoon, Sung Ju Hwang, Juho Lee This repository includes the official PyTorch imp

15 Dec 15, 2022
Download videos from YouTube/Twitch/Twitter right in the Windows Explorer, without installing any shady shareware apps

youtube-dl and ffmpeg Windows Explorer Integration Download videos from YouTube/Twitch/Twitter and more (any platform that is supported by youtube-dl)

Wolfgang 226 Dec 30, 2022
Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

derwen.ai 1.9k Jan 06, 2023
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Welcome to Spokestack Python! This library is intended for developing voice interfaces in Python. This can include anything from Raspberry Pi applicat

Spokestack 133 Sep 20, 2022
The code from the whylogs workshop in DataTalks.Club on 29 March 2022

whylogs Workshop The code from the whylogs workshop in DataTalks.Club on 29 March 2022 whylogs - The open source standard for data logging (Don't forg

DataTalksClub 12 Sep 05, 2022
A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

Emily's Symbol Dictionary Design This dictionary was created with the following goals in mind: Have a consistent method to type (pretty much) every sy

Emily 68 Jan 07, 2023
A 30000+ Chinese MRC dataset - Delta Reading Comprehension Dataset

Delta Reading Comprehension Dataset 台達閱讀理解資料集 Delta Reading Comprehension Dataset (DRCD) 屬於通用領域繁體中文機器閱讀理解資料集。 本資料集期望成為適用於遷移學習之標準中文閱讀理解資料集。 本資料集從2,108篇

272 Dec 15, 2022
NLP - Machine learning

Flipkart-product-reviews NLP - Machine learning About Product reviews is an essential part of an online store like Flipkart’s branding and marketing.

Harshith VH 1 Oct 29, 2021
Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

nwg-wrapper This program is a part of the nwg-shell project. This program is a GTK3-based wrapper to display a script output, or a text file content o

Piotr Miller 94 Dec 27, 2022
A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

CodeJ A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex) Install requirements pip install -r

TheProtagonist 1 Dec 06, 2021
Contains descriptions and code of the mini-projects developed in various programming languages

TexttoSpeechAndLanguageTranslator-project introduction A pleasant application where the client will be given buttons like play,reset and exit. The cli

Adarsh Reddy 1 Dec 22, 2021
An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

pl_prompt_sst An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SS

Zhiling Zhang 5 Oct 21, 2022
KR-FinBert And KR-FinBert-SC

KR-FinBert & KR-FinBert-SC Much progress has been made in the NLP (Natural Language Processing) field, with numerous studies showing that domain adapt

5 Jul 29, 2022
Multilingual text (NLP) processing toolkit

polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p

RAMI ALRFOU 2.1k Jan 07, 2023