A containerized REST API around OpenAI's CLIP model.

Overview

OpenAI's CLIP — REST API

This is a container wrapping OpenAI's CLIP model in a RESTful interface.

Running the container locally

First, build the container:

docker build -t clip-container:latest .

Then, you can run it:

docker run -it -p 8080:8080 --name "clip-container" --rm clip-container:latest /opt/ml/code/serve

Sending requests:

The container exposes two different endpoints:

  • GET /ping: Returns 200 status if the container is working properly.
  • POST /invocations: Processes a list of images and returns the list of labels with their corresponding probabilities.

Here is an example request assuming the container is listening in port 8080:

curl --location --request POST 'http://localhost:8080/invocations' \
--header 'Content-Type: application/json' \
--data-raw '{
    "images": [
        "https://images.unsplash.com/photo-1597308680537-1ba44407ffc0?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1834&q=80",
        "https://images.unsplash.com/photo-1589270216117-7972b3082c7d?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1834&q=80"],
    "classes": ["person", "bag", "person with a bag", "woman riding a horse", "woman with a bag", "woman with black shirt and a bag"]
}'

The response looks like this:

[
    {
        "url": "https://images.unsplash.com/photo-1597308680537-1ba44407ffc0?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1834&q=80", 
        "labels": [
            "woman with black shirt and a bag", 
            "woman with a bag", 
            "person with a bag", 
            "bag", "person"
        ], 
        "probs": [1.0, 1.7488513970320696e-09, 1.1663764917350243e-19, 4.179975909038141e-30, 3.77612043676229e-30]
    }, 
    {
        "url": "https://images.unsplash.com/photo-1589270216117-7972b3082c7d?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1834&q=80", 
        "labels": [
            "person with a bag", 
            "woman with black shirt and a bag", 
            "bag", 
            "woman with a bag", 
            "person"
        ], 
        "probs": [1.0, 2.4879632576357835e-08, 2.065714813830402e-13, 7.658033346455602e-15, 1.1307645811408335e-23]
    }
]

SageMaker Integration

This container is compatible with SageMaker so you should be able to host it as a SageMaker endpoint with no modifications. The code supports GPU and CPU instances.

Owner
Santiago Valdarrama
I care a whole lot about building software. More specifically, I lose sleep over Machine Learning and Computer Vision.
Santiago Valdarrama
🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

3D Object Reconstruction from a Single Depth View with Adversarial Learning Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni

Bo Yang 125 Nov 26, 2022
Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

structshot Code and data for paper "Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning", Yi Yang and Arz

ASAPP Research 47 Dec 27, 2022
A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

PGPElib A mini library for Policy Gradients with Parameter-based Exploration [1] and friends. This library serves as a clean re-implementation of the

NNAISENSE 56 Jan 01, 2023
Alphabetical Letter Recognition

BayeesNetworks-Image-Classification Alphabetical Letter Recognition In these demo we are using "Bayees Networks" Our database is composed by Learning

Mohammed Firass 4 Nov 30, 2021
Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP 2021.

The Stem Cell Hypothesis Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP

Emory NLP 5 Jul 08, 2022
ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Ibai Gorordo 14 Dec 09, 2022
Server files for UltimateLabeling

UltimateLabeling server files Server files for UltimateLabeling. git clone https://github.com/alexandre01/UltimateLabeling_server.git cd UltimateLabel

Alexandre Carlier 4 Oct 10, 2022
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

187 Dec 26, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Hongwei Wang 83 Nov 28, 2022
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation [Project] [Paper] [arXiv] [Home] Official implementation of FastFCN:

Wu Huikai 815 Dec 29, 2022
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

Aaron Chen 2.4k Dec 28, 2022
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime. Installatio

Aleksey Korshuk 115 Dec 16, 2022
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer

1 Feb 03, 2022
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 104 Jan 05, 2023
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022
Experiments for Neural Flows paper

Neural Flows: Efficient Alternative to Neural ODEs [arxiv] TL;DR: We directly model the neural ODE solutions with neural flows, which is much faster a

54 Dec 07, 2022
Hard cater examples from Hopper ICLR paper

CATER-h Honglu Zhou*, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf (*Contact: honglu.zhou

NECLA ML Group 6 May 11, 2021
Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch

This repository is used to suspend the results of our paper "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement"

ScorpioMiku 19 Sep 30, 2022
Ensembling Off-the-shelf Models for GAN Training

Vision-aided GAN video (3m) | website | paper Can the collective knowledge from a large bank of pretrained vision models be leveraged to improve GAN t

345 Dec 28, 2022