This is a modification of the OpenAI-CLIP repository of moein-shariatnia

Overview

INTRODUCTION


This is a modification of the OpenAI-CLIP repo of moein-shariatnia(https://github.com/moein-shariatnia/OpenAI-CLIP).

The current training dataset supports flicker-8k or flicker-30k, and the image encoder supports Resnet50 or ViT(vit_base_patch16_384).

Text encoder supports only DistilBert like moein-shariatnia.

ENVIRONTMENT SETTING


$ virtualenv .venv --python=python3.6
$ source .venv/bin/activate
$ pip install -r requirements.txt

EXECUTTION


  • Pretrain
$ python3 pretrain.py
  • Inference
$ python3 inference.py --qeury={YOUR QUERY}

CAUTION


You must set(or check) some options in config.py before pretrain & inference

ex1) dataset("8k" or "30k"): Train dataset(flicker-8k or flicker-30k)

ex2) model_name("resnet50" or "vit_base_patch16_384"): Type of image encoder

ex3) pretrained(True or False): Decide whether to learn by loading pretrain versions of text encoder(DistilBert) and image encoder(resnet50 or ViT)

ex4) batch_size: Set according to the capacity of the machine

Owner
Sangwon Beak
Sangwon Beak
official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

Plugin 3 Jan 12, 2022
Understanding the Difficulty of Training Transformers

Admin Understanding the Difficulty of Training Transformers Guided by our analyses, we propose Adaptive Model Initialization (Admin), which successful

Liyuan Liu 300 Dec 29, 2022
Spooky Skelly For Python

_____ _ _____ _ _ _ | __| ___ ___ ___ | |_ _ _ | __|| |_ ___ | || | _ _ |__ || . || . || . || '

Kur0R1uka 1 Dec 23, 2021
A Python script that compares files in directories

compare-files A Python script that compares files in different directories, this is similar to the command filecmp.cmp(f1, f2). I made this script in

Colvin 1 Oct 15, 2021
A Japanese tokenizer based on recurrent neural networks

Nagisa is a python module for Japanese word segmentation/POS-tagging. It is designed to be a simple and easy-to-use tool. This tool has the following

325 Jan 05, 2023
๐Ÿ’ซ Industrial-strength Natural Language Processing (NLP) in Python

spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc

Explosion 24.9k Jan 02, 2023
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022
Blackstone is a spaCy model and library for processing long-form, unstructured legal text

Blackstone Blackstone is a spaCy model and library for processing long-form, unstructured legal text. Blackstone is an experimental research project f

ICLR&D 579 Jan 08, 2023
A toolkit for document-level event extraction, containing some SOTA model implementations

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le

84 Dec 15, 2022
Sorce code and datasets for "K-BERT: Enabling Language Representation with Knowledge Graph",

K-BERT Sorce code and datasets for "K-BERT: Enabling Language Representation with Knowledge Graph", which is implemented based on the UER framework. R

Weijie Liu 834 Jan 09, 2023
๋‰ด์Šค ๋„๋ฉ”์ธ ์งˆ์˜์‘๋‹ต ์‹œ์Šคํ…œ (21-1ํ•™๊ธฐ ์กธ์—… ํ”„๋กœ์ ํŠธ)

๋‰ด์Šค ๋„๋ฉ”์ธ ์งˆ์˜์‘๋‹ต ์‹œ์Šคํ…œ ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ๋‰ด์Šค๊ธฐ์‚ฌ์— ๋Œ€ํ•œ ์งˆ์˜์‘๋‹ต ์„œ๋น„์Šค ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ง„ํ–‰ํ•œ ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ์•ฝ 3๊ฐœ์›”๊ฐ„ ( 21. 03 ~ 21. 05 ) ์ง„ํ–‰ํ•˜์˜€์œผ๋ฉฐ Transformer ์•„ํ‚คํ…์ณ ๊ธฐ๋ฐ˜์˜ Encoder๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•œ๊ตญ์–ด ์งˆ์˜์‘๋‹ต ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ

TaegyeongEo 4 Jul 08, 2022
Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

Eric Lam 31 Nov 07, 2022
Codes for coreference-aware machine reading comprehension

Data and code for the paper "Tracing Origins: Coreference-aware Machine Reading Comprehension" at ACL2022. Dataset There are three folders for our thr

11 Sep 29, 2022
Extracting Summary Knowledge Graphs from Long Documents

GraphSum This repo contains the data and code for the G2G model in the paper: Extracting Summary Knowledge Graphs from Long Documents. The other basel

Zeqiu (Ellen) Wu 10 Oct 21, 2022
Unofficial Python library for using the Polish Wordnet (plWordNet / Sล‚owosieฤ‡)

Polish Wordnet Python library Simple, easy-to-use and reasonably fast library for using the Sล‚owosieฤ‡ (also known as PlWordNet) - a lexico-semantic da

Max Adamski 12 Dec 23, 2022
Textpipe: clean and extract metadata from text

textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata

Textpipe 298 Nov 21, 2022
American Sign Language (ASL) to Text Converter

Signterpreter American Sign Language (ASL) to Text Converter Recommendations Although there is grayscale and gaussian blur, we recommend that you use

0 Feb 20, 2022
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib

Grammarly 227 Jan 02, 2023
DVC-NLP-Simple-usecase

dvc-NLP-simple-usecase DVC NLP project Reference repository: official reference repo DVC STUDIO MY View Bag of Words- Krish Naik TF-IDF- Krish Naik ST

SUNNY BHAVEEN CHANDRA 2 Oct 02, 2022
Problem: Given a nepali news find the category of the news

Classification of category of nepali news catorgory using different algorithms Problem: Multiclass Classification Approaches: TFIDF for vectorization

pudasainishushant 2 Jan 09, 2022