The first open-source library that detects the font of a text in a image.

Overview

Typefont

Typefont is an experimental library that detects the font of a text in a image.

Usage

Import the main function and invoke it like in the following script.

import { Typefont } from "./src/index.js";

Typefont("image.png").then((result) => console.log(result));

or

import { Typefont } from "./src/index.js";

async function getFontFromImage (source) {
    const fonts = await Typefont(source);
    
    return fonts[0]; // Return the most similar font.
}

The first argument of the function can be the path or the base64 of the image. The function returns a Promise that when is resolved returns an array containing each font ordered in descending order (considering the similarity percentage).

Preview

Text on the cover of a book (the language is different because I live in Italy).

Text on the cover of another book.

Version 2

I'm working on a new version which gets the fonts directly from .ttf files and the Google Fonts database. The comparison is made using the Hausdorff Distance and the Shape Context. If you are interested in a collaboration contact me ([email protected]). It's difficult to progress since I work and I have many other projects.

Options

You can pass an object with options to the function as second argument.

Option Type Description Default
progress Function A function that is called every time the comparison with a font is completed. undefined
minSymbolConfidence Number The minimum confidence that a symbol must have to be accepted in the comparison queue (the confidence value is assigned by the OCR engine). 15
analyticComparisonThreshold Number The threshold of the analytic comparison. 0.5
analyticComparisonScaleToSameSize Boolean Scale the symbols to the same size before the analytic comparison? false
analyticComparisonSize Number Used as dimension when resizing the images to the same size during the analytic comparison. 128
perceptualComparisonSize Number Used as dimension when resizing the images to the same size during the perceptual comparison. 64
fontsDirectory String The URL of the directory containing the fonts. storage/fonts/
fontsData String The name of the file containing the JSON data of a font. data.json
fontsIndex String The URL of the fonts index JSON file. storage/index.json
fontRequestTimeout Number Font request timeout [ms]. 2000
textRecognitionTimeout Number Text recognition timeout [s]. 60
textRecognitionBinarization Boolean Binarize the image before the recognition? true

Example

Example with options.

Typefont("restaurant-logo.jpg", {
    minSymbolConfidence: 50,
    analyticComparisonScaleToSameSize: true,
    analyticComparisonSize: 256
}).then(res => console.log(res));

Todo

  • Store and load fonts directly from .ttf files.
  • Implement the Shape Context algorithm to improve comparison results.
  • Implement the Hausdorff distance algorithm to improve the comparison results.
  • Import the Google Fonts database.

How it works?

Short summary: the input image is passed to the optical character recognition after some filters based on its brightness. The symbols (letters) are extracted from the input image and compared with the symbols of the fonts in the database using a perceptual comparison and a pixel based comparison in order to obtain a percentage of similarity.

How to add a font

The fonts stored in this database are just a JSON structure with letters as keys and the base64 of the image of the letter of the font as value. If you want to add a new font you must follow this structure.

{
    "meta": {
        "name": "name",
        "author": "author",
        "uri": "uri",
        "license": "license",
        "key": "value",
        ...
    },
    "alpha": {
        "a": "base64",
        "b": "base64",
        "c": "base64",
        ...
    }
}

Then you have to include your font in the index of fonts by adding the font name to the array.

License

MIT License.

Credits

Author: Vasile Pește ([email protected]).

You might also like...
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

 AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

Fast image augmentation library and easy to use wrapper around other libraries. Documentation:  https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

The Open Source Framework for Machine Vision

SimpleCV Quick Links: About Installation [Docker] (#docker) Ubuntu Virtual Environment Arch Linux Fedora MacOS Windows Raspberry Pi SimpleCV Shell Vid

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

Releases(v0.1-beta.0)
Owner
Vasile Pește
Specializing in all aspects of designing and implementing software systems.
Vasile Pește
Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

Harald Scheidl 190 Jan 03, 2023
📷 Face Recognition using Haar-Cascade Classifier, OpenCV, and Python

Face-Recognition-System Face Recognition using Haar-Cascade Classifier, OpenCV and Python. This project is based on face detection and face recognitio

1 Jan 10, 2022
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 47k Jan 07, 2023
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2.2k Jan 02, 2023
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Satoshi ~ DiscordCryptoBot Satoshi is a simple python discord bot using discord.py that allow you to track your favorites cryptos prices with your own

Théo 2 Sep 15, 2022
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 327 Jan 01, 2023
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 09, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
Generic framework for historical document processing

dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty

Digital Humanities Laboratory 343 Dec 24, 2022
Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

Suryam Thapa 1 Jan 26, 2022
Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Quick and Dirty OCR of Facebook Papers Gizmodo has been working through the Facebook Papers and releasing the docs that they process and review. As lu

Bill Fitzgerald 2 Oct 28, 2021
huoyijie 1.2k Dec 29, 2022
Convert Text-to Handwriting Using Python

Convert Text-to Handwriting Using Python Description In this project we'll use python library that's "pywhatkit" for converting text to handwriting. t

8 Nov 19, 2022
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022
This is a implementation of CRAFT OCR method

This is a implementation of CRAFT OCR method

Esaka 0 Nov 01, 2021
Autonomous Driving project for Euro Truck Simulator 2

hope-autonomous-driving Autonomous Driving project for Euro Truck Simulator 2 Video: How is it working ? In this video, the program processes the imag

Umut Görkem Kocabaş 36 Nov 06, 2022