DocuMiner
A production-ready pipeline for text mining and subject indexing
Want to Contribute?
More code and documentation coming soon.
Authors
Open Source Club
A production-ready pipeline for text mining and subject indexing
More code and documentation coming soon.
Open Source Club
python-sqlparse - Parse SQL statements sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting S
auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install
title emoji colorFrom colorTo sdk app_file pinned decontextualizer 📤 green gray streamlit main.py false Decontextualizer As a second step in improvin
Rater Scoring Modeling Tool Introduction Automated scoring of written and spoken test responses is a growing field in educational natural language pro
colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it
Skype export archive to text converter This software utility extracts chat logs
WeKws Production First and Production Ready End-to-End Keyword Spotting Toolkit. The goal of this toolkit it to... Small footprint keyword spotting (K
Adventura Adventura is an open source Python Text Adventure Engine, Not yet uplo
afasi Fuzz a language by mixing up only few words. Status Beta. Note: The default branch is default. Use Examples Version General Help Translate Help
seed_signer_ontimes WARNING This project has mainly been used for local development, and creation should be ran on a air-gapped machine. A minimal pyt
Credit Simple python program to auto credit your code, text, book, whatever! Setup First change credit_text to whatever text you would like to credit
Extract knowledge from raw text This repository is a nearly copy-paste of "From Text to Knowledge: The Information Extraction Pipeline" with some cosm
Anti-money laundering Dedect relationship between A and E by tracing through payments with similar amounts and identifying payment chains. For example
phonenumbers Python Library This is a Python port of Google's libphonenumber library It supports Python 2.5-2.7 and Python 3.x (in the same codebase,
MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (supports 16 languages) of Universal Sentence Encoder (USE).
deepl4subtitle Deeplを使って字幕ファイル(.sbv)を翻訳します。タイムスタンプも含めて出力しますが、翻訳時はタイムスタンプは文の一部とは切り離されるので、.sbvファイルをそのまま翻訳機に突っ込むよりも高精度な翻訳ができるはずです。 つかいかた 入力する.sbvファイルの前処理
Bash RegExp This project aims to test check if your RegExp are being matched by grep. It's a local server that starts on the port 8080. It runs the se
🛒 Caixa Registradora V2 ❓ - Como usar? Execute o caixa-registradora.py, nele vai ter um menu interativo, você pode cadastrar diversos produtos em um
Installation: Windows python -m pip install text2ipa macOS sudo pip3 install text2ipa Linux pip install text2ipa Features Convert English text to I
Name Parser A simple Python (3.2+ & 2.6+) module for parsing human names into their individual components. hn.title hn.first hn.middle hn.last hn.suff