無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

Last update: Jul 05, 2022

Related tags

Text Data & NLP voicevox_engine

Overview

VOICEVOX ENGINE

VOICEVOXの音声合成エンジン。実態は HTTP サーバーなので、リクエストを送信すればテキスト音声合成できます。

API ドキュメント

VOICEVOX ソフトウェアを起動した状態で、ブラウザから http://localhost:50021/docs にアクセスするとドキュメントが表示されます。
VOICEVOX 音声合成エンジンとの連携も参考になるかもしれません。

HTTP リクエストで音声合成するサンプルコード

query.json curl -s \ -H "Content-Type: application/json" \ -X POST \ -d @query.json \ localhost:50021/synthesis?speaker=1 \ > audio.wav ">

text="ABCDEFG"

curl -s \
    -X POST \
    "localhost:50021/audio_query?text=$text&speaker=1"\
    > query.json

curl -s \
    -H "Content-Type: application/json" \
    -X POST \
    -d @query.json \
    localhost:50021/synthesis?speaker=1 \
    > audio.wav

貢献者の方へ

Issue を解決するプルリクエストを作成される際は、別の方と同じ Issue に取り組むことを避けるため、 Issue 側で取り組み始めたことを伝えるか、最初に Draft プルリクエストを作成してください。

環境構築

# 開発に必要なライブラリのインストール
pip install -r requirements-test.txt

# とりあえず実行したいだけなら代わりにこちら
pip install -r requirements.txt

実行

# 製品版 VOICEVOX でサーバーを起動
VOICEVOX_DIR="C:/path/to/voicevox" # 製品版 VOICEVOX ディレクトリのパス
python run.py --voicevox_dir=$VOICEVOX_DIR

# モックでサーバー起動
python run.py

コードフォーマット

コードのフォーマットを整えます。プルリクエストを送る前に実行してください。

pysen run format lint

ビルド

Build Tools for Visual Studio 2019 が必要です。

pip install -r requirements-dev.txt

python -m nuitka \
    --standalone \
    --plugin-enable=numpy \
    --follow-import-to=numpy \
    --follow-import-to=aiofiles \
    --include-package=uvicorn \
    --include-package-data=pyopenjtalk \
    --include-data-file=VERSION.txt=./ \
    --include-data-file=speakers.json=./ \
    --include-data-file=C:/音声ライブラリへのパス/Release/*.dll=./ \
    --include-data-file=C:/音声ライブラリへのパス/*.bin=./ \
    --include-data-dir=.venv/Lib/site-packages/_soundfile_data=./_soundfile_data \
    --msvc=14.2 \
    --follow-imports \
    --no-prefer-source-code \
    run.py

ライセンス

LGPL v3 と、ソースコードの公開が不要な別ライセンスのデュアルライセンスです。別ライセンスを取得したい場合は、ヒホ（twitter: @hiho_karuta）に求めてください。

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

Related tags

Overview

VOICEVOX ENGINE

API ドキュメント

HTTP リクエストで音声合成するサンプルコード

貢献者の方へ

環境構築

実行

コードフォーマット

ビルド

ライセンス

You might also like...

Releases(check-code-sign-8)

check-code-sign-8(Jul 10, 2022)

Owner

Hiroshiba

Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Long text token classification using LongFormer

EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

This is a simple item2vec implementation using gensim for recbole

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

scikit-learn wrappers for Python fastText.

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Submit issues and feature requests for our API here.

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Pipeline for chemical image-to-text competition

An A-SOUL Text Generator Based on CPM-Distill.

Generate text line images for training deep learning OCR model (e.g. CRNN)

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.