mirror of
https://github.com/PR0M3TH3AN/Piper-TTS-Script.git
synced 2025-09-07 06:28:42 +00:00
173 lines
4.9 KiB
Markdown
173 lines
4.9 KiB
Markdown
# Piper-TTS Audiobook Script
|
||
|
||
This project is a **one-command solution** to convert large `.txt` documents into high-quality **MP3 audiobooks** entirely **offline** using [Piper](https://github.com/rhasspy/piper) — a fast, local, open-source text-to-speech engine.
|
||
|
||
The script:
|
||
- Installs all prerequisites (`pipx`, `ffmpeg`, `piper-tts`)
|
||
- Downloads a chosen Piper voice model (default: **female UK English — en_GB-cori-high**)
|
||
- Splits large text into manageable chunks for Piper
|
||
- Synthesizes each chunk into audio
|
||
- Stitches the chunks together into a single **MP3** file
|
||
|
||
---
|
||
|
||
## 📂 Project Structure
|
||
|
||
```
|
||
|
||
Piper-TTS-Script/
|
||
├── make\_audiobook.sh # Main script
|
||
└── README.md # This file
|
||
|
||
````
|
||
|
||
---
|
||
|
||
## ⚙️ Requirements
|
||
|
||
- **Linux Mint / Ubuntu** (or other Debian-based distro)
|
||
- Internet connection for the first run (to install packages and download the voice model)
|
||
- A `.txt` file containing your book or text to be converted
|
||
|
||
---
|
||
|
||
## 📥 Installation
|
||
|
||
1. **Clone or download this repo**:
|
||
```bash
|
||
cd ~/Documents/GitHub
|
||
git clone https://github.com/yourusername/Piper-TTS-Script.git
|
||
````
|
||
|
||
*(Or just manually create the folder and place `make_audiobook.sh` inside)*
|
||
|
||
2. **Make the script executable**:
|
||
|
||
```bash
|
||
chmod +x ~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh
|
||
```
|
||
|
||
---
|
||
|
||
## 🗂 Preparing Your Text File
|
||
|
||
Place your text file somewhere accessible, e.g.:
|
||
|
||
```
|
||
~/Documents/audiobooks/mybook.txt
|
||
```
|
||
|
||
If you have a PDF or EPUB, convert it to plain text first:
|
||
|
||
```bash
|
||
# PDF to TXT
|
||
pdftotext mybook.pdf mybook.txt
|
||
|
||
# EPUB to TXT (requires calibre)
|
||
ebook-convert mybook.epub mybook.txt
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 Usage
|
||
|
||
Basic usage:
|
||
|
||
```bash
|
||
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt
|
||
```
|
||
|
||
* **First run**: Installs all dependencies and downloads the default voice model.
|
||
* **Output**: An MP3 file in the same directory as your text:
|
||
|
||
```
|
||
~/Documents/audiobooks/mybook.mp3
|
||
```
|
||
|
||
---
|
||
|
||
## 🔧 Configuration
|
||
|
||
At the top of `make_audiobook.sh` there is a **CONFIG** section where you can adjust:
|
||
|
||
| Setting | Description |
|
||
| ----------------------------------- | ------------------------------------------------------ |
|
||
| `MODEL_NAME` | Name of the Piper voice model |
|
||
| `MODEL_FILE` | Path to your `.onnx` model file |
|
||
| `MODEL_ONNX_URL` / `MODEL_JSON_URL` | Download URLs for the model and metadata |
|
||
| `LENGTH_SCALE` | Speech speed (1.0 = normal, <1 faster, >1 slower) |
|
||
| `NOISE_SCALE` | Expressiveness (lower = flatter, higher = more varied) |
|
||
| `MAX_CHARS` | Max characters per chunk sent to Piper |
|
||
| `SILENCE_MS` | Silence gap between chunks in milliseconds |
|
||
| `MP3_Q` | MP3 quality (VBR: 0=best, 2=high, 5=medium) |
|
||
|
||
---
|
||
|
||
## 🎤 Changing the Voice
|
||
|
||
This script defaults to:
|
||
|
||
* **Female UK English** — `en_GB-cori-high`
|
||
|
||
To change:
|
||
|
||
1. Visit the official Piper voice list:
|
||
[https://github.com/rhasspy/piper/blob/master/VOICES.md](https://github.com/rhasspy/piper/blob/master/VOICES.md)
|
||
|
||
2. Copy the `.onnx` and `.onnx.json` URLs for your chosen voice.
|
||
|
||
3. Edit the **CONFIG** section in `make_audiobook.sh` with the new:
|
||
|
||
* `MODEL_NAME`
|
||
* `MODEL_ONNX_URL`
|
||
* `MODEL_JSON_URL`
|
||
|
||
Example for **Female US English (Lessac)**:
|
||
|
||
```bash
|
||
MODEL_NAME="en_US-lessac-high"
|
||
MODEL_ONNX_URL="https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/high/en_US-lessac-high.onnx"
|
||
MODEL_JSON_URL="https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/high/en_US-lessac-high.onnx.json"
|
||
```
|
||
|
||
---
|
||
|
||
## 🛠 Advanced Usage
|
||
|
||
Specify a custom voice model:
|
||
|
||
```bash
|
||
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt \
|
||
--model ~/.local/share/piper/voices/en_US-lessac-high.onnx
|
||
```
|
||
|
||
Name the output file:
|
||
|
||
```bash
|
||
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt \
|
||
--name my_custom_audiobook.mp3
|
||
```
|
||
|
||
---
|
||
|
||
## ⚠️ Limitations
|
||
|
||
* For **extremely large books** (>200k characters), processing can still take a while even with chunking.
|
||
* If the voice model URLs change, you’ll need to update them from [VOICES.md](https://github.com/rhasspy/piper/blob/master/VOICES.md).
|
||
* Piper’s pronunciation is very good but not perfect; some technical or foreign words may be read oddly.
|
||
|
||
---
|
||
|
||
## 📄 License
|
||
|
||
This script is released under the MIT License. Piper itself is licensed under the Mozilla Public License 2.0.
|
||
|
||
---
|
||
|
||
## 🙋♂️ Credits
|
||
|
||
* [Piper TTS](https://github.com/rhasspy/piper) — Open-source text-to-speech engine
|
||
* [ffmpeg](https://ffmpeg.org) — Audio conversion
|
||
* Script by **Your Name**
|
||
|