mirror of
https://github.com/PR0M3TH3AN/Piper-TTS-Script.git
synced 2025-09-05 05:28:43 +00:00
main
Piper-TTS Audiobook Script
This project is a one-command solution to convert large .txt
documents into high-quality MP3 audiobooks entirely offline using Piper — a fast, local, open-source text-to-speech engine.
The script:
- Installs all prerequisites (
pipx
,ffmpeg
,piper-tts
) - Downloads a chosen Piper voice model (default: female UK English — en_GB-cori-high)
- Splits large text into manageable chunks for Piper
- Synthesizes each chunk into audio
- Stitches the chunks together into a single MP3 file
📂 Project Structure
Piper-TTS-Script/
├── make\_audiobook.sh # Main script
└── README.md # This file
⚙️ Requirements
- Linux Mint / Ubuntu (or other Debian-based distro)
- Internet connection for the first run (to install packages and download the voice model)
- A
.txt
file containing your book or text to be converted
📥 Installation
- Clone or download this repo:
cd ~/Documents/GitHub git clone https://github.com/yourusername/Piper-TTS-Script.git
*(Or just manually create the folder and place `make_audiobook.sh` inside)*
2. **Make the script executable**:
```bash
chmod +x ~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh
```
---
## 🗂 Preparing Your Text File
Place your text file somewhere accessible, e.g.:
```
~/Documents/audiobooks/mybook.txt
```
If you have a PDF or EPUB, convert it to plain text first:
```bash
# PDF to TXT
pdftotext mybook.pdf mybook.txt
# EPUB to TXT (requires calibre)
ebook-convert mybook.epub mybook.txt
```
---
## 🚀 Usage
Basic usage:
```bash
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt
```
* **First run**: Installs all dependencies and downloads the default voice model.
* **Output**: An MP3 file in the same directory as your text:
```
~/Documents/audiobooks/mybook.mp3
```
---
## 🔧 Configuration
At the top of `make_audiobook.sh` there is a **CONFIG** section where you can adjust:
| Setting | Description |
| ----------------------------------- | ------------------------------------------------------ |
| `MODEL_NAME` | Name of the Piper voice model |
| `MODEL_FILE` | Path to your `.onnx` model file |
| `MODEL_ONNX_URL` / `MODEL_JSON_URL` | Download URLs for the model and metadata |
| `LENGTH_SCALE` | Speech speed (1.0 = normal, <1 faster, >1 slower) |
| `NOISE_SCALE` | Expressiveness (lower = flatter, higher = more varied) |
| `MAX_CHARS` | Max characters per chunk sent to Piper |
| `SILENCE_MS` | Silence gap between chunks in milliseconds |
| `MP3_Q` | MP3 quality (VBR: 0=best, 2=high, 5=medium) |
---
## 🎤 Changing the Voice
This script defaults to:
* **Female UK English** — `en_GB-cori-high`
To change:
1. Visit the official Piper voice list:
[https://github.com/rhasspy/piper/blob/master/VOICES.md](https://github.com/rhasspy/piper/blob/master/VOICES.md)
2. Copy the `.onnx` and `.onnx.json` URLs for your chosen voice.
3. Edit the **CONFIG** section in `make_audiobook.sh` with the new:
* `MODEL_NAME`
* `MODEL_ONNX_URL`
* `MODEL_JSON_URL`
Example for **Female US English (Lessac)**:
```bash
MODEL_NAME="en_US-lessac-high"
MODEL_ONNX_URL="https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/high/en_US-lessac-high.onnx"
MODEL_JSON_URL="https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/high/en_US-lessac-high.onnx.json"
```
---
## 🛠 Advanced Usage
Specify a custom voice model:
```bash
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt \
--model ~/.local/share/piper/voices/en_US-lessac-high.onnx
```
Name the output file:
```bash
~/Documents/GitHub/Piper-TTS-Script/make_audiobook.sh ~/Documents/audiobooks/mybook.txt \
--name my_custom_audiobook.mp3
```
---
## ⚠️ Limitations
* For **extremely large books** (>200k characters), processing can still take a while even with chunking.
* If the voice model URLs change, you’ll need to update them from [VOICES.md](https://github.com/rhasspy/piper/blob/master/VOICES.md).
* Piper’s pronunciation is very good but not perfect; some technical or foreign words may be read oddly.
---
## 📄 License
This script is released under the MIT License. Piper itself is licensed under the Mozilla Public License 2.0.
---
## 🙋♂️ Credits
* [Piper TTS](https://github.com/rhasspy/piper) — Open-source text-to-speech engine
* [ffmpeg](https://ffmpeg.org) — Audio conversion
* Script by **Your Name**
Description
Languages
Shell
100%