demo | ||
.gitignore | ||
env.sh | ||
LICENSE | ||
README.md | ||
requirements.txt | ||
screenshot.png | ||
tts.py |
Simple TTS
A simple machine learning text-to-speech program powered by kokoro.
Warning
This is currently only tested working on Ubuntu-based distros due to the required packages.
Setup
Clone repo and go into the directory
$ git clone https://git.ayo.run/ayo/simple-tts
$ cd simple-tts
Create new environment. Here I use conda
.
$ conda create -n tts
### for Intel XPU specific device usage:
$ conda create -n tts --clone llm-pt26
Important
For using Intel XPUs, you need to set up ipex-llm environment with pytorch 2.6. Also, see Intel XPU environmental variables" section below.
Activate the environment and install the dependencies
$ conda activate tts
$ python -m pip install -r requirements.txt
Required packages
The following are required packages aside from the python dependencies. espeak-ng
is used by kokoro
under the hood for english languages, and libvlc
is used as the default audio player for the generated audio.
$ sudo apt update
$ sudo apt install vlc espeak-ng
Note
Installing
vlc
via flatpak or snap will not work, as the code need access tolibvlc
.
Intel XPU environmental variables
For XPUs, we need to set some environmental variables. I have added a env.sh
script which will activate the conda environment tts
and set the environmental variables.
$ . env.sh
Usage
Go into the directory and activate the environment:
$ cd simple-tts
$ conda activate tts
If using Intel XPUs, set the env variables
$ . env.sh
Running the program without arguments will use the demo text tongue-twister.txt
with the default voice.
$ python tts.py # will use default arguments
Providing text inputs
You can pass a string as first argument:
$ python tts.py "Hello world!" # will be read by the default voice
To run the program with an input file, use flag --input_file
.
$ python tts.py --input_file demo/tongue-twister.txt
# or shorter...
$ python tts.py -i demo/tongue-twister.txt
You can also use the text stored in your clipboard (i.e., copied text). Select a text from anywhere (e.g., your web browser), copy it with <ctrl>+C
or the context menu, then use the flag --clipboard
:
$ python tts.py --clipboard
# or shorter...
$ python tts.py -c
Voices
Optionally, you can indicate a voice you want to use with the --voice
flag. See all voices available.
$ python tts.py --voice am_michael
# or shorter...
$ python tts.py -v am_michael
There are four shortcuts available to the best voices: pro
, hot
, asmr
, brit
(i.e., best trained voices), and pro
is the default if no value is given
$ python tts.py "Hello there!" --voice pro # af_heart
$ python tts.py "Hello there!" --voice hot # af_bella
$ python tts.py "Hello there!" --voice asmr # af_nicole
$ python tts.py "Hello there!" --voice brit # bf_emma
Disable audio player
You can disable the built-in audio player with --skip_play
if you choose to play the audio files generated with your preferred player.
$ python tts.py "Hello there!" --voice asmr --skip_play
# or shorter...
$ python tts.py "Hello there!" --voice asmr -s
Demo Outputs
Voice: pro (ah_heart)
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_heart-0.wav
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_heart-1.wav
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_heart-2.wav
Voice: asmr (ah_nicole)
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_nicole-0.wav
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_nicole-1.wav
https://git.ayo.run/ayo/simple-tts/src/branch/main/demo/tongue-twister-af_nicole-2.wav