DNN-TTS-ContVoc
Deep Neural Network-based Text-To-Speech Synthesis using a Continuous Vocoder
This repository contains a TTS system based on Continuous vocoder developed at the Speech Technology and Smart Interactions Laboratory (SmartLab), Budapest University of Technology and Economics. As a difference with other traditional statistical parametric vocoders, continuous model focuses on extracting continuous parameters:
- Fundamental Frequency (F0)
- Maximum Voiced Frequency (MVF)
- Mel-Generalized Cepstrum (MGC).
Install & Run: Please ensure you have installed python dependencies (pip install -r requirements.txt) and compiles (bash tools/compile_tools.sh).
Additional information: In a given English text sentence, users can select one of two voice patterns (either male or female) from the current set to build their custom voice model. Users can also specify the neural network topology (LSTM, BLSTM, GRU) to be trained as well as the number of hidden layers. The speech synthesis model inside this system has few parameters, and it is computationally feasible; therefore, it is suitable for real-time operation.
External links:
- Github repository: https://github.com/malradhi/conTTS