Charles University
A 4 hour tutorial presented in October 2021 at ACAI 2021 on Multimodal Perception and Interaction with Transformers
Speech recognition translates spoken information into digital text in real time
We present the training process of an End-to-end neural network-based TTS model and its optimization
AI Drummer that responds in real-time to the playing of a Human pianist.
Deep Neural Network-based Text-To-Speech Synthesis using a Continuous Vocoder