Text to speech (TTS) is form of speech synthesis. It is a system that converts text into spoken voice output. TTS systems were initially used in reading systems for the blind in which the system reads some text from a book by converting it into speech. TTS applications include voice-enabled e-mail and spoken prompts in voice response systems. TTS is often used with voice recognition programs.
TTS system is built by creating a database of recorded voices (speaking whole sentences to syllables). The recordings are stored, sorted, labeled and segmented by phones, syllables, morphemes, words, phrases, and sentences. It will reproduce words from a text by carrying out a sophisticated linguistic analysis and natural language processing to understand the structure of the sentences and to determine the context of the word for pronunciation. After the natural language processing, the system will match the text to the database of speech units to produce speech fitted to the text input.
The Main Principles of Text-to-Speech Synthesis System
U.R. Aida–Zade, C. Ardil and A.M. Sharifova