Patent attributes
A multilingual text-to-speech synthesis method and system are disclosed. The method includes receiving an articulatory feature of a speaker regarding a first language, receiving an input text of a second language, and generating output speech data for the input text of the second language that simulates the speaker's speech by inputting the input text of the second language and the articulatory feature of the speaker regarding the first language to a single artificial neural network multilingual text-to-speech synthesis model. The single artificial neural network multilingual text-to-speech synthesis model is generated by learning similarity information between phonemes of the first language and phonemes of the second language based on a first learning data of the first language and a second learning data of the second language.