US Patent 12087275 Neural-network-based text-to-speech model for novel speaker generation

Systems and methods for text-to-speech with novel speakers can obtain text data and output audio data. The input text data may be input along with one or more speaker preferences. The speaker preferences can include speaker characteristics. The speaker preferences can be processed by a machine-learned model conditioned on a learned prior distribution to determine a speaker embedding. The speaker embedding can then be processed with the text data to generate an output that includes audio data descriptive of the text data spoken by a novel speaker.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 12087275 Neural-network-based text-to-speech model for novel speaker generation

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 12087275 Neural-network-based text-to-speech model for novel speaker generation