DeepMind's WaveNet is a type of feedforward neural network, convolutional neural network (CNN). It is composed of layers of interconnected nodes, CNN uses a raw signal as input and synthesizes an output. The trained network creates new speech-like waveforms at 16,000 samples per second. The output waveforms include realistic breaths and lip smacks.
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, Mohammad Norouzi
WAVENET: A GENERATIVE MODEL FOR RAW AUDIO
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Seniorn and Koray Kavukcuoglu