Patent attributes
A method, system, and computer program product are provided for classifying spoken audio content with a cognitive audio classifier by applying a set of distorted audio resources through a set of speech-to-text models STTi (STT1 . . . STTn) to get a set of interference coherence scores based on the transcript for each speech-to-text model STTi, thereby generating a measured baseline Mi (M1 . . . Mn) and a practical baseline Pi (P1 . . . Pn) that is associated with a coherence matrix for the audio effects AEj (AE1 . . . AEk) that were used to generate the distorted audio resources, thereby generating training data for use in training a cognitive audio classifier which classifies input spoken audio content to measure a quality of detected vocabulary elements from the spoken audio content under the set of audio distortion effects for each speech-to-text model STTi.