Embodiments described herein provide for end-to-end joint determination of degradation parameter scores for certain types of degradation. Degradation parameters include degradation describing additive noise and multiplicative noise such as Signal-to-Noise Ratio (SNR), reverberation time (T60), and Direct-to-Reverberant Ratio (DRR). Various neural network architectures are described such that the inherent interplay between the degradation parameters is considered in both the degradation parameter score and degradation score determination. The neural network architectures are trained according to computer generated audio datasets.