1 Department of Electronic Systems, The Technical Faculty of IT and Design, Aalborg University, VBN2 The Faculty of Engineering and Science (TECH), Aalborg University, VBN3 Signal and Information Processing, The Technical Faculty of IT and Design, Aalborg University, VBN4 Leiden University Medical Center
This paper deals with the problem of predicting the average intelligibility of noisy and potentially processed speech signals, as observed by a group of normal hearing listeners. We propose a model which performs this prediction based on the hypothesis that intelligibility is monotonically related to the mutual information between critical-band amplitude envelopes of the clean signal and the corresponding noisy/processed signal. The resulting intelligibility predictor turns out to be a simple function of the mean-square error (mse) that arises when estimating a clean critical-band amplitude using a minimum mean-square error (mmse) estimator based on the noisy/processed amplitude. The proposed model predicts that speech intelligibility cannot be improved by any processing of noisy critical-band amplitudes. Furthermore, the proposed intelligibility predictor performs well ( ρ > 0.95) in predicting the intelligibility of speech signals contaminated by additive noise and potentially non-linearly processed using time-frequency weighting.
I E E E Transactions on Audio, Speech and Language Processing, 2014, Vol 22, Issue 2, p. 430-440