This paper deals with the problem of predicting the average intelligibility of noisy and potentially processed speech signals, as observed by a group of normal hearing listeners. We propose a model which performs this prediction based on the hypothesis that intelligibility is monotonically related to the mutual information between critical-band amplitude envelopes of the clean signal and the corresponding noisy/processed signal. The resulting intelligibility predictor turns out to be a simple function of the mean-square error (mse) that arises when estimating a clean critical-band amplitude using a minimum mean-square error (mmse) estimator based on the noisy/processed amplitude. The proposed model predicts that speech intelligibility cannot be improved by any processing of noisy critical-band amplitudes. Furthermore, the proposed intelligibility predictor performs well ( ρ > 0.95) in predicting the intelligibility of speech signals contaminated by additive noise and potentially non-linearly processed using time-frequency weighting.
I E E E Transactions on Audio, Speech and Language Processing, 2014, Vol 22, Issue 2, p. 430-440