1 The Faculty of Engineering and Science (TECH), Aalborg University, VBN2 Department of Architecture, Design and Media Technology, The Faculty of Engineering and Science (ENG), Aalborg University, VBN3 Audio Analysis Lab, The Faculty of Engineering and Science (ENG), Aalborg University, VBN4 Northwestern Polytechnical University Xian5 Northwestern Polytechnical University Xian
This paper is devoted to the study and analysis of the maximum signal-to-noise ratio (SNR) filters for noise reduction both in the time and short-time Fourier transform (STFT) domains with one single microphone and multiple microphones. In the time domain, we show that the maximum SNR filters can significantly increase the SNR but at the expense of tremendous speech distortion. As a consequence, the speech quality improvement, measured by the perceptual evaluation of speech quality (PESQ) algorithm, is marginal if any, regardless of the number of microphones used. In the STFT domain, the maximum SNR filters are formulated by considering the interframe information in every frequency band. It is found that these filters not only improve the SNR, but also improve the speech quality significantly. As the number of input channels increases so is the gain in SNR as well as the speech quality. This demonstrates that the maximum SNR filters, particularly the multichannel ones, in the STFT domain may be of great practical value.
I E E E - a C M Transactions on Audio, Speech, and Language Processing, 2014, Vol 22, Issue 12, p. 2034-2047