1 Sektion Aalborg, The Faculty of Engineering and Science (ENG), Aalborg University, VBN2 Media Technology, The Faculty of Engineering and Science (ENG), Aalborg University, VBN3 Audio Analysis Lab, The Faculty of Engineering and Science (ENG), Aalborg University, VBN4 Department of Architecture, Design and Media Technology, The Faculty of Engineering and Science (ENG), Aalborg University, VBN5 The Faculty of Engineering and Science (TECH), Aalborg University, VBN
Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for exam- ple, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a good estimator in finding the parameters of the intermediate representation, like a maximum like- lihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be depen- dencies between the estimation errors.
Acoustical Society of America. Journal, 2013, Vol 133, Issue 5, p. 3062-3071