Time delay estimation (TDE) is a fundamental component of speaker localization and tracking algorithms. Most of the existing systems are based on the generalized cross-correlation method assuming gaussianity of the source. It has been shown that the distribution of speech, captured with far-field microphones, is highly varying, depending on the noise and reverberation conditions. Thus the performance of TDE is expected to fluctuate depending on the underlying assumption for the speech distribution, being also subject to multi-path reflections and competitive background noise. This paper investigates the effect upon TDE when modeling the source signal with different speech-based distributions. An information theoretical TDE method indirectly encapsulating higher order statistics (HOS) formed the basis of this work. The underlying assumption of Gaussian distributed source has been replaced by that of generalized Gaussian distribution that allows evaluating the problem under a larger set of speech-shaped distributions, ranging from Gaussian to Laplacian and Gamma. Closed forms of the univariate and multivariate entropy expressions of the generalized Gaussian distribution are derived to evaluate the TDE. The results indicate that TDE based on the specific criterion is independent of the underlying assumption for the distribution of the source, for the same covariance matrix.
Journal of the Acoustical Society of America, 2013, Vol 133, Issue 3, p. 1515-1524