1 Department of Electrical Engineering, Technical University of Denmark2 Hearing Systems, Department of Electrical Engineering, Technical University of Denmark3 Centre for Applied Hearing Research, Center, Technical University of Denmark4 Department of Information Technology, Technical University of Denmark
The main idea of the project is to build a widely speaker-independent, biologically motivated automatic speech recognition (ASR) system. The two main differences between our approach and current state-of-the-art ASRs are that i) the features used here are based on the responses of neuronlike spectro-temporal receptive fields to auditory spectrogram input, motivated by the auditory pathway of humans, and ii) the adaptation or learning algorithms involved are biologically inspired. This is in contrast to state-of-the-art combinations of Mel-frequency cepstral coefficients and Hidden Markov Model-based adaptation procedures. Two databases are used, TI46 for discrete speech a subset of the TIMIT database collected from speakers belonging to the New York dialect region. Each of the selection of 10 sentences is uttered once by each of 35 speakers. The major differences between the two data sets initiate the development and comparison of two distinct ASRs within the project, which will be presented in the following. Employing a reduced sampling frequency and bandwidth of the signals, the ASR algorithm reaches and goes beyond recognition results that are known from humans.