Many different short-time features, using time windows in the size of 10-30 ms, have been proposed for music segmentation, retrieval and genre classification. However, often the available time frame of the music to make the actual decision or comparison (the decision time horizon) is in the range of seconds instead of milliseconds. The problem of making new features on the larger time scale from the short-time features (feature integration) has only received little attention. This paper investigates different methods for feature integration and late information fusion for music genre classification. A new feature integration technique, the AR model, is proposed and seemingly outperforms the commonly used mean-variance features.
Ieee International Conference on Acoustics, Speech, and Signal Processing, 2005, p. 497-500
Audio classification; early/late Information fusion,; Feature Integration
Main Research Area:
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005)