On the Evaluation of Music Genre Recognition Systems
A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically different state-of-the-art MGR systems, that classification accuracy does not necessarily reflect the capacity of a system to recognize genre in musical signals. We argue that a more comprehensive analysis of behavior at the level of the music is needed to address the problem of MGR, and that measuring classification accuracy obscures the aim of MGR: to select labels indistinguishable from those a person would choose.
Journal of Intelligent Information Systems, 2013, Vol 41, Issue 3, p. 371-406