1 Audio Analysis Lab, The Technical Faculty of IT and Design, Aalborg University, VBN2 Department of Architecture, Design and Media Technology, The Technical Faculty of IT and Design, Aalborg University, VBN3 Sektion København, The Technical Faculty of IT and Design, Aalborg University, VBN4 The Faculty of Engineering and Science (TECH), Aalborg University, VBN
Most research in automatic music genre recognition has used the dataset assembled by Tzanetakis et al. in 2001. The composition and integrity of this dataset, however, has never been formally analyzed. For the first time, we provide an analysis of its composition, and create a machine-readable index of artist and song titles, identifying nearly all excerpts. We also catalog numerous problems with its integrity, including replications, mislabelings, and distortion.
International Journal of Social Welfare, 2012, p. 7-12