Classification accuracy is not enough |
| |
Authors: | Bob L. Sturm |
| |
Affiliation: | 1. Audio Analysis Lab, AD:MT, Aalborg University Copenhagen, A.C. Meyers V?nge 15, 2450, Copenhagen, SV, Denmark
|
| |
Abstract: | We argue that an evaluation of system behavior at the level of the music is required to usefully address the fundamental problems of music genre recognition (MGR), and indeed other tasks of music information retrieval, such as autotagging. A recent review of works in MGR since 1995 shows that most (82 %) measure the capacity of a system to recognize genre by its classification accuracy. After reviewing evaluation in MGR, we show that neither classification accuracy, nor recall and precision, nor confusion tables, necessarily reflect the capacity of a system to recognize genre in musical signals. Hence, such figures of merit cannot be used to reliably rank, promote or discount the genre recognition performance of MGR systems if genre recognition (rather than identification by irrelevant confounding factors) is the objective. This motivates the development of a richer experimental toolbox for evaluating any system designed to intelligently extract information from music signals. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|