Download
About the MUSTER metrics
The MUsic Score Transcription Error Rate (MUSTER) metrics are edit-distance-based metrics, similar to the word error rate (WER) used for evaluating automatic speech recognition systems. Each of the six metrics evaluates a specific aspect of musical score. These metrics are error rates; a lower value means a larger similarity between the esimated score and the ground truth.Updates
You can download the old versions from the Github repository.(2022/Jan/27) Some internal modules were updated.
(2022/Jan/18) Added an output file with details of error analysis. Some internal modules were modified.
(2021/Dec/17) Fixed somes bugs.
References
The edit-distance-based metrics are first introduced in Ref. [1]. The metrics for voices are defined in Ref. [2].[1] Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, Simon Dixon, “Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization,” Proc. 43rd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 101-105, 2018.
[2] Yuki Hiramatsu, Eita Nakamura, Kazuyoshi Yoshii, “Joint Estimation of Note Values and Voices for Audio-to-Score Piano Transcription,” Proc. 22nd International Society for Music Information Retrieval Conference (ISMIR), 2021.