Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-28719
Publikationstyp: Beitrag in wissenschaftlicher Zeitschrift
Art der Begutachtung: Open peer review
Titel: Real world music object recognition
Autor/-in: Tuggener, Lukas
Emberger, Raphael
Ghosh, Adhiraj
Sager, Pascal
Satyawan, Yvan Putra
Montoya, Javier
Goldschagg, Simon
Seibold, Florian
Gut, Urs
Ackermann, Philipp
Schmidhuber, Jürgen
Stadelmann, Thilo
et. al: No
DOI: 10.5334/tismir.157
10.21256/zhaw-28719
Erschienen in: Transactions of the International Society for Music Information Retrieval
Band(Heft): 7
Heft: 1
Seite(n): 1
Seiten bis: 14
Erscheinungsdatum: 2024
Verlag / Hrsg. Institution: Ubiquity Press
ISSN: 2514-3298
Sprache: Englisch
Schlagwörter: Optical music recognition; Deep learning; Data augmentation; Adversarial training; Model ensemble; Open data
Fachgebiet (DDC): 006: Spezielle Computerverfahren
Zusammenfassung: We present solutions to two of the most pressing issues in contemporary optical music recognition (OMR).We improve recognition accuracy on low-quality, real-world (i.e. containing ageing, lighting, or dirt artefacts among others) input data and provide confidence-rated model outputs to enable efficient human post-processing. Specifically, we present (i) a sophisticated input augmentation scheme that can reduce the gap between sanitised benchmarks and realistic tasks through a combination of synthetic data and noisy perturbations of real-world documents; (ii) an adversarial discriminative domain adaptation method that can be employed to improve the performance of OMR systems on low-quality data; (iii) a combination of model ensembles and prediction fusion, which generates trustworthy confidence ratings for each prediction. We evaluate our contributions on a newly created test set consisting of manually annotated pages of varying real-world quality, sourced from International Music Score Library Project (IMSLP) / the Petrucci Music Library. With the presented data augmentation scheme, we achieve a doubling in detection performance from 36.0% to 73.3% on noisy real-world data compared to state-of-the-art training. This result is then combined with robust confidence ratings paving the way forOMR to be deployed in the realworld. Additionally, we showthe merits of unsupervised adversarial domain adaptation for OMR raising the 36.0% baseline to 48.9%. All our code and data are freely available at: https://github.com/raember/s2anet/tree/TISMIR_publication.
URI: https://digitalcollection.zhaw.ch/handle/11475/28719
Zugehörige Forschungsdaten: https://github.com/raember/s2anet/tree/TISMIR_publication
Volltext Version: Publizierte Version
Lizenz (gemäss Verlagsvertrag): CC BY 4.0: Namensnennung 4.0 International
Departement: School of Engineering
Organisationseinheit: Centre for Artificial Intelligence (CAI)
Institut für Informatik (InIT)
Publiziert im Rahmen des ZHAW-Projekts: RealScore – Scanning of Real-World Sheet Music for a Digital Music Stand
Enthalten in den Sammlungen:Publikationen School of Engineering

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
2024_Tuggener-etal_Real-world-music-object-recognition.pdfPublished Version2.13 MBAdobe PDFMiniaturbild
Öffnen/Anzeigen
2023_Tuggener-etal_Real-world-music-object-recognition_TISMIR.pdfAccepted Version1.07 MBAdobe PDFMiniaturbild
Öffnen/Anzeigen
Zur Langanzeige
Tuggener, L., Emberger, R., Ghosh, A., Sager, P., Satyawan, Y. P., Montoya, J., Goldschagg, S., Seibold, F., Gut, U., Ackermann, P., Schmidhuber, J., & Stadelmann, T. (2024). Real world music object recognition. Transactions of the International Society for Music Information Retrieval, 7(1), 1–14. https://doi.org/10.5334/tismir.157
Tuggener, L. et al. (2024) ‘Real world music object recognition’, Transactions of the International Society for Music Information Retrieval, 7(1), pp. 1–14. Available at: https://doi.org/10.5334/tismir.157.
L. Tuggener et al., “Real world music object recognition,” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, pp. 1–14, 2024, doi: 10.5334/tismir.157.
TUGGENER, Lukas, Raphael EMBERGER, Adhiraj GHOSH, Pascal SAGER, Yvan Putra SATYAWAN, Javier MONTOYA, Simon GOLDSCHAGG, Florian SEIBOLD, Urs GUT, Philipp ACKERMANN, Jürgen SCHMIDHUBER und Thilo STADELMANN, 2024. Real world music object recognition. Transactions of the International Society for Music Information Retrieval. 2024. Bd. 7, Nr. 1, S. 1–14. DOI 10.5334/tismir.157
Tuggener, Lukas, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, et al. 2024. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval 7 (1): 1–14. https://doi.org/10.5334/tismir.157.
Tuggener, Lukas, et al. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, 2024, pp. 1–14, https://doi.org/10.5334/tismir.157.


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.