Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen:
https://doi.org/10.21256/zhaw-28719
Publikationstyp: | Beitrag in wissenschaftlicher Zeitschrift |
Art der Begutachtung: | Open peer review |
Titel: | Real world music object recognition |
Autor/-in: | Tuggener, Lukas Emberger, Raphael Ghosh, Adhiraj Sager, Pascal Satyawan, Yvan Putra Montoya, Javier Goldschagg, Simon Seibold, Florian Gut, Urs Ackermann, Philipp Schmidhuber, Jürgen Stadelmann, Thilo |
et. al: | No |
DOI: | 10.5334/tismir.157 10.21256/zhaw-28719 |
Erschienen in: | Transactions of the International Society for Music Information Retrieval |
Band(Heft): | 7 |
Heft: | 1 |
Seite(n): | 1 |
Seiten bis: | 14 |
Erscheinungsdatum: | 2024 |
Verlag / Hrsg. Institution: | Ubiquity Press |
ISSN: | 2514-3298 |
Sprache: | Englisch |
Schlagwörter: | Optical music recognition; Deep learning; Data augmentation; Adversarial training; Model ensemble; Open data |
Fachgebiet (DDC): | 006: Spezielle Computerverfahren |
Zusammenfassung: | We present solutions to two of the most pressing issues in contemporary optical music recognition (OMR).We improve recognition accuracy on low-quality, real-world (i.e. containing ageing, lighting, or dirt artefacts among others) input data and provide confidence-rated model outputs to enable efficient human post-processing. Specifically, we present (i) a sophisticated input augmentation scheme that can reduce the gap between sanitised benchmarks and realistic tasks through a combination of synthetic data and noisy perturbations of real-world documents; (ii) an adversarial discriminative domain adaptation method that can be employed to improve the performance of OMR systems on low-quality data; (iii) a combination of model ensembles and prediction fusion, which generates trustworthy confidence ratings for each prediction. We evaluate our contributions on a newly created test set consisting of manually annotated pages of varying real-world quality, sourced from International Music Score Library Project (IMSLP) / the Petrucci Music Library. With the presented data augmentation scheme, we achieve a doubling in detection performance from 36.0% to 73.3% on noisy real-world data compared to state-of-the-art training. This result is then combined with robust confidence ratings paving the way forOMR to be deployed in the realworld. Additionally, we showthe merits of unsupervised adversarial domain adaptation for OMR raising the 36.0% baseline to 48.9%. All our code and data are freely available at: https://github.com/raember/s2anet/tree/TISMIR_publication. |
URI: | https://digitalcollection.zhaw.ch/handle/11475/28719 |
Zugehörige Forschungsdaten: | https://github.com/raember/s2anet/tree/TISMIR_publication |
Volltext Version: | Publizierte Version |
Lizenz (gemäss Verlagsvertrag): | CC BY 4.0: Namensnennung 4.0 International |
Departement: | School of Engineering |
Organisationseinheit: | Centre for Artificial Intelligence (CAI) Institut für Informatik (InIT) |
Publiziert im Rahmen des ZHAW-Projekts: | RealScore – Scanning of Real-World Sheet Music for a Digital Music Stand |
Enthalten in den Sammlungen: | Publikationen School of Engineering |
Dateien zu dieser Ressource:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
2024_Tuggener-etal_Real-world-music-object-recognition.pdf | Published Version | 2.13 MB | Adobe PDF | Öffnen/Anzeigen |
2023_Tuggener-etal_Real-world-music-object-recognition_TISMIR.pdf | Accepted Version | 1.07 MB | Adobe PDF | Öffnen/Anzeigen |
Zur Langanzeige
Tuggener, L., Emberger, R., Ghosh, A., Sager, P., Satyawan, Y. P., Montoya, J., Goldschagg, S., Seibold, F., Gut, U., Ackermann, P., Schmidhuber, J., & Stadelmann, T. (2024). Real world music object recognition. Transactions of the International Society for Music Information Retrieval, 7(1), 1–14. https://doi.org/10.5334/tismir.157
Tuggener, L. et al. (2024) ‘Real world music object recognition’, Transactions of the International Society for Music Information Retrieval, 7(1), pp. 1–14. Available at: https://doi.org/10.5334/tismir.157.
L. Tuggener et al., “Real world music object recognition,” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, pp. 1–14, 2024, doi: 10.5334/tismir.157.
TUGGENER, Lukas, Raphael EMBERGER, Adhiraj GHOSH, Pascal SAGER, Yvan Putra SATYAWAN, Javier MONTOYA, Simon GOLDSCHAGG, Florian SEIBOLD, Urs GUT, Philipp ACKERMANN, Jürgen SCHMIDHUBER und Thilo STADELMANN, 2024. Real world music object recognition. Transactions of the International Society for Music Information Retrieval. 2024. Bd. 7, Nr. 1, S. 1–14. DOI 10.5334/tismir.157
Tuggener, Lukas, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, et al. 2024. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval 7 (1): 1–14. https://doi.org/10.5334/tismir.157.
Tuggener, Lukas, et al. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, 2024, pp. 1–14, https://doi.org/10.5334/tismir.157.
Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.