Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-27042
Publikationstyp: Konferenz: Paper
Art der Begutachtung: Keine Angabe
Titel: On the effectiveness of automated metrics for text generation systems
Autor/-in: von Däniken, Pius
Deriu, Jan Milan
Tuggener, Don
Cieliebak, Mark
et. al: No
DOI: 10.21256/zhaw-27042
Tagungsband: Findings of the Association for Computational Linguistics: EMNLP 2022
Seite(n): 1503
Seiten bis: 1522
Angaben zur Konferenz: Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates, 7-11 December 2022
Erscheinungsdatum: 2022
Verlag / Hrsg. Institution: Association for Computational Linguistics
Sprache: Englisch
Schlagwörter: Text Generation; Artificial Intelligence (AI)
Fachgebiet (DDC): 410.285: Computerlinguistik
Zusammenfassung: A major challenge in the field of Text Generation is evaluation, because we lack a sound theory that can be leveraged to extract guidelines for evaluation campaigns. In this work, we propose a first step towards such a theory that incorporates different sources of uncertainty, such as imperfect automated metrics and insufficiently sized test sets. The theory has practical applications, such as determining the number of samples needed to reliably distinguish the performance of a set of Text Generation systems in a given setting. We showcase the application of the theory on the WMT 21 and Spot-The-Bot evaluation data and outline how it can be leveraged to improve the evaluation protocol regarding the reliability, robustness, and significance of the evaluation outcome.
URI: https://aclanthology.org/2022.findings-emnlp.108/
https://digitalcollection.zhaw.ch/handle/11475/27042
Volltext Version: Publizierte Version
Lizenz (gemäss Verlagsvertrag): CC BY 4.0: Namensnennung 4.0 International
Departement: School of Engineering
Organisationseinheit: Centre for Artificial Intelligence (CAI)
Enthalten in den Sammlungen:Publikationen School of Engineering

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
2022_vonDaeniken-etal_Effectiveness-of-automated-metrics-for-text-generation-systems.pdf498.19 kBAdobe PDFMiniaturbild
Öffnen/Anzeigen
Zur Langanzeige
von Däniken, P., Deriu, J. M., Tuggener, D., & Cieliebak, M. (2022). On the effectiveness of automated metrics for text generation systems [Conference paper]. Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–1522. https://doi.org/10.21256/zhaw-27042
von Däniken, P. et al. (2022) ‘On the effectiveness of automated metrics for text generation systems’, in Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, pp. 1503–1522. Available at: https://doi.org/10.21256/zhaw-27042.
P. von Däniken, J. M. Deriu, D. Tuggener, and M. Cieliebak, “On the effectiveness of automated metrics for text generation systems,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 1503–1522. doi: 10.21256/zhaw-27042.
VON DÄNIKEN, Pius, Jan Milan DERIU, Don TUGGENER und Mark CIELIEBAK, 2022. On the effectiveness of automated metrics for text generation systems. In: Findings of the Association for Computational Linguistics: EMNLP 2022 [online]. Conference paper. Association for Computational Linguistics. 2022. S. 1503–1522. Verfügbar unter: https://aclanthology.org/2022.findings-emnlp.108/
von Däniken, Pius, Jan Milan Deriu, Don Tuggener, and Mark Cieliebak. 2022. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Conference paper. In Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–22. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-27042.
von Däniken, Pius, et al. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Findings of the Association for Computational Linguistics: EMNLP 2022, Association for Computational Linguistics, 2022, pp. 1503–22, https://doi.org/10.21256/zhaw-27042.


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.