On the effectiveness of automated metrics for text generation systems

von Däniken, Pius; Deriu, Jan Milan; Tuggener, Don; Cieliebak, Mark

doi:10.21256/zhaw-27042

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-27042

Full metadata record

DC Field	Value	Language
dc.contributor.author	von Däniken, Pius	-
dc.contributor.author	Deriu, Jan Milan	-
dc.contributor.author	Tuggener, Don	-
dc.contributor.author	Cieliebak, Mark	-
dc.date.accessioned	2023-02-17T09:23:27Z	-
dc.date.available	2023-02-17T09:23:27Z	-
dc.date.issued	2022	-
dc.identifier.uri	https://aclanthology.org/2022.findings-emnlp.108/	de_CH
dc.identifier.uri	https://digitalcollection.zhaw.ch/handle/11475/27042	-
dc.description.abstract	A major challenge in the field of Text Generation is evaluation, because we lack a sound theory that can be leveraged to extract guidelines for evaluation campaigns. In this work, we propose a first step towards such a theory that incorporates different sources of uncertainty, such as imperfect automated metrics and insufficiently sized test sets. The theory has practical applications, such as determining the number of samples needed to reliably distinguish the performance of a set of Text Generation systems in a given setting. We showcase the application of the theory on the WMT 21 and Spot-The-Bot evaluation data and outline how it can be leveraged to improve the evaluation protocol regarding the reliability, robustness, and significance of the evaluation outcome.	de_CH
dc.language.iso	en	de_CH
dc.publisher	Association for Computational Linguistics	de_CH
dc.rights	https://creativecommons.org/licenses/by/4.0/	de_CH
dc.subject	Text Generation	de_CH
dc.subject	Artificial Intelligence (AI)	de_CH
dc.subject.ddc	410.285: Computerlinguistik	de_CH
dc.title	On the effectiveness of automated metrics for text generation systems	de_CH
dc.type	Konferenz: Paper	de_CH
dcterms.type	Text	de_CH
zhaw.departement	School of Engineering	de_CH
zhaw.organisationalunit	Centre for Artificial Intelligence (CAI)	de_CH
dc.identifier.doi	10.21256/zhaw-27042	-
zhaw.conference.details	Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates, 7-11 December 2022	de_CH
zhaw.funding.eu	No	de_CH
zhaw.originated.zhaw	Yes	de_CH
zhaw.pages.end	1522	de_CH
zhaw.pages.start	1503	de_CH
zhaw.publication.status	publishedVersion	de_CH
zhaw.publication.review	Not specified	de_CH
zhaw.title.proceedings	Findings of the Association for Computational Linguistics: EMNLP 2022	de_CH
zhaw.author.additional	No	de_CH
zhaw.display.portrait	Yes	de_CH
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2022_vonDaeniken-etal_Effectiveness-of-automated-metrics-for-text-generation-systems.pdf		498.19 kB	Adobe PDF	View/Open

Show simple item record

von Däniken, P., Deriu, J. M., Tuggener, D., & Cieliebak, M. (2022). On the effectiveness of automated metrics for text generation systems [Conference paper]. Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–1522. https://doi.org/10.21256/zhaw-27042

von Däniken, P. et al. (2022) ‘On the effectiveness of automated metrics for text generation systems’, in Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, pp. 1503–1522. Available at: https://doi.org/10.21256/zhaw-27042.

P. von Däniken, J. M. Deriu, D. Tuggener, and M. Cieliebak, “On the effectiveness of automated metrics for text generation systems,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 1503–1522. doi: 10.21256/zhaw-27042.

VON DÄNIKEN, Pius, Jan Milan DERIU, Don TUGGENER und Mark CIELIEBAK, 2022. On the effectiveness of automated metrics for text generation systems. In: Findings of the Association for Computational Linguistics: EMNLP 2022 [online]. Conference paper. Association for Computational Linguistics. 2022. S. 1503–1522. Verfügbar unter: https://aclanthology.org/2022.findings-emnlp.108/

von Däniken, Pius, Jan Milan Deriu, Don Tuggener, and Mark Cieliebak. 2022. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Conference paper. In Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–22. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-27042.

von Däniken, Pius, et al. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Findings of the Association for Computational Linguistics: EMNLP 2022, Association for Computational Linguistics, 2022, pp. 1503–22, https://doi.org/10.21256/zhaw-27042.