Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-27042
Full metadata record
DC FieldValueLanguage
dc.contributor.authorvon Däniken, Pius-
dc.contributor.authorDeriu, Jan Milan-
dc.contributor.authorTuggener, Don-
dc.contributor.authorCieliebak, Mark-
dc.date.accessioned2023-02-17T09:23:27Z-
dc.date.available2023-02-17T09:23:27Z-
dc.date.issued2022-
dc.identifier.urihttps://aclanthology.org/2022.findings-emnlp.108/de_CH
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/27042-
dc.description.abstractA major challenge in the field of Text Generation is evaluation, because we lack a sound theory that can be leveraged to extract guidelines for evaluation campaigns. In this work, we propose a first step towards such a theory that incorporates different sources of uncertainty, such as imperfect automated metrics and insufficiently sized test sets. The theory has practical applications, such as determining the number of samples needed to reliably distinguish the performance of a set of Text Generation systems in a given setting. We showcase the application of the theory on the WMT 21 and Spot-The-Bot evaluation data and outline how it can be leveraged to improve the evaluation protocol regarding the reliability, robustness, and significance of the evaluation outcome.de_CH
dc.language.isoende_CH
dc.publisherAssociation for Computational Linguisticsde_CH
dc.rightshttp://creativecommons.org/licenses/by/4.0/de_CH
dc.subjectText Generationde_CH
dc.subjectArtificial Intelligence (AI)de_CH
dc.subject.ddc410.285: Computerlinguistikde_CH
dc.titleOn the effectiveness of automated metrics for text generation systemsde_CH
dc.typeKonferenz: Paperde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitCentre for Artificial Intelligence (CAI)de_CH
dc.identifier.doi10.21256/zhaw-27042-
zhaw.conference.detailsConference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates, 7-11 December 2022de_CH
zhaw.funding.euNode_CH
zhaw.originated.zhawYesde_CH
zhaw.pages.end1522de_CH
zhaw.pages.start1503de_CH
zhaw.publication.statuspublishedVersionde_CH
zhaw.publication.reviewNot specifiedde_CH
zhaw.title.proceedingsFindings of the Association for Computational Linguistics: EMNLP 2022de_CH
zhaw.author.additionalNode_CH
zhaw.display.portraitYesde_CH
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2022_vonDaeniken-etal_Effectiveness-of-automated-metrics-for-text-generation-systems.pdf498.19 kBAdobe PDFThumbnail
View/Open
Show simple item record
von Däniken, P., Deriu, J. M., Tuggener, D., & Cieliebak, M. (2022). On the effectiveness of automated metrics for text generation systems [Conference paper]. Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–1522. https://doi.org/10.21256/zhaw-27042
von Däniken, P. et al. (2022) ‘On the effectiveness of automated metrics for text generation systems’, in Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, pp. 1503–1522. Available at: https://doi.org/10.21256/zhaw-27042.
P. von Däniken, J. M. Deriu, D. Tuggener, and M. Cieliebak, “On the effectiveness of automated metrics for text generation systems,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 1503–1522. doi: 10.21256/zhaw-27042.
VON DÄNIKEN, Pius, Jan Milan DERIU, Don TUGGENER und Mark CIELIEBAK, 2022. On the effectiveness of automated metrics for text generation systems. In: Findings of the Association for Computational Linguistics: EMNLP 2022 [online]. Conference paper. Association for Computational Linguistics. 2022. S. 1503–1522. Verfügbar unter: https://aclanthology.org/2022.findings-emnlp.108/
von Däniken, Pius, Jan Milan Deriu, Don Tuggener, and Mark Cieliebak. 2022. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Conference paper. In Findings of the Association for Computational Linguistics: EMNLP 2022, 1503–22. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-27042.
von Däniken, Pius, et al. “On the Effectiveness of Automated Metrics for Text Generation Systems.” Findings of the Association for Computational Linguistics: EMNLP 2022, Association for Computational Linguistics, 2022, pp. 1503–22, https://doi.org/10.21256/zhaw-27042.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.