Reproducing a comparative evaluation of German text-to-speech systems

Hürlimann, Manuela; Cieliebak, Mark

doi:10.21256/zhaw-29262

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-29262

Publication type:	Conference paper
Type of review:	Editorial review
Title:	Reproducing a comparative evaluation of German text-to-speech systems
Authors:	Hürlimann, Manuela Cieliebak, Mark
et. al:	No
DOI:	10.21256/zhaw-29262
Proceedings:	Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems
Editors of the parent work:	Belz, Anya Popovic, Maja Reiter, Ehud Thomson, Craig Sedoc, João
Page(s):	136
Pages to:	144
Conference details:	14th International Conference Recent Advances in Natural Language Processing (RANLP), Varna, Bulgaria, 4-8 September 2023
Issue Date:	2023
Publisher / Ed. Institution:	INCOMA
Publisher / Ed. Institution:	Shoumen
ISBN:	978-954-452-088-5
Language:	English
Subjects:	Human evaluation; Reproducibility
Subject (DDC):	410.285: Computational linguistics
Abstract:	This paper describes the reproduction of a human evaluation in Language-Agnostic Meta- Learning for Low-Resource Text-to-Speech with Articulatory Features reported in Lux and Vu (2022). It is a contribution to the ReproNLP 2023 Shared Task on Reproducibility of Evaluations in NLP. The original evaluation assessed the naturalness of audio generated by different Text-to-Speech (TTS) systems for German, and our goal was to repeat the experiment with a different set of evaluators. We reproduced the evaluation based on data and instructions provided by the original authors, with some uncertainty concerning the randomisation of question order. Evaluators were recruited via email to relevant mailing lists and we received 157 responses over the course of three weeks. Our initial results show low reproducibility, but when we assume that the systems of the original and repeat evaluation experiment have been transposed, the reproducibility assessment improves markedly. We do not know if and at what point such a transposition happened; however, an initial analysis of our audio and video files provides some evidence that the system assignment in our repeat experiment is correct.
URI:	https://aclanthology.org/2023.humeval-1.12 https://digitalcollection.zhaw.ch/handle/11475/29262
Fulltext version:	Published version
License (according to publishing contract):	CC BY 4.0: Attribution 4.0 International
Departement:	School of Engineering
Organisational Unit:	Centre for Artificial Intelligence (CAI)
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2023_Huerlimann-Cieliebak_Comparative-evaluation-of-German-text-to-speech-systems.pdf		469.58 kB	Adobe PDF	View/Open

Show full item record

Hürlimann, M., & Cieliebak, M. (2023). Reproducing a comparative evaluation of German text-to-speech systems [Conference paper]. In A. Belz, M. Popovic, E. Reiter, C. Thomson, & J. Sedoc (Eds.), Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems (pp. 136–144). INCOMA. https://doi.org/10.21256/zhaw-29262

Hürlimann, M. and Cieliebak, M. (2023) ‘Reproducing a comparative evaluation of German text-to-speech systems’, in A. Belz et al. (eds) Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems. Shoumen: INCOMA, pp. 136–144. Available at: https://doi.org/10.21256/zhaw-29262.

M. Hürlimann and M. Cieliebak, “Reproducing a comparative evaluation of German text-to-speech systems,” in Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, 2023, pp. 136–144. doi: 10.21256/zhaw-29262.

HÜRLIMANN, Manuela und Mark CIELIEBAK, 2023. Reproducing a comparative evaluation of German text-to-speech systems. In: Anya BELZ, Maja POPOVIC, Ehud REITER, Craig THOMSON und João SEDOC (Hrsg.), Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems [online]. Conference paper. Shoumen: INCOMA. 2023. S. 136–144. ISBN 978-954-452-088-5. Verfügbar unter: https://aclanthology.org/2023.humeval-1.12

Hürlimann, Manuela, and Mark Cieliebak. 2023. “Reproducing a Comparative Evaluation of German Text-to-Speech Systems.” Conference paper. In Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, edited by Anya Belz, Maja Popovic, Ehud Reiter, Craig Thomson, and João Sedoc, 136–44. Shoumen: INCOMA. https://doi.org/10.21256/zhaw-29262.

Hürlimann, Manuela, and Mark Cieliebak. “Reproducing a Comparative Evaluation of German Text-to-Speech Systems.” Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, edited by Anya Belz et al., INCOMA, 2023, pp. 136–44, https://doi.org/10.21256/zhaw-29262.