Survey on evaluation methods for dialogue systems

Deriu, Jan Milan; Rodrigo, Alvaro; Otegi, Arantxa; Echegoyen, Guillermo; Rosset, Sophie; Agirre, Eneko; Cieliebak, Mark

doi:10.1007/s10462-020-09866-x

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-20318

Publication type:	Article in scientific journal
Type of review:	Editorial review
Title:	Survey on evaluation methods for dialogue systems
Authors:	Deriu, Jan Milan Rodrigo, Alvaro Otegi, Arantxa Echegoyen, Guillermo Rosset, Sophie Agirre, Eneko Cieliebak, Mark
et. al:	No
DOI:	10.1007/s10462-020-09866-x 10.21256/zhaw-20318
Published in:	Artificial Intelligence Review
Volume(Issue):	54
Issue:	1
Page(s):	755
Pages to:	810
Issue Date:	2020
Publisher / Ed. Institution:	Springer
ISSN:	0269-2821 1573-7462
Language:	English
Subjects:	Dialogue systems; Artificial intelligence; Evaluation; Deep learning
Subject (DDC):	006: Special computer methods
Abstract:	In this paper, we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost- and time-intensive. Thus, much work has been put into finding methods which allow a reduction in involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented, conversational, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then present the evaluation methods regarding that class.
URI:	https://digitalcollection.zhaw.ch/handle/11475/20318
Fulltext version:	Published version
License (according to publishing contract):	CC BY 4.0: Attribution 4.0 International
Departement:	School of Engineering
Organisational Unit:	Institute of Computer Science (InIT)
Published as part of the ZHAW project:	LIHLITH - Learning to Interact with Humans by Lifelong Interaction with Humans
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2020_Deriu-etal_Survey-on-evaluation-methods-for-dialogue-systems.pdf		1.94 MB	Adobe PDF	View/Open

Show full item record

Deriu, J. M., Rodrigo, A., Otegi, A., Echegoyen, G., Rosset, S., Agirre, E., & Cieliebak, M. (2020). Survey on evaluation methods for dialogue systems. Artificial Intelligence Review, 54(1), 755–810. https://doi.org/10.1007/s10462-020-09866-x

Deriu, J.M. et al. (2020) ‘Survey on evaluation methods for dialogue systems’, Artificial Intelligence Review, 54(1), pp. 755–810. Available at: https://doi.org/10.1007/s10462-020-09866-x.

J. M. Deriu et al., “Survey on evaluation methods for dialogue systems,” Artificial Intelligence Review, vol. 54, no. 1, pp. 755–810, 2020, doi: 10.1007/s10462-020-09866-x.

DERIU, Jan Milan, Alvaro RODRIGO, Arantxa OTEGI, Guillermo ECHEGOYEN, Sophie ROSSET, Eneko AGIRRE und Mark CIELIEBAK, 2020. Survey on evaluation methods for dialogue systems. Artificial Intelligence Review. 2020. Bd. 54, Nr. 1, S. 755–810. DOI 10.1007/s10462-020-09866-x

Deriu, Jan Milan, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, and Mark Cieliebak. 2020. “Survey on Evaluation Methods for Dialogue Systems.” Artificial Intelligence Review 54 (1): 755–810. https://doi.org/10.1007/s10462-020-09866-x.

Deriu, Jan Milan, et al. “Survey on Evaluation Methods for Dialogue Systems.” Artificial Intelligence Review, vol. 54, no. 1, 2020, pp. 755–810, https://doi.org/10.1007/s10462-020-09866-x.