Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-20320
Publication type: | Conference paper |
Type of review: | Peer review (publication) |
Title: | DoQA : accessing domain-specific FAQs via conversational QA |
Authors: | Campos, Jon Ander Otegi, Arantxa Soroa, Aitor Deriu, Jan Milan Cieliebak, Mark Agirre, Eneko |
et. al: | No |
DOI: | 10.18653/v1/2020.acl-main.652 10.21256/zhaw-20320 |
Proceedings: | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics |
Pages: | 7302 |
Pages to: | 7314 |
Conference details: | ACL 2020, Virtual, 5-10 July 2020 |
Issue Date: | 2020 |
Publisher / Ed. Institution: | Association for Computational Linguistics |
Language: | English |
Subjects: | Question answering; Deep learning; Natural language processing |
Subject (DDC): | 004: Computer science 400: Language, linguistics |
Abstract: | The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites. We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs. The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing. Compared to previous work, DoQA comprises well-defined information needs, leading to more coherent and natural conversations with less factoid questions and is multi-domain. In addition, we introduce a more realistic information retrieval (IR) scenario where the system needs to find the answer in any of the FAQ documents. The results of an existing, strong, system show that, thanks to transfer learning from a Wikipedia QA dataset and fine tuning on a single FAQ domain, it is possible to build high quality conversational QA systems for FAQs without in-domain training data. The good results carry over into the more challenging IR scenario. In both cases, there is still ample room for improvement, as indicated by the higher human upperbound. |
URI: | https://digitalcollection.zhaw.ch/handle/11475/20320 |
Fulltext version: | Published version |
License (according to publishing contract): | CC BY 4.0: Attribution 4.0 International |
Departement: | School of Engineering |
Organisational Unit: | Institute of Applied Information Technology (InIT) |
Published as part of the ZHAW project: | LIHLITH - Learning to Interact with Humans by Lifelong Interaction with Humans |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2020_Campos-etal_DoQA_ACL.pdf | 590.12 kB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.