Reinforced active learning for low-resource, domain-specific, multi-label text classification

Wertz, Lukas; Bogojeska, Jasmina; Mirylenka, Katsiaryna; Kuhn, Jonas

doi:10.18653/v1/2023.findings-acl.697

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-29509

Publikationstyp:	Konferenz: Paper
Art der Begutachtung:	Peer review (Publikation)
Titel:	Reinforced active learning for low-resource, domain-specific, multi-label text classification
Autor/-in:	Wertz, Lukas Bogojeska, Jasmina Mirylenka, Katsiaryna Kuhn, Jonas
et. al:	No
DOI:	10.18653/v1/2023.findings-acl.697 10.21256/zhaw-29509
Tagungsband:	Findings of the Association for Computational Linguistics: ACL 2023
Angaben zur Konferenz:	61st Annual Meeting of the Association for Computational Linguistics (ACL), Toronto, Canada, 9-14 July 2023
Erscheinungsdatum:	Jul-2023
Verlag / Hrsg. Institution:	Association for Computational Linguistics (ACL)
Verlag / Hrsg. Institution:	Stroudsburg, PA
ISBN:	978-1-959429-62-3
Sprache:	Englisch
Schlagwörter:	Reinforcement learning; Active learning; Multi-label text classification; Digitalisierung
Fachgebiet (DDC):	006: Spezielle Computerverfahren
Zusammenfassung:	Text classification datasets from specialised or technical domains are in high demand, especially in industrial applications. However, due to the high cost of annotation such datasets are usually expensive to create. While Active Learning (AL) can reduce the labeling cost, required AL strategies are often only tested on general knowledge domains and tend to use information sources that are not consistent across tasks. We propose Reinforced Active Learning (RAL) to train a Reinforcement Learning policy that utilizes many different aspects of the data and the task in order to select the most informative unlabeled subset dynamically over the course of the AL procedure. We demonstrate the superior performance of the proposed RAL framework compared to strong AL baselines across four intricate multi-class, multi-label text classification datasets taken from specialised domains. In addition, we experiment with a unique data augmentation approach to further reduce the number of samples RAL needs to annotate.
URI:	https://digitalcollection.zhaw.ch/handle/11475/29509
Volltext Version:	Publizierte Version
Lizenz (gemäss Verlagsvertrag):	CC BY 4.0: Namensnennung 4.0 International
Departement:	School of Engineering
Organisationseinheit:	Centre for Artificial Intelligence (CAI)
Enthalten in den Sammlungen:	Publikationen School of Engineering

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
2023_Wertz-etal_Reinforced-active-learning-multi-label-text-classification_ACL.pdf		688.64 kB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Wertz, L., Bogojeska, J., Mirylenka, K., & Kuhn, J. (2023, July). Reinforced active learning for low-resource, domain-specific, multi-label text classification. Findings of the Association for Computational Linguistics: ACL 2023. https://doi.org/10.18653/v1/2023.findings-acl.697

Wertz, L. et al. (2023) ‘Reinforced active learning for low-resource, domain-specific, multi-label text classification’, in Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg, PA: Association for Computational Linguistics (ACL). Available at: https://doi.org/10.18653/v1/2023.findings-acl.697.

L. Wertz, J. Bogojeska, K. Mirylenka, and J. Kuhn, “Reinforced active learning for low-resource, domain-specific, multi-label text classification,” in Findings of the Association for Computational Linguistics: ACL 2023, Jul. 2023. doi: 10.18653/v1/2023.findings-acl.697.

WERTZ, Lukas, Jasmina BOGOJESKA, Katsiaryna MIRYLENKA und Jonas KUHN, 2023. Reinforced active learning for low-resource, domain-specific, multi-label text classification. In: Findings of the Association for Computational Linguistics: ACL 2023. Conference paper. Stroudsburg, PA: Association for Computational Linguistics (ACL). Juli 2023. ISBN 978-1-959429-62-3

Wertz, Lukas, Jasmina Bogojeska, Katsiaryna Mirylenka, and Jonas Kuhn. 2023. “Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification.” Conference paper. In Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg, PA: Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.697.

Wertz, Lukas, et al. “Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification.” Findings of the Association for Computational Linguistics: ACL 2023, Association for Computational Linguistics (ACL), 2023, https://doi.org/10.18653/v1/2023.findings-acl.697.

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.