Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-26577
Publication type: Conference paper
Type of review: Peer review (publication)
Title: Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification
Authors: Wertz, Lukas
Bogojeska, Jasmina
Mirylenka, Katsiaryna
Kuhn, Jonas
et. al: No
DOI: 10.21256/zhaw-26577
Proceedings: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Page(s): 366
Pages to: 372
Conference details: 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP), online, 20-23 November 2022
Issue Date: Nov-2022
Publisher / Ed. Institution: Association for Computational Linguistics
Language: English
Subjects: Multi-label text classification; Active learning; Transformer
Subject (DDC): 410.285: Computational linguistics
Abstract: The Transformer Language Model is a powerful tool that has been shown to excel at various NLP tasks and has become the de-facto standard solution thanks to its versatility. In this study, we employ pre-trained document embeddings in an Active Learning task to group samples with the same labels in the embedding space on a legal document corpus. We find that the calculated class embeddings are not close to the respective samples and consequently do not partition the embedding space in a meaningful way. In addition, we explore using the class embeddings as an Active Learning strategy with dramatically reduced results compared to all baselines.
URI: https://aclanthology.org/2022.aacl-short.45
https://digitalcollection.zhaw.ch/handle/11475/26577
Fulltext version: Published version
License (according to publishing contract): CC BY 4.0: Attribution 4.0 International
Departement: School of Engineering
Organisational Unit: Centre for Artificial Intelligence (CAI)
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2022_Wertz-etal_Sentence-BERT-with-class-embeddings-multi-label-text-classification.pdf213.12 kBAdobe PDFThumbnail
View/Open
Show full item record
Wertz, L., Bogojeska, J., Mirylenka, K., & Kuhn, J. (2022). Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification [Conference paper]. Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 366–372. https://doi.org/10.21256/zhaw-26577
Wertz, L. et al. (2022) ‘Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification’, in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, pp. 366–372. Available at: https://doi.org/10.21256/zhaw-26577.
L. Wertz, J. Bogojeska, K. Mirylenka, and J. Kuhn, “Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification,” in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Nov. 2022, pp. 366–372. doi: 10.21256/zhaw-26577.
WERTZ, Lukas, Jasmina BOGOJESKA, Katsiaryna MIRYLENKA und Jonas KUHN, 2022. Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) [online]. Conference paper. Association for Computational Linguistics. November 2022. S. 366–372. Verfügbar unter: https://aclanthology.org/2022.aacl-short.45
Wertz, Lukas, Jasmina Bogojeska, Katsiaryna Mirylenka, and Jonas Kuhn. 2022. “Evaluating Pre-Trained Sentence-BERT with Class Embeddings in Active Learning for Multi-Label Text Classification.” Conference paper. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 366–72. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-26577.
Wertz, Lukas, et al. “Evaluating Pre-Trained Sentence-BERT with Class Embeddings in Active Learning for Multi-Label Text Classification.” Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Association for Computational Linguistics, 2022, pp. 366–72, https://doi.org/10.21256/zhaw-26577.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.