Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-26577
Publication type: | Conference paper |
Type of review: | Peer review (publication) |
Title: | Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification |
Authors: | Wertz, Lukas Bogojeska, Jasmina Mirylenka, Katsiaryna Kuhn, Jonas |
et. al: | No |
DOI: | 10.21256/zhaw-26577 |
Proceedings: | Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) |
Page(s): | 366 |
Pages to: | 372 |
Conference details: | 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP), online, 20-23 November 2022 |
Issue Date: | Nov-2022 |
Publisher / Ed. Institution: | Association for Computational Linguistics |
Language: | English |
Subjects: | Multi-label text classification; Active learning; Transformer |
Subject (DDC): | 410.285: Computational linguistics |
Abstract: | The Transformer Language Model is a powerful tool that has been shown to excel at various NLP tasks and has become the de-facto standard solution thanks to its versatility. In this study, we employ pre-trained document embeddings in an Active Learning task to group samples with the same labels in the embedding space on a legal document corpus. We find that the calculated class embeddings are not close to the respective samples and consequently do not partition the embedding space in a meaningful way. In addition, we explore using the class embeddings as an Active Learning strategy with dramatically reduced results compared to all baselines. |
URI: | https://aclanthology.org/2022.aacl-short.45 https://digitalcollection.zhaw.ch/handle/11475/26577 |
Fulltext version: | Published version |
License (according to publishing contract): | CC BY 4.0: Attribution 4.0 International |
Departement: | School of Engineering |
Organisational Unit: | Centre for Artificial Intelligence (CAI) |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2022_Wertz-etal_Sentence-BERT-with-class-embeddings-multi-label-text-classification.pdf | 213.12 kB | Adobe PDF | View/Open |
Show full item record
Wertz, L., Bogojeska, J., Mirylenka, K., & Kuhn, J. (2022). Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification [Conference paper]. Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 366–372. https://doi.org/10.21256/zhaw-26577
Wertz, L. et al. (2022) ‘Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification’, in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, pp. 366–372. Available at: https://doi.org/10.21256/zhaw-26577.
L. Wertz, J. Bogojeska, K. Mirylenka, and J. Kuhn, “Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification,” in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Nov. 2022, pp. 366–372. doi: 10.21256/zhaw-26577.
WERTZ, Lukas, Jasmina BOGOJESKA, Katsiaryna MIRYLENKA und Jonas KUHN, 2022. Evaluating pre-trained Sentence-BERT with class embeddings in active learning for multi-label text classification. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) [online]. Conference paper. Association for Computational Linguistics. November 2022. S. 366–372. Verfügbar unter: https://aclanthology.org/2022.aacl-short.45
Wertz, Lukas, Jasmina Bogojeska, Katsiaryna Mirylenka, and Jonas Kuhn. 2022. “Evaluating Pre-Trained Sentence-BERT with Class Embeddings in Active Learning for Multi-Label Text Classification.” Conference paper. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 366–72. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-26577.
Wertz, Lukas, et al. “Evaluating Pre-Trained Sentence-BERT with Class Embeddings in Active Learning for Multi-Label Text Classification.” Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Association for Computational Linguistics, 2022, pp. 366–72, https://doi.org/10.21256/zhaw-26577.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.