Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen:
https://doi.org/10.21256/zhaw-19637
Publikationstyp: | Konferenz: Paper |
Art der Begutachtung: | Peer review (Publikation) |
Titel: | Entity matching with transformer architectures - a step forward in data integration |
Autor/-in: | Brunner, Ursin Stockinger, Kurt |
et. al: | No |
DOI: | 10.5441/002/edbt.2020.58 10.21256/zhaw-19637 |
Tagungsband: | Proceedings of EDBT 2020 |
Seite(n): | 463 |
Seiten bis: | 473 |
Angaben zur Konferenz: | 23rd International Conference on Extending Database Technology, Copenhagen, 30 March - 2 April 2020 |
Erscheinungsdatum: | Mär-2020 |
Verlag / Hrsg. Institution: | OpenProceedings |
ISBN: | 978-3-89318-083-7 |
Sprache: | Englisch |
Schlagwörter: | Entity matching; Data integration; Machine learning; Neural networks; Transformers; BERT |
Fachgebiet (DDC): | 006: Spezielle Computerverfahren |
Zusammenfassung: | Transformer architectures have proven to be very effective and provide state-of-the-art results in many natural language tasks. The attention-based architecture in combination with pre-training on large amounts of text lead to the recent breakthrough and a variety of slightly different implementations. In this paper we analyze how well four of the most recent attention-based transformer architectures (BERT, XLNet, RoBERTa and DistilBERT) perform on the task of entity matching - a crucial part of data integration. Entity matching (EM) is the task of finding data instances that refer to the same real-world entity. It is a challenging task if the data instances consist of long textual data or if the data instances are "dirty" due to misplaced values. To evaluate the capability of transformer architectures and transfer-learning on the task of EM, we empirically compare the four approaches on inherently difficult data sets. We show that transformer architectures outperform classical deep learning methods in EM by an average margin of 27.5%. |
URI: | https://digitalcollection.zhaw.ch/handle/11475/19637 |
Volltext Version: | Publizierte Version |
Lizenz (gemäss Verlagsvertrag): | CC BY-NC-ND 4.0: Namensnennung - Nicht kommerziell - Keine Bearbeitungen 4.0 International |
Departement: | School of Engineering |
Organisationseinheit: | Institut für Informatik (InIT) |
Enthalten in den Sammlungen: | Publikationen School of Engineering |
Dateien zu dieser Ressource:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
Entity_Machting_with_Transformers_edbt_2020__Camera_Ready.pdf | Entity Machting with Transformers EDBT 2020 | 1.12 MB | Adobe PDF | Öffnen/Anzeigen |
Zur Langanzeige
Brunner, U., & Stockinger, K. (2020). Entity matching with transformer architectures - a step forward in data integration [Conference paper]. Proceedings of EDBT 2020, 463–473. https://doi.org/10.5441/002/edbt.2020.58
Brunner, U. and Stockinger, K. (2020) ‘Entity matching with transformer architectures - a step forward in data integration’, in Proceedings of EDBT 2020. OpenProceedings, pp. 463–473. Available at: https://doi.org/10.5441/002/edbt.2020.58.
U. Brunner and K. Stockinger, “Entity matching with transformer architectures - a step forward in data integration,” in Proceedings of EDBT 2020, Mar. 2020, pp. 463–473. doi: 10.5441/002/edbt.2020.58.
BRUNNER, Ursin und Kurt STOCKINGER, 2020. Entity matching with transformer architectures - a step forward in data integration. In: Proceedings of EDBT 2020. Conference paper. OpenProceedings. März 2020. S. 463–473. ISBN 978-3-89318-083-7
Brunner, Ursin, and Kurt Stockinger. 2020. “Entity Matching with Transformer Architectures - a Step Forward in Data Integration.” Conference paper. In Proceedings of EDBT 2020, 463–73. OpenProceedings. https://doi.org/10.5441/002/edbt.2020.58.
Brunner, Ursin, and Kurt Stockinger. “Entity Matching with Transformer Architectures - a Step Forward in Data Integration.” Proceedings of EDBT 2020, OpenProceedings, 2020, pp. 463–73, https://doi.org/10.5441/002/edbt.2020.58.
Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.