Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-4974
Title: German compound splitting using the compound productivity of morphemes
Authors : Sugisaki, Kyoko
Tuggener, Don
Published in : 14th Conference on Natural Language Processing - KONVENS 2018
Pages : 141
Pages to: 147
Conference details: 14th Conference on Natural Language Processing, Vienna, Austria, 19-21 September 2018
Editors of the parent work: Barbaresi, Adrien
Biber, Hanno
Neubarth, Friedrich
Osswald, Rainer
Publisher / Ed. Institution : Austrian Academy of Sciences Press
Issue Date: 2018
License (according to publishing contract) : Licence according to publishing contract
Type of review: Peer review (Publication)
Language : English
Subjects : Compound splitting
Subject (DDC) : 410.285: Computational linguistics
Abstract: In this work, we present a novel compound splitting method for German by capturing the compound productivity of morphemes. We use a giga web corpus to create a lexicon and decompose noun compounds by computing the probabilities of compound elements as bound and free morphemes. Furthermore, we provide a uniformed evaluation of several unsupervised approaches and morphological analysers for the task. Our method achieved a high F1 score of 0.92, which was a comparable result to state-of-the-art methods.
Departement: School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Publication type: Conference Paper
DOI : 10.21256/zhaw-4974
URI: https://digitalcollection.zhaw.ch/handle/11475/14372
Other identifiers : 0xc1aa5576 0x003a2438
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2018_Sugisaki_German_compound_splitting_using_the_compound.pdf177.39 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.