Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-21263
Publication type: Article in scientific journal
Type of review: Peer review (publication)
Title: Querying knowledge graphs in natural language
Authors: Liang, Shiqi
Stockinger, Kurt
de Farias, Tarcisio Mendes
Anisimova, Maria
Gil, Manuel
et. al: No
DOI: 10.1186/s40537-020-00383-w
10.21256/zhaw-21263
Published in: Journal of Big Data
Volume(Issue): 8
Issue: 3
Issue Date: 6-Jan-2021
Publisher / Ed. Institution: Springer
ISSN: 2196-1115
Language: English
Subjects: Natural language processing; Knowledge graphs; Query processing; SPARQL
Subject (DDC): 006: Special computer methods
410.285: Computational linguistics
Abstract: Knowledge graphs are a powerful concept for querying large amounts of data. These knowledge graphs are typically enormous and are often not easily accessible to end-users because they require specialized knowledge in query languages such as SPARQL. Moreover, end-users need a deep understanding of the structure of the underlying data models often based on the Resource Description Framework (RDF). This drawback has led to the development of Question-Answering (QA) systems that enable end-users to express their information needs in natural language. While existing systems simplify user access, there is still room for improvement in the accuracy of these systems. In this paper we propose a new QA system for translating natural language questions into SPARQL queries. The key idea is to break up the translation process into 5 smaller, more manageable sub-tasks and use ensemble machine learning methods as well as Tree-LSTM-based neural network models to automatically learn and translate a natural language question into a SPARQL query. The performance of our proposed QA system is empirically evaluated using the two renowned benchmarks-the 7th Question Answering over Linked Data Challenge (QALD-7) and the Large-Scale Complex Question Answering Dataset (LC-QuAD). Experimental results show that our QA system outperforms the state-of-art systems by 15% on the QALD-7 dataset and by 48% on the LC-QuAD dataset, respectively. In addition, we make our source code available.
URI: https://digitalcollection.zhaw.ch/handle/11475/21263
Fulltext version: Published version
License (according to publishing contract): CC BY 4.0: Attribution 4.0 International
Departement: Life Sciences and Facility Management
School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Institute of Applied Simulation (IAS)
Published as part of the ZHAW project: SNF NRP 75 "Big Data": Bio-SODA - Enabling Complex, Semantic Queries to Bioinformatics Databases through Intuitive Searching over Data
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2021_Liang-Stockinger_etal_Querying-knowledge-graphs_Big-Data.pdf2.05 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.