Semantic integration and enrichment of heterogeneous biological databases

Sima, Ana-Claudia; Stockinger, Kurt; de Farias, Tarcisio Mendes; Gil, Manuel

doi:10.1007/978-1-4939-9074-0_22

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-3138

Publication type:	Book part
Type of review:	Editorial review
Title:	Semantic integration and enrichment of heterogeneous biological databases
Authors:	Sima, Ana-Claudia Stockinger, Kurt de Farias, Tarcisio Mendes Gil, Manuel
et. al:	No
DOI:	10.1007/978-1-4939-9074-0_22 10.21256/zhaw-3138
Published in:	Evolutionary genomics : statistical and computational methods
Editors of the parent work:	Anisimova, Maria
Page(s):	655
Pages to:	690
Issue Date:	2019
Series:	Methods in Molecular Biology
Series volume:	1910
Publisher / Ed. Institution:	Springer
Publisher / Ed. Institution:	New York
ISBN:	978-1-4939-9073-3 978-1-4939-9074-0
Language:	English
Subjects:	Data integration; Keyword search; Knowledge representation; Ontology-based data access; Query processing; RDF store; Relational database
Subject (DDC):	005: Computer programming, programs and data
Abstract:	Biological databases are growing at an exponential rate, currently being among the major producers of Big Data, almost on par with commercial generators, such as YouTube or Twitter. While traditionally biological databases evolved as independent silos, each purposely built by a different research group in order to answer specific research questions; more recently significant efforts have been made toward integrating these heterogeneous sources into unified data access systems or interoperable systems using the FAIR principles of data sharing. Semantic Web technologies have been key enablers in this process, opening the path for new insights into the unified data, which were not visible at the level of each independent database. In this chapter, we first provide an introduction into two of the most used database models for biological data: relational databases and RDF stores. Next, we discuss ontology-based data integration, which serves to unify and enrich heterogeneous data sources. We present an extensive timeline of milestones in data integration based on Semantic Web technologies in the field of life sciences. Finally, we discuss some of the remaining challenges in making ontology-based data access (OBDA) systems easily accessible to a larger audience. In particular, we introduce natural language search interfaces, which alleviate the need for database users to be familiar with technical query languages. We illustrate the main theoretical concepts of data integration through concrete examples, using two well-known biological databases: a gene expression database, Bgee, and an orthology database, OMA.
URI:	https://digitalcollection.zhaw.ch/handle/11475/17721
Fulltext version:	Published version
License (according to publishing contract):	CC BY 4.0: Attribution 4.0 International
Departement:	School of Engineering Life Sciences and Facility Management
Organisational Unit:	Institute of Computer Science (InIT) Institute of Computational Life Sciences (ICLS)
Published as part of the ZHAW project:	Bio-SODA: Enabling Complex, Semantic Queries to Bioinformatics Databases through Intuitive Searching over Data
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2019_Semantic Integration and Enrichment of Heterogeneous_Sima_Stockinger_etal_EvolutionaryGenomics.pdf		841.13 kB	Adobe PDF	View/Open

Show full item record

Sima, A.-C., Stockinger, K., de Farias, T. M., & Gil, M. (2019). Semantic integration and enrichment of heterogeneous biological databases. In M. Anisimova (Ed.), Evolutionary genomics : statistical and computational methods (pp. 655–690). Springer. https://doi.org/10.1007/978-1-4939-9074-0_22

Sima, A.-C. et al. (2019) ‘Semantic integration and enrichment of heterogeneous biological databases’, in M. Anisimova (ed.) Evolutionary genomics : statistical and computational methods. New York: Springer, pp. 655–690. Available at: https://doi.org/10.1007/978-1-4939-9074-0_22.

A.-C. Sima, K. Stockinger, T. M. de Farias, and M. Gil, “Semantic integration and enrichment of heterogeneous biological databases,” in Evolutionary genomics : statistical and computational methods, M. Anisimova, Ed. New York: Springer, 2019, pp. 655–690. doi: 10.1007/978-1-4939-9074-0_22.

SIMA, Ana-Claudia, Kurt STOCKINGER, Tarcisio Mendes DE FARIAS und Manuel GIL, 2019. Semantic integration and enrichment of heterogeneous biological databases. In: Maria ANISIMOVA (Hrsg.), Evolutionary genomics : statistical and computational methods. New York: Springer. S. 655–690. ISBN 978-1-4939-9073-3

Sima, Ana-Claudia, Kurt Stockinger, Tarcisio Mendes de Farias, and Manuel Gil. 2019. “Semantic Integration and Enrichment of Heterogeneous Biological Databases.” In Evolutionary Genomics : Statistical and Computational Methods, edited by Maria Anisimova, 655–90. New York: Springer. https://doi.org/10.1007/978-1-4939-9074-0_22.

Sima, Ana-Claudia, et al. “Semantic Integration and Enrichment of Heterogeneous Biological Databases.” Evolutionary Genomics : Statistical and Computational Methods, edited by Maria Anisimova, Springer, 2019, pp. 655–90, https://doi.org/10.1007/978-1-4939-9074-0_22.