Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-20794
Publication type: Article in scientific journal
Type of review: Peer review (publication)
Title: Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform
Authors: Maiolo, Massimo
Ulzega, Simone
Gil, Manuel
Anisimova, Maria
et. al: No
DOI: 10.1093/nargab/lqaa092
10.21256/zhaw-20794
Published in: NAR Genomics and Bioinformatics
Volume(Issue): 2
Issue: 4
Pages: lqaa092
Issue Date: 6-Nov-2020
Publisher / Ed. Institution: Oxford University Press
ISSN: 2631-9268
Language: English
Subject (DDC): 510: Mathematics
572: Biochemistry
Abstract: Recently we presented a frequentist dynamic pro- gramming (DP) approach for multiple sequence alignment based on the explicit model of indel evolution Poisson Indel Process (PIP). This phylogeny-aware approach produces evolutionary meaningful gap patterns and is robust to the ‘over-alignment’ bias. Despite linear time complexity for the computation of marginal likelihoods, the overall method’s complexity is cubic in sequence length. Inspired by the popular aligner MAFFT, we propose a new technique to accelerate the evolutionary indel based alignment. Amino acid sequences are converted to sequences representing their physicochemical properties, and homologous blocks are identified by multi-scale short-time Fourier transform. Three three-dimensional DP matrices are then created under PIP, with homologous blocks defining sparse structures where most cells are excluded from the calculations. The homologous blocks are connected through intermediate ‘linking blocks’. The homologous and linking blocks are aligned under PIP as independent DP sub-matrices and their tracebacks merged to yield the final alignment. The new algorithm can largely profit from parallel computing, yielding a theoretical speed-up estimated to be pro- portional to the cubic power of the number of sub-blocks in the DP matrices. We compare the new method to the original PIP approach and demonstrate it on real data.
URI: https://digitalcollection.zhaw.ch/handle/11475/20794
Fulltext version: Published version
License (according to publishing contract): CC BY-NC 4.0: Attribution - Non commercial 4.0 International
Departement: Life Sciences and Facility Management
Organisational Unit: Institute of Applied Simulation (IAS)
Published as part of the ZHAW project: Fast joint estimation of alignment and phylogeny from genomics sequences in a frequentist framework
Appears in collections:Publikationen Life Sciences und Facility Management

Files in This Item:
File Description SizeFormat 
2020_Maiolo-etal_Accelerating-phylogeny-aware-alignment-indel-evolution.pdf1.83 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.