Tomas Farkas, Jozef Sitarcik, Brona Brejova, Maria Lucka. SWSPM: A Novel Alignment-Free DNA Comparison Method Based on Signal ProcessingApproaches. Evolutionary Bioinformatics, 15:1176934319849071. 2019.
Download preprint: not available
Download from publisher: http://dx.doi.org/10.1177/1176934319849071
Related web page: not available
Bibliography entry: BibTeX
Computing similarity between 2 nucleotide sequences is one of the fundamental problems in bioinformatics. Current methods are based mainly on 2 major approaches: (1) sequence alignment, which is computationally expensive, and (2) faster, but less accurate, alignment-free methods based on various statistical summaries, for example, short word counts. We propose a new distance measure based on mathematical transforms from the domain of signal processing. To tolerate large-scale rearrangements in the sequences, the transform is computed across sliding windows. We compare our method on several data sets with current state-of-art alignment-free methods. Our method compares favorably in terms of accuracy and outperforms other methods in running time and memory requirements. In addition, it is massively scalable up to dozens of processing units without the loss of performance due to communication overhead. Source files and sample data are available at https://bitbucket.org/fiitstubioinfo/swspm/src.