Martina Višňovská, Tomáš Vinař, Broňa Brejová.
DNA Sequence Segmentation Based on Local Similarity.
In Tomáš Vinař, ed.,
Information Technologies - Applications and Theory (ITAT),
1003 volume of CEUR-WS,
pp. 36-43, 2013.
Download from publisher | BibTeX
DNA sequences evolve by local changes affecting one or several adjacent symbols, as well as by large-scale rearrangements and duplications. This results in mosaic sequences with various degrees of similarity between regions within a single genome or in genomes of related organisms. Our goal is to segment DNA to regions and to assign such regions to classes so that regions within a single class are similar and there is low or no similarity between regions of different classes. We provide a formal definition of the segmentation problem, prove its NP-hardness, and give a practical heuristic algorithm. We have implemented the algorithm and evaluated it on simulated data. Segments found by our algorithm can be used as markers in a wide range of evolutionary studies.