Martina Višňovská, Tomáš Vinař, Broňa Brejová. DNA Sequence Segmentation Based on Local Similarity. In Tomáš Vinař, ed., Information Technologies - Applications and Theory (ITAT), 1003 volume of CEUR-WS, pp. 36-43, 2013.
Download preprint: not available
Download from publisher: http://ceur-ws.org/Vol-1003/36.pdf
Related web page: not available
Bibliography entry: BibTeX
Abstract:
DNA sequences evolve by local changes affecting one or several adjacent symbols, as well as by large-scale rearrangements and duplications. This results in mosaic sequences with various degrees of similarity between regions within a single genome or in genomes of related organisms. Our goal is to segment DNA to regions and to assign such regions to classes so that regions within a single class are similar and there is low or no similarity between regions of different classes. We provide a formal definition of the segmentation problem, prove its NP-hardness, and give a practical heuristic algorithm. We have implemented the algorithm and evaluated it on simulated data. Segments found by our algorithm can be used as markers in a wide range of evolutionary studies.