Andrej Baláž, Alessia Petescia. Prefix-free Graphs and Suffix Array Construction in Sublinear Space. In Proceedings of the 23rd Conference Information Technologies – Applications and Theory (ITAT 2023), 3498 volume of CEUR Workshop Proceedings, pp. 209-216, Tatranske Matliare, 2023.
Download preprint: not available
Download from publisher: https://ceur-ws.org/Vol-3498/paper27.pdf
Related web page: not available
Bibliography entry: BibTeX
See also: early version
A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.