Andrej Baláž, Alessia Petescia. Prefix-free graphs and suffix array construction in sublinear space. Technical Report 2306.14689v1, arXiv, 2023.
Download preprint: not available
Download from publisher: https://arxiv.org/pdf/2306.14689.pdf
Related web page: not available
Bibliography entry: BibTeX
A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.