Andrej Baláž, Alessia Petescia. Prefix-free Graphs and Suffix Array Construction in Sublinear Space. In Proceedings of the 23rd Conference Information Technologies – Applications and Theory (ITAT 2023), 3498 volume of CEUR Workshop Proceedings, pp. 209-216, Tatranske Matliare, 2023.

Download preprint: not available

Download from publisher: https://ceur-ws.org/Vol-3498/paper27.pdf

Related web page: not available

Bibliography entry: BibTeX

See also: early version

Abstract:

A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it
several graph structures. These graph structures must implement operations, such as efficient construction
from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data,
and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient
computation. Attempts to achieve comparatively high performance on graphs bring many complications
since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce
prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to
obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations
on pangenomes.