2-AIN-506 a 2-AIN-252: Seminár z bioinformatiky (2) a (4)
Leto 2021
Abstrakt

Travis Gagie, Giovanni Manzini, Jouni Siren. Wheeler graphs: A framework for BWT-based data structures. Theoretical Computer Science, 698:67-78.

Download preprint: not available

Download from publisher: https://doi.org/10.1016/j.tcs.2017.06.016

Related web page: not available

Bibliography entry: BibTeX

Abstract:

The famous Burrows–Wheeler Transform (BWT) was originally defined for a 
single string but variations have been developed for sets of strings, 
labeled trees, de Bruijn graphs, etc. In this paper we propose a framework 
that includes many of these variations and that we hope will simplify the 
search for more.

We first define Wheeler graphs and show they have a property we call path 
coherence. We show that if the state diagram of a finite-state automaton 
is a Wheeler graph then, by its path coherence, we can order the nodes 
such that, for any string, the nodes reachable from the initial state or 
states by processing that string are consecutive. This means that even if 
the automaton is non-deterministic, we can still store it compactly and 
process strings with it quickly.

We then rederive several variations of the BWT by designing 
straightforward finite-state automata for the relevant problems and 
showing that their state diagrams are Wheeler graphs.