Dátové štruktúry pre externú pamäť

Zdroje:

Prednaska L07 Erika Demaina z MIT: http://courses.csail.mit.edu/6.851/spring12/lectures/
Kapitola o B-trees z Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. (ja spom pouzila 3. vydanie)
onlie algoritmy a paging http://courses.csail.mit.edu/6.854/03/scribe/scribe19.ps

Osnova:

uvod do externej pamati z prednasky Erika Demaina
- strucne opakovanie B-trees: struktura, hladanie, vkladanie, odhad vysky
- strucne opakovanie mergesortu v externej pamati
uvod do cache-oblivious modelu z prednasky Erika Demaina
- strucnu uvod do online algoritmov, kompetitivnost, paging
staticke vyhldavacie stromy (vEB layout) z prednasky Erika Demaina
nacrt ako zhruba vyzeraju dynamicke stromy (cache oblivious B-trees), nesli sme do vsetkych detailov

Obsah

Example: scanning through n elements (e.g. to compute their sum): Theta(n/B) memory transfers

I/O cost of binary search tree?
from CLRS 3rd edition
parameter T related to block size B (so that each node fits into a constant number of blocks)
each node v has some number v.n of keys
- if it is an internal node, it has v.n+1 children
- keys in a node are sorted
- each subtree contains only values between two successive keys in the parent
all leaves are in the same depth
each node except root has at least T-1 keys and each internal node except root has at least T children
each node has at most 2T-1 keys and at most 2T children
height is O(log_T n)

Search:

Insert:

find a leaf where the new key belongs
if leaf is full (2T-1 keys), split into two equal parts of size T-1 and insert median to the parent
- new element can be inserted to one of the new leaves
we continue splitting ancestors until we find a node which is not full or until we reach the root. If root is full, we create a new root with 2 children (increasing the height of the tree)
O(log_T n) block reads and writes

mergesort
- I/O cost of ordinary implementation?
- create sorted runs of size M using O(N/B) disk reads and writes
- repeatedly merge M/B-1 runs into 1 - 1 pass uses N/B disk reads and writes
- log_{M/B-1} (N/M) passes

B-trees can do search tree operations (insert, delete, search/predecessor) in Theta(log_{B+1} n) = Theta(log n / log (B+1)) memeory transfers
- number of memory transfers compared to usual search tree running time: factor of 1/log B
search optimal in comparison model
- proof outline through information theory
- goal of the seach is to tell us between which two elements is query located
- there are Theta(n) possible answers, thus Theta(logg n) bits of information
- in normal comparison model one comparison of query to an elements of the set gives us at most 1 bit of information
- in I/O model reading a block of B elements and comparing query to them gives us in the beste case its position within these B ele,ents, i.e. Theta(log B) bits
- to get Theta(log n) bits of infomration for the answer, we need Theta(log n/log B) block reads
- this can be transformed to a more formal and precise proof
sorting O((N/B) log_{M/B} (N/B)) memory trasfers
- number of memory transfers compared to usual sorting running time: factor of more than 1/B

in the I/O model the algorithm explicitly requests block transfers, knows B, controls memory allocation
in the cache oblivious model algorithm does not know B or M, memory operates as a cache
M/B slots, each holding one block from disk
algorithm requests reading a word from disk
- if the block containing this word in cache, no transfer
- replace one slot with block holding requested item, write original block if needed (1 or 2 transfers)
- which one to replace: classical on-line problem of paging

cache has size k = M/B slots, each holds one page (above called block)
sequence of page requests, if the requested page not in memory not in memory, bring it in and remove some other page (page fault)
- goal is to minimize the number of page faults
optimal offline algorithm (knows the whole sequence of page requests)
- At a page fault remove the page that will not be used for the longest time
example of an on-line algoritm: FIFO:
- at a page fault remove page which is in the memory longest time
- uses at most k times as many page faults as the optimal algorithm (k-competitive)
- no deterministic alg. can be better than k-competitive
- it is conservative: in a segment of requests containing at most k distinct pages it does at most k page faults
- every conservative alg. is k-competitive
Compare a conservative paging alg. (e.g. FIFO) on memory with k blocks to optimum offline alg. on a smaller memory of size h - competitive ratio k/(k-h)
- if h = k/2, we get competitive ratio 2
- divide input sequence into maximal blocks, each containing k distinct elements (first element of the next block is distinct from the k elements of the previous block)
- FIFO uses at most k page faults in each block
- optimum has at most h sets of a block in memory at the beginning - at least k-h pages will cause a fault
- in fact we can prove k/(k-h+1), works even for k=h

we analyze algorithms under the assumptiom that paging algorithm uses off-line optimum
instead it could use e.g. FIFO in memory 2M and increase the number of trasfers by a constant factor
advantages of cache-oblivious model:
- may adapt to changing M, B
- good for a whole memory hierarchy (several levels of cache, disk, network,...)