Vybrané partie z dátových štruktúr
2-INF-237, LS 2016/17

Úvod · Pravidlá · Prednášky · Prezentácia · Ako poznámkovať · Moodle
Táto stránka sa týka školského roku 2016/17. V školskom roku 2017/18 predmet vyučuje Jakub Kováč, stránku predmetu je https://people.ksp.sk/~kuko/ds


Dátové štruktúry pre externú pamäť

Z VPDS
Prejsť na: navigácia, hľadanie

External memory model, I/O model

Introduction

  • big and slow disk, fast memory of a limited size M words
  • disk reads/writes in blocks of size B words
  • when analyzing algorithms, count how many disk reads/writes (memory transfers) are done

Example: scanning through n elements (e.g. to compute their sum): Theta(ceiling of n/B) memory transfers

B-trees

  • I/O cost of binary search tree?
  • parameter T related to block size B (so that each node fits into a constant number of blocks)
  • each node v has some number v.n of keys
    • if it is an internal node, it has v.n+1 children
    • keys in a node are sorted
    • each subtree contains only values between two successive keys in the parent
  • all leaves are in the same depth
  • each node except root has at least T-1 keys and each internal node except root has at least T children
  • each node has at most 2T-1 keys and at most 2T children
  • height is O(log_T n)

Search:

  • O(log_T n) block reads

Insert:

  • find a leaf where the new key belongs
  • if leaf is full (2T-1 keys), split into two equal parts of size T-1 and insert median to the parent (recursively)
    • new element is inserted to one of the new leaves
    • we continue splitting ancestors until we find a node which is not full or until we reach the root. If root is full, we create a new root with 2 children (increasing the height of the tree)
  • O(log_T n) transfers

Sorting

external mergesort

  • I/O cost of ordinary mergesort implementation?
  • create sorted runs of size M using O(N/B) transfers
  • repeatedly merge M/B-1 runs into one in 1 pass, this uses N/B disk reads and writes
    • read one block from each run, use one block for output
    • when output block gets full, write it to disk
    • when some input gets exhausted, read next block from the run (if any)
  • log_{M/B-1} (N/M) passes

Summary

  • B-trees can do search tree operations (insert, delete, search/predecessor) in Theta(log_{B+1} n) = Theta(log n / log (B+1)) memory transfers
    • number of transfers compared to usual search tree running time: factor of 1/log B
  • search optimal in comparison model
    • proof outline through information theory
    • goal of the seach is to tell us between which two elements is query located
    • there are Theta(n) possible answers, thus Theta(log n) bits of information
    • in normal comparison model one comparison of query to an elements of the set gives us at most 1 bit of information
    • in I/O model reading a block of B elements and comparing query to them gives us in the beste case its position within these B ele,ents, i.e. Theta(log B) bits
    • to get Theta(log n) bits of infomration for the answer, we need Theta(log n/log B) block reads
    • this can be transformed to a more formal and precise proof
  • sorting O((N/B) log_{M/B} (N/B)) memory trasfers
    • number of memory transfers compared to usual sorting running time: factor of more than 1/B
    • also much better than sorting using B-trees, which would take O(N * log_B n)

Cache oblivious model

Introduction

  • in the I/O model the algorithm explicitly requests block transfers, knows B, controls memory allocation
  • in the cache oblivious model algorithm does not know B or M, memory operates as a cache
  • M/B slots, each holding one block from disk
  • algorithm requests reading a word from disk
    • if the block containing this word in cache, no transfer
    • replace one slot with block holding requested item, write original block if needed (1 or 2 transfers)
    • which one to replace: classical on-line problem of paging

Paging

  • cache has size k = M/B slots, each holds one page (above called block)
  • sequence of page requests, if the requested page not in memory not in memory, bring it in and remove some other page (page fault)
    • goal is to minimize the number of page faults
  • optimal offline algorithm (knows the whole sequence of page requests)
    • At a page fault remove the page that will not be used for the longest time
  • example of an on-line algoritm: FIFO:
    • at a page fault remove page which is in the memory longest time
    • uses at most k times as many page faults as the optimal algorithm (k-competitive)
    • no deterministic alg. can be better than k-competitive
    • it is conservative: in a segment of requests containing at most k distinct pages it does at most k page faults
    • every conservative alg. is k-competitive
  • Compare a conservative paging alg. (e.g. FIFO) on memory with k blocks to optimum offline alg. on a smaller memory of size h - competitive ratio k/(k-h)
    • if h = k/2, we get competitive ratio 2
    • divide input sequence into maximal blocks, each containing k distinct elements (first element of the next block is distinct from the k elements of the previous block)
    • FIFO uses at most k page faults in each block
    • optimum has at most h pages from a block in memory at the beginning - at least k-h pages will cause a fault
    • in fact we can prove k/(k-h+1), works even for k=h

Back to cache-oblivious model

  • we analyze algorithms under the assumption that paging algorithm uses off-line optimum
  • instead it could use e.g. FIFO in memory 2M and increase the number of transfers by a constant factor
  • advantages of cache-oblivious model:
    • may adapt to changing M, B
    • good for a whole memory hierarchy (several levels of cache, disk, network,...)
  • scanning still Theta(ceiling of N/B)
  • search trees still Theta(log_{B+1} N)
  • sorting Theta(n/B log_{M/B}(N/B)) but requires big enough M i.e. M=Omega(B^{1+epsilon})

Static cache-oblivious search trees

  • first published by Harald Prokop, Master thesis, MIT 1999 [1]
  • alternative to a binary search
    • exercise: how many transfers for binary search?
  • use perfectly balanced binary search tree
  • search algorithm as usual (follows path from root down as usual)
  • nodes are stored on disk in some order which we can choose
    • the goal is to choose it so that for any B, M and any search we use only few blocks
    • for example, what would happen if we store nodes in pre-order or level-order?
    • instead of simple orders, we will use van Emde Boas order (later in the course we will see van Emde Boas trees for integer keys)

van Emde Boas order

  • split tree of height lg n into top and bottom, each of height (1/2) lg n
  • top part is a small tree with about sqrt(n) vertices
  • bottom part consists of about sqrt(n) small trees, each size about sqrt(n)
  • each of these small trees is processed recursively and the results are concatenated
  • for example a tree with 4 levels is split into 5 trees with 2 levels, resulting in the following ordering:
          1
     2           3
  4     7    10    13
5  6  8  9 11 12 14 15
  • if B = (1/2) lg n, we need only 2 trasfers
  • but we will show next that for any B<=M/2 and for any path from the root we need only Theta(log_{B+1} n) block transfers

Analysis

  • find level of recursion where each smaller tree has height k such that size of these trees is <=B and the next higher level has tree size >B
  • (1/2) lg B <= k <= lg B
  • length of path lg N, visits lg N / k trees of height k, which is Theta(lg N / lg B)
  • each tree of height k occupies at most B cells in memory (assuming each node fits int a word, otherwise everything is multipled by a constant factor...)
  • traversing the tree might need up to 2 memory transfers if there is a block boundary inside interval for the tree (whole tree fits in memeory)

Dynamic cache oblivious trees

  • only rough sketch

uses "ordered file maintenance"

  • maintain n items in an array of size O(n) with gaps of size O(1)
  • updates: delete item, insert item between given two items (similar to linked list)
  • update rewrites interval of size at most O(log^2 n) amortized in O(1) scans
  • done by keeping appropriate density in a hierarchy of intervals
  • we will not cover details

simpler version of data structure

  • our keep elements in "ordered file" in a sorted order
  • build a full binary tree on top of array (segment tree)
  • each node stores maximum in its subtree
  • tree stored in vEB order
  • when array gets too full, double the size, rebuild everything

search

  • search checks max in left child and decides to move left or right
  • basically follows a path from root, uses O(log_B n) transfers

update

  • search to find the leaf
  • update "ordered file", resulting in changes in interval of size L (O(log^2 n) amortized)
  • then update all ancestors of these values in the tree by postorder traversal

analysis of update

  • again let k be height of vEM subtrees which have size <=B but next higher level has size >B
  • (1) first consider bottom 2 levels: O(1+L/B) trees of size k need updating (+1 from ceilings)
  • each tree stored in <=2 blocks
  • postorder traversals revisits parent block between child blocks, but if M>=4B, stays in memory
  • overall O(log^2 n /B) transfers for bottom two layers
  • (2) part of the tree up to lca of all changed leaves: O(L/B) nodes, even if one trasfer per each, we are fine
  • (3) path from lca to root O(log_B n) transfers as in search
  • (1)+(2)+(3): O(log^2 n / B + log_B n) - too big for small values of B

improvement of update

  • split sorted data into blocks, each block of size between (1/4) lg n and lg n
  • each block stored sequentially on disk,
  • minimum element from each block used as a leaf in the above data structure, now contains only O(n/log n) elements
  • update/search inside block: 1 scan, O(log n/B)
  • if a block gets too large, split into two blocks (similarly to B-trees)
  • if a block gets too small, connect with adjacent block, maybe resplit
  • spliting/connecting propagates as update to the structure for n/log n items
  • but this happens only once very log n updates, so amortized time for update divided by log n
  • however, O(log_B n) term still necessary for search

Sources