Dátové štruktúry pre externú pamäť: Rozdiel medzi revíziami

Aktuálna revízia z 16:16, 11. apríl 2017

Obsah

1 External memory model, I/O model
2 Cache oblivious model
3 Static cache-oblivious search trees
- 3.1 van Emde Boas order
- 3.2 Analysis
4 Dynamic cache oblivious trees
5 Sources

External memory model, I/O model

Introduction

big and slow disk, fast memory of a limited size M words
disk reads/writes in blocks of size B words
when analyzing algorithms, count how many disk reads/writes (memory transfers) are done

Example: scanning through n elements (e.g. to compute their sum): Theta(ceiling of n/B) memory transfers

B-trees

I/O cost of binary search tree woth arbitray node placement?
parameter T related to block size B (so that each node fits into a constant number of blocks)
each node v has some number v.n of keys
- if it is an internal node, it has v.n+1 children
- keys in a node are sorted
- each subtree contains only values between two successive keys in the parent
all leaves are in the same depth
each node except root has at least T-1 keys and each internal node except root has at least T children
each node has at most 2T-1 keys and at most 2T children
height is O(log_T n)

Search:

O(log_T n) block reads

Insert:

find a leaf where the new key belongs
if leaf is full (2T-1 keys), split into two equal parts of size T-1 and insert median to the parent (recursively)
- new element is inserted to one of the new leaves
- we continue splitting ancestors until we find a node which is not full or until we reach the root. If root is full, we create a new root with 2 children (increasing the height of the tree)
O(log_T n) transfers

Sorting

external mergesort

I/O cost of ordinary mergesort implementation?
create sorted runs of size M using O(n/B) transfers
repeatedly merge M/B-1 runs into one in 1 pass, this uses O(n/B) disk reads and writes per pass
- read one block from each run, use one block for output
- when output block gets full, write it to disk
- when some input gets exhausted, read next block from the run (if any)
log_{M/B-1} (n/M) passes

Summary

B-trees can do search tree operations (insert, delete, search/predecessor) in Theta(log_{B+1} n) = Theta(log n / log (B+1)) memory transfers
- number of transfers compared to usual search tree running time: factor of 1/log B
search optimal in comparison model
- proof outline through information theory
- goal of the search is to tell us between which two elements is query located
- there are Theta(n) possible answers, thus Theta(log n) bits of information
- in normal comparison model one comparison of query to an elements of the set gives us at most 1 bit of information
- in I/O model reading a block of B elements and comparing query to them gives us in the best case its position within these B elements, i.e. Theta(log B) bits
- to get Theta(log n) bits of information for the answer, we need Theta(log n/log B) block reads
- this can be transformed to a more formal and precise proof
sorting O((n/B) log_{M/B} (n/M)) memory transfers
- number of memory transfers compared to usual sorting running time: factor of more than 1/B
- also much better than sorting using B-trees, which would take O(N * log_B n)

Cache oblivious model

Introduction

in the I/O model the algorithm explicitly requests block transfers, knows B, controls memory allocation
in the cache oblivious model algorithm does not know B or M, memory operates as a cache
M/B slots, each holding one block from disk
algorithm requests reading a word from disk
- if the block containing this word in cache, no transfer
- replace one slot with block holding requested item, write original block if needed (1 or 2 transfers)
- which one to replace: classical on-line problem of paging

Paging

cache has size k = M/B slots, each holds one page (above called block)
sequence of page requests, if the requested page not in memory not in memory, bring it in and remove some other page (page fault)
- goal is to minimize the number of page faults
optimal offline algorithm (knows the whole sequence of page requests)
- At a page fault remove the page that will not be used for the longest time
example of an on-line algoritm: FIFO:
- at a page fault remove page which is in the memory longest time
- uses at most k times as many page faults as the optimal algorithm (k-competitive)
- no deterministic alg. can be better than k-competitive
- it is conservative: in a segment of requests containing at most k distinct pages it does at most k page faults
  - proof by induction on segment length, hypothesis: each element which was inserted within segment is never removed within segment
- every conservative alg. is k-competitive
Compare a conservative paging alg. (e.g. FIFO) on memory with k blocks to optimum offline alg. on a smaller memory of size h - competitive ratio k/(k-h)
- if h = k/2, we get competitive ratio 2
- divide input sequence into maximal blocks, each containing k distinct elements (first element of the next block is distinct from the k elements of the previous block)
- FIFO uses at most k page faults in each block
- optimum has at most h pages from a block in memory at the beginning - at least k-h pages will cause a fault
- in fact we can prove k/(k-h+1), works even for k=h

Back to cache-oblivious model

we analyze algorithms under the assumption that paging algorithm uses off-line optimum
instead it could use e.g. FIFO in memory 2M and increase the number of transfers by a constant factor
advantages of cache-oblivious model:
- may adapt to changing M, B
- good for a whole memory hierarchy (several levels of cache, disk, network,...)
scanning still Theta(ceiling of N/B)
search trees still Theta(log_{B+1} N)
sorting Theta(n/B log_{M/B}(N/B)) but requires big enough M i.e. M=Omega(B^{1+epsilon})

Static cache-oblivious search trees

first published by Harald Prokop, Master thesis, MIT 1999 [1]
alternative to a binary search
- exercise: how many transfers for binary search?
use perfectly balanced binary search tree
search algorithm as usual (follows path from root down as usual)
nodes are stored on disk in some order which we can choose
- the goal is to choose it so that for any B, M and any search we use only few blocks
- for example, what would happen if we store nodes in pre-order or level-order?
- what would be a good order for fixed known B?
  - e.g. store top lg_B n levels together in one block, then split the rest into subtrees and again store top of each subtree in a block etc.
- we do not know B, we will use van Emde Boas order which does this recursively for several values of "B" (later in the course we will see van Emde Boas trees for integer keys)

van Emde Boas order

split tree of height lg n into top and bottom, each of height (1/2) lg n
top part is a small tree with about sqrt(n) vertices
bottom part consists of about sqrt(n) small trees, each size about sqrt(n)
each of these small trees is processed recursively and the results are concatenated
for example a tree with 4 levels is split into 5 trees with 2 levels, resulting in the following ordering:

          1
     2           3
  4     7    10    13
5  6  8  9 11 12 14 15

if B = sqrt n, we need only 2 trasfers
but we will show next that for any B<=M/2 and for any path from the root we need only Theta(log_{B+1} n) block transfers

Analysis

k: height of trees in some level of recursion, such that size of these trees is <=B and the next higher level has tree size >B
(1/2) lg B <= k <= lg B
careful: of size of these trees is x=2^k, we have sqrt(B) <= x <= B
length of path lg n, visits lg n / k trees of height k, which is Theta(lg n / lg B)
each tree of height k occupies at most B cells in memory (assuming each node fits in a word, otherwise everything is multiplied by a constant factor...)
traversing the tree might need up to 2 memory transfers if there is a block boundary inside interval for the tree (whole tree fits in memory)

Dynamic cache oblivious trees

only rough sketch

uses "ordered file maintenance"

maintain n items in an array of size O(n) with gaps of size O(1)
updates: delete item, insert item between given two items (similar to linked list)
update rewrites interval of size at most O(log^2 n) amortized in O(1) scans
done by keeping appropriate density in a hierarchy of intervals
we will not cover details

simpler version of data structure

our keep elements in "ordered file" in a sorted order
build a full binary tree on top of array (segment tree)
each node stores maximum in its subtree
tree stored in vEB order
when array gets too full, double the size, rebuild everything

search

search checks max in left child and decides to move left or right
basically follows a path from root, uses O(log_B n) transfers

update

search to find the leaf
update "ordered file", resulting in changes in interval of size L (L=O(log^2 n) amortized)
then update all ancestors of these values in the tree by postorder traversal

analysis of update

let he size of interval update in ordered file be L (amortized O(log^2 n))
again let k be height of vEM subtrees which have size <=B but next higher level has size >B
size of these trees is at least sqrt(B)
(1) ordered file needs O(L/B+1) transfers
(2) consider bottom 2 levels: O(1+L/sqrt(B)) trees of size k need updating (+1 from ceilings)
each tree stored in <=2 blocks
postorder traversals revisits parent block between child blocks, but if M>=4B, stays in memory
overall O(L/sqrt(B)) transfers for bottom two layers
(3) at the top of 2 levels, at most O(L/sqrt(B)+1) nodes, all their ancestors also need updating (set of nodes X)
- vertices have 2 children in X, some only one child
- The number of vertices with 2 children is O(L/sqrt(B)+1), this is ok, even if each causes separate transfer
- Vertices with one child are arranged in at most 2 root-to-leaf paths, each path requires at most O(log_B n) transfers as in search
  - Consider lca of all leaves in L: from root to lca, one a single path, there splits into 2. Each node in the left path has interval with rightmost point inside the updated interval. There are two possibilities: (A) the right subtree is completely inside the updated interval, which means all nodes in the right subtree 2 children inside; the path continues left (B) the right tree is only partially inside the updated interval, then the left subtree is outside and the path continues right
(1)+(2)+(3): amortized O(log^2 n / sqrt(B) + log_B n) - too big for small values of B

improvement of update

split sorted data into buckets, each bucket of size between (1/4) lg n and lg n
each bucket stored sequentially on disk,
minimum element from each bucket used as a leaf in the above data structure, now contains only O(n/log n) elements
update/search inside bucket: 1 scan, O(log n/B)
if a bucket gets too large, split into two buckets (similarly to B-trees)
if a bucket gets too small, connect with adjacent bucket, maybe split again more evenly
- for example, if adjacent bucket more than (1/2) lg n, split again to get two buckets of size at least (3/8) lg n; for small adjacent bucket get a combined bucket with size at most 3/4 lg n; in either case at least (1/8) lg n from the end of admissible interval of bucket sizes (1/4)lg n ad lg n.
splitting/connecting propagates as update to the structure for n/log n items
but this happens at most once very (1/8)lg n updates of this particular bucket, so amortized time for update divided by log n
however, O(log_B n) term still necessary for search

Sources

Prednaska L07 Erika Demaina z MIT: http://courses.csail.mit.edu/6.851/spring12/lectures/
Kapitola o B-trees z Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. (ja som pouzila 3. vydanie)
online algoritmy a paging http://courses.csail.mit.edu/6.854/03/scribe/scribe19.ps

@@ Riadok 1: / Riadok 1: @@
-Zdroje:
-* Prednaska L07 Erika Demaina z MIT: http://courses.csail.mit.edu/6.851/spring12/lectures/
-* Kapitola o B-trees z Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. (ja spom pouzila 3. vydanie)
-* onlie algoritmy a paging http://courses.csail.mit.edu/6.854/03/scribe/scribe19.ps
-Osnova:
-* uvod do externej pamati z prednasky Erika Demaina
-** strucne opakovanie B-trees: struktura, hladanie, vkladanie, odhad vysky
-** strucne opakovanie mergesortu v externej pamati
-* uvod do cache-oblivious modelu z prednasky Erika Demaina
-** strucnu uvod do online algoritmov, kompetitivnost, paging
-* staticke vyhldavacie stromy (vEB layout) z prednasky Erika Demaina
-* nacrt ako zhruba vyzeraju dynamicke stromy (cache oblivious B-trees), nesli sme do vsetkych detailov
 ==External memory model, I/O model==
@@ Riadok 18: / Riadok 4: @@
 * big and slow disk, fast memory of a limited size M words
 * disk reads/writes in blocks of size B words
-* when analyzing algorithms, count how many disk reads/writes are done
+* when analyzing algorithms, count how many disk reads/writes (memory transfers) are done
 Example: scanning through n elements (e.g. to compute their sum): Theta(ceiling of n/B) memory transfers
 ===B-trees===
-* I/O cost of binary search tree?
+* I/O cost of binary search tree woth arbitray node placement?
-* from CLRS 3rd edition
 * parameter T related to block size B (so that each node fits into a constant number of blocks)
 * each node v has some number v.n of keys
@@ Riadok 40: / Riadok 25: @@
 Insert:
 * find a leaf where the new key belongs
-* if leaf is full (2T-1 keys), split into two equal parts of size T-1 and insert median to the parent
+* if leaf is full (2T-1 keys), split into two equal parts of size T-1 and insert median to the parent (recursively)
-** new element can be inserted to one of the new leaves
+** new element is inserted to one of the new leaves
-* we continue splitting ancestors until we find a node which is not full or until we reach the root. If root is full, we create a new root with 2 children (increasing the height of the tree)
+** we continue splitting ancestors until we find a node which is not full or until we reach the root. If root is full, we create a new root with 2 children (increasing the height of the tree)
-* O(log_T n) block reads and writes
+* O(log_T n) transfers
 ===Sorting===
-* mergesort
+external mergesort
-** I/O cost of ordinary implementation?
+* I/O cost of ordinary mergesort implementation?
-** create sorted runs of size M using O(N/B) disk reads and writes
+* create sorted runs of size M using O(n/B) transfers
-** repeatedly merge M/B-1 runs into 1 - 1 pass uses N/B disk reads and writes
+* repeatedly merge M/B-1 runs into one in 1 pass, this uses O(n/B) disk reads and writes per pass
-** log_{M/B-1} (N/M) passes
+** read one block from each run, use one block for output
+** when output block gets full, write it to disk
+** when some input gets exhausted, read next block from the run (if any)
+* log_{M/B-1} (n/M) passes
 ===Summary===
-* B-trees can do search tree operations (insert, delete, search/predecessor) in Theta(log_{B+1} n) = Theta(log n / log (B+1)) memeory transfers
+* B-trees can do search tree operations (insert, delete, search/predecessor) in Theta(log_{B+1} n) = Theta(log n / log (B+1)) memory transfers
-** number of memory transfers compared to usual search tree running time: factor of 1/log B
+** number of transfers compared to usual search tree running time: factor of 1/log B
 * search optimal in comparison model
 ** proof outline through information theory
-** goal of the seach is to tell us between which two elements is query located
+** goal of the search is to tell us between which two elements is query located
-** there are Theta(n) possible answers, thus Theta(logg n) bits of information
+** there are Theta(n) possible answers, thus Theta(log n) bits of information
 ** in normal comparison model one comparison of query to an elements of the set gives us at most 1 bit of information
-** in I/O model reading a block of B elements and comparing query to them gives us in the beste case its position within these B ele,ents, i.e. Theta(log B) bits
+** in I/O model reading a block of B elements and comparing query to them gives us in the best case its position within these B elements, i.e. Theta(log B) bits
-** to get Theta(log n) bits of infomration for the answer, we need Theta(log n/log B) block reads
+** to get Theta(log n) bits of information for the answer, we need Theta(log n/log B) block reads
 ** this can be transformed to a more formal and precise proof
-* sorting O((N/B) log_{M/B} (N/B)) memory trasfers
+* sorting O((n/B) log_{M/B} (n/M)) memory transfers
 ** number of memory transfers compared to usual sorting running time: factor of more than 1/B
+** also much better than sorting using B-trees, which would take O(N *  log_B n)
 ==Cache oblivious model==
@@ Riadok 88: / Riadok 77: @@
 ** no deterministic alg. can be better than k-competitive
 ** it is conservative: in a segment of requests containing at most k distinct pages it does at most k page faults
+*** proof by induction on segment length, hypothesis: each element which was inserted within segment is never removed within segment
 ** every conservative alg. is k-competitive
 * Compare a conservative paging alg. (e.g. FIFO) on memory with k blocks to optimum offline alg. on a smaller memory of size h - competitive ratio k/(k-h)
@@ Riadok 93: / Riadok 83: @@
 ** divide input sequence into maximal blocks, each containing k distinct elements (first element of the next block is distinct from the k elements of the previous block)
 ** FIFO uses at most k page faults in each block
-** optimum has at most h sets of a block in memory at the beginning - at least k-h pages will cause a fault
+** optimum has at most h pages from a block in memory at the beginning - at least k-h pages will cause a fault
 ** in fact we can prove k/(k-h+1), works even for k=h
 ===Back to cache-oblivious model===
-* we analyze algorithms under the assumptiom that paging algorithm uses off-line optimum
+* we analyze algorithms under the assumption that paging algorithm uses off-line optimum
 * instead it could use e.g. FIFO in memory 2M and increase the number of transfers by a constant factor
 * advantages of cache-oblivious model:
@@ Riadok 108: / Riadok 98: @@
 ==Static cache-oblivious search trees==
 * first published by Harald Prokop, Master thesis, MIT 1999 [http://supertech.csail.mit.edu/papers/Prokop99.pdf]
+* alternative to a binary search
+** exercise: how many transfers for binary search?
 * use perfectly balanced binary search tree
 * search algorithm as usual (follows path from root down as usual)
 * nodes are stored on disk in some order which we can choose
 ** the goal is to choose it so that for any B, M and any search we use only few blocks
-** for example, what would happen if we store nodes in pre-order?
+** for example, what would happen if we store nodes in pre-order or level-order?
-** instead of simple orders, we will use van Emde Boas order (later in the course we will see van Emde Boas trees for integer keys)
+** what would be a good order for fixed known B?
+*** e.g. store top lg_B n levels together in one block, then split the rest into subtrees and again store top of each subtree in a block etc.
+** we do not know B, we will use van Emde Boas order which does this recursively for several values of "B" (later in the course we will see van Emde Boas trees for integer keys)
 ===van Emde Boas order===
@@ Riadok 127: / Riadok 121: @@
   6  8  9 11 12 14 15
 </pre>
-* if B = (1/2) lg n, we need only 2 trasfers
+* if B = sqrt n, we need only 2 trasfers
 * but we will show next that for any B<=M/2 and for any path from the root we need only Theta(log_{B+1} n) block transfers
 ===Analysis===
-* find level of recursion where each smaller tree of size k<=B and the next higher level has tree size k^2>B
+* k: height of trees in some level of recursion, such that size of these trees is <=B and the next higher level has tree size >B
-* height of these trees of size k is between (1/2) lg B and lg B
+* (1/2) lg B <= k <= lg B
+* careful: of size of these trees is x=2^k, we have sqrt(B) <= x <= B
+* length of path lg n, visits lg n / k trees of height k, which is Theta(lg n / lg B)
+* each tree of height k occupies at most B cells in memory (assuming each node fits in a word, otherwise everything is multiplied by a constant factor...)
+* traversing the tree might need up to 2 memory transfers if there is a block boundary inside interval for the tree (whole tree fits in memory)
+==Dynamic cache oblivious trees==
+* only rough sketch
+uses "ordered file maintenance"
+* maintain n items in an array of size O(n) with gaps of size O(1)
+* updates: delete item, insert item between given two items (similar to linked list)
+* update rewrites interval of size at most O(log^2 n) amortized in O(1) scans
+* done by keeping appropriate density in a hierarchy of intervals
+* we will not cover details
+simpler version of data structure
+* our keep elements in "ordered file" in a sorted order
+* build a full binary tree on top of array (segment tree)
+* each node stores maximum in its subtree
+* tree stored in vEB order
+* when array gets too full, double the size, rebuild everything
+search
+* search checks max in left child and decides to move left or right
+* basically follows a path from root, uses O(log_B n) transfers
+update
+* search to find the leaf
+* update "ordered file", resulting in changes in interval of size L (L=O(log^2 n) amortized)
+* then update all ancestors of these values in the tree by postorder traversal
+analysis of update
+* let he size of interval update in ordered file be L (amortized O(log^2 n))
+* again let k be height of vEM subtrees which have size <=B but next higher level has size >B
+* size of these trees is at least sqrt(B)
+* (1) ordered file needs O(L/B+1) transfers
+* (2) consider bottom 2 levels: O(1+L/sqrt(B)) trees of size k need updating (+1 from ceilings)
+* each tree stored in <=2 blocks
+* postorder traversals revisits parent block between child blocks, but if M>=4B, stays in memory
+* overall O(L/sqrt(B)) transfers for bottom two layers
+* (3) at the top of 2 levels, at most O(L/sqrt(B)+1) nodes, all their ancestors also need updating (set of nodes X)
+** vertices have 2 children in X, some only one child
+** The number of vertices with 2 children is O(L/sqrt(B)+1), this is ok, even if each causes separate transfer
+** Vertices with one child are arranged in at most 2 root-to-leaf paths, each path requires at most O(log_B n) transfers as in search
+*** Consider lca of all leaves in L: from root to lca, one a single path, there splits into 2. Each node in the left path has interval with rightmost point inside the updated interval. There are two possibilities:  (A) the right subtree is completely inside the updated interval, which means all nodes in the right subtree 2 children inside; the path continues left (B) the right tree is only partially inside the updated interval, then the left subtree is outside and the path continues right
+* (1)+(2)+(3): amortized O(log^2 n / sqrt(B) + log_B n) - too big for small values of B
+improvement of update
+* split sorted data into buckets, each bucket of size between (1/4) lg n and lg n
+* each bucket stored sequentially on disk,
+* minimum element from each bucket used as a leaf in the above data structure, now contains only O(n/log n) elements
+* update/search inside bucket: 1 scan, O(log n/B)
+* if a bucket gets too large, split into two buckets (similarly to B-trees)
+* if a bucket gets too small, connect with adjacent bucket, maybe split again more evenly
+** for example, if adjacent bucket more than (1/2) lg n, split again to get two buckets of size at least (3/8) lg n; for small adjacent bucket get a combined bucket with size at most 3/4 lg n; in either case at least (1/8) lg n from the end of admissible interval of bucket sizes (1/4)lg n ad lg n.
+* splitting/connecting propagates as update to the structure for n/log n items
+* but this happens at most once very (1/8)lg n updates of this particular bucket, so amortized time for update divided by log n
+* however, O(log_B n) term still necessary for search
+==Sources==
+* Prednaska L07 Erika Demaina z MIT: http://courses.csail.mit.edu/6.851/spring12/lectures/
+* Kapitola o B-trees z Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. (ja som pouzila 3. vydanie)
+* online algoritmy a paging http://courses.csail.mit.edu/6.854/03/scribe/scribe19.ps

Dátové štruktúry pre externú pamäť: Rozdiel medzi revíziami

Aktuálna revízia z 16:16, 11. apríl 2017

Obsah

External memory model, I/O model

Introduction

B-trees

Sorting

Summary

Cache oblivious model

Introduction

Paging

Back to cache-oblivious model

Static cache-oblivious search trees

van Emde Boas order

Analysis

Dynamic cache oblivious trees

Sources

Navigačné menu

Osobné nástroje

Menné priestory

Varianty

Zobrazení

Operácie

Hľadať

Navigácia

Nástroje