Vybrané partie z dátových štruktúr
2-INF-237, LS 2016/17

Úvod · Pravidlá · Prednášky · Prezentácia · Ako poznámkovať · Moodle
Táto stránka sa týka školského roku 2016/17. V školskom roku 2017/18 predmet vyučuje Jakub Kováč, stránku predmetu je https://people.ksp.sk/~kuko/ds


Prioritné rady

Z VPDS
Prejsť na: navigácia, hľadanie

Nizsie su pracovne poznamky, ktore doplnuju informaciu z prezentacie, je mozne ich rozsirit, doplnit samotnou prezentaciou

Introduction to priority queues

Review

Operations        Bin. heap    Fib. heap 
                  worst case    amortized
MakeHeap()            O(1)       O(1)   
Insert(H, x)       O(log n)      O(1)   
Minimum(H)            O(1)       O(1)   
ExtractMin(H)      O(log n)    O(log n)   
Union(H1, H2)         O(n)       O(1)   
DecreaseKey(H, x, k)
                    O(log n)     O(1)   
Delete(H, x)        O(log n)   O(log n)   

Applications of priority queues

  • job scheduling on a server (store jobs in a max-based priority queue, always select the one with highest priority - needs operations insert and extract_min, possibly also delete -needs a handle)
  • event-driven simulation (store events, each with a time when it is supposed to happen. Always extract one event, process it and as a result potentially create new events - needs insert, extract_min Examples: queue in a bank or store: random customer arrival times, random service times per customer)
  • finding intersections of line segments [1]
    • Bentley–Ottmann algorithm - sweepline: binary search tree of line intersections) plus a priority queue with events (start of segment, end of segment, crossing)
    • for n segments and k intersections do 2n+k inserts, 2n+k extractMin, overall O((n+k)log n) time


  • heapsort (builds a heap from all elements or uses n time insert, then n times extract_min -implies a lower bound on time of insert+extract_min must be at least Omega(log n) in the comparison model)
  • aside: partial sorting - given n elements, print k in sorted order - selection O(n), then sort in O(k log k), together O(n + k log k)
  • incremental sorting: given n elements, build a data structure, then support extract_min called k times, k unknown in advance, running time a function of k and n
    • call selection each time O(kn)
    • heap (binary, Fibonacci etc) O(n + k log n) = O(n + k log k) because if k>= sqrt n, log n = Theta(log k) and of k < sqrt n, O(n) term dominates


  • Kruskal's algorithm (can use sort, but instead we can use heap as well, builds heap, then calls extract_min up to m times, but possibly fewer, plus needs union-find set structure) O(m log n)
    • process edges from smallest weight upwards, if it connects 2 components, add it. Finosh when we have a tree.
  • Prim's algorithm (build heap, n times extract-min, m times decrease_key - O(m log n) with binary heaps, O(m + n log n) with Fib heaps)
    • build a tree starting from one node. For each node outside tree keep the best cost to a node in a tree as a priority in a heap
  • Dijkstra's algorithm for shortest paths (build heap, n times extract_min, m times decrease_key = the same as Prim's)
    • keep nodes in a heap which do not have final distance. Set the distance of minimum node to final, update distances of tis neighbors

Soft heap

  • http://en.wikipedia.org/wiki/Soft_heap
  • Chazelle, B. 2000. The soft heap: an approximate priority queue with optimal error rate. J. ACM 47, 6 (Nov. 2000), 1012-1027.
  • Kaplan, H. and Zwick, U. 2009. A simpler implementation and analysis of Chazelle's soft heaps. In Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms (New York, New York, January 4––6, 2009). Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, 477-485. pdf
  • provides constant amortized time for insert, union, findMin, delete
    • but corrupts (increases) keys of up to epsilon*n elements, where epsilon is a fixed parameter in (0,1/2) and n is the number of insertions so far
    • useful to obtain minimum spanning tree in O(m alpha(m,n)) where alpha is the inverse of Ackermann's function

Fibonacci heaps

Analysis

ExtractMin

  • let c be degree of z, c<=Deg(n) = O(log n)
  • adding c children to root list:
    • cost c
    • delta Phi = c, amortized O(log n)

Consolidate:

  • t1 original number of nodes in the root list, t2 new
  • cost: t1 + Deg(n)
  • delta Phi: t2-t1
  • amortized t2+Deg(n), but t2<=Deg(n), therefore O(\log n)

DecreaseKey:

  • all except cascading cut:
    • cost O(1)
    • delta Phi <= 1
    • amortized O(1)

CascadingCut:

  • let each recursive call cost 1
  • last recursive call: delta Phi <= 2 (amortized cost O(1))
  • other recursive call: delta Phi = -1 (amortized cost 0)

Overall DecreaseKey O(1) amortized

Maximum degree Deg(n)

Proof of Lemma 3:

  • Linking done only in Consolidate for 2 roots of equal degree.
  • When y_j was linked, y_1,...y_{j-1} were children, so at that time x.degree >= j-1.
  • Therefore at that time y_j.degree>= j-1
  • Since then y_j lost at most 1 child, so y_j.degree >= j=2.

Proof of Lemma 4:

  • Let D(x) be the size of subtree rooted at x
  • let s_k be the minimum size of a node with degree k in any Fibonacci heap (will prove s_k>= phi^k)
  • s_k<= s_{k+1}
    • If we have a subtree of degree k+1, we can cut it from its parent and then cut away one child to get a smaller tree of degree k
  • Let x be a node of degree k and size s_k
  • Let y_1,...y_k be its children from earliest
  • D(x) = s_k = 1 + sum_{j=1}^k D(y_j) >= 1 + D(y_1) + sum_{j=2}^k s_{y_j.degree}
  • >= 2 + sum_{j=2}^k s_{j-2} (lemma 3)
  • by induction on k: s_k >= F_{k+2}
    • for k=0, s_0 = 1, F_2 = 1
    • for k=1, s_1 = 2, F_3 = 1
    • assume k>=2, s_i>=F_{i+2} for i<=k-1
    • s_k >= 2+ sum_{j=2}^k s_{j-2} >= 2 + sum_{j=2}^k F_j (IH)
    • = 1 + sum_{j=0}^k F_j = F_{k+2} (lemma 1)
    • >= phi^k

Proof of Corollary:

  • let x be a node of degree k with n nodes
  • by lemma 4 n >= phi^k
  • therefore k<= log_phi(n)

Exercise

  • What sequence of operations will create a tree of depth n?
  • Alternative delete (see slides)

Pairing heaps

  • Simple and fast in practice
  • hard to analyze amortized time
  • Fredman, Michael L.; Sedgewick, Robert; Sleator, Daniel D.; Tarjan, Robert E. (1986), "The pairing heap: a new form of self-adjusting heap", Algorithmica 1 (1): 111–129
  • Stasko, John T.; Vitter, Jeffrey S. (1987), "Pairing heaps: experiments and analysis", Communications of the ACM 30 (3): 234–249
  • Moret, Bernard M. E.; Shapiro, Henry D. (1991), "An empirical analysis of algorithms for constructing a minimum spanning tree", Proc. 2nd Workshop on Algorithms and Data Structures, Lecture Notes in Computer Science 519, Springer-Verlag, pp. 400–411
  • Pettie, Seth (2005), "Towards a final analysis of pairing heaps", Proc. 46th Annual IEEE Symposium on Foundations of Computer Science, pp. 174–183 [2]
  • a tree with arbitrary degrees
  • each node pointer to the first child and next sibling (for delete and decreaseKey also pointer to previous sibling or parent if first child)
  • each child has key >= key of parent
  • linking two trees H1 and H2 s.t. H1 has smaller key:
    • make H2 the first child of H1 (original first child of H1 is sibling of H2)
    • subtrees thus ordered by age of linking from youngest to oldest
  • extractMin is the most complex operation, the rest quite lazy
  • insert: make a new tree of size 1, link with H
  • union: link H1 and H2
  • decreaseKey: cut x from its parent, decrease its key and link to H
  • delete: cut x from parent, do extractMin on this tree, then link result to H

ExtractMin

  • remove root, combine resulting trees to a new tree (actual cost up to n)
  • start by linking pairs (1st and 2nd, 3rd and 4th,...), last may remain unpaired
  • finally we link each remaining tree to the last one, starting from next-to-last back to the first (in the opposite order)
    • other variants of linking were studied as well

O(log n) amortized time can be proved by a similar analysis as in splay trees

  • imagine as binary tree - rename first child to left and next sibling to right
  • D(x): number of descendants of node x, including x (size of a node)
  • r(x): lg (D(x)) rank of a node
  • potential of a tree - sum of ranks of all nodes
  • details in Fredman et al 1986
  • Better bounds in Pettie 2005: amortized O(log n) ExtractMin; O(2^{2\sqrt{log log n}}) insert, union, decreaseKey; but decreaseKey not constant as in Fibonacci heaps)