Vybrané partie z dátových štruktúr
2-INF-237, LS 2016/17
RMQ a LCA: Rozdiel medzi revíziami
Z VPDS
(→Print small) |
(→RMQ via LCA) |
||
4 medziľahlé revízie od jedného používateľa nie sú zobrazené. | |||
Riadok 2: | Riadok 2: | ||
==Introduction to LCA, RMQ== | ==Introduction to LCA, RMQ== | ||
− | + | ||
[[Image:lca1.png|250px]] | [[Image:lca1.png|250px]] | ||
+ | |||
+ | * ''v'' is ancestor of ''u'' if it is on the path from ''u'' to the root | ||
+ | * lca(''u'',''v''): node of greatest depth in the intersection of the set of ancestors of ''u'' and the set of ancestors of ''v'' | ||
+ | * note that a node is ancestor of itself and if u is ancestor of v, then lca(u,v)=u | ||
+ | |||
+ | |||
+ | '''Task:''' preprocess tree T in O(n), answer lca(u,v) in O(1) | ||
+ | |||
+ | '''History''': Harel and Tarjan 1984, Schieber a Vishkin 1988 (Gusfield book), | ||
+ | Bender and Farach-Colton 2000 (this lecture) | ||
==LCA via RMQ== | ==LCA via RMQ== | ||
* zname algoritmy a triv. algoritmy na slajde | * zname algoritmy a triv. algoritmy na slajde | ||
* definicia poli V, D, R + ich vytvorenie slajdy | * definicia poli V, D, R + ich vytvorenie slajdy | ||
+ | ** V -- visited nodes | ||
+ | ** D -- their depths | ||
+ | ** R -- first occurrence of node in V | ||
* assume that LCA(u,v) = x (fig.) and that R[u]<=R[v] | * assume that LCA(u,v) = x (fig.) and that R[u]<=R[v] | ||
** tree traversal visits x, then subtree with u, then again x, then subtree with v, then again x, and only after that it can get above x | ** tree traversal visits x, then subtree with u, then again x, then subtree with v, then again x, and only after that it can get above x | ||
− | ** D[x] = min D[R[u]..R[v]] | + | ** D[x] = min D[R[u]..R[v]] |
− | ** this holds even if LCA(u,v)=u | + | ** this holds even if LCA(u,v)=u (LCA(u,v) cannot be v if R[u]<R[v]) |
** if we can quickly find k = arg min_k in {i..j} D[k], the result is V[k] | ** if we can quickly find k = arg min_k in {i..j} D[k], the result is V[k] | ||
* we have trasformed LCA to a new problem | * we have trasformed LCA to a new problem | ||
Riadok 35: | Riadok 48: | ||
==+-1RMQ== | ==+-1RMQ== | ||
− | * | + | * Adjacent elements of array D in LCA differ by exactly 1, i.e. 1 D[i]-D[i+1] in {-1,1} |
− | * | + | * We will show a better RMQ data structure for arrays with this property. This problem is called +-1RMQ |
− | * | + | * Split A into blocks of size m = lg(n)/2 |
− | * | + | * Find minimum in each block, these minima store in array A'[0..n'-1] and position of these minima into array M'[0..n'-1] where n' = n/m = 2n/lg n |
− | ** | + | ** Preprocess array A' using algorithm 3 in O(n'log n') = O(n) time |
− | * | + | * RMQ(i,j) query considers two cases (figure): |
− | ** i | + | ** i and j are in different blocks bi and bj |
− | *** | + | *** compute RMQ in A'[bi+1..bj-1] in O(1) |
− | *** | + | *** compute minimum in block bi from i to end of block (see below) |
− | *** | + | *** compute minimum in block bj from start to j (see below) |
− | *** | + | *** find minimum of these three numbers |
− | ** i | + | ** i and j are in the same block |
− | *** | + | *** compute minimum between two indices in the same block |
− | * | + | * remains to show: how to compute +-1RMQ within blocksv ramci blokov |
− | ** | + | **representation of a block: the first element + differences between adjacent elements, e.g. 2,3,2,1 -> 2,+1,-1,-1 |
− | **na urcenie polohy minima nepotrebujeme vediet prvy | + | **na urcenie polohy minima nepotrebujeme vediet prvy prvok, staci iba m-1 rozdielov +1,-1,-1 |
**kazdy blok sa teda da zapisat ako binarne cislo s m-1 bitmi | **kazdy blok sa teda da zapisat ako binarne cislo s m-1 bitmi | ||
**najviac 2^{m-1}<=sqrt(n) roznych typov blokov | **najviac 2^{m-1}<=sqrt(n) roznych typov blokov | ||
Riadok 84: | Riadok 97: | ||
** v lavom podstrome rekurzive definovany strom pre cast pola a[0..i-1] | ** v lavom podstrome rekurzive definovany strom pre cast pola a[0..i-1] | ||
** v pravom podstrome rekurzivne definovany strom pre cast pola a[i+1..n] | ** v pravom podstrome rekurzivne definovany strom pre cast pola a[i+1..n] | ||
+ | |||
+ | * RMQ(i,j) = LCA(i,j) | ||
+ | ** vezmime si uplne minimum v poli na poz. k: pre ktore dvojice bude vyslekodm RMQ(i,j)? | ||
+ | ** pre tie, kde i<=k, j>=k, pre tieto je k aj LCA | ||
+ | ** ostatne dvojice su bude uplne nalavo od k alebo uplne napravo, tam je strom definovany rekurzivne a teda tiez funguje | ||
+ | |||
* strom sa da vytvorit v O(n): | * strom sa da vytvorit v O(n): | ||
** ak uz mame strom pre a[0..i-1] a pridavame a[i], a[i] bude niekde na najpravejsej ceste | ** ak uz mame strom pre a[0..i-1] a pridavame a[i], a[i] bude niekde na najpravejsej ceste | ||
Riadok 90: | Riadok 109: | ||
** kazdy vrchol raz pridame do cesty a raz uberieme (vsetky vrcholy, ktore sme presli pri hladani j uz nie su na pravej ceste), teda celkovo O(n) | ** kazdy vrchol raz pridame do cesty a raz uberieme (vsetky vrcholy, ktore sme presli pri hladani j uz nie su na pravej ceste), teda celkovo O(n) | ||
** potential function: number of vertices on the rightmost path | ** potential function: number of vertices on the rightmost path | ||
− | |||
− | |||
− | |||
− | |||
==Segment trees== | ==Segment trees== | ||
Riadok 100: | Riadok 115: | ||
* if node corresponds to [i,j) | * if node corresponds to [i,j) | ||
** left child corresponds to [i,k), right child to [k,j) where k = floor((i+j)/2) | ** left child corresponds to [i,k), right child to [k,j) where k = floor((i+j)/2) | ||
− | * overall number of nodes roughly 2n, height ceiling(lg n) | + | * overall the number of nodes roughly 2n, height ceiling(lg n) |
− | * each interval contains result for its range ( | + | * each interval contains the result for its range (minimum, sum,...) |
Decompose query interval [x,y) to a set of disjoint tree intervals (canonical decomposition): | Decompose query interval [x,y) to a set of disjoint tree intervals (canonical decomposition): | ||
Riadok 111: | Riadok 126: | ||
Tree of recursive calls: | Tree of recursive calls: | ||
+ | * sometimes recursion on only one child, sometimes both (split) | ||
* first follow a single path from the root, each time moving to the left or right child | * first follow a single path from the root, each time moving to the left or right child | ||
− | * first | + | * last node before first split: smallest interval in the tree which contains whole [x,y), lca of leaves [x,x+1), [y,y+1) |
− | * after that in the left branch [x,y) always covers right endpoint, therefore even if we | + | * first split to both left and right: spliting point ''k'' inside [x,y) |
− | * similarly in the right branch [x,y) | + | * after that in the left branch [x,y) always covers right endpoint, therefore even if we split, right recursive calls will terminate in the next step with case 1 |
− | * together at most 2 traversals from root to leaf plus a short | + | * similarly in the right branch [x,y) covers left endpoint, therefore left calls will terminate |
+ | * together at most 2 traversals from root to leaf plus a short call in each step, which means at most 4 ceil(lg n) vertices visited | ||
Segment trees can also support dynamic case where we add operation set(i,x) which sets A[i] = x | Segment trees can also support dynamic case where we add operation set(i,x) which sets A[i] = x | ||
* this would require large changes in prefix sum or O(n log n) RMQ data structures | * this would require large changes in prefix sum or O(n log n) RMQ data structures | ||
− | * but here only O(log n) vertices on | + | * but here only O(log n) vertices on the path from the changed leaf to the root are affected |
==Print small== | ==Print small== |
Aktuálna revízia z 08:49, 1. marec 2016
Predbežná verzia poznámok, nebojte sa upravovať: pripojiť informácie z prezentácie, pridať viac vysvetľujúceho textu a pod.
Obsah
Introduction to LCA, RMQ
- v is ancestor of u if it is on the path from u to the root
- lca(u,v): node of greatest depth in the intersection of the set of ancestors of u and the set of ancestors of v
- note that a node is ancestor of itself and if u is ancestor of v, then lca(u,v)=u
Task: preprocess tree T in O(n), answer lca(u,v) in O(1)
History: Harel and Tarjan 1984, Schieber a Vishkin 1988 (Gusfield book), Bender and Farach-Colton 2000 (this lecture)
LCA via RMQ
- zname algoritmy a triv. algoritmy na slajde
- definicia poli V, D, R + ich vytvorenie slajdy
- V -- visited nodes
- D -- their depths
- R -- first occurrence of node in V
- assume that LCA(u,v) = x (fig.) and that R[u]<=R[v]
- tree traversal visits x, then subtree with u, then again x, then subtree with v, then again x, and only after that it can get above x
- D[x] = min D[R[u]..R[v]]
- this holds even if LCA(u,v)=u (LCA(u,v) cannot be v if R[u]<R[v])
- if we can quickly find k = arg min_k in {i..j} D[k], the result is V[k]
- we have trasformed LCA to a new problem
Range minimum query RMQ
- goal: preprocess array A so that we can any two indices i and j quickly find index of minimum in A[i..j]
- Alg 1: no preprocessing, O(n) query: find minimum in interval
- Alg 2: O(n^2) preprocessing, O(1) query: precompute for all (i,j) in a table
- Alg 3: O(n log n) preprocessing, O(1) query: store minimum only for intervals of length 2^k
- M[i,k]: index of minimum in A[i..i+2^k-1] for k=1,2,...,floor(log n)
i 0 1 2 3 4 5 6 7 8 9 10 11 12 A[i] 0 1 2 1 2 3 2 1 0 1 0 1 0 -------------------------------------------- k=1 0 1 3 3 4 6 7 8 0 10 10 12 - k=2 0 1 3 3 7 8 8 8 8 10 - - - k=3 0 8 8 8 8 8 - - - - - - -
- Preprocessing M[i,k]: if A[M[i,k-1]]<A[M[i+2^{k-1},k-1]] then M[i,k]=M[i,k-1], else M[i,k]=M[i+2^{k-1},k-1]
- RMQ(i,j):
- let k=floor(log_2(j-i+1)) (max size of a block inside A[i..j]), floor(log x) can be precomputed for i=1..n
- if A[M[i,k]]<A[M[j-2^k+1,k]] return M[i,k] else return M[j-2^k+1,k] (figure of overlapping intervals)
+-1RMQ
- Adjacent elements of array D in LCA differ by exactly 1, i.e. 1 D[i]-D[i+1] in {-1,1}
- We will show a better RMQ data structure for arrays with this property. This problem is called +-1RMQ
- Split A into blocks of size m = lg(n)/2
- Find minimum in each block, these minima store in array A'[0..n'-1] and position of these minima into array M'[0..n'-1] where n' = n/m = 2n/lg n
- Preprocess array A' using algorithm 3 in O(n'log n') = O(n) time
- RMQ(i,j) query considers two cases (figure):
- i and j are in different blocks bi and bj
- compute RMQ in A'[bi+1..bj-1] in O(1)
- compute minimum in block bi from i to end of block (see below)
- compute minimum in block bj from start to j (see below)
- find minimum of these three numbers
- i and j are in the same block
- compute minimum between two indices in the same block
- i and j are in different blocks bi and bj
- remains to show: how to compute +-1RMQ within blocksv ramci blokov
- representation of a block: the first element + differences between adjacent elements, e.g. 2,3,2,1 -> 2,+1,-1,-1
- na urcenie polohy minima nepotrebujeme vediet prvy prvok, staci iba m-1 rozdielov +1,-1,-1
- kazdy blok sa teda da zapisat ako binarne cislo s m-1 bitmi
- najviac 2^{m-1}<=sqrt(n) roznych typov blokov
- pre kazdy typ predpocitaj vsetky odpovede (alg. 2) v case O(m^2)
- spolu teda O(2^{m-1} m^2) = O(sqrt(n)*log(n)^2) = o(n)
+-1RMQ predpracovanie:
- spocitaj typ kazdeho bloku, uloz do pola O(n)
- pre kazdy typ spocitaj odpovede o(n) (v priemere sa jeden typ opakuje cca sqrt(n) krat, takze usetrime)
- spocitaj pole A' O(n)
- predspracuj pole A' O(n)
spolu O(n)
RMQ(i,j):
- 2x minimum v bloku O(1)
- 1x minimum v A' O(1)
spolu O(1)
LCA predspracovanie:
- prehladanie stromu a vytvorenie poli O(n)
- +-1 RMQ predspracovanie O(n)
LCA(u,v) 2 pristupy do pola R, +-1RMQ na poli D, pristup do pola V O(1)
RMQ via LCA
- vieme teda dobre riesit +-1RMQ a pomocou neho aj LCA
- vseobecne RMQ vieme riesit pomocou LCA
- z pola A vytvorime karteziansky strom:
- v koreni ma polohu i najmensieho cisla
- v lavom podstrome rekurzive definovany strom pre cast pola a[0..i-1]
- v pravom podstrome rekurzivne definovany strom pre cast pola a[i+1..n]
- RMQ(i,j) = LCA(i,j)
- vezmime si uplne minimum v poli na poz. k: pre ktore dvojice bude vyslekodm RMQ(i,j)?
- pre tie, kde i<=k, j>=k, pre tieto je k aj LCA
- ostatne dvojice su bude uplne nalavo od k alebo uplne napravo, tam je strom definovany rekurzivne a teda tiez funguje
- strom sa da vytvorit v O(n):
- ak uz mame strom pre a[0..i-1] a pridavame a[i], a[i] bude niekde na najpravejsej ceste
- ideme zo spodu po tejto ceste a najdeme prvy vrchol j, pre ktory a[j]<a[i]
- i dame ako prave dieta j, stare prave dieta j dame ako lave dieta i
- kazdy vrchol raz pridame do cesty a raz uberieme (vsetky vrcholy, ktore sme presli pri hladani j uz nie su na pravej ceste), teda celkovo O(n)
- potential function: number of vertices on the rightmost path
Segment trees
- root correspods to interval [0,n)
- leafs correspond to intervals of length 1, [i,i+1)
- if node corresponds to [i,j)
- left child corresponds to [i,k), right child to [k,j) where k = floor((i+j)/2)
- overall the number of nodes roughly 2n, height ceiling(lg n)
- each interval contains the result for its range (minimum, sum,...)
Decompose query interval [x,y) to a set of disjoint tree intervals (canonical decomposition):
- let current node has interval [i,j) and its left child interval [i,k)
- invariant: [i,j) overlaps with [x,y)
- if [i,j) is a subset of [x,y), include [i,j) in the decomposition and stop
- if [i,k) overlaps with [x,y), recurse on left child
- if [k,j) overlaps with [x,y), recurse on right child
Tree of recursive calls:
- sometimes recursion on only one child, sometimes both (split)
- first follow a single path from the root, each time moving to the left or right child
- last node before first split: smallest interval in the tree which contains whole [x,y), lca of leaves [x,x+1), [y,y+1)
- first split to both left and right: spliting point k inside [x,y)
- after that in the left branch [x,y) always covers right endpoint, therefore even if we split, right recursive calls will terminate in the next step with case 1
- similarly in the right branch [x,y) covers left endpoint, therefore left calls will terminate
- together at most 2 traversals from root to leaf plus a short call in each step, which means at most 4 ceil(lg n) vertices visited
Segment trees can also support dynamic case where we add operation set(i,x) which sets A[i] = x
- this would require large changes in prefix sum or O(n log n) RMQ data structures
- but here only O(log n) vertices on the path from the changed leaf to the root are affected
Print small
- mame pole cisel A, predspracovane pre RMQ
- chceme vypisat vsetky indexy k in {i..j} take ze A[i]<=x
void small(i,j,x) { if(j<i) return; k = rmq(i,j); if(a[k]<=x) { print k; small(i,k-1); small(k+1,j); } }
- analyzujte cas vypoctu vzhladom na p, kde p je pocet vypisanych indexov