In computer science , an AVL TREE is a self-balancing binary search tree . It was the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O (log n) time in both the average and worst cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations . The
AVL trees are often compared with red–black trees because both support the same set of operations and take O(log n) time for the basic operations. For lookup-intensive applications, AVL trees are faster than red–black trees because they are more strictly balanced. Similar to red–black trees, AVL trees are height-balanced. Both are, in general, neither weight-balanced nor μ-balanced for any μ≤1⁄2; that is, sibling nodes can have hugely differing numbers of descendants. CONTENTS * 1 Definition * 1.1 Balance factor * 1.2 Properties * 2 Operations * 2.1 Searching * 2.2 Traversal * 2.3 Insert * 2.4 Delete * 2.5 Set operations and bulk operations * 3 Rebalancing * 3.1 Simple rotation * 3.2 Double rotation * 4 Comparison to other structures * 5 See also * 6 References * 7 Further reading * 8 External links DEFINITION BALANCE FACTOR In a binary tree the balance factor of a node N is defined to be the height difference BalanceFactor(N) := Height(RightSubtree(N)) - Height(LeftSubtree(N)) of its two child subtrees. A binary tree is defined to be an AVL tree if the invariant BalanceFactor(N) ∈ {–1,0,+1} holds for every node N in the tree. A node N with BalanceFactor(N) < 0 is called "left-heavy", one with BalanceFactor(N) > 0 is called "right-heavy", and one with BalanceFactor(N) = 0 is sometimes simply called "balanced". Remark In the sequel, because there is a one-to-one correspondence between nodes and the subtrees rooted by them, we sometimes leave it to the context whether the name of an object stands for the node or the subtree. PROPERTIES Balance factors can be kept up-to-date by knowing the previous balance factors and the change in height – it is not necessary to know the absolute height. For holding the AVL balance information, two bits per node are sufficient. The height h of an
with the golden ratio φ := (1+√5) ⁄2 ≈ 1.618, c := 1⁄ log2
φ ≈ 1.44, and b := c⁄2 log2 5 – 2 ≈ –0.328. This is because
an
OPERATIONS Read-only operations of an
SEARCHING This section NEEDS ADDITIONAL CITATIONS FOR VERIFICATION . Please help improve this article by adding citations to reliable sources . Unsourced material may be challenged and removed. (July 2016) (Learn how and when to remove this template message ) Searching for a specific key in an
TRAVERSAL This section NEEDS ADDITIONAL CITATIONS FOR VERIFICATION . Please help improve this article by adding citations to reliable sources . Unsourced material may be challenged and removed. (July 2016) (Learn how and when to remove this template message ) Once a node has been found in an AVL tree, the next or previous node
can be accessed in amortized constant time. Some instances of
exploring these "nearby" nodes require traversing up to h ∝ log(n)
links (particularly when navigating from the rightmost leaf of the
root’s left subtree to the root or from the root to the leftmost
leaf of the root’s right subtree; in the
INSERT THIS SECTION HAS MULTIPLE ISSUES. Please help IMPROVE IT or discuss these issues on the TALK PAGE . (Learn how and when to remove these template messages ) THIS SECTION NEEDS EXPANSION. You can help by adding to it . (November 2016) This section NEEDS ADDITIONAL CITATIONS FOR VERIFICATION . Please help improve this article by adding citations to reliable sources . Unsourced material may be challenged and removed. (November 2016) (Learn how and when to remove this template message ) (Learn how and when to remove this template message ) When inserting an element into an AVL tree, you initially follow the same process as inserting into a Binary Search Tree. More explicitly: In case a preceding search has not been successful the search routine returns the tree itself with indication EMPTY and the new node is inserted as root. Or, if the tree has not been empty the search routine returns a node and a direction (left or right) where the returned node does not have a child. Then the node to be inserted is made child of the returned node at the returned direction. After this insertion it is necessary to check each of the node’s ancestors for consistency with the invariants of AVL trees: this is called "retracing". This is achieved by considering the balance factor of each node. Since with a single insertion the height of an AVL subtree cannot increase by more than one, the temporary balance factor of a node after an insertion will be in the range . For each node checked, if the temporary balance factor remains in the range from –1 to +1 then only an update of the balance factor and no rotation is necessary. However, if the temporary balance factor becomes less than –1 or greater than +1, the subtree rooted at this node is AVL unbalanced, and a rotation is needed. The various cases of rotations are described in section Rebalancing . In figure 1, by inserting the new node Z as a child of node X the height of that subtree Z increases from 0 to 1. Invariant of the retracing loop for an insertion The height of the subtree rooted by Z has increased by 1. It is already in AVL shape. for (X = parent(Z); X != null; X = parent(Z)) { // Loop (possibly up to the root) // BalanceFactor(X) has to be updated: if (Z == right_child(X)) { // The right subtree increases if (BalanceFactor(X) > 0) { // X is right-heavy // ===> the temporary BalanceFactor(X) == +2 // ===> rebalancing is required. G = parent(X); // Save parent of X around rotations if (BalanceFactor(Z) 0) // Left Right Case N = rotate_LeftRight(X,Z); // Double rotation: Left(Z) then Right(X) else // Left Left Case N = rotate_Right(X,Z); // Single rotation Right(X) // After rotation adapt parent link } else { if (BalanceFactor(X) > 0) { BalanceFactor(X) = 0; // Z’s height increase is absorbed at X. break; // Leave the loop } BalanceFactor(X) = –1; Z=X; // Height(Z) increases by 1 continue; } } // After a rotation adapt parent link: // N is the new root of the rotated subtree // Height does not change: Height(N) == old Height(X) parent(N) = G; if (G != null) { if (X == left_child(G)) left_child(G) = N; else right_child(G) = N; break; } else { tree->root = N; // N is the new root of the total tree break; } // There is no fall thru, only break; or continue; } // Unless loop is left via break, the height of the total tree increases by 1. In order to update the balance factors of all nodes, first observe that all nodes requiring correction lie from child to parent along the path of the inserted leaf. If the above procedure is applied to nodes along this path, starting from the leaf, then every node in the tree will again have a balance factor of −1, 0, or 1. The retracing can stop if the balance factor becomes 0 implying that the height of that subtree remains unchanged. If the balance factor becomes ±1 then the height of the subtree increases by one and the retracing needs to continue. If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation after which the subtree has the same height as before (and its root the balance factor 0). The time required is O(log n) for lookup, plus a maximum of O(log n) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log n) time. DELETE The preliminary steps for deleting a node are described in section
Starting at this subtree, it is necessary to check each of the ancestors for consistency with the invariants of AVL trees. This is called "retracing". Since with a single deletion the height of an AVL subtree cannot decrease by more than one, the temporary balance factor of a node will be in the range from −2 to +2. If the balance factor remains in the range from −1 to +1 it can be adjusted in accord with the AVL rules. If it becomes ±2 then the subtree is unbalanced and needs to be rotated. The various cases of rotations are described in section Rebalancing . Invariant of the retracing loop for a deletion The height of the subtree rooted by N has decreased by 1. It is already in AVL shape. for (X = parent(N); X != null; X = G) { // Loop (possibly up to the root) G = parent(X); // Save parent of X around rotations // BalanceFactor(X) has not yet been updated! if (N == left_child(X)) { // the left subtree decreases if (BalanceFactor(X) > 0) { // X is right-heavy // ===> the temporary BalanceFactor(X) == +2 // ===> rebalancing is required. Z = right_child(X); // Sibling of N (higher by 2) b = BalanceFactor(Z); if (b rebalancing is required. Z = left_child(X); // Sibling of N (higher by 2) b = BalanceFactor(Z); if (b > 0) // Left Right Case N = rotate_LeftRight(X,Z); // Double rotation: Left(Z) then Right(X) else // Left Left Case N = rotate_Right(X,Z); // Single rotation Right(X) // After rotation adapt parent link } else { if (BalanceFactor(X) == 0) { BalanceFactor(X) = –1; // N’s height decrease is absorbed at X. break; // Leave the loop } N = X; BalanceFactor(N) = 0; // Height(N) decreases by 1 continue; } } // After a rotation adapt parent link: // N is the new root of the rotated subtree parent(N) = G; if (G != null) { if (X == left_child(G)) left_child(G) = N; else right_child(G) = N; if (b == 0) break; // Height does not change: Leave the loop } else { tree->root = N; // N is the new root of the total tree continue; } // Height(N) decreases by 1 (== old Height(X)-1) } // Unless loop is left via break, the height of the total tree decreases by 1. The retracing can stop if the balance factor becomes ±1 meaning that the height of that subtree remains unchanged. If the balance factor becomes 0 then the height of the subtree decreases by one and the retracing needs to continue. If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation. It depends on the balance factor of the sibling Z (the higher child tree) whether the height of the subtree decreases by one or does not change (the latter, if Z has the balance factor 0). The time required is O(log n) for lookup, plus a maximum of O(log n) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log n) time. SET OPERATIONS AND BULK OPERATIONS In addition to the single-element insert, delete and lookup operations, several set operations have been defined on AVL trees: union , intersection and set difference . Then fast bulk operations on insertions or deletions can be implemented based on these set functions. These set operations rely on two helper operations, Split and Join. With the new operations, the implementation of AVL trees can be more efficient and highly-parallelizable. * Join: The function Join is on two AVL trees t1 and t2 and a key k
will return a tree containing all elements in t1, t2 as well as k. It
requires k to be greater than all keys in t1 and smaller than all keys
in t2. If the two trees differ by height at most one, Join simply
create a new node with left subtree t1, root k and right subtree t2.
Otherwise, suppose that t1 is higher than t2 for more than one (the
other case is symmetric). Join follows the right spine of t1 until a
node c which is balanced with t2. At this point a new node with left
child c, root k and right child t1 is created to replace c. The new
node satisfies the AVL invariant, and its height is one greater than
c. The increase in height can increase the height of its ancestors,
possibly invalidating the AVL invariant of those nodes. This can be
fixed either with a double rotation if invalid at the parent or a
single left rotation if invalid higher in the tree, in both cases
restoring the height for any further ancestor nodes. Join will
therefore require at most two rotations. The cost of this function is
the difference of the heights between the two input trees.
* Split: To split an
The union of two AVLs t1 and t2 representing sets A and B, is an AVL t that represents A ∪ B. The following recursive function computes this union: FUNCTION union(t1, t2): IF t1 = nil: RETURN t2 IF t2 = nil: RETURN t1 t ← split t2 on t1.root RETURN join(t1.root,union(left(t1), t)) Here, Split is presumed to return two trees: one holding the keys less its input key, one holding the greater keys. (The algorithm is non-destructive , but an in-place destructive version exists as well.) The algorithm for intersection or difference is similar, but requires the Join2 helper routine that is the same as Join but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the AVL tree. Since Split calls Join but does not deal with the balancing criteria of AVL trees directly, such an implementation is usually called the "join-based" implementation. The complexity of each of union, intersection and difference is O ( m log ( n m + 1 ) ) {displaystyle Oleft(mlog left({n over m}+1right)right)} for AVLs of sizes m {displaystyle m} and n ( m ) {displaystyle n(geq m)} . More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth O ( log m log n ) {displaystyle O(log mlog n)} . When m = 1 {displaystyle m=1} , the join-based implementation has the same computational DAG as single-element insertion and deletion. REBALANCING If during a modifying operation (e.g. insert, delete) a (temporary) height difference of more than one arises between two child subtrees, the parent subtree has to be "rebalanced". The given repair tools are the so-called tree rotations , because they move the keys only "vertically", so that the ("horizontal") in-order sequence of the keys is fully preserved (which is essential for a binary-search tree). Let Z be the child higher by 2 (see figures 4 and 5). Two flavors of rotations are required: simple and double. Rebalancing can be accomplished by a simple rotation (see figure 4) if the inner child of Z, that is the child with a child direction opposite to that of Z, (t23 in figure 4, Y in figure 5) is not higher than its sibling, the outer child t4 in both figures. This situation is called "Right Right" or "Left Left" in the literature. On the other hand, if the inner child (t23 in figure 4, Y in figure 5) of Z is higher than t4 then rebalancing can be accomplished by a double rotation (see figure 5). This situation is called "Right Left" because X is right- and Z left-heavy (or "Left Right" if X is left- and Z is right-heavy). From a mere graph-theoretic point of view, the two rotations of a double are just single rotations. But they encounter and have to maintain other configurations of balance factors. So, in effect, it is simpler – and more efficient – to specialize, just as in the original paper, where the double rotation is called Большое вращение (lit. big turn) as opposed to the simple rotation which is called Малое вращение (lit. little turn). But there are alternatives: one could e.g. update all the balance factors in a separate walk from leaf to root. The cost of a rotation, both simple and double, is constant. For both flavors of rotations a mirrored version, i.e. rotate_Right or rotate_LeftRight, respectively, is required as well. SIMPLE ROTATION Figure 4 shows a Right Right situation. In its upper half, node X has two child trees with a balance factor of +2. Moreover, the inner child t23 of Z is not higher than its sibling t4. This can happen by a height increase of subtree t4 or by a height decrease of subtree t1. In the latter case, also the pale situation where t23 has the same height as t4 may occur. The result of the left rotation is shown in the lower half of the figure. Three links (thick edges in figure 4) and two balance factors are to be updated. As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2, where it is again, when t23 and t4 were of same height. Otherwise the leaf layer reaches level h+1, so that the height of the rotated tree decreases. Fig. 4: Simple rotation rotate_Left(X,Z) Code snippet of a simple left rotation Input: X = root of subtree to be rotated left Z = its right child, not left-heavy with height == Height(LeftSubtree(X))+2 Result: new root of rebalanced subtree node* rotate_Left(node* X,node* Z) { // Z is by 2 higher than its sibling t23 = left_child(Z); // Inner child of Z right_child(X) = t23; if (t23 != null) parent(t23) = X; left_child(Z) = X; parent(X) = Z; // 1st case, BalanceFactor(Z) == 0, only happens with deletion, not insertion: if (BalanceFactor(Z) == 0) { // t23 has been of same height as t4 BalanceFactor(X) = +1; // t23 now higher BalanceFactor(Z) = –1; // t4 now lower than X } else // 2nd case happens with insertion or deletion: { BalanceFactor(X) = 0; BalanceFactor(Z) = 0; } return Z; // return new root of rotated subtree } DOUBLE ROTATION Figure 5 shows a Right Left situation. In its upper third, node X has two child trees with a balance factor of +2. But unlike figure 4, the inner child Y of Z is higher than its sibling t4. This can happen by a height increase of subtree t2 or t3 (with the consequence that they are of different height) or by a height decrease of subtree t1. In the latter case, it may also occur that t2 and t3 are of same height. The result of the first, the right, rotation is shown in the middle third of the figure. (With respect to the balance factors, this rotation is not of the same kind as the other AVL single rotations, because the height difference between Y and t4 is only 1.) The result of the final left rotation is shown in the lower third of the figure. Five links (thick edges in figure 5) and three balance factors are to be updated. As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the double rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2 and after the double rotation it is at level h+1, so that the height of the rotated tree decreases. Fig. 5: Double rotation rotate_RightLeft(X,Z) = rotate_Right around Z followed by rotate_Left around X Code snippet of a right-left double rotation Input: X = root of subtree to be rotated Z = its right child, left-heavy with height == Height(LeftSubtree(X))+2 Result: new root of rebalanced subtree node* rotate_RightLeft(node* X,node* Z) { // Z is by 2 higher than its sibling Y = left_child(Z); // Inner child of Z // Y is by 1 higher than sibling t3 = right_child(Y); left_child(Z) = t3; if (t3 != null) parent(t3) = Z; right_child(Y) = Z; parent(Z) = Y; t2 = left_child(Y); right_child(X) = t2; if (t2 != null) parent(t2) = X; left_child(Y) = X; parent(X) = Y; // 1st case, BalanceFactor(Y) > 0, happens with insertion or deletion: if (BalanceFactor(Y) > 0) { // t3 was higher BalanceFactor(X) = –1; // t1 now higher BalanceFactor(Z) = 0; } else // 2nd case, BalanceFactor(Y) == 0, only happens with deletion, not insertion: if (BalanceFactor(Y) == 0) { BalanceFactor(X) = 0; BalanceFactor(Z) = 0; } else // 3rd case happens with insertion or deletion: { // t2 was higher BalanceFactor(X) = 0; BalanceFactor(Z) = +1; // t4 now higher } BalanceFactor(Y) = 0; return Y; // return new root of rotated subtree } COMPARISON TO OTHER STRUCTURES Both AVL trees and red–black (RB) trees are self-balancing binary search trees and they are related mathematically. Indeed, every AVL tree can be colored red–black, but there are RB trees which are not AVL balanced. For maintaining the AVL resp. RB tree's invariants, rotations play an important role. In the worst case, even without rotations, AVL or RB insertions or deletions require O(log n) inspections and/or updates to AVL balance factors resp. RB colors. RB insertions and deletions and AVL insertions require from zero to three tail-recursive rotations and run in amortized O(1) time, thus equally constant on average. AVL deletions requiring O(log n) rotations in the worst case are also O(1) on average. RB trees require storing one bit of information (the color) in each node, while AVL trees mostly use two bits for the balance factor, although, when stored at the children, one bit with meaning «lower than sibling» suffices. The bigger difference between the two data structures is their height limit. For a tree of size n ≥ 1 * an AVL tree’s height is at most h c log 2 ( n + d ) + b := 1 + 5 2 1.618 {displaystyle varphi :={tfrac {1+{sqrt {5}}}{2}}approx 1.618} the golden ratio , c := 1 log 2 1.440 , {displaystyle c:={tfrac {1}{log _{2}varphi }}approx 1.440,} b := c 2 log 2 5 2 0.328 , {displaystyle b:={tfrac {c}{2}}log _{2}5-2approx ;-0.328,} and d := 1 + 1 4 5 1.065 {displaystyle d:=1+{tfrac {1}{varphi ^{4}{sqrt {5}}}}approx 1.065} . * an RB tree’s height is at most h 2 log 2 ( n + 1 ) {displaystyle {begin{array}{ll}h2log _{2}(n+1)end{array}}} . AVL trees are more rigidly balanced than RB trees with an asymptotic relation AVL⁄RB≈0.720 of the maximal heights. For insertions and deletions, Ben Pfaff shows in 79 measurements a relation of AVL⁄RB between 0.677 and 1.077 with median ≈0.947 and geometric mean ≈0.910. SEE ALSO * Trees
*
REFERENCES * ^ A B C D E F Eric Alexander. "AVL Trees".
* ^ Robert Sedgewick , Algorithms, Addison-Wesley, 1983, ISBN
0-201-06672-6 , page 199, chapter 15: Balanced Trees.
* ^ Georgy Adelson-Velsky, G.;
* ^ AVL trees are not weight-balanced? (meaning: AVL trees are not
μ-balanced?)
Thereby: A Binary Tree is called {displaystyle mu }
-balanced, with 0 1 2 {displaystyle 0leq mu leq {tfrac
{1}{2}}} , if for every node N {displaystyle N} , the
inequality 1 2 N l N + 1
1 2 + {displaystyle {tfrac {1}{2}}-mu leq {tfrac
{N_{l}}{N+1}}leq {tfrac {1}{2}}+mu } holds and
{displaystyle mu } is minimal with this property. N
{displaystyle N} is the number of nodes below the tree with N
{displaystyle N} as root (including the root) and N l
{displaystyle N_{l}} is the left child node of N {displaystyle
N} . * ^ Knuth, Donald E. (2000). Sorting and searching (2. ed., 6.
printing, newly updated and rev. ed.). Boston : Addison-Wesley. p.
459. ISBN 0-201-89685-0 .
* ^ More precisely: if the AVL balance information is kept in the
child nodes – with meaning "when going upward there is an additional
increment in height", this can be done with one bit. Nevertheless, the
modifying operations can be programmed more efficiently if the balance
information can be checked with one test.
* ^ Knuth, Donald E. (2000). Sorting and searching (2. ed., 6.
printing, newly updated and rev. ed.). Boston : Addison-Wesley. p.
460. ISBN 0-201-89685-0 .
* ^ A B Knuth, Donald E. (2000). Sorting and searching (2. ed., 6.
printing, newly updated and rev. ed.). Boston : Addison-Wesley. pp.
458–481. ISBN 0201896850 .
* ^ A B Pfaff, Ben (2004). An Introduction to Binary Search Trees
and Balanced Trees. Free Software Foundation, Inc. pp. 107–138.
* ^ A B Blelloch, Guy E.; Ferizovic, Daniel; Sun, Yihan (2016),
"Just Join for Parallel Ordered Sets", Proc. 28th ACM Symp. Parallel
Algorithms and Architectures (SPAA 2016) , ACM, pp. 253–264, ISBN
978-1-4503-4210-0 , doi :10.1145/2935764.2935768 .
* ^ Paul E. Black (2015-04-13). "AVL tree". Dictionary of
Algorithms and Data Structures. National Institute of Standards and
Technology . Retrieved 2016-07-02.
* ^ Mehlhorn & Sanders 2008 , pp. 165, 158
* ^ Dinesh P. Mehta, Sartaj Sahni (Ed.) Handbook of Data Structures
and Applications 10.4.2
* ^
FURTHER READING *
EXTERNAL LINKS The Wikibook Algorithm Implementation has a page on the topic of: AVL TREE Wikimedia Commons has media related to AVL-TREES . * This |