The Info List - Binary Search Tree

--- Advertisement ---

In computer science , BINARY SEARCH TREES (BST), sometimes called ORDERED or SORTED BINARY TREES, are a particular type of container : data structures that store "items" (such as numbers, names etc.) in memory . They allow fast lookup, addition and removal of items, and can be used to implement either dynamic sets of items, or lookup tables that allow finding an item by its key (e.g., finding the phone number of a person by name).

Binary search trees keep their keys in sorted order, so that lookup and other operations can use the principle of binary search : when looking for a key in a tree (or a place to insert a new key), they traverse the tree from root to leaf, making comparisons to keys stored in the nodes of the tree and deciding, based on the comparison, to continue searching in the left or right subtrees. On average, this means that each comparison allows the operations to skip about half of the tree, so that each lookup, insertion or deletion takes time proportional to the logarithm of the number of items stored in the tree. This is much better than the linear time required to find items by key in an (unsorted) array, but slower than the corresponding operations on hash tables .

Several variants of the binary search tree have been studied in computer science; this article deals primarily with the basic type, making references to more advanced types when appropriate.


* 1 Definition

* 1.1 Order relation

* 2 Operations

* 2.1 Searching * 2.2 Insertion * 2.3 Deletion * 2.4 Traversal * 2.5 Verification

* 3 Examples of applications

* 3.1 Sort * 3.2 Priority queue operations

* 4 Types

* 4.1 Performance comparisons * 4.2 Optimal binary search trees

* 5 See also * 6 Notes * 7 References * 8 Further reading * 9 External links


A binary search tree is a rooted binary tree , whose internal nodes each store a key (and optionally, an associated value) and each have two distinguished sub-trees, commonly denoted left and right. The tree additionally satisfies the binary search tree property, which states that the key in each node must be greater than or equal to any key stored in the left sub-tree, and less than or equal to any key stored in the right sub-tree. :287 (The leaves (final nodes) of the tree contain no key and have no structure to distinguish them from one another. Leaves are commonly represented by a special leaf or nil symbol, a NULL pointer, etc.)

Generally, the information represented by each node is a record rather than a single data element. However, for sequencing purposes, nodes are compared according to their keys rather than any part of their associated records.

The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient; they are also easy to code.

Binary search trees are a fundamental data structure used to construct more abstract data structures such as sets , multisets , and associative arrays . Some of their disadvantages are as follows:

* The shape of the binary search tree depends entirely on the order of insertions and deletions, and can become degenerate. * When inserting or searching for an element in a binary search tree, the key of each visited node has to be compared with the key of the element to be inserted or found. * The keys in the binary search tree may be long and the run time may increase. * After a long intermixed sequence of random insertion and deletion, the expected height of the tree approaches square root of the number of keys, √n, which grows much faster than log n.


Binary search requires an order relation by which every element (item) can be compared with every other element in the sense of a total preorder . The part of the element which effectively takes place in the comparison is called its key. Whether duplicates, i.e. different elements with same key, shall be allowed in the tree or not, does not depend on the order relation, but on the application only.

In the context of binary search trees a total preorder is realized most flexibly by means of a three-way comparison subroutine .


Binary search trees support three main operations: insertion of elements, deletion of elements, and lookup (checking whether a key is present).


Searching a binary search tree for a specific key can be programmed recursively or iteratively .

We begin by examining the root node . If the tree is null, the key we are searching for does not exist in the tree. Otherwise, if the key equals that of the root, the search is successful and we return the node. If the key is less than that of the root, we search the left subtree. Similarly, if the key is greater than that of the root, we search the right subtree. This process is repeated until the key is found or the remaining subtree is null. If the searched key is not found after a null subtree is reached, then the key is not present in the tree. This is easily expressed as a recursive algorithm (implemented in Python ):

1 def search_recursively(key, node): 2 if node is None or node.key == key: 3 return node 4 if key node.key 7 return search_recursively(key, node.right)

The same algorithm can be implemented iteratively:

1 def search_iteratively(key, node): 2 current_node = node 3 while current_node is not None: 4 if key == current_node.key: 5 return current_node 6 elif key current_node.key: 9 current_node = current_node.right 10 return None

These two examples rely on the order relation being a total order.

If the order relation is only a total preorder a reasonable extension of the functionality is the following: also in case of equality search down to the leaves in a direction specifiable by the user. A binary tree sort equipped with such a comparison function becomes stable .

Because in the worst case this algorithm must search from the root of the tree to the leaf farthest from the root, the search operation takes time proportional to the tree's height (see tree terminology ). On average, binary search trees with n nodes have O (log n) height. However, in the worst case, binary search trees can have O(n) height, when the unbalanced tree resembles a linked list (degenerate tree ).


Insertion begins as a search would begin; if the key is not equal to that of the root, we search the left or right subtrees as before. Eventually, we will reach an external node and add the new key-value pair (here encoded as a record 'newNode') as its right or left child, depending on the node's key. In other words, we examine the root and recursively insert the new node to the left subtree if its key is less than that of the root, or the right subtree if its key is greater than or equal to the root.

Here's how a typical binary search tree insertion might be performed in a binary tree in C++

Node* insert(Node* else if (key key) root->left = insert(root->left, key, value); else // key >= root->key root->right = insert(root->right, key, value); return root; }

The above destructive procedural variant modifies the tree in place. It uses only constant heap space (and the iterative version uses constant stack space as well), but the prior version of the tree is lost. Alternatively, as in the following Python example, we can reconstruct all ancestors of the inserted node; any reference to the original tree root remains valid, making the tree a persistent data structure :

def binary_tree_insert(node, key, value): if node is None: return NodeTree(None, key, value, None) if key == node.key: return NodeTree(node.left, key, value, node.right) if key key key > maxKey) return false; return isBST(node->left, minKey, node->key-1) & }

node->key+1 and node->key-1 are done to allow only distinct elements in BST.

If we want same elements to also be present, then we can use only node->key in both places.

The initial call to this function can be something like this:

if(isBST(root, INT_MIN, INT_MAX)) { puts("This is a BST."); } else { puts("This is NOT a BST!"); }

Essentially we keep creating a valid range (starting from ) and keep shrinking it down for each node as we go down recursively.

As pointed out in section #Traversal , an in-order traversal of a binary search tree returns the nodes sorted. Thus we only need to keep the last visited node while traversing the tree and check whether its key is smaller (or smaller/equal, if duplicates are to be allowed in the tree) compared to the current key.


Some examples shall illustrate the use of above basic building blocks.


Main article: Tree sort

A binary search tree can be used to implement a simple sorting algorithm . Similar to heapsort , we insert all the values we wish to sort into a new ordered data structure—in this case a binary search tree—and then traverse it in order.

The worst-case time of build_binary_tree is O(n2)—if you feed it a sorted list of values, it chains them into a linked list with no left subtrees. For example, build_binary_tree() yields the tree (1 (2 (3 (4 (5))))).

There are several schemes for overcoming this flaw with simple binary trees; the most common is the self-balancing binary search tree . If this same procedure is done using such a tree, the overall worst-case time is O(n log n), which is asymptotically optimal for a comparison sort . In practice, the added overhead in time and space for a tree-based sort (particularly for node allocation ) make it inferior to other asymptotically optimal sorts such as heapsort for static list sorting. On the other hand, it is one of the most efficient methods of incremental sorting, adding items to a list over time while keeping the list sorted at all times.


Binary search trees can serve as priority queues : structures that allow insertion of arbitrary key as well as lookup and deletion of the minimum (or maximum) key. Insertion works as previously explained. Find-min walks the tree, following left pointers as far as it can without hitting a leaf:

// Precondition: T is not a leaf FUNCTION find-min(T): WHILE hasLeft(T): T ? left(T) RETURN key(T)

Find-max is analogous: follow right pointers as far as possible. Delete-min (max) can simply look up the minimum (maximum), then delete it. This way, insertion and deletion both take logarithmic time, just as they do in a binary heap , but unlike a binary heap and most other priority queue implementations, a single tree can support all of find-min, find-max, delete-min and delete-max at the same time, making binary search trees suitable as double-ended priority queues . :156


There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees . A splay tree is a binary search tree that automatically moves frequently accessed elements nearer to the root. In a treap (tree heap ), each node also holds a (randomly chosen) priority and the parent node has higher priority than its children. Tango trees are trees optimized for fast searches. T-trees are binary search trees optimized to reduce storage space overhead, widely used for in-memory databases

A degenerate tree is a tree where for each parent node, there is only one associated child node. It is unbalanced and, in the worst case, performance degrades to that of a linked list. If your add node function does not handle re-balancing, then you can easily construct a degenerate tree by feeding it with data that is already sorted. What this means is that in a performance measurement, the tree will essentially behave like a linked list data structure.


D. A. Heger (2004) presented a performance comparison of binary search trees. Treap was found to have the best average performance, while red-black tree was found to have the smallest amount of performance variations.


Main article: Optimal binary search tree Tree rotations are very common internal operations in binary trees to keep perfect, or near-to-perfect, internal balance in the tree.

If we do not plan on modifying a search tree, and we know exactly how often each item will be accessed, we can construct an optimal binary search tree, which is a search tree where the average cost of looking up an item (the expected search cost) is minimized.

Even if we only have estimates of the search costs, such a system can considerably speed up lookups on average. For example, if you have a BST of English words used in a spell checker , you might balance the tree based on word frequency in text corpora , placing words like the near the root and words like agerasia near the leaves. Such a tree might be compared with Huffman trees , which similarly seek to place frequently used items near the root in order to produce a dense information encoding; however, Huffman trees store data elements only in leaves, and these elements need not be ordered.

If we do not know the sequence in which the elements in the tree will be accessed in advance, we can use splay trees which are asymptotically as good as any static search tree we can construct for any particular sequence of lookup operations.

Alphabetic trees are Huffman trees with the additional constraint on order, or, equivalently, search trees with the modification that all elements are stored in the leaves. Faster algorithms exist for optimal alphabetic binary trees (OABTs).


* Binary search algorithm * Search tree * Self-balancing binary search tree * AVL tree * Red–black tree * Randomized binary search tree * Tango tree


* ^ The notion of an average BST is made precise as follows. Let a random BST be one built using only insertions out of a sequence of unique elements in random order (all permutations equally likely); then the expected height of the tree is O(log n). If deletions are allowed as well as insertions, "little is known about the average height of a binary search tree". :300 * ^ Of course, a generic software package has to work the other way around: It has to leave the user data untouched and to furnish E with all the BST links to and from D.


* ^ A B Cormen, Thomas H. ; Leiserson, Charles E. ; Rivest, Ronald L. ; Stein, Clifford (2009) . Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4 . * ^ s. Robert Sedgewick , Kevin Wayne: Algorithms Fourth Edition. Pearson Education, 2011, ISBN 978-0-321-57351-3 , p. 410. * ^ Mehlhorn, Kurt ; Sanders, Peter (2008). Algorithms and Data Structures: The Basic Toolbox (PDF). Springer. * ^ Heger, Dominique A. (2004), "A Disquisition on The Performance Behavior of Binary Search Tree Data Structures" (PDF), European Journal for the Informatics Professional, 5 (5): 67–75 * ^ Gonnet, Gaston. "Optimal Binary Search Trees". Scientific Computation. ETH Zürich. Archived from the original on 12 October 2014. Retrieved 1 December 2013.


* This article incorporates public domain material from the NIST document: Black, Paul E. "Binary Search Tree". Dictionary of Algorithms and Data Structures . * Cormen, Thomas H. ; Leiserson, Charles E. ; Rivest, Ronald L. ; Stein, Clifford (2001). "12: Binary search trees, 15.5: Optimal binary search trees". Introduction to Algorithms (2nd ed.). MIT Press & McGraw-Hill. pp. 253–272, 356–363. ISBN 0-262-03293-7 . * Jarc, Duane J. (3 December 2005). "Binary Tree Traversals". Interactive Data Structure Visualizations. University of Maryland
University of Maryland
. * Knuth, Donald (1997). "6.2.2: Binary Tree Searching". The Art of Computer Programming . 3: "Sorting and Searching" (3rd ed.). Addison-Wesley. pp. 426–458. ISBN 0-201-89685-0 . * Long, Sean. "Binary Search Tree" (PPT ). Data Structures and Algorithms Visualization-A PowerPoint Slides Based Approach. SUNY Oneonta . * Parlante, Nick (2001). "Binary Trees". CS Education Library. Stanford University
Stanford University