A B+ tree is an
m-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves.
The root may be either a leaf or a node with two or more children.
A B+ tree can be viewed as a
B-tree in which each node contains only keys (not key–value pairs), and to which an additional level is added at the bottom with linked leaves.
The primary value of a B+ tree is in storing data for efficient retrieval in a
block-oriented storage context — in particular,
filesystems. This is primarily because unlike
binary search trees, B+ trees have very high fanout (number of pointers to child nodes in a node,
typically on the order of 100 or more), which reduces the number of I/O operations required to find an element in the tree.
History
There is no single paper introducing the B+ tree concept. Instead, the notion of maintaining all data in leaf nodes is repeatedly brought up as an interesting variant.
Douglas Comer notes in an early survey of B-trees (which also covers B+ trees) that the B+ tree was used in IBM's
VSAM data access software, and refers to an IBM published article from 1973.
Structure
As with other trees, B+ trees can be represented as a collection of three types of nodes: ''root'', ''internal'', and ''leaf''. These node types have the following properties:
* Individual nodes will have either ''keys'' or ''children'', but never both at the same time: this is the main distinction from a
B-tree.
* The ''order'' or ''branching factor'' of a B+ tree measures the capacity of internal nodes, i.e. their maximum allowed number of direct child nodes. This value is constant over the entire tree.
* Internal nodes have no keys, but will always have nonzero children. The ''actual'' number of children for a given internal node is constrained such that
. Each child is then referred to as
for