The B+ Tree is an efficient and self-balancing tree data structure that is commonly used in databases and file systems to manage large datasets. It is an extension of the B-tree, with the key distinction that all actual data (or records) is stored in the leaf nodes, while the internal nodes only store keys to guide searches.
Before diving into the insertion process, it’s important to understand the basic structure and properties of a B+ Tree:
m-1
keys.Find the Correct Position:
Insert in the Leaf Node:
m-1
), the key is inserted directly.Node Split if Necessary:
m-1
keys), it needs to be split into two nodes.Parent Node Update:
Root Node Split:
Let’s walk through an example of inserting keys into a B+ tree with a minimum degree t = 3
, meaning each node can hold up to 5 keys.
2 * t - 1 = 5
), split the node and promote the middle key to the parent node.Below is a Python implementation that demonstrates how to insert a key into a B+ Tree.
class BPlusTreeNode:
def __init__(self, t, leaf=False):
self.t = t # Minimum degree (defines range for the number of keys)
self.leaf = leaf # True if the node is a leaf node
self.keys = [] # List of keys in the node
self.children = [] # List of children nodes
self.next = None # For linking leaf nodes in a linked list
class BPlusTree:
def __init__(self, t):
self.t = t # Minimum degree
self.root = BPlusTreeNode(t, True) # Root node is initially a leaf node
def insert(self, key):
root = self.root
if len(root.keys) == (2 * self.t) - 1:
# If the root is full, split it
new_node = BPlusTreeNode(self.t, False)
new_node.children.append(self.root)
self.split(new_node, 0)
self.root = new_node
# Insert the key in the non-full root
self.insert_non_full(self.root, key)
def insert_non_full(self, node, key):
if node.leaf:
# Find the correct position and insert the key in the leaf node
i = len(node.keys) - 1
while i >= 0 and key < node.keys[i]:
i -= 1
node.keys.insert(i + 1, key)
else:
# Traverse to the correct child node
i = len(node.keys) - 1
while i >= 0 and key < node.keys[i]:
i -= 1
i += 1
child = node.children[i]
if len(child.keys) == (2 * self.t) - 1:
# If the child is full, split it
self.split(node, i)
if key > node.keys[i]:
i += 1
self.insert_non_full(node.children[i], key)
def split(self, parent, index):
# Split the full child node into two nodes
full_node = parent.children[index]
new_node = BPlusTreeNode(self.t, full_node.leaf)
mid = self.t - 1
# Promote the middle key to the parent
parent.keys.insert(index, full_node.keys[mid])
parent.children.insert(index + 1, new_node)
# Move the keys and children to the new node
new_node.keys = full_node.keys[mid + 1:]
full_node.keys = full_node.keys[:mid]
if not full_node.leaf:
new_node.children = full_node.children[mid + 1:]
full_node.children = full_node.children[:mid + 1]
def inorder_traversal(self):
# Perform an in-order traversal to print keys in sorted order
self.inorder_traversal_helper(self.root)
def inorder_traversal_helper(self, node):
if node:
i = 0
while i < len(node.keys):
if not node.leaf:
self.inorder_traversal_helper(node.children[i])
print(node.keys[i], end=" ")
i += 1
if not node.leaf:
self.inorder_traversal_helper(node.children[i])
# Example Usage
btree = BPlusTree(3) # Create a B+ tree with minimum degree 3
keys = [10, 20, 5, 6, 12, 30, 7, 17]
# Insert keys into the B+ Tree
for key in keys:
btree.insert(key)
# In-order Traversal (prints keys in sorted order)
print("In-order Traversal of B+ Tree after insertion:")
btree.inorder_traversal()
BPlusTreeNode Class:
keys
and children
(pointers to child nodes).next
pointer is used for linking leaf nodes in a doubly linked list to make range queries more efficient.BPlusTree Class: