Insertion on a B+ Tree


The B+ Tree is an efficient and self-balancing tree data structure that is commonly used in databases and file systems to manage large datasets. It is an extension of the B-tree, with the key distinction that all actual data (or records) is stored in the leaf nodes, while the internal nodes only store keys to guide searches.


What is a B+ Tree?

Before diving into the insertion process, it’s important to understand the basic structure and properties of a B+ Tree:

  • Internal Nodes: Store only keys to direct the search for the correct leaf node.
  • Leaf Nodes: Store actual data (or records) and are linked in a doubly linked list to facilitate sequential access.
  • Balanced Tree: All leaf nodes are at the same level, ensuring efficient searching and balancing of the tree.
  • Order of the Tree: The order (m) of the tree defines the maximum number of children a node can have, with each node holding up to m-1 keys.

Insertion in a B+ Tree: Step-by-Step Process

Steps for Insertion in a B+ Tree

  1. Find the Correct Position:

    • Start at the root and search for the correct position to insert the new key. Traverse down the tree, comparing the key with the internal nodes.
    • In each node, choose the correct child pointer based on whether the key is smaller or greater than the node's keys.
  2. Insert in the Leaf Node:

    • Once the correct leaf node is found, insert the new key into that node.
    • If the node has enough space (i.e., the number of keys is less than m-1), the key is inserted directly.
  3. Node Split if Necessary:

    • If the leaf node is full (i.e., it has m-1 keys), it needs to be split into two nodes.
    • The middle key is promoted to the parent node, and the keys are split evenly between the two nodes.
  4. Parent Node Update:

    • If the parent node has space, the new key is inserted.
    • If the parent node is full, it will also need to be split, and the process is repeated up the tree.
  5. Root Node Split:

    • If the root node splits, a new root is created. The height of the tree increases, and the new root will point to the two child nodes.

Detailed Example: Insertion in a B+ Tree

Let’s walk through an example of inserting keys into a B+ tree with a minimum degree t = 3, meaning each node can hold up to 5 keys.

Steps Involved in Insertion:

  1. Start at the root and find the appropriate leaf node for insertion.
  2. Insert the key into the leaf node.
  3. If the leaf node exceeds the maximum key limit (2 * t - 1 = 5), split the node and promote the middle key to the parent node.
  4. Repeat the process: If the parent node exceeds the key limit, it will be split and the middle key will be promoted to its parent, and so on, until the root is reached.

Python Code Example: Insertion in a B+ Tree

Below is a Python implementation that demonstrates how to insert a key into a B+ Tree.

class BPlusTreeNode:
    def __init__(self, t, leaf=False):
        self.t = t  # Minimum degree (defines range for the number of keys)
        self.leaf = leaf  # True if the node is a leaf node
        self.keys = []  # List of keys in the node
        self.children = []  # List of children nodes
        self.next = None  # For linking leaf nodes in a linked list

class BPlusTree:
    def __init__(self, t):
        self.t = t  # Minimum degree
        self.root = BPlusTreeNode(t, True)  # Root node is initially a leaf node

    def insert(self, key):
        root = self.root
        if len(root.keys) == (2 * self.t) - 1:
            # If the root is full, split it
            new_node = BPlusTreeNode(self.t, False)
            new_node.children.append(self.root)
            self.split(new_node, 0)
            self.root = new_node

        # Insert the key in the non-full root
        self.insert_non_full(self.root, key)

    def insert_non_full(self, node, key):
        if node.leaf:
            # Find the correct position and insert the key in the leaf node
            i = len(node.keys) - 1
            while i >= 0 and key < node.keys[i]:
                i -= 1
            node.keys.insert(i + 1, key)
        else:
            # Traverse to the correct child node
            i = len(node.keys) - 1
            while i >= 0 and key < node.keys[i]:
                i -= 1
            i += 1
            child = node.children[i]
            if len(child.keys) == (2 * self.t) - 1:
                # If the child is full, split it
                self.split(node, i)
                if key > node.keys[i]:
                    i += 1
            self.insert_non_full(node.children[i], key)

    def split(self, parent, index):
        # Split the full child node into two nodes
        full_node = parent.children[index]
        new_node = BPlusTreeNode(self.t, full_node.leaf)
        mid = self.t - 1

        # Promote the middle key to the parent
        parent.keys.insert(index, full_node.keys[mid])
        parent.children.insert(index + 1, new_node)

        # Move the keys and children to the new node
        new_node.keys = full_node.keys[mid + 1:]
        full_node.keys = full_node.keys[:mid]
        if not full_node.leaf:
            new_node.children = full_node.children[mid + 1:]
            full_node.children = full_node.children[:mid + 1]

    def inorder_traversal(self):
        # Perform an in-order traversal to print keys in sorted order
        self.inorder_traversal_helper(self.root)

    def inorder_traversal_helper(self, node):
        if node:
            i = 0
            while i < len(node.keys):
                if not node.leaf:
                    self.inorder_traversal_helper(node.children[i])
                print(node.keys[i], end=" ")
                i += 1
            if not node.leaf:
                self.inorder_traversal_helper(node.children[i])

# Example Usage
btree = BPlusTree(3)  # Create a B+ tree with minimum degree 3
keys = [10, 20, 5, 6, 12, 30, 7, 17]

# Insert keys into the B+ Tree
for key in keys:
    btree.insert(key)

# In-order Traversal (prints keys in sorted order)
print("In-order Traversal of B+ Tree after insertion:")
btree.inorder_traversal()

Explanation of the Code

  1. BPlusTreeNode Class:

    • This class represents a node in the B+ Tree.
    • Each node holds a list of keys and children (pointers to child nodes).
    • The next pointer is used for linking leaf nodes in a doubly linked list to make range queries more efficient.
  2. BPlusTree Class:

    • This class manages the B+ Tree structure. It includes methods to insert keys, handle node splitting, and perform in-order traversal.
    • insert(): This is the main method that handles insertion into the B+ Tree.
    • insert_non_full(): This method is called to insert a key into a node that is not full.
    • split(): When a node becomes full, this method splits the node and promotes the middle key to the parent node.
    • inorder_traversal(): This method performs an in-order traversal of the tree, printing the keys in sorted order.