Deletion from a B-tree


A B-tree is a balanced tree data structure that is commonly used in databases and file systems for efficient storage and retrieval of data. While insertion into a B-tree is a straightforward process, deletion from a B-tree is more complex and involves multiple steps to maintain the tree's balance and properties.


What is a B-tree?

Before diving into the deletion process, let’s quickly review the properties of a B-tree of order m:

  • A B-tree is a self-balancing tree that maintains sorted data.
  • Each node can contain between ⌈m/2⌉ and m-1 keys, and the number of children in each node is always one more than the number of keys.
  • All leaf nodes are at the same level, ensuring the tree remains balanced.
  • The root node must have at least one key, unless the tree is empty.

The main goal of B-tree deletion is to maintain these properties after removing a key from the tree.


Deletion in a B-tree: Step-by-step Process

Steps to Delete a Key from a B-tree

The deletion process can be broken down into the following steps:

  1. Search for the Key:

    • First, find the key you want to delete in the B-tree using the standard search procedure. If the key is not found, the operation terminates.
  2. Delete the Key from a Leaf Node:

    • If the key to be deleted is found in a leaf node, simply remove it. However, removing a key might cause the node to have fewer than the minimum number of keys, so you need to handle this.
  3. Delete the Key from an Internal Node:

    • If the key to be deleted is found in an internal node, you need to replace it with either the predecessor (the largest key in the left subtree) or the successor (the smallest key in the right subtree).
    • After replacement, recursively delete the predecessor or successor key from the appropriate child.
  4. Rebalancing the Tree:

    • After deleting a key, if a node has fewer than the minimum number of keys, it must be rebalanced. This can happen in one of three ways:
      • Borrowing a key from a sibling (left or right).
      • Merging with a sibling if borrowing isn’t possible.

Detailed Cases in B-tree Deletion

There are three main cases to consider when deleting a key from a B-tree:

Case 1: Deletion from a Leaf Node

  • If the key is found in a leaf node and there are enough keys in the node (at least ⌈m/2⌉), simply remove the key.
  • If the node now has fewer than the required number of keys, move to case 2 or 3 for rebalancing.

Case 2: Deletion from an Internal Node (with Predecessor or Successor)

  • If the key is found in an internal node, it must be replaced with either the predecessor or the successor key. After the replacement, recursively delete the predecessor or successor from the appropriate child node.
  • This could lead to a rebalancing step if the child node becomes too small.

Case 3: Borrowing or Merging Nodes

  • If a node is underfull (fewer than ⌈m/2⌉ keys) after a deletion, try to borrow a key from a sibling.
  • If borrowing is not possible, merge the underfull node with a sibling, and move the parent’s separating key down to the merged node.

Python Code Example: Deletion in a B-tree

Below is a Python implementation of the B-tree deletion process, including the search, deletion, and rebalancing operations.

class BTreeNode:
    def __init__(self, t, leaf=False):
        self.t = t  # Minimum degree (defines the range for the number of keys)
        self.leaf = leaf  # True if the node is a leaf
        self.keys = []  # List of keys in the node
        self.children = []  # List of child nodes

class BTree:
    def __init__(self, t):
        self.root = BTreeNode(t, True)  # Create a root node that is a leaf
        self.t = t  # Minimum degree of the B-tree

    # Function to delete a key from the B-tree
    def delete(self, key):
        self.delete_from_node(self.root, key)

    # Function to delete a key from a node
    def delete_from_node(self, node, key):
        # If the node is a leaf, simply delete the key if it exists
        if node.leaf:
            if key in node.keys:
                node.keys.remove(key)
            return

        # If the node is internal, search for the key and handle accordingly
        index = self.find_key(node, key)

        if index < len(node.keys) and node.keys[index] == key:
            # Case 1: Key is in an internal node
            if len(node.children[index].keys) >= self.t:
                # Replace with predecessor (max value from the left subtree)
                predecessor = self.get_predecessor(node, index)
                node.keys[index] = predecessor
                self.delete_from_node(node.children[index], predecessor)
            elif len(node.children[index + 1].keys) >= self.t:
                # Replace with successor (min value from the right subtree)
                successor = self.get_successor(node, index)
                node.keys[index] = successor
                self.delete_from_node(node.children[index + 1], successor)
            else:
                # Case 2: Merge with child
                self.merge(node, index)
                self.delete_from_node(node.children[index], key)
        else:
            # Case 3: Key is not in the current node, recursively delete from child
            child = node.children[index]
            if len(child.keys) < self.t:
                self.fill(node, index)

            if index < len(node.keys) and node.keys[index] < key:
                self.delete_from_node(node.children[index + 1], key)
            else:
                self.delete_from_node(node.children[index], key)

    # Helper function to find the index of the key
    def find_key(self, node, key):
        index = 0
        while index < len(node.keys) and node.keys[index] < key:
            index += 1
        return index

    # Helper function to get the predecessor (max key from the left child)
    def get_predecessor(self, node, index):
        current_node = node.children[index]
        while not current_node.leaf:
            current_node = current_node.children[-1]
        return current_node.keys[-1]

    # Helper function to get the successor (min key from the right child)
    def get_successor(self, node, index):
        current_node = node.children[index + 1]
        while not current_node.leaf:
            current_node = current_node.children[0]
        return current_node.keys[0]

    # Helper function to merge child with sibling
    def merge(self, node, index):
        child = node.children[index]
        sibling = node.children[index + 1]
        child.keys.append(node.keys[index])
        child.keys.extend(sibling.keys)
        if not child.leaf:
            child.children.extend(sibling.children)

        node.keys.pop(index)
        node.children.pop(index + 1)

    # Helper function to fill a node if it has fewer than t keys
    def fill(self, node, index):
        if index > 0 and len(node.children[index - 1].keys) >= self.t:
            self.borrow_from_left(node, index)
        elif index < len(node.children) - 1 and len(node.children[index + 1].keys) >= self.t:
            self.borrow_from_right(node, index)
        else:
            if index == len(node.children) - 1:
                self.merge(node, index - 1)
            else:
                self.merge(node, index)

    # Helper function to borrow a key from the left sibling
    def borrow_from_left(self, node, index):
        child = node.children[index]
        sibling = node.children[index - 1]
        child.keys.insert(0, node.keys[index - 1])
        node.keys[index - 1] = sibling.keys.pop()
        if not sibling.leaf:
            child.children.insert(0, sibling.children.pop())

    # Helper function to borrow a key from the right sibling
    def borrow_from_right(self, node, index):
        child = node.children[index]
        sibling = node.children[index + 1]
        child.keys.append(node.keys[index])
        node.keys[index] = sibling.keys.pop(0)
        if not sibling.leaf:
            child.children.append(sibling.children.pop(0))

    # In-order traversal of the B-tree
    def inorder_traversal(self, node):
        if node:
            i = 0
            while i < len(node.keys):
                if not node.leaf:
                    self.inorder_traversal(node.children[i])
                print(node.keys[i], end=" ")
                i += 1
            if not node.leaf:
                self.inorder_traversal(node.children[i])

# Example Usage
btree = BTree(3)  # Create a B-tree with minimum degree 3
keys = [10, 20, 5, 6, 12, 30, 7, 17]

for key in keys:
    btree.delete(key)

# In-order Traversal (prints keys in sorted order)
print("In-order Traversal of B Tree after deletion:")
btree.inorder_traversal(btree.root)