Heap Data Structure


A heap is a specialized binary tree-based data structure that satisfies the heap property. The heap data structure is widely used in various algorithms, particularly in applications such as priority queues, heapsort, and graph algorithms like Dijkstra's shortest path.

The heap can be implemented as either a max-heap or a min-heap, depending on whether the largest or smallest element is given priority. Heaps are efficient for retrieving the maximum or minimum element in constant time, making them highly suitable for certain tasks that require frequent access to the highest or lowest value.


What is a Heap?

A heap is a complete binary tree, which means:

  • Every level of the tree is fully filled, except possibly the last level.
  • All nodes are as far left as possible.

The heap also satisfies the heap property, which differs depending on whether it is a min-heap or max-heap:

  • In a max-heap, the value of each node is greater than or equal to the values of its children.
  • In a min-heap, the value of each node is less than or equal to the values of its children.

The root of the heap is always either the maximum (in a max-heap) or the minimum (in a min-heap).


Types of Heaps

  1. Max-Heap: In a max-heap, the key of each node is greater than or equal to the keys of its children. This ensures that the largest element is always at the root of the heap.

  2. Min-Heap: In a min-heap, the key of each node is less than or equal to the keys of its children. This ensures that the smallest element is always at the root of the heap.


Operations on Heaps

There are several fundamental operations that can be performed on a heap:

  1. Insertion: Insert a new element into the heap while maintaining the heap property.
  2. Deletion: Remove the root element (maximum or minimum) and reorganize the heap to maintain the heap property.
  3. Heapify: Convert an unordered array into a valid heap.
  4. Peek: Retrieve the root element of the heap (either maximum or minimum) without removing it.
  5. Heap Sort: Sort an array by repeatedly extracting the root element and reheapifying the remaining elements.

Each of these operations can be performed efficiently, typically in O(log n) time.


1. Insertion in Heap

Inserting a new element into a heap involves:

  • Adding the element to the next available position in the last level (maintaining the complete tree property).
  • "Bubbling up" the element to restore the heap property if necessary (i.e., comparing it with its parent and swapping if needed).

2. Deletion in Heap

Deleting the root element (the maximum in a max-heap or the minimum in a min-heap) involves:

  • Replacing the root with the last element in the heap.
  • "Bubbling down" this element to restore the heap property (i.e., comparing it with its children and swapping if necessary).

3. Heapify

The heapify operation is used to rearrange an array into a valid heap. It’s a bottom-up process where we start from the last non-leaf node and move upward, applying the "bubbling down" operation to ensure the heap property is satisfied.


4. Heap Sort

Heap Sort is a comparison-based sorting algorithm that works by:

  • Building a max-heap (or min-heap) from the unsorted data.
  • Repeatedly extracting the root (maximum or minimum) and placing it at the end of the array.
  • After each extraction, reheapifying the remaining heap.

Heap sort has a time complexity of O(n log n), making it efficient for large datasets.


Python Code Example: Max-Heap Implementation

Here’s a Python implementation of a max-heap with basic operations like insertion, deletion, and heapify:

class MaxHeap:
    def __init__(self):
        self.heap = []

    def parent(self, i):
        return (i - 1) // 2

    def left_child(self, i):
        return 2 * i + 1

    def right_child(self, i):
        return 2 * i + 2

    def heapify_up(self, i):
        while i > 0 and self.heap[self.parent(i)] < self.heap[i]:
            self.heap[self.parent(i)], self.heap[i] = self.heap[i], self.heap[self.parent(i)]
            i = self.parent(i)

    def heapify_down(self, i):
        largest = i
        left = self.left_child(i)
        right = self.right_child(i)
        
        if left < len(self.heap) and self.heap[left] > self.heap[largest]:
            largest = left
        
        if right < len(self.heap) and self.heap[right] > self.heap[largest]:
            largest = right
        
        if largest != i:
            self.heap[i], self.heap[largest] = self.heap[largest], self.heap[i]
            self.heapify_down(largest)

    def insert(self, value):
        self.heap.append(value)
        self.heapify_up(len(self.heap) - 1)

    def delete_max(self):
        if len(self.heap) == 0:
            return None
        max_value = self.heap[0]
        self.heap[0] = self.heap[-1]
        self.heap.pop()
        self.heapify_down(0)
        return max_value

    def peek(self):
        return self.heap[0] if self.heap else None

    def display(self):
        print(self.heap)

# Example usage
heap = MaxHeap()
heap.insert(10)
heap.insert(20)
heap.insert(5)
heap.insert(30)

print("Heap after insertions:")
heap.display()  # Output: [30, 20, 5, 10]

print("Max value:", heap.delete_max())  # Output: 30
heap.display()  # Output: [20, 10, 5]

Explanation of the Code:

  1. MaxHeap Class: This class defines the max-heap with basic operations.
  2. parent, left_child, right_child: These methods return the indices of the parent and child nodes for a given index.
  3. heapify_up: This method ensures the heap property is maintained by "bubbling up" a node after insertion.
  4. heapify_down: This method restores the heap property by "bubbling down" a node after deletion.
  5. insert: This method inserts a value into the heap and restores the heap property.
  6. delete_max: This method deletes the maximum value (root) from the heap and restores the heap property.
  7. peek: This method returns the maximum value without removing it from the heap.
  8. display: This method prints the current state of the heap.

Real-World Applications of Heaps

  • Priority Queues: Heaps are used to implement priority queues, where elements are processed based on their priority (e.g., operating system task scheduling).
  • Heap Sort: As a sorting algorithm, heap sort is used for efficient, in-place sorting.
  • Graph Algorithms: Heaps are used in algorithms like Dijkstra’s shortest path and Prim’s minimum spanning tree, where a priority queue is needed to always select the next node with the smallest or largest value.
  • Kth Largest/Smallest Element: Heaps are useful in problems like finding the kth largest or smallest element in a stream of data.