Kruskal's Algorithm


Kruskal's Algorithm is a popular algorithm in computer science used to find the Minimum Spanning Tree (MST) of a connected, weighted graph. The Minimum Spanning Tree is a subset of the edges of a graph that connects all the vertices together, without any cycles, and with the minimum possible total edge weight.

Kruskal’s algorithm is a greedy algorithm, meaning it builds the solution incrementally by choosing the smallest available option at each step. It was developed by Joseph Kruskal in 1956 and is widely used in network design, circuit design, and other optimization problems.

What is a Spanning Tree?

A spanning tree of a graph is a subgraph that:

  • Includes all the vertices of the original graph.
  • Contains no cycles (i.e., it’s a tree).
  • Has exactly V1 edges, where V is the number of vertices.

The Minimum Spanning Tree (MST) is a spanning tree where the sum of the edge weights is minimized.

How Kruskal's Algorithm Works

Kruskal's algorithm works by sorting all the edges of the graph in non-decreasing order of their weight and adding them to the MST, provided they don’t form a cycle. The algorithm follows these steps:

  1. Sort all edges in non-decreasing order of their weights.
  2. Initialize the MST as an empty set of edges.
  3. Iterate over the sorted edges and add each edge to the MST if it does not form a cycle. This is checked using a Disjoint-Set (Union-Find) data structure.
  4. Repeat the above step until the MST contains exactly V1 edges (where V is the number of vertices).

Steps of Kruskal’s Algorithm:

  1. Sort all the edges of the graph in increasing order of weight.
  2. Create a Disjoint Set data structure to keep track of which vertices are in the same connected component.
  3. Initialize the MST as an empty set.
  4. Pick the smallest edge. If it connects two different components, add it to the MST and union the two components.
  5. Repeat the above step until you have added V1 edges to the MST.

Kruskal’s Algorithm Example

Consider the following graph with 4 nodes and 5 edges:

     10          15
A --------- B --------- C
|           |           |
| 5         | 7         | 8
|           |           |
D --------- E --------- F
     9           6
  • Nodes: A, B, C, D, E, F
  • Edges: (A, B, 10), (A, D, 5), (B, C, 15), (B, E, 7), (C, F, 8), (D, E, 9), (E, F, 6)

The Minimum Spanning Tree will be formed by choosing the smallest weight edges that connect all vertices without forming a cycle.


Kruskal's Algorithm Implementation

Now, let’s take a look at how to implement Kruskal’s algorithm using Python and C++.

1. Kruskal's Algorithm in Python

class DisjointSet:
    def __init__(self, n):
        self.parent = [i for i in range(n)]
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        rootX = self.find(x)
        rootY = self.find(y)
        
        if rootX != rootY:
            # Union by rank
            if self.rank[rootX] > self.rank[rootY]:
                self.parent[rootY] = rootX
            elif self.rank[rootX] < self.rank[rootY]:
                self.parent[rootX] = rootY
            else:
                self.parent[rootY] = rootX
                self.rank[rootX] += 1


def kruskal(n, edges):
    disjoint_set = DisjointSet(n)
    mst = []
    edges.sort(key=lambda x: x[2])  # Sort edges by weight

    for u, v, weight in edges:
        if disjoint_set.find(u) != disjoint_set.find(v):
            disjoint_set.union(u, v)
            mst.append((u, v, weight))

    return mst


# Sample usage
edges = [
    (0, 1, 10),  # (Node A, Node B, weight)
    (0, 3, 5),   # (Node A, Node D, weight)
    (1, 2, 15),  # (Node B, Node C, weight)
    (1, 4, 7),   # (Node B, Node E, weight)
    (2, 5, 8),   # (Node C, Node F, weight)
    (3, 4, 9),   # (Node D, Node E, weight)
    (4, 5, 6)    # (Node E, Node F, weight)
]

n = 6  # Number of nodes
mst = kruskal(n, edges)
print("Edges in the Minimum Spanning Tree:", mst)

Explanation:

  • The DisjointSet class implements the Union-Find data structure to efficiently track and merge sets of connected nodes.
  • The kruskal function sorts the edges by weight and iterates through them to add edges to the MST if they don’t form a cycle (checked using Union-Find).
  • The edges in the MST are returned as the result.

2. Kruskal's Algorithm in C++

#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

class DisjointSet {
public:
    vector<int> parent, rank;

    DisjointSet(int n) {
        parent.resize(n);
        rank.resize(n, 0);
        for (int i = 0; i < n; ++i) {
            parent[i] = i;
        }
    }

    int find(int x) {
        if (parent[x] != x)
            parent[x] = find(parent[x]);
        return parent[x];
    }

    void unionSets(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);

        if (rootX != rootY) {
            if (rank[rootX] > rank[rootY])
                parent[rootY] = rootX;
            else if (rank[rootX] < rank[rootY])
                parent[rootX] = rootY;
            else {
                parent[rootY] = rootX;
                rank[rootX]++;
            }
        }
    }
};

struct Edge {
    int u, v, weight;
    bool operator<(const Edge& other) {
        return weight < other.weight;
    }
};

vector<Edge> kruskal(int n, vector<Edge>& edges) {
    DisjointSet ds(n);
    vector<Edge> mst;
    sort(edges.begin(), edges.end());  // Sort edges by weight

    for (const auto& edge : edges) {
        if (ds.find(edge.u) != ds.find(edge.v)) {
            ds.unionSets(edge.u, edge.v);
            mst.push_back(edge);
        }
    }

    return mst;
}

int main() {
    vector<Edge> edges = {
        {0, 1, 10},  // (Node A, Node B, weight)
        {0, 3, 5},   // (Node A, Node D, weight)
        {1, 2, 15},  // (Node B, Node C, weight)
        {1, 4, 7},   // (Node B, Node E, weight)
        {2, 5, 8},   // (Node C, Node F, weight)
        {3, 4, 9},   // (Node D, Node E, weight)
        {4, 5, 6}    // (Node E, Node F, weight)
    };

    int n = 6;  // Number of nodes
    vector<Edge> mst = kruskal(n, edges);

    cout << "Edges in the Minimum Spanning Tree:\n";
    for (const auto& edge : mst) {
        cout << "(" << edge.u << ", " << edge.v << ") with weight " << edge.weight << "\n";
    }

    return 0;
}

Explanation:

  • The DisjointSet class is the implementation of the Union-Find data structure, which is used to track and merge components.
  • The kruskal function sorts the edges and adds edges to the MST if they don’t form a cycle.
  • The result is printed in the form of edges included in the MST.

Performance Analysis of Kruskal's Algorithm

  • Time Complexity:

    • Sorting the edges takes O(E log E), where E is the number of edges.
    • Each union and find operation takes O(α(V)), where V is the number of vertices and α is the inverse Ackermann function (very slow-growing, almost constant for practical input sizes). So, the total time complexity is O(E log E + E α(V)).
  • Space Complexity: O(V + E), where V is the number of vertices and E is the number of edges, as we need to store the graph and the Disjoint Set.


Applications of Kruskal’s Algorithm

Kruskal’s algorithm has several real-world applications, including:

  1. Network Design: Used in designing the most efficient network by connecting all points with the least cost.
  2. Cluster Analysis: In data mining, Kruskal’s algorithm is used for hierarchical clustering.
  3. Circuit Design: Helps in designing circuits by minimizing the total wire length.
  4. Image Segmentation: Used in computer vision for segmenting images based on minimum connections.