Kruskals's Algorithm
Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component). Kruskal's algorithm is an example of a greedy algorithm.
Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component). Kruskal's algorithm is an example of a greedy algorithm.
Each node is initially its own component. When an edge (u,v) is added, the components of u,v merge. Need a data structure to maintain a collection of disjoint sets.
Operations:
-makeset(x)
make a set consisting only of x
-makeset(x)
make a set consisting only of x
-find(x)
which set does x belong to?
-union(x,y)
merge the sets containing x and y
which set does x belong to?
-union(x,y)
merge the sets containing x and y
| Time: O(E log E) + (V-1) x union + 2E x find |
Union Find Data Structure
Operations:
-makeset(x)
make a set consisting only of x
-find(x)
which set does x belong to?
-union(x,y)
merge the sets containing x and y
Operations:
-makeset(x)
make a set consisting only of x
-find(x)
which set does x belong to?
-union(x,y)
merge the sets containing x and y
Each set is stored as a directed tree. eg. two sets {b,e} and {a,c,d,f,g,h}:
Union by Rank
Parent pointers: p(b) = e, p(g) = d, etc. The root of each set is its representative
(or name).
(or name).
Each node also has a numeric rank.
Union by Rank
Two choices for the union operation:
Choice 1:
root = e
root = e
Each node has a numeric rank. The rank of a root node is related to the height of that tree. Make the root with smaller rank point to the root with larger rank.
Choice 2:
root = h
root = h
Example:
makeset(a), makeset(b), ..., makeset(g)
union(c,h), union(g,d), union(a,f),
union(b,e), union(g,c), union(c,a)
union(c,h), union(g,d), union(a,f),
union(b,e), union(g,c), union(c,a)
Notice:
(1) rank[x] = height of subtree rooted at x
(2) subtree rooted at x has 2rank[x] nodes
(1) rank[x] = height of subtree rooted at x
(2) subtree rooted at x has 2rank[x] nodes
Properties of the Data Structure
We’ve shown:
rank[x] = height of subtree rooted at x subtree rooted at x has >= 2rank[x] nodes
rank[x] = height of subtree rooted at x subtree rooted at x has >= 2rank[x] nodes
Therefore: if there are n elements total, each tree has height at most log n. Time for find, union is O(log n).
Time for Kruskal’s algorithm
= O(E log E) + V x union + 2E x find
= O((V + E) log V)
= O(E log E) + V x union + 2E x find
= O((V + E) log V)
A Further Improvement
Path compression: a little extra computation performed during find, that
makes the tree shorter.
makes the tree shorter.