Spanning trees

As we’ve already discussed, graphs do not necessarily have Hamiltonian paths. Connectedness is a necessary condition for the existence of a Hamiltonian path, but it’s not a sufficient condition. Connected undirected graphs without self-loops do, however, always have spanning tree, i.e. a subgraph which is a tree and has every vertex of the original graph as a vertex.

The largest connected component of the EU border graph, for example, has the spanning tree shown in the accompanying diagram. Edges which belong to the spanning tree are shown in bold, while edges which belong to the original graph but not the spanning tree are dotted.

An undirected graph

To show that every connected graph has a spanning tree we use an argument similar to the one used earlier for the existence of Eulerian trails. For the rest of this paragraph whenever I refer to a tree I will mean a subgraph of the original graph which is a tree. The number of vertices in a tree is at most one less than the number of vertices in the graph, so there must be a tree with a maximal number of vertices. If there were a vertex in the graph which was not in the tree then we could obtain a larger tree as follows. Pick a vertex in the tree and one not in the tree. The graph is connected so there is a walk from the first vertex to the second. Consider the last vertex of the tree which belongs to this walk. The next edge connects that vertex to a vertex not in the tree. Adjoining that edge to the tree gives a graph which is still connected and has no loops and so is a tree. It would therefore be a larger tree, but we chose our tree to be maximal, so this can’t happen. In other words, there is no vertex in the tree but not in the graph, so the tree is a spanning tree.

There are, in general, many possible spanning trees. It’s common in applications that there is a cost function on the edges of the graph and that one wants to minimise the cost of the tree, i.e. the sum of the costs of the edges in the tree, among all the spanning trees of the graph. Such a cost-minimising spanning tree is called a minimal spanning tree. The existence of a minimal spanning tree is easy to prove. Spanning trees are determined by their edges, which are a subset of the edges of the original graph. There are only finitely many edges in the original graph and so only finitely many spanning trees. The total cost determines an ordering of spanning trees and we’ve already seen that orderings of finite sets have minimal elements. If you actually want to find a minimal spanning tree then finding all spanning trees, computing their total costs, and then choosing one with the lowest cost is not an efficient algorithm. A number of efficient algorithms are known though