1 00:00:00,000 --> 00:00:00,870 2 00:00:00,870 --> 00:00:03,490 One of the multiple definitions of trees that we saw 3 00:00:03,490 --> 00:00:08,570 is that it's a minimum edge simple graph that connects up 4 00:00:08,570 --> 00:00:09,950 a bunch of vertices. 5 00:00:09,950 --> 00:00:14,510 And that leads to the idea of finding a spanning tree 6 00:00:14,510 --> 00:00:17,520 within a simple graph that maintains the same connections. 7 00:00:17,520 --> 00:00:19,920 So let's begin with a precise technical definition. 8 00:00:19,920 --> 00:00:22,530 A spanning subgraph of a graph, G, 9 00:00:22,530 --> 00:00:25,740 is simply a subgraph that has all the vertices of G. 10 00:00:25,740 --> 00:00:28,040 So, again, a subgraph of a graph means 11 00:00:28,040 --> 00:00:30,050 it's got a subset of the vertices, 12 00:00:30,050 --> 00:00:31,870 and a subset of the edges. 13 00:00:31,870 --> 00:00:35,590 Spanning subgraph is that has all of the vertices, 14 00:00:35,590 --> 00:00:37,560 but a subset of the edges. 15 00:00:37,560 --> 00:00:39,940 And the definition of a spanning tree 16 00:00:39,940 --> 00:00:43,290 is a spanning subgraph that is a tree. 17 00:00:43,290 --> 00:00:47,750 Now, not all graphs are going to have a spanning tree, 18 00:00:47,750 --> 00:00:49,670 because the tree has to be connected. 19 00:00:49,670 --> 00:00:51,350 If the original graph is not connected, 20 00:00:51,350 --> 00:00:54,120 there's no way you can find a spanning tree using 21 00:00:54,120 --> 00:00:56,010 only the edges that are there already. 22 00:00:56,010 --> 00:00:58,470 But it's going to turn out that if the graph is connected, 23 00:00:58,470 --> 00:01:00,450 it's guaranteed to have a spanning tree. 24 00:01:00,450 --> 00:01:01,930 Let's look at an example. 25 00:01:01,930 --> 00:01:03,480 Here's a simple graph. 26 00:01:03,480 --> 00:01:06,380 And what I want, then, is a spanning tree, a selection 27 00:01:06,380 --> 00:01:10,120 of edges, that connect up all the vertices such that we're 28 00:01:10,120 --> 00:01:14,810 only using edges in the original graph, and they form a tree. 29 00:01:14,810 --> 00:01:16,340 There it is. 30 00:01:16,340 --> 00:01:19,760 So if you check on these magenta edges that I've highlighted, 31 00:01:19,760 --> 00:01:20,990 they define a tree. 32 00:01:20,990 --> 00:01:24,680 I haven't used three of the edges in the original graph. 33 00:01:24,680 --> 00:01:26,675 Now this particular choice of spanning tree 34 00:01:26,675 --> 00:01:27,550 is kind of arbitrary. 35 00:01:27,550 --> 00:01:29,341 In general, there's lots of spanning trees. 36 00:01:29,341 --> 00:01:31,670 Here's another one, this time with green edges. 37 00:01:31,670 --> 00:01:34,360 Again, I'm using only edges from the original graph-- 38 00:01:34,360 --> 00:01:36,210 I've left out three different ones, 39 00:01:36,210 --> 00:01:39,510 and used a different set of edges to form the tree. 40 00:01:39,510 --> 00:01:40,710 But there it is. 41 00:01:40,710 --> 00:01:44,630 It's got no cycles, and it spans the graph 42 00:01:44,630 --> 00:01:47,480 because every vertex in the graph is part of it. 43 00:01:47,480 --> 00:01:51,337 And of course, it's connected since it's a tree. 44 00:01:51,337 --> 00:01:53,420 There's actually some lovely combinatorial theory, 45 00:01:53,420 --> 00:01:56,930 which enables you to calculate the number of spanning trees 46 00:01:56,930 --> 00:01:59,600 in a simple graph without too much difficulty, 47 00:01:59,600 --> 00:02:01,530 just given its adjacency matrix. 48 00:02:01,530 --> 00:02:03,600 But we're not going to go into that for now. 49 00:02:03,600 --> 00:02:06,130 50 00:02:06,130 --> 00:02:08,729 First remark is that every connected graph 51 00:02:08,729 --> 00:02:10,250 is going to have a spanning tree, 52 00:02:10,250 --> 00:02:13,820 and the reason is, you just pick a minimal edge tree-- 53 00:02:13,820 --> 00:02:18,145 a minimal edge connected spanning subgraph, rather. 54 00:02:18,145 --> 00:02:20,700 So G, itself, if it's connected, is 55 00:02:20,700 --> 00:02:23,330 by definition, a spanning graph of itself, 56 00:02:23,330 --> 00:02:25,160 because it's got all its own vertices. 57 00:02:25,160 --> 00:02:26,920 That means by the well-ordering principle, 58 00:02:26,920 --> 00:02:30,120 there's going to be a connected spanning subgraph 59 00:02:30,120 --> 00:02:32,230 with a minimum number of edges. 60 00:02:32,230 --> 00:02:35,290 And that one, given that it has a minimum number of edges, 61 00:02:35,290 --> 00:02:40,930 it's guaranteed to be a spanning tree. 62 00:02:40,930 --> 00:02:43,389 Now the problem gets more interesting 63 00:02:43,389 --> 00:02:44,930 when it has a little more structure-- 64 00:02:44,930 --> 00:02:49,530 instead of just trying to find a spanning tree that 65 00:02:49,530 --> 00:02:52,280 has a minimum number of edges, it's 66 00:02:52,280 --> 00:02:55,260 quite typical in applications that the edges have weights, 67 00:02:55,260 --> 00:02:57,900 and we want to find a minimum weight spanning tree. 68 00:02:57,900 --> 00:02:59,410 So here's an example where we have 69 00:02:59,410 --> 00:03:03,610 a simple graph with a bunch of edges, and a bunch of vertices. 70 00:03:03,610 --> 00:03:05,960 And the edges all are assigned, in this case, an integer 71 00:03:05,960 --> 00:03:06,750 weight. 72 00:03:06,750 --> 00:03:08,720 Now the motivation for this kind of graph, 73 00:03:08,720 --> 00:03:12,010 as you could think of, these weights on the graph 74 00:03:12,010 --> 00:03:16,210 as indicating the cost of transporting some quantity 75 00:03:16,210 --> 00:03:18,410 commodity from this vertex to that vertex, 76 00:03:18,410 --> 00:03:19,580 directly by a road. 77 00:03:19,580 --> 00:03:25,760 Or the time it takes to transmit a signal over this channel. 78 00:03:25,760 --> 00:03:28,080 Lots of ways that simple graphs are 79 00:03:28,080 --> 00:03:32,500 used to model issues of communication 80 00:03:32,500 --> 00:03:35,470 among various locations. 81 00:03:35,470 --> 00:03:39,480 And it's quite typical that the channels and connections 82 00:03:39,480 --> 00:03:41,650 between them have different costs. 83 00:03:41,650 --> 00:03:43,500 And it's a natural question to say, OK, 84 00:03:43,500 --> 00:03:48,526 what's the minimum cost overall tree structure that will enable 85 00:03:48,526 --> 00:03:50,400 me to have everything connected to everything 86 00:03:50,400 --> 00:03:54,270 else in the same way, but that I can tolerate 87 00:03:54,270 --> 00:03:56,379 some of my edges going down? 88 00:03:56,379 --> 00:03:58,920 And I still would like to have the cheapest kind of tree that 89 00:03:58,920 --> 00:04:00,830 spans them all. 90 00:04:00,830 --> 00:04:03,340 Well, there's a fairly simple way 91 00:04:03,340 --> 00:04:06,020 to construct such a minimum weight spanning tree, 92 00:04:06,020 --> 00:04:08,390 and that's what we're going to talk about now. 93 00:04:08,390 --> 00:04:10,360 How do you find it? 94 00:04:10,360 --> 00:04:14,530 Well, the idea is to build it using grey edges. 95 00:04:14,530 --> 00:04:20,190 So what that means is that starting off with the vertices, 96 00:04:20,190 --> 00:04:22,730 we're going to start building a tree. 97 00:04:22,730 --> 00:04:25,630 And at any point, we will have a bunch 98 00:04:25,630 --> 00:04:29,030 of edges that are going to be part of our spanning tree-- 99 00:04:29,030 --> 00:04:31,930 that means that the edges don't have any cycles among them, 100 00:04:31,930 --> 00:04:35,000 they're a so-called forest, but they're not yet connected. 101 00:04:35,000 --> 00:04:37,120 And at each stage in this procedure, 102 00:04:37,120 --> 00:04:40,030 we're going to look at the connected components 103 00:04:40,030 --> 00:04:42,430 of the graph that we have at this moment, 104 00:04:42,430 --> 00:04:47,030 and color them black or white. 105 00:04:47,030 --> 00:04:49,910 And then look at the gray edges. 106 00:04:49,910 --> 00:04:53,880 So a grey edge is defined to be an edge where one end point is 107 00:04:53,880 --> 00:04:58,200 black, and the other end point is white. 108 00:04:58,200 --> 00:05:02,600 And what I'm going to do, at any stage in the procedure 109 00:05:02,600 --> 00:05:05,220 as I'm growing my minimum weight spanning tree, 110 00:05:05,220 --> 00:05:07,620 is I'm going to look at all the gray edges 111 00:05:07,620 --> 00:05:10,410 and pick a minimum weight gray edge. 112 00:05:10,410 --> 00:05:13,330 Well, let's do an example to get this clear. 113 00:05:13,330 --> 00:05:15,710 So to begin with, I don't have any edges. 114 00:05:15,710 --> 00:05:18,370 All I have are the isolated vertices. 115 00:05:18,370 --> 00:05:21,930 So it means that I have six connected components, 116 00:05:21,930 --> 00:05:25,990 each of which is a single vertex with no edges. 117 00:05:25,990 --> 00:05:28,710 That says that I'm allowed to color them black and white 118 00:05:28,710 --> 00:05:31,762 in any way I choose, and I will do that. 119 00:05:31,762 --> 00:05:33,220 The only constraint on the coloring 120 00:05:33,220 --> 00:05:36,080 is there has to be at least one black component, and one 121 00:05:36,080 --> 00:05:37,580 white component. 122 00:05:37,580 --> 00:05:39,230 So there's an arbitrary coloring-- 123 00:05:39,230 --> 00:05:41,950 I've colored two of the vertices white, and the other four 124 00:05:41,950 --> 00:05:42,840 black. 125 00:05:42,840 --> 00:05:44,350 Now, in this particular coloring-- 126 00:05:44,350 --> 00:05:46,939 I could've chosen any one, but I chose this one-- 127 00:05:46,939 --> 00:05:47,980 where are the gray edges? 128 00:05:47,980 --> 00:05:51,310 Well, I've highlighted them by thickening them. 129 00:05:51,310 --> 00:05:54,230 So this is a gray edge, because it's black and white. 130 00:05:54,230 --> 00:05:57,030 This is a gray edge because it's black and white-- 131 00:05:57,030 --> 00:05:58,990 black and white, black and white. 132 00:05:58,990 --> 00:06:01,330 This is not a gray edge, because it's white and white. 133 00:06:01,330 --> 00:06:03,580 This is not a gray edge, because it's black and black. 134 00:06:03,580 --> 00:06:05,530 So that's a simple enough idea. 135 00:06:05,530 --> 00:06:08,240 Now what I'm supposed to do is among my gray edges, 136 00:06:08,240 --> 00:06:09,885 pick the one with the minimum weight. 137 00:06:09,885 --> 00:06:12,010 Well, if you look at the weights of the gray edges, 138 00:06:12,010 --> 00:06:15,512 I got a four, a four, a nine, a seven, and a two. 139 00:06:15,512 --> 00:06:19,100 The two is the minimum weight gray edge. 140 00:06:19,100 --> 00:06:22,920 I'm going to choose that to start building my tree. 141 00:06:22,920 --> 00:06:26,020 So at this moment, once I've committed to that magenta 142 00:06:26,020 --> 00:06:30,620 edge, what I now have is a graph with five components-- 143 00:06:30,620 --> 00:06:34,310 namely the component defined by this edge, with two vertices. 144 00:06:34,310 --> 00:06:37,000 And the other four isolated vertices, 145 00:06:37,000 --> 00:06:40,170 which still don't have any edges connecting them 146 00:06:40,170 --> 00:06:42,960 in the structure of magenta edges 147 00:06:42,960 --> 00:06:45,690 that I'm building to be my minimum spanning tree. 148 00:06:45,690 --> 00:06:48,110 So according to the rules now, with these five components, 149 00:06:48,110 --> 00:06:49,240 I can recolor them. 150 00:06:49,240 --> 00:06:51,300 And as long as I recolor them in a way 151 00:06:51,300 --> 00:06:53,670 that this component gets the same color-- 152 00:06:53,670 --> 00:06:54,880 there's a recoloring. 153 00:06:54,880 --> 00:06:58,450 I've made both of those vertices in this component black, 154 00:06:58,450 --> 00:07:02,120 and the other four vertices-- which are isolated components-- 155 00:07:02,120 --> 00:07:03,600 I can color arbitrarily. 156 00:07:03,600 --> 00:07:05,720 So here's my new coloring. 157 00:07:05,720 --> 00:07:07,440 Now, again, once I have this coloring, 158 00:07:07,440 --> 00:07:10,910 I can proceed to identify the gray edges. 159 00:07:10,910 --> 00:07:11,780 There they are. 160 00:07:11,780 --> 00:07:13,880 And this time there are only two gray edges, 161 00:07:13,880 --> 00:07:16,347 because I chose to have only one white vertex. 162 00:07:16,347 --> 00:07:18,180 There's a gray edge and there's a gray edge. 163 00:07:18,180 --> 00:07:20,790 And of course, the minimum weight among the two gray edges 164 00:07:20,790 --> 00:07:21,750 is three. 165 00:07:21,750 --> 00:07:27,000 So that's going to be my next edge in my minimum weight 166 00:07:27,000 --> 00:07:28,330 spanning tree that I'm growing. 167 00:07:28,330 --> 00:07:29,380 What's next? 168 00:07:29,380 --> 00:07:33,060 Well now, I have four components left. 169 00:07:33,060 --> 00:07:35,680 Here's one component defined by that edge, here's 170 00:07:35,680 --> 00:07:38,150 another connected component defined by that edge. 171 00:07:38,150 --> 00:07:40,820 And these two vertices are isolated, still, 172 00:07:40,820 --> 00:07:42,830 so they're components all by themselves. 173 00:07:42,830 --> 00:07:45,860 And the rule is, recolor in such a way 174 00:07:45,860 --> 00:07:48,290 that both of these vertices in that component 175 00:07:48,290 --> 00:07:49,630 have the same color. 176 00:07:49,630 --> 00:07:51,210 All the vertices in this component 177 00:07:51,210 --> 00:07:53,060 have the same color-- I could switch them 178 00:07:53,060 --> 00:07:54,630 from black to white, in fact I will-- 179 00:07:54,630 --> 00:07:56,800 and those can be colored arbitrarily. 180 00:07:56,800 --> 00:07:58,190 Let's do that. 181 00:07:58,190 --> 00:08:00,200 There's a recoloring. 182 00:08:00,200 --> 00:08:04,150 Now this component is all white, that component is all white. 183 00:08:04,150 --> 00:08:07,065 These two separate components happen both to be black. 184 00:08:07,065 --> 00:08:09,327 185 00:08:09,327 --> 00:08:11,910 I could have colored one of them white, and one of them black. 186 00:08:11,910 --> 00:08:18,050 I need to have one black, given the other commitment to colors. 187 00:08:18,050 --> 00:08:21,632 So now, again, we could find a minimum weight edge, 188 00:08:21,632 --> 00:08:24,520 a grey edge I guess it would be. 189 00:08:24,520 --> 00:08:27,780 There are two ties for minimum, both of those ones. 190 00:08:27,780 --> 00:08:30,540 And I proceed in this way, and I wind up 191 00:08:30,540 --> 00:08:33,090 with this minimum weight spanning tree. 192 00:08:33,090 --> 00:08:34,309 That's the procedure. 193 00:08:34,309 --> 00:08:36,289 Now I haven't discussed why it works yet, 194 00:08:36,289 --> 00:08:38,659 and that is explained in the notes. 195 00:08:38,659 --> 00:08:40,250 But we're going to hold off on that 196 00:08:40,250 --> 00:08:45,580 and just examine applying this algorithm. 197 00:08:45,580 --> 00:08:47,380 So there are a bunch of ways, now, 198 00:08:47,380 --> 00:08:50,430 to grow a minimum weight spanning tree. 199 00:08:50,430 --> 00:08:54,190 One way is to start at any vertex, 200 00:08:54,190 --> 00:08:57,840 and then keep building around that vertex. 201 00:08:57,840 --> 00:09:00,840 So you start with that vertex and color it black, 202 00:09:00,840 --> 00:09:03,080 and everything else white. 203 00:09:03,080 --> 00:09:04,930 That means that all the gray edges are going 204 00:09:04,930 --> 00:09:06,800 to be connected to that vertex. 205 00:09:06,800 --> 00:09:08,750 So pick a minimum weight. 206 00:09:08,750 --> 00:09:13,060 Now you have a component with two vertices. 207 00:09:13,060 --> 00:09:15,790 Color it black and everything else white, and in that way, 208 00:09:15,790 --> 00:09:18,442 you keep working on one component 209 00:09:18,442 --> 00:09:19,900 that you're going to grow by always 210 00:09:19,900 --> 00:09:21,540 coloring it one color, and everything 211 00:09:21,540 --> 00:09:22,640 else the other color. 212 00:09:22,640 --> 00:09:24,910 This is a method known as Prim's algorithm 213 00:09:24,910 --> 00:09:27,230 for growing a minimum weight spanning tree. 214 00:09:27,230 --> 00:09:31,780 Another one is to globally, among all 215 00:09:31,780 --> 00:09:34,780 the different connected components, 216 00:09:34,780 --> 00:09:37,300 find a minimum weight edge among them. 217 00:09:37,300 --> 00:09:40,070 So what that means is that you find the minimum weight 218 00:09:40,070 --> 00:09:43,769 edge among all the connected components, 219 00:09:43,769 --> 00:09:46,310 and then having identified where that minimum weight edge is, 220 00:09:46,310 --> 00:09:48,090 you can color one of its components black, 221 00:09:48,090 --> 00:09:49,730 and the other one white, and that 222 00:09:49,730 --> 00:09:51,850 will have to conform to our procedure 223 00:09:51,850 --> 00:09:54,720 for picking a minimum weight edge between different colored 224 00:09:54,720 --> 00:09:55,250 components. 225 00:09:55,250 --> 00:09:57,270 That's Kruskal's algorithm. 226 00:09:57,270 --> 00:10:02,170 And finally, you can grow the trees in parallel. 227 00:10:02,170 --> 00:10:04,840 You can just start choosing the minimum weight 228 00:10:04,840 --> 00:10:06,850 edge around each connected component, 229 00:10:06,850 --> 00:10:09,210 because you can always take a connected component, 230 00:10:09,210 --> 00:10:12,630 color it one color, and color all the other edges 231 00:10:12,630 --> 00:10:14,490 another color. 232 00:10:14,490 --> 00:10:17,337 And so all of the edges touching a given component 233 00:10:17,337 --> 00:10:19,920 will be gray in that color, and you can choose the minimum one 234 00:10:19,920 --> 00:10:21,272 and grow that component. 235 00:10:21,272 --> 00:10:23,230 And if they're not too close to each other-- so 236 00:10:23,230 --> 00:10:25,450 that your choice of edges doesn't conflict-- 237 00:10:25,450 --> 00:10:27,370 you can grow these trees in parallel. 238 00:10:27,370 --> 00:10:30,830 So I call that, jokingly, Myer's procedure. 239 00:10:30,830 --> 00:10:34,150 And that is the application of this coloring approach 240 00:10:34,150 --> 00:10:37,440 to finding minimum weight spanning trees. 241 00:10:37,440 --> 00:10:38,176