1 00:00:00,500 --> 00:00:02,830 The following content is provided under a Creative 2 00:00:02,830 --> 00:00:04,340 Commons license. 3 00:00:04,340 --> 00:00:06,680 Your support will help MIT OpenCourseWare 4 00:00:06,680 --> 00:00:11,050 continue to offer high quality educational resources for free. 5 00:00:11,050 --> 00:00:13,670 To make a donation or view additional materials 6 00:00:13,670 --> 00:00:17,558 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,558 --> 00:00:18,183 at ocw.mit.edu. 8 00:00:23,780 --> 00:00:26,300 PROFESSOR: So welcome to this week. 9 00:00:26,300 --> 00:00:29,190 We're going to talk about trees mainly. 10 00:00:29,190 --> 00:00:31,640 We're going to talk about some very special ones, 11 00:00:31,640 --> 00:00:33,770 spanning trees. 12 00:00:33,770 --> 00:00:35,770 But before we do this, you will go 13 00:00:35,770 --> 00:00:37,690 through a whole bunch of definitions 14 00:00:37,690 --> 00:00:41,070 and look at examples to explain how it works. 15 00:00:45,910 --> 00:00:48,365 So let's talk about walks and paths. 16 00:00:54,290 --> 00:00:58,450 So you've already seen a bit of graph theory. 17 00:00:58,450 --> 00:01:03,000 We've talked about graph colorings and so on. 18 00:01:03,000 --> 00:01:06,560 But now we're going to talk about special types of graphs, 19 00:01:06,560 --> 00:01:08,960 and special structures within such graphs. 20 00:01:12,438 --> 00:01:15,510 Well, let's start with the first definition. 21 00:01:15,510 --> 00:01:16,797 How do we define a walk? 22 00:01:16,797 --> 00:01:18,380 Well, a walk is something very simple. 23 00:01:18,380 --> 00:01:23,250 You walk from one vertex in the graph over an edge 24 00:01:23,250 --> 00:01:26,740 to a next vertex, to a next vertex, and so on. 25 00:01:26,740 --> 00:01:29,370 So how do we define this? 26 00:01:29,370 --> 00:01:47,920 A walk is a sequence of vertices that are connected by edges. 27 00:01:47,920 --> 00:01:50,910 So for example, we may have something 28 00:01:50,910 --> 00:01:53,470 that looks like a first vertex. 29 00:01:53,470 --> 00:01:55,520 That we call the start. 30 00:01:55,520 --> 00:02:00,760 And an edge that goes to a second vertex. 31 00:02:00,760 --> 00:02:02,860 We continue like this until the end, 32 00:02:02,860 --> 00:02:07,520 say, at vk, which we call the end of the walk. 33 00:02:07,520 --> 00:02:11,370 And if you look at this, we say this 34 00:02:11,370 --> 00:02:13,900 is a walk from v0 all the way to vk. 35 00:02:13,900 --> 00:02:16,250 It has exactly k edges. 36 00:02:16,250 --> 00:02:23,130 And we say that the length is actually equal to k. 37 00:02:23,130 --> 00:02:26,570 So this is a very basic notion. 38 00:02:26,570 --> 00:02:28,660 And what we are really interested in 39 00:02:28,660 --> 00:02:31,770 is actually paths. 40 00:02:31,770 --> 00:02:35,620 And paths are special types of walks 41 00:02:35,620 --> 00:02:38,700 in which all those vertices are actually 42 00:02:38,700 --> 00:02:40,700 different from one another. 43 00:02:40,700 --> 00:02:44,530 But let's first give an example of a graph 44 00:02:44,530 --> 00:02:47,030 that you will study throughout the whole lecture, especially 45 00:02:47,030 --> 00:02:48,390 when we come to spanning trees. 46 00:02:50,860 --> 00:02:55,070 So let's have a look at the following graph. 47 00:03:14,290 --> 00:03:15,840 So let this be a graph. 48 00:03:15,840 --> 00:03:18,400 And for example, we can have a walk that 49 00:03:18,400 --> 00:03:23,890 goes from this particular edge. 50 00:03:23,890 --> 00:03:27,060 Say we call this v0. 51 00:03:27,060 --> 00:03:29,640 It goes over this edge over here. 52 00:03:29,640 --> 00:03:32,610 Suppose I want to end over here. 53 00:03:32,610 --> 00:03:35,030 Well, I can go many ways in this particular one. 54 00:03:35,030 --> 00:03:36,435 I can go for this edge. 55 00:03:36,435 --> 00:03:38,670 I may go over here. 56 00:03:38,670 --> 00:03:41,040 I may go all the way to this part over here. 57 00:03:41,040 --> 00:03:42,760 I may return if I want to. 58 00:03:42,760 --> 00:03:47,010 And I finally, I take this edge for example back to-- 59 00:03:47,010 --> 00:03:51,490 over this edge, I will end in the last vertex vk. 60 00:03:51,490 --> 00:03:53,566 So what we notice here is that we 61 00:03:53,566 --> 00:03:58,690 have a walk that actually gets to this particular vertex, 62 00:03:58,690 --> 00:04:01,370 and then returns to this vertex. 63 00:04:01,370 --> 00:04:05,470 So for a few other edges, and then goes all the way to vk. 64 00:04:05,470 --> 00:04:08,370 So if you talk about the path, then we really 65 00:04:08,370 --> 00:04:12,920 want all the difference vertices not to occur twice. 66 00:04:12,920 --> 00:04:17,610 So let's define this so the destination is 67 00:04:17,610 --> 00:04:20,500 that a path is actually a walk. 68 00:04:24,580 --> 00:04:33,380 Is a walk where all edges, where all vertices, 69 00:04:33,380 --> 00:04:35,505 vi's, are different. 70 00:04:40,790 --> 00:04:43,490 In this particular case-- well for example, 71 00:04:43,490 --> 00:04:45,810 if you strip off this part over here, 72 00:04:45,810 --> 00:04:49,950 you will get a path from v0 to here, to here, to here. 73 00:04:49,950 --> 00:04:54,450 And all these vertices on the path on this walk 74 00:04:54,450 --> 00:04:55,220 are different. 75 00:04:55,220 --> 00:04:58,130 And therefore it is a path. 76 00:04:58,130 --> 00:05:00,830 This also shows us something. 77 00:05:00,830 --> 00:05:05,140 It is possible to construct from a walk a path. 78 00:05:05,140 --> 00:05:08,560 As you can see, we started off with this particular walk. 79 00:05:08,560 --> 00:05:12,560 And we just deleted, essentially, 80 00:05:12,560 --> 00:05:17,630 a part where we sort of came back to the same vertex again. 81 00:05:17,630 --> 00:05:21,230 And when we delete all those kinds of parts, 82 00:05:21,230 --> 00:05:22,530 you will end up with a path. 83 00:05:22,530 --> 00:05:25,660 So can we formalize this a little bit better? 84 00:05:25,660 --> 00:05:27,350 Then we get the next lemma. 85 00:05:31,060 --> 00:05:34,630 I call this lemma one. 86 00:05:34,630 --> 00:05:36,700 And let's see whether we can prove this. 87 00:05:36,700 --> 00:05:42,970 So we want to show that if there exists a walk from, say, 88 00:05:42,970 --> 00:05:49,900 a vertex u to-- well, maybe I should not use arrows. 89 00:05:49,900 --> 00:05:57,210 That's confusing-- from u to v. Then there 90 00:05:57,210 --> 00:06:06,140 also exists a path from u to v. 91 00:06:06,140 --> 00:06:07,410 So how can we prove this? 92 00:06:07,410 --> 00:06:09,590 Do you have any ideas? 93 00:06:09,590 --> 00:06:15,190 So what kind of proof techniques can we use here? 94 00:06:15,190 --> 00:06:21,396 And maybe one of the principles that we have seen? 95 00:06:21,396 --> 00:06:26,808 AUDIENCE: [INAUDIBLE] but I have an idea 96 00:06:26,808 --> 00:06:28,284 about how you could show this. 97 00:06:28,284 --> 00:06:32,600 It's basically if you visit one vertex-- let's 98 00:06:32,600 --> 00:06:38,650 you go 1, 2, 3, 4, 5, 3, 6. 99 00:06:38,650 --> 00:06:42,132 You can walk from 1 to 6. 100 00:06:42,132 --> 00:06:46,732 Then we just take out the 4, you just go 1, 2, 3, 6. 101 00:06:46,732 --> 00:06:47,440 PROFESSOR: Right. 102 00:06:47,440 --> 00:06:48,520 So what did we do there? 103 00:06:48,520 --> 00:06:49,760 We essentially had a walk. 104 00:06:49,760 --> 00:06:52,250 And then we cut out a smaller parts here 105 00:06:52,250 --> 00:06:54,480 that recurs in a sense. 106 00:06:54,480 --> 00:06:59,060 And we have shortened the path, we have shortened the walk 107 00:06:59,060 --> 00:07:00,950 to a smaller path. 108 00:07:00,950 --> 00:07:06,720 So what we've used here is we can sort of take out 109 00:07:06,720 --> 00:07:09,850 parts of the walk just like we did over here 110 00:07:09,850 --> 00:07:13,440 until the walk gets shorter, and shorter, and shorter. 111 00:07:13,440 --> 00:07:16,420 So maybe one of the things that we may consider 112 00:07:16,420 --> 00:07:19,425 is a walk of shortest length. 113 00:07:22,980 --> 00:07:27,860 And we know that this is possible. 114 00:07:27,860 --> 00:07:28,991 Oops, sorry about that. 115 00:07:34,770 --> 00:07:36,470 So let's prove this. 116 00:07:36,470 --> 00:07:42,930 So for example, suppose there exists such a walk from u 117 00:07:42,930 --> 00:07:47,030 to v. Then we know by the ordering principle 118 00:07:47,030 --> 00:07:49,900 that there exists a walk of minimum length. 119 00:07:49,900 --> 00:07:51,960 So that's what we're going to use here. 120 00:07:51,960 --> 00:07:56,000 And then we're going to essentially go 121 00:07:56,000 --> 00:07:58,580 in the same direction as what you were talking about 122 00:07:58,580 --> 00:08:02,510 because we're going to show that it's not possible to delete 123 00:08:02,510 --> 00:08:04,804 anything more. 124 00:08:04,804 --> 00:08:06,470 Because otherwise, if you could do that, 125 00:08:06,470 --> 00:08:07,594 the walk would get shorter. 126 00:08:07,594 --> 00:08:13,010 And that's not possible, because by the well ordering principle, 127 00:08:13,010 --> 00:08:22,580 we simply consider that there exists a walk of minimal length 128 00:08:22,580 --> 00:08:23,990 that we are interested in. 129 00:08:30,440 --> 00:08:32,940 We will show that this particular walk is actually 130 00:08:32,940 --> 00:08:34,630 going to be a path. 131 00:08:34,630 --> 00:08:36,950 So let's see how we can do this. 132 00:08:36,950 --> 00:08:41,770 So let's do walk be equal to-- well, it starts in v0. 133 00:08:41,770 --> 00:08:43,450 It's equal to u. 134 00:08:43,450 --> 00:08:45,200 There's an edge to, say, v1. 135 00:08:45,200 --> 00:08:53,160 It goes all the way up to vk, which is the last vertex, v. 136 00:08:53,160 --> 00:08:54,410 So what are we going to prove? 137 00:08:54,410 --> 00:08:57,140 We're going to prove that this is actually a path. 138 00:08:57,140 --> 00:09:00,790 And since this is a walk of minimal length, 139 00:09:00,790 --> 00:09:04,200 well suppose it is not a path. 140 00:09:04,200 --> 00:09:05,980 So we are going to use contradiction. 141 00:09:08,795 --> 00:09:13,030 Well if it's not a path, then by our definitions over here, 142 00:09:13,030 --> 00:09:14,950 there must be two vertices that are actually 143 00:09:14,950 --> 00:09:17,170 the same, just like in here. 144 00:09:17,170 --> 00:09:22,270 So we go from v0 to, say, this particular vertex, or this one. 145 00:09:22,270 --> 00:09:23,900 And then we come back to this one, 146 00:09:23,900 --> 00:09:25,220 and then all the way to vk. 147 00:09:25,220 --> 00:09:29,950 So if it is not a path, then two of those 148 00:09:29,950 --> 00:09:32,150 must be equal to one another. 149 00:09:32,150 --> 00:09:34,520 And then we are going to use this trick. 150 00:09:34,520 --> 00:09:36,000 We're going to take out this part. 151 00:09:36,000 --> 00:09:38,070 We get the shorter walk. 152 00:09:38,070 --> 00:09:42,660 But that contradicts that our walk is minimal length. 153 00:09:42,660 --> 00:09:45,470 So that's sort of the proof technique that we're doing. 154 00:09:45,470 --> 00:09:45,970 OK. 155 00:09:45,970 --> 00:09:46,970 Let's do this. 156 00:09:52,140 --> 00:09:57,640 So first, let's consider a few cases. 157 00:09:57,640 --> 00:10:01,400 If k equals 0, then that's pretty obvious. 158 00:10:01,400 --> 00:10:05,640 We have only a single vertex u that is connected by a 0 length 159 00:10:05,640 --> 00:10:07,500 path to itself. 160 00:10:07,500 --> 00:10:09,230 So this is fine. 161 00:10:09,230 --> 00:10:13,020 For k equals 1, we have just a single edge. 162 00:10:13,020 --> 00:10:14,020 And then we're done too. 163 00:10:14,020 --> 00:10:15,830 That's a path as well. 164 00:10:15,830 --> 00:10:22,230 So let's consider the case where k is at least two. 165 00:10:22,230 --> 00:10:23,874 Well, suppose it's a walk. 166 00:10:23,874 --> 00:10:24,540 It's not a path. 167 00:10:24,540 --> 00:10:28,695 So we are going to assume the opposite. 168 00:10:34,310 --> 00:10:38,340 If it's not a path, then we concluded 169 00:10:38,340 --> 00:10:42,330 that two of those vi's must be equal to one another. 170 00:10:42,330 --> 00:10:51,208 So there exits an i unequal to j such that vertex vi equals vj. 171 00:10:51,208 --> 00:10:55,800 Ah, but now we have a walk that goes 172 00:10:55,800 --> 00:11:02,280 from v0 all the way to, say, vi, which is equal to vj. 173 00:11:02,280 --> 00:11:06,480 And then we go all the way up to vk which is equal to v. 174 00:11:06,480 --> 00:11:09,420 So essentially, we have created a shorter walk. 175 00:11:09,420 --> 00:11:11,930 We have taken out this little triangle over here. 176 00:11:17,260 --> 00:11:20,920 Now this is a shorter walk. 177 00:11:20,920 --> 00:11:27,170 And this contradicts our assumption of minimality. 178 00:11:30,250 --> 00:11:33,650 So that means that our assumption over here 179 00:11:33,650 --> 00:11:35,740 is not true. 180 00:11:35,740 --> 00:11:38,350 So the walk is actually a path. 181 00:11:38,350 --> 00:11:40,010 So this concludes the proof. 182 00:11:44,470 --> 00:11:46,300 So even though this lemma is pretty 183 00:11:46,300 --> 00:11:48,010 straightforward in the sense that we all 184 00:11:48,010 --> 00:11:52,010 feel that if you have a walk, then you can create a path, 185 00:11:52,010 --> 00:11:54,750 if you really want to write it down with all the principles 186 00:11:54,750 --> 00:11:57,594 that we have been taught, you can see 187 00:11:57,594 --> 00:11:58,760 a few things appearing here. 188 00:11:58,760 --> 00:12:01,170 So we first started with the well ordering principle. 189 00:12:01,170 --> 00:12:07,620 And also we used the method of contradiction in our proof. 190 00:12:07,620 --> 00:12:09,910 So let's set talk about the next definition. 191 00:12:09,910 --> 00:12:12,180 So we talked about walks and paths. 192 00:12:12,180 --> 00:12:13,430 Let's talk about connectivity. 193 00:12:21,460 --> 00:12:25,430 We actually defined two vertices, u and v 194 00:12:25,430 --> 00:12:29,710 to be connected if there is a path that 195 00:12:29,710 --> 00:12:39,160 connects u to v. So u and v are connected 196 00:12:39,160 --> 00:12:49,680 if there is a path from u to v. 197 00:12:49,680 --> 00:12:54,540 And now we can talk about the graphs that are connected. 198 00:12:54,540 --> 00:12:57,090 That's a very important concept. 199 00:12:57,090 --> 00:13:14,480 A graph is connected if every pair of vertices is connected. 200 00:13:14,480 --> 00:13:25,490 So when every pair of vertices in the graph are connected. 201 00:13:25,490 --> 00:13:32,130 So an example is the particular graph on the blackboard. 202 00:13:32,130 --> 00:13:34,720 You can see here that there's always a path 203 00:13:34,720 --> 00:13:38,210 to be found from this vertex, to this one, or to that one, 204 00:13:38,210 --> 00:13:42,140 to this. to that, to this, or that. 205 00:13:42,140 --> 00:13:45,420 You can do this for any two vertices. 206 00:13:45,420 --> 00:13:47,130 So this is a connected graph. 207 00:13:47,130 --> 00:13:50,500 A graph that is not connected looks something 208 00:13:50,500 --> 00:13:54,560 like this for example. 209 00:13:54,560 --> 00:13:58,980 You may have a single edge plus, say, 210 00:13:58,980 --> 00:14:02,532 another triangle, maybe with another edge 211 00:14:02,532 --> 00:14:03,490 or something like that. 212 00:14:03,490 --> 00:14:06,780 So they are disconnected from this vertex. 213 00:14:06,780 --> 00:14:08,690 I do not have a graph that connects 214 00:14:08,690 --> 00:14:12,080 to this vertex over here. 215 00:14:12,080 --> 00:14:15,640 So this is about connectivity. 216 00:14:15,640 --> 00:14:22,730 And now we can start talking about cycles and closed walks. 217 00:14:30,170 --> 00:14:34,686 Together, these concepts will help us to define trees. 218 00:14:34,686 --> 00:14:36,560 And then we can start talking about this very 219 00:14:36,560 --> 00:14:38,320 special concept, spanning trees. 220 00:14:40,980 --> 00:14:42,280 So let's see how that works. 221 00:14:45,930 --> 00:14:48,300 So cycles and closed walks. 222 00:14:55,110 --> 00:14:58,980 We first define a closed walk. 223 00:14:58,980 --> 00:15:04,210 And once we have done this, we are able to get into cycles. 224 00:15:04,210 --> 00:15:06,120 So what's a closed walk? 225 00:15:06,120 --> 00:15:08,830 Well, it's actually a walk for which the start and the end 226 00:15:08,830 --> 00:15:09,830 vertex are the same. 227 00:15:15,840 --> 00:15:23,980 So it starts and ends at exactly the same vertex. 228 00:15:31,770 --> 00:15:33,830 So this is all pretty straightforward. 229 00:15:33,830 --> 00:15:39,360 So we have v0 connected to, say, v1, all the way up to vk. 230 00:15:39,360 --> 00:15:42,650 And vk itself is equal to v0. 231 00:15:42,650 --> 00:15:44,790 So you have a walk that goes all the way 232 00:15:44,790 --> 00:15:47,420 from v0 all the way back to vk. 233 00:15:47,420 --> 00:15:53,020 For example, in this graph, you could have walked all the way 234 00:15:53,020 --> 00:15:54,920 back to v0. 235 00:15:54,920 --> 00:15:56,820 You would have a closed walk. 236 00:15:56,820 --> 00:15:58,160 So what's a cycle? 237 00:15:58,160 --> 00:16:01,000 A cycle is a little bit more special 238 00:16:01,000 --> 00:16:03,900 in which all these vertices over here 239 00:16:03,900 --> 00:16:06,600 are actually different from one another. 240 00:16:06,600 --> 00:16:11,240 So if k is at least three-- so the walk 241 00:16:11,240 --> 00:16:14,940 is not just a simple edge, or just a single vertex. 242 00:16:17,790 --> 00:16:29,340 And if all v0, v1, all the way up to vk minus 1 are different, 243 00:16:29,340 --> 00:16:36,900 then we are saying that we have a cycle. 244 00:16:36,900 --> 00:16:39,060 Then it is called a cycle. 245 00:16:45,250 --> 00:16:48,240 So these two concepts, connectivity and cycles, 246 00:16:48,240 --> 00:16:49,485 will help us to define trees. 247 00:16:55,410 --> 00:16:57,630 So what are trees? 248 00:16:57,630 --> 00:16:59,580 A simple example is something that 249 00:16:59,580 --> 00:17:02,570 looks like this, for example. 250 00:17:09,980 --> 00:17:13,530 The idea is that a tree is both connected. 251 00:17:13,530 --> 00:17:15,630 And it has no cycles. 252 00:17:15,630 --> 00:17:19,760 So in this case, we can see that every vertex it's connected 253 00:17:19,760 --> 00:17:21,310 to every other vertex. 254 00:17:21,310 --> 00:17:24,359 But there is no cycle like a triangle or something 255 00:17:24,359 --> 00:17:27,220 that is connected all around. 256 00:17:27,220 --> 00:17:30,920 So let's define this properly. 257 00:17:30,920 --> 00:17:43,690 A definition is that a connected and acyclic graph 258 00:17:43,690 --> 00:17:44,720 is called a tree. 259 00:17:51,230 --> 00:17:53,870 And within trees, we also have something very special 260 00:17:53,870 --> 00:17:55,520 that we call leaves. 261 00:17:55,520 --> 00:17:59,600 And leaves are exactly those nodes that have degree one. 262 00:17:59,600 --> 00:18:01,740 So for example, this node has degree one. 263 00:18:01,740 --> 00:18:04,290 It has only this edge connected to it. 264 00:18:04,290 --> 00:18:06,840 So this is called a leaf. 265 00:18:06,840 --> 00:18:10,964 This one also is a leaf because it has just one edge 266 00:18:10,964 --> 00:18:11,630 connected to it. 267 00:18:11,630 --> 00:18:14,290 But this lead over here has three edges. 268 00:18:14,290 --> 00:18:16,140 So it has degree three. 269 00:18:16,140 --> 00:18:17,830 This one has degree four. 270 00:18:17,830 --> 00:18:18,850 This is the leaf again. 271 00:18:18,850 --> 00:18:23,130 This one is a leaf, this one, and this one. 272 00:18:23,130 --> 00:18:25,230 So this is very important concept. 273 00:18:25,230 --> 00:18:27,920 We often use leaves in our proofs. 274 00:18:27,920 --> 00:18:30,510 And we will get to one. 275 00:18:30,510 --> 00:18:45,620 So a lead is a node with degree one in a tree. 276 00:18:49,810 --> 00:18:50,640 OK great. 277 00:18:54,020 --> 00:18:56,470 So now let's have a look at trees. 278 00:18:56,470 --> 00:19:00,410 Suppose I'm looking at a subgraph of a tree. 279 00:19:00,410 --> 00:19:02,947 What can we say about a subgraph? 280 00:19:02,947 --> 00:19:04,530 What kind of structure does this have? 281 00:19:04,530 --> 00:19:10,310 So suppose for example I want to take some connected subgraphs. 282 00:19:10,310 --> 00:19:12,060 So that's what I'm actually interested in. 283 00:19:12,060 --> 00:19:15,400 So for example, I take this-- suppose 284 00:19:15,400 --> 00:19:18,170 I take this connected subgraph. 285 00:19:18,170 --> 00:19:21,150 Then what does it look like? 286 00:19:21,150 --> 00:19:23,460 Do we see some structure in this? 287 00:19:27,890 --> 00:19:29,420 It's-- no? 288 00:19:29,420 --> 00:19:30,620 AUDIENCE: It's a tree. 289 00:19:30,620 --> 00:19:31,990 PROFESSOR: Yeah, it's a three. 290 00:19:31,990 --> 00:19:34,710 So are we able to prove this also? 291 00:19:37,140 --> 00:19:42,060 So this lemma you actually lose later on. 292 00:19:42,060 --> 00:19:54,940 So any connected subgraph of the tree is a tree. 293 00:19:57,770 --> 00:20:02,440 And to prove-- well, how can we start out a proof? 294 00:20:02,440 --> 00:20:04,750 Any suggestions? 295 00:20:04,750 --> 00:20:07,280 So let's have a look here. 296 00:20:07,280 --> 00:20:08,960 So how may I think about this? 297 00:20:08,960 --> 00:20:12,300 Well, I may take a connected subgraph, 298 00:20:12,300 --> 00:20:14,596 and I want to show-- of a tree- and I 299 00:20:14,596 --> 00:20:16,880 want show that it is a tree. 300 00:20:16,880 --> 00:20:21,144 A general method could be, for example, to-- 301 00:20:21,144 --> 00:20:23,736 AUDIENCE: [INAUDIBLE] You know, one 302 00:20:23,736 --> 00:20:25,518 of the things, one of the conditions 303 00:20:25,518 --> 00:20:28,434 is it has to be acyclical. 304 00:20:28,434 --> 00:20:31,836 So since you're not adding nodes and you're not 305 00:20:31,836 --> 00:20:35,794 starting within cycles, you can't create a new cycle. 306 00:20:35,794 --> 00:20:36,460 PROFESSOR: Yeah. 307 00:20:36,460 --> 00:20:38,840 I think you're saying something like, 308 00:20:38,840 --> 00:20:41,480 if I have a connected subgraph, which 309 00:20:41,480 --> 00:20:44,979 is like a smaller part of the tree-- 310 00:20:44,979 --> 00:20:46,520 now suppose that would not be a tree. 311 00:20:46,520 --> 00:20:48,450 Say it would have a cycle. 312 00:20:48,450 --> 00:20:51,840 Then that cycle would also be present in the tree itself. 313 00:20:51,840 --> 00:20:56,020 And that's not really possible because a tree has no cycles. 314 00:20:56,020 --> 00:20:59,440 So that's sort of indeed how the proof goes. 315 00:20:59,440 --> 00:21:02,550 So we essentially use contradiction. 316 00:21:07,630 --> 00:21:13,830 So let's suppose that this connected subgraph is actually 317 00:21:13,830 --> 00:21:14,470 not a tree. 318 00:21:24,160 --> 00:21:25,980 So suppose it's a connected subgraph. 319 00:21:25,980 --> 00:21:28,040 It's not a tree. 320 00:21:28,040 --> 00:21:31,690 Well then, in that case, it must have a cycle, 321 00:21:31,690 --> 00:21:35,110 because a tree is something that is both connected 322 00:21:35,110 --> 00:21:38,850 and has a cycle and is acyclic. 323 00:21:38,850 --> 00:21:41,590 We have a connected subgraph, which is not a tree. 324 00:21:41,590 --> 00:21:42,870 So it must have a cycle. 325 00:21:45,380 --> 00:21:48,480 So it has a cycle. 326 00:21:48,480 --> 00:21:52,400 But now since it is a subgraph of the bigger graph, 327 00:21:52,400 --> 00:21:55,780 we know that the cycle must also be in the whole graph. 328 00:21:55,780 --> 00:22:07,190 So the whole graph has this particular cycle. 329 00:22:07,190 --> 00:22:09,620 Wait a minute. 330 00:22:09,620 --> 00:22:13,420 The whole graph is the tree. 331 00:22:13,420 --> 00:22:15,130 And a tree's acyclic. 332 00:22:15,130 --> 00:22:17,280 So we get a contradiction. 333 00:22:17,280 --> 00:22:20,610 So that means that our original assumption that it's not a tree 334 00:22:20,610 --> 00:22:21,890 is wrong. 335 00:22:21,890 --> 00:22:23,580 So it is a tree. 336 00:22:23,580 --> 00:22:27,725 So this is, again, such a general kind of proof method 337 00:22:27,725 --> 00:22:30,520 that we repeatedly use here. 338 00:22:30,520 --> 00:22:35,380 So this lemma, which we will call two, 339 00:22:35,380 --> 00:22:37,820 we will use in the next proof where 340 00:22:37,820 --> 00:22:40,970 we're going to talk about the relationship between vertices 341 00:22:40,970 --> 00:22:44,040 and edges within trees. 342 00:22:55,980 --> 00:22:59,450 It's a very beautiful relationship. 343 00:22:59,450 --> 00:23:05,150 The lemma states if you have a tree that has n vertices, 344 00:23:05,150 --> 00:23:07,840 then it must have n minus 1 edges. 345 00:23:07,840 --> 00:23:11,110 So this is a very tight relationship here. 346 00:23:11,110 --> 00:23:23,070 So a tree with n vertices has n minus 1 edges. 347 00:23:23,070 --> 00:23:23,860 So let's see. 348 00:23:23,860 --> 00:23:26,930 How can we prove this one? 349 00:23:26,930 --> 00:23:27,770 Any suggestions? 350 00:23:27,770 --> 00:23:32,600 So what did we use at the beginning of the course here? 351 00:23:32,600 --> 00:23:35,040 So we have n vertices. 352 00:23:35,040 --> 00:23:39,020 And we want to show that it has n minus 1 edges. 353 00:23:42,980 --> 00:23:45,670 We can use induction, right? 354 00:23:45,670 --> 00:23:50,350 So let's use induction. 355 00:23:50,350 --> 00:23:51,720 on n. 356 00:23:51,720 --> 00:23:54,280 So how do we proceed if you do this? 357 00:23:54,280 --> 00:23:58,920 Well, we always start out with an induction hypothesis. 358 00:23:58,920 --> 00:24:00,640 So the statement here would be something 359 00:24:00,640 --> 00:24:15,380 like, well, there are n minus 1 edges in any n vertex tree. 360 00:24:18,480 --> 00:24:20,950 And if you start with induction, you always 361 00:24:20,950 --> 00:24:22,580 start with the base case, right? 362 00:24:22,580 --> 00:24:25,930 So we have the base case. 363 00:24:25,930 --> 00:24:28,540 So how does this work? 364 00:24:28,540 --> 00:24:33,090 So for example, we want to consider P1. 365 00:24:33,090 --> 00:24:37,069 So there are zero edges in any one vertex tree. 366 00:24:37,069 --> 00:24:38,110 Well, that's true, right? 367 00:24:38,110 --> 00:24:41,390 If I have just one vertex, there's no edge. 368 00:24:41,390 --> 00:24:45,630 So this is definitely correct. 369 00:24:45,630 --> 00:24:46,930 So that's easy. 370 00:24:46,930 --> 00:24:53,900 So let's prove the other part, which is the inductive step. 371 00:24:53,900 --> 00:25:02,306 And the inductive step is-- 372 00:25:04,770 --> 00:25:11,270 So to do the inductive step, we are going to always assume Pn. 373 00:25:11,270 --> 00:25:14,530 And then we want to prove Pn plus 1. 374 00:25:14,530 --> 00:25:16,060 So let's do this. 375 00:25:16,060 --> 00:25:17,390 So we have the inductive step. 376 00:25:22,740 --> 00:25:24,780 And we always start out the same way. 377 00:25:24,780 --> 00:25:32,850 So we suppose P of n. 378 00:25:32,850 --> 00:25:34,960 And now we want to prove Pn plus 1. 379 00:25:34,960 --> 00:25:37,350 So how do we do this? 380 00:25:37,350 --> 00:25:40,910 Well, take a tree that has n plus 1 vertices vertices. 381 00:25:40,910 --> 00:25:44,390 And we want to show that it has n edges. 382 00:25:44,390 --> 00:25:47,680 So take any such tree. 383 00:25:47,680 --> 00:25:56,660 So let T be a tree that has n plus 1 vertices. 384 00:26:02,170 --> 00:26:06,880 So how can we use the induction hypothesis here? 385 00:26:06,880 --> 00:26:09,360 So do you have any ideas over here? 386 00:26:09,360 --> 00:26:13,100 So I have a tree with n plus 1 vertices. 387 00:26:13,100 --> 00:26:16,960 And if I want to be able to use this induction 388 00:26:16,960 --> 00:26:19,740 at this induction hypothesis, then I 389 00:26:19,740 --> 00:26:24,125 can only apply it on trees that have n vertices. 390 00:26:24,125 --> 00:26:26,960 So somehow I have to delete one vertex. 391 00:26:26,960 --> 00:26:28,440 Right? 392 00:26:28,440 --> 00:26:32,120 So what kind of vertex can I delete in such a way 393 00:26:32,120 --> 00:26:34,480 that we still have a tree? 394 00:26:38,730 --> 00:26:44,550 And I need that because this induction step can only 395 00:26:44,550 --> 00:26:47,660 be applied to trees. 396 00:26:47,660 --> 00:26:53,060 So what type of vertex can I remove from a tree and it still 397 00:26:53,060 --> 00:26:54,560 stays a tree? 398 00:26:54,560 --> 00:26:59,630 So for example, if you look at this example up here, 399 00:26:59,630 --> 00:27:01,540 what kind of vertex can I remove here? 400 00:27:01,540 --> 00:27:02,326 AUDIENCE: A leaf. 401 00:27:02,326 --> 00:27:04,230 PROFESSOR: Yeah, except. 402 00:27:04,230 --> 00:27:06,594 I can remove a leaf-- for example, this one. 403 00:27:06,594 --> 00:27:07,260 And why is that? 404 00:27:07,260 --> 00:27:09,790 Because if there's only one edge connected to the rest 405 00:27:09,790 --> 00:27:15,420 of the tree-- so if I delete it, I will actually keep a tree. 406 00:27:15,420 --> 00:27:18,070 So that's how the proof will continue. 407 00:27:18,070 --> 00:27:21,170 So we take out one vertex. 408 00:27:21,170 --> 00:27:25,850 So let v be a leaf of the tree. 409 00:27:30,920 --> 00:27:35,970 And let's remove this particular vertex. 410 00:27:39,230 --> 00:27:41,250 So what do we get? 411 00:27:41,250 --> 00:27:47,600 Well, we again have a connected subgraph. 412 00:27:47,600 --> 00:27:56,540 So this creates a connected subgraph. 413 00:27:56,540 --> 00:28:00,740 And we also know that this is a tree. 414 00:28:00,740 --> 00:28:02,640 So how can we conclude that? 415 00:28:02,640 --> 00:28:05,060 Well, we had another lemma somewhere 416 00:28:05,060 --> 00:28:09,060 that's behind this blackboard. 417 00:28:09,060 --> 00:28:11,555 It says that if you have connected subgraph, 418 00:28:11,555 --> 00:28:13,896 then we still have a tree. 419 00:28:13,896 --> 00:28:15,040 So this is great. 420 00:28:15,040 --> 00:28:19,380 So this is also a tree. 421 00:28:22,760 --> 00:28:28,290 So now we can apply Pn because we have n vertices, 422 00:28:28,290 --> 00:28:30,140 because we deleted one. 423 00:28:30,140 --> 00:28:36,630 So by Pn, we can conclude that this particular subgraph 424 00:28:36,630 --> 00:28:40,510 has n minus 1 edges. 425 00:28:40,510 --> 00:28:43,070 You simply apply our statement over here. 426 00:28:46,560 --> 00:28:49,960 But we want to say something about the original tree. 427 00:28:49,960 --> 00:28:51,680 So how do we do this? 428 00:28:51,680 --> 00:28:53,840 Well, we have to somehow reconstruct this tree. 429 00:28:53,840 --> 00:28:58,972 Well, we have deleted v. So we have to reattach v again. 430 00:28:58,972 --> 00:28:59,680 So let's do this. 431 00:29:02,630 --> 00:29:08,730 So we reattach v. And if I do this, 432 00:29:08,730 --> 00:29:11,310 well, let's look at this example over here. 433 00:29:11,310 --> 00:29:12,950 I reattach the leaf. 434 00:29:12,950 --> 00:29:16,260 I will add one more edge. 435 00:29:16,260 --> 00:29:21,100 So if I reattach this particular leaf, 436 00:29:21,100 --> 00:29:27,100 I will get back the original T over here. 437 00:29:27,100 --> 00:29:30,600 And how many edges does it have? 438 00:29:30,600 --> 00:29:35,650 Well, it has these n minus 1 edges 439 00:29:35,650 --> 00:29:41,890 plus the one edge that is needed to reattach the leaf. 440 00:29:41,890 --> 00:29:46,420 So that's all together exactly n edges. 441 00:29:46,420 --> 00:29:54,430 So now we have shown that for every n plus 1 vertex tree, 442 00:29:54,430 --> 00:29:57,440 every such tree has n edges. 443 00:29:57,440 --> 00:30:02,740 So now we're done, because this shows P n plus 1. 444 00:30:02,740 --> 00:30:05,105 So this is how the lemma is proved. 445 00:30:05,105 --> 00:30:08,980 And in the homework, you actually use this. 446 00:30:12,300 --> 00:30:17,895 So let's now talk about very special trees, spanning trees. 447 00:30:20,290 --> 00:30:22,760 I think this is a very exciting topic. 448 00:30:22,760 --> 00:30:27,060 So let's look at this particular graph over here. 449 00:30:33,330 --> 00:30:35,380 Let's define spanning trees. 450 00:30:35,380 --> 00:30:36,640 So what's a spanning tree? 451 00:30:36,640 --> 00:30:42,420 A spanning tree is a subgraph of a graph that somehow spans all 452 00:30:42,420 --> 00:30:44,680 the vertices within this graph. 453 00:30:44,680 --> 00:30:49,830 So it's a tree that touches every single vertex. 454 00:30:49,830 --> 00:30:54,718 So for example, we may have-- 455 00:30:56,030 --> 00:30:59,690 Maybe this is not such a bright color. 456 00:30:59,690 --> 00:31:02,460 Maybe this one. 457 00:31:02,460 --> 00:31:07,590 So for example, we may have a tree that looks like this. 458 00:31:16,210 --> 00:31:19,450 So in this particular graph, we can 459 00:31:19,450 --> 00:31:24,840 create a subgraph with the thickened green edges 460 00:31:24,840 --> 00:31:28,480 over here, which is a tree, and also covers 461 00:31:28,480 --> 00:31:32,972 all the different vertices of the original graph. 462 00:31:32,972 --> 00:31:35,180 This is actually pretty amazing that you can do this, 463 00:31:35,180 --> 00:31:40,030 you can actually create such a kind of tree. 464 00:31:40,030 --> 00:31:42,950 And what you want to prove, first of all, 465 00:31:42,950 --> 00:31:45,510 is that for every connected graph, 466 00:31:45,510 --> 00:31:47,870 we can construct such a spanning tree. 467 00:31:47,870 --> 00:31:49,660 So let's first define this. 468 00:31:49,660 --> 00:31:52,610 And then we will prove this theorem. 469 00:31:56,530 --> 00:32:00,660 And then after this, we're going to talk about weighted graphs. 470 00:32:00,660 --> 00:32:04,772 That will lead to minimum weight spanning trees. 471 00:32:04,772 --> 00:32:06,230 And then it gets really interesting 472 00:32:06,230 --> 00:32:08,510 because you want to figure out how we can actually 473 00:32:08,510 --> 00:32:10,710 construct those for any graph. 474 00:32:15,830 --> 00:32:19,930 So let's define a spanning tree. 475 00:32:19,930 --> 00:32:23,725 A spanning tree-- 476 00:32:27,140 --> 00:32:40,230 And we appreciate this as SD-- of a connected graph 477 00:32:40,230 --> 00:32:51,820 is actually a subgraph that is a tree. 478 00:32:51,820 --> 00:32:53,900 So that's the first condition. 479 00:32:53,900 --> 00:32:55,940 And it also covers all the other vertices. 480 00:32:55,940 --> 00:33:06,725 So it has the same vertices as the graph. 481 00:33:13,520 --> 00:33:16,970 And over here, we have this example. 482 00:33:16,970 --> 00:33:19,960 So now what you want to show is the following theorem 483 00:33:19,960 --> 00:33:21,940 that any connected graph actually 484 00:33:21,940 --> 00:33:24,810 has such a spanning tree. 485 00:33:24,810 --> 00:33:43,110 So the theorem is every connected graph 486 00:33:43,110 --> 00:33:45,760 has a spanning tree. 487 00:33:45,760 --> 00:33:52,090 And let's think about a proof together. 488 00:33:52,090 --> 00:33:59,020 So for example, we may start with the idea 489 00:33:59,020 --> 00:34:01,510 to use a contradiction. 490 00:34:01,510 --> 00:34:04,810 Suppose there exists a connected graph that 491 00:34:04,810 --> 00:34:06,450 has no spanning tree. 492 00:34:06,450 --> 00:34:07,540 So how would we go ahead? 493 00:34:15,850 --> 00:34:21,644 So how can we prove that we have a contradiction here? 494 00:34:21,644 --> 00:34:24,830 So the proof that we're going to propose here 495 00:34:24,830 --> 00:34:25,820 is by contradiction. 496 00:34:33,320 --> 00:34:35,620 That means that we're going to assume the opposite. 497 00:34:35,620 --> 00:34:42,040 So assume that we have a connected graph, 498 00:34:42,040 --> 00:34:47,014 a connected graph G. 499 00:34:54,570 --> 00:34:58,600 That has no spanning tree. 500 00:34:58,600 --> 00:35:01,450 So how can we use this? 501 00:35:01,450 --> 00:35:03,300 How could this be strange? 502 00:35:03,300 --> 00:35:05,520 Or what can we do here? 503 00:35:13,420 --> 00:35:17,390 So one of the nice things about trees over here is-- 504 00:35:17,390 --> 00:35:22,970 so that maybe gives us some insight is that if we have 505 00:35:22,970 --> 00:35:29,395 a tree-- well, it seems like such a kind of spanning tree is 506 00:35:29,395 --> 00:35:31,739 like-- 507 00:35:32,238 --> 00:35:36,000 It's a solution to a subgraph that is connected 508 00:35:36,000 --> 00:35:37,555 with a minimum number of edges. 509 00:35:40,140 --> 00:35:46,970 Like if I have any other type of subgraph-- for example, 510 00:35:46,970 --> 00:35:49,900 if it has cycles, I can always remove an edge of course. 511 00:35:49,900 --> 00:36:00,260 So we know a subgraph of G that touches all the different edges 512 00:36:00,260 --> 00:36:04,980 and has a cycle can get shortened 513 00:36:04,980 --> 00:36:08,240 by eliminating one edge of a cycle that's 514 00:36:08,240 --> 00:36:09,370 within such as a subgraph. 515 00:36:09,370 --> 00:36:11,910 So maybe we have an idea here. 516 00:36:11,910 --> 00:36:14,340 And maybe we can use this. 517 00:36:14,340 --> 00:36:23,370 So maybe we can say, let's set consider a graph T. So 518 00:36:23,370 --> 00:36:30,730 let T be a connected subgraph of G, but with a property 519 00:36:30,730 --> 00:36:34,570 that it has a minimum number of edges. 520 00:36:34,570 --> 00:36:41,790 So let this be a connected subgraph of G. 521 00:36:41,790 --> 00:36:47,290 And we assume that it has the same vertices of G of course. 522 00:36:55,050 --> 00:37:00,200 In addition, it has the smallest number of edges possible. 523 00:37:17,350 --> 00:37:18,720 So what do we know? 524 00:37:18,720 --> 00:37:21,410 Well, let's look at our assumption here. 525 00:37:21,410 --> 00:37:25,070 We said that a connected graph, G-- that's 526 00:37:25,070 --> 00:37:28,140 what we consider here-- for which 527 00:37:28,140 --> 00:37:29,379 there is no spanning tree. 528 00:37:29,379 --> 00:37:30,170 So what do we know? 529 00:37:30,170 --> 00:37:32,622 We know that this particular T over here 530 00:37:32,622 --> 00:37:34,830 can definitely not be a spanning tree, because that's 531 00:37:34,830 --> 00:37:35,790 what we assume here. 532 00:37:39,890 --> 00:37:42,920 Can we use this? 533 00:37:42,920 --> 00:37:44,230 So let's think about this. 534 00:37:49,780 --> 00:37:55,510 So if it is not a spanning tree, then what must it have? 535 00:37:55,510 --> 00:38:01,380 So we already know that T is a connected subgraph of G 536 00:38:01,380 --> 00:38:06,180 with the same vertices of G. So the only difference 537 00:38:06,180 --> 00:38:11,035 between a spanning tree and T is that the spanning tree is 538 00:38:11,035 --> 00:38:15,550 a tree, and T is not a tree. 539 00:38:15,550 --> 00:38:17,870 So therefor, T must have a cycle. 540 00:38:17,870 --> 00:38:22,790 So we use our assumption over here. 541 00:38:22,790 --> 00:38:27,730 We know that T is not a spanning tree. 542 00:38:27,730 --> 00:38:31,060 And now we can conclude that it must have a cycle. 543 00:38:36,960 --> 00:38:42,530 So now we can start thinking about this cycle. 544 00:38:42,530 --> 00:38:44,820 And let's feel-- 545 00:38:46,540 --> 00:38:47,830 What can we do here? 546 00:38:47,830 --> 00:38:52,445 So we constructed T as the subgraph that 547 00:38:52,445 --> 00:38:55,730 has the minimum number of edges possible. 548 00:38:55,730 --> 00:38:57,480 So now if you have a cycle, the whole idea 549 00:38:57,480 --> 00:39:01,750 is that we're going to delete one of the edges of the cycle. 550 00:39:01,750 --> 00:39:06,710 And if you do that, we will be able to construct a smaller 551 00:39:06,710 --> 00:39:11,420 subgraph T, essentially, with a smaller number of edges. 552 00:39:11,420 --> 00:39:15,370 And that contradicts this particular statement over here. 553 00:39:15,370 --> 00:39:17,900 So that's sort of how we go ahead. 554 00:39:17,900 --> 00:39:21,370 So if it has a cycle, let's write it out. 555 00:39:21,370 --> 00:39:26,195 For example, here we have a cycle like this. 556 00:39:28,870 --> 00:39:36,820 Well, let's prove that if we remove one of the cycle, 557 00:39:36,820 --> 00:39:41,280 that we still have a connected subgraph. 558 00:39:41,280 --> 00:39:44,180 So the whole graph T is much bigger, right? 559 00:39:44,180 --> 00:39:49,480 It has lots of connections and so on. 560 00:39:49,480 --> 00:39:53,470 Suppose we will remove this particular edge over here. 561 00:39:53,470 --> 00:39:55,100 So we remove this one. 562 00:39:57,650 --> 00:40:03,060 Do we still have a connected subgraph of G 563 00:40:03,060 --> 00:40:06,060 that covers all the vertices of G? 564 00:40:06,060 --> 00:40:08,010 Well, of course it cover all the vertices of G 565 00:40:08,010 --> 00:40:09,980 because we only removed an edge over here. 566 00:40:09,980 --> 00:40:11,630 So that's fine. 567 00:40:11,630 --> 00:40:13,900 All these parts are still connected to the rest. 568 00:40:16,910 --> 00:40:23,080 If you can show that it is connected, then well we just-- 569 00:40:23,080 --> 00:40:24,380 well, we removed one edge. 570 00:40:24,380 --> 00:40:27,210 So we actually created this smaller graph. 571 00:40:27,210 --> 00:40:28,510 And then we get contradiction. 572 00:40:28,510 --> 00:40:30,800 And then we're done with our proof. 573 00:40:30,800 --> 00:40:35,330 So let's see whether we can prove that if we remove 574 00:40:35,330 --> 00:40:37,350 this particular edge over here, that we still 575 00:40:37,350 --> 00:40:39,110 have a connected subgraph. 576 00:40:39,110 --> 00:40:40,055 So how can we do this? 577 00:40:42,760 --> 00:40:51,382 Well, let's take-- so the first case would be if a vertex 578 00:40:51,382 --> 00:40:56,020 x is connected to y. 579 00:40:56,020 --> 00:41:00,000 So I want to take any pair x, y of vertices. 580 00:41:00,000 --> 00:41:04,210 And suppose that the path that connects x and y 581 00:41:04,210 --> 00:41:09,030 does not contain this particular edge e that I remove here. 582 00:41:09,030 --> 00:41:18,000 So suppose this does not contain e. 583 00:41:18,000 --> 00:41:22,770 So I look at the graph T. I Know T is connected. 584 00:41:22,770 --> 00:41:25,110 So I take a pair of vertices x and y. 585 00:41:25,110 --> 00:41:31,020 I know there is a path between x and y in T. 586 00:41:31,020 --> 00:41:35,360 If this path does not contain the edge e, 587 00:41:35,360 --> 00:41:39,410 then I know that the same path still exists in the same graph 588 00:41:39,410 --> 00:41:41,620 where I've removed this particular edge. 589 00:41:41,620 --> 00:41:43,960 So that's great, right? 590 00:41:43,960 --> 00:41:46,110 I have shown that x and y are still 591 00:41:46,110 --> 00:41:51,250 connected in the new graph where I've removed this e. 592 00:41:51,250 --> 00:41:52,870 So now let's look at another case. 593 00:41:52,870 --> 00:41:56,440 For example, x is over here. 594 00:41:56,440 --> 00:42:01,500 And it is connected all the way through here 595 00:42:01,500 --> 00:42:06,770 over this particular edge all the way to, say, y over here. 596 00:42:06,770 --> 00:42:08,360 So this is case number two. 597 00:42:15,720 --> 00:42:19,660 Well, are x ad y still connected? 598 00:42:19,660 --> 00:42:20,650 Yes they are, right? 599 00:42:20,650 --> 00:42:25,610 Because I have removed e from a cycle. 600 00:42:25,610 --> 00:42:30,400 So I can still go into the other direction of the cycle 601 00:42:30,400 --> 00:42:34,540 and connect back towards y. 602 00:42:34,540 --> 00:42:38,710 So what we see here is that x can go all the way to here. 603 00:42:38,710 --> 00:42:41,110 But it can still go over the cycle back 604 00:42:41,110 --> 00:42:45,090 to here, and then to y. 605 00:42:45,090 --> 00:42:48,410 So we are still connected. 606 00:42:48,410 --> 00:42:51,120 So this shows that in both of the cases, 607 00:42:51,120 --> 00:42:55,260 for any pair of vertices, even after removal of e, 608 00:42:55,260 --> 00:42:59,050 the graph T minus this edge e is still connected. 609 00:43:01,650 --> 00:43:02,990 So let's write this out. 610 00:43:02,990 --> 00:43:20,540 So all vertices in G are still connected 611 00:43:20,540 --> 00:43:31,150 after removing e from T. 612 00:43:31,150 --> 00:43:32,730 So now we are done. 613 00:43:32,730 --> 00:43:35,320 So what was that again? 614 00:43:35,320 --> 00:43:37,290 We now constructed a smaller graph-- 615 00:43:37,290 --> 00:43:42,540 smaller in the sense that it has one less edge-- that is still 616 00:43:42,540 --> 00:43:46,890 connected, and with the same vertices as G. 617 00:43:46,890 --> 00:43:49,470 But I assumed that T was already the smallest 618 00:43:49,470 --> 00:43:52,450 such graph, such subgraph. 619 00:43:52,450 --> 00:43:54,370 So we have a contradiction. 620 00:43:54,370 --> 00:43:56,190 So what does this mean? 621 00:43:56,190 --> 00:43:58,870 This means that our original assumption that we have over 622 00:43:58,870 --> 00:44:03,170 here cannot be true because we started off with this 623 00:44:03,170 --> 00:44:04,530 assumption. 624 00:44:04,530 --> 00:44:07,460 Because of this, we could start to construct this. 625 00:44:07,460 --> 00:44:11,705 And then you could derive this whole argumentation. 626 00:44:11,705 --> 00:44:13,330 And in the end, we get a contradiction. 627 00:44:13,330 --> 00:44:15,750 So our original assumption is wrong. 628 00:44:15,750 --> 00:44:20,040 So actually, this is not possible. 629 00:44:20,040 --> 00:44:23,760 There does not exist a connected graph G that has no ST. 630 00:44:23,760 --> 00:44:27,120 So the theorem is true. 631 00:44:27,120 --> 00:44:30,590 So this is an important theorem. 632 00:44:30,590 --> 00:44:34,416 So now we can start talking about weighted minimum spanning 633 00:44:34,416 --> 00:44:34,915 trees. 634 00:44:40,620 --> 00:44:48,020 So let's use this picture over here and assign some weights 635 00:44:48,020 --> 00:44:49,260 to this graph. 636 00:44:52,000 --> 00:45:08,800 So for example, we have 1, 2, 2, 3, 3, 3, a 1 over here, 637 00:45:08,800 --> 00:45:15,130 another 1, 1, 4, and 7. 638 00:45:15,130 --> 00:45:20,950 And let's construct a spanning tree. 639 00:45:20,950 --> 00:45:23,110 So let me give an example over here. 640 00:45:23,110 --> 00:45:25,980 Actually, this is an example that we have right her 641 00:45:25,980 --> 00:45:29,580 in the thick lines in green. 642 00:45:29,580 --> 00:45:32,430 So what is the weight of the spanning tree 643 00:45:32,430 --> 00:45:33,580 that we have here? 644 00:45:33,580 --> 00:45:35,920 So we simply add the weights of all the edges. 645 00:45:35,920 --> 00:45:38,020 So we have 7, 3. 646 00:45:38,020 --> 00:45:38,960 That makes 10. 647 00:45:38,960 --> 00:45:44,800 11, 14, another 2, 16, 17, 18, 19. 648 00:45:44,800 --> 00:45:53,650 So the weight of this particular spanning tree is 19. 649 00:45:53,650 --> 00:45:55,700 So now the problem is-- and that's 650 00:45:55,700 --> 00:45:57,970 the rest of the lecture-- how can you 651 00:45:57,970 --> 00:45:59,790 construct a minimum weight spanning 652 00:45:59,790 --> 00:46:05,870 tree, one that has the minimum weight possible? 653 00:46:05,870 --> 00:46:09,380 So we see another example. 654 00:46:09,380 --> 00:46:11,280 So could I swap some edges around 655 00:46:11,280 --> 00:46:15,560 or something like that such that I can get a lesser weight? 656 00:46:15,560 --> 00:46:20,940 So for example, over here, I have-- for example, 657 00:46:20,940 --> 00:46:27,290 this node over here is connected by an edge that has weight 3. 658 00:46:27,290 --> 00:46:29,590 But I could also have connected it 659 00:46:29,590 --> 00:46:33,530 to this particular one over here, which is only weight 1. 660 00:46:33,530 --> 00:46:39,220 If I do this, I actually improve the spanning tree construction. 661 00:46:39,220 --> 00:46:44,035 So I will get something that looks like-- 662 00:46:47,480 --> 00:46:51,095 Let's see-- that looks like this. 663 00:46:54,980 --> 00:46:58,590 And now we replaced 3 by 1. 664 00:46:58,590 --> 00:47:02,260 So it gets 17. 665 00:47:02,260 --> 00:47:06,060 So can we do anything less than this? 666 00:47:06,060 --> 00:47:08,580 That's maybe-- it's still a tree, right? 667 00:47:08,580 --> 00:47:09,850 Because there's no cycle. 668 00:47:09,850 --> 00:47:14,300 Here we have this, this, this, this part over here. 669 00:47:14,300 --> 00:47:16,230 If you want to have connected subgraph, 670 00:47:16,230 --> 00:47:18,040 I always need the 7 in here. 671 00:47:21,490 --> 00:47:25,170 Well, it seems like you cannot really replace anything 672 00:47:25,170 --> 00:47:27,020 by something cheaper. 673 00:47:27,020 --> 00:47:34,467 Like any edge in here seems to be pretty necessary. 674 00:47:34,467 --> 00:47:36,300 Of course, what I could do is I could create 675 00:47:36,300 --> 00:47:38,000 other minimum spanning trees. 676 00:47:38,000 --> 00:47:41,060 So actually it turns out that 17 is the minimum weight. 677 00:47:41,060 --> 00:47:43,640 For example, I could, instead of this edge, 678 00:47:43,640 --> 00:47:45,270 I could use this edge. 679 00:47:45,270 --> 00:47:46,930 That's another solution. 680 00:47:46,930 --> 00:47:50,070 Or instead of this edge, I could use this edge. 681 00:47:57,960 --> 00:48:02,080 So how do we construct such a minimum weight spanning tree? 682 00:48:02,080 --> 00:48:06,330 Can we somehow find an algorithm that creates this? 683 00:48:06,330 --> 00:48:09,220 And that's the big problem of today. 684 00:48:09,220 --> 00:48:11,860 So let me write out one spanning tree 685 00:48:11,860 --> 00:48:16,890 that I will use for the rest of this class. 686 00:48:16,890 --> 00:48:23,604 And maybe I will use a little bit different color. 687 00:48:33,930 --> 00:48:36,720 Let's use red over here. 688 00:48:39,892 --> 00:48:40,850 I hope this is visible. 689 00:48:43,280 --> 00:48:49,828 So we have red, this one, and this one, 690 00:48:49,828 --> 00:48:54,300 and this one, and this one, and this one. 691 00:48:54,300 --> 00:48:56,480 That's going to be-- this is going 692 00:48:56,480 --> 00:48:59,440 to be the spanning tree that I'm going to look at. 693 00:48:59,440 --> 00:49:00,730 This is also weight 17. 694 00:49:08,750 --> 00:49:11,780 So let's first define what a minimum weight spanning tree 695 00:49:11,780 --> 00:49:15,180 is, just to be very precise. 696 00:49:15,180 --> 00:49:17,610 And then let's have an algorithm. 697 00:49:17,610 --> 00:49:22,790 Maybe you have already ideas of how this could be done. 698 00:49:22,790 --> 00:49:24,610 It turns out actually that if you just 699 00:49:24,610 --> 00:49:26,890 start with a greedy approach, you just 700 00:49:26,890 --> 00:49:31,420 start to add sort of through the edges of minimum weight, 701 00:49:31,420 --> 00:49:36,630 that you can still add without creating any cycles. 702 00:49:36,630 --> 00:49:38,300 You will get an algorithm that is 703 00:49:38,300 --> 00:49:41,030 going to do the trick for you. 704 00:49:41,030 --> 00:49:43,830 So that's kind of surprising. 705 00:49:43,830 --> 00:49:46,555 So let's talk about the definition. 706 00:49:50,560 --> 00:50:01,830 So the minimum spanning tree of an edge weighted graph 707 00:50:01,830 --> 00:50:06,030 is defined as-- 708 00:50:12,410 --> 00:50:17,270 It's defined as the spanning tree of G such 709 00:50:17,270 --> 00:50:31,850 that it has the smallest possible sum of edge weights. 710 00:50:40,290 --> 00:50:42,810 Well the algorithm that I'm thinking about 711 00:50:42,810 --> 00:50:46,530 here is very straightforward. 712 00:50:46,530 --> 00:50:49,610 Actually I'll use this blackboard 713 00:50:49,610 --> 00:50:52,160 because this algorithm will be the focus 714 00:50:52,160 --> 00:50:53,930 of the rest of the lecture. 715 00:51:08,820 --> 00:51:09,870 So what's the algorithm? 716 00:51:13,950 --> 00:51:20,430 We actually grow a subgraph step by step. 717 00:51:20,430 --> 00:51:35,360 So one edge at a time such that at each step 718 00:51:35,360 --> 00:51:55,850 we want to add the minimum weight edge that 719 00:51:55,850 --> 00:51:57,900 keeps the subgraph acyclic. 720 00:52:13,320 --> 00:52:14,940 So this is our algorithm. 721 00:52:14,940 --> 00:52:18,010 Let's have a look how this works in this particular graph 722 00:52:18,010 --> 00:52:18,733 over here. 723 00:52:24,660 --> 00:52:28,530 So suppose I would start off with-- well, 724 00:52:28,530 --> 00:52:31,080 if I start with this algorithm. 725 00:52:31,080 --> 00:52:34,340 I take, say, any minimum weight edge. 726 00:52:34,340 --> 00:52:36,120 Say I take this particular one. 727 00:52:36,120 --> 00:52:38,160 So this could be my first edge that I choose. 728 00:52:40,710 --> 00:52:43,980 Then I may want to choose another minimum weight 729 00:52:43,980 --> 00:52:47,199 edge such that I do not create a cycle. 730 00:52:47,199 --> 00:52:48,490 Well, which one could I choose? 731 00:52:48,490 --> 00:52:49,770 I could choose this one. 732 00:52:49,770 --> 00:52:52,494 I could also choose this one. 733 00:52:52,494 --> 00:52:54,160 With this one, I could choose this part. 734 00:52:54,160 --> 00:52:55,826 With this one, I could choose that part. 735 00:52:55,826 --> 00:52:57,430 I could also choose this one. 736 00:52:57,430 --> 00:53:00,780 So let's choose this one actually. 737 00:53:00,780 --> 00:53:04,540 This would be our second step and our second edge 738 00:53:04,540 --> 00:53:06,490 that we select in our algorithm. 739 00:53:06,490 --> 00:53:08,130 There are two disjoint edges. 740 00:53:08,130 --> 00:53:09,820 So there's definitely no graph. 741 00:53:12,600 --> 00:53:15,810 Well, maybe now our third step, I 742 00:53:15,810 --> 00:53:21,390 can choose one of these edges, which also has just weight 743 00:53:21,390 --> 00:53:23,030 1, which is minimum weight. 744 00:53:23,030 --> 00:53:25,323 So this could be my third possibility. 745 00:53:27,960 --> 00:53:29,570 Then what's the fourth possibility? 746 00:53:29,570 --> 00:53:35,890 Well, I have to choose one of those actually. 747 00:53:35,890 --> 00:53:40,350 I cannot choose this weight 1 edge? 748 00:53:40,350 --> 00:53:40,850 Why not? 749 00:53:40,850 --> 00:53:43,400 Because that would create a cycle. 750 00:53:43,400 --> 00:53:45,520 So I'm not allowed to choose this one anymore. 751 00:53:45,520 --> 00:53:47,340 So I choose one of these two. 752 00:53:47,340 --> 00:53:50,860 So for example, I choose this one over here. 753 00:53:50,860 --> 00:53:55,410 This is the fourth edge that I add to the whole picture. 754 00:53:55,410 --> 00:53:57,200 So note that in the meantime. 755 00:53:57,200 --> 00:54:00,370 I have two disconnected components. 756 00:54:00,370 --> 00:54:07,300 One other here, which is this arc, and another arc over here. 757 00:54:07,300 --> 00:54:11,860 So which edge do I attach now in the algorithm? 758 00:54:11,860 --> 00:54:15,050 Well, I can still add this minimum weight edge 759 00:54:15,050 --> 00:54:20,140 over here of weight two, because it does not create any cycle. 760 00:54:20,140 --> 00:54:22,860 So this would be the fifth wonder 761 00:54:22,860 --> 00:54:25,500 that I add to the whole picture. 762 00:54:25,500 --> 00:54:28,750 Well, now I can choose either one of those. 763 00:54:28,750 --> 00:54:31,310 So for example, this would be my sixth step. 764 00:54:31,310 --> 00:54:35,420 And then finally, I can choose none of these 765 00:54:35,420 --> 00:54:37,410 because those would create cycles. 766 00:54:37,410 --> 00:54:39,350 But I can choose this one. 767 00:54:39,350 --> 00:54:42,990 So this is going to be the seventh one that I 768 00:54:42,990 --> 00:54:47,270 add to the whole graph that I construct here 769 00:54:47,270 --> 00:54:49,060 and grow in this algorithm. 770 00:54:49,060 --> 00:54:51,510 It turns out that this algorithm actually creates for you 771 00:54:51,510 --> 00:54:53,360 a minimum spanning tree. 772 00:54:53,360 --> 00:54:55,700 And the proof is much more complex 773 00:54:55,700 --> 00:54:58,120 than what you have seen before. 774 00:54:58,120 --> 00:55:00,970 So that's what we're going to do right now. 775 00:55:03,650 --> 00:55:06,050 So when we think about this algorithm, 776 00:55:06,050 --> 00:55:10,220 then somehow I want to guarantee that at every single step, 777 00:55:10,220 --> 00:55:14,090 I will be able to grow into a minimum spanning tree. 778 00:55:14,090 --> 00:55:18,090 And that's the basic intuition that we have. 779 00:55:18,090 --> 00:55:21,520 And that's the lemma that we want to prove first. 780 00:55:21,520 --> 00:55:29,227 So formalizing, we get the following. 781 00:55:29,227 --> 00:55:30,935 We want to prove the following statement. 782 00:55:39,040 --> 00:55:46,870 So the lemma that will help us is let S be the set of edges 783 00:55:46,870 --> 00:55:53,100 that I have been selecting up to now in the algorithm. 784 00:55:53,100 --> 00:56:00,175 So let S consist, for example, of the first m edges. 785 00:56:10,250 --> 00:56:17,760 What we want to show is that we can still 786 00:56:17,760 --> 00:56:20,960 extend this set of edges into a minimum spanning tree. 787 00:56:20,960 --> 00:56:23,190 We can still grow within the algorithm 788 00:56:23,190 --> 00:56:24,950 into a minimum spanning tree. 789 00:56:24,950 --> 00:56:28,280 That is what you want to show. 790 00:56:28,280 --> 00:56:34,300 So we want to show that their exists a minimum spanning tree 791 00:56:34,300 --> 00:56:41,942 T that has the vertex set V and an edge set E. 792 00:56:43,270 --> 00:56:46,220 So this is the minimum spanning tree for the graph G 793 00:56:46,220 --> 00:56:57,950 such that S is actually a subset of the edges 794 00:56:57,950 --> 00:56:59,200 in this minimum spanning tree. 795 00:56:59,200 --> 00:57:02,780 So this is a nice mathematical formulation 796 00:57:02,780 --> 00:57:08,836 that really precisely states that we can still 797 00:57:08,836 --> 00:57:10,710 keep on growing into a minimum spanning tree. 798 00:57:10,710 --> 00:57:13,560 So how would this lemma help us in proving the theorem 799 00:57:13,560 --> 00:57:16,980 that we want to show? 800 00:57:16,980 --> 00:57:20,110 Actually, where is the theorem? 801 00:57:20,110 --> 00:57:23,080 I did not write down the theorem so far. 802 00:57:23,080 --> 00:57:25,180 I can put this over here. 803 00:57:25,180 --> 00:57:26,910 So the theorem that's want to show 804 00:57:26,910 --> 00:57:43,990 is that for any connected weighted graph G, 805 00:57:43,990 --> 00:57:46,180 the algorithm creates a minimum spanning tree. 806 00:57:51,990 --> 00:57:59,960 The algorithm produces a minimum spanning tree. 807 00:57:59,960 --> 00:58:02,580 So how can we prove his theorem? 808 00:58:02,580 --> 00:58:04,840 Suppose we know that this lemma is true, 809 00:58:04,840 --> 00:58:07,050 which we will prove later. 810 00:58:07,050 --> 00:58:10,010 Then how do we know that this theorem holds? 811 00:58:16,020 --> 00:58:18,630 Actually, I'm wiping out this particular theorem over here. 812 00:58:18,630 --> 00:58:20,530 We will use this in a moment. 813 00:58:20,530 --> 00:58:23,990 So we know that every connected graph has a spanning tree. 814 00:58:23,990 --> 00:58:26,740 We already know that. 815 00:58:26,740 --> 00:58:28,880 But now we want to show that we can even 816 00:58:28,880 --> 00:58:31,480 construct a minimum spanning tree for a weighted graph. 817 00:58:34,070 --> 00:58:36,140 So how are we going ahead? 818 00:58:38,850 --> 00:58:45,820 The proof of the theorem is as follows. 819 00:58:45,820 --> 00:58:47,940 I will suppose that the number of edges-- 820 00:58:47,940 --> 00:58:52,320 the number of vertices-- is actually equal to n. 821 00:58:52,320 --> 00:58:57,750 And if we want to show that we get to a minimum spanning tree, 822 00:58:57,750 --> 00:59:00,820 we want to first of all show that the algorithm 823 00:59:00,820 --> 00:59:05,840 is able to choose at least n minus 1 edges, right? 824 00:59:05,840 --> 00:59:08,100 Because a tree has n minus 1 edges. 825 00:59:08,100 --> 00:59:09,395 We have shown this before. 826 00:59:13,220 --> 00:59:15,030 The algorithm should not get stuck. 827 00:59:15,030 --> 00:59:17,280 So let's first prove that. 828 00:59:17,280 --> 00:59:23,160 So suppose I have chosen less than n minus 1 edges. 829 00:59:26,310 --> 00:59:31,920 So suppose we have less than n minus 1 edges picked. 830 00:59:31,920 --> 00:59:35,220 Well then what do we know? 831 00:59:35,220 --> 00:59:44,460 Well, there exists as edge in E minus S-- 832 00:59:44,460 --> 00:59:48,510 well, let me first write it out-- 833 00:59:48,510 --> 00:59:58,440 that can be added without creating a cycle. 834 01:00:02,120 --> 01:00:03,760 So why is this? 835 01:00:03,760 --> 01:00:08,220 Well, I know that E is the edge set 836 01:00:08,220 --> 01:00:10,760 of my minimum spanning tree. 837 01:00:10,760 --> 01:00:18,390 I know that S is a part of-- is a subset of all the edges. 838 01:00:18,390 --> 01:00:21,490 I know that a minimum spanning tree is a tree. 839 01:00:21,490 --> 01:00:24,570 It's a tree on n vertices. 840 01:00:24,570 --> 01:00:29,870 We know that's a tree on n vertices has n minus 1 edges. 841 01:00:29,870 --> 01:00:35,070 I know that less than n minus 1 edges have been chosen so far. 842 01:00:35,070 --> 01:00:40,720 So that means that S has less than n minus 1 edges. 843 01:00:40,720 --> 01:00:42,650 E has successfully n minus 1 edges. 844 01:00:42,650 --> 01:00:45,915 So there's at least one edge that is contained, 845 01:00:45,915 --> 01:00:52,345 that it is an element in E, but not in S. 846 01:00:52,345 --> 01:00:55,610 And for that particular edge-- well, 847 01:00:55,610 --> 01:00:58,390 since it's part of this tree, together 848 01:00:58,390 --> 01:01:05,340 with S, which is also part of all the edges within the tree-- 849 01:01:05,340 --> 01:01:08,770 well, we know that that edge, for that reason, 850 01:01:08,770 --> 01:01:11,990 will not create a cycle. 851 01:01:11,990 --> 01:01:14,830 Because otherwise, the original would have a cycle. 852 01:01:17,690 --> 01:01:20,340 So we know that the algorithm can always 853 01:01:20,340 --> 01:01:25,300 select an edge such that's no cycle is created. 854 01:01:25,300 --> 01:01:28,960 And that means that we can keep on going in the algorithm 855 01:01:28,960 --> 01:01:33,410 until at least n minus 1 edges are selected. 856 01:01:33,410 --> 01:01:36,695 So what happens if we have n minus 1 edges? 857 01:01:40,020 --> 01:01:49,960 So once n minus 1 edges have been chosen, 858 01:01:49,960 --> 01:01:54,500 well, then S has exactly n minus 1 edges. 859 01:01:54,500 --> 01:01:58,450 E has exactly n minus 1 edges, because it's a tree. 860 01:01:58,450 --> 01:01:59,930 They are subsets of one another. 861 01:01:59,930 --> 01:02:02,330 So they must be equal to one another. 862 01:02:02,330 --> 01:02:04,160 So essentially what we have here is 863 01:02:04,160 --> 01:02:10,040 that S exactly defines the minimum spanning tree 864 01:02:10,040 --> 01:02:11,730 that we are looking for. 865 01:02:11,730 --> 01:02:15,880 So the algorithm at the very end, 866 01:02:15,880 --> 01:02:18,010 when it has chosen n minus 1 edges, 867 01:02:18,010 --> 01:02:21,540 produces a minimum spanning tree. 868 01:02:21,540 --> 01:02:34,130 So we know that S defines the edges of a minimum weight 869 01:02:34,130 --> 01:02:35,970 spanning tree. 870 01:02:35,970 --> 01:02:38,610 So if this lemma really holds true, 871 01:02:38,610 --> 01:02:41,226 then this theorem can be shown. 872 01:02:41,226 --> 01:02:42,850 So now let's have a look at this lemma. 873 01:02:42,850 --> 01:02:44,830 And we're going to use that picture over there 874 01:02:44,830 --> 01:02:47,610 to see how we can prove this. 875 01:02:47,610 --> 01:02:51,610 So this is a little bit tricky. 876 01:02:51,610 --> 01:03:00,340 And the way to go about this is actually to use induction. 877 01:03:00,340 --> 01:03:03,640 So let's check see whether we can do this together. 878 01:03:11,800 --> 01:03:13,740 So the proof of the lemma is as follows. 879 01:03:20,620 --> 01:03:23,940 We want to use induction. 880 01:03:23,940 --> 01:03:27,160 On what do you think we are going to induct here? 881 01:03:27,160 --> 01:03:29,560 Well, we want to prove this particular lemma, 882 01:03:29,560 --> 01:03:31,490 and we have this particular variable m. 883 01:03:31,490 --> 01:03:34,790 So that seems to be a good methods to continue. 884 01:03:34,790 --> 01:03:36,840 So we'll use induction on m. 885 01:03:41,740 --> 01:03:47,030 Let's state the induction hypothesis. 886 01:03:47,030 --> 01:03:49,000 Well, it's exactly the same as in the lemma. 887 01:03:49,000 --> 01:03:54,780 So we say for all G and for all sets 888 01:03:54,780 --> 01:04:12,730 S that consists of the first m selected edges, 889 01:04:12,730 --> 01:04:22,120 there exists a minimum spanning tree of G such 890 01:04:22,120 --> 01:04:28,780 that S is a subset of E. 891 01:04:28,780 --> 01:04:31,020 So this is how induction hypothesis. 892 01:04:31,020 --> 01:04:34,720 And as always, we start with the base case. 893 01:04:34,720 --> 01:04:36,980 And the base case is going to be pretty 894 01:04:36,980 --> 01:04:38,900 straightforward in this particular case. 895 01:04:42,750 --> 01:04:44,690 We have to do a little bit of work for it. 896 01:04:44,690 --> 01:04:46,660 So let's see how that works. 897 01:04:46,660 --> 01:04:50,620 So the base case is m equals 0. 898 01:04:50,620 --> 01:04:54,880 So let's reread this statement where m is equal to 0. 899 01:04:54,880 --> 01:05:03,780 Well, if we have zero selected edges, then S is the empty set. 900 01:05:03,780 --> 01:05:05,900 So I have to prove that for all G, 901 01:05:05,900 --> 01:05:09,240 there exists a minimum spanning tree such 902 01:05:09,240 --> 01:05:13,010 that the empty set is a subset of E. Well, the empty set is 903 01:05:13,010 --> 01:05:14,691 always a subset of E, right? 904 01:05:14,691 --> 01:05:15,565 So that's no problem. 905 01:05:18,370 --> 01:05:20,570 So what I need to proof is that for all G, 906 01:05:20,570 --> 01:05:25,780 there exists a minimum spanning tree of G. 907 01:05:25,780 --> 01:05:28,610 And this is where our previous theorem comes in, 908 01:05:28,610 --> 01:05:32,010 because we showed that for all graphs G, 909 01:05:32,010 --> 01:05:33,450 there exists a spanning tree. 910 01:05:33,450 --> 01:05:34,950 And if there exists a spanning tree, 911 01:05:34,950 --> 01:05:38,540 there exists a minimum weight spanning tree. 912 01:05:38,540 --> 01:05:39,660 So that's the base case. 913 01:05:39,660 --> 01:05:46,380 So m equals 0 implies that S is equal to the empty set. 914 01:05:46,380 --> 01:05:48,530 So that means that S is definitely 915 01:05:48,530 --> 01:05:58,420 a subset for any set of edges of a minimum weight spanning tree. 916 01:06:02,300 --> 01:06:07,030 And now we're going to use our theorem to show that there also 917 01:06:07,030 --> 01:06:08,570 exists a minimum spanning tree. 918 01:06:08,570 --> 01:06:10,950 So this in itself is not yet sufficient, right? 919 01:06:10,950 --> 01:06:13,970 You see that. 920 01:06:13,970 --> 01:06:17,450 I know that if S is the empty set, 921 01:06:17,450 --> 01:06:19,120 then I know that this always holds. 922 01:06:19,120 --> 01:06:21,230 But the statement is a little bit bigger. 923 01:06:21,230 --> 01:06:23,989 For all G, I still need to prove there 924 01:06:23,989 --> 01:06:25,280 exists a minimum spanning tree. 925 01:06:25,280 --> 01:06:27,950 And that's what our previous theorem told us. 926 01:06:27,950 --> 01:06:29,800 So we'll now write that part out. 927 01:06:29,800 --> 01:06:31,090 So we have to base case. 928 01:06:35,130 --> 01:06:36,130 Let's see. 929 01:06:42,680 --> 01:06:49,020 I think I will go and prove this over here. 930 01:06:49,020 --> 01:06:52,106 So let's look at the-- 931 01:06:53,840 --> 01:06:55,850 That's the inductive step. 932 01:06:55,850 --> 01:06:59,750 And then hopefully we can play a little bit with this picture 933 01:06:59,750 --> 01:07:01,916 up here to get some insight. 934 01:07:06,550 --> 01:07:09,050 So how do we do this? 935 01:07:09,050 --> 01:07:13,480 Well, with an inductive step, we start as usual. 936 01:07:17,880 --> 01:07:23,765 We assume that P of m holds. 937 01:07:26,390 --> 01:07:27,880 Now how do we go ahead? 938 01:07:27,880 --> 01:07:34,160 So I want to prove Pm plus 1, which is stated over here. 939 01:07:34,160 --> 01:07:38,150 So I want to consider the set S that 940 01:07:38,150 --> 01:07:40,960 consists of the first n plus 1 selected edges. 941 01:07:40,960 --> 01:07:46,140 So well, essentially what I'm interested in 942 01:07:46,140 --> 01:07:49,940 is what happens in the n plus 1 step. 943 01:07:49,940 --> 01:08:03,484 So let E denote the edge that I added in the m plus 1 step. 944 01:08:10,720 --> 01:08:16,319 And let S be the first m selected edges. 945 01:08:16,319 --> 01:08:19,500 And we know that we can apply p of m for S, right? 946 01:08:19,500 --> 01:08:22,960 Because we assume this is our inductive step. 947 01:08:22,960 --> 01:08:25,140 So we can apply something for S. We 948 01:08:25,140 --> 01:08:27,554 would like to show Pm plus 1. 949 01:08:27,554 --> 01:08:28,970 So we would like to show something 950 01:08:28,970 --> 01:08:30,710 for the union of those two. 951 01:08:30,710 --> 01:08:33,870 So let me repeat that. 952 01:08:33,870 --> 01:08:40,800 Let S denote the first m selected edges. 953 01:08:44,330 --> 01:08:54,430 So by our induction hypothesis, we can-- well, let's apply it. 954 01:08:54,430 --> 01:08:57,810 We know that there exists a minimum spanning 955 01:08:57,810 --> 01:09:01,700 tree such that S is a subset of the edges. 956 01:09:01,700 --> 01:09:06,050 So let's just pick one such minimum spanning tree. 957 01:09:06,050 --> 01:09:11,490 So let T star be such a minimum spanning tree. 958 01:09:11,490 --> 01:09:13,490 It has edges E star. 959 01:09:23,710 --> 01:09:27,640 And we know that S is actually a subset of E star. 960 01:09:34,890 --> 01:09:37,507 Let's look at the very first case. 961 01:09:37,507 --> 01:09:39,840 So what are the kind of cases that we are interested in? 962 01:09:39,840 --> 01:09:41,700 So let's think again. 963 01:09:41,700 --> 01:09:42,990 What do we need to prove? 964 01:09:42,990 --> 01:09:45,420 We need to prove Pn plus 1. 965 01:09:45,420 --> 01:09:48,670 So we need to prove something about the union of S and E. 966 01:09:48,670 --> 01:09:52,729 We want to show that there is a minimum spanning tree such 967 01:09:52,729 --> 01:09:56,220 that the edge set of this minimum spanning tree 968 01:09:56,220 --> 01:10:01,190 contains both S and also E. 969 01:10:01,190 --> 01:10:05,100 So what kind of cases can we handle here? 970 01:10:05,100 --> 01:10:09,590 Well, what will be a really good first case? 971 01:10:09,590 --> 01:10:13,260 The first case could be that E is actually already 972 01:10:13,260 --> 01:10:15,790 part of E star. 973 01:10:15,790 --> 01:10:20,980 Well, if that's true, then of course, S together with E 974 01:10:20,980 --> 01:10:24,700 is also a subset of E star. 975 01:10:24,700 --> 01:10:32,450 And that means that we are done, in this particular case. 976 01:10:32,450 --> 01:10:33,570 And why is this? 977 01:10:33,570 --> 01:10:38,220 Because this is an example of a minimum spanning tree that 978 01:10:38,220 --> 01:10:41,025 contains both S and E. So it contains the n 979 01:10:41,025 --> 01:10:43,430 plus 1 first selected edges. 980 01:10:43,430 --> 01:10:45,830 And so this is Pn plus 1. 981 01:10:45,830 --> 01:10:47,890 Now the next case is the difficult part. 982 01:10:52,350 --> 01:10:59,400 OK, let's wipe out-- actually, we 983 01:10:59,400 --> 01:11:03,150 do not really need the proof of the theorem anymore. 984 01:11:03,150 --> 01:11:05,930 So let's take this off the blackboard. 985 01:11:12,910 --> 01:11:18,910 So the way to do this is to sort of see 986 01:11:18,910 --> 01:11:26,040 how we can use the tree, T star, and somehow transform it 987 01:11:26,040 --> 01:11:29,680 into another tree for this particular case 988 01:11:29,680 --> 01:11:33,565 where we assume that E is not in E star. 989 01:11:39,950 --> 01:11:44,070 So let's have a look at that graph up there. 990 01:11:44,070 --> 01:11:45,730 Let me rewrite it a little bit. 991 01:11:45,730 --> 01:11:50,020 And let's look at what happens in the algorithm 992 01:11:50,020 --> 01:11:51,200 after two steps. 993 01:11:51,200 --> 01:11:54,980 So I will redraw the graph. 994 01:11:54,980 --> 01:12:03,920 So we have, say, these edges. 995 01:12:13,970 --> 01:12:19,370 This one-- let me see. 996 01:12:19,370 --> 01:12:24,100 I want to use some different colors to make this as clear 997 01:12:24,100 --> 01:12:26,630 as possible. 998 01:12:26,630 --> 01:12:28,623 We have this one. 999 01:12:28,623 --> 01:12:30,516 We have this one. 1000 01:12:30,516 --> 01:12:33,170 We have this one. 1001 01:12:33,170 --> 01:12:35,140 And we also have this one actually. 1002 01:12:35,140 --> 01:12:36,440 Sorry. 1003 01:12:36,440 --> 01:12:41,740 All right, so let's define the different edges. 1004 01:12:41,740 --> 01:12:46,290 The straight ones from the set S. 1005 01:12:46,290 --> 01:12:49,990 So after two steps over there, we have selected this edge 1006 01:12:49,990 --> 01:12:52,630 first, and then this edge was the second edge 1007 01:12:52,630 --> 01:12:55,150 that we selected in our algorithm. 1008 01:12:55,150 --> 01:12:59,860 So these two together form the set S. 1009 01:12:59,860 --> 01:13:02,100 So what is T star? 1010 01:13:02,100 --> 01:13:05,270 T star could be, for example, the spanning tree 1011 01:13:05,270 --> 01:13:14,155 that contains both these edges plus the dotted ones. 1012 01:13:16,770 --> 01:13:18,300 So let's have a look here. 1013 01:13:18,300 --> 01:13:21,390 This is indeed a tree, right? 1014 01:13:21,390 --> 01:13:26,580 And if we look into this picture over here, 1015 01:13:26,580 --> 01:13:30,710 you can see that the weight of this particular tree is 17. 1016 01:13:30,710 --> 01:13:33,180 It's also a minimum weight spanning tree. 1017 01:13:35,890 --> 01:13:37,240 What is G? 1018 01:13:37,240 --> 01:13:41,500 G is the complete graph and contains not only these two 1019 01:13:41,500 --> 01:13:46,245 types of edges, but also these sort of more orange colored 1020 01:13:46,245 --> 01:13:46,744 ones. 1021 01:13:50,910 --> 01:13:54,490 So this is so the situation that we are in. 1022 01:13:54,490 --> 01:13:58,870 Now in our algorithm, we want to choose 1023 01:13:58,870 --> 01:14:02,350 this particular third edge. 1024 01:14:02,350 --> 01:14:09,140 So this is going to be our e, this particular edge. 1025 01:14:09,140 --> 01:14:13,320 And as you can see, e is not part of T star. 1026 01:14:13,320 --> 01:14:16,190 It's not one of those two kinds of edges. 1027 01:14:16,190 --> 01:14:20,230 It's actually one of the orange edges. 1028 01:14:20,230 --> 01:14:22,400 So what can we do here? 1029 01:14:22,400 --> 01:14:29,640 Well, the idea is that we want to recreate like a you 1030 01:14:29,640 --> 01:14:35,530 tree in which we somehow exchange e-- exchange one 1031 01:14:35,530 --> 01:14:39,330 of the edges in T star with e, and still 1032 01:14:39,330 --> 01:14:42,740 sort of maintaining the minimum spanning tree property. 1033 01:14:42,740 --> 01:14:43,970 So how can you prove this? 1034 01:14:52,590 --> 01:14:57,190 The way to do this is to figure out 1035 01:14:57,190 --> 01:15:00,610 what the algorithm teaches us about e, 1036 01:15:00,610 --> 01:15:04,771 and also what the tree property tells us. 1037 01:15:08,040 --> 01:15:10,410 So let's combine all our knowledge. 1038 01:15:10,410 --> 01:15:13,470 So first of all, we know that the algorithm, 1039 01:15:13,470 --> 01:15:18,470 from its definition, implies that as together 1040 01:15:18,470 --> 01:15:24,370 with e, actually has no cycle. 1041 01:15:24,370 --> 01:15:28,750 That's how we select the n plus 1 edge. 1042 01:15:28,750 --> 01:15:32,710 There's no-- all selected edged together 1043 01:15:32,710 --> 01:15:34,020 should not contain a cycle. 1044 01:15:34,020 --> 01:15:36,040 That's the definition of the algorithm. 1045 01:15:36,040 --> 01:15:39,430 We know that T star is a tree. 1046 01:15:39,430 --> 01:15:45,750 And this implies that if you look at the graph that 1047 01:15:45,750 --> 01:15:49,710 has the edges, has the vertices V together 1048 01:15:49,710 --> 01:15:56,250 with E star with edge e, then this must contain a cycle. 1049 01:15:59,930 --> 01:16:02,626 Now this property I will not prove right now, 1050 01:16:02,626 --> 01:16:04,000 but it is pretty straightforward. 1051 01:16:04,000 --> 01:16:05,840 It's a one line proof actually in the book. 1052 01:16:05,840 --> 01:16:07,080 So you can look it up. 1053 01:16:07,080 --> 01:16:10,050 So it simply states that if you have a tree, if you 1054 01:16:10,050 --> 01:16:15,500 add another edge, an edge that is not contained in E star, 1055 01:16:15,500 --> 01:16:19,110 then we will create a cycle. 1056 01:16:19,110 --> 01:16:22,780 So now we can combine these two statements together 1057 01:16:22,780 --> 01:16:30,500 and conclude that, well, if S with e has no cycle, 1058 01:16:30,500 --> 01:16:36,940 but E star with e has a cycle, and S is contained in E star, 1059 01:16:36,940 --> 01:16:39,180 well there's only one possibility. 1060 01:16:39,180 --> 01:16:52,420 And that is that this cycle must have an edge, e prime, that 1061 01:16:52,420 --> 01:16:56,210 is in E star minus S. 1062 01:16:56,210 --> 01:16:59,630 Because if it would not have such an edge, 1063 01:16:59,630 --> 01:17:08,440 then all the edges of the cycle must be located in S-- in S 1064 01:17:08,440 --> 01:17:11,432 together with E. But S together with E has no cycle. 1065 01:17:11,432 --> 01:17:12,390 So that's not possible. 1066 01:17:12,390 --> 01:17:14,460 So this cycle must have and edge e 1067 01:17:14,460 --> 01:17:18,850 prime that is outside S, but still in E star. 1068 01:17:18,850 --> 01:17:22,000 So we can find one over here. 1069 01:17:22,000 --> 01:17:24,800 e prime can be this particular edge. 1070 01:17:24,800 --> 01:17:28,820 So e prime is in E star. 1071 01:17:28,820 --> 01:17:33,980 But it is not already selected in S. 1072 01:17:33,980 --> 01:17:36,085 So now we are going to do a trick. 1073 01:17:38,830 --> 01:17:43,560 The idea is to swap e and e prime. 1074 01:17:43,560 --> 01:17:45,580 So that's the main idea. 1075 01:17:45,580 --> 01:17:51,620 So let's first write down what we know about the weight of e 1076 01:17:51,620 --> 01:17:53,770 prime and e. 1077 01:17:53,770 --> 01:18:04,370 So since the algorithm, well, could have selected e 1078 01:18:04,370 --> 01:18:07,300 or e prime-- it did select e, right? 1079 01:18:07,300 --> 01:18:09,670 But it could have also selected e prime, 1080 01:18:09,670 --> 01:18:15,150 because e prime does not create a cycle with S. 1081 01:18:15,150 --> 01:18:18,760 That's what we have seen here. 1082 01:18:18,760 --> 01:18:27,920 So this really implies therefore that the weight of e 1083 01:18:27,920 --> 01:18:35,520 is at most the weight of e prime. 1084 01:18:35,520 --> 01:18:36,210 So why is this? 1085 01:18:36,210 --> 01:18:39,540 Because the algorithm always chooses the minimum weight. 1086 01:18:39,540 --> 01:18:43,515 And it can choose between e or e prime, but it chose e. 1087 01:18:43,515 --> 01:18:47,320 So the weight of e must be less than the weight of e prime. 1088 01:18:47,320 --> 01:18:49,620 So that's what we know also. 1089 01:18:49,620 --> 01:18:54,030 So now let's swap e and e prime in T. 1090 01:18:54,030 --> 01:18:55,900 And then we're almost done with this proof, 1091 01:18:55,900 --> 01:19:02,295 because that will be the tree of which we will prove that it 1092 01:19:02,295 --> 01:19:04,386 is a minimum spanning tree. 1093 01:19:10,870 --> 01:19:22,510 So the key idea is to swap e and e prime in T. How do we do it? 1094 01:19:22,510 --> 01:19:27,390 Well, let T double star be equal to V 1095 01:19:27,390 --> 01:19:32,080 with the edge set E double star where 1096 01:19:32,080 --> 01:19:38,530 E double star is really equal to, well, the original set E 1097 01:19:38,530 --> 01:19:39,260 star. 1098 01:19:39,260 --> 01:19:43,000 But we take out of this e prime. 1099 01:19:43,000 --> 01:19:46,330 So this one is taken out of the minimum spanning tree. 1100 01:19:46,330 --> 01:19:48,614 And we add e in here. 1101 01:19:52,310 --> 01:19:54,440 Now we want to prove that this particular T 1102 01:19:54,440 --> 01:19:57,100 double star is a minimum spanning tree that 1103 01:19:57,100 --> 01:19:59,810 fits our lemma. 1104 01:19:59,810 --> 01:20:03,180 So what do we know? 1105 01:20:03,180 --> 01:20:05,366 We know that T double star is acyclic. 1106 01:20:10,500 --> 01:20:12,030 And why is this? 1107 01:20:12,030 --> 01:20:25,500 Well, we actually removed e prime from the only cycle-- so 1108 01:20:25,500 --> 01:20:33,000 look up here also-- the only cycle in E star 1109 01:20:33,000 --> 01:20:35,720 together with e. 1110 01:20:35,720 --> 01:20:39,110 So it's acyclic. 1111 01:20:39,110 --> 01:20:41,130 T double star is also connected. 1112 01:20:43,660 --> 01:20:45,620 And why is that? 1113 01:20:45,620 --> 01:20:52,440 Well, since e prime is removed-- but it was on a cycle. 1114 01:20:52,440 --> 01:20:53,988 So it remains connected. 1115 01:20:58,680 --> 01:21:04,300 And finally, we also know that T double star actually 1116 01:21:04,300 --> 01:21:15,530 contains all the vertices in G. So all these three together 1117 01:21:15,530 --> 01:21:22,830 with prove that T double star is a spanning tree of G, 1118 01:21:22,830 --> 01:21:26,700 because it's acyclic, connected, contains all the vertices. 1119 01:21:26,700 --> 01:21:29,780 Is it a minimum weight spanning tree? 1120 01:21:29,780 --> 01:21:34,840 Now we are really almost done, because-- let me write it out 1121 01:21:34,840 --> 01:21:35,980 here. 1122 01:21:35,980 --> 01:21:42,980 We know that the weight of T double star 1123 01:21:42,980 --> 01:21:48,270 is at most the weight of T star. 1124 01:21:48,270 --> 01:21:49,210 Why is this? 1125 01:21:49,210 --> 01:21:53,570 Well, we showed just now that the weight of e 1126 01:21:53,570 --> 01:21:55,350 is at most the weight of e prime. 1127 01:21:55,350 --> 01:21:58,360 So we have exchanged e and e prime. 1128 01:21:58,360 --> 01:22:00,750 So that means that the weight of T double star 1129 01:22:00,750 --> 01:22:04,110 is at most that of the weight of T star. 1130 01:22:04,110 --> 01:22:12,750 We also know that T star is a minimum spanning tree. 1131 01:22:12,750 --> 01:22:17,600 So if you combine these two together, 1132 01:22:17,600 --> 01:22:20,290 then we know that the weight of T double star cannot get less 1133 01:22:20,290 --> 01:22:22,010 than the minimum weight possibility. 1134 01:22:22,010 --> 01:22:25,080 So it also has a minimum weight. 1135 01:22:25,080 --> 01:22:30,940 So T double star is a minimum spanning tree. 1136 01:22:30,940 --> 01:22:33,250 So now we're done because in this case, 1137 01:22:33,250 --> 01:22:37,720 we have constructed a T double star such 1138 01:22:37,720 --> 01:22:44,720 that S together with E is a subset of E double star. 1139 01:22:44,720 --> 01:22:50,630 And that means that we have shown that our induction 1140 01:22:50,630 --> 01:22:52,110 step is true. 1141 01:22:52,110 --> 01:22:54,800 We have shown that Pn plus 1 holds. 1142 01:22:54,800 --> 01:22:58,040 So now we finally have figured out that the lemma is true. 1143 01:22:58,040 --> 01:23:05,080 And now we showed already how to use the lemma in our proof 1144 01:23:05,080 --> 01:23:07,570 for this particular theorem. 1145 01:23:07,570 --> 01:23:11,030 So this is the end of this lecture. 1146 01:23:11,030 --> 01:23:14,620 And tomorrow, you will actually go again over this proof 1147 01:23:14,620 --> 01:23:16,830 because it's rather complex. 1148 01:23:16,830 --> 01:23:19,740 And then hopefully you really get into all 1149 01:23:19,740 --> 01:23:21,110 the different proof techniques. 1150 01:23:21,110 --> 01:23:21,610 OK. 1151 01:23:21,610 --> 01:23:23,510 Thank you.