1 00:00:07,000 --> 00:00:10,000 -- shortest paths. This is the finale. 2 00:00:10,000 --> 00:00:13,000 Hopefully it was worth waiting for. 3 00:00:13,000 --> 00:00:17,000 Remind you there's a quiz coming up soon, 4 00:00:17,000 --> 00:00:23,000 you should be studying for it. There's no problem set due at 5 00:00:23,000 --> 00:00:28,000 the same time as the quiz because you should be studying 6 00:00:28,000 --> 00:00:32,000 now. It's a take-home exam. 7 00:00:32,000 --> 00:00:37,000 It's required that you come to class on Monday. 8 00:00:37,000 --> 00:00:43,000 Of course, you'll all come, but everyone watching at home 9 00:00:43,000 --> 00:00:47,000 should also come next Monday to get the quiz. 10 00:00:47,000 --> 00:00:53,000 It's the required lecture. So, we need a bit of a recap in 11 00:00:53,000 --> 00:00:58,000 the trilogy so far. So, the last two lectures, 12 00:00:58,000 --> 00:01:04,000 the last two episodes, or about single source shortest 13 00:01:04,000 --> 00:01:08,000 paths. So, we wanted to find the 14 00:01:08,000 --> 00:01:13,000 shortest path from a source vertex to every other vertex. 15 00:01:13,000 --> 00:01:17,000 And, we saw a few algorithms for this. 16 00:01:17,000 --> 00:01:21,000 Here's some recap. We saw in the unweighted case, 17 00:01:21,000 --> 00:01:27,000 that was sort of the easiest where all the edge weights were 18 00:01:27,000 --> 00:01:30,000 one. Then we could use breadth first 19 00:01:30,000 --> 00:01:34,000 search. And this costs what we call 20 00:01:34,000 --> 00:01:41,000 linear time in the graph world, the number of vertices plus the 21 00:01:41,000 --> 00:01:46,000 number of edges. The next simplest case, 22 00:01:46,000 --> 00:01:50,000 perhaps, is nonnegative edge weights. 23 00:01:50,000 --> 00:01:54,000 And in that case, what algorithm do we use? 24 00:01:54,000 --> 00:02:00,000 Dijkstra, all right, everyone's awake. 25 00:02:00,000 --> 00:02:04,000 Several answers at once, great. 26 00:02:04,000 --> 00:02:11,000 So this takes almost linear time if you use a good heap 27 00:02:11,000 --> 00:02:15,000 structure, so, V log V plus E. 28 00:02:15,000 --> 00:02:21,000 And, in the general case, general weights, 29 00:02:21,000 --> 00:02:26,000 we would use Bellman-Ford which you saw. 30 00:02:26,000 --> 00:02:33,000 And that costs VE, good, OK, which is quite a bit 31 00:02:33,000 --> 00:02:38,000 worse. This is ignoring log factors. 32 00:02:38,000 --> 00:02:42,000 Dijkstra is basically linear time, Bellman-Ford you're 33 00:02:42,000 --> 00:02:45,000 quadratic if you have a connected graph. 34 00:02:45,000 --> 00:02:49,000 So, in the sparse case, when E is order V, 35 00:02:49,000 --> 00:02:52,000 this is about linear. This is about quadratic. 36 00:02:52,000 --> 00:02:56,000 In the dense case, when E is about V^2, 37 00:02:56,000 --> 00:03:00,000 this is quadratic, and this is cubic. 38 00:03:00,000 --> 00:03:06,000 So, Dijkstra and Bellman-Ford are separated by about an order 39 00:03:06,000 --> 00:03:09,000 of V factor, which is pretty bad. 40 00:03:09,000 --> 00:03:15,000 OK, but that's the best we know how to do for single source 41 00:03:15,000 --> 00:03:19,000 shortest paths, negative edge weights, 42 00:03:19,000 --> 00:03:24,000 Bellman-Ford is the best. We also saw in recitation the 43 00:03:24,000 --> 00:03:30,000 case of a DAG. And there, what do you do? 44 00:03:30,000 --> 00:03:32,000 Topological sort, yeah. 45 00:03:32,000 --> 00:03:39,000 So, you can do a topological sort to get an ordering on the 46 00:03:39,000 --> 00:03:42,000 vertices. That you run Bellman-Ford, 47 00:03:42,000 --> 00:03:47,000 one round. This is one way to think of 48 00:03:47,000 --> 00:03:51,000 what's going on. You run Bellman-Ford in the 49 00:03:51,000 --> 00:03:57,000 order given by the topological sort, which is once, 50 00:03:57,000 --> 00:04:03,000 and you get a linear time algorithm. 51 00:04:03,000 --> 00:04:06,000 So, DAG is another case where we know how to do well even with 52 00:04:06,000 --> 00:04:08,000 weights. Unweighted, we can also do 53 00:04:08,000 --> 00:04:10,000 linear time. But most of the time, 54 00:04:10,000 --> 00:04:13,000 though, will be, so you should keep these in 55 00:04:13,000 --> 00:04:15,000 mind in the quiz. When you get a shortest path 56 00:04:15,000 --> 00:04:19,000 problem, or what you end up determining is the shortest path 57 00:04:19,000 --> 00:04:22,000 problem, think about what's the best algorithm you can use in 58 00:04:22,000 --> 00:04:24,000 that case? OK, so that's single source 59 00:04:24,000 --> 00:04:27,000 shortest paths. And so, in our evolution of the 60 00:04:27,000 --> 00:04:30,000 Death Star, initially it was just nonnegative edge weights. 61 00:04:30,000 --> 00:04:34,000 Then we got negative edge weights. 62 00:04:34,000 --> 00:04:37,000 Today, the Death Star challenges us with all pair 63 00:04:37,000 --> 00:04:40,000 shortest paths, where we want to know the 64 00:04:40,000 --> 00:04:44,000 shortest path weight between every pair of vertices. 65 00:04:59,000 --> 00:05:03,000 OK, so let's get some quick results. 66 00:05:03,000 --> 00:05:07,000 What could we do with this case? 67 00:05:07,000 --> 00:05:13,000 So, for example, suppose I have an unweighted 68 00:05:13,000 --> 00:05:18,000 graph. Any suggestions of how I should 69 00:05:18,000 --> 00:05:26,000 compute all pair shortest paths? Between every pair of vertices, 70 00:05:26,000 --> 00:05:32,000 I want to know the shortest path weight. 71 00:05:32,000 --> 00:05:37,000 BFS, a couple more words? Yeah? 72 00:05:37,000 --> 00:05:44,000 Right, BFS V times. OK, I'll say V times BFS, 73 00:05:44,000 --> 00:05:49,000 OK? So, the running time would be 74 00:05:49,000 --> 00:05:57,000 V^2 plus V times E, yeah, which is assuming your 75 00:05:57,000 --> 00:06:03,000 graph is connected, V times E. 76 00:06:03,000 --> 00:06:05,000 OK, good. That's probably about the best 77 00:06:05,000 --> 00:06:07,000 algorithm we know for unweighted graphs. 78 00:06:07,000 --> 00:06:11,000 So, a lot of these are going to sort of be the obvious answer. 79 00:06:11,000 --> 00:06:15,000 You take your single source algorithm, you run it V times. 80 00:06:15,000 --> 00:06:18,000 That's the best you can do, OK, or the best we know how to 81 00:06:18,000 --> 00:06:19,000 do. This is not so bad. 82 00:06:19,000 --> 00:06:22,000 This is like one iteration of Bellman-Ford, 83 00:06:22,000 --> 00:06:25,000 for comparison. We definitely need at least, 84 00:06:25,000 --> 00:06:27,000 like, V^2 time, because the size of the output 85 00:06:27,000 --> 00:06:32,000 is V^2, shortest path weight we have to compute. 86 00:06:32,000 --> 00:06:37,000 So, this is not perfect, but pretty good. 87 00:06:37,000 --> 00:06:41,000 And we are not going to improve on that. 88 00:06:41,000 --> 00:06:49,000 So, nonnegative edge weights: the natural thing to do is to 89 00:06:49,000 --> 00:06:54,000 run Dijkstra V times, OK, no big surprise. 90 00:06:54,000 --> 00:07:01,000 And the running time of that is, well, V times E again, 91 00:07:01,000 --> 00:07:08,000 plus V^2, log V, which is also not too bad. 92 00:07:08,000 --> 00:07:10,000 I mean, it's basically the same as running BFS. 93 00:07:10,000 --> 00:07:12,000 And then, there's the log factor. 94 00:07:12,000 --> 00:07:16,000 If you ignore the log factor, this is the dominant term. 95 00:07:16,000 --> 00:07:18,000 And, I mean, this had an [added?] V^2 as 96 00:07:18,000 --> 00:07:20,000 well. So, these are both pretty good. 97 00:07:20,000 --> 00:07:22,000 I mean, this is kind of neat. Essentially, 98 00:07:22,000 --> 00:07:26,000 the time it takes to run one Bellman-Ford plus a log factor, 99 00:07:26,000 --> 00:07:29,000 you can compute all pair shortest paths if you have 100 00:07:29,000 --> 00:07:35,000 nonnegative edge weights. So, I mean, comparing all pairs 101 00:07:35,000 --> 00:07:39,000 to signal source, this seems a lot better, 102 00:07:39,000 --> 00:07:45,000 except we can only handle nonnegative edge weights. 103 00:07:45,000 --> 00:07:49,000 OK, so now let's think about the general case. 104 00:07:49,000 --> 00:07:55,000 Well, this is the focus of today, and here's where we can 105 00:07:55,000 --> 00:08:02,000 actually make an improvement. So the obvious thing is V times 106 00:08:02,000 --> 00:08:08,000 Bellman-Ford, which would cost V^2 times E. 107 00:08:08,000 --> 00:08:11,000 And that's pretty pitiful, and we're going to try to 108 00:08:11,000 --> 00:08:15,000 improve that to something closer to that nonnegative edge weight 109 00:08:15,000 --> 00:08:17,000 bound. So it turns out, 110 00:08:17,000 --> 00:08:21,000 here, we can actually make an improvement whereas in these 111 00:08:21,000 --> 00:08:24,000 special cases, we really can't do much better. 112 00:08:24,000 --> 00:08:26,000 OK, I don't have a good intuition why, 113 00:08:26,000 --> 00:08:30,000 but it's the case. So, we'll cover something like 114 00:08:30,000 --> 00:08:34,000 three algorithms today for this problem. 115 00:08:34,000 --> 00:08:37,000 The last one will be the best, but along the way we'll see 116 00:08:37,000 --> 00:08:40,000 some nice connections between shortest paths and dynamic 117 00:08:40,000 --> 00:08:42,000 programming, which we haven't really seen yet. 118 00:08:42,000 --> 00:08:46,000 We've seen shortest path, and applying greedy algorithms 119 00:08:46,000 --> 00:08:49,000 to it, but today will actually do dynamic programming. 120 00:08:49,000 --> 00:08:51,000 The intuition is that with all pair shortest paths, 121 00:08:51,000 --> 00:08:54,000 there's more potential subproblem reuse. 122 00:08:54,000 --> 00:08:57,000 We've got to compute the shortest path from x to y for 123 00:08:57,000 --> 00:08:59,000 all x and y. Maybe we can reuse those 124 00:08:59,000 --> 00:09:03,000 shortest paths in computing other shortest paths. 125 00:09:03,000 --> 00:09:07,000 OK, there's a bit more reusability, let's say. 126 00:09:07,000 --> 00:09:12,000 OK, let me quickly define all pair shortest paths formally, 127 00:09:12,000 --> 00:09:17,000 because we're going to change our notation slightly. 128 00:09:17,000 --> 00:09:20,000 It's because we care about all pairs. 129 00:09:20,000 --> 00:09:24,000 So, as usual, the input is directed graph, 130 00:09:24,000 --> 00:09:29,000 so, vertices and edges. We're going to say that the 131 00:09:29,000 --> 00:09:35,000 vertices are labeled one to n for convenience because with all 132 00:09:35,000 --> 00:09:42,000 pairs, we're going to think of things more as an n by n matrix 133 00:09:42,000 --> 00:09:48,000 instead of edges in some sense because it doesn't help to think 134 00:09:48,000 --> 00:09:51,000 any more in terms of adjacency lists. 135 00:09:51,000 --> 00:09:55,000 And, you have edge weights as usual. 136 00:09:55,000 --> 00:10:00,000 This is what makes it interesting. 137 00:10:00,000 --> 00:10:05,000 Some of them are going to be negative. 138 00:10:05,000 --> 00:10:13,000 So, w maps to every real number, and the target output is 139 00:10:13,000 --> 00:10:20,000 a shortest path matrix. So, this is now an n by n 140 00:10:20,000 --> 00:10:25,000 matrix. So, n is just the number of 141 00:10:25,000 --> 00:10:32,000 vertices of shortest path weights. 142 00:10:32,000 --> 00:10:37,000 So, delta of i, j is the shortest path weight 143 00:10:37,000 --> 00:10:42,000 from i to j for all pairs of vertices. 144 00:10:42,000 --> 00:10:50,000 So this, you could represent as an n by n matrix in particular. 145 00:10:50,000 --> 00:10:57,000 OK, so now let's start doing algorithms. 146 00:10:57,000 --> 00:11:02,000 So, we have this very simple algorithm, V times Bellman-Ford, 147 00:11:02,000 --> 00:11:06,000 V^2 times E, and just for comparison's sake, 148 00:11:06,000 --> 00:11:09,000 I'm going to say, let me rewrite that, 149 00:11:09,000 --> 00:11:14,000 V times Bellman-Ford gives us this running time of V^2 E, 150 00:11:14,000 --> 00:11:18,000 and I'm going to think about the case where, 151 00:11:18,000 --> 00:11:23,000 let's just say the graph is dense, meeting that the number 152 00:11:23,000 --> 00:11:29,000 of edges is quadratic, and the number of vertices. 153 00:11:29,000 --> 00:11:33,000 So in that case, this will take V^4 time, 154 00:11:33,000 --> 00:11:37,000 which is pretty slow. We'd like to do better. 155 00:11:37,000 --> 00:11:43,000 So, first goal would just be to beat V^4, V hypercubed, 156 00:11:43,000 --> 00:11:46,000 I guess. OK, and we are going to use 157 00:11:46,000 --> 00:11:52,000 dynamic programming to do that. Or at least that's what the 158 00:11:52,000 --> 00:11:58,000 motivation will come from. It will take us a while before 159 00:11:58,000 --> 00:12:03,000 we can even beat V^4, which is maybe a bit pathetic, 160 00:12:03,000 --> 00:12:10,000 but it takes some clever insights, let's say. 161 00:12:10,000 --> 00:12:19,000 OK, so I'm going to introduce a bit more notation for this 162 00:12:19,000 --> 00:12:25,000 graph. So, I'm going to think about 163 00:12:25,000 --> 00:12:33,000 the weighted adjacency matrix. So, I don't think we've really 164 00:12:33,000 --> 00:12:37,000 seen this in lecture before, although I think it's in the 165 00:12:37,000 --> 00:12:39,000 appendix. What that means, 166 00:12:39,000 --> 00:12:44,000 so normally adjacency matrix is like one if there's an edge, 167 00:12:44,000 --> 00:12:47,000 and zero if there isn't. And this is in a digraph, 168 00:12:47,000 --> 00:12:50,000 so you have to be a little bit careful. 169 00:12:50,000 --> 00:12:54,000 Here, these values, the entries in the matrix, 170 00:12:54,000 --> 00:12:57,000 are going to be the weights of the edges. 171 00:12:57,000 --> 00:13:01,000 OK, this is this if ij is an edge. 172 00:13:01,000 --> 00:13:04,000 So, if ij is an edge in the graph, and it's going to be 173 00:13:04,000 --> 00:13:08,000 infinity if there is no edge. OK, in terms of shortest paths, 174 00:13:08,000 --> 00:13:12,000 this is a more useful way to represent the graph. 175 00:13:12,000 --> 00:13:16,000 All right, and so this includes everything that we need from 176 00:13:16,000 --> 00:13:18,000 here. And now we just have to think 177 00:13:18,000 --> 00:13:21,000 about it as a matrix. Matrices will be a useful tool 178 00:13:21,000 --> 00:13:25,000 in a little while. OK, so now I'm going to define 179 00:13:25,000 --> 00:13:28,000 some sub problems. And, there's different ways 180 00:13:28,000 --> 00:13:32,000 that you could define what's going on in the shortest paths 181 00:13:32,000 --> 00:13:35,000 problem. OK, the natural thing is I want 182 00:13:35,000 --> 00:13:39,000 to go from vertex i to vertex j. What's the shortest path? 183 00:13:39,000 --> 00:13:42,000 OK, we need to refine the sub problems a little but more than 184 00:13:42,000 --> 00:13:43,000 that. Not surprising. 185 00:13:43,000 --> 00:13:46,000 And if you think about my analogy to Bellman-Ford, 186 00:13:46,000 --> 00:13:50,000 what Bellman-Ford does is it tries to build longer and longer 187 00:13:50,000 --> 00:13:52,000 shortest paths. But here, length is in terms of 188 00:13:52,000 --> 00:13:55,000 the number of edges. So, first, it builds shortest 189 00:13:55,000 --> 00:13:58,000 paths of length one. We've proven the first round it 190 00:13:58,000 --> 00:14:01,000 does that. The second round, 191 00:14:01,000 --> 00:14:06,000 it provides all shortest paths of length two, 192 00:14:06,000 --> 00:14:08,000 of count two, and so on. 193 00:14:08,000 --> 00:14:14,000 We'd like to do that sort of analogously, and try to reuse 194 00:14:14,000 --> 00:14:20,000 things a little bit more. So, I'm going to say d_ij^(m) 195 00:14:20,000 --> 00:14:26,000 is the weight of the shortest path from i to j with some 196 00:14:26,000 --> 00:14:33,000 restriction involving m. So: shortest path from i to j 197 00:14:33,000 --> 00:14:36,000 using at most m edges. OK, for example, 198 00:14:36,000 --> 00:14:41,000 if m is zero, then we don't have to really 199 00:14:41,000 --> 00:14:47,000 think very hard to find all shortest paths of length zero. 200 00:14:47,000 --> 00:14:50,000 OK, they use zero edges, I should say. 201 00:14:50,000 --> 00:14:57,000 So, Bellman-Ford sort of tells us how to go from m to m plus 202 00:14:57,000 --> 00:15:02,000 one. So, let's just figure that out. 203 00:15:02,000 --> 00:15:05,000 So one thing we know from the Bellman-Ford analysis is if we 204 00:15:05,000 --> 00:15:08,000 look at d_ij^(m-1), we know that in some sense the 205 00:15:08,000 --> 00:15:12,000 longest shortest path of relevance, unless you have 206 00:15:12,000 --> 00:15:15,000 negative weight cycle, the longest shortest path of 207 00:15:15,000 --> 00:15:19,000 relevance is when m equals n minus one because that's the 208 00:15:19,000 --> 00:15:21,000 longest simple path you can have. 209 00:15:21,000 --> 00:15:24,000 So, this should be a shortest path weight from i to j, 210 00:15:24,000 --> 00:15:28,000 and it would be no matter what larger value you put in the 211 00:15:28,000 --> 00:15:32,000 superscript. This should be delta of i comma 212 00:15:32,000 --> 00:15:35,000 j if there's no negative weight cycles. 213 00:15:35,000 --> 00:15:38,000 OK, so this feels good for dynamic programming. 214 00:15:38,000 --> 00:15:43,000 This will give us the answer if we can compute this for all m. 215 00:15:43,000 --> 00:15:47,000 Then we'll have the shortest path weights in particular. 216 00:15:47,000 --> 00:15:50,000 We need a way to detect negative weight cycles, 217 00:15:50,000 --> 00:15:54,000 but let's not worry about that too much for now. 218 00:15:54,000 --> 00:15:58,000 There are negative weights, but let's just assume for now 219 00:15:58,000 --> 00:16:02,000 there's no negative weight cycles. 220 00:16:02,000 --> 00:16:06,000 OK, and we get a recursion recurrence. 221 00:16:06,000 --> 00:16:10,000 And the base case is when m equals zero. 222 00:16:10,000 --> 00:16:16,000 This is pretty easy. They have the same vertices, 223 00:16:16,000 --> 00:16:22,000 the weight of zero, and otherwise it's infinity. 224 00:16:22,000 --> 00:16:28,000 OK, and then the actual recursion is for m. 225 00:16:57,000 --> 00:17:00,000 OK, if I got this right, this is a pretty easy, 226 00:17:00,000 --> 00:17:05,000 intuitive recursion for d_ij^(m) is a min of smaller 227 00:17:05,000 --> 00:17:10,000 things in terms of n minus one. I'll just show the picture, 228 00:17:10,000 --> 00:17:14,000 and then the proof of that claim should be obvious. 229 00:17:14,000 --> 00:17:19,000 So, this is proof by picture. So, we have on the one hand, 230 00:17:19,000 --> 00:17:22,000 I over here, and j over here. 231 00:17:22,000 --> 00:17:25,000 We want to know the shortest path from i to j. 232 00:17:25,000 --> 00:17:30,000 And, we want to use, at most, m edges. 233 00:17:30,000 --> 00:17:34,000 So, the idea is, well, you could use m minus one 234 00:17:34,000 --> 00:17:39,000 edges to get somewhere. So this is, at most, 235 00:17:39,000 --> 00:17:42,000 m minus one edges, some other place, 236 00:17:42,000 --> 00:17:48,000 and we'll call it k. So this is a candidate for k. 237 00:17:48,000 --> 00:17:53,000 And then you could take the edge directly from k to j. 238 00:17:53,000 --> 00:18:00,000 So, this costs A_k^j, and this costs DIK m minus one. 239 00:18:00,000 --> 00:18:02,000 OK, and that's a candidate path of length that uses, 240 00:18:02,000 --> 00:18:06,000 at most, m edges from I to j. And this is essentially just 241 00:18:06,000 --> 00:18:08,000 considering all of them. OK, so there's sort of many 242 00:18:08,000 --> 00:18:11,000 paths we are considering. All of these are candidate 243 00:18:11,000 --> 00:18:14,000 values of k. We are taking them in over all 244 00:18:14,000 --> 00:18:16,000 k as intermediate nodes, whatever. 245 00:18:16,000 --> 00:18:18,000 So there they are. We take the best such path. 246 00:18:18,000 --> 00:18:20,000 That should encompass all shortest paths. 247 00:18:20,000 --> 00:18:24,000 And this is essentially sort of what Bellman-Ford is doing, 248 00:18:24,000 --> 00:18:26,000 although not exactly. We also sort of want to think 249 00:18:26,000 --> 00:18:29,000 about, well, what if I just go directly with, 250 00:18:29,000 --> 00:18:34,000 say, m minus one edges? What if there is no edge here 251 00:18:34,000 --> 00:18:36,000 that I want to use, in some sense? 252 00:18:36,000 --> 00:18:40,000 Well, we always think about there being, and the way the A's 253 00:18:40,000 --> 00:18:45,000 are defined, there's always this zero weight edge to yourself. 254 00:18:45,000 --> 00:18:48,000 So, you could just take a path that's shorter, 255 00:18:48,000 --> 00:18:51,000 go from d i to j, and j is a particular value of 256 00:18:51,000 --> 00:18:55,000 k that we might consider, and then take a zero weight 257 00:18:55,000 --> 00:19:00,000 edge at the end from A and jj. OK, so this really encompasses 258 00:19:00,000 --> 00:19:03,000 everything. So that's a pretty trivial 259 00:19:03,000 --> 00:19:06,000 claim. OK, now once we have such a 260 00:19:06,000 --> 00:19:08,000 recursion, we get a dynamic program. 261 00:19:08,000 --> 00:19:11,000 I mean, there, this is it in some sense. 262 00:19:11,000 --> 00:19:15,000 It's written recursively. You can write a bottom up. 263 00:19:15,000 --> 00:19:19,000 And I would like to write it bottom up it little bit because 264 00:19:19,000 --> 00:19:23,000 while it doesn't look like it, this is a relaxation. 265 00:19:23,000 --> 00:19:26,000 This is yet another relaxation algorithm. 266 00:19:26,000 --> 00:19:29,000 So, I'll give you, so, this is sort of the 267 00:19:29,000 --> 00:19:31,000 algorithm. This is not a very interesting 268 00:19:31,000 --> 00:19:35,000 algorithm. So, you don't have to write it 269 00:19:35,000 --> 00:19:38,000 all down if you don't feel like it. 270 00:19:38,000 --> 00:19:40,000 It's probably not even in the book. 271 00:19:40,000 --> 00:19:42,000 This is just an intermediate step. 272 00:19:42,000 --> 00:19:45,000 So, we loop over all m. That's sort of the outermost 273 00:19:45,000 --> 00:19:48,000 thing to do. I want to build longer and 274 00:19:48,000 --> 00:19:51,000 longer paths, and this vaguely corresponds to 275 00:19:51,000 --> 00:19:53,000 Bellman-Ford, although it's actually worse 276 00:19:53,000 --> 00:19:56,000 than Bellman-Ford. But hey, what the heck? 277 00:19:56,000 --> 00:20:03,000 It's a stepping stone. OK, then for all i and j, 278 00:20:03,000 --> 00:20:10,000 and then we want to compute this min. 279 00:20:10,000 --> 00:20:17,000 So, we'll just loop over all k, and relax. 280 00:20:17,000 --> 00:20:26,000 And, here's where we're actually computing the min. 281 00:20:26,000 --> 00:20:35,000 And, it's a relaxation, is the point. 282 00:20:35,000 --> 00:20:38,000 This is our good friend, the relaxation step, 283 00:20:38,000 --> 00:20:40,000 relaxing edge. Well, it's not, 284 00:20:40,000 --> 00:20:42,000 yeah. I guess we're relaxing edge kj, 285 00:20:42,000 --> 00:20:45,000 or something, except we don't have the same 286 00:20:45,000 --> 00:20:48,000 clear notion. I mean, it's a particular thing 287 00:20:48,000 --> 00:20:52,000 that we're relaxing. It's not just a single edge 288 00:20:52,000 --> 00:20:55,000 because we don't have a single source anymore. 289 00:20:55,000 --> 00:20:59,000 It's now relative to source I, we are relaxing the edge kj, 290 00:20:59,000 --> 00:21:03,000 something like that. But this is clearly a 291 00:21:03,000 --> 00:21:05,000 relaxation. We are just making the triangle 292 00:21:05,000 --> 00:21:08,000 inequality true if it wasn't before. 293 00:21:08,000 --> 00:21:11,000 The tribal inequality has got to hold between all pairs. 294 00:21:11,000 --> 00:21:14,000 And that's just implementing this min, right? 295 00:21:14,000 --> 00:21:17,000 You're taking d ij. You take the min of what it was 296 00:21:17,000 --> 00:21:19,000 before in some sense. That was one of the 297 00:21:19,000 --> 00:21:23,000 possibilities we considered when we looked at the zero weight 298 00:21:23,000 --> 00:21:24,000 edge. We say, well, 299 00:21:24,000 --> 00:21:28,000 or you could go from i to some k in some way that we knew how 300 00:21:28,000 --> 00:21:32,000 to before, and then add on the edge, and check whether that's 301 00:21:32,000 --> 00:21:35,000 better if it's better, set our current estimate to 302 00:21:35,000 --> 00:21:38,000 that. And, you do this for all k. 303 00:21:38,000 --> 00:21:40,000 In particular, you might actually compute 304 00:21:40,000 --> 00:21:43,000 something smaller than this min because I didn't put 305 00:21:43,000 --> 00:21:46,000 superscripts up here. But that's just making paths 306 00:21:46,000 --> 00:21:49,000 even better. OK, so you have to argue that 307 00:21:49,000 --> 00:21:51,000 relaxation is always a good thing to do. 308 00:21:51,000 --> 00:21:53,000 So, by not putting superscripts, 309 00:21:53,000 --> 00:21:56,000 maybe I do some more relaxation, but more relaxation 310 00:21:56,000 --> 00:21:59,000 never hurts us. You can still argue correctness 311 00:21:59,000 --> 00:22:03,000 using this claim. So, it's not quite the direct 312 00:22:03,000 --> 00:22:05,000 implementation, but there you go, 313 00:22:05,000 --> 00:22:10,000 dynamic programming algorithm. The main reason I'll write it 314 00:22:10,000 --> 00:22:14,000 down: so you see that it's a relaxation, and you see the 315 00:22:14,000 --> 00:22:18,000 running time is n^4, OK, which is certainly no 316 00:22:18,000 --> 00:22:22,000 better than Bellman-Ford. Bellman-Ford was n^4 even in 317 00:22:22,000 --> 00:22:26,000 the dense case, and it's a little better in the 318 00:22:26,000 --> 00:22:30,000 sparse case. So: not doing so great. 319 00:22:30,000 --> 00:22:34,000 But it's a start. OK, it gets our dynamic 320 00:22:34,000 --> 00:22:41,000 programming minds thinking. And, we'll get a better dynamic 321 00:22:41,000 --> 00:22:47,000 program in a moment. But first, there's actually 322 00:22:47,000 --> 00:22:52,000 something useful we can do with this formulation, 323 00:22:52,000 --> 00:22:59,000 and I guess I'll ask, but I'll be really impressed if 324 00:22:59,000 --> 00:23:04,000 anyone can see. Does this formula look like 325 00:23:04,000 --> 00:23:09,000 anything else that you've seen in any context, 326 00:23:09,000 --> 00:23:15,000 mathematical or algorithmic? Have you seen that recurrence 327 00:23:15,000 --> 00:23:20,000 anywhere else? OK, not exactly as stated, 328 00:23:20,000 --> 00:23:24,000 but similar. I'm sure if you thought about 329 00:23:24,000 --> 00:23:30,000 it for awhile, you could come up with it. 330 00:23:30,000 --> 00:23:33,000 Any answers? I didn't think you would be 331 00:23:33,000 --> 00:23:36,000 very intuitive, but the answer is matrix 332 00:23:36,000 --> 00:23:39,000 multiplication. And it may now be obvious to 333 00:23:39,000 --> 00:23:43,000 you, or it may not. You have to think with the 334 00:23:43,000 --> 00:23:47,000 right quirky mind. Then it's obvious that it's 335 00:23:47,000 --> 00:23:50,000 matrix multiplication. Remember, matrix 336 00:23:50,000 --> 00:23:52,000 multiplication, we have A, B, 337 00:23:52,000 --> 00:23:55,000 and C. They're all n by n matrices. 338 00:23:55,000 --> 00:24:00,000 And, we want to compute C equals A times B. 339 00:24:00,000 --> 00:24:04,000 And what that meant was, well, c_ij was a sum over all k 340 00:24:04,000 --> 00:24:08,000 of a_ik times b_kj. All right, that was our 341 00:24:08,000 --> 00:24:11,000 definition of matrix multiplication. 342 00:24:11,000 --> 00:24:15,000 And that formula looks kind of like this one. 343 00:24:15,000 --> 00:24:19,000 I mean, notice the subscripts: ik and kj. 344 00:24:19,000 --> 00:24:22,000 Now, the operators are a little different. 345 00:24:22,000 --> 00:24:27,000 Here, we're multiplying the inside things and adding them 346 00:24:27,000 --> 00:24:34,000 all together. There, we're adding the inside 347 00:24:34,000 --> 00:24:41,000 things and taking them in. But other than that, 348 00:24:41,000 --> 00:24:47,000 it's the same. OK, weird, but here we go. 349 00:24:47,000 --> 00:24:55,000 So, the connection to shortest paths is you replace these 350 00:24:55,000 --> 00:25:00,000 operators. So, let's take matrix 351 00:25:00,000 --> 00:25:05,000 multiplication and replace, what should I do first, 352 00:25:05,000 --> 00:25:10,000 plus this thing with min. So, why not just change the 353 00:25:10,000 --> 00:25:13,000 operators, replace dot with plus? 354 00:25:13,000 --> 00:25:18,000 This is just a different algebra to work in, 355 00:25:18,000 --> 00:25:23,000 where plus actually means min, and dot actually means plus. 356 00:25:23,000 --> 00:25:29,000 So, you have to check that things sort of work out in that 357 00:25:29,000 --> 00:25:35,000 context, but if we do that, then we get that c_ij is the 358 00:25:35,000 --> 00:25:39,000 min overall k of a_ik plus, a bit messy here, 359 00:25:39,000 --> 00:25:44,000 b_kj. And that looks like what we 360 00:25:44,000 --> 00:25:49,000 actually want to compute, here, for one value of m, 361 00:25:49,000 --> 00:25:52,000 you have to sort of do this m times. 362 00:25:52,000 --> 00:25:56,000 But this conceptually is d_ij^(m), and this is 363 00:25:56,000 --> 00:25:59,000 d_ik^(m-1). So, this is looking like a 364 00:25:59,000 --> 00:26:04,000 matrix product, which is kind of cool. 365 00:26:04,000 --> 00:26:11,000 So, if we sort of plug in this claim, then, and think about 366 00:26:11,000 --> 00:26:17,000 things as matrices, the recurrence gives us, 367 00:26:17,000 --> 00:26:25,000 and I'll just write this now at matrix form, that d^(m) is d^(m) 368 00:26:25,000 --> 00:26:30,000 minus one, funny product, A. 369 00:26:30,000 --> 00:26:32,000 All right, so these are the weights. 370 00:26:32,000 --> 00:26:34,000 These were the weighted adjacency matrix. 371 00:26:34,000 --> 00:26:38,000 This was the previous d value. This is the new d value. 372 00:26:38,000 --> 00:26:41,000 So, I'll just rewrite that in matrix form with capital 373 00:26:41,000 --> 00:26:43,000 letters. OK, I have the circle up things 374 00:26:43,000 --> 00:26:47,000 that are using this funny algebra, so, in particular, 375 00:26:47,000 --> 00:26:49,000 circled product. OK, so that's kind of nifty. 376 00:26:49,000 --> 00:26:52,000 We know something about computing matrix 377 00:26:52,000 --> 00:26:54,000 multiplications. We can do it in n^3 time. 378 00:26:54,000 --> 00:26:57,000 If we were a bit fancier, maybe we could do it in 379 00:26:57,000 --> 00:27:02,000 sub-cubic time. So, we could try to sort of use 380 00:27:02,000 --> 00:27:07,000 this connection. And, well, think about what we 381 00:27:07,000 --> 00:27:10,000 are computing here. We are saying, 382 00:27:10,000 --> 00:27:14,000 well, d to the m is the previous one times A. 383 00:27:14,000 --> 00:27:19,000 So, what is d^(m)? Is that some other algebraic 384 00:27:19,000 --> 00:27:23,000 notion that we know? Yeah, it's the exponent. 385 00:27:23,000 --> 00:27:27,000 We're taking A, and we want to raise it to the 386 00:27:27,000 --> 00:27:33,000 power, m, with this funny notion of product. 387 00:27:33,000 --> 00:27:36,000 So, in other words, d to the m is really just A to 388 00:27:36,000 --> 00:27:40,000 the m in a funny way. So, I'll circle it, 389 00:27:40,000 --> 00:27:41,000 OK? So, that sounds good. 390 00:27:41,000 --> 00:27:46,000 We also know how to compute powers of things relatively 391 00:27:46,000 --> 00:27:50,000 quickly, if you remember how. OK, for this notion, 392 00:27:50,000 --> 00:27:52,000 this power notion, to make sense, 393 00:27:52,000 --> 00:27:55,000 I should say what A to the zero means. 394 00:27:55,000 --> 00:28:00,000 And so, I need some kind of identity matrix. 395 00:28:00,000 --> 00:28:02,000 And for here, the identity matrix is this 396 00:28:02,000 --> 00:28:06,000 one, if I get it right. So, it has zeros along the 397 00:28:06,000 --> 00:28:09,000 diagonal, and infinities everywhere else. 398 00:28:09,000 --> 00:28:12,000 OK, that sort of just to match this definition. 399 00:28:12,000 --> 00:28:16,000 d_ij zero should be zeros on the diagonals and infinity 400 00:28:16,000 --> 00:28:19,000 everywhere else. But you can check this is 401 00:28:19,000 --> 00:28:23,000 actually an identity. If you multiply it with this 402 00:28:23,000 --> 00:28:26,000 funny multiplication against any other matrix, 403 00:28:26,000 --> 00:28:31,000 you get the matrix back. Nothing changes. 404 00:28:31,000 --> 00:28:34,000 This really is a valid identity matrix. 405 00:28:34,000 --> 00:28:40,000 And, I should mention that for A to the m to make sense, 406 00:28:40,000 --> 00:28:44,000 you really knew that your product operation is 407 00:28:44,000 --> 00:28:48,000 associative. So, actually A to the m circled 408 00:28:48,000 --> 00:28:54,000 makes sense because circled multiplication is associative, 409 00:28:54,000 --> 00:28:58,000 and you can check that; not hard because, 410 00:28:58,000 --> 00:29:03,000 I mean, min is associative, and addition is associative, 411 00:29:03,000 --> 00:29:10,000 and all sorts of good stuff. And, you have some kind of 412 00:29:10,000 --> 00:29:14,000 distributivity property. And, this is, 413 00:29:14,000 --> 00:29:18,000 in turn, because the real numbers with, 414 00:29:18,000 --> 00:29:23,000 and get the right order here, with min as your addition 415 00:29:23,000 --> 00:29:29,000 operation, and plus as your multiplication operation is a 416 00:29:29,000 --> 00:29:34,000 closed semi-ring. So, if ever you want to know 417 00:29:34,000 --> 00:29:37,000 when powers make sense, this is a good rule. 418 00:29:37,000 --> 00:29:42,000 If you have a closed semi-ring, then matrix products on that 419 00:29:42,000 --> 00:29:46,000 semi-ring will give you an associative operator, 420 00:29:46,000 --> 00:29:49,000 and then, good, you can take products. 421 00:29:49,000 --> 00:29:54,000 OK, that's just some formalism. So now, we have some intuition. 422 00:29:54,000 --> 00:29:57,000 The question is, what's the right. 423 00:29:57,000 --> 00:30:00,000 Algorithm? There are many possible 424 00:30:00,000 --> 00:30:06,000 answers, some of which are right, some of which are not. 425 00:30:06,000 --> 00:30:09,000 So, we have this connection to matrix products, 426 00:30:09,000 --> 00:30:13,000 and we have a connection to matrix powers. 427 00:30:13,000 --> 00:30:15,000 And, we have algorithms for both. 428 00:30:15,000 --> 00:30:18,000 The question is, what should we do? 429 00:30:18,000 --> 00:30:23,000 So, all we need to do now is to compute A to the funny power, 430 00:30:23,000 --> 00:30:26,000 n minus one. n minus one is when we get 431 00:30:26,000 --> 00:30:29,000 shortest paths, assuming we have no negative 432 00:30:29,000 --> 00:30:34,000 weight cycles. In fact, we could compute a 433 00:30:34,000 --> 00:30:39,000 larger power than n minus one. Once you get beyond n minus 434 00:30:39,000 --> 00:30:43,000 one, multipling by A doesn't change you anymore. 435 00:30:43,000 --> 00:30:47,000 So, how should we do it? OK, you're not giving any smart 436 00:30:47,000 --> 00:30:50,000 answers. I'll give the stupid answer. 437 00:30:50,000 --> 00:30:53,000 You could say, well, I take A. 438 00:30:53,000 --> 00:30:56,000 I multiply it by A. Then I multiply it by A, 439 00:30:56,000 --> 00:31:00,000 and I multiply it by A, and I use normal, 440 00:31:00,000 --> 00:31:04,000 boring matrix to multiplication. 441 00:31:04,000 --> 00:31:07,000 So, I do, like, n minus two, 442 00:31:07,000 --> 00:31:13,000 standard matrix multiplies. So, standard multiply costs, 443 00:31:13,000 --> 00:31:17,000 like, n^3. And I'm doing n of them. 444 00:31:17,000 --> 00:31:23,000 So, this gives me an n^4 algorithm, and compute all the 445 00:31:23,000 --> 00:31:26,000 shortest pathways in n^4. Woohoo! 446 00:31:26,000 --> 00:31:31,000 OK, no improvement. So, how can I do better? 447 00:31:31,000 --> 00:31:36,000 Right, natural thing to try which sadly does not work, 448 00:31:36,000 --> 00:31:40,000 is to use the sub cubic matrix multiply algorithm. 449 00:31:40,000 --> 00:31:44,000 We will, in some sense, get there in a moment with a 450 00:31:44,000 --> 00:31:48,000 somewhat simpler problem. But, it's actually not known 451 00:31:48,000 --> 00:31:53,000 how to compute shortest paths using fast matrix multiplication 452 00:31:53,000 --> 00:31:55,000 like Strassen's system algorithm. 453 00:31:55,000 --> 00:32:00,000 But, good suggestion. OK, you have to think about why 454 00:32:00,000 --> 00:32:04,000 it doesn't work, and I'll tell you. 455 00:32:04,000 --> 00:32:07,000 It's not obvious, so it's a perfectly reasonable 456 00:32:07,000 --> 00:32:10,000 suggestion. But in this context it doesn't 457 00:32:10,000 --> 00:32:12,000 quite work. It will come up in a few 458 00:32:12,000 --> 00:32:14,000 moments. The problem is, 459 00:32:14,000 --> 00:32:17,000 Strassen requires the notion of subtraction. 460 00:32:17,000 --> 00:32:21,000 And here, addition is min. And, there's no inverse to min. 461 00:32:21,000 --> 00:32:25,000 Once you take the arguments, you can't sort of undo a min. 462 00:32:25,000 --> 00:32:28,000 OK, so there's no notion of subtraction, so it's not known 463 00:32:28,000 --> 00:32:32,000 how to pull that off, sadly. 464 00:32:32,000 --> 00:32:35,000 So, what other tricks do we have up our sleeve? 465 00:32:35,000 --> 00:32:37,000 Yeah? Divide and conquer, 466 00:32:37,000 --> 00:32:41,000 log n powering, yeah, repeated squaring. 467 00:32:41,000 --> 00:32:44,000 That works. Good, we had a fancy way. 468 00:32:44,000 --> 00:32:47,000 If you had a number n, you sort of looked at the 469 00:32:47,000 --> 00:32:52,000 binary number representation of n, and you either squared the 470 00:32:52,000 --> 00:32:57,000 number or squared it and added another factor of A. 471 00:32:57,000 --> 00:33:02,000 Here, we don't even have to be smart about it. 472 00:33:02,000 --> 00:33:07,000 OK, we can just compute, we really only have to think 473 00:33:07,000 --> 00:33:11,000 about powers of two. What we want to know, 474 00:33:11,000 --> 00:33:17,000 and I'm going to need a bigger font here because there's 475 00:33:17,000 --> 00:33:22,000 multiple levels of subscripts, A to the circled power, 476 00:33:22,000 --> 00:33:28,000 two to the ceiling of log n. Actually, n minus one would be 477 00:33:28,000 --> 00:33:32,000 enough. But there you go. 478 00:33:32,000 --> 00:33:35,000 You can write n if you didn't leave yourself enough space like 479 00:33:35,000 --> 00:33:37,000 me, n the ceiling, n the circle. 480 00:33:37,000 --> 00:33:41,000 This just means the next power of two after n minus one, 481 00:33:41,000 --> 00:33:44,000 two to the ceiling log. So, we don't have to go 482 00:33:44,000 --> 00:33:47,000 directly to n minus one. We can go further because 483 00:33:47,000 --> 00:33:51,000 anything farther than n minus one is still just the shortest 484 00:33:51,000 --> 00:33:53,000 pathways. If you look at the definition, 485 00:33:53,000 --> 00:33:57,000 and you know that your paths are simple, which is true if you 486 00:33:57,000 --> 00:34:02,000 have no negative weight cycles, then fine, just go farther. 487 00:34:02,000 --> 00:34:04,000 Why not? And so, to compute this, 488 00:34:04,000 --> 00:34:09,000 we just do ceiling of log n minus one products, 489 00:34:09,000 --> 00:34:13,000 just take A squared, and then take the result and 490 00:34:13,000 --> 00:34:17,000 square it; take the result and square it. 491 00:34:17,000 --> 00:34:20,000 So, this is order log n squares. 492 00:34:20,000 --> 00:34:25,000 And, we don't know how to use Strassen, but we can use the 493 00:34:25,000 --> 00:34:30,000 boring, standard multiply of n^3, and that gives us n^3 log n 494 00:34:30,000 --> 00:34:34,000 running time, OK, which finally is something 495 00:34:34,000 --> 00:34:40,000 that beats Bellman-Ford in the dense case. 496 00:34:40,000 --> 00:34:43,000 OK, in the dense case, Bellman-Ford was n^4. 497 00:34:43,000 --> 00:34:46,000 Here we get n^3 log n, finally something better. 498 00:34:46,000 --> 00:34:49,000 In the sparse case, it's about the same, 499 00:34:49,000 --> 00:34:52,000 maybe a little worse. E is order V. 500 00:34:52,000 --> 00:34:55,000 Then we're going to get, like, V3 for Bellman-Ford. 501 00:34:55,000 --> 00:34:59,000 Here, we get n^3 log n. OK, after log factors, 502 00:34:59,000 --> 00:35:03,000 this is an improvement some of the time. 503 00:35:03,000 --> 00:35:05,000 OK, it's about the same the other times. 504 00:35:05,000 --> 00:35:09,000 Another nifty thing that you get for free out of this, 505 00:35:09,000 --> 00:35:13,000 is you can detect negative weight cycles. 506 00:35:13,000 --> 00:35:16,000 So, here's a bit of a puzzle. How would I detect, 507 00:35:16,000 --> 00:35:21,000 after I compute this product, A to the power to ceiling log n 508 00:35:21,000 --> 00:35:25,000 minus one, how would I know if I found a negative weight cycle? 509 00:35:25,000 --> 00:35:30,000 What would that mean it this matrix of all their shortest 510 00:35:30,000 --> 00:35:34,000 paths of, at most, a certain length? 511 00:35:34,000 --> 00:35:36,000 If I found a cycle, what would have to be in that 512 00:35:36,000 --> 00:35:37,000 matrix? Yeah? 513 00:35:37,000 --> 00:35:39,000 Right, so I could, for example, 514 00:35:39,000 --> 00:35:41,000 take this thing, multiply it by A, 515 00:35:41,000 --> 00:35:43,000 see if the matrix changed at all. 516 00:35:43,000 --> 00:35:45,000 Right, that works fine. That's what we do in 517 00:35:45,000 --> 00:35:48,000 Bellman-Ford. It's an even simpler thing. 518 00:35:48,000 --> 00:35:51,000 It's already there. You don't have to multiply. 519 00:35:51,000 --> 00:35:52,000 But that's the same running time. 520 00:35:52,000 --> 00:35:55,000 That's a good answer. The diagonal would have a 521 00:35:55,000 --> 00:35:56,000 negative value, yeah. 522 00:35:56,000 --> 00:36:04,000 So, this is just a cute thing. Both approaches would work, 523 00:36:04,000 --> 00:36:15,000 can detect a negative weight cycle just by looking at the 524 00:36:15,000 --> 00:36:24,000 diagonal of the matrix. You just look for a negative 525 00:36:24,000 --> 00:36:30,000 value in the diagonal. OK. 526 00:36:30,000 --> 00:36:32,000 So, that's algorithm one, let's say. 527 00:36:32,000 --> 00:36:37,000 I mean, we've seen several that are all bad, but I'll call this 528 00:36:37,000 --> 00:36:39,000 number one. OK, we'll see two more. 529 00:36:39,000 --> 00:36:44,000 This is the only one that will, well, I shouldn't say that. 530 00:36:44,000 --> 00:36:47,000 Fine, there we go. So, this is one dynamic program 531 00:36:47,000 --> 00:36:51,000 that wasn't so helpful, except it showed us a 532 00:36:51,000 --> 00:36:53,000 connection to matrix multiplication, 533 00:36:53,000 --> 00:36:57,000 which is interesting. We'll see why it's useful a 534 00:36:57,000 --> 00:37:02,000 little bit more. But, it bled to this nasty four 535 00:37:02,000 --> 00:37:04,000 nested loops. And, using this trick, 536 00:37:04,000 --> 00:37:08,000 we got down to n^3 log n. Let's try, just for n^3. 537 00:37:08,000 --> 00:37:11,000 OK, just get rid of that log. It's annoying. 538 00:37:11,000 --> 00:37:15,000 It makes you a little bit worse than Bellman-Ford, 539 00:37:15,000 --> 00:37:18,000 and the sparse case. So, let's just erase one of 540 00:37:18,000 --> 00:37:21,000 these nested loops. OK, I want to do that. 541 00:37:21,000 --> 00:37:25,000 OK, obviously that algorithm doesn't work because it's for 542 00:37:25,000 --> 00:37:28,000 first decay, and it's not defined, but, 543 00:37:28,000 --> 00:37:31,000 you know, I've got enough variables. 544 00:37:31,000 --> 00:37:35,000 Why don't I just define k to the m? 545 00:37:35,000 --> 00:37:39,000 OK, it turns out that works. I'll do it from scratch, 546 00:37:39,000 --> 00:37:42,000 but why not? I don't know if that's how 547 00:37:42,000 --> 00:37:47,000 Floyd and Warshall came up with their algorithm, 548 00:37:47,000 --> 00:37:50,000 but here you go. Here's Floyd-Warshall. 549 00:37:50,000 --> 00:37:55,000 The idea is to define the subproblems a little bit more 550 00:37:55,000 --> 00:37:59,000 cleverly so that to compute one of these values, 551 00:37:59,000 --> 00:38:04,000 you don't have to take the min of n things. 552 00:38:04,000 --> 00:38:06,000 I just want to take the min of two things. 553 00:38:06,000 --> 00:38:09,000 If I could do that, and I still only have n^3 554 00:38:09,000 --> 00:38:12,000 subproblems, then I would have n^3 time. 555 00:38:12,000 --> 00:38:14,000 So, all right, the running time of dynamic 556 00:38:14,000 --> 00:38:18,000 program is number of subproblems times the time to compute the 557 00:38:18,000 --> 00:38:22,000 recurrence for one subproblem. So, here's linear times n^3, 558 00:38:22,000 --> 00:38:26,000 and we want n^3 times constant. That would be good. 559 00:38:26,000 --> 00:38:29,000 So that's Floyd-Warshall. So, here's the way we're going 560 00:38:29,000 --> 00:38:35,000 to redefine c_ij. Or I guess, there it was called 561 00:38:35,000 --> 00:38:39,000 d_ij. Good, so we're going to define 562 00:38:39,000 --> 00:38:43,000 something new. So, c_ij superscript k is now 563 00:38:43,000 --> 00:38:50,000 going to be the weight of the shortest path from I to j as 564 00:38:50,000 --> 00:38:54,000 before. Notice I used the superscript k 565 00:38:54,000 --> 00:39:00,000 instead of m because I want k and m to be the same thing. 566 00:39:00,000 --> 00:39:03,000 Deep. OK, now, here's the new 567 00:39:03,000 --> 00:39:05,000 constraint. I want all intermediate 568 00:39:05,000 --> 00:39:09,000 vertices along the path, meeting all vertices except for 569 00:39:09,000 --> 00:39:13,000 I and j at the beginning and the end to have a small label. 570 00:39:13,000 --> 00:39:17,000 So, they should be in the set from one up to k. 571 00:39:17,000 --> 00:39:21,000 And this is where we are really using that our vertices are 572 00:39:21,000 --> 00:39:24,000 labeled one up to m. So, I'm going to say, 573 00:39:24,000 --> 00:39:28,000 well, first think about the shortest paths that don't use 574 00:39:28,000 --> 00:39:32,000 any other vertices. That's when k is zero. 575 00:39:32,000 --> 00:39:35,000 Then think about all the shortest paths that maybe they 576 00:39:35,000 --> 00:39:38,000 use vertex one. And then think about the 577 00:39:38,000 --> 00:39:41,000 shortest paths that maybe use vertex one or vertex two. 578 00:39:41,000 --> 00:39:43,000 Why not? You could define it in this 579 00:39:43,000 --> 00:39:44,000 way. It turns out, 580 00:39:44,000 --> 00:39:48,000 then when you increase k, you only have to think about 581 00:39:48,000 --> 00:39:51,000 one new vertex. Here, we had to take min over 582 00:39:51,000 --> 00:39:53,000 all k. Now we know which k to look at. 583 00:39:53,000 --> 00:39:57,000 OK, maybe that made sense. Maybe it's not quite obvious 584 00:39:57,000 --> 00:39:59,000 yet. But I'm going to redo this 585 00:39:59,000 --> 00:40:04,000 claim, redo a recurrence. So, maybe first I should say 586 00:40:04,000 --> 00:40:07,000 some obvious things. So, if I want delta of ij of 587 00:40:07,000 --> 00:40:10,000 the shortest pathway, well, just take all the 588 00:40:10,000 --> 00:40:13,000 vertices. So, take c_ij superscript n. 589 00:40:13,000 --> 00:40:15,000 That's everything. And this even works, 590 00:40:15,000 --> 00:40:19,000 this is true even if you have a negative weight cycle. 591 00:40:19,000 --> 00:40:22,000 Although, again, we're going to sort of ignore 592 00:40:22,000 --> 00:40:26,000 negative weight cycles as long as we can detect them. 593 00:40:26,000 --> 00:40:29,000 And, another simple case is if you have, well, 594 00:40:29,000 --> 00:40:35,000 c_ij to zero. Let me put that in the claim to 595 00:40:35,000 --> 00:40:40,000 be a little bit more consistent here. 596 00:40:40,000 --> 00:40:47,000 So, here's the new claim. If we want to compute c_ij 597 00:40:47,000 --> 00:40:50,000 superscript zero, what is it? 598 00:40:50,000 --> 00:40:58,000 Superscript zero means I really shouldn't use any intermediate 599 00:40:58,000 --> 00:41:03,000 vertices. So, this has a very simple 600 00:41:03,000 --> 00:41:09,000 answer, a three letter answer. So, it's not zero. 601 00:41:09,000 --> 00:41:12,000 It's four letters. What's that? 602 00:41:12,000 --> 00:41:15,000 Nil. No, not working yet. 603 00:41:15,000 --> 00:41:18,000 It has some subscripts, too. 604 00:41:18,000 --> 00:41:25,000 So, the definition would be, what's the shortest path weight 605 00:41:25,000 --> 00:41:31,000 from I to j when you're not allowed to use any intermediate 606 00:41:31,000 --> 00:41:34,000 vertices? Sorry? 607 00:41:34,000 --> 00:41:38,000 So, yeah, it has a very simple name. 608 00:41:38,000 --> 00:41:43,000 That's the tricky part. All right, so if i equals j, 609 00:41:43,000 --> 00:41:48,000 [LAUGHTER] you're clever, right, open bracket i equals j 610 00:41:48,000 --> 00:41:50,000 means one, well, OK. 611 00:41:50,000 --> 00:41:54,000 It sort of works, but it's not quite right. 612 00:41:54,000 --> 00:41:59,000 In fact, I want infinity if i does not equal j. 613 00:41:59,000 --> 00:42:05,000 And I want to zero if i equals j, a_ij, good. 614 00:42:05,000 --> 00:42:07,000 I think it's a_ij. It should be, 615 00:42:07,000 --> 00:42:09,000 right? Maybe I'm wrong. 616 00:42:09,000 --> 00:42:12,000 Right, a_ij. So it's essentially not what I 617 00:42:12,000 --> 00:42:13,000 said. That's the point. 618 00:42:13,000 --> 00:42:17,000 If i does not equal j, you still have to think about a 619 00:42:17,000 --> 00:42:20,000 single edge connecting i to j, right? 620 00:42:20,000 --> 00:42:23,000 OK, so that's a bit of a subtlety. 621 00:42:23,000 --> 00:42:27,000 This is only intermediate vertices, so you could still go 622 00:42:27,000 --> 00:42:32,000 from i to j via a single edge. That will cost a_ij. 623 00:42:32,000 --> 00:42:34,000 If there is an edge: infinity. 624 00:42:34,000 --> 00:42:37,000 If there isn't one: that is a_ij. 625 00:42:37,000 --> 00:42:42,000 So, OK, that gets us started. And then, we want a recurrence. 626 00:42:42,000 --> 00:42:46,000 And, the recurrence is, well, maybe you get away with 627 00:42:46,000 --> 00:42:49,000 all the vertices that you had before. 628 00:42:49,000 --> 00:42:52,000 So, if you want to know paths that you had before, 629 00:42:52,000 --> 00:42:56,000 so if you want to know paths that use one up to k, 630 00:42:56,000 --> 00:43:01,000 maybe I just use one up to k minus one. 631 00:43:01,000 --> 00:43:04,000 You could try that. Or, you could try using k. 632 00:43:04,000 --> 00:43:07,000 So, either you use k or you don't. 633 00:43:07,000 --> 00:43:09,000 If you don't, it's got to be this. 634 00:43:09,000 --> 00:43:12,000 If you do, then you've got to go to k. 635 00:43:12,000 --> 00:43:17,000 So why not go to k at the end? So, you go from I to k using 636 00:43:17,000 --> 00:43:21,000 the previous vertices. Obviously, you don't want to 637 00:43:21,000 --> 00:43:24,000 repeat k in there. And then, you go from k to j 638 00:43:24,000 --> 00:43:29,000 somehow using vertices that are not k. 639 00:43:29,000 --> 00:43:31,000 This should be pretty intuitive. 640 00:43:31,000 --> 00:43:35,000 Again, I can draw a picture. So, either you never go to k, 641 00:43:35,000 --> 00:43:40,000 and that's this wiggly line. You go from i to j using things 642 00:43:40,000 --> 00:43:43,000 only one up to k minus one. In other words, 643 00:43:43,000 --> 00:43:45,000 here we have to use one up to k. 644 00:43:45,000 --> 00:43:48,000 So, this just means don't use k. 645 00:43:48,000 --> 00:43:52,000 So, that's this thing. Or, you use k somewhere in the 646 00:43:52,000 --> 00:43:55,000 middle there. OK, it's got to be one of the 647 00:43:55,000 --> 00:43:57,000 two. And in this case, 648 00:43:57,000 --> 00:44:00,000 you go from i to k using only smaller vertices, 649 00:44:00,000 --> 00:44:05,000 because you don't want to repeat k. 650 00:44:05,000 --> 00:44:10,000 And here, you go from k to j using only smaller labeled 651 00:44:10,000 --> 00:44:14,000 vertices. So, every path is one of the 652 00:44:14,000 --> 00:44:18,000 two. So, we take the shortest of 653 00:44:18,000 --> 00:44:22,000 these two subproblems. That's the answer. 654 00:44:22,000 --> 00:44:26,000 So, now we have a min of two things. 655 00:44:26,000 --> 00:44:29,000 It takes constant time to compute. 656 00:44:29,000 --> 00:44:36,000 So, we get a cubic algorithm. So, let me write it down. 657 00:44:36,000 --> 00:44:41,000 So, this is the Floyd-Warshall algorithm. 658 00:44:41,000 --> 00:44:46,000 I'll write the name again. You give it a matrix A. 659 00:44:46,000 --> 00:44:50,000 That's all it really needs to know. 660 00:44:50,000 --> 00:44:54,000 It codes everything. You copy C to A. 661 00:44:54,000 --> 00:44:58,000 That's the warm up. Right at time zero, 662 00:44:58,000 --> 00:45:03,000 C equals A. And then you just have these 663 00:45:03,000 --> 00:45:07,000 three loops for every value of k, for every value of i, 664 00:45:07,000 --> 00:45:10,000 and for every value of j. You compute that min. 665 00:45:10,000 --> 00:45:15,000 And if you think about it a little bit, that min is a 666 00:45:15,000 --> 00:45:18,000 relaxation. Surprise, surprise. 667 00:45:47,000 --> 00:45:51,000 So, that is the Floyd-Warshall algorithm. 668 00:45:51,000 --> 00:45:58,000 And, the running time is clearly n^3, three nested loops, 669 00:45:58,000 --> 00:46:02,000 constant time inside. So, we're finally getting 670 00:46:02,000 --> 00:46:05,000 something that is never worse than Bellman-Ford. 671 00:46:05,000 --> 00:46:06,000 In the sparse case, it's the same. 672 00:46:06,000 --> 00:46:09,000 And anything denser, the number of edges is super 673 00:46:09,000 --> 00:46:11,000 linear. This is strictly better than 674 00:46:11,000 --> 00:46:13,000 Bellman-Ford. And, it's better than 675 00:46:13,000 --> 00:46:16,000 everything we've seen so far for all pair, shortest paths. 676 00:46:16,000 --> 00:46:19,000 And, this handles negative weights; very simple algorithm, 677 00:46:19,000 --> 00:46:21,000 even simpler than the one before. 678 00:46:21,000 --> 00:46:23,000 It's just relaxation within three loops. 679 00:46:23,000 --> 00:46:27,000 What more could you ask for? And we need to check that this 680 00:46:27,000 --> 00:46:29,000 is indeed what min we're computing here, 681 00:46:29,000 --> 00:46:33,000 except that the superscripts are omitted. 682 00:46:33,000 --> 00:46:35,000 That's, again, a bit of hand waving a bit. 683 00:46:35,000 --> 00:46:39,000 It's OK to omit subscripts because that can only mean that 684 00:46:39,000 --> 00:46:42,000 you're doing more relaxation techniques should be. 685 00:46:42,000 --> 00:46:45,000 Doing more relaxations can never hurt you. 686 00:46:45,000 --> 00:46:48,000 In particular, we do all the ones that we have 687 00:46:48,000 --> 00:46:50,000 to. Therefore, we find the shortest 688 00:46:50,000 --> 00:46:52,000 path weights. And, again, here, 689 00:46:52,000 --> 00:46:55,000 we're assuming that there is no negative weight cycles. 690 00:46:55,000 --> 00:46:59,000 It shouldn't be hard to find them, but you have to think 691 00:46:59,000 --> 00:47:04,000 about that a little bit. OK, you could run another round 692 00:47:04,000 --> 00:47:07,000 of Bellman-Ford, see if it relaxes in a new 693 00:47:07,000 --> 00:47:09,000 edges again. For example, 694 00:47:09,000 --> 00:47:13,000 I think there's no nifty trick for that version. 695 00:47:13,000 --> 00:47:17,000 And, we're going to cover, that's our second algorithm for 696 00:47:17,000 --> 00:47:21,000 all pairs shortest paths. Before we go up to the third 697 00:47:21,000 --> 00:47:26,000 algorithm, which is going to be the cleverest of them all, 698 00:47:26,000 --> 00:47:30,000 the one Ring to rule them all, to switch trilogies, 699 00:47:30,000 --> 00:47:33,000 we're going to take a little bit of a diversion, 700 00:47:33,000 --> 00:47:37,000 side story, whatever, and talk about transitive 701 00:47:37,000 --> 00:47:42,000 closure briefly. This is just a good thing to 702 00:47:42,000 --> 00:47:45,000 know about. And, it relates to the 703 00:47:45,000 --> 00:47:51,000 algorithms we've seen so far. So, here's a transitive closure 704 00:47:51,000 --> 00:47:54,000 problem. I give you a directed graph, 705 00:47:54,000 --> 00:47:59,000 and for all pair vertices, i and j, I want to compute this 706 00:47:59,000 --> 00:48:03,000 number. It's one if there's a path from 707 00:48:03,000 --> 00:48:06,000 i to j. From i to j, 708 00:48:06,000 --> 00:48:14,000 OK, and then zero otherwise. OK, this is sort of like a 709 00:48:14,000 --> 00:48:22,000 boring adjacency matrix with no weights, except it's about paths 710 00:48:22,000 --> 00:48:32,000 instead of being about edges. OK, so how can I compute this? 711 00:48:32,000 --> 00:48:39,000 That's very simple. How should I compute this? 712 00:48:39,000 --> 00:48:45,000 This gives me a graph in some sense. 713 00:48:45,000 --> 00:48:54,000 This is adjacency matrix of a new graph called the transitive 714 00:48:54,000 --> 00:49:01,000 closure of my input graph. So, breadth first search, 715 00:49:01,000 --> 00:49:05,000 yeah, good. So, all I need to do is find 716 00:49:05,000 --> 00:49:08,000 shortest paths, and if the weights come out 717 00:49:08,000 --> 00:49:12,000 infinity, then there's no path. If it's less than infinity, 718 00:49:12,000 --> 00:49:15,000 that there's a path. And so here, 719 00:49:15,000 --> 00:49:19,000 so you are saying maybe I don't care about the weights, 720 00:49:19,000 --> 00:49:22,000 so I can run breadth first search n times, 721 00:49:22,000 --> 00:49:27,000 and that will work indeed. So, if we do B times B of S, 722 00:49:27,000 --> 00:49:31,000 so it's maybe weird that I'm covering here in the middle, 723 00:49:31,000 --> 00:49:36,000 but it's just an interlude. So, we have, 724 00:49:36,000 --> 00:49:42,000 then, something like V times E. OK, you can run any of these 725 00:49:42,000 --> 00:49:46,000 algorithms. You could take Floyd-Warshall 726 00:49:46,000 --> 00:49:48,000 for example. Why not? 727 00:49:48,000 --> 00:49:54,000 OK, then it would just be V^ I mean, you could run in any of 728 00:49:54,000 --> 00:50:00,000 these algorithms with weights of one or zero, and just check 729 00:50:00,000 --> 00:50:06,000 whether the values are infinity or not. 730 00:50:06,000 --> 00:50:10,000 So, I mean, t_ij equals zero, if and only if the shortest 731 00:50:10,000 --> 00:50:12,000 path weight from i to j is infinity. 732 00:50:12,000 --> 00:50:16,000 So, just solve this. This is an easier problem than 733 00:50:16,000 --> 00:50:18,000 shortest paths. It is, in fact, 734 00:50:18,000 --> 00:50:22,000 strictly easier in a certain sense, because what's going on 735 00:50:22,000 --> 00:50:26,000 with transitive closure, and I just want to mention this 736 00:50:26,000 --> 00:50:30,000 out of interest because transitive closure is a useful 737 00:50:30,000 --> 00:50:33,000 thing to know about. Essentially, 738 00:50:33,000 --> 00:50:36,000 what we are doing, let me get this right, 739 00:50:36,000 --> 00:50:39,000 is using a different set of operators. 740 00:50:39,000 --> 00:50:43,000 We're using or and and, a logical or and and instead of 741 00:50:43,000 --> 00:50:46,000 min and plus, OK, because we want to know, 742 00:50:46,000 --> 00:50:49,000 if you think about a relaxation, in some sense, 743 00:50:49,000 --> 00:50:53,000 maybe I should think about it in terms of this min. 744 00:50:53,000 --> 00:50:56,000 So, if I want to know, is there a pathway from I to j 745 00:50:56,000 --> 00:51:02,000 that uses vertices labeled one through k in the middle? 746 00:51:02,000 --> 00:51:05,000 Well, either there is a path that doesn't use the vertex k, 747 00:51:05,000 --> 00:51:09,000 or there is a path that uses k, and then it would have to look 748 00:51:09,000 --> 00:51:12,000 like that. OK, so there would have to be a 749 00:51:12,000 --> 00:51:15,000 path here, and there would have to be a path there. 750 00:51:15,000 --> 00:51:18,000 So, the min and plus get replaced with or and and. 751 00:51:18,000 --> 00:51:21,000 And if you remember, this used to be plus, 752 00:51:21,000 --> 00:51:24,000 and this used to be product in the matrix world. 753 00:51:24,000 --> 00:51:28,000 So, plus is now like or. And, multiply is now like and, 754 00:51:28,000 --> 00:51:31,000 which sounds very good, right? 755 00:51:31,000 --> 00:51:35,000 Plus does feel like or, and multiply does feel like and 756 00:51:35,000 --> 00:51:40,000 if you live in a zero-one world. So, in fact, 757 00:51:40,000 --> 00:51:45,000 this is not quite the field Z mod two, but this is a good, 758 00:51:45,000 --> 00:51:49,000 nice, field to work in. This is the Boolean world. 759 00:51:49,000 --> 00:51:55,000 So, I'll just write Boole. Good old Boole knows all about 760 00:51:55,000 --> 00:51:58,000 this. It's like his master's thesis, 761 00:51:58,000 --> 00:52:03,000 I think, talking about Boolean algebra. 762 00:52:03,000 --> 00:52:06,000 And, this actually means that you can use fast matrix 763 00:52:06,000 --> 00:52:09,000 multiply. You can use Strassen's 764 00:52:09,000 --> 00:52:13,000 algorithm, and the fancier algorithms, and you can compute 765 00:52:13,000 --> 00:52:16,000 the transitive closure in subcubic time. 766 00:52:16,000 --> 00:52:19,000 So, this is sub cubic if the edges are sparse. 767 00:52:19,000 --> 00:52:24,000 But, it's cubic in the worst case if there are lots of edges. 768 00:52:24,000 --> 00:52:27,000 This is cubic. You can actually do better 769 00:52:27,000 --> 00:52:30,000 using Strassen. So, I'll just say you can do 770 00:52:30,000 --> 00:52:33,000 it. No details here. 771 00:52:33,000 --> 00:52:37,000 I think it should be, so in fact, there is a theorem. 772 00:52:37,000 --> 00:52:41,000 This is probably not in the textbook, but there's a theorem 773 00:52:41,000 --> 00:52:45,000 that says transitive closure is just as hard as matrix multiply. 774 00:52:45,000 --> 00:52:49,000 OK, they are equivalent. Their running times are the 775 00:52:49,000 --> 00:52:52,000 same. We don't know how long it takes 776 00:52:52,000 --> 00:52:55,000 to do a matrix multiply over a field. 777 00:52:55,000 --> 00:52:57,000 It's somewhere between n^2 and n^2.3. 778 00:52:57,000 --> 00:53:03,000 But, whatever the answer is: same for transitive closure. 779 00:53:03,000 --> 00:53:09,000 OK, there's the interlude. And that's where we actually 780 00:53:09,000 --> 00:53:16,000 get to use Strassen and friends. Remember, Strassen was n to the 781 00:53:16,000 --> 00:53:22,000 log base two of seven algorithm. Remember that, 782 00:53:22,000 --> 00:53:28,000 especially on the final. Those are things you should 783 00:53:28,000 --> 00:53:35,000 have at the tip of your tongue. OK, the last algorithm we're 784 00:53:35,000 --> 00:53:39,000 going to cover is really going to build on what we saw last 785 00:53:39,000 --> 00:53:43,000 time: Johnson's algorithm. And, I've lost some of the 786 00:53:43,000 --> 00:53:46,000 running times here. But, when we had unweighted 787 00:53:46,000 --> 00:53:50,000 graphs, we could do all pairs really fast, just as fast as a 788 00:53:50,000 --> 00:53:54,000 single source Bellman-Ford. That's kind of nifty. 789 00:53:54,000 --> 00:53:58,000 We don't know how to improve Bellman-Ford in the single 790 00:53:58,000 --> 00:54:02,000 source case. So, we can't really help to get 791 00:54:02,000 --> 00:54:07,000 anything better than V times E. And, if you remember running V 792 00:54:07,000 --> 00:54:11,000 times Dijkstra, V times Dijkstra was about the 793 00:54:11,000 --> 00:54:14,000 same. So, just put this in the recall 794 00:54:14,000 --> 00:54:19,000 bubble here: V times Dijkstra would give us V times E plus V^2 795 00:54:19,000 --> 00:54:21,000 log V. And, if you ignore that log 796 00:54:21,000 --> 00:54:25,000 factor, this is just VE. OK, so this was really good. 797 00:54:25,000 --> 00:54:29,000 Dijkstra was great. And this was for nonnegative 798 00:54:29,000 --> 00:54:34,000 edge weights. So, with negative edge weights, 799 00:54:34,000 --> 00:54:38,000 somehow we'd like to get the same running time. 800 00:54:38,000 --> 00:54:41,000 Now, how might I get the same running time? 801 00:54:41,000 --> 00:54:45,000 Well, it would be really nice if I could use Dijkstra. 802 00:54:45,000 --> 00:54:49,000 Of course, Dijkstra doesn't work with negative weights. 803 00:54:49,000 --> 00:54:53,000 So what could I do? What would I hope to do? 804 00:54:53,000 --> 00:54:56,000 What could I hope to? Suppose I want, 805 00:54:56,000 --> 00:55:02,000 in the middle of the algorithm, it says run Dijkstra n times. 806 00:55:02,000 --> 00:55:05,000 Then, what should I do to prepare for that? 807 00:55:05,000 --> 00:55:09,000 Make all the weights positive, or nonnegative. 808 00:55:09,000 --> 00:55:13,000 Why not, right? We're being wishful thinking. 809 00:55:13,000 --> 00:55:17,000 That's what we'll do. So, this is called graph 810 00:55:17,000 --> 00:55:21,000 re-weighting. And, what's cool is we actually 811 00:55:21,000 --> 00:55:26,000 already know how to do it. We just don't know that we know 812 00:55:26,000 --> 00:55:30,000 how to do it. But I know that we know that we 813 00:55:30,000 --> 00:55:34,000 know how to do it. You don't yet know that we know 814 00:55:34,000 --> 00:55:39,000 that I know that we know how to do it. 815 00:55:39,000 --> 00:55:41,000 So, it turns out you can re-weight the vertices. 816 00:55:41,000 --> 00:55:44,000 So, at the end of the last class someone asked me, 817 00:55:44,000 --> 00:55:46,000 can you just, like, add the same weight to 818 00:55:46,000 --> 00:55:48,000 all the edges? That doesn't work. 819 00:55:48,000 --> 00:55:51,000 Not quite, because different paths have different numbers of 820 00:55:51,000 --> 00:55:53,000 edges. What we are going to do is add 821 00:55:53,000 --> 00:55:55,000 a particular weight to each vertex. 822 00:55:55,000 --> 00:55:58,000 What does that mean? Well, because we really only 823 00:55:58,000 --> 00:56:02,000 have weights on the edges, here's what well do. 824 00:56:02,000 --> 00:56:06,000 We'll re-weight each edge, so, (u,v), let's say, 825 00:56:06,000 --> 00:56:12,000 going to go back into graph speak instead of matrix speak, 826 00:56:12,000 --> 00:56:17,000 (u,v) instead of I and j, and we'll call this modified 827 00:56:17,000 --> 00:56:20,000 weight w_h. h is our function. 828 00:56:20,000 --> 00:56:24,000 It gives us a number for every vertex. 829 00:56:24,000 --> 00:56:30,000 And, it's just going to be the old weight of that edge plus the 830 00:56:30,000 --> 00:56:36,000 weight of the start vertex minus the weight of the terminating 831 00:56:36,000 --> 00:56:40,000 vertex. I'm sure these have good names. 832 00:56:40,000 --> 00:56:43,000 One of these is the head, and the other is the tail, 833 00:56:43,000 --> 00:56:47,000 but I can never remember which. OK, so we've directed edge 834 00:56:47,000 --> 00:56:48,000 (u,v). Just add one of them; 835 00:56:48,000 --> 00:56:51,000 subtract the other. And, it's a directed edge, 836 00:56:51,000 --> 00:56:53,000 so that's a consistent definition. 837 00:56:53,000 --> 00:56:55,000 OK, so that's called re-weighting. 838 00:56:55,000 --> 00:56:58,000 Now, this is actually a theorem. 839 00:56:58,000 --> 00:57:03,000 If you do this, then, let's say, 840 00:57:03,000 --> 00:57:10,000 for any vertices, u and v in the graph, 841 00:57:10,000 --> 00:57:18,000 for any two vertices, all paths from u to v have the 842 00:57:18,000 --> 00:57:27,000 same weight as they did before, well, not quite. 843 00:57:27,000 --> 00:57:34,000 They have the same re-weighting. 844 00:57:34,000 --> 00:57:37,000 So, if you look at all the different paths and you say, 845 00:57:37,000 --> 00:57:39,000 well, what's the difference between vh, well, 846 00:57:39,000 --> 00:57:42,000 sorry, let's say delta, which is the old shortest 847 00:57:42,000 --> 00:57:45,000 paths, and deltas of h, which is the shortest path 848 00:57:45,000 --> 00:57:48,000 weights according to this new weight function, 849 00:57:48,000 --> 00:57:50,000 then that difference is the same. 850 00:57:50,000 --> 00:57:53,000 So, we'll say that all these paths are re-weighted by the 851 00:57:53,000 --> 00:57:55,000 same amounts. OK, this is actually a 852 00:57:55,000 --> 00:58:00,000 statement about all paths, not just shortest paths. 853 00:58:00,000 --> 00:58:05,000 There we go. OK, to how many people is this 854 00:58:05,000 --> 00:58:08,000 obvious already? A few, yeah, 855 00:58:08,000 --> 00:58:12,000 it is. And what's the one word? 856 00:58:12,000 --> 00:58:16,000 OK, it's maybe not that obvious. 857 00:58:16,000 --> 00:58:23,000 All right, shout out the word when you figure it out. 858 00:58:23,000 --> 00:58:29,000 Meanwhile, I'll write out this rather verbose proof. 859 00:58:29,000 --> 00:58:36,000 There's a one word proof, still waiting. 860 00:58:36,000 --> 00:58:41,000 So, let's just take one of these paths that starts at u and 861 00:58:41,000 --> 00:58:43,000 ends at v. Take any path. 862 00:58:43,000 --> 00:58:49,000 We're just going to see what its new weight is relative to 863 00:58:49,000 --> 00:58:53,000 its old weight. And so, let's just write out 864 00:58:53,000 --> 00:58:57,000 w_h of the path, which we define in the usual 865 00:58:57,000 --> 00:59:03,000 way as the sum over all edges of the new weight of the edge from 866 00:59:03,000 --> 00:59:09,000 v_i to v_i plus one. Do you have the word? 867 00:59:09,000 --> 00:59:11,000 No? Tough puzzle then, 868 00:59:11,000 --> 00:59:15,000 OK. So that's the definition of the 869 00:59:15,000 --> 00:59:20,000 weight of a path. And, then we know this thing is 870 00:59:20,000 --> 00:59:23,000 just w of v_i, v_i plus one. 871 00:59:23,000 --> 00:59:27,000 I'll get it right, plus the weight of the first 872 00:59:27,000 --> 00:59:32,000 vertex, plus, sorry, the re-weighting of v_i 873 00:59:32,000 --> 00:59:38,000 minus the re-weighting of v_i plus one. 874 00:59:38,000 --> 00:59:42,000 This is all in parentheses that's summed over I. 875 00:59:42,000 --> 00:59:46,000 Now I need the magic word. Telescopes, good. 876 00:59:46,000 --> 00:59:51,000 Now this is obvious: each of these telescopes with 877 00:59:51,000 --> 00:59:55,000 an extra previous, except the very beginning and 878 00:59:55,000 --> 00:59:59,000 the very end. So, this is the sum of these 879 00:59:59,000 --> 01:00:03,817 weights of edges, but then outside the sum, 880 01:00:03,817 --> 01:00:09,000 we have plus h of v_1, and minus h of v_k. 881 01:00:09,000 --> 01:00:11,933 OK, those guys don't quite cancel. 882 01:00:11,933 --> 01:00:15,577 We're not looking at a cycle, just a path. 883 01:00:15,577 --> 01:00:20,822 And, this thing is just w of the path, as this is the normal 884 01:00:20,822 --> 01:00:24,111 weight of the path. And so the change, 885 01:00:24,111 --> 01:00:29,088 the difference between w_h of P and w of P is this thing, 886 01:00:29,088 --> 01:00:33,000 which is just h of u minus h of v. 887 01:00:33,000 --> 01:00:36,744 And, the point is that's the same as long as you fix the 888 01:00:36,744 --> 01:00:39,468 endpoints, u and v, of the shortest path, 889 01:00:39,468 --> 01:00:43,348 you're changing this path weight by the same thing for all 890 01:00:43,348 --> 01:00:45,800 paths. This is for any path from u to 891 01:00:45,800 --> 01:00:49,612 v, and that proves the theorem. So, the one word here was 892 01:00:49,612 --> 01:00:51,927 telescopes. These change in weights 893 01:00:51,927 --> 01:00:55,536 telescope over any path. Therefore, if we want to find 894 01:00:55,536 --> 01:00:58,327 shortest paths, you just find the shortest 895 01:00:58,327 --> 01:01:01,800 paths in this re-weighted version, and then you just 896 01:01:01,800 --> 01:01:06,848 change it by this one amount. You subtract off this amount 897 01:01:06,848 --> 01:01:10,281 instead of adding it. That will give you the shortest 898 01:01:10,281 --> 01:01:12,591 path weight in the original weights. 899 01:01:12,591 --> 01:01:15,694 OK, so this is a tool. We now know how to change 900 01:01:15,694 --> 01:01:18,995 weights in the graph. But what we really want is to 901 01:01:18,995 --> 01:01:22,889 change weights in the graph so that the weights all come out 902 01:01:22,889 --> 01:01:25,134 nonnegative. OK, how do we do that? 903 01:01:25,134 --> 01:01:28,105 Why in the world would there be a function, h, 904 01:01:28,105 --> 01:01:32,000 that makes all the edge weights nonnegative? 905 01:01:32,000 --> 01:01:42,851 It doesn't make sense. It turns out we already know. 906 01:01:42,851 --> 01:01:52,000 So, I should write down this consequence. 907 01:02:12,000 --> 01:02:14,193 Let me get this in the right order. 908 01:02:14,193 --> 01:02:17,096 So in particular, the shortest path changes by 909 01:02:17,096 --> 01:02:19,677 this amount. And if you want to know this 910 01:02:19,677 --> 01:02:22,774 value, you just move the stuff to the other side. 911 01:02:22,774 --> 01:02:26,193 So, we compute deltas of h, then we can compute delta. 912 01:02:26,193 --> 01:02:29,935 That's the consequence here. How many people here pronounce 913 01:02:29,935 --> 01:02:33,981 this word corollary? OK, and how many people 914 01:02:33,981 --> 01:02:37,599 pronounce it corollary? Yeah, we are alone. 915 01:02:37,599 --> 01:02:42,596 Usually get at least one other student, and they're usually 916 01:02:42,596 --> 01:02:45,353 Canadian or British or something. 917 01:02:45,353 --> 01:02:50,006 I think that the accent. So, I always avoid pronouncing 918 01:02:50,006 --> 01:02:53,969 his word unless I really think, it's corollary, 919 01:02:53,969 --> 01:02:57,587 and get it right. I at least say Z not Zed. 920 01:02:57,587 --> 01:03:03,428 OK, here we go. So, what we want to do is find 921 01:03:03,428 --> 01:03:09,371 one of these functions. I mean, let's just write down 922 01:03:09,371 --> 01:03:15,771 what we could hope to have. We want to find a re-weighted 923 01:03:15,771 --> 01:03:22,971 function, h, the signs of weight to each vertex such that w_h of 924 01:03:22,971 --> 01:03:28,457 (u,v) is nonnegative. That would be great for all 925 01:03:28,457 --> 01:03:34,735 edges, all (u,v) in E. OK, then we could run Dijkstra. 926 01:03:34,735 --> 01:03:38,264 We could run Dijkstra, get the delta h's, 927 01:03:38,264 --> 01:03:41,352 and then just undo the re-weighting, 928 01:03:41,352 --> 01:03:45,147 and get what we want. And, that is Johnson's 929 01:03:45,147 --> 01:03:48,235 algorithm. The claim is that this is 930 01:03:48,235 --> 01:03:52,029 always possible. OK, why should it always be 931 01:03:52,029 --> 01:03:54,941 possible? Well, let's look at this 932 01:03:54,941 --> 01:03:57,764 constraint. w_h of (u,v) is that. 933 01:03:57,764 --> 01:04:02,441 So, it's w of (u,v) plus h of u minus h of V should be 934 01:04:02,441 --> 01:04:09,691 nonnegative. Let me rewrite this a little 935 01:04:09,691 --> 01:04:14,886 bit. I'm going to put these guys 936 01:04:14,886 --> 01:04:21,589 over here. That would be the right thing, 937 01:04:21,589 --> 01:04:30,805 h of v minus h of u is less than or equal to w of (u,v). 938 01:04:30,805 --> 01:04:39,068 Does that look familiar? Did I get it right? 939 01:04:39,068 --> 01:04:46,496 It should be right. Anyone seen that inequality 940 01:04:46,496 --> 01:04:51,826 before? Yeah, yes, correct answer. 941 01:04:51,826 --> 01:04:56,993 OK, where? In a previous lecture? 942 01:04:56,993 --> 01:05:06,000 In the previous lecture. What is this called if I 943 01:05:06,000 --> 01:05:11,166 replace h with x? Charles knows. 944 01:05:11,166 --> 01:05:20,833 Good, anyone else remember all the way back to episode two? 945 01:05:20,833 --> 01:05:31,000 I know there was a weekend. What's this operator called? 946 01:05:31,000 --> 01:05:34,058 Not subtraction but, I think I heard it, 947 01:05:34,058 --> 01:05:36,568 oh man. All right, I'll tell you. 948 01:05:36,568 --> 01:05:39,627 It's a difference constraint, all right? 949 01:05:39,627 --> 01:05:42,058 This is the difference operator. 950 01:05:42,058 --> 01:05:45,745 OK, it's our good friend difference constraints. 951 01:05:45,745 --> 01:05:48,490 So, this is what we want to satisfy. 952 01:05:48,490 --> 01:05:51,784 We have a system of difference constraints. 953 01:05:51,784 --> 01:05:55,862 h of V minus h of u should be, we want to find these. 954 01:05:55,862 --> 01:05:59,941 These are our unknowns. Subject to these constraints, 955 01:05:59,941 --> 01:06:05,845 we are given the w's. Now, we know in these 956 01:06:05,845 --> 01:06:10,995 difference constraints are satisfiable. 957 01:06:10,995 --> 01:06:18,855 Can someone tell me when these constraints are satisfiable? 958 01:06:18,855 --> 01:06:26,714 We know exactly when for any set of difference constraints. 959 01:06:26,714 --> 01:06:32,000 You've got to remember the math. 960 01:06:32,000 --> 01:06:37,649 Terminology, I can understand. 961 01:06:37,649 --> 01:06:47,779 It's hard to remember words unless you're a linguist, 962 01:06:47,779 --> 01:06:54,207 perhaps. So, when is the system of 963 01:06:54,207 --> 01:07:02,000 different constraints satisfiable? 964 01:07:02,000 --> 01:07:08,341 All right, you should definitely, very good. 965 01:07:08,341 --> 01:07:12,027 [LAUGHTER] Yes, very good. 966 01:07:12,027 --> 01:07:21,023 Someone brought their lecture notes: when the constraint graph 967 01:07:21,023 --> 01:07:27,806 has no negative weight cycles. Good, thank you. 968 01:07:27,806 --> 01:07:34,000 Now, what is the constraint graph? 969 01:07:34,000 --> 01:07:37,726 OK, this has a one letter answer more or less. 970 01:07:37,726 --> 01:07:40,458 I'll accept the one letter answer. 971 01:07:40,458 --> 01:07:41,038 What? A? 972 01:07:41,038 --> 01:07:41,949 A: close. G. 973 01:07:41,949 --> 01:07:43,936 Yeah, I mean, same thing. 974 01:07:43,936 --> 01:07:47,745 Yeah, so the constraint graph is essentially G. 975 01:07:47,745 --> 01:07:51,388 Actually, it is G. The constraint graph is G, 976 01:07:51,388 --> 01:07:54,286 good. And, we prove this by adding a 977 01:07:54,286 --> 01:07:57,764 new source for text, and connecting that to 978 01:07:57,764 --> 01:08:01,766 everyone. But that's sort of beside the 979 01:08:01,766 --> 01:08:03,898 point. That was in order to actually 980 01:08:03,898 --> 01:08:05,604 satisfy them. But this is our 981 01:08:05,604 --> 01:08:08,527 characterization. So, if we assume that there are 982 01:08:08,527 --> 01:08:12,243 no negative weight cycles in our graph, which we've been doing 983 01:08:12,243 --> 01:08:14,923 all the time, then we know that this thing is 984 01:08:14,923 --> 01:08:16,993 satisfiable. Therefore, there is an 985 01:08:16,993 --> 01:08:20,100 assignment of this h's. There is a re-weighting that 986 01:08:20,100 --> 01:08:22,111 makes all the weights nonnegative. 987 01:08:22,111 --> 01:08:24,548 Then we can run Dijkstra. OK, we're done. 988 01:08:24,548 --> 01:08:27,167 Isn't that cool? And how do we satisfy these 989 01:08:27,167 --> 01:08:29,786 constraints? We know how to do that with one 990 01:08:29,786 --> 01:08:32,283 run of Bellman-Ford, which costs order VE, 991 01:08:32,283 --> 01:08:36,000 which is less than V times Dijkstra. 992 01:08:36,000 --> 01:08:39,750 So, that's it, write down the details 993 01:08:39,750 --> 01:08:41,000 somewhere. 994 01:09:00,000 --> 01:09:03,902 So, this is Johnson's algorithm. 995 01:09:03,902 --> 01:09:07,931 This is the fanciest of them all. 996 01:09:07,931 --> 01:09:13,723 It will be our fastest, all pairs shortest path 997 01:09:13,723 --> 01:09:17,122 algorithm. So, the claim is, 998 01:09:17,122 --> 01:09:23,542 we can find a function, h, from V to R such that the 999 01:09:23,542 --> 01:09:30,970 modified weight of every edge is nonnegative for every edge, 1000 01:09:30,970 --> 01:09:37,366 (u,v), in our graph. And, we do that using 1001 01:09:37,366 --> 01:09:43,000 Bellman-Ford to solve the difference constraints. 1002 01:09:57,000 --> 01:10:01,075 These are exactly the different constraints that we were born to 1003 01:10:01,075 --> 01:10:03,663 solve that we learned to solve last time. 1004 01:10:03,663 --> 01:10:06,704 The graphs here are corresponding exactly if you 1005 01:10:06,704 --> 01:10:10,391 look back at the definition. Or, Bellman-Ford will tell us 1006 01:10:10,391 --> 01:10:12,785 that there is a negative weight cycle. 1007 01:10:12,785 --> 01:10:16,796 OK, great, so it's not that we really have to assume that there 1008 01:10:16,796 --> 01:10:19,772 is no negative weight cycle. We'll get to know. 1009 01:10:19,772 --> 01:10:22,942 And if your fancy, you can actually figure out the 1010 01:10:22,942 --> 01:10:25,918 minus infinities from this. But, at this point, 1011 01:10:25,918 --> 01:10:29,865 I just want to think about the case where there is no negative 1012 01:10:29,865 --> 01:10:33,696 weight cycle. But if there is, 1013 01:10:33,696 --> 01:10:39,954 we can find out that it exists, and that just tell the user. 1014 01:10:39,954 --> 01:10:45,257 OK, then we'd stop. Otherwise, there is no negative 1015 01:10:45,257 --> 01:10:48,969 weight cycle. Therefore, there is an 1016 01:10:48,969 --> 01:10:54,166 assignment that gives is nonnegative edge weights. 1017 01:10:54,166 --> 01:11:00,000 So, we just use it. We use it to run Dijkstra. 1018 01:11:00,000 --> 01:11:02,744 So, step two is, oh, I should say the running 1019 01:11:02,744 --> 01:11:05,987 time of all this is V times E. So, we're just running 1020 01:11:05,987 --> 01:11:08,419 Bellman-Ford on exactly the input graph. 1021 01:11:08,419 --> 01:11:10,665 Plus, we add a source, if you recall, 1022 01:11:10,665 --> 01:11:13,160 to solve a set of difference constraints. 1023 01:11:13,160 --> 01:11:16,340 You add a source vertex, S, connected to everyone at 1024 01:11:16,340 --> 01:11:20,145 weight zero, run Bellman-Ford from there because we don't have 1025 01:11:20,145 --> 01:11:22,327 a source here. We just have a graph. 1026 01:11:22,327 --> 01:11:25,758 We want to know all pairs. So, this, you can use to find 1027 01:11:25,758 --> 01:11:30,000 whether there is a negative weight cycle anywhere. 1028 01:11:30,000 --> 01:11:33,428 Or, we get this magic assignment. 1029 01:11:33,428 --> 01:11:39,535 So now, w_h is nonnegative, so we can run Dijkstra on w_h. 1030 01:11:39,535 --> 01:11:43,821 We'll say, using w_h, so you compute w_h. 1031 01:11:43,821 --> 01:11:49,392 That takes linear time. And, we run Dijkstra for each 1032 01:11:49,392 --> 01:11:54,428 possible source. I'll write this out explicitly. 1033 01:11:54,428 --> 01:12:00,000 We've had this in our minds several times. 1034 01:12:00,000 --> 01:12:05,368 But, when we said n times Dijkstra over n times BFS, 1035 01:12:05,368 --> 01:12:09,684 here it is. We want to compute delta sub h 1036 01:12:09,684 --> 01:12:15,263 now, of (u,v) for all V, and we do this separately for 1037 01:12:15,263 --> 01:12:18,947 all u. And so, the running time here 1038 01:12:18,947 --> 01:12:23,684 is VE plus V^2 log V. This is just V times the 1039 01:12:23,684 --> 01:12:30,000 running time of Dijkstra, which is E plus V log V. 1040 01:12:30,000 --> 01:12:35,084 OK, it happens that this term is the same as this one, 1041 01:12:35,084 --> 01:12:39,017 which is nice, because that means step one 1042 01:12:39,017 --> 01:12:43,334 costs us nothing asymptotically. OK, and then, 1043 01:12:43,334 --> 01:12:47,075 last step is, well, now we know delta h. 1044 01:12:47,075 --> 01:12:52,831 We just need to compute delta. So, for each pair of vertices, 1045 01:12:52,831 --> 01:12:57,052 we'll call it (u,v), we just compute what the 1046 01:12:57,052 --> 01:13:03,000 original weights would be, so what delta (u,v) is. 1047 01:13:03,000 --> 01:13:07,471 And we can do that using this corollary. 1048 01:13:07,471 --> 01:13:13,777 It's just delta sub h of (u,v) minus h of u plus h of v. 1049 01:13:13,777 --> 01:13:19,624 I got the signs right. Yeah, so this takes V^2 time, 1050 01:13:19,624 --> 01:13:24,668 also dwarfed by the running time of Dijkstra. 1051 01:13:24,668 --> 01:13:31,777 So, the overall running time of Johnson's algorithm is just the 1052 01:13:31,777 --> 01:13:39,000 running time of step two, running Dijkstra n times -- 1053 01:13:51,000 --> 01:13:54,951 -- which is pretty cool. When it comes to single source 1054 01:13:54,951 --> 01:13:58,243 shortest paths, Bellman-Ford is the best thing 1055 01:13:58,243 --> 01:14:01,990 for general weights. Dijkstra is the best thing for 1056 01:14:01,990 --> 01:14:04,976 nonnegative weights. But for all pair shortest 1057 01:14:04,976 --> 01:14:08,890 paths, we can skirt the whole negative weight issue by using 1058 01:14:08,890 --> 01:14:11,213 this magic we saw from Bellman-Ford. 1059 01:14:11,213 --> 01:14:14,995 But now, running Dijkstra n times, which is still the best 1060 01:14:14,995 --> 01:14:17,383 thing we know how to do, pretty much, 1061 01:14:17,383 --> 01:14:21,232 for the all pairs nonnegative weights, now we can do it for 1062 01:14:21,232 --> 01:14:24,018 general weights too, which is a pretty nice 1063 01:14:24,018 --> 01:14:28,000 combination of all the techniques we've seen. 1064 01:14:28,000 --> 01:14:30,217 In the trilogy, and along the way, 1065 01:14:30,217 --> 01:14:33,577 we saw lots of dynamic programming, which is always 1066 01:14:33,577 --> 01:14:35,459 good practice. Any questions? 1067 01:14:35,459 --> 01:14:38,954 This is the last new content lecture before the quiz. 1068 01:14:38,954 --> 01:14:42,852 On Wednesday it will be quiz review, if I recall correctly. 1069 01:14:42,852 --> 01:14:46,347 And then it's Thanksgiving, so there's no recitation. 1070 01:14:46,347 --> 01:14:48,632 And then the quiz starts on Monday. 1071 01:14:48,632 --> 01:14:51,000 So, study up. See you then.