1 00:00:00,040 --> 00:00:02,480 The following content is provided under a Creative 2 00:00:02,480 --> 00:00:04,010 Commons license. 3 00:00:04,010 --> 00:00:06,340 Your support will help MIT OpenCourseWare 4 00:00:06,340 --> 00:00:10,700 continue to offer high quality educational resources for free. 5 00:00:10,700 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,035 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,035 --> 00:00:17,660 at ocw.mit.edu. 8 00:00:21,370 --> 00:00:22,830 PROFESSOR: OK, welcome back. 9 00:00:22,830 --> 00:00:25,930 This is the second half of our two lectures on distributed 10 00:00:25,930 --> 00:00:28,341 algorithms this week. 11 00:00:28,341 --> 00:00:31,130 I could turn that on, OK. 12 00:00:31,130 --> 00:00:33,970 All right, I'll start with a quick preview. 13 00:00:33,970 --> 00:00:37,110 This week, we're just looking at synchronous and asynchronous 14 00:00:37,110 --> 00:00:38,830 distributed algorithms, not worrying 15 00:00:38,830 --> 00:00:42,070 about interesting stuff like failures. 16 00:00:42,070 --> 00:00:44,010 Last time we looked at leader election, 17 00:00:44,010 --> 00:00:48,290 maximal independent set, breadth-first spanning trees, 18 00:00:48,290 --> 00:00:51,860 and we started looking at shortest paths trees. 19 00:00:51,860 --> 00:00:53,570 We'll finish that today, and then we'll 20 00:00:53,570 --> 00:00:56,980 move on to the main topic for today, which is asynchronous 21 00:00:56,980 --> 00:00:59,880 distributed algorithms where things start getting much more 22 00:00:59,880 --> 00:01:02,730 complicated because not everything is going on 23 00:01:02,730 --> 00:01:04,709 in synchronous rounds. 24 00:01:04,709 --> 00:01:07,260 And we'll revisit the same two problems, breadth-first 25 00:01:07,260 --> 00:01:09,205 and shortest paths spanning trees. 26 00:01:12,320 --> 00:01:16,670 Quick review, all of our algorithms 27 00:01:16,670 --> 00:01:20,505 are using a model that's based on undirected graph. 28 00:01:23,110 --> 00:01:26,562 Using this notation, gamma, for the neighbors of a vertex. 29 00:01:26,562 --> 00:01:28,820 We'll talk about the degree of a vertex. 30 00:01:28,820 --> 00:01:31,010 We have a process, then, associated 31 00:01:31,010 --> 00:01:33,290 with every vertex in the graph. 32 00:01:33,290 --> 00:01:35,680 And we associate communication channels 33 00:01:35,680 --> 00:01:37,936 in both directions on every edge. 34 00:01:41,040 --> 00:01:43,290 Last time we looked at synchronous distributed 35 00:01:43,290 --> 00:01:45,800 algorithms in the synchronous model. 36 00:01:45,800 --> 00:01:48,100 We have the processes of the ground vertices, 37 00:01:48,100 --> 00:01:50,600 they communicate using messages. 38 00:01:50,600 --> 00:01:53,270 The processes have local ports that they 39 00:01:53,270 --> 00:01:56,030 know by some kind of local name, and the ports 40 00:01:56,030 --> 00:01:58,520 connect to their communication channel. 41 00:01:58,520 --> 00:02:01,010 The algorithm executes in synchronous rounds 42 00:02:01,010 --> 00:02:03,310 where, in every around, every process 43 00:02:03,310 --> 00:02:06,456 is going to decide what to send on all of its ports, 44 00:02:06,456 --> 00:02:08,789 and the messages then get put into the channel delivered 45 00:02:08,789 --> 00:02:10,940 to the processes at the other end. 46 00:02:10,940 --> 00:02:13,970 And everybody looks at all the messages they get at that round 47 00:02:13,970 --> 00:02:16,500 all at once, and they determine a new state. 48 00:02:16,500 --> 00:02:18,940 They compute a new state based on all 49 00:02:18,940 --> 00:02:20,086 of those arriving messages. 50 00:02:22,610 --> 00:02:25,400 We started with leader election. 51 00:02:25,400 --> 00:02:27,010 I won't repeat the problem definition, 52 00:02:27,010 --> 00:02:29,670 but here's the results that we got last time. 53 00:02:29,670 --> 00:02:32,010 We looked at a special case of a graph that's 54 00:02:32,010 --> 00:02:34,090 just a simple clique. 55 00:02:34,090 --> 00:02:39,130 And if the processes don't have any distinguishing information, 56 00:02:39,130 --> 00:02:42,670 like unique identifiers, and they're deterministic, 57 00:02:42,670 --> 00:02:45,580 then there's no way to break the symmetry and you can prove 58 00:02:45,580 --> 00:02:49,440 an impossibility result that says that you 59 00:02:49,440 --> 00:02:52,100 can't-- in a deterministic, indistinguishable case, 60 00:02:52,100 --> 00:02:57,506 you can't guarantee to elect a leader in this kind of a graph. 61 00:02:57,506 --> 00:02:59,880 Just as an aside, I should say that distributed computing 62 00:02:59,880 --> 00:03:03,200 theory is just filled with impossibility results, 63 00:03:03,200 --> 00:03:05,670 and they're all based on this limitation of distributed 64 00:03:05,670 --> 00:03:07,640 computing where each node only knows 65 00:03:07,640 --> 00:03:11,130 what's happening to itself and in its neighborhood. 66 00:03:11,130 --> 00:03:15,190 Nobody knows what's happening in the entire system. 67 00:03:15,190 --> 00:03:18,250 That's a very strong limitation, and as you might expect, 68 00:03:18,250 --> 00:03:22,880 that would make a lot of things impossible or difficult. 69 00:03:22,880 --> 00:03:25,330 Then we went on and got two positive results, a theorem 70 00:03:25,330 --> 00:03:29,590 that-- well, an algorithm that is deterministic, 71 00:03:29,590 --> 00:03:32,210 but the processes have unique identifiers, 72 00:03:32,210 --> 00:03:35,740 and then you can elect a leader quickly. 73 00:03:35,740 --> 00:03:37,690 Or if you don't have unique identifiers 74 00:03:37,690 --> 00:03:41,750 but you have probability, so you can make random choices, 75 00:03:41,750 --> 00:03:44,850 you can essentially choose an identifier, 76 00:03:44,850 --> 00:03:49,840 and then it works almost as well with those identifiers. 77 00:03:49,840 --> 00:03:53,410 Then we looked at maximal independent set. 78 00:03:53,410 --> 00:03:55,724 Remember what it means, no two neighbors 79 00:03:55,724 --> 00:03:57,140 should both be in the set, but you 80 00:03:57,140 --> 00:04:00,740 don't want to be able to add any more nodes while keeping 81 00:04:00,740 --> 00:04:02,032 these nodes independent. 82 00:04:02,032 --> 00:04:03,990 In other words, every node is either in the set 83 00:04:03,990 --> 00:04:05,247 or has a neighbor in the set. 84 00:04:08,810 --> 00:04:12,980 And we gave this algorithm-- I'm just including it here 85 00:04:12,980 --> 00:04:16,200 as a reminder-- Luby's algorithm, which basically goes 86 00:04:16,200 --> 00:04:17,649 through a number of phases. 87 00:04:17,649 --> 00:04:22,730 In each phase, some processes decide to join the MIS 88 00:04:22,730 --> 00:04:25,220 and their neighbors decide not to join, 89 00:04:25,220 --> 00:04:26,720 and you just repeat this. 90 00:04:26,720 --> 00:04:29,390 You do this based on choosing random IDs again. 91 00:04:31,900 --> 00:04:36,800 And we proved that the theorem correctly computes an MIS, 92 00:04:36,800 --> 00:04:40,390 and with a good probability, all the nodes 93 00:04:40,390 --> 00:04:44,520 decide within only logarithmic phases, 94 00:04:44,520 --> 00:04:48,360 and here is the number of nodes. 95 00:04:48,360 --> 00:04:52,360 All right, then went on to breadth-first spanning trees. 96 00:04:52,360 --> 00:04:55,320 Here, we have now a graph that already has a leader. 97 00:04:55,320 --> 00:04:58,070 It has a distinguished vertex. 98 00:04:58,070 --> 00:04:59,770 And the process that's sitting there 99 00:04:59,770 --> 00:05:01,830 knows that it's the leader. 100 00:05:01,830 --> 00:05:03,540 The processes now are going to produce 101 00:05:03,540 --> 00:05:07,770 a breadth-first spanning tree rooted at that vertex. 102 00:05:07,770 --> 00:05:09,310 Now, for the rest of the time, we'll 103 00:05:09,310 --> 00:05:11,620 assume unique identifiers, but the processes 104 00:05:11,620 --> 00:05:14,980 don't know anything about the graph except that their own ID 105 00:05:14,980 --> 00:05:17,130 and their neighbors' IDs. 106 00:05:17,130 --> 00:05:20,780 And we want the processes to eventually output 107 00:05:20,780 --> 00:05:23,860 the ID of their parent. 108 00:05:23,860 --> 00:05:29,310 And here is, just to repeat, the simple algorithm that we used. 109 00:05:29,310 --> 00:05:32,740 Basically, it's processes just mark themselves 110 00:05:32,740 --> 00:05:34,864 as they get included in the tree. 111 00:05:34,864 --> 00:05:36,780 It starts out with just the root being marked. 112 00:05:36,780 --> 00:05:39,550 He sends a special search message out to his neighbors, 113 00:05:39,550 --> 00:05:43,260 and soon as they get it they get marked and pass it on. 114 00:05:43,260 --> 00:05:45,190 Everybody gets marked in a number 115 00:05:45,190 --> 00:05:49,650 of rounds that corresponds to their depth in-- their distance 116 00:05:49,650 --> 00:05:53,930 from the root of the tree, [INAUDIBLE] from the tree. 117 00:05:53,930 --> 00:05:56,470 We talked about correctness in terms of invariance. 118 00:05:56,470 --> 00:05:58,490 What this algorithm guarantees is, 119 00:05:58,490 --> 00:06:02,390 at the end of any number r of rounds, exactly the processes 120 00:06:02,390 --> 00:06:07,010 at distance up to r are marked, and the processes that 121 00:06:07,010 --> 00:06:09,950 are marked, have their parents defined. 122 00:06:09,950 --> 00:06:12,050 And if your parents defined-- it's 123 00:06:12,050 --> 00:06:16,500 the UID of a process that has distance d minus 1 124 00:06:16,500 --> 00:06:17,650 from the root. 125 00:06:17,650 --> 00:06:19,160 If you're a distance d, your parent 126 00:06:19,160 --> 00:06:22,160 should be somebody who's correct-- 127 00:06:22,160 --> 00:06:25,940 has the correct distance d minus 1. 128 00:06:25,940 --> 00:06:28,130 We analyze the complexity. 129 00:06:28,130 --> 00:06:31,096 Time is counting the number of rounds. 130 00:06:31,096 --> 00:06:32,470 And that's going to be, at worst, 131 00:06:32,470 --> 00:06:33,790 the diameter of the network. 132 00:06:33,790 --> 00:06:37,750 It's really the distance from a particular vertex, v 0. 133 00:06:37,750 --> 00:06:40,130 And the message complexity-- there's 134 00:06:40,130 --> 00:06:42,800 only one message sent in each direction on each edge, 135 00:06:42,800 --> 00:06:46,660 so that's going to be only order of the number of edges. 136 00:06:46,660 --> 00:06:49,304 And we talked about how you can get child pointers. 137 00:06:49,304 --> 00:06:50,470 This just gives you parents. 138 00:06:50,470 --> 00:06:53,070 But if you want to also find out your children, 139 00:06:53,070 --> 00:06:55,100 then every search message should get 140 00:06:55,100 --> 00:06:57,240 a response either saying you're my parent 141 00:06:57,240 --> 00:06:59,250 or you're not my parent. 142 00:06:59,250 --> 00:07:02,240 And we can do a termination using convergecast, 143 00:07:02,240 --> 00:07:04,590 and there's some applications we talked about as well. 144 00:07:07,239 --> 00:07:09,030 All right, and then at the end of the hour, 145 00:07:09,030 --> 00:07:11,620 we started talking about a generalization, 146 00:07:11,620 --> 00:07:15,340 a breadth-first spanning trees, which adds weights so you 147 00:07:15,340 --> 00:07:17,880 have shortest paths trees. 148 00:07:17,880 --> 00:07:20,170 Now you have weights on the edges. 149 00:07:20,170 --> 00:07:22,092 Processes still have unique identifiers. 150 00:07:24,452 --> 00:07:26,160 They don't know anything about the graph, 151 00:07:26,160 --> 00:07:29,420 except their immediate neighborhood information. 152 00:07:29,420 --> 00:07:31,630 And they have to produce a shortest path spanning 153 00:07:31,630 --> 00:07:36,209 tree rooted at vertex v 0. 154 00:07:36,209 --> 00:07:38,500 You know what a spanning tree is and the shortest paths 155 00:07:38,500 --> 00:07:42,290 are in terms of the total weight of the path. 156 00:07:42,290 --> 00:07:47,050 Now we want each node each, process, to output its parent 157 00:07:47,050 --> 00:07:48,070 in the shortest path. 158 00:07:48,070 --> 00:07:51,970 And also the distance from the original vertex v 0. 159 00:07:55,730 --> 00:07:57,690 At the end of the hour, we looked 160 00:07:57,690 --> 00:08:00,830 at a version of Bellman-Ford, which you've already 161 00:08:00,830 --> 00:08:05,920 seen as a sequential algorithm. 162 00:08:05,920 --> 00:08:08,400 But as a distributed algorithm, everybody 163 00:08:08,400 --> 00:08:12,090 keeps track of their best guess, their best current guess, 164 00:08:12,090 --> 00:08:16,310 of the distance from the initial vertex. 165 00:08:16,310 --> 00:08:18,760 And they keep track of their parent 166 00:08:18,760 --> 00:08:23,090 on some path that gave it this distance estimate, 167 00:08:23,090 --> 00:08:26,100 and they keep their unique identifier. 168 00:08:26,100 --> 00:08:29,110 The complete algorithm now is, everybody's 169 00:08:29,110 --> 00:08:31,190 going to keep sending their distance around-- I 170 00:08:31,190 --> 00:08:35,179 mean we could optimize that, but this is simple. 171 00:08:35,179 --> 00:08:38,720 At every round, everybody sends their current distance estimate 172 00:08:38,720 --> 00:08:40,559 to all their neighbors. 173 00:08:40,559 --> 00:08:42,280 They collect the distance estimates 174 00:08:42,280 --> 00:08:45,930 from all their neighbors, and then they do a relaxation step. 175 00:08:45,930 --> 00:08:48,040 If they get anything that's better 176 00:08:48,040 --> 00:08:51,040 than what they had before, they look at the best new estimate 177 00:08:51,040 --> 00:08:52,300 they could get. 178 00:08:52,300 --> 00:08:57,730 They take the minimum of their old estimate-- stop shaking, 179 00:08:57,730 --> 00:09:02,730 good-- and the minimum of all of the estimates 180 00:09:02,730 --> 00:09:06,060 they would get by looking at the incoming information 181 00:09:06,060 --> 00:09:11,160 plus adding the weight of the edge between the sender 182 00:09:11,160 --> 00:09:13,410 and the node itself. 183 00:09:13,410 --> 00:09:16,470 This way you may get a better estimate. 184 00:09:16,470 --> 00:09:19,700 And if you do, you would reset your parent 185 00:09:19,700 --> 00:09:24,780 to be the sender of this improved information. 186 00:09:24,780 --> 00:09:27,140 And again, you can pick-- if there's a tie, 187 00:09:27,140 --> 00:09:30,690 you can pick any of the nodes that 188 00:09:30,690 --> 00:09:35,130 sense-- the information leading to the best new guess, 189 00:09:35,130 --> 00:09:38,810 you can set your parent to any of those. 190 00:09:38,810 --> 00:09:40,045 And then this just repeats. 191 00:09:42,960 --> 00:09:47,084 At the very end of the hour, we showed an animation, 192 00:09:47,084 --> 00:09:48,500 which I'm not going to repeat now, 193 00:09:48,500 --> 00:09:52,210 which basically shows how you can get lots of corrections. 194 00:09:52,210 --> 00:09:55,680 You can have many paths that are set up 195 00:09:55,680 --> 00:09:57,610 that look good after just one round, 196 00:09:57,610 --> 00:10:00,660 but then they get corrected as the rounds go on. 197 00:10:00,660 --> 00:10:03,320 You get much lower weight paths by having 198 00:10:03,320 --> 00:10:07,820 a roundabout, multi-hop path to a node. 199 00:10:07,820 --> 00:10:11,690 You can get a better total cost. 200 00:10:11,690 --> 00:10:15,250 Here's where we got to last time. 201 00:10:15,250 --> 00:10:16,590 Now why does this work? 202 00:10:16,590 --> 00:10:18,940 Well what we need is eventually every process 203 00:10:18,940 --> 00:10:21,160 should get the correct distance. 204 00:10:21,160 --> 00:10:24,650 And the parent should be its predecessor on some shortest 205 00:10:24,650 --> 00:10:26,420 path. 206 00:10:26,420 --> 00:10:31,870 In order to prove that-- you always look for an invariant, 207 00:10:31,870 --> 00:10:34,630 something that's true, at intermediate steps 208 00:10:34,630 --> 00:10:38,110 of the algorithm that you can show by induction will hold-- 209 00:10:38,110 --> 00:10:42,200 and that will imply the result that you want at the end. 210 00:10:42,200 --> 00:10:44,520 Here, what's the key invariant? 211 00:10:44,520 --> 00:10:51,081 At the end of some number r of rounds, what do processes have? 212 00:10:56,650 --> 00:11:01,796 After r rounds have passed, in this kind of algorithm, 213 00:11:01,796 --> 00:11:03,520 what do the estimates look like? 214 00:11:08,920 --> 00:11:10,560 Well, after one round, everybody's 215 00:11:10,560 --> 00:11:14,350 got their best estimate that could happen 216 00:11:14,350 --> 00:11:17,290 on a single-- that could result from a single hop path 217 00:11:17,290 --> 00:11:18,540 from v 0. 218 00:11:18,540 --> 00:11:20,780 After two rounds, you also get the best guesses 219 00:11:20,780 --> 00:11:25,280 from two hop paths, and after r rounds in general, 220 00:11:25,280 --> 00:11:28,130 you have your distance and parent corresponding 221 00:11:28,130 --> 00:11:31,040 to a shortest path chosen from among those 222 00:11:31,040 --> 00:11:33,750 that have at most our hops. 223 00:11:33,750 --> 00:11:36,320 Yes? 224 00:11:36,320 --> 00:11:38,811 That makes sense? 225 00:11:38,811 --> 00:11:39,310 No? 226 00:11:39,310 --> 00:11:39,990 Yeah? 227 00:11:39,990 --> 00:11:41,610 OK. 228 00:11:41,610 --> 00:11:45,569 All right, and if there is no path of r hops or fewer 229 00:11:45,569 --> 00:11:47,110 to get to a node, there's still going 230 00:11:47,110 --> 00:11:52,230 to have their distance estimate be infinity after r rounds. 231 00:11:52,230 --> 00:11:56,030 This is not a complete proof, but it's the key invariant 232 00:11:56,030 --> 00:11:57,350 that makes this work. 233 00:11:57,350 --> 00:12:00,700 You can see that after enough rounds corresponding 234 00:12:00,700 --> 00:12:03,650 to the number of nodes, for example, 235 00:12:03,650 --> 00:12:08,610 everybody will have the correct shortest path of any length. 236 00:12:08,610 --> 00:12:10,620 OK? 237 00:12:10,620 --> 00:12:13,320 The number of rounds until every estimate 238 00:12:13,320 --> 00:12:17,980 stabilize-- all the estimates stabilize, 239 00:12:17,980 --> 00:12:19,430 it's going to be n minus 1. 240 00:12:19,430 --> 00:12:21,130 All right? 241 00:12:21,130 --> 00:12:25,230 Because the longest simple path to any node 242 00:12:25,230 --> 00:12:27,230 is going to be n minus 1, if it goes 243 00:12:27,230 --> 00:12:31,310 through all the nodes of the network before reaching it. 244 00:12:31,310 --> 00:12:33,444 Makes sense? 245 00:12:33,444 --> 00:12:35,610 If you want to make sure you have the best estimate, 246 00:12:35,610 --> 00:12:38,280 you have to wait n minus 1 rounds 247 00:12:38,280 --> 00:12:41,560 to make sure the information has reached you. 248 00:12:41,560 --> 00:12:43,610 The message complexity-- well, since there's 249 00:12:43,610 --> 00:12:46,680 all these repeated estimates it's no longer just 250 00:12:46,680 --> 00:12:48,250 proportional to the number of edges, 251 00:12:48,250 --> 00:12:51,210 but you have to take into account 252 00:12:51,210 --> 00:12:55,580 that there can be many new estimates sent on each edge. 253 00:12:55,580 --> 00:12:57,100 In fact, the way I've written this, 254 00:12:57,100 --> 00:13:00,250 you just keep sending your distance at every round. 255 00:13:00,250 --> 00:13:03,930 It's going to be the number of edges 256 00:13:03,930 --> 00:13:06,400 times the number of rounds. 257 00:13:06,400 --> 00:13:08,290 You can do somewhat better than this, 258 00:13:08,290 --> 00:13:13,010 but it's worse than just the simple BFS case. 259 00:13:13,010 --> 00:13:17,400 This is more expensive, because BFS just had diameter rounds 260 00:13:17,400 --> 00:13:21,150 and this now has n for n minus 1 rounds, 261 00:13:21,150 --> 00:13:25,400 and BFS just had one message ever sent on each edge, 262 00:13:25,400 --> 00:13:27,820 and now we have to send many. 263 00:13:27,820 --> 00:13:28,320 Comments? 264 00:13:28,320 --> 00:13:28,819 Questions? 265 00:13:34,670 --> 00:13:38,294 Is it clear that the time bound really does depend on n and not 266 00:13:38,294 --> 00:13:38,960 on the diameter? 267 00:13:43,190 --> 00:13:46,360 For breadth-first search, it was enough just 268 00:13:46,360 --> 00:13:50,180 to have enough rounds to correspond 269 00:13:50,180 --> 00:13:57,580 to the actual distance and hops to each node, 270 00:13:57,580 --> 00:14:00,850 but now you need enough rounds to take 271 00:14:00,850 --> 00:14:02,670 care of these indirect paths that 272 00:14:02,670 --> 00:14:08,630 might go through many nodes but still end up with a shorter-- 273 00:14:08,630 --> 00:14:11,860 with a smaller total weight. 274 00:14:11,860 --> 00:14:15,280 Is it clear to everybody why the bound depends on n? 275 00:14:15,280 --> 00:14:17,490 Actually the animation that I had last time 276 00:14:17,490 --> 00:14:20,110 showed how there are lots of corrections, 277 00:14:20,110 --> 00:14:24,260 and you had enough-- you had rounds that depended 278 00:14:24,260 --> 00:14:26,330 on the total number of nodes. 279 00:14:26,330 --> 00:14:28,074 Yes? 280 00:14:28,074 --> 00:14:29,580 OK. 281 00:14:29,580 --> 00:14:30,314 Yeah? 282 00:14:30,314 --> 00:14:35,154 AUDIENCE: But could you keep track of each round 283 00:14:35,154 --> 00:14:37,574 if the values-- if any of [INAUDIBLE]? 284 00:14:37,574 --> 00:14:40,962 If you-- if everything stops changing after less than n 285 00:14:40,962 --> 00:14:42,737 rounds, then you might not have to-- 286 00:14:42,737 --> 00:14:44,820 PROFESSOR: OK, so you're asking about termination? 287 00:14:44,820 --> 00:14:45,445 AUDIENCE: Yeah. 288 00:14:45,445 --> 00:14:48,600 PROFESSOR: Ah, OK, well so that's probably the next slide. 289 00:14:48,600 --> 00:14:50,350 First, let's deal with the child pointers, 290 00:14:50,350 --> 00:14:52,590 and then we'll come back to termination. 291 00:14:52,590 --> 00:14:54,640 First, this just gives you your parents. 292 00:14:54,640 --> 00:14:57,630 If you want your children, you can do the same thing 293 00:14:57,630 --> 00:14:58,810 that we did last time. 294 00:14:58,810 --> 00:15:03,430 When a process gets a message, and the message doesn't 295 00:15:03,430 --> 00:15:05,990 make the sender its parent, this is not 296 00:15:05,990 --> 00:15:08,590 giving it an improved distance. 297 00:15:08,590 --> 00:15:12,820 Then the node can just respond, non-parent. 298 00:15:12,820 --> 00:15:15,330 But if a process receives a message that 299 00:15:15,330 --> 00:15:20,280 doesn't improve the distance, it says, OK, you are my parent, 300 00:15:20,280 --> 00:15:23,040 but it might have also told another node-- 301 00:15:23,040 --> 00:15:24,790 it might have another parent, another node 302 00:15:24,790 --> 00:15:27,940 that previously it thought was its parent. 303 00:15:27,940 --> 00:15:29,720 It has to do more work, in this case, 304 00:15:29,720 --> 00:15:33,200 to correct the erroneous parent information. 305 00:15:33,200 --> 00:15:38,270 It has to send its previous parent a non-parent message 306 00:15:38,270 --> 00:15:42,300 to correct the previous parent message. 307 00:15:42,300 --> 00:15:46,150 Things are getting a little bit trickier here. 308 00:15:46,150 --> 00:15:49,450 On the other end, if somebody is keeping track of its children, 309 00:15:49,450 --> 00:15:52,550 it has to make adjustments too because things can 310 00:15:52,550 --> 00:15:55,520 change during the algorithm. 311 00:15:55,520 --> 00:15:58,350 Let's say a process keeps track of its children in some set, 312 00:15:58,350 --> 00:16:00,790 it has a set children. 313 00:16:00,790 --> 00:16:03,620 If it gets a non-parent message from a child, 314 00:16:03,620 --> 00:16:07,350 even if that child might be from a neighbor. 315 00:16:07,350 --> 00:16:09,500 That neighbor might be in its set children, 316 00:16:09,500 --> 00:16:11,640 and this could be a correction, and then 317 00:16:11,640 --> 00:16:14,510 the process has to take that neighbor out 318 00:16:14,510 --> 00:16:15,552 of the set of children. 319 00:16:18,460 --> 00:16:23,710 And suppose the process improves its own distance. 320 00:16:23,710 --> 00:16:25,660 Well, now, it's kind of starting over. 321 00:16:25,660 --> 00:16:29,150 It's going to send that distance to all of its neighbors 322 00:16:29,150 --> 00:16:31,280 again and collect new information about 323 00:16:31,280 --> 00:16:33,840 whether they're children or not. 324 00:16:33,840 --> 00:16:37,170 The simple thing to do here is just empty-- 325 00:16:37,170 --> 00:16:40,140 zero out your children set and start over. 326 00:16:40,140 --> 00:16:45,310 Now you send your new messages to your neighbors 327 00:16:45,310 --> 00:16:47,950 and wait for them to respond again. 328 00:16:47,950 --> 00:16:51,740 There's tricky bookkeeping to do to handle corrections 329 00:16:51,740 --> 00:16:54,350 as the structure of this tree changes, 330 00:16:54,350 --> 00:16:57,070 so getting child pointers is a little more complicated 331 00:16:57,070 --> 00:16:58,210 than before. 332 00:16:58,210 --> 00:17:00,950 Make sense? 333 00:17:00,950 --> 00:17:03,980 All right, so now back to your question, termination. 334 00:17:06,849 --> 00:17:12,126 How do all the processes know when the tree is complete? 335 00:17:12,126 --> 00:17:13,500 In fact, we have a worse problem. 336 00:17:13,500 --> 00:17:15,834 With this problem we hit for breadth first search, 337 00:17:15,834 --> 00:17:17,250 but we have an even worse problem. 338 00:17:17,250 --> 00:17:18,262 Now what is that? 339 00:17:21,138 --> 00:17:21,638 Yeah? 340 00:17:21,638 --> 00:17:23,550 AUDIENCE: [INAUDIBLE]. 341 00:17:23,550 --> 00:17:26,950 PROFESSOR: Yeah, before we had each individual node. 342 00:17:26,950 --> 00:17:29,700 Once it figured out who its parent was, 343 00:17:29,700 --> 00:17:31,690 it could just output that. 344 00:17:31,690 --> 00:17:34,910 And now, you can figure out your parent, but it's just a guess 345 00:17:34,910 --> 00:17:37,170 and you don't know when you can output this. 346 00:17:37,170 --> 00:17:39,720 How does a process even-- an individual process 347 00:17:39,720 --> 00:17:42,660 even figure out its own parent and distance? 348 00:17:42,660 --> 00:17:44,396 There's two aspects here in termination. 349 00:17:44,396 --> 00:17:46,520 One is how do you know the whole thing is finished, 350 00:17:46,520 --> 00:17:48,310 but the other one is even how do you 351 00:17:48,310 --> 00:17:52,630 know when you're done with your own parent and distance? 352 00:17:52,630 --> 00:17:55,230 Well, if you knew something about the graph, 353 00:17:55,230 --> 00:17:57,904 like an upper bound on the total number of nodes, 354 00:17:57,904 --> 00:18:00,070 then you could just wait until that number of rounds 355 00:18:00,070 --> 00:18:00,600 and be done. 356 00:18:00,600 --> 00:18:05,740 But what if you don't have that information about the graph? 357 00:18:05,740 --> 00:18:08,620 What might you do? 358 00:18:08,620 --> 00:18:09,846 Yeah? 359 00:18:09,846 --> 00:18:12,774 AUDIENCE: You want to BFS in parallel 360 00:18:12,774 --> 00:18:15,946 and filter down the information when you've 361 00:18:15,946 --> 00:18:19,130 reached the size of the graph. 362 00:18:19,130 --> 00:18:20,400 PROFESSOR: Maybe. 363 00:18:20,400 --> 00:18:24,830 I think-- what is the strategy we use for termination for BFS? 364 00:18:24,830 --> 00:18:27,110 Let's start with that one. 365 00:18:27,110 --> 00:18:28,620 It's a little easier. 366 00:18:28,620 --> 00:18:30,530 You did the subtree thing that we call 367 00:18:30,530 --> 00:18:32,800 convergecast the information. 368 00:18:32,800 --> 00:18:34,730 When we had a leaf he knew he was a leaf, 369 00:18:34,730 --> 00:18:37,610 and he could send his done information up to his parent 370 00:18:37,610 --> 00:18:42,530 and that got convergecast up to the top of the tree. 371 00:18:42,530 --> 00:18:46,520 Can we convergecast in this setting? 372 00:18:46,520 --> 00:18:50,620 Turns out we can, but since things are changing, 373 00:18:50,620 --> 00:18:52,960 you're going to be sending done messages 374 00:18:52,960 --> 00:18:54,440 and then something might change. 375 00:18:54,440 --> 00:18:58,392 You might be participating in the convergecast many times. 376 00:19:00,980 --> 00:19:03,730 Since the tree structure is changing, 377 00:19:03,730 --> 00:19:07,520 the main idea is anybody can send a done message 378 00:19:07,520 --> 00:19:13,000 to it's current-- the node he believes is his parent, 379 00:19:13,000 --> 00:19:19,720 provided he's received responses to all of its distance messages 380 00:19:19,720 --> 00:19:24,430 so it thinks it knows who all its children are. 381 00:19:24,430 --> 00:19:26,460 It has a current estimate of all the children 382 00:19:26,460 --> 00:19:30,030 and-- so if it knows all its children 383 00:19:30,030 --> 00:19:34,730 and they have all sent him done messages. 384 00:19:34,730 --> 00:19:38,060 For all your current children, your current belief 385 00:19:38,060 --> 00:19:40,150 or who your children are, if they've all 386 00:19:40,150 --> 00:19:43,230 sent you done messages, then you can send a done message 387 00:19:43,230 --> 00:19:46,100 up the tree. 388 00:19:46,100 --> 00:19:49,610 But this can get a little more complicated 389 00:19:49,610 --> 00:19:55,250 than it sounds, because you can change who your children are. 390 00:19:55,250 --> 00:19:57,070 What this means is that the same process 391 00:19:57,070 --> 00:20:00,400 can be involved several times in the convergecast, 392 00:20:00,400 --> 00:20:04,760 based on improving the estimates. 393 00:20:04,760 --> 00:20:10,730 Here's an example of the kind of thing that can happen. 394 00:20:10,730 --> 00:20:15,610 Let's say you start out, you have these huge weights 395 00:20:15,610 --> 00:20:20,810 and then you have a long path with small weights. 396 00:20:20,810 --> 00:20:26,560 i 0 starts out and sends its distance information 397 00:20:26,560 --> 00:20:31,670 on its three nodes, to its three neighbors. 398 00:20:31,670 --> 00:20:39,500 And this guy at the bottom now has a distance estimate of 100, 399 00:20:39,500 --> 00:20:42,290 and it's going to decide it's a leaf. 400 00:20:42,290 --> 00:20:42,800 Why? 401 00:20:42,800 --> 00:20:47,420 Because when it sends messages to its children, 402 00:20:47,420 --> 00:20:51,080 to its neighbors, they're not able to improve their estimates 403 00:20:51,080 --> 00:20:55,120 based on the new information that he's sending. 404 00:20:55,120 --> 00:20:59,310 This guy decides that he's a leaf right away, 405 00:20:59,310 --> 00:21:04,800 and he sends that information back to node i 0. 406 00:21:04,800 --> 00:21:09,320 On the other hand, this guy has the same estimate of 100. 407 00:21:09,320 --> 00:21:14,360 And he sends out his messages to try to find out if he's a leaf, 408 00:21:14,360 --> 00:21:17,220 but he finds out when he sends a message this way 409 00:21:17,220 --> 00:21:19,750 that he's actually able to improve that neighbor's 410 00:21:19,750 --> 00:21:22,830 estimate, because that was infinity till then. 411 00:21:22,830 --> 00:21:25,730 He doesn't think he's a leaf. 412 00:21:25,730 --> 00:21:28,300 We have this one guy who thinks he's a leaf and responds, 413 00:21:28,300 --> 00:21:29,990 so this i 0 is sitting there. 414 00:21:29,990 --> 00:21:33,790 He has to wait to hear from his other children. 415 00:21:33,790 --> 00:21:34,950 OK so far? 416 00:21:34,950 --> 00:21:37,220 All right, in the meantime, the messages 417 00:21:37,220 --> 00:21:39,770 are going to eventually creep around 418 00:21:39,770 --> 00:21:44,370 and this node is going to get a smaller estimate based 419 00:21:44,370 --> 00:21:48,810 on the length of that long many hop path. 420 00:21:48,810 --> 00:21:54,620 Then he's going to decide he is not a child of i 0. 421 00:21:54,620 --> 00:21:57,060 He's going to tell i 0 I'm really not 422 00:21:57,060 --> 00:22:01,750 your child, which means that i 0 stops waiting for him. 423 00:22:01,750 --> 00:22:05,870 But also, this guy decides he's not a child as well. 424 00:22:05,870 --> 00:22:09,530 He becomes a child of the node right above him. 425 00:22:09,530 --> 00:22:14,750 So i 0 now will realize he only has one child, 426 00:22:14,750 --> 00:22:20,920 but this guy believes he's a leaf again after trying again 427 00:22:20,920 --> 00:22:22,200 to find children. 428 00:22:22,200 --> 00:22:25,180 And now the done information propagates all the way 429 00:22:25,180 --> 00:22:28,870 up the tree, which is now just this one long path. 430 00:22:28,870 --> 00:22:31,840 They start trying to convergecast, 431 00:22:31,840 --> 00:22:33,790 but then, oops, they were wrong. 432 00:22:33,790 --> 00:22:36,490 They have to make a correction, and they 433 00:22:36,490 --> 00:22:38,379 are forming a new tree. 434 00:22:38,379 --> 00:22:40,170 Eventually, the tree is going to stabilize, 435 00:22:40,170 --> 00:22:41,730 and eventually the done information 436 00:22:41,730 --> 00:22:43,990 will get all the way up to the top, 437 00:22:43,990 --> 00:22:47,494 but there could be lots of false starts in the mean time. 438 00:22:47,494 --> 00:22:48,910 It's sort of confusing, but that's 439 00:22:48,910 --> 00:22:50,344 the kind of thing that happens. 440 00:22:50,344 --> 00:22:52,514 Yeah? 441 00:22:52,514 --> 00:22:56,213 AUDIENCE: There may be a process in which [INAUDIBLE] 442 00:22:56,213 --> 00:23:00,629 and then it just switches its mind at the very end of it. 443 00:23:00,629 --> 00:23:02,974 How do you make sure that, that propagation [INAUDIBLE]? 444 00:23:06,617 --> 00:23:08,200 PROFESSOR: Yes, so the root node isn't 445 00:23:08,200 --> 00:23:11,350 going to terminate until it hears from everybody. 446 00:23:11,350 --> 00:23:14,100 You kind of have to close out the whole process. 447 00:23:14,100 --> 00:23:15,940 It's always pending, waiting for somebody, 448 00:23:15,940 --> 00:23:17,370 if it hasn't heard from someone. 449 00:23:17,370 --> 00:23:20,610 If things switch, they'll join in another part of the tree. 450 00:23:20,610 --> 00:23:22,290 I think the best thing to do here 451 00:23:22,290 --> 00:23:25,292 is sort of construct some little examples by hand. 452 00:23:25,292 --> 00:23:27,000 I mean we're not going to get into it how 453 00:23:27,000 --> 00:23:29,430 you do formal proofs of things like this. 454 00:23:29,430 --> 00:23:32,120 We don't even have right now a simulator 455 00:23:32,120 --> 00:23:34,050 you can use to play with these algorithms. 456 00:23:34,050 --> 00:23:35,840 Although if anybody's interested, 457 00:23:35,840 --> 00:23:38,534 some students in my class last term, 458 00:23:38,534 --> 00:23:40,700 actually, wrote a simulator that might be available. 459 00:23:44,230 --> 00:23:50,070 OK, all right, so that's what you 460 00:23:50,070 --> 00:23:53,500 get for a synchronous-- some example synchronous distributed 461 00:23:53,500 --> 00:23:55,030 algorithms. 462 00:23:55,030 --> 00:23:57,560 Now let's look at something more complicated 463 00:23:57,560 --> 00:24:02,530 when you get into asynchronous algorithms. 464 00:24:02,530 --> 00:24:03,620 OK. 465 00:24:03,620 --> 00:24:06,820 So far complications that you've seen 466 00:24:06,820 --> 00:24:09,530 over the rest of this course, you 467 00:24:09,530 --> 00:24:12,140 have now processes that are acting concurrently. 468 00:24:12,140 --> 00:24:14,560 And we had a little bit of non-determinism, 469 00:24:14,560 --> 00:24:16,900 nothing important, but now things 470 00:24:16,900 --> 00:24:19,310 are about to get much worse. 471 00:24:19,310 --> 00:24:22,110 We don't have rounds anymore. 472 00:24:22,110 --> 00:24:24,500 Now we're going to have processes taking steps, 473 00:24:24,500 --> 00:24:28,260 messages getting delivered at absolutely arbitrary times, 474 00:24:28,260 --> 00:24:31,240 arbitrary orders. 475 00:24:31,240 --> 00:24:34,740 The processes can get completely out of sync, 476 00:24:34,740 --> 00:24:37,740 and so you have lots and lots more non-determinism 477 00:24:37,740 --> 00:24:38,580 in the algorithm. 478 00:24:38,580 --> 00:24:40,440 The non-determinism has to do with who's 479 00:24:40,440 --> 00:24:41,520 doing what in what order. 480 00:24:44,990 --> 00:24:47,230 Understanding that type of algorithm 481 00:24:47,230 --> 00:24:49,820 is really different from understanding the algorithms 482 00:24:49,820 --> 00:24:53,780 that you've seen all term and even synchronous distributed 483 00:24:53,780 --> 00:24:56,510 algorithms, because there isn't just one way 484 00:24:56,510 --> 00:25:00,450 the algorithm is going to execute. 485 00:25:00,450 --> 00:25:04,230 The execution can proceed in many different ways, 486 00:25:04,230 --> 00:25:07,870 just depending on the order of the steps. 487 00:25:07,870 --> 00:25:09,980 You can't ever hope to understand 488 00:25:09,980 --> 00:25:14,344 exactly how this kind of algorithm is executing. 489 00:25:14,344 --> 00:25:15,010 What can you do? 490 00:25:15,010 --> 00:25:17,710 Well, you can play with it, but in the end, 491 00:25:17,710 --> 00:25:20,170 you have to understand is some abstract properties. 492 00:25:20,170 --> 00:25:22,300 Some properties of the executions, rather 493 00:25:22,300 --> 00:25:26,230 than exactly what happens at every step, and that's a jump. 494 00:25:26,230 --> 00:25:29,580 It's a new way of thinking. 495 00:25:29,580 --> 00:25:31,230 We can look at asynchronous stuff, 496 00:25:31,230 --> 00:25:33,260 if you want, from my book. 497 00:25:33,260 --> 00:25:37,810 Now we still have processes at the nodes of a graph. 498 00:25:37,810 --> 00:25:40,250 And now we have communication channels 499 00:25:40,250 --> 00:25:42,610 associated with the edges. 500 00:25:42,610 --> 00:25:45,800 Now, the processes are going to be some kind of automata, 501 00:25:45,800 --> 00:25:48,540 but the channels will also be some kind of automata. 502 00:25:48,540 --> 00:25:51,100 We'll have all these components, and we'll 503 00:25:51,100 --> 00:25:55,040 be modeling all of them. 504 00:25:55,040 --> 00:25:57,690 Processes still have their ports. 505 00:25:57,690 --> 00:26:02,620 They need not, in general, have identifiers. 506 00:26:02,620 --> 00:26:04,970 What's a channel? 507 00:26:04,970 --> 00:26:08,170 A channel is-- it's a kind of an automaton, infinite state 508 00:26:08,170 --> 00:26:12,300 automaton, that has inputs and some outputs. 509 00:26:12,300 --> 00:26:19,620 Here, this is just a picture of a channel from node u 510 00:26:19,620 --> 00:26:24,450 to node v. Channel uv is just this cloud 511 00:26:24,450 --> 00:26:27,320 thing that delivers messages. 512 00:26:27,320 --> 00:26:28,620 It has inputs. 513 00:26:28,620 --> 00:26:32,220 The inputs are-- messages get sent on the channel. 514 00:26:32,220 --> 00:26:36,680 You can have one process at one end sending a message m 515 00:26:36,680 --> 00:26:39,510 and the outputs at the other end are the deliveries, 516 00:26:39,510 --> 00:26:48,450 let's say receive message m, at the other end, node v. 517 00:26:48,450 --> 00:26:50,640 To model this, the best thing is actually 518 00:26:50,640 --> 00:26:52,510 to-- instead of just describing what 519 00:26:52,510 --> 00:26:55,510 it does, to give an explicit model of its state 520 00:26:55,510 --> 00:26:59,767 and what happens when the inputs and outputs occur. 521 00:26:59,767 --> 00:27:01,850 If you want these messages to be delivered-- let's 522 00:27:01,850 --> 00:27:03,960 say in FIFO order, fine. 523 00:27:03,960 --> 00:27:07,250 You can make the state of the channel be an actual q. 524 00:27:07,250 --> 00:27:10,660 mq would just be a FIFO queue of messages. 525 00:27:10,660 --> 00:27:13,070 Starts out empty, and when messages get sent, 526 00:27:13,070 --> 00:27:15,200 they get added to the end, and get delivered they 527 00:27:15,200 --> 00:27:18,260 get removed from the beginning. 528 00:27:18,260 --> 00:27:20,070 All that we need to describe-- this 529 00:27:20,070 --> 00:27:21,900 is a sort of a pseudo code. 530 00:27:21,900 --> 00:27:24,530 All we need to describe-- to write in order 531 00:27:24,530 --> 00:27:26,760 to describe what this channel does, 532 00:27:26,760 --> 00:27:30,110 is what happens when a send occurs, 533 00:27:30,110 --> 00:27:34,280 and when can this channel deliver a message, and what 534 00:27:34,280 --> 00:27:36,340 happens when it does that. 535 00:27:36,340 --> 00:27:39,970 A send, which can just come in at any time, and the effect 536 00:27:39,970 --> 00:27:44,630 is just to add this message to the end of the queue. 537 00:27:44,630 --> 00:27:50,710 The recieve-- stop moving. 538 00:27:50,710 --> 00:27:57,420 The receive-- that cannot be construction. 539 00:27:57,420 --> 00:27:59,186 We'd hear noise. 540 00:27:59,186 --> 00:28:02,320 It's gremlins. 541 00:28:02,320 --> 00:28:07,730 We have-- a receive can occur only 542 00:28:07,730 --> 00:28:11,010 when this message is at the head of the queue, 543 00:28:11,010 --> 00:28:14,420 and when it occurs, it gets removed from the queue. 544 00:28:14,420 --> 00:28:15,990 Does this make sense as a description 545 00:28:15,990 --> 00:28:18,100 of what a channel does? 546 00:28:18,100 --> 00:28:20,090 Messages come in, get added to it, 547 00:28:20,090 --> 00:28:21,980 and then messages can get delivered 548 00:28:21,980 --> 00:28:26,610 in certain situations, and they get removed. 549 00:28:26,610 --> 00:28:28,950 That's a description of the channel. 550 00:28:28,950 --> 00:28:31,450 to be using an asynchronous system. 551 00:28:31,450 --> 00:28:33,760 A process-- the rest of the system 552 00:28:33,760 --> 00:28:38,340 consists of processes associated with the graph vertices. 553 00:28:38,340 --> 00:28:42,890 Let's say pu is a process that's associated with vertex u. 554 00:28:42,890 --> 00:28:45,680 But I'm writing that just sort of as a shorthand, because u 555 00:28:45,680 --> 00:28:49,130 is the vertex in the graph, and the process doesn't actually 556 00:28:49,130 --> 00:28:51,550 know the name of the vertex. 557 00:28:51,550 --> 00:28:54,080 It just has its own unique ID or something, 558 00:28:54,080 --> 00:28:58,020 but I'm going to be a little sloppy about that now. 559 00:28:58,020 --> 00:29:02,770 The process that at vertex u can perform 560 00:29:02,770 --> 00:29:05,480 send outputs to put messages on the channel, 561 00:29:05,480 --> 00:29:08,970 and it will receive inputs for messages 562 00:29:08,970 --> 00:29:10,940 to come in on the channel. 563 00:29:10,940 --> 00:29:16,320 But the processes might also have some external interface 564 00:29:16,320 --> 00:29:20,210 where somebody is submitting some inputs to the process, 565 00:29:20,210 --> 00:29:22,717 and the process has to produce some output at the end. 566 00:29:22,717 --> 00:29:24,300 There can be other inputs and outputs. 567 00:29:28,900 --> 00:29:32,180 And we'll model it with state variables. 568 00:29:32,180 --> 00:29:34,530 Process is supposed to keep taking steps. 569 00:29:34,530 --> 00:29:38,710 The channel is supposed to keep delivering messages. 570 00:29:38,710 --> 00:29:40,380 It's a property called liveness. 571 00:29:40,380 --> 00:29:42,880 You want to make sure that your components in your system 572 00:29:42,880 --> 00:29:44,160 all keep doing things. 573 00:29:44,160 --> 00:29:48,450 They don't just do some steps and then stop. 574 00:29:48,450 --> 00:29:49,980 Here's a simple example. 575 00:29:49,980 --> 00:29:54,650 A process that's remembering the maximum number it's ever seen. 576 00:29:54,650 --> 00:29:56,390 There's a max process automaton. 577 00:29:56,390 --> 00:30:01,160 It receives some messages m, some value, 578 00:30:01,160 --> 00:30:02,860 and it will send it out. 579 00:30:02,860 --> 00:30:07,180 It keeps track of the max-- that's max state variable. 580 00:30:07,180 --> 00:30:10,170 it starts out with its own initial values, 581 00:30:10,170 --> 00:30:14,610 so x for u. x of u is its initial value. 582 00:30:14,610 --> 00:30:17,820 And then it has-- for every neighbor, 583 00:30:17,820 --> 00:30:22,850 it has a little queue-- here, it's just a Boolean-- 584 00:30:22,850 --> 00:30:25,950 asking whether it's supposed to send to that neighbor. 585 00:30:29,800 --> 00:30:33,020 This is the pseudocode for that max process. 586 00:30:33,020 --> 00:30:34,900 What does it do when it receives a message? 587 00:30:34,900 --> 00:30:38,940 Well, it sees if that value is bigger than what it had before, 588 00:30:38,940 --> 00:30:41,480 and if so, it resets the max. 589 00:30:41,480 --> 00:30:44,870 And it also makes a note that it's supposed to send this out 590 00:30:44,870 --> 00:30:45,900 to all its neighbors. 591 00:30:45,900 --> 00:30:47,870 Whenever it gets new information, 592 00:30:47,870 --> 00:30:50,914 it will want to propagate it to its neighbors. 593 00:30:50,914 --> 00:30:52,080 You see how this is written? 594 00:30:52,080 --> 00:30:56,050 You just say, reset the max and then get ready 595 00:30:56,050 --> 00:31:00,030 to send to all your neighbors, and the last part 596 00:31:00,030 --> 00:31:01,380 is just sort of trivial code. 597 00:31:01,380 --> 00:31:04,800 It says, if you are ready to send, then you can send, 598 00:31:04,800 --> 00:31:10,519 and then you're done and you can set the send flags to false. 599 00:31:10,519 --> 00:31:11,018 Yeah? 600 00:31:11,018 --> 00:31:13,360 AUDIENCE: What study? 601 00:31:13,360 --> 00:31:16,160 PROFESSOR: For every neighbor. 602 00:31:16,160 --> 00:31:19,550 Oh, I wrote neighbor v before, and then I-- yeah. 603 00:31:19,550 --> 00:31:22,760 I wrote neighbor-- oh, I know why I wrote w. 604 00:31:22,760 --> 00:31:25,180 Here, I'm talking about if you receive a message 605 00:31:25,180 --> 00:31:28,010 from a particular neighbor v, then 606 00:31:28,010 --> 00:31:30,300 you have to send it to all your neighbors. 607 00:31:30,300 --> 00:31:33,590 Before, I used v to denote a generic neighbor, 608 00:31:33,590 --> 00:31:35,870 but now I can't do that anymore, because v 609 00:31:35,870 --> 00:31:38,070 is the sender of the message. 610 00:31:38,070 --> 00:31:40,310 Just technical-- OK? 611 00:31:43,610 --> 00:31:45,370 We have these process automata. 612 00:31:45,370 --> 00:31:46,710 We have this channel automata. 613 00:31:46,710 --> 00:31:48,188 Now, we want to build the system. 614 00:31:48,188 --> 00:31:49,146 We paste them together. 615 00:31:52,690 --> 00:31:54,350 How we paste them is just we have 616 00:31:54,350 --> 00:31:57,730 outputs from processes that can match up with inputs 617 00:31:57,730 --> 00:32:00,200 to channels and vice versa. 618 00:32:00,200 --> 00:32:02,900 If a process has a send output, let's 619 00:32:02,900 --> 00:32:07,780 say send from u to v, that will match up with the channel that 620 00:32:07,780 --> 00:32:11,310 has send uv as an input. 621 00:32:11,310 --> 00:32:13,980 And the receive from the channel matches up 622 00:32:13,980 --> 00:32:17,570 with the process that has that receive as an input. 623 00:32:17,570 --> 00:32:21,830 All I'm doing is matching up these components. 624 00:32:21,830 --> 00:32:24,480 I'm hooking together these components 625 00:32:24,480 --> 00:32:26,530 by matching up their action names. 626 00:32:26,530 --> 00:32:29,030 Does that make sense? 627 00:32:29,030 --> 00:32:33,110 I'm saying how I build a system out of all these components, 628 00:32:33,110 --> 00:32:38,310 and I just have a syntactic way of saying what actions match up 629 00:32:38,310 --> 00:32:39,850 in different components. 630 00:32:39,850 --> 00:32:40,664 Questions? 631 00:32:40,664 --> 00:32:41,330 This is all new. 632 00:32:45,920 --> 00:32:49,130 When this system takes a step, well, 633 00:32:49,130 --> 00:32:53,030 if somebody's performing an action, 634 00:32:53,030 --> 00:32:54,900 and someone else has that same action-- 635 00:32:54,900 --> 00:32:56,550 let's say a process is doing a send. 636 00:32:56,550 --> 00:32:58,030 The channel has a send. 637 00:32:58,030 --> 00:33:00,850 They both take their transitions at the same time. 638 00:33:00,850 --> 00:33:02,690 The process sends a message, it gets 639 00:33:02,690 --> 00:33:06,900 put into the channel at the end of its queue. 640 00:33:06,900 --> 00:33:09,291 Make sense? 641 00:33:09,291 --> 00:33:11,780 OK. 642 00:33:11,780 --> 00:33:13,050 How does this thing execute? 643 00:33:13,050 --> 00:33:15,490 Well, there's no synchronous rounds, 644 00:33:15,490 --> 00:33:19,170 so the system just operates by the processes 645 00:33:19,170 --> 00:33:22,500 and the channels just perform their steps in any order. 646 00:33:22,500 --> 00:33:25,470 One at a time, but it can be in any order, 647 00:33:25,470 --> 00:33:26,560 so it's a sequence model. 648 00:33:26,560 --> 00:33:30,100 You just have a sequence of individual steps. 649 00:33:30,100 --> 00:33:31,430 There's no concurrency here. 650 00:33:31,430 --> 00:33:34,600 In the synchronous model, we had everybody taking their steps 651 00:33:34,600 --> 00:33:35,910 in one big block. 652 00:33:35,910 --> 00:33:38,860 And here, it's just they take steps one at a time, 653 00:33:38,860 --> 00:33:40,112 but it could be in any order. 654 00:33:44,390 --> 00:33:46,990 And we have to make sure that everybody keeps taking steps. 655 00:33:46,990 --> 00:33:50,490 That every channel continues to deliver messages, 656 00:33:50,490 --> 00:33:53,450 and every process always performs 657 00:33:53,450 --> 00:33:54,830 some step that's enabled. 658 00:33:58,420 --> 00:34:02,450 For the max processes, well, we can just 659 00:34:02,450 --> 00:34:04,930 have a bunch of processes, each one now starting 660 00:34:04,930 --> 00:34:07,570 with its initial value. 661 00:34:07,570 --> 00:34:09,940 And what happens when we plug them together with all 662 00:34:09,940 --> 00:34:13,647 their channels between them? 663 00:34:13,647 --> 00:34:15,480 Corresponding to whatever graph they are in. 664 00:34:15,480 --> 00:34:18,000 They just have channels on the edges. 665 00:34:18,000 --> 00:34:20,639 What's the behavior of this? 666 00:34:20,639 --> 00:34:25,750 If all the processes are like the ones I just showed you, 667 00:34:25,750 --> 00:34:27,760 they wait till they hear some new max 668 00:34:27,760 --> 00:34:28,860 and then they send it out. 669 00:34:32,270 --> 00:34:34,452 Yeah? 670 00:34:34,452 --> 00:34:35,946 AUDIENCE: All processes eventually 671 00:34:35,946 --> 00:34:37,440 have a globally maximum value. 672 00:34:37,440 --> 00:34:40,600 PROFESSOR: Yeah, they'll all eventually get the global max. 673 00:34:40,600 --> 00:34:43,260 They'll keep propagating until everybody receives it. 674 00:34:43,260 --> 00:34:46,534 Here's a animation if everybody starts with the values that 675 00:34:46,534 --> 00:34:47,742 are written in these circles. 676 00:34:50,500 --> 00:34:52,139 Now, remember, before they were all 677 00:34:52,139 --> 00:34:54,076 sending it once, now no more. 678 00:34:54,076 --> 00:34:55,659 Let's say the first thing that happens 679 00:34:55,659 --> 00:34:58,710 is the process that started with five sends its message out 680 00:34:58,710 --> 00:35:02,942 on one of its channels, so the five goes out. 681 00:35:02,942 --> 00:35:05,150 The next thing that might happen is the other process 682 00:35:05,150 --> 00:35:07,360 with the seven might send the seven out 683 00:35:07,360 --> 00:35:10,110 on one of its channels. 684 00:35:10,110 --> 00:35:11,930 And these are three more steps. 685 00:35:11,930 --> 00:35:12,860 Somebody sends a 10. 686 00:35:12,860 --> 00:35:14,670 Somebody sends a seven. 687 00:35:14,670 --> 00:35:21,150 Somebody received a message and updated its value as a result, 688 00:35:21,150 --> 00:35:22,160 and we continue. 689 00:35:22,160 --> 00:35:24,074 I'm depicting several steps at once, 690 00:35:24,074 --> 00:35:26,240 because it's boring to really do them one at a time, 691 00:35:26,240 --> 00:35:29,990 but the model really says that they take steps in some order. 692 00:35:29,990 --> 00:35:32,980 Everybody is propagating the largest thing it's seen, 693 00:35:32,980 --> 00:35:38,060 and eventually, you wind up with everybody having 694 00:35:38,060 --> 00:35:39,543 the maximum value, the 10. 695 00:35:43,590 --> 00:35:48,180 All right, that's how an asynchronous system operates. 696 00:35:48,180 --> 00:35:51,280 We can analyze the message complexity of this. 697 00:35:51,280 --> 00:35:53,350 The total number of messages sent 698 00:35:53,350 --> 00:35:57,470 during the entire execution, at worst, on every edge. 699 00:35:57,470 --> 00:36:01,300 You can send the successively improved estimate, 700 00:36:01,300 --> 00:36:05,510 so that's, again, what are n times the number of edges. 701 00:36:05,510 --> 00:36:09,390 Time complexity is an issue. 702 00:36:09,390 --> 00:36:11,840 When we had synchronous algorithms, 703 00:36:11,840 --> 00:36:15,120 we just counted the number of rounds and that was easy. 704 00:36:15,120 --> 00:36:17,090 But what do we measure now? 705 00:36:17,090 --> 00:36:20,760 How do we count the time when you have all these processes 706 00:36:20,760 --> 00:36:24,140 and channels taking steps whenever they want? 707 00:36:24,140 --> 00:36:26,060 Yeah, so this really isn't obvious. 708 00:36:26,060 --> 00:36:31,060 There's a solution that's commonly used, which is-- OK, 709 00:36:31,060 --> 00:36:33,800 we're going to use real time. 710 00:36:33,800 --> 00:36:35,920 And we're going to make some assumptions 711 00:36:35,920 --> 00:36:40,234 about certain basic steps taking, at most, 712 00:36:40,234 --> 00:36:41,275 a certain amount of time. 713 00:36:43,890 --> 00:36:47,140 Let's say that local computational-- time 714 00:36:47,140 --> 00:36:51,620 for a process to perform its next step, is little l. 715 00:36:51,620 --> 00:36:54,200 You just give a local time bound. 716 00:36:54,200 --> 00:36:57,940 And then you have d for a channel to deliver one message. 717 00:36:57,940 --> 00:37:01,720 The first message that's currently in the channel. 718 00:37:01,720 --> 00:37:03,300 If you have assumptions like that, 719 00:37:03,300 --> 00:37:07,870 you could use those to infer a real time upper bound 720 00:37:07,870 --> 00:37:09,230 for solving the whole problem. 721 00:37:09,230 --> 00:37:12,180 I mean if you know it takes no longer than d to deliver one 722 00:37:12,180 --> 00:37:15,000 message, then you can bound how long 723 00:37:15,000 --> 00:37:18,940 it takes to deliver-- to empty out a queue, a channel, and how 724 00:37:18,940 --> 00:37:21,520 long it takes for messages to propagate through the network. 725 00:37:27,080 --> 00:37:29,550 It's tricky, but this is about the only thing 726 00:37:29,550 --> 00:37:32,450 you can do in a setting where you don't actually 727 00:37:32,450 --> 00:37:34,515 have rounds to measure. 728 00:37:37,500 --> 00:37:40,420 Then for the max system, how long does it take? 729 00:37:40,420 --> 00:37:42,800 Well, let's just ignore the local processing time. 730 00:37:42,800 --> 00:37:44,930 Usually that's assumed to be very small, 731 00:37:44,930 --> 00:37:47,550 so let's say it's 0. 732 00:37:47,550 --> 00:37:50,760 We can get a very simple, pessimistic really, 733 00:37:50,760 --> 00:37:54,470 upper bound that says the real time for finishing 734 00:37:54,470 --> 00:37:57,770 the whole thing is of the order of the diameter of the network 735 00:37:57,770 --> 00:38:01,430 times the number of nodes times little 736 00:38:01,430 --> 00:38:04,160 d is the amount of time for a message queue 737 00:38:04,160 --> 00:38:05,694 to deliver its first message. 738 00:38:10,180 --> 00:38:13,880 As a naive way of analyzing this, 739 00:38:13,880 --> 00:38:16,250 you just consider how long it takes for the max 740 00:38:16,250 --> 00:38:20,720 to reach some particular vertex u along the shortest path. 741 00:38:20,720 --> 00:38:24,510 Well, it has to go through all the hops on the path, 742 00:38:24,510 --> 00:38:27,900 so that would be the diameter. 743 00:38:27,900 --> 00:38:30,810 And how long does it have to wait in each channel 744 00:38:30,810 --> 00:38:33,370 before it gets to move another hop? 745 00:38:33,370 --> 00:38:36,010 Well, it might be at the end of a queue. 746 00:38:36,010 --> 00:38:37,170 How big could the queue be? 747 00:38:37,170 --> 00:38:40,920 Well, at worst end for the improved estimates. 748 00:38:40,920 --> 00:38:44,640 Let's say it's n times the delivery time on the channel, 749 00:38:44,640 --> 00:38:47,160 just to traverse one channel. 750 00:38:47,160 --> 00:38:49,610 What I'm doing is modeling possible congestion 751 00:38:49,610 --> 00:38:52,820 on the queue to see how long it takes for a message 752 00:38:52,820 --> 00:38:54,500 to traverse one channel. 753 00:38:54,500 --> 00:38:55,180 Yeah? 754 00:38:55,180 --> 00:38:58,076 AUDIENCE: Are we just assuming that processes process things, 755 00:38:58,076 --> 00:39:00,367 and messages are delivered as soon as they can possibly 756 00:39:00,367 --> 00:39:02,904 have them? 757 00:39:02,904 --> 00:39:03,570 PROFESSOR: Good. 758 00:39:03,570 --> 00:39:07,360 Yeah, we normally have-- I'm sort of skipping over 759 00:39:07,360 --> 00:39:11,810 some things in the model-- but you have a liveness assumption 760 00:39:11,810 --> 00:39:14,470 that says that the process keeps taking steps as long 761 00:39:14,470 --> 00:39:16,700 as it has anything to do, and so we 762 00:39:16,700 --> 00:39:18,780 would be putting time bounds on how long 763 00:39:18,780 --> 00:39:20,720 it takes between those steps. 764 00:39:20,720 --> 00:39:23,220 That'll be the local processing time, here. 765 00:39:23,220 --> 00:39:24,265 I'm saying that's 0. 766 00:39:24,265 --> 00:39:25,765 AUDIENCE: You do that-- wouldn't you 767 00:39:25,765 --> 00:39:29,115 be able to get some amount of information 768 00:39:29,115 --> 00:39:30,490 about what were things happening? 769 00:39:30,490 --> 00:39:32,070 PROFESSOR: Ah, OK. 770 00:39:32,070 --> 00:39:34,440 Here is-- this is a very subtle point. 771 00:39:34,440 --> 00:39:36,806 What do you think is that if I'm making all these timing 772 00:39:36,806 --> 00:39:38,180 assumptions, the processes should 773 00:39:38,180 --> 00:39:39,980 be able to figure that out. 774 00:39:39,980 --> 00:39:43,780 But actually, you can't figure anything out just 775 00:39:43,780 --> 00:39:48,540 based on these assumptions about these upper bounds. 776 00:39:48,540 --> 00:39:51,380 Putting upper bounds on the time between steps 777 00:39:51,380 --> 00:39:54,460 does not in any way restrict the orderings. 778 00:39:54,460 --> 00:39:57,970 You still have all the same possible orderings of steps. 779 00:39:57,970 --> 00:39:59,710 Nobody can see anything different. 780 00:39:59,710 --> 00:40:01,770 These times are not visible in any sense. 781 00:40:01,770 --> 00:40:05,226 They're not marked anywhere for the processes to read. 782 00:40:05,226 --> 00:40:06,600 They're just something that we're 783 00:40:06,600 --> 00:40:12,730 using to evaluate the cost, the time cost, of the execution. 784 00:40:12,730 --> 00:40:17,240 And you're not restricting the execution 785 00:40:17,240 --> 00:40:20,090 by putting just upper bounds on the time. 786 00:40:20,090 --> 00:40:21,800 If you are restricting the execution, 787 00:40:21,800 --> 00:40:24,830 like you had upper bounds and lower bounds on the time, then 788 00:40:24,830 --> 00:40:26,780 the processes might know a lot more. 789 00:40:26,780 --> 00:40:33,030 They might, in fact, be acting more like a synchronous system. 790 00:40:33,030 --> 00:40:36,300 These are times that are just used for analyzing complexity. 791 00:40:36,300 --> 00:40:40,310 They're not anything known to the processes in the system. 792 00:40:40,310 --> 00:40:41,860 OK? 793 00:40:41,860 --> 00:40:42,980 All right. 794 00:40:42,980 --> 00:40:45,260 Now let's revisit the breadth-first spanning tree 795 00:40:45,260 --> 00:40:47,750 problem. 796 00:40:47,750 --> 00:40:50,130 We want to now compute a breadth-first spanning tree 797 00:40:50,130 --> 00:40:53,710 in an asynchronous network. 798 00:40:53,710 --> 00:40:55,640 Connected graph, distinguish root vertex, 799 00:40:55,640 --> 00:40:57,900 processes have no knowledge of the graph. 800 00:40:57,900 --> 00:40:59,420 They have you UIDs. 801 00:40:59,420 --> 00:41:03,320 All this is the same as before in the synchronous case. 802 00:41:03,320 --> 00:41:06,900 Everybody's supposed to output its parent information 803 00:41:06,900 --> 00:41:07,826 when it's done. 804 00:41:11,480 --> 00:41:14,270 Here's an idea. 805 00:41:14,270 --> 00:41:17,820 Suppose we just take that nice simple synchronous algorithm 806 00:41:17,820 --> 00:41:20,870 that I reviewed at the beginning of the hour where everybody 807 00:41:20,870 --> 00:41:23,100 just sends a search message as soon as they get it 808 00:41:23,100 --> 00:41:26,550 and they just adopt the first parent they see. 809 00:41:26,550 --> 00:41:28,300 What happens if I run that asynchronously? 810 00:41:31,940 --> 00:41:34,372 I just send it and I get a search message. 811 00:41:34,372 --> 00:41:36,080 Then I send it out to my parents whenever 812 00:41:36,080 --> 00:41:38,956 and everybody's doing this. 813 00:41:38,956 --> 00:41:40,830 Whenever they get their first search message, 814 00:41:40,830 --> 00:41:42,980 they decide the sender is its parent. 815 00:41:42,980 --> 00:41:43,980 Yeah? 816 00:41:43,980 --> 00:41:46,980 AUDIENCE: Could you possibly have a front tier 817 00:41:46,980 --> 00:41:51,382 that doesn't keep expanding-- not obeying the defining 818 00:41:51,382 --> 00:41:53,196 property of the BFS? 819 00:41:53,196 --> 00:41:54,570 PROFESSOR: Yeah it could be that, 820 00:41:54,570 --> 00:41:56,910 because we don't have any restriction on how 821 00:41:56,910 --> 00:42:01,580 fast the messages might be sent and the order of steps, 822 00:42:01,580 --> 00:42:03,800 it could be that some messages get sent 823 00:42:03,800 --> 00:42:06,340 very quickly on some long path. 824 00:42:06,340 --> 00:42:08,940 Someone sitting at the far end of the network might 825 00:42:08,940 --> 00:42:10,820 get a message first on a long path 826 00:42:10,820 --> 00:42:13,230 and later on a short path, and the first one to 827 00:42:13,230 --> 00:42:17,740 gets it decides that's its parent, then it's stuck. 828 00:42:17,740 --> 00:42:22,170 This is not an algorithm that makes corrections. 829 00:42:22,170 --> 00:42:24,740 OK? 830 00:42:24,740 --> 00:42:28,110 All right, well, before we had a little bit of non-determinism 831 00:42:28,110 --> 00:42:31,170 when we had this algorithm in the synchronous case 832 00:42:31,170 --> 00:42:34,610 because we could have two messages arriving at once, 833 00:42:34,610 --> 00:42:36,782 and you have to pick one to be your parent. 834 00:42:36,782 --> 00:42:38,490 Now that doesn't happen, because you only 835 00:42:38,490 --> 00:42:42,010 get one message at a time, but you have a lot more 836 00:42:42,010 --> 00:42:44,260 non-determinism now. 837 00:42:44,260 --> 00:42:46,470 Now you have all this non-determinism 838 00:42:46,470 --> 00:42:50,060 from the order in which the messages get sent and processes 839 00:42:50,060 --> 00:42:51,844 take their steps. 840 00:42:51,844 --> 00:42:53,260 There's plenty of non-determinism, 841 00:42:53,260 --> 00:42:56,670 and remember, the way we treat non-determinism for distributed 842 00:42:56,670 --> 00:42:58,530 algorithms is it's supposed to work 843 00:42:58,530 --> 00:43:01,990 regardless of how those non-deterministic choices get 844 00:43:01,990 --> 00:43:03,930 made. 845 00:43:03,930 --> 00:43:05,260 How would we describe? 846 00:43:05,260 --> 00:43:07,160 I'll just write some pseudo code here 847 00:43:07,160 --> 00:43:10,270 for a process that's just mimicking 848 00:43:10,270 --> 00:43:11,910 the trivial algorithm. 849 00:43:11,910 --> 00:43:14,340 It can receive a search message, that's the inputs, 850 00:43:14,340 --> 00:43:18,560 and it can send a search message as its output. 851 00:43:18,560 --> 00:43:23,500 It can also output parent when it's ready to do that. 852 00:43:23,500 --> 00:43:25,020 What does it have to keep track of? 853 00:43:25,020 --> 00:43:28,800 Well, it keeps track of its parent, 854 00:43:28,800 --> 00:43:31,850 keeps track of whether it's reported its parent, 855 00:43:31,850 --> 00:43:36,140 and it has some send buffers with messages 856 00:43:36,140 --> 00:43:38,580 it's ready to send to its neighbors, 857 00:43:38,580 --> 00:43:42,400 could be search messages or nothing. 858 00:43:42,400 --> 00:43:45,870 This bottom symbol is just a placeholder symbol. 859 00:43:50,230 --> 00:43:54,060 What happens when the process receives a search message, just 860 00:43:54,060 --> 00:43:56,770 following the simple algorithm? 861 00:43:56,770 --> 00:44:00,890 Well, if it doesn't have a parent yet, 862 00:44:00,890 --> 00:44:03,850 then it sets its parent to be the sender of that search 863 00:44:03,850 --> 00:44:07,650 message, and it gets ready to send the search message 864 00:44:07,650 --> 00:44:10,320 to all of its neighbors. 865 00:44:10,320 --> 00:44:13,310 That's really the heart of the algorithm 866 00:44:13,310 --> 00:44:16,010 that you saw for the simple algorithm 867 00:44:16,010 --> 00:44:19,580 for that is the synchronous case so that's the same code. 868 00:44:19,580 --> 00:44:22,410 The rest of the code is it just sends the search messages out 869 00:44:22,410 --> 00:44:25,750 when it's got them in the send buffers, 870 00:44:25,750 --> 00:44:29,290 and it can announce its parent once the parent is set, 871 00:44:29,290 --> 00:44:33,140 and then it doesn't-- this flag just means it doesn't keep 872 00:44:33,140 --> 00:44:36,300 announcing it over and over again. 873 00:44:36,300 --> 00:44:37,930 It makes sense that this describes 874 00:44:37,930 --> 00:44:40,380 that simple algorithm. 875 00:44:40,380 --> 00:44:42,220 It's pretty concise, just what does 876 00:44:42,220 --> 00:44:46,320 it do in all these different steps. 877 00:44:46,320 --> 00:44:48,850 OK? 878 00:44:48,850 --> 00:44:50,970 Now, if you run this asynchronously, 879 00:44:50,970 --> 00:44:53,120 as you already noted, it isn't going 880 00:44:53,120 --> 00:44:56,010 to necessarily work right. 881 00:44:56,010 --> 00:44:59,080 You can have this guy sending search messages, 882 00:44:59,080 --> 00:45:03,420 but some are going to arrive faster than others. 883 00:45:03,420 --> 00:45:07,530 And you see you can have the search messages creeping around 884 00:45:07,530 --> 00:45:10,530 in an indirect path, which causes 885 00:45:10,530 --> 00:45:13,090 a spanning tree like this one to be created, 886 00:45:13,090 --> 00:45:15,600 which is definitely not a breadth-first spanning tree. 887 00:45:15,600 --> 00:45:20,830 The breadth-first tree is this one-- a breadth-first tree. 888 00:45:20,830 --> 00:45:23,150 You have these roundabout paths. 889 00:45:23,150 --> 00:45:25,340 This doesn't work. 890 00:45:25,340 --> 00:45:27,350 What do we do? 891 00:45:27,350 --> 00:45:28,322 Yeah? 892 00:45:28,322 --> 00:45:30,810 AUDIENCE: You could have a child [INAUDIBLE]. 893 00:45:40,063 --> 00:45:42,916 PROFESSOR: You're going to try to synchronize them, basically. 894 00:45:45,642 --> 00:45:47,442 AUDIENCE: [INAUDIBLE]. 895 00:45:47,442 --> 00:45:50,190 PROFESSOR: That is related to a homework question this week. 896 00:45:50,190 --> 00:45:51,850 Very good. 897 00:45:51,850 --> 00:45:53,600 We're going to do something different. 898 00:45:53,600 --> 00:45:55,360 Other suggestions? 899 00:45:55,360 --> 00:45:56,037 Yeah? 900 00:45:56,037 --> 00:46:00,510 AUDIENCE: We could keep a variable in each process 901 00:46:00,510 --> 00:46:03,971 that-- you can do something like Bellman-Ford. 902 00:46:03,971 --> 00:46:06,470 PROFESSOR: Yeah, so you can do what we did for Bellman-Ford, 903 00:46:06,470 --> 00:46:08,480 but now the setting is completely different. 904 00:46:08,480 --> 00:46:10,880 We did it for Bellman-Ford when it was synchronous 905 00:46:10,880 --> 00:46:12,160 and we had weights. 906 00:46:12,160 --> 00:46:15,050 Now, it's asynchronous and there are no weights. 907 00:46:15,050 --> 00:46:17,300 OK? 908 00:46:17,300 --> 00:46:20,810 Oh, there's some remarks on a couple of slides here. 909 00:46:20,810 --> 00:46:22,710 This is just belaboring the point, 910 00:46:22,710 --> 00:46:24,810 that the paths that you get by this algorithm 911 00:46:24,810 --> 00:46:27,168 can be longer than the shortest paths, 912 00:46:27,168 --> 00:46:32,085 and-- yeah, you can analyze the message and time complexity. 913 00:46:40,740 --> 00:46:44,570 The complexity here is order of the diameter times the message 914 00:46:44,570 --> 00:46:48,120 delay on one link. 915 00:46:48,120 --> 00:46:53,600 And why is it the diameter even though some paths 916 00:46:53,600 --> 00:46:54,500 may be very long? 917 00:46:57,280 --> 00:47:00,670 This is a real time upper bound that 918 00:47:00,670 --> 00:47:03,960 depends on the actual diameter of the graph, 919 00:47:03,960 --> 00:47:06,764 not on the total number of nodes. 920 00:47:09,370 --> 00:47:13,540 Why would an upper bound on the running time for the simple 921 00:47:13,540 --> 00:47:15,782 algorithm depend on the diameter? 922 00:47:21,420 --> 00:47:21,920 Yeah? 923 00:47:21,920 --> 00:47:23,935 AUDIENCE: Because you only have a longer path 924 00:47:23,935 --> 00:47:24,909 if it's faster than-- 925 00:47:24,909 --> 00:47:25,700 PROFESSOR: Exactly. 926 00:47:25,700 --> 00:47:27,540 The way we're modeling it-- it's a little strange, 927 00:47:27,540 --> 00:47:29,360 maybe-- but we're saying that something 928 00:47:29,360 --> 00:47:32,910 can travel on a long path only if it's going very fast. 929 00:47:32,910 --> 00:47:37,770 But the actual shortest paths still move along, at worst, 930 00:47:37,770 --> 00:47:41,750 at d time per hop. 931 00:47:41,750 --> 00:47:43,970 At worst, the shortest path information 932 00:47:43,970 --> 00:47:46,300 would get there within time d. 933 00:47:46,300 --> 00:47:48,350 Something is going to get there within time d, 934 00:47:48,350 --> 00:47:51,430 even though something else might get there faster. 935 00:47:51,430 --> 00:47:52,490 OK? 936 00:47:52,490 --> 00:47:55,050 All right. 937 00:47:55,050 --> 00:47:56,950 Yes, we can set up a child pointers, 938 00:47:56,950 --> 00:48:01,540 and we can do termination using convergecast. 939 00:48:01,540 --> 00:48:03,910 It's just one tree. 940 00:48:03,910 --> 00:48:06,930 There's nothing changing here. 941 00:48:06,930 --> 00:48:09,860 And applications, same as before. 942 00:48:09,860 --> 00:48:12,040 But that didn't work, so we're going 943 00:48:12,040 --> 00:48:13,980 to-- back to the point we were talking 944 00:48:13,980 --> 00:48:16,284 about a minute ago-- we're going to use a relaxation 945 00:48:16,284 --> 00:48:17,450 algorithm like Bellman-Ford. 946 00:48:19,960 --> 00:48:22,330 In the synchronous case, we corrected 947 00:48:22,330 --> 00:48:26,190 for paths that had many hops but low weight, 948 00:48:26,190 --> 00:48:29,760 but now we're going to correct for the asynchrony errors. 949 00:48:29,760 --> 00:48:32,590 All the errors that you get because of things traveling 950 00:48:32,590 --> 00:48:35,850 fast on long paths, we're going to correct for those 951 00:48:35,850 --> 00:48:39,960 using the same strategy. 952 00:48:39,960 --> 00:48:43,220 Everybody is going to keep track of the hop distance. 953 00:48:43,220 --> 00:48:45,680 No weights now just the hop distance, 954 00:48:45,680 --> 00:48:47,180 and they will change the parent when 955 00:48:47,180 --> 00:48:49,715 they learn of a shorter path. 956 00:48:49,715 --> 00:48:51,840 And then they will propagate the improved distance, 957 00:48:51,840 --> 00:48:54,800 so it's exactly like Bellman-Ford. 958 00:48:54,800 --> 00:48:56,430 And eventually, this will stabilize 959 00:48:56,430 --> 00:49:00,030 to an actual breadth-first spanning tree. 960 00:49:00,030 --> 00:49:03,140 Here's a description of this new algorithm. 961 00:49:03,140 --> 00:49:05,620 Everybody keeps track of their parent, 962 00:49:05,620 --> 00:49:09,220 and now they keep track of their hop distance, 963 00:49:09,220 --> 00:49:11,240 and they have their channels. 964 00:49:11,240 --> 00:49:14,670 Here's the key, when you get new information-- you receive 965 00:49:14,670 --> 00:49:16,790 some new information, this m is a number 966 00:49:16,790 --> 00:49:20,630 of hops that your neighbor is telling you. 967 00:49:20,630 --> 00:49:22,870 Well, if m plus 1, which would be 968 00:49:22,870 --> 00:49:24,410 your new estimate for a number of 969 00:49:24,410 --> 00:49:27,940 hops-- if that's better than what you already have, 970 00:49:27,940 --> 00:49:30,000 then you just replace your estimate 971 00:49:30,000 --> 00:49:33,340 by this new number of hops. 972 00:49:33,340 --> 00:49:36,690 And you set your-- to a new parent. 973 00:49:36,690 --> 00:49:39,910 You set your parent pointer to the sender, 974 00:49:39,910 --> 00:49:42,420 and you propagate this information. 975 00:49:42,420 --> 00:49:45,140 It's exactly the same as what we had before, 976 00:49:45,140 --> 00:49:49,810 but now we're correcting for the asynchrony. 977 00:49:49,810 --> 00:49:52,430 We get shorter hop paths later. 978 00:49:52,430 --> 00:49:53,082 Makes sense? 979 00:49:55,750 --> 00:49:58,400 And the rest of this is just you send the message. 980 00:49:58,400 --> 00:50:00,640 And notice we don't have any terminating actions. 981 00:50:00,640 --> 00:50:01,140 Why? 982 00:50:01,140 --> 00:50:02,806 Because we have the same problem that we 983 00:50:02,806 --> 00:50:06,750 had before with having processes know when they're done. 984 00:50:06,750 --> 00:50:08,600 If you keep get getting corrections, 985 00:50:08,600 --> 00:50:10,141 how do you know when you're finished? 986 00:50:12,440 --> 00:50:14,830 And how do you know this is going to work? 987 00:50:14,830 --> 00:50:16,780 In the synchronous case, we could 988 00:50:16,780 --> 00:50:20,970 get an exact characterization of what exactly 989 00:50:20,970 --> 00:50:24,714 the situation is after any number of rounds. 990 00:50:24,714 --> 00:50:26,380 And we can't do that now, because things 991 00:50:26,380 --> 00:50:29,220 can happen in so many orders. 992 00:50:29,220 --> 00:50:32,850 We have to, instead, state some higher level 993 00:50:32,850 --> 00:50:35,730 abstract properties, either in variance 994 00:50:35,730 --> 00:50:39,790 or you'll see some other kinds of properties as well. 995 00:50:42,670 --> 00:50:45,310 We could say, for instance, as an invariant, 996 00:50:45,310 --> 00:50:48,076 that all the distance information you get is correct. 997 00:50:50,650 --> 00:50:53,780 If you ever have your distance set to something, 998 00:50:53,780 --> 00:50:55,910 it's the actual distance on some path 999 00:50:55,910 --> 00:51:01,070 and your parent is correctly set to be your predecessor on such 1000 00:51:01,070 --> 00:51:02,720 a path. 1001 00:51:02,720 --> 00:51:05,790 This is just saying whatever you get is correct information. 1002 00:51:05,790 --> 00:51:10,190 This doesn't say that eventually you're going to finish, though. 1003 00:51:10,190 --> 00:51:14,360 It just says what you get is correct. 1004 00:51:14,360 --> 00:51:18,240 If you want to show that, eventually, you 1005 00:51:18,240 --> 00:51:20,350 get the right answer, you have to do something 1006 00:51:20,350 --> 00:51:21,030 with the timing. 1007 00:51:21,030 --> 00:51:27,230 You have to say something like by a certain time that 1008 00:51:27,230 --> 00:51:31,340 depends on the distance. 1009 00:51:31,340 --> 00:51:35,510 If there is-- and at most our hop path to a node, 1010 00:51:35,510 --> 00:51:39,160 then it will learn about that by a certain time, 1011 00:51:39,160 --> 00:51:44,470 but that depends on the length of the path and the message 1012 00:51:44,470 --> 00:51:45,620 delivery time. 1013 00:51:45,620 --> 00:51:49,280 And the number of nodes, because they can be congestion. 1014 00:51:49,280 --> 00:51:51,480 You have to not only say things are correct, 1015 00:51:51,480 --> 00:51:55,130 but you have to say, eventually, you get the right result, 1016 00:51:55,130 --> 00:51:57,330 and here it will say by a certain time you get 1017 00:51:57,330 --> 00:52:00,270 the right result. Makes sense? 1018 00:52:00,270 --> 00:52:04,900 This is how you would understand an algorithm like this one. 1019 00:52:04,900 --> 00:52:07,440 Message complexity. 1020 00:52:07,440 --> 00:52:08,940 Since there's all these corrections, 1021 00:52:08,940 --> 00:52:11,770 you're back in number of edges times possibly 1022 00:52:11,770 --> 00:52:13,420 the number of nodes. 1023 00:52:13,420 --> 00:52:18,520 And the time complexity, till all the distances and parent 1024 00:52:18,520 --> 00:52:21,660 values stabilize, could be-- this 1025 00:52:21,660 --> 00:52:23,830 is pessimistic again-- the diameter 1026 00:52:23,830 --> 00:52:27,270 times the number of nodes times d, because there 1027 00:52:27,270 --> 00:52:30,261 can be congestion in each of the links because 1028 00:52:30,261 --> 00:52:31,052 of the corrections. 1029 00:52:35,300 --> 00:52:38,000 How do you know when this is done? 1030 00:52:38,000 --> 00:52:41,205 How can a process know when it can finish? 1031 00:52:41,205 --> 00:52:41,705 Idea? 1032 00:52:47,680 --> 00:52:50,860 Before we had said, well, if you knew n, 1033 00:52:50,860 --> 00:52:52,540 if you knew the number of nodes, you 1034 00:52:52,540 --> 00:52:54,850 could weight that number of rounds. 1035 00:52:54,850 --> 00:52:57,800 That doesn't even help you here. 1036 00:52:57,800 --> 00:53:00,050 Even if you know-- have a good upper bound 1037 00:53:00,050 --> 00:53:02,880 on the number of nodes in the network, 1038 00:53:02,880 --> 00:53:06,390 there's no rounds to count. 1039 00:53:06,390 --> 00:53:08,101 You can't tell. 1040 00:53:08,101 --> 00:53:10,100 Even knowing the number of nodes you can't tell, 1041 00:53:10,100 --> 00:53:13,595 so how might you detect termination? 1042 00:53:13,595 --> 00:53:15,575 Yep? 1043 00:53:15,575 --> 00:53:19,050 AUDIENCE: It could bound on the diameter of [INAUDIBLE]. 1044 00:53:19,050 --> 00:53:20,950 PROFESSOR: Yeah, well, but even if you know 1045 00:53:20,950 --> 00:53:24,324 that, you can't count time. 1046 00:53:24,324 --> 00:53:26,490 See this is the thing about asynchronous algorithms, 1047 00:53:26,490 --> 00:53:28,810 you don't have-- although we're using time 1048 00:53:28,810 --> 00:53:33,040 to measure how long it's termination takes, 1049 00:53:33,040 --> 00:53:36,520 we-- the processes in there don't have that. 1050 00:53:36,520 --> 00:53:39,450 They're just these asynchronous guys who just take their steps. 1051 00:53:39,450 --> 00:53:44,480 They're ignorant of anything to do with time. 1052 00:53:44,480 --> 00:53:45,512 Other ideas? 1053 00:53:45,512 --> 00:53:46,012 Yeah. 1054 00:53:46,012 --> 00:53:47,720 AUDIENCE: Couldn't you use the same converge kind of thing-- 1055 00:53:47,720 --> 00:53:48,730 PROFESSOR: Same thing. 1056 00:53:48,730 --> 00:53:51,660 We're just going to use convergecast again, same idea. 1057 00:53:51,660 --> 00:53:54,940 You just compute and your repute your child pointers. 1058 00:53:54,940 --> 00:53:58,290 You send a done to your current parent, after you've gotten 1059 00:53:58,290 --> 00:54:00,250 responses to all your messages, so you 1060 00:54:00,250 --> 00:54:01,510 think you know your children. 1061 00:54:01,510 --> 00:54:03,780 And they've all told you they're done 1062 00:54:03,780 --> 00:54:06,150 and-- but then you might have to make corrections, 1063 00:54:06,150 --> 00:54:09,650 so as in what we saw before, you can be involved 1064 00:54:09,650 --> 00:54:13,140 in this convergecast several times until it finally reaches 1065 00:54:13,140 --> 00:54:14,373 all the way to the root. 1066 00:54:18,160 --> 00:54:20,980 Once you have these, you can use them the same way as before. 1067 00:54:20,980 --> 00:54:23,110 You now have costs that are better 1068 00:54:23,110 --> 00:54:26,990 than this simple tree that didn't have shortest 1069 00:54:26,990 --> 00:54:30,050 paths, because it now takes you less time to use 1070 00:54:30,050 --> 00:54:34,050 the tree for computing functions or disseminating information 1071 00:54:34,050 --> 00:54:35,704 because the tree is shallower. 1072 00:54:39,950 --> 00:54:44,070 Finally, what happens when we want 1073 00:54:44,070 --> 00:54:50,420 to find shortest path trees in an asynchronous setting? 1074 00:54:50,420 --> 00:54:52,510 Now, we're going to add to the complications 1075 00:54:52,510 --> 00:54:54,950 that we just saw with all the asynchrony. 1076 00:54:54,950 --> 00:54:59,790 The complications of having weights on the edges. 1077 00:54:59,790 --> 00:55:04,380 All right, the problem is get a shortest path spanning tree, 1078 00:55:04,380 --> 00:55:11,030 now in an asynchronous network, weighted graph, 1079 00:55:11,030 --> 00:55:14,160 processes don't know about the graph again. 1080 00:55:14,160 --> 00:55:20,800 They have UIDs, and everybody's supposed to output its distance 1081 00:55:20,800 --> 00:55:22,102 and parent in the tree. 1082 00:55:27,320 --> 00:55:30,215 We're going to use another relaxation algorithm. 1083 00:55:33,400 --> 00:55:35,380 Now think about what the relaxation 1084 00:55:35,380 --> 00:55:37,052 is going to be doing for you. 1085 00:55:40,790 --> 00:55:45,060 We have two kinds of corrections to make. 1086 00:55:45,060 --> 00:55:48,110 You could have long paths that have small weight. 1087 00:55:48,110 --> 00:55:51,740 That showed up for Bellman-Ford in the synchronous setting, 1088 00:55:51,740 --> 00:55:53,980 so we have to correct for those. 1089 00:55:53,980 --> 00:55:56,590 But you could also have-- because of asynchrony, 1090 00:55:56,590 --> 00:56:01,212 you could have information travelling fast on many hops, 1091 00:56:01,212 --> 00:56:02,920 and you have to correct for that as well. 1092 00:56:02,920 --> 00:56:04,480 There's two kinds of things you're 1093 00:56:04,480 --> 00:56:06,636 going to be correcting for in one algorithm. 1094 00:56:10,000 --> 00:56:13,140 This is going to-- and it's pretty surprising-- it's 1095 00:56:13,140 --> 00:56:17,030 going to lead to ridiculously high complexity, message 1096 00:56:17,030 --> 00:56:18,980 and time complexity. 1097 00:56:18,980 --> 00:56:22,250 If you really have unbridled asynchrony and weights, 1098 00:56:22,250 --> 00:56:25,640 this is going to give you a very costly algorithm. 1099 00:56:25,640 --> 00:56:27,810 You're going to see some exponential is 1100 00:56:27,810 --> 00:56:28,810 going to creep in there. 1101 00:56:33,200 --> 00:56:37,330 Here's the algorithm for the asynchronous Bellman-Ford 1102 00:56:37,330 --> 00:56:38,300 algorithm. 1103 00:56:38,300 --> 00:56:41,500 Everyone keeps track of their parent. 1104 00:56:41,500 --> 00:56:45,040 Their conjecture distance, and they 1105 00:56:45,040 --> 00:56:49,194 have, now, messages that they're going to send to the neighbors. 1106 00:56:49,194 --> 00:56:50,860 Let's say you have a queue because there 1107 00:56:50,860 --> 00:56:54,230 could be successive estimates. 1108 00:56:54,230 --> 00:56:57,500 We'll have a queue there. 1109 00:56:57,500 --> 00:56:59,640 The key step, the relaxation step, 1110 00:56:59,640 --> 00:57:05,440 is what happens when you receive a new estimate of the best 1111 00:57:05,440 --> 00:57:05,940 distance. 1112 00:57:05,940 --> 00:57:08,890 This is weighted distance, now, from a neighbor. 1113 00:57:08,890 --> 00:57:11,250 Well, you look at that distance plus the weight 1114 00:57:11,250 --> 00:57:13,190 of the edge in between, and you see 1115 00:57:13,190 --> 00:57:15,810 if that's better than your current distance, 1116 00:57:15,810 --> 00:57:18,810 just like synchronous Bellman-Ford. 1117 00:57:18,810 --> 00:57:23,810 And if it is, then you improve your distance, 1118 00:57:23,810 --> 00:57:28,630 reset your parent, and send the distance out to your neighbors. 1119 00:57:28,630 --> 00:57:31,890 It's exactly like the synchronous case, 1120 00:57:31,890 --> 00:57:36,460 but we're going to be running this asynchronously. 1121 00:57:36,460 --> 00:57:41,407 And since you're going to be correcting every time you see 1122 00:57:41,407 --> 00:57:42,990 a new estimate, this is actually going 1123 00:57:42,990 --> 00:57:45,170 to handle both kinds of corrections, 1124 00:57:45,170 --> 00:57:50,700 whether it comes because of a many hop path with a smaller 1125 00:57:50,700 --> 00:57:53,800 weight, or whether it just comes because of the asynchrony. 1126 00:57:53,800 --> 00:57:55,440 Whenever you get a better estimate, 1127 00:57:55,440 --> 00:57:58,084 you're going to make the correction. 1128 00:57:58,084 --> 00:57:59,500 Is it clear this is all you really 1129 00:57:59,500 --> 00:58:02,527 need to do in the algorithm, just what's in this code? 1130 00:58:07,409 --> 00:58:09,700 That's the received, and then the rest of the algorithm 1131 00:58:09,700 --> 00:58:12,962 is just you send it out when you're ready to send. 1132 00:58:15,480 --> 00:58:18,610 And then we have the same issue about termination, 1133 00:58:18,610 --> 00:58:23,350 there's no terminating actions. 1134 00:58:23,350 --> 00:58:24,350 We'll come back to that. 1135 00:58:27,580 --> 00:58:30,200 It's really hard to come up with invariants and timing 1136 00:58:30,200 --> 00:58:33,110 properties, now, for this setting. 1137 00:58:33,110 --> 00:58:35,170 You can certainly have an invariant like the one 1138 00:58:35,170 --> 00:58:38,180 that we just had for asynchronous breadth-first 1139 00:58:38,180 --> 00:58:39,520 search. 1140 00:58:39,520 --> 00:58:41,870 You can say that at any point, whatever 1141 00:58:41,870 --> 00:58:44,520 distance you have is an actual distance that's 1142 00:58:44,520 --> 00:58:50,010 achievable along some path, and the parent is correct. 1143 00:58:50,010 --> 00:58:51,510 But we'd also like to have something 1144 00:58:51,510 --> 00:58:53,718 that says, eventually, you'll get the right distance. 1145 00:58:56,360 --> 00:58:58,640 You want to state a timing property that says, 1146 00:58:58,640 --> 00:59:02,950 fine, if you have an at most r hop path, by a certain time, 1147 00:59:02,950 --> 00:59:06,650 you'd like to know that your distance is at least as 1148 00:59:06,650 --> 00:59:10,040 good as what you could get on that path. 1149 00:59:10,040 --> 00:59:11,710 The problem is what are you going 1150 00:59:11,710 --> 00:59:14,340 to have here for the amount of time? 1151 00:59:14,340 --> 00:59:16,960 How long would it possibly take in order 1152 00:59:16,960 --> 00:59:21,640 to get the best estimate that you could 1153 00:59:21,640 --> 00:59:24,244 for a path of at most r hops? 1154 00:59:28,846 --> 00:59:29,346 A guess? 1155 00:59:32,360 --> 00:59:34,440 I was able to calculate something reasonable 1156 00:59:34,440 --> 00:59:37,660 for the breadth-first search case, 1157 00:59:37,660 --> 00:59:41,190 but now this is going to be much, much worse. 1158 00:59:41,190 --> 00:59:42,380 It's not obvious at all. 1159 00:59:42,380 --> 00:59:46,870 It's actually going to depend on how many messages could pile up 1160 00:59:46,870 --> 00:59:50,430 in a channel, and that can be an awful lot, 1161 00:59:50,430 --> 00:59:52,835 an exponential number-- exponential in the number 1162 00:59:52,835 --> 00:59:53,835 of nodes in the network. 1163 01:00:04,340 --> 01:00:10,380 I'm going to produce an execution for you that 1164 01:00:10,380 --> 01:00:13,790 can generate a huge number of messages, which then will take 1165 01:00:13,790 --> 01:00:17,720 a long time to deliver and delay the termination 1166 01:00:17,720 --> 01:00:19,021 of the algorithm. 1167 01:00:19,021 --> 01:00:20,520 First, let's look at an upper bound. 1168 01:00:20,520 --> 01:00:23,460 What can we say for an upper bound? 1169 01:00:23,460 --> 01:00:25,490 Well, there's many different paths 1170 01:00:25,490 --> 01:00:29,960 from v 0 to any other particular node. 1171 01:00:29,960 --> 01:00:36,000 We might have to traverse all the simple paths in the graph, 1172 01:00:36,000 --> 01:00:37,540 perhaps. 1173 01:00:37,540 --> 01:00:40,480 And how many are there? 1174 01:00:40,480 --> 01:00:45,390 Well, as an upper bound, you could say order n factorial 1175 01:00:45,390 --> 01:00:47,100 for the number of different paths 1176 01:00:47,100 --> 01:00:50,130 that you can traverse to get from v 0 1177 01:00:50,130 --> 01:00:53,500 to some particular other node. 1178 01:00:53,500 --> 01:00:56,230 That's exponential in n. 1179 01:00:56,230 --> 01:00:57,830 Certainly, it's order n to the n. 1180 01:01:01,510 --> 01:01:04,100 This says that the number of messages 1181 01:01:04,100 --> 01:01:05,700 that you might send on any channel 1182 01:01:05,700 --> 01:01:09,600 could correspond to doing that many corrections. 1183 01:01:09,600 --> 01:01:14,170 This can blow up your message complexity into n 1184 01:01:14,170 --> 01:01:15,770 to the n times the number of edges, 1185 01:01:15,770 --> 01:01:22,520 and your time complexity n to the n times, n times d, 1186 01:01:22,520 --> 01:01:25,340 because on every edge you might have 1187 01:01:25,340 --> 01:01:29,590 to wait for that many messages, corrected messages, sitting 1188 01:01:29,590 --> 01:01:30,462 in front of you. 1189 01:01:33,910 --> 01:01:36,770 That seems pretty awful. 1190 01:01:36,770 --> 01:01:38,020 Does it actually happen? 1191 01:01:38,020 --> 01:01:41,900 Can you actually construct an execution of this algorithm 1192 01:01:41,900 --> 01:01:45,520 where you get exponential bounds like that? 1193 01:01:45,520 --> 01:01:48,410 And we'll see that, yes, you can. 1194 01:01:48,410 --> 01:01:49,700 Any questions so far? 1195 01:01:52,730 --> 01:01:56,680 Here's a bad example. 1196 01:01:56,680 --> 01:02:01,670 This is a network, consists of a sequence of, let's say, 1197 01:02:01,670 --> 01:02:07,760 k plus 1-- k plus 2, I guess, nodes, in a line. 1198 01:02:07,760 --> 01:02:12,820 And I'm going to throw in some little detour nodes 1199 01:02:12,820 --> 01:02:17,380 in between each consecutive pair of nodes in this graph, 1200 01:02:17,380 --> 01:02:20,950 and now let me play with the weights. 1201 01:02:20,950 --> 01:02:25,370 Let's say on this path, this direct path from v 0 1202 01:02:25,370 --> 01:02:28,480 to vk plus 1, all the weights are 0. 1203 01:02:28,480 --> 01:02:33,570 That's going to be the shortest path, the best weight path, 1204 01:02:33,570 --> 01:02:36,650 from v 0 to vk plus 1. 1205 01:02:36,650 --> 01:02:39,930 But now I'm going to have some detours. 1206 01:02:39,930 --> 01:02:45,840 And on the detours I have two edges, one of weight 0 1207 01:02:45,840 --> 01:02:48,710 and the other one of weight that's a power of 2. 1208 01:02:48,710 --> 01:02:50,450 I'm going to start with high powers of 2, 1209 01:02:50,450 --> 01:02:53,810 2 to the k minus 1, and go down to 2 to the k minus 2, 1210 01:02:53,810 --> 01:02:56,986 down to 2 of the 1, to the 0. 1211 01:02:56,986 --> 01:02:58,900 See what this graph is doing? 1212 01:02:58,900 --> 01:03:01,510 It has a very fast path in the bottom, 1213 01:03:01,510 --> 01:03:04,390 which you'd like to hear about as soon as you can. 1214 01:03:04,390 --> 01:03:07,510 But, actually, there is detours which could give you 1215 01:03:07,510 --> 01:03:08,350 much worse paths. 1216 01:03:14,140 --> 01:03:18,380 Let's see how this might execute to make a lot of messages 1217 01:03:18,380 --> 01:03:21,820 pile up in a channel. 1218 01:03:21,820 --> 01:03:23,680 My claim is that there's an execution 1219 01:03:23,680 --> 01:03:32,153 of that network in which the last node, bk, sends 2 1220 01:03:32,153 --> 01:03:37,390 to the k messages to the next node vk plus 1. 1221 01:03:37,390 --> 01:03:40,400 He's really going to send an exponential number of messages 1222 01:03:40,400 --> 01:03:42,490 corresponding to his corrections. 1223 01:03:42,490 --> 01:03:44,950 He's going to keep making corrections for better 1224 01:03:44,950 --> 01:03:48,100 and better estimates. 1225 01:03:48,100 --> 01:03:51,220 And if all this happens relatively fast, 1226 01:03:51,220 --> 01:03:52,640 that just means you have a channel 1227 01:03:52,640 --> 01:03:55,590 with an exponential number of messages in it, 1228 01:03:55,590 --> 01:03:57,555 emptying that will take exponential time. 1229 01:04:03,380 --> 01:04:06,690 You have an idea how this might happen? 1230 01:04:06,690 --> 01:04:14,260 How could this node, bk, get so many successively improved 1231 01:04:14,260 --> 01:04:15,968 estimates, one after the other? 1232 01:04:21,110 --> 01:04:23,072 Well, what's the biggest estimate it might get? 1233 01:04:28,943 --> 01:04:29,442 Yeah? 1234 01:04:29,442 --> 01:04:30,650 AUDIENCE: 2 to the k minus 1. 1235 01:04:30,650 --> 01:04:33,280 PROFESSOR: It could get 2 to the k minus 1? 1236 01:04:33,280 --> 01:04:34,395 Well, let's see. 1237 01:04:34,395 --> 01:04:35,436 AUDIENCE: Or [INAUDIBLE]. 1238 01:04:35,436 --> 01:04:37,195 PROFESSOR: It could do that. 1239 01:04:37,195 --> 01:04:38,530 AUDIENCE: Then it's 2 to the k. 1240 01:04:38,530 --> 01:04:39,873 PROFESSOR: And it could do that. 1241 01:04:39,873 --> 01:04:40,840 AUDIENCE: Plus 2 to the k. 1242 01:04:40,840 --> 01:04:42,298 PROFESSOR: Plus 2 to the k minus 2, 1243 01:04:42,298 --> 01:04:45,100 plus all the way down to plus 2 to the 0. 1244 01:04:45,100 --> 01:04:49,080 You could be following this really inefficient path, just 1245 01:04:49,080 --> 01:04:53,550 all the detours, before the messages actually arrive 1246 01:04:53,550 --> 01:04:55,780 on the edges on the spine. 1247 01:04:55,780 --> 01:04:57,010 AUDIENCE: [INAUDIBLE]. 1248 01:04:57,010 --> 01:05:02,410 PROFESSOR: Yeah, so you follow all-- oh, you said 2 to the k, 1249 01:05:02,410 --> 01:05:04,040 minus 1, parenthesis. 1250 01:05:04,040 --> 01:05:05,430 Yeah, that's exactly right. 1251 01:05:05,430 --> 01:05:07,860 AUDIENCE: It's all right. [INAUDIBLE]. 1252 01:05:07,860 --> 01:05:11,470 PROFESSOR: We can't parenthesize our speech. 1253 01:05:11,470 --> 01:05:16,810 All right, so the possible estimates that bk can take on 1254 01:05:16,810 --> 01:05:24,110 are actually 2 to the k-- as you said, 2 to the k minus 1, which 1255 01:05:24,110 --> 01:05:26,510 you would get by taking all of the detours 1256 01:05:26,510 --> 01:05:28,000 and adding up the powers of 2. 1257 01:05:30,580 --> 01:05:32,230 But it could also have an estimate, 1258 01:05:32,230 --> 01:05:37,340 which is 2 to the k minus 2 or 2 to the k minus 3. 1259 01:05:37,340 --> 01:05:40,090 All of those are possible. 1260 01:05:40,090 --> 01:05:42,920 Moreover, you can have a single execution 1261 01:05:42,920 --> 01:05:47,480 of this asynchronous algorithm in which node bk actually 1262 01:05:47,480 --> 01:05:53,270 acquires all those estimates in sequence, one at a time. 1263 01:05:53,270 --> 01:05:54,200 How might that work? 1264 01:05:54,200 --> 01:05:57,460 First, the messages travel all the detours. 1265 01:05:57,460 --> 01:06:01,560 Fine, bk gets 2 to the k minus 1. 1266 01:06:01,560 --> 01:06:05,120 Then, there's a message traveling-- well 1267 01:06:05,120 --> 01:06:09,470 this guy sends a message on the lower link. 1268 01:06:09,470 --> 01:06:13,050 This guy has only heard about the messages on the detours up 1269 01:06:13,050 --> 01:06:14,990 to that point. 1270 01:06:14,990 --> 01:06:17,350 But he sends that on the lower link, 1271 01:06:17,350 --> 01:06:21,340 which means you kind of bypass this weight of one. 1272 01:06:21,340 --> 01:06:24,720 bk gets a little bit of an improvement, 1273 01:06:24,720 --> 01:06:28,280 which gives it 2 to the k minus 2 as its estimate. 1274 01:06:31,240 --> 01:06:34,050 What happens next? 1275 01:06:34,050 --> 01:06:39,800 Well, we step one back, and the node from node k minus 2 1276 01:06:39,800 --> 01:06:44,710 can send a message on the lower link, which 1277 01:06:44,710 --> 01:06:48,740 has weight 0, to bk minus one. 1278 01:06:48,740 --> 01:06:54,250 But that corrected estimate might then traverse the detour 1279 01:06:54,250 --> 01:06:57,860 to get to node bk. 1280 01:06:57,860 --> 01:07:01,270 If you get the correction for node bk minus 1, 1281 01:07:01,270 --> 01:07:04,200 but then you follow the detour, you haven't improved that much. 1282 01:07:04,200 --> 01:07:07,300 You've just improved by one. 1283 01:07:07,300 --> 01:07:12,370 This way, you get 2 to the k minus 3 as the new estimate. 1284 01:07:12,370 --> 01:07:15,340 But then, again, on the lower link, 1285 01:07:15,340 --> 01:07:17,870 the message eventually arrives, which 1286 01:07:17,870 --> 01:07:21,210 gives you 2 to the k minus 4. 1287 01:07:21,210 --> 01:07:23,170 You see the pattern sort of? 1288 01:07:23,170 --> 01:07:26,000 You're going to be counting down in binary 1289 01:07:26,000 --> 01:07:30,300 by successively having nodes further to the left 1290 01:07:30,300 --> 01:07:32,830 deliver their messages, but then they 1291 01:07:32,830 --> 01:07:37,300 do the worst possible thing of getting the information to bk. 1292 01:07:37,300 --> 01:07:39,847 He has to deal with all those other estimates in between. 1293 01:07:42,740 --> 01:07:44,790 If this happens quickly, what you get 1294 01:07:44,790 --> 01:07:48,950 is a pile up of an exponential number of search messages 1295 01:07:48,950 --> 01:07:50,420 in one channel. 1296 01:07:50,420 --> 01:07:52,065 And then that information has to go 1297 01:07:52,065 --> 01:07:54,440 on to the next node or the rest of the network, whatever. 1298 01:07:54,440 --> 01:07:56,481 It's going to take an exponential amount of time, 1299 01:07:56,481 --> 01:07:58,647 in the worst case, to empty that all out. 1300 01:08:01,340 --> 01:08:08,520 This is pretty bad, but the algorithm is correct. 1301 01:08:08,520 --> 01:08:12,700 And so how do you learn when everything is finished, 1302 01:08:12,700 --> 01:08:14,810 and how does a process know when it 1303 01:08:14,810 --> 01:08:19,345 can output its own correct distance information? 1304 01:08:19,345 --> 01:08:21,428 How can we figure out when this is all stabilized? 1305 01:08:26,689 --> 01:08:31,540 Same thing as before, we can just do a convergecast. 1306 01:08:31,540 --> 01:08:34,180 I mean this is more corrections, but it's still 1307 01:08:34,180 --> 01:08:36,300 the same kind of corrections. 1308 01:08:36,300 --> 01:08:40,220 We can convergecast and, eventually, this 1309 01:08:40,220 --> 01:08:43,729 is going to convergecast all the way up to the root. 1310 01:08:43,729 --> 01:08:45,790 And then the root knows it's done 1311 01:08:45,790 --> 01:08:47,252 and can tell everyone else. 1312 01:08:51,319 --> 01:08:54,279 A moral here-- you've had a quick dose 1313 01:08:54,279 --> 01:08:58,600 of a lot of asynchrony-- yeah, if you don't do anything 1314 01:08:58,600 --> 01:09:01,279 about it and you just use unrestrained asynchrony, 1315 01:09:01,279 --> 01:09:03,890 in the worst case, you're going to have 1316 01:09:03,890 --> 01:09:06,349 some pretty bad performance. 1317 01:09:06,349 --> 01:09:08,140 The question is, what do you do about that? 1318 01:09:08,140 --> 01:09:09,348 There are various techniques. 1319 01:09:09,348 --> 01:09:11,479 And if you want to take my course next fall, 1320 01:09:11,479 --> 01:09:14,680 we'll cover some of those. 1321 01:09:14,680 --> 01:09:18,609 I'll say a little bit about the course. 1322 01:09:18,609 --> 01:09:23,960 It's a basic-- it's a TQE level, basic grad course. 1323 01:09:23,960 --> 01:09:27,470 We do synchronous, asynchronous, and some other stuff 1324 01:09:27,470 --> 01:09:32,651 where the nodes really know about something about time. 1325 01:09:32,651 --> 01:09:34,359 Here's some of the synchronous problems-- 1326 01:09:34,359 --> 01:09:38,710 some like the ones you've already seen. 1327 01:09:38,710 --> 01:09:44,460 Building many other kinds of structures in graphs, 1328 01:09:44,460 --> 01:09:46,609 and then we get into fault tolerance. 1329 01:09:46,609 --> 01:09:49,250 There's a lot of questions about what 1330 01:09:49,250 --> 01:09:53,170 happens when some of the components can fail, 1331 01:09:53,170 --> 01:09:55,370 or they're even malicious, and you 1332 01:09:55,370 --> 01:09:59,210 have to deal with the effects of malicious processes 1333 01:09:59,210 --> 01:10:01,500 in your system. 1334 01:10:01,500 --> 01:10:05,400 And then for asynchronous algorithms, 1335 01:10:05,400 --> 01:10:07,720 we'll do not only individual problems 1336 01:10:07,720 --> 01:10:10,410 like the ones you've just seen, but some general techniques 1337 01:10:10,410 --> 01:10:14,250 like synchronizers, notion of logical time-- 1338 01:10:14,250 --> 01:10:17,700 that's Leslie Lamport's first and biggest contribution. 1339 01:10:17,700 --> 01:10:22,260 He one the Turing Award last year-- other techniques, 1340 01:10:22,260 --> 01:10:26,800 like taking global snapshots of the entire system 1341 01:10:26,800 --> 01:10:29,060 while it's running. 1342 01:10:29,060 --> 01:10:31,140 In addition to talking about networks, 1343 01:10:31,140 --> 01:10:32,820 as we've been doing this week, we'll 1344 01:10:32,820 --> 01:10:36,320 talk about shared memory, multi-processors accessing 1345 01:10:36,320 --> 01:10:38,630 shared memory. 1346 01:10:38,630 --> 01:10:41,760 And solving problems that are of use in multiprocessors, 1347 01:10:41,760 --> 01:10:43,480 like mutual exclusion. 1348 01:10:43,480 --> 01:10:45,170 And again, fault tolerance. 1349 01:10:45,170 --> 01:10:49,570 Fault tolerance then gets us into a study of data objects 1350 01:10:49,570 --> 01:10:51,770 with consistency conditions, which 1351 01:10:51,770 --> 01:10:55,280 is the sort of stuff that's useful in cloud computing. 1352 01:10:55,280 --> 01:10:59,010 If you want to have coherent access to data that's 1353 01:10:59,010 --> 01:11:03,040 stored at many locations, you need to have some interesting 1354 01:11:03,040 --> 01:11:05,440 distributed algorithms. 1355 01:11:05,440 --> 01:11:08,210 Self stabilization-- if you plunge your system 1356 01:11:08,210 --> 01:11:10,070 into some arbitrary state, and you'd 1357 01:11:10,070 --> 01:11:13,660 like it to converge to a good state, that's the topic of self 1358 01:11:13,660 --> 01:11:15,860 stabilization. 1359 01:11:15,860 --> 01:11:18,800 And depending on time, there's things 1360 01:11:18,800 --> 01:11:23,060 that use time in the algorithms, and the newer work 1361 01:11:23,060 --> 01:11:26,489 that we're working on in research is very dynamic. 1362 01:11:26,489 --> 01:11:28,530 You have distributed algorithms where the network 1363 01:11:28,530 --> 01:11:31,420 itself is changing during the execution. 1364 01:11:31,420 --> 01:11:33,526 That comes up in wireless networks, 1365 01:11:33,526 --> 01:11:34,900 and lately we're actually looking 1366 01:11:34,900 --> 01:11:38,060 at insect colony algorithms. 1367 01:11:38,060 --> 01:11:40,110 What distributed algorithms do ants 1368 01:11:40,110 --> 01:11:43,750 use to decide on how to select a new nest when 1369 01:11:43,750 --> 01:11:47,580 the researchers smash their old nest in the laboratory? 1370 01:11:47,580 --> 01:11:48,593 That kind of question. 1371 01:11:51,770 --> 01:11:54,610 That's it for the distributed algorithms week. 1372 01:11:54,610 --> 01:11:59,430 And we'll see you-- next week is security? 1373 01:11:59,430 --> 01:12:02,218 OK, yeah.