1 00:00:00,090 --> 00:00:02,490 The following content is provided under a Creative 2 00:00:02,490 --> 00:00:04,030 Commons license. 3 00:00:04,030 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,720 continue to offer high-quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation, or view additional materials 6 00:00:13,320 --> 00:00:16,200 from hundreds of MIT courses, visit MIT 7 00:00:16,200 --> 00:00:21,280 OpenCourseWare at ocw.mit.edu, 8 00:00:21,280 --> 00:00:23,930 ERIK DEMAINE: Today we continue our theme of dynamic graphs. 9 00:00:23,930 --> 00:00:26,410 This is the second of three lectures. 10 00:00:26,410 --> 00:00:31,390 And this will be the main lecture about upper bounds 11 00:00:31,390 --> 00:00:32,890 for general graphs. 12 00:00:32,890 --> 00:00:34,900 Last class was about link cut trees. 13 00:00:34,900 --> 00:00:36,810 And we saw, essentially, how to solve 14 00:00:36,810 --> 00:00:40,655 a problem called dynamic connectivity, which 15 00:00:40,655 --> 00:00:42,280 is where you can insert into lead edges 16 00:00:42,280 --> 00:00:45,490 and you want to know what's connected to what. 17 00:00:45,490 --> 00:00:46,990 For trees, we solved it. 18 00:00:46,990 --> 00:00:49,600 We're going to solve it again for trees in an easier way. 19 00:00:49,600 --> 00:00:51,790 Because we need a slightly simpler data structure 20 00:00:51,790 --> 00:00:56,260 we can augment in different ways, called Euler-tour trees. 21 00:00:56,260 --> 00:00:59,800 And then we'll solve trees, yet again, 22 00:00:59,800 --> 00:01:01,900 in a setting where you can only delete edges. 23 00:01:01,900 --> 00:01:05,780 Then we can get from log n time to constant time. 24 00:01:05,780 --> 00:01:08,560 So we'll still talk about trees for quite awhile. 25 00:01:08,560 --> 00:01:10,690 But then we'll finally get to general graphs. 26 00:01:10,690 --> 00:01:14,570 And we'll do a log squared n solution for general graphs. 27 00:01:14,570 --> 00:01:17,110 So just another log factor more, you 28 00:01:17,110 --> 00:01:21,484 can generalize from trees to undirected graphs. 29 00:01:21,484 --> 00:01:23,650 And then I'll give you a flavor of what's out there. 30 00:01:23,650 --> 00:01:25,665 Dynamic graphs is a big world. 31 00:01:25,665 --> 00:01:27,299 It has lots of results. 32 00:01:27,299 --> 00:01:29,590 And I'll give you some idea of what other problems have 33 00:01:29,590 --> 00:01:32,770 been studied in dynamic graphs, both undirected and directed. 34 00:01:32,770 --> 00:01:34,960 Though for directed graphs, things get pretty slow. 35 00:01:37,480 --> 00:01:39,640 Yeah, next class will be about lower bounds, which 36 00:01:39,640 --> 00:01:44,830 will prove that this log stuff is actually necessary, 37 00:01:44,830 --> 00:01:45,959 in case you were wondering. 38 00:01:45,959 --> 00:01:47,500 Because a lot of the time, we've been 39 00:01:47,500 --> 00:01:49,620 able to get better than log in this class. 40 00:01:49,620 --> 00:01:51,040 And dynamic graphs, you can't. 41 00:01:51,040 --> 00:01:53,650 They're log lower bounds. 42 00:01:53,650 --> 00:01:57,280 So let's start out with a definition of the problem. 43 00:01:57,280 --> 00:02:00,305 We'll be focusing, today, mostly on this dynamic connectivity 44 00:02:00,305 --> 00:02:00,805 problem. 45 00:02:08,139 --> 00:02:19,012 Maintain an undirected graph subject to [INAUDIBLE] 46 00:02:19,012 --> 00:02:19,970 insertion and deletion. 47 00:02:27,090 --> 00:02:30,665 And you can also insert and delete degree-0 vertices. 48 00:02:33,170 --> 00:02:35,870 But we're not really going to worry about vertices. 49 00:02:35,870 --> 00:02:38,110 It's easy to add vertices that have no edges to them. 50 00:02:38,110 --> 00:02:39,525 Also easy to delete them. 51 00:02:39,525 --> 00:02:42,530 It'll be more about adding and removing edges incident 52 00:02:42,530 --> 00:02:45,020 to those vertices. 53 00:02:45,020 --> 00:02:53,195 And then the query connectivity query is given two vertices. 54 00:02:56,100 --> 00:02:59,420 Is there a v to w path? 55 00:03:02,000 --> 00:03:04,400 So this is asking are two vertices, v and w, 56 00:03:04,400 --> 00:03:07,310 in the same connected component? 57 00:03:07,310 --> 00:03:09,610 That's a strong query. 58 00:03:09,610 --> 00:03:12,800 In some sense, a weaker query is just to know globally, 59 00:03:12,800 --> 00:03:14,930 is the graph connected? 60 00:03:14,930 --> 00:03:18,260 That's another interpretation of dynamic connectivity. 61 00:03:18,260 --> 00:03:19,850 It turns out, both of these problems 62 00:03:19,850 --> 00:03:25,059 are interesting and useful for solving problems. 63 00:03:25,059 --> 00:03:27,350 But they also turn out to be, more or less, equivalent. 64 00:03:27,350 --> 00:03:28,849 All the upper bounds we know for one 65 00:03:28,849 --> 00:03:30,510 apply to the other and vice versa. 66 00:03:30,510 --> 00:03:33,380 So it doesn't seem to make much difference which kind of query 67 00:03:33,380 --> 00:03:34,130 you want. 68 00:03:34,130 --> 00:03:35,790 So might as well take both. 69 00:03:35,790 --> 00:03:40,100 We'll focus on the v, w query. 70 00:03:40,100 --> 00:03:41,690 OK. 71 00:03:41,690 --> 00:03:45,200 So that's the problem, in general. 72 00:03:45,200 --> 00:03:49,490 And I want to introduce some terminology, 73 00:03:49,490 --> 00:03:53,060 like when we had partially and full persistence, 74 00:03:53,060 --> 00:03:57,530 there's also a partial and full dynamicness. 75 00:03:57,530 --> 00:04:00,800 A fully dynamic data structure is 76 00:04:00,800 --> 00:04:05,190 one where you can do inserts and deletes of edges. 77 00:04:08,400 --> 00:04:15,473 And a partially dynamic data structure 78 00:04:15,473 --> 00:04:17,264 is one where you can do inserts or deletes. 79 00:04:19,790 --> 00:04:25,650 So you can only do inserts, or you can only 80 00:04:25,650 --> 00:04:30,100 do deletes throughout the lifetime of the data structure. 81 00:04:30,100 --> 00:04:31,350 And these have specific names. 82 00:04:31,350 --> 00:04:33,391 Because usually you use different data structures 83 00:04:33,391 --> 00:04:35,000 for just insertions or just deletions. 84 00:04:35,000 --> 00:04:36,972 Just insertions is called incremental. 85 00:04:43,540 --> 00:04:46,093 And just deletions is called decremental. 86 00:04:49,957 --> 00:04:52,040 The idea of incremental algorithms 87 00:04:52,040 --> 00:04:53,900 is definitely not a new one. 88 00:04:53,900 --> 00:04:57,010 But in dynamic graphs, it always makes sense. 89 00:04:57,010 --> 00:04:59,590 So in general, a dynamic graph problem 90 00:04:59,590 --> 00:05:02,320 is defined usually by insertion and deletion of edges, 91 00:05:02,320 --> 00:05:05,500 and either fully dynamic or partially dynamic. 92 00:05:05,500 --> 00:05:07,146 And then the query is what can vary. 93 00:05:07,146 --> 00:05:09,020 We're going to focus on connectivity queries. 94 00:05:09,020 --> 00:05:10,850 But you could do others. 95 00:05:10,850 --> 00:05:13,210 So this explains, for decremental, we're 96 00:05:13,210 --> 00:05:15,370 going to solve trees in constant time. 97 00:05:15,370 --> 00:05:18,040 For fully dynamic, we're going to solve general graphs 98 00:05:18,040 --> 00:05:19,040 in log squared n time. 99 00:05:26,220 --> 00:05:30,095 I guess these three are sort of the meat of the lecture. 100 00:05:30,095 --> 00:05:31,470 But before I go into that, I want 101 00:05:31,470 --> 00:05:34,920 to tell you where this dynamic connectivity results fit 102 00:05:34,920 --> 00:05:39,158 into the literature in general on dynamic connectivity. 103 00:05:45,510 --> 00:05:50,046 This is part of our survey, but just on dynamic connectivity. 104 00:05:59,780 --> 00:06:04,000 Well, for trees, this is the best we know. 105 00:06:04,000 --> 00:06:05,680 This is the best, possible really. 106 00:06:05,680 --> 00:06:10,030 You can do log n in general, and then decremental. 107 00:06:14,990 --> 00:06:16,930 You can do constant. 108 00:06:16,930 --> 00:06:20,200 We actually already know how to do trees with lint cut trees. 109 00:06:20,200 --> 00:06:23,560 It's not totally obvious, but you can use links 110 00:06:23,560 --> 00:06:27,010 to simulate edge insertions. 111 00:06:27,010 --> 00:06:30,040 And you can use cuts to simulate edge deletions. 112 00:06:30,040 --> 00:06:33,675 It's not totally obvious because the link operation 113 00:06:33,675 --> 00:06:35,050 requires that one of the vertices 114 00:06:35,050 --> 00:06:37,660 is the root of its tree. 115 00:06:37,660 --> 00:06:40,030 So that's not so trivial. 116 00:06:40,030 --> 00:06:43,720 But we'll see an easy way to do that in Euler-tour trees 117 00:06:43,720 --> 00:06:45,610 later on. 118 00:06:45,610 --> 00:06:47,360 So this we kind of already know how to do. 119 00:06:47,360 --> 00:06:50,490 But we're going to see another simpler way. 120 00:06:50,490 --> 00:06:52,930 The constant time amortized decremental 121 00:06:52,930 --> 00:06:55,470 we're going to do today. 122 00:06:55,470 --> 00:06:59,590 So we're also going to do this today. 123 00:06:59,590 --> 00:07:04,726 Incremental, I don't think it's known, how to do constant time. 124 00:07:04,726 --> 00:07:06,100 But I'll tell you another result. 125 00:07:06,100 --> 00:07:07,630 You can get almost constant time. 126 00:07:10,310 --> 00:07:12,340 Another special type of graph you could consider 127 00:07:12,340 --> 00:07:13,450 is a plane graph. 128 00:07:13,450 --> 00:07:15,670 This is a graph with planar embedding. 129 00:07:15,670 --> 00:07:18,670 So as you're inserting edges or adding vertices and whatnot, 130 00:07:18,670 --> 00:07:20,412 you say which face they live in. 131 00:07:20,412 --> 00:07:22,120 And so you have a fixed planar embedding. 132 00:07:22,120 --> 00:07:24,070 It's being constructed as you go along. 133 00:07:24,070 --> 00:07:26,590 These are easy to maintain. 134 00:07:26,590 --> 00:07:28,082 You can get order log n. 135 00:07:28,082 --> 00:07:30,040 I guess, in some sense, that generalizes trees. 136 00:07:30,040 --> 00:07:33,820 Because they also have pretty obvious planar embedding. 137 00:07:33,820 --> 00:07:40,750 So that's something you can do, in log n. 138 00:07:40,750 --> 00:07:47,920 But the big question is, for general graphs, 139 00:07:47,920 --> 00:07:51,580 can you do log n per operation? 140 00:07:51,580 --> 00:07:55,100 That's an open problem, probably the central open problem 141 00:07:55,100 --> 00:07:56,914 in dynamic connectivity. 142 00:08:06,300 --> 00:08:09,070 But we're not so far away from that. 143 00:08:09,070 --> 00:08:11,130 As I said, you can get log squared. 144 00:08:11,130 --> 00:08:13,110 Let me say what the log squared result is. 145 00:08:13,110 --> 00:08:15,250 It's a little bit better than just log squared. 146 00:08:15,250 --> 00:08:18,150 You get log squared update, but query 147 00:08:18,150 --> 00:08:20,130 is actually sub logarithmic. 148 00:08:23,010 --> 00:08:26,400 You got a log n over log log n query. 149 00:08:32,650 --> 00:08:37,000 And another result is you can do a little bit 150 00:08:37,000 --> 00:08:38,049 better than log squared. 151 00:08:38,049 --> 00:08:45,750 You can do log times log log n cube update. 152 00:08:45,750 --> 00:08:50,470 So roughly log n update, but there's some poly log log term. 153 00:08:50,470 --> 00:08:54,280 And query slows down very slightly. 154 00:08:54,280 --> 00:08:59,310 It becomes log n over a log log log n. 155 00:09:04,700 --> 00:09:08,060 Well, it'll sort of makes sense why in a moment. 156 00:09:08,060 --> 00:09:12,580 So the lowest update time known is this one, log times log 157 00:09:12,580 --> 00:09:14,710 log cubed. 158 00:09:14,710 --> 00:09:21,160 Big open question is whether you can get log for update, 159 00:09:21,160 --> 00:09:23,160 and any kind of reasonable query time. 160 00:09:23,160 --> 00:09:26,650 But ideally, log for both. 161 00:09:26,650 --> 00:09:28,210 So that's for a fully dynamic. 162 00:09:28,210 --> 00:09:30,490 If I don't say any otherwise, fully dynamic 163 00:09:30,490 --> 00:09:34,840 is always the default. Result, that's 164 00:09:34,840 --> 00:09:36,520 the case you usually care about. 165 00:09:39,520 --> 00:09:40,240 Cool. 166 00:09:40,240 --> 00:09:46,990 But if you want to do incremental, 167 00:09:46,990 --> 00:09:52,870 you can achieve an alpha bound. 168 00:09:52,870 --> 00:09:55,930 Because incremental dynamic connectivity is really 169 00:09:55,930 --> 00:09:58,580 the union find problem. 170 00:09:58,580 --> 00:10:01,300 Union find, you have a bunch of sets of elements. 171 00:10:01,300 --> 00:10:03,850 You can take two of them and union them together. 172 00:10:03,850 --> 00:10:06,040 Once they're unioned, they can never be split apart. 173 00:10:06,040 --> 00:10:07,570 And find is tell me what set I'm in. 174 00:10:07,570 --> 00:10:10,390 So if I do find on two vertices, I 175 00:10:10,390 --> 00:10:12,610 can find which connected components there 176 00:10:12,610 --> 00:10:15,070 and check whether they're the same. 177 00:10:15,070 --> 00:10:17,200 That will give me connectivity query. 178 00:10:17,200 --> 00:10:19,790 Insertion corresponds to merging two connected components 179 00:10:19,790 --> 00:10:21,794 unless they were already together. 180 00:10:21,794 --> 00:10:23,960 And it's known you can get an alpha amortized bound. 181 00:10:23,960 --> 00:10:26,782 And that's optimal for that problem. 182 00:10:26,782 --> 00:10:28,240 So I guess that's actually a theta. 183 00:10:31,670 --> 00:10:34,100 So that's sort of well-known stuff. 184 00:10:34,100 --> 00:10:39,250 It's a complicated analysis, but already been done. 185 00:10:39,250 --> 00:10:42,160 So in particular, that's, I believe, the best way 186 00:10:42,160 --> 00:10:46,120 we know how to solve incremental connectivity in trees. 187 00:10:46,120 --> 00:10:50,578 Though I'm curious whether you can do constant incremental. 188 00:10:54,790 --> 00:10:59,022 Decremental, there's essentially a log n solution. 189 00:10:59,022 --> 00:11:04,120 Not quite, but for dense graphs, there's a log n solution. 190 00:11:11,830 --> 00:11:14,740 So I'm stating, instead of a bound per operation, 191 00:11:14,740 --> 00:11:17,230 this is a total bound. 192 00:11:17,230 --> 00:11:19,150 Let's say you start with an m-edge graph 193 00:11:19,150 --> 00:11:20,860 and you delete all the edges. 194 00:11:20,860 --> 00:11:24,180 Then you pay log n for each edge. 195 00:11:24,180 --> 00:11:27,580 But you also have to pay this n poly log n cost. 196 00:11:27,580 --> 00:11:31,870 So if m is a bit bigger than n, then this dominates. 197 00:11:31,870 --> 00:11:33,220 And so it's log n per operation. 198 00:11:33,220 --> 00:11:34,960 But if you have a sparse graph, it's 199 00:11:34,960 --> 00:11:37,540 still poly log per operation. 200 00:11:37,540 --> 00:11:43,652 So decremental, we can kind of get log for general graphs. 201 00:11:43,652 --> 00:11:45,610 Another goal you might have is what if I wanted 202 00:11:45,610 --> 00:11:47,110 to get worse-case bounds? 203 00:11:47,110 --> 00:11:49,630 Everything I've said so far is amortized. 204 00:11:49,630 --> 00:11:51,910 If you want to get worse case, not much is known. 205 00:11:51,910 --> 00:11:59,290 Big open question is, can I get a poly log updating query. 206 00:12:05,500 --> 00:12:08,320 So it was actually a pretty big breakthrough to get poly log 207 00:12:08,320 --> 00:12:09,410 whatsoever. 208 00:12:09,410 --> 00:12:13,510 This first result is from 2001 or so, 209 00:12:13,510 --> 00:12:16,060 after people have worked on dynamic graphs for probably 210 00:12:16,060 --> 00:12:16,940 a decade. 211 00:12:16,940 --> 00:12:19,860 So it took quite a while to find this any kind of poly log. 212 00:12:19,860 --> 00:12:22,030 That still hasn't been made worse case. 213 00:12:22,030 --> 00:12:26,350 So the best results are state of the art 214 00:12:26,350 --> 00:12:28,570 at the time, before the poly log came out, 215 00:12:28,570 --> 00:12:32,800 which is square root of n update, constant query. 216 00:12:40,806 --> 00:12:42,920 There's one other result for worse case, which 217 00:12:42,920 --> 00:12:45,340 is if you want to do an incremental solution, 218 00:12:45,340 --> 00:12:47,260 so you're just inserting edges. 219 00:12:47,260 --> 00:12:49,640 Then there's a bound. 220 00:12:49,640 --> 00:12:53,950 It's actually a whole trade off between updates and queries. 221 00:12:53,950 --> 00:12:57,430 If you have updates to take order x time, 222 00:12:57,430 --> 00:13:01,780 then you get queries that take log n over log x time. 223 00:13:11,150 --> 00:13:13,490 So log base x of n. 224 00:13:13,490 --> 00:13:14,450 And that's optimal. 225 00:13:14,450 --> 00:13:15,140 These are theta. 226 00:13:15,140 --> 00:13:18,080 There's matching lower bounds. 227 00:13:18,080 --> 00:13:21,770 So this is again, really union find. 228 00:13:21,770 --> 00:13:24,420 And so union find has been well studied, 229 00:13:24,420 --> 00:13:27,740 even in the worse-case setting. 230 00:13:27,740 --> 00:13:31,570 Can't do as well as alpha, basically. 231 00:13:31,570 --> 00:13:32,810 OK. 232 00:13:32,810 --> 00:13:35,285 And then finally, I want to talk about lower bounds. 233 00:13:43,260 --> 00:13:47,380 So there are two complementary results. 234 00:13:47,380 --> 00:13:50,270 But maybe first I'll summarize by saying you need 235 00:13:50,270 --> 00:13:54,300 omega log n update or query. 236 00:13:54,300 --> 00:13:58,320 One of them has to be at least log n 237 00:13:58,320 --> 00:14:00,303 to do dynamic connectivity. 238 00:14:03,690 --> 00:14:05,740 As you see, queries can be sublogarithmic. 239 00:14:05,740 --> 00:14:08,100 We don't know how to make updates sublogarithmic 240 00:14:08,100 --> 00:14:11,610 and get any reasonable query time. 241 00:14:11,610 --> 00:14:15,240 But one of them has to be at least log. 242 00:14:15,240 --> 00:14:17,580 And here's the specific theorems that imply that. 243 00:14:33,310 --> 00:14:38,467 That's actually similar to this result, but not quite the same. 244 00:14:38,467 --> 00:14:40,300 Because there's a log on the left-hand side. 245 00:14:46,870 --> 00:14:51,670 And these results hold for amortized solutions. 246 00:14:51,670 --> 00:14:52,750 Whoops. 247 00:14:52,750 --> 00:14:54,760 Key word's missing here. 248 00:14:54,760 --> 00:14:56,140 This is an update bound. 249 00:14:56,140 --> 00:14:57,460 And this is a query bound. 250 00:15:02,340 --> 00:15:04,920 So these are basically symmetric results. 251 00:15:04,920 --> 00:15:07,980 Says if you have super logarithmic update, 252 00:15:07,980 --> 00:15:12,120 so you have some x bigger than 1 times log n update, 253 00:15:12,120 --> 00:15:14,940 then you get log n over log x query. 254 00:15:14,940 --> 00:15:16,050 That's a lower bound. 255 00:15:16,050 --> 00:15:18,450 If you have a super logarithmic query, 256 00:15:18,450 --> 00:15:20,790 so let's say it's x times larger than log n, 257 00:15:20,790 --> 00:15:25,440 then you need at least a log n over log x update time. 258 00:15:28,080 --> 00:15:30,870 So this result is the one of relevance. 259 00:15:30,870 --> 00:15:35,630 Because we don't know any sublogarithmic updates. 260 00:15:35,630 --> 00:15:37,240 But we do know sublogarithmic queries. 261 00:15:37,240 --> 00:15:39,720 So this is the regime that we have, 262 00:15:39,720 --> 00:15:42,510 essentially, matching upper bounds. 263 00:15:42,510 --> 00:15:45,450 Let me show you. 264 00:15:45,450 --> 00:15:50,670 This result, this result, and this result, 265 00:15:50,670 --> 00:15:54,220 even, are, in some sense, tight against this trade-off, 266 00:15:54,220 --> 00:15:56,850 this update query trade-off. 267 00:15:56,850 --> 00:15:59,610 So this update is log n times log n. 268 00:15:59,610 --> 00:16:01,260 So x is log n. 269 00:16:01,260 --> 00:16:05,550 And over here we get log n over log log n. 270 00:16:05,550 --> 00:16:07,590 So that's tight against that. 271 00:16:07,590 --> 00:16:11,570 Here, also, we have here x is log log n cubed. 272 00:16:11,570 --> 00:16:15,100 So you take a log of log log n cubed, you get log log log n. 273 00:16:15,100 --> 00:16:17,950 So that's why there's a log log log n down here. 274 00:16:17,950 --> 00:16:19,680 That's the best you could hope for, 275 00:16:19,680 --> 00:16:22,440 given that update time, that's the best query 276 00:16:22,440 --> 00:16:23,160 time you can get. 277 00:16:23,160 --> 00:16:23,880 Same over here. 278 00:16:23,880 --> 00:16:28,302 Given an update time of root n, well, of course, the best 279 00:16:28,302 --> 00:16:29,760 you can hope for is constant query. 280 00:16:29,760 --> 00:16:30,960 But indeed, it works out. 281 00:16:30,960 --> 00:16:33,330 As you're taking log n divided it by 1/2 log n, 282 00:16:33,330 --> 00:16:35,430 you get constant. 283 00:16:35,430 --> 00:16:40,260 So these results, they may sound suboptimal. 284 00:16:40,260 --> 00:16:42,030 But a certain sense, they're optimal. 285 00:16:42,030 --> 00:16:44,880 Of course, these are just single points on the trade-off curve. 286 00:16:44,880 --> 00:16:48,535 We still really care about the situation of-- 287 00:16:48,535 --> 00:16:50,280 where's the open problem here-- log n 288 00:16:50,280 --> 00:16:54,630 update and query, that would still be ideal. 289 00:16:54,630 --> 00:16:57,450 But these are not as bad as they look, is the point. 290 00:16:57,450 --> 00:16:59,370 Of course, it'd be better if this wasn't a 3, 291 00:16:59,370 --> 00:17:00,740 it was only a 1. 292 00:17:00,740 --> 00:17:03,210 But they're still, in a certain sense, 293 00:17:03,210 --> 00:17:07,540 tight against this, the query bound that you get. 294 00:17:07,540 --> 00:17:10,050 Another open problem here is whether this, 295 00:17:10,050 --> 00:17:13,230 the reverse situation, is ever relevant. 296 00:17:13,230 --> 00:17:17,520 Can you have a sublogarithmic update time 297 00:17:17,520 --> 00:17:20,716 and get any poly log query time? 298 00:17:20,716 --> 00:17:22,049 So this is another open problem. 299 00:17:26,550 --> 00:17:29,240 Little o of log n query. 300 00:17:29,240 --> 00:17:34,290 Sorry, little o of log n update, and let's say poly log query. 301 00:17:38,525 --> 00:17:40,900 The lower bound kind of predicts that something like this 302 00:17:40,900 --> 00:17:41,720 ought to exist. 303 00:17:41,720 --> 00:17:43,210 But it may be it's just impossible. 304 00:17:43,210 --> 00:17:46,978 And there's a different lower bound that makes it impossible. 305 00:17:52,680 --> 00:17:56,450 Some theta I wanted to add here. 306 00:18:02,600 --> 00:18:04,350 OK. 307 00:18:04,350 --> 00:18:06,390 Another fun thing about this lower bound, 308 00:18:06,390 --> 00:18:08,670 is that it holds even if your graphs are paths. 309 00:18:13,320 --> 00:18:16,840 So in particular, that tells us that this dynamic tree result 310 00:18:16,840 --> 00:18:21,570 that we covered last class, say, you need log n. 311 00:18:21,570 --> 00:18:23,760 If you're going to do both operations and log n, 312 00:18:23,760 --> 00:18:26,730 that's optimal. 313 00:18:26,730 --> 00:18:28,790 Even if your trees are paths, which 314 00:18:28,790 --> 00:18:30,570 is like the really easy case. 315 00:18:30,570 --> 00:18:33,190 And we're going to prove this theorem next class. 316 00:18:33,190 --> 00:18:35,970 So if you want, that'll be our last class, 317 00:18:35,970 --> 00:18:38,355 our last lower bound. 318 00:18:38,355 --> 00:18:39,540 That's next time. 319 00:18:39,540 --> 00:18:40,380 Tune in next time. 320 00:18:40,380 --> 00:18:43,410 But today we're going to focus on upper bounds, namely 321 00:18:43,410 --> 00:18:46,880 these two and this one. 322 00:18:52,980 --> 00:18:56,532 So let's do them. 323 00:18:56,532 --> 00:18:58,975 The first one is Euler-tour trees, 324 00:18:58,975 --> 00:19:03,180 which is another way to do the log n bound. 325 00:19:03,180 --> 00:19:05,430 It's something we need. 326 00:19:05,430 --> 00:19:09,012 We'll need it for doing the log squared solution, 327 00:19:09,012 --> 00:19:10,470 for a reason we'll see in a moment. 328 00:19:23,730 --> 00:19:27,680 So Euler-tour tress go back to 1995, Henzinger and King. 329 00:19:27,680 --> 00:19:31,530 They're simpler dynamic trees, simpler than link cut, 330 00:19:31,530 --> 00:19:34,560 even though they're newer. 331 00:19:34,560 --> 00:19:39,630 And the key difference compared to link cut trees 332 00:19:39,630 --> 00:19:42,630 is that they let you do stuff with subtrees 333 00:19:42,630 --> 00:19:45,519 of the tree instead of paths of the tree. 334 00:19:45,519 --> 00:19:46,560 Link cut trees are great. 335 00:19:46,560 --> 00:19:48,540 We could compute the min or the max 336 00:19:48,540 --> 00:19:50,910 or the sum of a bunch of weights on any path 337 00:19:50,910 --> 00:19:52,350 that we cared about. 338 00:19:52,350 --> 00:19:55,680 But for various reasons, we want to do the same thing 339 00:19:55,680 --> 00:19:56,410 on subtrees. 340 00:19:56,410 --> 00:19:58,890 And this turns out to be easier. 341 00:19:58,890 --> 00:20:02,640 So what do we do with Euler-tour trees? 342 00:20:02,640 --> 00:20:03,930 We take an Euler-tour. 343 00:20:03,930 --> 00:20:07,845 Remember Euler-tour from-- I can never remember-- lecture 15. 344 00:20:07,845 --> 00:20:10,170 [LAUGHS] 345 00:20:10,170 --> 00:20:13,170 I don't quite remember what that lecture was. 346 00:20:13,170 --> 00:20:14,580 When did we do Euler-tours? 347 00:20:18,164 --> 00:20:19,180 Well, doesn't matter. 348 00:20:19,180 --> 00:20:20,805 You remember the idea of an Euler-tour. 349 00:20:20,805 --> 00:20:23,710 It was just walk around the tree, do a depth for search, 350 00:20:23,710 --> 00:20:27,551 and keep track of every time you visit a node. 351 00:20:27,551 --> 00:20:29,990 So you visit some nodes multiple times. 352 00:20:29,990 --> 00:20:31,820 In general, the degree of the node 353 00:20:31,820 --> 00:20:34,190 is the number of times you visit it. 354 00:20:34,190 --> 00:20:37,220 Every edge gets visited exactly twice. 355 00:20:37,220 --> 00:20:39,530 So it's the linear number of visits total. 356 00:20:42,300 --> 00:20:48,360 I just want to take that linear order of visits, 357 00:20:48,360 --> 00:20:51,840 pull it out straight, store that in a balanced binary search 358 00:20:51,840 --> 00:20:52,590 tree. 359 00:20:52,590 --> 00:20:53,590 That's Euler-tour trees. 360 00:20:53,590 --> 00:20:54,630 They're very simple. 361 00:20:59,760 --> 00:21:10,915 Store the node visits by the Euler-tour, 362 00:21:10,915 --> 00:21:13,000 and a balanced binary search tree. 363 00:21:17,940 --> 00:21:18,650 What's the order? 364 00:21:18,650 --> 00:21:21,590 We're storing them in order by the order 365 00:21:21,590 --> 00:21:25,800 that the Euler-tour visits, the visits. 366 00:21:25,800 --> 00:21:28,250 So this will be the left-most, the min in the tree. 367 00:21:28,250 --> 00:21:29,750 This will be the next and then next. 368 00:21:29,750 --> 00:21:31,708 So if you do an in-order traversal of the tree, 369 00:21:31,708 --> 00:21:34,370 you should get the visits in this order. 370 00:21:34,370 --> 00:21:37,400 This is, of course, a way to balance a tree, in some sense. 371 00:21:37,400 --> 00:21:39,107 I drew a balanced tree as the thing 372 00:21:39,107 --> 00:21:40,190 we're trying to represent. 373 00:21:40,190 --> 00:21:42,800 But again, it could be a totally unbalanced thing. 374 00:21:42,800 --> 00:21:46,530 But when you do an Euler-tour, you get order n visits. 375 00:21:46,530 --> 00:21:48,530 And you throw them into a balanced binary search 376 00:21:48,530 --> 00:21:51,261 tree, it's balanced, of course. 377 00:21:51,261 --> 00:21:53,510 It's a bit of a weird thing, because you're not really 378 00:21:53,510 --> 00:21:56,530 preserving the structure in an obvious way. 379 00:21:56,530 --> 00:21:59,570 But there's one more thing we store, 380 00:21:59,570 --> 00:22:02,960 which will let us do, essentially, whatever we want. 381 00:22:02,960 --> 00:22:12,730 Each node stores two pointers into this structure 382 00:22:12,730 --> 00:22:15,740 to the first and last visit. 383 00:22:21,230 --> 00:22:24,200 So like this node has a pointer to its first visit 384 00:22:24,200 --> 00:22:27,980 and its last visit along this structure. 385 00:22:27,980 --> 00:22:29,907 It doesn't keep track of any middle visits. 386 00:22:29,907 --> 00:22:31,490 Because there could be a lot of those. 387 00:22:31,490 --> 00:22:33,410 And we just want to have a constant number of pointers. 388 00:22:33,410 --> 00:22:35,000 This is going to be a pointer machine data structure 389 00:22:35,000 --> 00:22:36,150 just like link cut trees. 390 00:22:39,680 --> 00:22:41,090 Let's see. 391 00:22:41,090 --> 00:22:43,760 How do we solve our operations, inserting and deleting 392 00:22:43,760 --> 00:22:45,770 of edges and connectivity? 393 00:22:45,770 --> 00:22:47,690 Actually, I'll start with just phrasing it 394 00:22:47,690 --> 00:22:48,560 like link cut trees. 395 00:22:48,560 --> 00:22:51,380 We can do most of the link cut tree operations. 396 00:22:51,380 --> 00:22:53,899 Remember, there was a find root, link and cut 397 00:22:53,899 --> 00:22:54,940 were our main operations. 398 00:22:54,940 --> 00:22:56,065 Than there was aggregation. 399 00:22:56,065 --> 00:22:57,785 But that will have to change. 400 00:22:57,785 --> 00:22:59,380 But if we want to do find root-- 401 00:23:02,690 --> 00:23:05,910 so we're given one of these balanced binary search trees. 402 00:23:05,910 --> 00:23:10,640 In it is the node v. Well, actually, it's 403 00:23:10,640 --> 00:23:13,560 a little weirder here. 404 00:23:13,560 --> 00:23:17,240 But we have some node v in our tree. 405 00:23:17,240 --> 00:23:19,980 And really, we have our pointers into this. 406 00:23:19,980 --> 00:23:23,222 This is a balanced binary search tree here. 407 00:23:23,222 --> 00:23:25,430 We have pointers to the first and last visit in here. 408 00:23:25,430 --> 00:23:25,971 I don't care. 409 00:23:25,971 --> 00:23:27,640 Just take any visit, anything in here, 410 00:23:27,640 --> 00:23:32,150 just say the first visit of v. Then walk up the tree. 411 00:23:32,150 --> 00:23:36,740 Then walk down the tree on the left spine. 412 00:23:36,740 --> 00:23:38,600 That is the min. 413 00:23:38,600 --> 00:23:43,400 And that's going to be the first visit of the root node. 414 00:23:43,400 --> 00:23:47,570 And so this has a pointer, in turn, to the root node. 415 00:23:47,570 --> 00:23:50,120 Boom, you found the root of your tree. 416 00:23:50,120 --> 00:23:53,070 Again, we're maintaining forest of trees. 417 00:23:53,070 --> 00:23:55,520 And so we just need to find which tree v is in, 418 00:23:55,520 --> 00:23:57,200 walk up, walk down, that's order log 419 00:23:57,200 --> 00:23:59,450 n, because this thing is balanced, 420 00:23:59,450 --> 00:24:01,220 and we found the root. 421 00:24:01,220 --> 00:24:03,500 So that's easy. 422 00:24:03,500 --> 00:24:06,380 Turns out, the other operations are not too easy 423 00:24:06,380 --> 00:24:10,430 if you can split and concatenate your tree. 424 00:24:10,430 --> 00:24:11,480 So let's start with cut. 425 00:24:15,580 --> 00:24:17,420 We need a bigger diagram. 426 00:24:29,540 --> 00:24:33,760 So here's v. Here's the parent of v. I'll call it w. 427 00:24:33,760 --> 00:24:34,840 Here's the edge. 428 00:24:34,840 --> 00:24:39,790 And our goal, and the cut operation 429 00:24:39,790 --> 00:24:45,890 is to delete that edge, separate this tree from that tree. 430 00:24:45,890 --> 00:24:50,680 So what I need to do is isolate this subtree of v. 431 00:24:50,680 --> 00:24:53,330 But conveniently, if you think of the Euler-tour, 432 00:24:53,330 --> 00:24:56,050 the Euler-tour, it does some stuff here. 433 00:24:56,050 --> 00:24:57,910 Eventually it follows this edge. 434 00:24:57,910 --> 00:25:00,700 Visits v for the first time here. 435 00:25:00,700 --> 00:25:02,890 Then it visits all the stuff down here. 436 00:25:02,890 --> 00:25:05,360 Then it visits v for the last time. 437 00:25:05,360 --> 00:25:07,990 Then it follows this edge and then does other stuff. 438 00:25:07,990 --> 00:25:13,060 So the v subtree is a contiguous interval of the Euler-tour. 439 00:25:13,060 --> 00:25:17,560 So we just cut it open, cut it apart. 440 00:25:17,560 --> 00:25:22,450 So we split the BST at the first and last visits 441 00:25:22,450 --> 00:25:29,110 to v, which is exactly what we know. 442 00:25:29,110 --> 00:25:33,340 We have a pointer from v to its first and last visits 443 00:25:33,340 --> 00:25:34,450 in the Euler-tour. 444 00:25:34,450 --> 00:25:35,520 So split the tree there. 445 00:25:35,520 --> 00:25:36,430 Split the tree there. 446 00:25:36,430 --> 00:25:38,800 You know split in a binary search tree. 447 00:25:38,800 --> 00:25:43,420 You're given a particular key value, let's say, x. 448 00:25:43,420 --> 00:25:47,050 And you split into everything less than x, 449 00:25:47,050 --> 00:25:49,810 and everything great than or equal to x, or something 450 00:25:49,810 --> 00:25:51,289 like that. 451 00:25:51,289 --> 00:25:52,330 That's a split operation. 452 00:25:52,330 --> 00:25:54,250 It can be done in log n time, and pick 453 00:25:54,250 --> 00:25:56,350 your favorite balanced binary search tree. 454 00:25:56,350 --> 00:25:58,150 Red-black trees, AVL trees, they can all 455 00:25:58,150 --> 00:26:00,392 do this and maintain balance in these two things. 456 00:26:00,392 --> 00:26:02,350 Even though they might be very different sizes, 457 00:26:02,350 --> 00:26:07,612 they'll both be log height relative to their own size. 458 00:26:07,612 --> 00:26:09,070 This is obviously not quite enough. 459 00:26:09,070 --> 00:26:11,020 Because what we've done is basically 460 00:26:11,020 --> 00:26:12,760 cut here and cut here. 461 00:26:12,760 --> 00:26:14,650 Now we have three trees. 462 00:26:14,650 --> 00:26:18,250 We have the one we want for v, v is subtree. 463 00:26:18,250 --> 00:26:20,950 That will correspond to the Euler-tour 464 00:26:20,950 --> 00:26:22,360 of exactly that thing. 465 00:26:22,360 --> 00:26:24,730 So this is the represented tree I'm talking about. 466 00:26:24,730 --> 00:26:28,990 And this is the balanced binary search tree. 467 00:26:28,990 --> 00:26:31,400 But the rest of the tree is in two parts. 468 00:26:31,400 --> 00:26:33,490 There's the part to the left of v, 469 00:26:33,490 --> 00:26:35,540 before we visited v, the part after we visited 470 00:26:35,540 --> 00:26:37,570 v. Those we just have to stick back together, 471 00:26:37,570 --> 00:26:38,620 so we concatenate them. 472 00:26:43,451 --> 00:26:43,950 Yeah. 473 00:26:50,420 --> 00:26:54,639 So I'll call them the before-v tree and the after-v tree. 474 00:26:54,639 --> 00:26:56,930 There is actually a tiny thing that has to happen here, 475 00:26:56,930 --> 00:26:58,070 which I just remembered. 476 00:26:58,070 --> 00:27:04,950 Which is we need, somewhere, to delete one occurrence of w, 477 00:27:04,950 --> 00:27:07,130 of the parent of v. 478 00:27:07,130 --> 00:27:10,142 Because it used to be we visit w, then we visit v, 479 00:27:10,142 --> 00:27:10,850 blah, blah, blah. 480 00:27:10,850 --> 00:27:12,080 Then we visit w again. 481 00:27:12,080 --> 00:27:14,030 If we cut out this part, we're going 482 00:27:14,030 --> 00:27:15,620 to have two visits to w in a row. 483 00:27:15,620 --> 00:27:17,100 We really just want one visit. 484 00:27:17,100 --> 00:27:20,980 So this is a minor thing, but we need to do a delete in there. 485 00:27:20,980 --> 00:27:22,990 So each of these is log n time. 486 00:27:22,990 --> 00:27:24,750 So the overall time is log n. 487 00:27:24,750 --> 00:27:25,400 Easy. 488 00:27:25,400 --> 00:27:31,060 This is a cut-and-paste kind of argument, all pretty cheap. 489 00:27:31,060 --> 00:27:33,140 OK, now let's do link. 490 00:27:38,080 --> 00:27:40,140 Concatenate is the reverse of split, by the way. 491 00:27:40,140 --> 00:27:41,790 You have two trees. 492 00:27:41,790 --> 00:27:43,650 All the things in the left are smaller 493 00:27:43,650 --> 00:27:44,858 than all things in the right. 494 00:27:44,858 --> 00:27:47,430 You just want to join them together 495 00:27:47,430 --> 00:27:51,060 like a very restrictive kind of merge. 496 00:27:51,060 --> 00:28:00,450 So join, recall, is you have a node v. And you want to make it 497 00:28:00,450 --> 00:28:02,414 a new child of w. 498 00:28:02,414 --> 00:28:05,350 w might have other children. 499 00:28:05,350 --> 00:28:09,060 It lives in some bigger tree. 500 00:28:09,060 --> 00:28:11,120 We want to add v as a child of w. 501 00:28:11,120 --> 00:28:13,260 So we're assuming here v is a root, for now. 502 00:28:15,880 --> 00:28:16,710 So what do we do? 503 00:28:20,280 --> 00:28:21,750 Sort of the same thing. 504 00:28:21,750 --> 00:28:25,260 We sort of know we want to put v in here somewhere. 505 00:28:25,260 --> 00:28:27,690 It actually doesn't really matter where we put it. 506 00:28:27,690 --> 00:28:30,750 We're not keeping track of the order of nodes in here. 507 00:28:30,750 --> 00:28:33,060 So I'm going to simplify my life and say, 508 00:28:33,060 --> 00:28:35,550 let's put v at the end. 509 00:28:35,550 --> 00:28:41,370 I want to make v the last child of w, like this. 510 00:28:41,370 --> 00:28:45,009 So we can find the last child, in some sense. 511 00:28:45,009 --> 00:28:46,800 After we visit the last child, we come back 512 00:28:46,800 --> 00:28:49,170 and do the last visit to w. 513 00:28:49,170 --> 00:28:51,120 So I'm going to look at the last visit of w, 514 00:28:51,120 --> 00:28:54,400 cut the tree open there, and stick this part in. 515 00:28:54,400 --> 00:28:56,880 Cut the Euler-tour in half there. 516 00:28:56,880 --> 00:28:57,540 So split. 517 00:29:01,270 --> 00:29:07,550 w is tree at w's last visit. 518 00:29:12,640 --> 00:29:14,140 Right? 519 00:29:14,140 --> 00:29:17,780 And then we do omega concatenate. 520 00:29:17,780 --> 00:29:23,020 We're going to concatenate, basically, 521 00:29:23,020 --> 00:29:25,960 before w's last visit. 522 00:29:31,370 --> 00:29:35,370 We're going to concatenate a single occurrence of w. 523 00:29:35,370 --> 00:29:37,120 Because there's a new occurrence of w now. 524 00:29:37,120 --> 00:29:39,130 We do it before and after v. 525 00:29:39,130 --> 00:29:40,640 So we add another w. 526 00:29:40,640 --> 00:29:46,660 Then we add v's Euler-tour, v's BST, 527 00:29:46,660 --> 00:29:50,590 and then we do after whatever used to be there 528 00:29:50,590 --> 00:29:54,160 after w's last visit. 529 00:29:54,160 --> 00:30:02,930 Let's say, after and including that w visit. 530 00:30:05,690 --> 00:30:07,730 So just the symmetric of this, here 531 00:30:07,730 --> 00:30:09,620 we had to delete one occurrence of w. 532 00:30:09,620 --> 00:30:11,480 Here we have to add one back in. 533 00:30:11,480 --> 00:30:13,430 But you just cut, paste in the thing you need, 534 00:30:13,430 --> 00:30:14,970 and rejoin everything. 535 00:30:14,970 --> 00:30:16,456 So again, very easy. 536 00:30:16,456 --> 00:30:18,455 It's a constant number of log n time operations, 537 00:30:18,455 --> 00:30:22,830 if you can do split and concatenate in log n time. 538 00:30:22,830 --> 00:30:26,060 So you see this, it's like wow, why do we spend so much time 539 00:30:26,060 --> 00:30:27,470 with link cut trees? 540 00:30:27,470 --> 00:30:29,159 This is really easy. 541 00:30:29,159 --> 00:30:30,950 But they have their different applications. 542 00:30:30,950 --> 00:30:32,630 If you want to compute aggregates on paths 543 00:30:32,630 --> 00:30:33,921 you need to use link cut tress. 544 00:30:33,921 --> 00:30:35,270 It's the best we know. 545 00:30:35,270 --> 00:30:37,430 This structure will not do aggregates on paths. 546 00:30:37,430 --> 00:30:39,920 But it will do aggregates on subtrees. 547 00:30:39,920 --> 00:30:42,680 Because subtrees are represented by these intervals 548 00:30:42,680 --> 00:30:43,730 in the Euler-tour tree. 549 00:30:43,730 --> 00:30:46,180 So if you have a weight on all the nodes in here, 550 00:30:46,180 --> 00:30:49,926 say, you can easily find this interval 551 00:30:49,926 --> 00:30:51,050 and do a range query there. 552 00:30:51,050 --> 00:30:52,716 So you take the min of all those things. 553 00:30:52,716 --> 00:30:55,490 If you have subtree mins, you can 554 00:30:55,490 --> 00:30:59,210 take the min of log n things and in log n time, 555 00:30:59,210 --> 00:31:02,840 compute the min in an interval, or min over 556 00:31:02,840 --> 00:31:06,800 a subtree rooted at a node, or max or sum, or whatever. 557 00:31:06,800 --> 00:31:09,820 So here you can do subtree aggregation in log n time. 558 00:31:09,820 --> 00:31:13,370 And we're going to need that for the log squared solution. 559 00:31:17,120 --> 00:31:20,990 One other thing I wanted to maybe mention, 560 00:31:20,990 --> 00:31:21,890 here we did join. 561 00:31:21,890 --> 00:31:26,300 And join assumed that v was the root of its tree. 562 00:31:26,300 --> 00:31:27,890 What if v is not the root of its tree 563 00:31:27,890 --> 00:31:29,570 and we still want to insert an edge? 564 00:31:29,570 --> 00:31:32,150 Because insert, when we're maintaining 565 00:31:32,150 --> 00:31:35,440 an undirected graph, there is no rootedness. 566 00:31:35,440 --> 00:31:37,377 That rootedness is just a tool. 567 00:31:37,377 --> 00:31:39,710 You could say at the first step here, just root the tree 568 00:31:39,710 --> 00:31:43,520 somewhere so that we can have a well-defined beginning 569 00:31:43,520 --> 00:31:44,774 and end to the Euler-tour. 570 00:31:44,774 --> 00:31:46,940 Of course, Euler-tour in it's the most natural form, 571 00:31:46,940 --> 00:31:48,680 is actually a cycle. 572 00:31:48,680 --> 00:31:50,330 And it would only have one visit here. 573 00:31:53,150 --> 00:31:54,800 So you can think of it that way. 574 00:31:54,800 --> 00:31:57,470 I'd still like to have the two visits conceptually, 575 00:31:57,470 --> 00:32:02,240 because that makes even the subtree rooted at the root 576 00:32:02,240 --> 00:32:06,110 an interval between the beginning and the end. 577 00:32:06,110 --> 00:32:07,940 But the nice thing of the cycle view 578 00:32:07,940 --> 00:32:09,900 is if I wanted to change who the root is, 579 00:32:09,900 --> 00:32:14,480 say I want this node to be the root, that's just 580 00:32:14,480 --> 00:32:15,799 a cyclic shift of everything. 581 00:32:15,799 --> 00:32:18,340 It just means I want to start the cycle here and end it here. 582 00:32:18,340 --> 00:32:24,390 So I'd basically duplicate this copy of the vertex, singlify 583 00:32:24,390 --> 00:32:25,244 this one. 584 00:32:25,244 --> 00:32:26,160 There's only one copy. 585 00:32:26,160 --> 00:32:28,490 That's the constant number of inserts and deletes. 586 00:32:28,490 --> 00:32:29,660 And now rotate. 587 00:32:29,660 --> 00:32:31,280 [LAUGHS] 588 00:32:31,280 --> 00:32:32,380 What does it rotate mean? 589 00:32:32,380 --> 00:32:34,680 Well, before it was a sequential order, 590 00:32:34,680 --> 00:32:36,950 starting here, ending here. 591 00:32:36,950 --> 00:32:39,150 Now, I want it to be a sequential order, say, 592 00:32:39,150 --> 00:32:44,200 starting here, continuing this way, and then ending back here. 593 00:32:44,200 --> 00:32:46,330 So if you look at what that requires, maybe 594 00:32:46,330 --> 00:32:50,060 use another color, I want this part. 595 00:32:55,120 --> 00:32:58,810 That's an interval of the old tree, of the old sequence. 596 00:32:58,810 --> 00:33:01,970 And then I want this part. 597 00:33:01,970 --> 00:33:04,270 So there's two pieces of the old tree. 598 00:33:04,270 --> 00:33:06,590 And I just want to change their order. 599 00:33:06,590 --> 00:33:07,450 It's a cyclic shift. 600 00:33:07,450 --> 00:33:11,780 Cyclic shift is just a cut and then a rejoin the other way. 601 00:33:11,780 --> 00:33:15,050 So I can reroute a tree in log n time as well. 602 00:33:15,050 --> 00:33:19,280 So this is why, in this world, join is equivalent to insert. 603 00:33:19,280 --> 00:33:20,890 If I want to do a general insert, 604 00:33:20,890 --> 00:33:22,640 first I reroute v's tree. 605 00:33:22,640 --> 00:33:25,720 I do a cyclic shift with one split one concatenate, 606 00:33:25,720 --> 00:33:26,650 like this. 607 00:33:26,650 --> 00:33:28,120 And I get v to be the root. 608 00:33:28,120 --> 00:33:30,560 Then I can do the join. 609 00:33:30,560 --> 00:33:33,710 So again, constant number of splits and concatenates. 610 00:33:37,140 --> 00:33:42,280 So it's another operation we can do, in log in. 611 00:33:42,280 --> 00:33:44,336 Everything here is log n. 612 00:33:48,070 --> 00:33:49,140 Cool. 613 00:33:49,140 --> 00:33:51,840 Any questions about Euler-tour trees. 614 00:33:51,840 --> 00:33:54,180 That's all I'll say about them. 615 00:33:54,180 --> 00:34:01,596 We will next move on to next result. 616 00:34:01,596 --> 00:34:03,554 AUDIENCE: Can link cut trees also do a reroute? 617 00:34:03,554 --> 00:34:06,010 ERIK DEMAINE: Can link cut tress also do a reroute? 618 00:34:06,010 --> 00:34:07,270 I haven't thought about that. 619 00:34:10,480 --> 00:34:12,684 I would guess so. 620 00:34:12,684 --> 00:34:14,567 [LAUGHS] 621 00:34:14,567 --> 00:34:16,400 But it's going to require some more thought. 622 00:34:16,400 --> 00:34:21,070 So let's see, link cut tree, we can do a little aside here. 623 00:34:21,070 --> 00:34:24,070 Link cut tree, we do an axis on v. 624 00:34:24,070 --> 00:34:27,130 So now we get that v is up here, right at the root 625 00:34:27,130 --> 00:34:29,124 of the tree of oxx trees. 626 00:34:29,124 --> 00:34:31,540 Now we've got other trees of [INAUDIBLE] trees hanging off 627 00:34:31,540 --> 00:34:33,159 here. 628 00:34:33,159 --> 00:34:35,664 But in particular, this thing represents a path. 629 00:34:38,560 --> 00:34:41,860 It represents the route to v path. 630 00:34:41,860 --> 00:34:43,389 It actually has no right child. 631 00:34:43,389 --> 00:34:48,460 So it's a little bit emptier, like that. 632 00:34:48,460 --> 00:34:51,257 So now we want to flip things around and say v is the root. 633 00:34:51,257 --> 00:34:52,840 We basically want to reverse the order 634 00:34:52,840 --> 00:34:55,090 of all the nodes in this tree. 635 00:34:55,090 --> 00:34:57,760 So reversing the order of nodes in a binary search tree, 636 00:34:57,760 --> 00:34:59,860 you can't regularly do. 637 00:34:59,860 --> 00:35:03,160 But if you augment your data structure to say, 638 00:35:03,160 --> 00:35:06,250 I could basically mark these nodes as all inverted, 639 00:35:06,250 --> 00:35:07,880 treat left as right and right as left. 640 00:35:07,880 --> 00:35:10,180 It's basically the dyslexic approach. 641 00:35:10,180 --> 00:35:13,570 You have a bit saying from here down 642 00:35:13,570 --> 00:35:16,540 to here, mark everything as dyslexic. 643 00:35:16,540 --> 00:35:18,032 And so you invert left and right. 644 00:35:18,032 --> 00:35:19,990 So I think, with this appropriate augmentation, 645 00:35:19,990 --> 00:35:25,200 you can reroute the tree in log n time. 646 00:35:25,200 --> 00:35:28,200 But of course, there's some details to check. 647 00:35:28,200 --> 00:35:30,700 It might even be that's in the original link cut tree paper. 648 00:35:30,700 --> 00:35:32,170 But I don't remember offhand. 649 00:35:32,170 --> 00:35:34,449 So I'm pretty sure it works, and therefore, 650 00:35:34,449 --> 00:35:35,740 gives you dynamic connectivity. 651 00:35:35,740 --> 00:35:37,656 But if it doesn't, Euler-tour trees do it too. 652 00:35:37,656 --> 00:35:41,170 So we're covered either way. 653 00:35:41,170 --> 00:35:42,460 So that's Euler-tour trees. 654 00:35:42,460 --> 00:35:45,782 Next on our list is decremental connectivity in trees. 655 00:35:45,782 --> 00:35:47,740 So we're going to solve a weaker problem, which 656 00:35:47,740 --> 00:35:51,384 is just deletions only instead of inserts and deletes. 657 00:35:51,384 --> 00:35:52,675 We're going to solve it faster. 658 00:35:55,300 --> 00:35:58,450 Constant time per operation. 659 00:35:58,450 --> 00:36:16,650 This is just a fun little result. 660 00:36:16,650 --> 00:36:21,270 It's fun because it uses some techniques we know. 661 00:36:21,270 --> 00:36:27,300 So bound is constant amortized, assuming all edges get deleted. 662 00:36:35,280 --> 00:36:36,780 We need to assume this, essentially, 663 00:36:36,780 --> 00:36:39,460 because we're amortizing against the future. 664 00:36:39,460 --> 00:36:43,710 So we need to get all the way to the end. 665 00:36:43,710 --> 00:36:46,410 Overall, over n operations it's going to take order n time. 666 00:36:49,050 --> 00:36:51,290 This is, again, a series of refinements, 667 00:36:51,290 --> 00:36:53,520 or a bunch of pieces of the solution that get 668 00:36:53,520 --> 00:36:57,540 combined together, not a sequential algorithm. 669 00:36:57,540 --> 00:37:01,200 So the first observation is, of course, we can do order log n. 670 00:37:01,200 --> 00:37:04,350 We know, now, maybe two ways to do it. 671 00:37:04,350 --> 00:37:08,337 But you can use Euler-tour trees and only do cuts, life is easy. 672 00:37:08,337 --> 00:37:10,170 Link cut trees, you could also just do cuts. 673 00:37:10,170 --> 00:37:11,980 You don't even need rerouting. 674 00:37:11,980 --> 00:37:14,790 Rerouting was only for joins. 675 00:37:14,790 --> 00:37:16,060 So that may seem obvious. 676 00:37:16,060 --> 00:37:21,140 But how do we reduce the log n to a constant? 677 00:37:21,140 --> 00:37:23,510 In general, in this class? 678 00:37:23,510 --> 00:37:24,935 AUDIENCE: Make up tables. 679 00:37:24,935 --> 00:37:27,650 ERIK DEMAINE: Indirection, and maybe look up tables. 680 00:37:27,650 --> 00:37:28,440 Yep. 681 00:37:28,440 --> 00:37:31,858 How do we do indirection in a tree? 682 00:37:31,858 --> 00:37:33,184 AUDIENCE: Leaf pruning? 683 00:37:33,184 --> 00:37:35,850 ERIK DEMAINE: Leaf pruning, leaf trimming, 684 00:37:35,850 --> 00:37:37,540 whatever you want to call it. 685 00:37:37,540 --> 00:37:41,730 Should call it the Edward Scissorhands approach, 686 00:37:41,730 --> 00:37:43,380 or something. 687 00:37:43,380 --> 00:37:45,420 So leaf trimming, what I want to do 688 00:37:45,420 --> 00:37:51,540 is cut below maximally deep nodes 689 00:37:51,540 --> 00:37:52,890 that are [INAUDIBLE] log n. 690 00:38:00,510 --> 00:38:04,536 Say with greater than log n descendants. 691 00:38:09,400 --> 00:38:12,600 So we have our top tree. 692 00:38:12,600 --> 00:38:14,257 And then hanging off some nodes here, 693 00:38:14,257 --> 00:38:15,840 we're going to have some bottom trees. 694 00:38:23,150 --> 00:38:26,840 These each have, at most, log n nodes. 695 00:38:26,840 --> 00:38:30,227 These have bigger than log n nodes below them. 696 00:38:30,227 --> 00:38:32,060 So we can charge every leaf in the structure 697 00:38:32,060 --> 00:38:34,950 to the log n nodes below it. 698 00:38:34,950 --> 00:38:42,320 And so up here we have at most, sorry, n over log n leaves. 699 00:38:42,320 --> 00:38:43,580 Not nodes. 700 00:38:43,580 --> 00:38:46,610 Nodes could still be linear if the graph is a path. 701 00:38:46,610 --> 00:38:49,850 But at most, n over log n leaves. 702 00:38:49,850 --> 00:38:53,750 So we have, at most, n over log n branching nodes up here. 703 00:38:53,750 --> 00:38:56,300 So what we need is a structure for dealing 704 00:38:56,300 --> 00:38:59,120 with long paths in the top, a structure 705 00:38:59,120 --> 00:39:02,180 for dealing with log n size things down here, 706 00:39:02,180 --> 00:39:06,390 and a structure for combining the paths together. 707 00:39:06,390 --> 00:39:09,620 And that's the easy part. 708 00:39:09,620 --> 00:39:12,080 If we treat each long path as a single edge, 709 00:39:12,080 --> 00:39:15,056 basically, we look at the compressed top tree, 710 00:39:15,056 --> 00:39:16,430 in the sense of compressed tries. 711 00:39:16,430 --> 00:39:17,840 But it's now a tree instead of a try. 712 00:39:17,840 --> 00:39:18,650 I guess it's a try. 713 00:39:18,650 --> 00:39:20,026 Whatever 714 00:39:20,026 --> 00:39:21,650 We look at the compressed tree up here. 715 00:39:21,650 --> 00:39:23,780 That will have size, n over log n. 716 00:39:23,780 --> 00:39:27,300 And then we can afford to use structure one. 717 00:39:27,300 --> 00:39:27,800 Why? 718 00:39:27,800 --> 00:39:30,290 Because in some sense, there only 719 00:39:30,290 --> 00:39:32,762 be n over log n operations performed. 720 00:39:32,762 --> 00:39:34,970 So it's not a space issue there we're trying to save. 721 00:39:34,970 --> 00:39:35,760 It's a time issue. 722 00:39:35,760 --> 00:39:37,260 So a little different from the past. 723 00:39:39,920 --> 00:39:42,110 There are only n over log n times 724 00:39:42,110 --> 00:39:43,640 that you can destroy a path up here. 725 00:39:43,640 --> 00:39:45,667 Because there's only n over log n paths. 726 00:39:45,667 --> 00:39:48,000 Each time you destroy a path, I do an operation up here, 727 00:39:48,000 --> 00:39:49,030 I pay log n. 728 00:39:49,030 --> 00:39:51,950 But if the total number of operations n over log n, 729 00:39:51,950 --> 00:39:53,120 total cost is linear. 730 00:39:53,120 --> 00:39:56,670 So constant amortized against the future. 731 00:39:56,670 --> 00:40:03,040 So I'm going to use structure one on the compressed top tree. 732 00:40:08,640 --> 00:40:11,820 And what remains is what I do in the bottom trees? 733 00:40:11,820 --> 00:40:13,050 What I do on the paths? 734 00:40:13,050 --> 00:40:15,297 And then how do I combine all those results together? 735 00:40:15,297 --> 00:40:17,380 Let's first talk about how to combine the results. 736 00:40:17,380 --> 00:40:19,650 There are a couple of different cases. 737 00:40:19,650 --> 00:40:23,220 It So we're doing a query on v, w, 738 00:40:23,220 --> 00:40:24,990 and we want to do connectivity query. 739 00:40:27,630 --> 00:40:32,940 So it could be that v and w are in the same bottom tree. 740 00:40:32,940 --> 00:40:37,860 In that case, we just need to do a query within a bottom tree. 741 00:40:37,860 --> 00:40:42,540 So as long as we can solve the bottom tree case quickly, 742 00:40:42,540 --> 00:40:43,650 we're happy. 743 00:40:43,650 --> 00:40:45,960 Constant time, I guess. 744 00:40:45,960 --> 00:40:49,700 That will turn out to be pretty easy. 745 00:40:49,700 --> 00:40:51,990 That's the easy case. 746 00:40:51,990 --> 00:40:55,260 Otherwise they could be in different bottom trees. 747 00:40:55,260 --> 00:41:02,430 So it could be v's down here and w's in some other bottom tree. 748 00:41:02,430 --> 00:41:05,080 So then we have three queries we need to do. 749 00:41:05,080 --> 00:41:06,570 One is like this. 750 00:41:06,570 --> 00:41:08,700 One is like this. 751 00:41:08,700 --> 00:41:11,880 We need to test the single edge, but that's trivial. 752 00:41:11,880 --> 00:41:14,660 And then we need to do a query in the top. 753 00:41:14,660 --> 00:41:16,300 Can you get from here to here? 754 00:41:16,300 --> 00:41:18,500 Now, these nodes are a little bit special, 755 00:41:18,500 --> 00:41:19,500 because they are leaves. 756 00:41:19,500 --> 00:41:22,710 They will be at the ends of paths. 757 00:41:22,710 --> 00:41:26,550 So this is really just a query we can do using structure one. 758 00:41:26,550 --> 00:41:32,610 Because we take either an entire path or no path at all. 759 00:41:32,610 --> 00:41:34,650 So we can look at the compressed tree 760 00:41:34,650 --> 00:41:36,390 and that's enough, because the leaves 761 00:41:36,390 --> 00:41:38,180 exist on the compressed tree. 762 00:41:38,180 --> 00:41:39,430 That's not always the case. 763 00:41:39,430 --> 00:41:42,390 A different situation is this. 764 00:41:42,390 --> 00:41:45,860 We might have, for example, v up here, w up here, 765 00:41:45,860 --> 00:41:48,760 and v and w are not down here. 766 00:41:48,760 --> 00:41:50,200 So these are irrelevant. 767 00:41:50,200 --> 00:41:53,304 We just want to know, in the top tree, can I get from v to w? 768 00:41:53,304 --> 00:41:54,720 Now, this is a little more awkward 769 00:41:54,720 --> 00:41:58,380 because v is going to live on some path. 770 00:41:58,380 --> 00:41:59,790 w is going to live on some path. 771 00:41:59,790 --> 00:42:02,220 But it might be in the middle of a path. 772 00:42:02,220 --> 00:42:04,330 And so there's now three queries we need to do. 773 00:42:04,330 --> 00:42:08,160 One is can I get to the top of my path? 774 00:42:08,160 --> 00:42:11,290 And then can I get from one path to the other? 775 00:42:11,290 --> 00:42:12,837 So these are branching nodes here. 776 00:42:12,837 --> 00:42:14,420 And can I get from this branching node 777 00:42:14,420 --> 00:42:15,330 to that branching node? 778 00:42:15,330 --> 00:42:16,500 It might not always be going up. 779 00:42:16,500 --> 00:42:17,750 Sometimes you have to go down. 780 00:42:17,750 --> 00:42:21,590 But point is, a constant number of queries to path structures, 781 00:42:21,590 --> 00:42:22,830 to compressed structures. 782 00:42:22,830 --> 00:42:24,900 This is, again, the one structure. 783 00:42:24,900 --> 00:42:28,596 And a constant number of calls to a bottom structure 784 00:42:28,596 --> 00:42:29,970 will suffice to answer the query. 785 00:42:29,970 --> 00:42:32,220 There's a few more cases that I haven't drawn. 786 00:42:32,220 --> 00:42:35,406 If you could be half of this and half of this. 787 00:42:35,406 --> 00:42:37,260 But it can all be done, as long as we 788 00:42:37,260 --> 00:42:39,300 can solve bottom and paths. 789 00:42:39,300 --> 00:42:42,540 So let's solve paths first, I think. 790 00:42:42,540 --> 00:42:45,900 Oh no, bottom trees, fine. 791 00:42:45,900 --> 00:42:49,260 So part 3 of the solution is solve a bottom tree. 792 00:42:52,170 --> 00:42:54,210 Here we can do it in constant worse case. 793 00:42:58,530 --> 00:43:01,740 So this is essentially the lookup table thing, 794 00:43:01,740 --> 00:43:03,810 although it's a pretty simple lookup table. 795 00:43:03,810 --> 00:43:08,860 What we do, bottom tree has only log n nodes. 796 00:43:08,860 --> 00:43:12,480 So we can represent 1 bit per node in a single word. 797 00:43:12,480 --> 00:43:18,930 And what we'll do, store bit vector of which 798 00:43:18,930 --> 00:43:23,380 edges have been deleted. 799 00:43:27,750 --> 00:43:28,650 In what order? 800 00:43:28,650 --> 00:43:29,490 I don't really care. 801 00:43:29,490 --> 00:43:33,810 Just pick some fixed order on the edges down here, 802 00:43:33,810 --> 00:43:35,790 and store the bits in that order. 803 00:43:35,790 --> 00:43:37,840 And say every edge knows its sequence. 804 00:43:37,840 --> 00:43:40,760 So you could do a depth first search at the beginning, 805 00:43:40,760 --> 00:43:44,140 to label the edges in some canonical order. 806 00:43:44,140 --> 00:43:46,440 And then every edge knows what its bit position 807 00:43:46,440 --> 00:43:47,230 is in the vector. 808 00:43:47,230 --> 00:43:49,470 So when I go to delete an edge, I just 809 00:43:49,470 --> 00:43:53,740 set that one bit using an or operation. 810 00:43:53,740 --> 00:43:54,450 Cool. 811 00:43:54,450 --> 00:43:55,620 So I can delete edges. 812 00:43:55,620 --> 00:43:59,190 I can actually even insert edges down here that used to exist. 813 00:43:59,190 --> 00:44:00,657 I could undelete an edge. 814 00:44:00,657 --> 00:44:02,490 But that only works in the bottom structure. 815 00:44:02,490 --> 00:44:04,850 So not generally useful. 816 00:44:04,850 --> 00:44:07,510 OK, so it's clear how to do an update, how to do a delete. 817 00:44:07,510 --> 00:44:08,730 I just mark a bit. 818 00:44:08,730 --> 00:44:10,212 How do I do a query? 819 00:44:10,212 --> 00:44:12,420 I want to know, given two nodes, v and w, one of them 820 00:44:12,420 --> 00:44:14,250 could be the root of the tree. 821 00:44:14,250 --> 00:44:16,560 I want to know can I get from v to w? 822 00:44:16,560 --> 00:44:19,340 Or there's only a single path that might exist. 823 00:44:19,340 --> 00:44:24,460 So it's a matter of has any edge along this path been deleted? 824 00:44:24,460 --> 00:44:32,580 So a general way to write that is I have w. 825 00:44:32,580 --> 00:44:34,980 I look at v's path to the root. 826 00:44:34,980 --> 00:44:36,840 I look at w's path to the root. 827 00:44:36,840 --> 00:44:39,450 At some point, we've reach the LCA. 828 00:44:39,450 --> 00:44:42,890 What I want to know is, have any of these edges been deleted? 829 00:44:42,890 --> 00:44:44,160 If yes, I can't get there. 830 00:44:44,160 --> 00:44:45,780 If none of them have been deleted, I can get there. 831 00:44:45,780 --> 00:44:46,950 It's an if and only if. 832 00:44:46,950 --> 00:44:49,980 I don't care about these edges up here above the LCA. 833 00:44:49,980 --> 00:44:53,170 Because I only need to go to the LCA and back down. 834 00:44:53,170 --> 00:44:55,170 So what do I do? 835 00:44:55,170 --> 00:45:07,916 Each vertex stores a bit vector of its ancestors, 836 00:45:07,916 --> 00:45:09,750 of the ancestor edges. 837 00:45:17,740 --> 00:45:21,930 Well, ancestor edges is kind of weird. 838 00:45:21,930 --> 00:45:26,790 It stores a bit vector representing its path 839 00:45:26,790 --> 00:45:27,290 to the root. 840 00:45:31,372 --> 00:45:32,700 That's this red stuff. 841 00:45:32,700 --> 00:45:34,890 I want to know, for vertex v, what 842 00:45:34,890 --> 00:45:38,760 are the edges along the path to the root of this bottom tree? 843 00:45:38,760 --> 00:45:39,510 Just preprocessed. 844 00:45:39,510 --> 00:45:41,884 You can build this thing as you do the depth first search 845 00:45:41,884 --> 00:45:42,570 and store it. 846 00:45:42,570 --> 00:45:45,220 Again, it's one word per vertex. 847 00:45:45,220 --> 00:45:48,962 So I can fit it with constant space. 848 00:45:48,962 --> 00:45:50,880 It's constant space overhead. 849 00:45:50,880 --> 00:45:52,920 And now, if I have two vertices, I 850 00:45:52,920 --> 00:45:55,950 take the XOR of those two paths. 851 00:45:55,950 --> 00:45:58,470 That will give me this part. 852 00:45:58,470 --> 00:46:03,250 And then I use that as a mask into the bit vector, say, 853 00:46:03,250 --> 00:46:04,677 are any of these 1? 854 00:46:04,677 --> 00:46:05,385 How do I do that? 855 00:46:05,385 --> 00:46:06,420 I just mask. 856 00:46:06,420 --> 00:46:09,780 And if the result is the 0 word, then I know none of them are 1. 857 00:46:09,780 --> 00:46:14,080 Otherwise, I know some edge was deleted and I'm screwed. 858 00:46:14,080 --> 00:46:21,660 So I take XOR of v's path and w's path. 859 00:46:25,710 --> 00:46:30,090 And then I mask this thing, which stores which edges 860 00:46:30,090 --> 00:46:31,410 were deleted. 861 00:46:31,410 --> 00:46:35,270 And then I check whether that word equals 0. 862 00:46:35,270 --> 00:46:37,564 If it equals 0, yes, I can get there. 863 00:46:37,564 --> 00:46:39,230 If it doesn't equal 0, I can't get there 864 00:46:39,230 --> 00:46:41,230 because some edge was deleted. 865 00:46:41,230 --> 00:46:46,600 So very easy, because we can fit log n bits in a word. 866 00:46:49,800 --> 00:46:51,770 OK, that's the bottom tree structure. 867 00:46:51,770 --> 00:46:54,935 Next we need a path structure. 868 00:47:23,180 --> 00:47:26,960 Here we can use due constant amortized with a similar trick. 869 00:47:30,650 --> 00:47:33,560 So here's our path. 870 00:47:33,560 --> 00:47:37,610 We're going to use, essentially, indirection again. 871 00:47:37,610 --> 00:47:40,349 Again, we know how to solve this in log n. 872 00:47:40,349 --> 00:47:42,890 When we have a sequential thing, our usual way of indirection 873 00:47:42,890 --> 00:47:44,030 is to split into chunks. 874 00:47:47,850 --> 00:47:51,410 Each chunk here, I need to be about log n in size. 875 00:47:51,410 --> 00:47:53,840 So I'll just make it exactly log n in size. 876 00:47:53,840 --> 00:47:56,570 So there's n over log n chunks. 877 00:47:56,570 --> 00:47:59,330 So if I store a summary vector, we 878 00:47:59,330 --> 00:48:01,880 can think of these as being 0, 1, 0, 1, 879 00:48:01,880 --> 00:48:04,280 whatever, where the 1's mean this edge has been deleted. 880 00:48:04,280 --> 00:48:06,300 0 means it hasn't been deleted. 881 00:48:06,300 --> 00:48:14,630 All right, let me write n over log n chunks, each log n edges. 882 00:48:19,190 --> 00:48:21,720 And I'm going to store each chunk as a bit vector. 883 00:48:28,180 --> 00:48:30,390 Bit vector has log n bits. 884 00:48:30,390 --> 00:48:31,620 w's at least log n. 885 00:48:31,620 --> 00:48:34,690 So I can do the same sort of things I could do before. 886 00:48:34,690 --> 00:48:39,270 If I delete an edge, I just set a 1 bit in the chunk vector. 887 00:48:42,990 --> 00:48:45,370 Now, that's good for local queries. 888 00:48:45,370 --> 00:48:47,310 If I want to do a long-distance query, 889 00:48:47,310 --> 00:48:51,690 I need to basically summarize, are any of these edges deleted? 890 00:48:51,690 --> 00:48:53,535 If so, I'll put a 1 up here. 891 00:48:53,535 --> 00:48:54,660 Any of these edges deleted? 892 00:48:54,660 --> 00:48:57,000 If not, put a 0. 893 00:48:57,000 --> 00:48:59,340 And now, I want to structure over-- this, 894 00:48:59,340 --> 00:49:02,490 you might call the summary vector, like in [INAUDIBLE].. 895 00:49:02,490 --> 00:49:05,470 Done that before with indirection as well. 896 00:49:05,470 --> 00:49:08,280 Now, this summary vector has size n over log n. 897 00:49:08,280 --> 00:49:13,360 So I can afford to use our log n solution that we started with. 898 00:49:13,360 --> 00:49:26,730 So use 1 on summary vector of chunks. 899 00:49:26,730 --> 00:49:31,230 Because again, the first time I set one of the bits in here, 900 00:49:31,230 --> 00:49:32,700 I have to do an update up here. 901 00:49:32,700 --> 00:49:33,600 But only then. 902 00:49:33,600 --> 00:49:36,409 Once it's been done once, I don't have to update again. 903 00:49:36,409 --> 00:49:38,200 So there will only be n over log n updates. 904 00:49:38,200 --> 00:49:40,510 So I can afford log n per update. 905 00:49:40,510 --> 00:49:43,410 It'll still be linear time total, constant per operation 906 00:49:43,410 --> 00:49:45,300 amortized. 907 00:49:45,300 --> 00:49:46,870 So this is my solution. 908 00:49:46,870 --> 00:49:52,440 If I have a query, let's say I want to know between here 909 00:49:52,440 --> 00:49:57,000 and here, I first check, can I get to the right endpoint 910 00:49:57,000 --> 00:49:58,200 locally? 911 00:49:58,200 --> 00:50:00,720 Which is a query within the chunk. 912 00:50:00,720 --> 00:50:02,660 Which is again, I just mask out these bits. 913 00:50:02,660 --> 00:50:04,494 I say and now do I have the 0 vector? 914 00:50:04,494 --> 00:50:06,660 In which case, nothing I care about has been deleted 915 00:50:06,660 --> 00:50:07,910 and I can get there? 916 00:50:07,910 --> 00:50:11,160 Or if I don't have the 0 vector, I can't get there. 917 00:50:11,160 --> 00:50:13,740 I want to know, can I get to the left from here? 918 00:50:13,740 --> 00:50:16,020 Again it's a mask and a check against 0. 919 00:50:16,020 --> 00:50:17,910 And then I want to know, can I go over 920 00:50:17,910 --> 00:50:20,730 this interval of chunks? 921 00:50:20,730 --> 00:50:23,970 And that, I can use the summary vector for. 922 00:50:23,970 --> 00:50:28,032 And again, that is a mask of a subinterval, 923 00:50:28,032 --> 00:50:29,490 and then checking whether it has 0. 924 00:50:29,490 --> 00:50:34,320 So I take the and of these three results, and I'm done. 925 00:50:34,320 --> 00:50:36,137 That gives me a query over a path. 926 00:50:36,137 --> 00:50:36,970 So it's kind of fun. 927 00:50:36,970 --> 00:50:38,860 We use two levels of indirection. 928 00:50:38,860 --> 00:50:43,040 One to reduce the number of leaves. 929 00:50:43,040 --> 00:50:44,910 And then these were sort of trivial. 930 00:50:44,910 --> 00:50:48,900 And then within the structure, we could only afford things 931 00:50:48,900 --> 00:50:50,400 on the non-branching part, so we had 932 00:50:50,400 --> 00:50:51,649 to deal with paths separately. 933 00:50:51,649 --> 00:50:54,270 And there we use another level of indirection. 934 00:50:54,270 --> 00:50:56,550 But in the end, we get rid of all of our logs. 935 00:50:56,550 --> 00:51:02,270 And it's constant amortized for deletions in a tree. 936 00:51:02,270 --> 00:51:04,360 Questions about that? 937 00:51:04,360 --> 00:51:09,250 That was our second result. We have one more. 938 00:51:34,929 --> 00:51:36,470 Last result we're going to talk about 939 00:51:36,470 --> 00:51:39,310 is log squared n update, log over log log 940 00:51:39,310 --> 00:51:41,980 query, for general graphs. 941 00:51:41,980 --> 00:51:45,890 Finally, we graduate from trees to general graphs. 942 00:52:24,100 --> 00:52:28,420 This is a result by Holm, de Lichtenberg, and Thorup, 943 00:52:28,420 --> 00:52:32,740 2001 is the journal version. 944 00:52:32,740 --> 00:52:34,540 So we want to solve dynamic connectivity. 945 00:52:34,540 --> 00:52:37,372 We want to understand, in general, 946 00:52:37,372 --> 00:52:39,580 when two vertices are in the same connected component 947 00:52:39,580 --> 00:52:40,780 or not. 948 00:52:40,780 --> 00:52:43,370 That sounds tricky. 949 00:52:43,370 --> 00:52:45,970 We're going to do that in a pretty simple way 950 00:52:45,970 --> 00:52:47,080 at a high level. 951 00:52:47,080 --> 00:52:52,286 High level is, we want to store a spanning forest. 952 00:52:52,286 --> 00:52:53,660 You know what a spanning tree is. 953 00:52:53,660 --> 00:52:56,786 Spanning forest, well, your graph might be disconnected. 954 00:52:56,786 --> 00:52:58,660 That's the whole point of the data structure. 955 00:52:58,660 --> 00:53:01,344 If it's not disconnected, you answer yes all the time. 956 00:53:01,344 --> 00:53:03,010 When it's disconnected you say, OK, I'll 957 00:53:03,010 --> 00:53:04,640 have a spanning tree for this connected component, 958 00:53:04,640 --> 00:53:06,473 spanning tree for every connected component. 959 00:53:06,473 --> 00:53:08,770 Together we call that a spanning forest. 960 00:53:08,770 --> 00:53:12,010 That's the maximal connectivity you can get, but represented 961 00:53:12,010 --> 00:53:12,940 as a tree. 962 00:53:12,940 --> 00:53:14,740 Now, we have a great way to represent 963 00:53:14,740 --> 00:53:17,710 trees, Euler-tour trees. 964 00:53:17,710 --> 00:53:21,010 And if you somehow connected together two components, 965 00:53:21,010 --> 00:53:24,460 that is an insertion of an edge in an Euler-tour structure. 966 00:53:24,460 --> 00:53:26,020 Great. 967 00:53:26,020 --> 00:53:28,600 So we can maintain all these connected components 968 00:53:28,600 --> 00:53:31,750 and merge them if we have to, using an insert 969 00:53:31,750 --> 00:53:33,790 in Euler-tour structure. 970 00:53:33,790 --> 00:53:35,659 And to do a connectivity query-- 971 00:53:35,659 --> 00:53:38,200 I don't think I mentioned this-- you do find root on v and w, 972 00:53:38,200 --> 00:53:40,050 see whether they're the same root. 973 00:53:40,050 --> 00:53:43,510 Because root is a canonical name for the connected component. 974 00:53:43,510 --> 00:53:46,885 You can solve connectivity using find root in constant time. 975 00:53:46,885 --> 00:53:51,270 Well, constant extra time, log for the find root. 976 00:53:51,270 --> 00:53:52,900 OK, so everything looks great. 977 00:53:52,900 --> 00:53:53,890 I can connectivity. 978 00:53:53,890 --> 00:53:55,120 I can do insertion. 979 00:53:55,120 --> 00:53:57,730 What about deletion? 980 00:53:57,730 --> 00:54:02,080 If I delete an edge that is not in my spanning forest, 981 00:54:02,080 --> 00:54:02,852 I'm happy. 982 00:54:02,852 --> 00:54:04,810 I have exactly the same connectivity as before, 983 00:54:04,810 --> 00:54:07,180 as proved by the spanning forest. 984 00:54:07,180 --> 00:54:09,040 The trouble is when I delete an edge that's 985 00:54:09,040 --> 00:54:10,960 in the spanning forest. 986 00:54:10,960 --> 00:54:14,080 Then it's like, uh, maybe. 987 00:54:14,080 --> 00:54:17,420 So here's spanning tree, whatever. 988 00:54:22,690 --> 00:54:27,280 And now let's say I delete this edge here. 989 00:54:27,280 --> 00:54:29,440 Then there's a couple possibilities. 990 00:54:33,147 --> 00:54:34,480 It was a graph, it's not a tree. 991 00:54:34,480 --> 00:54:35,620 So there could be some other edge 992 00:54:35,620 --> 00:54:36,870 that connects those two trees. 993 00:54:36,870 --> 00:54:38,509 Then I have to find it. 994 00:54:38,509 --> 00:54:39,550 Oh, I'm going to find it. 995 00:54:39,550 --> 00:54:40,990 That's going to be annoying. 996 00:54:40,990 --> 00:54:42,490 Or it could be there's no such edge. 997 00:54:42,490 --> 00:54:43,450 Then I'm fine. 998 00:54:43,450 --> 00:54:45,610 Then it's just a cut. 999 00:54:45,610 --> 00:54:47,050 But distinguishing those two cases 1000 00:54:47,050 --> 00:54:48,258 is going to be our challenge. 1001 00:54:48,258 --> 00:54:50,950 And that's where we're going to lose another log factor, 1002 00:54:50,950 --> 00:54:53,189 but only another log factor. 1003 00:54:53,189 --> 00:54:55,480 To only lose another log factor, what we're going to do 1004 00:54:55,480 --> 00:54:59,560 is not just store one spanning forest. 1005 00:54:59,560 --> 00:55:01,040 We will store the spanning forest, 1006 00:55:01,040 --> 00:55:03,910 but then we're going to hierarchically decompose it 1007 00:55:03,910 --> 00:55:08,110 and say, well, yeah, there's this big tree. 1008 00:55:08,110 --> 00:55:13,600 But some of the edges I'm going to put in the next level down. 1009 00:55:13,600 --> 00:55:14,560 Some I won't. 1010 00:55:14,560 --> 00:55:17,890 So some subset of this forest will be at the next level down. 1011 00:55:17,890 --> 00:55:19,660 And there's going to be log n levels. 1012 00:55:19,660 --> 00:55:21,850 That's where we lose our log factor. 1013 00:55:21,850 --> 00:55:25,060 And the weird thing is, there's no real reason 1014 00:55:25,060 --> 00:55:29,200 to put things down, except we'll use it as a charging scheme. 1015 00:55:29,200 --> 00:55:32,740 We'll prove that an edge can only go down log n levels. 1016 00:55:32,740 --> 00:55:36,190 And then it has to get deleted before it becomes relevant 1017 00:55:36,190 --> 00:55:36,820 again. 1018 00:55:36,820 --> 00:55:42,960 So it will let us charge only log n times per edge. 1019 00:55:42,960 --> 00:55:44,710 OK, that's about as good as I can give you 1020 00:55:44,710 --> 00:55:45,626 a high-level overview. 1021 00:55:45,626 --> 00:55:48,040 Now we have to see the details. 1022 00:55:48,040 --> 00:55:53,740 It's kind of amazing that this works, but it does. 1023 00:55:53,740 --> 00:55:57,320 So we're going to talk about the level of an edge. 1024 00:55:57,320 --> 00:56:01,970 This is not a definition, per se. 1025 00:56:01,970 --> 00:56:04,420 But it's a change in quantity over time. 1026 00:56:04,420 --> 00:56:06,520 As you do edge deletions, some edges 1027 00:56:06,520 --> 00:56:08,590 are going to decrease in level. 1028 00:56:08,590 --> 00:56:13,060 They all start at log n. 1029 00:56:13,060 --> 00:56:14,576 Log n is going to be the top level. 1030 00:56:14,576 --> 00:56:15,950 That's the whole spanning forest. 1031 00:56:19,390 --> 00:56:23,320 They will only decrease, so it's monotone. 1032 00:56:23,320 --> 00:56:25,840 And they can never get below 0. 1033 00:56:25,840 --> 00:56:27,430 So we start at log n. 1034 00:56:27,430 --> 00:56:28,960 They could go down. 1035 00:56:28,960 --> 00:56:31,720 And the lowest value is 0. 1036 00:56:31,720 --> 00:56:35,350 Now, Gi is my graph. 1037 00:56:35,350 --> 00:56:40,105 And G is going to be the subgraph of lower-level edges. 1038 00:56:47,785 --> 00:56:52,210 It's going to say, level less than or equal to i. 1039 00:56:52,210 --> 00:56:54,730 So in particular, G log n is the whole graph. 1040 00:56:57,830 --> 00:57:00,530 So we always include lower-level edges. 1041 00:57:00,530 --> 00:57:03,490 So level 0 is going to appear in all the Gi's. 1042 00:57:03,490 --> 00:57:08,370 But if we look at G 0, it's only level 0 edges. 1043 00:57:08,370 --> 00:57:10,186 G1 is level 0 and 1. 1044 00:57:10,186 --> 00:57:12,310 And so this is sort of a hierarchical decomposition 1045 00:57:12,310 --> 00:57:12,790 of the graph. 1046 00:57:12,790 --> 00:57:14,498 We have fewer edges at the bottom levels. 1047 00:57:17,530 --> 00:57:19,450 And there's going to be two key invariants 1048 00:57:19,450 --> 00:57:22,530 we have over these structures. 1049 00:57:22,530 --> 00:57:27,740 In variant one it's is going to be every connected component 1050 00:57:27,740 --> 00:57:30,135 of Gi is small. 1051 00:57:43,711 --> 00:57:46,240 It's going to be size, at most, 2 to the i. 1052 00:57:51,140 --> 00:57:52,570 This is really what's going to let 1053 00:57:52,570 --> 00:57:54,160 us charge against something. 1054 00:57:54,160 --> 00:57:56,680 Whenever you go down a level, the max size 1055 00:57:56,680 --> 00:57:59,740 of a connected component goes down by a factor of 2. 1056 00:57:59,740 --> 00:58:03,380 So at level 0, all components have size 1. 1057 00:58:03,380 --> 00:58:05,760 There are no edges at level 0. 1058 00:58:05,760 --> 00:58:06,580 So I kind of lied. 1059 00:58:06,580 --> 00:58:09,320 I guess the lowest level is 1. 1060 00:58:09,320 --> 00:58:12,140 At level 1, you can have two vertices. 1061 00:58:12,140 --> 00:58:15,000 So there can be isolated edges in Gi. 1062 00:58:15,000 --> 00:58:16,050 But that's it. 1063 00:58:16,050 --> 00:58:20,500 You can't have a path of length 2 in G1, and so on up the tree. 1064 00:58:20,500 --> 00:58:22,420 At G log n, you can have the whole graph. 1065 00:58:22,420 --> 00:58:24,010 Whole thing could be connected. 1066 00:58:24,010 --> 00:58:27,100 But this will let us charge to something 1067 00:58:27,100 --> 00:58:28,510 as we go down in the tree. 1068 00:58:31,240 --> 00:58:33,790 As we go down in levels, I should say. 1069 00:58:47,890 --> 00:58:52,900 So next thing we need is a spanning forest for each Gi. 1070 00:58:52,900 --> 00:58:56,050 Fi is going to be spanning forest of Gi. 1071 00:58:58,690 --> 00:59:03,550 So it's going to maintain the connected components of Gi. 1072 00:59:03,550 --> 00:59:06,300 And we're going to store that using an Euler-tour tree. 1073 00:59:13,355 --> 00:59:14,855 There's this issue of pluralization. 1074 00:59:14,855 --> 00:59:16,820 I'll say trees. 1075 00:59:16,820 --> 00:59:18,710 Because Fi is disconnected. 1076 00:59:18,710 --> 00:59:20,330 Each connected component you use, 1077 00:59:20,330 --> 00:59:23,390 you store using an Euler-tour tree. 1078 00:59:23,390 --> 00:59:27,200 Together, it's an Euler-tour forest, I suppose. 1079 00:59:27,200 --> 00:59:28,850 So that way we can do a query. 1080 00:59:28,850 --> 00:59:30,530 And so given two nodes, we can know 1081 00:59:30,530 --> 00:59:34,040 whether they're in the same connected component in Fi, 1082 00:59:34,040 --> 00:59:38,390 just by saying whether they live under the same root. 1083 00:59:41,090 --> 00:59:43,050 Well, in particular, that means that f 1084 00:59:43,050 --> 00:59:45,940 log n, which is a spanning forest of G log 1085 00:59:45,940 --> 00:59:50,940 n, which is everything, that will let me solve queries. 1086 00:59:50,940 --> 00:59:55,780 This is the desired spanning forest 1087 00:59:55,780 --> 00:59:58,677 that will let me ask connectivity queries. 1088 00:59:58,677 --> 01:00:00,260 So all this infrastructure is in order 1089 01:00:00,260 --> 01:00:02,330 to support deletes efficiently. 1090 01:00:02,330 --> 01:00:03,439 But queries are easy. 1091 01:00:03,439 --> 01:00:04,730 We just look at the top forest. 1092 01:00:04,730 --> 01:00:07,520 That's the one we want. 1093 01:00:07,520 --> 01:00:10,400 OK, now second invariant, this is 1094 01:00:10,400 --> 01:00:12,082 where things get interesting. 1095 01:00:15,938 --> 01:00:16,902 Variant 2. 1096 01:00:19,552 --> 01:00:21,630 The forests have to nest. 1097 01:00:27,320 --> 01:00:30,245 So F log n, of course, has the most edges. 1098 01:00:30,245 --> 01:00:32,420 F0 is going to have the fewest edges. 1099 01:00:32,420 --> 01:00:35,840 But I want them to be contained in this nested structure. 1100 01:00:35,840 --> 01:00:38,000 All this is saying is that there's really only 1101 01:00:38,000 --> 01:00:40,495 one spanning forest, F log n. 1102 01:00:40,495 --> 01:00:50,030 Fi is just F log n, but restricted to the edges of Gi. 1103 01:00:50,030 --> 01:00:52,730 So really we're trying to represent this forest. 1104 01:00:52,730 --> 01:00:55,130 But then, as we look at lower levels, 1105 01:00:55,130 --> 01:00:57,490 we just forget about the higher-level edges. 1106 01:00:57,490 --> 01:01:00,200 Restrict to the lower edges of level less than or equal to i, 1107 01:01:00,200 --> 01:01:01,791 that's our smaller force. 1108 01:01:01,791 --> 01:01:04,040 This is, in some sense, the hierarchical decomposition 1109 01:01:04,040 --> 01:01:05,275 of the forest. 1110 01:01:05,275 --> 01:01:07,430 Because there's really only one forest. 1111 01:01:07,430 --> 01:01:10,640 That would make our lives way easier. 1112 01:01:10,640 --> 01:01:15,440 Fun fact, is that that forest, then, is actually not 1113 01:01:15,440 --> 01:01:16,850 just any spanning forest. 1114 01:01:16,850 --> 01:01:20,420 It's a minimum spanning forest, with respect to level. 1115 01:01:24,762 --> 01:01:26,720 You've probably heard of minimum spanning tree. 1116 01:01:26,720 --> 01:01:29,180 Minimum spanning forest is just the analog 1117 01:01:29,180 --> 01:01:31,060 for disconnected graphs. 1118 01:01:33,680 --> 01:01:39,000 So we're defining the weight of an edge to be it's level. 1119 01:01:39,000 --> 01:01:42,460 And so F log n can't just be any spanning forest. 1120 01:01:42,460 --> 01:01:44,750 It has to prefer lower-level edges. 1121 01:01:44,750 --> 01:01:48,440 Otherwise, this nesting structure won't be true. 1122 01:01:48,440 --> 01:01:52,005 Now, that doesn't uniquely define the forest or anything. 1123 01:01:52,005 --> 01:01:53,630 Because maybe all the levels are log n. 1124 01:01:53,630 --> 01:01:56,840 And then every spanning forest is a minimum spanning forest. 1125 01:01:56,840 --> 01:02:00,890 But it's a constraint on spanning forest. 1126 01:02:00,890 --> 01:02:04,040 Cool Let's go over here. 1127 01:02:28,120 --> 01:02:31,120 Let me quickly say how to do an insert and how to do a query. 1128 01:02:31,120 --> 01:02:33,340 These are really easy. 1129 01:02:33,340 --> 01:02:34,970 Delete is really where the action is, 1130 01:02:34,970 --> 01:02:40,340 but just to make sure we're on the same page, 1131 01:02:40,340 --> 01:02:42,790 and to introduce a little bit of notation, 1132 01:02:42,790 --> 01:02:45,640 and say a little bit about what we do. 1133 01:02:45,640 --> 01:02:48,820 We are going to store incidence lists, which 1134 01:02:48,820 --> 01:02:53,260 is for every vertex, we have a list of the incident 1135 01:02:53,260 --> 01:02:55,570 edges in a linked list. 1136 01:02:59,050 --> 01:03:00,930 So in constant time, we can add our edge to v 1137 01:03:00,930 --> 01:03:02,110 and w's incidence lists. 1138 01:03:02,110 --> 01:03:06,250 That's not maintaining any order or anything special. 1139 01:03:06,250 --> 01:03:13,540 Then we also set the level of the edge to the log n. 1140 01:03:13,540 --> 01:03:16,937 That's what I said, every edge starts at level log n. 1141 01:03:16,937 --> 01:03:18,520 And then there's one more thing, which 1142 01:03:18,520 --> 01:03:31,720 is if v and w are disconnected, and F log n, 1143 01:03:31,720 --> 01:03:35,110 we can tell that because we can do a connectivity query 1144 01:03:35,110 --> 01:03:36,232 on F log n. 1145 01:03:36,232 --> 01:03:37,690 We have that as an Euler-tour tree. 1146 01:03:37,690 --> 01:03:40,180 So we can see whether v and w are in different components. 1147 01:03:40,180 --> 01:03:41,800 If they are, we have to merge them. 1148 01:03:41,800 --> 01:03:43,750 So we merge them. 1149 01:03:43,750 --> 01:03:47,849 We This is what you call an edge insertion. 1150 01:03:52,340 --> 01:03:55,390 So this is an Euler-tour tree insertion, 1151 01:03:55,390 --> 01:03:57,045 that we know how to do in log n. 1152 01:03:57,045 --> 01:04:00,508 We reroute and we do a join. 1153 01:04:00,508 --> 01:04:02,640 So that's cheap. 1154 01:04:02,640 --> 01:04:04,640 And that's it. 1155 01:04:04,640 --> 01:04:07,930 So insertion is easy to do in log n time, actually. 1156 01:04:07,930 --> 01:04:10,524 Its deletion that's going to be painful. 1157 01:04:10,524 --> 01:04:12,690 Actually, we're going to charge through the inserts. 1158 01:04:12,690 --> 01:04:14,648 So it'll be end up being log squared amortized. 1159 01:04:14,648 --> 01:04:17,770 But worst case, it's log n. 1160 01:04:17,770 --> 01:04:19,244 Great. 1161 01:04:19,244 --> 01:04:21,160 I want to say a little bit about connectivity. 1162 01:04:21,160 --> 01:04:24,100 Now we know how to solve connectivity already. 1163 01:04:24,100 --> 01:04:28,870 We do find root on v and w and in log n time, 1164 01:04:28,870 --> 01:04:31,450 we determine whether they're in the same component. 1165 01:04:31,450 --> 01:04:36,560 But I actually claimed-- it's maybe erased by now. 1166 01:04:36,560 --> 01:04:37,170 Yeah. 1167 01:04:37,170 --> 01:04:38,560 But I claim not to log n query. 1168 01:04:38,560 --> 01:04:40,660 I claimed a log over log log query. 1169 01:04:40,660 --> 01:04:43,400 Turns out, that's really easy to do. 1170 01:04:43,400 --> 01:04:48,100 You just change the top level minimum spanning forest 1171 01:04:48,100 --> 01:04:49,720 slightly. 1172 01:04:49,720 --> 01:04:57,940 So I want to make F log n equal to a B-tree. 1173 01:04:57,940 --> 01:04:59,880 Or actually, it's a bunch of B-trees, 1174 01:04:59,880 --> 01:05:05,290 one per connected component, of branching factor log n. 1175 01:05:08,475 --> 01:05:10,600 I guess it's a B-tree, so I have to have some slop. 1176 01:05:10,600 --> 01:05:13,600 Theta log n, let's say. 1177 01:05:13,600 --> 01:05:16,720 Usually we said with Euler-tour trees, 1178 01:05:16,720 --> 01:05:19,090 it was a balanced binary search tree. 1179 01:05:19,090 --> 01:05:22,780 I'm going to make this particular forest use a log n 1180 01:05:22,780 --> 01:05:24,700 way, B-tree. 1181 01:05:24,700 --> 01:05:26,560 This is going to slow down updates. 1182 01:05:26,560 --> 01:05:29,710 But it actually speeds up queries. 1183 01:05:29,710 --> 01:05:36,280 So to do a find root is now a little bit easier. 1184 01:05:36,280 --> 01:05:42,790 Find root should just be log base log n of n, 1185 01:05:42,790 --> 01:05:45,850 which is log n over log log n. 1186 01:05:45,850 --> 01:05:48,250 Because find root, you just need to go to your parent, 1187 01:05:48,250 --> 01:05:49,660 to your parent, to your parent. 1188 01:05:49,660 --> 01:05:50,390 Go to the top. 1189 01:05:50,390 --> 01:05:52,840 And then you have to go to the leftmost place. 1190 01:05:52,840 --> 01:05:54,924 But it's easy, in a B-tree, to always go 1191 01:05:54,924 --> 01:05:55,840 to the leftmost place. 1192 01:05:55,840 --> 01:05:57,130 If I had to do a search within the node, 1193 01:05:57,130 --> 01:05:58,360 that would be annoying. 1194 01:05:58,360 --> 01:06:00,160 But going to the leftmost, that's easy. 1195 01:06:00,160 --> 01:06:03,190 So I only pay log over log log for query, 1196 01:06:03,190 --> 01:06:06,410 so I got my desired query time. 1197 01:06:06,410 --> 01:06:08,620 And I claim the update is kind of OK. 1198 01:06:08,620 --> 01:06:11,530 Because it's slowed down, but we're already going to pay 1199 01:06:11,530 --> 01:06:13,090 log square eventually. 1200 01:06:13,090 --> 01:06:14,470 And here we were doing log. 1201 01:06:14,470 --> 01:06:20,080 So we don't need to be that fast. 1202 01:06:20,080 --> 01:06:23,160 Update is now going to be-- 1203 01:06:23,160 --> 01:06:29,290 let's see-- the height of the tree 1204 01:06:29,290 --> 01:06:31,120 times the branching factor. 1205 01:06:34,230 --> 01:06:37,870 If we touch nodes in a root-to-leaf path, 1206 01:06:37,870 --> 01:06:39,310 there's log over log log of them. 1207 01:06:39,310 --> 01:06:41,309 For each one, we have to rewrite the whole node. 1208 01:06:41,309 --> 01:06:42,510 So we pay log. 1209 01:06:42,510 --> 01:06:44,660 But this is less than log squared. 1210 01:06:49,970 --> 01:06:51,620 So this won't end up hurting us. 1211 01:06:51,620 --> 01:06:54,205 Because we only do this at the top forest level. 1212 01:06:54,205 --> 01:06:55,580 We don't do it at all the levels. 1213 01:06:55,580 --> 01:06:56,996 If we did it at all the levels, we 1214 01:06:56,996 --> 01:07:00,290 would lose another log factor, log over log log factor. 1215 01:07:00,290 --> 01:07:02,960 But here, we do this once at the top, 1216 01:07:02,960 --> 01:07:04,520 pay log squared over log log. 1217 01:07:04,520 --> 01:07:07,117 Then we have to update log n other levels, each paying log. 1218 01:07:07,117 --> 01:07:09,450 So we're already paying log squared for the lower level. 1219 01:07:09,450 --> 01:07:12,500 So if we increase the top level a bit, not a big deal. 1220 01:07:12,500 --> 01:07:18,230 So that's how you improve connectivity queries slightly. 1221 01:07:18,230 --> 01:07:19,760 And now, finally, we get to deletes. 1222 01:07:19,760 --> 01:07:24,188 This is the moment you've been waiting for. 1223 01:07:24,188 --> 01:07:27,812 How do we do a delete in this structure? 1224 01:07:27,812 --> 01:07:29,270 What are all these levels good for? 1225 01:07:42,860 --> 01:07:44,260 So we're deleting an edge e. 1226 01:07:44,260 --> 01:07:48,260 Its endpoints are v and w. 1227 01:07:48,260 --> 01:07:51,355 First thing we do is remove e from the incidence list. 1228 01:07:59,676 --> 01:08:01,300 If every edge stores a pointer to where 1229 01:08:01,300 --> 01:08:02,758 it lives in the incidence list, you 1230 01:08:02,758 --> 01:08:05,578 can do that deletion in constant time. 1231 01:08:05,578 --> 01:08:06,550 Great. 1232 01:08:06,550 --> 01:08:14,530 So as I said, if e is not in this forest-- 1233 01:08:21,670 --> 01:08:23,930 what if it's in a lower level of forest? 1234 01:08:27,220 --> 01:08:28,504 Oh, I see, right. 1235 01:08:28,504 --> 01:08:29,810 I forgot. 1236 01:08:29,810 --> 01:08:32,930 In variant two, variant two says the forests are nested. 1237 01:08:32,930 --> 01:08:36,040 So if it's in any forest, it's going to be in the last one. 1238 01:08:36,040 --> 01:08:37,540 So if it's not in the last one, that 1239 01:08:37,540 --> 01:08:39,160 means it's not in any of the forests. 1240 01:08:39,160 --> 01:08:40,076 That means we're done. 1241 01:08:40,076 --> 01:08:41,240 We do nothing. 1242 01:08:41,240 --> 01:08:43,492 That's the easy case. 1243 01:08:43,492 --> 01:08:44,950 We didn't destroy any connectivity, 1244 01:08:44,950 --> 01:08:47,300 because the forests represent maximal connectivity. 1245 01:08:47,300 --> 01:08:48,939 They're spanning. 1246 01:08:48,939 --> 01:08:52,120 But if it's in the forest, then something changed. 1247 01:08:52,120 --> 01:08:54,580 Then we need to determine, like in this picture, 1248 01:08:54,580 --> 01:08:56,229 is there a replacement edge? 1249 01:08:56,229 --> 01:08:58,880 Or is there no replacement edge? 1250 01:08:58,880 --> 01:09:01,009 In which case, when there's no replacement edge, 1251 01:09:01,009 --> 01:09:02,300 we basically don't do anything. 1252 01:09:02,300 --> 01:09:06,859 We have to do a bunch of deletes in the forests. 1253 01:09:06,859 --> 01:09:11,500 But yeah, is that what I want to say next? 1254 01:09:11,500 --> 01:09:13,430 Yes, we always do that. 1255 01:09:13,430 --> 01:09:14,830 We're going to delete e. 1256 01:09:14,830 --> 01:09:15,760 We have to delete e. 1257 01:09:15,760 --> 01:09:24,490 So we're going to recursively delete it from f sub e dot 1258 01:09:24,490 --> 01:09:31,814 level up to F log n. e dot level is the earliest 1259 01:09:31,814 --> 01:09:32,689 forest it appears in. 1260 01:09:32,689 --> 01:09:34,355 And we have to delete it from all those. 1261 01:09:34,355 --> 01:09:37,770 Each of those is an Euler-tour tree deletion, or cut. 1262 01:09:37,770 --> 01:09:40,240 And so each of them pays log n total cost log squared n. 1263 01:09:40,240 --> 01:09:42,790 So this is where the log squared's coming from. 1264 01:09:42,790 --> 01:09:43,540 That's great. 1265 01:09:43,540 --> 01:09:45,700 Now we've successfully deleted the edge. 1266 01:09:45,700 --> 01:09:48,609 But now we need to know, is there a replacement? 1267 01:09:48,609 --> 01:09:50,520 And at what level is there a replacement? 1268 01:09:50,520 --> 01:09:56,380 Now, we know by invariant 2, that there could be 1269 01:09:56,380 --> 01:09:59,380 no replacement of lower level. 1270 01:09:59,380 --> 01:10:03,300 The point is, this is a minimum spanning tree. 1271 01:10:03,300 --> 01:10:05,230 So e was the lowest level edge that 1272 01:10:05,230 --> 01:10:08,710 could connect the two sides, these two connected components 1273 01:10:08,710 --> 01:10:10,720 of the forest. 1274 01:10:10,720 --> 01:10:15,160 So if there's any replacements at e's level or higher, 1275 01:10:15,160 --> 01:10:17,540 there might not be a replacement at e dot level. 1276 01:10:17,540 --> 01:10:18,940 But there might be a replacement at a higher level. 1277 01:10:18,940 --> 01:10:21,023 We want to find the smallest level for which there 1278 01:10:21,023 --> 01:10:21,970 is a replacement. 1279 01:10:21,970 --> 01:10:24,130 That will preserve invariant 2, that we 1280 01:10:24,130 --> 01:10:26,890 have a minimum spanning forest. 1281 01:10:26,890 --> 01:10:29,350 So that's what we're going to do, loop over the levels 1282 01:10:29,350 --> 01:10:32,500 to try to find a replacement edge. 1283 01:10:32,500 --> 01:10:37,090 So we're going to start at e dot level, loop 1284 01:10:37,090 --> 01:10:41,740 potentially up to log n, call it a level i. 1285 01:10:41,740 --> 01:10:46,270 Then I want to identify, at level i, the two 1286 01:10:46,270 --> 01:10:47,320 sides of the edge. 1287 01:10:47,320 --> 01:10:58,810 Let Tv and Tw be the trees of Fi, containing v and w, 1288 01:10:58,810 --> 01:11:00,660 respectively. 1289 01:11:00,660 --> 01:11:03,100 We just deleted the edge connecting those two sides. 1290 01:11:03,100 --> 01:11:07,310 So we know that they're in two different trees, Tv and Tw. 1291 01:11:07,310 --> 01:11:09,650 And one of them is smaller than the other. 1292 01:11:09,650 --> 01:11:15,100 I'd like to relabel v and w so that the size of Tv 1293 01:11:15,100 --> 01:11:17,410 is less than or equal to the size of Tw, size, 1294 01:11:17,410 --> 01:11:24,240 in terms of number of vertices from there in the tree. 1295 01:11:24,240 --> 01:11:27,780 So swapped v and w if necessary. 1296 01:11:27,780 --> 01:11:28,890 Here's the fun thing. 1297 01:11:28,890 --> 01:11:38,520 If we apply invariant 1, then we learn 1298 01:11:38,520 --> 01:11:41,210 these sizes are not so big. 1299 01:11:41,210 --> 01:11:43,320 Claim, at most, 2 to the i. 1300 01:11:43,320 --> 01:11:45,130 As you realize it's almost 2 to the i, 1301 01:11:45,130 --> 01:11:48,050 you have to imagine the moment before this deletion happened, 1302 01:11:48,050 --> 01:11:50,120 before we deleted the edge. 1303 01:11:50,120 --> 01:11:53,240 Because at that point, Tv and Tw were actually one tree. 1304 01:11:53,240 --> 01:11:55,120 They were connected by edge e. 1305 01:11:55,120 --> 01:11:56,120 We've just deleted them. 1306 01:11:56,120 --> 01:11:59,090 But the moment before we deleted them, invariant 1 held. 1307 01:11:59,090 --> 01:12:00,772 And so that was a connected component. 1308 01:12:00,772 --> 01:12:02,480 It should have size, at most, 2 to the i. 1309 01:12:02,480 --> 01:12:03,560 Now, we split it. 1310 01:12:03,560 --> 01:12:04,820 But that's all we've done. 1311 01:12:04,820 --> 01:12:07,560 So the sum of those sizes has to be at most 2 to the i. 1312 01:12:07,560 --> 01:12:10,760 Because that used to be a connected component. 1313 01:12:10,760 --> 01:12:12,050 Cool 1314 01:12:12,050 --> 01:12:19,190 So that means that size of Tv is 1/2 that, at most 2 to i 1315 01:12:19,190 --> 01:12:19,990 minus 1. 1316 01:12:19,990 --> 01:12:23,130 Because Tv is less than or equal to Tw. 1317 01:12:23,130 --> 01:12:26,980 So it's, at most, 1/2 of the total. 1318 01:12:26,980 --> 01:12:29,682 So Tv is even smaller. 1319 01:12:29,682 --> 01:12:31,140 In particular, what that tells us-- 1320 01:12:31,140 --> 01:12:33,740 now we have Tv and Tw as kind of separate components 1321 01:12:33,740 --> 01:12:34,920 temporarily-- 1322 01:12:34,920 --> 01:12:38,640 we don't really know that they're separate at level i. 1323 01:12:38,640 --> 01:12:41,159 But we know at level i minus 1 they are separate. 1324 01:12:41,159 --> 01:12:43,200 At level i minus 1, there is no replacement edge, 1325 01:12:43,200 --> 01:12:46,050 by the minimum spanning forest property. 1326 01:12:46,050 --> 01:12:49,470 So what we could do, at this point, 1327 01:12:49,470 --> 01:12:52,740 is take Tv, take all of the edges internal to Tv, 1328 01:12:52,740 --> 01:12:55,860 and push them down to level i minus 1. 1329 01:12:55,860 --> 01:12:57,635 We could afford that. 1330 01:12:57,635 --> 01:12:59,260 What do I mean by we could afford that? 1331 01:12:59,260 --> 01:13:01,166 We wouldn't destroy invariant 1. 1332 01:13:01,166 --> 01:13:02,790 Because we're taking 2 to the i minus 1 1333 01:13:02,790 --> 01:13:05,754 vertices, pushing all those edges down. 1334 01:13:05,754 --> 01:13:07,170 We don't have to push all of them. 1335 01:13:07,170 --> 01:13:09,110 We could push some subset of the edges down. 1336 01:13:09,110 --> 01:13:11,700 Whatever connectivity component we make at that level 1337 01:13:11,700 --> 01:13:14,325 will be of size at most 2 to d, i minus 1, 1338 01:13:14,325 --> 01:13:16,110 and so invariant 1 will be preserved. 1339 01:13:18,730 --> 01:13:19,950 So great. 1340 01:13:19,950 --> 01:13:25,220 We can afford in this certain sense of 1341 01:13:25,220 --> 01:13:40,702 "afford," to push all of Tv's edges to level i minus 1. 1342 01:13:44,480 --> 01:13:48,080 Great 1343 01:13:48,080 --> 01:13:51,170 We don't actually need to do this. 1344 01:13:51,170 --> 01:13:54,470 But what we're going to do is use it to pay for stuff. 1345 01:14:02,150 --> 01:14:03,740 We have this scary goal, which is 1346 01:14:03,740 --> 01:14:06,686 we want to find is there a replacement edge? 1347 01:14:06,686 --> 01:14:08,810 I don't know a good way to find a replacement edge, 1348 01:14:08,810 --> 01:14:10,685 except to do, basically, a depth first search 1349 01:14:10,685 --> 01:14:12,680 and look at all the edges. 1350 01:14:12,680 --> 01:14:15,090 That's going to take a lot of time. 1351 01:14:15,090 --> 01:14:17,030 But the good news is, if we search 1352 01:14:17,030 --> 01:14:21,020 from Tv, every edge that's useless, 1353 01:14:21,020 --> 01:14:23,244 we can just decrease its level. 1354 01:14:23,244 --> 01:14:25,160 And whenever we decrease the level of an edge, 1355 01:14:25,160 --> 01:14:28,260 we basically get a free coin to continue. 1356 01:14:28,260 --> 01:14:30,290 So you get a free life every time you 1357 01:14:30,290 --> 01:14:32,750 push an edge down by one level. 1358 01:14:32,750 --> 01:14:34,280 Because overall, number of pushes 1359 01:14:34,280 --> 01:14:40,550 can be number insertions times log n. 1360 01:14:40,550 --> 01:14:43,830 Because every edge can only be pushed long n times. 1361 01:14:43,830 --> 01:14:48,080 So whenever we push down an edge, we get a free bonus life, 1362 01:14:48,080 --> 01:14:50,330 so we can keep doing our search. 1363 01:14:50,330 --> 01:14:52,325 So here's how the search works. 1364 01:14:52,325 --> 01:14:57,335 I'm going to say, for each-- 1365 01:14:57,335 --> 01:15:00,620 this is a little bit tricky to implement. 1366 01:15:00,620 --> 01:15:02,900 I'm going to first tell you what we want to do 1367 01:15:02,900 --> 01:15:04,162 and why that's OK. 1368 01:15:04,162 --> 01:15:05,870 And then we'll see how we actually do it. 1369 01:15:10,730 --> 01:15:14,280 So I want to search from every vertex in Fv. 1370 01:15:14,280 --> 01:15:17,470 I want to look at all the outgoing edges from there 1371 01:15:17,470 --> 01:15:20,612 of level i. 1372 01:15:20,612 --> 01:15:22,070 Never mind how to find those edges. 1373 01:15:22,070 --> 01:15:25,610 Just pretend you could find it in constant time per edge. 1374 01:15:25,610 --> 01:15:27,920 It's going to be log n time per edge. 1375 01:15:27,920 --> 01:15:29,951 But that's OK. 1376 01:15:29,951 --> 01:15:32,060 It's two cases. 1377 01:15:32,060 --> 01:15:36,260 Either y is in Tw. 1378 01:15:36,260 --> 01:15:39,320 Otherwise, y is going to be in Tv. 1379 01:15:42,440 --> 01:15:44,850 Because we have Tv here. 1380 01:15:44,850 --> 01:15:46,700 We have Tw. 1381 01:15:46,700 --> 01:15:50,240 We just deleted this edge, e. 1382 01:15:50,240 --> 01:15:52,640 Sadly, we want to find a replacement edge. 1383 01:15:52,640 --> 01:15:56,210 So if we look at all the edges coming out of a vertex in Tv, 1384 01:15:56,210 --> 01:15:58,310 this is, I guess, x. 1385 01:15:58,310 --> 01:16:00,710 It could be it's an edge that stays within Tv. 1386 01:16:00,710 --> 01:16:02,750 Or could be it's an edge that goes to Tw. 1387 01:16:02,750 --> 01:16:05,780 There are no other options, because of our invariants. 1388 01:16:05,780 --> 01:16:09,470 If it's in Tv, oh, that's kind of annoying. 1389 01:16:09,470 --> 01:16:11,430 Doesn't help us. 1390 01:16:11,430 --> 01:16:14,960 But then we can just set the level of that edge, 1391 01:16:14,960 --> 01:16:20,072 e-prime, to be i minus 1. 1392 01:16:20,072 --> 01:16:22,880 Because we can afford to push all of these 1393 01:16:22,880 --> 01:16:24,110 edges down by one level. 1394 01:16:24,110 --> 01:16:26,480 So that pays for this round of the loop. 1395 01:16:30,587 --> 01:16:31,670 What about the other case? 1396 01:16:31,670 --> 01:16:33,044 The other cases is the good case. 1397 01:16:33,044 --> 01:16:36,020 That means we've found a replacement edge for e. 1398 01:16:36,020 --> 01:16:38,260 If we have an edge from Tv to Tw. 1399 01:16:38,260 --> 01:16:41,880 And at level i, we did these levels in order. 1400 01:16:41,880 --> 01:16:42,690 Did I say that? 1401 01:16:42,690 --> 01:16:44,930 Yeah, we're doing levels in order. 1402 01:16:44,930 --> 01:16:48,670 So if we can find this is the lowest level replacement edge, 1403 01:16:48,670 --> 01:16:49,960 so we do it. 1404 01:16:49,960 --> 01:16:58,370 We insert that edge into Fi using an Euler-tour tree 1405 01:16:58,370 --> 01:17:00,100 insertion. 1406 01:17:00,100 --> 01:17:02,500 And then we're done. 1407 01:17:02,500 --> 01:17:05,260 So we stop the algorithm. 1408 01:17:05,260 --> 01:17:07,180 The moment we find a desired edge we're happy. 1409 01:17:07,180 --> 01:17:08,638 Now, maybe there are lots of edges. 1410 01:17:08,638 --> 01:17:10,910 So we've got to stop when we find the first one. 1411 01:17:10,910 --> 01:17:13,810 Then we've restored our minimum spanning forest property. 1412 01:17:13,810 --> 01:17:17,650 We have maximal connectivity, all these great things. 1413 01:17:17,650 --> 01:17:19,300 As long as we don't find an edge, 1414 01:17:19,300 --> 01:17:22,400 we can push things down a level and pay for this round. 1415 01:17:22,400 --> 01:17:25,237 So it's constant time or log n time to do the last step. 1416 01:17:25,237 --> 01:17:27,820 But all the other steps are paid for by the decrease in level. 1417 01:17:31,870 --> 01:17:35,890 So the claim is overall, we pay log squared n. 1418 01:17:35,890 --> 01:17:38,860 Because we had to do log squared n for these deletions. 1419 01:17:38,860 --> 01:17:48,740 Plus log n times the number of level decreases. 1420 01:17:51,820 --> 01:17:53,030 That's for this step. 1421 01:17:53,030 --> 01:17:56,972 I claim that each round of this loop cost log n. 1422 01:17:56,972 --> 01:17:58,430 So that's why we've get log n times 1423 01:17:58,430 --> 01:17:59,513 number of level decreases. 1424 01:17:59,513 --> 01:18:02,150 But number of level decreases is, at most, 1425 01:18:02,150 --> 01:18:05,417 number of edge insertions times log n. 1426 01:18:05,417 --> 01:18:07,250 Because each edge can only decrease in level 1427 01:18:07,250 --> 01:18:10,290 log n times, by definition of levels. 1428 01:18:10,290 --> 01:18:12,320 And so we can charge to the insert operations 1429 01:18:12,320 --> 01:18:15,650 at a factor of log n. 1430 01:18:15,650 --> 01:18:17,090 So there's two log n's here. 1431 01:18:17,090 --> 01:18:20,500 So inserts now cost log squared amortized as well. 1432 01:18:20,500 --> 01:18:22,330 To deletes costs log squared amortized. 1433 01:18:22,330 --> 01:18:25,460 And we get log squared overall. 1434 01:18:25,460 --> 01:18:28,760 One last issue though, which is how do we do this step? 1435 01:18:28,760 --> 01:18:31,370 Actually, there's another step that's not so trivial, 1436 01:18:31,370 --> 01:18:32,450 this one. 1437 01:18:32,450 --> 01:18:35,540 We need to know the size of Tv and the size of Tw. 1438 01:18:35,540 --> 01:18:36,290 Well, that's easy. 1439 01:18:36,290 --> 01:18:38,390 I just store subtree sizes at every node. 1440 01:18:38,390 --> 01:18:41,720 So then I look at the root of the BST, 1441 01:18:41,720 --> 01:18:44,270 and it tells me how big that tree is. 1442 01:18:44,270 --> 01:18:47,420 And so I can see which is bigger and switch accordingly. 1443 01:18:47,420 --> 01:18:48,980 But that's trivial. 1444 01:18:48,980 --> 01:18:50,900 This one requires a different augmentation 1445 01:18:50,900 --> 01:18:54,840 for Euler-tour trees. 1446 01:18:54,840 --> 01:18:56,480 And again, in both of these situations, 1447 01:18:56,480 --> 01:18:59,120 we're using the fact that we can augment on subtrees, not 1448 01:18:59,120 --> 01:18:59,700 on paths. 1449 01:18:59,700 --> 01:19:02,910 That's why we need Euler-tour trees, not link cut trees. 1450 01:19:02,910 --> 01:19:06,405 So what we store in F-- 1451 01:19:06,405 --> 01:19:08,690 why did I write Fv? 1452 01:19:08,690 --> 01:19:11,060 I mean Tv here. 1453 01:19:15,000 --> 01:19:19,580 So in Fi-- this is for each i, but at the forest at level i-- 1454 01:19:19,580 --> 01:19:22,970 what I'm going to store is at every node in the Euler-tour 1455 01:19:22,970 --> 01:19:28,830 tree, I want to know in the subtree rooted in that node-- 1456 01:19:28,830 --> 01:19:31,430 so here's Euler-tour tree. 1457 01:19:31,430 --> 01:19:35,405 Here's a node, v. This is a tree in Fi. 1458 01:19:35,405 --> 01:19:38,990 I want to know, in this subtree, in here, 1459 01:19:38,990 --> 01:19:46,180 is there any node that has a level i edge incident to it? 1460 01:19:46,180 --> 01:19:49,262 So I want to know, are there any level i edges? 1461 01:19:53,366 --> 01:19:54,990 If there aren't any level i edges, then 1462 01:19:54,990 --> 01:19:57,110 for the purposes of this search, I 1463 01:19:57,110 --> 01:19:58,930 should just skip that subtree. 1464 01:19:58,930 --> 01:20:02,310 It's as if the subtree was erased. 1465 01:20:02,310 --> 01:20:03,340 So I have some tree. 1466 01:20:03,340 --> 01:20:04,620 It has height log n overall. 1467 01:20:04,620 --> 01:20:06,270 I erase some of the subtrees according 1468 01:20:06,270 --> 01:20:09,610 to this bit, which is an easy thing to keep track of. 1469 01:20:09,610 --> 01:20:13,670 It's an augmentation, constant factor overhead. 1470 01:20:13,670 --> 01:20:17,310 Then I basically want to do an in-order traversal 1471 01:20:17,310 --> 01:20:20,460 of this tree, but skipping the stuff below. 1472 01:20:20,460 --> 01:20:23,160 So I mean, one way to think of it is you just repeatedly call 1473 01:20:23,160 --> 01:20:26,047 successor, like successor in a binary search tree. 1474 01:20:26,047 --> 01:20:28,380 But whenever I see that there's no level edges below me, 1475 01:20:28,380 --> 01:20:30,900 I just skip. 1476 01:20:30,900 --> 01:20:33,720 So in log n time you can basically find the next place 1477 01:20:33,720 --> 01:20:35,250 where there's an outgoing edge. 1478 01:20:35,250 --> 01:20:37,080 I guess also, the incidence lists 1479 01:20:37,080 --> 01:20:40,090 have to be stored separately for each level. 1480 01:20:40,090 --> 01:20:42,270 So you can say, what are your outgoing level i 1481 01:20:42,270 --> 01:20:43,387 edges from a vertex? 1482 01:20:43,387 --> 01:20:44,970 This lets you find the vertex quickly, 1483 01:20:44,970 --> 01:20:47,040 but then you have to find the edges quickly. 1484 01:20:47,040 --> 01:20:50,100 So the instance lists are not just stored as one single link 1485 01:20:50,100 --> 01:20:52,510 lists, one link list per level. 1486 01:20:52,510 --> 01:20:53,850 But that's easy to do. 1487 01:20:53,850 --> 01:20:56,450 So then from this, vertex we can find all the outgoing level 1488 01:20:56,450 --> 01:20:57,870 i edges. 1489 01:20:57,870 --> 01:21:01,650 And we can do this 4 loop efficiently. 1490 01:21:01,650 --> 01:21:05,705 And whenever it's low, we get to charge. 1491 01:21:05,705 --> 01:21:08,220 In the remaining negative two minutes, 1492 01:21:08,220 --> 01:21:11,142 let me tell you briefly about other problems that 1493 01:21:11,142 --> 01:21:11,850 have been solved. 1494 01:21:11,850 --> 01:21:15,240 So this finishes log squared n fully dynamic connectivity. 1495 01:21:15,240 --> 01:21:17,550 I wasn't planning to spend much time on this anyway. 1496 01:21:17,550 --> 01:21:18,060 There are notes. 1497 01:21:18,060 --> 01:21:19,470 If you're interested in the specific problems, 1498 01:21:19,470 --> 01:21:21,390 I'll just tell you what the problems are. 1499 01:21:21,390 --> 01:21:23,820 Generalization of connectivity is k connectivity. 1500 01:21:23,820 --> 01:21:26,730 I don't just want to know, is there one path from v to w? 1501 01:21:26,730 --> 01:21:30,450 I want to know are there k disjoint paths from v to w? 1502 01:21:30,450 --> 01:21:34,380 So they could be edge-disjoint or vertex-disjoint paths, 1503 01:21:34,380 --> 01:21:36,600 might be your goal. 1504 01:21:36,600 --> 01:21:40,860 In that case, poly log solutions are 1505 01:21:40,860 --> 01:21:43,752 known for 2 connectivity edge or vertex. 1506 01:21:43,752 --> 01:21:46,210 This is the result, actually, I used in my very first graph 1507 01:21:46,210 --> 01:21:49,380 algorithms papers, where I first learned about dynamic graphs. 1508 01:21:49,380 --> 01:21:52,200 In log to the fourth n time, you can maintain 2 edge 1509 01:21:52,200 --> 01:21:54,480 connectivity. 1510 01:21:54,480 --> 01:22:00,570 It's open for k equals 3, even, whether you 1511 01:22:00,570 --> 01:22:02,130 can achieve that bound. 1512 01:22:02,130 --> 01:22:04,560 So kind of lame. 1513 01:22:04,560 --> 01:22:06,847 Another problem is minimum spanning forest. 1514 01:22:06,847 --> 01:22:09,180 So you don't just want to maintain some spanning forest. 1515 01:22:09,180 --> 01:22:10,680 Maybe you actually have weights on the edges 1516 01:22:10,680 --> 01:22:12,517 and you want the minimum spanning for us. 1517 01:22:12,517 --> 01:22:14,850 That it turns out, you can solve with a couple extra log 1518 01:22:14,850 --> 01:22:15,370 factors. 1519 01:22:15,370 --> 01:22:19,996 So in log to the fourth n time, with an extra log 1520 01:22:19,996 --> 01:22:21,870 squared factor, you can reduce to the problem 1521 01:22:21,870 --> 01:22:22,860 of dynamic connectivity. 1522 01:22:22,860 --> 01:22:24,359 And so log to the forth overall, you 1523 01:22:24,359 --> 01:22:27,409 can maintain minimum spanning forest. 1524 01:22:27,409 --> 01:22:28,950 Another problem is planarity testing. 1525 01:22:28,950 --> 01:22:32,150 You want to insert or delete edges and, at all points, 1526 01:22:32,150 --> 01:22:34,980 know whether your graph is planar. 1527 01:22:34,980 --> 01:22:38,850 And that, there's no great results known for that. 1528 01:22:38,850 --> 01:22:40,810 It's like n to the 2/3. 1529 01:22:40,810 --> 01:22:44,430 Directed graphs, things get really hard. 1530 01:22:44,430 --> 01:22:46,844 There are two problems I've surveyed here. 1531 01:22:46,844 --> 01:22:47,760 There's probably more. 1532 01:22:47,760 --> 01:22:49,410 But the most fundamental questions 1533 01:22:49,410 --> 01:22:52,260 are the analog of connectivity is 1534 01:22:52,260 --> 01:22:55,080 what I call dynamic transitive closure, which 1535 01:22:55,080 --> 01:22:58,680 means given two vertices, v and w, I want to know, 1536 01:22:58,680 --> 01:23:03,180 is there a directed path from v to w instead of undirected? 1537 01:23:03,180 --> 01:23:05,520 At the moment, a bunch of results 1538 01:23:05,520 --> 01:23:10,890 match a trade-off curve which should be update times 1539 01:23:10,890 --> 01:23:15,860 query is roughly m times n, a number of edges times number 1540 01:23:15,860 --> 01:23:16,830 of vertices. 1541 01:23:16,830 --> 01:23:19,820 This is big. 1542 01:23:19,820 --> 01:23:23,550 But there are a bunch of results that achieve this trade-off. 1543 01:23:23,550 --> 01:23:25,290 I don't think I'm matching lower bound. 1544 01:23:25,290 --> 01:23:28,920 But it's also open, whether this entire trade-off 1545 01:23:28,920 --> 01:23:30,330 curve could be achieved. 1546 01:23:30,330 --> 01:23:32,700 There's various points on the curve where you pay 1547 01:23:32,700 --> 01:23:34,050 square root of n for an update. 1548 01:23:34,050 --> 01:23:36,240 Then you pay like, m times square root 1549 01:23:36,240 --> 01:23:40,330 of n for the query or vice versa, stuff like that. 1550 01:23:40,330 --> 01:23:41,760 And then a harder version of this 1551 01:23:41,760 --> 01:23:43,135 is you have weights on the edges. 1552 01:23:43,135 --> 01:23:45,300 And now the query is not just find a path, 1553 01:23:45,300 --> 01:23:47,460 find the shortest path. 1554 01:23:47,460 --> 01:23:50,470 This is what I call dynamic all pairs shortest paths. 1555 01:23:50,470 --> 01:23:51,960 This is even harder. 1556 01:23:51,960 --> 01:23:56,040 There are some results where you can do like, roughly, n squared 1557 01:23:56,040 --> 01:23:58,110 time updates constant query. 1558 01:23:58,110 --> 01:24:00,840 So it's maybe similar type of trade-off curves, 1559 01:24:00,840 --> 01:24:03,630 not really known. 1560 01:24:03,630 --> 01:24:05,410 When all the weights of the edges are 1, 1561 01:24:05,410 --> 01:24:07,410 you just want to find the shortest path in terms 1562 01:24:07,410 --> 01:24:11,910 of the number of edges, you can do not really that much better. 1563 01:24:11,910 --> 01:24:13,740 Still, this is the product. 1564 01:24:13,740 --> 01:24:17,160 But at least more of the trade curve has been solved. 1565 01:24:17,160 --> 01:24:21,512 So that's a quick overview of dynamic graphs, more generally. 1566 01:24:21,512 --> 01:24:23,720 Dynamic graphs come up all over the place, obviously. 1567 01:24:23,720 --> 01:24:27,302 You have network where the links are changing, or Facebook, 1568 01:24:27,302 --> 01:24:29,385 you want to maintain connections but people friend 1569 01:24:29,385 --> 01:24:30,900 and de-friend all the time. 1570 01:24:30,900 --> 01:24:35,000 So you want to maintain whatever structure you care about. 1571 01:24:35,000 --> 01:24:37,310 Anyway, that's dynamic graphs upper bounds. 1572 01:24:37,310 --> 01:24:41,500 Next class will be lower bounds, the log n.