1 00:00:00,090 --> 00:00:02,490 The following content is provided under a Creative 2 00:00:02,490 --> 00:00:04,030 Commons license. 3 00:00:04,030 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,720 continue to offer high quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,280 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,280 --> 00:00:18,450 at ocw.mit.edu. 8 00:00:22,100 --> 00:00:24,680 ERIK DEMAINE: All right, welcome back to Dynamic Optimality. 9 00:00:24,680 --> 00:00:27,020 This is the second of two lectures. 10 00:00:27,020 --> 00:00:29,770 And today we're going to focus mainly on lower bounds. 11 00:00:29,770 --> 00:00:32,479 So last time we saw this geometric connection 12 00:00:32,479 --> 00:00:34,020 to binary search trees. 13 00:00:34,020 --> 00:00:37,760 So again, this is about is there one best binary search tree. 14 00:00:37,760 --> 00:00:40,070 And we represented binary search trees, or at least 15 00:00:40,070 --> 00:00:42,080 the execution of those algorithms, 16 00:00:42,080 --> 00:00:45,920 as point sets in time space. 17 00:00:45,920 --> 00:00:49,340 And of course a point set corresponded 18 00:00:49,340 --> 00:00:52,410 to a valid execution of a BST tree where each of these points 19 00:00:52,410 --> 00:00:55,970 represented which nodes got touched during an access. 20 00:00:55,970 --> 00:00:59,300 If and only if the point set was arborally satisfied, 21 00:00:59,300 --> 00:01:01,910 meaning you take any two points in the point set, 22 00:01:01,910 --> 00:01:03,530 if they span a rectangle that is not 23 00:01:03,530 --> 00:01:05,720 just a horizontal or vertical line segment, 24 00:01:05,720 --> 00:01:07,640 there must be a third point somewhere 25 00:01:07,640 --> 00:01:09,410 inside that rectangle, which in the end 26 00:01:09,410 --> 00:01:14,310 implies that there's a monotone path between those two points. 27 00:01:14,310 --> 00:01:16,190 And then we saw, on the upper bound side, 28 00:01:16,190 --> 00:01:19,100 we saw a greedy algorithm, which was the obvious offline thing 29 00:01:19,100 --> 00:01:22,010 to do, which is as these points come along, 30 00:01:22,010 --> 00:01:24,170 as you do the accesses, the white dots, 31 00:01:24,170 --> 00:01:26,540 you add the necessary red dots in order 32 00:01:26,540 --> 00:01:30,867 to make it arborally satisfied row by row. 33 00:01:30,867 --> 00:01:33,200 And so that seemed like the obvious offline thing to do. 34 00:01:33,200 --> 00:01:36,140 Turns out it could be done online up to constant factors. 35 00:01:36,140 --> 00:01:37,790 I sketched that last time. 36 00:01:37,790 --> 00:01:40,100 And this is conjectured to be within a constant factor 37 00:01:40,100 --> 00:01:41,060 of optimal. 38 00:01:41,060 --> 00:01:41,810 We can't prove it. 39 00:01:41,810 --> 00:01:43,851 What I'm going to show today is our best attempts 40 00:01:43,851 --> 00:01:45,166 at proving this is optimal. 41 00:01:45,166 --> 00:01:46,790 In particular, there's something called 42 00:01:46,790 --> 00:01:49,730 the signed greedy algorithm, which is almost the same as 43 00:01:49,730 --> 00:01:52,100 greedy, but it's a lower bound. 44 00:01:52,100 --> 00:01:54,007 And greedy is an upper bound. 45 00:01:54,007 --> 00:01:56,090 So all you need to do is show these two things are 46 00:01:56,090 --> 00:01:59,564 within a constant factor of each other and we're done. 47 00:01:59,564 --> 00:02:01,230 We're not going to get there, obviously, 48 00:02:01,230 --> 00:02:02,830 because we haven't solved that yet. 49 00:02:02,830 --> 00:02:05,900 But along the way, we're going to see 50 00:02:05,900 --> 00:02:10,639 tango trees, which achieve this log log n competitive bound. 51 00:02:10,639 --> 00:02:13,120 So we think greedy is constant competitive. 52 00:02:13,120 --> 00:02:14,840 The best we know right now is log log n, 53 00:02:14,840 --> 00:02:17,330 this is an improvement over red black trees, 54 00:02:17,330 --> 00:02:19,340 which achieve log n. 55 00:02:19,340 --> 00:02:21,710 Any balance binary search tree is within a log n 56 00:02:21,710 --> 00:02:23,030 factor of optimal. 57 00:02:23,030 --> 00:02:27,350 So it's all between constant and log n that we're trying to do. 58 00:02:27,350 --> 00:02:29,450 Another fun consequence of lower bounds 59 00:02:29,450 --> 00:02:31,450 is a particular sense in which log n 60 00:02:31,450 --> 00:02:34,350 is necessary for some access sequences. 61 00:02:34,350 --> 00:02:37,340 So I argued last time that if you take, like, 62 00:02:37,340 --> 00:02:39,650 a random access sequence, or-- 63 00:02:39,650 --> 00:02:42,487 for example, if you look at a binary search tree 64 00:02:42,487 --> 00:02:44,570 and you say, oh, I'll just access the thing that's 65 00:02:44,570 --> 00:02:46,160 deepest in the tree, there's always 66 00:02:46,160 --> 00:02:48,950 something that's deep in the tree of depth, at least log n. 67 00:02:48,950 --> 00:02:50,450 And so for any binary search tree 68 00:02:50,450 --> 00:02:51,710 there is an access sequence. 69 00:02:51,710 --> 00:02:53,543 No matter what that binary search tree does, 70 00:02:53,543 --> 00:02:56,300 I can choose the next access to force you 71 00:02:56,300 --> 00:02:58,520 to take log n per operation. 72 00:02:58,520 --> 00:03:01,370 What we're going to see today in these lower bounds 73 00:03:01,370 --> 00:03:05,600 is one access sequence that for all binary search trees, 74 00:03:05,600 --> 00:03:08,780 they must spend log n time. 75 00:03:08,780 --> 00:03:11,240 Just changing the quantifiers around. 76 00:03:11,240 --> 00:03:13,970 So instead of for every binary search tree 77 00:03:13,970 --> 00:03:15,470 there is an access sequence, there's 78 00:03:15,470 --> 00:03:17,480 going to be there is an access sequence 79 00:03:17,480 --> 00:03:19,070 such that for every binary search tree 80 00:03:19,070 --> 00:03:20,330 you need log n time. 81 00:03:20,330 --> 00:03:22,810 That's something we'll get easily out of these lower 82 00:03:22,810 --> 00:03:24,470 bounds. 83 00:03:24,470 --> 00:03:26,660 So let's jump into the lower bounds. 84 00:03:26,660 --> 00:03:29,620 And we're going to cover you could say three different lower 85 00:03:29,620 --> 00:03:30,120 bounds. 86 00:03:30,120 --> 00:03:32,930 The independent rectangles is kind of a generic class 87 00:03:32,930 --> 00:03:34,250 of lower bounds. 88 00:03:34,250 --> 00:03:36,260 Then we're going to see two specific choices 89 00:03:36,260 --> 00:03:38,810 of these independent rectangles, which are actually 90 00:03:38,810 --> 00:03:41,480 older than this result. So this is 91 00:03:41,480 --> 00:03:44,450 sort of a modern interpretation of two older results 92 00:03:44,450 --> 00:03:47,060 and a more general result. 93 00:03:47,060 --> 00:03:50,210 Signed greedy is going to turn out to be the best lower bound. 94 00:03:50,210 --> 00:03:52,790 Is It's better than all the ones that we will cover, 95 00:03:52,790 --> 00:03:56,630 but each of them has their own uses for analysis. 96 00:03:56,630 --> 00:03:59,300 Each of them is going to let us analyze an algorithm 97 00:03:59,300 --> 00:04:01,340 that we couldn't or that we don't otherwise 98 00:04:01,340 --> 00:04:04,160 know how to analyze. 99 00:04:04,160 --> 00:04:08,600 So let's do the independent rectangle lower bound. 100 00:04:08,600 --> 00:04:09,700 The sort of generic one. 101 00:04:25,540 --> 00:04:28,690 So these lower bounds are all going 102 00:04:28,690 --> 00:04:31,780 to refer to the original point set, 103 00:04:31,780 --> 00:04:34,720 the white dots, the accesses. 104 00:04:34,720 --> 00:04:37,330 The idea is you're given an access sequence, a sequence x 105 00:04:37,330 --> 00:04:41,170 i-- x 1 up to x n, and you want to know some lower bound 106 00:04:41,170 --> 00:04:43,690 that every binary search tree requires 107 00:04:43,690 --> 00:04:46,930 a certain number of accesses, a certain number of node touches 108 00:04:46,930 --> 00:04:47,980 for that access sequence. 109 00:04:47,980 --> 00:04:49,021 You know it's at least n. 110 00:04:49,021 --> 00:04:50,910 You want something bigger than n. 111 00:04:50,910 --> 00:04:54,475 We've got to at least touch the nodes that are being accessed. 112 00:04:58,983 --> 00:05:00,370 I'm going to drop this. 113 00:05:15,520 --> 00:05:19,090 So I want the notion of independent rectangles. 114 00:05:19,090 --> 00:05:24,400 And general idea of dependent rectangles 115 00:05:24,400 --> 00:05:27,340 would be something like this. 116 00:05:30,328 --> 00:05:31,300 Ah, I see. 117 00:05:42,610 --> 00:05:44,170 So these are two rectangles. 118 00:05:44,170 --> 00:05:46,570 I consider them dependent because one of the corners 119 00:05:46,570 --> 00:05:48,900 is inside the other rectangle. 120 00:05:48,900 --> 00:05:51,320 This is true no matter where the points are. 121 00:05:51,320 --> 00:05:59,750 So, for example, if I take two points, they span a rectangle. 122 00:05:59,750 --> 00:06:04,660 If I take these two points, for example, they span a rectangle. 123 00:06:04,660 --> 00:06:06,180 This corner is inside that one. 124 00:06:06,180 --> 00:06:08,440 So these are considered dependent rectangles 125 00:06:08,440 --> 00:06:11,127 in either case. 126 00:06:11,127 --> 00:06:13,210 So corner here does not necessarily mean a point-- 127 00:06:13,210 --> 00:06:14,782 any of the four corners. 128 00:06:14,782 --> 00:06:16,240 Rectangle is defined by two points, 129 00:06:16,240 --> 00:06:19,960 but it has all four corners. 130 00:06:19,960 --> 00:06:23,380 And so, in particular, independent rectangles-- 131 00:06:23,380 --> 00:06:26,080 for example, they might be completely disjoint. 132 00:06:26,080 --> 00:06:28,700 Those are going to be independent. 133 00:06:28,700 --> 00:06:31,700 Something like that is independent. 134 00:06:31,700 --> 00:06:33,140 But there are some other cases. 135 00:06:33,140 --> 00:06:37,180 You can have rectangles that look like this. 136 00:06:37,180 --> 00:06:38,110 OK? 137 00:06:38,110 --> 00:06:39,901 And it doesn't matter where the points are. 138 00:06:39,901 --> 00:06:42,000 Maybe here, here, here, and here. 139 00:06:42,000 --> 00:06:44,210 Or the other way. 140 00:06:44,210 --> 00:06:46,990 These are independent. 141 00:06:46,990 --> 00:06:52,960 And there's one other kind of special case, which maybe I'll 142 00:06:52,960 --> 00:06:56,680 use color to draw the other one because they're 143 00:06:56,680 --> 00:06:57,970 right on top of each other. 144 00:07:03,405 --> 00:07:10,000 So I've got a point here, a point here, and a point here. 145 00:07:10,000 --> 00:07:13,810 These are two rectangles defined on three points. 146 00:07:13,810 --> 00:07:16,990 So they both use this point. 147 00:07:16,990 --> 00:07:19,930 And if you check, it does satisfy this condition. 148 00:07:19,930 --> 00:07:23,260 So no corner strictly inside the other. 149 00:07:23,260 --> 00:07:25,660 But we also need that the rectangles are unsatisfied. 150 00:07:25,660 --> 00:07:28,660 So this is saying that there's no other point even 151 00:07:28,660 --> 00:07:30,410 on the boundary of the rectangle. 152 00:07:30,410 --> 00:07:34,360 So this part says, OK, there's nothing strictly inside. 153 00:07:34,360 --> 00:07:36,610 But we also need that on the boundary 154 00:07:36,610 --> 00:07:38,060 there's no other points. 155 00:07:38,060 --> 00:07:40,060 So this is the only sort of situation other 156 00:07:40,060 --> 00:07:44,200 than reflections where you get this working 157 00:07:44,200 --> 00:07:45,840 out as independent. 158 00:07:50,632 --> 00:07:52,090 AUDIENCE: Last case is independent? 159 00:07:52,090 --> 00:07:53,730 ERIK DEMAINE: Last case is independent. 160 00:07:56,730 --> 00:07:58,800 All right? 161 00:07:58,800 --> 00:08:00,470 So this is a definition. 162 00:08:00,470 --> 00:08:02,470 If I give you a set of rectangles, 163 00:08:02,470 --> 00:08:05,070 they're independent. 164 00:08:05,070 --> 00:08:06,930 I mean, I was looking at pairwise. 165 00:08:06,930 --> 00:08:08,640 But if they're are pairwise independent, 166 00:08:08,640 --> 00:08:10,710 then they will be independent. 167 00:08:10,710 --> 00:08:12,270 No corner of any rectangle strictly 168 00:08:12,270 --> 00:08:14,590 inside any other rectangle. 169 00:08:14,590 --> 00:08:17,940 And there's no points of those rectangles 170 00:08:17,940 --> 00:08:21,360 that are inside others. 171 00:08:21,360 --> 00:08:22,166 OK. 172 00:08:22,166 --> 00:08:22,665 Cool. 173 00:08:25,390 --> 00:08:27,790 So what? 174 00:08:32,350 --> 00:08:36,640 Lower bound says the optimal offline binary search 175 00:08:36,640 --> 00:08:41,799 tree, or the optimal way to add dots to satisfy your point set, 176 00:08:41,799 --> 00:08:45,490 is going to be at least the size of the input-- 177 00:08:45,490 --> 00:08:48,340 meaning the number of initial points you have-- 178 00:08:48,340 --> 00:08:53,170 plus half the maximum number of independent rectangles. 179 00:09:02,591 --> 00:09:03,090 OK. 180 00:09:03,090 --> 00:09:05,460 So this is a max independence set problem. 181 00:09:05,460 --> 00:09:07,350 In general, that's NP-complete. 182 00:09:07,350 --> 00:09:10,260 Turns out we'll be able to at least approximate the number 183 00:09:10,260 --> 00:09:12,990 of independent rectangles within a constant factor 184 00:09:12,990 --> 00:09:14,560 by the end of class. 185 00:09:14,560 --> 00:09:16,220 That's going to be signed greedy. 186 00:09:16,220 --> 00:09:18,180 So signed greedy is going to be the best 187 00:09:18,180 --> 00:09:21,270 way up to constant factors to choose independent rectangles. 188 00:09:21,270 --> 00:09:24,780 For now, someone magically tells you 189 00:09:24,780 --> 00:09:28,770 what's the best way or you just choose some reasonable-- 190 00:09:28,770 --> 00:09:30,534 any choice of independent rectangles 191 00:09:30,534 --> 00:09:31,450 will be a lower bound. 192 00:09:31,450 --> 00:09:34,600 But you get the best lower bound by choosing the max. 193 00:09:34,600 --> 00:09:35,100 OK? 194 00:09:35,100 --> 00:09:38,400 So we're going to prove this theorem, 195 00:09:38,400 --> 00:09:43,432 and then we're going to see three different ways to choose 196 00:09:43,432 --> 00:09:44,640 those independent rectangles. 197 00:09:44,640 --> 00:09:46,770 And we'll use them for various things. 198 00:09:46,770 --> 00:09:48,780 Wilber 1, Wilber, 2 and signed greedy 199 00:09:48,780 --> 00:09:51,090 are going to be the three choices 200 00:09:51,090 --> 00:09:53,530 for independent rectangles. 201 00:09:53,530 --> 00:09:54,030 All right. 202 00:09:54,030 --> 00:09:57,660 To prove this theorem, we're going to change it a little bit 203 00:09:57,660 --> 00:09:59,850 first. 204 00:09:59,850 --> 00:10:02,700 And this is kind of the focus of today-- 205 00:10:02,700 --> 00:10:04,890 is the idea of signed rectangles. 206 00:10:09,090 --> 00:10:11,580 If you look at the rectangles in the world spanned 207 00:10:11,580 --> 00:10:14,800 by two points, there are two different kinds. 208 00:10:14,800 --> 00:10:20,460 There's the top right, lower left kind. 209 00:10:20,460 --> 00:10:23,910 And then there's the top left, lower right kind. 210 00:10:23,910 --> 00:10:26,610 These are positive slope or negative slope. 211 00:10:29,340 --> 00:10:31,590 Those are the two kinds of rectangles. 212 00:10:31,590 --> 00:10:36,630 And it's helpful to think about just the positive rectangles 213 00:10:36,630 --> 00:10:39,690 or the slash rectangles and just the backslash rectangles 214 00:10:39,690 --> 00:10:41,380 separately. 215 00:10:41,380 --> 00:10:43,920 So we're going to call a point set-- 216 00:10:48,090 --> 00:10:50,400 it's a little hard to pronounce-- 217 00:10:50,400 --> 00:10:53,820 we used to call this plus satisfied. 218 00:10:53,820 --> 00:10:55,620 So maybe it's easiest to pronounce it 219 00:10:55,620 --> 00:11:00,990 that way, the symbol formerly known as plus satisfied, 220 00:11:00,990 --> 00:11:11,370 if all plus rectangles that are not 221 00:11:11,370 --> 00:11:15,000 on a horizontal or vertical line contain another point. 222 00:11:23,360 --> 00:11:27,390 So a point set is arborally satisfied 223 00:11:27,390 --> 00:11:31,310 if and only if it is plus satisfied and minus satisfied-- 224 00:11:31,310 --> 00:11:33,787 just breaking apart that definition into two parts. 225 00:11:33,787 --> 00:11:36,120 But now, we're going to look at point sets that are just 226 00:11:36,120 --> 00:11:39,300 plus satisfied or point sets that are just minus satisfied. 227 00:11:39,300 --> 00:11:42,630 And then we can look at the optimal solution 228 00:11:42,630 --> 00:11:46,960 if you only care about plus rectangles. 229 00:11:46,960 --> 00:12:03,840 So this is the smallest plus satisfied point set 230 00:12:03,840 --> 00:12:08,776 containing all the access points, all the given points. 231 00:12:08,776 --> 00:12:12,470 So we'll call that the input. 232 00:12:12,470 --> 00:12:15,090 OPT was the smallest arborally satisfied. 233 00:12:15,090 --> 00:12:17,340 OPT plus is the-- 234 00:12:17,340 --> 00:12:19,483 you just look at plus rectangles. 235 00:12:24,351 --> 00:12:24,850 OK. 236 00:12:24,850 --> 00:12:26,570 Why are we doing this? 237 00:12:26,570 --> 00:12:29,030 Well, for now, we're going to do it to prove this theorem. 238 00:12:29,030 --> 00:12:35,830 So lemma, which was what we're actually going to prove, 239 00:12:35,830 --> 00:12:38,650 is if you look at this OPT plus thing, 240 00:12:38,650 --> 00:12:41,010 it's got to be at least the size of the input-- 241 00:12:41,010 --> 00:12:44,620 everything has to at least contain the input-- 242 00:12:44,620 --> 00:12:54,460 plus maximum number of independent plus rectangles. 243 00:13:02,260 --> 00:13:04,680 So this is where we're actually going to prove. 244 00:13:04,680 --> 00:13:06,180 If you want to get plus satisfied 245 00:13:06,180 --> 00:13:09,240 and you've got k independent plus rectangles, 246 00:13:09,240 --> 00:13:11,050 you need to add at least that many points-- 247 00:13:11,050 --> 00:13:14,850 so at least one point per plus rectangle. 248 00:13:14,850 --> 00:13:16,830 If you can prove this, you prove the theorem 249 00:13:16,830 --> 00:13:18,900 because this holds for minus just 250 00:13:18,900 --> 00:13:21,120 as well as plus by symmetry. 251 00:13:21,120 --> 00:13:26,400 And so you take your maximum independent set of rectangles. 252 00:13:26,400 --> 00:13:28,830 At least half of them are plus or at least half of them 253 00:13:28,830 --> 00:13:30,000 are minus. 254 00:13:30,000 --> 00:13:34,290 You apply this bound, and that's where you get the 1/2 here. 255 00:13:34,290 --> 00:13:38,182 So this is stronger, I guess, than the theorem, 256 00:13:38,182 --> 00:13:40,140 and this is what we're actually going to prove. 257 00:13:40,140 --> 00:13:41,850 And so, in this world, we just are 258 00:13:41,850 --> 00:13:44,790 thinking about plus rectangles, which is a little weird. 259 00:13:44,790 --> 00:13:47,340 But it works. 260 00:13:51,120 --> 00:13:54,366 And the proof is going to be in three steps. 261 00:13:54,366 --> 00:13:56,760 I'm first going to give you an overview of the steps, 262 00:13:56,760 --> 00:13:58,470 and then we'll actually do them. 263 00:13:58,470 --> 00:14:03,030 So this is like a two-level proof. 264 00:14:03,030 --> 00:14:07,470 First thing we do, the top level, 265 00:14:07,470 --> 00:14:13,590 is we're going to find a rectangle 266 00:14:13,590 --> 00:14:20,370 in the independent set, and we're 267 00:14:20,370 --> 00:14:27,540 going to find a vertical line that hits only that rectangle. 268 00:14:33,730 --> 00:14:37,160 So we're going to have some rectangle 269 00:14:37,160 --> 00:14:43,380 in the independent set, and we want a vertical line stabbing 270 00:14:43,380 --> 00:14:46,470 it such that no other rectangle is 271 00:14:46,470 --> 00:14:48,490 stabbed by this vertical line. 272 00:14:48,490 --> 00:14:50,910 So all other rectangles-- that's independence, 273 00:14:50,910 --> 00:14:54,260 so maybe they look something like this-- 274 00:14:54,260 --> 00:14:55,690 but nothing like this. 275 00:14:59,297 --> 00:15:01,380 Not obvious that such a thing exists, but it does. 276 00:15:01,380 --> 00:15:03,240 Actually, not that hard to find. 277 00:15:03,240 --> 00:15:05,510 We just need some rectangle with some line. 278 00:15:09,540 --> 00:15:16,230 Then, using that property, we're going 279 00:15:16,230 --> 00:15:31,200 to be able to find some points in that rectangle that are also 280 00:15:31,200 --> 00:15:43,530 in the optimal plus solution in the rectangle 281 00:15:43,530 --> 00:15:44,490 crossing the line. 282 00:15:49,950 --> 00:15:52,770 Let me get another color. 283 00:15:52,770 --> 00:15:56,490 So we're going to find a point on the left 284 00:15:56,490 --> 00:15:59,964 of the line and a point on the right of the line. 285 00:15:59,964 --> 00:16:01,380 And they're horizontally adjacent, 286 00:16:01,380 --> 00:16:03,790 meaning there's no other point between them. 287 00:16:03,790 --> 00:16:07,164 So we know there's some point in this box. 288 00:16:07,164 --> 00:16:08,580 Because this is a plus box, it has 289 00:16:08,580 --> 00:16:09,750 got to be satisfied somehow. 290 00:16:09,750 --> 00:16:11,375 And I claim there's actually two points 291 00:16:11,375 --> 00:16:12,600 on either side of the line. 292 00:16:12,600 --> 00:16:14,992 One of them could be equal to this or this, but not both 293 00:16:14,992 --> 00:16:16,950 obviously because they're horizontally aligned. 294 00:16:21,270 --> 00:16:29,230 And then what we're going to do is charge the rectangle 295 00:16:29,230 --> 00:16:29,970 to those points. 296 00:16:34,346 --> 00:16:36,720 And then, basically, we're going to remove that rectangle 297 00:16:36,720 --> 00:16:38,340 and repeat. 298 00:16:38,340 --> 00:16:40,830 And the claim is this charging sort of only happens once 299 00:16:40,830 --> 00:16:41,910 per point. 300 00:16:41,910 --> 00:16:44,700 And therefore, the number of points in the optimal solution 301 00:16:44,700 --> 00:16:48,150 has to be at least the number of rectangles-- number 302 00:16:48,150 --> 00:16:50,940 of plus rectangles in the independent set. 303 00:16:50,940 --> 00:16:53,740 So, basically, this is a way of ordering the rectangles. 304 00:16:53,740 --> 00:16:56,640 We're going to take one that has one of these vertical lines, 305 00:16:56,640 --> 00:17:00,630 find two points that pay for that rectangle, 306 00:17:00,630 --> 00:17:03,360 and therefore argue that OPT has to be at least the number 307 00:17:03,360 --> 00:17:04,394 of rectangles. 308 00:17:04,394 --> 00:17:08,250 So we have to argue that at least one of these points 309 00:17:08,250 --> 00:17:11,470 is not one of the original points. 310 00:17:11,470 --> 00:17:15,644 And that's where we're getting the input plus this. 311 00:17:15,644 --> 00:17:17,310 So there's lots of things to check here. 312 00:17:17,310 --> 00:17:18,476 Let's do them one at a time. 313 00:17:21,560 --> 00:17:23,930 And throughout, I'm going to assume-- 314 00:17:23,930 --> 00:17:27,150 let me write that the bottom-- 315 00:17:27,150 --> 00:17:31,700 assume all x- and y-coordinates are unique. 316 00:17:39,860 --> 00:17:43,490 This is an idea I mentioned last time as well. 317 00:17:43,490 --> 00:17:46,890 If you have lots of accesses to the same key, 318 00:17:46,890 --> 00:17:50,870 imagine them being accesses to slightly different keys. 319 00:17:50,870 --> 00:17:53,330 Just skew them a little bit, and it doesn't 320 00:17:53,330 --> 00:17:56,420 change any of the bounds much. 321 00:17:56,420 --> 00:18:00,290 I won't to argue that here, but at the least think of this 322 00:18:00,290 --> 00:18:03,530 as just a simplifying assumption to make the proofs cleaner. 323 00:18:07,010 --> 00:18:09,230 So how are we going to do step one? 324 00:18:09,230 --> 00:18:12,710 I need to find some rectangle and some vertical line that 325 00:18:12,710 --> 00:18:16,580 only stabs that rectangle. 326 00:18:16,580 --> 00:18:18,560 And the way we're going to do that is just 327 00:18:18,560 --> 00:18:30,190 take the widest rectangle that just has the maximum x extent. 328 00:18:30,190 --> 00:18:34,810 There might be more than one, but just take one of them. 329 00:18:34,810 --> 00:18:38,534 So it's very wide. 330 00:18:38,534 --> 00:18:39,950 What this tells us is that there's 331 00:18:39,950 --> 00:18:42,710 no other rectangle like this. 332 00:18:42,710 --> 00:18:45,110 This would be independent, but it would be wider. 333 00:18:45,110 --> 00:18:46,040 So that's not allowed. 334 00:18:51,365 --> 00:18:53,490 Now, we have to think about all sorts of scenarios. 335 00:18:53,490 --> 00:18:55,990 So we've got a point here and a point here. 336 00:18:55,990 --> 00:18:58,960 It could still be that we have rectangles like this. 337 00:18:58,960 --> 00:19:00,970 They just can't go farther to the right. 338 00:19:00,970 --> 00:19:03,670 It could be we have rectangles that go like this-- 339 00:19:03,670 --> 00:19:07,354 just can't go too far to the left. 340 00:19:07,354 --> 00:19:08,770 These rectangles that are anchored 341 00:19:08,770 --> 00:19:10,330 in the lower left and these rectangles that 342 00:19:10,330 --> 00:19:12,610 are anchored in the upper right can't touch each other 343 00:19:12,610 --> 00:19:16,369 because then one of them would be satisfied. 344 00:19:16,369 --> 00:19:18,160 This one's going to have a point down here. 345 00:19:18,160 --> 00:19:20,230 This one is going to have a point here. 346 00:19:20,230 --> 00:19:23,200 I guess-- yeah, let's see, how would it 347 00:19:23,200 --> 00:19:24,560 go if they were touching? 348 00:19:31,110 --> 00:19:33,465 We'd have a corner-- 349 00:19:33,465 --> 00:19:34,910 hmm, touching is a little weird. 350 00:19:51,000 --> 00:19:52,000 Ah, I see. 351 00:19:52,000 --> 00:19:52,500 Good. 352 00:19:52,500 --> 00:19:53,958 This can't happen because we assume 353 00:19:53,958 --> 00:19:56,790 the x-coordinates are distinct. 354 00:19:56,790 --> 00:19:59,360 So that's why I did this. 355 00:19:59,360 --> 00:20:01,860 That's the reason. 356 00:20:01,860 --> 00:20:03,580 So this can't happen. 357 00:20:03,580 --> 00:20:07,410 And I also can't have them go like this because then there's 358 00:20:07,410 --> 00:20:12,120 a corner in the strict interior of the other rectangle. 359 00:20:12,120 --> 00:20:12,912 Is that clear? 360 00:20:12,912 --> 00:20:14,370 This rectangle can't come over here 361 00:20:14,370 --> 00:20:18,090 because then that would be not independent. 362 00:20:18,090 --> 00:20:20,160 Rectangle can't come right to the same spot 363 00:20:20,160 --> 00:20:21,460 because there is no same spot. 364 00:20:21,460 --> 00:20:24,190 That would be two points on the same vertical line. 365 00:20:24,190 --> 00:20:26,460 And so what we must have is a picture more 366 00:20:26,460 --> 00:20:30,810 like this where there's an empty region in between. 367 00:20:30,810 --> 00:20:34,680 that not hit by-- there can be many of these rectangles, many 368 00:20:34,680 --> 00:20:35,911 of these rectangles. 369 00:20:35,911 --> 00:20:37,410 They're independent from each other. 370 00:20:37,410 --> 00:20:40,620 That's like this case here. 371 00:20:40,620 --> 00:20:43,560 There can also be some rectangles like this. 372 00:20:43,560 --> 00:20:47,100 But by the same argument, these guys can't touch each other 373 00:20:47,100 --> 00:20:49,380 and they can't overlap horizontally 374 00:20:49,380 --> 00:20:52,770 because then one of the corners would be inside the other. 375 00:20:52,770 --> 00:20:54,080 Question? 376 00:20:54,080 --> 00:20:56,455 AUDIENCE: For that picture, you drew a rectangle 377 00:20:56,455 --> 00:20:59,282 under the other one. 378 00:20:59,282 --> 00:21:00,740 ERIK DEMAINE: This one or this one? 379 00:21:00,740 --> 00:21:01,531 AUDIENCE: That one. 380 00:21:01,531 --> 00:21:03,480 ERIK DEMAINE: Yeah, this one cannot happen. 381 00:21:03,480 --> 00:21:04,575 That's what we claim-- 382 00:21:04,575 --> 00:21:05,250 haha, right. 383 00:21:05,250 --> 00:21:07,140 So you're right. 384 00:21:07,140 --> 00:21:10,590 So we worry about-- 385 00:21:10,590 --> 00:21:11,612 interesting. 386 00:21:11,612 --> 00:21:13,320 Well, we worry about something like this. 387 00:21:16,194 --> 00:21:19,425 AUDIENCE: Sorry, why can't that happen? 388 00:21:19,425 --> 00:21:20,800 ERIK DEMAINE: Yeah, you're right. 389 00:21:20,800 --> 00:21:22,216 I actually drew the wrong picture. 390 00:21:22,216 --> 00:21:23,215 Sorry. 391 00:21:23,215 --> 00:21:23,715 Kidding. 392 00:21:31,080 --> 00:21:31,580 Yeah. 393 00:21:31,580 --> 00:21:33,250 I really meant line segment here. 394 00:21:33,250 --> 00:21:34,305 I'm sorry. 395 00:21:39,640 --> 00:21:42,020 Poor choice of wording. 396 00:21:42,020 --> 00:21:43,720 So vertical line is actually just going 397 00:21:43,720 --> 00:21:47,770 to go the extent of the rectangle-- 398 00:21:47,770 --> 00:21:49,510 something like this. 399 00:21:49,510 --> 00:21:50,230 Sorry. 400 00:21:50,230 --> 00:21:53,910 We can't forbid rectangles like this. 401 00:21:53,910 --> 00:21:57,260 What we can forbid our rectangles like this 402 00:21:57,260 --> 00:22:01,632 that also try to cross that segment. 403 00:22:01,632 --> 00:22:03,340 We'll see why this is enough in a moment. 404 00:22:03,340 --> 00:22:04,060 Sorry about that. 405 00:22:08,350 --> 00:22:12,070 I really only care about the interior of this rectangle. 406 00:22:12,070 --> 00:22:15,370 I'm trying to get a vertical line that only stabs 407 00:22:15,370 --> 00:22:17,530 this rectangle, nothing else-- 408 00:22:17,530 --> 00:22:19,600 inside the rectangle. 409 00:22:19,600 --> 00:22:20,740 Sorry, poor wording. 410 00:22:20,740 --> 00:22:22,674 I don't care about these guys outside 411 00:22:22,674 --> 00:22:24,340 because I can't say anything about them. 412 00:22:24,340 --> 00:22:29,275 They could be all over the place in an independent set. 413 00:22:29,275 --> 00:22:31,150 I mean, relative to what hits this rectangle, 414 00:22:31,150 --> 00:22:32,050 there's stuff on the left. 415 00:22:32,050 --> 00:22:33,050 There's stuff on the right. 416 00:22:33,050 --> 00:22:33,924 There are these guys. 417 00:22:33,924 --> 00:22:37,340 There can also be things like this. 418 00:22:37,340 --> 00:22:40,600 But still remaining are these regions 419 00:22:40,600 --> 00:22:42,959 which are not hit by any rectangles, 420 00:22:42,959 --> 00:22:44,500 and that's because what I was saying. 421 00:22:44,500 --> 00:22:46,050 These guys can't touch each other because then there 422 00:22:46,050 --> 00:22:47,680 would be equal x-coordinates. 423 00:22:47,680 --> 00:22:49,660 They can't overlap because then one of them 424 00:22:49,660 --> 00:22:52,180 would not be independent from the other. 425 00:22:52,180 --> 00:22:58,540 So I get my vertical lines. 426 00:22:58,540 --> 00:23:01,510 I just need one, but it could be any of these. 427 00:23:01,510 --> 00:23:06,400 In general, for example, if you take all of these lower 428 00:23:06,400 --> 00:23:08,470 left anchored rectangles and take just 429 00:23:08,470 --> 00:23:10,540 to the right of the rightmost one, 430 00:23:10,540 --> 00:23:13,210 that will be a valid choice for your line. 431 00:23:13,210 --> 00:23:17,060 Because you can argue none of these can overlap it. 432 00:23:17,060 --> 00:23:18,550 So that's step one. 433 00:23:18,550 --> 00:23:20,686 We just take a widest rectangle. 434 00:23:20,686 --> 00:23:22,060 The one thing we needed to forbid 435 00:23:22,060 --> 00:23:25,060 was something going like this all the way across. 436 00:23:28,120 --> 00:23:28,870 Step two. 437 00:23:31,822 --> 00:23:33,560 Step two is actually pretty easy. 438 00:23:33,560 --> 00:23:37,490 Once you've identified this red line-- 439 00:23:37,490 --> 00:23:40,370 inside the rectangle, you know there are some points. 440 00:23:40,370 --> 00:23:43,714 And I'm going to take the rightmost. 441 00:23:43,714 --> 00:23:45,380 And then among all the rightmost points, 442 00:23:45,380 --> 00:23:47,560 I'm going to take the topmost point that 443 00:23:47,560 --> 00:23:50,420 is to the left of the line and inside the rectangle. 444 00:23:50,420 --> 00:23:59,340 So let p be the topmost, leftmost point-- 445 00:23:59,340 --> 00:24:14,020 sorry, rightmost-- that is both in the rectangle 446 00:24:14,020 --> 00:24:16,210 and left of the line. 447 00:24:22,240 --> 00:24:28,420 Let me erase this one for a little bit more room. 448 00:24:28,420 --> 00:24:31,150 So I'm looking at all of this region 449 00:24:31,150 --> 00:24:33,370 to the left of the line in the rectangle. 450 00:24:33,370 --> 00:24:39,130 I want to take the rightmost and then topmost 451 00:24:39,130 --> 00:24:42,100 point-- something like this. 452 00:24:42,100 --> 00:24:43,960 How do I know such a point exists? 453 00:24:43,960 --> 00:24:46,640 Because this point is such a point. 454 00:24:46,640 --> 00:24:49,030 And this point is to the left of the line. 455 00:24:49,030 --> 00:24:52,510 So if there's nothing else in here, that is a valid choice. 456 00:24:52,510 --> 00:24:54,950 But in general, we go to the right as much as possible. 457 00:24:54,950 --> 00:24:56,390 Then we go up as much as possible. 458 00:24:56,390 --> 00:24:59,110 So that's a point, which we will call them p. 459 00:24:59,110 --> 00:25:03,534 AUDIENCE: Couldn't it be on the border of the rectangle? 460 00:25:03,534 --> 00:25:05,200 ERIK DEMAINE: It could be on the border. 461 00:25:05,200 --> 00:25:06,550 It could be interior. 462 00:25:06,550 --> 00:25:08,177 We don't know. 463 00:25:08,177 --> 00:25:11,890 AUDIENCE: When you said topmost, what is your topmost? 464 00:25:11,890 --> 00:25:15,448 ERIK DEMAINE: Topmost means of maximum y-coordinate. 465 00:25:15,448 --> 00:25:16,226 AUDIENCE: Oh, OK. 466 00:25:16,226 --> 00:25:16,850 Got it. 467 00:25:16,850 --> 00:25:19,490 ERIK DEMAINE: So it could be up here. 468 00:25:19,490 --> 00:25:21,380 We don't know. 469 00:25:21,380 --> 00:25:22,760 First, we go rightmost. 470 00:25:22,760 --> 00:25:25,019 Then, among all the things in that column, 471 00:25:25,019 --> 00:25:26,060 we go to the topmost one. 472 00:25:26,060 --> 00:25:27,140 So it might be on the top. 473 00:25:27,140 --> 00:25:27,806 It might not be. 474 00:25:30,760 --> 00:25:36,265 These are points-- sorry, this is a point in OPT plus. 475 00:25:40,270 --> 00:25:42,966 And then q is going to be a similar thing. 476 00:25:42,966 --> 00:25:52,780 It's going to be the bottom-most, leftmost point 477 00:25:52,780 --> 00:26:03,970 in OPT plus that is in the rectangle 478 00:26:03,970 --> 00:26:05,230 and right of the line. 479 00:26:08,112 --> 00:26:09,670 Not totally symmetric, though. 480 00:26:09,670 --> 00:26:12,420 We're also going to say and not below p. 481 00:26:16,570 --> 00:26:25,450 So now we're looking at this upper region here. 482 00:26:25,450 --> 00:26:28,290 Among all the things that are not below p-- 483 00:26:28,290 --> 00:26:30,280 should have drawn this more horizontal-- 484 00:26:33,430 --> 00:26:35,960 and to the right of the red line-- 485 00:26:35,960 --> 00:26:38,960 so that's up to here, I guess-- 486 00:26:38,960 --> 00:26:41,662 I want to take the leftmost column that 487 00:26:41,662 --> 00:26:43,370 has any points in it and then among those 488 00:26:43,370 --> 00:26:47,370 take the bottom-most point in the column. 489 00:26:47,370 --> 00:26:50,539 I claim that's actually going to be on this line. 490 00:26:50,539 --> 00:26:52,580 First thing to check is that such a point exists. 491 00:26:52,580 --> 00:26:54,980 Such a point exists because, in particular, this 492 00:26:54,980 --> 00:26:56,689 is such a point. 493 00:26:56,689 --> 00:26:57,980 It is to the right of the line. 494 00:26:57,980 --> 00:27:00,800 It's above the blue line, right of the red line, 495 00:27:00,800 --> 00:27:02,660 in the rectangle. 496 00:27:02,660 --> 00:27:05,350 But I claim that if we take the leftmost, bottom-most one, then 497 00:27:05,350 --> 00:27:06,600 they must actually be aligned. 498 00:27:06,600 --> 00:27:08,010 Why? 499 00:27:08,010 --> 00:27:10,520 So if it was somewhere else, like up here 500 00:27:10,520 --> 00:27:16,590 or like this point, then I claim that is an unsatisfied box. 501 00:27:16,590 --> 00:27:24,180 Let me draw that picture, make a little clearer. 502 00:27:24,180 --> 00:27:27,240 So something like this. 503 00:27:30,760 --> 00:27:39,540 So we've got our red line and we've got this picture. 504 00:27:43,090 --> 00:27:54,000 This is p, and then actually we don't know anything 505 00:27:54,000 --> 00:27:55,095 about down here. 506 00:27:59,770 --> 00:28:01,990 This is q. 507 00:28:01,990 --> 00:28:06,230 I claim that these black regions cannot have any points in them 508 00:28:06,230 --> 00:28:09,010 because, by definition, p was in the rightmost column. 509 00:28:09,010 --> 00:28:11,677 So there's nothing in this strip and between p and the red line. 510 00:28:11,677 --> 00:28:14,218 And it was the topmost within the column, which means there's 511 00:28:14,218 --> 00:28:15,520 nothing above p in the column. 512 00:28:15,520 --> 00:28:17,620 So that's why all the points are confined 513 00:28:17,620 --> 00:28:19,300 to this region over here. 514 00:28:19,300 --> 00:28:22,360 Similarly, for q, if you look at the things that are above 515 00:28:22,360 --> 00:28:24,640 or on this horizontal line, which 516 00:28:24,640 --> 00:28:28,630 was the blue line over there, then we 517 00:28:28,630 --> 00:28:33,320 know that there's nothing in this strip in between 518 00:28:33,320 --> 00:28:34,330 because q is leftmost. 519 00:28:34,330 --> 00:28:36,163 And then among leftmost, it was bottom-most, 520 00:28:36,163 --> 00:28:37,630 so there's nothing down here. 521 00:28:37,630 --> 00:28:40,360 So that means, if these guys are not horizontally aligned, 522 00:28:40,360 --> 00:28:42,840 there is an unsatisfied box here. 523 00:28:42,840 --> 00:28:43,940 Contradiction. 524 00:28:43,940 --> 00:28:45,170 It's a plus box. 525 00:28:45,170 --> 00:28:48,227 So in OPT plus, there's got to be another point, which 526 00:28:48,227 --> 00:28:49,060 was a contradiction. 527 00:28:49,060 --> 00:28:52,360 So in fact, p and q must be horizontally aligned. 528 00:28:52,360 --> 00:28:54,360 So that was step two. 529 00:28:57,040 --> 00:28:58,300 Finally, we get step three. 530 00:29:03,250 --> 00:29:05,290 So the idea with step three-- 531 00:29:05,290 --> 00:29:07,390 now we're going to do a charging argument. 532 00:29:07,390 --> 00:29:13,060 We want to say, OK, basically, for every independent 533 00:29:13,060 --> 00:29:15,370 rectangle, we want to find a point that's 534 00:29:15,370 --> 00:29:19,720 in OPT that was not in the original input. 535 00:29:19,720 --> 00:29:22,390 And therefore, then OPT plus has to be 536 00:29:22,390 --> 00:29:25,120 at least the size of the input plus 1 per each 537 00:29:25,120 --> 00:29:26,350 of these plus rectangles. 538 00:29:29,510 --> 00:29:32,530 So the idea is the following. 539 00:29:32,530 --> 00:29:33,730 Because of all this set-up-- 540 00:29:33,730 --> 00:29:39,340 because we made pq horizontally aligned-- 541 00:29:39,340 --> 00:29:40,930 they're inside the rectangle. 542 00:29:40,930 --> 00:29:43,030 And furthermore, they're adjacent 543 00:29:43,030 --> 00:29:45,040 and they cross this vertical line. 544 00:29:45,040 --> 00:29:48,880 And that vertical line is not crossed by any other rectangle. 545 00:29:48,880 --> 00:29:51,780 When I say line, I mean line segment. 546 00:29:51,780 --> 00:29:54,880 There's no other rectangle that hits this red thing. 547 00:29:54,880 --> 00:29:56,470 Therefore, these two points are not 548 00:29:56,470 --> 00:30:00,640 going to get charged as a pair ever again. 549 00:30:00,640 --> 00:30:05,260 If you remove this rectangle, repeat this process, 550 00:30:05,260 --> 00:30:09,070 pq is never going to get charged again. 551 00:30:09,070 --> 00:30:10,990 So we charge to pq. 552 00:30:10,990 --> 00:30:26,230 And the pair never charged again, never 553 00:30:26,230 --> 00:30:33,860 be charged by another rectangle because no rectangle 554 00:30:33,860 --> 00:30:39,520 hits the red thing. 555 00:30:46,600 --> 00:30:53,380 So no rectangle contains the segment pq, 556 00:30:53,380 --> 00:30:54,620 the horizontal segment pq. 557 00:30:57,600 --> 00:30:59,130 So this is almost what we want. 558 00:30:59,130 --> 00:31:01,900 We really want a single point which is not in the input. 559 00:31:01,900 --> 00:31:03,360 So we have p and q. 560 00:31:03,360 --> 00:31:04,680 They're horizontally aligned. 561 00:31:04,680 --> 00:31:07,080 Now, if they're horizontally aligned, 562 00:31:07,080 --> 00:31:12,090 we know that not both of them are in the original input 563 00:31:12,090 --> 00:31:14,564 because all y-coordinates are distinct. 564 00:31:14,564 --> 00:31:16,230 This is usually true because you're only 565 00:31:16,230 --> 00:31:19,800 accessing one point per row, per time step. 566 00:31:19,800 --> 00:31:23,550 So one of these might be in the input, 567 00:31:23,550 --> 00:31:25,170 but the other one is not. 568 00:31:25,170 --> 00:31:26,970 So that's the one I want to hold onto. 569 00:31:26,970 --> 00:31:32,040 And say, OK, that's a point added to OPT plus that pays 570 00:31:32,040 --> 00:31:34,295 for this rectangle. 571 00:31:34,295 --> 00:31:35,670 It's not quite so simple, though, 572 00:31:35,670 --> 00:31:37,230 because we might have a whole bunch 573 00:31:37,230 --> 00:31:42,060 of horizontally-aligned things. 574 00:31:42,060 --> 00:31:44,950 And one rectangle charges to this one. 575 00:31:44,950 --> 00:31:46,980 One rectangle charges to this one. 576 00:31:46,980 --> 00:31:48,960 One rectangle charges to this pair. 577 00:31:51,510 --> 00:31:54,660 That's OK, though, because here we have four points. 578 00:31:54,660 --> 00:31:57,180 Again, one of them could be in the input. 579 00:31:57,180 --> 00:31:59,160 The other three have to be added. 580 00:31:59,160 --> 00:32:01,720 And so you've got three rectangles, three added points, 581 00:32:01,720 --> 00:32:03,190 and we're happy. 582 00:32:03,190 --> 00:32:04,149 Question? 583 00:32:04,149 --> 00:32:05,940 AUDIENCE: Just to make the argument formal, 584 00:32:05,940 --> 00:32:11,667 wouldn't you want to say that only when your saying assume 585 00:32:11,667 --> 00:32:13,916 that x and y are always distinct-- but then, 586 00:32:13,916 --> 00:32:16,050 if you have the same either x or y-- 587 00:32:16,050 --> 00:32:17,490 ERIK DEMAINE: Ah, good point. 588 00:32:17,490 --> 00:32:21,960 So this is distinct in the input is what I meant. 589 00:32:21,960 --> 00:32:23,832 Obviously, in OPT, any satisfied set 590 00:32:23,832 --> 00:32:25,290 is not going to have this property. 591 00:32:25,290 --> 00:32:26,159 Yeah, good. 592 00:32:26,159 --> 00:32:28,200 So I want to assume x- and y-coordinates are only 593 00:32:28,200 --> 00:32:29,490 distinct in the input. 594 00:32:29,490 --> 00:32:31,020 OPT will not have that property. 595 00:32:31,020 --> 00:32:34,422 And that's why p and q can exist and have the same y-coordinate. 596 00:32:34,422 --> 00:32:35,130 Another question? 597 00:32:39,397 --> 00:32:40,938 AUDIENCE: Does this still [INAUDIBLE] 598 00:32:40,938 --> 00:32:43,116 the special case where your two points are 599 00:32:43,116 --> 00:32:44,932 the points of the rectangle? 600 00:32:44,932 --> 00:32:45,640 ERIK DEMAINE: OK. 601 00:32:45,640 --> 00:32:50,560 So the question is can p and q be the points of the rectangle? 602 00:32:50,560 --> 00:32:52,180 One of them can be. 603 00:32:52,180 --> 00:32:55,360 Like, p could be here, and then another point is over here. 604 00:32:55,360 --> 00:32:58,480 So then that will be the segment that you are using, 605 00:32:58,480 --> 00:33:00,770 between p and q. 606 00:33:00,770 --> 00:33:03,690 Or it could be q is here, and p is over here. 607 00:33:03,690 --> 00:33:04,875 Then that's the segment. 608 00:33:04,875 --> 00:33:07,000 You can't have them both equal because p and q have 609 00:33:07,000 --> 00:33:10,320 to be horizontally aligned and also because there's got 610 00:33:10,320 --> 00:33:13,750 to be another point in there. 611 00:33:13,750 --> 00:33:15,220 Yeah, so that should work. 612 00:33:15,220 --> 00:33:17,710 You have to check that this boundary case is still OK. 613 00:33:17,710 --> 00:33:23,350 But the claim is no other rectangle touches this red line 614 00:33:23,350 --> 00:33:24,825 even on the endpoint. 615 00:33:24,825 --> 00:33:26,200 And therefore, no other rectangle 616 00:33:26,200 --> 00:33:28,760 will wholly contain p and q. 617 00:33:28,760 --> 00:33:32,470 And so that means you're only charging to this pair once. 618 00:33:32,470 --> 00:33:34,280 And then this pair charging is OK 619 00:33:34,280 --> 00:33:38,142 because, luckily, there's three edges here, four vertices. 620 00:33:38,142 --> 00:33:39,850 One of those vertices we can't charge to, 621 00:33:39,850 --> 00:33:43,840 so there's exactly the right number of things for the edges, 622 00:33:43,840 --> 00:33:46,210 and we're OK. 623 00:33:46,210 --> 00:33:48,580 Yeah, this can really happen. 624 00:33:48,580 --> 00:33:54,420 In fact our favorite example of the pinwheel-- 625 00:33:54,420 --> 00:33:58,903 if instead of doing the greedy addition, we do this addition-- 626 00:34:03,740 --> 00:34:07,120 these are supposed to be horizontally aligned. 627 00:34:07,120 --> 00:34:09,770 A little hard without a grid-- 628 00:34:09,770 --> 00:34:12,350 a graph blackboard would a great. 629 00:34:12,350 --> 00:34:14,870 So this is not quite satisfied. 630 00:34:14,870 --> 00:34:19,969 You've got to add some more points here or something. 631 00:34:19,969 --> 00:34:22,380 But it has the feature that-- 632 00:34:22,380 --> 00:34:24,560 here's an independent set of rectangles. 633 00:34:24,560 --> 00:34:33,199 I can do this one, this one, and this one. 634 00:34:36,030 --> 00:34:38,330 So this is three independent rectangles. 635 00:34:38,330 --> 00:34:42,852 As the white points go, they're independent rectangles. 636 00:34:42,852 --> 00:34:44,810 The corners are not strictly inside each other, 637 00:34:44,810 --> 00:34:46,393 and none of the white points satisfies 638 00:34:46,393 --> 00:34:49,590 any of the other rectangles. 639 00:34:49,590 --> 00:34:53,132 And indeed, if you applied this argument, 640 00:34:53,132 --> 00:34:54,590 first you take the widest rectangle 641 00:34:54,590 --> 00:34:57,570 and say, OK, here is my vertical red segment. 642 00:34:57,570 --> 00:35:00,597 I'm going to charge to these two guys, this segment, 643 00:35:00,597 --> 00:35:02,930 and then eventually this guy will charge to this segment 644 00:35:02,930 --> 00:35:05,480 and this guy will charge to this segment. 645 00:35:05,480 --> 00:35:07,460 And luckily, there are three added points 646 00:35:07,460 --> 00:35:11,140 for exactly the three segments for the three rectangles. 647 00:35:11,140 --> 00:35:12,390 There had to be another point. 648 00:35:15,080 --> 00:35:16,140 So that's a lower bound. 649 00:35:21,453 --> 00:35:22,930 A lot of work-- 650 00:35:22,930 --> 00:35:25,060 but in the end, it says, look, just 651 00:35:25,060 --> 00:35:28,930 find an independent set of plus boxes, plus rectangles. 652 00:35:28,930 --> 00:35:30,350 That's a lower bound on OPT. 653 00:35:30,350 --> 00:35:32,620 So now, the question remains, how 654 00:35:32,620 --> 00:35:36,340 do we find a good independent set of plus boxes? 655 00:35:36,340 --> 00:35:38,500 And now we'll go through the three different ways 656 00:35:38,500 --> 00:35:40,230 we know how to do it. 657 00:35:40,230 --> 00:35:42,580 I'll start actually with Wilber 2. 658 00:35:42,580 --> 00:35:45,220 It's called Wilber 2 because it was in a paper by Wilber, 659 00:35:45,220 --> 00:35:49,090 and I think he called it lower bound number 1 and lower bound 660 00:35:49,090 --> 00:35:50,410 number 2. 661 00:35:50,410 --> 00:35:53,620 But for pragmatic reasons, I'm going to start with number 2. 662 00:36:00,670 --> 00:36:03,460 It's from 1989, so it's actually an old paper. 663 00:36:03,460 --> 00:36:06,410 And it was sort of lost for a long time. 664 00:36:06,410 --> 00:36:09,540 I don't think Wilber wrote any other papers. 665 00:36:09,540 --> 00:36:11,755 It was in SICOMP, a big journal. 666 00:36:14,372 --> 00:36:18,400 so a few years after splay trees and then 667 00:36:18,400 --> 00:36:21,610 sort of rediscovered in the early 2000s 668 00:36:21,610 --> 00:36:25,060 and turns out to be really useful for a lot of theorems. 669 00:36:25,060 --> 00:36:27,970 So here's the lower bound. 670 00:36:27,970 --> 00:36:29,590 Again, we're looking at the input 671 00:36:29,590 --> 00:36:33,970 point set-- no added points, just the original points. 672 00:36:33,970 --> 00:36:37,450 Look at every point, and look at all the points 673 00:36:37,450 --> 00:36:40,970 that you can see from this point downward. 674 00:36:40,970 --> 00:36:41,830 What does see mean? 675 00:36:41,830 --> 00:36:46,210 I'm interested in points below p that when 676 00:36:46,210 --> 00:36:49,910 I draw the rectangle contain no other points. 677 00:36:49,910 --> 00:36:53,740 So this is sort of like a lower envelope. 678 00:36:53,740 --> 00:36:56,770 It's going to look something like this-- 679 00:36:56,770 --> 00:37:02,180 and maybe some points like this. 680 00:37:02,180 --> 00:37:05,950 So all of these rectangles have to be empty. 681 00:37:23,540 --> 00:37:37,670 So these are the downward visible points from p. 682 00:37:41,330 --> 00:37:45,680 And now, among these points, you can sort them by y-coordinate. 683 00:37:45,680 --> 00:37:48,590 And I want to see how many times do 684 00:37:48,590 --> 00:37:51,630 they cross this vertical line. 685 00:37:51,630 --> 00:37:54,560 So if I order them by y-coordinate-- 686 00:37:57,590 --> 00:37:59,780 so I start here, and maybe I go to here. 687 00:37:59,780 --> 00:38:03,470 Then the next one is over here, so that's across. 688 00:38:03,470 --> 00:38:04,850 Then I go over here. 689 00:38:04,850 --> 00:38:05,520 Then I cross. 690 00:38:05,520 --> 00:38:06,710 Then I go here. 691 00:38:06,710 --> 00:38:08,250 Then I cross. 692 00:38:08,250 --> 00:38:10,670 Go here, here, cross. 693 00:38:10,670 --> 00:38:12,350 So if I visit them in order, I want 694 00:38:12,350 --> 00:38:15,170 to know how many times do I cross this vertical line. 695 00:38:19,320 --> 00:38:23,330 So this is the past of p, all of the accesses before p. 696 00:38:23,330 --> 00:38:25,880 Think of this is how many times you alternate 697 00:38:25,880 --> 00:38:27,985 between accessing on the left of the line 698 00:38:27,985 --> 00:38:29,610 and accessing on the right of the line. 699 00:38:33,594 --> 00:38:44,270 So count number of alternations left or right of p. 700 00:38:44,270 --> 00:38:46,412 And again, if we assume that no key is ever 701 00:38:46,412 --> 00:38:47,870 accessed more than once, then there 702 00:38:47,870 --> 00:38:49,910 will always be left or right, never exactly on. 703 00:38:54,931 --> 00:38:56,555 And then I want to sum over all points. 704 00:38:59,820 --> 00:39:03,950 And I claim this is a lower bound. 705 00:39:03,950 --> 00:39:07,550 Why is it a lower bound? 706 00:39:07,550 --> 00:39:13,580 Essentially, I take each of these red lines that 707 00:39:13,580 --> 00:39:20,005 cross the p vertical line and I turn them into a box. 708 00:39:20,005 --> 00:39:28,360 So there's one there, one there, one there, and one there. 709 00:39:28,360 --> 00:39:33,597 I claim if I do this for all p, all those boxes 710 00:39:33,597 --> 00:39:34,430 will be independent. 711 00:39:34,430 --> 00:39:36,805 All those rectangles will be independent from each other. 712 00:39:36,805 --> 00:39:41,810 I won't prove that formally here, but you can check it. 713 00:39:41,810 --> 00:39:44,240 So it's obvious for one p because each of these 714 00:39:44,240 --> 00:39:46,250 has a different vertical span. 715 00:39:46,250 --> 00:39:49,270 If you do it for all p-- all the points p-- 716 00:39:49,270 --> 00:39:51,980 these won't conflict. 717 00:39:51,980 --> 00:39:55,560 So by the independent rectangle lower bound, 718 00:39:55,560 --> 00:40:02,490 this is a lower bound on OPT up to a factor of 2. 719 00:40:02,490 --> 00:40:04,950 So what? 720 00:40:04,950 --> 00:40:07,080 Wilber 2 is quite interesting. 721 00:40:07,080 --> 00:40:08,520 For a long time, we've conjectured 722 00:40:08,520 --> 00:40:11,060 that it is the right answer. 723 00:40:11,060 --> 00:40:16,620 So conjecture-- I know it's a weird lower 724 00:40:16,620 --> 00:40:18,290 bound to even think of. 725 00:40:18,290 --> 00:40:20,850 It's a very hard paper to read. 726 00:40:20,850 --> 00:40:22,950 Without the geometric view, it's even harder 727 00:40:22,950 --> 00:40:26,800 to imagine the definition of this bound. 728 00:40:26,800 --> 00:40:28,400 It's sort of an algorithm. 729 00:40:28,400 --> 00:40:29,955 It's a way to assign boxes. 730 00:40:29,955 --> 00:40:31,080 It gives you a lower bound. 731 00:40:31,080 --> 00:40:32,810 It's a little weird. 732 00:40:32,810 --> 00:40:34,560 We conjecture that it's proportional 733 00:40:34,560 --> 00:40:37,540 to the optimal solution. 734 00:40:37,540 --> 00:40:38,550 We can't prove it. 735 00:40:38,550 --> 00:40:40,440 We've tried many times. 736 00:40:40,440 --> 00:40:47,630 It's a pain to work with, but it is what it is. 737 00:40:47,630 --> 00:40:49,340 There's one theorem that uses it, 738 00:40:49,340 --> 00:40:51,150 so I want to tell you about that theorem. 739 00:40:51,150 --> 00:40:54,200 But I don't want to go into it in too much detail. 740 00:40:54,200 --> 00:40:56,660 It's a neat theorem. 741 00:40:56,660 --> 00:41:02,900 And it's in a paper by Iacono, 2002. 742 00:41:02,900 --> 00:41:05,900 And it was the first paper to revitalize the Wilber stuff. 743 00:41:05,900 --> 00:41:08,390 So it's like, hey, there's this Wilber 2 bound. 744 00:41:08,390 --> 00:41:10,820 We can use it to solve a new problem, which is called 745 00:41:10,820 --> 00:41:12,905 key independent optimality. 746 00:41:23,960 --> 00:41:27,100 Briefly, the idea with key independent optimality 747 00:41:27,100 --> 00:41:29,640 is, suppose you've heard about dynamic optimality. 748 00:41:29,640 --> 00:41:32,510 You know, it's really cool because splay trees and whatnot 749 00:41:32,510 --> 00:41:34,940 seem to really adapt to whatever your inputs are. 750 00:41:34,940 --> 00:41:37,250 But suppose your inputs really don't have keys. 751 00:41:37,250 --> 00:41:42,140 They're just arbitrary objects labeled however, just randomly. 752 00:41:42,140 --> 00:41:44,570 In fact, let's assume that they're labeled randomly. 753 00:41:44,570 --> 00:41:46,340 Suppose the keys are generated randomly 754 00:41:46,340 --> 00:41:48,797 because they're meaningless or just arbitrary things. 755 00:41:48,797 --> 00:41:50,630 So you figure, oh, maybe I'll make it better 756 00:41:50,630 --> 00:41:54,560 and just randomize them completely. 757 00:41:54,560 --> 00:42:06,380 If keys are random, then dynamic OPT 758 00:42:06,380 --> 00:42:09,481 is the same thing up to constant factors as the working set 759 00:42:09,481 --> 00:42:09,980 bound. 760 00:42:15,830 --> 00:42:17,390 That's the theorem. 761 00:42:17,390 --> 00:42:20,120 So this is cool because it means splay trees are actually 762 00:42:20,120 --> 00:42:22,070 optimal in the setting where keys are random. 763 00:42:25,580 --> 00:42:30,200 This is in expectation over the randomized keys. 764 00:42:30,200 --> 00:42:34,070 And the way this theorem is proved is basically-- 765 00:42:34,070 --> 00:42:36,890 so what this is saying is, if we take a point set-- 766 00:42:36,890 --> 00:42:39,290 arbitrary point set-- but then we re-randomize 767 00:42:39,290 --> 00:42:43,350 the x-coordinates-- leave the y-coordinates as they are-- 768 00:42:43,350 --> 00:42:48,106 then you can compute how Wilber 2 behaves. 769 00:42:48,106 --> 00:42:49,730 Because now you have a bunch of points, 770 00:42:49,730 --> 00:42:53,940 and you're randomly shifting their x-coordinate. 771 00:42:53,940 --> 00:42:58,070 So it's like if you're randomly bouncing around an x 772 00:42:58,070 --> 00:43:00,380 and you're interested in this envelope 773 00:43:00,380 --> 00:43:02,720 on the left and the right, you want to know basically 774 00:43:02,720 --> 00:43:03,605 how many times-- 775 00:43:08,030 --> 00:43:11,950 I guess since I last accessed p, which is here. 776 00:43:11,950 --> 00:43:14,330 We didn't do that here, but in the working set bound 777 00:43:14,330 --> 00:43:16,280 that's part of the deal. 778 00:43:20,960 --> 00:43:23,000 If you look on the left side, it's 779 00:43:23,000 --> 00:43:26,060 like how many times does the max change. 780 00:43:26,060 --> 00:43:28,520 And you may know if you n random numbers 781 00:43:28,520 --> 00:43:33,530 and you want to know how many times does the max changes 782 00:43:33,530 --> 00:43:35,390 as I go left to right, as I take larger 783 00:43:35,390 --> 00:43:37,130 and larger prefixes of those n numbers, 784 00:43:37,130 --> 00:43:40,610 the answer is log n in expectation. 785 00:43:40,610 --> 00:43:43,040 Because the more points you have, the less and less likely 786 00:43:43,040 --> 00:43:46,240 it is for the max to change. 787 00:43:46,240 --> 00:43:51,880 So basically, you show the expected Wilber 788 00:43:51,880 --> 00:43:57,200 2 of a point over this randomization 789 00:43:57,200 --> 00:44:03,750 is theta log ti, where ti is the working set bound. 790 00:44:03,750 --> 00:44:08,260 And so, that gives you the theorem. 791 00:44:08,260 --> 00:44:10,664 This gives you a lower bound of the working set bound. 792 00:44:10,664 --> 00:44:12,580 We have upper bounds of the working set bound, 793 00:44:12,580 --> 00:44:15,190 and therefore that's OPT. 794 00:44:15,190 --> 00:44:17,911 So that's just a very quick sketch. 795 00:44:17,911 --> 00:44:19,660 If you're interested, check out the paper. 796 00:44:22,230 --> 00:44:25,865 That's unfortunately all we know what to do with Wilber 2. 797 00:44:25,865 --> 00:44:27,490 But there's this other bound, Wilber 1, 798 00:44:27,490 --> 00:44:33,030 which seems less good yet we can do a lot more with it. 799 00:44:33,030 --> 00:44:35,140 So let me go to that. 800 00:44:54,470 --> 00:44:57,220 It's a lot easier to analyze algorithms with respect 801 00:44:57,220 --> 00:45:00,440 to Wilber 1. 802 00:45:00,440 --> 00:45:01,230 What's Wilber 1? 803 00:45:04,080 --> 00:45:09,870 We're going to fix something called a lower bound tree. 804 00:45:09,870 --> 00:45:12,660 I'm going to call it because it's basically 805 00:45:12,660 --> 00:45:16,515 going to be a perfect binary tree on my keys. 806 00:45:19,200 --> 00:45:20,740 This tree never changes. 807 00:45:20,740 --> 00:45:22,320 That's why I say fix. 808 00:45:22,320 --> 00:45:25,250 It is not the binary search tree you're looking for. 809 00:45:25,250 --> 00:45:29,040 It is not the binary search tree that you're interested in. 810 00:45:29,040 --> 00:45:30,540 It's just a thing to think about. 811 00:45:33,930 --> 00:45:41,400 Now, for each node of that tree-- 812 00:45:41,400 --> 00:45:44,700 let's look at this node, I'll give the node a name, y. 813 00:45:47,880 --> 00:45:48,720 So here's y. 814 00:45:52,000 --> 00:45:53,960 There's the left subtree of y, and there's 815 00:45:53,960 --> 00:45:56,360 the right subtree of y. 816 00:45:56,360 --> 00:45:57,766 These are a bunch of keys. 817 00:45:57,766 --> 00:45:59,390 There's keys that are to the left of y. 818 00:45:59,390 --> 00:46:01,100 There's keys to the right of y. 819 00:46:01,100 --> 00:46:02,840 There's keys outside the subtree. 820 00:46:02,840 --> 00:46:04,880 We're going to ignore those. 821 00:46:04,880 --> 00:46:08,030 I want to look at the accesses to these keys and accesses 822 00:46:08,030 --> 00:46:10,220 to these keys and see how many times do I 823 00:46:10,220 --> 00:46:13,040 switch between left and right. 824 00:46:13,040 --> 00:46:19,730 So count the number of alternations-- 825 00:46:19,730 --> 00:46:22,887 so very similar in spirit to Wilber 2, 826 00:46:22,887 --> 00:46:24,470 it's just relative to this weird tree, 827 00:46:24,470 --> 00:46:26,045 which is kind of arbitrary-- 828 00:46:29,090 --> 00:46:34,880 in the access sequence-- which is x1 up to xn-- 829 00:46:34,880 --> 00:46:48,020 between left and right subtrees of y 830 00:46:48,020 --> 00:46:50,240 So we're going to ignore accesses to y itself. 831 00:46:50,240 --> 00:46:53,240 We're going to ignore accesses to keys outside of y. 832 00:46:53,240 --> 00:46:56,990 Just look at how many times do I switch between x and y. 833 00:46:56,990 --> 00:46:58,322 That's a lower bound. 834 00:46:58,322 --> 00:46:59,030 That's the claim. 835 00:47:09,550 --> 00:47:11,200 It's a lower bound for the same reason 836 00:47:11,200 --> 00:47:14,800 we use the independent rectangle lower bound. 837 00:47:14,800 --> 00:47:17,710 And the claim is, if you look at these alternations, 838 00:47:17,710 --> 00:47:20,370 draw the corresponding rectangles-- 839 00:47:26,170 --> 00:47:27,820 so over here, we had a vertical line 840 00:47:27,820 --> 00:47:30,820 which corresponded to the key, and we see how many times do we 841 00:47:30,820 --> 00:47:32,020 cross the line. 842 00:47:32,020 --> 00:47:38,874 Basically, the same thing over here except now 843 00:47:38,874 --> 00:47:41,290 there's one big vertical line that corresponds to the root 844 00:47:41,290 --> 00:47:43,330 node, then there's some vertical lines that 845 00:47:43,330 --> 00:47:45,670 correspond to this node and this node, 846 00:47:45,670 --> 00:47:48,350 and you're interested in the access sequence. 847 00:47:48,350 --> 00:47:51,591 How many times-- let's do some kind of access sequence like 848 00:47:51,591 --> 00:47:52,090 this-- 849 00:47:55,890 --> 00:47:56,920 these are our points-- 850 00:48:02,590 --> 00:48:06,220 and you just look at what lines are you crossing. 851 00:48:08,830 --> 00:48:11,127 So like this crosses the big line. 852 00:48:11,127 --> 00:48:13,210 So that's going to be one alternation between left 853 00:48:13,210 --> 00:48:14,157 and right here. 854 00:48:14,157 --> 00:48:16,240 Here's another alternation between left and right. 855 00:48:16,240 --> 00:48:18,820 Here is another alternation between left and right. 856 00:48:18,820 --> 00:48:22,750 Here's another alternation between left and right. 857 00:48:22,750 --> 00:48:24,880 And one more. 858 00:48:24,880 --> 00:48:27,280 So for the big line, for the root node, 859 00:48:27,280 --> 00:48:29,530 that's how many times you cross between left and right 860 00:48:29,530 --> 00:48:31,750 relative to the root. 861 00:48:31,750 --> 00:48:34,140 Then, for the left subtree the root, 862 00:48:34,140 --> 00:48:36,120 there's one crossing here. 863 00:48:36,120 --> 00:48:40,060 There is one crossing here, one crossing here. 864 00:48:42,940 --> 00:48:45,330 These are touching, but they're not satisfied. 865 00:48:45,330 --> 00:48:46,200 So it's OK. 866 00:48:46,200 --> 00:48:48,540 The claim is all these rectangles will be independent. 867 00:48:48,540 --> 00:48:52,010 Again, I won't prove that formally, but it's true. 868 00:48:55,010 --> 00:48:55,940 OK? 869 00:48:55,940 --> 00:48:58,790 Rough sketch. 870 00:48:58,790 --> 00:49:00,250 So that's Wilber 1. 871 00:49:00,250 --> 00:49:02,600 It's, again, an independent rectangle lower bound. 872 00:49:02,600 --> 00:49:04,944 It's a little weird because it depends on this tree. 873 00:49:04,944 --> 00:49:06,860 You could choose it to be a nice perfect tree. 874 00:49:06,860 --> 00:49:07,970 You could choose it to be a different tree. 875 00:49:07,970 --> 00:49:10,344 You'll get a different lower bound each time. 876 00:49:10,344 --> 00:49:12,260 So of course, you take the max over all trees. 877 00:49:12,260 --> 00:49:16,100 That will give you the biggest Wilber 1 lower bound. 878 00:49:16,100 --> 00:49:21,140 We don't know much about that biggest Wilber 1 lower bound. 879 00:49:21,140 --> 00:49:26,960 I guess you could ask the following open question. 880 00:49:26,960 --> 00:49:30,380 Is it true that for every access sequence 881 00:49:30,380 --> 00:49:37,700 there exists a tree p such that Wilber 1 is theta OPT? 882 00:49:37,700 --> 00:49:39,920 Or is theta Wilber 2 or something? 883 00:49:39,920 --> 00:49:41,940 Wilber 2 is a single quantity. 884 00:49:41,940 --> 00:49:43,005 You compute it. 885 00:49:43,005 --> 00:49:44,420 It gives you a bound. 886 00:49:44,420 --> 00:49:46,790 Wilber 1, it depends on this p. 887 00:49:46,790 --> 00:49:48,950 Maybe if you choose the best p for your sequence 888 00:49:48,950 --> 00:49:50,190 you get the right answer. 889 00:49:50,190 --> 00:49:55,550 But it's definitely the case that Wilber 1 for a fixed p 890 00:49:55,550 --> 00:49:58,850 is not the right answer. 891 00:49:58,850 --> 00:50:00,740 I recall that's easy to prove. 892 00:50:06,580 --> 00:50:08,690 Well, maybe we'll come back to that. 893 00:50:08,690 --> 00:50:10,017 Yeah, question? 894 00:50:10,017 --> 00:50:11,686 AUDIENCE: So how do you construct 895 00:50:11,686 --> 00:50:12,879 this lower bound tree? 896 00:50:12,879 --> 00:50:13,932 Like, is it just-- 897 00:50:13,932 --> 00:50:16,140 ERIK DEMAINE: I'll tell you what we're going to use-- 898 00:50:16,140 --> 00:50:17,884 the question is how do we construct p. 899 00:50:17,884 --> 00:50:19,300 You can make it whatever you want. 900 00:50:19,300 --> 00:50:21,870 What we're going to use is the perfect tree, 901 00:50:21,870 --> 00:50:23,730 which is sort of unique. 902 00:50:23,730 --> 00:50:27,076 It's kind of arbitrary, but it works. 903 00:50:27,076 --> 00:50:28,950 It has the property that its height is log n. 904 00:50:28,950 --> 00:50:30,300 That's all we need. 905 00:50:30,300 --> 00:50:32,430 We're going to use that to get tango trees. 906 00:50:32,430 --> 00:50:34,500 Other questions? 907 00:50:34,500 --> 00:50:36,600 All right. 908 00:50:36,600 --> 00:50:39,270 Let me briefly mention a fun access sequence. 909 00:50:45,315 --> 00:50:48,990 You may recognize this sequence. 910 00:50:48,990 --> 00:50:52,080 This would be in-order traversal in binary. 911 00:50:52,080 --> 00:50:53,760 But if I take these bit sequences 912 00:50:53,760 --> 00:51:09,311 and read them backwards, then I get 0, 4, 2, 6, 1, 5, 3, 7. 913 00:51:09,311 --> 00:51:11,310 This is the number 0 through 7 in a funny order. 914 00:51:11,310 --> 00:51:13,890 It's called the bit reversal sequence. 915 00:51:13,890 --> 00:51:24,000 If you access 0, 4, 2, 6, 1, 5, 3, 7 in a perfect binary tree, 916 00:51:24,000 --> 00:51:26,410 it maximizes Wilber 1. 917 00:51:26,410 --> 00:51:36,960 So in-order traversal-- 0, 1, 2, 3, 4, 5, 6. 918 00:51:36,960 --> 00:51:38,010 Ignore 7. 919 00:51:38,010 --> 00:51:39,647 There's not 7 in this tree. 920 00:51:43,135 --> 00:51:46,050 I do 0, 4-- 921 00:51:46,050 --> 00:51:48,640 if you look at the root, alternate left, 922 00:51:48,640 --> 00:51:53,820 right, left, right, left, right. 923 00:51:53,820 --> 00:51:56,500 Because the high-order bit is switching every time, 924 00:51:56,500 --> 00:51:58,614 and so whether I go to the left of the tree here 925 00:51:58,614 --> 00:52:00,780 or the right of the tree, it's switching every time. 926 00:52:00,780 --> 00:52:02,321 And also, if you look in any subtree, 927 00:52:02,321 --> 00:52:04,710 like when I'm accessing things within the subtree of one, 928 00:52:04,710 --> 00:52:06,530 it alternates 0, too. 929 00:52:06,530 --> 00:52:09,030 It's too small a tree to really see that happening, 930 00:52:09,030 --> 00:52:10,600 but it's true. 931 00:52:10,600 --> 00:52:15,380 And so, if you do this for k bits, 932 00:52:15,380 --> 00:52:18,060 n equals 2 to the k roughly. 933 00:52:18,060 --> 00:52:21,680 And Wilber 1, the lower bound, is 934 00:52:21,680 --> 00:52:27,210 log n per [INAUDIBLE] because the every access alternates. 935 00:52:27,210 --> 00:52:29,622 So if you look at a subtree, whatever 936 00:52:29,622 --> 00:52:31,080 the size of that subtree is, that's 937 00:52:31,080 --> 00:52:33,060 how many alternations there are. 938 00:52:33,060 --> 00:52:38,070 And so, number of alternations is theta n log n 939 00:52:38,070 --> 00:52:41,780 because it's the sum over all nodes of their subtree sizes. 940 00:52:41,780 --> 00:52:48,300 And so OPT is theta n log n. 941 00:52:48,300 --> 00:52:50,420 We know we can achieve n log n-- 942 00:52:50,420 --> 00:52:53,520 this is to do n accesses-- 943 00:52:53,520 --> 00:52:55,899 we know we can n log n with a red-black tree or whatever, 944 00:52:55,899 --> 00:52:57,690 but there's actually a lower bound of n log 945 00:52:57,690 --> 00:52:59,730 n, meaning all binary search trees-- 946 00:52:59,730 --> 00:53:01,590 if you're given this access sequence, 947 00:53:01,590 --> 00:53:04,080 doesn't matter what you're doing-- you have to pay 948 00:53:04,080 --> 00:53:04,770 n log n. 949 00:53:04,770 --> 00:53:06,680 It's kind of cool. 950 00:53:06,680 --> 00:53:07,990 A little side effect-- 951 00:53:07,990 --> 00:53:09,840 that's Wilber's paper ended. 952 00:53:09,840 --> 00:53:13,350 It's like, hey, cool, can find one access sequence that 953 00:53:13,350 --> 00:53:17,340 is bad for everybody. 954 00:53:17,340 --> 00:53:20,480 But now we're going to use Wilber 1 955 00:53:20,480 --> 00:53:22,890 to get one binary search tree that's pretty 956 00:53:22,890 --> 00:53:25,770 good for all access sequences. 957 00:53:25,770 --> 00:53:28,590 Pretty good meaning within a log log n factor of optimal. 958 00:53:51,280 --> 00:53:59,152 And this is tango trees, which would 959 00:53:59,152 --> 00:54:04,400 be log log n competitive online binary search trees. 960 00:54:09,380 --> 00:54:11,270 Why are they called tango trees? 961 00:54:11,270 --> 00:54:14,330 People made up all sorts of reasons, but I can tell you-- 962 00:54:14,330 --> 00:54:16,010 because I was there-- 963 00:54:16,010 --> 00:54:20,810 they were invented mostly on a flight from New York 964 00:54:20,810 --> 00:54:24,650 to Buenos Aires, which is the center of tango. 965 00:54:24,650 --> 00:54:26,900 I bought this T-shirt I think the day after. 966 00:54:26,900 --> 00:54:29,150 And then that week, we wrote the paper, 967 00:54:29,150 --> 00:54:30,610 and that was tango trees. 968 00:54:30,610 --> 00:54:35,630 So no particular reason, but it sounds good. 969 00:54:35,630 --> 00:54:37,130 Always good to have a cool name. 970 00:54:37,130 --> 00:54:39,095 So the secret is revealed. 971 00:54:39,095 --> 00:54:46,100 The true meaning of tango trees is nothing, but you we'll see. 972 00:54:46,100 --> 00:54:48,630 So how do they work? 973 00:54:48,630 --> 00:54:49,640 It's very simple. 974 00:54:49,640 --> 00:54:55,040 Basically, we take Wilber 1 and we simulate it. 975 00:54:55,040 --> 00:54:59,310 So let me be more precise. 976 00:54:59,310 --> 00:55:05,870 There's one key idea, which is to look at the preferred 977 00:55:05,870 --> 00:55:10,510 child of a node. 978 00:55:15,870 --> 00:55:19,242 I'm going to say the preferred child is left. 979 00:55:19,242 --> 00:55:23,818 Let's see, node y in p. 980 00:55:23,818 --> 00:55:38,060 It's left if we accessed some node in the left subtree of y 981 00:55:38,060 --> 00:55:38,675 most recently. 982 00:55:43,300 --> 00:55:46,760 It's the right child if we accessed 983 00:55:46,760 --> 00:55:48,960 something in the right subtree most recently. 984 00:55:48,960 --> 00:55:52,850 So we're just looking at left and right subtree accesses, 985 00:55:52,850 --> 00:55:54,240 what was most recent? 986 00:55:54,240 --> 00:55:56,390 There is a special case in the beginning, which 987 00:55:56,390 --> 00:55:58,681 is you don't have a preferred child because you haven't 988 00:55:58,681 --> 00:56:00,710 accessed either left or right yet. 989 00:56:00,710 --> 00:56:12,365 So this is if no access to the left or right yet. 990 00:56:12,365 --> 00:56:14,340 So that just happens in the beginning. 991 00:56:14,340 --> 00:56:17,180 Once you've touched everything, everybody 992 00:56:17,180 --> 00:56:19,970 will have a left or right preferred child. 993 00:56:19,970 --> 00:56:23,780 So this is just what was your most recent child. 994 00:56:23,780 --> 00:56:26,580 This is like a parent with a very short memory. 995 00:56:26,580 --> 00:56:31,400 Just whichever child I most recently talked to, 996 00:56:31,400 --> 00:56:34,292 that is my preferred child at the moment. 997 00:56:34,292 --> 00:56:35,750 It's kind of like I don't know when 998 00:56:35,750 --> 00:56:38,575 you're going to job interviews. 999 00:56:38,575 --> 00:56:40,160 You know, the most recent interview 1000 00:56:40,160 --> 00:56:42,702 is the one you remember most fondly and so, ah, 1001 00:56:42,702 --> 00:56:44,660 you like that one the best independent of which 1002 00:56:44,660 --> 00:56:45,570 is the coolest. 1003 00:56:45,570 --> 00:56:48,620 So let me draw a picture. 1004 00:56:48,620 --> 00:56:52,520 And I guess I'm going to draw a big picture-- 1005 00:56:52,520 --> 00:57:00,121 my favorite-- a perfectly balanced binary search tree 1006 00:57:00,121 --> 00:57:02,810 with eight leaves. 1007 00:57:02,810 --> 00:57:10,840 And so now, suppose that every node has a preferred child. 1008 00:57:10,840 --> 00:57:12,710 Let's say they all do just because it makes 1009 00:57:12,710 --> 00:57:13,834 a more interesting picture. 1010 00:57:17,240 --> 00:57:20,780 I'm going to draw that with a big fat arrow. 1011 00:57:20,780 --> 00:57:23,390 And now, what does that do? 1012 00:57:23,390 --> 00:57:24,950 It decomposes our tree. 1013 00:57:24,950 --> 00:57:27,860 This is the perfect tree. p is going to be perfectly balanced, 1014 00:57:27,860 --> 00:57:28,700 log n height. 1015 00:57:28,700 --> 00:57:30,290 It could be any log n height tree, 1016 00:57:30,290 --> 00:57:32,990 but we'll make it perfect. 1017 00:57:32,990 --> 00:57:38,840 And it decomposes that tree into paths. 1018 00:57:38,840 --> 00:57:40,120 And there's a path here. 1019 00:57:40,120 --> 00:57:43,880 You just keep following parent pointers, you get a path-- 1020 00:57:43,880 --> 00:57:45,680 not parent pointers, preferred pointers. 1021 00:57:45,680 --> 00:57:47,600 It's also true if you follow parent pointers you get a path, 1022 00:57:47,600 --> 00:57:49,160 but they'll overlap each other. 1023 00:57:49,160 --> 00:57:50,660 You follow preferred child pointers, 1024 00:57:50,660 --> 00:57:52,050 you get non-overlapping paths. 1025 00:57:54,515 --> 00:57:55,140 There they are. 1026 00:57:55,140 --> 00:57:58,950 We also get these singleton paths at the leaves. 1027 00:57:58,950 --> 00:58:01,170 Some of the leaves are in singleton paths. 1028 00:58:01,170 --> 00:58:02,685 These are called preferred paths. 1029 00:58:12,820 --> 00:58:15,635 Why do I care? 1030 00:58:15,635 --> 00:58:19,860 So this tells me the most recently accessed element 1031 00:58:19,860 --> 00:58:21,940 was somebody on this path. 1032 00:58:21,940 --> 00:58:22,911 I don't quite know who. 1033 00:58:22,911 --> 00:58:24,910 It could have been this one, and that would say, 1034 00:58:24,910 --> 00:58:26,410 OK, this is the most recent direction 1035 00:58:26,410 --> 00:58:27,110 we went through all of them. 1036 00:58:27,110 --> 00:58:28,270 Let's say it's that one. 1037 00:58:28,270 --> 00:58:30,280 Now suppose I access this node. 1038 00:58:30,280 --> 00:58:32,410 What does that tell me? 1039 00:58:32,410 --> 00:58:35,020 Well, if I most recently accessed left here 1040 00:58:35,020 --> 00:58:38,650 and now I'm accessing the right, if you look at this node, 1041 00:58:38,650 --> 00:58:40,840 the Wilber 1 bound goes up by 1. 1042 00:58:40,840 --> 00:58:42,400 Because I just accessed left. 1043 00:58:42,400 --> 00:58:44,170 Now I accessed right. 1044 00:58:44,170 --> 00:58:49,120 Also, if I access this node, this guy, his Wilber 1 bound 1045 00:58:49,120 --> 00:58:51,430 goes up by 1 because now he's going to the right, 1046 00:58:51,430 --> 00:58:53,200 whereas last time he went to the left. 1047 00:58:53,200 --> 00:58:56,170 Also, this node previously went to the right 1048 00:58:56,170 --> 00:58:57,580 and went to the left. 1049 00:58:57,580 --> 00:59:02,200 So Wilber 1 went up because of this edge, 1050 00:59:02,200 --> 00:59:03,920 and it went up because of this edge. 1051 00:59:03,920 --> 00:59:07,240 In general, following non-preferred edges, 1052 00:59:07,240 --> 00:59:10,240 I can pay for because Wilber 1 goes up by 1 1053 00:59:10,240 --> 00:59:12,190 every time I use a non-preferred edge. 1054 00:59:12,190 --> 00:59:15,280 This is another way to state the Wilber 1 bound. 1055 00:59:15,280 --> 00:59:17,320 This is the cool thing. 1056 00:59:17,320 --> 00:59:20,350 As long as I can go through a path quickly-- 1057 00:59:23,419 --> 00:59:25,210 ideally, if I could do it in constant time, 1058 00:59:25,210 --> 00:59:27,460 this would be a dynamically-optimal binary 1059 00:59:27,460 --> 00:59:27,990 search tree. 1060 00:59:27,990 --> 00:59:30,573 If I could instantly transport to where I need to go on a path 1061 00:59:30,573 --> 00:59:32,680 and then jump off the path to the next path, 1062 00:59:32,680 --> 00:59:35,900 that I can pay for-- 1063 00:59:35,900 --> 00:59:38,590 I can spend constant time to do that-- 1064 00:59:38,590 --> 00:59:40,125 then I'd be OK. 1065 00:59:40,125 --> 00:59:42,250 I'm not going to be able to do it in constant time, 1066 00:59:42,250 --> 00:59:44,935 but I'm going to be able to do it log log n time. 1067 00:59:44,935 --> 00:59:47,710 I'm going to be able to jump through a path in log log n 1068 00:59:47,710 --> 00:59:49,270 time, and then jump-- 1069 00:59:49,270 --> 00:59:51,730 figure out where I need to diverge from the path 1070 00:59:51,730 --> 00:59:53,364 because maybe I'm accessing this guy. 1071 00:59:53,364 --> 00:59:54,280 Jump to the next path. 1072 00:59:54,280 --> 00:59:56,099 Do that in log log n time. 1073 00:59:56,099 --> 00:59:57,640 I've got to update the path structure 1074 00:59:57,640 --> 00:59:59,620 because now the preferred child is to the right. 1075 00:59:59,620 --> 01:00:00,703 It used to be to the left. 1076 01:00:00,703 --> 01:00:05,886 So I've got to do something that will only cost log log n time. 1077 01:00:05,886 --> 01:00:08,260 If I can do that, the lower bound is the number of edges. 1078 01:00:08,260 --> 01:00:11,390 The upper bound is the number of non-preferred edges 1079 01:00:11,390 --> 01:00:13,250 times log log n. 1080 01:00:13,250 --> 01:00:21,190 So we get a lower bound Wilber 1, 1081 01:00:21,190 --> 01:00:22,930 which is going to be equal to the number 1082 01:00:22,930 --> 01:00:25,480 of non-preferred edges. 1083 01:00:29,099 --> 01:00:30,640 And we're going to get an upper bound 1084 01:00:30,640 --> 01:00:35,950 through tango trees, which is going 1085 01:00:35,950 --> 01:00:39,430 to be order number of non-preferred edges times 1086 01:00:39,430 --> 01:00:42,080 log log n. 1087 01:00:42,080 --> 01:00:42,580 OK. 1088 01:00:42,580 --> 01:00:44,230 Why is it log log n? 1089 01:00:44,230 --> 01:00:47,710 Because each of these paths has length only log n. 1090 01:00:47,710 --> 01:00:50,500 So put them in a balanced binary search tree, 1091 01:00:50,500 --> 01:00:53,350 and it has height log log n. 1092 01:00:53,350 --> 01:00:57,220 So take these paths, squish them into a tree-- 1093 01:00:57,220 --> 01:01:01,430 it's hard, I don't know which way you're squishing. 1094 01:01:01,430 --> 01:01:02,830 It says log n depth. 1095 01:01:02,830 --> 01:01:03,990 It's a path. 1096 01:01:03,990 --> 01:01:05,680 I'm going to fold it into a tree. 1097 01:01:05,680 --> 01:01:07,450 So it has height only log log n. 1098 01:01:07,450 --> 01:01:09,886 Then I can jump around it in log log n time. 1099 01:01:09,886 --> 01:01:11,260 That's the idea with tango trees. 1100 01:01:11,260 --> 01:01:12,345 You're basically done. 1101 01:01:12,345 --> 01:01:15,550 A few details in how they work. 1102 01:01:15,550 --> 01:01:17,510 I don't want to spend too much time on them, 1103 01:01:17,510 --> 01:01:18,926 but let's go through some of them. 1104 01:01:42,410 --> 01:01:53,930 So we're going to store each preferred path 1105 01:01:53,930 --> 01:01:59,840 as an auxiliary tree, which is just-- 1106 01:01:59,840 --> 01:02:02,650 I don't know-- a red-black tree, say. 1107 01:02:07,800 --> 01:02:10,200 What is the red-black tree sorted by? 1108 01:02:10,200 --> 01:02:11,710 I don't have a choice. 1109 01:02:11,710 --> 01:02:13,500 Whatever I do has to be a binary search 1110 01:02:13,500 --> 01:02:15,610 tree among the original keys. 1111 01:02:15,610 --> 01:02:18,180 So if I take these items and I just throw them 1112 01:02:18,180 --> 01:02:21,854 into a red-black tree, they will be sorted by whatever 1113 01:02:21,854 --> 01:02:22,770 their x-coordinate is. 1114 01:02:22,770 --> 01:02:24,910 So this is the max, this is the min. 1115 01:02:24,910 --> 01:02:26,480 This is somewhere in between. 1116 01:02:26,480 --> 01:02:28,320 This is to the left of that. 1117 01:02:28,320 --> 01:02:29,640 So the order is a little weird. 1118 01:02:29,640 --> 01:02:31,980 I'd really like to store them sorted by depth, 1119 01:02:31,980 --> 01:02:33,390 but I can't do that. 1120 01:02:33,390 --> 01:02:34,965 They are sorted by their key values. 1121 01:02:38,430 --> 01:02:42,740 Now, what do I need to do with these auxiliary trees? 1122 01:02:42,740 --> 01:02:46,110 I mean, the basic thing I do is a search, right? 1123 01:02:46,110 --> 01:02:47,330 I'm searching for a key. 1124 01:02:47,330 --> 01:02:49,580 It's a binary search tree, so I can still do a search. 1125 01:02:49,580 --> 01:02:52,740 I can figure out this tree gets represented 1126 01:02:52,740 --> 01:02:56,040 as something more like this. 1127 01:02:56,040 --> 01:02:59,390 That would be a nicely balanced version of these four nodes. 1128 01:02:59,390 --> 01:03:05,250 So if I called them, I don't know, a, b, c, d. 1129 01:03:05,250 --> 01:03:06,900 That's their sorted order. 1130 01:03:06,900 --> 01:03:10,290 It's going to be a, b, c, d. 1131 01:03:10,290 --> 01:03:12,502 That's also their sorted order over here. 1132 01:03:12,502 --> 01:03:14,460 So if I search for my key, I'll figure out, oh, 1133 01:03:14,460 --> 01:03:18,090 do I fall off here, here, here, here, or here? 1134 01:03:18,090 --> 01:03:20,400 Now, each of those corresponds to another path 1135 01:03:20,400 --> 01:03:21,420 I need to visit. 1136 01:03:21,420 --> 01:03:23,730 So if I fall off the left side of a, 1137 01:03:23,730 --> 01:03:26,940 then I should have a pointer to this structure. 1138 01:03:26,940 --> 01:03:29,486 If I fall off the-- 1139 01:03:29,486 --> 01:03:32,640 I guess these two are empty. 1140 01:03:32,640 --> 01:03:35,820 Those correspond to these two places. 1141 01:03:35,820 --> 01:03:40,920 If I fall off here, the right side of c, which is now here, 1142 01:03:40,920 --> 01:03:45,240 this is going to be a pointer to my new structure 1143 01:03:45,240 --> 01:03:48,570 which corresponds to this one. 1144 01:03:48,570 --> 01:03:51,652 And then this one is going to correspond to all this stuff-- 1145 01:03:51,652 --> 01:03:52,860 well, in particular this one. 1146 01:03:55,960 --> 01:03:59,970 It's a little hard to draw this picture, but you get the idea. 1147 01:03:59,970 --> 01:04:02,220 You just rebalance each of these things. 1148 01:04:02,220 --> 01:04:05,160 Keep that the pointers between the preferred paths 1149 01:04:05,160 --> 01:04:06,774 just as they were. 1150 01:04:06,774 --> 01:04:08,940 This is uniquely defined how to do this because it's 1151 01:04:08,940 --> 01:04:11,340 a binary search tree. 1152 01:04:11,340 --> 01:04:20,450 So leaves point to other-- 1153 01:04:20,450 --> 01:04:24,775 let's call them child auxiliary trees. 1154 01:04:24,775 --> 01:04:26,640 It uniquely defines which ones they 1155 01:04:26,640 --> 01:04:29,160 have to point to in order to still navigate 1156 01:04:29,160 --> 01:04:30,760 the whole structure. 1157 01:04:30,760 --> 01:04:33,810 So it's a weird way of rebalancing your tree. 1158 01:04:33,810 --> 01:04:36,360 And the point is each of these red-black trees has height log 1159 01:04:36,360 --> 01:04:39,750 log n because the number of nodes in it is only log n. 1160 01:04:39,750 --> 01:04:41,626 And that gives us the bound. 1161 01:04:55,620 --> 01:05:03,330 Now, key thing to think about is what happens when you change-- 1162 01:05:03,330 --> 01:05:05,010 I said I have to be able to achieve 1163 01:05:05,010 --> 01:05:07,990 number of non-preferred edges times log log n. 1164 01:05:07,990 --> 01:05:08,490 So fine. 1165 01:05:08,490 --> 01:05:10,920 I do a log log n search in here. 1166 01:05:10,920 --> 01:05:12,850 Maybe I decide I have to go off here. 1167 01:05:12,850 --> 01:05:14,589 Then I do a log log n search in here. 1168 01:05:14,589 --> 01:05:16,130 And then maybe I have to go this way. 1169 01:05:16,130 --> 01:05:18,360 So number of non-preferred edges was 2. 1170 01:05:18,360 --> 01:05:20,440 I did two, maybe three searches. 1171 01:05:20,440 --> 01:05:21,880 Fine. 1172 01:05:21,880 --> 01:05:23,910 It's going to be number of non-preferred edges 1173 01:05:23,910 --> 01:05:25,410 plus 1 time log log n. 1174 01:05:25,410 --> 01:05:25,980 No big deal. 1175 01:05:33,600 --> 01:05:35,120 Now I have to update. 1176 01:05:35,120 --> 01:05:37,710 Now this is the preferred edge from the root, 1177 01:05:37,710 --> 01:05:41,070 and this is the preferred edge from this node. 1178 01:05:41,070 --> 01:05:43,930 How do I update preferred edges? 1179 01:05:43,930 --> 01:05:45,400 That's something to think about. 1180 01:05:45,400 --> 01:05:49,480 So I've got a path represented by a red-black tree. 1181 01:05:49,480 --> 01:05:54,360 And now I fall off here, and there's another path here. 1182 01:05:54,360 --> 01:06:00,960 I need to convert this into a path that goes like this 1183 01:06:00,960 --> 01:06:02,490 and then does this. 1184 01:06:02,490 --> 01:06:05,040 And separately, a path that does this. 1185 01:06:05,040 --> 01:06:07,110 That's the new version. 1186 01:06:07,110 --> 01:06:08,120 How do I do that? 1187 01:06:08,120 --> 01:06:11,170 Conceptually, it's pretty simple. 1188 01:06:11,170 --> 01:06:18,630 I want to cut the path here and then rejoin along there, 1189 01:06:18,630 --> 01:06:20,490 like that. 1190 01:06:20,490 --> 01:06:23,700 So conceptually, if things were stored by depth, 1191 01:06:23,700 --> 01:06:26,060 this is what we'd call a split and a concatenate. 1192 01:06:26,060 --> 01:06:28,680 You should know this from regular binary search trees. 1193 01:06:28,680 --> 01:06:31,530 This is a standard exercise for red-black trees. 1194 01:06:31,530 --> 01:06:35,670 Given a query, x, you can cut this tree into two halves 1195 01:06:35,670 --> 01:06:39,000 and get two red-black trees, which represent everything 1196 01:06:39,000 --> 01:06:42,960 to the left of x and everything to the right of x. 1197 01:06:42,960 --> 01:06:44,499 Similarly, given two trees that are 1198 01:06:44,499 --> 01:06:46,290 sorted like this where all the elements are 1199 01:06:46,290 --> 01:06:48,112 less than all the elements over here, 1200 01:06:48,112 --> 01:06:50,070 I can concatenate them into one red-black tree. 1201 01:06:50,070 --> 01:06:51,809 And all of these take log n time, 1202 01:06:51,809 --> 01:06:53,100 where n is the number of nodes. 1203 01:06:53,100 --> 01:06:56,790 Here, that would be log log n time. 1204 01:06:56,790 --> 01:06:58,620 In this world, it's not quite so simple 1205 01:06:58,620 --> 01:07:00,600 because things are not sorted by depth. 1206 01:07:00,600 --> 01:07:02,670 They're sorted by key value. 1207 01:07:02,670 --> 01:07:04,860 But it's not so bad. 1208 01:07:04,860 --> 01:07:12,730 Because, if you look at some path and you want to say, 1209 01:07:12,730 --> 01:07:21,150 OK, I want everything that's below this key value 1210 01:07:21,150 --> 01:07:24,300 or something, then that's the same as saying, 1211 01:07:24,300 --> 01:07:27,790 well, take everything that is within this interval of keys. 1212 01:07:27,790 --> 01:07:29,860 So it's strictly between here and here. 1213 01:07:32,900 --> 01:07:34,460 Let me redraw this slightly. 1214 01:07:45,020 --> 01:07:54,312 So if you look at the nodes of depth greater than d, 1215 01:07:54,312 --> 01:07:56,350 I want to cut off everybody that's 1216 01:07:56,350 --> 01:07:58,120 deeper than a particular spot in order 1217 01:07:58,120 --> 01:08:01,760 to do this kind of change. 1218 01:08:01,760 --> 01:08:12,970 These are equal to nodes in subtree of that. 1219 01:08:12,970 --> 01:08:16,850 So let me give it a name. 1220 01:08:16,850 --> 01:08:18,830 Let's say I want to cut here. 1221 01:08:18,830 --> 01:08:21,580 So I'm going to look at this node y. 1222 01:08:21,580 --> 01:08:24,580 This is nodes in the subtree of y. 1223 01:08:24,580 --> 01:08:26,290 All of the nodes that are below y 1224 01:08:26,290 --> 01:08:30,790 are obviously going to have smaller depth than that path. 1225 01:08:30,790 --> 01:08:31,960 This is nodes in a path. 1226 01:08:35,229 --> 01:08:41,920 And nodes in a subtree are equal to nodes 1227 01:08:41,920 --> 01:08:50,109 with keys in the min of that subtree 1228 01:08:50,109 --> 01:08:51,600 to the max of that tree. 1229 01:08:51,600 --> 01:08:53,899 It's an interval. 1230 01:08:53,899 --> 01:08:55,090 So what do I do? 1231 01:08:55,090 --> 01:08:56,590 I split at min of y. 1232 01:08:56,590 --> 01:08:58,660 I split at max of y. 1233 01:08:58,660 --> 01:09:00,010 That gives me the interval. 1234 01:09:00,010 --> 01:09:01,210 So here's the picture. 1235 01:09:01,210 --> 01:09:02,290 I have a tree. 1236 01:09:02,290 --> 01:09:04,229 I want to cut out this interval of nodes. 1237 01:09:04,229 --> 01:09:07,300 This is like range queries kind of in 1D. 1238 01:09:07,300 --> 01:09:08,229 So I split here. 1239 01:09:08,229 --> 01:09:09,040 I split here. 1240 01:09:09,040 --> 01:09:10,779 What I will have are the things I 1241 01:09:10,779 --> 01:09:13,160 care about, the things to the left of it 1242 01:09:13,160 --> 01:09:15,040 and the things to the right of it. 1243 01:09:15,040 --> 01:09:17,170 What I wanted was this and everything else. 1244 01:09:17,170 --> 01:09:18,160 How do I do that? 1245 01:09:18,160 --> 01:09:24,010 I concatenate-- this is y. 1246 01:09:24,010 --> 01:09:32,380 This is in the interval min of y to max of y. 1247 01:09:32,380 --> 01:09:33,348 So I wanted those guys. 1248 01:09:33,348 --> 01:09:35,139 Those are the nodes that are deeper than d. 1249 01:09:35,139 --> 01:09:37,240 I also want the nodes all together that 1250 01:09:37,240 --> 01:09:38,870 are less deep than d. 1251 01:09:38,870 --> 01:09:40,939 That's these nodes and these nodes. 1252 01:09:40,939 --> 01:09:43,029 So I concatenate these together, get 1253 01:09:43,029 --> 01:09:45,970 one big tree that represents things with depth less than d. 1254 01:09:45,970 --> 01:09:49,479 These are the things of depth greater than d. 1255 01:09:49,479 --> 01:09:49,979 OK? 1256 01:09:49,979 --> 01:09:52,810 So I do two splits, one concatenate, 1257 01:09:52,810 --> 01:09:56,500 and that simulates this kind of cut operation. 1258 01:09:56,500 --> 01:09:58,780 Similarly, if I want to do a joint operation, 1259 01:09:58,780 --> 01:10:01,447 it's a constant number of splits and concatenates, and I'm done. 1260 01:10:01,447 --> 01:10:03,988 Just dealing with the fact that things are in the wrong order 1261 01:10:03,988 --> 01:10:05,620 here, but it's not so bad. 1262 01:10:10,510 --> 01:10:15,230 One more thing, which is-- 1263 01:10:15,230 --> 01:10:17,100 I basically described the overall structure 1264 01:10:17,100 --> 01:10:19,350 as a tree of auxiliary trees. 1265 01:10:19,350 --> 01:10:22,680 In reality, we're in the binary search tree model. 1266 01:10:22,680 --> 01:10:25,830 We can only have one tree. 1267 01:10:25,830 --> 01:10:26,910 Not so hard, though. 1268 01:10:26,910 --> 01:10:30,060 I mean, basically, you want one tree that 1269 01:10:30,060 --> 01:10:34,030 represents lots of trees that are kind of pasted together. 1270 01:10:34,030 --> 01:10:36,900 So to do that, you just mark each node 1271 01:10:36,900 --> 01:10:39,780 that transitions from one tree to the next. 1272 01:10:39,780 --> 01:10:42,810 So each node will say, I am the root of a new auxiliary tree 1273 01:10:42,810 --> 01:10:45,720 or just say, no, I'm part of the same auxiliary tree 1274 01:10:45,720 --> 01:10:46,470 as my parent. 1275 01:10:49,530 --> 01:10:51,790 And then you have to define these kinds of split 1276 01:10:51,790 --> 01:10:55,080 and concatenate operations in this setting where you have 1277 01:10:55,080 --> 01:10:56,676 a tree embedded inside a tree. 1278 01:10:56,676 --> 01:10:58,050 But you just ignore all the nodes 1279 01:10:58,050 --> 01:10:59,883 that are claimed to be part of another tree. 1280 01:10:59,883 --> 01:11:02,290 Just pretend they weren't there, and it works. 1281 01:11:02,290 --> 01:11:06,819 So a little hand-wavy there, but it's kind of a tedious detail. 1282 01:11:06,819 --> 01:11:08,610 You can stick all these trees into one tree 1283 01:11:08,610 --> 01:11:13,100 just by marking these roots. 1284 01:11:13,100 --> 01:11:14,976 And that's tango trees. 1285 01:11:14,976 --> 01:11:18,970 I already spoiled the climax, which is this log log n thing, 1286 01:11:18,970 --> 01:11:22,220 but it's pretty obvious how to get there. 1287 01:11:22,220 --> 01:11:25,150 It's just a lot of details to actually do it. 1288 01:11:25,150 --> 01:11:27,100 We're just taking the Wilber 1 bound, 1289 01:11:27,100 --> 01:11:30,070 recasting it in terms of this preferred path thing 1290 01:11:30,070 --> 01:11:33,370 where it's just the non-preferred edges. 1291 01:11:33,370 --> 01:11:35,600 Or the non-preferred edges are what Wilber 1 counts, 1292 01:11:35,600 --> 01:11:37,270 and so we can afford to spend log log n 1293 01:11:37,270 --> 01:11:38,860 time for each of them. 1294 01:11:38,860 --> 01:11:41,100 And the paths themselves only have log n nodes, 1295 01:11:41,100 --> 01:11:45,470 so you can search through them in log log n time pretty easy. 1296 01:11:45,470 --> 01:11:47,110 This also shows you why Wilber 1 is not 1297 01:11:47,110 --> 01:11:50,890 a good bound with a fixed tree. 1298 01:11:50,890 --> 01:11:53,730 Because here are log n nodes. 1299 01:11:53,730 --> 01:11:58,180 I can just sit there all day bouncing around all of them 1300 01:11:58,180 --> 01:11:59,322 in random order. 1301 01:11:59,322 --> 01:12:01,780 I'm definitely going to need log log n time to access them, 1302 01:12:01,780 --> 01:12:04,460 but Wilber 1 is not changing at all. 1303 01:12:04,460 --> 01:12:08,480 So Wilber 1 stays constant, like 0. 1304 01:12:08,480 --> 01:12:11,140 I had to warm it up, but after I test everything, 1305 01:12:11,140 --> 01:12:14,710 I can just sit there and bounce around these guys randomly. 1306 01:12:14,710 --> 01:12:16,730 I've got to spend log log n time to do that, 1307 01:12:16,730 --> 01:12:19,090 but Wilber 1 doesn't justify it for me. 1308 01:12:19,090 --> 01:12:22,720 Wilber 2 will go up, but Wilber 1 with this tree? 1309 01:12:22,720 --> 01:12:24,389 It's kind of lame. 1310 01:12:24,389 --> 01:12:25,930 So this is the best tango trees could 1311 01:12:25,930 --> 01:12:28,790 hope to do using Wilber 1. 1312 01:12:31,420 --> 01:12:33,790 I would guess that tango trees are a log log 1313 01:12:33,790 --> 01:12:37,400 factor away from optimal, though we don't know that for sure. 1314 01:12:37,400 --> 01:12:40,110 But greedy we're still pretty sure is good. 1315 01:12:40,110 --> 01:12:43,031 It should be a constant factor away from optimal. 1316 01:12:43,031 --> 01:12:44,780 So I want to talk a little bit about that. 1317 01:12:48,170 --> 01:12:50,260 There's one thing on this outline 1318 01:12:50,260 --> 01:12:51,260 we haven't talked about. 1319 01:12:51,260 --> 01:12:52,450 We did independent rectangles. 1320 01:12:52,450 --> 01:12:53,390 We did Wilber 1 and 2. 1321 01:12:53,390 --> 01:12:55,919 We saw applications of them in particular tango trees. 1322 01:12:55,919 --> 01:12:57,710 One thing we haven't done is Signed Greedy. 1323 01:13:01,617 --> 01:13:02,700 So let's do Signed Greedy. 1324 01:13:06,550 --> 01:13:10,260 Still left here is we have two ways to choose 1325 01:13:10,260 --> 01:13:12,330 rectangles, independent rectangles. 1326 01:13:12,330 --> 01:13:13,290 They're different. 1327 01:13:13,290 --> 01:13:15,690 It would be kind of nice to know what the best 1328 01:13:15,690 --> 01:13:17,520 way to choose rectangles is. 1329 01:13:17,520 --> 01:13:19,140 And we actually know that-- 1330 01:13:25,340 --> 01:13:26,880 Signed Greedy. 1331 01:13:26,880 --> 01:13:28,500 So there's two kinds of Signed Greedy. 1332 01:13:28,500 --> 01:13:31,260 There's the plus sign greedy, and there's 1333 01:13:31,260 --> 01:13:32,420 the minus sign greedy. 1334 01:13:36,640 --> 01:13:38,136 How does plus greedy work? 1335 01:13:38,136 --> 01:13:39,510 It's the same as greedy, you just 1336 01:13:39,510 --> 01:13:41,970 only look at plus rectangles. 1337 01:13:41,970 --> 01:13:44,140 Remember plus rectangles and minus rectangles. 1338 01:13:44,140 --> 01:13:49,320 So let's look at our favorite example here. 1339 01:13:49,320 --> 01:13:52,020 With greedy, I would sweep up, and every rectangle that 1340 01:13:52,020 --> 01:13:54,390 was unsatisfied, I would satisfy it. 1341 01:13:54,390 --> 01:13:57,570 Now I'm going to ignore minus rectangles, 1342 01:13:57,570 --> 01:14:00,220 only look at plus rectangles. 1343 01:14:00,220 --> 01:14:02,160 So I see this rectangle, and I say, oh, I 1344 01:14:02,160 --> 01:14:05,220 don't care because that's a minus rectangle. 1345 01:14:05,220 --> 01:14:10,880 Then I see this one and this one. 1346 01:14:10,880 --> 01:14:13,385 I say, oh, those are plus rectangles. 1347 01:14:13,385 --> 01:14:14,760 So I'm going to add a point here. 1348 01:14:14,760 --> 01:14:17,190 I'm going to add a point here. 1349 01:14:17,190 --> 01:14:20,760 Then I go up to here. 1350 01:14:20,760 --> 01:14:23,130 I see this rectangle, which is a plus rectangle. 1351 01:14:23,130 --> 01:14:24,070 That's bad. 1352 01:14:24,070 --> 01:14:25,290 So I've got to add a point. 1353 01:14:25,290 --> 01:14:29,670 I see this minus rectangle I don't care about. 1354 01:14:29,670 --> 01:14:31,590 This is plus greedy. 1355 01:14:31,590 --> 01:14:33,150 It does not satisfy the set. 1356 01:14:33,150 --> 01:14:35,790 This rectangle never got satisfied. 1357 01:14:35,790 --> 01:14:37,930 But it plus satisfies the set. 1358 01:14:37,930 --> 01:14:42,780 If I do plus greedy, it will be plus satisfied. 1359 01:14:42,780 --> 01:14:45,030 Every rectangle you draw here, if it's plus rectangle, 1360 01:14:45,030 --> 01:14:47,490 it's got another point in it. 1361 01:14:47,490 --> 01:14:50,430 What's kind of nice, also, is if you actually 1362 01:14:50,430 --> 01:14:54,600 draw the rectangles you are satisfying-- 1363 01:14:54,600 --> 01:14:55,830 maybe I'm use another color. 1364 01:14:58,530 --> 01:15:00,715 There was one rectangle here. 1365 01:15:00,715 --> 01:15:04,650 There was one rectangle here. 1366 01:15:04,650 --> 01:15:08,379 And there was one rectangle here. 1367 01:15:08,379 --> 01:15:10,170 That's a little awkward because they're not 1368 01:15:10,170 --> 01:15:12,690 on the original points. 1369 01:15:12,690 --> 01:15:14,490 So I can change them a little bit, 1370 01:15:14,490 --> 01:15:21,052 maybe move this one down to here and move this one over to here. 1371 01:15:21,052 --> 01:15:22,510 You could say that those rectangles 1372 01:15:22,510 --> 01:15:24,700 came from those points. 1373 01:15:24,700 --> 01:15:27,670 Then this is a set of independent rectangles 1374 01:15:27,670 --> 01:15:30,210 on the original points. 1375 01:15:30,210 --> 01:15:34,900 Maybe not totally obvious, but plus greedy 1376 01:15:34,900 --> 01:15:49,220 always gives an independent set of plus rectangles. 1377 01:15:49,220 --> 01:15:50,629 So it's a lower bound. 1378 01:15:50,629 --> 01:15:53,170 It's not an upper bound because it's not satisfying the point 1379 01:15:53,170 --> 01:15:55,550 set, but it's a lower bound. 1380 01:15:55,550 --> 01:15:58,418 I claim it's a very good lower bound. 1381 01:16:07,420 --> 01:16:09,280 It by itself might not be great, but you 1382 01:16:09,280 --> 01:16:11,686 have to consider both of them. 1383 01:16:11,686 --> 01:16:26,340 So theorem is if I take the max of plus greedy and minus 1384 01:16:26,340 --> 01:16:26,840 greedy-- 1385 01:16:30,290 --> 01:16:32,210 each of them is lower bound, so the max 1386 01:16:32,210 --> 01:16:35,090 is a lower bound on optimal-- 1387 01:16:35,090 --> 01:16:37,760 then this is within a constant factor 1388 01:16:37,760 --> 01:16:41,650 of the biggest possible independent rectangle lower 1389 01:16:41,650 --> 01:16:42,150 bound. 1390 01:16:51,860 --> 01:16:53,360 And so this is the way you should 1391 01:16:53,360 --> 01:16:54,680 choose independent rectangles. 1392 01:16:54,680 --> 01:16:55,430 Run plus greedy. 1393 01:16:55,430 --> 01:16:56,150 Run minus greedy. 1394 01:16:56,150 --> 01:16:57,587 Take the best of the two. 1395 01:16:57,587 --> 01:16:59,420 That will always be within a constant factor 1396 01:16:59,420 --> 01:17:04,015 of the best independent set of rectangles, factors like 4 1397 01:17:04,015 --> 01:17:06,540 or something in the worst case. 1398 01:17:06,540 --> 01:17:08,480 So let me prove this to you. 1399 01:17:12,110 --> 01:17:13,610 It's a kind of a weird argument. 1400 01:17:16,307 --> 01:17:17,765 I'm going to define a new quantity. 1401 01:17:20,670 --> 01:17:23,210 Let's call this OPT x, I guess. 1402 01:17:26,300 --> 01:17:31,940 It's sort of like if you consider plus rectangles 1403 01:17:31,940 --> 01:17:33,837 separately from minus rectangles, which 1404 01:17:33,837 --> 01:17:34,670 is what we're doing. 1405 01:17:52,440 --> 01:17:55,220 So I would like a point set-- 1406 01:17:55,220 --> 01:17:57,620 first, I'd like a plus satisfying point set, 1407 01:17:57,620 --> 01:18:01,440 and then I'd also like a minus satisfying point set. 1408 01:18:01,440 --> 01:18:03,020 And then I take their union. 1409 01:18:03,020 --> 01:18:08,340 And I say the cost of that pair of plus satisfying and minus 1410 01:18:08,340 --> 01:18:10,350 satisfying is the size of the union. 1411 01:18:10,350 --> 01:18:13,640 So I get a bonus point if they happen to overlap. 1412 01:18:13,640 --> 01:18:16,040 Not a big deal, just a factor of 2. 1413 01:18:16,040 --> 01:18:18,200 So this is not a core concept, but it turns out 1414 01:18:18,200 --> 01:18:20,510 to be basically what we were doing over here. 1415 01:18:23,300 --> 01:18:27,200 Let me give you a sequence of crazy inequalities. 1416 01:18:27,200 --> 01:18:29,900 First one is that this OPT thing is greater than 1417 01:18:29,900 --> 01:18:33,980 or equal to size of the input. 1418 01:18:33,980 --> 01:18:36,060 Each of these inequalities is totally obvious, 1419 01:18:36,060 --> 01:18:38,056 but the conclusion is kind of crazy. 1420 01:18:43,860 --> 01:18:46,520 The independent rectangle lower bound, which we proved, 1421 01:18:46,520 --> 01:18:49,040 says that if you look at plus satisfying things that's 1422 01:18:49,040 --> 01:18:51,050 going to be at least size of the input 1423 01:18:51,050 --> 01:18:52,967 plus the max number of independent rectangles. 1424 01:18:52,967 --> 01:18:54,758 If you look at the minus satisfying things, 1425 01:18:54,758 --> 01:18:56,990 that's also going to be at least size of the input 1426 01:18:56,990 --> 01:19:00,320 plus maximum number of minus independent rectangles. 1427 01:19:00,320 --> 01:19:01,520 So we already proved this. 1428 01:19:01,520 --> 01:19:03,344 That, if you look at this union, it's 1429 01:19:03,344 --> 01:19:05,510 going to be at least the size of the input plus half 1430 01:19:05,510 --> 01:19:08,060 the overall max. 1431 01:19:08,060 --> 01:19:12,320 So that's what we proved at the beginning a lecture. 1432 01:19:12,320 --> 01:19:17,840 Now, this is the best way to use independent rectangles. 1433 01:19:17,840 --> 01:19:19,970 This kind of Signed Greedy, which 1434 01:19:19,970 --> 01:19:22,749 is the max of the two signs, is a way 1435 01:19:22,749 --> 01:19:24,040 to find independent rectangles. 1436 01:19:24,040 --> 01:19:25,331 So it's only going to be worse. 1437 01:19:25,331 --> 01:19:27,480 It's going to be smaller. 1438 01:19:27,480 --> 01:19:32,000 So we can say is greater than or equal to half 1439 01:19:32,000 --> 01:19:37,630 the max of plus greedy and minus greedy. 1440 01:19:42,650 --> 01:19:44,160 This was the max. 1441 01:19:44,160 --> 01:19:46,700 So this is another way to do it, so it must be smaller. 1442 01:19:49,920 --> 01:19:58,710 Now, greedy computes a plus satisfying assignment. 1443 01:19:58,710 --> 01:20:02,834 So I could say, well, if you looked at the optimal plus 1444 01:20:02,834 --> 01:20:05,000 satisfying assignment-- this is something we defined 1445 01:20:05,000 --> 01:20:06,440 at the beginning of lecture-- 1446 01:20:06,440 --> 01:20:09,590 and the optimal minus satisfying assignment, that's 1447 01:20:09,590 --> 01:20:11,870 only going to be smaller than greedy because greedy 1448 01:20:11,870 --> 01:20:15,290 is an algorithm for solving OPT plus. 1449 01:20:15,290 --> 01:20:19,302 It can't be better than the optimum. 1450 01:20:19,302 --> 01:20:21,260 Greedy again has to be bigger than the optimum. 1451 01:20:24,890 --> 01:20:29,110 Now I just want to turn this max into a plus 1452 01:20:29,110 --> 01:20:32,550 because the max is always at least the average. 1453 01:20:32,550 --> 01:20:37,670 So if I take the average, which turns it into 1/4 OPT plus 1454 01:20:37,670 --> 01:20:39,995 plus OPT minus. 1455 01:20:42,990 --> 01:20:44,090 Then that holds. 1456 01:20:44,090 --> 01:20:46,120 You turn the max into a plus. 1457 01:20:46,120 --> 01:20:48,440 If I look at the optimal plus satisfying 1458 01:20:48,440 --> 01:20:50,690 plus the optimal minus satisfying, 1459 01:20:50,690 --> 01:20:54,770 that's only going to be bigger than this thing 1460 01:20:54,770 --> 01:20:56,900 because this can only save like a factor of 2 1461 01:20:56,900 --> 01:21:00,436 or whatever over just adding them up. 1462 01:21:00,436 --> 01:21:02,060 I don't even need to factor of 2 thing. 1463 01:21:02,060 --> 01:21:05,810 I just need that if you add them up, 1464 01:21:05,810 --> 01:21:08,270 that's only going to be worse than just counting them 1465 01:21:08,270 --> 01:21:10,350 as the union. 1466 01:21:10,350 --> 01:21:13,100 So we get what I call a sandwich. 1467 01:21:13,100 --> 01:21:15,860 On the one side, we have OPT x. 1468 01:21:15,860 --> 01:21:17,489 On the other side, we have 1/4 OPT x. 1469 01:21:17,489 --> 01:21:19,280 I really don't care about OPT x personally. 1470 01:21:19,280 --> 01:21:21,690 I mean, it's kind of interesting to see that it's here. 1471 01:21:21,690 --> 01:21:23,910 But the point is these are within a constant factor. 1472 01:21:23,910 --> 01:21:25,618 Therefore, all of these things in between 1473 01:21:25,618 --> 01:21:27,690 are within a constant factor of each other. 1474 01:21:27,690 --> 01:21:31,970 So in particular, this thing, max of the two greedys, 1475 01:21:31,970 --> 01:21:34,070 is within a constant factor of this thing. 1476 01:21:34,070 --> 01:21:37,070 This is the independent rectangle lower bound, 1477 01:21:37,070 --> 01:21:38,540 the best one. 1478 01:21:38,540 --> 01:21:40,460 It also tells you that OPT x is basically 1479 01:21:40,460 --> 01:21:43,230 what we're computing here. 1480 01:21:43,230 --> 01:21:44,640 So this is weird. 1481 01:21:44,640 --> 01:21:49,190 I'm going to draw one more picture, which 1482 01:21:49,190 --> 01:21:56,630 is greedy versus Signed Greedy. 1483 01:21:56,630 --> 01:21:58,760 Remember greedy from last lecture. 1484 01:21:58,760 --> 01:22:02,930 Greedy says, look, I'm going to fix plus rectangles, 1485 01:22:02,930 --> 01:22:04,640 and I'm going to fix minus rectangles. 1486 01:22:04,640 --> 01:22:06,740 It does them both at the same time. 1487 01:22:06,740 --> 01:22:08,600 Signed Greedy says, look, I'm going 1488 01:22:08,600 --> 01:22:11,270 to do the plus rectangles separately, 1489 01:22:11,270 --> 01:22:14,510 and then I'm going to the minus rectangles separately, 1490 01:22:14,510 --> 01:22:17,605 and then add them up or take the union or take the max. 1491 01:22:17,605 --> 01:22:18,370 It doesn't matter. 1492 01:22:18,370 --> 01:22:19,760 It's a constant factor. 1493 01:22:19,760 --> 01:22:22,670 Just add them separately. 1494 01:22:22,670 --> 01:22:24,920 This one is an upper bound. 1495 01:22:24,920 --> 01:22:27,680 It is a binary search tree. 1496 01:22:27,680 --> 01:22:29,030 This thing is a lower bound. 1497 01:22:29,030 --> 01:22:32,780 All binary search trees must take at least this. 1498 01:22:32,780 --> 01:22:35,720 Are they equal up to constant factors? 1499 01:22:35,720 --> 01:22:36,500 We don't know. 1500 01:22:36,500 --> 01:22:37,970 That's the big question. 1501 01:22:37,970 --> 01:22:40,400 They look almost identical. 1502 01:22:40,400 --> 01:22:43,280 But what greedy has to deal with is sort of the interrelations. 1503 01:22:43,280 --> 01:22:45,980 When I fix some plus rectangles, I 1504 01:22:45,980 --> 01:22:49,305 might get new minus rectangles that I have to fix with greedy. 1505 01:22:49,305 --> 01:22:51,180 Signed Greedy doesn't have to deal with that. 1506 01:22:51,180 --> 01:22:52,989 It's just the plus rectangles. 1507 01:22:52,989 --> 01:22:54,530 They might make more plus rectangles, 1508 01:22:54,530 --> 01:22:56,120 but that's all I have to deal with. 1509 01:22:56,120 --> 01:22:57,620 It doesn't deal with the interaction 1510 01:22:57,620 --> 01:22:59,420 between plus and minus rectangles. 1511 01:22:59,420 --> 01:23:03,290 Seems like the interaction kind of fades away 1512 01:23:03,290 --> 01:23:04,207 as a geometric series. 1513 01:23:04,207 --> 01:23:05,873 And therefore, these things are the same 1514 01:23:05,873 --> 01:23:07,040 up to constant factors. 1515 01:23:07,040 --> 01:23:08,900 But we have no way to prove that. 1516 01:23:08,900 --> 01:23:12,380 It could be the interaction blows you out of the water 1517 01:23:12,380 --> 01:23:14,750 somehow. 1518 01:23:14,750 --> 01:23:19,271 That's the best we know for dynamic optimality. 1519 01:23:19,271 --> 01:23:21,770 Maybe next time I teach this class we'll have a final answer 1520 01:23:21,770 --> 01:23:25,730 and it'll be constant, but that's where we are today.