1 00:00:00,090 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:20,554 --> 00:00:21,720 SRINIVAS DEVADAS: All right. 9 00:00:21,720 --> 00:00:23,610 Good morning, everyone. 10 00:00:23,610 --> 00:00:25,280 Let's get started. 11 00:00:25,280 --> 00:00:29,930 A new module today-- we're going to spend a few 12 00:00:29,930 --> 00:00:34,320 lectures on randomized algorithms. 13 00:00:34,320 --> 00:00:39,740 And so not only will we look at slightly different ways 14 00:00:39,740 --> 00:00:43,960 of solving old problems like sorting, we'll also 15 00:00:43,960 --> 00:00:48,110 look at how we can analyze this new kind of algorithm that 16 00:00:48,110 --> 00:00:51,470 generates random numbers in order to actually make 17 00:00:51,470 --> 00:00:57,010 decisions as it's executing and that we'll end up obviously 18 00:00:57,010 --> 00:01:02,230 with the analysis that gives us the expected run 19 00:01:02,230 --> 00:01:05,844 time of the algorithm-- for example, 20 00:01:05,844 --> 00:01:08,260 whether the algorithm is going to produce a correct result 21 00:01:08,260 --> 00:01:11,550 or not, with what probability will this algorithm 22 00:01:11,550 --> 00:01:13,660 produce a correct result. 23 00:01:13,660 --> 00:01:16,730 So I'll talk a little bit about why 24 00:01:16,730 --> 00:01:18,700 we're interested in randomized algorithms 25 00:01:18,700 --> 00:01:22,840 in a couple of minutes, but let me define 26 00:01:22,840 --> 00:01:26,540 what a randomized algorithm, or a probabilistic algorithm, 27 00:01:26,540 --> 00:01:30,650 is to start things off. 28 00:01:38,000 --> 00:01:42,050 And so randomized algorithm is something 29 00:01:42,050 --> 00:01:47,040 that generates a random number. 30 00:01:52,250 --> 00:01:57,290 Now, this would be a coinage flip, but more often than not, 31 00:01:57,290 --> 00:02:04,510 you're generating a real number that comes from a sudden range. 32 00:02:04,510 --> 00:02:06,410 Sometimes you're generating a vector. 33 00:02:06,410 --> 00:02:09,570 You'll see a couple of different examples 34 00:02:09,570 --> 00:02:15,620 here in today's lecture and in section. 35 00:02:15,620 --> 00:02:23,730 And it's going to make decisions based on this value, 36 00:02:23,730 --> 00:02:26,580 based on r's actual value. 37 00:02:30,920 --> 00:02:33,619 Now, you can imagine that an algorithm would be recursive, 38 00:02:33,619 --> 00:02:35,160 and at every level of recursion, it's 39 00:02:35,160 --> 00:02:37,700 going to generate a random r. 40 00:02:37,700 --> 00:02:41,950 So when you're executing at a particular level of recursion, 41 00:02:41,950 --> 00:02:45,580 you may be doing different things based on r. 42 00:02:45,580 --> 00:02:49,630 And not only that, if you re-run the algorithm again 43 00:02:49,630 --> 00:02:52,990 on the same input, the execution will be different 44 00:02:52,990 --> 00:02:56,070 because you're resuming a true random number 45 00:02:56,070 --> 00:02:59,670 generator as opposed to a pseudo random one. 46 00:02:59,670 --> 00:03:01,630 And the r's that you're going to get 47 00:03:01,630 --> 00:03:04,570 at different levels of recursion or through the execution 48 00:03:04,570 --> 00:03:07,610 of the algorithm are going to be different from the first time 49 00:03:07,610 --> 00:03:10,890 to the second time. 50 00:03:10,890 --> 00:03:22,820 So on the same input on different executions, 51 00:03:22,820 --> 00:03:23,820 two things might happen. 52 00:03:26,850 --> 00:03:30,530 The algorithm may run for a different number of steps. 53 00:03:35,540 --> 00:03:38,330 So you might get lucky on the first execution, 54 00:03:38,330 --> 00:03:41,860 and the algorithm finishes, let's say at 100 time units. 55 00:03:41,860 --> 00:03:44,520 The second time around, it takes a long time. 56 00:03:44,520 --> 00:03:46,940 It takes 700 time units. 57 00:03:46,940 --> 00:03:51,460 Our goal here is to try and analyze 58 00:03:51,460 --> 00:03:53,370 what this probabilistic runtime would 59 00:03:53,370 --> 00:03:59,320 be to ask for an expectation, to be able to compute 60 00:03:59,320 --> 00:04:03,120 an expectation for the runtime, or-- if you're 61 00:04:03,120 --> 00:04:07,130 talking about a different scenario 62 00:04:07,130 --> 00:04:09,980 where different executions-- I could actually 63 00:04:09,980 --> 00:04:11,950 produce different outputs. 64 00:04:18,839 --> 00:04:23,560 And in this case, it's possible that one or more 65 00:04:23,560 --> 00:04:25,320 of these outputs are incorrect. 66 00:04:25,320 --> 00:04:27,930 You actually get the wrong answer. 67 00:04:27,930 --> 00:04:30,980 And obviously, that's going to happen 68 00:04:30,980 --> 00:04:32,810 with a certain probability. 69 00:04:32,810 --> 00:04:37,940 You're going to have to decide or analyze 70 00:04:37,940 --> 00:04:40,200 what that probability is. 71 00:04:40,200 --> 00:04:44,520 And generally speaking, we won't be happy 72 00:04:44,520 --> 00:04:48,110 with a high probability of error, as you can imagine. 73 00:04:48,110 --> 00:04:51,290 And we'd like to set up an algorithm such 74 00:04:51,290 --> 00:04:56,450 that you can reduce that probability of incorrect output 75 00:04:56,450 --> 00:04:59,410 to be something really, really small. 76 00:04:59,410 --> 00:05:03,630 And it might take you longer to get 77 00:05:03,630 --> 00:05:10,620 to that low level of incorrect output 78 00:05:10,620 --> 00:05:15,350 in one case for a certain set of inputs versus another case. 79 00:05:15,350 --> 00:05:20,780 So that's this set up here in terms of randomized. 80 00:05:20,780 --> 00:05:23,760 You're going to have algorithms that-- you can think 81 00:05:23,760 --> 00:05:27,670 of them as probably correct. 82 00:05:27,670 --> 00:05:30,789 So these are algorithms-- you want 83 00:05:30,789 --> 00:05:32,330 to think of them as probably correct, 84 00:05:32,330 --> 00:05:33,750 and they do have a name. 85 00:05:33,750 --> 00:05:37,930 They're called Monte Carlo algorithms. 86 00:05:37,930 --> 00:05:40,425 And then you have algorithms that are probably fast. 87 00:05:44,850 --> 00:05:50,080 So-- indicates a probably correct-- 88 00:05:50,080 --> 00:05:52,950 you could have a constant probability that they're 89 00:05:52,950 --> 00:05:57,150 going to give you the correct answer, 99%. 90 00:05:57,150 --> 00:05:59,470 And you could obviously try and parametrize that. 91 00:05:59,470 --> 00:06:02,440 In the case of probably fast, you 92 00:06:02,440 --> 00:06:06,050 say things like, it runs an expected polynomial time. 93 00:06:06,050 --> 00:06:10,120 And really what that means is that you may have 94 00:06:10,120 --> 00:06:12,630 to run it for more information. 95 00:06:12,630 --> 00:06:16,830 So rather than taking 100 iterations or 100 steps 96 00:06:16,830 --> 00:06:20,410 to sort something, it might take you 110. 97 00:06:20,410 --> 00:06:22,230 But in the case of probably fast, 98 00:06:22,230 --> 00:06:25,220 you do get the sorted result at the end. 99 00:06:25,220 --> 00:06:27,330 And when the algorithm has finished execution, 100 00:06:27,330 --> 00:06:29,410 you do get that sorted result at the end. 101 00:06:29,410 --> 00:06:35,090 So it's correct and probably fast or probably correct 102 00:06:35,090 --> 00:06:37,570 and deterministically fast. 103 00:06:37,570 --> 00:06:38,090 OK. 104 00:06:38,090 --> 00:06:41,060 And this is Las Vegas. 105 00:06:41,060 --> 00:06:44,450 So you have Monte Carlo versus Las Vegas here. 106 00:06:44,450 --> 00:06:47,746 So yesterday, it occurred to me-- 107 00:06:47,746 --> 00:06:49,620 and I've taught this class a bunch of times-- 108 00:06:49,620 --> 00:06:50,820 but it occurred to me for the first time 109 00:06:50,820 --> 00:06:52,880 last night that there should be algorithms 110 00:06:52,880 --> 00:06:56,144 that are probably correct and probably fast, 111 00:06:56,144 --> 00:06:57,560 which means that they're incorrect 112 00:06:57,560 --> 00:06:59,560 and slow some of the time. 113 00:06:59,560 --> 00:07:00,100 Right? 114 00:07:00,100 --> 00:07:03,900 So what do you think those algorithms are called? 115 00:07:03,900 --> 00:07:04,400 Sorry. 116 00:07:04,400 --> 00:07:05,222 What? 117 00:07:05,222 --> 00:07:06,580 AUDIENCE: T? 118 00:07:06,580 --> 00:07:07,610 SRINIVAS DEVADAS: The T? 119 00:07:07,610 --> 00:07:08,130 Oh. 120 00:07:08,130 --> 00:07:08,330 Oh! 121 00:07:08,330 --> 00:07:09,500 That deserves a Frisbee. 122 00:07:09,500 --> 00:07:10,700 Oh my goodness! 123 00:07:10,700 --> 00:07:11,980 [LAUGHS] All right. 124 00:07:11,980 --> 00:07:12,920 There you go. 125 00:07:12,920 --> 00:07:14,125 There you go. 126 00:07:14,125 --> 00:07:14,680 All right. 127 00:07:14,680 --> 00:07:18,030 Now, they're not called the T. So we should write 128 00:07:18,030 --> 00:07:21,740 that down so everyone knows. 129 00:07:21,740 --> 00:07:28,310 Probably correct and probably fast, 130 00:07:28,310 --> 00:07:31,970 which is I guess they don't get you anywhere. 131 00:07:31,970 --> 00:07:34,160 I don't know what that means-- incorrect and so 132 00:07:34,160 --> 00:07:37,479 in the case of the T. 133 00:07:37,479 --> 00:07:38,020 So the MB/TA. 134 00:07:42,540 --> 00:07:43,160 Any guesses? 135 00:07:43,160 --> 00:07:48,820 I mean, think about what we have for Monte Carlo, Las Vegas. 136 00:07:48,820 --> 00:07:51,046 Extrapolate. 137 00:07:51,046 --> 00:07:52,670 These are the kinds of questions you're 138 00:07:52,670 --> 00:07:53,753 going to get on your quiz. 139 00:07:58,680 --> 00:08:00,460 I guess you guys don't gamble you. 140 00:08:00,460 --> 00:08:01,230 Go ahead. 141 00:08:01,230 --> 00:08:01,610 AUDIENCE: Atlantic City. 142 00:08:01,610 --> 00:08:02,160 SRINIVAS DEVADAS: Atlantic City. 143 00:08:02,160 --> 00:08:03,930 That deserves a Frisbee. 144 00:08:03,930 --> 00:08:05,390 Yeah. 145 00:08:05,390 --> 00:08:06,100 Absolutely right. 146 00:08:06,100 --> 00:08:10,110 That It turns out Atlantic City isn't 147 00:08:10,110 --> 00:08:12,780 a name that's really caught on, but it 148 00:08:12,780 --> 00:08:20,540 was in terms of being used in this context. 149 00:08:20,540 --> 00:08:23,450 Most of the time, if you do have a probably correct probably 150 00:08:23,450 --> 00:08:24,920 fast algorithm, you can convert it 151 00:08:24,920 --> 00:08:28,400 into a Monte Carlo algorithm or a Las Vegas algorithm. 152 00:08:28,400 --> 00:08:32,250 There are some prime testing algorithms 153 00:08:32,250 --> 00:08:35,340 to test whether a particular number is a prime or not 154 00:08:35,340 --> 00:08:37,620 that run in probabilistic polynomial time, 155 00:08:37,620 --> 00:08:42,330 and they may incorrectly tell you that the number is a prime. 156 00:08:42,330 --> 00:08:46,060 So that's an example of an Atlantic City algorithm. 157 00:08:46,060 --> 00:08:49,170 We won't actually do Atlantic City. 158 00:08:49,170 --> 00:08:50,720 What we'll do is we'll take a look 159 00:08:50,720 --> 00:08:54,870 at a couple of different algorithms, and both of these 160 00:08:54,870 --> 00:08:58,430 will motivate why randomized algorithms are interesting. 161 00:08:58,430 --> 00:09:04,440 The Monte Carlo example is checking matrix multiply. 162 00:09:04,440 --> 00:09:08,330 So you've gotten a couple of square matrices. 163 00:09:08,330 --> 00:09:10,580 Both of them are n by n matrices, 164 00:09:10,580 --> 00:09:15,030 and you multiply them out-- A times B, and you produce C. 165 00:09:15,030 --> 00:09:16,720 And so you got the C matrix. 166 00:09:16,720 --> 00:09:21,310 And rather than re-multiplying and checking the result, 167 00:09:21,310 --> 00:09:23,610 you'd like to do something better. 168 00:09:23,610 --> 00:09:27,390 You'd like to verify with some probability 169 00:09:27,390 --> 00:09:30,000 that you can parametrize that the output 170 00:09:30,000 --> 00:09:33,550 matrix is in fact the product of the two input matrices. 171 00:09:33,550 --> 00:09:35,500 And so that's a randomized algorithm 172 00:09:35,500 --> 00:09:39,700 that's a Monte Carlo because you're not guaranteeing 173 00:09:39,700 --> 00:09:42,620 that that output matrix is in fact the product 174 00:09:42,620 --> 00:09:46,710 of the first two matrices or the operand matrices, 175 00:09:46,710 --> 00:09:51,230 but you're getting a good sense of how likely that is. 176 00:09:51,230 --> 00:09:53,890 And you can kind of squish that probability of error 177 00:09:53,890 --> 00:10:00,300 down to however low you want it to be except you have to run 178 00:10:00,300 --> 00:10:02,120 the algorithm for longer. 179 00:10:02,120 --> 00:10:04,560 So that's an example of Monte Carlo. 180 00:10:04,560 --> 00:10:05,830 Now, quicksort. 181 00:10:05,830 --> 00:10:08,694 It doesn't make sense to say-- I guess you could-- 182 00:10:08,694 --> 00:10:10,110 but it doesn't make too much sense 183 00:10:10,110 --> 00:10:14,090 to say that you have an almost sorted array. 184 00:10:14,090 --> 00:10:15,285 What does that mean exactly? 185 00:10:15,285 --> 00:10:16,970 You have to categorize that. 186 00:10:16,970 --> 00:10:19,460 So quicksort is an example where you're 187 00:10:19,460 --> 00:10:22,370 guaranteed to get a sorted array at the end of it. 188 00:10:22,370 --> 00:10:23,780 So it's correct. 189 00:10:23,780 --> 00:10:25,231 You will get a sorted ray. 190 00:10:25,231 --> 00:10:26,980 That's what you wanted-- descending order, 191 00:10:26,980 --> 00:10:28,090 ascending order. 192 00:10:28,090 --> 00:10:34,630 But it might not run in order n log n time. 193 00:10:34,630 --> 00:10:36,200 That's expected time. 194 00:10:36,200 --> 00:10:38,130 Order n log n is expected time. 195 00:10:38,130 --> 00:10:42,070 And so that's what probably fast would correspond to. 196 00:10:42,070 --> 00:10:42,640 All right? 197 00:10:42,640 --> 00:10:45,150 So that's the set up. 198 00:10:45,150 --> 00:10:48,370 You can kind of see why these are interesting because you 199 00:10:48,370 --> 00:10:51,660 could imagine that in practical scenarios, 200 00:10:51,660 --> 00:10:53,780 you might want to do some checking 201 00:10:53,780 --> 00:10:57,130 in a probabilistic way. 202 00:10:57,130 --> 00:11:03,540 And you want to do that without having to redo all the work. 203 00:11:03,540 --> 00:11:06,290 Obviously you don't want your checker for matrix multiply 204 00:11:06,290 --> 00:11:09,150 to be as slow as multiplying two matrices. 205 00:11:09,150 --> 00:11:10,920 Otherwise it makes no sense. 206 00:11:10,920 --> 00:11:15,610 So let's dive into matrix product 207 00:11:15,610 --> 00:11:22,090 and our first example of a probably correct algorithm, 208 00:11:22,090 --> 00:11:23,810 or Monte Carlo algorithm. 209 00:11:23,810 --> 00:11:26,780 So what I want to do here is C equals A 210 00:11:26,780 --> 00:11:34,640 times B. And the simple algorithm-- I guess, 211 00:11:34,640 --> 00:11:39,140 those of us who went to high school, 212 00:11:39,140 --> 00:11:48,160 myself included, did my four years-- know of an n cube 213 00:11:48,160 --> 00:11:51,040 algorithm-- or learned it back then. 214 00:11:51,040 --> 00:11:57,640 It simply corresponds to taking rows and columns, 215 00:11:57,640 --> 00:11:59,870 and you get an entry. 216 00:11:59,870 --> 00:12:02,670 You have n square entries that you 217 00:12:02,670 --> 00:12:06,510 need to compute corresponding to the output matrix C. 218 00:12:06,510 --> 00:12:11,790 And you're going to do order n multiplications and additions, 219 00:12:11,790 --> 00:12:14,670 but we're really going to consider multiplications here. 220 00:12:14,670 --> 00:12:17,000 When I talk about n here, it's not 221 00:12:17,000 --> 00:12:18,820 the total number of operations. 222 00:12:18,820 --> 00:12:20,640 It's the number of multiplications. 223 00:12:20,640 --> 00:12:22,890 And the reason for that is-- this 224 00:12:22,890 --> 00:12:26,150 may have gone away a little bit, but it's still probably true-- 225 00:12:26,150 --> 00:12:29,540 that multiplication, in computers, it 226 00:12:29,540 --> 00:12:33,280 takes longer to multiply two numbers, integers 227 00:12:33,280 --> 00:12:35,860 are floating point numbers, than adding numbers. 228 00:12:35,860 --> 00:12:37,440 It used to be much more dramatic, 229 00:12:37,440 --> 00:12:39,330 the differences between multiplying 230 00:12:39,330 --> 00:12:44,300 and add in computers. 231 00:12:44,300 --> 00:12:48,000 But thanks to pipelining and lots of optimizations, 232 00:12:48,000 --> 00:12:49,940 multiplies are actually very fast. 233 00:12:49,940 --> 00:12:54,090 But they are, obviously, a more sophisticated operation 234 00:12:54,090 --> 00:12:55,110 than addition. 235 00:12:55,110 --> 00:12:57,090 So we'll be counting multiplies. 236 00:12:57,090 --> 00:13:01,450 So when you've seen Karatsuba divide and conquer 237 00:13:01,450 --> 00:13:04,090 for multiply, back end in 006. 238 00:13:04,090 --> 00:13:06,680 Remember that we were counting multiplications, 239 00:13:06,680 --> 00:13:09,480 and we were actually trading off multiplications for additions. 240 00:13:09,480 --> 00:13:12,360 We were trying to shrink that number associated 241 00:13:12,360 --> 00:13:14,720 with the complexity of the algorithm 242 00:13:14,720 --> 00:13:17,154 when counting the number of multiplies. 243 00:13:17,154 --> 00:13:18,570 And we actually counted the number 244 00:13:18,570 --> 00:13:21,520 of additions that were going up-- 245 00:13:21,520 --> 00:13:23,840 at least from a constant factor standpoint, 246 00:13:23,840 --> 00:13:25,920 not necessarily from an asymptotic complexity 247 00:13:25,920 --> 00:13:27,420 standpoint. 248 00:13:27,420 --> 00:13:29,940 And so that's simple algorithm. 249 00:13:29,940 --> 00:13:31,520 You probably heard of Strassen. 250 00:13:31,520 --> 00:13:34,317 Some of you might have seen it. 251 00:13:34,317 --> 00:13:35,900 Essentially what happens with Strassen 252 00:13:35,900 --> 00:13:45,080 is you multiply two two by two matrices using 253 00:13:45,080 --> 00:13:47,860 seven multiplications as opposed to eight. 254 00:13:52,765 --> 00:13:56,180 Now, if you do that-- and this is similar to the Karatsuba 255 00:13:56,180 --> 00:14:01,050 analysis-- you can do this in n raised 256 00:14:01,050 --> 00:14:04,515 to log 2 7 time, which is essentially 257 00:14:04,515 --> 00:14:08,650 n raised to 2.81 time. 258 00:14:08,650 --> 00:14:13,740 And so rather than n cubed, you can go down to n raised 2.81. 259 00:14:13,740 --> 00:14:16,480 Now it turns out people have obviously not 260 00:14:16,480 --> 00:14:17,940 stopped with this. 261 00:14:17,940 --> 00:14:23,910 You can go to n raised to 2.70 by doing something of the order 262 00:14:23,910 --> 00:14:29,660 of 143,000 multiplications for 70 by 70 matrices. 263 00:14:29,660 --> 00:14:30,760 So you can play around. 264 00:14:30,760 --> 00:14:32,212 Just like you had Toom-Cook. 265 00:14:32,212 --> 00:14:34,420 I don't know if you remember that or it got covered-- 266 00:14:34,420 --> 00:14:36,710 but Karatsuba could get generalized into this thing 267 00:14:36,710 --> 00:14:37,800 called Toom-Cook. 268 00:14:37,800 --> 00:14:44,950 And the same thing, Strassen-- you could go off and divide 269 00:14:44,950 --> 00:14:50,890 and conquer whose base case is not two by two, but 70 by 70, 270 00:14:50,890 --> 00:14:52,260 and that improves things. 271 00:14:52,260 --> 00:14:56,560 But it turns out there's other arithmetic series summation 272 00:14:56,560 --> 00:14:57,300 ways. 273 00:14:57,300 --> 00:15:03,030 And so a famous algorithm that's up until 2010 274 00:15:03,030 --> 00:15:06,330 was the best complexity algorithm known. 275 00:15:06,330 --> 00:15:12,660 It's Coppersmith-Winograd, which is 2.376. 276 00:15:12,660 --> 00:15:15,980 And then at some point, we had a faculty candidate here 277 00:15:15,980 --> 00:15:26,060 who either shrunk this from 2.376 to 2.373. 278 00:15:26,060 --> 00:15:30,470 And it turns out that there were two different researchers who 279 00:15:30,470 --> 00:15:34,640 came up with a 2.373, but this particular candidate 280 00:15:34,640 --> 00:15:37,650 in the sixth decimal place won. 281 00:15:37,650 --> 00:15:38,850 So she had an eight. 282 00:15:38,850 --> 00:15:40,550 Person had a nine or something. 283 00:15:40,550 --> 00:15:42,830 But anyway, all of these are impractical. 284 00:15:42,830 --> 00:15:43,900 OK. 285 00:15:43,900 --> 00:15:45,415 You don't want to use them. 286 00:15:45,415 --> 00:15:48,390 The constant factors associated with these things 287 00:15:48,390 --> 00:15:52,180 are much larger than what you have here. 288 00:15:52,180 --> 00:15:57,200 The constant factors here, I guess it's one, right? 289 00:15:57,200 --> 00:16:01,400 Makes sense that it would be one, forgetting 290 00:16:01,400 --> 00:16:02,760 the additions of course. 291 00:16:02,760 --> 00:16:04,560 So if you have large constant factors, 292 00:16:04,560 --> 00:16:08,520 then you need a billion by billion matrix in order to win. 293 00:16:08,520 --> 00:16:12,580 And if you have billion by billion matrices that you want 294 00:16:12,580 --> 00:16:15,010 to multiply, do something else. 295 00:16:15,010 --> 00:16:15,836 OK. 296 00:16:15,836 --> 00:16:16,960 You don't want to go there. 297 00:16:19,600 --> 00:16:23,760 Even in the day of the internet, it's not going to work. 298 00:16:23,760 --> 00:16:26,350 So what we'd like to do now is do something better. 299 00:16:26,350 --> 00:16:33,420 So we will try-- given that theoretical computer science 300 00:16:33,420 --> 00:16:39,360 class, it makes sense to say that our verification algorithm 301 00:16:39,360 --> 00:16:45,330 should be better than n raised to 2.376 or 2.3-whatever. 302 00:16:45,330 --> 00:16:46,220 Right? 303 00:16:46,220 --> 00:16:50,140 Otherwise, it doesn't feel good. 304 00:16:50,140 --> 00:16:52,670 So what we'd like to do-- and we can do this-- 305 00:16:52,670 --> 00:17:05,910 is try and get an order n square algorithm-- that's this. 306 00:17:05,910 --> 00:17:09,890 So it's probably correct Monte Carlo algorithm 307 00:17:09,890 --> 00:17:14,700 where if you have A times B equals C, 308 00:17:14,700 --> 00:17:22,930 then the probability of the output equals yes is 1. 309 00:17:22,930 --> 00:17:27,020 So in fact, if you got it right, then the verifier 310 00:17:27,020 --> 00:17:31,590 is going to not give you a false negative. 311 00:17:31,590 --> 00:17:33,520 It's not going to say-- no, you got it wrong-- 312 00:17:33,520 --> 00:17:35,160 when you got it right. 313 00:17:35,160 --> 00:17:42,860 But it could give you up a false positive with some probability 314 00:17:42,860 --> 00:17:52,450 where you have the probability of output equals yes, 315 00:17:52,450 --> 00:17:54,240 and that's a false positive. 316 00:17:54,240 --> 00:17:57,060 But you can bound that to be less than half. 317 00:17:57,060 --> 00:17:58,050 OK. 318 00:17:58,050 --> 00:18:00,060 So it's going to say, yes. 319 00:18:00,060 --> 00:18:04,632 So obviously, if the verifier kept saying yes all the time, 320 00:18:04,632 --> 00:18:05,590 you wouldn't have this. 321 00:18:05,590 --> 00:18:07,440 It wouldn't be very interesting. 322 00:18:07,440 --> 00:18:12,000 I would be constant time, but it wouldn't be very interesting. 323 00:18:12,000 --> 00:18:16,880 What is interesting here is that when they're not equal, 324 00:18:16,880 --> 00:18:20,460 you're going to get an incorrect result with some operand 325 00:18:20,460 --> 00:18:23,060 on the probability. 326 00:18:23,060 --> 00:18:27,540 So you say, about one half seems kind of high-- 50% 327 00:18:27,540 --> 00:18:29,010 flipping a coin. 328 00:18:29,010 --> 00:18:34,120 The good news is that these algorithms, 329 00:18:34,120 --> 00:18:35,550 you can run them over and over. 330 00:18:35,550 --> 00:18:38,370 You can run this checker over and over. 331 00:18:38,370 --> 00:18:41,600 And as long as executions are independent, 332 00:18:41,600 --> 00:18:44,020 and you can certainly ensure that they're 333 00:18:44,020 --> 00:18:47,790 independent by ensuring that the randomness from one 334 00:18:47,790 --> 00:18:50,470 execution to another-- the flipping of the coins-- 335 00:18:50,470 --> 00:18:51,670 are independent. 336 00:18:51,670 --> 00:18:52,310 OK. 337 00:18:52,310 --> 00:18:54,170 And so that's relatively easy to do, 338 00:18:54,170 --> 00:18:56,460 in certainly all of the scenarios we'll be looking at. 339 00:18:56,460 --> 00:18:58,570 In 046, it's relatively easy to do. 340 00:18:58,570 --> 00:19:01,300 You can now drive this probability down 341 00:19:01,300 --> 00:19:04,380 to one quarter with two executions 342 00:19:04,380 --> 00:19:07,250 because you'll just check different things. 343 00:19:07,250 --> 00:19:10,180 And then one eighth with three and so on and so forth. 344 00:19:10,180 --> 00:19:11,510 So that's what's cool about it. 345 00:19:11,510 --> 00:19:14,020 And now, if you can look at the runtime, 346 00:19:14,020 --> 00:19:17,050 you say, well, runtime still order n square. 347 00:19:17,050 --> 00:19:18,800 That's the beauty of this because I'm just 348 00:19:18,800 --> 00:19:21,060 putting an extra constant factor here where 349 00:19:21,060 --> 00:19:23,970 I have k n square, where k is a constant. 350 00:19:23,970 --> 00:19:28,360 And effectively, I have this nice relationship 351 00:19:28,360 --> 00:19:32,400 in terms of the probability of error 352 00:19:32,400 --> 00:19:37,480 going down to 1 divided by 2 raised to k. 353 00:19:37,480 --> 00:19:42,019 And what I have here is a k n square. 354 00:19:42,019 --> 00:19:43,310 So that's what's cool about it. 355 00:19:43,310 --> 00:19:45,600 And obviously, k n square is 2 order n square 356 00:19:45,600 --> 00:19:49,810 in a polynomial time, and but the probably correct aspect 357 00:19:49,810 --> 00:19:51,950 of it gets better and better. 358 00:19:51,950 --> 00:19:53,365 OK. 359 00:19:53,365 --> 00:19:54,240 Any questions so far? 360 00:19:58,380 --> 00:19:58,880 All right. 361 00:19:58,880 --> 00:19:59,380 Good. 362 00:20:02,270 --> 00:20:04,910 So what we're going to do is this algorithm actually 363 00:20:04,910 --> 00:20:11,890 works for arbitrary matrices-- the structure at least. 364 00:20:14,630 --> 00:20:18,750 We're going to assume that the matrix entries are Boolean. 365 00:20:18,750 --> 00:20:23,050 They're going to work in the finite field mod 2. 366 00:20:23,050 --> 00:20:25,290 And it's just is an easier proof. 367 00:20:25,290 --> 00:20:26,600 It's easier to see. 368 00:20:26,600 --> 00:20:29,330 So the complexities are all the same. 369 00:20:29,330 --> 00:20:30,850 You're still multiplying numbers. 370 00:20:30,850 --> 00:20:32,130 They happen to be small. 371 00:20:32,130 --> 00:20:34,510 And multiplication cost you one operation. 372 00:20:34,510 --> 00:20:36,840 And you need to do n cubed multiplies to actually get 373 00:20:36,840 --> 00:20:42,760 the C matrix, and you have to verify it order n square time. 374 00:20:42,760 --> 00:20:44,850 And so the number of multiplies, again, 375 00:20:44,850 --> 00:20:47,940 that you want to use in your verification algorithm 376 00:20:47,940 --> 00:20:49,752 has to be order n square. 377 00:20:49,752 --> 00:20:50,960 We're ignoring the additions. 378 00:20:54,110 --> 00:21:00,230 So that's what we'd like out of our matrix product checker, 379 00:21:00,230 --> 00:21:02,090 and the algorithm we're going to look at 380 00:21:02,090 --> 00:21:10,470 is called Freivald's algorithm, cute little algorithm, that 381 00:21:10,470 --> 00:21:12,930 does the following. 382 00:21:12,930 --> 00:21:15,300 So the algorithm itself is very straightforward, 383 00:21:15,300 --> 00:21:17,800 in couple of lines, a minute or so to describe, 384 00:21:17,800 --> 00:21:23,180 and the interesting aspect of it is the analysis-- the fact 385 00:21:23,180 --> 00:21:26,250 that you can show this. 386 00:21:26,250 --> 00:21:27,474 That's the cool part. 387 00:21:27,474 --> 00:21:28,890 If you couldn't show that, there's 388 00:21:28,890 --> 00:21:32,060 nothing cool about this algorithm. 389 00:21:32,060 --> 00:21:38,990 So we're going to choose a random binary vector. 390 00:21:38,990 --> 00:21:41,000 So there you go. 391 00:21:41,000 --> 00:21:43,480 Here's your randomness. 392 00:21:43,480 --> 00:21:45,750 And this binary vector, every time you run it, 393 00:21:45,750 --> 00:21:48,340 as the k increases here, the random binary vector 394 00:21:48,340 --> 00:21:51,380 is different from one to another. 395 00:21:51,380 --> 00:21:52,190 That's important. 396 00:21:52,190 --> 00:21:54,150 You can't run the same thing again 397 00:21:54,150 --> 00:21:58,690 and then expect a different result. That's called insanity. 398 00:21:58,690 --> 00:22:06,730 But you are going to assume that given 399 00:22:06,730 --> 00:22:11,100 that we are working in the binary space 400 00:22:11,100 --> 00:22:18,000 and this is a binary vector, you're 401 00:22:18,000 --> 00:22:23,990 going to assume that r i equals 1 is half independently for i 402 00:22:23,990 --> 00:22:26,730 equals 1 through n. 403 00:22:26,730 --> 00:22:31,300 And the algorithm essentially is this-- 404 00:22:31,300 --> 00:22:34,900 we're going to do a bunch of matrix vector multiplies. 405 00:22:37,660 --> 00:22:43,430 An n by n matrix multiplied by an n by n matrix 406 00:22:43,430 --> 00:22:46,870 gives you an n by n matrix, and that's your n cubed. 407 00:22:46,870 --> 00:22:49,320 So these are all-- I think I said this, 408 00:22:49,320 --> 00:22:51,110 but I should've written this down-- 409 00:22:51,110 --> 00:22:55,730 these are all square matrices that are n by n. 410 00:22:55,730 --> 00:22:58,700 And that's where you get your n cube. 411 00:22:58,700 --> 00:23:01,900 A matrix vector would be something 412 00:23:01,900 --> 00:23:05,803 where you have-- typically we'd have a column vector here. 413 00:23:05,803 --> 00:23:08,160 You're going to get something like that, 414 00:23:08,160 --> 00:23:11,290 and you have n square multiplications here. 415 00:23:13,810 --> 00:23:16,580 You're going to grab one of these 416 00:23:16,580 --> 00:23:19,950 and then multiply it by that and get an entry here. 417 00:23:19,950 --> 00:23:23,760 And that obviously is n multiplications, 418 00:23:23,760 --> 00:23:27,280 but you only have n elements to produce here in this vector. 419 00:23:27,280 --> 00:23:29,290 So you only got n square. 420 00:23:29,290 --> 00:23:31,160 That make sense? 421 00:23:31,160 --> 00:23:35,510 And so what we're going to do is, we're going to take this r, 422 00:23:35,510 --> 00:23:39,850 and we're going to compute A times B r. 423 00:23:39,850 --> 00:23:42,154 And so the brackets are important because it 424 00:23:42,154 --> 00:23:43,820 says that you're going to compute what's 425 00:23:43,820 --> 00:23:46,820 inside the brackets first. 426 00:23:46,820 --> 00:23:48,820 Otherwise, it would be a problem because you'd 427 00:23:48,820 --> 00:23:52,440 be multiplying A times B. And obviously, that's order n cube. 428 00:23:52,440 --> 00:23:52,940 Right? 429 00:23:52,940 --> 00:23:54,670 You don't want that. 430 00:23:54,670 --> 00:23:58,710 So A times Br equals Cr. 431 00:23:58,710 --> 00:23:59,500 OK. 432 00:23:59,500 --> 00:24:02,560 So r, remember, is a column vector. 433 00:24:02,560 --> 00:24:04,720 And C is an n by n matrix as our A 434 00:24:04,720 --> 00:24:09,635 and B. We're going to output yes. 435 00:24:16,670 --> 00:24:21,520 Else-- if these two are not equal, 436 00:24:21,520 --> 00:24:25,040 you're going to output no. 437 00:24:25,040 --> 00:24:26,860 OK? 438 00:24:26,860 --> 00:24:27,886 And so that's it. 439 00:24:27,886 --> 00:24:33,110 That's one run of the algorithm, generator random r 440 00:24:33,110 --> 00:24:37,060 and do the multiplication as you see here. 441 00:24:37,060 --> 00:24:42,390 So let's be clear about complexity, 442 00:24:42,390 --> 00:24:48,040 and let's make sure we understand the simpler 443 00:24:48,040 --> 00:24:50,170 aspects of the algorithm before we 444 00:24:50,170 --> 00:24:53,070 get into the analysis associated with bounding 445 00:24:53,070 --> 00:24:57,259 the false positive probability. 446 00:24:57,259 --> 00:24:58,800 The hard part is going to be bounding 447 00:24:58,800 --> 00:25:00,185 the false positive probability. 448 00:25:02,810 --> 00:25:08,070 But the easy part is first, the complexity. 449 00:25:08,070 --> 00:25:15,850 So how many matrix vector products am I doing here? 450 00:25:15,850 --> 00:25:18,160 How many matrix vector products am I 451 00:25:18,160 --> 00:25:23,010 doing here in this check on one iteration of algorithm? 452 00:25:23,010 --> 00:25:23,510 Yeah. 453 00:25:23,510 --> 00:25:24,480 AUDIENCE: Three. 454 00:25:24,480 --> 00:25:25,480 SRINIVAS DEVADAS: Three. 455 00:25:25,480 --> 00:25:26,490 All right. 456 00:25:26,490 --> 00:25:26,990 All right. 457 00:25:26,990 --> 00:25:29,210 You need to stand up. 458 00:25:29,210 --> 00:25:29,890 This is fun. 459 00:25:29,890 --> 00:25:35,070 This is the hardest throw I've had to make in 6046. 460 00:25:35,070 --> 00:25:36,250 I got to put this down. 461 00:25:36,250 --> 00:25:39,030 Warm up a little bit. 462 00:25:39,030 --> 00:25:42,010 It's kind of cold. 463 00:25:42,010 --> 00:25:42,810 Whoa. 464 00:25:42,810 --> 00:25:43,670 Terrible! 465 00:25:43,670 --> 00:25:44,910 All right. 466 00:25:44,910 --> 00:25:47,540 The person who gets up and gets that owns it, 467 00:25:47,540 --> 00:25:49,081 and we're going to do this again. 468 00:25:49,081 --> 00:25:49,580 All right. 469 00:25:49,580 --> 00:25:50,830 Let's see how long this takes. 470 00:25:50,830 --> 00:25:52,288 AUDIENCE: Is this part of my trial? 471 00:25:52,520 --> 00:25:53,540 SRINIVAS DEVADAS: Yes. 472 00:25:53,540 --> 00:25:55,200 Well, the first one failed. 473 00:25:55,200 --> 00:25:56,715 False whatever, right? 474 00:25:59,475 --> 00:26:00,860 [LAUGHTER] 475 00:26:00,860 --> 00:26:01,940 I got a few more. 476 00:26:01,940 --> 00:26:02,852 [LAUGHS] 477 00:26:03,721 --> 00:26:04,220 All right. 478 00:26:04,220 --> 00:26:05,160 Let me see. 479 00:26:05,160 --> 00:26:07,440 I think I need to go here. 480 00:26:07,440 --> 00:26:08,030 This is good. 481 00:26:08,030 --> 00:26:12,450 And I need to be-- all right. 482 00:26:12,450 --> 00:26:13,030 Number three. 483 00:26:13,030 --> 00:26:13,530 Thank you. 484 00:26:13,530 --> 00:26:14,321 Thank you. 485 00:26:14,321 --> 00:26:15,674 [CLAPPING] 486 00:26:18,380 --> 00:26:21,070 So it was three. 487 00:26:21,070 --> 00:26:21,570 Three. 488 00:26:21,570 --> 00:26:23,671 Perfect. 489 00:26:23,671 --> 00:26:25,920 Three matrix vector products because I got to do this. 490 00:26:25,920 --> 00:26:27,680 That's the matrix vector product. 491 00:26:27,680 --> 00:26:29,980 Remember I'm getting a column vector out of this, which 492 00:26:29,980 --> 00:26:33,740 is important, and then I'm going to multiply 493 00:26:33,740 --> 00:26:35,800 this matrix with a column vector, matrix vector 494 00:26:35,800 --> 00:26:37,150 product number two. 495 00:26:37,150 --> 00:26:40,050 And then there's a matrix vector product over here. 496 00:26:40,050 --> 00:26:41,970 So then at that point-- do you remember 497 00:26:41,970 --> 00:26:44,790 I have a vector and a vector. 498 00:26:44,790 --> 00:26:47,310 And checking the equivalence of two vectors 499 00:26:47,310 --> 00:26:49,520 is simply checking the equivalence of each element 500 00:26:49,520 --> 00:26:51,970 in the vector 1 by one. 501 00:26:51,970 --> 00:26:53,630 So first one, same as the first one. 502 00:26:53,630 --> 00:26:55,280 Second one, same as the second one. 503 00:26:55,280 --> 00:26:57,410 Et cetera. 504 00:26:57,410 --> 00:27:02,080 And so this is order n square, but three 505 00:27:02,080 --> 00:27:04,320 is something that is worth thinking about simply 506 00:27:04,320 --> 00:27:05,820 because every once in a while, we're 507 00:27:05,820 --> 00:27:07,370 interested in constant factors. 508 00:27:07,370 --> 00:27:10,960 And the other thing that's interesting about this-- 509 00:27:10,960 --> 00:27:14,810 make sure I write this-- let me write this over 510 00:27:14,810 --> 00:27:28,090 here-- that you are going to-- if A B equals C, 511 00:27:28,090 --> 00:27:34,080 then there's no issue associated with error here. 512 00:27:34,080 --> 00:27:37,520 So there's no notion of a false negative 513 00:27:37,520 --> 00:27:42,300 because if AB equals C, then you know, thanks 514 00:27:42,300 --> 00:27:49,190 to the associativity of matrix multiplication-- 515 00:27:49,190 --> 00:27:57,500 be it whether they're n by n matrices or columns-- 516 00:27:57,500 --> 00:27:59,950 you have this relationship here. 517 00:27:59,950 --> 00:28:01,620 And I hope you can read it at the back. 518 00:28:01,620 --> 00:28:05,030 Essentially what I have here is if AB equals C. 519 00:28:05,030 --> 00:28:09,400 So if in fact, the matrix multiply happened correctly, 520 00:28:09,400 --> 00:28:15,440 I'm in a situation where it is clear that A Br equals this, 521 00:28:15,440 --> 00:28:18,200 thanks to this associativity of matrix multiply. 522 00:28:18,200 --> 00:28:22,240 And that of course, is exactly the same as Cr. 523 00:28:22,240 --> 00:28:25,650 So that should convince you, thanks to associativity 524 00:28:25,650 --> 00:28:29,870 of matrix multiply that you don't have any false negatives 525 00:28:29,870 --> 00:28:31,480 in this algorithm. 526 00:28:31,480 --> 00:28:33,680 Make sense? 527 00:28:33,680 --> 00:28:34,730 So we're all good. 528 00:28:34,730 --> 00:28:39,656 All we have to do, given what we have with respect to Frievald 529 00:28:39,656 --> 00:28:41,280 is to do this part here, which is going 530 00:28:41,280 --> 00:28:43,190 to take a little bit of doing. 531 00:28:43,190 --> 00:28:46,590 And the challenge always with simple algorithm 532 00:28:46,590 --> 00:28:49,680 is you don't quite know why they work. 533 00:28:49,680 --> 00:28:51,965 And then of course, you have sophisticated algorithms, 534 00:28:51,965 --> 00:28:53,590 and you don't quite know why they work. 535 00:29:00,580 --> 00:29:02,020 So this will take a few minutes. 536 00:29:02,020 --> 00:29:05,770 It's not super complicated, and there's a little insight, 537 00:29:05,770 --> 00:29:07,800 as always, with these things that 538 00:29:07,800 --> 00:29:09,850 are not immediately obvious. 539 00:29:09,850 --> 00:29:15,750 But we'll have to look at the number of r's. 540 00:29:15,750 --> 00:29:19,140 So you have an r vector that you've generated randomly, 541 00:29:19,140 --> 00:29:21,170 and it may be a bad vector. 542 00:29:21,170 --> 00:29:28,070 It may be a vector that doesn't show you that the product 543 00:29:28,070 --> 00:29:30,490 matrix has an incorrect entry. 544 00:29:30,490 --> 00:29:33,760 Remember there's n square entries in this matrix. 545 00:29:33,760 --> 00:29:37,280 Exactly one of them may be wrong, and you need to find it. 546 00:29:37,280 --> 00:29:38,739 Right? 547 00:29:38,739 --> 00:29:41,030 So there may be a lot of entries which are all correct, 548 00:29:41,030 --> 00:29:43,720 but you've got to find that one entry that's incorrect. 549 00:29:43,720 --> 00:29:45,800 And so you could miss it. 550 00:29:45,800 --> 00:29:49,000 A given r vector might miss it, and of course, 551 00:29:49,000 --> 00:29:50,980 if you keep generating the r's, you'd 552 00:29:50,980 --> 00:29:56,200 like to find it and declare that the matrices weren't multiplied 553 00:29:56,200 --> 00:30:00,330 correctly and that probability is what we have to compute. 554 00:30:00,330 --> 00:30:05,370 So we want to get this result where we are analyzing 555 00:30:05,370 --> 00:30:08,240 the correctness in the case. 556 00:30:08,240 --> 00:30:10,400 You've already analyzed the correctness 557 00:30:10,400 --> 00:30:16,509 in the case where AB equals C, but now we 558 00:30:16,509 --> 00:30:18,300 have to analyze the correctness in the case 559 00:30:18,300 --> 00:30:20,790 where AB is not equal to C. Right? 560 00:30:21,890 --> 00:30:31,690 And so the claim is that if AB is not equal to C, 561 00:30:31,690 --> 00:30:39,610 then the probability of ABr not equal to Cr 562 00:30:39,610 --> 00:30:42,140 is greater than or equal to half. 563 00:30:42,140 --> 00:30:43,652 So this is greater than or equal to. 564 00:30:43,652 --> 00:30:45,860 Over there, I'm just talking about the false negative 565 00:30:45,860 --> 00:30:49,020 probability where I'm actually getting an incorrect yes when 566 00:30:49,020 --> 00:30:54,890 you have the matrices being multiplied wrongly, 567 00:30:54,890 --> 00:30:55,830 incorrectly. 568 00:30:55,830 --> 00:30:58,770 And so that's why I get-- this is what I want. 569 00:30:58,770 --> 00:31:05,340 I want there to be a greater than one half probability for r 570 00:31:05,340 --> 00:31:11,810 to have discovered that, for r to have discovered that. 571 00:31:11,810 --> 00:31:14,490 OK? 572 00:31:14,490 --> 00:31:17,220 I'll stop for questions in a second, 573 00:31:17,220 --> 00:31:20,680 but let me do a little bit more. 574 00:31:20,680 --> 00:31:23,059 I'm going to compute the difference matrix, 575 00:31:23,059 --> 00:31:25,100 and I'm not computing this because obviously this 576 00:31:25,100 --> 00:31:26,600 would take a while to compute. 577 00:31:26,600 --> 00:31:28,610 It's just for the purpose of analysis. 578 00:31:28,610 --> 00:31:31,840 I'm going to look at the difference matrix D equals AB 579 00:31:31,840 --> 00:31:34,570 minus C because you want D to be 0. 580 00:31:34,570 --> 00:31:36,550 And that we're going to do some analysis that 581 00:31:36,550 --> 00:31:38,510 says-- we are going to try and find 582 00:31:38,510 --> 00:31:41,370 these non-zero entries in D because, clearly, 583 00:31:41,370 --> 00:31:44,670 the non-zero entries in D tell you 584 00:31:44,670 --> 00:31:47,650 if there's non-zero entry in D, you got a problem here. 585 00:31:47,650 --> 00:31:50,490 The matrices weren't multiplied properly. 586 00:31:50,490 --> 00:31:52,015 So that's why we have D here. 587 00:31:52,015 --> 00:31:54,140 Don't think of it as we're actually computing that. 588 00:31:56,960 --> 00:32:05,480 So what we'd like is to, as I said, 589 00:32:05,480 --> 00:32:08,460 discover these entries where our hypothesis now 590 00:32:08,460 --> 00:32:10,620 is that D is not equal to 0 because that's 591 00:32:10,620 --> 00:32:12,500 the case we're considering. 592 00:32:12,500 --> 00:32:16,070 We know that D is not equal to 0 if the matrices were multiplied 593 00:32:16,070 --> 00:32:17,630 incorrectly. 594 00:32:17,630 --> 00:32:19,580 And when I say D is not equal to 0, 595 00:32:19,580 --> 00:32:23,240 it means that there are n square entries in D, and one of them 596 00:32:23,240 --> 00:32:24,850 is not 0. 597 00:32:24,850 --> 00:32:26,550 They all have to be identically 0. 598 00:32:26,550 --> 00:32:27,700 That's all it means. 599 00:32:27,700 --> 00:32:31,290 D not equal to 0 means one entry at least is not 0. 600 00:32:34,190 --> 00:32:43,320 So now what we need to do is we need to show that there are 601 00:32:43,320 --> 00:32:50,710 many r-- it's a binary vector of length n, 602 00:32:50,710 --> 00:32:53,700 and you can obviously think about two ways to n 603 00:32:53,700 --> 00:32:56,060 possibilities with respect to r. 604 00:32:56,060 --> 00:32:57,810 And what we really want to show is 605 00:32:57,810 --> 00:33:00,860 that there's a large fraction-- more than half 606 00:33:00,860 --> 00:33:08,410 of the r's are going to actually discover that the matrices were 607 00:33:08,410 --> 00:33:10,010 multiplied incorrectly. 608 00:33:10,010 --> 00:33:10,670 OK. 609 00:33:10,670 --> 00:33:15,390 So we want to show that there are many r's such 610 00:33:15,390 --> 00:33:16,960 that Dr is not equal to 0. 611 00:33:22,230 --> 00:33:24,520 And because if Dr is not equal to 0, 612 00:33:24,520 --> 00:33:26,360 then you're obviously going to discover 613 00:33:26,360 --> 00:33:30,350 that ABr is not equal to Cr. 614 00:33:30,350 --> 00:33:32,290 So if ABr is not equal to Cr, that's 615 00:33:32,290 --> 00:33:35,440 identical to saying the Dr is not equal to 0. 616 00:33:35,440 --> 00:33:38,110 That make sense? 617 00:33:38,110 --> 00:33:44,830 So specifically, if you look at the claim 618 00:33:44,830 --> 00:33:47,420 and writing it in terms of Dr, you 619 00:33:47,420 --> 00:33:53,740 want to say that the probability of Dr not equal to 0 620 00:33:53,740 --> 00:33:58,560 is greater than or equal to half for a randomly chosen r. 621 00:34:10,440 --> 00:34:11,385 And so that's it. 622 00:34:11,385 --> 00:34:14,550 That's the setup that we have to show. 623 00:34:14,550 --> 00:34:16,900 We have to do a counting argument corresponding 624 00:34:16,900 --> 00:34:21,060 to these r vectors that are being generated randomly. 625 00:34:21,060 --> 00:34:21,820 So let's do that. 626 00:34:34,670 --> 00:34:37,190 So the general argument we're going to make here 627 00:34:37,190 --> 00:34:41,530 is simply that we're going to-- roughly speaking-- 628 00:34:41,530 --> 00:34:45,510 if you're going to look at a bad r-- what's a bad r? 629 00:34:45,510 --> 00:34:50,020 And bad r is something that doesn't discover 630 00:34:50,020 --> 00:34:52,940 the incorrect multiplication. 631 00:34:52,940 --> 00:34:54,630 That's what a bad r is. 632 00:34:54,630 --> 00:35:02,160 So you're D is not equal to 0, but Dr equals 0. 633 00:35:02,160 --> 00:35:02,950 OK. 634 00:35:02,950 --> 00:35:05,560 That's a bad r, right? 635 00:35:05,560 --> 00:35:08,370 It's quite possible that that would be the case. 636 00:35:08,370 --> 00:35:13,410 And so you want to try and figure out 637 00:35:13,410 --> 00:35:15,190 how many of these bad r are there 638 00:35:15,190 --> 00:35:16,680 because those are the ones that are 639 00:35:16,680 --> 00:35:22,550 causing the false negatives. 640 00:35:22,550 --> 00:35:23,680 Right? 641 00:35:23,680 --> 00:35:30,920 So that counting argument is the crux of the proof of the claim. 642 00:35:30,920 --> 00:35:32,740 So let's look at that. 643 00:35:32,740 --> 00:35:35,390 And what we're going to do is, we're going to pick a bad r, 644 00:35:35,390 --> 00:35:37,130 and we're going to say I there are 645 00:35:37,130 --> 00:35:41,200 these good r's that are associated with this bad r. 646 00:35:41,200 --> 00:35:44,499 And for every bad r, there's a good r. 647 00:35:44,499 --> 00:35:46,540 And a good r is something that actually discovers 648 00:35:46,540 --> 00:35:48,720 the incorrect multiply. 649 00:35:48,720 --> 00:35:52,630 And given that for every bad r there's a good r, 650 00:35:52,630 --> 00:35:55,810 half of the arts are good r. 651 00:35:55,810 --> 00:35:56,640 That's it. 652 00:35:56,640 --> 00:35:57,890 So I'll write it down. 653 00:35:57,890 --> 00:36:00,180 That's the essence of the argument. 654 00:36:00,180 --> 00:36:04,480 And I'm go a little more slowly so hopefully you'll get that. 655 00:36:04,480 --> 00:36:08,440 So let's look at the case where Dr equals 0 case because that's 656 00:36:08,440 --> 00:36:10,290 the interesting case. 657 00:36:10,290 --> 00:36:12,550 That's the case where the r is bad 658 00:36:12,550 --> 00:36:14,870 even though we had an incorrect multiply, 659 00:36:14,870 --> 00:36:22,499 and you get this-- I should have said you get a false positive. 660 00:36:22,499 --> 00:36:23,040 So I'm sorry. 661 00:36:23,040 --> 00:36:25,880 I think just before, I said false negative, 662 00:36:25,880 --> 00:36:27,990 but I meant false positive. 663 00:36:27,990 --> 00:36:31,630 So you have a false positive in this case, 664 00:36:31,630 --> 00:36:38,100 and D equals AB minus C not equal to 0 665 00:36:38,100 --> 00:36:41,960 implies there exists an i and j such 666 00:36:41,960 --> 00:36:45,780 that Dij is not equal to 0. 667 00:36:45,780 --> 00:36:47,370 OK? 668 00:36:47,370 --> 00:36:50,250 And there's just one entry at least-- if you 669 00:36:50,250 --> 00:36:51,940 say the matrix is not equal to 0, 670 00:36:51,940 --> 00:36:54,590 there's got to be an entry that's not equal to 0. 671 00:36:54,590 --> 00:36:56,730 So let's take a look at that entry, 672 00:36:56,730 --> 00:36:58,670 and let's just draw it out. 673 00:36:58,670 --> 00:37:01,250 That's my D matrix. 674 00:37:01,250 --> 00:37:07,220 And there's going to be an i and a j. 675 00:37:07,220 --> 00:37:11,430 So that's my ith row and my jth column. 676 00:37:11,430 --> 00:37:12,220 And there you go. 677 00:37:12,220 --> 00:37:18,610 I have an entry here which is Dij, and I'm just picking that. 678 00:37:18,610 --> 00:37:20,770 I don't care what i and j are, but there's 679 00:37:20,770 --> 00:37:23,460 got to be an entry that's not equal to 0. 680 00:37:23,460 --> 00:37:29,320 Now I'm going to create a vector, v. So 681 00:37:29,320 --> 00:37:31,270 this vector is not r. 682 00:37:31,270 --> 00:37:36,100 It's a vector v that is chosen deterministically given 683 00:37:36,100 --> 00:37:45,430 the Dij where it's got 0's everywhere except it at vj. 684 00:37:45,430 --> 00:37:50,420 So if this the jth entry column-wise, 685 00:37:50,420 --> 00:37:52,500 everywhere else you got 0. 686 00:37:55,590 --> 00:37:58,050 And you just got the one associated 687 00:37:58,050 --> 00:38:04,270 with the-- going downward-- the jth entry. 688 00:38:04,270 --> 00:38:04,770 OK? 689 00:38:04,770 --> 00:38:06,730 So it's a one one-hot vector, if you will. 690 00:38:06,730 --> 00:38:08,550 It's got one, one. 691 00:38:08,550 --> 00:38:11,567 So now, if you multiply these two things out, 692 00:38:11,567 --> 00:38:13,400 you know that you're going to get something, 693 00:38:13,400 --> 00:38:17,520 and we're going to can call this Dv. 694 00:38:17,520 --> 00:38:18,900 So you take D and you multiply it 695 00:38:18,900 --> 00:38:24,160 by v-- matrix multiplied by a vector. 696 00:38:24,160 --> 00:38:27,390 You're guaranteed, given that all of these are 0, 697 00:38:27,390 --> 00:38:30,020 when I do my this times that plus this 698 00:38:30,020 --> 00:38:32,710 times this plus this times this, all of these 699 00:38:32,710 --> 00:38:33,920 are going to produce 0. 700 00:38:33,920 --> 00:38:37,410 This times 1 is going to produce something that's non-zero, 701 00:38:37,410 --> 00:38:39,810 and then all of the other ones are going to produce 0. 702 00:38:39,810 --> 00:38:43,540 So I'm just adding a bunch of 0's to this non-zero multiplied 703 00:38:43,540 --> 00:38:44,770 by one. 704 00:38:44,770 --> 00:38:47,420 So I'm going to get something that's non-zero. 705 00:38:47,420 --> 00:38:48,350 Right? 706 00:38:48,350 --> 00:38:50,790 All make sense? 707 00:38:50,790 --> 00:38:52,620 So I'm going to see something here, 708 00:38:52,620 --> 00:38:57,590 which is the jth entry that's not equal 0. 709 00:38:57,590 --> 00:39:00,910 And so that implies that Dv is not equal to 0. 710 00:39:00,910 --> 00:39:06,450 And in particular, what I'm saying is Dv of j-- 711 00:39:06,450 --> 00:39:10,070 so if I just look at that entry that is identically 712 00:39:10,070 --> 00:39:14,100 Dij, which is not equal to 0. 713 00:39:14,100 --> 00:39:16,830 Because I'm multiplying it by 1 and I'm adding a bunch of 0's 714 00:39:16,830 --> 00:39:17,700 to it. 715 00:39:17,700 --> 00:39:18,880 That's it. 716 00:39:18,880 --> 00:39:19,700 OK. 717 00:39:19,700 --> 00:39:20,200 Yeah. 718 00:39:20,200 --> 00:39:21,210 A question. 719 00:39:21,210 --> 00:39:24,222 AUDIENCE: Is it Dv of j or of i? 720 00:39:24,222 --> 00:39:25,930 SRINIVAS DEVADAS: So I picked the j here. 721 00:39:25,930 --> 00:39:30,894 So I think I'm going j, j, right? 722 00:39:30,894 --> 00:39:31,560 That make sense? 723 00:39:33,910 --> 00:39:37,440 If j was 7 and this is 7 down, then it 724 00:39:37,440 --> 00:39:41,130 would be the seventh [INAUDIBLE] because this 725 00:39:41,130 --> 00:39:43,991 is going to turn into that. 726 00:39:43,991 --> 00:39:44,490 OK? 727 00:39:47,112 --> 00:39:50,610 Now, either way, if I picked it in the middle, 728 00:39:50,610 --> 00:39:51,979 it doesn't really matter. 729 00:39:51,979 --> 00:39:53,770 The point is there's going to be one entry. 730 00:39:53,770 --> 00:39:54,478 So hang in there. 731 00:39:54,478 --> 00:39:56,472 There's going to be one entry that's nonzero 732 00:39:56,472 --> 00:39:57,680 if you didn't quite get that. 733 00:39:58,750 --> 00:40:00,610 So Dij is not 0. 734 00:40:00,610 --> 00:40:03,240 And this is one more observation we're 735 00:40:03,240 --> 00:40:07,950 going to make in order to do the counting of these bad r's 736 00:40:07,950 --> 00:40:10,100 because this is a bad r that we're looking 737 00:40:10,100 --> 00:40:14,520 at if you say that Dr equals 0. 738 00:40:14,520 --> 00:40:17,790 You've created a v that has nothing to do with r, 739 00:40:17,790 --> 00:40:20,470 but we're going to use the v to go 740 00:40:20,470 --> 00:40:25,320 from a bad r, which is our example here, to a good one. 741 00:40:25,320 --> 00:40:27,440 That's pretty much it. 742 00:40:27,440 --> 00:40:29,112 That's the last step here. 743 00:40:29,112 --> 00:40:30,570 So what we're going to do, is we're 744 00:40:30,570 --> 00:40:45,230 going to take any r that can be chosen by our algorithm 745 00:40:45,230 --> 00:40:47,790 such that Dr equals 0 because that's 746 00:40:47,790 --> 00:40:50,110 the case that we're looking at. 747 00:40:50,110 --> 00:40:59,720 And we're going to compute r prime, which is r plus v. 748 00:40:59,720 --> 00:41:02,060 And just remember this is mod 2 arithmetic. 749 00:41:02,060 --> 00:41:03,870 You're only going to get 0's and 1's. 750 00:41:03,870 --> 00:41:06,204 So if you have 1 plus 1, it gives you 0. 751 00:41:06,204 --> 00:41:08,310 Obviously 0 plus 0 gives you 0. 752 00:41:08,310 --> 00:41:11,260 And the other cases are clear. 753 00:41:11,260 --> 00:41:13,400 And this plus here, remember, is also-- 754 00:41:13,400 --> 00:41:18,100 the other thing that's important is this is not only mod 2. 755 00:41:18,100 --> 00:41:19,730 These are all vectors. 756 00:41:19,730 --> 00:41:24,510 So r is a vector, and you can think of it as a column vector. 757 00:41:24,510 --> 00:41:26,620 That's how I drew it. 758 00:41:26,620 --> 00:41:29,170 You're adding up a column vector with a v. 759 00:41:29,170 --> 00:41:31,890 That's the column vector the way I drew that. 760 00:41:31,890 --> 00:41:35,740 You could do it with rows if you like, but it's just notation. 761 00:41:35,740 --> 00:41:40,100 And you're going to compute an r prime here. 762 00:41:40,100 --> 00:41:47,560 What can you say about Dr prime? 763 00:41:47,560 --> 00:41:49,650 Someone? 764 00:41:49,650 --> 00:41:50,150 Yeah. 765 00:41:50,150 --> 00:41:50,885 Go ahead. 766 00:41:50,885 --> 00:41:51,760 AUDIENCE: It's not 0. 767 00:41:51,760 --> 00:41:53,190 SRINIVAS DEVADAS: It's not 0. 768 00:41:53,190 --> 00:41:56,090 And I'll give you a Frisbee, but then you 769 00:41:56,090 --> 00:41:58,228 can explain-- can you stand up a little? 770 00:41:58,228 --> 00:42:01,830 I don't want to take this lady's head off. 771 00:42:01,830 --> 00:42:04,100 So can you explain why? 772 00:42:04,100 --> 00:42:06,600 AUDIENCE: Because r prime is r plus v 773 00:42:06,600 --> 00:42:09,682 and Dr gives you Dr plus DB is not 0. 774 00:42:09,682 --> 00:42:11,140 SRINIVAS DEVADAS: Absolutely right. 775 00:42:11,140 --> 00:42:12,740 So essentially what we have is this 776 00:42:12,740 --> 00:42:26,576 is simply Dr plus v. 0 plus Dv, not equal to 0. 777 00:42:26,576 --> 00:42:30,730 Do we like yellow or do we like white? 778 00:42:30,730 --> 00:42:33,240 Yellow's fine. 779 00:42:33,240 --> 00:42:34,500 So that's pretty much it. 780 00:42:34,500 --> 00:42:38,579 So what's cool about this- is this final step, 781 00:42:38,579 --> 00:42:41,120 which I think you've gotten, but I'm just going to say it out 782 00:42:41,120 --> 00:42:55,560 loud now, which is that r to r prime is 1 to 1 for any given r 783 00:42:55,560 --> 00:43:00,720 such that Dr equals 0, given the situation where capital D is 784 00:43:00,720 --> 00:43:02,810 not equal to 0, and there's some Dij-- 785 00:43:02,810 --> 00:43:04,470 and there could be many Dij's. 786 00:43:04,470 --> 00:43:05,960 I just need one. 787 00:43:05,960 --> 00:43:09,580 I've constructed, based on that Dij, this v vector which 788 00:43:09,580 --> 00:43:13,990 has the jth entry corresponding to the v vector being a 1 789 00:43:13,990 --> 00:43:16,140 with all of the other entries being a 0. 790 00:43:16,140 --> 00:43:21,340 But I can now create an r to r that is 1 to 1 in the sense 791 00:43:21,340 --> 00:43:31,290 that if r prime equals r plus v and that equals 792 00:43:31,290 --> 00:43:38,330 r double prime plus v-- so if I ever have a situation where 793 00:43:38,330 --> 00:43:39,830 in order to show there's one-to-one, 794 00:43:39,830 --> 00:43:44,350 I want to say that it's not too many-to-one or even two-to-one. 795 00:43:44,350 --> 00:43:47,400 So if I have an r prime that equals r plus v 796 00:43:47,400 --> 00:43:52,830 and you tell me that r prime also equals r double prime plus 797 00:43:52,830 --> 00:43:57,730 v, I can make the argument that r and r double prime 798 00:43:57,730 --> 00:43:59,920 are exactly the same. 799 00:43:59,920 --> 00:44:05,490 So then r equals r double prime. 800 00:44:05,490 --> 00:44:07,970 So what am I saying there? 801 00:44:07,970 --> 00:44:15,770 I'm just saying that for any given r that has Dr equals 0, 802 00:44:15,770 --> 00:44:22,220 I can twiddle the jth element of that r and go from 0 to 1 or 1 803 00:44:22,220 --> 00:44:23,330 to 0. 804 00:44:23,330 --> 00:44:27,810 If you tell me that there's a Dij somewhere in that matrix 805 00:44:27,810 --> 00:44:31,627 that is nonzero and I do that little twiddle-- remembers 806 00:44:31,627 --> 00:44:33,710 it's all 0's and 1's Boolean matrices-- so if I do 807 00:44:33,710 --> 00:44:37,280 one twiddle, it's one-to-one. 808 00:44:37,280 --> 00:44:42,730 If I do two twiddles and I go 1 to 0, I'm back to 1 again. 809 00:44:42,730 --> 00:44:44,830 And that's all this says. 810 00:44:44,830 --> 00:44:46,470 Because you have mod 2. 811 00:44:46,470 --> 00:44:47,910 That's all that says. 812 00:44:47,910 --> 00:44:51,240 So one little tweak-- and I'm going 813 00:44:51,240 --> 00:44:55,000 to be able to take a bad r and turn it into a good r 814 00:44:55,000 --> 00:44:59,420 because the good r, the r prime in this case, 815 00:44:59,420 --> 00:45:03,120 had Dr prime not equal to 0. 816 00:45:03,120 --> 00:45:03,780 And that's it. 817 00:45:03,780 --> 00:45:06,990 That's my counting argument, and all that remains 818 00:45:06,990 --> 00:45:11,970 is to essentially close this by sayiny-- just 819 00:45:11,970 --> 00:45:17,230 to write this out to get to the final claim 820 00:45:17,230 --> 00:45:20,080 and get the one half-- the one-to-one essentially 821 00:45:20,080 --> 00:45:21,960 gives you the one half. 822 00:45:21,960 --> 00:45:24,630 At least half of these things are going to be good r's. 823 00:45:33,110 --> 00:45:36,810 If you had I Dr that's not equal to 0-- 824 00:45:36,810 --> 00:45:40,180 and that's the case that you have here-- then we're 825 00:45:40,180 --> 00:45:46,050 going to discover an r prime such 826 00:45:46,050 --> 00:45:52,010 that the Dr prime is not equal to 0 and r to r 827 00:45:52,010 --> 00:45:54,350 prime is a one-to-one mapping. 828 00:46:00,170 --> 00:46:09,340 So the number of r prime for which 829 00:46:09,340 --> 00:46:12,990 Dr prime is not equal to 0 is greater than 830 00:46:12,990 --> 00:46:24,570 or equal to the number of r for which Dr equals zero. 831 00:46:24,570 --> 00:46:30,430 And so that implies that the probability of Dr 832 00:46:30,430 --> 00:46:33,190 not equal to zero-- so if you just choose an r, 833 00:46:33,190 --> 00:46:37,560 this is now a randomly chosen r. 834 00:46:37,560 --> 00:46:40,600 Not that others weren't, but I'm treating it 835 00:46:40,600 --> 00:46:41,850 a little bit differently here. 836 00:46:41,850 --> 00:46:46,210 This was a specific r for which Dr was equal to 0. 837 00:46:46,210 --> 00:46:48,100 I made an argument that you can always 838 00:46:48,100 --> 00:46:51,480 get this r prime one-to-one such that Dr prime is not 839 00:46:51,480 --> 00:46:52,480 equal to 0. 840 00:46:52,480 --> 00:46:55,480 And now going back to what I had initially with respect 841 00:46:55,480 --> 00:47:00,870 to the claim here where the r here was a randomly chosen r, 842 00:47:00,870 --> 00:47:04,530 I'm saying, thanks to this little argument-- this line up 843 00:47:04,530 --> 00:47:07,400 top-- I'm going to be able to say this is greater than 844 00:47:07,400 --> 00:47:08,870 or equal to one half. 845 00:47:08,870 --> 00:47:10,780 OK? 846 00:47:10,780 --> 00:47:11,507 Cool. 847 00:47:11,507 --> 00:47:12,090 Any questions? 848 00:47:12,090 --> 00:47:12,994 Yeah. 849 00:47:12,994 --> 00:47:15,958 AUDIENCE: I think the Dr squared times 850 00:47:15,958 --> 00:47:17,770 column equal column on the board. 851 00:47:17,770 --> 00:47:18,728 SRINIVAS DEVADAS: Yeah. 852 00:47:18,728 --> 00:47:21,322 AUDIENCE: On the last column, it should be i, not j. 853 00:47:21,322 --> 00:47:22,780 SRINIVAS DEVADAS: This should be i? 854 00:47:22,780 --> 00:47:23,363 AUDIENCE: Yes. 855 00:47:26,402 --> 00:47:28,110 SRINIVAS DEVADAS: People agree with that? 856 00:47:28,110 --> 00:47:30,380 Majority vote. 857 00:47:30,380 --> 00:47:31,140 All right. 858 00:47:31,140 --> 00:47:32,590 I'm good. 859 00:47:32,590 --> 00:47:33,844 Let's make that an i. 860 00:47:33,844 --> 00:47:35,996 AUDIENCE: The iteration of that as well, Dv sub j. 861 00:47:35,996 --> 00:47:37,120 SRINIVAS DEVADAS: Oh, yeah. 862 00:47:37,120 --> 00:47:38,650 Of course. 863 00:47:38,650 --> 00:47:39,150 Yeah. 864 00:47:39,150 --> 00:47:42,680 Once you do that, you have to have an i there. 865 00:47:42,680 --> 00:47:44,960 Good. 866 00:47:44,960 --> 00:47:50,180 So you're looking at a particular entry-- 867 00:47:50,180 --> 00:47:53,890 it makes a difference whether you used a column or a row. 868 00:47:53,890 --> 00:47:59,370 If I'd done-- now that I remember-- if you turn this 869 00:47:59,370 --> 00:48:03,390 into a row matrix and this becomes a row matrix, 870 00:48:03,390 --> 00:48:05,470 you'll essentially get the Dvj. 871 00:48:05,470 --> 00:48:07,704 So it depends on which way you look at it, but thanks 872 00:48:07,704 --> 00:48:08,620 for pointing that out. 873 00:48:11,000 --> 00:48:15,820 The specifics of i and j weren't particularly 874 00:48:15,820 --> 00:48:19,030 important to the proof itself. 875 00:48:19,030 --> 00:48:23,880 The key thing is you zoom in on a particular entry that is not 876 00:48:23,880 --> 00:48:29,530 equal to 0, and then you tweak that entry corresponding 877 00:48:29,530 --> 00:48:30,210 to the r. 878 00:48:30,210 --> 00:48:33,950 So once you tweak that-- you make that 0 or 1 or 1 879 00:48:33,950 --> 00:48:38,830 to 0-- you can get this result. 880 00:48:38,830 --> 00:48:39,330 I'm sorry. 881 00:48:39,330 --> 00:48:40,621 I'm pointing to the wrong spot. 882 00:48:40,621 --> 00:48:44,970 This result-- and then get your claim. 883 00:48:44,970 --> 00:48:45,850 OK? 884 00:48:45,850 --> 00:48:49,220 So summarize-- we have a bound. 885 00:48:49,220 --> 00:48:53,700 We run it over and over, and we get it to the point 886 00:48:53,700 --> 00:48:59,090 where we can have a 0.0001 probability that, 887 00:48:59,090 --> 00:49:02,980 if the matrices were multiplied incorrectly, 888 00:49:02,980 --> 00:49:07,780 that you wouldn't discover that because you ran it 889 00:49:07,780 --> 00:49:10,000 for enough hours independently chosen 890 00:49:10,000 --> 00:49:13,130 that that probability becomes as low as possible. 891 00:49:13,130 --> 00:49:14,270 OK? 892 00:49:14,270 --> 00:49:18,240 So that was Monte Carlo. 893 00:49:18,240 --> 00:49:20,790 Let's do a Las Vegas algorithm. 894 00:49:20,790 --> 00:49:23,030 And you guys are probably thinking, my goodness. 895 00:49:23,030 --> 00:49:27,000 Another sorting algorithm after, I don't know, 896 00:49:27,000 --> 00:49:28,480 17 different sorting algorithms. 897 00:49:30,200 --> 00:49:34,231 This all sorting algorithms that you've ever learned so far. 898 00:49:34,231 --> 00:49:34,730 Right? 899 00:49:34,730 --> 00:49:38,520 So merge sort doesn't work, and the reason 900 00:49:38,520 --> 00:49:41,230 it doesn't work in practice-- if you're really 901 00:49:41,230 --> 00:49:48,240 into performance-- is because of the auxiliary space that 902 00:49:48,240 --> 00:49:50,070 merge sort requires. 903 00:49:50,070 --> 00:49:54,800 So if you recall there's the notion of in-place sorting. 904 00:49:54,800 --> 00:49:57,340 So let's move onto the next thing 905 00:49:57,340 --> 00:50:03,039 here, which is quicksort, which is a new sorting algorithm. 906 00:50:03,039 --> 00:50:05,330 And I want to motivate it for just a couple of minutes. 907 00:50:08,990 --> 00:50:10,380 And the primary motivation really 908 00:50:10,380 --> 00:50:16,620 is practical performance, not asymptotic complexity. 909 00:50:16,620 --> 00:50:19,920 So I'll be upfront about that. 910 00:50:19,920 --> 00:50:22,150 It's all about practical performance 911 00:50:22,150 --> 00:50:24,590 corresponding to quicksort. 912 00:50:24,590 --> 00:50:29,360 And quicksort is a divide and conquer randomized 913 00:50:29,360 --> 00:50:33,720 algorithm invented in '62. 914 00:50:33,720 --> 00:50:38,980 Unlike merge sort, it's got two interesting properties. 915 00:50:38,980 --> 00:50:42,190 The first is that it's in place, like I just said. 916 00:50:42,190 --> 00:50:44,555 So no auxiliary space. 917 00:50:44,555 --> 00:50:51,570 In Mozart, you can try and get around this. 918 00:50:51,570 --> 00:50:53,490 I should say order n auxiliary space. 919 00:50:53,490 --> 00:50:58,200 You need a little temporary variable 920 00:50:58,200 --> 00:51:00,410 in order to do a swapping. 921 00:51:00,410 --> 00:51:05,170 But you don't have the order n auxiliary space. 922 00:51:05,170 --> 00:51:07,012 So you don't have to constantly allocate. 923 00:51:07,012 --> 00:51:08,303 And remember, n could be large. 924 00:51:08,303 --> 00:51:10,340 It could be in the billions or trillions. 925 00:51:10,340 --> 00:51:12,980 So, from that standpoint, quicksort 926 00:51:12,980 --> 00:51:17,130 ends up winning simply because of relatively mundane things 927 00:51:17,130 --> 00:51:22,450 like memory allocation in your computer. 928 00:51:22,450 --> 00:51:26,430 And the other interesting thing about quicksort in relation 929 00:51:26,430 --> 00:51:33,010 to merge sort is that all the work is in the divide step. 930 00:51:39,180 --> 00:51:42,720 So in merge sort, remember we just split, and we recurse. 931 00:51:42,720 --> 00:51:46,050 And what happens when you come back 932 00:51:46,050 --> 00:51:51,040 is you have to do the finger emerging 933 00:51:51,040 --> 00:51:54,800 algorithm by looking at the two sorted arrays 934 00:51:54,800 --> 00:51:59,820 and looking at what the new merger is going to look like. 935 00:51:59,820 --> 00:52:01,830 So the work is in the merge. 936 00:52:01,830 --> 00:52:04,990 But in quicksort, the work is going to be in the divide 937 00:52:04,990 --> 00:52:09,860 because we're going to have to do a bunch of work associated 938 00:52:09,860 --> 00:52:15,070 with figuring out how to keep the partitions balanced-- 939 00:52:15,070 --> 00:52:20,090 a little bit like we had to do when we did median finding back 940 00:52:20,090 --> 00:52:22,150 a couple of weeks ago. 941 00:52:22,150 --> 00:52:24,150 I'm going to talk about three different variants 942 00:52:24,150 --> 00:52:26,110 of quicksort. 943 00:52:26,110 --> 00:52:29,470 The. variant that we're going to spend the most time on 944 00:52:29,470 --> 00:52:32,810 is the Las Vegas quicksort where we'd 945 00:52:32,810 --> 00:52:35,430 like to show that it's probably fast 946 00:52:35,430 --> 00:52:40,590 and make a statement about the expected runtime. 947 00:52:40,590 --> 00:52:43,590 But we'll get to that by talking about a couple 948 00:52:43,590 --> 00:52:47,110 of other interesting variants, and this'll 949 00:52:47,110 --> 00:52:53,310 be elaborated on to some extent in section tomorrow. 950 00:52:53,310 --> 00:52:56,230 So before we get to variants of course, 951 00:52:56,230 --> 00:52:58,840 let's try and set up the structure corresponding 952 00:52:58,840 --> 00:53:00,500 to quicksort. 953 00:53:00,500 --> 00:53:09,150 And as always, we have an n element array A. 954 00:53:09,150 --> 00:53:22,210 You have divide that corresponds to picking a pivot element, x 955 00:53:22,210 --> 00:53:30,780 in A. And then we're going to partition 956 00:53:30,780 --> 00:53:34,550 the array into sub-arrays. 957 00:53:39,240 --> 00:53:43,240 And what we have here-- this little picture 958 00:53:43,240 --> 00:53:46,080 should make things clearer. 959 00:53:46,080 --> 00:53:49,330 And you kind of saw this in the median finding, 960 00:53:49,330 --> 00:53:50,600 but here we go again. 961 00:53:50,600 --> 00:53:53,410 Let's assume all the array elements are unique. 962 00:53:53,410 --> 00:53:58,840 We have L, E, and G. L is less than. 963 00:53:58,840 --> 00:54:00,560 G is greater than. 964 00:54:00,560 --> 00:54:03,460 And so your pivot element is going to break this array up 965 00:54:03,460 --> 00:54:06,610 into L and G, where you got all the elements that 966 00:54:06,610 --> 00:54:08,770 are less on the left and all the elements that 967 00:54:08,770 --> 00:54:10,100 are greater on the right. 968 00:54:10,100 --> 00:54:15,040 And you're going to recurs on the L and G. 969 00:54:15,040 --> 00:54:30,950 So recursively sort sub-arrays L and G. Combine is trivial-- 970 00:54:30,950 --> 00:54:33,820 or merge is trivial-- because you've already broken things 971 00:54:33,820 --> 00:54:35,330 up thanks to the pivoting. 972 00:54:35,330 --> 00:54:37,650 And you just concatenate those arrays. 973 00:54:37,650 --> 00:54:39,900 And that's why you can do this in place. 974 00:54:39,900 --> 00:54:41,260 There's no issues. 975 00:54:41,260 --> 00:54:44,890 You're really recursively sorting sub-arrays. 976 00:54:44,890 --> 00:54:46,920 You are moving things around a little bit 977 00:54:46,920 --> 00:54:48,380 when you do the partition. 978 00:54:48,380 --> 00:54:51,460 Obviously, the initial array may have all the elements. 979 00:54:51,460 --> 00:54:53,210 You may pick the pivot such that the pivot 980 00:54:53,210 --> 00:54:56,080 is all the way on the right-hand side in the sense 981 00:54:56,080 --> 00:54:57,760 that it's a very large element. 982 00:54:57,760 --> 00:54:59,900 That is not necessarily a good thing. 983 00:54:59,900 --> 00:55:01,250 I will talk about that. 984 00:55:01,250 --> 00:55:05,290 But if you pick an interesting pivot or a good pivot, 985 00:55:05,290 --> 00:55:08,860 you're going to have to move the elements in the array 986 00:55:08,860 --> 00:55:11,610 to the left of the pivot if they're less than the pivot, 987 00:55:11,610 --> 00:55:13,840 and you got to move the elements to the right 988 00:55:13,840 --> 00:55:15,870 if they're to the right of the pivot. 989 00:55:15,870 --> 00:55:18,470 Nontrivial piece of code, not super 990 00:55:18,470 --> 00:55:24,440 complicated, but you can look at the CLRS page 171 991 00:55:24,440 --> 00:55:28,740 to look at in-place partitioning where 992 00:55:28,740 --> 00:55:33,660 you don't have to use another order n space to move 993 00:55:33,660 --> 00:55:37,010 these elements around such that they look like that picture 994 00:55:37,010 --> 00:55:40,710 that I have up there, starting from some random starting 995 00:55:40,710 --> 00:55:41,700 point. 996 00:55:41,700 --> 00:55:45,580 So you want to have the picture that you have here, 997 00:55:45,580 --> 00:55:50,800 and you need to go from-- the very same array, it needs to-- 998 00:55:50,800 --> 00:55:55,950 and x is somewhere here, and you got x plus 1 here and x minus 1 999 00:55:55,950 --> 00:55:57,300 here, for example. 1000 00:55:57,300 --> 00:55:58,990 And you need to move those things around 1001 00:55:58,990 --> 00:56:04,340 so they look like L, E, and G, and that's something 1002 00:56:04,340 --> 00:56:05,650 that you can do in place. 1003 00:56:05,650 --> 00:56:08,600 And you can look at the code for that in the CLRS. 1004 00:56:08,600 --> 00:56:11,076 I won't cover that here. 1005 00:56:11,076 --> 00:56:13,450 So let's look at a bunch of different variants responding 1006 00:56:13,450 --> 00:56:15,170 to quicksort. 1007 00:56:15,170 --> 00:56:18,030 And there's some real simple ones. 1008 00:56:18,030 --> 00:56:23,880 Each of these, we can knock off with respect to complexity 1009 00:56:23,880 --> 00:56:28,450 and runtime fairly easily with the one exception 1010 00:56:28,450 --> 00:56:30,080 that we'll spend some time on, which 1011 00:56:30,080 --> 00:56:32,440 is the Las Vegas quicksort. 1012 00:56:32,440 --> 00:56:35,950 But we'll call these different names. 1013 00:56:35,950 --> 00:56:38,360 Let's talk about the basic quicksort, 1014 00:56:38,360 --> 00:56:44,220 which is also a useful algorithm that people use in practice. 1015 00:56:44,220 --> 00:56:50,100 And amazingly, this algorithm is simply something that says, 1016 00:56:50,100 --> 00:56:55,060 I'm just going to constantly pivot on either the first entry 1017 00:56:55,060 --> 00:56:55,960 or the last entry. 1018 00:56:55,960 --> 00:56:58,140 So I'm going to pick my pivot to be A1. 1019 00:56:58,140 --> 00:57:01,060 And when I pick my pivot to be A1, 1020 00:57:01,060 --> 00:57:03,020 it's a value that I'm talking about here. 1021 00:57:03,020 --> 00:57:04,470 X is a value. 1022 00:57:04,470 --> 00:57:06,090 It's not an index. 1023 00:57:06,090 --> 00:57:09,110 The A1 value-- maybe that's 75. 1024 00:57:09,110 --> 00:57:12,110 Then I'm going to create my L matrix corresponding 1025 00:57:12,110 --> 00:57:15,070 to this pivot where all the entries are strictly 1026 00:57:15,070 --> 00:57:20,450 less than 75, and G would be strictly greater than 75. 1027 00:57:20,450 --> 00:57:21,670 And I could do that for A1. 1028 00:57:21,670 --> 00:57:22,962 I could do that for An. 1029 00:57:22,962 --> 00:57:24,545 So remember that the pivot is a value. 1030 00:57:28,140 --> 00:57:31,570 Now, if I look at this, I'm going 1031 00:57:31,570 --> 00:57:39,180 to do the partition, given x, just like you saw there. 1032 00:57:39,180 --> 00:57:42,280 And this is going to be done in order n time. 1033 00:57:42,280 --> 00:57:44,890 It makes sense that you're going to look at every element, 1034 00:57:44,890 --> 00:57:47,390 and you're going to move it to an appropriate location 1035 00:57:47,390 --> 00:57:52,020 to the left of x, which is the e array, or to the right. 1036 00:57:52,020 --> 00:57:53,110 And you'll do that. 1037 00:57:53,110 --> 00:57:55,200 That takes order n time. 1038 00:57:55,200 --> 00:57:58,250 And, as I mentioned, you can look at this 1039 00:57:58,250 --> 00:58:04,280 to see how this is done in place. 1040 00:58:04,280 --> 00:58:10,020 So let's take a look at the analysis of basic quicksort, 1041 00:58:10,020 --> 00:58:12,680 and what I'm interested in, of course, is the worst case 1042 00:58:12,680 --> 00:58:14,010 analysis. 1043 00:58:14,010 --> 00:58:16,900 And I asked this question, I think, 1044 00:58:16,900 --> 00:58:20,910 before when we were doing median finding, 1045 00:58:20,910 --> 00:58:24,930 but what is the worst case complexity 1046 00:58:24,930 --> 00:58:30,800 of the basic quicksort algorithm that chooses the pivot as A1? 1047 00:58:30,800 --> 00:58:32,022 What is the complexity? 1048 00:58:32,022 --> 00:58:33,870 [INTERPOSING VOICES] 1049 00:58:33,870 --> 00:58:35,160 Order n square. 1050 00:58:35,160 --> 00:58:36,700 It's order n square. 1051 00:58:36,700 --> 00:58:40,130 And the reason for that is that you 1052 00:58:40,130 --> 00:58:46,680 may have an array that is sorted or reverse sorted-- 1053 00:58:46,680 --> 00:58:50,670 depending on whether you're picking A1 or An. 1054 00:58:50,670 --> 00:58:53,510 You can have a worst case situation where 1055 00:58:53,510 --> 00:59:02,710 one side, L or G, has n minus 1 elements, 1056 00:59:02,710 --> 00:59:06,920 and the other has 0 elements. 1057 00:59:06,920 --> 00:59:10,240 And so if you look at our recurrence associated 1058 00:59:10,240 --> 00:59:14,010 with this, you could have Tn, which is T0, plus T n 1059 00:59:14,010 --> 00:59:18,040 minus 1 plus theta n. 1060 00:59:18,040 --> 00:59:20,160 And why do I have a theta n here? 1061 00:59:20,160 --> 00:59:22,800 Well, remember that I still have to do this divide 1062 00:59:22,800 --> 00:59:26,610 step or this partition step in order to compute 1063 00:59:26,610 --> 00:59:28,390 this up unbalanced array. 1064 00:59:28,390 --> 00:59:31,190 So I do have to look at each of these elements 1065 00:59:31,190 --> 00:59:33,190 and do the comparison. 1066 00:59:33,190 --> 00:59:35,800 And maybe I don't actually have to move them, 1067 00:59:35,800 --> 00:59:39,140 but I have to do the comparison with the A1, which 1068 00:59:39,140 --> 00:59:39,790 is the x pivot. 1069 00:59:42,380 --> 00:59:46,010 And in some cases, if I'm doing the wrong thing reverse sorted, 1070 00:59:46,010 --> 00:59:47,630 I also have to do the move. 1071 00:59:47,630 --> 00:59:50,600 Either way, I have a theta n complexity associated 1072 00:59:50,600 --> 00:59:53,040 with the divide step. 1073 00:59:53,040 --> 00:59:57,890 And so if you go off and you look at what happens with this, 1074 00:59:57,890 --> 01:00:02,850 well, you've got Tn equals Tn minus 1 1075 01:00:02,850 --> 01:00:09,300 plus theta n, which ends up with theta n square complexity. 1076 01:00:09,300 --> 01:00:13,440 So a hand waved a little bit two weeks ago 1077 01:00:13,440 --> 01:00:15,870 for a similar analysis, but you can kind of 1078 01:00:15,870 --> 01:00:18,410 look at it a little more precisely 1079 01:00:18,410 --> 01:00:21,630 here by writing the actual recurrence out. 1080 01:00:21,630 --> 01:00:24,450 And you see that you get the recurrence Tn equals Tn minus 1 1081 01:00:24,450 --> 01:00:29,520 plus theta n, which is an n square result, 1082 01:00:29,520 --> 01:00:32,080 or the solutions is n square. 1083 01:00:32,080 --> 01:00:33,010 OK? 1084 01:00:33,010 --> 01:00:36,100 So basic quicksort look bad. 1085 01:00:36,100 --> 01:00:38,940 It's got a worst case complexity of theta n square. 1086 01:00:38,940 --> 01:00:41,130 It works well on random inputs and practice. 1087 01:00:41,130 --> 01:00:45,060 And it turns out that it's a fashion of algorithm, 1088 01:00:45,060 --> 01:00:48,000 partly because it's in place and it's easy to code, 1089 01:00:48,000 --> 01:00:52,230 that what people do is they take their inputs, 1090 01:00:52,230 --> 01:00:55,700 and they shuffle them. 1091 01:00:55,700 --> 01:00:59,130 You might get a bad input, and it might take you a long time 1092 01:00:59,130 --> 01:00:59,794 to run. 1093 01:00:59,794 --> 01:01:01,460 But if you take an input and you shuffle 1094 01:01:01,460 --> 01:01:04,325 it and you do that in theta n time, 1095 01:01:04,325 --> 01:01:07,000 you just move things around and randomize the input. 1096 01:01:07,000 --> 01:01:08,910 Then effectively, you have a random input, 1097 01:01:08,910 --> 01:01:12,240 and this thing works pretty well in practice. 1098 01:01:12,240 --> 01:01:13,760 Now, what is pretty well? 1099 01:01:13,760 --> 01:01:15,580 Well, we're going to do an analysis that 1100 01:01:15,580 --> 01:01:19,990 is going to not be exactly the analysis that you'd 1101 01:01:19,990 --> 01:01:24,220 have to do on basic quicksort on random inputs, 1102 01:01:24,220 --> 01:01:27,400 but essentially, you can say that basic quicksort 1103 01:01:27,400 --> 01:01:33,060 on random inputs is going to run in expected theta m log n time. 1104 01:01:33,060 --> 01:01:34,040 OK? 1105 01:01:34,040 --> 01:01:39,670 It's something that you'll see a little bit of how 1106 01:01:39,670 --> 01:01:43,730 to do that today and in section, perhaps for median finding 1107 01:01:43,730 --> 01:01:45,140 in section tomorrow. 1108 01:01:45,140 --> 01:01:49,390 But that's all I wanted to say about basic quicksort. 1109 01:01:49,390 --> 01:01:51,200 It's a practical algorithm. 1110 01:01:51,200 --> 01:01:54,320 It does require a little bit of shuffling up at the beginning, 1111 01:01:54,320 --> 01:01:57,610 and then you can simply use the pivot A1. 1112 01:01:57,610 --> 01:01:59,610 And because you've done the shuffle, 1113 01:01:59,610 --> 01:02:01,690 generally you get balance partitions. 1114 01:02:01,690 --> 01:02:05,110 The L and G's look balanced, and you don't end up 1115 01:02:05,110 --> 01:02:06,410 with theta n square. 1116 01:02:06,410 --> 01:02:09,380 If you have any sort of balance associated with the two 1117 01:02:09,380 --> 01:02:12,660 partitions L and G, you're going to get a nice divide 1118 01:02:12,660 --> 01:02:15,630 and conquer, which is going to give you your theta n log n. 1119 01:02:15,630 --> 01:02:16,990 OK? 1120 01:02:16,990 --> 01:02:20,050 So that's basic quicksort. 1121 01:02:20,050 --> 01:02:23,520 There's another way to do this, and so this 1122 01:02:23,520 --> 01:02:25,060 is a question for you guys. 1123 01:02:25,060 --> 01:02:28,770 Suppose I wanted to use the quicksort strategy 1124 01:02:28,770 --> 01:02:35,420 and get a worst case theta n log n through an intelligent pivot 1125 01:02:35,420 --> 01:02:36,860 selection. 1126 01:02:36,860 --> 01:02:41,370 So I want to do a pivot selection intelligently. 1127 01:02:47,280 --> 01:02:51,680 So how would I get under the structure of quicksort 1128 01:02:51,680 --> 01:02:55,680 that you see up there on the left there? 1129 01:02:55,680 --> 01:03:00,910 How would I select a pivot such that I get worse case theta 1130 01:03:00,910 --> 01:03:02,710 n log n complexity? 1131 01:03:06,574 --> 01:03:08,023 Go ahead. 1132 01:03:08,023 --> 01:03:09,480 AUDIENCE: Linear median finding. 1133 01:03:09,480 --> 01:03:10,700 SRINIVAS DEVADAS: Linear medium finding. 1134 01:03:10,700 --> 01:03:11,245 Perfect. 1135 01:03:11,245 --> 01:03:12,120 That's exactly right. 1136 01:03:15,000 --> 01:03:17,375 There's a gentleman at the back who'd raised his hand, 1137 01:03:17,375 --> 01:03:18,625 and I decided I'd chicken out. 1138 01:03:21,280 --> 01:03:25,669 I think one time to the back of the room is enough for a day. 1139 01:03:25,669 --> 01:03:26,710 I'll have a Frisbee left. 1140 01:03:26,710 --> 01:03:30,190 Hopefully you can get one. 1141 01:03:30,190 --> 01:03:33,690 So the intelligence pivots selection algorithm 1142 01:03:33,690 --> 01:03:36,250 is the median finding algorithm because that's 1143 01:03:36,250 --> 01:03:38,890 going to guarantee me that I'm going 1144 01:03:38,890 --> 01:03:40,410 to get balanced partitions. 1145 01:03:40,410 --> 01:03:43,080 If you tell me that A1-- and remember, 1146 01:03:43,080 --> 01:03:45,080 we're talking medians of values-- 1147 01:03:45,080 --> 01:03:46,650 so don't get confused with indices. 1148 01:03:46,650 --> 01:03:48,380 When I say something is a median, 1149 01:03:48,380 --> 01:03:52,860 I'm talking about the value, that given its value, 1150 01:03:52,860 --> 01:03:56,270 there are all these other n over 2 values that are less 1151 01:03:56,270 --> 01:03:59,290 than it, roughly speaking and n over 2 values that 1152 01:03:59,290 --> 01:04:01,630 are greater than it. 1153 01:04:01,630 --> 01:04:06,740 And so A1, I have no idea whether it's large or small. 1154 01:04:06,740 --> 01:04:08,280 So I couldn't say much about it. 1155 01:04:08,280 --> 01:04:09,820 But if I want to be worst case and I 1156 01:04:09,820 --> 01:04:14,150 want to guarantee that I have balanced partitions, 1157 01:04:14,150 --> 01:04:15,570 I can choose the median. 1158 01:04:15,570 --> 01:04:17,500 And if I choose the median every time, 1159 01:04:17,500 --> 01:04:19,797 I'm going to get perfectly balanced partitions. 1160 01:04:19,797 --> 01:04:22,130 They're going to half on the left and half on the right. 1161 01:04:24,630 --> 01:04:29,180 And we do know a way of getting balanced partitions. 1162 01:04:29,180 --> 01:04:44,470 We can guarantee that balanced L and G using median selection 1163 01:04:44,470 --> 01:04:46,800 that runs in theta n time. 1164 01:04:46,800 --> 01:04:48,960 And we showed that a couple of weeks ago. 1165 01:04:51,570 --> 01:04:54,330 Now, that median selection algorithm was nontrivial. 1166 01:04:54,330 --> 01:04:55,710 OK? 1167 01:04:55,710 --> 01:04:58,610 It had this weird thing where you broke things up 1168 01:04:58,610 --> 01:05:03,750 into five sub-arrays of size 5, and you 1169 01:05:03,750 --> 01:05:06,370 found a median of medians et cetera, et cetera. 1170 01:05:06,370 --> 01:05:08,690 But we argued that the whole thing ran in theta n 1171 01:05:08,690 --> 01:05:10,860 time, which is important. 1172 01:05:10,860 --> 01:05:14,550 And so now, if you look at what happens with quicksort 1173 01:05:14,550 --> 01:05:17,330 and if I write the recurrence for quicksort, 1174 01:05:17,330 --> 01:05:20,520 thanks to selecting a median, I effectively 1175 01:05:20,520 --> 01:05:21,850 have balanced partitions. 1176 01:05:21,850 --> 01:05:23,700 So I have 2T n over 2. 1177 01:05:23,700 --> 01:05:33,860 This is thanks to the median based pivoting. 1178 01:05:33,860 --> 01:05:36,150 That's important. 1179 01:05:36,150 --> 01:05:38,040 Otherwise it won't work. 1180 01:05:38,040 --> 01:05:46,460 And then, just to be very clear here, I got two theta n terms. 1181 01:05:46,460 --> 01:05:47,740 OK? 1182 01:05:47,740 --> 01:05:53,485 The first theta n term is the recursive median selection. 1183 01:05:59,440 --> 01:06:01,910 And then the second theta n term is of course 1184 01:06:01,910 --> 01:06:02,955 the divide, or partition. 1185 01:06:06,860 --> 01:06:09,970 But it's important to realize that now I have 1186 01:06:09,970 --> 01:06:12,390 a lot of work in the divide. 1187 01:06:12,390 --> 01:06:12,999 A lot of work. 1188 01:06:12,999 --> 01:06:14,790 I have to do an intelligent selection using 1189 01:06:14,790 --> 01:06:17,380 this recursive median finding algorithm. 1190 01:06:17,380 --> 01:06:20,760 And I also have to do the moves comparing and then generate 1191 01:06:20,760 --> 01:06:22,610 and the G arrays. 1192 01:06:22,610 --> 01:06:23,110 OK? 1193 01:06:23,110 --> 01:06:24,401 So those are the two theta n's. 1194 01:06:24,401 --> 01:06:26,770 They're obviously theta n, but I wanted to make it clear 1195 01:06:26,770 --> 01:06:28,830 that there's two things going on here. 1196 01:06:28,830 --> 01:06:31,820 And we all know that that is theta n log n worst case. 1197 01:06:31,820 --> 01:06:32,540 All right? 1198 01:06:32,540 --> 01:06:37,060 So there is a way of using the quicksort structure template 1199 01:06:37,060 --> 01:06:40,640 and getting a theta n log n worst case algorithm, which 1200 01:06:40,640 --> 01:06:45,310 doesn't work in practice because it's just too complicated. 1201 01:06:45,310 --> 01:06:50,400 What's going on here is at every level of recursion, 1202 01:06:50,400 --> 01:06:55,630 you're calling another recursive algorithm to find the median. 1203 01:06:55,630 --> 01:06:58,170 So if you go code this up, it loses 1204 01:06:58,170 --> 01:07:00,170 to merge sort in practice. 1205 01:07:00,170 --> 01:07:02,450 You can do all of this in place, but because 1206 01:07:02,450 --> 01:07:06,750 of all these recursive calls, it doesn't work well in practice. 1207 01:07:06,750 --> 01:07:07,750 But it's good to know. 1208 01:07:07,750 --> 01:07:10,560 And so this is a good example I think, 1209 01:07:10,560 --> 01:07:13,010 which we don't do a lot of in 046, 1210 01:07:13,010 --> 01:07:17,150 but you get a sense of the difference between asymptotic 1211 01:07:17,150 --> 01:07:19,640 complexity and performance. 1212 01:07:19,640 --> 01:07:24,590 So while the median finding algorithm has better asymptotic 1213 01:07:24,590 --> 01:07:28,360 complexity worst case, it really loses in practice 1214 01:07:28,360 --> 01:07:31,390 to the basic quicksort, which essentially 1215 01:07:31,390 --> 01:07:33,720 is a bit of a hack, where you take an input 1216 01:07:33,720 --> 01:07:36,500 and you randomize the input and you 1217 01:07:36,500 --> 01:07:39,830 run it with A1 as the pivot or An at the pivot. 1218 01:07:44,250 --> 01:07:47,830 Is there a different way that you can actually 1219 01:07:47,830 --> 01:07:51,220 get to a Las Vegas algorithm? 1220 01:07:51,220 --> 01:07:53,290 And it turns out randomized quicksort 1221 01:07:53,290 --> 01:07:57,590 is something that you can build and use, 1222 01:07:57,590 --> 01:07:59,710 which is a bit different from basic quicksort 1223 01:07:59,710 --> 01:08:02,744 and certainly different from median finding. 1224 01:08:02,744 --> 01:08:04,910 But it kind of has a little bit in common with them, 1225 01:08:04,910 --> 01:08:08,960 and it's our example of a Vegas algorithm. 1226 01:08:08,960 --> 01:08:11,510 So what happens at randomized quicksort? 1227 01:08:11,510 --> 01:08:18,220 An x is chosen at random from the array, A. 1228 01:08:18,220 --> 01:08:21,359 So you're not choosing A1 or An. 1229 01:08:21,359 --> 01:08:25,290 You might just flip-- well, effectively an n-sided 1230 01:08:25,290 --> 01:08:29,080 die-- and pick a particular index, 1231 01:08:29,080 --> 01:08:31,560 and then go grab the pivot corresponding 1232 01:08:31,560 --> 01:08:35,000 to the value at that index. 1233 01:08:35,000 --> 01:08:37,100 You're not going to randomize over values. 1234 01:08:37,100 --> 01:08:38,990 You don't know what these values are, 1235 01:08:38,990 --> 01:08:42,380 but you can pick a random index and then 1236 01:08:42,380 --> 01:08:45,899 grab the pivot based on the value at that index. 1237 01:08:45,899 --> 01:08:48,920 And so at each recursion, a random choice is made. 1238 01:08:48,920 --> 01:08:51,524 And the expected time-- so now we're 1239 01:08:51,524 --> 01:08:52,649 saying something different. 1240 01:08:52,649 --> 01:08:56,740 We're making a stronger theoretical statement 1241 01:08:56,740 --> 01:09:03,510 that the expected time, when you do this, for all inputs 1242 01:09:03,510 --> 01:09:07,229 arrays A is order n log n. 1243 01:09:07,229 --> 01:09:09,990 And so now, this is not worst case time. 1244 01:09:09,990 --> 01:09:11,649 It's expected time. 1245 01:09:11,649 --> 01:09:14,640 So this is going to be our analysis 1246 01:09:14,640 --> 01:09:19,430 in the last few minutes here to analyze 1247 01:09:19,430 --> 01:09:24,080 not randomized quicksort, but a slight variant 1248 01:09:24,080 --> 01:09:27,210 of randomized quicksort that is going to show you 1249 01:09:27,210 --> 01:09:31,399 that you can run randomize quicksort and this variant 1250 01:09:31,399 --> 01:09:33,640 in order n log n time. 1251 01:09:33,640 --> 01:09:37,420 So not quite sure what's going to happen in section tomorrow, 1252 01:09:37,420 --> 01:09:41,580 but the full analysis is in the book. 1253 01:09:41,580 --> 01:09:44,399 You should read it. 1254 01:09:44,399 --> 01:09:47,520 As you can see, it's a couple of pages 1255 01:09:47,520 --> 01:09:50,609 that includes the description of a quicksort 1256 01:09:50,609 --> 01:09:52,340 that I have already. 1257 01:09:52,340 --> 01:09:57,564 But what we're going to do here is analyze a variant quicksort, 1258 01:09:57,564 --> 01:09:59,230 which is a little bit easier to analyze, 1259 01:09:59,230 --> 01:10:02,410 and it gives you the sense of why 1260 01:10:02,410 --> 01:10:04,900 in fact the randomized quicksort is 1261 01:10:04,900 --> 01:10:06,480 going to run in expected time. 1262 01:10:06,480 --> 01:10:09,790 And this analysis is easy to do it in a few minutes. 1263 01:10:09,790 --> 01:10:12,290 So we'll do that. 1264 01:10:12,290 --> 01:10:15,900 And tomorrow, you'll see either a median finding analysis 1265 01:10:15,900 --> 01:10:21,070 that's similar to that analysis in CLRS or precisely 1266 01:10:21,070 --> 01:10:26,080 that analysis, depending on what your TAs want to do. 1267 01:10:26,080 --> 01:10:27,860 So this particular variant, we're 1268 01:10:27,860 --> 01:10:31,740 going to call paranoid quicksort. 1269 01:10:31,740 --> 01:10:36,120 And so this quicksort is paranoid in the sense 1270 01:10:36,120 --> 01:10:42,660 that it's going to be afraid of getting unbalanced partitions, 1271 01:10:42,660 --> 01:10:46,350 and it's going to keep trying to get balanced partitions. 1272 01:10:46,350 --> 01:10:49,690 So it's going to try to get a balanced partition. 1273 01:10:49,690 --> 01:10:51,600 It's going to check, and then if it fails, 1274 01:10:51,600 --> 01:10:52,990 it's going to try again. 1275 01:10:52,990 --> 01:10:56,000 And so at the end of it, there's obviously 1276 01:10:56,000 --> 01:11:00,470 an expectation associated with the number of tries 1277 01:11:00,470 --> 01:11:02,990 that you need in order to get a balanced partition. 1278 01:11:02,990 --> 01:11:05,790 But it just sort of flips the problem on its head and says, 1279 01:11:05,790 --> 01:11:07,540 you know what? 1280 01:11:07,540 --> 01:11:09,890 I'm just going to guarantee a balanced partition 1281 01:11:09,890 --> 01:11:12,150 from a probabilistic standpoint and it 1282 01:11:12,150 --> 01:11:14,870 might take me a little bit longer to get there. 1283 01:11:14,870 --> 01:11:17,750 But that's what Las Vegas algorithms are all about. 1284 01:11:17,750 --> 01:11:19,620 They're probably fast. 1285 01:11:19,620 --> 01:11:22,350 And once I get a balanced partition, I'm in good shape 1286 01:11:22,350 --> 01:11:26,160 because I can go do my recursion, and I get my divide 1287 01:11:26,160 --> 01:11:28,530 and conquer working properly. 1288 01:11:28,530 --> 01:11:30,470 So what is paranoid quicksort? 1289 01:11:30,470 --> 01:11:32,300 Absolutely straightforward. 1290 01:11:32,300 --> 01:11:35,410 You could probably guess given my description. 1291 01:11:35,410 --> 01:11:45,940 Let's just choose a pivot to be a random element 1292 01:11:45,940 --> 01:11:58,050 of A. Perform the partition, and then values will repeat. 1293 01:11:58,050 --> 01:12:04,550 So we're going to go off, and we say until the resulting 1294 01:12:04,550 --> 01:12:11,770 partition is such that the cardinality of L 1295 01:12:11,770 --> 01:12:18,415 less than or equal to 3/4 of cardinality of A. 1296 01:12:18,415 --> 01:12:22,720 And the cardinality of G is less than or equal to 3/4 1297 01:12:22,720 --> 01:12:24,570 the cardinality of A. 1298 01:12:24,570 --> 01:12:28,130 So I'm allowing you a certain amount of imbalance, 1299 01:12:28,130 --> 01:12:29,600 but not a lot. 1300 01:12:29,600 --> 01:12:31,040 Right? 1301 01:12:31,040 --> 01:12:31,970 And that's it. 1302 01:12:31,970 --> 01:12:37,600 That's paranoid quicksort. 1303 01:12:37,600 --> 01:12:41,040 You obviously are doing that in each level of the recursion. 1304 01:12:41,040 --> 01:12:43,085 And at each level of the recursion, 1305 01:12:43,085 --> 01:12:48,360 your L and G are going to be, at most, a factor of three apart. 1306 01:12:48,360 --> 01:12:50,620 So you might get 1/4 and 3/4. 1307 01:12:50,620 --> 01:12:53,350 If you're lucky, you'll get 1/2 and 1/2. 1308 01:12:53,350 --> 01:12:56,020 But the worst case, given that you're 1309 01:12:56,020 --> 01:13:00,080 going to be exiting out of this loop, is 1/4 and 3/4. 1310 01:13:00,080 --> 01:13:01,270 OK? 1311 01:13:01,270 --> 01:13:04,980 So, as always, you have a simple algorithm, 1312 01:13:04,980 --> 01:13:10,720 and it's not completely clear how 1313 01:13:10,720 --> 01:13:14,630 you're going to get to expected n log n time. 1314 01:13:14,630 --> 01:13:18,000 But it's not difficult. 1315 01:13:18,000 --> 01:13:20,900 Basically, what we had to do is we have to try and figure out 1316 01:13:20,900 --> 01:13:22,440 what the probability of a good call 1317 01:13:22,440 --> 01:13:27,160 is, over here, a good pivot choice, 1318 01:13:27,160 --> 01:13:31,600 and what the probability of a bad pivot choice is. 1319 01:13:31,600 --> 01:13:36,990 And we have to obviously-- given the potential imbalance, 1320 01:13:36,990 --> 01:13:39,570 we have to write the recurrence associated with that, 1321 01:13:39,570 --> 01:13:44,400 but let's take a look at the pivots here. 1322 01:13:44,400 --> 01:13:51,110 And what can we say about the size as of L and G 1323 01:13:51,110 --> 01:13:53,650 if you just did a random pivot? 1324 01:13:53,650 --> 01:14:03,320 Well, a bad call is when you get something in L or G 1325 01:14:03,320 --> 01:14:06,010 that is less than 1/4. 1326 01:14:06,010 --> 01:14:14,500 And a good call is when you get somewhere between 1/2-- well, 1327 01:14:14,500 --> 01:14:18,580 roughly, if you look at the choice of the pivot. 1328 01:14:18,580 --> 01:14:21,320 So what I have up here, is the choice of the pivot. 1329 01:14:21,320 --> 01:14:26,290 If my pivot is out here, I have a very small L, 1330 01:14:26,290 --> 01:14:28,380 and all of the thing on the right 1331 01:14:28,380 --> 01:14:32,520 is G. If the pivot is here, I have a relatively small L 1332 01:14:32,520 --> 01:14:36,100 and a large G. The pivot is over here, I'm good. 1333 01:14:36,100 --> 01:14:39,530 I got 1/4 and 3/4. 1334 01:14:39,530 --> 01:14:41,970 If the pivot is over here, I got 1/2 and 1/2 1335 01:14:41,970 --> 01:14:43,290 and so on and so forth. 1336 01:14:43,290 --> 01:14:46,790 And so this part is bad, this part is bad, 1337 01:14:46,790 --> 01:14:48,470 and the middle part is good. 1338 01:14:48,470 --> 01:14:51,570 So that's all that this picture shows. 1339 01:14:51,570 --> 01:14:59,520 So a call is good with what probability? 1340 01:14:59,520 --> 01:15:03,764 Given that picture, a call is good with what probability? 1341 01:15:03,764 --> 01:15:05,180 It's greater than or equal to 1/2. 1342 01:15:10,350 --> 01:15:18,480 And so what you can now write simply is if Tn is 1343 01:15:18,480 --> 01:15:24,980 the time required to sort the array, essentially you can say 1344 01:15:24,980 --> 01:15:35,010 Tn is T of n divided by 4 plus T of 3n divided by 4 1345 01:15:35,010 --> 01:15:42,800 plus expected number of iterations 1346 01:15:42,800 --> 01:15:48,769 in terms of getting a good partition times C times n. 1347 01:15:48,769 --> 01:15:50,310 And there is a reason why I'm putting 1348 01:15:50,310 --> 01:15:52,240 C in here as opposed to theta. 1349 01:15:52,240 --> 01:15:53,940 That will become clear in just a second. 1350 01:15:53,940 --> 01:15:55,565 Because I can't really apply the master 1351 01:15:55,565 --> 01:15:58,380 theorem to this given what I have with respect 1352 01:15:58,380 --> 01:16:01,612 to Tn over 4 and 3n over 4. 1353 01:16:01,612 --> 01:16:03,570 So what I have here is, I'm looking at the case 1354 01:16:03,570 --> 01:16:08,550 where I could get an imbalanced partition, 1355 01:16:08,550 --> 01:16:10,690 but the imbalance is bounded. 1356 01:16:10,690 --> 01:16:14,150 So I'd have n over 4 on one side and 3n over 4 1357 01:16:14,150 --> 01:16:15,050 on the other side. 1358 01:16:15,050 --> 01:16:18,240 But I'm not going to have n over 5 and 4n over 5 1359 01:16:18,240 --> 01:16:20,190 or what have you. 1360 01:16:20,190 --> 01:16:22,430 And so that's the two recursive calls. 1361 01:16:22,430 --> 01:16:25,170 So that's hopefully easy to see. 1362 01:16:25,170 --> 01:16:29,770 The part that is new here is simply the complexity 1363 01:16:29,770 --> 01:16:34,130 of this code that you see here, which is obviously 1364 01:16:34,130 --> 01:16:35,710 the randomized algorithm. 1365 01:16:35,710 --> 01:16:37,840 That's exactly where the randomness comes in 1366 01:16:37,840 --> 01:16:41,530 because you're picking a random pivot, and you're checking it. 1367 01:16:41,530 --> 01:16:45,090 And so this is going to run a certain number of times. 1368 01:16:45,090 --> 01:16:47,560 And we can figure out what the expectation 1369 01:16:47,560 --> 01:16:49,430 is in just a minute. 1370 01:16:49,430 --> 01:16:53,240 But I have C times n because this is constant time 1371 01:16:53,240 --> 01:16:54,660 to choose a random number. 1372 01:16:54,660 --> 01:16:57,210 We'll assume that performing the partition 1373 01:16:57,210 --> 01:17:02,020 is C times n or theta n, and that's why I have this 1374 01:17:02,020 --> 01:17:02,840 up there. 1375 01:17:02,840 --> 01:17:06,050 So this, we're going to call this Cn. 1376 01:17:06,050 --> 01:17:09,350 And so expected number of iterations 1377 01:17:09,350 --> 01:17:11,040 given what I have-- what can I say 1378 01:17:11,040 --> 01:17:12,960 about the expected number of iterations 1379 01:17:12,960 --> 01:17:16,980 using simple probability rules? 1380 01:17:16,980 --> 01:17:18,500 What is that? 1381 01:17:18,500 --> 01:17:19,860 2, right? 1382 01:17:19,860 --> 01:17:21,240 1 over p. 1383 01:17:21,240 --> 01:17:22,880 All of them are independent. 1384 01:17:22,880 --> 01:17:25,060 So this is 2. 1385 01:17:25,060 --> 01:17:32,330 So what I have here is something that I think you might have 1386 01:17:32,330 --> 01:17:36,640 seen this before, but it's worth drawing the tree out and seeing 1387 01:17:36,640 --> 01:17:43,210 it one more time in case it didn't fully registered 1388 01:17:43,210 --> 01:17:47,570 the first time or you didn't actually see it in 006 1389 01:17:47,570 --> 01:17:49,480 or recitation. 1390 01:17:49,480 --> 01:17:53,630 But what I now have is T of n. 1391 01:17:53,630 --> 01:17:57,030 I want to solve T of n equals T of n over 4 1392 01:17:57,030 --> 01:18:04,020 plus T of 3 n over 4 plus 2 times Cn. 1393 01:18:04,020 --> 01:18:07,280 And, again, like I said, I didn't put theta n in here 1394 01:18:07,280 --> 01:18:10,840 because, as you'll see, when I draw this tree out-- 1395 01:18:10,840 --> 01:18:15,760 because it's not a massive theorem invocation-- it's worth 1396 01:18:15,760 --> 01:18:18,200 looking at it from a constant factor standpoint 1397 01:18:18,200 --> 01:18:22,580 to really get the sense of how all of this works out. 1398 01:18:22,580 --> 01:18:24,780 And so if I draw that tree of execution 1399 01:18:24,780 --> 01:18:27,400 and I start counting, basically what I have 1400 01:18:27,400 --> 01:18:29,340 is 2Cn up at the top. 1401 01:18:29,340 --> 01:18:32,570 I have 1 over 4 times 2Cn over here. 1402 01:18:32,570 --> 01:18:36,340 I have 3 over 4 times 2Cn over here. 1403 01:18:36,340 --> 01:18:42,390 And then this 1 over 4 might go 1 over 16 times 2Cn over here. 1404 01:18:42,390 --> 01:18:46,910 And this might go 3 over 16 times 2Cn over here. 1405 01:18:46,910 --> 01:18:53,650 And this would go, I guess it would be 3 over 16 times 2Cn. 1406 01:18:53,650 --> 01:18:58,840 And then 9 over 16 times 2Cn et cetera. 1407 01:18:58,840 --> 01:19:02,260 So this is an unbalanced tree because you 1408 01:19:02,260 --> 01:19:06,060 have an unbalanced partition up on top, 1409 01:19:06,060 --> 01:19:12,370 and now you want to count up all the work that this tree does. 1410 01:19:12,370 --> 01:19:14,620 If you collect up all of the operations, 1411 01:19:14,620 --> 01:19:16,700 then that's going to tell you what T of n 1412 01:19:16,700 --> 01:19:18,430 is because that's all the work that you 1413 01:19:18,430 --> 01:19:23,210 have to do in order to finish up the top level of recursion. 1414 01:19:23,210 --> 01:19:26,850 And what you can say is, if you look at this side here are 1415 01:19:26,850 --> 01:19:28,820 all the way to the right-hand side, 1416 01:19:28,820 --> 01:19:37,950 you're going to have log to the base 4 over 3 times 2Cn levels. 1417 01:19:37,950 --> 01:19:41,820 So that's just simply every time you're multiplying by 3 over 4, 1418 01:19:41,820 --> 01:19:47,670 when you get down to the number 1, and that's log of 4 over 3. 1419 01:19:47,670 --> 01:19:49,690 And then over here, it's a little bit easier 1420 01:19:49,690 --> 01:19:51,960 to think about because it's a power of 2. 1421 01:19:51,960 --> 01:19:58,530 You're going to have log of 4 to the base 4 times 2Cn levels. 1422 01:19:58,530 --> 01:20:01,060 And really, it doesn't really matter honestly 1423 01:20:01,060 --> 01:20:02,850 when we go to asymptotics. 1424 01:20:02,850 --> 01:20:04,830 But is worth seeing, I think, just 1425 01:20:04,830 --> 01:20:09,400 to get a sense of why it all works out, regardless 1426 01:20:09,400 --> 01:20:13,960 of whether it's n over 4 or a different constant here 1427 01:20:13,960 --> 01:20:15,860 or whether it's balanced or unbalanced. 1428 01:20:15,860 --> 01:20:18,130 The tree looks a little bit different. 1429 01:20:18,130 --> 01:20:19,240 It's sort of weird. 1430 01:20:19,240 --> 01:20:24,550 It's got fewer levels here and more levels there. 1431 01:20:24,550 --> 01:20:26,460 So it's sort of tilted this way. 1432 01:20:26,460 --> 01:20:33,870 But eventually, you get down to theta 1 constants down below. 1433 01:20:33,870 --> 01:20:38,650 And basically what you can see-- if you just add it up-- 1434 01:20:38,650 --> 01:20:42,120 is 1 over 4 plus 3 over 4 is 1. 1435 01:20:42,120 --> 01:20:42,930 1 over 16. 1436 01:20:42,930 --> 01:20:43,580 3 over 16. 1437 01:20:43,580 --> 01:20:48,100 Obviously, those all end up being 1. 1438 01:20:48,100 --> 01:20:57,190 So you have 2Cn work at each level. 1439 01:20:57,190 --> 01:21:01,510 And if you just go ahead and be pessimistic about it, 1440 01:21:01,510 --> 01:21:09,745 there's a maximum of log 4 over 3 times 2Cn levels. 1441 01:21:12,430 --> 01:21:15,760 And that's pretty much it. 1442 01:21:15,760 --> 01:21:18,390 Obviously, now you can start ignoring the constants. 1443 01:21:18,390 --> 01:21:19,780 You just keep the log here. 1444 01:21:19,780 --> 01:21:21,320 You don't care what the base is. 1445 01:21:21,320 --> 01:21:22,230 You got an n here. 1446 01:21:22,230 --> 01:21:23,130 So drop the 2C. 1447 01:21:23,130 --> 01:21:24,460 Drop the 4 over 3. 1448 01:21:24,460 --> 01:21:25,560 Drop the 2C. 1449 01:21:25,560 --> 01:21:27,260 And you get your n log n. 1450 01:21:27,260 --> 01:21:28,070 OK? 1451 01:21:28,070 --> 01:21:30,650 So that's pretty much it. 1452 01:21:30,650 --> 01:21:34,090 I'll stick around here for questions. 1453 01:21:34,090 --> 01:21:38,210 But you got an example of a Monte Carlo algorithm. 1454 01:21:38,210 --> 01:21:40,400 You got an example of a Las Vegas algorithm. 1455 01:21:40,400 --> 01:21:43,020 And tomorrow in section, you'll see a slightly more involved 1456 01:21:43,020 --> 01:21:46,500 analysis for something that looks a lot closer 1457 01:21:46,500 --> 01:21:48,470 to the randomized quicksort. 1458 01:21:48,470 --> 01:21:51,216 So see you next time.