1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:21,720 --> 00:00:24,930 SRINIVAS DEVADAS: So welcome to 6046. 9 00:00:24,930 --> 00:00:27,010 My name is Srinivas Devadas. 10 00:00:27,010 --> 00:00:30,240 I'm a professor of computer science. 11 00:00:30,240 --> 00:00:33,650 This is my 27th year at MIT. 12 00:00:33,650 --> 00:00:37,740 I'm teaching this class with great course staff, 13 00:00:37,740 --> 00:00:42,010 with co-lecturers, Eric Demaine over here 14 00:00:42,010 --> 00:00:45,750 and Nancy Lynch, who's over there, 15 00:00:45,750 --> 00:00:50,250 and a whole bunch of TAs, who you will meet through the term. 16 00:00:50,250 --> 00:00:55,620 We just signed up our last TA yesterday, so at this point, 17 00:00:55,620 --> 00:00:58,230 even I don't know their names. 18 00:00:58,230 --> 00:01:01,840 But we hope to have a great semester. 19 00:01:01,840 --> 00:01:05,470 I'm very excited to be teaching this class with Eric and Nancy. 20 00:01:05,470 --> 00:01:09,830 I recognize some of you folks from 006 from a year ago, 21 00:01:09,830 --> 00:01:13,450 so hello again, and from other classes. 22 00:01:13,450 --> 00:01:16,600 And so let's get started. 23 00:01:16,600 --> 00:01:18,310 I mentioned 006. 24 00:01:18,310 --> 00:01:21,730 006 is a prerequisite for this class, 25 00:01:21,730 --> 00:01:26,030 so if by chance you've skipped a class-- 26 00:01:26,030 --> 00:01:29,490 MIT or EECS has allowed you to skip that-- 27 00:01:29,490 --> 00:01:34,440 make sure you check in with us to see that you are ready 28 00:01:34,440 --> 00:01:39,633 for 6046 because we will assume that you know the 6006 29 00:01:39,633 --> 00:01:40,340 material. 30 00:01:40,340 --> 00:01:42,950 And by that, I mean basic material 31 00:01:42,950 --> 00:01:46,290 on that data structures, classical algorithms 32 00:01:46,290 --> 00:01:52,930 like sorting, algorithms for dynamic programming, 33 00:01:52,930 --> 00:01:56,750 or algorithms that use dynamic programming I should say, 34 00:01:56,750 --> 00:02:00,580 algorithms for shortest paths, et cetera. 35 00:02:00,580 --> 00:02:04,550 6046 itself, we're going to run this course 36 00:02:04,550 --> 00:02:08,039 pretty much off the Stellar website in the sense 37 00:02:08,039 --> 00:02:12,550 that that'll be our one-stop shop for getting everything 38 00:02:12,550 --> 00:02:16,490 including lecture handouts, problem sets-- turning 39 00:02:16,490 --> 00:02:19,590 in your problem sets, et cetera. 40 00:02:19,590 --> 00:02:22,800 And I should mention that this course is 41 00:02:22,800 --> 00:02:27,290 being taped for OpenCourseWare, and while it'll 42 00:02:27,290 --> 00:02:32,750 take a little bit of time for the videos to be put online, 43 00:02:32,750 --> 00:02:39,370 we hope to do that perhaps in clumps before the quizzes 44 00:02:39,370 --> 00:02:46,020 that you will have as we have to have in our class. 45 00:02:46,020 --> 00:02:49,940 So let me just say a couple more things about logistics, 46 00:02:49,940 --> 00:02:53,580 and then we get started with technical content. 47 00:02:53,580 --> 00:02:55,610 As I mentioned, we're going to be running 48 00:02:55,610 --> 00:02:57,610 this course off Stellar. 49 00:02:57,610 --> 00:03:00,610 Please sign up for recitations section 50 00:03:00,610 --> 00:03:04,390 by going to the stellar website and choosing a section that 51 00:03:04,390 --> 00:03:06,270 works for your schedule. 52 00:03:06,270 --> 00:03:11,580 Sections go from 10:00 AM all the way to 3:00 I think, 53 00:03:11,580 --> 00:03:17,390 and we've placed a limit on the number of students per section. 54 00:03:17,390 --> 00:03:20,140 We wanted the sections to be manageable in size, 55 00:03:20,140 --> 00:03:22,600 but there's plenty of room for everybody, 56 00:03:22,600 --> 00:03:24,950 and the schedule flexibility should 57 00:03:24,950 --> 00:03:29,820 allows you to choose a section pretty easily. 58 00:03:29,820 --> 00:03:32,310 We have a course information document and an objectives 59 00:03:32,310 --> 00:03:34,270 document on the website. 60 00:03:34,270 --> 00:03:37,460 That has a lot of details on the grading policy, 61 00:03:37,460 --> 00:03:40,260 the collaboration policy, et cetera. 62 00:03:40,260 --> 00:03:44,700 Please read it very carefully from the first page 63 00:03:44,700 --> 00:03:46,230 all the way to the end. 64 00:03:46,230 --> 00:03:48,530 And I will mention one thing that you 65 00:03:48,530 --> 00:03:55,650 should be careful about, which is that while problem sets are 66 00:03:55,650 --> 00:03:59,380 only 30% of the grade, we do require 67 00:03:59,380 --> 00:04:02,180 you to attempt the problems. 68 00:04:02,180 --> 00:04:04,120 And there's actually a penalty associated 69 00:04:04,120 --> 00:04:07,940 with not attempting problems and not tuning problem sets in that 70 00:04:07,940 --> 00:04:11,390 is way more than 30%, so keep that in mind, 71 00:04:11,390 --> 00:04:14,470 and please read the collaboration policy 72 00:04:14,470 --> 00:04:17,130 as well as the grading policy, carefully. 73 00:04:17,130 --> 00:04:19,740 And feel free to ask us questions. 74 00:04:19,740 --> 00:04:23,600 You can ask us questions anonymously through Piazza, 75 00:04:23,600 --> 00:04:25,520 or you can certainly send us email. 76 00:04:25,520 --> 00:04:27,930 All the information is on Stellar. 77 00:04:27,930 --> 00:04:32,960 So that's all I really had to say about course logistics. 78 00:04:32,960 --> 00:04:35,830 Let me tell you a little bit about how 79 00:04:35,830 --> 00:04:37,800 the content of this course is structured. 80 00:04:40,410 --> 00:04:44,900 We have several modules, and Eric, Nancy, 81 00:04:44,900 --> 00:04:49,810 and I will be in charge of each of these different modules 82 00:04:49,810 --> 00:04:51,670 as the term goes. 83 00:04:51,670 --> 00:04:58,070 Our very first module is going to start really next time. 84 00:04:58,070 --> 00:05:00,400 Today is really an overview lecture. 85 00:05:00,400 --> 00:05:02,890 But it's a module on divide and conquer, 86 00:05:02,890 --> 00:05:06,970 and you learned about this divide and conquer paradigm 87 00:05:06,970 --> 00:05:09,670 in 006 or equivalent classes. 88 00:05:09,670 --> 00:05:12,350 It's breaking of a problem into smaller problems 89 00:05:12,350 --> 00:05:14,690 and getting efficiency that way. 90 00:05:14,690 --> 00:05:16,780 Merge sort is a classic algorithm 91 00:05:16,780 --> 00:05:19,600 that follows the divide and conquer paradigm. 92 00:05:19,600 --> 00:05:22,180 If you're going to take it to a new level. 93 00:05:22,180 --> 00:05:25,190 And I guess that's sort of the team of 046. 94 00:05:25,190 --> 00:05:28,920 Take the material in 006 and raise the stakes a little bit-- 95 00:05:28,920 --> 00:05:31,030 raise the level of sophistication-- 96 00:05:31,030 --> 00:05:34,880 and you'll see things like fast Fourier transform. 97 00:05:34,880 --> 00:05:37,310 Finding an algorithm for a convex hull, 98 00:05:37,310 --> 00:05:38,770 we'll do that next time. 99 00:05:38,770 --> 00:05:42,010 That uses the divide and conquer paradigm. 100 00:05:42,010 --> 00:05:44,670 We're going to do a ton of optimization. 101 00:05:44,670 --> 00:05:48,790 Divide and conquer can obviously be used for search 102 00:05:48,790 --> 00:05:51,060 and also for optimization. 103 00:05:51,060 --> 00:05:56,580 In particular, we'll look at strategies corresponding 104 00:05:56,580 --> 00:06:02,050 to greedy algorithms, Dijkstra, which hopefully you remember 105 00:06:02,050 --> 00:06:07,480 the shortest path algorithm from 006 is an example of a greedy 106 00:06:07,480 --> 00:06:08,370 algorithm. 107 00:06:08,370 --> 00:06:11,360 We'll see a bunch of other examples, 108 00:06:11,360 --> 00:06:13,260 and we'll look at one today. 109 00:06:13,260 --> 00:06:19,770 And dynamic programming, it's a wonderful algorithmic hammer 110 00:06:19,770 --> 00:06:22,730 that you can apply to a wide variety of problems, 111 00:06:22,730 --> 00:06:25,250 certainly to shortest paths as well. 112 00:06:25,250 --> 00:06:27,700 We'll look at it in many different contexts. 113 00:06:27,700 --> 00:06:33,580 And then really quickly network flow, 114 00:06:33,580 --> 00:06:36,490 which is a problem that's associated with-- here's 115 00:06:36,490 --> 00:06:37,480 a network. 116 00:06:37,480 --> 00:06:40,620 This capacity is associated with the network. 117 00:06:40,620 --> 00:06:44,010 The capacities could respond to the width 118 00:06:44,010 --> 00:06:48,880 of the roads in a highway system or the number 119 00:06:48,880 --> 00:06:52,170 of lanes, the amount of traffic that can go through. 120 00:06:52,170 --> 00:06:56,230 How do I maximize the set of commodities, 121 00:06:56,230 --> 00:06:58,140 or the amount of commodities that I 122 00:06:58,140 --> 00:06:59,730 can push through the network? 123 00:06:59,730 --> 00:07:06,320 That it turns out is, again, a problem that 124 00:07:06,320 --> 00:07:08,410 has many different applications, so it's really 125 00:07:08,410 --> 00:07:10,606 a collection of problems. 126 00:07:10,606 --> 00:07:12,480 You're going to spend some time, a little bit 127 00:07:12,480 --> 00:07:17,170 today, but a little more than in 6006, 128 00:07:17,170 --> 00:07:19,670 talking about intractability. 129 00:07:19,670 --> 00:07:22,115 So a lot of algorithms that we're going to talk about 130 00:07:22,115 --> 00:07:25,080 are efficient in the sense that they're 131 00:07:25,080 --> 00:07:27,160 polynomial time solvable. 132 00:07:27,160 --> 00:07:31,560 And first, polynomial time solvable 133 00:07:31,560 --> 00:07:35,801 doesn't imply efficiency in the practical sense, 134 00:07:35,801 --> 00:07:37,550 so if you have an n raised to 8 algorithm, 135 00:07:37,550 --> 00:07:38,930 it's polynomial time. 136 00:07:38,930 --> 00:07:42,250 But really, it's not something that you can use on real world 137 00:07:42,250 --> 00:07:45,220 problems where n is relatively large, 138 00:07:45,220 --> 00:07:49,700 but generally in a theoretical computer science class, 139 00:07:49,700 --> 00:07:52,100 we'll think about tractable problems 140 00:07:52,100 --> 00:07:57,320 as being those that have polynomial time algorithms that 141 00:07:57,320 --> 00:08:00,270 can solve them exactly or optimally. 142 00:08:00,270 --> 00:08:04,690 But intractability then corresponds to problems 143 00:08:04,690 --> 00:08:08,850 that, at the moment, we don't know of a polynomial time 144 00:08:08,850 --> 00:08:11,510 algorithm to solve them, and the best algorithms 145 00:08:11,510 --> 00:08:14,340 we have take worst case exponential time. 146 00:08:14,340 --> 00:08:17,740 And so the question is, what happens with those problems? 147 00:08:17,740 --> 00:08:21,360 And we'll look at things like approximation algorithms 148 00:08:21,360 --> 00:08:28,740 that can get us, in the case of optimization problems, 149 00:08:28,740 --> 00:08:32,929 get us to within a certain fraction of optimal, 150 00:08:32,929 --> 00:08:36,260 guaranteed, and run in polynomial time. 151 00:08:36,260 --> 00:08:38,750 So you can't get the absolute best, 152 00:08:38,750 --> 00:08:42,470 but you can get within 10% or we can get within a factor of 2. 153 00:08:42,470 --> 00:08:46,450 That may be enough for a particular instance 154 00:08:46,450 --> 00:08:49,660 of a problem or a set of instances of a problem. 155 00:08:49,660 --> 00:08:53,000 And what we do a bunch of advanced topics. 156 00:08:53,000 --> 00:08:56,050 I think we have distributed algorithms plan. 157 00:08:56,050 --> 00:09:01,450 Nancy works in that area, and we'll also 158 00:09:01,450 --> 00:09:03,380 talk about cryptography. 159 00:09:03,380 --> 00:09:06,820 There's a deep connection between number theory 160 00:09:06,820 --> 00:09:11,190 algorithms and cryptography that towards end of the lecture, 161 00:09:11,190 --> 00:09:13,340 or, I should say, towards the end of the course, 162 00:09:13,340 --> 00:09:18,110 I will look at a little more closely. 163 00:09:18,110 --> 00:09:22,530 So much for overview, let's get started with today's lecture 164 00:09:22,530 --> 00:09:24,490 for real. 165 00:09:24,490 --> 00:09:27,360 And here's the theme of today's lecture. 166 00:09:30,050 --> 00:09:31,640 I talked a bit about tractability 167 00:09:31,640 --> 00:09:33,470 and intractability. 168 00:09:33,470 --> 00:09:36,110 And what is fascinating about algorithms 169 00:09:36,110 --> 00:09:41,020 is that you might see a problem that 170 00:09:41,020 --> 00:09:46,440 has a fairly obvious polynomial time solution or a linear time 171 00:09:46,440 --> 00:09:50,360 solution, then you change it ever so slightly, 172 00:09:50,360 --> 00:09:53,360 and the linear time algorithm doesn't work. 173 00:09:53,360 --> 00:09:56,370 Maybe you can find a cubic algorithm. 174 00:09:56,370 --> 00:09:59,890 And then you change it a little more, 175 00:09:59,890 --> 00:10:03,800 and you end up with something that you 176 00:10:03,800 --> 00:10:06,130 can't find a polynomial time algorithm for. 177 00:10:06,130 --> 00:10:10,510 You can't prove that the polynomial time algorithm 178 00:10:10,510 --> 00:10:12,980 or polynomial monomial time algorithm 179 00:10:12,980 --> 00:10:15,690 gives you the optimal solution in all cases. 180 00:10:15,690 --> 00:10:18,700 And then you go off into complexity theory. 181 00:10:18,700 --> 00:10:23,630 You maybe discover that, or show that this problem is 182 00:10:23,630 --> 00:10:27,560 NP-complete, and now you're in the intractability domain. 183 00:10:27,560 --> 00:10:32,540 So very small changes in problem statements 184 00:10:32,540 --> 00:10:39,150 can end up with very different situations 185 00:10:39,150 --> 00:10:41,960 from a standpoint of algorithm complexity. 186 00:10:41,960 --> 00:10:46,480 And so that's really what I want to point out 187 00:10:46,480 --> 00:10:50,585 to you in some detail with a concrete example. 188 00:10:59,130 --> 00:11:02,630 So I want to get a little bit pedantic here 189 00:11:02,630 --> 00:11:06,820 with respect to intractability and tractability. 190 00:11:06,820 --> 00:11:12,580 You've seen, I think, these terms before in the one lecture 191 00:11:12,580 --> 00:11:19,580 in 006, but we'll go over this in some detail today and more 192 00:11:19,580 --> 00:11:21,310 later on in the semester. 193 00:11:21,310 --> 00:11:27,090 But for now, let's recall some basic terminology associated 194 00:11:27,090 --> 00:11:29,990 with tractability and intractability or complexity 195 00:11:29,990 --> 00:11:32,530 theory, broadly speaking. 196 00:11:32,530 --> 00:11:39,700 Capital P is a class of problems solvable in polynomial time. 197 00:11:45,560 --> 00:11:48,600 And think of that as big O, n raised 198 00:11:48,600 --> 00:11:54,686 to k for some constant k. 199 00:11:54,686 --> 00:11:57,210 Now you can have long factors in there, 200 00:11:57,210 --> 00:12:00,550 but once you put a big O in there, you're good. 201 00:12:00,550 --> 00:12:04,000 You can always say, order n, even 202 00:12:04,000 --> 00:12:08,140 if it's a logarithmic problem, and big O 203 00:12:08,140 --> 00:12:10,790 lets you be sloppy like that. 204 00:12:10,790 --> 00:12:15,410 And there are many examples of polynomial time algorithms, 205 00:12:15,410 --> 00:12:18,760 of course, for interesting problems like shortest paths. 206 00:12:18,760 --> 00:12:22,600 So the shortest path problem is order V square, 207 00:12:22,600 --> 00:12:25,070 where V is the number of vertices in the graph. 208 00:12:25,070 --> 00:12:26,810 There's algorithms for that. 209 00:12:26,810 --> 00:12:32,380 You can do a little bit better if you use fancier data 210 00:12:32,380 --> 00:12:35,660 structure, but that's an example. 211 00:12:35,660 --> 00:12:42,050 NP is another class of problems that's very interesting. 212 00:12:42,050 --> 00:12:48,030 This is the class of problems that whose solution 213 00:12:48,030 --> 00:12:51,810 is verifiable in polynomial time. 214 00:12:55,640 --> 00:13:06,060 So an example of a problem in NP that is not known to be NP 215 00:13:06,060 --> 00:13:10,710 is the Hamiltonian cycle problem. 216 00:13:10,710 --> 00:13:13,180 And the Hamiltonian cycle problem 217 00:13:13,180 --> 00:13:31,260 corresponds to given a directed graph, find a simple cycle. 218 00:13:31,260 --> 00:13:37,900 So you can repeat vertices, but you need the simple cycle 219 00:13:37,900 --> 00:13:48,250 to contain each vertex in V. 220 00:13:48,250 --> 00:13:55,500 And determining whether a given cycle is a Hamiltonian cycle 221 00:13:55,500 --> 00:13:57,740 or not is simple. 222 00:13:57,740 --> 00:14:00,400 You just traverse the cycle. 223 00:14:00,400 --> 00:14:03,780 Make sure that you've touched all the vertices exactly once, 224 00:14:03,780 --> 00:14:04,690 and you're done. 225 00:14:04,690 --> 00:14:07,430 Clearly doable in polynomial time. 226 00:14:07,430 --> 00:14:10,390 So therefore, Hamiltonian cycle is an NP, 227 00:14:10,390 --> 00:14:16,620 but determining whether a graph has a Hamiltonian cycle 228 00:14:16,620 --> 00:14:19,850 or not is a hard problem. 229 00:14:19,850 --> 00:14:31,000 And in particular, the notion of NP completeness 230 00:14:31,000 --> 00:14:39,940 is something that defines the level of intractability for NP. 231 00:14:39,940 --> 00:14:45,450 The NP complete problems are the hardest problems in NP, 232 00:14:45,450 --> 00:14:47,820 and Hamiltonian cycle is one of them. 233 00:14:47,820 --> 00:14:55,710 If you can solve any NP complete problem in polynomial time, 234 00:14:55,710 --> 00:14:59,710 you can solve all problems in NP in polynomial time. 235 00:14:59,710 --> 00:15:03,670 So that's what I meant by saying that NP complete problems are, 236 00:15:03,670 --> 00:15:06,050 in some sense, the hardest problems an NP 237 00:15:06,050 --> 00:15:10,000 because solving one of them gives you everything. 238 00:15:10,000 --> 00:15:12,900 So the definition of NP completeness 239 00:15:12,900 --> 00:15:19,070 is that the problem is in NP and is 240 00:15:19,070 --> 00:15:23,370 as hard-- an informal definition-- 241 00:15:23,370 --> 00:15:26,280 as any problem in NP. 242 00:15:33,050 --> 00:15:37,110 And so Hamiltonian cycle is an NP complete problem. 243 00:15:37,110 --> 00:15:39,410 Satisfiability is an NP complete problem, 244 00:15:39,410 --> 00:15:41,250 and there's a whole bunch of them. 245 00:15:41,250 --> 00:15:46,130 So going back to our theme here, what I want to show you 246 00:15:46,130 --> 00:15:50,200 is how for an interval scheduling problem, that I'll 247 00:15:50,200 --> 00:15:57,960 define in a couple of minutes, how we move from linear time, 248 00:15:57,960 --> 00:16:01,930 therefore P, to something that's still in P. 249 00:16:01,930 --> 00:16:03,500 But it's a little more complicated 250 00:16:03,500 --> 00:16:06,360 if I change the constraints of a problem a little bit. 251 00:16:06,360 --> 00:16:10,450 And finally, if I add more constraints to the problem, 252 00:16:10,450 --> 00:16:12,100 generalize it-- and you can think of it 253 00:16:12,100 --> 00:16:14,180 as adding constraints or generalizing 254 00:16:14,180 --> 00:16:19,930 the problem-- you get small changes to something 255 00:16:19,930 --> 00:16:22,250 that becomes NP complete. 256 00:16:22,250 --> 00:16:25,890 So this is something that algorithm designers 257 00:16:25,890 --> 00:16:30,390 have to keep in mind because before you go off and try 258 00:16:30,390 --> 00:16:32,620 to design an algorithm for a problem 259 00:16:32,620 --> 00:16:37,760 you like to know where in the spectrum your problem resides. 260 00:16:37,760 --> 00:16:41,330 And in order to do that, you need 261 00:16:41,330 --> 00:16:44,660 to understand algorithm paradigms obviously and be 262 00:16:44,660 --> 00:16:48,420 able to apply them, but you also have to understand reductions 263 00:16:48,420 --> 00:16:51,430 where you can try and translate one problem to another. 264 00:16:51,430 --> 00:16:55,550 And if you can do that, and the first problem 265 00:16:55,550 --> 00:16:58,660 is known to be hard, then you can make arguments 266 00:16:58,660 --> 00:17:02,060 about the hardness of your problem. 267 00:17:02,060 --> 00:17:06,020 So these are the kinds of things that we'll touch upon today, 268 00:17:06,020 --> 00:17:12,440 the analysis of an algorithm, the design of an algorithm, 269 00:17:12,440 --> 00:17:16,579 and also the complexity analysis of an algorithm, which may not 270 00:17:16,579 --> 00:17:18,900 just be an asymptotic-- well, this 271 00:17:18,900 --> 00:17:21,339 is order n cubed or order n square 272 00:17:21,339 --> 00:17:25,510 but more in the realm of NP completeness as well. 273 00:17:28,280 --> 00:17:32,000 So so much for context, let's dive 274 00:17:32,000 --> 00:17:39,520 into our interval scheduling problem, which is something 275 00:17:39,520 --> 00:17:44,670 that you can imagine doing for classes, 276 00:17:44,670 --> 00:17:50,110 tasks, a particular schedule during a day, life in general. 277 00:17:50,110 --> 00:17:59,140 And in the general setting, we have resources and requests, 278 00:17:59,140 --> 00:18:03,600 and we're going to have a single resource for our first version 279 00:18:03,600 --> 00:18:05,190 of the problem. 280 00:18:05,190 --> 00:18:10,640 And our requests are going to be 1 through n, 281 00:18:10,640 --> 00:18:12,550 and we can think of these requests 282 00:18:12,550 --> 00:18:17,100 as requiring time corresponding to the resource. 283 00:18:17,100 --> 00:18:19,300 So the request is for the resource, 284 00:18:19,300 --> 00:18:20,790 and you want time on the resource. 285 00:18:20,790 --> 00:18:22,280 Maybe it's computation time. 286 00:18:22,280 --> 00:18:24,420 Maybe it's your time. 287 00:18:24,420 --> 00:18:26,550 It could be anything. 288 00:18:26,550 --> 00:18:32,230 Each of these requests responds to an interval of time, 289 00:18:32,230 --> 00:18:34,460 and that's where the name comes from. 290 00:18:34,460 --> 00:18:40,030 si is start time time. 291 00:18:40,030 --> 00:18:46,650 fi is the finish time, and we're going 292 00:18:46,650 --> 00:18:50,660 to say si is strictly less than fi. 293 00:18:50,660 --> 00:18:52,970 So I didn't put less than or equal to there 294 00:18:52,970 --> 00:18:57,790 because I want these requests to be non-null, non-zero, 295 00:18:57,790 --> 00:19:00,810 so otherwise they're uninteresting. 296 00:19:00,810 --> 00:19:02,680 And we're going to have a start time, 297 00:19:02,680 --> 00:19:05,860 and we're going to have an end time, and they're not equal. 298 00:19:05,860 --> 00:19:11,270 So that's the first part of the specification 299 00:19:11,270 --> 00:19:15,000 of the problem and then the second part, 300 00:19:15,000 --> 00:19:21,370 which is intuitive is that two requests-- we have 301 00:19:21,370 --> 00:19:24,460 a single resource here remember-- i 302 00:19:24,460 --> 00:19:30,390 and j are considered to be compatible, 303 00:19:30,390 --> 00:19:33,810 which means you can satisfy both of these requests. 304 00:19:33,810 --> 00:19:34,760 They're compatible. 305 00:19:34,760 --> 00:19:37,330 Incompatible requests, you can't satisfy 306 00:19:37,330 --> 00:19:41,600 with your single resource simultaneously-- 307 00:19:41,600 --> 00:19:45,345 Provided they don't overlap. 308 00:19:51,850 --> 00:19:58,450 And an overlapping condition might be that fi is less than 309 00:19:58,450 --> 00:20:08,430 or equal to sg, or fj less than or equal to si. 310 00:20:08,430 --> 00:20:11,110 So again, I put a less than or equal to here, 311 00:20:11,110 --> 00:20:16,160 which is important to spend a minute on. 312 00:20:16,160 --> 00:20:22,470 What I'm saying here in this context is that I really have 313 00:20:22,470 --> 00:20:26,970 open-ended intervals on the right-hand side corresponding 314 00:20:26,970 --> 00:20:29,420 to the fi's. 315 00:20:29,420 --> 00:20:35,190 So pictorially, you could look at it this way. 316 00:20:35,190 --> 00:20:40,830 Let's say I have intervals like this. 317 00:20:40,830 --> 00:20:42,830 So this is interval number 1. 318 00:20:42,830 --> 00:20:44,640 That's interval number 2. 319 00:20:44,640 --> 00:20:53,360 Right here I have s of 1, f of 1 out here, s of 2 out here, 320 00:20:53,360 --> 00:20:55,150 and f of 2 out here. 321 00:20:55,150 --> 00:21:01,910 So this is f of 1 for that and s of 2 for this. 322 00:21:01,910 --> 00:21:07,920 I'm allowing s of 2 and f of 1 to be exactly equal, 323 00:21:07,920 --> 00:21:14,960 and I still agree that these two are compatible requests. 324 00:21:14,960 --> 00:21:17,860 So this is-- I guess it's terminology. 325 00:21:17,860 --> 00:21:22,940 It's our definition of compatibility. 326 00:21:22,940 --> 00:21:29,000 So you can imagine now an optimization problem 327 00:21:29,000 --> 00:21:31,880 that is associated with interval scheduling 328 00:21:31,880 --> 00:21:35,980 where, in a different example, I have 329 00:21:35,980 --> 00:21:40,780 this interval corresponding to s1 and f1. 330 00:21:40,780 --> 00:21:45,200 I might have a different interval here corresponding 331 00:21:45,200 --> 00:21:48,750 to 2, then corresponding to 3. 332 00:21:48,750 --> 00:21:56,990 And then maybe I've got 4 here, 5, and 6. 333 00:21:56,990 --> 00:22:03,090 So those are my six intervals corresponding to my input. 334 00:22:03,090 --> 00:22:04,760 I have a single resource. 335 00:22:04,760 --> 00:22:08,110 I'm just drawn out in a two-dimensional form. 336 00:22:08,110 --> 00:22:11,220 There's six different requests that I have, 337 00:22:11,220 --> 00:22:12,700 the six different intervals. 338 00:22:12,700 --> 00:22:16,260 Intervals and requests are synonyms. 339 00:22:16,260 --> 00:22:20,780 And my goal here-- and it's kind of obvious in this example-- 340 00:22:20,780 --> 00:22:42,540 is to select a compatible subset of requests, or intervals, 341 00:22:42,540 --> 00:22:43,785 that is of maximum size. 342 00:22:49,850 --> 00:22:53,540 And I'd like to do this efficiently. 343 00:22:53,540 --> 00:22:56,480 So we'll always consider efficiency here, 344 00:22:56,480 --> 00:22:59,860 but in terms of the specification of the problem as 345 00:22:59,860 --> 00:23:05,450 opposed to a requirement on the complexity of the algorithm, 346 00:23:05,450 --> 00:23:09,940 I want maximum size for this subset. 347 00:23:09,940 --> 00:23:13,570 So as I showed you, or I mentioned earlier, 348 00:23:13,570 --> 00:23:18,150 in this case, it is clear from the drawing 349 00:23:18,150 --> 00:23:21,200 that I put up there that the maximum size for that six 350 00:23:21,200 --> 00:23:25,820 requests example that I have is three. 351 00:23:25,820 --> 00:23:27,900 So that's the set up. 352 00:23:27,900 --> 00:23:33,490 Now we're going to spend the next few minutes 353 00:23:33,490 --> 00:23:37,840 talking about a greedy strategy for solving 354 00:23:37,840 --> 00:23:39,990 this particular problem. 355 00:23:39,990 --> 00:23:43,090 If you don't know of it, the greedy strategy 356 00:23:43,090 --> 00:23:49,540 is going to always produce the maximum size or not. 357 00:23:49,540 --> 00:23:54,750 In fact, it depends on the particular greedy heuristic, 358 00:23:54,750 --> 00:23:58,710 the selection heuristic that a greedy algorithm uses. 359 00:23:58,710 --> 00:24:01,410 So that's going to be important, and we'll take a look-- 360 00:24:01,410 --> 00:24:03,060 and hopefully you can suggest some-- 361 00:24:03,060 --> 00:24:06,970 at a few different greedy heuristics. 362 00:24:06,970 --> 00:24:10,297 But my claim, overall claim, that I'm 363 00:24:10,297 --> 00:24:11,880 going to have to spend a bunch of time 364 00:24:11,880 --> 00:24:15,300 here justifying and eventually proving 365 00:24:15,300 --> 00:24:27,210 is that we can solve this problem using 366 00:24:27,210 --> 00:24:28,160 a greedy algorithm. 367 00:24:30,730 --> 00:24:32,730 Now what is a greedy algorithm? 368 00:24:32,730 --> 00:24:36,190 You've seen some examples. 369 00:24:36,190 --> 00:24:42,240 As the name implies, it's something that's myopic. 370 00:24:42,240 --> 00:24:43,750 It doesn't look ahead. 371 00:24:43,750 --> 00:24:48,700 It looks to maximize the very first thing 372 00:24:48,700 --> 00:24:51,660 that you couldn't maximize. 373 00:24:51,660 --> 00:24:57,050 It says-- traffic is a good example-- don't let 374 00:24:57,050 --> 00:24:58,840 anybody cut in front of you. 375 00:24:58,840 --> 00:25:00,210 You've got some room up there. 376 00:25:00,210 --> 00:25:02,780 Get up there. 377 00:25:02,780 --> 00:25:07,040 Generally, people are greedy when 378 00:25:07,040 --> 00:25:10,120 it comes to getting to work, trying 379 00:25:10,120 --> 00:25:13,330 to minimize the time and, in this case, 380 00:25:13,330 --> 00:25:15,480 on the time that they spend on the road. 381 00:25:15,480 --> 00:25:18,230 But we've had other examples. 382 00:25:18,230 --> 00:25:22,240 For example, when you look at interval scheduling, 383 00:25:22,240 --> 00:25:29,410 you might say, I'm going to pick the smallest request. 384 00:25:29,410 --> 00:25:32,510 And I'm going to pick the smallest request first, 385 00:25:32,510 --> 00:25:34,490 and I'm going to try and collect together 386 00:25:34,490 --> 00:25:36,270 as many requests as possible. 387 00:25:36,270 --> 00:25:38,020 And if the requests are small in the sense 388 00:25:38,020 --> 00:25:41,620 that si and fi, for the two requests, 389 00:25:41,620 --> 00:25:46,440 are close to each other, then maybe that's the best strategy. 390 00:25:46,440 --> 00:25:50,050 So that's an example of a greedy strategy 391 00:25:50,050 --> 00:25:52,130 for our particular example. 392 00:25:52,130 --> 00:25:59,050 But just to give you a slightly better definition of greedy 393 00:25:59,050 --> 00:26:04,780 than what I've said so far, a greedy algorithm 394 00:26:04,780 --> 00:26:18,270 is a myopic algorithm that does two things. 395 00:26:18,270 --> 00:26:35,121 It processes the input one piece at a time with no apparent look 396 00:26:35,121 --> 00:26:35,620 ahead. 397 00:26:39,820 --> 00:26:42,910 So what happens is that greedy algorithms are typically 398 00:26:42,910 --> 00:26:45,550 quite efficient. 399 00:26:45,550 --> 00:26:51,340 What you end up doing is looking at a small part of the problem 400 00:26:51,340 --> 00:26:55,040 instance and deciding what to do. 401 00:26:55,040 --> 00:26:57,790 Once you've done that, then you're 402 00:26:57,790 --> 00:27:01,080 in a situation where the problem has gotten a little bit simpler 403 00:27:01,080 --> 00:27:03,780 because you've already solved part of it, 404 00:27:03,780 --> 00:27:05,680 and then you move on. 405 00:27:05,680 --> 00:27:09,180 So what would a template for a greedy algorithm 406 00:27:09,180 --> 00:27:13,420 look like for our interval scheduling problem? 407 00:27:13,420 --> 00:27:17,590 Here's a template that probably puts it all together 408 00:27:17,590 --> 00:27:24,310 and gives you a good sense of what I mean by greedy, at least 409 00:27:24,310 --> 00:27:24,980 in this context. 410 00:27:29,150 --> 00:27:34,400 So before we even get into particulars of selection 411 00:27:34,400 --> 00:27:38,170 strategies, let me give you a template 412 00:27:38,170 --> 00:27:41,905 for greedy interval scheduling. 413 00:27:46,350 --> 00:27:54,910 So step 1, use a simple rule to select a request. 414 00:28:00,630 --> 00:28:08,180 And once you do that, if you selected a particular request-- 415 00:28:08,180 --> 00:28:11,730 let's say you selected 1. 416 00:28:11,730 --> 00:28:16,630 What happens now once you've selected 1? 417 00:28:16,630 --> 00:28:17,850 Well, you're done. 418 00:28:17,850 --> 00:28:18,730 You can't select 2. 419 00:28:18,730 --> 00:28:19,650 You can't select 3. 420 00:28:19,650 --> 00:28:20,599 You can't select 4. 421 00:28:20,599 --> 00:28:21,390 You can't select 5. 422 00:28:21,390 --> 00:28:23,560 You can't select 6. 423 00:28:23,560 --> 00:28:27,880 So if you have selected 1 in this case, you're done, 424 00:28:27,880 --> 00:28:32,260 but we have to codify that in a step here. 425 00:28:32,260 --> 00:28:34,500 And what that means is that we have 426 00:28:34,500 --> 00:28:44,400 to reject all requests that are incompatible with i. 427 00:28:46,930 --> 00:28:51,190 And at this point, because we've rejected a bunch of requests, 428 00:28:51,190 --> 00:28:54,480 our problem got smaller. 429 00:28:54,480 --> 00:28:59,180 And so you now have a smaller problem, 430 00:28:59,180 --> 00:29:06,400 and you just repeat-- go back to step 1-- until all requests are 431 00:29:06,400 --> 00:29:06,900 processed. 432 00:29:09,620 --> 00:29:12,480 All right, so that's a classical template 433 00:29:12,480 --> 00:29:15,000 for a greedy algorithm. 434 00:29:15,000 --> 00:29:19,350 You just go through these really simple steps. 435 00:29:19,350 --> 00:29:22,410 And the reason this is a template 436 00:29:22,410 --> 00:29:26,430 is because I haven't specified a particular rule, 437 00:29:26,430 --> 00:29:30,200 and so it's not quite an algorithm that you can code yet 438 00:29:30,200 --> 00:29:32,390 because we need a rule. 439 00:29:32,390 --> 00:29:38,230 So with all of that context, let me ask you. 440 00:29:38,230 --> 00:29:44,880 What is a rule that you think would work well 441 00:29:44,880 --> 00:29:47,090 for an interval scheduling problem? 442 00:29:47,090 --> 00:29:48,101 Yeah, go ahead. 443 00:29:48,101 --> 00:29:50,360 AUDIENCE: Select one with the earliest finish time. 444 00:29:50,360 --> 00:29:52,430 SRINIVAS DEVADAS: Select one with the earliest finish time. 445 00:29:52,430 --> 00:29:54,440 All right, well, I did not want that answer. 446 00:29:54,440 --> 00:29:56,640 [LAUGHTER] 447 00:29:56,640 --> 00:29:59,360 But now that you've given me the answer, 448 00:29:59,360 --> 00:30:01,640 I have to do something about this. 449 00:30:01,640 --> 00:30:06,040 So I want a different answer, so we'll go to a different person. 450 00:30:06,040 --> 00:30:13,480 But before I do that, let me reward you for that answer 451 00:30:13,480 --> 00:30:22,472 I did not want with a limited edition 6046 Frisbee, OK? 452 00:30:22,472 --> 00:30:25,200 [APPLAUSE] 453 00:30:25,200 --> 00:30:26,995 You need to stand up because I don't want 454 00:30:26,995 --> 00:30:28,120 to take people's heads off. 455 00:30:28,120 --> 00:30:28,240 [LAUGHTER] 456 00:30:28,240 --> 00:30:28,770 Yeah sorry. 457 00:30:28,770 --> 00:30:31,522 All right, so here you go. 458 00:30:31,522 --> 00:30:32,468 All right? 459 00:30:32,468 --> 00:30:32,968 Good. 460 00:30:32,968 --> 00:30:34,420 [APPLAUSE] 461 00:30:34,420 --> 00:30:38,390 So people do cookies and candy. 462 00:30:38,390 --> 00:30:40,340 I think Eric, Nancy and I are cooler. 463 00:30:40,340 --> 00:30:41,790 [LAUGHTER] 464 00:30:41,790 --> 00:30:45,800 So we do Frisbees. 465 00:30:45,800 --> 00:30:48,930 All right, good, so the fact of the matter 466 00:30:48,930 --> 00:30:54,570 was that this class was scheduled for 9:30 to 11:00 467 00:30:54,570 --> 00:30:56,200 on Tuesdays and Thursdays. 468 00:30:56,200 --> 00:30:59,370 That's when we decided to do Frisbees. 469 00:30:59,370 --> 00:31:01,530 And then it got shifted over to 11:00 to 12:30, 470 00:31:01,530 --> 00:31:05,270 but then we bought all these Frisbees, so we said, whatever. 471 00:31:05,270 --> 00:31:07,424 It's not like we could use all of them 472 00:31:07,424 --> 00:31:08,090 All right, good. 473 00:31:08,090 --> 00:31:12,887 So I don't like that answer, and I want a different one. 474 00:31:12,887 --> 00:31:13,720 Give me another one. 475 00:31:13,720 --> 00:31:14,485 Yeah, go ahead. 476 00:31:14,485 --> 00:31:16,610 AUDIENCE: Just carry it through in numerical order. 477 00:31:16,610 --> 00:31:17,120 SRINIVAS DEVADAS: I'm sorry? 478 00:31:17,120 --> 00:31:19,340 AUDIENCE: Just carry it through in numerical order. 479 00:31:19,340 --> 00:31:21,590 SRINIVAS DEVADAS: Carry it through in numerical order. 480 00:31:21,590 --> 00:31:22,760 Is that going to work? 481 00:31:22,760 --> 00:31:25,770 And what's an example that it didn't work? 482 00:31:25,770 --> 00:31:27,830 The one right there, right? 483 00:31:27,830 --> 00:31:30,190 Should I get her a Frisbee? 484 00:31:30,190 --> 00:31:31,391 We should. 485 00:31:31,391 --> 00:31:33,140 I'm going to be generous at the beginning. 486 00:31:33,140 --> 00:31:33,723 You can just-- 487 00:31:36,280 --> 00:31:38,080 But that's an answer I liked. 488 00:31:38,080 --> 00:31:42,280 Yeah, there you go. 489 00:31:42,280 --> 00:31:46,370 So entering through a numeric order isn't going to work. 490 00:31:46,370 --> 00:31:50,486 This is a great example right there. 491 00:31:50,486 --> 00:31:52,900 Give me another one. 492 00:31:52,900 --> 00:31:55,430 [LAUGHTER] 493 00:31:55,430 --> 00:31:56,820 There are no Frisbees right here. 494 00:31:56,820 --> 00:31:58,320 Over there, yeah? 495 00:31:58,320 --> 00:32:01,136 AUDIENCE: Try the one with the shortest time. 496 00:32:01,136 --> 00:32:03,510 SRINIVAS DEVADAS: Ah, try the one with the shortest time. 497 00:32:03,510 --> 00:32:09,001 OK, so the shortest time in this case might be this one. 498 00:32:09,001 --> 00:32:10,500 The shortest time might be this one, 499 00:32:10,500 --> 00:32:12,083 and, hey, that might work in this case 500 00:32:12,083 --> 00:32:14,600 because you pick this one, which is the shortest, 501 00:32:14,600 --> 00:32:16,860 or maybe it's five, which is the shortest. 502 00:32:16,860 --> 00:32:20,870 Either way, you could get 2, 5, and 6, looking at this picture, 503 00:32:20,870 --> 00:32:22,420 seems to work. 504 00:32:22,420 --> 00:32:27,050 Maybe 4, 5, and 6 if you pick 5 first, et cetera, right? 505 00:32:27,050 --> 00:32:31,010 I'll give you a Frisbee if you can take that same algorithm 506 00:32:31,010 --> 00:32:32,260 and give me a counter example. 507 00:32:39,120 --> 00:32:43,762 AUDIENCE: Let's say you have two requests which don't overlap, 508 00:32:43,762 --> 00:32:44,512 and then there's-- 509 00:32:44,512 --> 00:32:46,678 SRINIVAS DEVADAS: --there's one right in the middle, 510 00:32:46,678 --> 00:32:47,680 exactly right. 511 00:32:47,680 --> 00:32:50,780 Yep, so let's see. 512 00:32:50,780 --> 00:32:51,830 What do I do? 513 00:32:51,830 --> 00:32:52,420 Oh, here. 514 00:32:56,320 --> 00:33:00,020 So pictorially, a really you can look at this, 515 00:33:00,020 --> 00:33:04,960 and you can actually figure out whether your heuristic works 516 00:33:04,960 --> 00:33:05,460 or not. 517 00:33:05,460 --> 00:33:07,460 But this, I think, what you were thinking about. 518 00:33:10,720 --> 00:33:12,950 There you go, right? 519 00:33:12,950 --> 00:33:14,450 So you get one. 520 00:33:17,322 --> 00:33:18,530 So that clearly doesn't work. 521 00:33:18,530 --> 00:33:24,730 So this one was smallest, doesn't work. 522 00:33:24,730 --> 00:33:27,090 The suggestion here was a numeric. 523 00:33:30,350 --> 00:33:31,030 It doesn't work. 524 00:33:34,900 --> 00:33:40,030 Here's one that might actually work. 525 00:33:40,030 --> 00:33:46,550 For each request, find the number 526 00:33:46,550 --> 00:33:48,020 of incompatible requests. 527 00:33:51,880 --> 00:33:52,880 So you've got a request. 528 00:33:52,880 --> 00:33:56,230 You can always intersect the other requests with it 529 00:33:56,230 --> 00:34:00,800 and decide whether the second request is compatible or not, 530 00:34:00,800 --> 00:34:03,670 and you do this for every other request. 531 00:34:03,670 --> 00:34:07,450 And you can collect together numbers associated 532 00:34:07,450 --> 00:34:12,240 with how many incompatible requests a particular request 533 00:34:12,240 --> 00:34:17,760 has, and you say, well, let me use that as a heuristic. 534 00:34:17,760 --> 00:34:21,750 So each request, find number of incompatible requests 535 00:34:21,750 --> 00:34:30,401 and select the one with the minimum number 536 00:34:30,401 --> 00:34:31,109 of incompatibles. 537 00:34:36,110 --> 00:34:40,090 So just to be clear, in this case, 538 00:34:40,090 --> 00:34:43,020 you would not select 1 because clearly 1 is 539 00:34:43,020 --> 00:34:45,659 incompatible with every other request, 540 00:34:45,659 --> 00:34:48,150 so that clearly is not numeric order. 541 00:34:48,150 --> 00:34:50,500 In this case, you would not select this one 542 00:34:50,500 --> 00:34:53,580 because it's incompatible with this one and that one. 543 00:34:53,580 --> 00:34:56,020 So you'd select that one which has the minimum number 544 00:34:56,020 --> 00:34:57,650 of incompatibles. 545 00:34:57,650 --> 00:34:59,630 So you think this is going to produce 546 00:34:59,630 --> 00:35:06,333 the correct answer, the maximum answer, in every possible case? 547 00:35:06,333 --> 00:35:07,260 AUDIENCE: No. 548 00:35:07,260 --> 00:35:10,460 SRINIVAS DEVADAS: No, who said, no? 549 00:35:10,460 --> 00:35:12,290 Well, anybody who said, no, should give me 550 00:35:12,290 --> 00:35:13,420 a counter example. 551 00:35:13,420 --> 00:35:14,450 Yeah, go for it. 552 00:35:14,450 --> 00:35:16,845 AUDIENCE: If the one that it selects 553 00:35:16,845 --> 00:35:21,156 has mutually incompatible collection of intervals 554 00:35:21,156 --> 00:35:23,560 with which it's compatible. 555 00:35:23,560 --> 00:35:27,740 SRINIVAS DEVADAS: Right, so that's a good thought. 556 00:35:27,740 --> 00:35:29,390 We'll have to [INAUDIBLE] that. 557 00:35:29,390 --> 00:35:31,910 And I think this particular example, 558 00:35:31,910 --> 00:35:34,940 that's exactly what you said, which 559 00:35:34,940 --> 00:35:43,290 just instantiates your notion of mutual incompatibility. 560 00:35:43,290 --> 00:35:46,360 So here's an example where I have something. 561 00:35:46,360 --> 00:35:48,010 It's a little more complicated. 562 00:35:48,010 --> 00:35:50,960 As you can see, this is a pretty good heuristic. 563 00:35:50,960 --> 00:35:55,570 It's not perfect as you can see from this example, where 564 00:35:55,570 --> 00:36:01,805 I have something like this. 565 00:36:13,910 --> 00:36:22,600 So if you look at this, what I have here is 566 00:36:22,600 --> 00:36:25,800 I have just a bunch of requests which 567 00:36:25,800 --> 00:36:29,760 have-- this is incompatible with this and that 568 00:36:29,760 --> 00:36:33,870 and these two, so clearly a lot of incompatibilities for these, 569 00:36:33,870 --> 00:36:36,390 a lot incompatibilities for these. 570 00:36:36,390 --> 00:36:38,580 Which is the minimum? 571 00:36:38,580 --> 00:36:43,460 The one in here, but what happens if you select that? 572 00:36:43,460 --> 00:36:47,720 Well, clearly you don't get this solution, which is optimal. 573 00:36:47,720 --> 00:36:51,800 The one on top, so this is a bad selection. 574 00:36:56,273 --> 00:36:59,880 And so this doesn't work either, OK? 575 00:36:59,880 --> 00:37:00,900 There you go. 576 00:37:04,120 --> 00:37:07,510 So as it turns out, the reason I didn't 577 00:37:07,510 --> 00:37:10,830 like that first answer was it was correct. 578 00:37:10,830 --> 00:37:12,440 [LAUGHTER] 579 00:37:12,440 --> 00:37:15,070 It's actually a beautiful heuristic. 580 00:37:15,070 --> 00:37:21,194 Earliest finish time is a heuristic that is-- well, 581 00:37:21,194 --> 00:37:22,860 it's not really a heuristic in the sense 582 00:37:22,860 --> 00:37:25,940 that if you use that selection rule, 583 00:37:25,940 --> 00:37:29,700 then it works in every case. 584 00:37:29,700 --> 00:37:35,150 In every case, it's going to get to you the maximum number, OK? 585 00:37:35,150 --> 00:37:41,110 Earliest finished time so what does that mean? 586 00:37:41,110 --> 00:37:46,690 Well, it just means that I'm going to just scan 587 00:37:46,690 --> 00:37:52,670 the f of i's associated with the list of requests that I have, 588 00:37:52,670 --> 00:37:56,680 and I'm going to pick the one that is minimum. 589 00:37:56,680 --> 00:37:59,860 Minimum f of i means earliest finish time. 590 00:37:59,860 --> 00:38:01,730 Now you can just step back, and I'm not 591 00:38:01,730 --> 00:38:06,650 going to do this for every diagram that I have up here, 592 00:38:06,650 --> 00:38:09,510 but look at every example that I've put up. 593 00:38:09,510 --> 00:38:14,250 Apply the selection rule associated with earliest finish 594 00:38:14,250 --> 00:38:17,840 time, and you'll see that it works and gets 595 00:38:17,840 --> 00:38:19,370 you the maximum number. 596 00:38:19,370 --> 00:38:27,630 For example, over here, this has the earliest finish time. 597 00:38:27,630 --> 00:38:29,760 Not this, not this, it's over here. 598 00:38:29,760 --> 00:38:35,170 So you pick that, and then you use the greedy algorithm step 2 599 00:38:35,170 --> 00:38:38,680 to eliminate all of the intervals that are 600 00:38:38,680 --> 00:38:41,500 incompatible, so these go away. 601 00:38:41,500 --> 00:38:45,050 Once this goes away, this one has the earliest finish 602 00:38:45,050 --> 00:38:48,480 time and so on and so forth. 603 00:38:48,480 --> 00:38:56,000 So this is something that you can prove through examples. 604 00:38:56,000 --> 00:38:58,230 That's not really a good notion when 605 00:38:58,230 --> 00:39:01,760 you can prove to yourself using examples. 606 00:39:01,760 --> 00:39:08,880 And this is where I guess is the essence of 6046, 607 00:39:08,880 --> 00:39:12,780 to some extent 006 comes into play. 608 00:39:12,780 --> 00:39:18,310 We will have to prove beyond a shadow of a doubt using 609 00:39:18,310 --> 00:39:23,820 mathematical rigor that the earliest finish time selection 610 00:39:23,820 --> 00:39:30,665 rule always gives us the maximum number of requests, 611 00:39:30,665 --> 00:39:31,790 and we're going to do that. 612 00:39:31,790 --> 00:39:33,750 It's going to take us a little bit of time, 613 00:39:33,750 --> 00:39:38,330 but that's the kind of thing you will be expected to do 614 00:39:38,330 --> 00:39:41,530 and you'll see a lot of in 046. 615 00:39:41,530 --> 00:39:43,590 OK? 616 00:39:43,590 --> 00:39:47,260 So everyone buy earliest finish time? 617 00:39:47,260 --> 00:39:48,462 Yep, go ahead. 618 00:39:48,462 --> 00:39:50,922 AUDIENCE: So what if we consider the simple path 619 00:39:50,922 --> 00:39:54,080 example of there's one request for the whole block, 620 00:39:54,080 --> 00:39:57,577 and there's one small request that it mentioned earlier. 621 00:39:57,577 --> 00:39:59,160 SRINIVAS DEVADAS: Well, you'll get one 622 00:39:59,160 --> 00:40:01,960 for-- if there's any two requests, 623 00:40:01,960 --> 00:40:03,810 your maximum number is 1. 624 00:40:03,810 --> 00:40:05,770 So you pick-- it doesn't matter-- 625 00:40:05,770 --> 00:40:08,840 it's not like you want efficiency of your resource. 626 00:40:08,840 --> 00:40:11,390 In this particular case, we will look at cases 627 00:40:11,390 --> 00:40:15,450 where you might have an extra consideration associated 628 00:40:15,450 --> 00:40:18,770 with your problem which changes the problem that says, 629 00:40:18,770 --> 00:40:22,210 I want my resource to be maximally utilized. 630 00:40:22,210 --> 00:40:24,770 If you do that, then this doesn't work. 631 00:40:24,770 --> 00:40:27,470 And that's exactly-- it's a great question you asked. 632 00:40:27,470 --> 00:40:32,380 But I did say that we were going to look at the team here, which 633 00:40:32,380 --> 00:40:38,150 I don't have anymore, but of how problems change algorithms. 634 00:40:38,150 --> 00:40:40,830 And so that's a problem change. 635 00:40:40,830 --> 00:40:41,956 You've got a question. 636 00:40:41,956 --> 00:40:43,414 AUDIENCE: I have a counter example. 637 00:40:43,414 --> 00:40:46,110 You have three intervals that don't 638 00:40:46,110 --> 00:40:48,085 conflict with one another. 639 00:40:48,085 --> 00:40:52,376 You have one interval that conflicts with the first two 640 00:40:52,376 --> 00:40:55,912 and ends earlier than the first one. 641 00:40:55,912 --> 00:40:57,620 SRINIVAS DEVADAS: OK, so are you claiming 642 00:40:57,620 --> 00:40:59,870 that there's going to be a counter example to earliest 643 00:40:59,870 --> 00:41:00,430 finish time? 644 00:41:00,430 --> 00:41:00,770 AUDIENCE: Yes. 645 00:41:00,770 --> 00:41:02,853 SRINIVAS DEVADAS: All right, I would write it down 646 00:41:02,853 --> 00:41:04,680 on a sheet of paper. 647 00:41:04,680 --> 00:41:07,930 And get me a concrete example, and you can just slide it by. 648 00:41:07,930 --> 00:41:13,470 And if you get that before I finished my proof, you win, OK? 649 00:41:13,470 --> 00:41:15,870 [LAUGHTER] 650 00:41:15,870 --> 00:41:18,000 So I would write it down. 651 00:41:18,000 --> 00:41:20,750 Just write it down, so good. 652 00:41:20,750 --> 00:41:22,600 All right, so this is a contest now. 653 00:41:22,600 --> 00:41:25,559 [LAUGHTER] 654 00:41:25,559 --> 00:41:27,600 All right, so we are going to try and prove this. 655 00:41:36,450 --> 00:41:39,780 So there's many ways you could prove things, 656 00:41:39,780 --> 00:41:42,320 and I mean prove things properly. 657 00:41:42,320 --> 00:41:46,290 And I don't know if you've read the old 6042 proof 658 00:41:46,290 --> 00:41:48,040 techniques that are invalid, which 659 00:41:48,040 --> 00:41:53,230 is things like prove by intimidation, proof 660 00:41:53,230 --> 00:41:56,890 because the lecturer said so, you know, things like that. 661 00:41:56,890 --> 00:41:59,030 This is going to be a classical proof technique. 662 00:41:59,030 --> 00:42:01,150 It's going to be a proof by induction. 663 00:42:01,150 --> 00:42:04,360 We're going to go into it in some detail. 664 00:42:04,360 --> 00:42:06,360 Later on in the term we are going 665 00:42:06,360 --> 00:42:08,890 to put out sketches of proofs. 666 00:42:08,890 --> 00:42:12,300 We are going to be skipping steps in lecture that 667 00:42:12,300 --> 00:42:16,810 are obvious or maybe not so obvious, 668 00:42:16,810 --> 00:42:19,570 but if you paid attention, then you 669 00:42:19,570 --> 00:42:23,770 can infer the middle step, for example. 670 00:42:23,770 --> 00:42:26,710 And so will be doing proof sketches, 671 00:42:26,710 --> 00:42:29,390 and proof sketches are not sketchy proofs. 672 00:42:29,390 --> 00:42:30,870 [LAUGHTER] 673 00:42:30,870 --> 00:42:32,470 So keep that in mind. 674 00:42:32,470 --> 00:42:34,700 But this particular proof that we're going to do, 675 00:42:34,700 --> 00:42:36,170 I'm going to put in all the steps 676 00:42:36,170 --> 00:42:39,210 because it's our first one. 677 00:42:39,210 --> 00:42:46,580 And so what we're going to do here is prove a claim, 678 00:42:46,580 --> 00:42:57,010 and the claim is simply that-- whoops, 679 00:42:57,010 --> 00:42:59,980 this is not writing very well. 680 00:43:03,390 --> 00:43:04,360 What is going on here? 681 00:43:11,200 --> 00:43:13,534 OK. 682 00:43:13,534 --> 00:43:16,450 [LAUGHTER] 683 00:43:16,450 --> 00:43:17,725 Back to the white. 684 00:43:22,480 --> 00:43:37,050 Given a list of intervals l, our greedy algorithm 685 00:43:37,050 --> 00:43:53,960 with earliest finish time produces k star intervals 686 00:43:53,960 --> 00:43:55,340 where k star is minimal. 687 00:44:01,250 --> 00:44:03,430 So that's what we like to prove. 688 00:44:03,430 --> 00:44:05,313 AUDIENCE: [INAUDIBLE]. 689 00:44:05,313 --> 00:44:06,938 SRINIVAS DEVADAS: Sorry, what happened? 690 00:44:06,938 --> 00:44:08,180 AUDIENCE: [INAUDIBLE] 691 00:44:08,180 --> 00:44:10,900 SRINIVAS DEVADAS: Oh, right. 692 00:44:10,900 --> 00:44:13,510 Good point. 693 00:44:13,510 --> 00:44:14,010 Maximum. 694 00:44:21,370 --> 00:44:24,720 What we're going to do is prove this by induction, 695 00:44:24,720 --> 00:44:27,010 and it's going to be induction on k star. 696 00:44:32,070 --> 00:44:40,990 And so the base case is almost always with induction proofs 697 00:44:40,990 --> 00:44:45,110 trivial, and it's similar here as well. 698 00:44:45,110 --> 00:44:49,960 And in the base case, if you have 699 00:44:49,960 --> 00:44:55,360 a single interval in your list, then 700 00:44:55,360 --> 00:44:57,950 obviously that's a trivial example. 701 00:44:57,950 --> 00:45:01,420 But what I'm saying here for the base is slightly different. 702 00:45:01,420 --> 00:45:05,510 It says that the optimal solution has a single interval, 703 00:45:05,510 --> 00:45:06,130 right? 704 00:45:06,130 --> 00:45:11,020 And so now if your problem has one interval or two intervals 705 00:45:11,020 --> 00:45:14,380 or three intervals, you can always pick one, 706 00:45:14,380 --> 00:45:18,205 and it's clearly going to be a valid schedule because you 707 00:45:18,205 --> 00:45:19,860 don't have to check compatibility. 708 00:45:19,860 --> 00:45:22,490 And so the base case is trivial even 709 00:45:22,490 --> 00:45:27,990 in the case where you're not talking just 710 00:45:27,990 --> 00:45:32,560 of intervals that have cardinality 1, 711 00:45:32,560 --> 00:45:36,230 but the optimal schedule has cardinality 1. 712 00:45:36,230 --> 00:45:38,770 So that's a trivial case. 713 00:45:38,770 --> 00:45:46,750 So the hard work, of course, in the induction proofs is 714 00:45:46,750 --> 00:45:51,930 assuming the hypothesis and proving the n-plus-1, 715 00:45:51,930 --> 00:45:56,510 or in this case, the k-star-plus-1 case. 716 00:45:56,510 --> 00:45:59,330 And that's what we'll have to work on. 717 00:45:59,330 --> 00:46:13,910 So let's say that the claim holds for k star, 718 00:46:13,910 --> 00:46:28,260 and we are given a list of intervals 719 00:46:28,260 --> 00:46:33,860 who's optimal schedule is k star plus 1. 720 00:46:37,110 --> 00:46:45,660 It has k-star-plus-1 intervals in the optimal schedule, 721 00:46:45,660 --> 00:46:48,300 so L may be some large number, capital L, 722 00:46:48,300 --> 00:46:49,650 maybe in the hundreds. 723 00:46:49,650 --> 00:46:53,160 And k star, there may be 10 of what have you. 724 00:46:53,160 --> 00:46:53,910 They're different. 725 00:46:53,910 --> 00:46:56,080 I want to point that out. 726 00:46:56,080 --> 00:47:05,010 So our optimal schedule, we're going to write out as this, 727 00:47:05,010 --> 00:47:05,750 s star. 728 00:47:08,340 --> 00:47:13,550 So usually if you use star for optimal in 046 and it's got 729 00:47:13,550 --> 00:47:22,510 k-star-plus-1 entries, and those entries look like sf pairs-- 730 00:47:22,510 --> 00:47:28,080 so I'm going to using the subscript j1 through j k star 731 00:47:28,080 --> 00:47:35,440 plus 1 to denote these intervals. 732 00:47:35,440 --> 00:47:39,170 So the first one is sj1, fj1. 733 00:47:39,170 --> 00:47:41,670 That's an interval that's been selected 734 00:47:41,670 --> 00:47:44,510 and is part of our optimal solution. 735 00:47:44,510 --> 00:47:51,590 And then you keep going and we have sj k star 736 00:47:51,590 --> 00:47:59,470 plus 1 comma fj k star plus 1. 737 00:47:59,470 --> 00:48:06,240 So no getting away from subscripts here in 046 So 738 00:48:06,240 --> 00:48:15,010 that's what we have in terms of this is what the optimal 739 00:48:15,010 --> 00:48:15,710 schedule is. 740 00:48:15,710 --> 00:48:17,740 It's got size k star. 741 00:48:17,740 --> 00:48:19,990 Of course, what we have to show is 742 00:48:19,990 --> 00:48:26,154 that the greedy algorithm with the earliest finish time 743 00:48:26,154 --> 00:48:27,570 is going to produce something that 744 00:48:27,570 --> 00:48:30,900 is k star plus one in size. 745 00:48:30,900 --> 00:48:33,300 And so that's the hard part. 746 00:48:33,300 --> 00:48:36,330 We can assume the inductive hypothesis, 747 00:48:36,330 --> 00:48:37,950 and we'll have to do that. 748 00:48:37,950 --> 00:48:41,700 But there's a couple of steps in between. 749 00:48:41,700 --> 00:48:51,490 So let's say that what we have is s1 through k 750 00:48:51,490 --> 00:48:55,020 is what the greedy algorithm produces with the earliest 751 00:48:55,020 --> 00:48:56,126 finish time. 752 00:48:56,126 --> 00:49:10,350 So I'm going to write that down sik fik, 753 00:49:10,350 --> 00:49:17,800 so notice I have k here, and k and k star, at this point, 754 00:49:17,800 --> 00:49:18,640 are not comparable. 755 00:49:21,780 --> 00:49:29,860 I'm just making a statement that I took this particular problem 756 00:49:29,860 --> 00:49:35,800 that has k star plus 1 in terms of its optimal solution size, 757 00:49:35,800 --> 00:49:39,684 and for that problem, I have k intervals 758 00:49:39,684 --> 00:49:41,350 that are produced by the earliest finish 759 00:49:41,350 --> 00:49:43,760 time greedy heuristic. 760 00:49:43,760 --> 00:49:47,630 And so that's why the subscripts here are different. 761 00:49:47,630 --> 00:49:51,570 I have i1 here and ik, and then over here I 762 00:49:51,570 --> 00:49:54,320 have the j's, and so these intervals are different. 763 00:49:56,940 --> 00:50:07,160 If I look at f of i plus f of i1, and if I look f of j1, 764 00:50:07,160 --> 00:50:10,200 what can I say about f of i1 and f of j1? 765 00:50:15,720 --> 00:50:17,950 Is there a relationship between f of i1 and f of j1? 766 00:50:21,700 --> 00:50:23,160 They're equal? 767 00:50:23,160 --> 00:50:26,380 Do they have to be equal? 768 00:50:26,380 --> 00:50:26,880 Yeah? 769 00:50:26,880 --> 00:50:27,990 AUDIENCE: Less or equal to. 770 00:50:27,990 --> 00:50:29,365 SRINIVAS DEVADAS: Less than equal 771 00:50:29,365 --> 00:50:33,740 to, exactly right, so they're less than equal to. 772 00:50:33,740 --> 00:50:37,330 It's possible that you might end up 773 00:50:37,330 --> 00:50:43,130 with a different optimal solution that doesn't 774 00:50:43,130 --> 00:50:44,520 use the earliest finish time. 775 00:50:44,520 --> 00:50:46,940 We think earliest finish time is optimal at this point. 776 00:50:46,940 --> 00:50:49,680 We haven't proven it yet, but it's quite possible 777 00:50:49,680 --> 00:50:53,880 that you may have other solutions that 778 00:50:53,880 --> 00:50:56,650 are optimal that aren't necessarily the ones 779 00:50:56,650 --> 00:50:58,980 that earliest finish time gives you. 780 00:50:58,980 --> 00:51:01,260 So that's really why the less than or equal to 781 00:51:01,260 --> 00:51:04,490 is important here. 782 00:51:04,490 --> 00:51:07,180 Now what I'm going to do is create a schedule, 783 00:51:07,180 --> 00:51:13,270 s star star, that essentially is going to be taking s star 784 00:51:13,270 --> 00:51:18,270 and pulling out the first interval from s star 785 00:51:18,270 --> 00:51:21,000 and substituting it with the first interval 786 00:51:21,000 --> 00:51:24,020 from my greedy algorithms schedule. 787 00:51:24,020 --> 00:51:25,820 So I'm just going to replace that, 788 00:51:25,820 --> 00:51:32,660 and so s star star is si1 fj1. 789 00:51:36,520 --> 00:51:42,500 And then I'm going to be going back to sj2 fj2 790 00:51:42,500 --> 00:51:46,690 because I'm going back to s star and all the other ones 791 00:51:46,690 --> 00:51:48,930 are coming from s star. 792 00:51:48,930 --> 00:52:01,780 So they're going to be sj k star plus 1 comma fj k star plus 1. 793 00:52:04,350 --> 00:52:08,130 So I just did a little substitution there associated 794 00:52:08,130 --> 00:52:13,950 with the optimal solution, and I stuck 795 00:52:13,950 --> 00:52:16,890 in part of the greedy algorithm solution, 796 00:52:16,890 --> 00:52:18,625 in fact, the very first schedule. 797 00:52:18,625 --> 00:52:22,542 AUDIENCE: So the 1 should be i1. 798 00:52:22,542 --> 00:52:25,580 SRINIVAS DEVADAS: Oh, this should be-- i1, 799 00:52:25,580 --> 00:52:26,580 AUDIENCE: Right? 800 00:52:26,580 --> 00:52:28,640 SRINIVAS DEVADAS: i1, thank you. 801 00:52:28,640 --> 00:52:33,650 Yep, good. 802 00:52:33,650 --> 00:52:39,260 So we've got a couple of things to do, a couple of observations 803 00:52:39,260 --> 00:52:46,150 to make, and we're going to be able do prove 804 00:52:46,150 --> 00:52:48,680 some relationship between k and k star 805 00:52:48,680 --> 00:52:51,970 that is going to give us the proof for our claim. 806 00:52:56,570 --> 00:53:02,670 So clearly, s star is also optimal. 807 00:53:02,670 --> 00:53:05,226 All I've done is taken one interval out 808 00:53:05,226 --> 00:53:06,600 and replaced it with another one. 809 00:53:06,600 --> 00:53:08,610 It hasn't changed the size. 810 00:53:08,610 --> 00:53:13,230 It goes up to k star plus 1, so s double star is also optimal. 811 00:53:13,230 --> 00:53:17,470 s star is optimal. s double star is optimal. 812 00:53:17,470 --> 00:53:29,210 Now I'm going to define L prime as the set of intervals 813 00:53:29,210 --> 00:53:35,720 with s of i greater than or equal to f of i1. 814 00:53:35,720 --> 00:53:37,330 So what is L prime? 815 00:53:37,330 --> 00:53:41,030 Well, L prime is what happens in the second step 816 00:53:41,030 --> 00:53:45,500 of the greedy algorithm, where in the second step 817 00:53:45,500 --> 00:53:48,720 of the greedy algorithm, once I've selected 818 00:53:48,720 --> 00:53:52,000 this particular interval and I've pull it in, 819 00:53:52,000 --> 00:53:54,720 I have to reject all of the other intervals that 820 00:53:54,720 --> 00:53:57,840 are incompatible with this one. 821 00:53:57,840 --> 00:54:03,880 So I'm going to have to take only those intervals for which 822 00:54:03,880 --> 00:54:08,260 s of i is greater than or equal to f of i1 823 00:54:08,260 --> 00:54:14,140 because those are the ones that are compatible. 824 00:54:14,140 --> 00:54:16,190 So that's what L prime is. 825 00:54:16,190 --> 00:54:22,500 And I'm going to be able to say that since s double 826 00:54:22,500 --> 00:54:38,300 star is optimal for L, s double star 2 to k star plus 1 827 00:54:38,300 --> 00:54:40,615 is optimal for L prime. 828 00:54:44,760 --> 00:54:51,670 So I'm making a statement about this optimal solution. 829 00:54:51,670 --> 00:54:53,660 I know that's optimal, and basically 830 00:54:53,660 --> 00:54:58,730 what I'm saying is subsets of the optimal solution are going 831 00:54:58,730 --> 00:55:01,360 to have to be optimal because if that's not the case, 832 00:55:01,360 --> 00:55:06,850 I could always substitute something better and shrink 833 00:55:06,850 --> 00:55:12,110 the size of the k star plus 1 optimal solution, which 834 00:55:12,110 --> 00:55:14,790 obviously would be a contradiction. 835 00:55:14,790 --> 00:55:20,730 So s double star is optimal for L, 836 00:55:20,730 --> 00:55:23,816 and therefore s double star 2 through k star 837 00:55:23,816 --> 00:55:26,810 plus 1 is optimal for L prime. 838 00:55:26,810 --> 00:55:28,120 Everybody buy that? 839 00:55:28,120 --> 00:55:29,020 Yep? 840 00:55:29,020 --> 00:55:31,720 Good. 841 00:55:31,720 --> 00:55:33,440 And so what this means, of course, 842 00:55:33,440 --> 00:55:49,100 is that the optimal schedule for L prime has k star size. 843 00:55:49,100 --> 00:55:50,290 And I'm starting with 2. 844 00:55:50,290 --> 00:55:51,650 I've taken away 1. 845 00:55:51,650 --> 00:55:55,030 So now I have L prime, which is a smaller problem. 846 00:55:55,030 --> 00:55:58,575 Now you see where the proof is headed, if you didn't already. 847 00:55:58,575 --> 00:56:01,250 I have a smaller problem, which is L prime. 848 00:56:01,250 --> 00:56:03,680 Clearly, it's got fewer requests, 849 00:56:03,680 --> 00:56:08,930 and I have constructed an optimal schedule 850 00:56:08,930 --> 00:56:11,570 for that problem by pulling it out 851 00:56:11,570 --> 00:56:15,910 of the original optimal schedule I was given. 852 00:56:15,910 --> 00:56:21,950 And that size of that optimal schedule is k star. 853 00:56:21,950 --> 00:56:26,140 And now I get to invoke my inductive hypothesis 854 00:56:26,140 --> 00:56:29,150 because my inductive hypothesis says 855 00:56:29,150 --> 00:56:31,960 that this claim that I have up there holds 856 00:56:31,960 --> 00:56:36,160 for any set of problems that have 857 00:56:36,160 --> 00:56:39,430 an optimal schedule of size k star. 858 00:56:39,430 --> 00:56:42,610 That's what the inductive hypothesis gives me. 859 00:56:42,610 --> 00:56:56,340 And so by the inductive hypothesis, 860 00:56:56,340 --> 00:57:09,520 when I run the greedy algorithm on L prime, 861 00:57:09,520 --> 00:57:19,340 I'm going to get sk schedule of size k star. 862 00:57:28,920 --> 00:57:33,070 Now can you tell me, based on what you see on the board, 863 00:57:33,070 --> 00:57:37,230 by construction, when I run the greedy algorithm, 864 00:57:37,230 --> 00:57:41,830 what am I getting on L star? 865 00:57:41,830 --> 00:57:46,980 By construction, when I run the greedy algorithm on L prime-- 866 00:57:46,980 --> 00:57:49,840 there's too many superscripts here-- 867 00:57:49,840 --> 00:57:52,980 when I run the greedy algorithm on L prime, what do I get? 868 00:57:55,840 --> 00:57:56,851 Someone? 869 00:57:56,851 --> 00:57:57,350 Yeah? 870 00:57:57,350 --> 00:58:01,244 AUDIENCE: We get s of i sub 2, s of i sub 2 interval. 871 00:58:01,244 --> 00:58:02,660 SRINIVAS DEVADAS: Exactly right, I 872 00:58:02,660 --> 00:58:06,500 get everything from the second thing 873 00:58:06,500 --> 00:58:09,910 here all the way to the end because that's exactly 874 00:58:09,910 --> 00:58:11,410 what the greedy algorithm does. 875 00:58:11,410 --> 00:58:15,020 Remember, the greedy algorithm picked si1 fi1, 876 00:58:15,020 --> 00:58:19,140 and then rejected all requests that are incompatible and then 877 00:58:19,140 --> 00:58:20,090 move on. 878 00:58:20,090 --> 00:58:23,180 When you rejected all requests that are incompatible here, 879 00:58:23,180 --> 00:58:25,830 you got exactly L prime. 880 00:58:25,830 --> 00:58:29,130 And by construction, the greedy algorithm 881 00:58:29,130 --> 00:58:35,800 should have given me all the way from si2 too sik. 882 00:58:35,800 --> 00:58:37,620 Thank you. 883 00:58:37,620 --> 00:58:54,620 So by construction, the greedy on L prime 884 00:58:54,620 --> 00:59:00,760 gives s2 to k, right? 885 00:59:00,760 --> 00:59:02,985 And what is the size of this? 886 00:59:02,985 --> 00:59:07,990 2 to k gives me a size of k minus 1. 887 00:59:07,990 --> 00:59:08,970 This is k minus 1. 888 00:59:15,910 --> 00:59:21,620 So if I put these two things together, 889 00:59:21,620 --> 00:59:23,125 what is the next step? 890 00:59:23,125 --> 00:59:26,330 I have the inductive hypothesis giving me a fact. 891 00:59:26,330 --> 00:59:29,380 I have the construction giving me something. 892 00:59:29,380 --> 00:59:32,600 Now I can relate k and k star. 893 00:59:32,600 --> 00:59:33,600 What's the relationship? 894 00:59:38,440 --> 00:59:41,720 k star is equal to k minus 1, right? 895 00:59:41,720 --> 00:59:44,710 Do people see that? 896 00:59:44,710 --> 00:59:51,100 So size k star or just k minus 1. 897 00:59:51,100 --> 00:59:57,680 So what that means is given that s2k is a size k 898 00:59:57,680 --> 01:00:05,280 star, it means that s1k is of size k star plus 1, 899 01:00:05,280 --> 01:00:07,940 which is exactly what I want. 900 01:00:07,940 --> 01:00:11,860 That's optimal because I said in the beginning 901 01:00:11,860 --> 01:00:16,550 that we had k star plus 1 in our inductive hypothesis this case 902 01:00:16,550 --> 01:00:18,400 as being the optimal solution. 903 01:00:18,400 --> 01:00:21,980 So this last step here is all you 904 01:00:21,980 --> 01:00:30,220 need to argue now that s of 1k, going back up here, 905 01:00:30,220 --> 01:00:42,140 this is optimal because k equals k star plus 1. 906 01:00:42,140 --> 01:00:48,440 There you go, so that's the kind of argument that you have 907 01:00:48,440 --> 01:00:53,510 to make in order to prove something like this in 046. 908 01:00:53,510 --> 01:00:57,010 And what you'll see in your problem sets, 909 01:00:57,010 --> 01:00:59,380 including the one that's going to come out on Thursday, 910 01:00:59,380 --> 01:01:02,910 is that different problem that you 911 01:01:02,910 --> 01:01:05,907 have to have proof for a greedy algorithm for. 912 01:01:05,907 --> 01:01:07,490 I forget exactly what technique you'll 913 01:01:07,490 --> 01:01:09,460 have used there, perhaps induction, 914 01:01:09,460 --> 01:01:10,900 perhaps contradiction. 915 01:01:10,900 --> 01:01:12,650 And these are the kinds of things 916 01:01:12,650 --> 01:01:17,110 that get you to the point where you've 917 01:01:17,110 --> 01:01:19,830 analyzed the correctness of algorithms, 918 01:01:19,830 --> 01:01:22,720 not just the fact that you're getting a valid schedule, 919 01:01:22,720 --> 01:01:26,330 but you're getting a valid maximum schedule 920 01:01:26,330 --> 01:01:29,520 in terms of the maximum number of requests. 921 01:01:29,520 --> 01:01:32,920 Any questions about this? 922 01:01:32,920 --> 01:01:35,110 Do people buy the proof? 923 01:01:35,110 --> 01:01:36,308 Yep. 924 01:01:36,308 --> 01:01:37,730 Good. 925 01:01:37,730 --> 01:01:42,080 So that was greedy for a particular problem. 926 01:01:42,080 --> 01:01:43,890 I told you that the team of our lecture 927 01:01:43,890 --> 01:01:51,200 here was changing the problem and getting 928 01:01:51,200 --> 01:01:58,800 different algorithms that had different complexities. 929 01:01:58,800 --> 01:02:00,050 So let's go ahead and do that. 930 01:02:00,050 --> 01:02:03,130 So the rest of this lecture, we'll 931 01:02:03,130 --> 01:02:05,290 just take a look at different kinds of problems 932 01:02:05,290 --> 01:02:09,720 and talk a little more superficially about what 933 01:02:09,720 --> 01:02:12,160 the problem complexities are. 934 01:02:12,160 --> 01:02:14,480 And so one thing that might come to mind 935 01:02:14,480 --> 01:02:18,890 is that you'd like to do weighted interval scheduling. 936 01:02:23,600 --> 01:02:38,350 And what happens here is each request has weight wi, 937 01:02:38,350 --> 01:02:48,420 and what we want to do is schedule a subset of requests 938 01:02:48,420 --> 01:02:50,180 with maximum weight. 939 01:02:50,180 --> 01:02:52,600 So previously, it was just all weights were 1, 940 01:02:52,600 --> 01:02:57,150 so maximum cardinality was what we wanted. 941 01:02:57,150 --> 01:02:59,850 But now we want to schedule a subset of requests 942 01:02:59,850 --> 01:03:03,060 with maximum weight. 943 01:03:03,060 --> 01:03:10,470 Someone give me an argument as to whether the greedy algorithm 944 01:03:10,470 --> 01:03:15,300 earliest finish time first is optimal for this weighted case, 945 01:03:15,300 --> 01:03:18,320 or give me a counter example. 946 01:03:18,320 --> 01:03:19,960 Yep, go ahead. 947 01:03:19,960 --> 01:03:22,285 AUDIENCE: Oh, well, you know like your first example 948 01:03:22,285 --> 01:03:25,378 you have your first weight of the first interval, 949 01:03:25,378 --> 01:03:26,836 it took the whole time, [INAUDIBLE] 950 01:03:26,836 --> 01:03:28,324 would have three smaller ones? 951 01:03:28,324 --> 01:03:31,862 Well, if the weight of the first one was 20 and then-- 952 01:03:31,862 --> 01:03:33,570 SRINIVAS DEVADAS: Exactly, exactly right. 953 01:03:33,570 --> 01:03:34,830 All right, I owe you one too. 954 01:03:34,830 --> 01:03:38,010 So here you go. 955 01:03:38,010 --> 01:03:41,220 So it's a fairly trivial example. 956 01:03:41,220 --> 01:03:51,330 All you do is w equals 1, w equals 1, w equals 3, 957 01:03:51,330 --> 01:03:53,210 so there you go. 958 01:03:53,210 --> 01:03:54,800 So clearly, the earliest finish time 959 01:03:54,800 --> 01:03:57,980 would pick this one and then this one, which is fine. 960 01:03:57,980 --> 01:04:00,570 You get two of these, but this was important. 961 01:04:00,570 --> 01:04:04,750 This is, I don't know, sleep party, 6046. 962 01:04:04,750 --> 01:04:06,640 [LAUGHTER] 963 01:04:06,640 --> 01:04:08,580 So there you go. 964 01:04:08,580 --> 01:04:11,155 So the weight it is, we should make that infinity. 965 01:04:15,050 --> 01:04:17,720 Most important thing in the world at least 966 01:04:17,720 --> 01:04:19,240 for the next six months. 967 01:04:23,130 --> 01:04:24,790 So how does this work now? 968 01:04:28,640 --> 01:04:33,410 So it turns out that the greedy strategy, 969 01:04:33,410 --> 01:04:37,950 the template that I had, fails. 970 01:04:37,950 --> 01:04:41,660 There's nothing that exists on this planet 971 01:04:41,660 --> 01:04:47,520 that, at least I know of, where you can have a simple rule 972 01:04:47,520 --> 01:04:52,900 and use that template to get the optimum solution, in this case, 973 01:04:52,900 --> 01:04:57,989 maximum weight solution, for every problem instance, 974 01:04:57,989 --> 01:04:59,155 so that template just fails. 975 01:05:03,730 --> 01:05:05,426 What other programming paradigm do you 976 01:05:05,426 --> 01:05:06,550 think would be useful here? 977 01:05:09,790 --> 01:05:10,909 Yeah, go ahead. 978 01:05:10,909 --> 01:05:11,450 AUDIENCE: DP. 979 01:05:11,450 --> 01:05:12,820 SRINIVAS DEVADAS: DP, right. 980 01:05:12,820 --> 01:05:18,293 So do you want to take a stab at a potential DP solution here? 981 01:05:18,293 --> 01:05:21,524 AUDIENCE: Yeah, so either include it in your [INAUDIBLE] 982 01:05:21,524 --> 01:05:25,480 or discard it and then continue with set of other intervals. 983 01:05:25,480 --> 01:05:28,190 SRINIVAS DEVADAS: Yeah, that's a perfect divide and conquer. 984 01:05:28,190 --> 01:05:30,820 And then when you include it, what do you have to do? 985 01:05:30,820 --> 01:05:32,737 AUDIENCE: Eliminate all conflicting intervals. 986 01:05:32,737 --> 01:05:34,611 SRINIVAS DEVADAS: Right, how many subproblems 987 01:05:34,611 --> 01:05:36,220 do you think there are. 988 01:05:36,220 --> 01:05:38,588 I want to make you own your Frisbee, right? 989 01:05:38,588 --> 01:05:42,720 [LAUGHTER] 990 01:05:42,720 --> 01:05:50,070 AUDIENCE: 2 to the power of the number of intervals 991 01:05:50,070 --> 01:05:51,690 you have because-- 992 01:05:51,690 --> 01:05:54,165 SRINIVAS DEVADAS: Well, that's a number of subsets 993 01:05:54,165 --> 01:05:55,686 that you have. 994 01:05:55,686 --> 01:05:57,060 So you have n intervals, then you 995 01:05:57,060 --> 01:05:58,600 have two [INAUDIBLE] subsets. 996 01:05:58,600 --> 01:05:59,590 AUDIENCE: Yeah. 997 01:05:59,590 --> 01:06:01,048 SRINIVAS DEVADAS: But remember, you 998 01:06:01,048 --> 01:06:04,230 want to go-- you want to be smarter than that, right? 999 01:06:04,230 --> 01:06:07,940 You want to be a little bit smarter than that. 1000 01:06:07,940 --> 01:06:10,560 So here, you get a Frisbee anyway. 1001 01:06:10,560 --> 01:06:11,570 [LAUGHTER] 1002 01:06:11,570 --> 01:06:14,150 No, not anyway, here you go. 1003 01:06:14,150 --> 01:06:16,150 Right. 1004 01:06:16,150 --> 01:06:18,760 So anybody else? 1005 01:06:18,760 --> 01:06:21,470 So what I want to use is dynamic programming. 1006 01:06:21,470 --> 01:06:23,110 We've established that. 1007 01:06:23,110 --> 01:06:24,642 I want to use dynamic programming. 1008 01:06:24,642 --> 01:06:27,100 And the dynamic programming-- you have some experience with 1009 01:06:27,100 --> 01:06:32,210 that in 006-- the name of the game is to figure out what 1010 01:06:32,210 --> 01:06:35,320 the subproblems are. 1011 01:06:35,320 --> 01:06:36,960 The subproblems are kind of going 1012 01:06:36,960 --> 01:06:39,880 to look like a collection of requests. 1013 01:06:39,880 --> 01:06:42,180 I mean, there's no two things about it. 1014 01:06:42,180 --> 01:06:45,260 They're going to be a collection of requests, 1015 01:06:45,260 --> 01:06:50,430 and so the challenge here is not to go to the 2 raised to n, 1016 01:06:50,430 --> 01:06:55,960 because 2 raised to n is bad if you want efficiency. 1017 01:06:55,960 --> 01:07:00,500 So we have to have a polynomial number of subproblems. 1018 01:07:00,500 --> 01:07:03,160 So someone who hasn't answered yet, go ahead. 1019 01:07:03,160 --> 01:07:10,920 AUDIENCE: [INAUDIBLE] so [INAUDIBLE] subset [INAUDIBLE] 1020 01:07:10,920 --> 01:07:16,122 So from interval i to interval j [INAUDIBLE]. 1021 01:07:16,122 --> 01:07:17,580 SRINIVAS DEVADAS: So you're looking 1022 01:07:17,580 --> 01:07:22,520 at every pair of i's and j's, and, well, not all of them 1023 01:07:22,520 --> 01:07:23,460 are going to be valid. 1024 01:07:23,460 --> 01:07:26,070 There won't be intervals associated with that, 1025 01:07:26,070 --> 01:07:29,280 but that's a reasonable start. 1026 01:07:29,280 --> 01:07:32,860 Someone else, someone who hasn't answered? 1027 01:07:32,860 --> 01:07:33,963 Yeah, back there. 1028 01:07:33,963 --> 01:07:35,895 AUDIENCE: You could go the best term 1029 01:07:35,895 --> 01:07:39,609 to start to some even point, and so there'd n of those. 1030 01:07:39,609 --> 01:07:42,150 SRINIVAS DEVADAS: Ah, best from the start to any given point. 1031 01:07:42,150 --> 01:07:45,665 All right, well, you got close, Michael. 1032 01:07:45,665 --> 01:07:47,180 There you go. 1033 01:07:47,180 --> 01:07:50,360 You need to stand up. 1034 01:07:50,360 --> 01:07:52,660 Ew, bad throw. 1035 01:07:52,660 --> 01:07:54,100 That's a bad throw. 1036 01:07:54,100 --> 01:07:56,570 I've got to practice. 1037 01:07:56,570 --> 01:08:00,800 OK, so as you can see with dynamic programming, 1038 01:08:00,800 --> 01:08:03,550 the challenge is to figure out what the subproblems are. 1039 01:08:03,550 --> 01:08:05,710 The fact of the matter is that there's 1040 01:08:05,710 --> 01:08:10,895 going to be many different possible algorithms that 1041 01:08:10,895 --> 01:08:13,630 are all DP for this weighted problem. 1042 01:08:13,630 --> 01:08:15,520 There's at least two interesting ones. 1043 01:08:15,520 --> 01:08:18,350 We're going to do a simple one, which is based on the answer 1044 01:08:18,350 --> 01:08:21,689 that the gentleman here just gave. 1045 01:08:21,689 --> 01:08:24,550 But it turns out you can be a little smarter than that, 1046 01:08:24,550 --> 01:08:29,451 and most likely you'll hear the smarter way in the section, 1047 01:08:29,451 --> 01:08:31,200 but let's do the simple one because that's 1048 01:08:31,200 --> 01:08:33,210 all I have time here for. 1049 01:08:33,210 --> 01:08:36,569 And the key is to define the subproblems, 1050 01:08:36,569 --> 01:08:40,210 and then once you do that, the actual recursion ends up 1051 01:08:40,210 --> 01:08:45,380 being a fairly straightforward and intuitive step. 1052 01:08:45,380 --> 01:08:56,770 So let's look at dynamic programming, one particular way 1053 01:08:56,770 --> 01:09:01,020 of solving this problem, using the DP paradigm. 1054 01:09:01,020 --> 01:09:07,149 So what I'm going to do is define subproblems R star, 1055 01:09:07,149 --> 01:09:10,899 so R is the total number of requests that we have, 1056 01:09:10,899 --> 01:09:13,370 and the subproblems are going to correspond 1057 01:09:13,370 --> 01:09:19,830 to-- I'm going to request j belonging to R such 1058 01:09:19,830 --> 01:09:22,010 that-- oh, I'm sorry. 1059 01:09:22,010 --> 01:09:31,130 This is R of x-- such that sj is greater than or equal to x. 1060 01:09:31,130 --> 01:09:37,960 So what I'm doing here is, given a particular x, 1061 01:09:37,960 --> 01:09:40,680 I can always shrink the number of requests 1062 01:09:40,680 --> 01:09:45,340 that I have based on this rule. 1063 01:09:45,340 --> 01:09:48,279 And then you might ask, what is x? 1064 01:09:48,279 --> 01:09:59,410 And now you can apply the same subsetting property 1065 01:09:59,410 --> 01:10:07,030 by choosing the x's to be the finishing times of all 1066 01:10:07,030 --> 01:10:08,535 of the other requests. 1067 01:10:08,535 --> 01:10:12,210 All right, so x equals f of i. 1068 01:10:12,210 --> 01:10:17,680 So what this means is-- then I put f of i over here-- 1069 01:10:17,680 --> 01:10:20,360 it means all of the requests that 1070 01:10:20,360 --> 01:10:26,940 come after the i-th request finished our part of R of fi. 1071 01:10:26,940 --> 01:10:43,350 So R of fi would simply be requests later than f of i. 1072 01:10:43,350 --> 01:10:46,430 And there's something subtle here that I want to point out, 1073 01:10:46,430 --> 01:10:51,790 which is R of fi is not the set of requests 1074 01:10:51,790 --> 01:10:55,910 that are compatible with the i-th request. 1075 01:10:55,910 --> 01:10:57,810 It's not exactly that. 1076 01:10:57,810 --> 01:11:02,324 It's the set of requests that are later than f of i. 1077 01:11:02,324 --> 01:11:04,800 So keep that in mind because what happens here 1078 01:11:04,800 --> 01:11:09,620 is we're going to solve this problem step by step. 1079 01:11:09,620 --> 01:11:13,990 We're going to construct the dynamic programming solution 1080 01:11:13,990 --> 01:11:19,410 essentially by picking a request, 1081 01:11:19,410 --> 01:11:21,570 just like in the greedy case, and then taking 1082 01:11:21,570 --> 01:11:23,310 the request that comes after that. 1083 01:11:23,310 --> 01:11:26,690 So we're going to pick an early request, 1084 01:11:26,690 --> 01:11:28,570 and then we're going to subset the solution, 1085 01:11:28,570 --> 01:11:31,630 pick the next one just like we did with the greedy. 1086 01:11:31,630 --> 01:11:35,110 And so the subproblems that we will actually 1087 01:11:35,110 --> 01:11:38,090 solve potentially bottom up if we are doing recursion 1088 01:11:38,090 --> 01:11:43,290 are going to correspond to a set of requests that come later 1089 01:11:43,290 --> 01:11:48,620 than the particular subset that we're looking at, 1090 01:11:48,620 --> 01:11:51,770 which is defined by a particular interval. 1091 01:11:51,770 --> 01:11:55,036 So requests that are later than f of i, not necessarily 1092 01:11:55,036 --> 01:11:56,660 all of the requests that are compatible 1093 01:11:56,660 --> 01:11:59,043 with the i-th request. 1094 01:11:59,043 --> 01:12:04,230 And so if you do that, then the number of subproblems 1095 01:12:04,230 --> 01:12:10,680 here are small n, where n is the number of requests. 1096 01:12:10,680 --> 01:12:16,640 So if n is the number of requests 1097 01:12:16,640 --> 01:12:27,520 in the original problem, the number of sub problems 1098 01:12:27,520 --> 01:12:32,070 equals n because all I do is plug-in an appropriate i, 1099 01:12:32,070 --> 01:12:35,110 find f of i for it, and generate the R of f of i 1100 01:12:35,110 --> 01:12:36,290 for each of those. 1101 01:12:36,290 --> 01:12:39,210 So there's going to be n of those subproblems. 1102 01:12:39,210 --> 01:12:51,406 And we're going to solve each subproblem once and then 1103 01:12:51,406 --> 01:12:51,905 memoize. 1104 01:12:56,140 --> 01:12:59,180 And so the work that we have to do 1105 01:12:59,180 --> 01:13:04,140 is the basic rule corresponding to the complexity of a DP, 1106 01:13:04,140 --> 01:13:16,800 which is number of subproblems times the time 1107 01:13:16,800 --> 01:13:24,860 to solve each subproblem, or a single subproblem, 1108 01:13:24,860 --> 01:13:33,540 and this assumes order 1 for lookups. 1109 01:13:33,540 --> 01:13:36,240 So you can think of the recursive calls 1110 01:13:36,240 --> 01:13:44,260 as being order 1 because your assuming 1111 01:13:44,260 --> 01:13:46,900 you're doing memoization. 1112 01:13:46,900 --> 01:13:50,450 So I haven't really told you anything here that you 1113 01:13:50,450 --> 01:13:56,150 haven't seen in 006 and likely applied a bunch of times. 1114 01:13:56,150 --> 01:14:00,410 Over here, we've just defined what our subproblems are 1115 01:14:00,410 --> 01:14:04,680 for our particular DP, and we argued 1116 01:14:04,680 --> 01:14:06,450 that the number of subproblems that 1117 01:14:06,450 --> 01:14:09,110 are associated with this particular choice 1118 01:14:09,110 --> 01:14:11,560 of subproblems corresponds to n if you 1119 01:14:11,560 --> 01:14:14,750 have n requests in the original problem instance 1120 01:14:14,750 --> 01:14:16,460 that you've given. 1121 01:14:16,460 --> 01:14:21,310 So the last thing that we have to do here to solve our DP 1122 01:14:21,310 --> 01:14:23,830 is, of course, to write our recursion 1123 01:14:23,830 --> 01:14:25,830 and to convince ourselves that this actually all 1124 01:14:25,830 --> 01:14:28,950 works out, and let's do that. 1125 01:14:35,290 --> 01:14:41,700 And so what we have here is our DP guessing. 1126 01:14:45,520 --> 01:14:51,530 And we're going to try each request 1127 01:14:51,530 --> 01:15:03,060 i as a plausible first request, and so that's where this works. 1128 01:15:03,060 --> 01:15:06,000 You might be thinking, boy, I mean, this R of fi 1129 01:15:06,000 --> 01:15:07,450 looks a little strange. 1130 01:15:07,450 --> 01:15:10,230 Why doesn't it include all of the requests that 1131 01:15:10,230 --> 01:15:15,050 are compatible with the i-th request? 1132 01:15:15,050 --> 01:15:19,590 I mean, I'm somehow shrinking my subsequent problem size 1133 01:15:19,590 --> 01:15:22,165 if I'm ignoring some requests that 1134 01:15:22,165 --> 01:15:25,380 are earlier that really should be part of-- 1135 01:15:25,380 --> 01:15:27,610 or are part of the compatible set, 1136 01:15:27,610 --> 01:15:30,380 but they're not part of the R of fi set. 1137 01:15:30,380 --> 01:15:32,230 And so some of you may be thinking that, 1138 01:15:32,230 --> 01:15:35,700 well, the reason this is going to work out 1139 01:15:35,700 --> 01:15:38,160 is because we are going to construct 1140 01:15:38,160 --> 01:15:42,900 our solution, as I said before, from the beginning to the end. 1141 01:15:42,900 --> 01:15:45,330 So we're going to try each request 1142 01:15:45,330 --> 01:15:48,370 as a plausible first request. 1143 01:15:48,370 --> 01:15:51,840 So even though this request might be in our chart 1144 01:15:51,840 --> 01:15:56,290 all the way to the right, it might have a huge weight, 1145 01:15:56,290 --> 01:16:01,360 and so I'm going to have to try that out as my first selection. 1146 01:16:01,360 --> 01:16:03,780 And when I try that out as my first selection, 1147 01:16:03,780 --> 01:16:05,760 then the definition of my subproblem 1148 01:16:05,760 --> 01:16:07,246 says that this will work. 1149 01:16:07,246 --> 01:16:08,870 I only have to look at the request that 1150 01:16:08,870 --> 01:16:11,900 comes later than that because the ones that came the earlier, 1151 01:16:11,900 --> 01:16:14,220 I've tried them out too. 1152 01:16:14,220 --> 01:16:17,750 So that's something that you need to keep in mind in order 1153 01:16:17,750 --> 01:16:21,390 to argue correctness of this recursion 1154 01:16:21,390 --> 01:16:23,760 that I'm going to write out now. 1155 01:16:23,760 --> 01:16:31,270 And so the recursion, and I have opt R, what is the first thing 1156 01:16:31,270 --> 01:16:33,520 that I'm going to have on the right-hand side 1157 01:16:33,520 --> 01:16:36,950 of this recursive formulation? 1158 01:16:36,950 --> 01:16:41,320 What mathematical construct am I going to have to do here? 1159 01:16:41,320 --> 01:16:43,800 And you see something like guessing and seeing something 1160 01:16:43,800 --> 01:16:46,810 like try each request as a possible first, what 1161 01:16:46,810 --> 01:16:50,010 mathematical construct am I going to have to put up here? 1162 01:16:50,010 --> 01:16:50,930 AUDIENCE: Max. 1163 01:16:50,930 --> 01:16:54,470 SRINIVAS DEVADAS: Max, who said max? 1164 01:16:54,470 --> 01:16:57,910 No one wants to take credit for max? 1165 01:16:57,910 --> 01:16:59,450 It's max, right? 1166 01:16:59,450 --> 01:17:06,480 So I'm going to have max 1 less than equal to i less than 1167 01:17:06,480 --> 01:17:08,740 or equal to n. 1168 01:17:08,740 --> 01:17:12,770 And I'm going to-- does someone want to tell me what 1169 01:17:12,770 --> 01:17:14,210 the rest of this looks like? 1170 01:17:17,780 --> 01:17:19,706 Someone else? 1171 01:17:19,706 --> 01:17:21,460 A couple Frisbees left, guys. 1172 01:17:21,460 --> 01:17:22,565 [LAUGHTER] 1173 01:17:22,565 --> 01:17:24,106 What does the rest of this look like? 1174 01:17:28,170 --> 01:17:28,670 Yep? 1175 01:17:28,670 --> 01:17:31,323 AUDIENCE: 1 plus the optimal R f of-- 1176 01:17:31,323 --> 01:17:34,630 SRINIVAS DEVADAS: Not 1, just what kind 1177 01:17:34,630 --> 01:17:36,670 of problem do we have here? 1178 01:17:36,670 --> 01:17:37,925 It's not 1 anymore. 1179 01:17:37,925 --> 01:17:38,509 AUDIENCE: Oh-- 1180 01:17:38,509 --> 01:17:39,716 SRINIVAS DEVADAS: The weight. 1181 01:17:39,716 --> 01:17:40,520 AUDIENCE: Right. 1182 01:17:40,520 --> 01:17:44,170 SRINIVAS DEVADAS: The weight, yep, so Wi 1183 01:17:44,170 --> 01:17:50,620 plus the optimal R fi. 1184 01:17:53,700 --> 01:18:00,350 OK, so we got Wi plus optimum of R of fi. 1185 01:18:00,350 --> 01:18:03,560 And you said "1," close enough. 1186 01:18:03,560 --> 01:18:05,580 If it was 1, you'd use greedy. 1187 01:18:05,580 --> 01:18:07,840 And so that's why we were in that Wi mode, 1188 01:18:07,840 --> 01:18:09,520 and we end up getting this here. 1189 01:18:09,520 --> 01:18:10,350 So that's it. 1190 01:18:10,350 --> 01:18:13,114 You try every request as a possible first. 1191 01:18:13,114 --> 01:18:14,780 Obviously, you pick that request so it's 1192 01:18:14,780 --> 01:18:20,580 part of your weight in terms of the weight for your solution. 1193 01:18:20,580 --> 01:18:23,370 When you do that, because it was the first request, 1194 01:18:23,370 --> 01:18:25,640 you get to prune the set of requests 1195 01:18:25,640 --> 01:18:31,290 that come later corresponding to R of fi that you see here. 1196 01:18:31,290 --> 01:18:36,060 And then you go ahead and simply find 1197 01:18:36,060 --> 01:18:38,920 the optimum for a smaller problem, 1198 01:18:38,920 --> 01:18:40,610 clearly has fewer requests. 1199 01:18:40,610 --> 01:18:45,920 And as long as you maximize over the set of guesses 1200 01:18:45,920 --> 01:18:49,980 that you've taken, and there's n guesses up at the top level. 1201 01:18:49,980 --> 01:18:51,670 Obviously in the lower levels, you're 1202 01:18:51,670 --> 01:18:55,050 going to have fewer requests in your R of fi's, and you'll 1203 01:18:55,050 --> 01:19:03,270 have fewer durations of the max, but it's n at the top level. 1204 01:19:03,270 --> 01:19:08,900 So one last question, what is the complexity 1205 01:19:08,900 --> 01:19:12,184 of what we see here? 1206 01:19:12,184 --> 01:19:13,510 AUDIENCE: n square. 1207 01:19:13,510 --> 01:19:16,600 SRINIVAS DEVADAS: n square, and the reason it's n square is you 1208 01:19:16,600 --> 01:19:19,610 simply use-- you can be really mechanical about this-- 1209 01:19:19,610 --> 01:19:25,440 you say, if this was order 1, I'm doing a max over n items. 1210 01:19:25,440 --> 01:19:29,570 And therefore, that's order n time to solve one subproblem. 1211 01:19:29,570 --> 01:19:36,650 And since I have n subproblems, I get n times order in, 1212 01:19:36,650 --> 01:19:40,390 which is order n squared. 1213 01:19:40,390 --> 01:19:45,310 So the last thing I'll do-- and I just have one more minute-- 1214 01:19:45,310 --> 01:19:52,740 is give you a sense of a small change to interval scheduling 1215 01:19:52,740 --> 01:19:56,940 that puts us in that NP complete domain. 1216 01:19:56,940 --> 01:19:59,360 So so far, we've just done two problems. 1217 01:19:59,360 --> 01:20:00,275 There's many others. 1218 01:20:00,275 --> 01:20:01,400 We did interval scheduling. 1219 01:20:01,400 --> 01:20:03,630 There was greedy linear time. 1220 01:20:03,630 --> 01:20:05,880 Weighted interval scheduling is order n 1221 01:20:05,880 --> 01:20:08,880 squared according to this particular DP formulation. 1222 01:20:08,880 --> 01:20:13,620 It turns out there's a smarter DP formulation that 1223 01:20:13,620 --> 01:20:16,900 runs an order n log n time that you'll 1224 01:20:16,900 --> 01:20:22,500 hear about in section on Friday, but it's still polynomial time. 1225 01:20:22,500 --> 01:20:26,880 Let's make one reasonable change to this, 1226 01:20:26,880 --> 01:20:31,520 which is to say that we may have multiple resources, 1227 01:20:31,520 --> 01:20:34,120 and they may be non identical. 1228 01:20:34,120 --> 01:20:38,140 So it turns out everything that we've done kind of extrapolates 1229 01:20:38,140 --> 01:20:43,580 very well to identical machines, even though there's 1230 01:20:43,580 --> 01:20:45,710 many identical machines. 1231 01:20:45,710 --> 01:20:48,180 But if you have non-identical machines, what 1232 01:20:48,180 --> 01:20:53,350 that means is you have resources or machines 1233 01:20:53,350 --> 01:20:55,510 that have different types. 1234 01:20:55,510 --> 01:21:02,190 So maybe your machines are T1 to Tm. 1235 01:21:02,190 --> 01:21:06,670 And it's essentially a situation where 1236 01:21:06,670 --> 01:21:09,660 you say, this particular task can only 1237 01:21:09,660 --> 01:21:12,370 be run on this machine or this other machines, 1238 01:21:12,370 --> 01:21:14,650 some subset of machines. 1239 01:21:14,650 --> 01:21:22,670 So you can still have a weight of 1 for all requests, 1240 01:21:22,670 --> 01:21:30,670 but you have something like A of i belonging subset of T 1241 01:21:30,670 --> 01:21:36,730 is a set of machines that i runs on. 1242 01:21:40,280 --> 01:21:42,020 OK, that's it. 1243 01:21:42,020 --> 01:21:43,970 That's the change we make. 1244 01:21:43,970 --> 01:21:49,350 Q of i is going to be specified for each of the i's. 1245 01:21:49,350 --> 01:21:51,470 So you could even have two machines. 1246 01:21:51,470 --> 01:21:53,470 And you could say, here's a set of requests that 1247 01:21:53,470 --> 01:21:55,210 could run on both machines. 1248 01:21:55,210 --> 01:21:57,600 Here's a set that only runs on the first machine, 1249 01:21:57,600 --> 01:22:00,630 and here's another set that runs on the second machine. 1250 01:22:00,630 --> 01:22:04,790 That's a simple example of this generalization. 1251 01:22:04,790 --> 01:22:13,280 If you do this, this problem has been shown to be NP complete. 1252 01:22:13,280 --> 01:22:17,690 And by that I mean, NP complete problems are decision problems. 1253 01:22:17,690 --> 01:22:24,100 And so you say, can some specific number k less 1254 01:22:24,100 --> 01:22:26,875 than and requests be scheduled. 1255 01:22:31,250 --> 01:22:35,669 This decision problem is NP complete. 1256 01:22:35,669 --> 01:22:37,960 And so what happens when you have NP complete problems? 1257 01:22:37,960 --> 01:22:40,418 Well, we're going to have a little module in the class that 1258 01:22:40,418 --> 01:22:41,940 deals with intractability. 1259 01:22:41,940 --> 01:22:44,290 We're going to look at cases where 1260 01:22:44,290 --> 01:22:46,430 we could apply approximation algorithms, 1261 01:22:46,430 --> 01:22:50,610 and maybe in the case of the optimization problem, 1262 01:22:50,610 --> 01:22:54,270 if the optimum for this is k star, 1263 01:22:54,270 --> 01:22:57,800 I will say that we can get within 10% of k star. 1264 01:22:57,800 --> 01:23:00,520 The other way is to just deal with intractability 1265 01:23:00,520 --> 01:23:03,410 by hoping that your exponential time 1266 01:23:03,410 --> 01:23:06,640 algorithm runs in a reasonable amount of time 1267 01:23:06,640 --> 01:23:08,790 for common cases. 1268 01:23:08,790 --> 01:23:13,110 So in the worst case, you might end up taking a long time. 1269 01:23:13,110 --> 01:23:15,400 But you just sort of back off after an hour 1270 01:23:15,400 --> 01:23:19,070 and take what you get from the operative algorithm. 1271 01:23:19,070 --> 01:23:23,840 But in many cases, the algorithm might actually complete, 1272 01:23:23,840 --> 01:23:26,760 and they give you the optimum solution. 1273 01:23:26,760 --> 01:23:28,270 So done here. 1274 01:23:28,270 --> 01:23:31,490 Make sure to sign up for a recitation section. 1275 01:23:31,490 --> 01:23:33,616 And see you guys next time.