1 00:00:00,040 --> 00:00:02,480 The following content is provided under a Creative 2 00:00:02,480 --> 00:00:04,010 Commons license. 3 00:00:04,010 --> 00:00:06,340 Your support will help MIT OpenCourseWare 4 00:00:06,340 --> 00:00:10,690 continue to offer high quality educational resources for free. 5 00:00:10,690 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,035 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,035 --> 00:00:17,660 at ocw.mit.edu. 8 00:00:26,350 --> 00:00:32,340 ERIC DEMAINE: All right, today we do NP completeness, 9 00:00:32,340 --> 00:00:35,420 an entire field in one lecture. 10 00:00:35,420 --> 00:00:36,010 Should be fun. 11 00:00:36,010 --> 00:00:38,590 I actually taught an entire class about this topic 12 00:00:38,590 --> 00:00:42,620 last semester, but now we're going to do it in 80 minutes. 13 00:00:42,620 --> 00:00:45,610 And we're going to look at lots of different problems, 14 00:00:45,610 --> 00:00:48,440 from Super Mario Brothers to jigsaw puzzles, 15 00:00:48,440 --> 00:00:51,000 and show that they're NP -complete. 16 00:00:51,000 --> 00:00:51,930 This is a fun area. 17 00:00:51,930 --> 00:00:54,499 As Srini mentioned last class, it's all about reductions. 18 00:00:54,499 --> 00:00:56,040 It's all about converting one problem 19 00:00:56,040 --> 00:00:59,380 into another, which is a fun kind of puzzle in itself. 20 00:00:59,380 --> 00:01:01,170 It's an algorithmic challenge. 21 00:01:01,170 --> 00:01:03,410 And we're going to do it a lot. 22 00:01:03,410 --> 00:01:06,600 But first I'm going to remind you of some of the things you 23 00:01:06,600 --> 00:01:10,010 learned from 006, and tell you what we need to do in order 24 00:01:10,010 --> 00:01:14,830 to prove all of these relations, what exactly we need to show 25 00:01:14,830 --> 00:01:18,750 for each of those arrows, and why it's interesting. 26 00:01:18,750 --> 00:01:23,490 So this is generally around the P versus NP problem. 27 00:01:23,490 --> 00:01:26,630 So remember, P is all the problems we know how 28 00:01:26,630 --> 00:01:27,932 to solve in polynomial time. 29 00:01:27,932 --> 00:01:30,140 Well not just the ones we know how to solve, but also 30 00:01:30,140 --> 00:01:36,430 the ones that can be solved, which 31 00:01:36,430 --> 00:01:44,490 is pretty much-- which is the topic of 6.006, and 6.046 up 32 00:01:44,490 --> 00:01:45,250 till now. 33 00:01:45,250 --> 00:01:47,210 But for now, in the next few lectures, 34 00:01:47,210 --> 00:01:49,480 we'll be talking about problems that are probably not 35 00:01:49,480 --> 00:01:52,940 polynomially solvable, and what to do about them. 36 00:01:52,940 --> 00:01:57,590 Polynomial, as you now, is like n to the some constant. 37 00:01:57,590 --> 00:01:59,350 Polynomial good exponential bad. 38 00:02:03,490 --> 00:02:04,300 What is n? 39 00:02:04,300 --> 00:02:08,630 I guess n is the size of the problem, which 40 00:02:08,630 --> 00:02:11,150 we'll have to be a little bit careful about today. 41 00:02:11,150 --> 00:02:17,240 And then NP is not problem solvable 42 00:02:17,240 --> 00:02:19,480 not in polynomial time, but it's problem solvable 43 00:02:19,480 --> 00:02:21,800 in nondeterministic polynomial time. 44 00:02:24,720 --> 00:02:27,060 And in this case we need to focus 45 00:02:27,060 --> 00:02:32,010 on a particular type of problem, which is decision problems. 46 00:02:32,010 --> 00:02:35,520 Decision just means that the answer is either yes or no. 47 00:02:35,520 --> 00:02:37,450 So it's a single bit answer. 48 00:02:43,130 --> 00:02:44,935 We will see why we need to restrict 49 00:02:44,935 --> 00:02:46,435 to that kind of problem in a moment. 50 00:03:11,030 --> 00:03:14,370 So this is problems you can solve in polynomial time. 51 00:03:14,370 --> 00:03:16,960 Same notion of polynomials, same notion of n, 52 00:03:16,960 --> 00:03:19,640 but in a totally unrealistic model of computation. 53 00:03:19,640 --> 00:03:22,080 Which is a nondeterministic model. 54 00:03:22,080 --> 00:03:24,060 In a nondeterministic model, what 55 00:03:24,060 --> 00:03:29,150 you can do is say instead of computing 56 00:03:29,150 --> 00:03:33,010 something from something you know, you could make a guess. 57 00:03:33,010 --> 00:03:58,840 So you can guess one out of polynomially many options 58 00:03:58,840 --> 00:04:02,090 in constant time. 59 00:04:02,090 --> 00:04:04,680 So normally a constant time operation, in regular models, 60 00:04:04,680 --> 00:04:08,230 like you add two numbers, or you do an if, that sort of thing. 61 00:04:08,230 --> 00:04:10,870 Here we can make a guess. 62 00:04:10,870 --> 00:04:15,290 I give the computer polynomially many options I'm interested in. 63 00:04:15,290 --> 00:04:17,050 Computer's going to give me one of them. 64 00:04:17,050 --> 00:04:20,570 It's going to give me a good guess. 65 00:04:20,570 --> 00:04:21,959 Guess is guaranteed to be good. 66 00:04:21,959 --> 00:04:24,290 And good means here that I want to get 67 00:04:24,290 --> 00:04:27,110 to a yes answer if I can. 68 00:04:27,110 --> 00:04:35,630 So the formal statement is, if any guess 69 00:04:35,630 --> 00:04:46,960 would lead to a yes answer, then we get such a guess. 70 00:04:51,096 --> 00:04:52,580 OK, this is weird. 71 00:04:52,580 --> 00:04:53,610 And it's asymmetric. 72 00:04:53,610 --> 00:04:55,060 It's biased towards yes. 73 00:04:55,060 --> 00:04:58,010 And this is why we can only think about decision problems, 74 00:04:58,010 --> 00:04:59,090 yes or no. 75 00:04:59,090 --> 00:05:00,360 You could bias towards no. 76 00:05:00,360 --> 00:05:02,060 You get something else called coNP. 77 00:05:02,060 --> 00:05:04,560 But we'll focus here just on NP. 78 00:05:04,560 --> 00:05:08,820 So the idea is I'd really like to find a guess that 79 00:05:08,820 --> 00:05:09,980 leads to a yes answer. 80 00:05:09,980 --> 00:05:12,670 And the machine magically gives me one if there is one. 81 00:05:12,670 --> 00:05:15,170 Which means if I end up saying no, that 82 00:05:15,170 --> 00:05:18,790 means there was absolutely no path that would lead to a yes. 83 00:05:18,790 --> 00:05:21,300 So when you get a no, you get a lot of information. 84 00:05:21,300 --> 00:05:23,250 When you get a yes, you get some information. 85 00:05:23,250 --> 00:05:24,249 But hey, you were lucky. 86 00:05:24,249 --> 00:05:25,700 Hard to complain. 87 00:05:25,700 --> 00:05:30,340 So in 006, I often call this the lucky model of computation. 88 00:05:30,340 --> 00:05:31,960 That's the informal version. 89 00:05:31,960 --> 00:05:35,740 But nondeterminism is what's really going on here. 90 00:05:35,740 --> 00:05:41,820 So maybe it's useful to get an example. 91 00:05:41,820 --> 00:05:47,240 So here's a problem we'll-- this is sort of the granddaddy 92 00:05:47,240 --> 00:05:49,080 of all NP-complete problems. 93 00:05:49,080 --> 00:05:51,560 We'll get to completeness in a moment. 94 00:05:51,560 --> 00:06:01,855 3SAT-- SAT stands for satisfiability. 95 00:06:10,030 --> 00:06:12,590 So in 3SAT, the input to the problem 96 00:06:12,590 --> 00:06:14,470 looks something like this. 97 00:06:14,470 --> 00:06:16,016 I'm just going to give an example. 98 00:06:26,930 --> 00:06:30,870 And in case you've forgotten your weird logic notation, 99 00:06:30,870 --> 00:06:33,520 this is an and. 100 00:06:33,520 --> 00:06:36,580 These are ORs. 101 00:06:36,580 --> 00:06:42,680 And I'm using this for negation, not. 102 00:06:42,680 --> 00:06:48,050 So in other words, I'm given a formula which is and of ORs. 103 00:06:48,050 --> 00:06:52,340 And each or clause only has three things in it. 104 00:06:52,340 --> 00:06:53,715 These things are called literals. 105 00:07:00,300 --> 00:07:03,990 And a literal is either a variable x sub i, 106 00:07:03,990 --> 00:07:08,020 or it's the negation of a variable, not x sub i. 107 00:07:08,020 --> 00:07:09,500 So this is a typical example. 108 00:07:09,500 --> 00:07:11,380 You could have no negations. 109 00:07:11,380 --> 00:07:13,510 You could here have one negation, two negations, 110 00:07:13,510 --> 00:07:15,950 any number of negations per clause. 111 00:07:15,950 --> 00:07:19,580 These groups of three-- these or of three 112 00:07:19,580 --> 00:07:23,900 things, three literals, are called clauses. 113 00:07:23,900 --> 00:07:26,960 And they're all ANDed together. 114 00:07:26,960 --> 00:07:30,100 And my goal is, this should be a decision question, 115 00:07:30,100 --> 00:07:31,960 so I have a yes or no question. 116 00:07:31,960 --> 00:07:44,030 And that question is, can you set the variables-- 117 00:07:44,030 --> 00:07:52,010 So they're x1 to true or false? 118 00:07:52,010 --> 00:07:56,020 So each variable I get to choose a true or false designation 119 00:07:56,020 --> 00:07:57,860 such that the formula comes out true. 120 00:08:04,870 --> 00:08:07,390 I use T and F for true and false. 121 00:08:07,390 --> 00:08:09,650 So I want to set these variables such 122 00:08:09,650 --> 00:08:12,230 that every clause comes out true, because they're 123 00:08:12,230 --> 00:08:12,970 ANDed together. 124 00:08:12,970 --> 00:08:15,910 So I have to satisfy this clause in one of three ways. 125 00:08:15,910 --> 00:08:17,700 Maybe I satisfy it all three ways. 126 00:08:17,700 --> 00:08:19,700 Doesn't matter, as long as at least one of these 127 00:08:19,700 --> 00:08:22,490 should be true, and at least one of these should be true, 128 00:08:22,490 --> 00:08:25,940 and at least one of each clause should be true. 129 00:08:25,940 --> 00:08:28,300 So that's the 3SAT problem. 130 00:08:28,300 --> 00:08:29,421 This is a hard problem. 131 00:08:29,421 --> 00:08:31,170 We don't know a polynomial time algorithm. 132 00:08:31,170 --> 00:08:32,520 There probably isn't one. 133 00:08:32,520 --> 00:08:37,669 But there is a polynomial time nondeterministic algorithm. 134 00:08:37,669 --> 00:08:54,140 So this problem is in NP because if I have lucky guesses, 135 00:08:54,140 --> 00:08:56,460 it's kind of designed to solve this kind of problem. 136 00:08:56,460 --> 00:09:03,730 What I'm going to do is guess whether x1 is true or false. 137 00:09:03,730 --> 00:09:05,490 So I have two choices. 138 00:09:05,490 --> 00:09:08,240 And I'm going to ask my machine to make the right choice, 139 00:09:08,240 --> 00:09:10,120 whether it should be true or false. 140 00:09:10,120 --> 00:09:13,760 Then I'll guess x2. 141 00:09:13,760 --> 00:09:17,480 Each of these guess operations takes constant time. 142 00:09:17,480 --> 00:09:19,250 So I do it for every variable. 143 00:09:19,250 --> 00:09:21,000 And then I'm going to check whether I 144 00:09:21,000 --> 00:09:22,670 happen to satisfy the formula. 145 00:09:28,960 --> 00:09:33,140 And if it comes out true, then I'll return yes. 146 00:09:33,140 --> 00:09:35,267 And if it comes out false, I'll return no. 147 00:09:43,720 --> 00:09:47,720 And because NP is biased towards yes answers, 148 00:09:47,720 --> 00:09:50,824 it always finds a yes answer if you can. 149 00:09:50,824 --> 00:09:53,560 If there's some way to satisfy the formula, 150 00:09:53,560 --> 00:09:55,160 then I will get it. 151 00:09:55,160 --> 00:09:57,550 If there's some way to make the formula come out true, 152 00:09:57,550 --> 00:09:59,890 then this algorithm will return yes. 153 00:09:59,890 --> 00:10:03,930 If there's no way to satisfy it, then 154 00:10:03,930 --> 00:10:05,996 this nondeterministic algorithm will return no. 155 00:10:05,996 --> 00:10:07,370 That's just the definition of how 156 00:10:07,370 --> 00:10:09,340 nondeterministic machines work. 157 00:10:09,340 --> 00:10:10,410 It's a little weird. 158 00:10:10,410 --> 00:10:13,670 But you can see from this kind of prototype 159 00:10:13,670 --> 00:10:17,800 of a nondeterministic algorithm, you can actually always 160 00:10:17,800 --> 00:10:20,920 arrange for your guessing to be at the beginning. 161 00:10:20,920 --> 00:10:25,830 And then you do some regular polynomial time checking 162 00:10:25,830 --> 00:10:29,010 or deterministic checking. 163 00:10:29,010 --> 00:10:30,440 So when you rewrite your algorithm 164 00:10:30,440 --> 00:10:32,820 like this with guesses up front and then checking, 165 00:10:32,820 --> 00:10:36,360 you can also think of it as a verification algorithm. 166 00:10:36,360 --> 00:10:41,050 So you can say, your friend claims that this 3SAT formula 167 00:10:41,050 --> 00:10:43,420 is satisfiable, meaning there's a way 168 00:10:43,420 --> 00:10:46,140 to set the variable so that it comes out true. 169 00:10:46,140 --> 00:10:48,264 So this is called a satisfying assignment. 170 00:10:50,990 --> 00:10:52,627 Satisfying just means make true. 171 00:10:56,830 --> 00:10:59,355 And you're like, no, I don't believe you. 172 00:10:59,355 --> 00:11:01,940 And your friend says no, no, no, really, it's true. 173 00:11:01,940 --> 00:11:03,460 And here's how I can prove it. 174 00:11:03,460 --> 00:11:04,720 You set x1 to false. 175 00:11:04,720 --> 00:11:05,725 You set x2 to true. 176 00:11:05,725 --> 00:11:09,260 You set x3-- basically they give you the guesses. 177 00:11:09,260 --> 00:11:11,605 And then you don't have to be convinced 178 00:11:11,605 --> 00:11:12,980 that those are the right guesses, 179 00:11:12,980 --> 00:11:14,646 you can check that it's the right guess. 180 00:11:14,646 --> 00:11:17,090 You can compute this formula in linear time, 181 00:11:17,090 --> 00:11:18,510 see what the outcome is. 182 00:11:18,510 --> 00:11:20,420 If someone tells you what the xi's are, 183 00:11:20,420 --> 00:11:21,980 you can very quickly see whether that 184 00:11:21,980 --> 00:11:24,310 was a satisfying assignment. 185 00:11:24,310 --> 00:11:26,440 So you could call this a solution, 186 00:11:26,440 --> 00:11:28,650 and then there's a polynomial time verification 187 00:11:28,650 --> 00:11:32,700 algorithm that checks that solutions are valid. 188 00:11:32,700 --> 00:11:36,170 But, you can only do that for yes answers. 189 00:11:36,170 --> 00:11:39,510 Your friend says no, this is not satisfiable, 190 00:11:39,510 --> 00:11:42,270 they have no way of proving it to you. 191 00:11:42,270 --> 00:11:46,450 I mean, other than checking all the assignments separately, 192 00:11:46,450 --> 00:11:48,290 which would take exponential time, 193 00:11:48,290 --> 00:11:51,140 there's no easy way to confirm that the answer to this problem 194 00:11:51,140 --> 00:11:51,670 is no. 195 00:11:51,670 --> 00:11:53,470 But there is an easy way to check 196 00:11:53,470 --> 00:11:55,830 that the answer is yes, namely I give you the satisfying 197 00:11:55,830 --> 00:11:57,600 assignment. 198 00:11:57,600 --> 00:12:01,130 So this definition of NP is what I'll stick to. 199 00:12:01,130 --> 00:12:05,100 It's this sort of-- I like guessing because it's 200 00:12:05,100 --> 00:12:06,500 like dynamic programming. 201 00:12:06,500 --> 00:12:09,329 With dynamic programming we also guess, 202 00:12:09,329 --> 00:12:11,620 and guessing actually originally comes from this world, 203 00:12:11,620 --> 00:12:13,390 nondeterminism. 204 00:12:13,390 --> 00:12:16,790 In dynamic programming, we don't allow this kind of model. 205 00:12:16,790 --> 00:12:19,030 And so we have to check the guesses separately. 206 00:12:19,030 --> 00:12:20,790 And so we spend lots of time. 207 00:12:20,790 --> 00:12:22,680 Here, magically, you always get the right 208 00:12:22,680 --> 00:12:24,320 guess in only constant time. 209 00:12:24,320 --> 00:12:26,070 So this is a much more powerful model. 210 00:12:26,070 --> 00:12:30,251 Of course there's no computers that work like this, sadly, 211 00:12:30,251 --> 00:12:32,280 or I guess more interestingly. 212 00:12:32,280 --> 00:12:36,870 So this is more about confirming that your problem is not 213 00:12:36,870 --> 00:12:38,830 totally impossible. 214 00:12:38,830 --> 00:12:42,560 At least you can check the answers in polynomial time. 215 00:12:42,560 --> 00:12:46,830 So that's one thing. 216 00:13:31,110 --> 00:13:33,210 So this is an equivalent definition of NP 217 00:13:33,210 --> 00:13:35,430 because you can take a nondeterministic algorithm 218 00:13:35,430 --> 00:13:37,300 and put the guessing up top. 219 00:13:37,300 --> 00:13:39,200 You can call the results of those guesses 220 00:13:39,200 --> 00:13:42,060 a certificate that an answer is yes. 221 00:13:42,060 --> 00:13:45,940 And then you have a regular old deterministic polynomial time 222 00:13:45,940 --> 00:13:48,430 algorithm that, given that certificate, 223 00:13:48,430 --> 00:13:54,516 will verify that it actually proves that the answer is yes. 224 00:13:57,820 --> 00:14:00,730 It's just that certificate has to be polynomial size. 225 00:14:00,730 --> 00:14:02,960 You can't guess something of exponential size. 226 00:14:02,960 --> 00:14:07,010 You can only guess something of polynomial size in this model. 227 00:14:07,010 --> 00:14:10,410 So seems a little weird. 228 00:14:10,410 --> 00:14:16,610 But we'll see why this is useful in a little bit. 229 00:14:16,610 --> 00:14:18,795 So let me go to NP completeness. 230 00:14:24,740 --> 00:14:37,320 So if I have a problem X, it's NP-complete if X is in NP 231 00:14:37,320 --> 00:14:38,830 and X is NP-hard. 232 00:14:41,760 --> 00:14:44,130 But I haven't told you what NP-hard is. 233 00:14:44,130 --> 00:15:24,860 Maybe you remember from 006, but let me remind you. 234 00:15:24,860 --> 00:15:28,840 So, I need to define reduce. 235 00:15:28,840 --> 00:15:30,684 So maybe I'll do that as well, then 236 00:15:30,684 --> 00:15:31,850 we can talk about all these. 237 00:16:26,130 --> 00:16:29,110 OK, a lot of definitions. 238 00:16:29,110 --> 00:16:32,970 But the idea of NP hardness is very simple. 239 00:16:32,970 --> 00:16:35,140 If problem X is NP-hard, it means 240 00:16:35,140 --> 00:16:39,890 that it's at least as hard as-- sorry, 241 00:16:39,890 --> 00:16:46,280 that is a Y-- it's at least as hard as all problems in NP. 242 00:16:46,280 --> 00:16:48,140 Intuitively, X means it's at least as 243 00:16:48,140 --> 00:16:50,280 hard as everything in NP. 244 00:16:50,280 --> 00:16:53,310 Whereas being in NP is a positive statement. 245 00:16:53,310 --> 00:16:55,240 That says it's not too hard, at least 246 00:16:55,240 --> 00:16:57,760 there's a polynomial time verification algorithm. 247 00:16:57,760 --> 00:17:00,060 So being in NP is good news. 248 00:17:00,060 --> 00:17:02,650 It says you're no harder than NP. 249 00:17:02,650 --> 00:17:05,920 NP-hard says you're at least as hard as everything in NP. 250 00:17:05,920 --> 00:17:08,230 And so NP-complete is a nice answer 251 00:17:08,230 --> 00:17:11,579 because this says you're exactly as hard as everything in NP-- 252 00:17:11,579 --> 00:17:13,849 no harder, no easier. 253 00:17:13,849 --> 00:17:17,099 If you draw, in this vague sense, 254 00:17:17,099 --> 00:17:23,589 computational difficulty on one axis-- which is not 255 00:17:23,589 --> 00:17:26,380 really accurate, but I like to do it anyway-- 256 00:17:26,380 --> 00:17:31,320 and you have P is all of these easy problems down here. 257 00:17:31,320 --> 00:17:35,460 And NP is some larger set like this. 258 00:17:35,460 --> 00:17:38,139 NP-hard is from here over. 259 00:17:42,540 --> 00:17:45,040 And this point right here is NP-complete. 260 00:17:49,500 --> 00:17:53,350 Being in NP means you're left of this line, or on the line. 261 00:17:53,350 --> 00:17:55,600 And being NP-hard means you're right of this line, 262 00:17:55,600 --> 00:17:56,480 or on the line. 263 00:17:56,480 --> 00:17:58,610 NP-complete means you're right there. 264 00:17:58,610 --> 00:18:02,510 So that's a very definitive sense of hardness. 265 00:18:02,510 --> 00:18:04,690 Now there is this slight catch, which 266 00:18:04,690 --> 00:18:06,550 is we don't know whether P equals NP. 267 00:18:06,550 --> 00:18:11,450 So maybe this is the same as this, but probably not. 268 00:18:11,450 --> 00:18:14,530 Unless you believe in luck, basically, 269 00:18:14,530 --> 00:18:16,400 unless you imagine that a computer could 270 00:18:16,400 --> 00:18:19,830 engineer luck and always guess the right things 271 00:18:19,830 --> 00:18:23,630 without spending a lot of time, then P does not equal NP. 272 00:18:23,630 --> 00:18:26,680 And in that world, what we get is that if you have 273 00:18:26,680 --> 00:18:29,650 an NP-complete problem, or actually any NP-hard problem, 274 00:18:29,650 --> 00:18:33,590 you know it cannot be NP. 275 00:18:33,590 --> 00:18:38,240 So if you have that X is NP-hard, 276 00:18:38,240 --> 00:18:44,405 then you know that X is not in P unless all of NP 277 00:18:44,405 --> 00:18:47,690 is in P. So unless P equals NP. 278 00:18:47,690 --> 00:18:51,040 And most reasonable people do not believe this. 279 00:18:51,040 --> 00:18:53,160 And so instead they have to believe this, 280 00:18:53,160 --> 00:18:57,580 that your problem is not polynomially solvable. 281 00:18:57,580 --> 00:18:59,740 So why is this true? 282 00:18:59,740 --> 00:19:01,680 Because if your problem is NP-hard, 283 00:19:01,680 --> 00:19:05,880 it is at least as hard as every problem in NP. 284 00:19:05,880 --> 00:19:08,449 And if you believe that there is some problem in NP-- 285 00:19:08,449 --> 00:19:09,990 we don't necessarily know which one-- 286 00:19:09,990 --> 00:19:14,280 but if there is any problem out there in NP that is not in P, 287 00:19:14,280 --> 00:19:18,710 then X has to be at least as hard as it. 288 00:19:18,710 --> 00:19:22,730 So it also requires nonpolynomial time, something 289 00:19:22,730 --> 00:19:25,060 larger than polynomial time. 290 00:19:25,060 --> 00:19:27,090 What does at least as hard mean though? 291 00:19:27,090 --> 00:19:29,690 We're going to define it in terms of reductions. 292 00:19:29,690 --> 00:19:31,530 Reduction from one problem to another 293 00:19:31,530 --> 00:19:33,910 is just a polynomial time algorithm, 294 00:19:33,910 --> 00:19:36,130 regular deterministic polynomial time, 295 00:19:36,130 --> 00:19:38,440 that converts an input to the problem A 296 00:19:38,440 --> 00:19:40,980 into an equivalent input to problem 297 00:19:40,980 --> 00:19:45,554 B. Equivalent means that it has the same yes or no answer. 298 00:19:48,105 --> 00:19:50,480 And we'll just be thinking about decision problems today. 299 00:19:55,410 --> 00:19:59,610 So why would I care about a reduction? 300 00:19:59,610 --> 00:20:03,960 Because what it tells me is that if I know how to solve problem 301 00:20:03,960 --> 00:20:07,860 B, then I also know how to solve problem A. If I 302 00:20:07,860 --> 00:20:11,180 have a, say, a polynomial time algorithm for solving B 303 00:20:11,180 --> 00:20:13,960 and I want one for A, I just take my A input. 304 00:20:13,960 --> 00:20:16,440 I convert it into the equivalent B input. 305 00:20:16,440 --> 00:20:18,190 Then I run my algorithm for B, and then it 306 00:20:18,190 --> 00:20:19,880 gives me the answer to the A problem 307 00:20:19,880 --> 00:20:22,920 because the answers are the same. 308 00:20:22,920 --> 00:20:27,480 So if you have a reduction like this and if say, 309 00:20:27,480 --> 00:20:29,940 B, has a polynomial time algorithm, 310 00:20:29,940 --> 00:20:35,410 then so does A, because you can just convert A into B, 311 00:20:35,410 --> 00:20:39,870 and then solve B. Also this works 312 00:20:39,870 --> 00:20:42,990 for nondeterministic algorithms. 313 00:20:48,540 --> 00:20:50,288 Not too important. 314 00:20:53,360 --> 00:20:56,710 So what this tells us is that in a certain sense-- 315 00:20:56,710 --> 00:21:00,850 get this right-- well this is saying, 316 00:21:00,850 --> 00:21:06,320 if I can solve B, that I can solve A. 317 00:21:06,320 --> 00:21:12,020 So this is saying that B is at least as 318 00:21:12,020 --> 00:21:18,020 hard as A. I think I got that right, a little tricky. 319 00:21:18,020 --> 00:21:21,210 So if we want to prove the problem is NP hard, 320 00:21:21,210 --> 00:21:24,560 what we do is show that every problem in NP 321 00:21:24,560 --> 00:21:28,150 can be reduced to the problem of X. So now we can go back 322 00:21:28,150 --> 00:21:31,390 and say well, if we believe that there is some problem Y, 323 00:21:31,390 --> 00:21:34,520 that is in NP minus P, if there's something out here that 324 00:21:34,520 --> 00:21:38,780 is not in P, then we can take that problem Y, 325 00:21:38,780 --> 00:21:40,790 and by this definition, we can reduce it 326 00:21:40,790 --> 00:21:43,750 to X, because everything in NP reduces to X. 327 00:21:43,750 --> 00:21:49,250 And so then I can solve my problem 328 00:21:49,250 --> 00:21:51,420 Y, which is in NP minus P, 329 00:21:51,420 --> 00:21:52,810 by converting it to X and solving 330 00:21:52,810 --> 00:21:55,770 X. So that means X better not have a polynomial time 331 00:21:55,770 --> 00:21:59,240 algorithm, because if it did, Y would also 332 00:21:59,240 --> 00:22:00,650 have a polynomial time algorithm. 333 00:22:00,650 --> 00:22:02,740 And then in general, P would equal 334 00:22:02,740 --> 00:22:05,730 NP, because every problem in NP can be converted to X. 335 00:22:05,730 --> 00:22:07,780 So if X has a polynomial time algorithm, 336 00:22:07,780 --> 00:22:10,470 then every problem Y does. 337 00:22:10,470 --> 00:22:11,667 Question? 338 00:22:11,667 --> 00:22:13,250 AUDIENCE: For the second if statement, 339 00:22:13,250 --> 00:22:17,690 why can't you say that if A is in NP, B is in NP? 340 00:22:17,690 --> 00:22:20,190 ERIC DEMAINE: So you're asked us about the reverse question. 341 00:22:20,190 --> 00:22:23,900 If is A in NP, can we conclude that B is in NP? 342 00:22:23,900 --> 00:22:25,314 And the answer is no. 343 00:22:30,890 --> 00:22:33,880 Because this reduction only lets us convert from A to B. 344 00:22:33,880 --> 00:22:37,100 It doesn't let us do anything for converting from B to A. 345 00:22:37,100 --> 00:22:39,580 So if we know how to solve A and we also 346 00:22:39,580 --> 00:22:43,180 know how to convert A into B, it doesn't tell us anything. 347 00:22:43,180 --> 00:22:47,345 It could be B is a much harder problem than A, 348 00:22:47,345 --> 00:22:48,450 in that situation. 349 00:22:53,394 --> 00:22:55,310 That's, I think, as good as I can do for that. 350 00:22:55,310 --> 00:22:55,976 Other questions? 351 00:22:58,891 --> 00:22:59,390 All right. 352 00:22:59,390 --> 00:23:01,860 It is really tricky to get these directions right. 353 00:23:01,860 --> 00:23:09,410 So let me give you a handy guide on how to not make a mistake. 354 00:23:09,410 --> 00:23:11,028 So maybe over here. 355 00:23:20,410 --> 00:23:24,430 What we care about, from an algorithmic perspective, 356 00:23:24,430 --> 00:23:27,780 is proving the problems are NP-complete. 357 00:23:37,716 --> 00:23:39,590 Because if we prove NP-completeness-- I mean, 358 00:23:39,590 --> 00:23:41,044 really we care about NP-hardness, 359 00:23:41,044 --> 00:23:42,710 but we might as well do NP-completeness. 360 00:23:42,710 --> 00:23:45,250 Most of the problems that we'll see that are NP-hard 361 00:23:45,250 --> 00:23:47,950 are also NP-complete. 362 00:23:47,950 --> 00:23:51,450 So when we prove this, we prove that there is basically 363 00:23:51,450 --> 00:23:53,990 no polynomial time algorithm for that problem. 364 00:23:53,990 --> 00:23:55,810 So that's good to know, because then we 365 00:23:55,810 --> 00:23:59,840 can just give up searching for a polynomial time algorithm. 366 00:23:59,840 --> 00:24:02,300 So all the problems we've seen so far 367 00:24:02,300 --> 00:24:05,050 have polynomial time algorithms, except a couple in your problem 368 00:24:05,050 --> 00:24:07,089 sets, which were actually NP-complete. 369 00:24:07,089 --> 00:24:09,130 And the best you could have done was exponential, 370 00:24:09,130 --> 00:24:11,420 unless P equals NP. 371 00:24:11,420 --> 00:24:15,550 So here's how you can prove this kind of lower bound 372 00:24:15,550 --> 00:24:17,550 to say look, I don't need to look for algorithms 373 00:24:17,550 --> 00:24:19,970 any more because my problem is just too hard. 374 00:24:19,970 --> 00:24:22,580 It's as hard as everything in NP. 375 00:24:22,580 --> 00:24:25,700 So this is just a summary of those definitions. 376 00:24:25,700 --> 00:24:29,820 The first thing you do is prove that X is in NP. 377 00:24:29,820 --> 00:24:33,140 The second thing you do is prove that X is NP-hard. 378 00:24:33,140 --> 00:24:39,460 And to do that, you reduce from some known NP-complete 379 00:24:39,460 --> 00:24:44,430 problem-- or I guess NP-hard, but we'll 380 00:24:44,430 --> 00:24:55,600 use NP-complete-- to your problem 381 00:24:55,600 --> 00:25:06,030 X. Maybe I'll give this a name Y. 382 00:25:06,030 --> 00:25:08,080 OK, so to prove that X is in NP, you 383 00:25:08,080 --> 00:25:09,970 do something like what we did over here, 384 00:25:09,970 --> 00:25:12,390 which is to give a nondeterministic algorithm. 385 00:25:12,390 --> 00:25:13,970 Or you can think of it as defining 386 00:25:13,970 --> 00:25:19,290 what the certificate is and then giving a polynomial time 387 00:25:19,290 --> 00:25:22,050 verification algorithm. 388 00:25:22,050 --> 00:25:24,812 So sort of two approaches. 389 00:25:24,812 --> 00:25:29,710 You can give a nondeterministic polynomial time algorithm, 390 00:25:29,710 --> 00:25:33,234 or you give a certificate and a verifier. 391 00:25:39,360 --> 00:25:41,230 There's no right or wrong certificate. 392 00:25:41,230 --> 00:25:43,730 I mean, a certificate, you can define however you want, 393 00:25:43,730 --> 00:25:46,190 as long as the verifier can actually check it 394 00:25:46,190 --> 00:25:49,480 and when it says yes, then the answer to the problem was yes. 395 00:25:49,480 --> 00:25:51,900 So it's really the same thing. 396 00:25:51,900 --> 00:25:53,710 Just want to say there's some certificate 397 00:25:53,710 --> 00:25:57,236 that a verifier could actually check. 398 00:25:57,236 --> 00:25:59,110 So that's proving that your problem is in NP. 399 00:25:59,110 --> 00:26:01,660 It's sort of an algorithmic thing. 400 00:26:01,660 --> 00:26:03,490 The second part is all about reductions. 401 00:26:03,490 --> 00:26:05,270 Now the definition says that I should 402 00:26:05,270 --> 00:26:08,776 reduce every problem in NP to my problem X. 403 00:26:08,776 --> 00:26:10,150 That's tedious, because there are 404 00:26:10,150 --> 00:26:11,610 a lot of problems in the world. 405 00:26:11,610 --> 00:26:14,530 So I don't want to do it for every problem in NP. 406 00:26:14,530 --> 00:26:16,030 I'd like to just do it for one. 407 00:26:16,030 --> 00:26:17,994 Now if I reduce sorting to my problem, 408 00:26:17,994 --> 00:26:19,160 that's not very interesting. 409 00:26:19,160 --> 00:26:22,980 It says my problem is at least as hard as sorting. 410 00:26:22,980 --> 00:26:24,940 But I already know how to solve sorting. 411 00:26:24,940 --> 00:26:28,960 But if I start from an NP-complete problem, 412 00:26:28,960 --> 00:26:32,720 then I know, by the definition, that every problem in NP 413 00:26:32,720 --> 00:26:34,550 can be reduced to that problem. 414 00:26:34,550 --> 00:26:37,770 And if I show how to reduce the NP-complete problem to me, 415 00:26:37,770 --> 00:26:39,830 then I know that I'm NP-complete too. 416 00:26:39,830 --> 00:26:44,180 Because if I have any problem Z in NP, 417 00:26:44,180 --> 00:26:49,510 by the definition of NP-complete of Y I can reduce that to Y. 418 00:26:49,510 --> 00:26:53,330 And then if I can build a reduction from Y to X, 419 00:26:53,330 --> 00:26:55,520 then I get this reduction. 420 00:26:55,520 --> 00:26:58,380 And so that means I can convert any problem in NP 421 00:26:58,380 --> 00:27:01,310 to my problem X, which means X is NP-hard. 422 00:27:01,310 --> 00:27:03,190 That's the definition. 423 00:27:03,190 --> 00:27:06,650 So all this is to say the first time 424 00:27:06,650 --> 00:27:09,740 you prove a problem is NP-complete in the world-- this 425 00:27:09,740 --> 00:27:14,700 happened in the '70s by Cook. 426 00:27:14,700 --> 00:27:16,900 Basically he proved that 3SAT is NP-complete. 427 00:27:19,725 --> 00:27:21,100 That was annoying, because he had 428 00:27:21,100 --> 00:27:23,106 to start from any problem in NP, and he 429 00:27:23,106 --> 00:27:25,230 had to show that you could reduce any problem in NP 430 00:27:25,230 --> 00:27:26,810 to 3SAT. 431 00:27:26,810 --> 00:27:29,820 But now that that hard work is done, our life is much easier. 432 00:27:29,820 --> 00:27:31,780 And in this class all you need to think about 433 00:27:31,780 --> 00:27:34,620 is picking your favorite NP-complete problem. 434 00:27:34,620 --> 00:27:36,900 3SAT's a good choice for almost anything, 435 00:27:36,900 --> 00:27:40,190 but we'll see a bunch of other problems today from here. 436 00:27:40,190 --> 00:27:44,520 And then reduce from that known problem to your problem 437 00:27:44,520 --> 00:27:46,690 that you're trying to prove is NP-hard. 438 00:27:46,690 --> 00:27:50,220 If you can do that, you know your problem is NP-hard. 439 00:27:50,220 --> 00:27:52,970 So we only need one reduction for each hardness result, which 440 00:27:52,970 --> 00:27:53,470 is nice. 441 00:27:53,470 --> 00:27:56,060 And this picture is a collection of reductions. 442 00:27:56,060 --> 00:27:57,360 We're going to start from 3SAT. 443 00:27:57,360 --> 00:27:59,010 I'm not going to prove that it's NP-complete, 444 00:27:59,010 --> 00:28:01,150 although I'll give you a hint as to why that's true. 445 00:28:01,150 --> 00:28:02,850 We're going to reduce it to Super Mario Brothers. 446 00:28:02,850 --> 00:28:04,890 We're going to reduce it to three dimensional matching. 447 00:28:04,890 --> 00:28:06,290 We're going to reduce three dimensional matching 448 00:28:06,290 --> 00:28:08,490 to subsets sum, to partition, to rectangle packing, 449 00:28:08,490 --> 00:28:10,165 to jig saw puzzles. 450 00:28:10,165 --> 00:28:13,310 And we're going to do all those reductions, hopefully. 451 00:28:13,310 --> 00:28:18,920 And that's proving NP-hardness of all those problems. 452 00:28:18,920 --> 00:28:20,070 They're also all in NP. 453 00:28:24,990 --> 00:28:31,610 So 30 second intuition why 3SAT is NP-hard. 454 00:28:31,610 --> 00:28:35,310 Well, if you have any problem in NP, 455 00:28:35,310 --> 00:28:39,580 that means there is one of these nondeterministic polynomial 456 00:28:39,580 --> 00:28:44,870 time algorithms, or there is some verifier given 457 00:28:44,870 --> 00:28:46,740 a polynomial size certificate. 458 00:28:46,740 --> 00:28:49,250 So that verifier is just some algorithm. 459 00:28:49,250 --> 00:28:52,070 And software and hardware are basically the same thing, 460 00:28:52,070 --> 00:28:52,570 right? 461 00:28:52,570 --> 00:28:54,200 So you can convert that algorithm 462 00:28:54,200 --> 00:28:57,150 into a circuit that implements the algorithm. 463 00:28:57,150 --> 00:28:59,840 And if I have a circuit with like ANDs and ORs and NOTs, 464 00:28:59,840 --> 00:29:02,469 I can convert that into a Boolean formula 465 00:29:02,469 --> 00:29:03,510 with ANDs, ORs, and NOTs. 466 00:29:03,510 --> 00:29:06,030 Circuits and formulas are about the same. 467 00:29:06,030 --> 00:29:08,374 And if I have a formula-- fun fact, 468 00:29:08,374 --> 00:29:10,040 although this is a little less obvious-- 469 00:29:10,040 --> 00:29:17,120 you can convert it into this form, an AND of triple ORs. 470 00:29:17,120 --> 00:29:18,850 And once you've done that, that formula 471 00:29:18,850 --> 00:29:22,120 is equivalent to the original algorithm. 472 00:29:22,120 --> 00:29:25,840 And the inputs to that verification algorithm, 473 00:29:25,840 --> 00:29:29,220 the certificate, are represented by these variables, the xi's. 474 00:29:29,220 --> 00:29:31,040 And so deciding whether there's some way 475 00:29:31,040 --> 00:29:33,160 to set the xi's to make the formula true 476 00:29:33,160 --> 00:29:37,040 is the same thing as saying is there some certificate where 477 00:29:37,040 --> 00:29:40,220 the verifier says yes, which is the same thing as saying 478 00:29:40,220 --> 00:29:44,120 that the problem has answer yes. 479 00:29:44,120 --> 00:29:47,670 So given an NP algorithm, one of these nondeterministic funny 480 00:29:47,670 --> 00:29:50,700 algorithms, we can convert it into a formula satisfaction 481 00:29:50,700 --> 00:29:51,670 problem. 482 00:29:51,670 --> 00:29:53,800 And that's how you prove 3SAT is NP-complete. 483 00:29:53,800 --> 00:29:55,680 But to do that can take many lectures, 484 00:29:55,680 --> 00:29:58,212 so I'm not going to do the details. 485 00:30:01,195 --> 00:30:03,570 The main annoying part is being formal about what exactly 486 00:30:03,570 --> 00:30:08,220 an algorithm is, which we don't do in this class. 487 00:30:08,220 --> 00:30:11,820 If you're interested, take 6.045, 488 00:30:11,820 --> 00:30:13,350 which is some people are actually 489 00:30:13,350 --> 00:30:16,050 in the overlap this semester. 490 00:30:16,050 --> 00:30:16,550 Cool. 491 00:30:16,550 --> 00:30:17,850 Let's do some reductions. 492 00:30:17,850 --> 00:30:19,500 This is where things get fun. 493 00:30:19,500 --> 00:30:23,230 So we're going to start with reducing 3SAT 494 00:30:23,230 --> 00:30:24,490 to Super Mario Brothers. 495 00:30:27,400 --> 00:30:31,320 So how many people have played Super Mario Brothers? 496 00:30:31,320 --> 00:30:31,880 Easy one. 497 00:30:31,880 --> 00:30:33,769 I hope if you haven't played, you've seen it, 498 00:30:33,769 --> 00:30:36,310 because we're going to rely very much on Super Mario Brothers 499 00:30:36,310 --> 00:30:38,720 physics, which I hope is fairly intuitive. 500 00:30:38,720 --> 00:30:42,050 But if you haven't played, you should, obviously. 501 00:30:46,180 --> 00:30:50,610 And we're going to reduce 3SAT to Super Mario Brothers. 502 00:31:04,480 --> 00:31:13,440 Now this is a theorem by a bunch of people, one MIT grad 503 00:31:13,440 --> 00:31:19,550 student, myself, and a couple other collaborators not at MIT. 504 00:31:19,550 --> 00:31:23,220 And of course this result holds for all versions of Super Mario 505 00:31:23,220 --> 00:31:26,569 Brothers so far released, I think. 506 00:31:26,569 --> 00:31:28,110 The proofs are a little bit different 507 00:31:28,110 --> 00:31:31,200 for each one, especially Mario 2, which is its own universe. 508 00:31:33,710 --> 00:31:36,430 What I'm going to talk about the original Super Mario Brothers, 509 00:31:36,430 --> 00:31:39,380 NES classic which I grew up with. 510 00:31:42,020 --> 00:31:45,240 Now the real Super Mario Brothers is on a 320 511 00:31:45,240 --> 00:31:46,450 by 240 screen. 512 00:31:46,450 --> 00:31:47,480 It's a little bit small. 513 00:31:47,480 --> 00:31:51,110 Once you go right, you can't go back left, except in the maze 514 00:31:51,110 --> 00:31:53,020 levels anyway. 515 00:31:53,020 --> 00:31:55,190 So I need to generalize a little bit. 516 00:31:55,190 --> 00:31:58,540 Because if you assume that the screen size of Super Mario 517 00:31:58,540 --> 00:32:00,270 Brothers is constant, in fact you 518 00:32:00,270 --> 00:32:01,770 can dynamic program your way through 519 00:32:01,770 --> 00:32:04,670 and find the optimal solution in polynomial time. 520 00:32:04,670 --> 00:32:13,080 So I need to generalize a little bit to arbitrary board size, 521 00:32:13,080 --> 00:32:15,640 arbitrary screen size. 522 00:32:15,640 --> 00:32:22,010 So in fact, my entire level will be in one screen, no scrolling. 523 00:32:22,010 --> 00:32:24,890 Never mind this is a side scrolling adventure. 524 00:32:24,890 --> 00:32:27,160 And so that's my generalized problem. 525 00:32:27,160 --> 00:32:28,400 And I claim this is NP-hard. 526 00:32:28,400 --> 00:32:30,820 If I give you a level and I ask you, 527 00:32:30,820 --> 00:32:34,070 can you get to the end of this level? 528 00:32:34,070 --> 00:32:36,890 That problem is NP-hard. 529 00:32:36,890 --> 00:32:38,284 Also no time limit. 530 00:32:38,284 --> 00:32:40,700 The time limit would be OK, but you have to generalize it. 531 00:32:40,700 --> 00:32:43,240 Instead of 300 seconds or whatever, 532 00:32:43,240 --> 00:32:47,119 it has to be an arbitrary value. 533 00:32:47,119 --> 00:32:48,410 So how are we going to do this? 534 00:32:48,410 --> 00:32:52,100 We're going to reduce from 3SAT to Super Mario Brothers. 535 00:32:52,100 --> 00:32:54,480 So that means I'm given-- I don't get to choose. 536 00:32:54,480 --> 00:32:56,470 I'm given one of these formulas. 537 00:32:56,470 --> 00:32:59,800 And I have to convert it into an equivalent Super Mario Brother 538 00:32:59,800 --> 00:33:00,550 instance. 539 00:33:00,550 --> 00:33:04,300 So I have to convert it into a level, a hypothetical level 540 00:33:04,300 --> 00:33:05,310 of Super Mario Brothers. 541 00:33:05,310 --> 00:33:08,000 Given a formula, I have to build a level that 542 00:33:08,000 --> 00:33:11,150 implements that formula. 543 00:33:11,150 --> 00:33:13,230 So here's what it's going to look like. 544 00:33:13,230 --> 00:33:15,300 I'm going to start out somewhere. 545 00:33:15,300 --> 00:33:19,180 Here's my drawing of Mario. 546 00:33:19,180 --> 00:33:21,840 Mario-- or you could play Luigi. 547 00:33:21,840 --> 00:33:23,857 It doesn't matter. 548 00:33:23,857 --> 00:33:26,190 First thing it's going to do is enter a little black box 549 00:33:26,190 --> 00:33:29,220 called a variable. 550 00:33:29,220 --> 00:33:33,800 This is supposed to represent, let's call it x1. 551 00:33:33,800 --> 00:33:35,035 And so it's some black box. 552 00:33:35,035 --> 00:33:36,910 I'm going to tell you what it is in a moment. 553 00:33:36,910 --> 00:33:38,180 And it has two outputs. 554 00:33:38,180 --> 00:33:41,020 There's the true output and the false output. 555 00:33:41,020 --> 00:33:44,040 And the idea is that Mario has to choose whether to set x1 556 00:33:44,040 --> 00:33:46,470 to true or false. 557 00:33:46,470 --> 00:33:47,676 Let me show you that gadget. 558 00:33:52,326 --> 00:33:56,880 So here's the-- whoops, upside down-- here 559 00:33:56,880 --> 00:33:59,330 is the variable gadget. 560 00:33:59,330 --> 00:34:00,950 So here's Mario. 561 00:34:00,950 --> 00:34:02,630 Could enter from this way or that way. 562 00:34:02,630 --> 00:34:04,610 We'll need a couple of entrances in a moment. 563 00:34:04,610 --> 00:34:06,510 And then falls down. 564 00:34:06,510 --> 00:34:08,760 Once Mario is down here, if you check the jump height, 565 00:34:08,760 --> 00:34:10,199 you cannot get back up to here. 566 00:34:10,199 --> 00:34:11,451 So this is like a one way. 567 00:34:11,451 --> 00:34:13,159 Once you're down here, you have a choice. 568 00:34:13,159 --> 00:34:15,270 Should I fall to the left or fall to the right? 569 00:34:15,270 --> 00:34:19,150 And if you make these falls large enough, once you fall, 570 00:34:19,150 --> 00:34:21,550 you can't unfall. 571 00:34:21,550 --> 00:34:23,389 So once you make a choice of whether I 572 00:34:23,389 --> 00:34:26,780 leave on the true exit or the false exit, 573 00:34:26,780 --> 00:34:28,230 that's a permanent choice. 574 00:34:28,230 --> 00:34:30,730 So you can't undo it, unless you can come back to here. 575 00:34:30,730 --> 00:34:33,487 But we'll set up so that never happens. 576 00:34:33,487 --> 00:34:35,320 I mean, if you're trying to solve the level, 577 00:34:35,320 --> 00:34:37,100 you don't know which way to go. 578 00:34:37,100 --> 00:34:38,212 You have to guess. 579 00:34:38,212 --> 00:34:40,170 Can I go fall to the left or fall to the right, 580 00:34:40,170 --> 00:34:43,370 or do something. 581 00:34:43,370 --> 00:34:47,100 So the existence of a play through, this level, 582 00:34:47,100 --> 00:34:50,870 is the same as saying there is a choice for the x1 variable. 583 00:34:50,870 --> 00:34:53,573 Now we have to do this for lots of variables. 584 00:34:53,573 --> 00:34:58,760 So there's x2 variable, x3 variable, and so on. 585 00:34:58,760 --> 00:35:03,950 Each one has a true exit and a false exit. 586 00:35:03,950 --> 00:35:06,890 So the actual level will have n instances of this 587 00:35:06,890 --> 00:35:08,530 if we have n variables. 588 00:35:08,530 --> 00:35:11,750 Now, what do I do once Mario decides 589 00:35:11,750 --> 00:35:14,220 that this is a true thing? 590 00:35:14,220 --> 00:35:15,680 What I'm going to do is have-- this 591 00:35:15,680 --> 00:35:17,150 is called a gadget by the way. 592 00:35:17,150 --> 00:35:19,680 In general, most NP-hardness proofs 593 00:35:19,680 --> 00:35:21,950 use these things called gadgets, which is just saying, 594 00:35:21,950 --> 00:35:24,790 we take various features of the input, 595 00:35:24,790 --> 00:35:26,940 and we convert them into corresponding features 596 00:35:26,940 --> 00:35:27,890 on the output. 597 00:35:27,890 --> 00:35:30,770 So here I'm taking each variable, x1, x2, x3, 598 00:35:30,770 --> 00:35:33,270 and so on, and building this little gadget 599 00:35:33,270 --> 00:35:35,330 for each of those variables. 600 00:35:35,330 --> 00:35:39,260 Now the other main thing you have in 3SAT are the clauses. 601 00:35:39,260 --> 00:35:42,100 We have triples of variables or their negations. 602 00:35:42,100 --> 00:35:45,640 They have to come together and be satisfied. 603 00:35:45,640 --> 00:35:47,320 One of them has to be true. 604 00:35:47,320 --> 00:35:56,220 So down here I'm going to have some clause gadgets, which 605 00:35:56,220 --> 00:35:57,560 I will show you in a moment. 606 00:36:05,630 --> 00:36:08,510 OK, and I think I'll switch colors. 607 00:36:08,510 --> 00:36:10,220 This is about to get messy. 608 00:36:10,220 --> 00:36:16,132 So the idea is that some of the clauses have x1 in them. 609 00:36:16,132 --> 00:36:19,440 The true version of x1, not x1 bar. 610 00:36:19,440 --> 00:36:22,920 So for those clauses, I want to connect. 611 00:36:22,920 --> 00:36:25,010 I'm going to dip into the clause briefly. 612 00:36:25,010 --> 00:36:27,960 So from this wire going to dip into the clause here. 613 00:36:27,960 --> 00:36:30,950 And then I'm going to go to the next clause that has x1. 614 00:36:30,950 --> 00:36:35,020 Maybe it's this one, and the next one, and so on. 615 00:36:35,020 --> 00:36:38,360 All the clauses that have x1 in it, I dip into. 616 00:36:38,360 --> 00:36:40,240 The other ones I don't. 617 00:36:40,240 --> 00:36:41,640 And then once I'm done, I'm going 618 00:36:41,640 --> 00:36:44,290 to come back and feed into x2. 619 00:36:48,620 --> 00:36:52,180 Next, I look at this false wire for x1. 620 00:36:52,180 --> 00:36:55,260 So all the clauses that have x1 bar in them, 621 00:36:55,260 --> 00:36:56,720 I'm going to connect. 622 00:36:56,720 --> 00:36:58,770 So I don't know which ones they are. 623 00:36:58,770 --> 00:37:05,420 Maybe this one, or this one, something. 624 00:37:05,420 --> 00:37:08,270 And then I come here. 625 00:37:08,270 --> 00:37:11,150 And so the idea is that Mario makes a choice 626 00:37:11,150 --> 00:37:12,540 whether x1 is true or false. 627 00:37:12,540 --> 00:37:17,640 If x1 is true, Mario is going to visit all of the clauses that 628 00:37:17,640 --> 00:37:19,400 have x1 true in them. 629 00:37:19,400 --> 00:37:21,260 And then it's going to go to the x2 choice. 630 00:37:21,260 --> 00:37:23,510 Then it's going to choose whether x2 is true or false, 631 00:37:23,510 --> 00:37:25,620 and repeat. 632 00:37:25,620 --> 00:37:27,900 Or Mario decides x1 should be false. 633 00:37:27,900 --> 00:37:30,420 That will satisfy all the clauses 634 00:37:30,420 --> 00:37:34,550 that have x1 bar in them. 635 00:37:34,550 --> 00:37:37,220 And then again, we feed back into x2. 636 00:37:37,220 --> 00:37:39,950 So this is why we have two inputs into the x2 gadget. 637 00:37:39,950 --> 00:37:42,400 One of them is when the previous variable was true. 638 00:37:42,400 --> 00:37:45,200 The other is when the previous variable was false. 639 00:37:45,200 --> 00:37:47,780 The choice of x2 doesn't depend on the choice of x1. 640 00:37:47,780 --> 00:37:49,495 So they feed into the same thing. 641 00:37:49,495 --> 00:37:50,870 And you have to make your choice. 642 00:37:53,960 --> 00:37:55,330 So far, so good. 643 00:37:55,330 --> 00:37:58,970 Now the question is, what's happening in these clauses. 644 00:37:58,970 --> 00:38:03,580 And then there's one other aspect, 645 00:38:03,580 --> 00:38:07,740 which is after you've set all of the variables, at the very end, 646 00:38:07,740 --> 00:38:13,350 after this last variable xn, at the very end, 647 00:38:13,350 --> 00:38:19,650 what we're going to do is come and go through all the clauses. 648 00:38:19,650 --> 00:38:21,440 And then this is the flag. 649 00:38:21,440 --> 00:38:22,970 This is where you win the level. 650 00:38:22,970 --> 00:38:24,110 Sorry, I drew it backwards. 651 00:38:24,110 --> 00:38:29,600 But the goal is for Martin to start here and get to here. 652 00:38:29,600 --> 00:38:31,070 In order to do that, you have to be 653 00:38:31,070 --> 00:38:32,694 able to traverse through these clauses. 654 00:38:32,694 --> 00:38:34,750 So what do the clauses look like? 655 00:38:34,750 --> 00:38:37,780 This is a little bit more elaborate. 656 00:38:37,780 --> 00:38:39,250 So here we are. 657 00:38:43,095 --> 00:38:44,970 This is a clause gadget. 658 00:38:44,970 --> 00:38:47,906 So there are three ways to dip into the clause. 659 00:38:47,906 --> 00:38:50,030 It's actually upside down relative to that picture, 660 00:38:50,030 --> 00:38:51,500 but that's not a problem. 661 00:38:54,180 --> 00:38:58,830 So if Mario comes here, then he can hit the question mark 662 00:38:58,830 --> 00:38:59,850 from below. 663 00:38:59,850 --> 00:39:02,742 And inside this question mark is an invincibility star. 664 00:39:02,742 --> 00:39:04,950 And the invincibility star will come up here and just 665 00:39:04,950 --> 00:39:07,710 bounce around forever. 666 00:39:07,710 --> 00:39:08,610 We checked. 667 00:39:08,610 --> 00:39:12,350 The star will just stay there for as long as you let it sit. 668 00:39:12,350 --> 00:39:14,290 Unfortunately, all of these are solid blocks, 669 00:39:14,290 --> 00:39:18,420 so Mario can't actually get up to here to get the star. 670 00:39:18,420 --> 00:39:20,459 But as long as Mario can visit this question 671 00:39:20,459 --> 00:39:22,500 mark or this question mark or this question mark, 672 00:39:22,500 --> 00:39:25,275 then there will be at least one star up here. 673 00:39:25,275 --> 00:39:26,650 So the idea is that each of these 674 00:39:26,650 --> 00:39:30,140 represents one of the literals that's in the clause. 675 00:39:30,140 --> 00:39:34,560 And if we choose-- so let's look at this first clause, x1 or x3 676 00:39:34,560 --> 00:39:35,860 or x6 bar. 677 00:39:35,860 --> 00:39:40,150 So if we choose x1 to be true, then we'll follow the path 678 00:39:40,150 --> 00:39:42,040 and we'll be able to hit the star. 679 00:39:42,040 --> 00:39:46,150 Or if we choose x3 to be true, then we'll come in here 680 00:39:46,150 --> 00:39:47,910 and hit this star. 681 00:39:47,910 --> 00:39:52,199 Or if we choose x6 to be false, then that path 682 00:39:52,199 --> 00:39:54,740 will lead to here and we'll be able to hit this question mark 683 00:39:54,740 --> 00:39:55,800 and get the star up here. 684 00:39:55,800 --> 00:39:58,570 So as long as we satisfy the clause, 685 00:39:58,570 --> 00:39:59,990 there will be at least one star. 686 00:39:59,990 --> 00:40:02,880 Won't help if you have multiple stars. 687 00:40:02,880 --> 00:40:07,220 Then the final traversal part-- so that was this first clause. 688 00:40:07,220 --> 00:40:08,960 And now we're traversing through. 689 00:40:08,960 --> 00:40:10,960 Actually in this picture, it's left to right. 690 00:40:10,960 --> 00:40:13,420 Just turn your head. 691 00:40:13,420 --> 00:40:16,040 And so now Mario is going to have 692 00:40:16,040 --> 00:40:19,450 to traverse this gadget from left to right on this top part. 693 00:40:19,450 --> 00:40:22,810 And if Mario comes in here and you can barely jump over that. 694 00:40:22,810 --> 00:40:25,120 If there's a star, you can collect the star 695 00:40:25,120 --> 00:40:29,700 and then run through all of these flaming bars of death. 696 00:40:29,700 --> 00:40:32,030 If there's no star, you can't. 697 00:40:32,030 --> 00:40:34,170 You'll die if you try to traverse. 698 00:40:34,170 --> 00:40:38,530 So in order to be able to traverse all these clauses, 699 00:40:38,530 --> 00:40:40,630 they must all be true. 700 00:40:40,630 --> 00:40:44,180 And them all being true is the same is their AND being true. 701 00:40:44,180 --> 00:40:47,300 So you will be able to survive through all these clauses 702 00:40:47,300 --> 00:40:51,404 if and only if this formula has a satisfying assignment. 703 00:40:51,404 --> 00:40:52,820 The satisfying assignment would be 704 00:40:52,820 --> 00:40:57,990 given to you by the level play. 705 00:40:57,990 --> 00:41:00,400 The choices that Mario makes in this gadget 706 00:41:00,400 --> 00:41:02,750 will tell you whether each variable 707 00:41:02,750 --> 00:41:05,310 should be true or false. 708 00:41:05,310 --> 00:41:08,520 So to elaborate just a little bit more 709 00:41:08,520 --> 00:41:10,600 in general, when you have a reduction like this, 710 00:41:10,600 --> 00:41:13,600 to prove that it actually works, you need to check two things. 711 00:41:13,600 --> 00:41:15,690 You need to check that if there is a way 712 00:41:15,690 --> 00:41:17,670 to satisfy this formula, then there 713 00:41:17,670 --> 00:41:19,300 is a way to play this level. 714 00:41:19,300 --> 00:41:21,716 And then conversely you need to show that if there's a way 715 00:41:21,716 --> 00:41:23,610 to play this level, then the formula 716 00:41:23,610 --> 00:41:25,150 has a satisfying assignment. 717 00:41:25,150 --> 00:41:28,980 So for that latter part, in order 718 00:41:28,980 --> 00:41:31,437 to convert a level play into a satisfying assignment, 719 00:41:31,437 --> 00:41:34,020 you just check which way Mario falls in each of these gadgets, 720 00:41:34,020 --> 00:41:34,820 left or right. 721 00:41:34,820 --> 00:41:36,700 That tells you the variable assignment. 722 00:41:36,700 --> 00:41:38,860 And because of the way the clauses work, 723 00:41:38,860 --> 00:41:41,200 you'll only be able to finish the level if there 724 00:41:41,200 --> 00:41:42,910 was at least one star here. 725 00:41:42,910 --> 00:41:44,780 And stars run out after some time. 726 00:41:44,780 --> 00:41:48,530 So you can barely make it through all the flaming bars 727 00:41:48,530 --> 00:41:49,067 of death. 728 00:41:49,067 --> 00:41:50,400 Then you get to the next clause. 729 00:41:50,400 --> 00:41:53,230 You need another star for each one. 730 00:41:53,230 --> 00:41:56,300 Conversely, if there is a satisfying assignment, 731 00:41:56,300 --> 00:41:58,060 you can actually play through the level, 732 00:41:58,060 --> 00:41:59,934 you just make these choices according to what 733 00:41:59,934 --> 00:42:01,180 the satisfying assignment is. 734 00:42:01,180 --> 00:42:03,330 So either way it's equivalent. 735 00:42:03,330 --> 00:42:07,090 We always get a yes or no answer here whenever 736 00:42:07,090 --> 00:42:11,460 we get a corresponding yes or no answer to the 3SAT process. 737 00:42:11,460 --> 00:42:15,330 You also need to check that this reduction is polynomial size. 738 00:42:15,330 --> 00:42:17,690 It can be computed in polynomial time. 739 00:42:17,690 --> 00:42:18,530 So there's an issue. 740 00:42:18,530 --> 00:42:21,420 Given this thing, you have to lay this out in a grid 741 00:42:21,420 --> 00:42:23,780 and draw all these wires. 742 00:42:23,780 --> 00:42:28,290 And there's one problem here, which is, 743 00:42:28,290 --> 00:42:31,010 these wires cross each other. 744 00:42:31,010 --> 00:42:33,870 And that's a little awkward, because these wires 745 00:42:33,870 --> 00:42:36,660 are basically just long tunnels for Mario to walk through. 746 00:42:36,660 --> 00:42:38,910 But what does it mean to have a crossing wire? 747 00:42:38,910 --> 00:42:41,400 Really, if Mario's coming this way, 748 00:42:41,400 --> 00:42:44,630 I don't want them to be able to go up here. 749 00:42:44,630 --> 00:42:46,100 He has to go straight. 750 00:42:46,100 --> 00:42:47,900 Otherwise this reduction won't work. 751 00:42:47,900 --> 00:42:50,800 So I need what's called a crossover gadget. 752 00:42:50,800 --> 00:42:56,940 And everywhere here I have a crossing, I have a crossover. 753 00:42:56,940 --> 00:42:59,360 And this gadget has to guarantee that I 754 00:42:59,360 --> 00:43:01,220 can go through one way or the other way, 755 00:43:01,220 --> 00:43:04,860 but there's no leakage from one path to the other path. 756 00:43:04,860 --> 00:43:08,550 Actually, if I first traverse through here, 757 00:43:08,550 --> 00:43:10,950 and then I traverse through here, it's OK if I leak back. 758 00:43:10,950 --> 00:43:14,310 Because once I visit a wire, it's kind of done. 759 00:43:14,310 --> 00:43:17,430 But I can't have leakage if only one of them is traversed. 760 00:43:17,430 --> 00:43:23,570 So this is the last gadget, the most complicated of them all. 761 00:43:23,570 --> 00:43:25,295 So this took a while to construct, 762 00:43:25,295 --> 00:43:26,170 as you might imagine. 763 00:43:28,710 --> 00:43:32,470 So this is what we call a unidirectional crossover. 764 00:43:32,470 --> 00:43:38,690 You can either go from left to right or from bottom to top, 765 00:43:38,690 --> 00:43:42,270 but you cannot go from bottom to right or bottom to left or left 766 00:43:42,270 --> 00:43:44,480 to bottom, that kind of thing. 767 00:43:44,480 --> 00:43:45,980 So I'm told that Mario is only going 768 00:43:45,980 --> 00:43:48,450 to enter from here to here, because all of these wires, 769 00:43:48,450 --> 00:43:50,010 I can make one way wires. 770 00:43:50,010 --> 00:43:53,440 I only have to think about going in a particular direction. 771 00:43:53,440 --> 00:43:55,620 I can have falls to force Mario to only go 772 00:43:55,620 --> 00:43:57,840 one way along these wires. 773 00:43:57,840 --> 00:44:01,790 And so let me show you the valid traversals. 774 00:44:01,790 --> 00:44:03,900 Maybe the simplest one is from here. 775 00:44:03,900 --> 00:44:05,910 So let's say Mario comes in here, falls. 776 00:44:05,910 --> 00:44:08,480 So I can't backtrack, can jump up here. 777 00:44:08,480 --> 00:44:11,960 And then if Mario's big, he can break this block, 778 00:44:11,960 --> 00:44:12,890 break this block. 779 00:44:12,890 --> 00:44:17,490 But if he's big-- there should be a couple more zig zags here. 780 00:44:17,490 --> 00:44:18,790 Let's try to run. 781 00:44:18,790 --> 00:44:21,729 You can crouch slide through here. 782 00:44:21,729 --> 00:44:23,520 But then you'll sort of lose your momentum, 783 00:44:23,520 --> 00:44:25,144 and you won't be able to go through all 784 00:44:25,144 --> 00:44:27,990 these traversals as big Mario. 785 00:44:27,990 --> 00:44:31,800 So you can break these blocks and then get up to the top 786 00:44:31,800 --> 00:44:32,800 and leave. 787 00:44:32,800 --> 00:44:35,520 Or, if big Mario comes from over this way, 788 00:44:35,520 --> 00:44:38,940 you can first take a damage, become small Mario. 789 00:44:38,940 --> 00:44:42,010 Then you can fit through these wiggly blocks. 790 00:44:42,010 --> 00:44:45,090 But you cannot break blocks anymore as small Mario. 791 00:44:45,090 --> 00:44:48,230 So once you've committed to going small, 792 00:44:48,230 --> 00:44:50,520 you have to stay small, until you get to here. 793 00:44:50,520 --> 00:44:52,402 And then there's a mushroom in this block. 794 00:44:52,402 --> 00:44:54,860 So you can get big again, and then you can break this block 795 00:44:54,860 --> 00:44:56,020 and leave. 796 00:44:56,020 --> 00:44:57,700 But once you're big, you can't backtrack 797 00:44:57,700 --> 00:45:00,690 because big Mario can't fit through these tiny tubes. 798 00:45:00,690 --> 00:45:03,430 See it clear, right? 799 00:45:03,430 --> 00:45:07,260 So slight detail, which is at the beginning, 800 00:45:07,260 --> 00:45:10,660 we need to make Mario big-- so there's a little mushroom. 801 00:45:10,660 --> 00:45:14,450 I think they have three spots-- at the beginning. 802 00:45:14,450 --> 00:45:16,180 And also at the end, there has to be 803 00:45:16,180 --> 00:45:18,600 something like this that checks that you actually 804 00:45:18,600 --> 00:45:19,640 have a mushroom. 805 00:45:19,640 --> 00:45:22,180 So the only time you're allowed to take damage 806 00:45:22,180 --> 00:45:24,232 is briefly in this gadget you take damage. 807 00:45:24,232 --> 00:45:26,190 If you tried to backtrack, you would get stuck. 808 00:45:26,190 --> 00:45:28,120 There's a long fall here. 809 00:45:28,120 --> 00:45:29,820 And then you have to get the mushroom 810 00:45:29,820 --> 00:45:32,610 so you can escape again. 811 00:45:32,610 --> 00:45:39,250 So at the end there's like a mushroom check. 812 00:45:39,250 --> 00:45:40,950 Make sure you have it. 813 00:45:40,950 --> 00:45:42,790 So most of the time Mario is big. 814 00:45:42,790 --> 00:45:44,260 And just in these little crossovers 815 00:45:44,260 --> 00:45:45,635 you have to make these decisions. 816 00:45:45,635 --> 00:45:47,740 This would make a giant level, but it 817 00:45:47,740 --> 00:45:54,650 is polynomial size, probably quadratic or something. 818 00:45:54,650 --> 00:45:56,400 Therefore Super Mario Brothers is NP-hard. 819 00:45:59,050 --> 00:46:01,420 So if you want more fun examples like this, 820 00:46:01,420 --> 00:46:04,650 you should check out 6.890, the class I taught last semester, 821 00:46:04,650 --> 00:46:08,100 which has online video lectures, soon to be on OpenCourseWare. 822 00:46:08,100 --> 00:46:12,310 So you can play with that. 823 00:46:12,310 --> 00:46:14,310 Any questions about Mario? 824 00:46:17,340 --> 00:46:18,740 All right, I hope you all play. 825 00:46:27,380 --> 00:46:30,730 So the next topic is a problem you probably 826 00:46:30,730 --> 00:46:33,410 haven't heard about, three dimensional matching. 827 00:46:43,880 --> 00:46:46,980 This is a kind of a graph theory problem. 828 00:46:46,980 --> 00:46:50,590 We're going to call it 3DM for short. 829 00:46:50,590 --> 00:46:53,640 And you've seen matching problems based on flow. 830 00:46:53,640 --> 00:46:56,030 Matching problems are usually about pairs of things. 831 00:46:56,030 --> 00:46:58,220 You're pairing them up, which you might 832 00:46:58,220 --> 00:46:59,470 call two dimensional matching. 833 00:46:59,470 --> 00:47:01,700 That can be solved in polynomial time. 834 00:47:01,700 --> 00:47:05,380 But if you change two to three and you're tripling things up, 835 00:47:05,380 --> 00:47:08,420 then suddenly the problem becomes NP-complete. 836 00:47:08,420 --> 00:47:13,290 So it's a useful starting point, similar to 3SAT. 837 00:47:20,090 --> 00:47:28,250 So you're given a set X of elements, 838 00:47:28,250 --> 00:47:30,810 a set Y of elements, a set Z of elements. 839 00:47:30,810 --> 00:47:32,770 None of them are shared. 840 00:47:32,770 --> 00:47:37,020 But more importantly, you are given a bunch of triples. 841 00:47:37,020 --> 00:47:40,850 These are the allowable triples. 842 00:47:40,850 --> 00:47:44,790 So we'll call the set of allowable triples T. 843 00:47:44,790 --> 00:47:47,100 And so we're looking at the cross product. 844 00:47:47,100 --> 00:47:50,890 This is the set of all triples X, Y, and Z, or X is in X, 845 00:47:50,890 --> 00:47:54,620 Y is in Y, and Z is in Z. But not all triples are allowed. 846 00:47:54,620 --> 00:47:58,210 Only some subset of triples is allowed. 847 00:47:58,210 --> 00:48:04,800 And your goal is to choose among those subsets-- sorry, 848 00:48:04,800 --> 00:48:09,440 among those triples a subset of the triples. 849 00:48:09,440 --> 00:48:11,020 So we're trying to choose a subset 850 00:48:11,020 --> 00:48:26,667 S of T such that every element-- so the things in X, Y, and Z 851 00:48:26,667 --> 00:48:27,500 are called elements. 852 00:48:27,500 --> 00:48:33,240 So I'm just taking somebody in the union XYZ. 853 00:48:33,240 --> 00:48:48,225 It should be in exactly one triple s in big S. 854 00:48:48,225 --> 00:48:50,600 This is a little weird, but you can think of this problem 855 00:48:50,600 --> 00:48:56,140 as you have an alien race with three genders-- 856 00:48:56,140 --> 00:48:58,254 male, female, neuter I guess. 857 00:48:58,254 --> 00:48:59,420 Those are the X, Y, and Z's. 858 00:48:59,420 --> 00:49:02,420 There's an equal number of each. 859 00:49:02,420 --> 00:49:06,010 And every triple reports to you whether that 860 00:49:06,010 --> 00:49:08,270 is a compatible matching. 861 00:49:08,270 --> 00:49:11,130 Who knows what they're doing, all three of them? 862 00:49:11,130 --> 00:49:14,110 So you're told up front-- you take a survey. 863 00:49:14,110 --> 00:49:15,790 There's only n cubed different triples. 864 00:49:15,790 --> 00:49:19,992 For each of them they say, yeah, I'd do that. 865 00:49:19,992 --> 00:49:22,630 So you were given that subset. 866 00:49:22,630 --> 00:49:26,400 And now your goal is to permanently triple up 867 00:49:26,400 --> 00:49:27,380 these guys. 868 00:49:27,380 --> 00:49:30,630 And everybody wants to be in exactly one triple. 869 00:49:30,630 --> 00:49:34,860 So it's a monogamous race, imagine. 870 00:49:34,860 --> 00:49:38,190 So everybody wants to be put in one triple, but only one 871 00:49:38,190 --> 00:49:39,317 triple. 872 00:49:39,317 --> 00:49:40,900 And the question is, is this possible? 873 00:49:40,900 --> 00:49:42,358 This is three dimensional matching. 874 00:49:42,358 --> 00:49:45,436 Certainly not always going to be possible, but sometimes it is. 875 00:49:45,436 --> 00:49:46,810 If it is, you want to answer yes. 876 00:49:46,810 --> 00:49:50,060 If it's not possible, you want to answer no. 877 00:49:50,060 --> 00:49:52,180 This problem is NP-complete. 878 00:49:52,180 --> 00:49:54,150 Why is it in NP? 879 00:49:54,150 --> 00:49:58,140 Because I can basically guess which elements of T are in S. 880 00:49:58,140 --> 00:50:00,730 There's only at most n cubed of them. 881 00:50:00,730 --> 00:50:02,510 So for each one, it is guess yes or no, 882 00:50:02,510 --> 00:50:04,310 is that element of T in S? 883 00:50:04,310 --> 00:50:07,270 And then I check whether this coverage constraint holds. 884 00:50:07,270 --> 00:50:09,617 So it's very easy to prove this is in NP. 885 00:50:12,540 --> 00:50:15,940 The challenge is to prove that it's NP-hard. 886 00:50:15,940 --> 00:50:20,610 And we're going to do that, again, by reducing from 3SAT. 887 00:50:32,510 --> 00:50:36,640 So we're going to make a reduction from 3SAT 888 00:50:36,640 --> 00:50:38,850 to three dimensional matching. 889 00:50:38,850 --> 00:50:39,840 Direction is important. 890 00:50:39,840 --> 00:50:42,030 Always reduce from the thing you know is hard 891 00:50:42,030 --> 00:50:45,230 and reduce to the thing you don't know is hard. 892 00:50:45,230 --> 00:50:47,320 So again, we're given a formula. 893 00:50:47,320 --> 00:50:49,640 And we want to convert that formula 894 00:50:49,640 --> 00:50:53,670 into an equivalent three dimensional matching input. 895 00:50:53,670 --> 00:50:56,965 So the formula has variables and clauses. 896 00:50:56,965 --> 00:50:58,590 For each variable, we're going to build 897 00:50:58,590 --> 00:51:01,777 a gadget that looks like this. 898 00:51:01,777 --> 00:51:03,860 And for each clause we're going to build a gadget. 899 00:51:03,860 --> 00:51:06,720 So here's what they look like. 900 00:51:06,720 --> 00:51:13,730 If we have a variable x1, we're going to convert that 901 00:51:13,730 --> 00:51:15,650 into this picture. 902 00:51:30,490 --> 00:51:32,053 Stay monochromatic for now. 903 00:51:36,660 --> 00:51:41,176 Looks pretty crazy at the moment, but it's not so crazy. 904 00:51:56,510 --> 00:51:58,490 This is not supposed to be obvious. 905 00:51:58,490 --> 00:52:00,330 You have to think for a while. 906 00:52:00,330 --> 00:52:03,310 It's a puzzle to figure out this kind of thing. 907 00:52:03,310 --> 00:52:08,610 But I call this thing a variable gadget because locally-- 908 00:52:08,610 --> 00:52:11,450 so there's basically a wheel in the center here. 909 00:52:11,450 --> 00:52:13,740 And then there's these extra dots 910 00:52:13,740 --> 00:52:15,810 for every pair of dots, consecutive pairs of dots 911 00:52:15,810 --> 00:52:17,100 in a wheel. 912 00:52:17,100 --> 00:52:19,960 And what I've drawn is the set of triples that are allowed. 913 00:52:19,960 --> 00:52:22,620 There's tons of other triples which are forbidden. 914 00:52:22,620 --> 00:52:24,670 The triples that are in T are the ones 915 00:52:24,670 --> 00:52:27,490 that I draw as little triangles. 916 00:52:27,490 --> 00:52:30,480 And two color them because there are exactly two ways 917 00:52:30,480 --> 00:52:31,850 to solve this gadget locally. 918 00:52:31,850 --> 00:52:35,310 Now these dots are going to be connected to other gadgets. 919 00:52:35,310 --> 00:52:39,407 But these dots only exist in this gadget, which means 920 00:52:39,407 --> 00:52:40,490 they've got to be covered. 921 00:52:40,490 --> 00:52:42,115 They've got to be covered exactly once. 922 00:52:42,115 --> 00:52:45,470 So either you choose the blue triangles, 923 00:52:45,470 --> 00:52:47,050 or you choose the red triangles. 924 00:52:47,050 --> 00:52:51,840 Each of them will exactly cover each of these guys once. 925 00:52:51,840 --> 00:52:53,890 You cannot mix and match red and blue, 926 00:52:53,890 --> 00:52:56,810 because you either get overlap if you choose two guys that 927 00:52:56,810 --> 00:52:59,250 share a point, or you'd miss one. 928 00:52:59,250 --> 00:53:01,770 If I choose like this blue and this red, 929 00:53:01,770 --> 00:53:04,550 then I can't cover this point because both of these 930 00:53:04,550 --> 00:53:05,910 would overlap those two. 931 00:53:05,910 --> 00:53:09,080 And over here you have to choose [INAUDIBLE] triples. 932 00:53:09,080 --> 00:53:10,590 They can't overlap at all. 933 00:53:10,590 --> 00:53:13,270 And everybody has to get covered. 934 00:53:13,270 --> 00:53:15,640 So just given those constraints, locally you 935 00:53:15,640 --> 00:53:18,260 can see you have to choose red or blue. 936 00:53:18,260 --> 00:53:18,920 Guess what? 937 00:53:18,920 --> 00:53:21,800 One of them is true, the other one is false. 938 00:53:21,800 --> 00:53:25,414 Let's say that red is true and blue is false. 939 00:53:25,414 --> 00:53:27,830 In general, when you're trying to build a variable gadget, 940 00:53:27,830 --> 00:53:30,180 you build something that has exactly 941 00:53:30,180 --> 00:53:33,220 two solutions, one representing true, one representing false. 942 00:53:33,220 --> 00:53:36,850 Now how big do I make this wheel? 943 00:53:36,850 --> 00:53:39,800 Big enough. 944 00:53:39,800 --> 00:53:41,970 You could make it as big as the number of clauses. 945 00:53:41,970 --> 00:53:46,890 I'm going to make it into two and x1. 946 00:53:46,890 --> 00:53:59,110 So wheel-- and this number is the number of occurrences 947 00:53:59,110 --> 00:54:03,680 of x1 in the formula. 948 00:54:03,680 --> 00:54:06,900 So this is the number of clauses that contain either xi or xi 949 00:54:06,900 --> 00:54:08,720 bar. 950 00:54:08,720 --> 00:54:09,585 That's in xi. 951 00:54:09,585 --> 00:54:11,580 I'm going to double that. 952 00:54:11,580 --> 00:54:14,960 Because what I get over here is basically xi 953 00:54:14,960 --> 00:54:21,540 being true for those guys. 954 00:54:21,540 --> 00:54:25,020 Actually, yeah, that's actually right. 955 00:54:25,020 --> 00:54:27,220 It looks backwards. 956 00:54:27,220 --> 00:54:28,650 And false for these guys. 957 00:54:31,670 --> 00:54:35,380 One way or the other, we'll figure it out. 958 00:54:35,380 --> 00:54:40,800 So in order for xi to appear in, say, five different clauses, 959 00:54:40,800 --> 00:54:45,590 I want five of the true things and five of the false things. 960 00:54:45,590 --> 00:54:49,854 And so I need to double in order to get-- potentially 961 00:54:49,854 --> 00:54:51,520 I have twice as many as I actually need, 962 00:54:51,520 --> 00:54:53,603 but this way I'm guaranteed to have false or true, 963 00:54:53,603 --> 00:54:55,830 whichever I need. 964 00:54:55,830 --> 00:54:57,560 In reality I have some true occurrences. 965 00:54:57,560 --> 00:55:01,210 I have some false occurrences, some x1's, some x1 bars. 966 00:55:01,210 --> 00:55:06,600 This will guarantee that I have enough of these free points 967 00:55:06,600 --> 00:55:09,720 to connect into my clause gadgets. 968 00:55:09,720 --> 00:55:11,359 How do I do a clause gadget? 969 00:55:11,359 --> 00:55:12,442 It's actually really easy. 970 00:55:16,854 --> 00:55:18,770 So these would be pretty boring by themselves. 971 00:55:25,590 --> 00:55:27,180 So a clause always looks like this. 972 00:55:27,180 --> 00:55:28,388 Maybe there's some negations. 973 00:55:30,860 --> 00:55:34,620 Yeah, let's do something like that. 974 00:55:34,620 --> 00:55:36,795 I'm going to convert it into a very simple picture. 975 00:55:43,810 --> 00:55:50,700 It's going to be xi dot, and xj bar dot, and xk dot. 976 00:55:50,700 --> 00:55:56,363 And then-- well maybe I'll stick to these colors. 977 00:56:02,400 --> 00:56:06,940 Again, these two points only appear in this clause gadget. 978 00:56:06,940 --> 00:56:10,410 These dots are actually these dots. 979 00:56:10,410 --> 00:56:12,570 So there's one of these pictures for x1. 980 00:56:12,570 --> 00:56:14,330 There's another one for x2, x3. 981 00:56:14,330 --> 00:56:16,770 And so xi has one of these wheels. 982 00:56:16,770 --> 00:56:20,190 I want this dot to be one of these dots of the wheel. 983 00:56:20,190 --> 00:56:22,900 And then I want this dot to be one 984 00:56:22,900 --> 00:56:26,300 of the dots in the xj wheel with the false setting, one 985 00:56:26,300 --> 00:56:27,600 of the red dots. 986 00:56:27,600 --> 00:56:34,981 I want this one to be xk true setting in the xk wheel. 987 00:56:34,981 --> 00:56:36,730 So these things are all connected together 988 00:56:36,730 --> 00:56:38,310 in a complicated pattern. 989 00:56:38,310 --> 00:56:40,110 But the point is that within this gadget, 990 00:56:40,110 --> 00:56:42,910 I only have three allowed triples. 991 00:56:42,910 --> 00:56:44,790 And these points only appear in this gadget, 992 00:56:44,790 --> 00:56:47,734 which means they have to be covered in this gadget. 993 00:56:47,734 --> 00:56:49,150 They can be covered by this triple 994 00:56:49,150 --> 00:56:51,200 or this triple or this triple. 995 00:56:51,200 --> 00:56:54,480 But once you choose one, you can't choose the others. 996 00:56:54,480 --> 00:56:59,540 What this means is if I set x1 to be true, 997 00:56:59,540 --> 00:57:03,540 it leaves behind these points marked true. 998 00:57:03,540 --> 00:57:05,640 If I choose the red things, then it's 999 00:57:05,640 --> 00:57:08,184 the blue points that are left behind. 1000 00:57:08,184 --> 00:57:09,600 Leaving points behind in this case 1001 00:57:09,600 --> 00:57:11,772 is going to be good, because this clause, 1002 00:57:11,772 --> 00:57:13,480 in order to satisfy this clause, in order 1003 00:57:13,480 --> 00:57:17,330 to choose one of these three triples, at least one of these 1004 00:57:17,330 --> 00:57:22,500 must be left behind by the wheel. 1005 00:57:22,500 --> 00:57:24,457 If all of these are covered by their wheels, 1006 00:57:24,457 --> 00:57:25,290 then there's no way. 1007 00:57:25,290 --> 00:57:27,750 I can't choose any of these guys. 1008 00:57:27,750 --> 00:57:30,810 But if at least one of these is left behind by the wheel, 1009 00:57:30,810 --> 00:57:33,780 then I can choose the corresponding triple 1010 00:57:33,780 --> 00:57:34,760 and cover these points. 1011 00:57:34,760 --> 00:57:36,880 So I'll be able to cover these points if and only 1012 00:57:36,880 --> 00:57:39,947 if at least one of these is true. 1013 00:57:39,947 --> 00:57:40,780 And that's a clause. 1014 00:57:40,780 --> 00:57:43,440 That's what a clause is supposed to do in 3SAT. 1015 00:57:43,440 --> 00:57:45,200 If at least one of these is true, 1016 00:57:45,200 --> 00:57:46,590 then the clause is satisfied. 1017 00:57:46,590 --> 00:57:48,380 I need all the clauses to be satisfied 1018 00:57:48,380 --> 00:57:50,720 because I need to cover of these points for all 1019 00:57:50,720 --> 00:57:53,760 the instances of these clauses. 1020 00:57:53,760 --> 00:57:55,310 And that's how it works. 1021 00:57:55,310 --> 00:57:57,540 Now, slight catch. 1022 00:57:57,540 --> 00:57:59,710 If you do this, not all the points 1023 00:57:59,710 --> 00:58:01,740 will be covered, even so. 1024 00:58:04,650 --> 00:58:05,970 Maybe all of these are true. 1025 00:58:05,970 --> 00:58:07,890 And so they're all left behind. 1026 00:58:07,890 --> 00:58:10,580 And I can only cover one of them with the clause. 1027 00:58:10,580 --> 00:58:12,560 It's a little messy. 1028 00:58:12,560 --> 00:58:17,710 You need another gadget, which is called garbage collection. 1029 00:58:23,910 --> 00:58:26,650 I don't want to spend too much time on it. 1030 00:58:26,650 --> 00:58:30,900 But you have two dots. 1031 00:58:30,900 --> 00:58:42,610 And then you have every single xi-- these dots, 1032 00:58:42,610 --> 00:58:47,490 all true and false dots. 1033 00:58:47,490 --> 00:58:49,410 And you're going to have this triple, 1034 00:58:49,410 --> 00:58:52,690 and this triple, and this triple, and this triple, 1035 00:58:52,690 --> 00:58:53,220 and so on. 1036 00:58:53,220 --> 00:58:56,230 It looks an awful lot like a clause. 1037 00:58:56,230 --> 00:58:59,190 But this is like a clause that's connected to everybody 1038 00:58:59,190 --> 00:59:00,820 in the entire universe. 1039 00:59:00,820 --> 00:59:03,260 And you repeat this the appropriate number 1040 00:59:03,260 --> 00:59:09,380 of times, which is something like sum of nx 1041 00:59:09,380 --> 00:59:10,670 minus the number of clauses. 1042 00:59:14,100 --> 00:59:14,850 OK, why? 1043 00:59:14,850 --> 00:59:16,880 Because if you look at a wheel, it 1044 00:59:16,880 --> 00:59:21,780 has size 2 times nx for a variable x. 1045 00:59:21,780 --> 00:59:26,300 And half of the points will be left uncovered. 1046 00:59:26,300 --> 00:59:29,630 So that means nx of them will be uncovered. 1047 00:59:29,630 --> 00:59:33,110 Then the clause, if everything works out correctly, 1048 00:59:33,110 --> 00:59:37,710 the clause will cover exactly one of those points. 1049 00:59:37,710 --> 00:59:40,090 So for each clause we cover one of the points. 1050 00:59:40,090 --> 00:59:43,270 That means this difference is exactly how 1051 00:59:43,270 --> 00:59:45,470 many points are left uncovered. 1052 00:59:45,470 --> 00:59:48,720 And so we make this gadget exactly that many times. 1053 00:59:48,720 --> 00:59:50,530 And it's free to cover anybody. 1054 00:59:50,530 --> 00:59:53,860 So whatever is left over, this garbage collector 1055 00:59:53,860 --> 00:59:55,160 will clean up. 1056 00:59:55,160 --> 00:59:57,200 And if we use exactly the right number of them, 1057 00:59:57,200 --> 01:00:01,170 this garbage collector won't run out of things to collect. 1058 01:00:01,170 --> 01:00:03,830 So this makes the proof messy. 1059 01:00:03,830 --> 01:00:07,712 But I want to move on to somewhat simpler proofs 1060 01:00:07,712 --> 01:00:08,670 and for other problems. 1061 01:00:08,670 --> 01:00:09,832 Yeah? 1062 01:00:09,832 --> 01:00:12,262 AUDIENCE: Real quick, what about the t or f 1063 01:00:12,262 --> 01:00:14,692 points that we didn't cover because we didn't actually 1064 01:00:14,692 --> 01:00:15,670 need that many? 1065 01:00:15,670 --> 01:00:16,503 ERIC DEMAINE: Right. 1066 01:00:16,503 --> 01:00:19,210 So this also includes the points that weren't even connected 1067 01:00:19,210 --> 01:00:20,950 to clauses. 1068 01:00:20,950 --> 01:00:23,330 I think this is the right number no matter what, 1069 01:00:23,330 --> 01:00:26,180 because this is counting the total number of uncovered guys, 1070 01:00:26,180 --> 01:00:28,310 whether they're connected to clauses or not. 1071 01:00:28,310 --> 01:00:30,890 Each clause will, in a satisfied situation, 1072 01:00:30,890 --> 01:00:33,341 it will cover exactly one of those points. 1073 01:00:33,341 --> 01:00:35,090 The ones that are connected to the clauses 1074 01:00:35,090 --> 01:00:37,010 won't be covered at all, but that will still 1075 01:00:37,010 --> 01:00:38,000 be in this difference. 1076 01:00:38,000 --> 01:00:39,750 So yeah, it's good to check that. 1077 01:00:39,750 --> 01:00:42,250 The first time I wrote this down I forgot about those points 1078 01:00:42,250 --> 01:00:42,958 and got it wrong. 1079 01:00:42,958 --> 01:00:45,930 But I think this is right, hopefully. 1080 01:00:45,930 --> 01:00:47,980 I did not come up with this proof. 1081 01:00:47,980 --> 01:00:49,950 Garey and Johnson I think-- or no. 1082 01:00:49,950 --> 01:00:52,906 This is-- I forgot. 1083 01:00:52,906 --> 01:00:54,710 Yeah, this is a Garey and Johnson proof. 1084 01:00:54,710 --> 01:00:58,610 There's a cool book from the late '70s by Garey and Johnson, 1085 01:00:58,610 --> 01:01:04,320 does a lot of NP-completeness, if you're curious. 1086 01:01:04,320 --> 01:01:06,702 All right, so hopefully you believe 1087 01:01:06,702 --> 01:01:08,160 three dimensional matching is hard. 1088 01:01:08,160 --> 01:01:10,520 Now I'm going to use it to prove that some very different types 1089 01:01:10,520 --> 01:01:11,650 of problems are hard. 1090 01:01:11,650 --> 01:01:14,270 This is a kind of graph theory problem. 1091 01:01:14,270 --> 01:01:18,070 You'll see more graph theory problems in recitation. 1092 01:01:18,070 --> 01:01:21,800 This one, I can erase 3SAT and Mario. 1093 01:01:21,800 --> 01:01:27,660 So in the world, most NP-hardness proofs 1094 01:01:27,660 --> 01:01:31,520 are reductions from 3SAT, or some variation of 3SAT. 1095 01:01:31,520 --> 01:01:34,450 In some sense, you can think of three dimensional matching 1096 01:01:34,450 --> 01:01:37,870 as kind of like a version of 3SAT, 1097 01:01:37,870 --> 01:01:39,840 but it's a little bit more stringent. 1098 01:01:39,840 --> 01:01:44,980 And that stringency helps us to do other reductions. 1099 01:01:44,980 --> 01:01:47,070 So here's another problem where we'll 1100 01:01:47,070 --> 01:01:51,710 reduce from three dimensional matching. 1101 01:01:51,710 --> 01:01:52,940 It's called subset sum. 1102 01:02:00,220 --> 01:02:09,830 So you're given n integers, a1 up to an. 1103 01:02:09,830 --> 01:02:15,090 And you're given a target sum, also an integer. 1104 01:02:15,090 --> 01:02:17,150 Call it t. 1105 01:02:17,150 --> 01:02:21,160 What you'd like to know is, is there a subset of the integers 1106 01:02:21,160 --> 01:02:22,550 that adds up to that target. 1107 01:02:34,380 --> 01:02:36,230 Can you choose a sum of the integers 1108 01:02:36,230 --> 01:02:39,800 so that-- I'll write it the sum of S. 1109 01:02:39,800 --> 01:02:43,670 But what this means is the sum over the ai's that 1110 01:02:43,670 --> 01:02:47,576 are in S of the value ai. 1111 01:02:47,576 --> 01:02:51,540 I want that to equal t. 1112 01:02:51,540 --> 01:02:52,980 So this is the definition. 1113 01:02:52,980 --> 01:02:55,206 This is the constraint. 1114 01:02:55,206 --> 01:02:56,580 So I give you a bunch of numbers. 1115 01:02:56,580 --> 01:02:58,480 Do any subset of them add up to t? 1116 01:02:58,480 --> 01:02:59,720 That's all this is asking. 1117 01:02:59,720 --> 01:03:02,740 This problem is NP-hard. 1118 01:03:02,740 --> 01:03:06,265 It's NP-complete, in fact, when you can guess which integers 1119 01:03:06,265 --> 01:03:08,140 should go in the subset, and then add them up 1120 01:03:08,140 --> 01:03:09,264 to see if you got it right. 1121 01:03:11,886 --> 01:03:14,320 It is NP-hard, but it's something special 1122 01:03:14,320 --> 01:03:15,960 we call weakly NP-hard. 1123 01:03:22,430 --> 01:03:28,687 And why don't I come back to the definition of that in a moment? 1124 01:03:28,687 --> 01:03:30,020 Let me first show you the proof. 1125 01:03:30,020 --> 01:03:33,510 It's actually really easy now that we have this three 1126 01:03:33,510 --> 01:03:37,120 dimensional matching problem. 1127 01:03:37,120 --> 01:03:38,820 It's pretty cool. 1128 01:03:38,820 --> 01:03:42,530 So these numbers are going to be huge. 1129 01:03:42,530 --> 01:03:48,600 What we're going to say is, let's view-- so again, 1130 01:03:48,600 --> 01:03:51,587 we're given a three dimensional matching instance. 1131 01:03:51,587 --> 01:03:52,670 Get the directions, right? 1132 01:03:52,670 --> 01:03:54,520 We're given a set of triples. 1133 01:03:54,520 --> 01:03:59,520 We want to solve this problem by reducing it to a subset sum. 1134 01:03:59,520 --> 01:04:04,688 So we get to construct integers that represent triples. 1135 01:04:04,688 --> 01:04:06,220 That's what we're going to do. 1136 01:04:06,220 --> 01:04:09,477 So here we go. 1137 01:04:09,477 --> 01:04:10,560 We get to choose a number. 1138 01:04:10,560 --> 01:04:16,440 So I'm going to think of them in a particular base, b, 1139 01:04:16,440 --> 01:04:23,420 which is going to be 1 plus the max of the mxi's. 1140 01:04:23,420 --> 01:04:26,310 So again, this is the number of occurrences of variable xi 1141 01:04:26,310 --> 01:04:28,610 in a true or false form. 1142 01:04:28,610 --> 01:04:31,490 So I take the maximum occurrence of any variable, add 1. 1143 01:04:31,490 --> 01:04:32,490 That's my base. 1144 01:04:32,490 --> 01:04:34,170 It just has to be large enough. 1145 01:04:37,010 --> 01:04:43,420 And this is basically the entire reduction, is one line. 1146 01:04:43,420 --> 01:04:48,350 If I have three triples-- if I have a triple xi, xj, 1147 01:04:48,350 --> 01:04:53,530 xk, I'm going to convert that into a number that 1148 01:04:53,530 --> 01:05:03,150 looks like this where the one positions are-- I don't really 1149 01:05:03,150 --> 01:05:06,650 know the order, but they are i, j, and k. 1150 01:05:06,650 --> 01:05:08,190 Everything else is zero. 1151 01:05:08,190 --> 01:05:11,300 And this is in base b, not base 2. 1152 01:05:11,300 --> 01:05:12,460 It's a little weird. 1153 01:05:12,460 --> 01:05:16,120 All my digits are 0 or 1, but I'm in base b. 1154 01:05:16,120 --> 01:05:18,500 And three of the digits are 1. 1155 01:05:18,500 --> 01:05:19,720 And the rest are zero. 1156 01:05:19,720 --> 01:05:20,440 Why? 1157 01:05:20,440 --> 01:05:22,680 Because of my target sum. 1158 01:05:22,680 --> 01:05:28,470 Target sum is going to be 1111111111. 1159 01:05:28,470 --> 01:05:31,580 So this number, in algebra, you're 1160 01:05:31,580 --> 01:05:36,280 write this as b to the i plus b to the j plus b to the k. 1161 01:05:36,280 --> 01:05:43,437 This you would write as the sum of b to the i for all i. 1162 01:05:43,437 --> 01:05:44,520 Do you see why this works? 1163 01:05:44,520 --> 01:05:45,992 It's actually really simple. 1164 01:05:51,890 --> 01:05:54,100 For this instance, my goal is to choose 1165 01:05:54,100 --> 01:05:57,890 a subset of these numbers that add up to this number. 1166 01:05:57,890 --> 01:05:59,330 How could that possibly happen? 1167 01:05:59,330 --> 01:06:01,810 Well, I've got to choose-- every time I choose 1168 01:06:01,810 --> 01:06:06,930 one of the numbers, those three digits get set to 1 in my sum. 1169 01:06:06,930 --> 01:06:10,790 If I ever have a collision, if I add two 1s together, 1170 01:06:10,790 --> 01:06:12,654 I'm going to get a 2. 1171 01:06:12,654 --> 01:06:14,320 That's not good, because once I get a 2, 1172 01:06:14,320 --> 01:06:16,390 I'll never be able to get back to a 1, 1173 01:06:16,390 --> 01:06:20,150 because my base is really big. 1174 01:06:20,150 --> 01:06:22,140 This base is designed so that the total-- this 1175 01:06:22,140 --> 01:06:26,060 is the total number of colliding 1s. 1176 01:06:26,060 --> 01:06:27,504 So we set it one larger than that, 1177 01:06:27,504 --> 01:06:29,920 which means you'll never get a carry when you're adding up 1178 01:06:29,920 --> 01:06:31,000 in this base. 1179 01:06:31,000 --> 01:06:34,420 That's why I set the base to be something large, not base 2. 1180 01:06:34,420 --> 01:06:39,040 Base 2 might work, but this is much safer. 1181 01:06:39,040 --> 01:06:44,540 So what that means is for each of these 1s in the target sum, 1182 01:06:44,540 --> 01:06:47,140 I've got to find a triple that has those 1s. 1183 01:06:47,140 --> 01:06:48,902 And those triples can't overlap. 1184 01:06:48,902 --> 01:06:51,360 So that means choosing a set of numbers that add up to this 1185 01:06:51,360 --> 01:06:54,460 is exactly the same as choosing a set of triples that 1186 01:06:54,460 --> 01:06:56,970 covers all of the elements. 1187 01:06:56,970 --> 01:07:03,150 Done, super easy once you have the right problem. 1188 01:07:03,150 --> 01:07:05,690 OK, good. 1189 01:07:05,690 --> 01:07:08,750 Now why do I call this weekly NP-hard? 1190 01:07:08,750 --> 01:07:12,010 Because these numbers are giant. 1191 01:07:12,010 --> 01:07:18,240 If I have n elements in X, Y, Z over there-- 1192 01:07:18,240 --> 01:07:22,130 I guess here they're called xi, yk, zk. 1193 01:07:22,130 --> 01:07:24,490 Sorry, maybe I should've called them that here. 1194 01:07:24,490 --> 01:07:26,980 Doesn't matter. 1195 01:07:26,980 --> 01:07:34,120 If I have n of those elements in X union Y union Z, 1196 01:07:34,120 --> 01:07:38,760 the number of digits here is n. 1197 01:07:38,760 --> 01:07:46,030 So the number of digits in order n. 1198 01:07:46,030 --> 01:07:48,430 This is fine from an NP-completeness standpoint. 1199 01:07:48,430 --> 01:07:49,600 This is polynomial size. 1200 01:07:49,600 --> 01:07:55,170 The number of digits in my numbers is a polynomial. 1201 01:07:55,170 --> 01:07:56,730 And this base is also pretty small. 1202 01:07:56,730 --> 01:07:58,124 So if you wrote it out in binary, 1203 01:07:58,124 --> 01:07:59,290 it would also be polynomial. 1204 01:07:59,290 --> 01:08:01,600 So just lost a log factor. 1205 01:08:01,600 --> 01:08:10,460 But the size of the numbers, the actual values of the numbers, 1206 01:08:10,460 --> 01:08:12,020 is exponential. 1207 01:08:17,350 --> 01:08:20,550 With weak NP-hardness, that's allowed. 1208 01:08:20,550 --> 01:08:23,920 With strong NP-hardness, that's forbidden. 1209 01:08:23,920 --> 01:08:26,460 In strong NP-hardness, you want the values of the numbers 1210 01:08:26,460 --> 01:08:28,370 to be polynomial. 1211 01:08:28,370 --> 01:08:30,950 So in this case, the number of bits is small, 1212 01:08:30,950 --> 01:08:34,220 but the actual values are giant, because you 1213 01:08:34,220 --> 01:08:36,271 have to exponentiate. 1214 01:08:36,271 --> 01:08:37,550 It would be cool. 1215 01:08:37,550 --> 01:08:39,279 And this problem is only weakly NP-hard. 1216 01:08:39,279 --> 01:08:41,700 Maybe you actually know a pseudo-polynomial time 1217 01:08:41,700 --> 01:08:42,660 algorithm for this. 1218 01:08:42,660 --> 01:08:44,500 It's basically a knapsack. 1219 01:08:44,500 --> 01:08:51,660 If these numbers have polynomial value, then you can basically, 1220 01:08:51,660 --> 01:08:53,800 in your subproblems in dynamic programming, 1221 01:08:53,800 --> 01:08:57,609 you can write down the number t and just 1222 01:08:57,609 --> 01:08:59,149 solve it for all values of t. 1223 01:08:59,149 --> 01:09:01,790 And it's easy to solve it in polynomial time, 1224 01:09:01,790 --> 01:09:05,302 polynomial in the integer values. 1225 01:09:05,302 --> 01:09:07,260 So we call that pseudo-polynomial, because it's 1226 01:09:07,260 --> 01:09:08,229 not really polynomial. 1227 01:09:08,229 --> 01:09:10,065 It's not polynomial in the number of digits 1228 01:09:10,065 --> 01:09:11,689 that you have to write down the number. 1229 01:09:11,689 --> 01:09:13,800 It's Polynomial in the values. 1230 01:09:13,800 --> 01:09:18,180 Weak NP-hardness goes together with pseudo-polynomial. 1231 01:09:18,180 --> 01:09:20,540 That's kind of a matching result. Say look, 1232 01:09:20,540 --> 01:09:22,890 pseudo-polynomial is the best you can do. 1233 01:09:22,890 --> 01:09:25,520 You can't hope for a polynomial because if you 1234 01:09:25,520 --> 01:09:30,330 let the numbers get huge, then the problem is NP-complete. 1235 01:09:30,330 --> 01:09:34,029 But if you force the numbers to be small, this problem is easy. 1236 01:09:34,029 --> 01:09:37,010 So subset sum is a little funny in that sense. 1237 01:09:40,131 --> 01:09:40,630 Cool. 1238 01:10:11,400 --> 01:10:15,327 Let me tell you about another problem, partition. 1239 01:10:19,360 --> 01:10:21,560 So partition is pretty much the same set up. 1240 01:10:21,560 --> 01:10:23,200 I'm given n integers. 1241 01:10:31,880 --> 01:10:33,860 Let's say they're positive. 1242 01:10:33,860 --> 01:10:39,040 And I want to know, is there a subset-- I'm not 1243 01:10:39,040 --> 01:10:41,200 given a target sum t. 1244 01:10:41,200 --> 01:10:42,655 Target sum is basically forced. 1245 01:10:50,240 --> 01:10:53,880 What I would like is the sum of all the values in S 1246 01:10:53,880 --> 01:10:55,950 to equal the sum of all the values 1247 01:10:55,950 --> 01:11:00,490 not in S. That's A minus S, which in other words 1248 01:11:00,490 --> 01:11:04,790 is going to be the sum of all values in A divided by 2. 1249 01:11:04,790 --> 01:11:08,610 So this is called partition because you're taking a set, 1250 01:11:08,610 --> 01:11:11,990 you're splitting it into two halves of equal sum. 1251 01:11:11,990 --> 01:11:14,290 Every element has to go in one of the two halves. 1252 01:11:14,290 --> 01:11:19,510 And they're called S and A minus S, like cuts in the flow stuff. 1253 01:11:19,510 --> 01:11:21,227 And you want those two halves to have 1254 01:11:21,227 --> 01:11:22,810 exactly the same sum, which means they 1255 01:11:22,810 --> 01:11:24,620 will be the sum divided by 2. 1256 01:11:24,620 --> 01:11:26,320 So that better be even, otherwise 1257 01:11:26,320 --> 01:11:27,840 it's not going to be possible. 1258 01:11:27,840 --> 01:11:30,190 So again, you want to decide whether this 1259 01:11:30,190 --> 01:11:34,720 is possible or impossible, yes or no. 1260 01:11:34,720 --> 01:11:38,200 I claim this problem is also weakly NP-complete, 1261 01:11:38,200 --> 01:11:42,934 and we can reduce from subset sum to partition. 1262 01:11:42,934 --> 01:11:45,350 This is a little interesting because partition is actually 1263 01:11:45,350 --> 01:11:48,560 a special case of subset sum. 1264 01:11:48,560 --> 01:11:53,500 It is the case where t equals this. 1265 01:11:53,500 --> 01:11:56,490 Subset sum, you're trying to solve it no matter what t is. 1266 01:11:56,490 --> 01:11:58,280 t is a given input. 1267 01:11:58,280 --> 01:12:00,750 So there's more instances over here. 1268 01:12:00,750 --> 01:12:03,150 Some of them, some of these instances 1269 01:12:03,150 --> 01:12:05,650 are the case where t equals the sum over 2. 1270 01:12:05,650 --> 01:12:07,010 Those are partition instances. 1271 01:12:07,010 --> 01:12:09,550 So this is like a subset of the possible inputs 1272 01:12:09,550 --> 01:12:11,900 as over there, which means this problem is 1273 01:12:11,900 --> 01:12:16,470 easier than this one-- no harder anyway. 1274 01:12:16,470 --> 01:12:21,580 In other words, I can reduce partition to subset sum. 1275 01:12:21,580 --> 01:12:23,750 I just compute this value and set that to t, 1276 01:12:23,750 --> 01:12:25,880 and then leave the a's alone. 1277 01:12:25,880 --> 01:12:28,350 That will reduce partition to subset sum. 1278 01:12:28,350 --> 01:12:30,130 But that's not the direction I want. 1279 01:12:30,130 --> 01:12:32,690 I want to reduce from subset sum, a problem I can prove 1280 01:12:32,690 --> 01:12:35,360 is NP-complete, to partition, because I 1281 01:12:35,360 --> 01:12:37,400 want to prove that partition is NP-complete. 1282 01:12:37,400 --> 01:12:42,090 So in this case, there's an easy reduction in both directions. 1283 01:12:42,090 --> 01:12:44,750 This direction is a little harder. 1284 01:12:44,750 --> 01:12:51,410 So reduction from subset sum. 1285 01:12:55,440 --> 01:12:57,700 So I'm given a bunch of ai's. 1286 01:12:57,700 --> 01:12:59,380 I'm not going to touch them. 1287 01:12:59,380 --> 01:13:01,010 And I'm given a target sum t. 1288 01:13:01,010 --> 01:13:04,820 And I basically want to make that target sum into this half. 1289 01:13:04,820 --> 01:13:09,480 To do that, I'm going to add two numbers to my set. 1290 01:13:09,480 --> 01:13:17,660 So I'm going to let sigma be the sum of the given a's. 1291 01:13:17,660 --> 01:13:21,300 And then I'm going to add-- so I'm given a1 through an. 1292 01:13:21,300 --> 01:13:30,720 I'm going to add an plus 1, is going to be sigma plus t. 1293 01:13:30,720 --> 01:13:40,822 And I'm going to add an plus 2 to be 2 sigma minus t. 1294 01:13:45,050 --> 01:13:45,990 Why? 1295 01:13:45,990 --> 01:13:48,040 So these are two basically huge numbers. 1296 01:13:51,100 --> 01:13:52,720 Because sigma is bigger than-- I mean, 1297 01:13:52,720 --> 01:13:54,386 it's the sum of all the numbers, so it's 1298 01:13:54,386 --> 01:13:56,140 bigger than all of them. 1299 01:13:56,140 --> 01:13:58,320 And so imagine for a moment that I 1300 01:13:58,320 --> 01:14:02,020 put these two in the same side of the partition. 1301 01:14:02,020 --> 01:14:04,100 I put them both in S or I put them both out 1302 01:14:04,100 --> 01:14:09,060 of S. Their sum by themselves is 3 sigma. 1303 01:14:09,060 --> 01:14:10,450 The t's cancel. 1304 01:14:10,450 --> 01:14:13,940 Whereas all the other items, their sum is sigma. 1305 01:14:13,940 --> 01:14:15,622 So I'm host. 1306 01:14:15,622 --> 01:14:17,830 If I have 3 sigma on one side and sigma on the other, 1307 01:14:17,830 --> 01:14:19,830 I'm not going to make them equal. 1308 01:14:19,830 --> 01:14:23,670 So in fact, these two elements have to be on opposite sides. 1309 01:14:23,670 --> 01:14:25,690 So there's a side that has sigma plus t. 1310 01:14:25,690 --> 01:14:28,877 There's a side has 2 sigma minus t. 1311 01:14:28,877 --> 01:14:31,210 And then there's all the other n items, and some of them 1312 01:14:31,210 --> 01:14:32,990 are going to go to this side, some of them 1313 01:14:32,990 --> 01:14:35,660 are going to go to this side. 1314 01:14:35,660 --> 01:14:38,310 Their total value is sigma. 1315 01:14:38,310 --> 01:14:40,260 Right now this is close to sigma. 1316 01:14:40,260 --> 01:14:41,540 This is close to 2 sigma. 1317 01:14:41,540 --> 01:14:43,650 So they have to kind of meet in the middle. 1318 01:14:43,650 --> 01:14:54,540 In fact, what you'll have to do is add sigma minus t over here 1319 01:14:54,540 --> 01:15:01,194 and add t over here. 1320 01:15:01,194 --> 01:15:02,360 Think about it for a second. 1321 01:15:02,360 --> 01:15:04,910 If I add sigma minus t, this comes out to 2 sigma. 1322 01:15:04,910 --> 01:15:07,622 If I add t to this, this comes out to 2 sigma. 1323 01:15:07,622 --> 01:15:09,330 That would be good because they're equal. 1324 01:15:09,330 --> 01:15:11,490 And notice that this is sigma minus t. 1325 01:15:11,490 --> 01:15:12,130 This is t. 1326 01:15:12,130 --> 01:15:13,770 Their sum is sigma. 1327 01:15:13,770 --> 01:15:16,870 So in fact, it has to be like this. 1328 01:15:16,870 --> 01:15:20,210 You add something over here, and sigma minus something over here 1329 01:15:20,210 --> 01:15:21,980 for all the other ai's. 1330 01:15:21,980 --> 01:15:24,920 And the something has to be t in order for these two values 1331 01:15:24,920 --> 01:15:26,390 to equalize. 1332 01:15:26,390 --> 01:15:30,420 So in order to solve this slightly larger partition 1333 01:15:30,420 --> 01:15:33,640 problem, you have to actually solve the subset sum problem 1334 01:15:33,640 --> 01:15:37,540 because you have to construct a subset that adds up to t. 1335 01:15:37,540 --> 01:15:40,260 t was an arbitrary given value. 1336 01:15:40,260 --> 01:15:43,820 So this is pretty nifty. 1337 01:15:43,820 --> 01:15:47,990 We're adding some values so that the new target sum is the 50/50 1338 01:15:47,990 --> 01:15:51,330 split when we're given some values that 1339 01:15:51,330 --> 01:15:55,140 have an arbitrary target sum. 1340 01:15:55,140 --> 01:15:59,380 So partition is weakly NP-complete. 1341 01:15:59,380 --> 01:16:03,220 Let me go to rectangle packing. 1342 01:16:27,690 --> 01:16:30,810 So rectangle packing-- I'm going to draw a picture. 1343 01:16:30,810 --> 01:16:34,780 I give you a bunch of rectangles of varying sizes. 1344 01:16:34,780 --> 01:16:37,692 And I give you a target rectangle. 1345 01:16:37,692 --> 01:16:42,240 Let's call it T. These are the Ri's. 1346 01:16:42,240 --> 01:16:48,900 I want to put these rectangles into this picture 1347 01:16:48,900 --> 01:16:52,000 without any overlaps. 1348 01:16:52,000 --> 01:16:54,630 Each of these rectangles here corresponds to one 1349 01:16:54,630 --> 01:16:56,510 of the rectangles over here. 1350 01:16:56,510 --> 01:17:01,180 So I'll tell you that the sum of the areas of these rectangles 1351 01:17:01,180 --> 01:17:04,370 is equal to the area of T. And the question is, can you 1352 01:17:04,370 --> 01:17:08,080 pack those rectangles into T without any overlaps, 1353 01:17:08,080 --> 01:17:10,850 and therefore without any gaps, because the areas are exactly 1354 01:17:10,850 --> 01:17:12,090 the same. 1355 01:17:12,090 --> 01:17:16,050 I claim this problem is weakly NP-hard-- I 1356 01:17:16,050 --> 01:17:21,065 guess NP-complete by reduction from partition. 1357 01:17:24,740 --> 01:17:31,800 This will be super easy if you followed what 1358 01:17:31,800 --> 01:17:33,270 the definition of partition is. 1359 01:17:33,270 --> 01:17:36,490 We're given some integers ai. 1360 01:17:36,490 --> 01:17:41,100 And we're going to take each of them and convert them into a, 1361 01:17:41,100 --> 01:17:46,330 let's say, 1 by 3ai rectangle. 1362 01:17:46,330 --> 01:17:49,490 Three is to avoid some rotation we'll see. 1363 01:17:49,490 --> 01:17:51,225 And then we're also given the targets. 1364 01:17:51,225 --> 01:17:53,330 Oh no, target sum is given. 1365 01:17:53,330 --> 01:17:55,210 Target sum is the sum over 2. 1366 01:17:55,210 --> 01:17:59,520 But anyway, we're going to build our target rectangle 1367 01:17:59,520 --> 01:18:06,970 to be-- it's actually going to be really big. 1368 01:18:06,970 --> 01:18:13,820 It's going to be 2 by 3 times t. 1369 01:18:13,820 --> 01:18:16,020 So this is that thing. 1370 01:18:16,020 --> 01:18:21,580 So this is 3/2 sum of the a's. 1371 01:18:21,580 --> 01:18:24,720 OK, that's about it. 1372 01:18:24,720 --> 01:18:26,600 In order to pack these rectangles into here, 1373 01:18:26,600 --> 01:18:28,627 because each of them is at least three long, 1374 01:18:28,627 --> 01:18:29,960 you cannot pack them vertically. 1375 01:18:29,960 --> 01:18:31,790 They have to be horizontal. 1376 01:18:31,790 --> 01:18:35,320 So in fact what your packing will look like is 1377 01:18:35,320 --> 01:18:38,070 they'll be the top half and the bottom half. 1378 01:18:38,070 --> 01:18:40,620 And the top half, the total length of those rectangles 1379 01:18:40,620 --> 01:18:45,940 has to add up to 3/2 sum of A. Everything was scaled up by 3, 1380 01:18:45,940 --> 01:18:49,690 so that's 1/2 of A on the top and the bottom. 1381 01:18:49,690 --> 01:18:50,790 That's a partition. 1382 01:18:50,790 --> 01:18:52,620 In order to pack the rectangles into here, 1383 01:18:52,620 --> 01:18:55,570 you have to solve the partition problem, and vice versa. 1384 01:18:55,570 --> 01:18:57,580 Easy. 1385 01:18:57,580 --> 01:19:11,815 OK, let me show you one more thing, jigsaw puzzles. 1386 01:19:16,790 --> 01:19:21,360 This is not the jigsaw puzzles you grew up on, somewhat more 1387 01:19:21,360 --> 01:19:23,150 generalized. 1388 01:19:23,150 --> 01:19:27,286 So a piece is going to look something like this. 1389 01:19:30,480 --> 01:19:33,860 I drew them intentionally different. 1390 01:19:33,860 --> 01:19:36,820 So on each, you have a unit square. 1391 01:19:36,820 --> 01:19:39,620 Some of the sides can be flat. 1392 01:19:39,620 --> 01:19:40,840 Some of them can be tabs. 1393 01:19:40,840 --> 01:19:42,500 Some of them can be pockets. 1394 01:19:42,500 --> 01:19:45,000 Each tab and pocket has a shape. 1395 01:19:45,000 --> 01:19:48,110 And they're not in a perfect matching with each other. 1396 01:19:48,110 --> 01:19:50,950 So there could be seven of these tabs 1397 01:19:50,950 --> 01:19:53,550 and seven of these pockets, all the same shape. 1398 01:19:53,550 --> 01:19:55,810 This is what you might call ambiguous jigsaw puzzles. 1399 01:19:55,810 --> 01:19:58,240 Plus, there is no image on the piece, 1400 01:19:58,240 --> 01:20:01,900 so this is like hardcore jigsaw puzzles. 1401 01:20:01,900 --> 01:20:05,100 This is NP-complete. 1402 01:20:05,100 --> 01:20:10,860 And what I'd like to do is to simulate a rectangle 1403 01:20:10,860 --> 01:20:13,110 with a bunch of jigsaw pieces. 1404 01:20:13,110 --> 01:20:15,790 So it would look something like this. 1405 01:20:24,080 --> 01:20:27,350 If I have a 1 buy something rectangle, 1406 01:20:27,350 --> 01:20:31,370 I'm going to simulate it with that same something, 1407 01:20:31,370 --> 01:20:34,020 little jigsaw pieces. 1408 01:20:34,020 --> 01:20:38,290 And I'm going to make these shapes only match each other. 1409 01:20:38,290 --> 01:20:41,140 And so for every rectangle, they're 1410 01:20:41,140 --> 01:20:43,920 going to have a different shape. 1411 01:20:43,920 --> 01:20:45,694 This one will be squares. 1412 01:20:45,694 --> 01:20:47,860 At that point I ran out of shapes I can easily draw, 1413 01:20:47,860 --> 01:20:48,970 but you get the idea. 1414 01:20:48,970 --> 01:20:50,980 Each rectangle has a different shape. 1415 01:20:50,980 --> 01:20:52,930 And so these have to match to each other. 1416 01:20:52,930 --> 01:20:55,000 You can't mix the tiles, which means you 1417 01:20:55,000 --> 01:20:56,470 have to build this rectangle. 1418 01:20:56,470 --> 01:20:58,300 You have to build this rectangle. 1419 01:20:58,300 --> 01:21:00,850 And then if the jigsaw problem is, can you 1420 01:21:00,850 --> 01:21:02,830 fit these into a given rectangle, 1421 01:21:02,830 --> 01:21:04,460 then you get rectangle packing. 1422 01:21:04,460 --> 01:21:07,185 But this is not a valid reduction. 1423 01:21:10,610 --> 01:21:15,050 You can't reduce from partition. 1424 01:21:18,550 --> 01:21:20,150 Why? 1425 01:21:20,150 --> 01:21:26,040 Because these numbers are huge. 1426 01:21:26,040 --> 01:21:28,430 Remember, the values of the numbers in my partition 1427 01:21:28,430 --> 01:21:30,550 instance are exponential. 1428 01:21:30,550 --> 01:21:34,970 So if I have a value ai and it's exponential in my problem size, 1429 01:21:34,970 --> 01:21:38,030 and I tried to make ai have little tiles, 1430 01:21:38,030 --> 01:21:40,040 that means a number of jigsaw pieces 1431 01:21:40,040 --> 01:21:42,814 will be exponential in n. 1432 01:21:42,814 --> 01:21:43,480 That's not good. 1433 01:21:43,480 --> 01:21:45,910 That's not allowed. 1434 01:21:45,910 --> 01:21:49,780 This is why weak NP-hardness is annoying. 1435 01:21:49,780 --> 01:21:54,550 So instead, we need a strong NP-hard problem. 1436 01:22:00,020 --> 01:22:01,810 This is a problem that's NP-hard even when 1437 01:22:01,810 --> 01:22:06,090 the numbers are polynomial in value, not just in size. 1438 01:22:06,090 --> 01:22:07,400 And it's called 4-partition. 1439 01:22:11,670 --> 01:22:16,970 4-partition, you're given n integers, as usual. 1440 01:22:20,380 --> 01:22:28,110 Say set is A. And you want to split those integers 1441 01:22:28,110 --> 01:22:39,990 into n over 4 quadruples of the same sum. 1442 01:22:46,860 --> 01:22:51,400 So this would be the sum of A divided by n over four. 1443 01:22:51,400 --> 01:22:53,240 That's your target sum. 1444 01:22:53,240 --> 01:22:55,300 So before we had to split into two parts that 1445 01:22:55,300 --> 01:22:56,500 had the same sum. 1446 01:22:56,500 --> 01:22:57,600 That was partition. 1447 01:22:57,600 --> 01:23:00,330 Now we have to split into n over 4 parts. 1448 01:23:00,330 --> 01:23:04,040 Each part will have exactly four numbers, four integers. 1449 01:23:04,040 --> 01:23:06,840 And they should all have the same sum. 1450 01:23:06,840 --> 01:23:09,660 This problem is hard even when the integers 1451 01:23:09,660 --> 01:23:12,345 have polynomial value. 1452 01:23:12,345 --> 01:23:20,765 So the values are at most some polynomial in n. 1453 01:23:20,765 --> 01:23:22,890 I won't prove it here, but it's in my lecture notes 1454 01:23:22,890 --> 01:23:23,639 if you're curious. 1455 01:23:23,639 --> 01:23:27,290 It's like this proof, but harder. 1456 01:23:27,290 --> 01:23:30,090 You end up, instead of having n digit numbers, 1457 01:23:30,090 --> 01:23:32,080 you have five digit numbers. 1458 01:23:32,080 --> 01:23:36,130 Each digit only has a polynomial in n different values. 1459 01:23:36,130 --> 01:23:40,470 So the total value of the numbers is only polynomial. 1460 01:23:40,470 --> 01:23:43,380 It's like n to the fifth or something. 1461 01:23:43,380 --> 01:23:46,160 Good news is that this reduction I just 1462 01:23:46,160 --> 01:23:56,710 gave you is also a reduction from 4-partition 1463 01:23:56,710 --> 01:23:59,950 because it's the same set up. 1464 01:23:59,950 --> 01:24:01,260 Again, I'm given integers. 1465 01:24:01,260 --> 01:24:05,340 Each integer I'm going to represent by that many tiles. 1466 01:24:05,340 --> 01:24:07,390 Now the number of tiles is only polynomial, 1467 01:24:07,390 --> 01:24:09,560 so this is a valid reduction. 1468 01:24:09,560 --> 01:24:11,650 And again, if I have to pack all of these tiles 1469 01:24:11,650 --> 01:24:14,360 into a rectangular board, that's exactly the same 1470 01:24:14,360 --> 01:24:17,530 as packing these integers. 1471 01:24:17,530 --> 01:24:19,670 Well, I guess I should do rectangle packing again. 1472 01:24:19,670 --> 01:24:22,590 So this is a proof rectangle packing was weakly NP-hard. 1473 01:24:22,590 --> 01:24:24,786 But in fact it's strongly NP-hard. 1474 01:24:24,786 --> 01:24:26,160 You just change these dimensions. 1475 01:24:26,160 --> 01:24:33,470 You say well, I need whatever, n over 4 different parts, each 1476 01:24:33,470 --> 01:24:36,820 of size the sum over n over 4. 1477 01:24:36,820 --> 01:24:38,260 You need some scale factor here. 1478 01:24:38,260 --> 01:24:39,360 Three doesn't work. 1479 01:24:39,360 --> 01:24:43,950 Use n or something-- n and n. 1480 01:24:43,950 --> 01:24:46,330 That will prove that rectangle packing is actually 1481 01:24:46,330 --> 01:24:49,510 strongly NP-hard because we're reducing for 4-partition 1482 01:24:49,510 --> 01:24:50,612 instead of partition. 1483 01:24:50,612 --> 01:24:52,320 And then you can reduce rectangle packing 1484 01:24:52,320 --> 01:24:55,187 to jigsaw puzzles because you have strong hardness over here. 1485 01:24:55,187 --> 01:24:56,520 Over here we don't have numbers. 1486 01:24:56,520 --> 01:24:59,800 We just have these pieces. 1487 01:24:59,800 --> 01:25:01,710 So whenever you convert from a number problem 1488 01:25:01,710 --> 01:25:05,110 to a non-number problem, if you're representing the numbers 1489 01:25:05,110 --> 01:25:07,060 in unary, which is what's going on here, 1490 01:25:07,060 --> 01:25:09,620 you need strong NP-hardness for it to work. 1491 01:25:09,620 --> 01:25:11,340 Weak NP-hardness isn't enough. 1492 01:25:11,340 --> 01:25:13,970 Then we get jigsaw puzzles, which we know and love, 1493 01:25:13,970 --> 01:25:15,220 are NP-complete. 1494 01:25:15,220 --> 01:25:17,070 That's it.