1 00:00:00,070 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:22,740 --> 00:00:24,240 AMARTYA SHANKHA BISWAS: Let's start. 9 00:00:24,240 --> 00:00:28,290 So today we're going to do some NP hardness reductions. 10 00:00:28,290 --> 00:00:32,000 So let's just do a quick recap of P and NP. 11 00:00:32,000 --> 00:00:37,920 So let's see. 12 00:00:37,920 --> 00:00:51,150 So P is-- so you have a decision problem. 13 00:00:51,150 --> 00:00:55,010 So I have an input x and you have some algorithm, A, 14 00:00:55,010 --> 00:00:58,710 and it spits out an answer, which is either 0 or 1. 15 00:00:58,710 --> 00:00:59,950 So that's a decision problem. 16 00:00:59,950 --> 00:01:02,900 And if a problem isn't P, this algorithm A 17 00:01:02,900 --> 00:01:05,485 runs in polynomial time. 18 00:01:05,485 --> 00:01:07,410 So what about NP? 19 00:01:07,410 --> 00:01:11,860 So NP is when the solution is verifiable in polynomial time. 20 00:01:11,860 --> 00:01:15,952 So let's say you have an input x and some oracle, which 21 00:01:15,952 --> 00:01:17,660 is [INAUDIBLE] run exponential time or is 22 00:01:17,660 --> 00:01:21,250 has infinite computation time gives you an answer, 23 00:01:21,250 --> 00:01:25,830 so you get x and you get an answer which is either 0 or 1. 24 00:01:25,830 --> 00:01:27,870 And you also get a certificate. 25 00:01:27,870 --> 00:01:29,250 So let's call that a certificate. 26 00:01:29,250 --> 00:01:32,160 And given these three values, you 27 00:01:32,160 --> 00:01:34,270 can verify whether the solution was correct or not 28 00:01:34,270 --> 00:01:36,080 in polynomial time. 29 00:01:36,080 --> 00:01:37,690 Make sense? 30 00:01:37,690 --> 00:01:39,400 So clearly if you're going to compute 31 00:01:39,400 --> 00:01:40,540 the answer in polynomial time, you 32 00:01:40,540 --> 00:01:41,998 can also verify in polynomial time. 33 00:01:41,998 --> 00:01:44,260 So P is a subset of NP. 34 00:01:44,260 --> 00:01:46,490 And so that's it. 35 00:01:46,490 --> 00:01:50,580 So now, NP hard problems are problems 36 00:01:50,580 --> 00:01:53,620 that are at least as hard to solve as any problem in NP. 37 00:01:53,620 --> 00:01:56,290 And so now today we're going to be doing some reductions. 38 00:01:56,290 --> 00:01:58,400 So I think we did some of them in class, 39 00:01:58,400 --> 00:02:00,150 Nintendo games or something. 40 00:02:00,150 --> 00:02:01,120 Is that what we did? 41 00:02:01,120 --> 00:02:04,140 So today we are going to do some less interesting examples, 42 00:02:04,140 --> 00:02:06,960 but let's see. 43 00:02:06,960 --> 00:02:09,900 So how do we do reductions? 44 00:02:09,900 --> 00:02:14,130 So if we know that we have a problem A, 45 00:02:14,130 --> 00:02:16,840 which is a hard problem, and we want 46 00:02:16,840 --> 00:02:18,550 to show that problem B is hard. 47 00:02:25,010 --> 00:02:27,144 So we want to draw this implication. 48 00:02:27,144 --> 00:02:28,810 So if we want to draw this implication-- 49 00:02:28,810 --> 00:02:38,140 so this is equivalent to saying that if B is easy, 50 00:02:38,140 --> 00:02:41,003 then A is easy. 51 00:02:41,003 --> 00:02:41,969 Sorry, other way. 52 00:02:45,350 --> 00:02:46,830 This is the counter positive. 53 00:02:46,830 --> 00:02:50,550 So assuming that B has a polynomial time solution, 54 00:02:50,550 --> 00:02:52,310 A has a polynomial time solution. 55 00:02:52,310 --> 00:02:54,080 And that statement is equivalent to saying if A is hard, 56 00:02:54,080 --> 00:02:54,579 B is hard. 57 00:02:54,579 --> 00:02:56,510 So if you know that A is an NP hard problem, 58 00:02:56,510 --> 00:02:58,870 we can say that B is an NP hard problem. 59 00:02:58,870 --> 00:03:01,980 So if we want to show that B is an NP hard problem, 60 00:03:01,980 --> 00:03:04,180 first we show that B is an NP, and then we 61 00:03:04,180 --> 00:03:06,850 show that B is hard. 62 00:03:06,850 --> 00:03:14,250 So the way do this step is-- so your A looks like this. 63 00:03:14,250 --> 00:03:20,120 You have your algorithm, and it spits out an answer in 0 to 1. 64 00:03:22,900 --> 00:03:25,370 And B looks like this. 65 00:03:25,370 --> 00:03:29,523 Let's say you have an input y and it spits out 66 00:03:29,523 --> 00:03:32,870 an answer in 0 to 1. 67 00:03:32,870 --> 00:03:38,810 And you want to find a function R which takes your x 68 00:03:38,810 --> 00:03:40,980 and sends it to y. 69 00:03:40,980 --> 00:03:45,370 So basically, if you know how to solve B very fast, then 70 00:03:45,370 --> 00:03:54,190 you can take an input to A. So you take x, you transform it, 71 00:03:54,190 --> 00:03:57,390 and then you apply B. And the condition 72 00:03:57,390 --> 00:04:02,760 is that A applied to x is the same as B applied to R of x. 73 00:04:02,760 --> 00:04:04,450 So basically, what you are doing is 74 00:04:04,450 --> 00:04:08,380 you're showing that A is easy by showing that you can 75 00:04:08,380 --> 00:04:09,920 use-- so let's say B is easy. 76 00:04:09,920 --> 00:04:13,040 We can use B to compute A, so then A must be easy. 77 00:04:13,040 --> 00:04:14,680 But since we know that A is hard, 78 00:04:14,680 --> 00:04:17,500 that there's something wrong in our logic, so B must be hard. 79 00:04:17,500 --> 00:04:20,765 Does that makes sense? 80 00:04:20,765 --> 00:04:22,250 Yes? 81 00:04:22,250 --> 00:04:25,030 Sort of? 82 00:04:25,030 --> 00:04:31,491 So let's move onto an actual problem. 83 00:04:31,491 --> 00:04:33,240 So the first problem we're going to reduce 84 00:04:33,240 --> 00:04:35,710 is the Hamiltonian path. 85 00:04:35,710 --> 00:04:41,990 So a well-known NP hard problem is the Hamiltonian cycle. 86 00:04:41,990 --> 00:04:52,931 So here our A is-- so it's a Hamiltonian cycle. 87 00:04:52,931 --> 00:04:54,181 So what's a Hamiltonian cycle? 88 00:04:54,181 --> 00:04:55,430 So what's a Hamiltonian cycle? 89 00:04:55,430 --> 00:04:58,110 So a Hamiltonian cycle-- so let's say you have a graph. 90 00:04:58,110 --> 00:04:59,660 So we have this graph. 91 00:04:59,660 --> 00:05:00,813 Let me draw this out. 92 00:05:05,730 --> 00:05:06,230 That's it. 93 00:05:10,525 --> 00:05:12,860 So a Hamiltonian cycle is a cycle 94 00:05:12,860 --> 00:05:15,250 in the graph which starts at some vertex, 95 00:05:15,250 --> 00:05:17,400 visits all the other vertices, and comes back 96 00:05:17,400 --> 00:05:19,090 to the starting vertex. 97 00:05:19,090 --> 00:05:23,820 So in this case, we could do something like go here, 98 00:05:23,820 --> 00:05:29,970 and then take this vertex, take this vertex, take this vertex, 99 00:05:29,970 --> 00:05:31,189 and come back here. 100 00:05:31,189 --> 00:05:32,730 So that is a valid Hamiltonian cycle. 101 00:05:32,730 --> 00:05:34,750 So this graph is a Hamiltonian cycle. 102 00:05:34,750 --> 00:05:38,010 So the decision problem is here that given the graph, 103 00:05:38,010 --> 00:05:39,905 does it have a Hamiltonian cycle? 104 00:05:39,905 --> 00:05:42,030 And that problem is NP-hard, so you can [INAUDIBLE] 105 00:05:42,030 --> 00:05:43,220 polynomial [INAUDIBLE]. 106 00:05:43,220 --> 00:05:46,320 So now the new polynomial shows NP-hard, 107 00:05:46,320 --> 00:05:49,340 which is B is Hamiltonian path. 108 00:05:56,960 --> 00:05:59,340 So the Hamiltonian path is a very similar problem. 109 00:05:59,340 --> 00:06:01,589 Instead of a cycle, you remove the requirement 110 00:06:01,589 --> 00:06:03,630 that you have to come back to the starting point. 111 00:06:03,630 --> 00:06:07,290 You can just start anywhere and [INAUDIBLE] all the vertices 112 00:06:07,290 --> 00:06:08,080 and stop. 113 00:06:08,080 --> 00:06:10,840 So for example, if you remove this edge, 114 00:06:10,840 --> 00:06:13,390 this graph no longer has Hamiltonian cycle, 115 00:06:13,390 --> 00:06:16,737 but it has a Hamiltonian path, which is just this line. 116 00:06:16,737 --> 00:06:18,050 Simple. 117 00:06:18,050 --> 00:06:20,680 So this is a simple reduction because the problems 118 00:06:20,680 --> 00:06:23,060 are very similar. 119 00:06:23,060 --> 00:06:25,460 So the first step is, of course, showing 120 00:06:25,460 --> 00:06:27,060 that Hamiltonian path is an NP. 121 00:06:31,740 --> 00:06:34,280 So that should be pretty clear because-- so what is 122 00:06:34,280 --> 00:06:35,374 our certificate here? 123 00:06:35,374 --> 00:06:37,790 So if someone says, OK, I have solved the Hamiltonian path 124 00:06:37,790 --> 00:06:40,157 and this is my Hamiltonian path. 125 00:06:40,157 --> 00:06:42,490 And he gives you a certificate which is the actual path. 126 00:06:42,490 --> 00:06:45,440 So you can always look at the certificate, check the path, 127 00:06:45,440 --> 00:06:47,070 and see if it's a valid path. 128 00:06:47,070 --> 00:06:49,636 And then you can verify the answer in polynomial time. 129 00:06:49,636 --> 00:06:53,250 So that's the linear time verification. 130 00:06:53,250 --> 00:06:55,130 So this is true. 131 00:06:55,130 --> 00:07:00,820 So now what we're going to show is B is hard. 132 00:07:03,670 --> 00:07:06,540 So the way we do that is we do a reduction. 133 00:07:06,540 --> 00:07:09,720 So the reduction is given an input to the original problem, 134 00:07:09,720 --> 00:07:13,250 A. So let's say an input to A is in the form of a graph. 135 00:07:18,240 --> 00:07:23,210 And now you have transformed this graph somehow to G dash. 136 00:07:28,690 --> 00:07:31,800 And then you have to argue that-- so this 137 00:07:31,800 --> 00:07:35,450 is the transformation R-- and you 138 00:07:35,450 --> 00:07:39,409 have to argue that the solution, the answer that this 139 00:07:39,409 --> 00:07:41,200 will spit out, is the same amount that this 140 00:07:41,200 --> 00:07:42,201 would spit out. 141 00:07:42,201 --> 00:07:43,950 So let's look at the transformation first. 142 00:07:43,950 --> 00:07:45,460 Anyone have any ideas? 143 00:07:45,460 --> 00:07:48,690 So you have a graph and you're using 144 00:07:48,690 --> 00:07:50,890 that to solve the Hamiltonian cycle problem. 145 00:07:50,890 --> 00:07:52,710 So how do you transform it in a way 146 00:07:52,710 --> 00:07:55,297 that lets you cast into Hamiltonian path formulation? 147 00:07:58,159 --> 00:07:59,026 Yes. 148 00:07:59,026 --> 00:08:00,067 AUDIENCE: Remove an edge? 149 00:08:02,929 --> 00:08:04,680 AMARTYA SHANKHA BISWAS: Not exactly. 150 00:08:04,680 --> 00:08:06,430 So which edge would you remove? 151 00:08:06,430 --> 00:08:08,000 You can't find the Hamiltonian cycle. 152 00:08:08,000 --> 00:08:09,480 OK, important point. 153 00:08:09,480 --> 00:08:11,990 So there's no point doing this reduction 154 00:08:11,990 --> 00:08:14,310 unless this reduction is itself polynomial 155 00:08:14,310 --> 00:08:15,900 because otherwise, this whole strategy 156 00:08:15,900 --> 00:08:19,170 of transforming and then using B to find A doesn't work. 157 00:08:19,170 --> 00:08:21,680 Because if the reduction is exponential time, 158 00:08:21,680 --> 00:08:22,820 that doesn't help you. 159 00:08:22,820 --> 00:08:23,460 So it-- 160 00:08:23,460 --> 00:08:24,876 AUDIENCE: Try removing every edge. 161 00:08:26,904 --> 00:08:29,070 AMARTYA SHANKHA BISWAS: So if you remove some edges, 162 00:08:29,070 --> 00:08:31,660 you'll see that there is still a Hamiltonian cycle. 163 00:08:31,660 --> 00:08:37,210 But you remove some edges-- you can't tell. 164 00:08:37,210 --> 00:08:39,809 You don't know which edge to remove. 165 00:08:39,809 --> 00:08:41,447 So a better way to do it is this. 166 00:08:41,447 --> 00:08:43,280 So let's say this is the rest of your graph. 167 00:08:43,280 --> 00:08:44,890 And you just look at one vertex. 168 00:08:44,890 --> 00:08:47,439 So look at one vertex, V. And let's say 169 00:08:47,439 --> 00:08:48,480 this is a directed graph. 170 00:08:48,480 --> 00:08:49,430 If it's an undirected graph, you can just 171 00:08:49,430 --> 00:08:52,380 like add one edge there and one edge back for everything. 172 00:08:52,380 --> 00:08:55,640 So now you add a directed edge along this. 173 00:08:55,640 --> 00:08:56,140 I'm sorry. 174 00:08:56,140 --> 00:08:57,670 You look at all the directed edges. 175 00:08:57,670 --> 00:09:00,100 So let's say you have some edges coming in 176 00:09:00,100 --> 00:09:01,540 and you have some edges going out. 177 00:09:04,470 --> 00:09:06,317 So this is just a vertex and you just 178 00:09:06,317 --> 00:09:07,900 look at the rest of the graph and look 179 00:09:07,900 --> 00:09:10,191 at all the edges coming in and all the edges going out. 180 00:09:10,191 --> 00:09:12,460 So this is in A. This is the original problem. 181 00:09:12,460 --> 00:09:20,650 And you transform this into-- you split the vertex into two, 182 00:09:20,650 --> 00:09:23,340 so V dash and V double-dash, let's say. 183 00:09:23,340 --> 00:09:28,940 And in one of them, you keep all the incoming edges 184 00:09:28,940 --> 00:09:33,041 and the other one contains all the outgoing edges. 185 00:09:33,041 --> 00:09:35,040 Does that transformation make sense intuitively? 186 00:09:37,660 --> 00:09:39,640 So what do you have here? 187 00:09:39,640 --> 00:09:45,190 So here you had-- let's say this graph had a Hamiltonian cycle. 188 00:09:45,190 --> 00:09:47,530 So this graph had some cycle which 189 00:09:47,530 --> 00:09:50,640 went up here, did something, something, and came back. 190 00:09:50,640 --> 00:09:51,730 So it would go like this. 191 00:09:51,730 --> 00:09:53,230 It would do the cycle and come back. 192 00:09:53,230 --> 00:09:54,470 So there was some cycle. 193 00:09:54,470 --> 00:09:57,550 And since the cycle, it contains V, so now what you're doing 194 00:09:57,550 --> 00:10:00,090 is you're splitting apart V and disconnecting them. 195 00:10:00,090 --> 00:10:03,410 So you'll still have-- if you look at the original path, 196 00:10:03,410 --> 00:10:06,460 it's still there, but it's been split up into a path now. 197 00:10:06,460 --> 00:10:09,600 It's no longer a cycle. 198 00:10:09,600 --> 00:10:10,720 Make sense? 199 00:10:10,720 --> 00:10:13,350 So now let's argue this more rigorously. 200 00:10:13,350 --> 00:10:17,540 So what we want to say here is that let's say 201 00:10:17,540 --> 00:10:20,955 there was a cycle here. 202 00:10:20,955 --> 00:10:25,400 If there was a cycle here, then is it clear 203 00:10:25,400 --> 00:10:27,060 that there is a path here? 204 00:10:27,060 --> 00:10:31,012 Because just take the same edges that you had before. 205 00:10:31,012 --> 00:10:32,720 If you take the same edges, they will now 206 00:10:32,720 --> 00:10:34,460 form a path instead of a cycle. 207 00:10:34,460 --> 00:10:37,440 So cycle implies path. 208 00:10:37,440 --> 00:10:38,480 Does that makes sense? 209 00:10:41,960 --> 00:10:44,570 So the other way is a little more tricky. 210 00:10:44,570 --> 00:10:46,200 So let's say you have a path. 211 00:10:51,240 --> 00:10:53,050 So let's say you had a path. 212 00:10:53,050 --> 00:10:59,180 So that means that-- let's redraw this so it's more clear. 213 00:10:59,180 --> 00:11:02,500 So you have this new graph where you have two vertices, V dash 214 00:11:02,500 --> 00:11:03,250 and V double-dash. 215 00:11:06,120 --> 00:11:08,366 This has a bunch of incoming edges 216 00:11:08,366 --> 00:11:10,130 and this is a bunch of outgoing edges. 217 00:11:13,540 --> 00:11:18,402 So now let's say you have a Hamiltonian path in this graph. 218 00:11:18,402 --> 00:11:19,360 So what does that mean? 219 00:11:19,360 --> 00:11:22,160 So where can the Hamiltonian path start? 220 00:11:22,160 --> 00:11:23,890 Can it start anywhere? 221 00:11:23,890 --> 00:11:25,420 Where can it start? 222 00:11:25,420 --> 00:11:26,420 AUDIENCE: V double-dash. 223 00:11:26,420 --> 00:11:27,910 AMARTYA SHANKHA BISWAS: Right, because V double point doesn't 224 00:11:27,910 --> 00:11:28,780 have any incoming edges, so it can't 225 00:11:28,780 --> 00:11:29,988 be in the middle of the path. 226 00:11:29,988 --> 00:11:31,590 So it has a start here. 227 00:11:31,590 --> 00:11:32,660 So it starts. 228 00:11:32,660 --> 00:11:34,055 It does something in there. 229 00:11:34,055 --> 00:11:34,930 And where can it end? 230 00:11:34,930 --> 00:11:36,930 It can only end, similarly, in V dash. 231 00:11:36,930 --> 00:11:38,240 Because V dash doesn't have any outgoing edges, 232 00:11:38,240 --> 00:11:39,948 so it can't be in the middle of the path. 233 00:11:39,948 --> 00:11:41,520 So it has to end in V dash. 234 00:11:41,520 --> 00:11:44,490 So now, if you have a path like that, 235 00:11:44,490 --> 00:11:49,185 and you go back to this graph-- so V dash and V double-dash 236 00:11:49,185 --> 00:11:50,310 are now on the same vertex. 237 00:11:50,310 --> 00:11:52,494 And just that path just becomes a cycle now. 238 00:11:52,494 --> 00:11:53,410 So path implies cycle. 239 00:11:56,380 --> 00:11:59,000 So now what we have is previously what we had, right? 240 00:11:59,000 --> 00:12:07,270 So now we know that A of G here is equal to B of G dash. 241 00:12:07,270 --> 00:12:09,400 So G dash was transformation. 242 00:12:09,400 --> 00:12:11,359 Also notice that the transformation was just 243 00:12:11,359 --> 00:12:12,400 splitting apart a vertex. 244 00:12:12,400 --> 00:12:15,389 So depending on your representation of your graph, 245 00:12:15,389 --> 00:12:17,680 it'll take something, like constant time or linear time 246 00:12:17,680 --> 00:12:19,600 or something polynomial, essentially. 247 00:12:19,600 --> 00:12:23,610 So you get a polynomial-time reduction. 248 00:12:23,610 --> 00:12:25,871 After reduction, you show that the answer 249 00:12:25,871 --> 00:12:27,870 to the reduced problem is the same as the answer 250 00:12:27,870 --> 00:12:29,230 to the original problem. 251 00:12:29,230 --> 00:12:32,360 And that means that this is also an NP-hard problem 252 00:12:32,360 --> 00:12:35,850 by the argument given here. 253 00:12:35,850 --> 00:12:37,090 Questions? 254 00:12:37,090 --> 00:12:38,388 Does that make sense? 255 00:12:38,388 --> 00:12:38,888 Yes. 256 00:12:38,888 --> 00:12:40,098 AUDIENCE: So are you creating two vertices 257 00:12:40,098 --> 00:12:41,310 for every vertex in the graph? 258 00:12:41,310 --> 00:12:42,315 AMARTYA SHANKHA BISWAS: No, just this one. 259 00:12:42,315 --> 00:12:44,000 Just pick any vertex-- doesn't matter-- 260 00:12:44,000 --> 00:12:45,360 because you have a cycle. 261 00:12:45,360 --> 00:12:48,896 So if you take any vertex and split it apart, you get a path. 262 00:12:48,896 --> 00:12:51,206 AUDIENCE: Oh, perfect. 263 00:12:51,206 --> 00:12:53,060 AMARTYA SHANKHA BISWAS: Anything else? 264 00:12:53,060 --> 00:12:56,000 OK, let's move on to the next one. 265 00:13:02,120 --> 00:13:03,060 Grab a new board. 266 00:13:06,360 --> 00:13:30,700 So the next problem is-- so given the graph, 267 00:13:30,700 --> 00:13:33,120 is there a k-clique? 268 00:13:33,120 --> 00:13:34,485 Do people know what a clique is? 269 00:13:37,120 --> 00:13:38,060 So a clique is this. 270 00:13:38,060 --> 00:13:40,530 So a clique is a set of vertices. 271 00:13:40,530 --> 00:13:44,770 So let's say C subset of V for the set of vertices, 272 00:13:44,770 --> 00:13:46,966 such that C is a complete graph. 273 00:13:46,966 --> 00:13:48,090 Let's just draw a diagonal. 274 00:13:48,090 --> 00:13:49,050 That's probably easier. 275 00:14:05,200 --> 00:14:05,820 OK. 276 00:14:05,820 --> 00:14:08,896 So in this example, so you have this graph. 277 00:14:08,896 --> 00:14:09,790 This one. 278 00:14:09,790 --> 00:14:11,985 So look at this set of vertices. 279 00:14:14,740 --> 00:14:17,210 Every pair of them is connected to each other. 280 00:14:17,210 --> 00:14:18,700 What that means is that if you just 281 00:14:18,700 --> 00:14:21,100 look at the graph with these vertices, 282 00:14:21,100 --> 00:14:22,430 it's a complete graph. 283 00:14:22,430 --> 00:14:24,850 That's what's called a clique. 284 00:14:24,850 --> 00:14:26,464 And in this case, this is a 4-clique. 285 00:14:26,464 --> 00:14:28,380 So you have four vertices, this is a 4-clique. 286 00:14:33,140 --> 00:14:36,210 So the position problem is, given the graph, 287 00:14:36,210 --> 00:14:38,247 does there exist a k-clique? 288 00:14:38,247 --> 00:14:40,330 So this is, again, known to be an NP-hard problem. 289 00:14:40,330 --> 00:14:45,350 So now we will use this to show that this problem is NP-hard. 290 00:14:45,350 --> 00:14:50,850 So this problem is independent set. 291 00:14:55,860 --> 00:15:00,940 So again, so given the graph, what is an independent set? 292 00:15:00,940 --> 00:15:04,320 Anyone want to explain? 293 00:15:04,320 --> 00:15:05,900 So what an independent set is this. 294 00:15:05,900 --> 00:15:07,358 So let's say you have a graph which 295 00:15:07,358 --> 00:15:16,310 looks like-- so kind of complementary 296 00:15:16,310 --> 00:15:18,000 to the definition of a clique. 297 00:15:18,000 --> 00:15:20,500 An independent set is a set of vertices, 298 00:15:20,500 --> 00:15:23,710 such that no pair of them has an edge between them. 299 00:15:23,710 --> 00:15:27,670 So in this case, if you took this vertex, 300 00:15:27,670 --> 00:15:30,427 you took this vertex, this vertex, and this vertex-- 301 00:15:30,427 --> 00:15:32,010 so you can see, none of these vertices 302 00:15:32,010 --> 00:15:35,385 have an edge between them, so that is an independent set. 303 00:15:35,385 --> 00:15:38,762 So in this case, you're taking a set of vertices 304 00:15:38,762 --> 00:15:40,470 which is a complete graph, so all of them 305 00:15:40,470 --> 00:15:41,900 have edges between them. 306 00:15:41,900 --> 00:15:44,441 And in this case, you're taking vertices which are completely 307 00:15:44,441 --> 00:15:45,560 disconnected. 308 00:15:45,560 --> 00:15:48,180 So now we're going to find a reduction from this problem 309 00:15:48,180 --> 00:15:50,510 to this problem. 310 00:15:50,510 --> 00:15:53,400 So first of all, independent set is an NP. 311 00:15:53,400 --> 00:15:54,650 Is that clear? 312 00:15:54,650 --> 00:15:57,120 How would you show that? 313 00:15:57,120 --> 00:15:59,250 So how would you create a certificate which 314 00:15:59,250 --> 00:16:01,370 would tell you that-- so how would someone 315 00:16:01,370 --> 00:16:03,540 create a certificate which would convince you, 316 00:16:03,540 --> 00:16:08,210 in polynomial-time, that this is correct? 317 00:16:08,210 --> 00:16:11,070 So the certificate would be just give you the independent set. 318 00:16:11,070 --> 00:16:14,800 And you can check if something is independent set. 319 00:16:14,800 --> 00:16:19,700 So given I-- let's call the independent set, I. So given 320 00:16:19,700 --> 00:16:21,689 the set of vertices, you can verify 321 00:16:21,689 --> 00:16:23,730 that it is an independent set in polynomial-time. 322 00:16:23,730 --> 00:16:25,557 So just look at all pairs and check 323 00:16:25,557 --> 00:16:27,140 if there's an edge-- just an n squared 324 00:16:27,140 --> 00:16:29,070 and Q whatever is polynomial. 325 00:16:29,070 --> 00:16:30,309 That's important. 326 00:16:30,309 --> 00:16:31,975 So now let's look at our transformation. 327 00:16:35,850 --> 00:16:39,860 So again, as before, you have A, which is given by a graph, 328 00:16:39,860 --> 00:16:42,060 and you want to transform into something. 329 00:16:42,060 --> 00:16:47,630 So the important note here is that in the clique, 330 00:16:47,630 --> 00:17:03,900 you have-- so for your clique C, all the pairs of vertices 331 00:17:03,900 --> 00:17:05,160 are connected. 332 00:17:05,160 --> 00:17:07,550 In I, no pair of vertices are connected. 333 00:17:07,550 --> 00:17:11,109 So what should be a logical transformation that would map 334 00:17:11,109 --> 00:17:12,400 a clique to an independent set? 335 00:17:16,210 --> 00:17:17,461 Anyone? 336 00:17:17,461 --> 00:17:19,294 What do you think you should do to the graph 337 00:17:19,294 --> 00:17:20,730 so that the clique becomes-- yeah. 338 00:17:20,730 --> 00:17:22,722 AUDIENCE: Invert the existence of edges 339 00:17:22,722 --> 00:17:23,681 so they're [INAUDIBLE]. 340 00:17:23,681 --> 00:17:25,096 AMARTYA SHANKHA BISWAS: Precisely. 341 00:17:25,096 --> 00:17:26,604 AUDIENCE: [INAUDIBLE]. 342 00:17:26,604 --> 00:17:27,770 AMARTYA SHANKHA BISWAS: Yup. 343 00:17:27,770 --> 00:17:28,470 Exactly. 344 00:17:28,470 --> 00:17:28,970 Great. 345 00:17:28,970 --> 00:17:32,170 So all you have to do is if you want 346 00:17:32,170 --> 00:17:35,150 to turn a clique into independent set, you 347 00:17:35,150 --> 00:17:40,042 just-- what is it-- complement the adjacency matrix. 348 00:17:40,042 --> 00:17:42,500 So every edge does not exist now exists and every edge that 349 00:17:42,500 --> 00:17:44,450 did exist is gone. 350 00:17:44,450 --> 00:17:50,080 So you create a graph, G dash, with the same [INAUDIBLE] 351 00:17:50,080 --> 00:17:52,750 vertices, except the edges are now complemented. 352 00:17:52,750 --> 00:17:55,780 So the E bar just means the edges that were not. 353 00:17:55,780 --> 00:17:57,440 So let's just draw an example. 354 00:17:57,440 --> 00:18:17,300 So let's say you had-- and what does this become? 355 00:18:17,300 --> 00:18:18,765 So let's draw the vertices first. 356 00:18:22,800 --> 00:18:23,780 Let's go one by one. 357 00:18:23,780 --> 00:18:24,863 So let's take this vertex. 358 00:18:28,570 --> 00:18:29,770 Let's take this vertex. 359 00:18:29,770 --> 00:18:31,250 And what edges does it have? 360 00:18:31,250 --> 00:18:32,079 So it goes here. 361 00:18:32,079 --> 00:18:32,620 It goes here. 362 00:18:32,620 --> 00:18:35,040 It doesn't go here, so we draw an edge here. 363 00:18:35,040 --> 00:18:38,030 It doesn't go there, so we draw another edge there. 364 00:18:38,030 --> 00:18:41,582 And it doesn't go there, so we draw another edge there. 365 00:18:41,582 --> 00:18:42,940 Now, this vertex. 366 00:18:42,940 --> 00:18:45,460 It's connected to all of these except this one. 367 00:18:45,460 --> 00:18:48,900 So that means that there's an edge there. 368 00:18:48,900 --> 00:18:50,100 Let's take this vertex. 369 00:18:50,100 --> 00:18:51,455 It's connected to everything. 370 00:18:51,455 --> 00:18:53,160 Oh, it's connected to everything. 371 00:18:53,160 --> 00:18:53,996 Let's take this one. 372 00:18:53,996 --> 00:18:55,870 It's connected to these two and nothing else, 373 00:18:55,870 --> 00:18:59,770 so we need to connect it to this guy. 374 00:18:59,770 --> 00:19:02,120 And I think that's it. 375 00:19:02,120 --> 00:19:03,670 Yeah. 376 00:19:03,670 --> 00:19:05,380 Similarly proceeding, this is connected 377 00:19:05,380 --> 00:19:09,390 to everything except that, so I guess this goes there. 378 00:19:09,390 --> 00:19:12,340 And I think that's it, right? 379 00:19:12,340 --> 00:19:13,920 Or is there more? 380 00:19:13,920 --> 00:19:14,780 Three. 381 00:19:14,780 --> 00:19:15,350 No, OK. 382 00:19:15,350 --> 00:19:17,340 So that is a complementary graph I think. 383 00:19:17,340 --> 00:19:19,450 So you can probably verify that. 384 00:19:19,450 --> 00:19:22,660 So now let's look at the clique in this graph. 385 00:19:22,660 --> 00:19:23,870 So this is the clique. 386 00:19:23,870 --> 00:19:26,491 This is the largest clique, rather, but this is a clique. 387 00:19:26,491 --> 00:19:28,490 You could have other cliques, like, for example, 388 00:19:28,490 --> 00:19:30,595 this these three things are also a clique. 389 00:19:30,595 --> 00:19:32,220 So now look at what this is mapping to. 390 00:19:32,220 --> 00:19:35,400 This is mapping to this vertex, this vertex, this vertex, 391 00:19:35,400 --> 00:19:37,200 and this vertex. 392 00:19:37,200 --> 00:19:40,480 And you can see that's an independent set. 393 00:19:40,480 --> 00:19:43,040 So does that transformation make sense? 394 00:19:43,040 --> 00:19:45,150 So now the proof should be intuitively clear. 395 00:19:45,150 --> 00:19:47,500 So let's just go through it moderately rigorously. 396 00:19:47,500 --> 00:19:49,105 So let's say you have a clique here. 397 00:19:53,020 --> 00:19:58,470 So your clique here, for every pair of vertices in the clique, 398 00:19:58,470 --> 00:20:00,620 there's an edge between them. 399 00:20:00,620 --> 00:20:05,990 And so if that maps to-- so let's say clique C maps 400 00:20:05,990 --> 00:20:19,200 to-- let's say this maps to I. So for all U, V element of C, 401 00:20:19,200 --> 00:20:23,980 you have U, V, element of E. So if this 402 00:20:23,980 --> 00:20:26,320 is a clique, for every pair of vertices, 403 00:20:26,320 --> 00:20:29,240 that edge is in the original graph, which 404 00:20:29,240 --> 00:20:41,400 means that U, V is not an element of E for all U, V 405 00:20:41,400 --> 00:20:42,780 element of I. 406 00:20:42,780 --> 00:20:45,710 So for every U, V element of I, we 407 00:20:45,710 --> 00:20:53,520 have this, which means that-- does that make sense? 408 00:20:53,520 --> 00:20:57,810 So that means that that's the independent set criterion. 409 00:20:57,810 --> 00:21:00,020 So you reduced clique to independent set. 410 00:21:00,020 --> 00:21:02,552 And that means that independent set is now NP-hard. 411 00:21:02,552 --> 00:21:03,599 OK. 412 00:21:03,599 --> 00:21:04,640 How are we doing on time? 413 00:21:04,640 --> 00:21:05,710 OK. 414 00:21:05,710 --> 00:21:08,505 So now let's do a more complicated example. 415 00:21:08,505 --> 00:21:09,630 Any questions on these two? 416 00:21:12,890 --> 00:21:14,210 Make sense? 417 00:21:14,210 --> 00:21:15,300 OK. 418 00:21:15,300 --> 00:21:20,098 So let's try this. 419 00:21:20,098 --> 00:21:22,066 So let's start by erasing something. 420 00:21:48,680 --> 00:21:50,370 So this is the next problem. 421 00:21:50,370 --> 00:21:54,780 So as before, our A is k-clique. 422 00:22:01,280 --> 00:22:04,900 So it says there's a clique-- rather, 423 00:22:04,900 --> 00:22:06,210 let's add this in this way. 424 00:22:06,210 --> 00:22:10,739 Clique size greater than or equal to k. 425 00:22:10,739 --> 00:22:12,030 So that's the decision problem. 426 00:22:12,030 --> 00:22:14,580 Is there a clique of size greater than or equal to k? 427 00:22:14,580 --> 00:22:20,460 And B is, it's called Max-2-SAT. 428 00:22:20,460 --> 00:22:24,570 So what that means is that-- so it is somewhat 429 00:22:24,570 --> 00:22:36,115 like normal 2-SAT, except basically you have some clauses 430 00:22:36,115 --> 00:22:37,240 and you have some literals. 431 00:22:45,200 --> 00:22:48,400 So each of these literals contain values 1 or 0. 432 00:22:48,400 --> 00:22:55,119 And each of these clauses is something like xi or xj. 433 00:22:55,119 --> 00:22:57,285 And actually, there can be naughts in front of this, 434 00:22:57,285 --> 00:23:00,267 so let's say xi 0 xj. 435 00:23:00,267 --> 00:23:01,350 Or it can be other things. 436 00:23:01,350 --> 00:23:07,650 So it's xi 0 xj, or 0 xi 0 xj, or xi 437 00:23:07,650 --> 00:23:10,790 xj, and so on and so forth, so just the normal 2-SAT. 438 00:23:10,790 --> 00:23:14,960 So now the decision problem is to-- does 439 00:23:14,960 --> 00:23:30,090 there exist an assignment, such that greater than equal to k 440 00:23:30,090 --> 00:23:41,000 clauses-- so that the decision problem. 441 00:23:41,000 --> 00:23:45,150 So is there an assignment to literals such that at least k 442 00:23:45,150 --> 00:23:47,680 of these clauses are satisfied? 443 00:23:47,680 --> 00:23:50,350 And so now we're going to show that it is with k-clique. 444 00:23:50,350 --> 00:23:53,070 So again, is it an NP? 445 00:23:53,070 --> 00:23:56,230 So what is the certificate? 446 00:23:56,230 --> 00:23:59,460 How would someone convince you their solution to this problem? 447 00:24:03,630 --> 00:24:05,580 AUDIENCE: Give you the literals. 448 00:24:05,580 --> 00:24:06,790 AMARTYA SHANKHA BISWAS: Yeah, so give you an assignment 449 00:24:06,790 --> 00:24:07,782 of values to literals. 450 00:24:07,782 --> 00:24:10,240 And then you can go through all the clauses and check them. 451 00:24:10,240 --> 00:24:12,490 So if they give you like x1 equal to 1, x2 equal to 2, 452 00:24:12,490 --> 00:24:14,657 x equal to 0, x3 equal to 1, and so on and so forth, 453 00:24:14,657 --> 00:24:16,698 you can then go through and check all the clauses 454 00:24:16,698 --> 00:24:18,460 and see if greater than k are satisfied. 455 00:24:18,460 --> 00:24:20,280 So this is an NP. 456 00:24:20,280 --> 00:24:22,550 So now let's try the reduction. 457 00:24:22,550 --> 00:24:25,600 So this is how the reduction goes. 458 00:24:25,600 --> 00:24:26,180 Let's see. 459 00:24:26,180 --> 00:24:27,810 So naturally, we have k-clique, or greater 460 00:24:27,810 --> 00:24:29,101 than equal to k-clique, rather. 461 00:24:33,530 --> 00:24:37,529 So that means you have a graph with a set of vertices 462 00:24:37,529 --> 00:24:38,320 and a set of edges. 463 00:24:43,700 --> 00:24:48,240 So let's say you have a clique, V dash, subset of V, 464 00:24:48,240 --> 00:24:52,570 and mod of V dash is greater than equal to k. 465 00:24:52,570 --> 00:24:54,070 So now you have to somehow construct 466 00:24:54,070 --> 00:24:56,050 literals, construct clauses, which 467 00:24:56,050 --> 00:24:57,949 will reflect this behavior. 468 00:24:57,949 --> 00:24:59,990 So first of all, this may not be clear right now, 469 00:24:59,990 --> 00:25:01,906 but let's say we take some literals like this. 470 00:25:01,906 --> 00:25:07,412 So let's say we take xi for all i element of V. 471 00:25:07,412 --> 00:25:11,120 So for every vertex in the graph, we take a literal. 472 00:25:11,120 --> 00:25:13,760 So if the number of vertices is n, 473 00:25:13,760 --> 00:25:16,900 we have n literals because it's corresponding to each vertex. 474 00:25:16,900 --> 00:25:19,710 Also, we take a dummy literal. 475 00:25:19,710 --> 00:25:24,200 Let's call it Z. 476 00:25:24,200 --> 00:25:26,750 So now how do we get our clauses? 477 00:25:26,750 --> 00:25:31,680 So the general idea is that if a vertex is in the clique, 478 00:25:31,680 --> 00:25:32,839 you will assign it 1. 479 00:25:32,839 --> 00:25:34,797 If it's not in the clique, we will assign it 0. 480 00:25:40,990 --> 00:25:42,850 Everything outside of the clique is 0. 481 00:25:42,850 --> 00:25:53,130 So this clause, not xi or not xj-- 482 00:25:53,130 --> 00:25:55,630 so what is the value of this clause normally? 483 00:25:55,630 --> 00:26:01,486 So let's say xi and xj are both outside the clique. 484 00:26:01,486 --> 00:26:03,110 So i and j are both outside the clique. 485 00:26:03,110 --> 00:26:06,249 That means that both of them are 0 and so this is true. 486 00:26:06,249 --> 00:26:08,665 What if one of them is inside the clique and the other one 487 00:26:08,665 --> 00:26:09,420 is outside? 488 00:26:09,420 --> 00:26:11,510 It still is true because one of these naughts is 1 489 00:26:11,510 --> 00:26:13,370 and that is still true. 490 00:26:13,370 --> 00:26:16,940 Let's say both i and j are inside the clique. 491 00:26:16,940 --> 00:26:24,110 So in that case, you have 0 of xi is 0 and 0 of xj is also 0. 492 00:26:24,110 --> 00:26:26,260 That's the only case that this is false. 493 00:26:26,260 --> 00:26:27,860 So the way we take care of-- so we're 494 00:26:27,860 --> 00:26:30,360 trying to maximize the number of true clauses in some sense. 495 00:26:30,360 --> 00:26:32,240 It's like you had maybe an explanation of why 496 00:26:32,240 --> 00:26:33,730 you're using this clause. 497 00:26:33,730 --> 00:26:36,710 So what we do instead of taking all the xi, xj pairs, 498 00:26:36,710 --> 00:26:42,420 is we just take xi, xj, such that i, 499 00:26:42,420 --> 00:26:49,532 j is not an element of E. 500 00:26:49,532 --> 00:26:50,490 So what does that mean? 501 00:26:50,490 --> 00:26:59,170 So now if you had the graph which looked like this, now 502 00:26:59,170 --> 00:27:02,745 it looks like 1, 2, 3, 4. 503 00:27:02,745 --> 00:27:06,980 So you would take 0 x1 and 0 x4. 504 00:27:06,980 --> 00:27:08,980 You would take 0 x1 and 0 x3. 505 00:27:08,980 --> 00:27:11,380 But you would not take 0 x2 and 0 x3. 506 00:27:11,380 --> 00:27:12,590 So what does that do? 507 00:27:12,590 --> 00:27:15,040 That means that if you follow the assignment according 508 00:27:15,040 --> 00:27:21,130 to the clique rules-- so if i and j are both in the clique, 509 00:27:21,130 --> 00:27:22,884 this clause will not be included. 510 00:27:22,884 --> 00:27:25,300 Does that make sense-- what set of clauses you are taking? 511 00:27:28,266 --> 00:27:30,850 OK, so let's continue and it will hopefully 512 00:27:30,850 --> 00:27:32,390 be a little more clear. 513 00:27:32,390 --> 00:27:37,860 So the other sort of clause we're going to take is xi or z. 514 00:27:37,860 --> 00:27:42,892 And the other one is xi or 0 z. 515 00:27:42,892 --> 00:27:44,350 So the reason we're taking these is 516 00:27:44,350 --> 00:27:48,020 that if you wanted to Max-2-SAT on this alone, 517 00:27:48,020 --> 00:27:49,800 you can just set everything to 0, 518 00:27:49,800 --> 00:27:51,350 and that would give you a maximum. 519 00:27:51,350 --> 00:27:54,325 So sort of not do that-- so to sort of minimize 520 00:27:54,325 --> 00:27:56,700 the number of things to be set to 0, you were doing this. 521 00:27:56,700 --> 00:28:00,260 So is this just some hand-wavy argument why you're doing this. 522 00:28:00,260 --> 00:28:03,280 So let's actually try to do some analysis on this. 523 00:28:03,280 --> 00:28:05,050 So let's say you do this transformation. 524 00:28:05,050 --> 00:28:07,320 So do the clauses make sense? 525 00:28:07,320 --> 00:28:10,080 Does the first clause sense? 526 00:28:10,080 --> 00:28:14,080 So you have 0 xi, 0 xj for every i, j which is not in the graph. 527 00:28:16,640 --> 00:28:17,970 So how does this work? 528 00:28:17,970 --> 00:28:30,060 So let's say you have V dash such that size of V dash 529 00:28:30,060 --> 00:28:31,800 is greater than or equal to k. 530 00:28:31,800 --> 00:28:35,120 Actually, let's just make size is V dash equal to k. 531 00:28:35,120 --> 00:28:37,720 So if you have a clique of size greater than or equal to k, 532 00:28:37,720 --> 00:28:39,080 of course, you have a clique of size equal to k. 533 00:28:39,080 --> 00:28:41,270 You can just throw away some of the vertices. 534 00:28:41,270 --> 00:28:44,320 So you take your V dash such that this size is equal to k. 535 00:28:44,320 --> 00:28:48,140 And you set xi is equal to 1. 536 00:28:48,140 --> 00:28:59,424 So you set xi equal to 1 if i is element of V dash 0, 537 00:28:59,424 --> 00:29:02,610 if i is not an element of V dash. 538 00:29:02,610 --> 00:29:03,110 Make sense? 539 00:29:03,110 --> 00:29:05,190 So you would set everything in your clique to be 1, everything 540 00:29:05,190 --> 00:29:06,340 outside to be 0. 541 00:29:06,340 --> 00:29:08,227 And let z equal to 1. 542 00:29:08,227 --> 00:29:09,810 So you're starting with the assumption 543 00:29:09,810 --> 00:29:11,519 that-- so you're showing one direction. 544 00:29:11,519 --> 00:29:13,810 You're showing that given that there's a clique of size 545 00:29:13,810 --> 00:29:15,270 greater than equal to k. 546 00:29:15,270 --> 00:29:17,690 And now you're are going to construct a Max-2-SAT 547 00:29:17,690 --> 00:29:21,028 instance which has the satisfied number of clauses greater 548 00:29:21,028 --> 00:29:22,020 than equal to k. 549 00:29:22,020 --> 00:29:24,859 And then we'd show the other direction. 550 00:29:24,859 --> 00:29:26,400 So now let's look at how many clauses 551 00:29:26,400 --> 00:29:27,400 we have to be satisfied. 552 00:29:32,510 --> 00:29:42,050 So the first type of clause was 0 of xi or 0 of xj. 553 00:29:44,690 --> 00:29:47,810 So how many of these clauses are being satisfied? 554 00:29:47,810 --> 00:29:51,330 So first case, i and G are both outside V dash. 555 00:29:51,330 --> 00:29:53,380 Is the clause satisfied in that case? 556 00:29:53,380 --> 00:29:55,760 Yes, because by definition, if they're outside V dash, 557 00:29:55,760 --> 00:29:58,810 they're both 0, so their naughts are both 1, so they're 1. 558 00:29:58,810 --> 00:30:01,090 So if the i and j are outside V dash, you're good. 559 00:30:01,090 --> 00:30:04,490 So what about the case when one of them is inside V dash? 560 00:30:04,490 --> 00:30:05,490 Is the clause satisfied. 561 00:30:09,405 --> 00:30:13,712 Yeah, because one of them is 0, which makes the 0 1, 562 00:30:13,712 --> 00:30:14,670 and the whole thing is. 563 00:30:14,670 --> 00:30:16,440 It's an [? arc, ?] so it's satisfied. 564 00:30:16,440 --> 00:30:19,100 Let's say both of them are inside V dash. 565 00:30:19,100 --> 00:30:21,040 Let's say both i and j are inside V dash. 566 00:30:21,040 --> 00:30:24,130 Then this clause just doesn't exist because of the condition 567 00:30:24,130 --> 00:30:26,310 that i, j had 0 elements of V. Because if it's 568 00:30:26,310 --> 00:30:28,480 inside the clique, then that edge obviously 569 00:30:28,480 --> 00:30:30,910 exists, and therefore this clause is not 570 00:30:30,910 --> 00:30:32,790 in the set of clauses we're using. 571 00:30:32,790 --> 00:30:36,017 So essentially, every clause of this form will be satisfied. 572 00:30:36,017 --> 00:30:37,850 And how many clauses of this form are there? 573 00:30:37,850 --> 00:30:39,266 The number of clauses of this form 574 00:30:39,266 --> 00:30:42,310 is just E bar, where E is the complementary edge 575 00:30:42,310 --> 00:30:44,650 set of that graph. 576 00:30:44,650 --> 00:30:45,365 That's E bar. 577 00:30:45,365 --> 00:30:46,240 Does that make sense? 578 00:30:50,500 --> 00:30:56,100 Next clause is xi or z. 579 00:30:56,100 --> 00:30:59,700 So since we have considered z to be 1, 580 00:30:59,700 --> 00:31:01,220 this clause is always satisfied. 581 00:31:01,220 --> 00:31:05,290 So this just gives us mod of V because this 582 00:31:05,290 --> 00:31:08,630 is for every i-- I should mention that. 583 00:31:08,630 --> 00:31:11,490 For i, this is also for all i. 584 00:31:11,490 --> 00:31:14,450 So for every i, so they all have a number of xi's are 585 00:31:14,450 --> 00:31:15,810 V, so that's it. 586 00:31:15,810 --> 00:31:23,400 So the third type of clause is xi or 0 of v. So 0 of z, 587 00:31:23,400 --> 00:31:25,060 since z is 1, 0 of x is 0. 588 00:31:25,060 --> 00:31:27,820 So the only cases where this clause is true is where? 589 00:31:27,820 --> 00:31:29,720 Is when xi is 1. 590 00:31:29,720 --> 00:31:32,180 And xi is one only inside V dash, 591 00:31:32,180 --> 00:31:33,960 so the number of clauses satisfied here 592 00:31:33,960 --> 00:31:37,027 is mod of V dash. 593 00:31:37,027 --> 00:31:38,110 And mod of V dash is what? 594 00:31:38,110 --> 00:31:40,315 It's just k. 595 00:31:40,315 --> 00:31:40,970 You see this? 596 00:31:46,780 --> 00:31:49,110 All three clauses make sense? 597 00:31:49,110 --> 00:31:52,770 Can you see why the first one is it's the size of V bar? 598 00:31:52,770 --> 00:31:56,180 Because all the clauses are satisfied. 599 00:31:56,180 --> 00:31:58,950 The second one, also all the clauses 600 00:31:58,950 --> 00:32:01,400 are satisfied because z is 1. 601 00:32:01,400 --> 00:32:04,670 Every clause is just the number of vertices. 602 00:32:04,670 --> 00:32:06,820 And the last one, it's only satisfied for the cases 603 00:32:06,820 --> 00:32:08,540 where xi is 1. 604 00:32:08,540 --> 00:32:10,420 And xi is 1 only inside the clique, 605 00:32:10,420 --> 00:32:13,150 so that gives you the k things in the clique. 606 00:32:13,150 --> 00:32:17,020 So now we can finish formulating our transformation. 607 00:32:17,020 --> 00:32:23,152 So our transformation was you set-- 608 00:32:23,152 --> 00:32:24,360 where was the transformation? 609 00:32:24,360 --> 00:32:24,860 Yes. 610 00:32:24,860 --> 00:32:28,450 So these are our clauses and now we're going to set the k. 611 00:32:28,450 --> 00:32:32,480 So the Max-2-SAT was with the condition of k. 612 00:32:32,480 --> 00:32:40,930 So here you'll set it to be mod of V bar plus mod of V plus k. 613 00:32:45,090 --> 00:32:46,090 So does that make sense? 614 00:32:46,090 --> 00:32:51,170 So basically, our Max-2-SAT problem-- so the reduction 615 00:32:51,170 --> 00:32:54,060 to the Max-2-SAT problem-- so we specified the clauses. 616 00:32:54,060 --> 00:32:55,780 We specified the literals. 617 00:32:55,780 --> 00:32:59,060 Literals were xi for all iV and the dummy z. 618 00:32:59,060 --> 00:33:02,100 We satisfied the clauses and we specified the threshold 619 00:33:02,100 --> 00:33:04,420 we are trying to achieve. 620 00:33:04,420 --> 00:33:07,450 So those three things completely specify the Max-2-SAT problem. 621 00:33:13,056 --> 00:33:16,120 OK, proceeding. 622 00:33:16,120 --> 00:33:17,480 So we just showed one direction. 623 00:33:17,480 --> 00:33:19,730 We showed that if there's a clique of size 624 00:33:19,730 --> 00:33:21,480 greater than or equal to k, we reduce this 625 00:33:21,480 --> 00:33:24,040 to a size of exactly k. 626 00:33:24,040 --> 00:33:25,690 Then we did this assignment and we 627 00:33:25,690 --> 00:33:28,600 showed that the total number of clauses being satisfied 628 00:33:28,600 --> 00:33:31,460 is mod of V bar plus mod of V plus k. 629 00:33:31,460 --> 00:33:33,792 And that is the first direction. 630 00:33:33,792 --> 00:33:36,000 So now we're showing that-- so if you have a solution 631 00:33:36,000 --> 00:33:39,056 to clique, we have a solution to k-SAT, Max-k-SAT. 632 00:33:39,056 --> 00:33:40,430 So now we do the other direction. 633 00:33:40,430 --> 00:33:45,079 So let's say we have a solution to Max-k-SAT. 634 00:33:45,079 --> 00:33:46,370 Let me get this out of the way. 635 00:33:51,940 --> 00:33:52,930 Let's do this. 636 00:34:01,050 --> 00:34:27,900 So let's say-- so this part is a little more tricky. 637 00:34:27,900 --> 00:34:30,639 So we start with Max-k-SAT has a number of satisfied clauses 638 00:34:30,639 --> 00:34:33,380 which is greater than or equal to mod of V bar plus V plus k. 639 00:34:33,380 --> 00:34:36,719 So we know that max-SAT that accepts. 640 00:34:36,719 --> 00:34:40,291 So know that so we define. 641 00:34:40,291 --> 00:34:41,330 We define a V dash. 642 00:34:44,750 --> 00:34:47,620 This V dash is in the graph for clique 643 00:34:47,620 --> 00:34:56,675 as the set of vertices i, such that x of i is equal to 1. 644 00:34:56,675 --> 00:34:59,130 So if our Max-k-SAT has that many things, 645 00:34:59,130 --> 00:35:02,160 then there's some assignment to xi which satisfies that. 646 00:35:02,160 --> 00:35:03,450 And we take that assignment. 647 00:35:03,450 --> 00:35:05,330 And for every value that is 1, we 648 00:35:05,330 --> 00:35:07,830 put that in a-- it's not a clique yet, 649 00:35:07,830 --> 00:35:10,942 but it's, well, it's a clique under construction. 650 00:35:10,942 --> 00:35:12,650 So we make this clique under construction 651 00:35:12,650 --> 00:35:15,160 and we assign it values of all the vertices 652 00:35:15,160 --> 00:35:17,010 which are currently labeled by 1. 653 00:35:17,010 --> 00:35:18,140 But we don't know that it's a clique, right? 654 00:35:18,140 --> 00:35:18,960 It could not be a clique. 655 00:35:18,960 --> 00:35:20,543 It could be something like, let's say, 656 00:35:20,543 --> 00:35:22,610 so this is the current V dash. 657 00:35:22,610 --> 00:35:25,520 So V dash could be something like you 658 00:35:25,520 --> 00:35:28,720 have all these vertices and let's say some of them 659 00:35:28,720 --> 00:35:29,940 are connected. 660 00:35:29,940 --> 00:35:31,700 But then you have this dangling. 661 00:35:31,700 --> 00:35:32,270 So this is not connected. 662 00:35:32,270 --> 00:35:33,210 This is not connected. 663 00:35:33,210 --> 00:35:34,930 So it's not a clique. 664 00:35:34,930 --> 00:35:39,420 So let's say this is vertex i and this is vertex j, 665 00:35:39,420 --> 00:35:41,700 and this whole thing is V dash. 666 00:35:41,700 --> 00:35:43,850 And let's say somehow you have this anomaly. 667 00:35:43,850 --> 00:35:47,590 You have that this guy is not connected to all the vertices. 668 00:35:47,590 --> 00:35:49,740 So let's take one such pair. 669 00:35:49,740 --> 00:35:54,666 And let's just say we remove xi. 670 00:35:54,666 --> 00:35:56,920 So we just take the xi and remove it from V dash. 671 00:35:56,920 --> 00:36:01,590 What that equates to in the Max-k-SAT is setting xi to 0. 672 00:36:01,590 --> 00:36:04,810 So we take the original satisfying assignment 673 00:36:04,810 --> 00:36:07,660 and change the value of xi. 674 00:36:07,660 --> 00:36:12,500 So let's see what that does. 675 00:36:12,500 --> 00:36:19,600 So we take and set xi equal to 0. 676 00:36:19,600 --> 00:36:21,160 And xi was originally 1. 677 00:36:23,977 --> 00:36:25,185 Let's actually write it down. 678 00:36:25,185 --> 00:36:37,570 So xi, or rather, i, j is not an element of E, 679 00:36:37,570 --> 00:36:50,340 but-- so these i, j were in the supposed to be clique, 680 00:36:50,340 --> 00:36:52,370 but the i, j is not in the edge set, 681 00:36:52,370 --> 00:36:53,620 so it's not actually a clique. 682 00:36:53,620 --> 00:36:55,740 So the way we resolve that is just we do this. 683 00:36:55,740 --> 00:36:56,240 We say, OK. 684 00:36:56,240 --> 00:36:57,610 Let's just forget about this vertex. 685 00:36:57,610 --> 00:36:58,960 Let's say it's just not in the clique. 686 00:36:58,960 --> 00:37:00,080 So we set our xi to 0. 687 00:37:00,080 --> 00:37:01,250 And what does that do? 688 00:37:01,250 --> 00:37:04,610 Let's look at how that affects the number of sat clauses. 689 00:37:04,610 --> 00:37:10,390 So the first one is x [? over ?] z. 690 00:37:10,390 --> 00:37:14,410 So now here we had not set z to be 1. 691 00:37:14,410 --> 00:37:16,260 This is something in thing. 692 00:37:16,260 --> 00:37:19,210 If z is equal to 0, we can just replace z by 0 of z. 693 00:37:19,210 --> 00:37:22,600 So the clauses are symmetrical with respect to z. 694 00:37:22,600 --> 00:37:24,055 So it's xi [? over ?] z or xi 0 z. 695 00:37:24,055 --> 00:37:26,180 So it doesn't matter which one is set to be 1 or 0. 696 00:37:26,180 --> 00:37:27,763 If one of them is 1, one of them is 0. 697 00:37:27,763 --> 00:37:31,170 So let's just say that z is 1 without loss of generality. 698 00:37:31,170 --> 00:37:34,020 So what that does is-- so initially, this clause was 1 699 00:37:34,020 --> 00:37:35,210 because z is 1. 700 00:37:35,210 --> 00:37:38,130 And now it goes to 1 and nothing changes. 701 00:37:38,130 --> 00:37:39,600 OK, sounds good. 702 00:37:39,600 --> 00:37:40,790 What about the next one? 703 00:37:40,790 --> 00:37:47,640 So now we have xi or 0 of z. 704 00:37:47,640 --> 00:37:52,370 So 0 of z is 0 and xi just went from being 1 to being 0. 705 00:37:52,370 --> 00:37:55,630 So what happens here is that initially, the clause value 706 00:37:55,630 --> 00:37:59,490 was 1 and now it goes to 0. 707 00:37:59,490 --> 00:38:01,880 So that's not good because now we 708 00:38:01,880 --> 00:38:04,360 are no longer satisfying Max-k-sat clause possibly 709 00:38:04,360 --> 00:38:06,820 because we had some number of satisfying clauses, 710 00:38:06,820 --> 00:38:08,770 which was above our threshold. 711 00:38:08,770 --> 00:38:10,530 But now we lose a clause and we could 712 00:38:10,530 --> 00:38:12,190 be going below the threshold. 713 00:38:12,190 --> 00:38:14,390 But then we look at the third clause. 714 00:38:14,390 --> 00:38:22,690 And what that does is-- so let's say 715 00:38:22,690 --> 00:38:25,250 we look at this specific clause, the one which 716 00:38:25,250 --> 00:38:27,450 said that xi and xj. 717 00:38:27,450 --> 00:38:30,732 Note that this clause exists because xi, 718 00:38:30,732 --> 00:38:32,636 xj is not an edge you need. 719 00:38:32,636 --> 00:38:34,010 And therefore, by that condition, 720 00:38:34,010 --> 00:38:37,020 this clause exists in the set of clauses. 721 00:38:37,020 --> 00:38:40,450 So what was the value of this clause initially before? 722 00:38:40,450 --> 00:38:43,040 In the initial assignment, what was the value of this clause? 723 00:38:45,650 --> 00:38:48,460 Note that i and j are both in V dash. 724 00:38:48,460 --> 00:38:49,910 What was the value of this clause 725 00:38:49,910 --> 00:38:53,890 in the original assignment? 726 00:38:53,890 --> 00:38:56,044 Before we set xi to 0? 727 00:38:56,044 --> 00:38:58,210 What was the value of xi in the original assignment? 728 00:39:01,696 --> 00:39:03,072 AUDIENCE: 1. 729 00:39:03,072 --> 00:39:04,780 AMARTYA SHANKHA BISWAS: Yes, why is it 1? 730 00:39:04,780 --> 00:39:06,700 AUDIENCE: Because xi was in V dash. 731 00:39:06,700 --> 00:39:09,970 AMARTYA SHANKHA BISWAS: Yeah, so xi was in V dash, so it's 1. 732 00:39:09,970 --> 00:39:11,960 xj was also in V dash because that's 733 00:39:11,960 --> 00:39:13,290 the anomaly we saw it, right? 734 00:39:13,290 --> 00:39:15,350 There was not an edge in the clique. 735 00:39:15,350 --> 00:39:17,650 So this was originally so this was 1 and this was 1. 736 00:39:17,650 --> 00:39:18,990 So this was 9 and this was 0. 737 00:39:18,990 --> 00:39:22,390 And our R0, R0 is-- this used to be 0. 738 00:39:22,390 --> 00:39:23,850 So what happens now though? 739 00:39:23,850 --> 00:39:26,130 Does it change? 740 00:39:26,130 --> 00:39:28,540 It changes to 1 because xi goes to 0. 741 00:39:28,540 --> 00:39:31,840 Now this thing becomes 1 and so it changed to 1. 742 00:39:31,840 --> 00:39:33,620 So there will be other clauses with xi, 743 00:39:33,620 --> 00:39:37,990 but realize that 0 of xi-- so if xi is changed to 0, 0 of xi 744 00:39:37,990 --> 00:39:39,040 is changing to 1. 745 00:39:39,040 --> 00:39:40,990 So whatever happens here, it will only 746 00:39:40,990 --> 00:39:43,740 increase the number of clauses that are being satisfied. 747 00:39:43,740 --> 00:39:47,330 So you lose only 1, but you gain at least 1. 748 00:39:47,330 --> 00:39:51,820 So eventually, it does not change. 749 00:40:12,480 --> 00:40:15,060 It does not change the number of satisfied clauses, 750 00:40:15,060 --> 00:40:16,330 so that's important. 751 00:40:16,330 --> 00:40:19,090 So what we did is we started with some satisfying assignment 752 00:40:19,090 --> 00:40:20,580 that we assumed existed. 753 00:40:20,580 --> 00:40:22,130 And then we just changed the variable 754 00:40:22,130 --> 00:40:24,800 and we said that it's still at least as many clauses 755 00:40:24,800 --> 00:40:26,440 being satisfied. 756 00:40:26,440 --> 00:40:27,200 So we did that. 757 00:40:27,200 --> 00:40:29,420 And now we have one less vertex that 758 00:40:29,420 --> 00:40:40,240 is violating-- so now we have one less vertex that is 759 00:40:40,240 --> 00:40:42,040 violating the clique property. 760 00:40:42,040 --> 00:40:45,560 So now we can take this step-- this setting xi to 0. 761 00:40:45,560 --> 00:40:46,560 We can just repeat this. 762 00:40:49,690 --> 00:40:51,790 So how long do we repeat this? 763 00:40:51,790 --> 00:40:55,152 So we repeat this till there are no longer any violations. 764 00:40:55,152 --> 00:40:57,360 So every time we find a violation-- so once we delete 765 00:40:57,360 --> 00:40:59,137 xi-- so let's this is gone. 766 00:40:59,137 --> 00:41:00,720 So now we look for the next violation. 767 00:41:00,720 --> 00:41:02,909 The next violation is this edge. 768 00:41:02,909 --> 00:41:03,950 So there's no edge there. 769 00:41:03,950 --> 00:41:06,324 So we can either take this vertex-- so let's call this k. 770 00:41:06,324 --> 00:41:08,095 So we can either remove xk or remove xj. 771 00:41:08,095 --> 00:41:09,830 So we remove one of them. 772 00:41:09,830 --> 00:41:14,020 And once you remove that, you will end up with the 3-clique. 773 00:41:14,020 --> 00:41:15,530 So what happens is once we repeat 774 00:41:15,530 --> 00:41:26,378 this, until V dash is a clique. 775 00:41:29,290 --> 00:41:31,510 Does that make sense why you can do that? 776 00:41:31,510 --> 00:41:33,660 So you just keep deleting vertices 777 00:41:33,660 --> 00:41:34,800 until this is a clique. 778 00:41:34,800 --> 00:41:38,510 And those clauses plus clause condition is still satisfied. 779 00:41:38,510 --> 00:41:41,670 You will still have greater than or equal to that many clauses. 780 00:41:41,670 --> 00:41:44,370 So now let's look at what we have. 781 00:41:44,370 --> 00:41:56,660 We have V dash, which is a clique and xi is equal to-- so 782 00:41:56,660 --> 00:41:59,030 once you have done this process as many times 783 00:41:59,030 --> 00:42:02,656 as you need, when is xi equal to 1? 784 00:42:02,656 --> 00:42:04,780 So remember that-- so V dash is also being updated. 785 00:42:04,780 --> 00:42:07,857 Every time you set xi to 0, xi is being removed form V dash. 786 00:42:07,857 --> 00:42:09,440 So that property is always satisfied-- 787 00:42:09,440 --> 00:42:12,560 that define V dash equal to xi for xi equal to 1. 788 00:42:12,560 --> 00:42:14,930 That property's always invariant. 789 00:42:14,930 --> 00:42:18,180 That means that even after reviewing these repetitions, 790 00:42:18,180 --> 00:42:24,180 you still have xi equal to 1, if and only if i 791 00:42:24,180 --> 00:42:30,960 is E V dash and 0 otherwise. 792 00:42:30,960 --> 00:42:33,760 So does make sense why that property is solved? 793 00:42:33,760 --> 00:42:37,350 So you have a clique and you have this assignment. 794 00:42:37,350 --> 00:42:38,980 So that should take you back to this. 795 00:42:38,980 --> 00:42:42,940 So remember where we took a clique of size k 796 00:42:42,940 --> 00:42:46,050 and we found that if you go through all the algebra, 797 00:42:46,050 --> 00:42:50,310 you will find something which is like you will go through 798 00:42:50,310 --> 00:42:54,330 and you will get mod of E bar plus mod of V plus mod of k, 799 00:42:54,330 --> 00:42:55,540 right? 800 00:42:55,540 --> 00:42:57,460 So here you have a clique. 801 00:42:57,460 --> 00:42:58,960 So forget about what you did before. 802 00:42:58,960 --> 00:43:01,250 So just consider this is an assignment 803 00:43:01,250 --> 00:43:02,790 according to those rules. 804 00:43:02,790 --> 00:43:04,650 And those rules give you that you 805 00:43:04,650 --> 00:43:14,830 should get number of satisfied clauses 806 00:43:14,830 --> 00:43:22,729 is equal to mod of E bar plus mod of V plus mod of V dash. 807 00:43:22,729 --> 00:43:24,145 Realize that V dash here is not k. 808 00:43:24,145 --> 00:43:24,860 Does that make sense? 809 00:43:24,860 --> 00:43:27,070 Because V dash is just-- so you started with some set. 810 00:43:27,070 --> 00:43:28,130 You started deleting some elements. 811 00:43:28,130 --> 00:43:30,129 You throw [INAUDIBLE] randomly, but you ended up 812 00:43:30,129 --> 00:43:31,100 with some V dash. 813 00:43:31,100 --> 00:43:34,575 And by this argument, you had E bar plus mod 814 00:43:34,575 --> 00:43:37,150 of V plus mod of V dash-- E bar from this clause, 815 00:43:37,150 --> 00:43:39,560 V from this clause, V dash from this clause. 816 00:43:39,560 --> 00:43:40,970 So you didn't end up with this. 817 00:43:40,970 --> 00:43:43,800 And you know that because of what we showed here 818 00:43:43,800 --> 00:43:45,700 does not change number of satisfied clauses, 819 00:43:45,700 --> 00:43:48,130 it's still greater than or equal to mod 820 00:43:48,130 --> 00:43:55,649 of E bar plus mod of V plus k. 821 00:43:55,649 --> 00:43:56,440 And we cancel this. 822 00:43:56,440 --> 00:43:58,760 You cancel this. 823 00:43:58,760 --> 00:44:05,320 And you get mod of V dash is greater than or equal to k, 824 00:44:05,320 --> 00:44:10,400 which means that you have a clique of size greater 825 00:44:10,400 --> 00:44:12,940 than equal to k. 826 00:44:12,940 --> 00:44:13,440 Questions? 827 00:44:18,830 --> 00:44:19,990 Does that make sense? 828 00:44:19,990 --> 00:44:20,490 Really? 829 00:44:20,490 --> 00:44:23,181 All of it? 830 00:44:23,181 --> 00:44:23,680 OK. 831 00:44:23,680 --> 00:44:24,880 That's good. 832 00:44:24,880 --> 00:44:26,850 Any case. 833 00:44:26,850 --> 00:44:29,650 So let's go back and see what we are doing here. 834 00:44:29,650 --> 00:44:33,690 So the way you are doing NP-hard reductions is 835 00:44:33,690 --> 00:44:36,880 you take a problem that you already know is hard, 836 00:44:36,880 --> 00:44:41,200 you take any arbitrary instance of that problem, 837 00:44:41,200 --> 00:44:45,100 and you transform the input into an input to the problem 838 00:44:45,100 --> 00:44:47,240 that you're trying to show is hard. 839 00:44:47,240 --> 00:44:49,470 So you take problem A, which you know is hard. 840 00:44:49,470 --> 00:44:53,260 You transform the input into an input for problem B. 841 00:44:53,260 --> 00:44:56,150 And then you show that if you can solve [INAUDIBLE] problem 842 00:44:56,150 --> 00:44:59,346 B, you will be able to solve problem A. 843 00:44:59,346 --> 00:45:01,095 But since you know you can't solve problem 844 00:45:01,095 --> 00:45:03,280 A in polynomial-time, you know that you can't solve 845 00:45:03,280 --> 00:45:06,630 problem B in polynomial-time. 846 00:45:06,630 --> 00:45:08,770 And the other important thing to notice 847 00:45:08,770 --> 00:45:11,274 here is that the reduction needs to polynomial-time. 848 00:45:11,274 --> 00:45:12,940 So look at this reduction, for instance. 849 00:45:12,940 --> 00:45:13,940 What are you doing here? 850 00:45:13,940 --> 00:45:17,060 You're taking every vertex. 851 00:45:17,060 --> 00:45:18,240 You're making a clause. 852 00:45:18,240 --> 00:45:19,670 And how many clauses do you have? 853 00:45:19,670 --> 00:45:22,640 Well, you have about n squared clauses here. 854 00:45:22,640 --> 00:45:25,030 You have Rn clauses and you have Rn clauses here. 855 00:45:25,030 --> 00:45:27,840 So time to construct that is like roughly Rn squared. 856 00:45:27,840 --> 00:45:30,310 So you're constructing the clause in polynomial-time. 857 00:45:30,310 --> 00:45:33,660 So you have a polynomial-time reduction. 858 00:45:33,660 --> 00:45:36,080 And if you reduce a known NP-hard problem 859 00:45:36,080 --> 00:45:38,210 in polynomial-time to an unknown problem, 860 00:45:38,210 --> 00:45:40,350 you can show that it is NP-hard. 861 00:45:40,350 --> 00:45:40,850 OK. 862 00:45:40,850 --> 00:45:44,880 I don't think we have time to do another problem, so we're done.