1 00:00:00,070 --> 00:00:02,430 The following content is provided under a Creative 2 00:00:02,430 --> 00:00:03,820 Commons license. 3 00:00:03,820 --> 00:00:06,050 Your support will help MIT OpenCourseWare 4 00:00:06,050 --> 00:00:10,150 continue to offer high-quality educational resources for free. 5 00:00:10,150 --> 00:00:12,700 To make a donation or to view additional materials 6 00:00:12,700 --> 00:00:16,600 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,600 --> 00:00:17,263 at ocw.mit.edu. 8 00:00:25,846 --> 00:00:26,720 PROFESSOR: All right. 9 00:00:26,720 --> 00:00:31,300 Today we continue our theme of approximation, lower bounds 10 00:00:31,300 --> 00:00:33,020 inapproximability. 11 00:00:33,020 --> 00:00:35,470 Quick recap of last time. 12 00:00:35,470 --> 00:00:39,530 We talked about lots of different reductions. 13 00:00:39,530 --> 00:00:44,900 We, I guess, in particular talked about P-tests, AP and L. 14 00:00:44,900 --> 00:00:48,720 And in particular we'll be using L-reductions almost exclusively 15 00:00:48,720 --> 00:00:51,880 today, except the occasional strict reduction, which 16 00:00:51,880 --> 00:00:55,160 is even stronger, in a sense. 17 00:00:55,160 --> 00:00:57,117 So what's an L-reduction? 18 00:00:57,117 --> 00:00:59,450 We're trying to go from one problem A to another problem 19 00:00:59,450 --> 00:01:05,530 B. We're given an instance x of A. We convert it via function f 20 00:01:05,530 --> 00:01:10,790 to an instance x prime of B. Then we imagine that somehow we 21 00:01:10,790 --> 00:01:12,290 obtain a solution. 22 00:01:12,290 --> 00:01:16,030 We don't know anything about it. y prime to x prime. 23 00:01:16,030 --> 00:01:17,540 That's in B space. 24 00:01:17,540 --> 00:01:19,210 And then, in the reduction, we're 25 00:01:19,210 --> 00:01:21,160 supposed to be able to map any such solution 26 00:01:21,160 --> 00:01:28,040 y prime to x prime via g into solution y of x in A problem-- 27 00:01:28,040 --> 00:01:31,560 so that's given by the function g-- such that two things hold. 28 00:01:31,560 --> 00:01:36,070 The first one is that for f, optimal solution 29 00:01:36,070 --> 00:01:39,520 of x prime should be at most some constant times 30 00:01:39,520 --> 00:01:41,930 the optimal solution to x. 31 00:01:41,930 --> 00:01:44,190 So we don't blow up OPTs too much. 32 00:01:44,190 --> 00:01:47,660 And secondly the absolute difference 33 00:01:47,660 --> 00:01:51,600 between the cost of y versus the optimal solution 34 00:01:51,600 --> 00:01:56,150 for x should be within a constant factor of this kind 35 00:01:56,150 --> 00:01:59,650 of gap-- additive gap between the cost of y 36 00:01:59,650 --> 00:02:04,210 prime versus the optimal solution to x prime, 37 00:02:04,210 --> 00:02:06,430 meaning that if we were given a y prime that's 38 00:02:06,430 --> 00:02:08,430 very close to optimal for x prime, then the y 39 00:02:08,430 --> 00:02:10,949 we produce is very close to optimal for x. 40 00:02:10,949 --> 00:02:14,446 And we want that in an additive sense 41 00:02:14,446 --> 00:02:17,070 that will imply that things are good in a multiplicative sense. 42 00:02:17,070 --> 00:02:20,180 Last time we proved that for the min case, 43 00:02:20,180 --> 00:02:21,490 for minimization problems. 44 00:02:21,490 --> 00:02:24,110 If you're curious, I worked out the details 45 00:02:24,110 --> 00:02:26,840 for maximization problems. 46 00:02:26,840 --> 00:02:29,670 It's a little bit uglier in terms of the arithmetic. 47 00:02:29,670 --> 00:02:32,920 But you again get that if you had a constant factor 48 00:02:32,920 --> 00:02:35,970 approximation over here, you preserve a constant factor 49 00:02:35,970 --> 00:02:39,630 approximation over here, and you only-- you 50 00:02:39,630 --> 00:02:41,500 lose a reasonable factor. 51 00:02:44,530 --> 00:02:47,119 We also have that if you can get a PTAS over here, 52 00:02:47,119 --> 00:02:49,160 so you can get an arbitrarily good approximation, 53 00:02:49,160 --> 00:02:50,550 you also get a PTAS over here. 54 00:02:50,550 --> 00:02:52,380 That was the PTAS reduction. 55 00:02:52,380 --> 00:02:55,430 And it turns out the constant in the end is roughly 56 00:02:55,430 --> 00:02:59,710 epsilon over alpha beta, where alpha was this constant, 57 00:02:59,710 --> 00:03:02,270 and beta was this constant. 58 00:03:02,270 --> 00:03:03,564 That's what we had before. 59 00:03:03,564 --> 00:03:04,730 It's a little bit different. 60 00:03:04,730 --> 00:03:06,380 For small epsilon, it's about the same. 61 00:03:06,380 --> 00:03:09,310 But for large epsilon, it does make a difference. 62 00:03:09,310 --> 00:03:13,470 And this is why, in case you were confused, 63 00:03:13,470 --> 00:03:17,230 an L-reduction does not imply in the maximization 64 00:03:17,230 --> 00:03:21,030 case an AP-reduction, because you have this non-linear term. 65 00:03:21,030 --> 00:03:23,680 Here, everything was linear in epsilon. 66 00:03:23,680 --> 00:03:26,320 With minimization, that's true. 67 00:03:26,320 --> 00:03:27,780 The L implies AP. 68 00:03:27,780 --> 00:03:30,350 But for maximization it's not quite true. 69 00:03:30,350 --> 00:03:32,560 It's close. 70 00:03:32,560 --> 00:03:34,270 So there's some funny. 71 00:03:34,270 --> 00:03:36,270 What I said didn't quite match this picture. 72 00:03:36,270 --> 00:03:38,920 That's an explanation. 73 00:03:38,920 --> 00:03:42,280 And then we did a few reductions. 74 00:03:42,280 --> 00:03:47,680 I claimed that Max E3SAT-E5, this was exactly 75 00:03:47,680 --> 00:03:49,650 three distinct literals per clause, 76 00:03:49,650 --> 00:03:54,790 exactly five occurrences of each variable in five 77 00:03:54,790 --> 00:03:57,150 different clauses. 78 00:03:57,150 --> 00:03:58,490 I claimed that was APX-complete. 79 00:03:58,490 --> 00:04:00,130 We didn't prove it. 80 00:04:00,130 --> 00:04:02,810 What we did prove is that assuming Max 3SAT 81 00:04:02,810 --> 00:04:06,940 is APX-complete, we reduce that to Max 3SAT3, 82 00:04:06,940 --> 00:04:10,240 which is at most three occurrences, each thing, 83 00:04:10,240 --> 00:04:12,200 first by using expander, and then 84 00:04:12,200 --> 00:04:14,620 splitting the constant size-- constant occurrence 85 00:04:14,620 --> 00:04:18,190 variables-- with the cycle of implications trick. 86 00:04:18,190 --> 00:04:21,019 And then we reduced from that to bounded degree. 87 00:04:21,019 --> 00:04:23,800 I think we did like max degree 4. 88 00:04:23,800 --> 00:04:27,320 But all of these can be done in max degree 3. 89 00:04:27,320 --> 00:04:30,951 Independent set, vertex cover, and dominating set. 90 00:04:30,951 --> 00:04:32,200 Vertex cover we've seen a lot. 91 00:04:32,200 --> 00:04:33,940 You want to cover all the edges by choosing vertices. 92 00:04:33,940 --> 00:04:35,650 Dominating set, you want to cover all the vertices 93 00:04:35,650 --> 00:04:36,630 by choosing vertices. 94 00:04:36,630 --> 00:04:38,610 Each vertex covers its neighbor set. 95 00:04:38,610 --> 00:04:43,556 And independent set, for general graphs this is super hard. 96 00:04:43,556 --> 00:04:45,055 But for bounded degree graphs, there 97 00:04:45,055 --> 00:04:46,950 is a constant factor approximation. 98 00:04:46,950 --> 00:04:50,750 This was choosing vertices that induced no edges. 99 00:04:50,750 --> 00:04:55,850 So with that in mind, let's do some more APX-reductions, 100 00:04:55,850 --> 00:04:58,030 APX-hardness, using L-reductions. 101 00:05:00,600 --> 00:05:06,450 So the next problem we're going to do is Max 2SAT. 102 00:05:12,110 --> 00:05:15,470 So because we're in the world of optimization, in some sense 103 00:05:15,470 --> 00:05:19,420 the distinction between 2SAT and 3SAT is not so important. 104 00:05:19,420 --> 00:05:22,340 It turns out Max 2SAT will be APX-complete 105 00:05:22,340 --> 00:05:24,230 just like Max 3SAT was. 106 00:05:24,230 --> 00:05:25,877 So when we didn't have Max, of course 107 00:05:25,877 --> 00:05:27,460 the complexities were quite different. 108 00:05:27,460 --> 00:05:29,847 3SAT was hard, 2SAT was easy. 109 00:05:29,847 --> 00:05:31,430 With maximization, they're going to be 110 00:05:31,430 --> 00:05:34,460 equivalent in this perspective. 111 00:05:34,460 --> 00:05:43,150 So I'm going to do an L-reduction 112 00:05:43,150 --> 00:05:52,250 from independent set of, let's say a degree 3. 113 00:05:55,275 --> 00:05:56,900 So it'll work with any constant degree, 114 00:05:56,900 --> 00:05:59,260 but we'll get a different number of occurrences. 115 00:05:59,260 --> 00:06:01,710 And the reduction is the following. 116 00:06:01,710 --> 00:06:05,240 There are two types of gadgets for every vertex. 117 00:06:05,240 --> 00:06:07,280 So I'm given an independent set instance. 118 00:06:07,280 --> 00:06:11,240 For every vertex v, we're going to convert that 119 00:06:11,240 --> 00:06:19,484 into a clause-- namely v. I want v to be true, if possible. 120 00:06:19,484 --> 00:06:21,150 It's a funny way of thinking when you're 121 00:06:21,150 --> 00:06:23,608 maximizing a number of causes, because a lot of the clauses 122 00:06:23,608 --> 00:06:24,490 won't be satisfied. 123 00:06:24,490 --> 00:06:27,240 But you're going to try to put v in the independent set 124 00:06:27,240 --> 00:06:28,040 if you can. 125 00:06:28,040 --> 00:06:30,060 That's the meaning of that clause. 126 00:06:30,060 --> 00:06:34,420 Then for every edge-- let's say connecting v 127 00:06:34,420 --> 00:06:38,070 to w-- we're going to convert that into a clause which 128 00:06:38,070 --> 00:06:43,220 is not v or not w. 129 00:06:43,220 --> 00:06:46,200 We don't want them both to be in the independent set. 130 00:06:46,200 --> 00:06:48,794 That's the meaning of-- yeah. 131 00:06:48,794 --> 00:06:50,460 I'm trying to simulate independent sets. 132 00:06:50,460 --> 00:06:52,200 So I don't want these both to be in. 133 00:06:52,200 --> 00:06:55,220 This is a 2SAT clause. 134 00:06:55,220 --> 00:06:56,950 So what's the claim here? 135 00:06:56,950 --> 00:07:00,800 Suppose you have some assignment to the variable. 136 00:07:00,800 --> 00:07:03,360 So there's one variable per vertex over here. 137 00:07:03,360 --> 00:07:07,900 The idea is that variable should indicate whether the vertex is 138 00:07:07,900 --> 00:07:10,450 in the independent set. 139 00:07:10,450 --> 00:07:13,290 And the claim is that we will never 140 00:07:13,290 --> 00:07:18,315 violate an edge constraint, or it's never useful to violate. 141 00:07:18,315 --> 00:07:21,050 The claim is that there exists an OPT-- optimal 142 00:07:21,050 --> 00:07:29,075 solution-- satisfying all of these edge constraints. 143 00:07:32,021 --> 00:07:33,020 So we're doing Max 2SAT. 144 00:07:33,020 --> 00:07:35,610 So we get a point for every one of these things 145 00:07:35,610 --> 00:07:37,670 that we satisfy. 146 00:07:37,670 --> 00:07:40,790 And so in particular, if you didn't 147 00:07:40,790 --> 00:07:45,530 get this point-- not v or not w-- the converse of this 148 00:07:45,530 --> 00:07:48,740 is that they are both in. 149 00:07:48,740 --> 00:07:50,570 Then the idea is that you instead 150 00:07:50,570 --> 00:07:54,510 take one of those vertices out of the independent set, 151 00:07:54,510 --> 00:07:57,484 and that will be better for you. 152 00:07:57,484 --> 00:07:59,900 In general, when you put a variable in an independent set, 153 00:07:59,900 --> 00:08:02,462 it only helps you for one clause. 154 00:08:02,462 --> 00:08:04,170 There's only one occurrence of positive v 155 00:08:04,170 --> 00:08:05,500 in all of these things. 156 00:08:05,500 --> 00:08:08,710 You might have many edges coming into a vertex, 157 00:08:08,710 --> 00:08:12,350 and they all prefer the case that v is false. 158 00:08:12,350 --> 00:08:15,220 So things are going to be easier if you set v to false. 159 00:08:15,220 --> 00:08:18,420 So if you discover a clause like this, which is currently false, 160 00:08:18,420 --> 00:08:20,880 meaning both v and w are true, you're 161 00:08:20,880 --> 00:08:24,170 going to gain a point by setting v to false. 162 00:08:24,170 --> 00:08:26,564 You'll also lose a point, but you'll only lose one point. 163 00:08:26,564 --> 00:08:27,980 Potentially, you gain many points, 164 00:08:27,980 --> 00:08:31,620 but you gain at least one point and lose at most one point 165 00:08:31,620 --> 00:08:36,240 by switching from both v and w true into just one 166 00:08:36,240 --> 00:08:37,490 of them true. 167 00:08:37,490 --> 00:08:41,840 So you can always convert without losing anything in OPT 168 00:08:41,840 --> 00:08:44,900 into a solution that satisfies all edge constraints. 169 00:08:44,900 --> 00:08:48,000 And then we know we have an independent set. 170 00:08:48,000 --> 00:08:50,070 That's what the edge constraints say. 171 00:08:50,070 --> 00:08:53,006 And therefore the remaining problem 172 00:08:53,006 --> 00:08:54,630 is to maximize the number vertices that 173 00:08:54,630 --> 00:08:55,853 are in the independent set. 174 00:09:04,350 --> 00:09:08,820 So that means if we're given any solution y prime to this Max 175 00:09:08,820 --> 00:09:11,915 2SAT instance, we can convert it back to an independent set. 176 00:09:11,915 --> 00:09:14,970 Now it's not quite of the same value. 177 00:09:14,970 --> 00:09:20,700 In general, the optimal solution here for the 2SAT instance 178 00:09:20,700 --> 00:09:23,080 is going to be the optimal solution 179 00:09:23,080 --> 00:09:28,197 for the independent set instance plus the total number of edges, 180 00:09:28,197 --> 00:09:30,030 because we're going to satisfy all of these. 181 00:09:30,030 --> 00:09:32,030 That's what we just showed. 182 00:09:32,030 --> 00:09:36,410 So this is where we get a kind of additive behavior, 183 00:09:36,410 --> 00:09:39,060 like in this L-reduction. 184 00:09:39,060 --> 00:09:40,800 The gap is an additive thing. 185 00:09:40,800 --> 00:09:42,310 But here it's a nice fixed thing. 186 00:09:42,310 --> 00:09:46,080 And so these are pretty much the same. 187 00:09:46,080 --> 00:09:48,160 There's just this additive offset. 188 00:09:48,160 --> 00:09:51,110 So that's going to be fine in terms of the second property. 189 00:09:51,110 --> 00:09:54,310 The additive difference between one of these solutions and OPT 190 00:09:54,310 --> 00:09:55,460 will be exactly the same. 191 00:09:55,460 --> 00:09:58,017 The beta here at this constant will be 1. 192 00:09:58,017 --> 00:10:00,100 But we do have to worry about the first condition. 193 00:10:00,100 --> 00:10:02,641 We need to make sure OPT doesn't blow up too much, because we 194 00:10:02,641 --> 00:10:04,490 did make it bigger. 195 00:10:04,490 --> 00:10:08,790 So for that, all we need is this is 196 00:10:08,790 --> 00:10:12,200 omega, the number of vertices. 197 00:10:12,200 --> 00:10:16,840 And that's because we assumed our graph had bounded degree, 198 00:10:16,840 --> 00:10:19,610 and so we can always find an independent set of size 199 00:10:19,610 --> 00:10:23,190 something like n over constant. 200 00:10:23,190 --> 00:10:24,610 So because that's already linear, 201 00:10:24,610 --> 00:10:26,630 we only added another linear thing. 202 00:10:26,630 --> 00:10:32,120 Again, also this is order, number of vertices. 203 00:10:32,120 --> 00:10:35,200 So we're not adding too much relative to this, 204 00:10:35,200 --> 00:10:37,110 because bounded degree. 205 00:10:37,110 --> 00:10:37,900 Cool? 206 00:10:37,900 --> 00:10:40,460 So that's Max 2SAT, APX-hardness. 207 00:10:48,110 --> 00:10:53,190 Fun fact which I won't prove. 208 00:10:53,190 --> 00:11:03,050 Max E2SAT-E3 is also APX-complete. 209 00:11:03,050 --> 00:11:07,030 So here we got some bounded number of occurrences. 210 00:11:07,030 --> 00:11:10,680 I guess each variable is going to appear in one 211 00:11:10,680 --> 00:11:13,060 plus three, four clauses. 212 00:11:13,060 --> 00:11:16,090 You can get that down to three clauses per variable. 213 00:11:21,150 --> 00:11:21,650 OK. 214 00:11:27,810 --> 00:11:29,630 Now that we have Max 2SAT, we can 215 00:11:29,630 --> 00:11:36,105 do another one, which is Max not all equal 3SAT. 216 00:11:40,580 --> 00:11:48,060 So from SAT-land, we have 3SAT, not all equal 3SAT, 217 00:11:48,060 --> 00:11:49,710 and 1 and 3SAT. 218 00:11:49,710 --> 00:11:51,230 We're going to get all of those. 219 00:11:51,230 --> 00:11:53,320 Actually, we can even get 1 and 2SAT. 220 00:11:53,320 --> 00:11:55,160 Little bit stronger. 221 00:11:55,160 --> 00:11:58,260 But let's do not all equal 3SAT. 222 00:11:58,260 --> 00:12:02,870 So here we are going to do, I believe, 223 00:12:02,870 --> 00:12:15,920 a strict reduction from Max 2SAT which we just proved, 224 00:12:15,920 --> 00:12:18,510 APX-complete. 225 00:12:18,510 --> 00:12:20,230 Yeah. 226 00:12:20,230 --> 00:12:22,655 It's again in APX, because you can, say, take 227 00:12:22,655 --> 00:12:24,330 your random assignment, and you'll 228 00:12:24,330 --> 00:12:28,100 satisfy some constant fraction of the clauses. 229 00:12:28,100 --> 00:12:30,710 And OK. 230 00:12:30,710 --> 00:12:32,630 So here's the reduction. 231 00:12:32,630 --> 00:12:34,370 Again, very easy. 232 00:12:34,370 --> 00:12:36,294 Suppose we're starting from Max 2SAT, 233 00:12:36,294 --> 00:12:37,710 so all our clauses look like this. 234 00:12:37,710 --> 00:12:40,380 These may be negated or not. 235 00:12:40,380 --> 00:12:50,700 And we're going to convert it into not all equal of x, y, 236 00:12:50,700 --> 00:12:52,056 and a. 237 00:12:52,056 --> 00:12:57,250 a is a new variable, and it appears in every single clause. 238 00:12:57,250 --> 00:12:57,750 OK? 239 00:12:57,750 --> 00:12:58,791 So this is kind of funny. 240 00:13:01,980 --> 00:13:04,880 So a appears everywhere. 241 00:13:04,880 --> 00:13:06,990 And not all equal has this nice symmetry, right? 242 00:13:06,990 --> 00:13:08,240 There wasn't really a zero or one. 243 00:13:08,240 --> 00:13:09,990 You can think of them as red, as blue. 244 00:13:09,990 --> 00:13:12,910 Doesn't matter whether red is true or blue is true. 245 00:13:12,910 --> 00:13:15,130 So in particular, we can use that symmetry 246 00:13:15,130 --> 00:13:18,530 to make a consider it as false. 247 00:13:18,530 --> 00:13:21,210 So by a possible flipping everything, 248 00:13:21,210 --> 00:13:25,080 we can imagine that a equals zero. 249 00:13:25,080 --> 00:13:29,242 If not, flip all the bits, and you'll still be not all equal. 250 00:13:29,242 --> 00:13:31,700 Or all the things that were not all equal before will still 251 00:13:31,700 --> 00:13:32,408 be not all equal. 252 00:13:32,408 --> 00:13:34,350 You'll preserve OPT. 253 00:13:34,350 --> 00:13:38,540 Now once you think of a is false, then not all equal 254 00:13:38,540 --> 00:13:41,790 is saying that these are not both 0, which is 255 00:13:41,790 --> 00:13:44,010 the same thing as saying 2SAT. 256 00:13:44,010 --> 00:13:45,890 Duh. 257 00:13:45,890 --> 00:13:46,510 OK. 258 00:13:46,510 --> 00:13:49,370 Again, I mean this is saying OPT is preserved. 259 00:13:49,370 --> 00:13:51,560 But if you take any solution to this problem, 260 00:13:51,560 --> 00:13:54,100 you first possibly flip it so that a is zero, 261 00:13:54,100 --> 00:13:57,910 and then convert the xy is just exactly the xy's over here, 262 00:13:57,910 --> 00:14:00,450 and you'll preserve the size of the solution. 263 00:14:00,450 --> 00:14:02,420 You won't get any scale here, and you also 264 00:14:02,420 --> 00:14:04,840 preserved OPT exactly. 265 00:14:04,840 --> 00:14:06,750 So it's in particular an L-reduction, 266 00:14:06,750 --> 00:14:08,750 but it's even a strict reduction. 267 00:14:08,750 --> 00:14:10,820 Didn't lose anything. 268 00:14:10,820 --> 00:14:14,110 No additive slop or whatever. 269 00:14:14,110 --> 00:14:14,750 OK. 270 00:14:14,750 --> 00:14:16,430 That's nice. 271 00:14:16,430 --> 00:14:21,625 Next is usually called Max-Cut. 272 00:14:24,690 --> 00:14:25,760 You're given a graph. 273 00:14:25,760 --> 00:14:27,550 You want to split it into two parts 274 00:14:27,550 --> 00:14:31,660 to maximize the number of edges between the two parts. 275 00:14:31,660 --> 00:14:43,100 But this is the same thing as max positive 1 and 2SAT, 276 00:14:43,100 --> 00:14:46,360 which is simpler than 1 and 3SAT. 277 00:14:46,360 --> 00:14:51,860 You have, I mean, in a cut, again, you have two sides. 278 00:14:51,860 --> 00:14:54,320 Call them true or false, or red and blue, or whatever. 279 00:14:54,320 --> 00:14:57,800 You would like to assign exactly one of these to be true. 280 00:14:57,800 --> 00:15:00,010 Then that edge will be in the cut. 281 00:15:00,010 --> 00:15:01,630 So it's the same problem. 282 00:15:01,630 --> 00:15:10,440 And you can also think of it as max positive XOR-SAT. 283 00:15:10,440 --> 00:15:12,320 Maybe actually call it 2XOR-SAT. 284 00:15:15,320 --> 00:15:15,890 Same thing. 285 00:15:15,890 --> 00:15:19,100 It's just every constraint is of the form this x or this. 286 00:15:19,100 --> 00:15:21,480 You want to maximize the number of those constraints. 287 00:15:21,480 --> 00:15:23,510 So a lot of these problems have different formulations 288 00:15:23,510 --> 00:15:25,551 depending on whether you're thinking about logic, 289 00:15:25,551 --> 00:15:27,970 or thinking about a graph problem. 290 00:15:27,970 --> 00:15:31,890 So we're going to get all of these four with one reduction. 291 00:15:31,890 --> 00:15:35,010 And it's going to be from probably this one. 292 00:15:35,010 --> 00:15:36,430 Yes. 293 00:15:36,430 --> 00:15:38,260 The great chain of reductions here. 294 00:15:47,320 --> 00:15:51,450 So we're going to reduce from Max not all equal 3SAT. 295 00:15:54,060 --> 00:15:58,130 I should mention, all of the reductions we've been seeing, 296 00:15:58,130 --> 00:16:01,260 including this initial batch where we started from 3SAT, 297 00:16:01,260 --> 00:16:02,902 converted into 3SAT 3, converted it 298 00:16:02,902 --> 00:16:04,610 into an independent set, to vertex cover, 299 00:16:04,610 --> 00:16:09,460 to dominating set to Max 2SAT, to Max not equal 3SAT 300 00:16:09,460 --> 00:16:12,870 to Max-Cut, are all in this seminal paper by Papadimitriou 301 00:16:12,870 --> 00:16:15,350 and Yannakakis, 1991. 302 00:16:15,350 --> 00:16:18,826 This is before APX was really a thing. 303 00:16:18,826 --> 00:16:20,450 It had a different name at that point-- 304 00:16:20,450 --> 00:16:24,530 Max SMP-- which later is proved to be essentially equal to APX, 305 00:16:24,530 --> 00:16:26,710 or the completeness version is the same. 306 00:16:26,710 --> 00:16:29,019 You don't need to know about that. 307 00:16:29,019 --> 00:16:31,310 It comes from a different world, but all the reductions 308 00:16:31,310 --> 00:16:32,570 apply here. 309 00:16:32,570 --> 00:16:36,230 So here is the reduction for a Max-Cut. 310 00:16:36,230 --> 00:16:39,040 So again we're trying to simulate Max 311 00:16:39,040 --> 00:16:40,720 not all equal 3SAT. 312 00:16:40,720 --> 00:16:44,850 Now we actually saw in the planar lecture, planar 3SAT, 313 00:16:44,850 --> 00:16:49,650 that you can reduce planar not all equal 3SAT to planar 314 00:16:49,650 --> 00:16:53,470 Max-Cut, and that we use that to get a polynomial time algorithm 315 00:16:53,470 --> 00:16:55,340 for planar not all equal 3SAT. 316 00:16:55,340 --> 00:16:57,190 We're just going to do the reverse. 317 00:16:57,190 --> 00:17:00,930 And if you recall, this was the heart of that reduction. 318 00:17:00,930 --> 00:17:03,810 The point is that you can represent 319 00:17:03,810 --> 00:17:08,800 a not all equal clause as a cut, as a Max-Cut problem 320 00:17:08,800 --> 00:17:09,599 on a triangle. 321 00:17:09,599 --> 00:17:11,864 Because in a triangle, either they're all equal, 322 00:17:11,864 --> 00:17:14,849 and then there's no cut edges, or they're not all equal, 323 00:17:14,849 --> 00:17:17,520 and then there's exactly two cut edges. 324 00:17:17,520 --> 00:17:19,079 So that's for a cause of size 3. 325 00:17:19,079 --> 00:17:22,050 We also need to handle the case of a cause of size 2. 326 00:17:22,050 --> 00:17:24,905 But that's a two-gon, I guess, instead of a triangle. 327 00:17:24,905 --> 00:17:26,030 It works the same way here. 328 00:17:26,030 --> 00:17:29,950 You get 1 if they're not all equal, and zero otherwise. 329 00:17:29,950 --> 00:17:32,120 This is shown as the zero case. 330 00:17:32,120 --> 00:17:32,620 OK. 331 00:17:32,620 --> 00:17:35,910 Now the one thing we need, because not all equal 3SAT 332 00:17:35,910 --> 00:17:39,300 here, we need negation. 333 00:17:39,300 --> 00:17:45,480 So we're going to build each variable and its negation 334 00:17:45,480 --> 00:17:46,350 with this gadget. 335 00:17:46,350 --> 00:17:48,220 This is a new gadget, variable gadget. 336 00:17:48,220 --> 00:17:52,790 It's just a whole bunch of edges connecting xi and xi bar. 337 00:17:52,790 --> 00:17:53,900 And you can make this. 338 00:17:53,900 --> 00:17:56,570 You can avoid the multigraph aspect here. 339 00:17:56,570 --> 00:17:59,530 But let's not worry about it here. 340 00:17:59,530 --> 00:18:03,950 So in general, if there are k occurrences of this variable, 341 00:18:03,950 --> 00:18:07,370 then we're going to have 2k parallel edges, 342 00:18:07,370 --> 00:18:11,480 because the cost over here, the potential benefit here is 2. 343 00:18:11,480 --> 00:18:14,730 Again, we want to argue that if we take an optimal solution, 344 00:18:14,730 --> 00:18:18,550 we can make it another optimal solution where xi and xi 345 00:18:18,550 --> 00:18:21,706 bar are on opposite sides of the cut. 346 00:18:21,706 --> 00:18:23,830 And the reason is, if they're both on the same side 347 00:18:23,830 --> 00:18:27,690 of the cut, you're not getting this benefit. 348 00:18:27,690 --> 00:18:29,930 If you flip one of the sides, you 349 00:18:29,930 --> 00:18:32,150 get this huge benefit, which is 2k. 350 00:18:32,150 --> 00:18:33,960 And you say, well, how much do I lose 351 00:18:33,960 --> 00:18:37,470 if I flip this from one side of the cut to the other. 352 00:18:37,470 --> 00:18:41,590 Well, it appears in at most k different clauses, each of them 353 00:18:41,590 --> 00:18:43,480 gives me at most two points. 354 00:18:43,480 --> 00:18:45,970 So I'm losing, at most, 2k points 355 00:18:45,970 --> 00:18:47,330 by making these opposite. 356 00:18:47,330 --> 00:18:48,410 But I gain 2k points. 357 00:18:48,410 --> 00:18:50,990 So it never hurts me to do that switch. 358 00:18:50,990 --> 00:18:53,560 So I can assume these two guys are on opposite sides, 359 00:18:53,560 --> 00:18:56,540 and therefore I can assume it's sort of validly doing 360 00:18:56,540 --> 00:18:57,740 the negation part. 361 00:18:57,740 --> 00:19:01,810 And then it just reduces to not all equal 3SAT. 362 00:19:01,810 --> 00:19:04,770 There's a difference between this one, where we only 363 00:19:04,770 --> 00:19:07,250 get one point, and this one we only get two points. 364 00:19:07,250 --> 00:19:09,212 AUDIENCE: You get two points. 365 00:19:09,212 --> 00:19:10,670 PROFESSOR: You get two points here? 366 00:19:10,670 --> 00:19:10,910 Oh yeah. 367 00:19:10,910 --> 00:19:11,701 You get two points. 368 00:19:11,701 --> 00:19:14,820 That's why we doubled the edge. 369 00:19:14,820 --> 00:19:16,747 So that's cool. 370 00:19:16,747 --> 00:19:17,830 I think you would be fine. 371 00:19:17,830 --> 00:19:20,121 It'd still be an L-reduction even if you have one edge. 372 00:19:20,121 --> 00:19:21,690 But this is nicer. 373 00:19:21,690 --> 00:19:23,410 And yeah. 374 00:19:23,410 --> 00:19:24,750 That's it. 375 00:19:24,750 --> 00:19:25,250 Cool. 376 00:19:25,250 --> 00:19:28,050 This is Max-Cut. 377 00:19:28,050 --> 00:19:32,020 It will be a bounded degree based 378 00:19:32,020 --> 00:19:34,470 on the number of occurrences we got, which was like four. 379 00:19:34,470 --> 00:19:37,600 I mean, we can use three, and then we'll multiply. 380 00:19:37,600 --> 00:19:42,270 In general you can prove Max-Cut remains APX-complete 381 00:19:42,270 --> 00:19:45,440 for degree three graphs. 382 00:19:45,440 --> 00:19:47,630 So we're not going to prove it here. 383 00:19:47,630 --> 00:19:51,600 So another kind of reduction trick to reduce degrees, just 384 00:19:51,600 --> 00:19:55,680 say degree 3 is possible. 385 00:19:55,680 --> 00:20:03,690 It's also Max Cut in degree 3 graphs is APX-complete. 386 00:20:03,690 --> 00:20:10,100 So you could call that max positive 1 and 2SAT, hyphen 3. 387 00:20:10,100 --> 00:20:10,740 Maybe even E3. 388 00:20:13,580 --> 00:20:14,175 All right. 389 00:20:16,690 --> 00:20:18,587 So this gives you a flavor. 390 00:20:18,587 --> 00:20:20,420 This is a fun series of reductions, each one 391 00:20:20,420 --> 00:20:22,150 building on the previous one. 392 00:20:22,150 --> 00:20:24,730 But it gives you kind of starting point. 393 00:20:24,730 --> 00:20:27,310 A lot of the problems we're familiar with in NP 394 00:20:27,310 --> 00:20:29,480 completeness land, if you just add "Max" in front, 395 00:20:29,480 --> 00:20:32,930 they become hard. 396 00:20:32,930 --> 00:20:35,960 I mean I guess Max-Cut always had a Max in front. 397 00:20:35,960 --> 00:20:38,850 Max 2SAT for NP completeness, we also had a Max in front. 398 00:20:38,850 --> 00:20:41,341 So those are familiar, and they're APX-complete. 399 00:20:41,341 --> 00:20:42,840 All of the problems, I've described, 400 00:20:42,840 --> 00:20:44,298 at least for bounded degree graphs, 401 00:20:44,298 --> 00:20:46,340 have constant factor approximations. 402 00:20:46,340 --> 00:20:47,730 So this is the right level. 403 00:20:47,730 --> 00:20:49,350 They are APX-complete. 404 00:20:49,350 --> 00:20:51,650 And that determines their approximability. 405 00:20:51,650 --> 00:20:52,710 Constant factor, no PTAS. 406 00:20:55,890 --> 00:21:03,060 Now it would be nice to know which problems are hard. 407 00:21:03,060 --> 00:21:06,790 With NP-completeness, and in the SAT universe, 408 00:21:06,790 --> 00:21:09,170 we had Schaefer's dichotomy theorem that 409 00:21:09,170 --> 00:21:12,380 said-- let me cheat and look at my notes from, 410 00:21:12,380 --> 00:21:17,390 I think, lecture four-- that SAT is polynomial if 411 00:21:17,390 --> 00:21:18,980 and only if the clauses that you're 412 00:21:18,980 --> 00:21:21,270 allowed to do-- the operations you're allowed 413 00:21:21,270 --> 00:21:25,491 to do with variables-- are either have 414 00:21:25,491 --> 00:21:27,740 the property that when you set all the variables true, 415 00:21:27,740 --> 00:21:28,810 everything's satisfied. 416 00:21:28,810 --> 00:21:31,730 Or you set all the variables false, everything satisfied. 417 00:21:31,730 --> 00:21:37,080 Or every single clause is a conjunction of Horn causes. 418 00:21:37,080 --> 00:21:43,200 Horn clauses were a few variables, and at most one 419 00:21:43,200 --> 00:21:45,200 of them is positive. 420 00:21:45,200 --> 00:21:48,520 Or all the causes you have are conjunctions of Dual-Horn, 421 00:21:48,520 --> 00:21:54,300 which was, in every clause at most one of them is negated, 422 00:21:54,300 --> 00:21:58,900 or all of the clauses are conjunctions of 2CNF, 423 00:21:58,900 --> 00:22:00,660 only like 2SAT. 424 00:22:00,660 --> 00:22:05,010 Or what I didn't give a name at the time, 425 00:22:05,010 --> 00:22:10,140 but is essentially a slight generalization of XOR-SAT. 426 00:22:10,140 --> 00:22:11,580 Let me give it a name here. 427 00:22:11,580 --> 00:22:13,040 I'm going to call it X(N)OR-SAT. 428 00:22:19,350 --> 00:22:23,190 You can also phrase them as linear equations over Z2. 429 00:22:32,390 --> 00:22:34,290 So this is zero and one. 430 00:22:34,290 --> 00:22:38,120 And it's either X OR, meaning you take the X OR of all 431 00:22:38,120 --> 00:22:40,340 the things-- that's like the summation of all things, 432 00:22:40,340 --> 00:22:42,370 or it's X(N)OR, meaning when you take that sum, 433 00:22:42,370 --> 00:22:44,420 it should equal zero. 434 00:22:44,420 --> 00:22:46,300 And such systems of linear equations 435 00:22:46,300 --> 00:22:52,250 can be solved in polynomial time using Gaussian elimination 436 00:22:52,250 --> 00:22:53,920 over Z2. 437 00:22:53,920 --> 00:22:56,060 And all of the things I just mentioned 438 00:22:56,060 --> 00:22:59,420 are all the situations where SAT is polynomial. 439 00:22:59,420 --> 00:23:03,810 Every other type of clause, SAT is NP-complete-- 440 00:23:03,810 --> 00:23:05,607 or set of classes. 441 00:23:05,607 --> 00:23:06,690 Now why do I mention this? 442 00:23:06,690 --> 00:23:11,520 Because there is an analogous theorem for it's 443 00:23:11,520 --> 00:23:15,690 not quite SAT, because we need something like this Max. 444 00:23:15,690 --> 00:23:17,690 We need to turn it into an optimization problem. 445 00:23:17,690 --> 00:23:21,050 SAT is not normally an optimization problem by itself. 446 00:23:21,050 --> 00:23:25,270 And characterizing how approximal those problems are. 447 00:23:25,270 --> 00:23:32,750 Now it is a complicated theorem-- so complicated, 448 00:23:32,750 --> 00:23:35,200 that I don't want to write it on the board, 449 00:23:35,200 --> 00:23:36,670 because there's a lot of cases. 450 00:23:36,670 --> 00:23:39,140 But the point is, it's exhaustive. 451 00:23:39,140 --> 00:23:41,166 It will tell you if you have anything 452 00:23:41,166 --> 00:23:42,540 of the type we had with Schaefer, 453 00:23:42,540 --> 00:23:44,515 which was you define a kind of clause function. 454 00:23:44,515 --> 00:23:46,190 It's either satisfied or not. 455 00:23:46,190 --> 00:23:48,120 It applies to some number of variables. 456 00:23:48,120 --> 00:23:51,150 And then, once you've defined that clause type, 457 00:23:51,150 --> 00:23:52,830 you can apply it to any combination 458 00:23:52,830 --> 00:23:54,450 of variables you want. 459 00:23:54,450 --> 00:23:57,400 That family of problems with no other restrictions 460 00:23:57,400 --> 00:23:58,505 is what we get. 461 00:23:58,505 --> 00:24:03,590 And I will just tell you what the problems are. 462 00:24:03,590 --> 00:24:04,547 There's four of them. 463 00:24:04,547 --> 00:24:06,380 This is part of what makes the theorem long, 464 00:24:06,380 --> 00:24:08,750 but also extremely powerful. 465 00:24:08,750 --> 00:24:12,340 The first dichotomy is max verses min. 466 00:24:12,340 --> 00:24:15,580 And then the second dichotomy is they 467 00:24:15,580 --> 00:24:18,162 call it CSP for constraint satisfaction problem. 468 00:24:18,162 --> 00:24:19,620 So you have a bunch of constraints. 469 00:24:19,620 --> 00:24:21,970 You want to satisfy as many as possible. 470 00:24:21,970 --> 00:24:26,750 So this would be the number of satisfied constraints 471 00:24:26,750 --> 00:24:29,940 is your objective, or your cost function. 472 00:24:33,240 --> 00:24:37,870 Or the other version is what's called the ones problem, or max 473 00:24:37,870 --> 00:24:39,530 ones, or min ones. 474 00:24:39,530 --> 00:24:42,560 This is the number of true variables. 475 00:24:48,010 --> 00:24:52,060 So again, we have a Schaefer-like SAT style 476 00:24:52,060 --> 00:24:53,132 of set of clauses. 477 00:24:53,132 --> 00:24:55,590 Either we want to maximize the number of satisfied clauses, 478 00:24:55,590 --> 00:24:58,170 or we want to minimize the number satisfied clauses, 479 00:24:58,170 --> 00:25:02,360 or we want to maximize the number of true variables 480 00:25:02,360 --> 00:25:03,980 and satisfy everything. 481 00:25:03,980 --> 00:25:06,360 Or we want to minimize the number of true variables 482 00:25:06,360 --> 00:25:09,040 and satisfy everything. 483 00:25:09,040 --> 00:25:09,540 OK. 484 00:25:09,540 --> 00:25:11,930 Now obviously, if the SAT problem is hard, 485 00:25:11,930 --> 00:25:13,840 it's going to be hard to do this. 486 00:25:13,840 --> 00:25:15,710 But it's still interesting. 487 00:25:15,710 --> 00:25:17,000 You can still think about it. 488 00:25:17,000 --> 00:25:23,260 And even when the SAT problem is easy, Max ones can be hard. 489 00:25:23,260 --> 00:25:25,650 So I am going to-- I wrote it all down, 490 00:25:25,650 --> 00:25:27,280 and then I realized how long it was. 491 00:25:27,280 --> 00:25:29,060 And so I will just show you. 492 00:25:29,060 --> 00:25:32,460 Imagine I just hand-wrote this. 493 00:25:32,460 --> 00:25:35,310 So this is the easy case. 494 00:25:35,310 --> 00:25:36,401 Max CSP. 495 00:25:36,401 --> 00:25:38,400 So we want to maximize the number of constraints 496 00:25:38,400 --> 00:25:40,990 that we satisfy. 497 00:25:40,990 --> 00:25:45,430 And I'm going to characterize when it is polynomial. 498 00:25:45,430 --> 00:25:47,710 Now here, PO I haven't defined, but that's 499 00:25:47,710 --> 00:25:49,737 the analog of P for optimization problems. 500 00:25:49,737 --> 00:25:51,570 So it's the set of all optimization problems 501 00:25:51,570 --> 00:25:55,340 that are in P that have a polynomial timed algorithm 502 00:25:55,340 --> 00:25:57,110 to solve them exactly. 503 00:25:57,110 --> 00:25:58,520 So it turns out in this situation 504 00:25:58,520 --> 00:26:01,330 you are either polynomial or APX-complete. 505 00:26:01,330 --> 00:26:04,940 So it's only about constant factor verses perfect. 506 00:26:04,940 --> 00:26:08,310 There's never a PTAS, unless there's a polynomial time 507 00:26:08,310 --> 00:26:09,012 algorithm. 508 00:26:09,012 --> 00:26:10,470 And the cases should look familiar. 509 00:26:10,470 --> 00:26:13,170 It's either when you set all the variables true 510 00:26:13,170 --> 00:26:15,860 or all the variables false, that satisfies everything. 511 00:26:15,860 --> 00:26:17,690 In that case, Max CSP is, of course, easy. 512 00:26:17,690 --> 00:26:19,790 You can satisfy everything. 513 00:26:19,790 --> 00:26:23,150 Another case is if you write the clauses 514 00:26:23,150 --> 00:26:26,510 in disjunctive normal form-- this is a new type 515 00:26:26,510 --> 00:26:29,360 that we hadn't seen before, all your causes are-- 516 00:26:29,360 --> 00:26:32,450 when you write them in DNF, they have exactly two terms. 517 00:26:32,450 --> 00:26:36,265 So it's the OR of two things that are anded together. 518 00:26:36,265 --> 00:26:36,765 Sorry. 519 00:26:36,765 --> 00:26:38,050 There's an "or" in the middle. 520 00:26:38,050 --> 00:26:40,340 And you have a bunch of things anded together 521 00:26:40,340 --> 00:26:41,630 in each of my hands. 522 00:26:41,630 --> 00:26:44,760 And all the ones in here and positive, and all the ones 523 00:26:44,760 --> 00:26:46,110 in here are negative. 524 00:26:46,110 --> 00:26:49,090 If every clause looks like that, then you 525 00:26:49,090 --> 00:26:51,670 can solve this in polynomial time. 526 00:26:51,670 --> 00:26:56,180 And in all other cases, this problem is APX-complete. 527 00:26:56,180 --> 00:26:59,582 So that's a nice, very clean characterization. 528 00:26:59,582 --> 00:27:01,998 AUDIENCE: Wait. [INAUDIBLE] that we learned about earlier. 529 00:27:01,998 --> 00:27:03,390 Is this the [INAUDIBLE]? 530 00:27:03,390 --> 00:27:04,015 PROFESSOR: Yes. 531 00:27:04,015 --> 00:27:05,530 This is disjunctive normal form. 532 00:27:05,530 --> 00:27:09,390 So it's the or of ands. 533 00:27:09,390 --> 00:27:12,590 We usually, we deal with CNF ands of ors. 534 00:27:12,590 --> 00:27:17,530 But for this characterization, every clause 535 00:27:17,530 --> 00:27:19,810 can be uniquely converted into a DNF, 536 00:27:19,810 --> 00:27:21,150 and uniquely converted into CNF. 537 00:27:21,150 --> 00:27:23,990 So that's a well-defined thing to say. 538 00:27:26,405 --> 00:27:28,530 With Schaefer, we just had to look at the CNF form. 539 00:27:28,530 --> 00:27:31,990 But here we get a new set of things. 540 00:27:31,990 --> 00:27:33,130 All right. 541 00:27:33,130 --> 00:27:35,350 That was one out of four. 542 00:27:35,350 --> 00:27:37,240 Max Min CSP Ones. 543 00:27:37,240 --> 00:27:40,486 Next one is Max Ones. 544 00:27:40,486 --> 00:27:41,860 This is not the most complicated. 545 00:27:44,540 --> 00:27:46,390 But let's go through them. 546 00:27:46,390 --> 00:27:49,862 So again, we want to maximize the number of true variables. 547 00:27:49,862 --> 00:27:51,945 So of course, if we set all the variables to true, 548 00:27:51,945 --> 00:27:55,570 and everything is satisfied, yay, a polynomial, OK? 549 00:27:55,570 --> 00:27:58,180 But curiously, if you settle the variables to false, 550 00:27:58,180 --> 00:28:02,910 and that satisfies everything, that's going to be here. 551 00:28:02,910 --> 00:28:05,230 That's Poly-APX-complete. 552 00:28:05,230 --> 00:28:08,050 Poly-APX-complete, you can translate to something like n 553 00:28:08,050 --> 00:28:10,160 to the 1 minus epsilon, approximable, 554 00:28:10,160 --> 00:28:12,850 and that's the best you can do. 555 00:28:12,850 --> 00:28:15,620 Or there's a lower bound of n to the 1 minus epsilon. 556 00:28:15,620 --> 00:28:18,231 Upper bound might be n or something. 557 00:28:18,231 --> 00:28:18,730 OK. 558 00:28:18,730 --> 00:28:23,180 So because maximizing ones, when setting things all at false, 559 00:28:23,180 --> 00:28:24,450 does not necessarily help you. 560 00:28:24,450 --> 00:28:26,800 There are some more positive cases. 561 00:28:26,800 --> 00:28:28,730 If you have a Dual-Horn set up. 562 00:28:28,730 --> 00:28:31,270 So this is another one of the Schaefer situations. 563 00:28:31,270 --> 00:28:34,675 If every clause when you write it in CNF every subclause 564 00:28:34,675 --> 00:28:37,780 is Dual-Horn, at most, one negated thing, 565 00:28:37,780 --> 00:28:40,070 that is a good situation for maximizing ones, 566 00:28:40,070 --> 00:28:44,170 because only one of them has to be negative. 567 00:28:44,170 --> 00:28:48,646 But with Horn, for example, you get Poly-APX-complete, 568 00:28:48,646 --> 00:28:51,020 because we have an asymmetry here between ones and zeros. 569 00:28:51,020 --> 00:28:51,968 Question? 570 00:28:51,968 --> 00:28:53,160 AUDIENCE: In this list, do we just read down it 571 00:28:53,160 --> 00:28:54,210 until we hit the thing? 572 00:28:54,210 --> 00:28:55,010 PROFESSOR: Yes. 573 00:28:55,010 --> 00:28:55,860 Good question. 574 00:28:55,860 --> 00:29:01,290 This is a sequential algorithm for determining what you have. 575 00:29:01,290 --> 00:29:03,430 If any of these says, oh, you're in PO, 576 00:29:03,430 --> 00:29:05,760 then you should stop reading the rest of the theorem. 577 00:29:05,760 --> 00:29:09,640 The way they write the theorem is less is probably clearer. 578 00:29:09,640 --> 00:29:11,386 They write an else if for each one, 579 00:29:11,386 --> 00:29:13,260 but I wrote it backwards, so it's hard for me 580 00:29:13,260 --> 00:29:14,730 to write else if. 581 00:29:14,730 --> 00:29:15,410 Yeah. 582 00:29:15,410 --> 00:29:18,530 Occasionally I'll mention that the previous things 583 00:29:18,530 --> 00:29:19,030 don't apply. 584 00:29:19,030 --> 00:29:20,860 But you should read this sequentially. 585 00:29:24,100 --> 00:29:24,600 OK. 586 00:29:24,600 --> 00:29:25,870 So it was Dual-Horn. 587 00:29:25,870 --> 00:29:31,300 Another polynomial case is what I call 2-X(N)OR-SAT, 588 00:29:31,300 --> 00:29:32,590 where the N is in parentheses. 589 00:29:32,590 --> 00:29:35,110 So in other words, you have linear equations. 590 00:29:35,110 --> 00:29:39,300 Each equation only has two terms, sort of like 2SAT. 591 00:29:39,300 --> 00:29:41,330 And you have equations that say equal zero 592 00:29:41,330 --> 00:29:44,120 or equal one on those two terms. 593 00:29:44,120 --> 00:29:45,870 That is also polynomially solvable. 594 00:29:45,870 --> 00:29:47,490 This is a special case. 595 00:29:47,490 --> 00:29:49,650 We didn't need the 2 for Schaefer. 596 00:29:49,650 --> 00:29:54,490 Here we need the 2, because if you have X(N)OR-SAT in general. 597 00:29:54,490 --> 00:29:57,760 And when I say this, I mean that all constraints 598 00:29:57,760 --> 00:29:58,940 fall into this category. 599 00:29:58,940 --> 00:30:00,990 If all constraints are of this form, 600 00:30:00,990 --> 00:30:03,080 all clauses are of this form, then you're good. 601 00:30:03,080 --> 00:30:06,420 If all clauses are of the form X(N)OR-SAT, 602 00:30:06,420 --> 00:30:10,450 but they're not in this class, they're not all of length 2, 603 00:30:10,450 --> 00:30:12,800 then the problem becomes APX-complete, 604 00:30:12,800 --> 00:30:16,630 by contrast to Schaefer, where, I mean, 605 00:30:16,630 --> 00:30:19,370 deciding whether you can satisfy all those things is easy-- 606 00:30:19,370 --> 00:30:22,670 maximizing the number of ones when you do it is APX-complete. 607 00:30:22,670 --> 00:30:25,950 So that's particularly interesting. 608 00:30:25,950 --> 00:30:27,700 AUDIENCE: Not all equal 3SAT fall in that? 609 00:30:27,700 --> 00:30:28,610 Is that? 610 00:30:32,620 --> 00:30:35,330 PROFESSOR: Not all equal 3SAT. 611 00:30:35,330 --> 00:30:37,527 AUDIENCE: Those are X(N)OR clauses, right? 612 00:30:37,527 --> 00:30:38,110 PROFESSOR: No. 613 00:30:38,110 --> 00:30:39,526 They should not be X(N)OR clauses, 614 00:30:39,526 --> 00:30:40,930 because it's NP-complete. 615 00:30:40,930 --> 00:30:42,800 And when you have X(N)OR clauses, 616 00:30:42,800 --> 00:30:45,650 it's always polynomial to decide whether you can satisfy 617 00:30:45,650 --> 00:30:47,040 everything. 618 00:30:47,040 --> 00:30:49,645 So it's in the other case. 619 00:30:52,570 --> 00:30:54,070 But good question, because we should 620 00:30:54,070 --> 00:30:56,920 be getting APX-completeness. 621 00:30:56,920 --> 00:30:58,837 Yeah, but Max not all equal 3SAT is different. 622 00:30:58,837 --> 00:31:01,128 Here we're trying to maximize the number of clause that 623 00:31:01,128 --> 00:31:01,810 were satisfied. 624 00:31:01,810 --> 00:31:04,309 So if you have not all equal 3SAT, 625 00:31:04,309 --> 00:31:06,350 and you want to maximize the number of ones, that 626 00:31:06,350 --> 00:31:08,724 means first you have to satisfy not all equal 3SAT, which 627 00:31:08,724 --> 00:31:09,610 is hard. 628 00:31:09,610 --> 00:31:11,760 So that's going to fall into this. 629 00:31:11,760 --> 00:31:13,800 The bottom one is feasibility. 630 00:31:13,800 --> 00:31:15,930 Just finding a feasible solution is NP hard. 631 00:31:18,590 --> 00:31:24,630 The X(N)OR-SAT is this thing-- linear equations over Z2. 632 00:31:24,630 --> 00:31:27,139 And it could be equal to 0, or equal to 1. 633 00:31:27,139 --> 00:31:28,930 This is what you might call an X OR clause, 634 00:31:28,930 --> 00:31:32,940 or this is an X OR clause, this is an X(N)OR clause. 635 00:31:32,940 --> 00:31:36,890 So if they don't all have size two, then you're APX-complete. 636 00:31:36,890 --> 00:31:41,400 But you can find a solution by Schaefer's theorem. 637 00:31:41,400 --> 00:31:42,280 OK. 638 00:31:42,280 --> 00:31:45,390 So as I mentioned, Horn clauses and 2AT clauses 639 00:31:45,390 --> 00:31:46,570 are actually really hard. 640 00:31:46,570 --> 00:31:49,320 They're Poly-APX-complete, n to the 1 minus epsilon. 641 00:31:49,320 --> 00:31:51,350 Also these are all situations where 642 00:31:51,350 --> 00:31:54,724 you can find feasible solutions easily by Schaefer, like when 643 00:31:54,724 --> 00:31:57,140 you can set them all false, and that satisfies everything. 644 00:31:57,140 --> 00:31:58,020 It doesn't help you when you're trying 645 00:31:58,020 --> 00:31:59,311 to maximize the number of ones. 646 00:31:59,311 --> 00:32:01,916 It just gets you to zero. 647 00:32:01,916 --> 00:32:03,040 Then you want to do better. 648 00:32:03,040 --> 00:32:06,680 And it's really hard to get any better factor. 649 00:32:06,680 --> 00:32:08,630 One more situation. 650 00:32:08,630 --> 00:32:09,130 Sorry. 651 00:32:11,934 --> 00:32:13,350 There's a slight distinction here. 652 00:32:13,350 --> 00:32:15,800 So suppose you have the feature that you 653 00:32:15,800 --> 00:32:20,290 can set one variable true, and the rest false. 654 00:32:20,290 --> 00:32:22,650 If that satisfies all your constraints, than great, 655 00:32:22,650 --> 00:32:24,467 you found the value 1. 656 00:32:24,467 --> 00:32:26,300 And there's a big difference between 0 and 1 657 00:32:26,300 --> 00:32:28,216 when you're looking at relative approximation, 658 00:32:28,216 --> 00:32:30,950 because anything divided by 0 is huge. 659 00:32:30,950 --> 00:32:32,880 So it's really hard to get a good factor. 660 00:32:32,880 --> 00:32:33,760 That's the situation. 661 00:32:33,760 --> 00:32:35,260 Distinguishing between 0 and greater 662 00:32:35,260 --> 00:32:39,150 than 0, which is an infinite ratio, it could be NP-hard. 663 00:32:39,150 --> 00:32:41,470 That's when you, in this situation, 664 00:32:41,470 --> 00:32:42,980 we set all the variables false. 665 00:32:42,980 --> 00:32:43,680 You get zero. 666 00:32:43,680 --> 00:32:46,690 But finding any other solution is going to be NP-hard. 667 00:32:46,690 --> 00:32:48,280 Here, if you can at least get 1, you 668 00:32:48,280 --> 00:32:50,930 can get an N approximation, whereas here you 669 00:32:50,930 --> 00:32:52,320 can't get an N approximation. 670 00:32:52,320 --> 00:32:55,290 Here you can get Poly approximation. 671 00:32:55,290 --> 00:32:57,700 And finally, if you have none of this above situations, 672 00:32:57,700 --> 00:33:01,950 then testing feasibility is NP-hard by Schaefer's theorem. 673 00:33:01,950 --> 00:33:04,310 So it's like Schaefer theorem, but some of the cases 674 00:33:04,310 --> 00:33:08,200 split up into parts. 675 00:33:08,200 --> 00:33:09,660 Now, that was maximization. 676 00:33:09,660 --> 00:33:10,510 Question? 677 00:33:10,510 --> 00:33:12,510 AUDIENCE: So, what's special about 1 here? 678 00:33:12,510 --> 00:33:15,977 It seems to me if you replace that 1 by K 679 00:33:15,977 --> 00:33:17,310 it should still be in that case. 680 00:33:17,310 --> 00:33:18,390 PROFESSOR: This case. 681 00:33:18,390 --> 00:33:19,330 AUDIENCE: Yeah. 682 00:33:19,330 --> 00:33:22,620 If I just replace that one with a fixed K. Like 2. 683 00:33:22,620 --> 00:33:23,880 PROFESSOR: Yes. 684 00:33:23,880 --> 00:33:27,290 So that problem will still be-- so if you 685 00:33:27,290 --> 00:33:30,000 can set all but K of them true, I 686 00:33:30,000 --> 00:33:32,000 think you can also set all but one of them true, 687 00:33:32,000 --> 00:33:33,430 and still satisfy. 688 00:33:33,430 --> 00:33:34,190 Yeah. 689 00:33:34,190 --> 00:33:35,310 So here's the thing. 690 00:33:35,310 --> 00:33:36,680 This is all variables, right? 691 00:33:36,680 --> 00:33:39,440 So the idea is you have tons of variables, 692 00:33:39,440 --> 00:33:41,857 and let's say two of them are set to true. 693 00:33:41,857 --> 00:33:43,440 So if you look at a clause, the clause 694 00:33:43,440 --> 00:33:46,685 might just apply to these guys-- all the false guys-- 695 00:33:46,685 --> 00:33:49,060 or it might apply to false guys and one of the true guys, 696 00:33:49,060 --> 00:33:52,595 or it might apply to false guys and two of the true guys. 697 00:33:52,595 --> 00:33:54,220 All of those would have to be satisfied 698 00:33:54,220 --> 00:33:56,050 in your hypothetical situation. 699 00:33:56,050 --> 00:33:58,810 If that's true, that implies that all the clauses are 700 00:33:58,810 --> 00:34:00,950 satisfied when only one of them is set true, 701 00:34:00,950 --> 00:34:02,400 and the rest are false. 702 00:34:02,400 --> 00:34:04,980 So your case would fall into this case as well, 703 00:34:04,980 --> 00:34:07,260 and you'd get Poly-APX-completeness again. 704 00:34:07,260 --> 00:34:10,040 So it's not totally obvious when these things apply. 705 00:34:10,040 --> 00:34:14,256 But this is the complete list of different cases. 706 00:34:14,256 --> 00:34:14,839 Any questions? 707 00:34:17,480 --> 00:34:19,530 OK. 708 00:34:19,530 --> 00:34:21,440 Two out of four. 709 00:34:21,440 --> 00:34:25,460 Next one, this is the longest one, is Min CSP. 710 00:34:25,460 --> 00:34:28,639 Now here we don't get as nice a characterization, 711 00:34:28,639 --> 00:34:31,159 because there are some open problems left. 712 00:34:31,159 --> 00:34:33,420 I haven't checked whether all of these open problems 713 00:34:33,420 --> 00:34:36,610 remain open, but as of 2001 they were open, 714 00:34:36,610 --> 00:34:38,639 which was a while ago. 715 00:34:38,639 --> 00:34:41,800 And we can check whether there's more explicit status. 716 00:34:41,800 --> 00:34:45,310 But I have the status as of this paper here. 717 00:34:45,310 --> 00:34:47,150 So Min CSP. 718 00:34:47,150 --> 00:34:51,130 This is, you want to minimize the number of constraints 719 00:34:51,130 --> 00:34:54,122 that are satisfied, whereas before we 720 00:34:54,122 --> 00:34:55,080 looked at maximization. 721 00:34:55,080 --> 00:34:58,740 There are only three cases which were something like this. 722 00:34:58,740 --> 00:35:02,270 Again, if setting all the variables false or true 723 00:35:02,270 --> 00:35:08,810 satisfies all the clauses, this is good, apparently. 724 00:35:08,810 --> 00:35:10,830 That's less obvious in this case. 725 00:35:10,830 --> 00:35:12,240 In general, minimization problems 726 00:35:12,240 --> 00:35:14,365 behave quite differently from maximization problems 727 00:35:14,365 --> 00:35:16,110 in terms of approximability. 728 00:35:16,110 --> 00:35:17,970 Maximization is generally easier to 729 00:35:17,970 --> 00:35:22,130 approximate, because your solutions tend to be big, 730 00:35:22,130 --> 00:35:24,370 and it's easier to approximate big things. 731 00:35:24,370 --> 00:35:27,830 Minimization-- small-- is hard. 732 00:35:27,830 --> 00:35:31,380 Also we had the situation from Max CSP, 733 00:35:31,380 --> 00:35:33,540 if when you write it in DNF, is exactly 734 00:35:33,540 --> 00:35:35,107 two terms for every clause. 735 00:35:35,107 --> 00:35:36,690 One of them is all positive variables, 736 00:35:36,690 --> 00:35:38,356 and the other is all negative variables. 737 00:35:38,356 --> 00:35:40,470 That's also easy. 738 00:35:40,470 --> 00:35:46,270 And here's a new case of APX-completeness. 739 00:35:46,270 --> 00:35:48,610 So if the problem you're trying to solve 740 00:35:48,610 --> 00:35:51,290 is exactly this problem, they call this, 741 00:35:51,290 --> 00:35:54,190 I think, implication hitting set. 742 00:35:54,190 --> 00:35:57,910 So you have a clause which lets you say x1 implies 743 00:35:57,910 --> 00:36:01,620 x2 for any two variables. 744 00:36:01,620 --> 00:36:06,010 And you have some set of clauses like this, where you 745 00:36:06,010 --> 00:36:08,720 can say here's five variables. 746 00:36:08,720 --> 00:36:10,680 The OR of them is true. 747 00:36:10,680 --> 00:36:13,479 No negation here. 748 00:36:13,479 --> 00:36:15,520 So this is called hitting set, meaning I give you 749 00:36:15,520 --> 00:36:19,370 a set of vertices and a graph, and I want at least one of them 750 00:36:19,370 --> 00:36:22,320 to be hit, to be included, to be true. 751 00:36:22,320 --> 00:36:24,700 And we're trying to minimize the number of such things 752 00:36:24,700 --> 00:36:26,533 that we satisfy. 753 00:36:26,533 --> 00:36:31,490 So this turns out to be hard, but only there's no PTAS, 754 00:36:31,490 --> 00:36:35,600 but there's a constant factor approximation. 755 00:36:35,600 --> 00:36:38,360 And then we have these four cases 756 00:36:38,360 --> 00:36:41,770 which show that they are equivalent to known studied 757 00:36:41,770 --> 00:36:42,860 problems. 758 00:36:42,860 --> 00:36:44,720 So there are these special cases. 759 00:36:44,720 --> 00:36:48,414 Other than these getting any approximation 760 00:36:48,414 --> 00:36:49,830 factor of less than infinity would 761 00:36:49,830 --> 00:36:52,430 require you to distinguish between zeros OPT, 762 00:36:52,430 --> 00:36:55,400 and OPT is greater than zero, and it's NP-complete, 763 00:36:55,400 --> 00:36:57,980 unless you have these. 764 00:36:57,980 --> 00:37:00,970 So there are some special cases like Min Uncut. 765 00:37:00,970 --> 00:37:03,150 This is the reverse of Max Cut. 766 00:37:03,150 --> 00:37:05,880 You want to minimize the number of uncut edges. 767 00:37:05,880 --> 00:37:10,320 So that plus Max Cut should be equal to the number of edges. 768 00:37:10,320 --> 00:37:12,920 But the approximability of the two sides is quite different. 769 00:37:12,920 --> 00:37:16,480 And here are the best results of our APX-hardness, 770 00:37:16,480 --> 00:37:19,900 and log and upper bound for approximation. 771 00:37:19,900 --> 00:37:21,870 So that's a little bit harder maybe. 772 00:37:21,870 --> 00:37:25,110 It's at least as hard as this. 773 00:37:25,110 --> 00:37:30,480 And that happens when you are in the 2x (N)OR-SAT situation, 774 00:37:30,480 --> 00:37:33,320 something we saw from the last slide. 775 00:37:33,320 --> 00:37:35,820 So here it reduces to this other problem. 776 00:37:35,820 --> 00:37:39,025 Basically the same, but the X(N)ORs don't buy you anything 777 00:37:39,025 --> 00:37:39,525 new. 778 00:37:42,580 --> 00:37:44,860 In the case of 2SAT, you get a problem 779 00:37:44,860 --> 00:37:47,950 known as Min 2CNF deletion. 780 00:37:47,950 --> 00:37:51,780 And it's similar-- APX-hard, and best approximation 781 00:37:51,780 --> 00:37:54,680 is log times log log. 782 00:37:54,680 --> 00:37:57,880 If in the case where you have X(N)OR-SAT in general, 783 00:37:57,880 --> 00:38:01,330 but it's not all of the linear equations have only two terms-- 784 00:38:01,330 --> 00:38:05,110 so we have some larger ones-- then it turns out to be 785 00:38:05,110 --> 00:38:07,000 equivalent to nearest Codeword. 786 00:38:07,000 --> 00:38:10,120 So it turns out you can write all such equations using 787 00:38:10,120 --> 00:38:13,260 either equations of length, by using equations of length 3 788 00:38:13,260 --> 00:38:13,760 always. 789 00:38:13,760 --> 00:38:15,750 So this is linear equation. 790 00:38:15,750 --> 00:38:20,820 This should equal 1, or this says equals zero. 791 00:38:20,820 --> 00:38:23,276 And from that, you can construct all such things. 792 00:38:23,276 --> 00:38:24,525 This is a really hard problem. 793 00:38:27,610 --> 00:38:29,800 Poly-APX-hardness is not known. 794 00:38:29,800 --> 00:38:31,680 Current lower best lower bound is this 2 795 00:38:31,680 --> 00:38:33,460 to the log to the 1 minus epsilon, which 796 00:38:33,460 --> 00:38:37,440 we saw in the table of various inapproximability results 797 00:38:37,440 --> 00:38:37,940 last time. 798 00:38:37,940 --> 00:38:42,620 So this is a little bit smaller than n to the epsilon, 799 00:38:42,620 --> 00:38:43,890 but it's kind of close-ish. 800 00:38:47,150 --> 00:38:50,300 And finally, in the-- I didn't write it. 801 00:38:50,300 --> 00:38:52,810 If you're in CNF form, and all of the subclauses 802 00:38:52,810 --> 00:38:55,960 are either Horn, or all of the subclauses are Dual-Horn, 803 00:38:55,960 --> 00:39:00,350 then you get something called Min Horn Deletion. 804 00:39:00,350 --> 00:39:02,170 And this has the same inapproximability. 805 00:39:04,730 --> 00:39:06,070 Here it's known. 806 00:39:06,070 --> 00:39:07,580 So up here, the best approximation 807 00:39:07,580 --> 00:39:11,770 is n-- nothing, basically. 808 00:39:11,770 --> 00:39:13,110 Put them all in. 809 00:39:13,110 --> 00:39:16,990 And here there's a slightly better approximation known , 810 00:39:16,990 --> 00:39:18,990 I think, n to the 1 minus epsilon, or something. 811 00:39:18,990 --> 00:39:20,804 But these are all super hard. 812 00:39:20,804 --> 00:39:22,470 The main point of this is so that you're 813 00:39:22,470 --> 00:39:23,820 aware of these problems. 814 00:39:23,820 --> 00:39:26,640 If you ever encounter a problem that looks anything like this, 815 00:39:26,640 --> 00:39:29,740 or it looks like some kind of CSP problem, 816 00:39:29,740 --> 00:39:31,900 you should go to this list and check it out. 817 00:39:31,900 --> 00:39:35,430 So don't memorize these, but look at the notes. 818 00:39:35,430 --> 00:39:36,772 Definitely memorize these guys. 819 00:39:36,772 --> 00:39:37,730 These are good to know. 820 00:39:37,730 --> 00:39:42,140 But there's a few obscure problems here. 821 00:39:42,140 --> 00:39:42,640 OK. 822 00:39:42,640 --> 00:39:47,560 Last one is minimizing the number of ones. 823 00:39:47,560 --> 00:39:49,990 So this is like the hardest of two worlds. 824 00:39:49,990 --> 00:39:51,760 Minimization is kind of harder. 825 00:39:51,760 --> 00:39:54,460 And here you have to satisfy everything, but minimize 826 00:39:54,460 --> 00:39:56,390 the number of true variables. 827 00:39:59,530 --> 00:40:03,250 So this is easy if you can set them all false. 828 00:40:03,250 --> 00:40:04,820 And then you win. 829 00:40:04,820 --> 00:40:07,120 This is easy in the Horn case. 830 00:40:07,120 --> 00:40:09,170 The Horn case is when at most one is positive, 831 00:40:09,170 --> 00:40:11,900 so most of them can be set to zero. 832 00:40:11,900 --> 00:40:15,990 This is easy in the 2X(N)OR case. 833 00:40:15,990 --> 00:40:19,060 So if you have linear equations, two terms each, equal to 0 834 00:40:19,060 --> 00:40:21,320 or equals 1, that's also. 835 00:40:21,320 --> 00:40:24,100 And you want to minimize the number of true variables. 836 00:40:24,100 --> 00:40:25,410 That's good. 837 00:40:25,410 --> 00:40:28,060 If you're in 2CNF form, there's a constant factor 838 00:40:28,060 --> 00:40:28,780 approximation. 839 00:40:28,780 --> 00:40:30,240 That's the best you can do. 840 00:40:30,240 --> 00:40:30,781 APX-complete. 841 00:40:33,090 --> 00:40:36,300 This is a case from the last slide. 842 00:40:36,300 --> 00:40:39,290 If you have the hitting set constraints on constant number 843 00:40:39,290 --> 00:40:41,830 of constant size vertex sets, and you 844 00:40:41,830 --> 00:40:44,230 have implication constraints, then your problem 845 00:40:44,230 --> 00:40:45,535 is APX-complete again. 846 00:40:48,380 --> 00:40:50,300 And then we have these guys appearing, again 847 00:40:50,300 --> 00:40:51,070 nearest Codeword. 848 00:40:51,070 --> 00:40:52,980 N Min Horn deletion. 849 00:40:52,980 --> 00:40:55,020 This one we get in the Dual-Horn case. 850 00:40:55,020 --> 00:40:56,490 The Horn case is good. 851 00:40:56,490 --> 00:40:59,880 Dual-Horn, we get this thing, which was like log N 852 00:40:59,880 --> 00:41:00,380 approximal. 853 00:41:00,380 --> 00:41:01,490 Or no. 854 00:41:01,490 --> 00:41:05,880 This was the 2 to the log N to the 1 minus epsilon. 855 00:41:05,880 --> 00:41:10,380 And this is X(N)OR-SAT when they're not all binary. 856 00:41:10,380 --> 00:41:12,870 Then we get nearest Codeword-complete. 857 00:41:12,870 --> 00:41:16,590 And finally, oh, two more. 858 00:41:16,590 --> 00:41:19,450 The dual to this, if all the variables being set true 859 00:41:19,450 --> 00:41:22,590 satisfies your constraint, that gives you a solution, 860 00:41:22,590 --> 00:41:27,780 but it's like the worst solution possible, because you get N. 861 00:41:27,780 --> 00:41:32,320 And so in that case, you can get probably a poly approximation. 862 00:41:32,320 --> 00:41:34,740 Not very impressive. 863 00:41:34,740 --> 00:41:37,380 And that's actually the best you can do, at some N 864 00:41:37,380 --> 00:41:39,180 to the 1 minus epsilon. 865 00:41:39,180 --> 00:41:42,250 And in all other cases, by Schaefer's theorem, 866 00:41:42,250 --> 00:41:45,250 deciding whether even finding a feasible solution is NP-hard. 867 00:41:45,250 --> 00:41:47,960 So, good luck approximating. 868 00:41:47,960 --> 00:41:49,360 Cool? 869 00:41:49,360 --> 00:41:54,275 This is the Khanna, Sudan, Trevisan, Williamson 870 00:41:54,275 --> 00:41:55,150 multichotomy theorem. 871 00:41:59,100 --> 00:41:59,600 All right. 872 00:42:03,860 --> 00:42:11,280 So let's do some more reductions. 873 00:42:38,260 --> 00:42:42,740 My goal on this page is to get to our good friend 874 00:42:42,740 --> 00:42:46,180 from one of the first lectures, edge-matching-puzzles. 875 00:42:46,180 --> 00:42:50,480 You have little square tiles, colors on the edges. 876 00:42:50,480 --> 00:42:52,910 Normally we want to satisfy all of the edge constraints. 877 00:42:52,910 --> 00:42:57,480 Only equal colors match, are adjacent to each other. 878 00:42:57,480 --> 00:43:00,040 Now the problem is going to be maximize the number 879 00:43:00,040 --> 00:43:03,335 of satisfied edge constraints. 880 00:43:03,335 --> 00:43:05,160 But before I show you that reduction, 881 00:43:05,160 --> 00:43:08,020 I need another problem, which is APX-complete. 882 00:43:08,020 --> 00:43:10,330 So that problem is APX-complete. 883 00:43:10,330 --> 00:43:14,540 So I need two more problems. 884 00:43:14,540 --> 00:43:28,996 One is Max independent set in 3-regular 3-edge colorable 885 00:43:28,996 --> 00:43:29,495 graphs. 886 00:43:32,790 --> 00:43:33,290 OK. 887 00:43:33,290 --> 00:43:35,415 I'm not going to prove this one, because we already 888 00:43:35,415 --> 00:43:37,030 did a version of independent set, 889 00:43:37,030 --> 00:43:39,340 and it's just tedious to make it-- first, 890 00:43:39,340 --> 00:43:42,210 to make it exactly degree three everywhere, 891 00:43:42,210 --> 00:43:45,260 and secondly make it 3-edge colorable. 892 00:43:45,260 --> 00:43:48,630 With 3 regular 3-edge color is a nice kind of graph, 893 00:43:48,630 --> 00:43:55,370 because every vertex, you've got one edge of each class. 894 00:43:55,370 --> 00:43:56,930 So that's kind of cool. 895 00:43:56,930 --> 00:43:57,990 And we can use this. 896 00:43:57,990 --> 00:44:00,310 This problem is basically equivalent 897 00:44:00,310 --> 00:44:03,720 to the actual problem I want, which 898 00:44:03,720 --> 00:44:07,610 is a variation of three-dimensional matching. 899 00:44:07,610 --> 00:44:09,980 So remember three-dimensional matching, 900 00:44:09,980 --> 00:44:16,310 you have three sets-- A, B, and C. You 901 00:44:16,310 --> 00:44:19,080 look at the triples on A, B, and C. 902 00:44:19,080 --> 00:44:23,140 And you're given some set of interesting triples 903 00:44:23,140 --> 00:44:24,960 among those. 904 00:44:24,960 --> 00:44:32,350 And with 3DM, what we wanted was to choose a set of such triples 905 00:44:32,350 --> 00:44:36,080 that covers all the vertices, and no two of them intersect. 906 00:44:36,080 --> 00:44:38,500 That's the matching aspect. 907 00:44:38,500 --> 00:44:40,740 In this problem, we want to choose as many triples 908 00:44:40,740 --> 00:44:43,700 as we can that don't intersect each other. 909 00:44:43,700 --> 00:44:55,530 So the problem is choose max subset S prime of S 910 00:44:55,530 --> 00:44:59,750 with no duplicate coordinates, I'll say. 911 00:45:03,720 --> 00:45:05,900 So let's assume A, B, and C are disjoint. 912 00:45:05,900 --> 00:45:09,020 Then I don't want any element in A union B union C 913 00:45:09,020 --> 00:45:13,800 to appear twice in this chosen set S prime. 914 00:45:13,800 --> 00:45:15,710 So that's the problem. 915 00:45:15,710 --> 00:45:19,521 Now I'm going to prove that that's hard. 916 00:45:19,521 --> 00:45:24,990 It is basically the same as Max independent set, 917 00:45:24,990 --> 00:45:29,830 and three regular 3-edge colored graphs, 918 00:45:29,830 --> 00:45:33,760 because what I do is I take such a graph, 919 00:45:33,760 --> 00:45:43,490 and for each edge color class-- there are three of them-- 920 00:45:43,490 --> 00:45:46,040 those are going to be A, B, and C. 921 00:45:46,040 --> 00:45:47,800 So if I have red, green, and blue, 922 00:45:47,800 --> 00:45:49,910 all the red edges are going to be elements of A, 923 00:45:49,910 --> 00:45:52,220 all the green edges are going to be the elements 924 00:45:52,220 --> 00:45:54,720 of B-- B for green. 925 00:45:54,720 --> 00:45:58,090 And then all the blue elements are elements of C. 926 00:45:58,090 --> 00:45:58,710 OK. 927 00:45:58,710 --> 00:46:06,380 Then a vertex, as I said, has exactly one of each class. 928 00:46:06,380 --> 00:46:07,790 So that's going to be my triple. 929 00:46:11,410 --> 00:46:13,540 And that's it. 930 00:46:13,540 --> 00:46:16,150 So now, if I want to solve three-dimensional matching 931 00:46:16,150 --> 00:46:17,930 among those triples, that's going 932 00:46:17,930 --> 00:46:22,735 to correspond to choosing a set of vertices in here, no two 933 00:46:22,735 --> 00:46:25,760 of which share a color. 934 00:46:25,760 --> 00:46:30,105 No two of which share the same item of A. Let's say A 935 00:46:30,105 --> 00:46:32,360 is this color of edge. 936 00:46:32,360 --> 00:46:35,720 So that means that the vertices over here 937 00:46:35,720 --> 00:46:37,890 are not connected by an edge. 938 00:46:37,890 --> 00:46:40,920 So the cool thing here is that each element of A, B, and C 939 00:46:40,920 --> 00:46:49,000 only appears in two different triples. 940 00:46:49,000 --> 00:46:51,800 Corresponding to the two ends of the edge. 941 00:46:51,800 --> 00:46:54,540 So now we have max three-dimensional matching 942 00:46:54,540 --> 00:46:58,670 where every element in ABC appears in exactly two triples. 943 00:46:58,670 --> 00:47:03,188 So I guess I can even write E2 if I want to. 944 00:47:03,188 --> 00:47:05,060 OK. 945 00:47:05,060 --> 00:47:08,130 That was our sort of homework. 946 00:47:08,130 --> 00:47:13,370 Now we have max edge matching puzzles. 947 00:47:13,370 --> 00:47:17,287 Again, we're given square tiles. 948 00:47:17,287 --> 00:47:18,870 There's different colors on the tiles. 949 00:47:18,870 --> 00:47:20,780 Any number of colors. 950 00:47:20,780 --> 00:47:23,950 And we would like to lay things out. 951 00:47:23,950 --> 00:47:26,880 And I'll tell you the instance here is going to be 2 by N. 952 00:47:26,880 --> 00:47:29,760 So it's fairly narrow, unlike the construction 953 00:47:29,760 --> 00:47:32,240 we saw in class. 954 00:47:32,240 --> 00:47:36,330 And we're reducing from Max 3D M2. 955 00:47:36,330 --> 00:47:38,156 That's why I introduced it. 956 00:47:38,156 --> 00:47:43,090 And this is a four years ago result. 957 00:47:43,090 --> 00:47:47,640 So the idea is the triple is represented by these three 958 00:47:47,640 --> 00:47:49,210 tiles, and some more. 959 00:47:49,210 --> 00:47:52,090 But for starters, these three tiles. 960 00:47:52,090 --> 00:47:54,870 The u glue is unique-- global unique. 961 00:47:54,870 --> 00:47:57,090 So it wants to be on the boundary. 962 00:47:57,090 --> 00:47:58,890 And here tiles are not allowed to rotate, 963 00:47:58,890 --> 00:48:01,490 so it wants to be on the bottom boundary. 964 00:48:01,490 --> 00:48:08,676 So this ab glues only appear as a single pairs. 965 00:48:08,676 --> 00:48:10,300 I guess they'll also appear over there. 966 00:48:10,300 --> 00:48:11,383 But not very many of them. 967 00:48:11,383 --> 00:48:13,800 So basically a, b, and c have to glue together 968 00:48:13,800 --> 00:48:14,720 in sequence like that. 969 00:48:14,720 --> 00:48:15,980 And the percent signs are going to be 970 00:48:15,980 --> 00:48:17,140 the same on the bottom row. 971 00:48:17,140 --> 00:48:19,130 So nothing else. 972 00:48:19,130 --> 00:48:20,832 This is basically forced to do this. 973 00:48:20,832 --> 00:48:22,540 We'll actually have to do it a few times, 974 00:48:22,540 --> 00:48:24,920 but you have to build this bottom structure. 975 00:48:24,920 --> 00:48:28,210 And then the question is what do you build on top. 976 00:48:28,210 --> 00:48:32,900 And the idea is there are exactly one each of these three 977 00:48:32,900 --> 00:48:37,110 tiles which just communicate dollar sign left to right, 978 00:48:37,110 --> 00:48:39,550 and have a, b, c on the bottom. 979 00:48:39,550 --> 00:48:40,432 So those are cool. 980 00:48:40,432 --> 00:48:42,890 And if you want to put a triple into your three-dimensional 981 00:48:42,890 --> 00:48:46,950 matching, then you put those in sequence. 982 00:48:46,950 --> 00:48:48,020 No mismatches. 983 00:48:48,020 --> 00:48:48,680 This is great. 984 00:48:48,680 --> 00:48:49,820 You can take a whole bunch of these, 985 00:48:49,820 --> 00:48:52,028 stick them next to each other, everything will match. 986 00:48:52,028 --> 00:48:53,000 No errors. 987 00:48:53,000 --> 00:48:54,990 So you're getting some constant number 988 00:48:54,990 --> 00:48:58,230 of points for each of these. 989 00:48:58,230 --> 00:49:03,240 But you will have to build more-- at least two copies 990 00:49:03,240 --> 00:49:04,930 of this bottom structure. 991 00:49:04,930 --> 00:49:07,540 And there's only one copy of this top thing. 992 00:49:07,540 --> 00:49:09,110 So that's the annoying part. 993 00:49:09,110 --> 00:49:11,820 But there are some variations of these tiles which 994 00:49:11,820 --> 00:49:13,570 look like something like this-- I'll 995 00:49:13,570 --> 00:49:16,930 show you all of them in a moment-- which have exactly one 996 00:49:16,930 --> 00:49:18,450 mismatch. 997 00:49:18,450 --> 00:49:20,842 So you don't get quite as many points. 998 00:49:20,842 --> 00:49:22,800 You get, I don't know, 15 instead of 16 points, 999 00:49:22,800 --> 00:49:24,740 or whatever. 1000 00:49:24,740 --> 00:49:26,750 Bottom structure looks the same. 1001 00:49:26,750 --> 00:49:31,571 And the point of this is we know a appears 1002 00:49:31,571 --> 00:49:32,570 in two different places. 1003 00:49:32,570 --> 00:49:35,870 So we need two versions of the a tile. 1004 00:49:35,870 --> 00:49:39,015 But we only want one of them to be happy and give you 1005 00:49:39,015 --> 00:49:40,640 all the points, because you should only 1006 00:49:40,640 --> 00:49:44,400 be able to choose the a thing once. 1007 00:49:44,400 --> 00:49:46,520 So yet this triple will still exist. 1008 00:49:46,520 --> 00:49:48,400 adc will still be floating around there. 1009 00:49:48,400 --> 00:49:52,410 You want to still be buildable, but at a cost of negative 1. 1010 00:49:52,410 --> 00:49:54,880 So this part's still built. 1011 00:49:54,880 --> 00:49:57,030 Then you have these sort of filler tiles. 1012 00:49:57,030 --> 00:49:59,000 Your goal is then just get rid of all the stuff 1013 00:49:59,000 --> 00:50:00,770 and pay a penalty. 1014 00:50:00,770 --> 00:50:03,410 But you want to minimize the number of times you do this, 1015 00:50:03,410 --> 00:50:05,850 or maximize the number of times you do this, 1016 00:50:05,850 --> 00:50:09,200 and then it will be simulating Max 3DM. 1017 00:50:09,200 --> 00:50:12,640 There'll be some additive consistent cost, 1018 00:50:12,640 --> 00:50:16,690 which is the cost of all the unpicked triples. 1019 00:50:16,690 --> 00:50:20,525 And then this will be an L-reduction. 1020 00:50:20,525 --> 00:50:21,650 So I have some more slides. 1021 00:50:21,650 --> 00:50:24,070 It's a bit complicated to do all of the details, 1022 00:50:24,070 --> 00:50:28,020 but this is a fully worked-out example with two triples. 1023 00:50:28,020 --> 00:50:30,680 We have a, b, c and a, d, c. 1024 00:50:30,680 --> 00:50:32,264 And because they share a, we don't 1025 00:50:32,264 --> 00:50:33,430 want them both to be picked. 1026 00:50:33,430 --> 00:50:36,380 So the same as what I showed you just in the previous slide. 1027 00:50:36,380 --> 00:50:38,500 But then there are all these other tiles 1028 00:50:38,500 --> 00:50:41,420 that are floating around in order to make 1029 00:50:41,420 --> 00:50:43,320 all the combinations possible. 1030 00:50:43,320 --> 00:50:45,730 And there's all these tiles to basically allow 1031 00:50:45,730 --> 00:50:47,390 them to get thrown away. 1032 00:50:47,390 --> 00:50:50,710 And so that's not so clear. 1033 00:50:50,710 --> 00:50:54,104 This is the overall construction. 1034 00:50:54,104 --> 00:50:56,520 For every triple, you're going to have exactly these three 1035 00:50:56,520 --> 00:50:59,310 tiles that we saw. 1036 00:50:59,310 --> 00:51:01,310 It got rotated relative to the previous picture. 1037 00:51:01,310 --> 00:51:03,560 Maybe rotations are allowed. 1038 00:51:03,560 --> 00:51:05,890 And then for every variable, here 1039 00:51:05,890 --> 00:51:08,150 they're called x, y, z instead of a, b, c. 1040 00:51:08,150 --> 00:51:09,290 But the same thing. 1041 00:51:09,290 --> 00:51:13,100 For every a thing we'll have some constant set of tiles that 1042 00:51:13,100 --> 00:51:15,250 includes the really good one. 1043 00:51:15,250 --> 00:51:15,750 Sorry. 1044 00:51:15,750 --> 00:51:17,270 The good one has two dollar signs. 1045 00:51:17,270 --> 00:51:19,465 This is the one you really like. 1046 00:51:19,465 --> 00:51:21,090 And then there's all this stuff to make 1047 00:51:21,090 --> 00:51:23,350 sure things can get consumed. 1048 00:51:23,350 --> 00:51:24,880 And you can get rid of the triples 1049 00:51:24,880 --> 00:51:27,950 and pay exactly one per unpicked triple. 1050 00:51:27,950 --> 00:51:29,700 So I don't want to go through the details, 1051 00:51:29,700 --> 00:51:34,711 but once you have that, you get an L-reduction from Max 3DN2. 1052 00:51:34,711 --> 00:51:35,210 Questions? 1053 00:51:38,508 --> 00:51:39,494 All right. 1054 00:51:44,960 --> 00:51:50,590 So I want to go up the hierarchy. 1055 00:51:50,590 --> 00:51:55,130 We've been focusing on constant factor, approximable problems 1056 00:51:55,130 --> 00:51:56,270 that have no PTASses. 1057 00:51:59,080 --> 00:52:00,870 I will mention there before we go on 1058 00:52:00,870 --> 00:52:04,050 that there are some constant factor approximable 1059 00:52:04,050 --> 00:52:08,020 problems that are not, that have no PTAS, 1060 00:52:08,020 --> 00:52:10,600 and yet are not APX-complete. 1061 00:52:10,600 --> 00:52:17,520 So APX-complete is not all of APX minus PTAS. 1062 00:52:17,520 --> 00:52:22,460 So there are APX minus PTAS problems 1063 00:52:22,460 --> 00:52:23,620 that are not APX-complete. 1064 00:52:26,380 --> 00:52:29,140 So these are still useful from a reduction standpoint. 1065 00:52:29,140 --> 00:52:33,910 You can use them to show that your problem has no PTAS. 1066 00:52:33,910 --> 00:52:36,450 But you have to state them differently. 1067 00:52:40,690 --> 00:52:43,190 And they're somewhat familiar problems. 1068 00:52:43,190 --> 00:52:46,230 One of them is bin packing. 1069 00:52:46,230 --> 00:52:48,950 This is you're moving out of your house. 1070 00:52:48,950 --> 00:52:50,950 You have a bunch of objects. 1071 00:52:50,950 --> 00:52:52,700 You live in a one-dimensional universe. 1072 00:52:52,700 --> 00:52:55,620 So each box is exactly the same size. 1073 00:52:55,620 --> 00:52:57,240 It's one-dimensional in size. 1074 00:52:57,240 --> 00:52:58,920 And you have a bunch of items which are one-dimensional. 1075 00:52:58,920 --> 00:53:01,211 And you want to pack as many as you can into each box-- 1076 00:53:01,211 --> 00:53:03,190 but overall use the minimum number of boxes. 1077 00:53:03,190 --> 00:53:05,690 It's a minimization problem. 1078 00:53:05,690 --> 00:53:08,770 This has no constant factor approximation. 1079 00:53:08,770 --> 00:53:14,740 But you can find what's called a asymptotic PTAS, where 1080 00:53:14,740 --> 00:53:17,950 you can get a PTAS-style result-- 1 plus epsilon 1081 00:53:17,950 --> 00:53:21,822 times OPT plus 1. 1082 00:53:21,822 --> 00:53:24,881 So an additive error. 1083 00:53:24,881 --> 00:53:26,380 And so in particular, distinguishing 1084 00:53:26,380 --> 00:53:29,930 between two bins and three bins is weakly NP-complete. 1085 00:53:29,930 --> 00:53:36,325 That's like partition, right, between two bins 1086 00:53:36,325 --> 00:53:37,570 and three bins. 1087 00:53:37,570 --> 00:53:39,280 So you need this sort of additive one. 1088 00:53:39,280 --> 00:53:42,060 You can't get a PTAS without the additive one. 1089 00:53:42,060 --> 00:53:45,360 So it's not as hard as all constant factor inapproximable 1090 00:53:45,360 --> 00:53:49,300 problems, but somewhere in between. 1091 00:53:49,300 --> 00:53:52,440 APX-intermediate is the technical term. 1092 00:53:52,440 --> 00:53:56,501 Some other ones are minimum. 1093 00:53:56,501 --> 00:53:58,432 AUDIENCE: [INAUDIBLE]. 1094 00:53:58,432 --> 00:54:00,765 PROFESSOR: Oh, this is all assuming P does not equal NP. 1095 00:54:00,765 --> 00:54:01,120 Yes. 1096 00:54:01,120 --> 00:54:03,530 If P equals NP, then I think all these things are equal. 1097 00:54:03,530 --> 00:54:05,300 So, thank you. 1098 00:54:08,400 --> 00:54:10,670 Another problem I've seen in some situations 1099 00:54:10,670 --> 00:54:15,260 is you want to find the spanning tree in a graph that 1100 00:54:15,260 --> 00:54:16,840 minimizes the maximum degree. 1101 00:54:16,840 --> 00:54:19,070 This is also APX-intermediate. 1102 00:54:19,070 --> 00:54:21,220 There's a constant factor approximation. 1103 00:54:21,220 --> 00:54:26,130 No PTAS, but not as hard as all of APX. 1104 00:54:26,130 --> 00:54:28,440 And another one is min edge coloring, 1105 00:54:28,440 --> 00:54:33,120 which is quite a bit easier than vertex coloring. 1106 00:54:33,120 --> 00:54:34,864 So these are problems to watch out for. 1107 00:54:34,864 --> 00:54:37,280 They're the only ones I know of that are APX-intermediate. 1108 00:54:37,280 --> 00:54:38,280 There may be more known. 1109 00:54:41,330 --> 00:54:42,190 OK. 1110 00:54:42,190 --> 00:54:44,900 So unless there are questions, I want to go up 1111 00:54:44,900 --> 00:54:46,795 to log factor approximation. 1112 00:54:54,180 --> 00:54:56,050 Surprisingly, in the CSP universe, 1113 00:54:56,050 --> 00:54:59,970 we didn't get any log approximation 1114 00:54:59,970 --> 00:55:00,970 as the right answer. 1115 00:55:00,970 --> 00:55:03,350 But there are problems where log is the right answer. 1116 00:55:07,774 --> 00:55:09,690 Again, there's probably intermediate problems. 1117 00:55:09,690 --> 00:55:11,720 But here are some problems that are actually 1118 00:55:11,720 --> 00:55:14,880 complete over all log approximable problems. 1119 00:55:14,880 --> 00:55:16,930 So there's a log lower-bound and upper-bound 1120 00:55:16,930 --> 00:55:19,390 on their approximability. 1121 00:55:19,390 --> 00:55:25,090 I've mentioned two of them-- set cover and dominating set. 1122 00:55:29,859 --> 00:55:32,150 First thing I'd like to show is that these two problems 1123 00:55:32,150 --> 00:55:33,390 are the same. 1124 00:55:33,390 --> 00:55:35,810 I'm not going to try to prove lower bounds on them-- 1125 00:55:35,810 --> 00:55:37,240 at least for now. 1126 00:55:37,240 --> 00:55:40,720 But let me show that you could L-reduce one to the other. 1127 00:55:40,720 --> 00:55:44,080 So the easy direction is L-reducing dominating 1128 00:55:44,080 --> 00:55:47,260 set to set cover, because dominating set 1129 00:55:47,260 --> 00:55:49,100 says, well, if I choose this vertex, 1130 00:55:49,100 --> 00:55:52,550 then I cover these vertices. 1131 00:55:52,550 --> 00:55:53,050 OK. 1132 00:55:53,050 --> 00:55:57,920 So let's call this vertex V, and then maybe a, b, c, d. 1133 00:55:57,920 --> 00:56:04,450 I can represent that by a set-- namely v, a, b, c, d. 1134 00:56:04,450 --> 00:56:06,502 If I choose that set, it covers those elements, 1135 00:56:06,502 --> 00:56:07,960 just like when I choose this vertex 1136 00:56:07,960 --> 00:56:09,450 it covers those vertices. 1137 00:56:09,450 --> 00:56:09,950 OK. 1138 00:56:09,950 --> 00:56:12,490 So that's a strict reduction from dominating 1139 00:56:12,490 --> 00:56:14,865 set to set cover. 1140 00:56:14,865 --> 00:56:18,320 In some sense, the bipartite version gives you more control. 1141 00:56:18,320 --> 00:56:18,820 OK. 1142 00:56:18,820 --> 00:56:22,500 This is the non-bipartite version of set cover. 1143 00:56:22,500 --> 00:56:24,110 So what about the other reduction-- 1144 00:56:24,110 --> 00:56:27,500 reducing set cover to dominating set? 1145 00:56:30,090 --> 00:56:33,170 So this is a little more fun. 1146 00:56:33,170 --> 00:56:35,710 We need to build a graph dominating 1147 00:56:35,710 --> 00:56:39,000 set that somehow has two very different types of vertices. 1148 00:56:39,000 --> 00:56:42,810 We want to represent sets, and we want to represent elements. 1149 00:56:42,810 --> 00:56:44,400 So here's what we're going to do. 1150 00:56:44,400 --> 00:56:49,040 We build a clique representing the sets. 1151 00:56:49,040 --> 00:56:53,560 So there are nodes in this clique-- one for every set. 1152 00:56:53,560 --> 00:56:57,240 And then we're going to have an independent set over here that 1153 00:56:57,240 --> 00:56:59,710 will represent the elements. 1154 00:56:59,710 --> 00:57:01,910 And then whenever a set over here 1155 00:57:01,910 --> 00:57:05,940 contains an element over there, we will add an edge. 1156 00:57:05,940 --> 00:57:08,970 So in general, an element may appear in several sets, 1157 00:57:08,970 --> 00:57:12,200 and the set is going to consist of many elements. 1158 00:57:12,200 --> 00:57:14,440 But over here, there's not going to be any edges 1159 00:57:14,440 --> 00:57:15,410 between these elements. 1160 00:57:15,410 --> 00:57:18,390 These are independent. 1161 00:57:18,390 --> 00:57:22,370 And over here, all of the edges exist. 1162 00:57:22,370 --> 00:57:25,540 So the intent is you choose a set of these vertices 1163 00:57:25,540 --> 00:57:29,620 corresponding to sets in order to cover those vertices. 1164 00:57:29,620 --> 00:57:31,870 And that's going to work, because these vertices 1165 00:57:31,870 --> 00:57:33,820 are super easy to cover in the dominating set. 1166 00:57:33,820 --> 00:57:36,880 You choose any of them, you cover all of them. 1167 00:57:36,880 --> 00:57:40,800 These guys, you never want to put them in a dominating set. 1168 00:57:40,800 --> 00:57:42,800 Why would you put this in a dominating set, when 1169 00:57:42,800 --> 00:57:44,466 you could just follow one of these edges 1170 00:57:44,466 --> 00:57:45,780 and put this in instead? 1171 00:57:45,780 --> 00:57:49,960 That vertex will cover this one, and it will cover all of these. 1172 00:57:49,960 --> 00:57:52,680 And the only edges from here are to over here. 1173 00:57:52,680 --> 00:57:56,451 So if you choose a set, you'll cover all the sets and that one 1174 00:57:56,451 --> 00:57:56,950 element. 1175 00:57:56,950 --> 00:57:58,324 If you choose the element, you'll 1176 00:57:58,324 --> 00:58:01,390 cover the element and some of the sets. 1177 00:58:01,390 --> 00:58:04,100 So in any optimal solution, if this ever appears, 1178 00:58:04,100 --> 00:58:06,530 you can keep it optimal and move over here. 1179 00:58:06,530 --> 00:58:09,170 That is sort of arguments we've been doing over and over. 1180 00:58:09,170 --> 00:58:11,240 So there is an optimal solution where you only 1181 00:58:11,240 --> 00:58:16,810 choose vertices on the left, and then that is a set cover. 1182 00:58:16,810 --> 00:58:19,570 Again, it's a strict reduction. 1183 00:58:19,570 --> 00:58:21,170 No loss. 1184 00:58:21,170 --> 00:58:21,670 Cool? 1185 00:58:21,670 --> 00:58:24,425 So that is why these two problems are equivalent. 1186 00:58:24,425 --> 00:58:26,300 Now we're just going to take on faith for now 1187 00:58:26,300 --> 00:58:29,290 that they are log inapproximable. 1188 00:58:29,290 --> 00:58:32,044 And you've probably seen that this one is log approximable. 1189 00:58:32,044 --> 00:58:33,960 So now you know that this is log approximable. 1190 00:58:39,540 --> 00:58:45,170 I would say most of the literature 1191 00:58:45,170 --> 00:58:50,040 I see for inapproximability is either APX hardness, 1192 00:58:50,040 --> 00:58:52,465 or what people usually call set cover hardness. 1193 00:58:55,140 --> 00:58:57,440 I mean, the fact that set covers log APX-complete, 1194 00:58:57,440 --> 00:58:58,814 that is complete for that class-- 1195 00:58:58,814 --> 00:59:01,230 not just a log lower-bound-- is fairly recent. 1196 00:59:01,230 --> 00:59:03,760 So people usually have called it set cover hardness. 1197 00:59:03,760 --> 00:59:07,000 Now you can call it log APX-hardness. 1198 00:59:07,000 --> 00:59:10,120 So let me show you one example. 1199 00:59:10,120 --> 00:59:11,880 There are a lot of both out there, 1200 00:59:11,880 --> 00:59:15,852 and I'm actually just showing you sort of a small sampling, 1201 00:59:15,852 --> 00:59:17,900 because there's so much. 1202 00:59:17,900 --> 00:59:20,120 So here's a fun problem. 1203 00:59:20,120 --> 00:59:23,167 It's called token reconfiguration. 1204 00:59:23,167 --> 00:59:24,750 And the idea is you're doing some kind 1205 00:59:24,750 --> 00:59:27,410 of motion planning in a graph. 1206 00:59:27,410 --> 00:59:29,380 So something like pushing blocks, 1207 00:59:29,380 --> 00:59:33,300 except you have a bunch of robots, 1208 00:59:33,300 --> 00:59:37,100 which here are represented-- well, you have a graph. 1209 00:59:37,100 --> 00:59:40,760 And each vertex can either have a robot or not. 1210 00:59:40,760 --> 00:59:43,580 In some, you're given an initial configuration 1211 00:59:43,580 --> 00:59:45,320 of how the robots are placed, and you're 1212 00:59:45,320 --> 00:59:46,903 given a final configuration of how you 1213 00:59:46,903 --> 00:59:48,160 want the robots to be placed. 1214 00:59:48,160 --> 00:59:49,826 And they have the same number of robots, 1215 00:59:49,826 --> 00:59:53,220 because you can't eat robots, or create them yet. 1216 00:59:53,220 --> 00:59:55,470 So when robots can create robots, 1217 00:59:55,470 --> 00:59:57,990 that will be another problem. 1218 00:59:57,990 --> 00:59:59,490 So here you have robot conservation. 1219 01:00:03,200 --> 01:00:05,370 So in a configuration, there are three types 1220 01:00:05,370 --> 01:00:08,350 of vertices in that situation. 1221 01:00:08,350 --> 01:00:10,760 It could be you have a vertex that currently 1222 01:00:10,760 --> 01:00:12,580 has a robot-- here they're called tokens, 1223 01:00:12,580 --> 01:00:16,210 to be a little more generic. 1224 01:00:16,210 --> 01:00:19,480 It could have a robot, but not be a place 1225 01:00:19,480 --> 01:00:20,690 that should have a robot. 1226 01:00:20,690 --> 01:00:22,690 So in the initial configuration, it has a robot, 1227 01:00:22,690 --> 01:00:24,910 but in the final configuration it does not. 1228 01:00:24,910 --> 01:00:28,750 It could be you have some robots that are basically 1229 01:00:28,750 --> 01:00:29,940 where they want to be. 1230 01:00:29,940 --> 01:00:33,240 They are robot and also in the target configuration, 1231 01:00:33,240 --> 01:00:34,780 there's a robot there. 1232 01:00:34,780 --> 01:00:36,870 Or I guess there's four cases, but in this case 1233 01:00:36,870 --> 01:00:38,040 we'll only have three. 1234 01:00:38,040 --> 01:00:40,260 Or it could be that you want to have robot there, 1235 01:00:40,260 --> 01:00:42,240 but currently you do not. 1236 01:00:42,240 --> 01:00:46,817 So this is an instance that simulates set cover. 1237 01:00:46,817 --> 01:00:48,650 And this is a situation where robots are all 1238 01:00:48,650 --> 01:00:49,520 treated identically. 1239 01:00:49,520 --> 01:00:52,400 So you don't care which robot goes where. 1240 01:00:52,400 --> 01:00:54,030 So you've got these robots over here, 1241 01:00:54,030 --> 01:00:55,350 which don't want to be here. 1242 01:00:55,350 --> 01:00:56,860 They want to be over there. 1243 01:00:56,860 --> 01:00:58,450 I mean, if you measure this length, 1244 01:00:58,450 --> 01:01:01,900 it's the same as this length. 1245 01:01:01,900 --> 01:01:03,540 And these robots don't want to move, 1246 01:01:03,540 --> 01:01:05,930 but they're going to have to, because they're in the way. 1247 01:01:05,930 --> 01:01:08,590 In this tripartite graph, they're in the way from here 1248 01:01:08,590 --> 01:01:09,950 to there. 1249 01:01:09,950 --> 01:01:12,840 I didn't tell you a move in this scenario 1250 01:01:12,840 --> 01:01:18,220 is that you can take a robot and follow any empty path, OK 1251 01:01:18,220 --> 01:01:21,410 So you can make a sequence of moves all at a cost of one, 1252 01:01:21,410 --> 01:01:23,610 as long as it doesn't hit any other robots. 1253 01:01:23,610 --> 01:01:25,570 So, a collision-free path. 1254 01:01:25,570 --> 01:01:27,850 You follow it, then you can pick up another robot, 1255 01:01:27,850 --> 01:01:29,349 move it along a collision-free path, 1256 01:01:29,349 --> 01:01:32,720 pick up another robot, and so on. 1257 01:01:32,720 --> 01:01:34,884 So if you want to move all these guys over here, 1258 01:01:34,884 --> 01:01:37,300 you're going to have to move some of these out of the way. 1259 01:01:37,300 --> 01:01:38,300 How many? 1260 01:01:38,300 --> 01:01:39,570 Set cover many. 1261 01:01:39,570 --> 01:01:42,330 Here's the set cover instance in this bipartite graph. 1262 01:01:42,330 --> 01:01:45,710 So what you can do is take this robot, move it out of the way, 1263 01:01:45,710 --> 01:01:47,240 move it to one of these elements, 1264 01:01:47,240 --> 01:01:49,200 and then for the remainder of this set, which 1265 01:01:49,200 --> 01:01:51,597 are these two nodes, you can take this guy 1266 01:01:51,597 --> 01:01:53,430 and move it there in one step, take this guy 1267 01:01:53,430 --> 01:01:54,800 and move it there in one step. 1268 01:01:54,800 --> 01:01:56,200 The length of this doesn't matter, because you 1269 01:01:56,200 --> 01:01:57,480 can follow a long path. 1270 01:01:57,480 --> 01:02:01,900 And you just drain out this thing one at a time-- 1271 01:02:01,900 --> 01:02:05,490 except for this guy, who you moved out of the way. 1272 01:02:05,490 --> 01:02:08,260 You move one of these to fill his spot. 1273 01:02:08,260 --> 01:02:10,690 And if you can cover all the elements over here 1274 01:02:10,690 --> 01:02:13,640 with only k of these guys moving, 1275 01:02:13,640 --> 01:02:20,215 then the number of moves will be k plus A. So 1276 01:02:20,215 --> 01:02:21,340 that's what's written here. 1277 01:02:21,340 --> 01:02:26,940 OPT is, this is a fixed added of cost plus the set cover. 1278 01:02:26,940 --> 01:02:30,600 And this is going to be an L-reduction, provided 1279 01:02:30,600 --> 01:02:36,990 this is a linear in A, which is easy enough to arrange. 1280 01:02:36,990 --> 01:02:38,590 So that's the unlabeled case. 1281 01:02:38,590 --> 01:02:40,860 You can also solve the labeled case. 1282 01:02:40,860 --> 01:02:44,170 Maybe you want robot one to go to position one, 1283 01:02:44,170 --> 01:02:47,190 and you want robot two to go to position two. 1284 01:02:47,190 --> 01:02:48,901 Same thing, but here these robots 1285 01:02:48,901 --> 01:02:50,900 are going to have to go back where they started. 1286 01:02:50,900 --> 01:02:53,525 So you just add a little vertex so they can get out of the way. 1287 01:02:53,525 --> 01:02:55,590 Everything can move where they want to. 1288 01:02:55,590 --> 01:02:58,710 Again, choose a set cover, move those over, 1289 01:02:58,710 --> 01:02:59,970 and then move them back. 1290 01:02:59,970 --> 01:03:02,130 So you end up paying two times the set cover. 1291 01:03:02,130 --> 01:03:03,840 But just a constant factor loss. 1292 01:03:03,840 --> 01:03:05,960 Still an L-reduction. 1293 01:03:05,960 --> 01:03:07,960 And this problem is motivated, it's 1294 01:03:07,960 --> 01:03:10,290 sort of a generalization of the 15 puzzle. 1295 01:03:10,290 --> 01:03:12,750 You have a little 4 by 4 grid. 1296 01:03:12,750 --> 01:03:13,990 You've got movable tiles. 1297 01:03:13,990 --> 01:03:16,300 You can only move one at a time in that case, 1298 01:03:16,300 --> 01:03:18,420 because there's only a single gap. 1299 01:03:18,420 --> 01:03:20,890 This is sort of a generalized form of that, 1300 01:03:20,890 --> 01:03:22,770 where you have various tiles. 1301 01:03:22,770 --> 01:03:25,230 You want to get them into the right spots, 1302 01:03:25,230 --> 01:03:28,300 but you can't have collisions during that motion. 1303 01:03:28,300 --> 01:03:31,470 So that's where this problem came from. 1304 01:03:31,470 --> 01:03:34,320 15 puzzle, by the way, in the generalized n by n form 1305 01:03:34,320 --> 01:03:37,077 is NP-hard and in APX, but I think it's open 1306 01:03:37,077 --> 01:03:38,160 whether it's APX-complete. 1307 01:03:40,700 --> 01:03:44,820 I would show the proof, but it's very complicated, so, I won't. 1308 01:03:48,450 --> 01:03:50,140 Cool. 1309 01:03:50,140 --> 01:03:53,170 Well, in the last little bit, I wanted to tell you 1310 01:03:53,170 --> 01:03:56,230 about the super high end. 1311 01:03:56,230 --> 01:03:57,835 So we went to log approximation. 1312 01:04:00,640 --> 01:04:03,720 There are other things known, but not 1313 01:04:03,720 --> 01:04:05,110 a lot of completeness results. 1314 01:04:05,110 --> 01:04:06,610 So we're going to get to other kinds 1315 01:04:06,610 --> 01:04:09,370 of interapproximability next class. 1316 01:04:09,370 --> 01:04:13,430 For now, I want to stick to something APX-complete. 1317 01:04:13,430 --> 01:04:15,790 And the most studied class above log 1318 01:04:15,790 --> 01:04:19,740 is poly, which is like n to the 1 minus epsilon. 1319 01:04:34,860 --> 01:04:38,360 And my main goal here is to tell you about some problems 1320 01:04:38,360 --> 01:04:40,880 that you should, if you think your problem is 1321 01:04:40,880 --> 01:04:44,730 like Poly-APX-hard, these are the standard problems 1322 01:04:44,730 --> 01:04:46,390 to start from. 1323 01:04:46,390 --> 01:04:47,629 There are two of them. 1324 01:04:47,629 --> 01:04:49,920 And I've mentioned them, but not quite in this context. 1325 01:04:57,920 --> 01:05:03,194 They are clique and independent set. 1326 01:05:03,194 --> 01:05:04,610 These are really the same problem. 1327 01:05:04,610 --> 01:05:08,670 One is the complement graph of the other. 1328 01:05:08,670 --> 01:05:09,975 Both maximization problems. 1329 01:05:12,930 --> 01:05:14,480 And those are the standard ones. 1330 01:05:14,480 --> 01:05:16,690 I'll leave it at that. 1331 01:05:16,690 --> 01:05:18,490 I'm going to keep going up. 1332 01:05:18,490 --> 01:05:22,173 The next level most studied is Exp-APX-complete. 1333 01:05:25,136 --> 01:05:27,010 So for these problems, the best approximation 1334 01:05:27,010 --> 01:05:29,960 is n divided by log squared n. 1335 01:05:29,960 --> 01:05:32,234 And there's a lower bound of n to the 1 minus epsilon. 1336 01:05:32,234 --> 01:05:34,400 So there is a gap in terms of their approximability. 1337 01:05:34,400 --> 01:05:35,775 But what we know is that they are 1338 01:05:35,775 --> 01:05:39,930 the hardest problems that have any n to the ce approximation. 1339 01:05:39,930 --> 01:05:44,380 They're all reducible to each other via PTAS reductions. 1340 01:05:44,380 --> 01:05:45,705 So, fairly preserving. 1341 01:05:48,680 --> 01:05:52,010 So our next class up is APX-complete, 1342 01:05:52,010 --> 01:05:59,980 things, problems approximable in exponential and n approximation 1343 01:05:59,980 --> 01:06:00,480 factors. 1344 01:06:00,480 --> 01:06:02,850 How would that happen? 1345 01:06:02,850 --> 01:06:04,420 This is kind of funny. 1346 01:06:04,420 --> 01:06:09,350 And the canonical problem here is the basic reason is numbers. 1347 01:06:12,190 --> 01:06:14,410 We take the traveling salesman problem. 1348 01:06:14,410 --> 01:06:16,950 And every edge can have a weight. 1349 01:06:16,950 --> 01:06:18,730 Let's say it's integer weights. 1350 01:06:18,730 --> 01:06:21,960 But any integer weight that can be expressible in n bits 1351 01:06:21,960 --> 01:06:25,740 is fair game, which means the actual value of that edge 1352 01:06:25,740 --> 01:06:28,500 is going to be exponential in n. 1353 01:06:28,500 --> 01:06:31,210 And from that, you can get a very easy lower bound. 1354 01:06:31,210 --> 01:06:33,480 And in fact, all problems that are 1355 01:06:33,480 --> 01:06:38,430 approximable in exponential APX can be reduced to general TSP, 1356 01:06:38,430 --> 01:06:40,267 where you're just given a bunch of distances 1357 01:06:40,267 --> 01:06:41,350 between pairs of vertices. 1358 01:06:41,350 --> 01:06:43,040 It doesn't satisfy triangle inequality. 1359 01:06:43,040 --> 01:06:44,930 That's the non-metric aspect. 1360 01:06:44,930 --> 01:06:48,100 The triangle inequality TSP, which is what normally happens, 1361 01:06:48,100 --> 01:06:49,260 there is a constant factor. 1362 01:06:49,260 --> 01:06:51,430 It's APX complete. 1363 01:06:51,430 --> 01:06:57,030 But for general waits between pairs of vertices, 1364 01:06:57,030 --> 01:06:59,370 non-metric, it's Exp-APX-complete, 1365 01:06:59,370 --> 01:07:03,220 because you can basically make a graph 1366 01:07:03,220 --> 01:07:05,240 and solve Hamiltonicity by saying 1367 01:07:05,240 --> 01:07:09,050 all the edges in the graph have weight one or zero, 1368 01:07:09,050 --> 01:07:12,790 and all of the edges-- I guess one would be a little bit more 1369 01:07:12,790 --> 01:07:14,240 legitimate. 1370 01:07:14,240 --> 01:07:16,350 And all the non-edges in the graph 1371 01:07:16,350 --> 01:07:17,890 are going to give weight infinity. 1372 01:07:17,890 --> 01:07:19,920 Infinity is the largest expressible number which 1373 01:07:19,920 --> 01:07:22,360 is 1, 1, 1, 1, n bits long. 1374 01:07:22,360 --> 01:07:24,610 And so either you use one of those edges or you don't. 1375 01:07:24,610 --> 01:07:27,540 And there's an exponential gap between them. 1376 01:07:27,540 --> 01:07:29,590 So even if we disallow zeros being an output, 1377 01:07:29,590 --> 01:07:33,446 then we get exponential separation. 1378 01:07:33,446 --> 01:07:35,070 That doesn't prove completeness, but it 1379 01:07:35,070 --> 01:07:38,070 proves that you can't hope for better than exponential 1380 01:07:38,070 --> 01:07:40,910 approximation there. 1381 01:07:40,910 --> 01:07:42,240 OK. 1382 01:07:42,240 --> 01:07:46,620 Two more even crazier classes. 1383 01:07:46,620 --> 01:07:48,420 Now we did see these classes come up 1384 01:07:48,420 --> 01:07:52,580 with the characterization theorem. 1385 01:07:52,580 --> 01:07:54,990 But these are probably how these results were proved. 1386 01:08:17,750 --> 01:08:20,778 So you might think, well, double the exponential. 1387 01:08:20,778 --> 01:08:21,319 I don't know. 1388 01:08:21,319 --> 01:08:22,376 What's next? 1389 01:08:22,376 --> 01:08:24,189 Next, you could define that. 1390 01:08:24,189 --> 01:08:27,550 But what seems to appear most often 1391 01:08:27,550 --> 01:08:33,040 is this is the ultimate class among all NP optimization 1392 01:08:33,040 --> 01:08:34,810 problems, you could imagine being complete 1393 01:08:34,810 --> 01:08:36,060 against all of them. 1394 01:08:36,060 --> 01:08:40,270 And this is with respect to AP-reductions, 1395 01:08:40,270 --> 01:08:41,279 one of the ones we saw. 1396 01:08:44,090 --> 01:08:47,490 And I'm going to define a very closely related class, which 1397 01:08:47,490 --> 01:08:51,560 is NPO PB, NPO polynomially bounded. 1398 01:08:57,700 --> 01:08:58,992 OK. 1399 01:08:58,992 --> 01:09:02,220 So these are the hardest problems to approximate. 1400 01:09:02,220 --> 01:09:04,740 This is basically the problems that have numbers in them, 1401 01:09:04,740 --> 01:09:06,810 and this is the problem that have no numbers, 1402 01:09:06,810 --> 01:09:10,180 or if they have numbers they are polynomially bounded, 1403 01:09:10,180 --> 01:09:12,660 like the polynomial situation. 1404 01:09:12,660 --> 01:09:16,160 So non-metric TSP, well, it's not as hard as NPO-complete, 1405 01:09:16,160 --> 01:09:18,021 but it's more in this category. 1406 01:09:18,021 --> 01:09:20,645 AUDIENCE: Is there a notion of strongness, weakness 1407 01:09:20,645 --> 01:09:22,450 in these kind of things? 1408 01:09:22,450 --> 01:09:23,620 PROFESSOR: That's funny. 1409 01:09:23,620 --> 01:09:25,090 This is a stronger result. 1410 01:09:25,090 --> 01:09:26,560 So there's not quite an analog. 1411 01:09:26,560 --> 01:09:29,569 But you can do exponential tricks 1412 01:09:29,569 --> 01:09:33,140 and give yourself a hard time over here. 1413 01:09:33,140 --> 01:09:36,080 And here you're just not allowed to use. 1414 01:09:36,080 --> 01:09:37,760 Everything's polynomial. 1415 01:09:37,760 --> 01:09:41,870 So a three-partition is sort of more in this universe. 1416 01:09:41,870 --> 01:09:45,490 But in this situation, if you sort of have three partitions, 1417 01:09:45,490 --> 01:09:50,410 but with exponential numbers, then you get this harder class. 1418 01:09:50,410 --> 01:09:53,040 So this is not the analog of weak. 1419 01:09:53,040 --> 01:09:57,724 You could maybe imagine-- well, in some sense, 1420 01:09:57,724 --> 01:09:59,390 weak is a modifier in the problem, where 1421 01:09:59,390 --> 01:10:01,139 you say I want to restrict all the numbers 1422 01:10:01,139 --> 01:10:02,700 to a polynomial size. 1423 01:10:02,700 --> 01:10:05,900 So when you do something like three partition, 1424 01:10:05,900 --> 01:10:10,160 it's sort of a weak problem, or it's 1425 01:10:10,160 --> 01:10:12,270 a polynomially bounded problem. 1426 01:10:12,270 --> 01:10:15,850 Strong NP hardness means that that is NP-complete. 1427 01:10:15,850 --> 01:10:19,012 Anyway vague analog, but not quite. 1428 01:10:19,012 --> 01:10:21,470 It's possible some of these, you could add a weak modifier, 1429 01:10:21,470 --> 01:10:24,590 and it would mean something, but I don't know. 1430 01:10:24,590 --> 01:10:25,090 All right. 1431 01:10:25,090 --> 01:10:27,230 So I just want to give you some sample problems 1432 01:10:27,230 --> 01:10:29,290 on both of these sides. 1433 01:10:29,290 --> 01:10:31,930 Maybe let's start with this side, which 1434 01:10:31,930 --> 01:10:35,117 is a little more interesting, because you 1435 01:10:35,117 --> 01:10:36,575 get some kind of familiar problems, 1436 01:10:36,575 --> 01:10:37,533 and they're super hard. 1437 01:10:40,520 --> 01:10:46,150 Minimum independent dominating set. 1438 01:10:46,150 --> 01:10:47,300 We've seen independent set. 1439 01:10:47,300 --> 01:10:48,383 We've seen dominating set. 1440 01:10:48,383 --> 01:10:51,390 Independent set is already hard to approximate. 1441 01:10:51,390 --> 01:10:56,360 But this problem is worse, because even 1442 01:10:56,360 --> 01:10:58,080 finding an independent dominating set 1443 01:10:58,080 --> 01:11:02,020 is NP-complete, whereas finding an independent set, 1444 01:11:02,020 --> 01:11:04,210 I can choose nothing. 1445 01:11:04,210 --> 01:11:06,990 But if I want to simultaneously be dominating an independent, 1446 01:11:06,990 --> 01:11:07,870 that's NP. 1447 01:11:07,870 --> 01:11:09,570 Hard to find any solution. 1448 01:11:09,570 --> 01:11:17,680 In general in NPO PB problems, NPO PB-complete problems, 1449 01:11:17,680 --> 01:11:20,920 it's always NP-complete to find a feasible solution. 1450 01:11:20,920 --> 01:11:22,543 But it's worse than that. 1451 01:11:22,543 --> 01:11:25,210 So the first level would be to find a feasible solution. 1452 01:11:25,210 --> 01:11:26,910 And this is saying on top of that you 1453 01:11:26,910 --> 01:11:28,490 want to minimize the size. 1454 01:11:28,490 --> 01:11:30,276 I think Max would also be hard. 1455 01:11:30,276 --> 01:11:32,080 But I think there's a general theorem, 1456 01:11:32,080 --> 01:11:33,640 that if you're hard in the min case, 1457 01:11:33,640 --> 01:11:35,570 you're also hard in the max case. 1458 01:11:35,570 --> 01:11:38,960 But it depends on the exact set-up. 1459 01:11:38,960 --> 01:11:41,540 So this is sort of an optimization version 1460 01:11:41,540 --> 01:11:44,110 that makes it even harder than NP-complete. 1461 01:11:44,110 --> 01:11:49,330 So I think this is NP-complete, and this is kind of even worse. 1462 01:11:49,330 --> 01:11:52,910 It's sort of stating the stronger thing about when 1463 01:11:52,910 --> 01:11:55,380 you're trying to optimize over a space of solutions, 1464 01:11:55,380 --> 01:11:57,130 that it's NP-complete to decide. 1465 01:11:57,130 --> 01:11:59,320 Notice that's still an NPO problem. 1466 01:11:59,320 --> 01:12:01,490 We define that solutions need to be 1467 01:12:01,490 --> 01:12:03,244 recognizable in polynomial time. 1468 01:12:03,244 --> 01:12:05,035 But we didn't say that you can generate one 1469 01:12:05,035 --> 01:12:06,200 in polynomial time. 1470 01:12:06,200 --> 01:12:09,030 So it could be NP-complete to find a single solution, 1471 01:12:09,030 --> 01:12:09,744 like here. 1472 01:12:09,744 --> 01:12:11,660 All of these problems will have that property. 1473 01:12:15,990 --> 01:12:21,250 Another fun problem is shortest computation. 1474 01:12:21,250 --> 01:12:23,189 This is sort of the most intuitive one 1475 01:12:23,189 --> 01:12:23,980 at a certain level. 1476 01:12:23,980 --> 01:12:25,480 If you know Turing machines, and you 1477 01:12:25,480 --> 01:12:27,396 have a non-deterministic Turing machine, which 1478 01:12:27,396 --> 01:12:29,020 could take non-deterministic branches, 1479 01:12:29,020 --> 01:12:31,630 you want to find the computation in such a machine that 1480 01:12:31,630 --> 01:12:34,720 terminates the earliest using the fewest steps. 1481 01:12:34,720 --> 01:12:39,080 So you might think of that as canonical NPO PB problem. 1482 01:12:39,080 --> 01:12:41,690 There's no numbers in it, but as you can imagine, 1483 01:12:41,690 --> 01:12:44,440 that's super hard to do. 1484 01:12:44,440 --> 01:12:46,840 Here's some more graph theoretic ones. 1485 01:12:46,840 --> 01:12:50,920 Quite natural problems, but super hard. 1486 01:12:50,920 --> 01:12:52,510 Longest induced path. 1487 01:12:52,510 --> 01:12:55,030 Induced means, there are no other edges 1488 01:12:55,030 --> 01:12:57,320 between the chosen vertices. 1489 01:12:57,320 --> 01:13:00,810 So this is sort of longest path is one thing. 1490 01:13:00,810 --> 01:13:03,030 That's quite hard to approximate-- like, I think, 1491 01:13:03,030 --> 01:13:05,070 n to the 1 minus epsilon. 1492 01:13:05,070 --> 01:13:07,070 That's sort of the analog of Hamiltonicity. 1493 01:13:07,070 --> 01:13:09,740 Along this induced path is worse. 1494 01:13:09,740 --> 01:13:12,190 Even finding an induced path of length k, 1495 01:13:12,190 --> 01:13:16,550 finding a feasible solution, finding an induced path 1496 01:13:16,550 --> 01:13:17,150 is hard. 1497 01:13:24,310 --> 01:13:33,749 Another fun one is longest path with forbidden pairs. 1498 01:13:33,749 --> 01:13:35,540 So there are pairs of edges that you're not 1499 01:13:35,540 --> 01:13:38,160 allowed to choose together, and subject to those constraints 1500 01:13:38,160 --> 01:13:40,000 you want to find the longest path. 1501 01:13:40,000 --> 01:13:42,600 So these are all NPO PB complete. 1502 01:13:42,600 --> 01:13:44,437 No numbers in any of them. 1503 01:13:44,437 --> 01:13:46,145 Now let me give you some number problems. 1504 01:13:58,930 --> 01:14:03,230 So Ones was you want to maximize the number of true variables. 1505 01:14:03,230 --> 01:14:05,710 Now we're going to add weights. 1506 01:14:05,710 --> 01:14:09,330 So we want to maximize the sum of the weights 1507 01:14:09,330 --> 01:14:12,370 of the true variables-- and while 1508 01:14:12,370 --> 01:14:15,830 satisfying a Boolean formula. 1509 01:14:15,830 --> 01:14:17,970 So again, finding a feasible solution is hard. 1510 01:14:17,970 --> 01:14:19,800 That's not surprising. 1511 01:14:19,800 --> 01:14:22,440 Here, the weights can be exponential in value, 1512 01:14:22,440 --> 01:14:24,500 because we allow n bits for the weights. 1513 01:14:24,500 --> 01:14:28,450 And that pushes you into NPO completeness. 1514 01:14:28,450 --> 01:14:31,210 If you say the weights have to be polynomially bounded, 1515 01:14:31,210 --> 01:14:33,276 then this problem is NPO PB complete. 1516 01:14:33,276 --> 01:14:34,900 And that's sort of the starting problem 1517 01:14:34,900 --> 01:14:36,820 that they used to prove all of these are hard. 1518 01:14:36,820 --> 01:14:39,420 So they're reductions from this with polynomial weights 1519 01:14:39,420 --> 01:14:40,100 to these guys. 1520 01:14:44,438 --> 01:14:47,330 AUDIENCE: [INAUDIBLE]? 1521 01:14:47,330 --> 01:14:49,220 PROFESSOR: 3SAT. 1522 01:14:49,220 --> 01:14:54,080 I don't know whether you could go down to 2SAT is interesting. 1523 01:14:54,080 --> 01:14:57,960 Here they say, I think, probably 3SAT or CNFSAT. 1524 01:14:57,960 --> 01:15:00,040 Those reductions definitely still work. 1525 01:15:00,040 --> 01:15:03,050 Whether you could put the 2SAT into the Max aspect, 1526 01:15:03,050 --> 01:15:03,630 I don't know. 1527 01:15:03,630 --> 01:15:06,550 But this could be fun to look at. 1528 01:15:06,550 --> 01:15:09,000 There aren't a ton of papers about these two classes, 1529 01:15:09,000 --> 01:15:11,050 but there are a few before they nailed down 1530 01:15:11,050 --> 01:15:12,860 any interesting problems. 1531 01:15:12,860 --> 01:15:14,740 Here's another interesting problem. 1532 01:15:20,600 --> 01:15:24,830 Suppose you want to do integer linear programming. 1533 01:15:24,830 --> 01:15:28,710 To keep it simple, we'll assume that the variables are 1534 01:15:28,710 --> 01:15:33,452 zero or one, and then that is equally hard. 1535 01:15:33,452 --> 01:15:34,910 Here it's a little, unless you know 1536 01:15:34,910 --> 01:15:37,034 a lot about linear programming, it's not so obvious 1537 01:15:37,034 --> 01:15:39,290 that finding a feasible solution here is hard. 1538 01:15:39,290 --> 01:15:41,589 But in general, linear programing-- at least 1539 01:15:41,589 --> 01:15:43,880 in the non-integer case-- you could reduce optimization 1540 01:15:43,880 --> 01:15:45,660 to feasibility. 1541 01:15:45,660 --> 01:15:47,992 So I think the same thing applies here. 1542 01:15:47,992 --> 01:15:49,950 If you're not familiar with linear programming, 1543 01:15:49,950 --> 01:15:53,450 it's basically a bunch of inequality constraints, 1544 01:15:53,450 --> 01:15:55,260 linear inequality constraints. 1545 01:15:55,260 --> 01:15:58,330 And now this is a bunch of integers. 1546 01:15:58,330 --> 01:16:01,840 These are both given integer matrices and vectors. 1547 01:16:01,840 --> 01:16:05,400 And they can have exponential value. 1548 01:16:05,400 --> 01:16:06,310 Question? 1549 01:16:06,310 --> 01:16:08,750 AUDIENCE: For the max/min weighted ones, 1550 01:16:08,750 --> 01:16:12,320 for polynomial bounded, is it still hard 1551 01:16:12,320 --> 01:16:15,460 if you just do ones and minus ones? 1552 01:16:15,460 --> 01:16:19,560 PROFESSOR: I think min or max ones without weights 1553 01:16:19,560 --> 01:16:21,600 is NPO PB-complete. 1554 01:16:21,600 --> 01:16:23,600 I should double-check. 1555 01:16:23,600 --> 01:16:27,100 I didn't actually mention, but this characterization theorem 1556 01:16:27,100 --> 01:16:30,230 works for weighted problems also. 1557 01:16:30,230 --> 01:16:33,540 For every single case, they show that weighted and unweighted 1558 01:16:33,540 --> 01:16:38,640 are the same complexity, except for this one. 1559 01:16:38,640 --> 01:16:42,740 In the min ones case, if all the variables' true, satisfy it, 1560 01:16:42,740 --> 01:16:45,330 you get Poly-APX-completeness if you're unweighted. 1561 01:16:45,330 --> 01:16:50,300 If you're weighted, then you can't find any approximation. 1562 01:16:50,300 --> 01:16:55,390 It's NP-hard to find any factor, which I think, this is, I 1563 01:16:55,390 --> 01:16:58,037 think, before the introduction or popularization 1564 01:16:58,037 --> 01:16:58,745 of these classes. 1565 01:16:58,745 --> 01:17:03,549 So that may be distinguishing between Poly-APX-complete, 1566 01:17:03,549 --> 01:17:05,590 which is definitely smaller than NPO PB-complete. 1567 01:17:05,590 --> 01:17:08,430 This might be NPO PB-completeness. 1568 01:17:08,430 --> 01:17:08,930 Unclear. 1569 01:17:08,930 --> 01:17:12,120 But it's definitely worse than Poly-APX. 1570 01:17:12,120 --> 01:17:13,150 Yeah? 1571 01:17:13,150 --> 01:17:15,150 AUDIENCE: How is it that distinguished from PXP? 1572 01:17:15,150 --> 01:17:17,550 Because I'm just confused how you would ever get anything 1573 01:17:17,550 --> 01:17:19,950 worse than this, because, that's like the biggest 1574 01:17:19,950 --> 01:17:22,370 that you [INAUDIBLE]. 1575 01:17:22,370 --> 01:17:25,470 PROFESSOR: So this problem is exponential APX-hard 1576 01:17:25,470 --> 01:17:26,795 if you forbid zero. 1577 01:17:26,795 --> 01:17:30,030 If you allow zero, then you can't get any approximation. 1578 01:17:30,030 --> 01:17:32,500 Here, I think even when you allow zero, 1579 01:17:32,500 --> 01:17:34,820 or even when you forbid zero, you still 1580 01:17:34,820 --> 01:17:36,000 can't get an approximation. 1581 01:17:36,000 --> 01:17:39,370 I think that's the idea here. 1582 01:17:39,370 --> 01:17:42,020 Here, these problems generally you 1583 01:17:42,020 --> 01:17:44,812 can get, depending on your set-up, 1584 01:17:44,812 --> 01:17:47,395 these problems you can all get like a factor, n approximation. 1585 01:17:49,900 --> 01:17:52,360 Well, maybe not in polynomial time. 1586 01:17:52,360 --> 01:17:54,090 This is hard to find. 1587 01:17:54,090 --> 01:17:55,320 Some of these you can. 1588 01:17:55,320 --> 01:17:58,750 Longest induced path, just have a path of length 1. 1589 01:17:58,750 --> 01:18:00,220 That will be induced. 1590 01:18:00,220 --> 01:18:02,040 So that gives you a factor n approximation. 1591 01:18:02,040 --> 01:18:05,239 There is a lower bound on this situation, 1592 01:18:05,239 --> 01:18:07,030 n to the 1 minus epsilon inapproximability. 1593 01:18:09,640 --> 01:18:12,810 I think morally it should be a factor n, 1594 01:18:12,810 --> 01:18:15,190 but this is the best result I found. 1595 01:18:15,190 --> 01:18:17,290 So it's funny. 1596 01:18:17,290 --> 01:18:19,800 This is only for number problems. 1597 01:18:19,800 --> 01:18:21,520 So I presented this is as in between. 1598 01:18:21,520 --> 01:18:23,832 But this is actually in some sense lower 1599 01:18:23,832 --> 01:18:24,915 than Exp-APX-completeness. 1600 01:18:27,716 --> 01:18:29,465 It's sort of a harder version of Poly-APX. 1601 01:18:32,130 --> 01:18:34,740 This is a slightly harder version of Exp-APX. 1602 01:18:37,320 --> 01:18:39,920 I think it's a small difference, but it's 1603 01:18:39,920 --> 01:18:43,410 good to know there is this difference. 1604 01:18:43,410 --> 01:18:46,160 Other questions? 1605 01:18:46,160 --> 01:18:46,660 All right. 1606 01:18:46,660 --> 01:18:54,460 So this ends what I plan to say about L-reduction-style proofs, 1607 01:18:54,460 --> 01:18:57,562 which are all about preserving approximability. 1608 01:18:57,562 --> 01:18:59,020 The next class, we're going to look 1609 01:18:59,020 --> 01:19:01,980 at a different take on inapproximability, which 1610 01:19:01,980 --> 01:19:06,180 is called gaps, and gap preserving reductions, 1611 01:19:06,180 --> 01:19:08,320 where you can set up a problem that either 1612 01:19:08,320 --> 01:19:10,980 it has a great solution, or the next solution 1613 01:19:10,980 --> 01:19:12,110 below that is way lower. 1614 01:19:12,110 --> 01:19:15,105 And there's a gap between the best and the next to best. 1615 01:19:15,105 --> 01:19:16,480 And whenever you have such a gap, 1616 01:19:16,480 --> 01:19:18,249 you also have an inapproximability gap, 1617 01:19:18,249 --> 01:19:20,290 because you know there's this solution out there, 1618 01:19:20,290 --> 01:19:24,510 but finding it, if it's NP-complete to find this, 1619 01:19:24,510 --> 01:19:27,240 to solve it exactly, and so the next level down you 1620 01:19:27,240 --> 01:19:28,110 lose some factor. 1621 01:19:28,110 --> 01:19:30,600 And whatever that gap is is your inapproximability bound. 1622 01:19:30,600 --> 01:19:33,280 It doesn't give you completeness results like this 1623 01:19:33,280 --> 01:19:35,030 in general-- not always. 1624 01:19:35,030 --> 01:19:37,732 But it tends to give you really get inapproximability bounds. 1625 01:19:37,732 --> 01:19:40,190 Here I've completely ignored what the constant factors are. 1626 01:19:40,190 --> 01:19:42,860 Most of them are not so great. 1627 01:19:42,860 --> 01:19:44,650 Like when you prove APX-hardness, 1628 01:19:44,650 --> 01:19:48,770 usually you get a 1 plus 1 over 1,000 kind of lower bound 1629 01:19:48,770 --> 01:19:50,540 on the possibility factor. 1630 01:19:50,540 --> 01:19:53,750 But the best upper bound is like 2, or 1.5. 1631 01:19:53,750 --> 01:19:55,290 And what we'll talk about next time, 1632 01:19:55,290 --> 01:19:58,290 you can get much closer-- sometimes exact bounds 1633 01:19:58,290 --> 01:20:00,380 between upper and lower. 1634 01:20:00,380 --> 01:20:03,130 But that will be next week.