1
00:00:00,070 --> 00:00:02,430
The following content is
provided under a Creative

2
00:00:02,430 --> 00:00:03,820
Commons license.

3
00:00:03,820 --> 00:00:06,050
Your support will help
MIT OpenCourseWare

4
00:00:06,050 --> 00:00:10,150
continue to offer high-quality
educational resources for free.

5
00:00:10,150 --> 00:00:12,700
To make a donation or to
view additional materials

6
00:00:12,700 --> 00:00:16,600
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:16,600 --> 00:00:17,263
at ocw.mit.edu.

8
00:00:25,846 --> 00:00:26,720
PROFESSOR: All right.

9
00:00:26,720 --> 00:00:31,300
Today we continue our theme
of approximation, lower bounds

10
00:00:31,300 --> 00:00:33,020
inapproximability.

11
00:00:33,020 --> 00:00:35,470
Quick recap of last time.

12
00:00:35,470 --> 00:00:39,530
We talked about lots of
different reductions.

13
00:00:39,530 --> 00:00:44,900
We, I guess, in particular
talked about P-tests, AP and L.

14
00:00:44,900 --> 00:00:48,720
And in particular we'll be using
L-reductions almost exclusively

15
00:00:48,720 --> 00:00:51,880
today, except the occasional
strict reduction, which

16
00:00:51,880 --> 00:00:55,160
is even stronger, in a sense.

17
00:00:55,160 --> 00:00:57,117
So what's an L-reduction?

18
00:00:57,117 --> 00:00:59,450
We're trying to go from one
problem A to another problem

19
00:00:59,450 --> 00:01:05,530
B. We're given an instance x of
A. We convert it via function f

20
00:01:05,530 --> 00:01:10,790
to an instance x prime of B.
Then we imagine that somehow we

21
00:01:10,790 --> 00:01:12,290
obtain a solution.

22
00:01:12,290 --> 00:01:16,030
We don't know anything about
it. y prime to x prime.

23
00:01:16,030 --> 00:01:17,540
That's in B space.

24
00:01:17,540 --> 00:01:19,210
And then, in the
reduction, we're

25
00:01:19,210 --> 00:01:21,160
supposed to be able to
map any such solution

26
00:01:21,160 --> 00:01:28,040
y prime to x prime via g into
solution y of x in A problem--

27
00:01:28,040 --> 00:01:31,560
so that's given by the function
g-- such that two things hold.

28
00:01:31,560 --> 00:01:36,070
The first one is that
for f, optimal solution

29
00:01:36,070 --> 00:01:39,520
of x prime should be at
most some constant times

30
00:01:39,520 --> 00:01:41,930
the optimal solution to x.

31
00:01:41,930 --> 00:01:44,190
So we don't blow
up OPTs too much.

32
00:01:44,190 --> 00:01:47,660
And secondly the
absolute difference

33
00:01:47,660 --> 00:01:51,600
between the cost of y
versus the optimal solution

34
00:01:51,600 --> 00:01:56,150
for x should be within a
constant factor of this kind

35
00:01:56,150 --> 00:01:59,650
of gap-- additive gap
between the cost of y

36
00:01:59,650 --> 00:02:04,210
prime versus the optimal
solution to x prime,

37
00:02:04,210 --> 00:02:06,430
meaning that if we were
given a y prime that's

38
00:02:06,430 --> 00:02:08,430
very close to optimal
for x prime, then the y

39
00:02:08,430 --> 00:02:10,949
we produce is very
close to optimal for x.

40
00:02:10,949 --> 00:02:14,446
And we want that in
an additive sense

41
00:02:14,446 --> 00:02:17,070
that will imply that things are
good in a multiplicative sense.

42
00:02:17,070 --> 00:02:20,180
Last time we proved
that for the min case,

43
00:02:20,180 --> 00:02:21,490
for minimization problems.

44
00:02:21,490 --> 00:02:24,110
If you're curious, I
worked out the details

45
00:02:24,110 --> 00:02:26,840
for maximization problems.

46
00:02:26,840 --> 00:02:29,670
It's a little bit uglier
in terms of the arithmetic.

47
00:02:29,670 --> 00:02:32,920
But you again get that if
you had a constant factor

48
00:02:32,920 --> 00:02:35,970
approximation over here, you
preserve a constant factor

49
00:02:35,970 --> 00:02:39,630
approximation over
here, and you only-- you

50
00:02:39,630 --> 00:02:41,500
lose a reasonable factor.

51
00:02:44,530 --> 00:02:47,119
We also have that if you
can get a PTAS over here,

52
00:02:47,119 --> 00:02:49,160
so you can get an arbitrarily
good approximation,

53
00:02:49,160 --> 00:02:50,550
you also get a PTAS over here.

54
00:02:50,550 --> 00:02:52,380
That was the PTAS reduction.

55
00:02:52,380 --> 00:02:55,430
And it turns out the constant
in the end is roughly

56
00:02:55,430 --> 00:02:59,710
epsilon over alpha beta,
where alpha was this constant,

57
00:02:59,710 --> 00:03:02,270
and beta was this constant.

58
00:03:02,270 --> 00:03:03,564
That's what we had before.

59
00:03:03,564 --> 00:03:04,730
It's a little bit different.

60
00:03:04,730 --> 00:03:06,380
For small epsilon,
it's about the same.

61
00:03:06,380 --> 00:03:09,310
But for large epsilon, it
does make a difference.

62
00:03:09,310 --> 00:03:13,470
And this is why, in
case you were confused,

63
00:03:13,470 --> 00:03:17,230
an L-reduction does not
imply in the maximization

64
00:03:17,230 --> 00:03:21,030
case an AP-reduction, because
you have this non-linear term.

65
00:03:21,030 --> 00:03:23,680
Here, everything was
linear in epsilon.

66
00:03:23,680 --> 00:03:26,320
With minimization, that's true.

67
00:03:26,320 --> 00:03:27,780
The L implies AP.

68
00:03:27,780 --> 00:03:30,350
But for maximization
it's not quite true.

69
00:03:30,350 --> 00:03:32,560
It's close.

70
00:03:32,560 --> 00:03:34,270
So there's some funny.

71
00:03:34,270 --> 00:03:36,270
What I said didn't quite
match this picture.

72
00:03:36,270 --> 00:03:38,920
That's an explanation.

73
00:03:38,920 --> 00:03:42,280
And then we did
a few reductions.

74
00:03:42,280 --> 00:03:47,680
I claimed that Max
E3SAT-E5, this was exactly

75
00:03:47,680 --> 00:03:49,650
three distinct
literals per clause,

76
00:03:49,650 --> 00:03:54,790
exactly five occurrences
of each variable in five

77
00:03:54,790 --> 00:03:57,150
different clauses.

78
00:03:57,150 --> 00:03:58,490
I claimed that was APX-complete.

79
00:03:58,490 --> 00:04:00,130
We didn't prove it.

80
00:04:00,130 --> 00:04:02,810
What we did prove is
that assuming Max 3SAT

81
00:04:02,810 --> 00:04:06,940
is APX-complete, we
reduce that to Max 3SAT3,

82
00:04:06,940 --> 00:04:10,240
which is at most three
occurrences, each thing,

83
00:04:10,240 --> 00:04:12,200
first by using
expander, and then

84
00:04:12,200 --> 00:04:14,620
splitting the constant
size-- constant occurrence

85
00:04:14,620 --> 00:04:18,190
variables-- with the cycle
of implications trick.

86
00:04:18,190 --> 00:04:21,019
And then we reduced from
that to bounded degree.

87
00:04:21,019 --> 00:04:23,800
I think we did
like max degree 4.

88
00:04:23,800 --> 00:04:27,320
But all of these can be
done in max degree 3.

89
00:04:27,320 --> 00:04:30,951
Independent set, vertex
cover, and dominating set.

90
00:04:30,951 --> 00:04:32,200
Vertex cover we've seen a lot.

91
00:04:32,200 --> 00:04:33,940
You want to cover all the
edges by choosing vertices.

92
00:04:33,940 --> 00:04:35,650
Dominating set, you want
to cover all the vertices

93
00:04:35,650 --> 00:04:36,630
by choosing vertices.

94
00:04:36,630 --> 00:04:38,610
Each vertex covers
its neighbor set.

95
00:04:38,610 --> 00:04:43,556
And independent set, for general
graphs this is super hard.

96
00:04:43,556 --> 00:04:45,055
But for bounded
degree graphs, there

97
00:04:45,055 --> 00:04:46,950
is a constant factor
approximation.

98
00:04:46,950 --> 00:04:50,750
This was choosing vertices
that induced no edges.

99
00:04:50,750 --> 00:04:55,850
So with that in mind, let's
do some more APX-reductions,

100
00:04:55,850 --> 00:04:58,030
APX-hardness,
using L-reductions.

101
00:05:00,600 --> 00:05:06,450
So the next problem we're
going to do is Max 2SAT.

102
00:05:12,110 --> 00:05:15,470
So because we're in the world
of optimization, in some sense

103
00:05:15,470 --> 00:05:19,420
the distinction between 2SAT
and 3SAT is not so important.

104
00:05:19,420 --> 00:05:22,340
It turns out Max 2SAT
will be APX-complete

105
00:05:22,340 --> 00:05:24,230
just like Max 3SAT was.

106
00:05:24,230 --> 00:05:25,877
So when we didn't
have Max, of course

107
00:05:25,877 --> 00:05:27,460
the complexities
were quite different.

108
00:05:27,460 --> 00:05:29,847
3SAT was hard, 2SAT was easy.

109
00:05:29,847 --> 00:05:31,430
With maximization,
they're going to be

110
00:05:31,430 --> 00:05:34,460
equivalent in this perspective.

111
00:05:34,460 --> 00:05:43,150
So I'm going to
do an L-reduction

112
00:05:43,150 --> 00:05:52,250
from independent set of,
let's say a degree 3.

113
00:05:55,275 --> 00:05:56,900
So it'll work with
any constant degree,

114
00:05:56,900 --> 00:05:59,260
but we'll get a different
number of occurrences.

115
00:05:59,260 --> 00:06:01,710
And the reduction
is the following.

116
00:06:01,710 --> 00:06:05,240
There are two types of
gadgets for every vertex.

117
00:06:05,240 --> 00:06:07,280
So I'm given an
independent set instance.

118
00:06:07,280 --> 00:06:11,240
For every vertex v, we're
going to convert that

119
00:06:11,240 --> 00:06:19,484
into a clause-- namely v. I
want v to be true, if possible.

120
00:06:19,484 --> 00:06:21,150
It's a funny way of
thinking when you're

121
00:06:21,150 --> 00:06:23,608
maximizing a number of causes,
because a lot of the clauses

122
00:06:23,608 --> 00:06:24,490
won't be satisfied.

123
00:06:24,490 --> 00:06:27,240
But you're going to try to
put v in the independent set

124
00:06:27,240 --> 00:06:28,040
if you can.

125
00:06:28,040 --> 00:06:30,060
That's the meaning
of that clause.

126
00:06:30,060 --> 00:06:34,420
Then for every edge--
let's say connecting v

127
00:06:34,420 --> 00:06:38,070
to w-- we're going to convert
that into a clause which

128
00:06:38,070 --> 00:06:43,220
is not v or not w.

129
00:06:43,220 --> 00:06:46,200
We don't want them both to
be in the independent set.

130
00:06:46,200 --> 00:06:48,794
That's the meaning of-- yeah.

131
00:06:48,794 --> 00:06:50,460
I'm trying to simulate
independent sets.

132
00:06:50,460 --> 00:06:52,200
So I don't want
these both to be in.

133
00:06:52,200 --> 00:06:55,220
This is a 2SAT clause.

134
00:06:55,220 --> 00:06:56,950
So what's the claim here?

135
00:06:56,950 --> 00:07:00,800
Suppose you have some
assignment to the variable.

136
00:07:00,800 --> 00:07:03,360
So there's one variable
per vertex over here.

137
00:07:03,360 --> 00:07:07,900
The idea is that variable should
indicate whether the vertex is

138
00:07:07,900 --> 00:07:10,450
in the independent set.

139
00:07:10,450 --> 00:07:13,290
And the claim is
that we will never

140
00:07:13,290 --> 00:07:18,315
violate an edge constraint, or
it's never useful to violate.

141
00:07:18,315 --> 00:07:21,050
The claim is that there
exists an OPT-- optimal

142
00:07:21,050 --> 00:07:29,075
solution-- satisfying all
of these edge constraints.

143
00:07:32,021 --> 00:07:33,020
So we're doing Max 2SAT.

144
00:07:33,020 --> 00:07:35,610
So we get a point for
every one of these things

145
00:07:35,610 --> 00:07:37,670
that we satisfy.

146
00:07:37,670 --> 00:07:40,790
And so in particular,
if you didn't

147
00:07:40,790 --> 00:07:45,530
get this point-- not v or
not w-- the converse of this

148
00:07:45,530 --> 00:07:48,740
is that they are both in.

149
00:07:48,740 --> 00:07:50,570
Then the idea is
that you instead

150
00:07:50,570 --> 00:07:54,510
take one of those vertices
out of the independent set,

151
00:07:54,510 --> 00:07:57,484
and that will be better for you.

152
00:07:57,484 --> 00:07:59,900
In general, when you put a
variable in an independent set,

153
00:07:59,900 --> 00:08:02,462
it only helps you
for one clause.

154
00:08:02,462 --> 00:08:04,170
There's only one
occurrence of positive v

155
00:08:04,170 --> 00:08:05,500
in all of these things.

156
00:08:05,500 --> 00:08:08,710
You might have many edges
coming into a vertex,

157
00:08:08,710 --> 00:08:12,350
and they all prefer the
case that v is false.

158
00:08:12,350 --> 00:08:15,220
So things are going to be
easier if you set v to false.

159
00:08:15,220 --> 00:08:18,420
So if you discover a clause like
this, which is currently false,

160
00:08:18,420 --> 00:08:20,880
meaning both v and
w are true, you're

161
00:08:20,880 --> 00:08:24,170
going to gain a point
by setting v to false.

162
00:08:24,170 --> 00:08:26,564
You'll also lose a point, but
you'll only lose one point.

163
00:08:26,564 --> 00:08:27,980
Potentially, you
gain many points,

164
00:08:27,980 --> 00:08:31,620
but you gain at least one point
and lose at most one point

165
00:08:31,620 --> 00:08:36,240
by switching from both v
and w true into just one

166
00:08:36,240 --> 00:08:37,490
of them true.

167
00:08:37,490 --> 00:08:41,840
So you can always convert
without losing anything in OPT

168
00:08:41,840 --> 00:08:44,900
into a solution that satisfies
all edge constraints.

169
00:08:44,900 --> 00:08:48,000
And then we know we
have an independent set.

170
00:08:48,000 --> 00:08:50,070
That's what the edge
constraints say.

171
00:08:50,070 --> 00:08:53,006
And therefore the
remaining problem

172
00:08:53,006 --> 00:08:54,630
is to maximize the
number vertices that

173
00:08:54,630 --> 00:08:55,853
are in the independent set.

174
00:09:04,350 --> 00:09:08,820
So that means if we're given
any solution y prime to this Max

175
00:09:08,820 --> 00:09:11,915
2SAT instance, we can convert
it back to an independent set.

176
00:09:11,915 --> 00:09:14,970
Now it's not quite
of the same value.

177
00:09:14,970 --> 00:09:20,700
In general, the optimal solution
here for the 2SAT instance

178
00:09:20,700 --> 00:09:23,080
is going to be the
optimal solution

179
00:09:23,080 --> 00:09:28,197
for the independent set instance
plus the total number of edges,

180
00:09:28,197 --> 00:09:30,030
because we're going to
satisfy all of these.

181
00:09:30,030 --> 00:09:32,030
That's what we just showed.

182
00:09:32,030 --> 00:09:36,410
So this is where we get a
kind of additive behavior,

183
00:09:36,410 --> 00:09:39,060
like in this L-reduction.

184
00:09:39,060 --> 00:09:40,800
The gap is an additive thing.

185
00:09:40,800 --> 00:09:42,310
But here it's a
nice fixed thing.

186
00:09:42,310 --> 00:09:46,080
And so these are
pretty much the same.

187
00:09:46,080 --> 00:09:48,160
There's just this
additive offset.

188
00:09:48,160 --> 00:09:51,110
So that's going to be fine in
terms of the second property.

189
00:09:51,110 --> 00:09:54,310
The additive difference between
one of these solutions and OPT

190
00:09:54,310 --> 00:09:55,460
will be exactly the same.

191
00:09:55,460 --> 00:09:58,017
The beta here at this
constant will be 1.

192
00:09:58,017 --> 00:10:00,100
But we do have to worry
about the first condition.

193
00:10:00,100 --> 00:10:02,641
We need to make sure OPT doesn't
blow up too much, because we

194
00:10:02,641 --> 00:10:04,490
did make it bigger.

195
00:10:04,490 --> 00:10:08,790
So for that, all
we need is this is

196
00:10:08,790 --> 00:10:12,200
omega, the number of vertices.

197
00:10:12,200 --> 00:10:16,840
And that's because we assumed
our graph had bounded degree,

198
00:10:16,840 --> 00:10:19,610
and so we can always find
an independent set of size

199
00:10:19,610 --> 00:10:23,190
something like n over constant.

200
00:10:23,190 --> 00:10:24,610
So because that's
already linear,

201
00:10:24,610 --> 00:10:26,630
we only added
another linear thing.

202
00:10:26,630 --> 00:10:32,120
Again, also this is
order, number of vertices.

203
00:10:32,120 --> 00:10:35,200
So we're not adding too
much relative to this,

204
00:10:35,200 --> 00:10:37,110
because bounded degree.

205
00:10:37,110 --> 00:10:37,900
Cool?

206
00:10:37,900 --> 00:10:40,460
So that's Max
2SAT, APX-hardness.

207
00:10:48,110 --> 00:10:53,190
Fun fact which I won't prove.

208
00:10:53,190 --> 00:11:03,050
Max E2SAT-E3 is
also APX-complete.

209
00:11:03,050 --> 00:11:07,030
So here we got some bounded
number of occurrences.

210
00:11:07,030 --> 00:11:10,680
I guess each variable is
going to appear in one

211
00:11:10,680 --> 00:11:13,060
plus three, four clauses.

212
00:11:13,060 --> 00:11:16,090
You can get that down to
three clauses per variable.

213
00:11:21,150 --> 00:11:21,650
OK.

214
00:11:27,810 --> 00:11:29,630
Now that we have
Max 2SAT, we can

215
00:11:29,630 --> 00:11:36,105
do another one, which is
Max not all equal 3SAT.

216
00:11:40,580 --> 00:11:48,060
So from SAT-land, we have
3SAT, not all equal 3SAT,

217
00:11:48,060 --> 00:11:49,710
and 1 and 3SAT.

218
00:11:49,710 --> 00:11:51,230
We're going to get all of those.

219
00:11:51,230 --> 00:11:53,320
Actually, we can
even get 1 and 2SAT.

220
00:11:53,320 --> 00:11:55,160
Little bit stronger.

221
00:11:55,160 --> 00:11:58,260
But let's do not all equal 3SAT.

222
00:11:58,260 --> 00:12:02,870
So here we are going
to do, I believe,

223
00:12:02,870 --> 00:12:15,920
a strict reduction from Max
2SAT which we just proved,

224
00:12:15,920 --> 00:12:18,510
APX-complete.

225
00:12:18,510 --> 00:12:20,230
Yeah.

226
00:12:20,230 --> 00:12:22,655
It's again in APX,
because you can, say, take

227
00:12:22,655 --> 00:12:24,330
your random
assignment, and you'll

228
00:12:24,330 --> 00:12:28,100
satisfy some constant
fraction of the clauses.

229
00:12:28,100 --> 00:12:30,710
And OK.

230
00:12:30,710 --> 00:12:32,630
So here's the reduction.

231
00:12:32,630 --> 00:12:34,370
Again, very easy.

232
00:12:34,370 --> 00:12:36,294
Suppose we're starting
from Max 2SAT,

233
00:12:36,294 --> 00:12:37,710
so all our clauses
look like this.

234
00:12:37,710 --> 00:12:40,380
These may be negated or not.

235
00:12:40,380 --> 00:12:50,700
And we're going to convert it
into not all equal of x, y,

236
00:12:50,700 --> 00:12:52,056
and a.

237
00:12:52,056 --> 00:12:57,250
a is a new variable, and it
appears in every single clause.

238
00:12:57,250 --> 00:12:57,750
OK?

239
00:12:57,750 --> 00:12:58,791
So this is kind of funny.

240
00:13:01,980 --> 00:13:04,880
So a appears everywhere.

241
00:13:04,880 --> 00:13:06,990
And not all equal has
this nice symmetry, right?

242
00:13:06,990 --> 00:13:08,240
There wasn't really
a zero or one.

243
00:13:08,240 --> 00:13:09,990
You can think of
them as red, as blue.

244
00:13:09,990 --> 00:13:12,910
Doesn't matter whether red
is true or blue is true.

245
00:13:12,910 --> 00:13:15,130
So in particular, we
can use that symmetry

246
00:13:15,130 --> 00:13:18,530
to make a consider it as false.

247
00:13:18,530 --> 00:13:21,210
So by a possible
flipping everything,

248
00:13:21,210 --> 00:13:25,080
we can imagine
that a equals zero.

249
00:13:25,080 --> 00:13:29,242
If not, flip all the bits, and
you'll still be not all equal.

250
00:13:29,242 --> 00:13:31,700
Or all the things that were
not all equal before will still

251
00:13:31,700 --> 00:13:32,408
be not all equal.

252
00:13:32,408 --> 00:13:34,350
You'll preserve OPT.

253
00:13:34,350 --> 00:13:38,540
Now once you think of a is
false, then not all equal

254
00:13:38,540 --> 00:13:41,790
is saying that these
are not both 0, which is

255
00:13:41,790 --> 00:13:44,010
the same thing as saying 2SAT.

256
00:13:44,010 --> 00:13:45,890
Duh.

257
00:13:45,890 --> 00:13:46,510
OK.

258
00:13:46,510 --> 00:13:49,370
Again, I mean this is
saying OPT is preserved.

259
00:13:49,370 --> 00:13:51,560
But if you take any
solution to this problem,

260
00:13:51,560 --> 00:13:54,100
you first possibly flip
it so that a is zero,

261
00:13:54,100 --> 00:13:57,910
and then convert the xy is just
exactly the xy's over here,

262
00:13:57,910 --> 00:14:00,450
and you'll preserve the
size of the solution.

263
00:14:00,450 --> 00:14:02,420
You won't get any scale
here, and you also

264
00:14:02,420 --> 00:14:04,840
preserved OPT exactly.

265
00:14:04,840 --> 00:14:06,750
So it's in particular
an L-reduction,

266
00:14:06,750 --> 00:14:08,750
but it's even a
strict reduction.

267
00:14:08,750 --> 00:14:10,820
Didn't lose anything.

268
00:14:10,820 --> 00:14:14,110
No additive slop or whatever.

269
00:14:14,110 --> 00:14:14,750
OK.

270
00:14:14,750 --> 00:14:16,430
That's nice.

271
00:14:16,430 --> 00:14:21,625
Next is usually called Max-Cut.

272
00:14:24,690 --> 00:14:25,760
You're given a graph.

273
00:14:25,760 --> 00:14:27,550
You want to split
it into two parts

274
00:14:27,550 --> 00:14:31,660
to maximize the number of
edges between the two parts.

275
00:14:31,660 --> 00:14:43,100
But this is the same thing
as max positive 1 and 2SAT,

276
00:14:43,100 --> 00:14:46,360
which is simpler
than 1 and 3SAT.

277
00:14:46,360 --> 00:14:51,860
You have, I mean, in a cut,
again, you have two sides.

278
00:14:51,860 --> 00:14:54,320
Call them true or false, or
red and blue, or whatever.

279
00:14:54,320 --> 00:14:57,800
You would like to assign
exactly one of these to be true.

280
00:14:57,800 --> 00:15:00,010
Then that edge
will be in the cut.

281
00:15:00,010 --> 00:15:01,630
So it's the same problem.

282
00:15:01,630 --> 00:15:10,440
And you can also think of
it as max positive XOR-SAT.

283
00:15:10,440 --> 00:15:12,320
Maybe actually call it 2XOR-SAT.

284
00:15:15,320 --> 00:15:15,890
Same thing.

285
00:15:15,890 --> 00:15:19,100
It's just every constraint is
of the form this x or this.

286
00:15:19,100 --> 00:15:21,480
You want to maximize the
number of those constraints.

287
00:15:21,480 --> 00:15:23,510
So a lot of these problems
have different formulations

288
00:15:23,510 --> 00:15:25,551
depending on whether you're
thinking about logic,

289
00:15:25,551 --> 00:15:27,970
or thinking about
a graph problem.

290
00:15:27,970 --> 00:15:31,890
So we're going to get all of
these four with one reduction.

291
00:15:31,890 --> 00:15:35,010
And it's going to be
from probably this one.

292
00:15:35,010 --> 00:15:36,430
Yes.

293
00:15:36,430 --> 00:15:38,260
The great chain of
reductions here.

294
00:15:47,320 --> 00:15:51,450
So we're going to reduce
from Max not all equal 3SAT.

295
00:15:54,060 --> 00:15:58,130
I should mention, all of the
reductions we've been seeing,

296
00:15:58,130 --> 00:16:01,260
including this initial batch
where we started from 3SAT,

297
00:16:01,260 --> 00:16:02,902
converted into 3SAT
3, converted it

298
00:16:02,902 --> 00:16:04,610
into an independent
set, to vertex cover,

299
00:16:04,610 --> 00:16:09,460
to dominating set to Max
2SAT, to Max not equal 3SAT

300
00:16:09,460 --> 00:16:12,870
to Max-Cut, are all in this
seminal paper by Papadimitriou

301
00:16:12,870 --> 00:16:15,350
and Yannakakis, 1991.

302
00:16:15,350 --> 00:16:18,826
This is before APX
was really a thing.

303
00:16:18,826 --> 00:16:20,450
It had a different
name at that point--

304
00:16:20,450 --> 00:16:24,530
Max SMP-- which later is proved
to be essentially equal to APX,

305
00:16:24,530 --> 00:16:26,710
or the completeness
version is the same.

306
00:16:26,710 --> 00:16:29,019
You don't need to
know about that.

307
00:16:29,019 --> 00:16:31,310
It comes from a different
world, but all the reductions

308
00:16:31,310 --> 00:16:32,570
apply here.

309
00:16:32,570 --> 00:16:36,230
So here is the
reduction for a Max-Cut.

310
00:16:36,230 --> 00:16:39,040
So again we're trying
to simulate Max

311
00:16:39,040 --> 00:16:40,720
not all equal 3SAT.

312
00:16:40,720 --> 00:16:44,850
Now we actually saw in the
planar lecture, planar 3SAT,

313
00:16:44,850 --> 00:16:49,650
that you can reduce planar
not all equal 3SAT to planar

314
00:16:49,650 --> 00:16:53,470
Max-Cut, and that we use that to
get a polynomial time algorithm

315
00:16:53,470 --> 00:16:55,340
for planar not all equal 3SAT.

316
00:16:55,340 --> 00:16:57,190
We're just going
to do the reverse.

317
00:16:57,190 --> 00:17:00,930
And if you recall, this was
the heart of that reduction.

318
00:17:00,930 --> 00:17:03,810
The point is that
you can represent

319
00:17:03,810 --> 00:17:08,800
a not all equal clause as
a cut, as a Max-Cut problem

320
00:17:08,800 --> 00:17:09,599
on a triangle.

321
00:17:09,599 --> 00:17:11,864
Because in a triangle,
either they're all equal,

322
00:17:11,864 --> 00:17:14,849
and then there's no cut edges,
or they're not all equal,

323
00:17:14,849 --> 00:17:17,520
and then there's
exactly two cut edges.

324
00:17:17,520 --> 00:17:19,079
So that's for a cause of size 3.

325
00:17:19,079 --> 00:17:22,050
We also need to handle the
case of a cause of size 2.

326
00:17:22,050 --> 00:17:24,905
But that's a two-gon, I
guess, instead of a triangle.

327
00:17:24,905 --> 00:17:26,030
It works the same way here.

328
00:17:26,030 --> 00:17:29,950
You get 1 if they're not all
equal, and zero otherwise.

329
00:17:29,950 --> 00:17:32,120
This is shown as the zero case.

330
00:17:32,120 --> 00:17:32,620
OK.

331
00:17:32,620 --> 00:17:35,910
Now the one thing we need,
because not all equal 3SAT

332
00:17:35,910 --> 00:17:39,300
here, we need negation.

333
00:17:39,300 --> 00:17:45,480
So we're going to build each
variable and its negation

334
00:17:45,480 --> 00:17:46,350
with this gadget.

335
00:17:46,350 --> 00:17:48,220
This is a new gadget,
variable gadget.

336
00:17:48,220 --> 00:17:52,790
It's just a whole bunch of
edges connecting xi and xi bar.

337
00:17:52,790 --> 00:17:53,900
And you can make this.

338
00:17:53,900 --> 00:17:56,570
You can avoid the
multigraph aspect here.

339
00:17:56,570 --> 00:17:59,530
But let's not worry
about it here.

340
00:17:59,530 --> 00:18:03,950
So in general, if there are k
occurrences of this variable,

341
00:18:03,950 --> 00:18:07,370
then we're going to
have 2k parallel edges,

342
00:18:07,370 --> 00:18:11,480
because the cost over here, the
potential benefit here is 2.

343
00:18:11,480 --> 00:18:14,730
Again, we want to argue that
if we take an optimal solution,

344
00:18:14,730 --> 00:18:18,550
we can make it another optimal
solution where xi and xi

345
00:18:18,550 --> 00:18:21,706
bar are on opposite
sides of the cut.

346
00:18:21,706 --> 00:18:23,830
And the reason is, if
they're both on the same side

347
00:18:23,830 --> 00:18:27,690
of the cut, you're not
getting this benefit.

348
00:18:27,690 --> 00:18:29,930
If you flip one
of the sides, you

349
00:18:29,930 --> 00:18:32,150
get this huge
benefit, which is 2k.

350
00:18:32,150 --> 00:18:33,960
And you say, well,
how much do I lose

351
00:18:33,960 --> 00:18:37,470
if I flip this from one side
of the cut to the other.

352
00:18:37,470 --> 00:18:41,590
Well, it appears in at most k
different clauses, each of them

353
00:18:41,590 --> 00:18:43,480
gives me at most two points.

354
00:18:43,480 --> 00:18:45,970
So I'm losing, at
most, 2k points

355
00:18:45,970 --> 00:18:47,330
by making these opposite.

356
00:18:47,330 --> 00:18:48,410
But I gain 2k points.

357
00:18:48,410 --> 00:18:50,990
So it never hurts me
to do that switch.

358
00:18:50,990 --> 00:18:53,560
So I can assume these two
guys are on opposite sides,

359
00:18:53,560 --> 00:18:56,540
and therefore I can assume
it's sort of validly doing

360
00:18:56,540 --> 00:18:57,740
the negation part.

361
00:18:57,740 --> 00:19:01,810
And then it just reduces
to not all equal 3SAT.

362
00:19:01,810 --> 00:19:04,770
There's a difference between
this one, where we only

363
00:19:04,770 --> 00:19:07,250
get one point, and this
one we only get two points.

364
00:19:07,250 --> 00:19:09,212
AUDIENCE: You get two points.

365
00:19:09,212 --> 00:19:10,670
PROFESSOR: You get
two points here?

366
00:19:10,670 --> 00:19:10,910
Oh yeah.

367
00:19:10,910 --> 00:19:11,701
You get two points.

368
00:19:11,701 --> 00:19:14,820
That's why we doubled the edge.

369
00:19:14,820 --> 00:19:16,747
So that's cool.

370
00:19:16,747 --> 00:19:17,830
I think you would be fine.

371
00:19:17,830 --> 00:19:20,121
It'd still be an L-reduction
even if you have one edge.

372
00:19:20,121 --> 00:19:21,690
But this is nicer.

373
00:19:21,690 --> 00:19:23,410
And yeah.

374
00:19:23,410 --> 00:19:24,750
That's it.

375
00:19:24,750 --> 00:19:25,250
Cool.

376
00:19:25,250 --> 00:19:28,050
This is Max-Cut.

377
00:19:28,050 --> 00:19:32,020
It will be a
bounded degree based

378
00:19:32,020 --> 00:19:34,470
on the number of occurrences
we got, which was like four.

379
00:19:34,470 --> 00:19:37,600
I mean, we can use three,
and then we'll multiply.

380
00:19:37,600 --> 00:19:42,270
In general you can prove
Max-Cut remains APX-complete

381
00:19:42,270 --> 00:19:45,440
for degree three graphs.

382
00:19:45,440 --> 00:19:47,630
So we're not going
to prove it here.

383
00:19:47,630 --> 00:19:51,600
So another kind of reduction
trick to reduce degrees, just

384
00:19:51,600 --> 00:19:55,680
say degree 3 is possible.

385
00:19:55,680 --> 00:20:03,690
It's also Max Cut in degree
3 graphs is APX-complete.

386
00:20:03,690 --> 00:20:10,100
So you could call that max
positive 1 and 2SAT, hyphen 3.

387
00:20:10,100 --> 00:20:10,740
Maybe even E3.

388
00:20:13,580 --> 00:20:14,175
All right.

389
00:20:16,690 --> 00:20:18,587
So this gives you a flavor.

390
00:20:18,587 --> 00:20:20,420
This is a fun series
of reductions, each one

391
00:20:20,420 --> 00:20:22,150
building on the previous one.

392
00:20:22,150 --> 00:20:24,730
But it gives you kind
of starting point.

393
00:20:24,730 --> 00:20:27,310
A lot of the problems
we're familiar with in NP

394
00:20:27,310 --> 00:20:29,480
completeness land, if you
just add "Max" in front,

395
00:20:29,480 --> 00:20:32,930
they become hard.

396
00:20:32,930 --> 00:20:35,960
I mean I guess Max-Cut
always had a Max in front.

397
00:20:35,960 --> 00:20:38,850
Max 2SAT for NP completeness,
we also had a Max in front.

398
00:20:38,850 --> 00:20:41,341
So those are familiar,
and they're APX-complete.

399
00:20:41,341 --> 00:20:42,840
All of the problems,
I've described,

400
00:20:42,840 --> 00:20:44,298
at least for bounded
degree graphs,

401
00:20:44,298 --> 00:20:46,340
have constant factor
approximations.

402
00:20:46,340 --> 00:20:47,730
So this is the right level.

403
00:20:47,730 --> 00:20:49,350
They are APX-complete.

404
00:20:49,350 --> 00:20:51,650
And that determines
their approximability.

405
00:20:51,650 --> 00:20:52,710
Constant factor, no PTAS.

406
00:20:55,890 --> 00:21:03,060
Now it would be nice to know
which problems are hard.

407
00:21:03,060 --> 00:21:06,790
With NP-completeness,
and in the SAT universe,

408
00:21:06,790 --> 00:21:09,170
we had Schaefer's
dichotomy theorem that

409
00:21:09,170 --> 00:21:12,380
said-- let me cheat and
look at my notes from,

410
00:21:12,380 --> 00:21:17,390
I think, lecture four--
that SAT is polynomial if

411
00:21:17,390 --> 00:21:18,980
and only if the
clauses that you're

412
00:21:18,980 --> 00:21:21,270
allowed to do-- the
operations you're allowed

413
00:21:21,270 --> 00:21:25,491
to do with variables--
are either have

414
00:21:25,491 --> 00:21:27,740
the property that when you
set all the variables true,

415
00:21:27,740 --> 00:21:28,810
everything's satisfied.

416
00:21:28,810 --> 00:21:31,730
Or you set all the variables
false, everything satisfied.

417
00:21:31,730 --> 00:21:37,080
Or every single clause is a
conjunction of Horn causes.

418
00:21:37,080 --> 00:21:43,200
Horn clauses were a few
variables, and at most one

419
00:21:43,200 --> 00:21:45,200
of them is positive.

420
00:21:45,200 --> 00:21:48,520
Or all the causes you have
are conjunctions of Dual-Horn,

421
00:21:48,520 --> 00:21:54,300
which was, in every clause at
most one of them is negated,

422
00:21:54,300 --> 00:21:58,900
or all of the clauses
are conjunctions of 2CNF,

423
00:21:58,900 --> 00:22:00,660
only like 2SAT.

424
00:22:00,660 --> 00:22:05,010
Or what I didn't give
a name at the time,

425
00:22:05,010 --> 00:22:10,140
but is essentially a slight
generalization of XOR-SAT.

426
00:22:10,140 --> 00:22:11,580
Let me give it a name here.

427
00:22:11,580 --> 00:22:13,040
I'm going to call it X(N)OR-SAT.

428
00:22:19,350 --> 00:22:23,190
You can also phrase them as
linear equations over Z2.

429
00:22:32,390 --> 00:22:34,290
So this is zero and one.

430
00:22:34,290 --> 00:22:38,120
And it's either X OR, meaning
you take the X OR of all

431
00:22:38,120 --> 00:22:40,340
the things-- that's like
the summation of all things,

432
00:22:40,340 --> 00:22:42,370
or it's X(N)OR, meaning
when you take that sum,

433
00:22:42,370 --> 00:22:44,420
it should equal zero.

434
00:22:44,420 --> 00:22:46,300
And such systems
of linear equations

435
00:22:46,300 --> 00:22:52,250
can be solved in polynomial
time using Gaussian elimination

436
00:22:52,250 --> 00:22:53,920
over Z2.

437
00:22:53,920 --> 00:22:56,060
And all of the things
I just mentioned

438
00:22:56,060 --> 00:22:59,420
are all the situations
where SAT is polynomial.

439
00:22:59,420 --> 00:23:03,810
Every other type of clause,
SAT is NP-complete--

440
00:23:03,810 --> 00:23:05,607
or set of classes.

441
00:23:05,607 --> 00:23:06,690
Now why do I mention this?

442
00:23:06,690 --> 00:23:11,520
Because there is an
analogous theorem for it's

443
00:23:11,520 --> 00:23:15,690
not quite SAT, because we
need something like this Max.

444
00:23:15,690 --> 00:23:17,690
We need to turn it into
an optimization problem.

445
00:23:17,690 --> 00:23:21,050
SAT is not normally an
optimization problem by itself.

446
00:23:21,050 --> 00:23:25,270
And characterizing how
approximal those problems are.

447
00:23:25,270 --> 00:23:32,750
Now it is a complicated
theorem-- so complicated,

448
00:23:32,750 --> 00:23:35,200
that I don't want to
write it on the board,

449
00:23:35,200 --> 00:23:36,670
because there's a lot of cases.

450
00:23:36,670 --> 00:23:39,140
But the point is,
it's exhaustive.

451
00:23:39,140 --> 00:23:41,166
It will tell you if
you have anything

452
00:23:41,166 --> 00:23:42,540
of the type we
had with Schaefer,

453
00:23:42,540 --> 00:23:44,515
which was you define a
kind of clause function.

454
00:23:44,515 --> 00:23:46,190
It's either satisfied or not.

455
00:23:46,190 --> 00:23:48,120
It applies to some
number of variables.

456
00:23:48,120 --> 00:23:51,150
And then, once you've
defined that clause type,

457
00:23:51,150 --> 00:23:52,830
you can apply it
to any combination

458
00:23:52,830 --> 00:23:54,450
of variables you want.

459
00:23:54,450 --> 00:23:57,400
That family of problems
with no other restrictions

460
00:23:57,400 --> 00:23:58,505
is what we get.

461
00:23:58,505 --> 00:24:03,590
And I will just tell you
what the problems are.

462
00:24:03,590 --> 00:24:04,547
There's four of them.

463
00:24:04,547 --> 00:24:06,380
This is part of what
makes the theorem long,

464
00:24:06,380 --> 00:24:08,750
but also extremely powerful.

465
00:24:08,750 --> 00:24:12,340
The first dichotomy
is max verses min.

466
00:24:12,340 --> 00:24:15,580
And then the second
dichotomy is they

467
00:24:15,580 --> 00:24:18,162
call it CSP for constraint
satisfaction problem.

468
00:24:18,162 --> 00:24:19,620
So you have a bunch
of constraints.

469
00:24:19,620 --> 00:24:21,970
You want to satisfy
as many as possible.

470
00:24:21,970 --> 00:24:26,750
So this would be the number
of satisfied constraints

471
00:24:26,750 --> 00:24:29,940
is your objective, or
your cost function.

472
00:24:33,240 --> 00:24:37,870
Or the other version is what's
called the ones problem, or max

473
00:24:37,870 --> 00:24:39,530
ones, or min ones.

474
00:24:39,530 --> 00:24:42,560
This is the number
of true variables.

475
00:24:48,010 --> 00:24:52,060
So again, we have a
Schaefer-like SAT style

476
00:24:52,060 --> 00:24:53,132
of set of clauses.

477
00:24:53,132 --> 00:24:55,590
Either we want to maximize the
number of satisfied clauses,

478
00:24:55,590 --> 00:24:58,170
or we want to minimize the
number satisfied clauses,

479
00:24:58,170 --> 00:25:02,360
or we want to maximize the
number of true variables

480
00:25:02,360 --> 00:25:03,980
and satisfy everything.

481
00:25:03,980 --> 00:25:06,360
Or we want to minimize the
number of true variables

482
00:25:06,360 --> 00:25:09,040
and satisfy everything.

483
00:25:09,040 --> 00:25:09,540
OK.

484
00:25:09,540 --> 00:25:11,930
Now obviously, if the
SAT problem is hard,

485
00:25:11,930 --> 00:25:13,840
it's going to be
hard to do this.

486
00:25:13,840 --> 00:25:15,710
But it's still interesting.

487
00:25:15,710 --> 00:25:17,000
You can still think about it.

488
00:25:17,000 --> 00:25:23,260
And even when the SAT problem
is easy, Max ones can be hard.

489
00:25:23,260 --> 00:25:25,650
So I am going to--
I wrote it all down,

490
00:25:25,650 --> 00:25:27,280
and then I realized
how long it was.

491
00:25:27,280 --> 00:25:29,060
And so I will just show you.

492
00:25:29,060 --> 00:25:32,460
Imagine I just hand-wrote this.

493
00:25:32,460 --> 00:25:35,310
So this is the easy case.

494
00:25:35,310 --> 00:25:36,401
Max CSP.

495
00:25:36,401 --> 00:25:38,400
So we want to maximize
the number of constraints

496
00:25:38,400 --> 00:25:40,990
that we satisfy.

497
00:25:40,990 --> 00:25:45,430
And I'm going to characterize
when it is polynomial.

498
00:25:45,430 --> 00:25:47,710
Now here, PO I haven't
defined, but that's

499
00:25:47,710 --> 00:25:49,737
the analog of P for
optimization problems.

500
00:25:49,737 --> 00:25:51,570
So it's the set of all
optimization problems

501
00:25:51,570 --> 00:25:55,340
that are in P that have a
polynomial timed algorithm

502
00:25:55,340 --> 00:25:57,110
to solve them exactly.

503
00:25:57,110 --> 00:25:58,520
So it turns out
in this situation

504
00:25:58,520 --> 00:26:01,330
you are either polynomial
or APX-complete.

505
00:26:01,330 --> 00:26:04,940
So it's only about constant
factor verses perfect.

506
00:26:04,940 --> 00:26:08,310
There's never a PTAS, unless
there's a polynomial time

507
00:26:08,310 --> 00:26:09,012
algorithm.

508
00:26:09,012 --> 00:26:10,470
And the cases should
look familiar.

509
00:26:10,470 --> 00:26:13,170
It's either when you set
all the variables true

510
00:26:13,170 --> 00:26:15,860
or all the variables false,
that satisfies everything.

511
00:26:15,860 --> 00:26:17,690
In that case, Max CSP
is, of course, easy.

512
00:26:17,690 --> 00:26:19,790
You can satisfy everything.

513
00:26:19,790 --> 00:26:23,150
Another case is if
you write the clauses

514
00:26:23,150 --> 00:26:26,510
in disjunctive normal
form-- this is a new type

515
00:26:26,510 --> 00:26:29,360
that we hadn't seen before,
all your causes are--

516
00:26:29,360 --> 00:26:32,450
when you write them in DNF,
they have exactly two terms.

517
00:26:32,450 --> 00:26:36,265
So it's the OR of two things
that are anded together.

518
00:26:36,265 --> 00:26:36,765
Sorry.

519
00:26:36,765 --> 00:26:38,050
There's an "or" in the middle.

520
00:26:38,050 --> 00:26:40,340
And you have a bunch of
things anded together

521
00:26:40,340 --> 00:26:41,630
in each of my hands.

522
00:26:41,630 --> 00:26:44,760
And all the ones in here and
positive, and all the ones

523
00:26:44,760 --> 00:26:46,110
in here are negative.

524
00:26:46,110 --> 00:26:49,090
If every clause looks
like that, then you

525
00:26:49,090 --> 00:26:51,670
can solve this in
polynomial time.

526
00:26:51,670 --> 00:26:56,180
And in all other cases, this
problem is APX-complete.

527
00:26:56,180 --> 00:26:59,582
So that's a nice, very
clean characterization.

528
00:26:59,582 --> 00:27:01,998
AUDIENCE: Wait. [INAUDIBLE]
that we learned about earlier.

529
00:27:01,998 --> 00:27:03,390
Is this the [INAUDIBLE]?

530
00:27:03,390 --> 00:27:04,015
PROFESSOR: Yes.

531
00:27:04,015 --> 00:27:05,530
This is disjunctive normal form.

532
00:27:05,530 --> 00:27:09,390
So it's the or of ands.

533
00:27:09,390 --> 00:27:12,590
We usually, we deal
with CNF ands of ors.

534
00:27:12,590 --> 00:27:17,530
But for this
characterization, every clause

535
00:27:17,530 --> 00:27:19,810
can be uniquely
converted into a DNF,

536
00:27:19,810 --> 00:27:21,150
and uniquely converted into CNF.

537
00:27:21,150 --> 00:27:23,990
So that's a well-defined
thing to say.

538
00:27:26,405 --> 00:27:28,530
With Schaefer, we just had
to look at the CNF form.

539
00:27:28,530 --> 00:27:31,990
But here we get a
new set of things.

540
00:27:31,990 --> 00:27:33,130
All right.

541
00:27:33,130 --> 00:27:35,350
That was one out of four.

542
00:27:35,350 --> 00:27:37,240
Max Min CSP Ones.

543
00:27:37,240 --> 00:27:40,486
Next one is Max Ones.

544
00:27:40,486 --> 00:27:41,860
This is not the
most complicated.

545
00:27:44,540 --> 00:27:46,390
But let's go through them.

546
00:27:46,390 --> 00:27:49,862
So again, we want to maximize
the number of true variables.

547
00:27:49,862 --> 00:27:51,945
So of course, if we set
all the variables to true,

548
00:27:51,945 --> 00:27:55,570
and everything is satisfied,
yay, a polynomial, OK?

549
00:27:55,570 --> 00:27:58,180
But curiously, if you settle
the variables to false,

550
00:27:58,180 --> 00:28:02,910
and that satisfies everything,
that's going to be here.

551
00:28:02,910 --> 00:28:05,230
That's Poly-APX-complete.

552
00:28:05,230 --> 00:28:08,050
Poly-APX-complete, you can
translate to something like n

553
00:28:08,050 --> 00:28:10,160
to the 1 minus
epsilon, approximable,

554
00:28:10,160 --> 00:28:12,850
and that's the best you can do.

555
00:28:12,850 --> 00:28:15,620
Or there's a lower bound of
n to the 1 minus epsilon.

556
00:28:15,620 --> 00:28:18,231
Upper bound might
be n or something.

557
00:28:18,231 --> 00:28:18,730
OK.

558
00:28:18,730 --> 00:28:23,180
So because maximizing ones, when
setting things all at false,

559
00:28:23,180 --> 00:28:24,450
does not necessarily help you.

560
00:28:24,450 --> 00:28:26,800
There are some more
positive cases.

561
00:28:26,800 --> 00:28:28,730
If you have a Dual-Horn set up.

562
00:28:28,730 --> 00:28:31,270
So this is another one of
the Schaefer situations.

563
00:28:31,270 --> 00:28:34,675
If every clause when you write
it in CNF every subclause

564
00:28:34,675 --> 00:28:37,780
is Dual-Horn, at most,
one negated thing,

565
00:28:37,780 --> 00:28:40,070
that is a good situation
for maximizing ones,

566
00:28:40,070 --> 00:28:44,170
because only one of
them has to be negative.

567
00:28:44,170 --> 00:28:48,646
But with Horn, for example,
you get Poly-APX-complete,

568
00:28:48,646 --> 00:28:51,020
because we have an asymmetry
here between ones and zeros.

569
00:28:51,020 --> 00:28:51,968
Question?

570
00:28:51,968 --> 00:28:53,160
AUDIENCE: In this list,
do we just read down it

571
00:28:53,160 --> 00:28:54,210
until we hit the thing?

572
00:28:54,210 --> 00:28:55,010
PROFESSOR: Yes.

573
00:28:55,010 --> 00:28:55,860
Good question.

574
00:28:55,860 --> 00:29:01,290
This is a sequential algorithm
for determining what you have.

575
00:29:01,290 --> 00:29:03,430
If any of these says,
oh, you're in PO,

576
00:29:03,430 --> 00:29:05,760
then you should stop reading
the rest of the theorem.

577
00:29:05,760 --> 00:29:09,640
The way they write the theorem
is less is probably clearer.

578
00:29:09,640 --> 00:29:11,386
They write an else
if for each one,

579
00:29:11,386 --> 00:29:13,260
but I wrote it backwards,
so it's hard for me

580
00:29:13,260 --> 00:29:14,730
to write else if.

581
00:29:14,730 --> 00:29:15,410
Yeah.

582
00:29:15,410 --> 00:29:18,530
Occasionally I'll mention
that the previous things

583
00:29:18,530 --> 00:29:19,030
don't apply.

584
00:29:19,030 --> 00:29:20,860
But you should read
this sequentially.

585
00:29:24,100 --> 00:29:24,600
OK.

586
00:29:24,600 --> 00:29:25,870
So it was Dual-Horn.

587
00:29:25,870 --> 00:29:31,300
Another polynomial case is
what I call 2-X(N)OR-SAT,

588
00:29:31,300 --> 00:29:32,590
where the N is in parentheses.

589
00:29:32,590 --> 00:29:35,110
So in other words, you
have linear equations.

590
00:29:35,110 --> 00:29:39,300
Each equation only has two
terms, sort of like 2SAT.

591
00:29:39,300 --> 00:29:41,330
And you have equations
that say equal zero

592
00:29:41,330 --> 00:29:44,120
or equal one on those two terms.

593
00:29:44,120 --> 00:29:45,870
That is also
polynomially solvable.

594
00:29:45,870 --> 00:29:47,490
This is a special case.

595
00:29:47,490 --> 00:29:49,650
We didn't need the
2 for Schaefer.

596
00:29:49,650 --> 00:29:54,490
Here we need the 2, because if
you have X(N)OR-SAT in general.

597
00:29:54,490 --> 00:29:57,760
And when I say this, I
mean that all constraints

598
00:29:57,760 --> 00:29:58,940
fall into this category.

599
00:29:58,940 --> 00:30:00,990
If all constraints
are of this form,

600
00:30:00,990 --> 00:30:03,080
all clauses are of this
form, then you're good.

601
00:30:03,080 --> 00:30:06,420
If all clauses are of
the form X(N)OR-SAT,

602
00:30:06,420 --> 00:30:10,450
but they're not in this class,
they're not all of length 2,

603
00:30:10,450 --> 00:30:12,800
then the problem
becomes APX-complete,

604
00:30:12,800 --> 00:30:16,630
by contrast to
Schaefer, where, I mean,

605
00:30:16,630 --> 00:30:19,370
deciding whether you can satisfy
all those things is easy--

606
00:30:19,370 --> 00:30:22,670
maximizing the number of ones
when you do it is APX-complete.

607
00:30:22,670 --> 00:30:25,950
So that's particularly
interesting.

608
00:30:25,950 --> 00:30:27,700
AUDIENCE: Not all equal
3SAT fall in that?

609
00:30:27,700 --> 00:30:28,610
Is that?

610
00:30:32,620 --> 00:30:35,330
PROFESSOR: Not all equal 3SAT.

611
00:30:35,330 --> 00:30:37,527
AUDIENCE: Those are
X(N)OR clauses, right?

612
00:30:37,527 --> 00:30:38,110
PROFESSOR: No.

613
00:30:38,110 --> 00:30:39,526
They should not
be X(N)OR clauses,

614
00:30:39,526 --> 00:30:40,930
because it's NP-complete.

615
00:30:40,930 --> 00:30:42,800
And when you have
X(N)OR clauses,

616
00:30:42,800 --> 00:30:45,650
it's always polynomial to
decide whether you can satisfy

617
00:30:45,650 --> 00:30:47,040
everything.

618
00:30:47,040 --> 00:30:49,645
So it's in the other case.

619
00:30:52,570 --> 00:30:54,070
But good question,
because we should

620
00:30:54,070 --> 00:30:56,920
be getting APX-completeness.

621
00:30:56,920 --> 00:30:58,837
Yeah, but Max not all
equal 3SAT is different.

622
00:30:58,837 --> 00:31:01,128
Here we're trying to maximize
the number of clause that

623
00:31:01,128 --> 00:31:01,810
were satisfied.

624
00:31:01,810 --> 00:31:04,309
So if you have not
all equal 3SAT,

625
00:31:04,309 --> 00:31:06,350
and you want to maximize
the number of ones, that

626
00:31:06,350 --> 00:31:08,724
means first you have to satisfy
not all equal 3SAT, which

627
00:31:08,724 --> 00:31:09,610
is hard.

628
00:31:09,610 --> 00:31:11,760
So that's going
to fall into this.

629
00:31:11,760 --> 00:31:13,800
The bottom one is feasibility.

630
00:31:13,800 --> 00:31:15,930
Just finding a feasible
solution is NP hard.

631
00:31:18,590 --> 00:31:24,630
The X(N)OR-SAT is this thing--
linear equations over Z2.

632
00:31:24,630 --> 00:31:27,139
And it could be equal
to 0, or equal to 1.

633
00:31:27,139 --> 00:31:28,930
This is what you might
call an X OR clause,

634
00:31:28,930 --> 00:31:32,940
or this is an X OR clause,
this is an X(N)OR clause.

635
00:31:32,940 --> 00:31:36,890
So if they don't all have size
two, then you're APX-complete.

636
00:31:36,890 --> 00:31:41,400
But you can find a solution
by Schaefer's theorem.

637
00:31:41,400 --> 00:31:42,280
OK.

638
00:31:42,280 --> 00:31:45,390
So as I mentioned, Horn
clauses and 2AT clauses

639
00:31:45,390 --> 00:31:46,570
are actually really hard.

640
00:31:46,570 --> 00:31:49,320
They're Poly-APX-complete,
n to the 1 minus epsilon.

641
00:31:49,320 --> 00:31:51,350
Also these are all
situations where

642
00:31:51,350 --> 00:31:54,724
you can find feasible solutions
easily by Schaefer, like when

643
00:31:54,724 --> 00:31:57,140
you can set them all false,
and that satisfies everything.

644
00:31:57,140 --> 00:31:58,020
It doesn't help you
when you're trying

645
00:31:58,020 --> 00:31:59,311
to maximize the number of ones.

646
00:31:59,311 --> 00:32:01,916
It just gets you to zero.

647
00:32:01,916 --> 00:32:03,040
Then you want to do better.

648
00:32:03,040 --> 00:32:06,680
And it's really hard to
get any better factor.

649
00:32:06,680 --> 00:32:08,630
One more situation.

650
00:32:08,630 --> 00:32:09,130
Sorry.

651
00:32:11,934 --> 00:32:13,350
There's a slight
distinction here.

652
00:32:13,350 --> 00:32:15,800
So suppose you have
the feature that you

653
00:32:15,800 --> 00:32:20,290
can set one variable
true, and the rest false.

654
00:32:20,290 --> 00:32:22,650
If that satisfies all your
constraints, than great,

655
00:32:22,650 --> 00:32:24,467
you found the value 1.

656
00:32:24,467 --> 00:32:26,300
And there's a big
difference between 0 and 1

657
00:32:26,300 --> 00:32:28,216
when you're looking at
relative approximation,

658
00:32:28,216 --> 00:32:30,950
because anything
divided by 0 is huge.

659
00:32:30,950 --> 00:32:32,880
So it's really hard
to get a good factor.

660
00:32:32,880 --> 00:32:33,760
That's the situation.

661
00:32:33,760 --> 00:32:35,260
Distinguishing
between 0 and greater

662
00:32:35,260 --> 00:32:39,150
than 0, which is an infinite
ratio, it could be NP-hard.

663
00:32:39,150 --> 00:32:41,470
That's when you,
in this situation,

664
00:32:41,470 --> 00:32:42,980
we set all the variables false.

665
00:32:42,980 --> 00:32:43,680
You get zero.

666
00:32:43,680 --> 00:32:46,690
But finding any other solution
is going to be NP-hard.

667
00:32:46,690 --> 00:32:48,280
Here, if you can
at least get 1, you

668
00:32:48,280 --> 00:32:50,930
can get an N approximation,
whereas here you

669
00:32:50,930 --> 00:32:52,320
can't get an N approximation.

670
00:32:52,320 --> 00:32:55,290
Here you can get
Poly approximation.

671
00:32:55,290 --> 00:32:57,700
And finally, if you have none
of this above situations,

672
00:32:57,700 --> 00:33:01,950
then testing feasibility is
NP-hard by Schaefer's theorem.

673
00:33:01,950 --> 00:33:04,310
So it's like Schaefer
theorem, but some of the cases

674
00:33:04,310 --> 00:33:08,200
split up into parts.

675
00:33:08,200 --> 00:33:09,660
Now, that was maximization.

676
00:33:09,660 --> 00:33:10,510
Question?

677
00:33:10,510 --> 00:33:12,510
AUDIENCE: So, what's
special about 1 here?

678
00:33:12,510 --> 00:33:15,977
It seems to me if you
replace that 1 by K

679
00:33:15,977 --> 00:33:17,310
it should still be in that case.

680
00:33:17,310 --> 00:33:18,390
PROFESSOR: This case.

681
00:33:18,390 --> 00:33:19,330
AUDIENCE: Yeah.

682
00:33:19,330 --> 00:33:22,620
If I just replace that one
with a fixed K. Like 2.

683
00:33:22,620 --> 00:33:23,880
PROFESSOR: Yes.

684
00:33:23,880 --> 00:33:27,290
So that problem will
still be-- so if you

685
00:33:27,290 --> 00:33:30,000
can set all but
K of them true, I

686
00:33:30,000 --> 00:33:32,000
think you can also set
all but one of them true,

687
00:33:32,000 --> 00:33:33,430
and still satisfy.

688
00:33:33,430 --> 00:33:34,190
Yeah.

689
00:33:34,190 --> 00:33:35,310
So here's the thing.

690
00:33:35,310 --> 00:33:36,680
This is all variables, right?

691
00:33:36,680 --> 00:33:39,440
So the idea is you
have tons of variables,

692
00:33:39,440 --> 00:33:41,857
and let's say two of
them are set to true.

693
00:33:41,857 --> 00:33:43,440
So if you look at a
clause, the clause

694
00:33:43,440 --> 00:33:46,685
might just apply to these
guys-- all the false guys--

695
00:33:46,685 --> 00:33:49,060
or it might apply to false
guys and one of the true guys,

696
00:33:49,060 --> 00:33:52,595
or it might apply to false
guys and two of the true guys.

697
00:33:52,595 --> 00:33:54,220
All of those would
have to be satisfied

698
00:33:54,220 --> 00:33:56,050
in your hypothetical situation.

699
00:33:56,050 --> 00:33:58,810
If that's true, that implies
that all the clauses are

700
00:33:58,810 --> 00:34:00,950
satisfied when only one
of them is set true,

701
00:34:00,950 --> 00:34:02,400
and the rest are false.

702
00:34:02,400 --> 00:34:04,980
So your case would fall
into this case as well,

703
00:34:04,980 --> 00:34:07,260
and you'd get
Poly-APX-completeness again.

704
00:34:07,260 --> 00:34:10,040
So it's not totally obvious
when these things apply.

705
00:34:10,040 --> 00:34:14,256
But this is the complete
list of different cases.

706
00:34:14,256 --> 00:34:14,839
Any questions?

707
00:34:17,480 --> 00:34:19,530
OK.

708
00:34:19,530 --> 00:34:21,440
Two out of four.

709
00:34:21,440 --> 00:34:25,460
Next one, this is the
longest one, is Min CSP.

710
00:34:25,460 --> 00:34:28,639
Now here we don't get as
nice a characterization,

711
00:34:28,639 --> 00:34:31,159
because there are some
open problems left.

712
00:34:31,159 --> 00:34:33,420
I haven't checked whether
all of these open problems

713
00:34:33,420 --> 00:34:36,610
remain open, but as of
2001 they were open,

714
00:34:36,610 --> 00:34:38,639
which was a while ago.

715
00:34:38,639 --> 00:34:41,800
And we can check whether
there's more explicit status.

716
00:34:41,800 --> 00:34:45,310
But I have the status
as of this paper here.

717
00:34:45,310 --> 00:34:47,150
So Min CSP.

718
00:34:47,150 --> 00:34:51,130
This is, you want to minimize
the number of constraints

719
00:34:51,130 --> 00:34:54,122
that are satisfied,
whereas before we

720
00:34:54,122 --> 00:34:55,080
looked at maximization.

721
00:34:55,080 --> 00:34:58,740
There are only three cases
which were something like this.

722
00:34:58,740 --> 00:35:02,270
Again, if setting all the
variables false or true

723
00:35:02,270 --> 00:35:08,810
satisfies all the clauses,
this is good, apparently.

724
00:35:08,810 --> 00:35:10,830
That's less obvious
in this case.

725
00:35:10,830 --> 00:35:12,240
In general,
minimization problems

726
00:35:12,240 --> 00:35:14,365
behave quite differently
from maximization problems

727
00:35:14,365 --> 00:35:16,110
in terms of approximability.

728
00:35:16,110 --> 00:35:17,970
Maximization is
generally easier to

729
00:35:17,970 --> 00:35:22,130
approximate, because your
solutions tend to be big,

730
00:35:22,130 --> 00:35:24,370
and it's easier to
approximate big things.

731
00:35:24,370 --> 00:35:27,830
Minimization-- small-- is hard.

732
00:35:27,830 --> 00:35:31,380
Also we had the
situation from Max CSP,

733
00:35:31,380 --> 00:35:33,540
if when you write it
in DNF, is exactly

734
00:35:33,540 --> 00:35:35,107
two terms for every clause.

735
00:35:35,107 --> 00:35:36,690
One of them is all
positive variables,

736
00:35:36,690 --> 00:35:38,356
and the other is all
negative variables.

737
00:35:38,356 --> 00:35:40,470
That's also easy.

738
00:35:40,470 --> 00:35:46,270
And here's a new case
of APX-completeness.

739
00:35:46,270 --> 00:35:48,610
So if the problem
you're trying to solve

740
00:35:48,610 --> 00:35:51,290
is exactly this
problem, they call this,

741
00:35:51,290 --> 00:35:54,190
I think, implication
hitting set.

742
00:35:54,190 --> 00:35:57,910
So you have a clause which
lets you say x1 implies

743
00:35:57,910 --> 00:36:01,620
x2 for any two variables.

744
00:36:01,620 --> 00:36:06,010
And you have some set of
clauses like this, where you

745
00:36:06,010 --> 00:36:08,720
can say here's five variables.

746
00:36:08,720 --> 00:36:10,680
The OR of them is true.

747
00:36:10,680 --> 00:36:13,479
No negation here.

748
00:36:13,479 --> 00:36:15,520
So this is called hitting
set, meaning I give you

749
00:36:15,520 --> 00:36:19,370
a set of vertices and a graph,
and I want at least one of them

750
00:36:19,370 --> 00:36:22,320
to be hit, to be
included, to be true.

751
00:36:22,320 --> 00:36:24,700
And we're trying to minimize
the number of such things

752
00:36:24,700 --> 00:36:26,533
that we satisfy.

753
00:36:26,533 --> 00:36:31,490
So this turns out to be hard,
but only there's no PTAS,

754
00:36:31,490 --> 00:36:35,600
but there's a constant
factor approximation.

755
00:36:35,600 --> 00:36:38,360
And then we have
these four cases

756
00:36:38,360 --> 00:36:41,770
which show that they are
equivalent to known studied

757
00:36:41,770 --> 00:36:42,860
problems.

758
00:36:42,860 --> 00:36:44,720
So there are these
special cases.

759
00:36:44,720 --> 00:36:48,414
Other than these getting
any approximation

760
00:36:48,414 --> 00:36:49,830
factor of less
than infinity would

761
00:36:49,830 --> 00:36:52,430
require you to distinguish
between zeros OPT,

762
00:36:52,430 --> 00:36:55,400
and OPT is greater than
zero, and it's NP-complete,

763
00:36:55,400 --> 00:36:57,980
unless you have these.

764
00:36:57,980 --> 00:37:00,970
So there are some special
cases like Min Uncut.

765
00:37:00,970 --> 00:37:03,150
This is the reverse of Max Cut.

766
00:37:03,150 --> 00:37:05,880
You want to minimize the
number of uncut edges.

767
00:37:05,880 --> 00:37:10,320
So that plus Max Cut should be
equal to the number of edges.

768
00:37:10,320 --> 00:37:12,920
But the approximability of the
two sides is quite different.

769
00:37:12,920 --> 00:37:16,480
And here are the best
results of our APX-hardness,

770
00:37:16,480 --> 00:37:19,900
and log and upper bound
for approximation.

771
00:37:19,900 --> 00:37:21,870
So that's a little
bit harder maybe.

772
00:37:21,870 --> 00:37:25,110
It's at least as hard as this.

773
00:37:25,110 --> 00:37:30,480
And that happens when you are
in the 2x (N)OR-SAT situation,

774
00:37:30,480 --> 00:37:33,320
something we saw
from the last slide.

775
00:37:33,320 --> 00:37:35,820
So here it reduces to
this other problem.

776
00:37:35,820 --> 00:37:39,025
Basically the same, but the
X(N)ORs don't buy you anything

777
00:37:39,025 --> 00:37:39,525
new.

778
00:37:42,580 --> 00:37:44,860
In the case of 2SAT,
you get a problem

779
00:37:44,860 --> 00:37:47,950
known as Min 2CNF deletion.

780
00:37:47,950 --> 00:37:51,780
And it's similar-- APX-hard,
and best approximation

781
00:37:51,780 --> 00:37:54,680
is log times log log.

782
00:37:54,680 --> 00:37:57,880
If in the case where you
have X(N)OR-SAT in general,

783
00:37:57,880 --> 00:38:01,330
but it's not all of the linear
equations have only two terms--

784
00:38:01,330 --> 00:38:05,110
so we have some larger ones--
then it turns out to be

785
00:38:05,110 --> 00:38:07,000
equivalent to nearest Codeword.

786
00:38:07,000 --> 00:38:10,120
So it turns out you can write
all such equations using

787
00:38:10,120 --> 00:38:13,260
either equations of length,
by using equations of length 3

788
00:38:13,260 --> 00:38:13,760
always.

789
00:38:13,760 --> 00:38:15,750
So this is linear equation.

790
00:38:15,750 --> 00:38:20,820
This should equal 1, or
this says equals zero.

791
00:38:20,820 --> 00:38:23,276
And from that, you can
construct all such things.

792
00:38:23,276 --> 00:38:24,525
This is a really hard problem.

793
00:38:27,610 --> 00:38:29,800
Poly-APX-hardness is not known.

794
00:38:29,800 --> 00:38:31,680
Current lower best
lower bound is this 2

795
00:38:31,680 --> 00:38:33,460
to the log to the 1
minus epsilon, which

796
00:38:33,460 --> 00:38:37,440
we saw in the table of various
inapproximability results

797
00:38:37,440 --> 00:38:37,940
last time.

798
00:38:37,940 --> 00:38:42,620
So this is a little bit
smaller than n to the epsilon,

799
00:38:42,620 --> 00:38:43,890
but it's kind of close-ish.

800
00:38:47,150 --> 00:38:50,300
And finally, in the--
I didn't write it.

801
00:38:50,300 --> 00:38:52,810
If you're in CNF form,
and all of the subclauses

802
00:38:52,810 --> 00:38:55,960
are either Horn, or all of
the subclauses are Dual-Horn,

803
00:38:55,960 --> 00:39:00,350
then you get something
called Min Horn Deletion.

804
00:39:00,350 --> 00:39:02,170
And this has the same
inapproximability.

805
00:39:04,730 --> 00:39:06,070
Here it's known.

806
00:39:06,070 --> 00:39:07,580
So up here, the
best approximation

807
00:39:07,580 --> 00:39:11,770
is n-- nothing, basically.

808
00:39:11,770 --> 00:39:13,110
Put them all in.

809
00:39:13,110 --> 00:39:16,990
And here there's a slightly
better approximation known ,

810
00:39:16,990 --> 00:39:18,990
I think, n to the 1 minus
epsilon, or something.

811
00:39:18,990 --> 00:39:20,804
But these are all super hard.

812
00:39:20,804 --> 00:39:22,470
The main point of
this is so that you're

813
00:39:22,470 --> 00:39:23,820
aware of these problems.

814
00:39:23,820 --> 00:39:26,640
If you ever encounter a problem
that looks anything like this,

815
00:39:26,640 --> 00:39:29,740
or it looks like some
kind of CSP problem,

816
00:39:29,740 --> 00:39:31,900
you should go to this
list and check it out.

817
00:39:31,900 --> 00:39:35,430
So don't memorize these,
but look at the notes.

818
00:39:35,430 --> 00:39:36,772
Definitely memorize these guys.

819
00:39:36,772 --> 00:39:37,730
These are good to know.

820
00:39:37,730 --> 00:39:42,140
But there's a few
obscure problems here.

821
00:39:42,140 --> 00:39:42,640
OK.

822
00:39:42,640 --> 00:39:47,560
Last one is minimizing
the number of ones.

823
00:39:47,560 --> 00:39:49,990
So this is like the
hardest of two worlds.

824
00:39:49,990 --> 00:39:51,760
Minimization is kind of harder.

825
00:39:51,760 --> 00:39:54,460
And here you have to satisfy
everything, but minimize

826
00:39:54,460 --> 00:39:56,390
the number of true variables.

827
00:39:59,530 --> 00:40:03,250
So this is easy if you
can set them all false.

828
00:40:03,250 --> 00:40:04,820
And then you win.

829
00:40:04,820 --> 00:40:07,120
This is easy in the Horn case.

830
00:40:07,120 --> 00:40:09,170
The Horn case is when
at most one is positive,

831
00:40:09,170 --> 00:40:11,900
so most of them
can be set to zero.

832
00:40:11,900 --> 00:40:15,990
This is easy in
the 2X(N)OR case.

833
00:40:15,990 --> 00:40:19,060
So if you have linear equations,
two terms each, equal to 0

834
00:40:19,060 --> 00:40:21,320
or equals 1, that's also.

835
00:40:21,320 --> 00:40:24,100
And you want to minimize the
number of true variables.

836
00:40:24,100 --> 00:40:25,410
That's good.

837
00:40:25,410 --> 00:40:28,060
If you're in 2CNF form,
there's a constant factor

838
00:40:28,060 --> 00:40:28,780
approximation.

839
00:40:28,780 --> 00:40:30,240
That's the best you can do.

840
00:40:30,240 --> 00:40:30,781
APX-complete.

841
00:40:33,090 --> 00:40:36,300
This is a case from
the last slide.

842
00:40:36,300 --> 00:40:39,290
If you have the hitting set
constraints on constant number

843
00:40:39,290 --> 00:40:41,830
of constant size
vertex sets, and you

844
00:40:41,830 --> 00:40:44,230
have implication constraints,
then your problem

845
00:40:44,230 --> 00:40:45,535
is APX-complete again.

846
00:40:48,380 --> 00:40:50,300
And then we have these
guys appearing, again

847
00:40:50,300 --> 00:40:51,070
nearest Codeword.

848
00:40:51,070 --> 00:40:52,980
N Min Horn deletion.

849
00:40:52,980 --> 00:40:55,020
This one we get in
the Dual-Horn case.

850
00:40:55,020 --> 00:40:56,490
The Horn case is good.

851
00:40:56,490 --> 00:40:59,880
Dual-Horn, we get this
thing, which was like log N

852
00:40:59,880 --> 00:41:00,380
approximal.

853
00:41:00,380 --> 00:41:01,490
Or no.

854
00:41:01,490 --> 00:41:05,880
This was the 2 to the log
N to the 1 minus epsilon.

855
00:41:05,880 --> 00:41:10,380
And this is X(N)OR-SAT when
they're not all binary.

856
00:41:10,380 --> 00:41:12,870
Then we get nearest
Codeword-complete.

857
00:41:12,870 --> 00:41:16,590
And finally, oh, two more.

858
00:41:16,590 --> 00:41:19,450
The dual to this, if all
the variables being set true

859
00:41:19,450 --> 00:41:22,590
satisfies your constraint,
that gives you a solution,

860
00:41:22,590 --> 00:41:27,780
but it's like the worst solution
possible, because you get N.

861
00:41:27,780 --> 00:41:32,320
And so in that case, you can get
probably a poly approximation.

862
00:41:32,320 --> 00:41:34,740
Not very impressive.

863
00:41:34,740 --> 00:41:37,380
And that's actually the
best you can do, at some N

864
00:41:37,380 --> 00:41:39,180
to the 1 minus epsilon.

865
00:41:39,180 --> 00:41:42,250
And in all other cases,
by Schaefer's theorem,

866
00:41:42,250 --> 00:41:45,250
deciding whether even finding
a feasible solution is NP-hard.

867
00:41:45,250 --> 00:41:47,960
So, good luck approximating.

868
00:41:47,960 --> 00:41:49,360
Cool?

869
00:41:49,360 --> 00:41:54,275
This is the Khanna, Sudan,
Trevisan, Williamson

870
00:41:54,275 --> 00:41:55,150
multichotomy theorem.

871
00:41:59,100 --> 00:41:59,600
All right.

872
00:42:03,860 --> 00:42:11,280
So let's do some
more reductions.

873
00:42:38,260 --> 00:42:42,740
My goal on this page is
to get to our good friend

874
00:42:42,740 --> 00:42:46,180
from one of the first lectures,
edge-matching-puzzles.

875
00:42:46,180 --> 00:42:50,480
You have little square
tiles, colors on the edges.

876
00:42:50,480 --> 00:42:52,910
Normally we want to satisfy
all of the edge constraints.

877
00:42:52,910 --> 00:42:57,480
Only equal colors match,
are adjacent to each other.

878
00:42:57,480 --> 00:43:00,040
Now the problem is going
to be maximize the number

879
00:43:00,040 --> 00:43:03,335
of satisfied edge constraints.

880
00:43:03,335 --> 00:43:05,160
But before I show
you that reduction,

881
00:43:05,160 --> 00:43:08,020
I need another problem,
which is APX-complete.

882
00:43:08,020 --> 00:43:10,330
So that problem is APX-complete.

883
00:43:10,330 --> 00:43:14,540
So I need two more problems.

884
00:43:14,540 --> 00:43:28,996
One is Max independent set
in 3-regular 3-edge colorable

885
00:43:28,996 --> 00:43:29,495
graphs.

886
00:43:32,790 --> 00:43:33,290
OK.

887
00:43:33,290 --> 00:43:35,415
I'm not going to prove this
one, because we already

888
00:43:35,415 --> 00:43:37,030
did a version of
independent set,

889
00:43:37,030 --> 00:43:39,340
and it's just tedious
to make it-- first,

890
00:43:39,340 --> 00:43:42,210
to make it exactly
degree three everywhere,

891
00:43:42,210 --> 00:43:45,260
and secondly make
it 3-edge colorable.

892
00:43:45,260 --> 00:43:48,630
With 3 regular 3-edge color
is a nice kind of graph,

893
00:43:48,630 --> 00:43:55,370
because every vertex, you've
got one edge of each class.

894
00:43:55,370 --> 00:43:56,930
So that's kind of cool.

895
00:43:56,930 --> 00:43:57,990
And we can use this.

896
00:43:57,990 --> 00:44:00,310
This problem is
basically equivalent

897
00:44:00,310 --> 00:44:03,720
to the actual
problem I want, which

898
00:44:03,720 --> 00:44:07,610
is a variation of
three-dimensional matching.

899
00:44:07,610 --> 00:44:09,980
So remember
three-dimensional matching,

900
00:44:09,980 --> 00:44:16,310
you have three sets--
A, B, and C. You

901
00:44:16,310 --> 00:44:19,080
look at the triples
on A, B, and C.

902
00:44:19,080 --> 00:44:23,140
And you're given some set
of interesting triples

903
00:44:23,140 --> 00:44:24,960
among those.

904
00:44:24,960 --> 00:44:32,350
And with 3DM, what we wanted was
to choose a set of such triples

905
00:44:32,350 --> 00:44:36,080
that covers all the vertices,
and no two of them intersect.

906
00:44:36,080 --> 00:44:38,500
That's the matching aspect.

907
00:44:38,500 --> 00:44:40,740
In this problem, we want
to choose as many triples

908
00:44:40,740 --> 00:44:43,700
as we can that don't
intersect each other.

909
00:44:43,700 --> 00:44:55,530
So the problem is choose
max subset S prime of S

910
00:44:55,530 --> 00:44:59,750
with no duplicate
coordinates, I'll say.

911
00:45:03,720 --> 00:45:05,900
So let's assume A, B,
and C are disjoint.

912
00:45:05,900 --> 00:45:09,020
Then I don't want any
element in A union B union C

913
00:45:09,020 --> 00:45:13,800
to appear twice in this
chosen set S prime.

914
00:45:13,800 --> 00:45:15,710
So that's the problem.

915
00:45:15,710 --> 00:45:19,521
Now I'm going to prove
that that's hard.

916
00:45:19,521 --> 00:45:24,990
It is basically the same
as Max independent set,

917
00:45:24,990 --> 00:45:29,830
and three regular
3-edge colored graphs,

918
00:45:29,830 --> 00:45:33,760
because what I do is
I take such a graph,

919
00:45:33,760 --> 00:45:43,490
and for each edge color class--
there are three of them--

920
00:45:43,490 --> 00:45:46,040
those are going
to be A, B, and C.

921
00:45:46,040 --> 00:45:47,800
So if I have red,
green, and blue,

922
00:45:47,800 --> 00:45:49,910
all the red edges are
going to be elements of A,

923
00:45:49,910 --> 00:45:52,220
all the green edges are
going to be the elements

924
00:45:52,220 --> 00:45:54,720
of B-- B for green.

925
00:45:54,720 --> 00:45:58,090
And then all the blue
elements are elements of C.

926
00:45:58,090 --> 00:45:58,710
OK.

927
00:45:58,710 --> 00:46:06,380
Then a vertex, as I said, has
exactly one of each class.

928
00:46:06,380 --> 00:46:07,790
So that's going to be my triple.

929
00:46:11,410 --> 00:46:13,540
And that's it.

930
00:46:13,540 --> 00:46:16,150
So now, if I want to solve
three-dimensional matching

931
00:46:16,150 --> 00:46:17,930
among those triples,
that's going

932
00:46:17,930 --> 00:46:22,735
to correspond to choosing a
set of vertices in here, no two

933
00:46:22,735 --> 00:46:25,760
of which share a color.

934
00:46:25,760 --> 00:46:30,105
No two of which share the
same item of A. Let's say A

935
00:46:30,105 --> 00:46:32,360
is this color of edge.

936
00:46:32,360 --> 00:46:35,720
So that means that
the vertices over here

937
00:46:35,720 --> 00:46:37,890
are not connected by an edge.

938
00:46:37,890 --> 00:46:40,920
So the cool thing here is that
each element of A, B, and C

939
00:46:40,920 --> 00:46:49,000
only appears in two
different triples.

940
00:46:49,000 --> 00:46:51,800
Corresponding to the
two ends of the edge.

941
00:46:51,800 --> 00:46:54,540
So now we have max
three-dimensional matching

942
00:46:54,540 --> 00:46:58,670
where every element in ABC
appears in exactly two triples.

943
00:46:58,670 --> 00:47:03,188
So I guess I can even
write E2 if I want to.

944
00:47:03,188 --> 00:47:05,060
OK.

945
00:47:05,060 --> 00:47:08,130
That was our sort of homework.

946
00:47:08,130 --> 00:47:13,370
Now we have max edge
matching puzzles.

947
00:47:13,370 --> 00:47:17,287
Again, we're given square tiles.

948
00:47:17,287 --> 00:47:18,870
There's different
colors on the tiles.

949
00:47:18,870 --> 00:47:20,780
Any number of colors.

950
00:47:20,780 --> 00:47:23,950
And we would like
to lay things out.

951
00:47:23,950 --> 00:47:26,880
And I'll tell you the instance
here is going to be 2 by N.

952
00:47:26,880 --> 00:47:29,760
So it's fairly narrow,
unlike the construction

953
00:47:29,760 --> 00:47:32,240
we saw in class.

954
00:47:32,240 --> 00:47:36,330
And we're reducing
from Max 3D M2.

955
00:47:36,330 --> 00:47:38,156
That's why I introduced it.

956
00:47:38,156 --> 00:47:43,090
And this is a four
years ago result.

957
00:47:43,090 --> 00:47:47,640
So the idea is the triple is
represented by these three

958
00:47:47,640 --> 00:47:49,210
tiles, and some more.

959
00:47:49,210 --> 00:47:52,090
But for starters,
these three tiles.

960
00:47:52,090 --> 00:47:54,870
The u glue is unique--
global unique.

961
00:47:54,870 --> 00:47:57,090
So it wants to be
on the boundary.

962
00:47:57,090 --> 00:47:58,890
And here tiles are
not allowed to rotate,

963
00:47:58,890 --> 00:48:01,490
so it wants to be on
the bottom boundary.

964
00:48:01,490 --> 00:48:08,676
So this ab glues only
appear as a single pairs.

965
00:48:08,676 --> 00:48:10,300
I guess they'll also
appear over there.

966
00:48:10,300 --> 00:48:11,383
But not very many of them.

967
00:48:11,383 --> 00:48:13,800
So basically a, b, and
c have to glue together

968
00:48:13,800 --> 00:48:14,720
in sequence like that.

969
00:48:14,720 --> 00:48:15,980
And the percent
signs are going to be

970
00:48:15,980 --> 00:48:17,140
the same on the bottom row.

971
00:48:17,140 --> 00:48:19,130
So nothing else.

972
00:48:19,130 --> 00:48:20,832
This is basically
forced to do this.

973
00:48:20,832 --> 00:48:22,540
We'll actually have
to do it a few times,

974
00:48:22,540 --> 00:48:24,920
but you have to build
this bottom structure.

975
00:48:24,920 --> 00:48:28,210
And then the question is
what do you build on top.

976
00:48:28,210 --> 00:48:32,900
And the idea is there are
exactly one each of these three

977
00:48:32,900 --> 00:48:37,110
tiles which just communicate
dollar sign left to right,

978
00:48:37,110 --> 00:48:39,550
and have a, b, c on the bottom.

979
00:48:39,550 --> 00:48:40,432
So those are cool.

980
00:48:40,432 --> 00:48:42,890
And if you want to put a triple
into your three-dimensional

981
00:48:42,890 --> 00:48:46,950
matching, then you
put those in sequence.

982
00:48:46,950 --> 00:48:48,020
No mismatches.

983
00:48:48,020 --> 00:48:48,680
This is great.

984
00:48:48,680 --> 00:48:49,820
You can take a whole
bunch of these,

985
00:48:49,820 --> 00:48:52,028
stick them next to each
other, everything will match.

986
00:48:52,028 --> 00:48:53,000
No errors.

987
00:48:53,000 --> 00:48:54,990
So you're getting
some constant number

988
00:48:54,990 --> 00:48:58,230
of points for each of these.

989
00:48:58,230 --> 00:49:03,240
But you will have to build
more-- at least two copies

990
00:49:03,240 --> 00:49:04,930
of this bottom structure.

991
00:49:04,930 --> 00:49:07,540
And there's only one
copy of this top thing.

992
00:49:07,540 --> 00:49:09,110
So that's the annoying part.

993
00:49:09,110 --> 00:49:11,820
But there are some variations
of these tiles which

994
00:49:11,820 --> 00:49:13,570
look like something
like this-- I'll

995
00:49:13,570 --> 00:49:16,930
show you all of them in a
moment-- which have exactly one

996
00:49:16,930 --> 00:49:18,450
mismatch.

997
00:49:18,450 --> 00:49:20,842
So you don't get
quite as many points.

998
00:49:20,842 --> 00:49:22,800
You get, I don't know,
15 instead of 16 points,

999
00:49:22,800 --> 00:49:24,740
or whatever.

1000
00:49:24,740 --> 00:49:26,750
Bottom structure looks the same.

1001
00:49:26,750 --> 00:49:31,571
And the point of this
is we know a appears

1002
00:49:31,571 --> 00:49:32,570
in two different places.

1003
00:49:32,570 --> 00:49:35,870
So we need two
versions of the a tile.

1004
00:49:35,870 --> 00:49:39,015
But we only want one of them
to be happy and give you

1005
00:49:39,015 --> 00:49:40,640
all the points,
because you should only

1006
00:49:40,640 --> 00:49:44,400
be able to choose
the a thing once.

1007
00:49:44,400 --> 00:49:46,520
So yet this triple
will still exist.

1008
00:49:46,520 --> 00:49:48,400
adc will still be
floating around there.

1009
00:49:48,400 --> 00:49:52,410
You want to still be buildable,
but at a cost of negative 1.

1010
00:49:52,410 --> 00:49:54,880
So this part's still built.

1011
00:49:54,880 --> 00:49:57,030
Then you have these
sort of filler tiles.

1012
00:49:57,030 --> 00:49:59,000
Your goal is then just
get rid of all the stuff

1013
00:49:59,000 --> 00:50:00,770
and pay a penalty.

1014
00:50:00,770 --> 00:50:03,410
But you want to minimize the
number of times you do this,

1015
00:50:03,410 --> 00:50:05,850
or maximize the number
of times you do this,

1016
00:50:05,850 --> 00:50:09,200
and then it will be
simulating Max 3DM.

1017
00:50:09,200 --> 00:50:12,640
There'll be some
additive consistent cost,

1018
00:50:12,640 --> 00:50:16,690
which is the cost of all
the unpicked triples.

1019
00:50:16,690 --> 00:50:20,525
And then this will
be an L-reduction.

1020
00:50:20,525 --> 00:50:21,650
So I have some more slides.

1021
00:50:21,650 --> 00:50:24,070
It's a bit complicated
to do all of the details,

1022
00:50:24,070 --> 00:50:28,020
but this is a fully worked-out
example with two triples.

1023
00:50:28,020 --> 00:50:30,680
We have a, b, c and a, d, c.

1024
00:50:30,680 --> 00:50:32,264
And because they
share a, we don't

1025
00:50:32,264 --> 00:50:33,430
want them both to be picked.

1026
00:50:33,430 --> 00:50:36,380
So the same as what I showed
you just in the previous slide.

1027
00:50:36,380 --> 00:50:38,500
But then there are
all these other tiles

1028
00:50:38,500 --> 00:50:41,420
that are floating
around in order to make

1029
00:50:41,420 --> 00:50:43,320
all the combinations possible.

1030
00:50:43,320 --> 00:50:45,730
And there's all these
tiles to basically allow

1031
00:50:45,730 --> 00:50:47,390
them to get thrown away.

1032
00:50:47,390 --> 00:50:50,710
And so that's not so clear.

1033
00:50:50,710 --> 00:50:54,104
This is the overall
construction.

1034
00:50:54,104 --> 00:50:56,520
For every triple, you're going
to have exactly these three

1035
00:50:56,520 --> 00:50:59,310
tiles that we saw.

1036
00:50:59,310 --> 00:51:01,310
It got rotated relative
to the previous picture.

1037
00:51:01,310 --> 00:51:03,560
Maybe rotations are allowed.

1038
00:51:03,560 --> 00:51:05,890
And then for every
variable, here

1039
00:51:05,890 --> 00:51:08,150
they're called x, y,
z instead of a, b, c.

1040
00:51:08,150 --> 00:51:09,290
But the same thing.

1041
00:51:09,290 --> 00:51:13,100
For every a thing we'll have
some constant set of tiles that

1042
00:51:13,100 --> 00:51:15,250
includes the really good one.

1043
00:51:15,250 --> 00:51:15,750
Sorry.

1044
00:51:15,750 --> 00:51:17,270
The good one has
two dollar signs.

1045
00:51:17,270 --> 00:51:19,465
This is the one you really like.

1046
00:51:19,465 --> 00:51:21,090
And then there's all
this stuff to make

1047
00:51:21,090 --> 00:51:23,350
sure things can get consumed.

1048
00:51:23,350 --> 00:51:24,880
And you can get
rid of the triples

1049
00:51:24,880 --> 00:51:27,950
and pay exactly one
per unpicked triple.

1050
00:51:27,950 --> 00:51:29,700
So I don't want to go
through the details,

1051
00:51:29,700 --> 00:51:34,711
but once you have that, you get
an L-reduction from Max 3DN2.

1052
00:51:34,711 --> 00:51:35,210
Questions?

1053
00:51:38,508 --> 00:51:39,494
All right.

1054
00:51:44,960 --> 00:51:50,590
So I want to go
up the hierarchy.

1055
00:51:50,590 --> 00:51:55,130
We've been focusing on constant
factor, approximable problems

1056
00:51:55,130 --> 00:51:56,270
that have no PTASses.

1057
00:51:59,080 --> 00:52:00,870
I will mention there
before we go on

1058
00:52:00,870 --> 00:52:04,050
that there are some
constant factor approximable

1059
00:52:04,050 --> 00:52:08,020
problems that are not,
that have no PTAS,

1060
00:52:08,020 --> 00:52:10,600
and yet are not APX-complete.

1061
00:52:10,600 --> 00:52:17,520
So APX-complete is not
all of APX minus PTAS.

1062
00:52:17,520 --> 00:52:22,460
So there are APX
minus PTAS problems

1063
00:52:22,460 --> 00:52:23,620
that are not APX-complete.

1064
00:52:26,380 --> 00:52:29,140
So these are still useful
from a reduction standpoint.

1065
00:52:29,140 --> 00:52:33,910
You can use them to show that
your problem has no PTAS.

1066
00:52:33,910 --> 00:52:36,450
But you have to state
them differently.

1067
00:52:40,690 --> 00:52:43,190
And they're somewhat
familiar problems.

1068
00:52:43,190 --> 00:52:46,230
One of them is bin packing.

1069
00:52:46,230 --> 00:52:48,950
This is you're moving
out of your house.

1070
00:52:48,950 --> 00:52:50,950
You have a bunch of objects.

1071
00:52:50,950 --> 00:52:52,700
You live in a
one-dimensional universe.

1072
00:52:52,700 --> 00:52:55,620
So each box is
exactly the same size.

1073
00:52:55,620 --> 00:52:57,240
It's one-dimensional in size.

1074
00:52:57,240 --> 00:52:58,920
And you have a bunch of items
which are one-dimensional.

1075
00:52:58,920 --> 00:53:01,211
And you want to pack as many
as you can into each box--

1076
00:53:01,211 --> 00:53:03,190
but overall use the
minimum number of boxes.

1077
00:53:03,190 --> 00:53:05,690
It's a minimization problem.

1078
00:53:05,690 --> 00:53:08,770
This has no constant
factor approximation.

1079
00:53:08,770 --> 00:53:14,740
But you can find what's called
a asymptotic PTAS, where

1080
00:53:14,740 --> 00:53:17,950
you can get a PTAS-style
result-- 1 plus epsilon

1081
00:53:17,950 --> 00:53:21,822
times OPT plus 1.

1082
00:53:21,822 --> 00:53:24,881
So an additive error.

1083
00:53:24,881 --> 00:53:26,380
And so in particular,
distinguishing

1084
00:53:26,380 --> 00:53:29,930
between two bins and three
bins is weakly NP-complete.

1085
00:53:29,930 --> 00:53:36,325
That's like partition,
right, between two bins

1086
00:53:36,325 --> 00:53:37,570
and three bins.

1087
00:53:37,570 --> 00:53:39,280
So you need this
sort of additive one.

1088
00:53:39,280 --> 00:53:42,060
You can't get a PTAS
without the additive one.

1089
00:53:42,060 --> 00:53:45,360
So it's not as hard as all
constant factor inapproximable

1090
00:53:45,360 --> 00:53:49,300
problems, but
somewhere in between.

1091
00:53:49,300 --> 00:53:52,440
APX-intermediate is
the technical term.

1092
00:53:52,440 --> 00:53:56,501
Some other ones are minimum.

1093
00:53:56,501 --> 00:53:58,432
AUDIENCE: [INAUDIBLE].

1094
00:53:58,432 --> 00:54:00,765
PROFESSOR: Oh, this is all
assuming P does not equal NP.

1095
00:54:00,765 --> 00:54:01,120
Yes.

1096
00:54:01,120 --> 00:54:03,530
If P equals NP, then I think
all these things are equal.

1097
00:54:03,530 --> 00:54:05,300
So, thank you.

1098
00:54:08,400 --> 00:54:10,670
Another problem I've
seen in some situations

1099
00:54:10,670 --> 00:54:15,260
is you want to find the
spanning tree in a graph that

1100
00:54:15,260 --> 00:54:16,840
minimizes the maximum degree.

1101
00:54:16,840 --> 00:54:19,070
This is also APX-intermediate.

1102
00:54:19,070 --> 00:54:21,220
There's a constant
factor approximation.

1103
00:54:21,220 --> 00:54:26,130
No PTAS, but not as
hard as all of APX.

1104
00:54:26,130 --> 00:54:28,440
And another one is
min edge coloring,

1105
00:54:28,440 --> 00:54:33,120
which is quite a bit easier
than vertex coloring.

1106
00:54:33,120 --> 00:54:34,864
So these are problems
to watch out for.

1107
00:54:34,864 --> 00:54:37,280
They're the only ones I know
of that are APX-intermediate.

1108
00:54:37,280 --> 00:54:38,280
There may be more known.

1109
00:54:41,330 --> 00:54:42,190
OK.

1110
00:54:42,190 --> 00:54:44,900
So unless there are
questions, I want to go up

1111
00:54:44,900 --> 00:54:46,795
to log factor approximation.

1112
00:54:54,180 --> 00:54:56,050
Surprisingly, in
the CSP universe,

1113
00:54:56,050 --> 00:54:59,970
we didn't get any
log approximation

1114
00:54:59,970 --> 00:55:00,970
as the right answer.

1115
00:55:00,970 --> 00:55:03,350
But there are problems where
log is the right answer.

1116
00:55:07,774 --> 00:55:09,690
Again, there's probably
intermediate problems.

1117
00:55:09,690 --> 00:55:11,720
But here are some
problems that are actually

1118
00:55:11,720 --> 00:55:14,880
complete over all log
approximable problems.

1119
00:55:14,880 --> 00:55:16,930
So there's a log
lower-bound and upper-bound

1120
00:55:16,930 --> 00:55:19,390
on their approximability.

1121
00:55:19,390 --> 00:55:25,090
I've mentioned two of them--
set cover and dominating set.

1122
00:55:29,859 --> 00:55:32,150
First thing I'd like to show
is that these two problems

1123
00:55:32,150 --> 00:55:33,390
are the same.

1124
00:55:33,390 --> 00:55:35,810
I'm not going to try to
prove lower bounds on them--

1125
00:55:35,810 --> 00:55:37,240
at least for now.

1126
00:55:37,240 --> 00:55:40,720
But let me show that you could
L-reduce one to the other.

1127
00:55:40,720 --> 00:55:44,080
So the easy direction
is L-reducing dominating

1128
00:55:44,080 --> 00:55:47,260
set to set cover,
because dominating set

1129
00:55:47,260 --> 00:55:49,100
says, well, if I
choose this vertex,

1130
00:55:49,100 --> 00:55:52,550
then I cover these vertices.

1131
00:55:52,550 --> 00:55:53,050
OK.

1132
00:55:53,050 --> 00:55:57,920
So let's call this vertex V,
and then maybe a, b, c, d.

1133
00:55:57,920 --> 00:56:04,450
I can represent that by a
set-- namely v, a, b, c, d.

1134
00:56:04,450 --> 00:56:06,502
If I choose that set, it
covers those elements,

1135
00:56:06,502 --> 00:56:07,960
just like when I
choose this vertex

1136
00:56:07,960 --> 00:56:09,450
it covers those vertices.

1137
00:56:09,450 --> 00:56:09,950
OK.

1138
00:56:09,950 --> 00:56:12,490
So that's a strict
reduction from dominating

1139
00:56:12,490 --> 00:56:14,865
set to set cover.

1140
00:56:14,865 --> 00:56:18,320
In some sense, the bipartite
version gives you more control.

1141
00:56:18,320 --> 00:56:18,820
OK.

1142
00:56:18,820 --> 00:56:22,500
This is the non-bipartite
version of set cover.

1143
00:56:22,500 --> 00:56:24,110
So what about the
other reduction--

1144
00:56:24,110 --> 00:56:27,500
reducing set cover
to dominating set?

1145
00:56:30,090 --> 00:56:33,170
So this is a little more fun.

1146
00:56:33,170 --> 00:56:35,710
We need to build
a graph dominating

1147
00:56:35,710 --> 00:56:39,000
set that somehow has two very
different types of vertices.

1148
00:56:39,000 --> 00:56:42,810
We want to represent sets, and
we want to represent elements.

1149
00:56:42,810 --> 00:56:44,400
So here's what
we're going to do.

1150
00:56:44,400 --> 00:56:49,040
We build a clique
representing the sets.

1151
00:56:49,040 --> 00:56:53,560
So there are nodes in this
clique-- one for every set.

1152
00:56:53,560 --> 00:56:57,240
And then we're going to have an
independent set over here that

1153
00:56:57,240 --> 00:56:59,710
will represent the elements.

1154
00:56:59,710 --> 00:57:01,910
And then whenever
a set over here

1155
00:57:01,910 --> 00:57:05,940
contains an element over
there, we will add an edge.

1156
00:57:05,940 --> 00:57:08,970
So in general, an element
may appear in several sets,

1157
00:57:08,970 --> 00:57:12,200
and the set is going to
consist of many elements.

1158
00:57:12,200 --> 00:57:14,440
But over here, there's
not going to be any edges

1159
00:57:14,440 --> 00:57:15,410
between these elements.

1160
00:57:15,410 --> 00:57:18,390
These are independent.

1161
00:57:18,390 --> 00:57:22,370
And over here, all
of the edges exist.

1162
00:57:22,370 --> 00:57:25,540
So the intent is you choose
a set of these vertices

1163
00:57:25,540 --> 00:57:29,620
corresponding to sets in
order to cover those vertices.

1164
00:57:29,620 --> 00:57:31,870
And that's going to work,
because these vertices

1165
00:57:31,870 --> 00:57:33,820
are super easy to cover
in the dominating set.

1166
00:57:33,820 --> 00:57:36,880
You choose any of them,
you cover all of them.

1167
00:57:36,880 --> 00:57:40,800
These guys, you never want to
put them in a dominating set.

1168
00:57:40,800 --> 00:57:42,800
Why would you put this
in a dominating set, when

1169
00:57:42,800 --> 00:57:44,466
you could just follow
one of these edges

1170
00:57:44,466 --> 00:57:45,780
and put this in instead?

1171
00:57:45,780 --> 00:57:49,960
That vertex will cover this one,
and it will cover all of these.

1172
00:57:49,960 --> 00:57:52,680
And the only edges from
here are to over here.

1173
00:57:52,680 --> 00:57:56,451
So if you choose a set, you'll
cover all the sets and that one

1174
00:57:56,451 --> 00:57:56,950
element.

1175
00:57:56,950 --> 00:57:58,324
If you choose the
element, you'll

1176
00:57:58,324 --> 00:58:01,390
cover the element
and some of the sets.

1177
00:58:01,390 --> 00:58:04,100
So in any optimal solution,
if this ever appears,

1178
00:58:04,100 --> 00:58:06,530
you can keep it optimal
and move over here.

1179
00:58:06,530 --> 00:58:09,170
That is sort of arguments
we've been doing over and over.

1180
00:58:09,170 --> 00:58:11,240
So there is an optimal
solution where you only

1181
00:58:11,240 --> 00:58:16,810
choose vertices on the left,
and then that is a set cover.

1182
00:58:16,810 --> 00:58:19,570
Again, it's a strict reduction.

1183
00:58:19,570 --> 00:58:21,170
No loss.

1184
00:58:21,170 --> 00:58:21,670
Cool?

1185
00:58:21,670 --> 00:58:24,425
So that is why these two
problems are equivalent.

1186
00:58:24,425 --> 00:58:26,300
Now we're just going to
take on faith for now

1187
00:58:26,300 --> 00:58:29,290
that they are log
inapproximable.

1188
00:58:29,290 --> 00:58:32,044
And you've probably seen that
this one is log approximable.

1189
00:58:32,044 --> 00:58:33,960
So now you know that
this is log approximable.

1190
00:58:39,540 --> 00:58:45,170
I would say most
of the literature

1191
00:58:45,170 --> 00:58:50,040
I see for inapproximability
is either APX hardness,

1192
00:58:50,040 --> 00:58:52,465
or what people usually
call set cover hardness.

1193
00:58:55,140 --> 00:58:57,440
I mean, the fact that set
covers log APX-complete,

1194
00:58:57,440 --> 00:58:58,814
that is complete
for that class--

1195
00:58:58,814 --> 00:59:01,230
not just a log lower-bound--
is fairly recent.

1196
00:59:01,230 --> 00:59:03,760
So people usually have
called it set cover hardness.

1197
00:59:03,760 --> 00:59:07,000
Now you can call it
log APX-hardness.

1198
00:59:07,000 --> 00:59:10,120
So let me show you one example.

1199
00:59:10,120 --> 00:59:11,880
There are a lot
of both out there,

1200
00:59:11,880 --> 00:59:15,852
and I'm actually just showing
you sort of a small sampling,

1201
00:59:15,852 --> 00:59:17,900
because there's so much.

1202
00:59:17,900 --> 00:59:20,120
So here's a fun problem.

1203
00:59:20,120 --> 00:59:23,167
It's called token
reconfiguration.

1204
00:59:23,167 --> 00:59:24,750
And the idea is
you're doing some kind

1205
00:59:24,750 --> 00:59:27,410
of motion planning in a graph.

1206
00:59:27,410 --> 00:59:29,380
So something like
pushing blocks,

1207
00:59:29,380 --> 00:59:33,300
except you have a
bunch of robots,

1208
00:59:33,300 --> 00:59:37,100
which here are represented--
well, you have a graph.

1209
00:59:37,100 --> 00:59:40,760
And each vertex can either
have a robot or not.

1210
00:59:40,760 --> 00:59:43,580
In some, you're given
an initial configuration

1211
00:59:43,580 --> 00:59:45,320
of how the robots are
placed, and you're

1212
00:59:45,320 --> 00:59:46,903
given a final
configuration of how you

1213
00:59:46,903 --> 00:59:48,160
want the robots to be placed.

1214
00:59:48,160 --> 00:59:49,826
And they have the
same number of robots,

1215
00:59:49,826 --> 00:59:53,220
because you can't eat
robots, or create them yet.

1216
00:59:53,220 --> 00:59:55,470
So when robots
can create robots,

1217
00:59:55,470 --> 00:59:57,990
that will be another problem.

1218
00:59:57,990 --> 00:59:59,490
So here you have
robot conservation.

1219
01:00:03,200 --> 01:00:05,370
So in a configuration,
there are three types

1220
01:00:05,370 --> 01:00:08,350
of vertices in that situation.

1221
01:00:08,350 --> 01:00:10,760
It could be you have a
vertex that currently

1222
01:00:10,760 --> 01:00:12,580
has a robot-- here
they're called tokens,

1223
01:00:12,580 --> 01:00:16,210
to be a little more generic.

1224
01:00:16,210 --> 01:00:19,480
It could have a robot,
but not be a place

1225
01:00:19,480 --> 01:00:20,690
that should have a robot.

1226
01:00:20,690 --> 01:00:22,690
So in the initial
configuration, it has a robot,

1227
01:00:22,690 --> 01:00:24,910
but in the final
configuration it does not.

1228
01:00:24,910 --> 01:00:28,750
It could be you have some
robots that are basically

1229
01:00:28,750 --> 01:00:29,940
where they want to be.

1230
01:00:29,940 --> 01:00:33,240
They are robot and also in
the target configuration,

1231
01:00:33,240 --> 01:00:34,780
there's a robot there.

1232
01:00:34,780 --> 01:00:36,870
Or I guess there's four
cases, but in this case

1233
01:00:36,870 --> 01:00:38,040
we'll only have three.

1234
01:00:38,040 --> 01:00:40,260
Or it could be that you
want to have robot there,

1235
01:00:40,260 --> 01:00:42,240
but currently you do not.

1236
01:00:42,240 --> 01:00:46,817
So this is an instance
that simulates set cover.

1237
01:00:46,817 --> 01:00:48,650
And this is a situation
where robots are all

1238
01:00:48,650 --> 01:00:49,520
treated identically.

1239
01:00:49,520 --> 01:00:52,400
So you don't care
which robot goes where.

1240
01:00:52,400 --> 01:00:54,030
So you've got these
robots over here,

1241
01:00:54,030 --> 01:00:55,350
which don't want to be here.

1242
01:00:55,350 --> 01:00:56,860
They want to be over there.

1243
01:00:56,860 --> 01:00:58,450
I mean, if you
measure this length,

1244
01:00:58,450 --> 01:01:01,900
it's the same as this length.

1245
01:01:01,900 --> 01:01:03,540
And these robots
don't want to move,

1246
01:01:03,540 --> 01:01:05,930
but they're going to have to,
because they're in the way.

1247
01:01:05,930 --> 01:01:08,590
In this tripartite graph,
they're in the way from here

1248
01:01:08,590 --> 01:01:09,950
to there.

1249
01:01:09,950 --> 01:01:12,840
I didn't tell you a
move in this scenario

1250
01:01:12,840 --> 01:01:18,220
is that you can take a robot
and follow any empty path, OK

1251
01:01:18,220 --> 01:01:21,410
So you can make a sequence of
moves all at a cost of one,

1252
01:01:21,410 --> 01:01:23,610
as long as it doesn't
hit any other robots.

1253
01:01:23,610 --> 01:01:25,570
So, a collision-free path.

1254
01:01:25,570 --> 01:01:27,850
You follow it, then you
can pick up another robot,

1255
01:01:27,850 --> 01:01:29,349
move it along a
collision-free path,

1256
01:01:29,349 --> 01:01:32,720
pick up another
robot, and so on.

1257
01:01:32,720 --> 01:01:34,884
So if you want to move
all these guys over here,

1258
01:01:34,884 --> 01:01:37,300
you're going to have to move
some of these out of the way.

1259
01:01:37,300 --> 01:01:38,300
How many?

1260
01:01:38,300 --> 01:01:39,570
Set cover many.

1261
01:01:39,570 --> 01:01:42,330
Here's the set cover instance
in this bipartite graph.

1262
01:01:42,330 --> 01:01:45,710
So what you can do is take this
robot, move it out of the way,

1263
01:01:45,710 --> 01:01:47,240
move it to one of
these elements,

1264
01:01:47,240 --> 01:01:49,200
and then for the remainder
of this set, which

1265
01:01:49,200 --> 01:01:51,597
are these two nodes,
you can take this guy

1266
01:01:51,597 --> 01:01:53,430
and move it there in
one step, take this guy

1267
01:01:53,430 --> 01:01:54,800
and move it there in one step.

1268
01:01:54,800 --> 01:01:56,200
The length of this doesn't
matter, because you

1269
01:01:56,200 --> 01:01:57,480
can follow a long path.

1270
01:01:57,480 --> 01:02:01,900
And you just drain out
this thing one at a time--

1271
01:02:01,900 --> 01:02:05,490
except for this guy, who
you moved out of the way.

1272
01:02:05,490 --> 01:02:08,260
You move one of these
to fill his spot.

1273
01:02:08,260 --> 01:02:10,690
And if you can cover all
the elements over here

1274
01:02:10,690 --> 01:02:13,640
with only k of
these guys moving,

1275
01:02:13,640 --> 01:02:20,215
then the number of moves
will be k plus A. So

1276
01:02:20,215 --> 01:02:21,340
that's what's written here.

1277
01:02:21,340 --> 01:02:26,940
OPT is, this is a fixed added
of cost plus the set cover.

1278
01:02:26,940 --> 01:02:30,600
And this is going to be
an L-reduction, provided

1279
01:02:30,600 --> 01:02:36,990
this is a linear in A, which
is easy enough to arrange.

1280
01:02:36,990 --> 01:02:38,590
So that's the unlabeled case.

1281
01:02:38,590 --> 01:02:40,860
You can also solve
the labeled case.

1282
01:02:40,860 --> 01:02:44,170
Maybe you want robot one
to go to position one,

1283
01:02:44,170 --> 01:02:47,190
and you want robot two
to go to position two.

1284
01:02:47,190 --> 01:02:48,901
Same thing, but
here these robots

1285
01:02:48,901 --> 01:02:50,900
are going to have to go
back where they started.

1286
01:02:50,900 --> 01:02:53,525
So you just add a little vertex
so they can get out of the way.

1287
01:02:53,525 --> 01:02:55,590
Everything can move
where they want to.

1288
01:02:55,590 --> 01:02:58,710
Again, choose a set
cover, move those over,

1289
01:02:58,710 --> 01:02:59,970
and then move them back.

1290
01:02:59,970 --> 01:03:02,130
So you end up paying
two times the set cover.

1291
01:03:02,130 --> 01:03:03,840
But just a constant factor loss.

1292
01:03:03,840 --> 01:03:05,960
Still an L-reduction.

1293
01:03:05,960 --> 01:03:07,960
And this problem
is motivated, it's

1294
01:03:07,960 --> 01:03:10,290
sort of a generalization
of the 15 puzzle.

1295
01:03:10,290 --> 01:03:12,750
You have a little 4 by 4 grid.

1296
01:03:12,750 --> 01:03:13,990
You've got movable tiles.

1297
01:03:13,990 --> 01:03:16,300
You can only move one
at a time in that case,

1298
01:03:16,300 --> 01:03:18,420
because there's
only a single gap.

1299
01:03:18,420 --> 01:03:20,890
This is sort of a
generalized form of that,

1300
01:03:20,890 --> 01:03:22,770
where you have various tiles.

1301
01:03:22,770 --> 01:03:25,230
You want to get them
into the right spots,

1302
01:03:25,230 --> 01:03:28,300
but you can't have collisions
during that motion.

1303
01:03:28,300 --> 01:03:31,470
So that's where this
problem came from.

1304
01:03:31,470 --> 01:03:34,320
15 puzzle, by the way, in
the generalized n by n form

1305
01:03:34,320 --> 01:03:37,077
is NP-hard and in APX,
but I think it's open

1306
01:03:37,077 --> 01:03:38,160
whether it's APX-complete.

1307
01:03:40,700 --> 01:03:44,820
I would show the proof, but it's
very complicated, so, I won't.

1308
01:03:48,450 --> 01:03:50,140
Cool.

1309
01:03:50,140 --> 01:03:53,170
Well, in the last little
bit, I wanted to tell you

1310
01:03:53,170 --> 01:03:56,230
about the super high end.

1311
01:03:56,230 --> 01:03:57,835
So we went to log approximation.

1312
01:04:00,640 --> 01:04:03,720
There are other
things known, but not

1313
01:04:03,720 --> 01:04:05,110
a lot of completeness results.

1314
01:04:05,110 --> 01:04:06,610
So we're going to
get to other kinds

1315
01:04:06,610 --> 01:04:09,370
of interapproximability
next class.

1316
01:04:09,370 --> 01:04:13,430
For now, I want to stick
to something APX-complete.

1317
01:04:13,430 --> 01:04:15,790
And the most studied
class above log

1318
01:04:15,790 --> 01:04:19,740
is poly, which is like n
to the 1 minus epsilon.

1319
01:04:34,860 --> 01:04:38,360
And my main goal here is to
tell you about some problems

1320
01:04:38,360 --> 01:04:40,880
that you should, if you
think your problem is

1321
01:04:40,880 --> 01:04:44,730
like Poly-APX-hard, these
are the standard problems

1322
01:04:44,730 --> 01:04:46,390
to start from.

1323
01:04:46,390 --> 01:04:47,629
There are two of them.

1324
01:04:47,629 --> 01:04:49,920
And I've mentioned them, but
not quite in this context.

1325
01:04:57,920 --> 01:05:03,194
They are clique and
independent set.

1326
01:05:03,194 --> 01:05:04,610
These are really
the same problem.

1327
01:05:04,610 --> 01:05:08,670
One is the complement
graph of the other.

1328
01:05:08,670 --> 01:05:09,975
Both maximization problems.

1329
01:05:12,930 --> 01:05:14,480
And those are the standard ones.

1330
01:05:14,480 --> 01:05:16,690
I'll leave it at that.

1331
01:05:16,690 --> 01:05:18,490
I'm going to keep going up.

1332
01:05:18,490 --> 01:05:22,173
The next level most studied
is Exp-APX-complete.

1333
01:05:25,136 --> 01:05:27,010
So for these problems,
the best approximation

1334
01:05:27,010 --> 01:05:29,960
is n divided by log squared n.

1335
01:05:29,960 --> 01:05:32,234
And there's a lower bound
of n to the 1 minus epsilon.

1336
01:05:32,234 --> 01:05:34,400
So there is a gap in terms
of their approximability.

1337
01:05:34,400 --> 01:05:35,775
But what we know
is that they are

1338
01:05:35,775 --> 01:05:39,930
the hardest problems that have
any n to the ce approximation.

1339
01:05:39,930 --> 01:05:44,380
They're all reducible to each
other via PTAS reductions.

1340
01:05:44,380 --> 01:05:45,705
So, fairly preserving.

1341
01:05:48,680 --> 01:05:52,010
So our next class
up is APX-complete,

1342
01:05:52,010 --> 01:05:59,980
things, problems approximable in
exponential and n approximation

1343
01:05:59,980 --> 01:06:00,480
factors.

1344
01:06:00,480 --> 01:06:02,850
How would that happen?

1345
01:06:02,850 --> 01:06:04,420
This is kind of funny.

1346
01:06:04,420 --> 01:06:09,350
And the canonical problem here
is the basic reason is numbers.

1347
01:06:12,190 --> 01:06:14,410
We take the traveling
salesman problem.

1348
01:06:14,410 --> 01:06:16,950
And every edge
can have a weight.

1349
01:06:16,950 --> 01:06:18,730
Let's say it's integer weights.

1350
01:06:18,730 --> 01:06:21,960
But any integer weight that
can be expressible in n bits

1351
01:06:21,960 --> 01:06:25,740
is fair game, which means
the actual value of that edge

1352
01:06:25,740 --> 01:06:28,500
is going to be exponential in n.

1353
01:06:28,500 --> 01:06:31,210
And from that, you can get
a very easy lower bound.

1354
01:06:31,210 --> 01:06:33,480
And in fact, all
problems that are

1355
01:06:33,480 --> 01:06:38,430
approximable in exponential APX
can be reduced to general TSP,

1356
01:06:38,430 --> 01:06:40,267
where you're just given
a bunch of distances

1357
01:06:40,267 --> 01:06:41,350
between pairs of vertices.

1358
01:06:41,350 --> 01:06:43,040
It doesn't satisfy
triangle inequality.

1359
01:06:43,040 --> 01:06:44,930
That's the non-metric aspect.

1360
01:06:44,930 --> 01:06:48,100
The triangle inequality TSP,
which is what normally happens,

1361
01:06:48,100 --> 01:06:49,260
there is a constant factor.

1362
01:06:49,260 --> 01:06:51,430
It's APX complete.

1363
01:06:51,430 --> 01:06:57,030
But for general waits
between pairs of vertices,

1364
01:06:57,030 --> 01:06:59,370
non-metric, it's
Exp-APX-complete,

1365
01:06:59,370 --> 01:07:03,220
because you can
basically make a graph

1366
01:07:03,220 --> 01:07:05,240
and solve
Hamiltonicity by saying

1367
01:07:05,240 --> 01:07:09,050
all the edges in the graph
have weight one or zero,

1368
01:07:09,050 --> 01:07:12,790
and all of the edges-- I guess
one would be a little bit more

1369
01:07:12,790 --> 01:07:14,240
legitimate.

1370
01:07:14,240 --> 01:07:16,350
And all the non-edges
in the graph

1371
01:07:16,350 --> 01:07:17,890
are going to give
weight infinity.

1372
01:07:17,890 --> 01:07:19,920
Infinity is the largest
expressible number which

1373
01:07:19,920 --> 01:07:22,360
is 1, 1, 1, 1, n bits long.

1374
01:07:22,360 --> 01:07:24,610
And so either you use one
of those edges or you don't.

1375
01:07:24,610 --> 01:07:27,540
And there's an exponential
gap between them.

1376
01:07:27,540 --> 01:07:29,590
So even if we disallow
zeros being an output,

1377
01:07:29,590 --> 01:07:33,446
then we get
exponential separation.

1378
01:07:33,446 --> 01:07:35,070
That doesn't prove
completeness, but it

1379
01:07:35,070 --> 01:07:38,070
proves that you can't hope
for better than exponential

1380
01:07:38,070 --> 01:07:40,910
approximation there.

1381
01:07:40,910 --> 01:07:42,240
OK.

1382
01:07:42,240 --> 01:07:46,620
Two more even crazier classes.

1383
01:07:46,620 --> 01:07:48,420
Now we did see these
classes come up

1384
01:07:48,420 --> 01:07:52,580
with the
characterization theorem.

1385
01:07:52,580 --> 01:07:54,990
But these are probably how
these results were proved.

1386
01:08:17,750 --> 01:08:20,778
So you might think, well,
double the exponential.

1387
01:08:20,778 --> 01:08:21,319
I don't know.

1388
01:08:21,319 --> 01:08:22,376
What's next?

1389
01:08:22,376 --> 01:08:24,189
Next, you could define that.

1390
01:08:24,189 --> 01:08:27,550
But what seems to
appear most often

1391
01:08:27,550 --> 01:08:33,040
is this is the ultimate class
among all NP optimization

1392
01:08:33,040 --> 01:08:34,810
problems, you could
imagine being complete

1393
01:08:34,810 --> 01:08:36,060
against all of them.

1394
01:08:36,060 --> 01:08:40,270
And this is with respect
to AP-reductions,

1395
01:08:40,270 --> 01:08:41,279
one of the ones we saw.

1396
01:08:44,090 --> 01:08:47,490
And I'm going to define a very
closely related class, which

1397
01:08:47,490 --> 01:08:51,560
is NPO PB, NPO
polynomially bounded.

1398
01:08:57,700 --> 01:08:58,992
OK.

1399
01:08:58,992 --> 01:09:02,220
So these are the hardest
problems to approximate.

1400
01:09:02,220 --> 01:09:04,740
This is basically the problems
that have numbers in them,

1401
01:09:04,740 --> 01:09:06,810
and this is the problem
that have no numbers,

1402
01:09:06,810 --> 01:09:10,180
or if they have numbers they
are polynomially bounded,

1403
01:09:10,180 --> 01:09:12,660
like the polynomial situation.

1404
01:09:12,660 --> 01:09:16,160
So non-metric TSP, well, it's
not as hard as NPO-complete,

1405
01:09:16,160 --> 01:09:18,021
but it's more in this category.

1406
01:09:18,021 --> 01:09:20,645
AUDIENCE: Is there a notion
of strongness, weakness

1407
01:09:20,645 --> 01:09:22,450
in these kind of things?

1408
01:09:22,450 --> 01:09:23,620
PROFESSOR: That's funny.

1409
01:09:23,620 --> 01:09:25,090
This is a stronger result.

1410
01:09:25,090 --> 01:09:26,560
So there's not quite an analog.

1411
01:09:26,560 --> 01:09:29,569
But you can do
exponential tricks

1412
01:09:29,569 --> 01:09:33,140
and give yourself a
hard time over here.

1413
01:09:33,140 --> 01:09:36,080
And here you're just
not allowed to use.

1414
01:09:36,080 --> 01:09:37,760
Everything's polynomial.

1415
01:09:37,760 --> 01:09:41,870
So a three-partition is sort
of more in this universe.

1416
01:09:41,870 --> 01:09:45,490
But in this situation, if you
sort of have three partitions,

1417
01:09:45,490 --> 01:09:50,410
but with exponential numbers,
then you get this harder class.

1418
01:09:50,410 --> 01:09:53,040
So this is not the
analog of weak.

1419
01:09:53,040 --> 01:09:57,724
You could maybe imagine--
well, in some sense,

1420
01:09:57,724 --> 01:09:59,390
weak is a modifier
in the problem, where

1421
01:09:59,390 --> 01:10:01,139
you say I want to
restrict all the numbers

1422
01:10:01,139 --> 01:10:02,700
to a polynomial size.

1423
01:10:02,700 --> 01:10:05,900
So when you do something
like three partition,

1424
01:10:05,900 --> 01:10:10,160
it's sort of a weak
problem, or it's

1425
01:10:10,160 --> 01:10:12,270
a polynomially bounded problem.

1426
01:10:12,270 --> 01:10:15,850
Strong NP hardness means
that that is NP-complete.

1427
01:10:15,850 --> 01:10:19,012
Anyway vague analog,
but not quite.

1428
01:10:19,012 --> 01:10:21,470
It's possible some of these,
you could add a weak modifier,

1429
01:10:21,470 --> 01:10:24,590
and it would mean
something, but I don't know.

1430
01:10:24,590 --> 01:10:25,090
All right.

1431
01:10:25,090 --> 01:10:27,230
So I just want to give
you some sample problems

1432
01:10:27,230 --> 01:10:29,290
on both of these sides.

1433
01:10:29,290 --> 01:10:31,930
Maybe let's start
with this side, which

1434
01:10:31,930 --> 01:10:35,117
is a little more
interesting, because you

1435
01:10:35,117 --> 01:10:36,575
get some kind of
familiar problems,

1436
01:10:36,575 --> 01:10:37,533
and they're super hard.

1437
01:10:40,520 --> 01:10:46,150
Minimum independent
dominating set.

1438
01:10:46,150 --> 01:10:47,300
We've seen independent set.

1439
01:10:47,300 --> 01:10:48,383
We've seen dominating set.

1440
01:10:48,383 --> 01:10:51,390
Independent set is already
hard to approximate.

1441
01:10:51,390 --> 01:10:56,360
But this problem is
worse, because even

1442
01:10:56,360 --> 01:10:58,080
finding an independent
dominating set

1443
01:10:58,080 --> 01:11:02,020
is NP-complete, whereas
finding an independent set,

1444
01:11:02,020 --> 01:11:04,210
I can choose nothing.

1445
01:11:04,210 --> 01:11:06,990
But if I want to simultaneously
be dominating an independent,

1446
01:11:06,990 --> 01:11:07,870
that's NP.

1447
01:11:07,870 --> 01:11:09,570
Hard to find any solution.

1448
01:11:09,570 --> 01:11:17,680
In general in NPO PB problems,
NPO PB-complete problems,

1449
01:11:17,680 --> 01:11:20,920
it's always NP-complete to
find a feasible solution.

1450
01:11:20,920 --> 01:11:22,543
But it's worse than that.

1451
01:11:22,543 --> 01:11:25,210
So the first level would be
to find a feasible solution.

1452
01:11:25,210 --> 01:11:26,910
And this is saying
on top of that you

1453
01:11:26,910 --> 01:11:28,490
want to minimize the size.

1454
01:11:28,490 --> 01:11:30,276
I think Max would also be hard.

1455
01:11:30,276 --> 01:11:32,080
But I think there's
a general theorem,

1456
01:11:32,080 --> 01:11:33,640
that if you're hard
in the min case,

1457
01:11:33,640 --> 01:11:35,570
you're also hard
in the max case.

1458
01:11:35,570 --> 01:11:38,960
But it depends on
the exact set-up.

1459
01:11:38,960 --> 01:11:41,540
So this is sort of an
optimization version

1460
01:11:41,540 --> 01:11:44,110
that makes it even
harder than NP-complete.

1461
01:11:44,110 --> 01:11:49,330
So I think this is NP-complete,
and this is kind of even worse.

1462
01:11:49,330 --> 01:11:52,910
It's sort of stating the
stronger thing about when

1463
01:11:52,910 --> 01:11:55,380
you're trying to optimize
over a space of solutions,

1464
01:11:55,380 --> 01:11:57,130
that it's NP-complete to decide.

1465
01:11:57,130 --> 01:11:59,320
Notice that's still
an NPO problem.

1466
01:11:59,320 --> 01:12:01,490
We define that
solutions need to be

1467
01:12:01,490 --> 01:12:03,244
recognizable in polynomial time.

1468
01:12:03,244 --> 01:12:05,035
But we didn't say that
you can generate one

1469
01:12:05,035 --> 01:12:06,200
in polynomial time.

1470
01:12:06,200 --> 01:12:09,030
So it could be NP-complete
to find a single solution,

1471
01:12:09,030 --> 01:12:09,744
like here.

1472
01:12:09,744 --> 01:12:11,660
All of these problems
will have that property.

1473
01:12:15,990 --> 01:12:21,250
Another fun problem is
shortest computation.

1474
01:12:21,250 --> 01:12:23,189
This is sort of the
most intuitive one

1475
01:12:23,189 --> 01:12:23,980
at a certain level.

1476
01:12:23,980 --> 01:12:25,480
If you know Turing
machines, and you

1477
01:12:25,480 --> 01:12:27,396
have a non-deterministic
Turing machine, which

1478
01:12:27,396 --> 01:12:29,020
could take
non-deterministic branches,

1479
01:12:29,020 --> 01:12:31,630
you want to find the computation
in such a machine that

1480
01:12:31,630 --> 01:12:34,720
terminates the earliest
using the fewest steps.

1481
01:12:34,720 --> 01:12:39,080
So you might think of that
as canonical NPO PB problem.

1482
01:12:39,080 --> 01:12:41,690
There's no numbers in it,
but as you can imagine,

1483
01:12:41,690 --> 01:12:44,440
that's super hard to do.

1484
01:12:44,440 --> 01:12:46,840
Here's some more
graph theoretic ones.

1485
01:12:46,840 --> 01:12:50,920
Quite natural problems,
but super hard.

1486
01:12:50,920 --> 01:12:52,510
Longest induced path.

1487
01:12:52,510 --> 01:12:55,030
Induced means, there
are no other edges

1488
01:12:55,030 --> 01:12:57,320
between the chosen vertices.

1489
01:12:57,320 --> 01:13:00,810
So this is sort of
longest path is one thing.

1490
01:13:00,810 --> 01:13:03,030
That's quite hard to
approximate-- like, I think,

1491
01:13:03,030 --> 01:13:05,070
n to the 1 minus epsilon.

1492
01:13:05,070 --> 01:13:07,070
That's sort of the
analog of Hamiltonicity.

1493
01:13:07,070 --> 01:13:09,740
Along this induced
path is worse.

1494
01:13:09,740 --> 01:13:12,190
Even finding an induced
path of length k,

1495
01:13:12,190 --> 01:13:16,550
finding a feasible solution,
finding an induced path

1496
01:13:16,550 --> 01:13:17,150
is hard.

1497
01:13:24,310 --> 01:13:33,749
Another fun one is longest
path with forbidden pairs.

1498
01:13:33,749 --> 01:13:35,540
So there are pairs of
edges that you're not

1499
01:13:35,540 --> 01:13:38,160
allowed to choose together, and
subject to those constraints

1500
01:13:38,160 --> 01:13:40,000
you want to find
the longest path.

1501
01:13:40,000 --> 01:13:42,600
So these are all
NPO PB complete.

1502
01:13:42,600 --> 01:13:44,437
No numbers in any of them.

1503
01:13:44,437 --> 01:13:46,145
Now let me give you
some number problems.

1504
01:13:58,930 --> 01:14:03,230
So Ones was you want to maximize
the number of true variables.

1505
01:14:03,230 --> 01:14:05,710
Now we're going to add weights.

1506
01:14:05,710 --> 01:14:09,330
So we want to maximize
the sum of the weights

1507
01:14:09,330 --> 01:14:12,370
of the true
variables-- and while

1508
01:14:12,370 --> 01:14:15,830
satisfying a Boolean formula.

1509
01:14:15,830 --> 01:14:17,970
So again, finding a
feasible solution is hard.

1510
01:14:17,970 --> 01:14:19,800
That's not surprising.

1511
01:14:19,800 --> 01:14:22,440
Here, the weights can
be exponential in value,

1512
01:14:22,440 --> 01:14:24,500
because we allow n
bits for the weights.

1513
01:14:24,500 --> 01:14:28,450
And that pushes you
into NPO completeness.

1514
01:14:28,450 --> 01:14:31,210
If you say the weights have
to be polynomially bounded,

1515
01:14:31,210 --> 01:14:33,276
then this problem
is NPO PB complete.

1516
01:14:33,276 --> 01:14:34,900
And that's sort of
the starting problem

1517
01:14:34,900 --> 01:14:36,820
that they used to prove
all of these are hard.

1518
01:14:36,820 --> 01:14:39,420
So they're reductions from
this with polynomial weights

1519
01:14:39,420 --> 01:14:40,100
to these guys.

1520
01:14:44,438 --> 01:14:47,330
AUDIENCE: [INAUDIBLE]?

1521
01:14:47,330 --> 01:14:49,220
PROFESSOR: 3SAT.

1522
01:14:49,220 --> 01:14:54,080
I don't know whether you could
go down to 2SAT is interesting.

1523
01:14:54,080 --> 01:14:57,960
Here they say, I think,
probably 3SAT or CNFSAT.

1524
01:14:57,960 --> 01:15:00,040
Those reductions
definitely still work.

1525
01:15:00,040 --> 01:15:03,050
Whether you could put the
2SAT into the Max aspect,

1526
01:15:03,050 --> 01:15:03,630
I don't know.

1527
01:15:03,630 --> 01:15:06,550
But this could be
fun to look at.

1528
01:15:06,550 --> 01:15:09,000
There aren't a ton of papers
about these two classes,

1529
01:15:09,000 --> 01:15:11,050
but there are a few
before they nailed down

1530
01:15:11,050 --> 01:15:12,860
any interesting problems.

1531
01:15:12,860 --> 01:15:14,740
Here's another
interesting problem.

1532
01:15:20,600 --> 01:15:24,830
Suppose you want to do
integer linear programming.

1533
01:15:24,830 --> 01:15:28,710
To keep it simple, we'll
assume that the variables are

1534
01:15:28,710 --> 01:15:33,452
zero or one, and then
that is equally hard.

1535
01:15:33,452 --> 01:15:34,910
Here it's a little,
unless you know

1536
01:15:34,910 --> 01:15:37,034
a lot about linear programming,
it's not so obvious

1537
01:15:37,034 --> 01:15:39,290
that finding a feasible
solution here is hard.

1538
01:15:39,290 --> 01:15:41,589
But in general, linear
programing-- at least

1539
01:15:41,589 --> 01:15:43,880
in the non-integer case--
you could reduce optimization

1540
01:15:43,880 --> 01:15:45,660
to feasibility.

1541
01:15:45,660 --> 01:15:47,992
So I think the same
thing applies here.

1542
01:15:47,992 --> 01:15:49,950
If you're not familiar
with linear programming,

1543
01:15:49,950 --> 01:15:53,450
it's basically a bunch of
inequality constraints,

1544
01:15:53,450 --> 01:15:55,260
linear inequality constraints.

1545
01:15:55,260 --> 01:15:58,330
And now this is a
bunch of integers.

1546
01:15:58,330 --> 01:16:01,840
These are both given integer
matrices and vectors.

1547
01:16:01,840 --> 01:16:05,400
And they can have
exponential value.

1548
01:16:05,400 --> 01:16:06,310
Question?

1549
01:16:06,310 --> 01:16:08,750
AUDIENCE: For the
max/min weighted ones,

1550
01:16:08,750 --> 01:16:12,320
for polynomial bounded,
is it still hard

1551
01:16:12,320 --> 01:16:15,460
if you just do ones
and minus ones?

1552
01:16:15,460 --> 01:16:19,560
PROFESSOR: I think min or
max ones without weights

1553
01:16:19,560 --> 01:16:21,600
is NPO PB-complete.

1554
01:16:21,600 --> 01:16:23,600
I should double-check.

1555
01:16:23,600 --> 01:16:27,100
I didn't actually mention, but
this characterization theorem

1556
01:16:27,100 --> 01:16:30,230
works for weighted
problems also.

1557
01:16:30,230 --> 01:16:33,540
For every single case, they show
that weighted and unweighted

1558
01:16:33,540 --> 01:16:38,640
are the same complexity,
except for this one.

1559
01:16:38,640 --> 01:16:42,740
In the min ones case, if all
the variables' true, satisfy it,

1560
01:16:42,740 --> 01:16:45,330
you get Poly-APX-completeness
if you're unweighted.

1561
01:16:45,330 --> 01:16:50,300
If you're weighted, then you
can't find any approximation.

1562
01:16:50,300 --> 01:16:55,390
It's NP-hard to find any factor,
which I think, this is, I

1563
01:16:55,390 --> 01:16:58,037
think, before the
introduction or popularization

1564
01:16:58,037 --> 01:16:58,745
of these classes.

1565
01:16:58,745 --> 01:17:03,549
So that may be distinguishing
between Poly-APX-complete,

1566
01:17:03,549 --> 01:17:05,590
which is definitely smaller
than NPO PB-complete.

1567
01:17:05,590 --> 01:17:08,430
This might be NPO
PB-completeness.

1568
01:17:08,430 --> 01:17:08,930
Unclear.

1569
01:17:08,930 --> 01:17:12,120
But it's definitely
worse than Poly-APX.

1570
01:17:12,120 --> 01:17:13,150
Yeah?

1571
01:17:13,150 --> 01:17:15,150
AUDIENCE: How is it that
distinguished from PXP?

1572
01:17:15,150 --> 01:17:17,550
Because I'm just confused how
you would ever get anything

1573
01:17:17,550 --> 01:17:19,950
worse than this, because,
that's like the biggest

1574
01:17:19,950 --> 01:17:22,370
that you [INAUDIBLE].

1575
01:17:22,370 --> 01:17:25,470
PROFESSOR: So this problem
is exponential APX-hard

1576
01:17:25,470 --> 01:17:26,795
if you forbid zero.

1577
01:17:26,795 --> 01:17:30,030
If you allow zero, then you
can't get any approximation.

1578
01:17:30,030 --> 01:17:32,500
Here, I think even
when you allow zero,

1579
01:17:32,500 --> 01:17:34,820
or even when you
forbid zero, you still

1580
01:17:34,820 --> 01:17:36,000
can't get an approximation.

1581
01:17:36,000 --> 01:17:39,370
I think that's the idea here.

1582
01:17:39,370 --> 01:17:42,020
Here, these problems
generally you

1583
01:17:42,020 --> 01:17:44,812
can get, depending
on your set-up,

1584
01:17:44,812 --> 01:17:47,395
these problems you can all get
like a factor, n approximation.

1585
01:17:49,900 --> 01:17:52,360
Well, maybe not in
polynomial time.

1586
01:17:52,360 --> 01:17:54,090
This is hard to find.

1587
01:17:54,090 --> 01:17:55,320
Some of these you can.

1588
01:17:55,320 --> 01:17:58,750
Longest induced path, just
have a path of length 1.

1589
01:17:58,750 --> 01:18:00,220
That will be induced.

1590
01:18:00,220 --> 01:18:02,040
So that gives you a
factor n approximation.

1591
01:18:02,040 --> 01:18:05,239
There is a lower bound
on this situation,

1592
01:18:05,239 --> 01:18:07,030
n to the 1 minus epsilon
inapproximability.

1593
01:18:09,640 --> 01:18:12,810
I think morally it
should be a factor n,

1594
01:18:12,810 --> 01:18:15,190
but this is the
best result I found.

1595
01:18:15,190 --> 01:18:17,290
So it's funny.

1596
01:18:17,290 --> 01:18:19,800
This is only for
number problems.

1597
01:18:19,800 --> 01:18:21,520
So I presented this
is as in between.

1598
01:18:21,520 --> 01:18:23,832
But this is actually
in some sense lower

1599
01:18:23,832 --> 01:18:24,915
than Exp-APX-completeness.

1600
01:18:27,716 --> 01:18:29,465
It's sort of a harder
version of Poly-APX.

1601
01:18:32,130 --> 01:18:34,740
This is a slightly harder
version of Exp-APX.

1602
01:18:37,320 --> 01:18:39,920
I think it's a small
difference, but it's

1603
01:18:39,920 --> 01:18:43,410
good to know there
is this difference.

1604
01:18:43,410 --> 01:18:46,160
Other questions?

1605
01:18:46,160 --> 01:18:46,660
All right.

1606
01:18:46,660 --> 01:18:54,460
So this ends what I plan to say
about L-reduction-style proofs,

1607
01:18:54,460 --> 01:18:57,562
which are all about
preserving approximability.

1608
01:18:57,562 --> 01:18:59,020
The next class,
we're going to look

1609
01:18:59,020 --> 01:19:01,980
at a different take on
inapproximability, which

1610
01:19:01,980 --> 01:19:06,180
is called gaps, and gap
preserving reductions,

1611
01:19:06,180 --> 01:19:08,320
where you can set up
a problem that either

1612
01:19:08,320 --> 01:19:10,980
it has a great solution,
or the next solution

1613
01:19:10,980 --> 01:19:12,110
below that is way lower.

1614
01:19:12,110 --> 01:19:15,105
And there's a gap between the
best and the next to best.

1615
01:19:15,105 --> 01:19:16,480
And whenever you
have such a gap,

1616
01:19:16,480 --> 01:19:18,249
you also have an
inapproximability gap,

1617
01:19:18,249 --> 01:19:20,290
because you know there's
this solution out there,

1618
01:19:20,290 --> 01:19:24,510
but finding it, if it's
NP-complete to find this,

1619
01:19:24,510 --> 01:19:27,240
to solve it exactly, and
so the next level down you

1620
01:19:27,240 --> 01:19:28,110
lose some factor.

1621
01:19:28,110 --> 01:19:30,600
And whatever that gap is is
your inapproximability bound.

1622
01:19:30,600 --> 01:19:33,280
It doesn't give you
completeness results like this

1623
01:19:33,280 --> 01:19:35,030
in general-- not always.

1624
01:19:35,030 --> 01:19:37,732
But it tends to give you really
get inapproximability bounds.

1625
01:19:37,732 --> 01:19:40,190
Here I've completely ignored
what the constant factors are.

1626
01:19:40,190 --> 01:19:42,860
Most of them are not so great.

1627
01:19:42,860 --> 01:19:44,650
Like when you
prove APX-hardness,

1628
01:19:44,650 --> 01:19:48,770
usually you get a 1 plus 1
over 1,000 kind of lower bound

1629
01:19:48,770 --> 01:19:50,540
on the possibility factor.

1630
01:19:50,540 --> 01:19:53,750
But the best upper
bound is like 2, or 1.5.

1631
01:19:53,750 --> 01:19:55,290
And what we'll talk
about next time,

1632
01:19:55,290 --> 01:19:58,290
you can get much closer--
sometimes exact bounds

1633
01:19:58,290 --> 01:20:00,380
between upper and lower.

1634
01:20:00,380 --> 01:20:03,130
But that will be next week.