1
00:00:00,080 --> 00:00:01,770
The following
content is provided

2
00:00:01,770 --> 00:00:04,010
under a Creative
Commons license.

3
00:00:04,010 --> 00:00:06,860
Your support will help MIT
OpenCourseWare continue

4
00:00:06,860 --> 00:00:10,720
to offer high quality
educational resources for free.

5
00:00:10,720 --> 00:00:13,330
To make a donation or
view additional materials

6
00:00:13,330 --> 00:00:17,226
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,226 --> 00:00:17,851
at ocw.mit.edu.

8
00:00:22,720 --> 00:00:26,270
PROFESSOR: Today, we are going
to do computational complexity.

9
00:00:26,270 --> 00:00:28,989
This is rather different
from every other thing

10
00:00:28,989 --> 00:00:30,030
we've seen in this class.

11
00:00:32,729 --> 00:00:36,120
This class is basically about
polynomial time algorithms

12
00:00:36,120 --> 00:00:38,580
and problems where we
can solve your problem

13
00:00:38,580 --> 00:00:40,290
in polynomial time.

14
00:00:40,290 --> 00:00:43,130
And today, it's about
when you can't do that.

15
00:00:43,130 --> 00:00:44,880
Sometimes, we can prove
you can't do that.

16
00:00:44,880 --> 00:00:47,020
Sometimes, we're pretty
sure you can't do that.

17
00:00:47,020 --> 00:00:49,140
But it's all about
negative results

18
00:00:49,140 --> 00:00:52,850
when your problems
are really complex.

19
00:00:52,850 --> 00:00:55,040
And there's a lot
of fun topics, here.

20
00:00:55,040 --> 00:00:59,280
This is the topic of
entire classes, like 6045.

21
00:00:59,280 --> 00:01:02,950
We're just going to get
a 1 hour flavor of it.

22
00:01:02,950 --> 00:01:04,519
So think of it as
a high level intro.

23
00:01:04,519 --> 00:01:06,893
But we're going to prove real
theorems and do real things

24
00:01:06,893 --> 00:01:09,580
and you'll get a sense
of how all this works.

25
00:01:09,580 --> 00:01:13,680
So I'm going to start out with
three complexity classes--

26
00:01:13,680 --> 00:01:21,390
P, EXP, and R. How many
people know what P is?

27
00:01:21,390 --> 00:01:23,760
And it is?

28
00:01:23,760 --> 00:01:25,990
Polynomial time.

29
00:01:25,990 --> 00:01:28,510
More precisely, it's
the set of all problems

30
00:01:28,510 --> 00:01:30,060
you can solve in
polynomial time.

31
00:01:35,590 --> 00:01:37,090
This is what the
class is all about.

32
00:01:39,736 --> 00:01:41,110
Almost every
problem we have seen

33
00:01:41,110 --> 00:01:44,410
in this class-- there's
one exception-- is

34
00:01:44,410 --> 00:01:48,960
in P. Does anyone
know the exception?

35
00:01:48,960 --> 00:01:51,150
It's a good puzzle for you.

36
00:01:51,150 --> 00:01:51,940
Not NP.

37
00:01:51,940 --> 00:01:52,440
What's next?

38
00:01:52,440 --> 00:01:55,120
EXP.

39
00:01:55,120 --> 00:01:56,660
How many people
know what EXP is?

40
00:01:56,660 --> 00:01:58,810
Or you can guess.

41
00:01:58,810 --> 00:02:00,400
Any guesses?

42
00:02:00,400 --> 00:02:01,569
Exponential.

43
00:02:01,569 --> 00:02:04,110
These are all the problems you
can solve in exponential time.

44
00:02:21,190 --> 00:02:23,210
If you want to be formal
about it, in this case,

45
00:02:23,210 --> 00:02:29,200
exponential means 2 to
the n to some constant.

46
00:02:29,200 --> 00:02:31,700
So not just 2 the n, but also
2 to the n squared, 2 to the n

47
00:02:31,700 --> 00:02:32,199
cubed.

48
00:02:32,199 --> 00:02:34,350
Those are all
considered-- exponential

49
00:02:34,350 --> 00:02:38,050
and a polynomial is
considered in the class EXP.

50
00:02:38,050 --> 00:02:42,010
Now, basically, almost every
problem you can dream of you

51
00:02:42,010 --> 00:02:43,170
can solve in EXP.

52
00:02:43,170 --> 00:02:45,370
Exponential time
is so much time.

53
00:02:45,370 --> 00:02:47,705
And this class has always
been about taking things that

54
00:02:47,705 --> 00:02:51,460
are obviously in EXP and showing
that they're actually in P.

55
00:02:51,460 --> 00:02:53,120
So if you want to
draw a picture,

56
00:02:53,120 --> 00:02:54,870
you could say, OK,
here's all the problems

57
00:02:54,870 --> 00:02:57,070
we can solve in polynomial time.

58
00:02:57,070 --> 00:03:00,287
Here's all the problems we
can solve in exponential time.

59
00:03:00,287 --> 00:03:01,620
And there are problems out here.

60
00:03:01,620 --> 00:03:03,390
These are different classes.

61
00:03:03,390 --> 00:03:06,070
And we want to sort
of bring things

62
00:03:06,070 --> 00:03:09,620
into here as much as possible.

63
00:03:09,620 --> 00:03:11,940
I actually want to
draw this picture

64
00:03:11,940 --> 00:03:16,050
in a different way, which
is as a horizontal line.

65
00:03:20,760 --> 00:03:21,715
So an axis.

66
00:03:24,860 --> 00:03:27,949
I'm going to call this
computational difficulty.

67
00:03:27,949 --> 00:03:29,740
You could call it
computational complexity,

68
00:03:29,740 --> 00:03:30,930
but that's a bit of
a loaded term that

69
00:03:30,930 --> 00:03:32,470
actually has formal meaning.

70
00:03:32,470 --> 00:03:34,000
Difficulty is nice and vague.

71
00:03:34,000 --> 00:03:36,040
So I can draw an
abstract picture.

72
00:03:36,040 --> 00:03:38,460
This is not a true
diagram, but it's

73
00:03:38,460 --> 00:03:40,770
a very good guideline
of what's going on.

74
00:03:40,770 --> 00:03:46,905
So we have-- I'm going to draw--
I believe-- three notches.

75
00:03:50,200 --> 00:03:52,770
No, eventually four, so let
me give myself some room.

76
00:03:56,480 --> 00:04:01,920
We have over here, the
easy problems are P. Then,

77
00:04:01,920 --> 00:04:05,180
we have these problems,
which are EXP.

78
00:04:05,180 --> 00:04:08,930
We're going to fill in
something in the middle.

79
00:04:08,930 --> 00:04:11,910
And then this is
something called R.

80
00:04:11,910 --> 00:04:13,410
So you've got P is
everything, here.

81
00:04:13,410 --> 00:04:19,140
EXP is all the way out to
here, in some abstract view.

82
00:04:19,140 --> 00:04:23,290
The next thing is R. How
many people know what R is?

83
00:04:23,290 --> 00:04:26,250
This one, I had to look up.

84
00:04:26,250 --> 00:04:29,780
It's not usually given a name.

85
00:04:29,780 --> 00:04:30,805
No one.

86
00:04:30,805 --> 00:04:31,430
Teaching staff?

87
00:04:31,430 --> 00:04:32,445
You guys know it?

88
00:04:36,500 --> 00:04:39,900
These are all problems
solvable in finite time.

89
00:04:39,900 --> 00:04:40,910
R stands for finite.

90
00:04:49,980 --> 00:04:52,280
R stands for recursive.

91
00:04:52,280 --> 00:04:54,530
Recursive used to mean
something completely different,

92
00:04:54,530 --> 00:04:56,863
back in the '30s, when people
were thinking about what's

93
00:04:56,863 --> 00:04:58,470
computable, what's
not computable.

94
00:04:58,470 --> 00:05:02,390
These are, basically, solvable
problems, computable problems.

95
00:05:02,390 --> 00:05:04,890
Finite time is a reasonable
requirement, I think,

96
00:05:04,890 --> 00:05:06,105
for all algorithms.

97
00:05:06,105 --> 00:05:09,690
And that's R. Now,
I've drawn this arrow

98
00:05:09,690 --> 00:05:13,106
to keep going because there
are problems out here.

99
00:05:13,106 --> 00:05:14,605
It's kind of
discouraging, but there

100
00:05:14,605 --> 00:05:17,080
are problems that
are unsolvable.

101
00:05:17,080 --> 00:05:19,977
In fact, most problems
are unsolvable.

102
00:05:19,977 --> 00:05:21,060
We're going to prove that.

103
00:05:21,060 --> 00:05:23,340
It's actually really
easy to prove.

104
00:05:23,340 --> 00:05:28,360
Kind of depressing, but true.

105
00:05:28,360 --> 00:05:31,820
Let me start with some examples
before we get to that proof.

106
00:05:36,200 --> 00:05:40,730
So I'm writing examples
of some things we've seen.

107
00:05:40,730 --> 00:05:44,170
So here's an example of
a problem we've seen.

108
00:05:47,740 --> 00:05:49,095
Negative-weight cycle detection.

109
00:05:52,314 --> 00:05:54,440
I give you a graph--
a weighted graph.

110
00:05:54,440 --> 00:05:58,090
I want to know does it have
any negative-weight cycles?

111
00:05:58,090 --> 00:06:00,560
What classes is this problem in?

112
00:06:00,560 --> 00:06:03,050
P. We know how to solve
this in polynomial time--

113
00:06:03,050 --> 00:06:06,260
in VE time-- using Bellman-Ford.

114
00:06:06,260 --> 00:06:08,650
VE time-- well, that finds
negative-weight cycles

115
00:06:08,650 --> 00:06:09,540
reachable from s.

116
00:06:09,540 --> 00:06:11,370
But, I guess, if you
add a source that

117
00:06:11,370 --> 00:06:14,390
can reach anywhere--
zero weight-- then

118
00:06:14,390 --> 00:06:18,390
that'll tell you
overall that it's in P.

119
00:06:18,390 --> 00:06:19,640
It's also in EXP, of course.

120
00:06:19,640 --> 00:06:21,322
Everything in P is also in EXP.

121
00:06:21,322 --> 00:06:23,280
Because if you can solve
it in polynomial time,

122
00:06:23,280 --> 00:06:25,290
you can solve it in
exponential time.

123
00:06:25,290 --> 00:06:29,230
This is at most
exponential time.

124
00:06:29,230 --> 00:06:30,048
At most polynomial.

125
00:06:33,440 --> 00:06:35,680
Here's a problem
we haven't seen.

126
00:06:35,680 --> 00:06:37,160
But it's pretty cool.

127
00:06:37,160 --> 00:06:39,190
N by n Chess.

128
00:06:39,190 --> 00:06:41,320
So this is the
problem I give you.

129
00:06:41,320 --> 00:06:43,440
So we're in an by n
board, and I give you

130
00:06:43,440 --> 00:06:45,690
a whole bunch of
pieces on the board,

131
00:06:45,690 --> 00:06:48,890
and I want to know does
White win from here?

132
00:06:48,890 --> 00:06:51,660
I say it's White to
move or Black to move,

133
00:06:51,660 --> 00:06:55,250
and who's going to win
form this position?

134
00:06:55,250 --> 00:06:59,085
This problem, can be
solved in exponential time.

135
00:06:59,085 --> 00:07:02,080
You can sort of play out
all possible strategies

136
00:07:02,080 --> 00:07:05,210
and see who wins.

137
00:07:05,210 --> 00:07:10,060
And it's not in P. There's
no polynomial time algorithm

138
00:07:10,060 --> 00:07:12,150
to play generalized Chess.

139
00:07:12,150 --> 00:07:15,250
This sort of captures why
Chess-- even at eight by eight

140
00:07:15,250 --> 00:07:17,510
Chess-- is hard-- because
there's no general way

141
00:07:17,510 --> 00:07:19,220
to do it.

142
00:07:19,220 --> 00:07:23,210
So there's no special
way to do it, probably.

143
00:07:23,210 --> 00:07:25,780
Computational complexity is
all about order of growth.

144
00:07:25,780 --> 00:07:27,770
So we can't analyze
eight by eight Chess,

145
00:07:27,770 --> 00:07:29,290
but we can analyze n by n Chess.

146
00:07:29,290 --> 00:07:32,290
And that gives us a flavor of
why 8 by 8 is so difficult.

147
00:07:32,290 --> 00:07:37,000
Go is also in EXP, but
not in P-- lots of games

148
00:07:37,000 --> 00:07:41,430
are in this category, lot's of
complicated games, let's say.

149
00:07:41,430 --> 00:07:45,130
And so this is a first example
of a problem that we know we

150
00:07:45,130 --> 00:07:48,930
cannot solve in polynomial time.

151
00:07:48,930 --> 00:07:50,880
Bad news.

152
00:07:50,880 --> 00:07:53,070
I also talked about
Tetris a little bit.

153
00:07:56,940 --> 00:07:58,920
Unlike the Tetris
training, which we saw,

154
00:07:58,920 --> 00:08:00,630
this is sort of
realistic Tetris--

155
00:08:00,630 --> 00:08:02,340
all the rules of Tetris.

156
00:08:02,340 --> 00:08:04,990
The only catch is that I
tell you all the pieces that

157
00:08:04,990 --> 00:08:06,470
are going to come in advance.

158
00:08:06,470 --> 00:08:08,442
Because, otherwise,
it's some random process

159
00:08:08,442 --> 00:08:11,025
and it's kind of hard to think
about what's the best strategy.

160
00:08:11,025 --> 00:08:13,524
But if I tell you
what's going to come--

161
00:08:13,524 --> 00:08:14,940
say it's a
pseudo-random generator

162
00:08:14,940 --> 00:08:16,365
and you know how it works.

163
00:08:16,365 --> 00:08:17,990
You know all the
pieces that will come.

164
00:08:17,990 --> 00:08:22,830
I want to know can I survive
from a given initial board mess

165
00:08:22,830 --> 00:08:24,980
and for a given
sequence of pieces.

166
00:08:24,980 --> 00:08:27,740
This can also be solved
in exponential time.

167
00:08:27,740 --> 00:08:29,575
Just try all the possibilities.

168
00:08:34,780 --> 00:08:45,760
We don't know whether it's
in P. We're pretty sure

169
00:08:45,760 --> 00:08:47,920
it's not in P. And by the
end of today's lecture,

170
00:08:47,920 --> 00:08:51,500
you'll understand why
we think it's not in P.

171
00:08:51,500 --> 00:08:54,930
But it's going to be
somewhere in between here.

172
00:08:54,930 --> 00:08:57,110
Tetris is actually right here.

173
00:08:57,110 --> 00:08:59,410
But I haven't defined
what right here is yet.

174
00:09:06,040 --> 00:09:10,460
And then the next one
is halting problem.

175
00:09:24,720 --> 00:09:27,140
So halting problem
is particularly cool,

176
00:09:27,140 --> 00:09:29,320
as we'll see-- or interesting.

177
00:09:29,320 --> 00:09:34,710
It's the problem of given a
computer program-- Python,

178
00:09:34,710 --> 00:09:37,540
whatever, it doesn't really
matter what language.

179
00:09:37,540 --> 00:09:42,150
They're all the same in
a theoretical sense--

180
00:09:42,150 --> 00:09:43,005
does it ever halt?

181
00:09:46,680 --> 00:09:50,335
Does it ever stop running,
return a result, whatever?

182
00:09:53,070 --> 00:09:55,600
This would be really handy--
you're writing some code,

183
00:09:55,600 --> 00:09:58,360
and you've run it
for 5 hours, and you

184
00:09:58,360 --> 00:10:00,120
don't know is that
because there's a bug

185
00:10:00,120 --> 00:10:01,453
and you've got an infinite loop?

186
00:10:01,453 --> 00:10:04,000
Or is it just because
it's really slow?

187
00:10:04,000 --> 00:10:08,137
So you'd like to give it
to some program-- checking

188
00:10:08,137 --> 00:10:09,845
program-- that says
will this run forever

189
00:10:09,845 --> 00:10:11,532
or will it terminate.

190
00:10:11,532 --> 00:10:13,160
That's the halting problem.

191
00:10:13,160 --> 00:10:17,080
And this problem
is not in R. There

192
00:10:17,080 --> 00:10:20,260
is no correct algorithm
for solving this problem.

193
00:10:20,260 --> 00:10:24,270
There's no way to tell,
given an arbitrary program,

194
00:10:24,270 --> 00:10:25,840
whether it will halt.

195
00:10:25,840 --> 00:10:28,130
Now, in some situations--
take the empty program--

196
00:10:28,130 --> 00:10:29,630
I can tell that it halts.

197
00:10:29,630 --> 00:10:33,580
Or I take some special
simple class of programs,

198
00:10:33,580 --> 00:10:36,890
I can tell whether they halt or
determine that they don't halt.

199
00:10:36,890 --> 00:10:40,890
But there's no algorithm that
solves it for all programs,

200
00:10:40,890 --> 00:10:42,380
in finite time.

201
00:10:42,380 --> 00:10:44,330
In infinite time,
I can solve it.

202
00:10:44,330 --> 00:10:46,540
Just run it.

203
00:10:46,540 --> 00:10:48,340
Run the program.

204
00:10:48,340 --> 00:10:50,340
Given finite time, there's
no way to solve this.

205
00:10:50,340 --> 00:10:53,370
And so this is a little bit
beyond what we can prove today.

206
00:10:53,370 --> 00:10:54,930
It's not that hard
to prove, but it

207
00:10:54,930 --> 00:10:56,440
takes half an hour or something.

208
00:10:56,440 --> 00:10:57,690
I want to get to other things.

209
00:10:57,690 --> 00:11:02,020
But if you take 6045,
they'll prove this.

210
00:11:02,020 --> 00:11:03,990
What I want to show you
instead is an easier

211
00:11:03,990 --> 00:11:29,800
result-- that almost
every problem is not in R.

212
00:11:29,800 --> 00:11:32,680
I need one term, though,
which is decision problems.

213
00:11:32,680 --> 00:11:35,050
All of these problems,
I set it up in a way

214
00:11:35,050 --> 00:11:37,556
that the answer is
binary-- yes or no.

215
00:11:37,556 --> 00:11:38,930
Is there a
negative-weight cycle?

216
00:11:38,930 --> 00:11:41,090
Yes or no?

217
00:11:41,090 --> 00:11:43,950
Does White win from
this position in Chess?

218
00:11:43,950 --> 00:11:46,000
Can you survive in Tetris?

219
00:11:46,000 --> 00:11:48,240
And does this program halt?

220
00:11:48,240 --> 00:11:51,430
For various reasons--
basically convenience--

221
00:11:51,430 --> 00:11:53,320
the whole field of
computational complexity

222
00:11:53,320 --> 00:11:56,550
focuses on decision problems.

223
00:11:56,550 --> 00:11:59,370
And, in fact-- so
decision problems

224
00:11:59,370 --> 00:12:01,080
are ones where the
answer is yes or no.

225
00:12:01,080 --> 00:12:02,920
That's all.

226
00:12:02,920 --> 00:12:03,880
Why?

227
00:12:03,880 --> 00:12:05,520
Essentially because
it doesn't matter.

228
00:12:05,520 --> 00:12:07,920
If you take a problem
you care about,

229
00:12:07,920 --> 00:12:10,200
you can convert it into
a decision problem.

230
00:12:10,200 --> 00:12:12,760
We can see examples
of that later.

231
00:12:12,760 --> 00:12:14,290
Decision problems
are basically as

232
00:12:14,290 --> 00:12:17,989
hard as optimization
problems or whatever.

233
00:12:17,989 --> 00:12:19,530
But let's focus on
decision problems.

234
00:12:19,530 --> 00:12:20,920
The answer is yes or no.

235
00:12:20,920 --> 00:12:23,590
Claim that most of
them are uncomputable.

236
00:12:23,590 --> 00:12:26,100
And we can prove
this pretty easily

237
00:12:26,100 --> 00:12:30,390
if you know a bit of
set theory, I guess.

238
00:12:35,309 --> 00:12:37,350
On the one hand, I have
problems I want to solve.

239
00:12:37,350 --> 00:12:38,520
These are decision problems.

240
00:12:38,520 --> 00:12:41,220
And on the other hand,
I have algorithms,

241
00:12:41,220 --> 00:12:42,820
or computer programs
to solve them.

242
00:12:42,820 --> 00:12:44,445
I'm going to think
of computer programs

243
00:12:44,445 --> 00:12:46,640
because more precise
algorithms can

244
00:12:46,640 --> 00:12:50,230
be a little bit nebulous for
thinking about pseudocode--

245
00:12:50,230 --> 00:12:51,460
what's valid, what's invalid.

246
00:12:51,460 --> 00:12:53,610
But computer programs
are very clear.

247
00:12:53,610 --> 00:12:55,049
I give you some code.

248
00:12:55,049 --> 00:12:56,090
You throw it into Python.

249
00:12:56,090 --> 00:12:57,500
Either it works or it doesn't.

250
00:12:57,500 --> 00:12:59,552
And it does something.

251
00:12:59,552 --> 00:13:00,260
Runs for a while.

252
00:13:04,080 --> 00:13:08,750
How can I think about the
space of all possible programs?

253
00:13:08,750 --> 00:13:12,020
Well, programs are things
you type into a computer

254
00:13:12,020 --> 00:13:13,400
in ASCII, whatever.

255
00:13:13,400 --> 00:13:16,380
In the end, you can think of
it as just as a binary string.

256
00:13:16,380 --> 00:13:18,060
Somehow it gets
encoded in binary.

257
00:13:18,060 --> 00:13:21,870
Everything is reduced to binary
in the end, on a computer.

258
00:13:21,870 --> 00:13:27,340
So this is a binary string.

259
00:13:27,340 --> 00:13:29,430
Now, you can also think
of a binary string

260
00:13:29,430 --> 00:13:33,280
as representing a
number, in binary.

261
00:13:33,280 --> 00:13:34,870
So you can also
think of a program,

262
00:13:34,870 --> 00:13:38,960
then, as a natural number-- some
number between 0 and infinity.

263
00:13:38,960 --> 00:13:41,870
And an integer.

264
00:13:41,870 --> 00:13:45,850
So usually we represent
this as math bold N.

265
00:13:45,850 --> 00:13:48,470
That's just 0, 1, 2, 3.

266
00:13:48,470 --> 00:13:50,670
You can think of every
program is ultimately

267
00:13:50,670 --> 00:13:51,960
reducing to an integer.

268
00:13:51,960 --> 00:13:53,615
It's a big integer, but, hey.

269
00:13:53,615 --> 00:13:55,770
It's an integer.

270
00:13:55,770 --> 00:13:57,647
So that's the space
of all programs.

271
00:13:57,647 --> 00:14:00,230
Now, I want to think about the
space of all decision problems.

272
00:14:02,900 --> 00:14:06,474
So how can I define
a decision problem?

273
00:14:06,474 --> 00:14:08,640
Well, the natural way to
think of a decision problem

274
00:14:08,640 --> 00:14:12,650
is as a function that
maps inputs to yes or no.

275
00:14:19,350 --> 00:14:28,430
Function from
inputs to yes or no.

276
00:14:28,430 --> 00:14:32,310
Or you can think
of that as 1 and 0.

277
00:14:32,310 --> 00:14:34,310
So what's an input?

278
00:14:34,310 --> 00:14:36,220
Well, an input is
a binary string.

279
00:14:36,220 --> 00:14:39,000
So an input is a number--
a natural number.

280
00:14:41,790 --> 00:14:51,580
Input is a binary string, which
we can think of as being in N.

281
00:14:51,580 --> 00:14:58,570
So we've got a
function from N to 0,1.

282
00:14:58,570 --> 00:15:03,000
So another way to represent
one of these functions

283
00:15:03,000 --> 00:15:04,220
is as a table.

284
00:15:04,220 --> 00:15:06,070
I could just write
down all the answers.

285
00:15:06,070 --> 00:15:10,110
So I've got, well, the input
could be 0-- the number 0.

286
00:15:10,110 --> 00:15:11,969
And then, maybe it's a 0.

287
00:15:11,969 --> 00:15:14,260
Input could be could be 1
and then, maybe, output is 0.

288
00:15:14,260 --> 00:15:21,750
Then, the input could be 2,
3, 4, 5, 1, 0, 1, 1, whatever.

289
00:15:21,750 --> 00:15:24,510
So I could write the
table of all answers.

290
00:15:24,510 --> 00:15:28,430
This is another way to
write down such a function.

291
00:15:28,430 --> 00:15:32,110
What we have, here, is an
infinite string of bits.

292
00:15:32,110 --> 00:15:34,290
Each of them could be 0 or 1.

293
00:15:34,290 --> 00:15:36,810
It would be a different problem.

294
00:15:36,810 --> 00:15:37,850
But they all exist.

295
00:15:37,850 --> 00:15:41,060
Any infinite string of bits
represents a decision problem.

296
00:15:41,060 --> 00:15:42,750
They're the same thing.

297
00:15:42,750 --> 00:15:45,950
So a decision problem is
an infinite string of bits.

298
00:15:45,950 --> 00:15:49,676
A program is a finite
string of bits.

299
00:15:49,676 --> 00:15:52,260
These are different things.

300
00:15:52,260 --> 00:15:54,170
One way to see that
they're different

301
00:15:54,170 --> 00:15:57,640
is put a decimal point, here.

302
00:15:57,640 --> 00:15:59,630
Now, this infinite
string of bits

303
00:15:59,630 --> 00:16:03,710
is a number-- a real
number-- between 0 and 1.

304
00:16:03,710 --> 00:16:04,720
It's written in binary.

305
00:16:04,720 --> 00:16:07,627
You may not be used
to binary point.

306
00:16:07,627 --> 00:16:08,960
This dot is not a decimal point.

307
00:16:08,960 --> 00:16:10,180
It's a binary point.

308
00:16:10,180 --> 00:16:12,970
But, hey.

309
00:16:12,970 --> 00:16:16,040
Any real number can be expressed
by an infinite string of bits

310
00:16:16,040 --> 00:16:18,820
in this way-- any real
number between 0 and 1.

311
00:16:22,210 --> 00:16:31,940
So a decision
problem is basically

312
00:16:31,940 --> 00:16:35,260
something in R, the set
of all real numbers,

313
00:16:35,260 --> 00:16:38,555
whereas a program is something
in N, the set of all integers.

314
00:16:41,750 --> 00:16:45,360
And the thing is, the
number of real numbers

315
00:16:45,360 --> 00:16:50,460
is much, much bigger than
the number of integers.

316
00:16:50,460 --> 00:16:53,110
In a formal sense, we call
this one uncountably infinite,

317
00:16:53,110 --> 00:16:55,060
and this one is
countably infinite.

318
00:16:55,060 --> 00:16:56,830
I'm not going to prove
that here, today.

319
00:16:56,830 --> 00:16:59,010
You may have seen that proof.

320
00:16:59,010 --> 00:17:01,320
It's pretty simple.

321
00:17:01,320 --> 00:17:02,350
And that's bad news.

322
00:17:02,350 --> 00:17:04,829
That means that there
are way more problems

323
00:17:04,829 --> 00:17:07,550
than there are
programs to solve them.

324
00:17:07,550 --> 00:17:12,890
So this means almost every
problem that we could conceive

325
00:17:12,890 --> 00:17:16,525
of is unsolvable
by every program.

326
00:17:30,600 --> 00:17:33,245
And this is pretty depressing
the first time I saw it.

327
00:17:33,245 --> 00:17:35,120
That's why we put it at
the end of the class.

328
00:17:37,950 --> 00:17:40,160
I think you get all existential.

329
00:17:40,160 --> 00:17:42,040
I mean the thing is
every program only

330
00:17:42,040 --> 00:17:43,201
solves one problem.

331
00:17:43,201 --> 00:17:44,700
It takes some input,
and it's either

332
00:17:44,700 --> 00:17:46,247
going to output yes or no.

333
00:17:46,247 --> 00:17:48,580
And if it's wrong on any of
the inputs, then it's wrong.

334
00:17:48,580 --> 00:17:51,010
So it's going to give an answer.

335
00:17:51,010 --> 00:17:52,570
Say it's a
deterministic algorithm.

336
00:17:52,570 --> 00:17:55,950
No random numbers or things.

337
00:17:55,950 --> 00:17:57,670
Then, there's just
not enough programs

338
00:17:57,670 --> 00:18:01,402
to go around if each program
only solves one problem.

339
00:18:01,402 --> 00:18:02,610
This is the end of the proof.

340
00:18:02,610 --> 00:18:05,110
Any questions about that?

341
00:18:05,110 --> 00:18:07,390
Kind of weird.

342
00:18:07,390 --> 00:18:10,390
Because yet somehow, most of
the problems that we think about

343
00:18:10,390 --> 00:18:11,530
are computable.

344
00:18:11,530 --> 00:18:13,100
I don't know why that is.

345
00:18:13,100 --> 00:18:15,520
But mathematically,
most problems

346
00:18:15,520 --> 00:18:17,425
that you could think
of are uncomputable.

347
00:18:21,450 --> 00:18:22,974
Question?

348
00:18:22,974 --> 00:18:23,890
AUDIENCE: [INAUDIBLE].

349
00:18:27,850 --> 00:18:28,750
PROFESSOR: Yeah.

350
00:18:28,750 --> 00:18:32,270
It's something like,
the way that we describe

351
00:18:32,270 --> 00:18:35,990
problems is usually almost
algorithmic, anyway.

352
00:18:35,990 --> 00:18:39,610
And so, usually, most problems
we think of are in EXP.

353
00:18:39,610 --> 00:18:42,300
And so they're
definitely computable.

354
00:18:42,300 --> 00:18:43,820
There's some
metatheorem about how

355
00:18:43,820 --> 00:18:46,385
we think about problems,
not just programs.

356
00:18:51,110 --> 00:18:53,970
So that's all I'm going to
say about R. So out here,

357
00:18:53,970 --> 00:18:57,972
we have halting problem and,
actually, most problems.

358
00:18:57,972 --> 00:18:59,680
You can think of this
as an infinite line

359
00:18:59,680 --> 00:19:01,980
and then there's just
this small portion

360
00:19:01,980 --> 00:19:03,840
which are things you can solve.

361
00:19:03,840 --> 00:19:05,880
But we care about this
portion because that's

362
00:19:05,880 --> 00:19:07,040
the interesting stuff.

363
00:19:07,040 --> 00:19:08,750
That's what
algorithms are about.

364
00:19:08,750 --> 00:19:13,560
Out here kind of
nothing happens.

365
00:19:13,560 --> 00:19:17,070
So I want to talk about
this notch, which is NP.

366
00:19:22,747 --> 00:19:24,080
I imagine you've heard about NP.

367
00:19:27,252 --> 00:19:30,240
It's pretty cool, but
also kind of confusing.

368
00:19:34,616 --> 00:19:37,890
But it's actually very
closely related to something

369
00:19:37,890 --> 00:19:42,320
we've seen with dynamic
programming, which is guessing.

370
00:19:42,320 --> 00:19:44,460
So I'm going to give you
a couple of definitions

371
00:19:44,460 --> 00:19:48,285
of NP-- not formal definition,
but high level definitions.

372
00:19:52,210 --> 00:19:57,350
So just like P, EXP, and R,
it's a set of decision problems.

373
00:19:57,350 --> 00:20:03,650
And it's going to look very
similar to P. NP does not

374
00:20:03,650 --> 00:20:05,640
stand for not a polynomial.

375
00:20:05,640 --> 00:20:08,750
It stands for
nondeterministic polynomial.

376
00:20:08,750 --> 00:20:12,330
We'll get to
nondeterministic in a moment.

377
00:20:12,330 --> 00:20:14,436
The first line is the same.

378
00:20:14,436 --> 00:20:17,740
It's all decision problems you
can solve in polynomial time.

379
00:20:17,740 --> 00:20:19,720
That sounds like P.
But then, there's

380
00:20:19,720 --> 00:20:25,480
this extra line, which is
via a "lucky" algorithm.

381
00:20:33,730 --> 00:20:36,840
Let me tell you--
at a high level what

382
00:20:36,840 --> 00:20:40,434
a lucky algorithm does
is it can make guesses.

383
00:20:40,434 --> 00:20:42,475
But unlike the way that
we've been making guesses

384
00:20:42,475 --> 00:20:44,360
with dynamic programming--
with dynamic programming

385
00:20:44,360 --> 00:20:45,443
we had to guess something.

386
00:20:45,443 --> 00:20:47,190
We tried all the possibilities.

387
00:20:47,190 --> 00:20:50,597
A lucky algorithm just
needs to try one possibility

388
00:20:50,597 --> 00:20:51,680
because it's really lucky.

389
00:20:51,680 --> 00:20:54,860
It always guesses
the right choice.

390
00:20:54,860 --> 00:20:56,440
It's like magic.

391
00:20:56,440 --> 00:20:59,310
This is not a realistic
model of computation,

392
00:20:59,310 --> 00:21:04,470
but it is a model of computation
called nondeterministic model.

393
00:21:09,040 --> 00:21:11,950
And it's going to sound
crazy because it is crazy,

394
00:21:11,950 --> 00:21:15,080
but nonetheless it's
actually really useful--

395
00:21:15,080 --> 00:21:17,410
even though you could
never really build

396
00:21:17,410 --> 00:21:19,576
this on a real computer.

397
00:21:19,576 --> 00:21:20,950
The nondeterministic
model is not

398
00:21:20,950 --> 00:21:22,116
a model of real computation.

399
00:21:22,116 --> 00:21:25,410
It is a model of theoretical
hypothetical computation.

400
00:21:25,410 --> 00:21:27,300
It gets at the
root-- at the core

401
00:21:27,300 --> 00:21:29,730
of what is possible to solve.

402
00:21:29,730 --> 00:21:32,800
You'll see why, in a little bit.

403
00:21:32,800 --> 00:21:39,800
So in this model, an algorithm--
it can compute stuff,

404
00:21:39,800 --> 00:21:43,682
but, in particular,
it makes guesses.

405
00:21:43,682 --> 00:21:46,030
So should I do this
or should I do this?

406
00:21:46,030 --> 00:21:48,790
And it just says-- It
doesn't flip a coin.

407
00:21:48,790 --> 00:21:50,010
It's not random.

408
00:21:50,010 --> 00:21:53,990
It just thinks-- it
just makes a guess.

409
00:21:53,990 --> 00:21:54,800
Well, I don't know.

410
00:21:54,800 --> 00:21:56,390
Let's go this way.

411
00:21:56,390 --> 00:21:58,182
And then it comes
another fork in the road.

412
00:21:58,182 --> 00:21:59,431
It's like, well, I don't know.

413
00:21:59,431 --> 00:22:00,410
I'll go this way.

414
00:22:00,410 --> 00:22:01,660
That's the guessing.

415
00:22:01,660 --> 00:22:04,160
You give it a list of
choices and somehow a choice

416
00:22:04,160 --> 00:22:08,910
is determined, by magic--
nondeterministic magic.

417
00:22:08,910 --> 00:22:18,970
And then the fun part is--
I should say, at the end

418
00:22:18,970 --> 00:22:24,652
the algorithm either
says yes or no.

419
00:22:24,652 --> 00:22:25,610
It gives you an output.

420
00:22:28,420 --> 00:22:34,590
The guesses are guaranteed--
this is the magic part--

421
00:22:34,590 --> 00:22:43,780
to lead to a yes
answer, if possible.

422
00:22:47,650 --> 00:22:50,800
So if you imagine the space
of executions of this program,

423
00:22:50,800 --> 00:22:53,342
you start here, and you
make some guess and you

424
00:22:53,342 --> 00:22:55,190
don't know which way to go.

425
00:22:55,190 --> 00:22:57,119
In dynamic programming,
we try all of them.

426
00:22:57,119 --> 00:22:58,910
But this algorithm
doesn't try all of them.

427
00:22:58,910 --> 00:23:01,580
It's like a branching universe
model of the universe.

428
00:23:01,580 --> 00:23:03,930
So you make some
choice, and then you

429
00:23:03,930 --> 00:23:06,430
make some other choice, and
then you make some other choice.

430
00:23:06,430 --> 00:23:08,900
All of these are guesses.

431
00:23:08,900 --> 00:23:11,500
And some of these
things will lead to yes.

432
00:23:11,500 --> 00:23:13,120
Some of these things
will lead to no.

433
00:23:13,120 --> 00:23:17,040
And in this magical model,
if there's any yes out there,

434
00:23:17,040 --> 00:23:19,660
you will follow a path to a yes.

435
00:23:19,660 --> 00:23:21,974
If all of the answers
are no, then, of course,

436
00:23:21,974 --> 00:23:23,640
it doesn't matter
what choices you make.

437
00:23:23,640 --> 00:23:25,100
You will output no.

438
00:23:25,100 --> 00:23:27,940
But if there's ever a yes,
magically these guesses

439
00:23:27,940 --> 00:23:28,580
find it.

440
00:23:28,580 --> 00:23:30,390
This is the sense of lucky.

441
00:23:30,390 --> 00:23:33,940
If you're trying to find a
yes-- that's your goal in life--

442
00:23:33,940 --> 00:23:37,290
then this corresponds to luck.

443
00:23:37,290 --> 00:23:40,260
And NP is the class of
all problems solvable

444
00:23:40,260 --> 00:23:43,800
in polynomial time by a
really lucky algorithm.

445
00:23:43,800 --> 00:23:44,850
Crazy.

446
00:23:44,850 --> 00:23:45,350
I know.

447
00:23:50,450 --> 00:23:53,115
Let's talk about Tetris.

448
00:23:56,320 --> 00:23:59,871
Tetris, I claim, is in NP.

449
00:23:59,871 --> 00:24:01,870
And we know how to solve
it in exponential time.

450
00:24:01,870 --> 00:24:04,280
Just try all the options.

451
00:24:04,280 --> 00:24:07,640
But, in fact, I don't need
to try all the options.

452
00:24:07,640 --> 00:24:11,150
It would be enough just use
this nondeterministic magic.

453
00:24:11,150 --> 00:24:14,420
I could say, well, should I
drop the piece here, here, here,

454
00:24:14,420 --> 00:24:15,739
here, here, or here.

455
00:24:15,739 --> 00:24:17,780
And should it be rotated
like this, or like this,

456
00:24:17,780 --> 00:24:20,170
or like this, or like this?

457
00:24:20,170 --> 00:24:20,910
I don't know.

458
00:24:20,910 --> 00:24:22,210
So I guess.

459
00:24:22,210 --> 00:24:23,775
And I just place that piece.

460
00:24:23,775 --> 00:24:25,900
I make another guess where
to place the next piece.

461
00:24:25,900 --> 00:24:27,550
Then I make another guess
where to place the next piece.

462
00:24:27,550 --> 00:24:29,520
I implement the rules
of Tetris, which

463
00:24:29,520 --> 00:24:32,426
is if there's a
full line it clears.

464
00:24:32,426 --> 00:24:34,980
I figure out where
these things fall.

465
00:24:34,980 --> 00:24:39,080
I can even think about, should
I rotate at the last second.

466
00:24:39,080 --> 00:24:41,894
If I don't know, I'll guess.

467
00:24:41,894 --> 00:24:43,810
Any choice you have to
make in playing Tetris,

468
00:24:43,810 --> 00:24:45,280
you can just guess.

469
00:24:45,280 --> 00:24:47,660
There's only polynomially
many guesses you need to make.

470
00:24:47,660 --> 00:24:49,600
So it's still polynomial time.

471
00:24:49,600 --> 00:24:50,440
That's important.

472
00:24:50,440 --> 00:24:52,060
It's not like we
can do anything.

473
00:24:52,060 --> 00:24:54,770
But we can make a polynomial
number these magic guesses.

474
00:24:54,770 --> 00:24:59,330
And then at the end, I
determine did I die--

475
00:24:59,330 --> 00:25:01,070
or rather, did I survive.

476
00:25:01,070 --> 00:25:02,120
It's important, actually.

477
00:25:02,120 --> 00:25:03,980
It only works one way.

478
00:25:03,980 --> 00:25:04,780
Did I survive?

479
00:25:04,780 --> 00:25:05,555
Yes or no?

480
00:25:05,555 --> 00:25:06,680
And that's easy to compute.

481
00:25:06,680 --> 00:25:11,120
I just see did I ever
go above the top row.

482
00:25:11,120 --> 00:25:13,640
So what this model says
is if there is any way

483
00:25:13,640 --> 00:25:17,100
to survive-- if there is
any way to get a yes answer,

484
00:25:17,100 --> 00:25:21,220
then, my guesses will find
it, magically, in this model.

485
00:25:21,220 --> 00:25:22,370
Therefore, Tetris is in NP.

486
00:25:24,980 --> 00:25:28,200
If I had instead
said, did I die, then,

487
00:25:28,200 --> 00:25:31,120
what this algorithm would
tell me is there any way

488
00:25:31,120 --> 00:25:33,970
to die-- which, the
answer's probably yes,

489
00:25:33,970 --> 00:25:36,360
unless you're given a
really trivial input.

490
00:25:36,360 --> 00:25:39,710
So it's important you set up
the yes versus no, correctly.

491
00:25:39,710 --> 00:25:43,980
But the Tetris decision problem
"can I survive," is in NP.

492
00:25:43,980 --> 00:25:48,670
The decision problem "can I
die," should not be in NP.

493
00:25:48,670 --> 00:25:49,494
But we don't know.

494
00:25:57,110 --> 00:25:58,430
Another way to think about NP.

495
00:26:01,382 --> 00:26:02,950
And you might find
this intuitive

496
00:26:02,950 --> 00:26:04,800
because we've been
doing lots of guessing.

497
00:26:04,800 --> 00:26:06,330
It's just a little crazy.

498
00:26:06,330 --> 00:26:11,490
There's another way that's
more intuitive to many people.

499
00:26:11,490 --> 00:26:14,520
So if this doesn't make
sense, don't worry, yet.

500
00:26:14,520 --> 00:26:16,110
This is another
way to phrase it.

501
00:26:53,152 --> 00:26:55,110
Another way to think
about NP-- which turns out

502
00:26:55,110 --> 00:27:01,450
to be equivalent-- is that don't
think so much about algorithms

503
00:27:01,450 --> 00:27:04,300
for solving a problem,
just think about algorithms

504
00:27:04,300 --> 00:27:07,437
for checking the
solution to a problem.

505
00:27:07,437 --> 00:27:09,270
It's usually a lot
easier to check your work

506
00:27:09,270 --> 00:27:11,980
than it is to solve a
problem in the first place.

507
00:27:11,980 --> 00:27:15,400
And NP is all about that issue.

508
00:27:15,400 --> 00:27:17,540
So think of decision
problems and think

509
00:27:17,540 --> 00:27:21,270
about if you have a solution--
so let's say in Tetris,

510
00:27:21,270 --> 00:27:24,790
the solution is yes.

511
00:27:24,790 --> 00:27:27,660
In fact, I need to
say this, probably.

512
00:27:27,660 --> 00:27:31,150
The more formal
version is whenever

513
00:27:31,150 --> 00:27:37,900
the answer is yes,
you can prove it.

514
00:27:41,880 --> 00:27:44,100
And you can check that
proof in polynomial time.

515
00:27:49,890 --> 00:27:53,140
This is the more formal--
this a little bit high level.

516
00:27:53,140 --> 00:27:54,130
What does check mean?

517
00:27:54,130 --> 00:27:56,080
Here's what check means.

518
00:27:56,080 --> 00:27:59,560
Whenever an answer is "yes,"
you can write down a proof

519
00:27:59,560 --> 00:28:00,900
that the answer is yes.

520
00:28:00,900 --> 00:28:02,400
And someone can
come along and check

521
00:28:02,400 --> 00:28:04,370
that proof in polynomial
time and be convinced

522
00:28:04,370 --> 00:28:06,310
that the answer is yes.

523
00:28:06,310 --> 00:28:07,600
What does convinced mean?

524
00:28:07,600 --> 00:28:09,800
It's not that hard.

525
00:28:09,800 --> 00:28:11,660
Think of it is a
two player game.

526
00:28:11,660 --> 00:28:13,400
There's me trying
to play Tetris,

527
00:28:13,400 --> 00:28:15,400
and there's you
trying to be convinced

528
00:28:15,400 --> 00:28:18,100
that I'm really good at Tetris.

529
00:28:18,100 --> 00:28:23,060
It seems a little one sided,
but-- it's a asymmetric game.

530
00:28:23,060 --> 00:28:27,420
So you want to prove Tetris is--
I want to show Tetris is in NP.

531
00:28:27,420 --> 00:28:29,790
Imagine I'm this
magical creature.

532
00:28:29,790 --> 00:28:31,160
Actually, it's kind of funny.

533
00:28:31,160 --> 00:28:32,680
It reminds me of a story.

534
00:28:32,680 --> 00:28:34,620
On the front of my
office door, you

535
00:28:34,620 --> 00:28:37,540
may have seen there's
an email I received,

536
00:28:37,540 --> 00:28:39,900
maybe 15 years
ago-- oh no, I guess

537
00:28:39,900 --> 00:28:42,100
it can't be that long ago.

538
00:28:42,100 --> 00:28:43,710
Must've been about
7 years ago when

539
00:28:43,710 --> 00:28:47,040
we proved that Tetris
is NP-complete.

540
00:28:47,040 --> 00:28:51,660
And the email says, "Dear
Sir,"-- or whatever--

541
00:28:51,660 --> 00:28:54,349
"I am NP-complete."

542
00:28:54,349 --> 00:28:55,890
We don't what
NP-complete means, yet,

543
00:28:55,890 --> 00:28:57,310
but it's a
meaningless statement.

544
00:28:57,310 --> 00:28:59,640
So it doesn't matter that
you don't know what it means.

545
00:28:59,640 --> 00:29:03,940
It might get funnier
throughout the lecture today.

546
00:29:03,940 --> 00:29:07,860
And he's like, I
can solve Tetris.

547
00:29:07,860 --> 00:29:09,534
I'm really good
at playing Tetris.

548
00:29:09,534 --> 00:29:11,200
I'm really good at
playing Minesweeper--

549
00:29:11,200 --> 00:29:14,210
all these games that are
thought to be intractable.

550
00:29:14,210 --> 00:29:15,810
He gave me his
records and so on.

551
00:29:15,810 --> 00:29:20,230
It's like how can
I apply my talent.

552
00:29:20,230 --> 00:29:26,230
So I will translate what he
meant to say was, "I am lucky."

553
00:29:26,230 --> 00:29:29,017
And this is probably
not true, but he

554
00:29:29,017 --> 00:29:30,100
thought that he was lucky.

555
00:29:30,100 --> 00:29:31,940
He wanted to convince
me he was lucky.

556
00:29:31,940 --> 00:29:33,610
So how could we do it?

557
00:29:33,610 --> 00:29:36,540
Well, I could give him a
really hard Tetris problem.

558
00:29:36,540 --> 00:29:38,870
And say, can you
survive these pieces?

559
00:29:38,870 --> 00:29:41,450
And he says, "yes,
I can survive. "

560
00:29:41,450 --> 00:29:43,450
And how does he prove to
me that he can survive?

561
00:29:43,450 --> 00:29:45,150
Well, he just plays it.

562
00:29:45,150 --> 00:29:47,420
He shows me what to do.

563
00:29:47,420 --> 00:29:53,740
So proof is sequence
of moves that you make.

564
00:29:53,740 --> 00:29:55,870
It's really easy
to convince someone

565
00:29:55,870 --> 00:30:00,290
that you can survive a
given level of Tetris.

566
00:30:00,290 --> 00:30:04,380
You just show what the
sequence of moves are.

567
00:30:04,380 --> 00:30:07,860
And then I, as a mere mortal
polynomial time algorithm

568
00:30:07,860 --> 00:30:09,780
can check that that
sequence works.

569
00:30:09,780 --> 00:30:12,064
I just have to implement
the rules of Tetris.

570
00:30:12,064 --> 00:30:13,980
So in Tetris, the rules
are easy to implement.

571
00:30:13,980 --> 00:30:18,120
Its the knowing what
thing to do is hard.

572
00:30:18,120 --> 00:30:21,840
But in NP, knowing
which way to go is easy.

573
00:30:21,840 --> 00:30:23,340
In this version,
you don't even talk

574
00:30:23,340 --> 00:30:24,820
about how to find the solution.

575
00:30:24,820 --> 00:30:26,486
It's just a matter
of can you write down

576
00:30:26,486 --> 00:30:29,280
a solution that can be checked.

577
00:30:29,280 --> 00:30:30,000
Can prove it.

578
00:30:30,000 --> 00:30:31,420
This is not in polynomial time.

579
00:30:31,420 --> 00:30:34,710
You get arbitrarily
much time to prove it.

580
00:30:34,710 --> 00:30:37,540
But then, the check has to
happen in polynomial time.

581
00:30:41,047 --> 00:30:41,630
Kind of clear?

582
00:30:44,290 --> 00:30:46,450
That's Tetris.

583
00:30:46,450 --> 00:30:49,220
And every problem that you
can solve in polynomial

584
00:30:49,220 --> 00:30:51,049
time you can also,
of course, check it.

585
00:30:51,049 --> 00:30:53,090
Because if you could solve
it in polynomial time,

586
00:30:53,090 --> 00:30:54,590
you could just solve
it and then see

587
00:30:54,590 --> 00:30:56,320
did you get the same
answer that I did.

588
00:30:56,320 --> 00:30:59,790
So P is inside NP.

589
00:30:59,790 --> 00:31:04,910
But the big question
is does p equal NP.

590
00:31:04,910 --> 00:31:08,600
And most people think no.

591
00:31:08,600 --> 00:31:12,060
P does not equal NP--
most sane people.

592
00:31:16,690 --> 00:31:18,910
So this is a big problem.

593
00:31:18,910 --> 00:31:21,900
It's one of the famous
Millennium Prize problems.

594
00:31:21,900 --> 00:31:27,030
So in particular, if you solved
it, you would get $1 million,

595
00:31:27,030 --> 00:31:29,080
and fame, and probably
other fortune.

596
00:31:29,080 --> 00:31:31,060
You could do TV spots.

597
00:31:31,060 --> 00:31:34,160
I think that's how people
mostly make their money.

598
00:31:34,160 --> 00:31:35,280
You could do a lot.

599
00:31:35,280 --> 00:31:38,020
You would become the most famous
computer scientist in the world

600
00:31:38,020 --> 00:31:40,020
if you prove this.

601
00:31:40,020 --> 00:31:41,270
So a lot of people have tried.

602
00:31:41,270 --> 00:31:44,070
Every year, there's an
attempt to prove either

603
00:31:44,070 --> 00:31:46,500
what everyone believes
or, most often,

604
00:31:46,500 --> 00:31:49,742
people try to prove the
reverse-- that they are equal.

605
00:31:49,742 --> 00:31:50,450
I don't know why.

606
00:31:50,450 --> 00:31:53,250
They should bet the other way.

607
00:31:53,250 --> 00:31:55,360
So what does P does
not equal NP mean?

608
00:31:55,360 --> 00:32:00,040
It means that there are
problems, here, that are in NP

609
00:32:00,040 --> 00:32:03,240
but not in P. Think
about what this means.

610
00:32:03,240 --> 00:32:05,810
This is saying P are the
problems that we can actually

611
00:32:05,810 --> 00:32:07,680
solve on a legitimate computer.

612
00:32:07,680 --> 00:32:10,950
NP are problems that we can
solve in this magical fairy

613
00:32:10,950 --> 00:32:14,370
computer where all of
our dreams are granted.

614
00:32:14,370 --> 00:32:16,120
You say, oh, I don't
know which way to go.

615
00:32:16,120 --> 00:32:19,510
It doesn't matter because
the machine magically

616
00:32:19,510 --> 00:32:21,400
tells you which way to go.

617
00:32:21,400 --> 00:32:24,210
If you're goal is
to get to a yes.

618
00:32:24,210 --> 00:32:27,930
So NP is a really powerful
model of computation.

619
00:32:27,930 --> 00:32:29,690
It's an insane model
of computation.

620
00:32:29,690 --> 00:32:32,100
No one in their right mind
would consider it legitimate.

621
00:32:32,100 --> 00:32:35,250
So obviously, it's
more powerful than P,

622
00:32:35,250 --> 00:32:37,727
except we don't know
how to prove it.

623
00:32:37,727 --> 00:32:38,310
Very annoying.

624
00:32:45,480 --> 00:32:47,450
Other phrasings of
P does not equal

625
00:32:47,450 --> 00:32:50,870
NP is-- these are my
phrasings, I them up-- you

626
00:32:50,870 --> 00:32:53,090
can't engineer luck.

627
00:32:57,520 --> 00:32:59,160
You can believe in
luck, if you want.

628
00:32:59,160 --> 00:33:01,410
But it's not something
that we can build out

629
00:33:01,410 --> 00:33:03,960
of a regular computer.

630
00:33:03,960 --> 00:33:07,545
That's the meaning
of this statement.

631
00:33:07,545 --> 00:33:09,485
And so I think most
people believe that.

632
00:33:13,530 --> 00:33:19,520
Another phrasing would
be that solving problems

633
00:33:19,520 --> 00:33:22,460
is harder than
checking solutions.

634
00:33:27,300 --> 00:33:30,645
A more formal version is that
generating solutions or proofs

635
00:33:30,645 --> 00:33:37,510
of solutions can be
harder than checking them.

636
00:33:44,850 --> 00:33:47,860
Another phrasing is
it's harder to generate

637
00:33:47,860 --> 00:33:49,550
a proof of a theorem
than it is to check

638
00:33:49,550 --> 00:33:50,780
the proof of a theorem.

639
00:33:50,780 --> 00:33:53,400
We all know checking
the proof of a theorem

640
00:33:53,400 --> 00:33:56,000
should be easy if you
write it precisely.

641
00:33:56,000 --> 00:33:58,420
Just make sure each step
follows from the previous ones.

642
00:33:58,420 --> 00:34:00,152
Done.

643
00:34:00,152 --> 00:34:01,610
But proving a
theorem, that's hard.

644
00:34:01,610 --> 00:34:02,550
You need inspiration.

645
00:34:02,550 --> 00:34:03,740
You need some clever idea.

646
00:34:03,740 --> 00:34:04,920
That's guessing.

647
00:34:04,920 --> 00:34:09,020
Inspiration equals luck equals
guessing, in this model.

648
00:34:09,020 --> 00:34:10,370
And that's hard.

649
00:34:13,380 --> 00:34:15,880
The only way we know is
to try all the proofs.

650
00:34:15,880 --> 00:34:17,270
See which of them work.

651
00:34:24,020 --> 00:34:26,350
So what the heck?

652
00:34:26,350 --> 00:34:27,510
What could we possibly say?

653
00:34:27,510 --> 00:34:30,020
This is all kind of weird.

654
00:34:30,020 --> 00:34:31,520
This would be the
end of the lecture

655
00:34:31,520 --> 00:34:34,770
if you say, OK,
well we don't know.

656
00:34:34,770 --> 00:34:37,350
That's it.

657
00:34:37,350 --> 00:34:41,524
But thankfully-- I kind
of need this board.

658
00:34:41,524 --> 00:34:43,690
I also want this one, but
I guess I'll go over here.

659
00:34:48,364 --> 00:34:50,280
Fortunately, this is not
the end of the story.

660
00:34:50,280 --> 00:34:55,340
And we can say a lot
about things like Tetris.

661
00:34:55,340 --> 00:34:57,930
See I drew Tetris not
just in this regime.

662
00:34:57,930 --> 00:35:01,330
We're pretty sure Tetris
is between NP and P.

663
00:35:01,330 --> 00:35:06,480
That it's in NP minus P.

664
00:35:06,480 --> 00:35:08,830
So let me write that down.

665
00:35:08,830 --> 00:35:16,640
Tetris is in NP minus P. We
don't know that because we

666
00:35:16,640 --> 00:35:20,070
don't know-- this
could be the empty set.

667
00:35:20,070 --> 00:35:26,040
What we do know
is that if there's

668
00:35:26,040 --> 00:35:32,040
anything in NP minus P--
if they are different,

669
00:35:32,040 --> 00:35:35,900
then-- if there's
anything in NP minus P,

670
00:35:35,900 --> 00:35:39,060
then Tetris is one
of those things.

671
00:35:39,060 --> 00:35:40,760
That's why I drew
Tetris out there.

672
00:35:40,760 --> 00:35:45,800
It is, in a certain sense,
the hardest problem in NP.

673
00:35:45,800 --> 00:35:47,690
Tetris.

674
00:35:47,690 --> 00:35:49,550
Why Tetris?

675
00:35:49,550 --> 00:35:50,809
Well, it's not just Tetris.

676
00:35:50,809 --> 00:35:53,100
There are a lot of problems
right at that little notch.

677
00:35:53,100 --> 00:35:57,220
But this is pretty interesting
because, while we can't figure

678
00:35:57,220 --> 00:35:59,920
this out, most people
believe this is true.

679
00:35:59,920 --> 00:36:01,982
And so as long as you
believe in that-- as long

680
00:36:01,982 --> 00:36:05,920
as you have faith--
then you can prove

681
00:36:05,920 --> 00:36:08,160
that Tetris is in NP minus P.

682
00:36:08,160 --> 00:36:09,650
And so it's hard.

683
00:36:09,650 --> 00:36:11,880
It's not in P, in this case.

684
00:36:11,880 --> 00:36:19,739
In particular, not in
P. That's kind of cool.

685
00:36:19,739 --> 00:36:21,780
How in the world do we
prove something like this?

686
00:36:21,780 --> 00:36:23,910
It's actually not that hard.

687
00:36:23,910 --> 00:36:25,830
I mean it took us
several months,

688
00:36:25,830 --> 00:36:29,640
but that's just months, whereas
this thing has been around

689
00:36:29,640 --> 00:36:33,170
since, I guess, the '70s.

690
00:36:33,170 --> 00:36:36,030
P versus NP.

691
00:36:36,030 --> 00:36:38,760
Why is this true?

692
00:36:38,760 --> 00:36:42,960
Because Tetris is NP-hard.

693
00:36:46,210 --> 00:36:48,420
What does NP-hard mean?

694
00:36:48,420 --> 00:36:54,640
This means as hard as
every problem in NP.

695
00:36:59,340 --> 00:37:02,010
I can't say harder than
because it's non-strict.

696
00:37:02,010 --> 00:37:04,910
So it's at least as hard
as every problem in NP.

697
00:37:04,910 --> 00:37:07,580
And that's why I drew
it at the far right.

698
00:37:07,580 --> 00:37:10,340
It's sort of the
hardest extreme of NP.

699
00:37:10,340 --> 00:37:13,030
Among everything in NP
you can possibly imagine,

700
00:37:13,030 --> 00:37:16,000
Tetris is as hard
as all of them.

701
00:37:16,000 --> 00:37:19,430
And therefore, if there's
anything that's harder than P,

702
00:37:19,430 --> 00:37:22,350
then Tetris is going to be
harder than P because it's

703
00:37:22,350 --> 00:37:23,700
as far to the right as possible.

704
00:37:23,700 --> 00:37:27,490
Either P equals NP, in which
case the picture is like this.

705
00:37:27,490 --> 00:37:29,920
Here's P. Here's NP.

706
00:37:29,920 --> 00:37:32,300
Tetris is still at the
right extreme, here.

707
00:37:32,300 --> 00:37:35,430
But it's less interesting
because it's still in P.

708
00:37:35,430 --> 00:37:37,590
Or the picture looks like
this, and NP is strictly

709
00:37:37,590 --> 00:37:41,020
bigger than P. And then, because
Tetris is at the right extreme,

710
00:37:41,020 --> 00:37:45,290
it's outside of P. So
we prove this in order

711
00:37:45,290 --> 00:37:47,110
to establish this claim.

712
00:37:51,010 --> 00:37:52,630
Just to get some
terminology, what

713
00:37:52,630 --> 00:37:53,940
is this NP-complete business?

714
00:37:58,810 --> 00:38:09,550
Tetris is NP-complete,
which means two things.

715
00:38:09,550 --> 00:38:11,470
One is that it's NP-hard.

716
00:38:11,470 --> 00:38:13,960
And the other is
that it's in NP.

717
00:38:13,960 --> 00:38:16,340
So if you think of the
intersection, NP intersect

718
00:38:16,340 --> 00:38:18,210
NP-hard, that's NP-complete.

719
00:38:18,210 --> 00:38:26,490
Let me draw on the picture
here what this means.

720
00:38:26,490 --> 00:38:28,140
So I'm going to
draw it on the top.

721
00:38:38,590 --> 00:38:39,720
This is NP-hard.

722
00:38:42,390 --> 00:38:46,040
Everything from here to
the right is NP-hard.

723
00:38:46,040 --> 00:38:48,922
NP-hard means it's at least
as hard as everything in NP.

724
00:38:48,922 --> 00:38:50,380
That means it might
be at this line

725
00:38:50,380 --> 00:38:52,390
or it might be to the right.

726
00:38:52,390 --> 00:38:55,130
But in the case of Tetris,
we know that it's in NP.

727
00:38:55,130 --> 00:38:57,494
We proved that a
couple of times.

728
00:38:57,494 --> 00:38:59,535
And so we know that Tetris
is also in this range.

729
00:38:59,535 --> 00:39:01,850
And so if it's in this
range and in this range,

730
00:39:01,850 --> 00:39:03,690
it's got to be right here.

731
00:39:03,690 --> 00:39:04,940
Completeness is nice.

732
00:39:04,940 --> 00:39:07,370
If you prove something
is something complete--

733
00:39:07,370 --> 00:39:09,920
prove a problem is some
complexity class complete--

734
00:39:09,920 --> 00:39:13,550
then you know sort of exactly
where it falls on this line.

735
00:39:13,550 --> 00:39:15,750
NP-complete means right here.

736
00:39:15,750 --> 00:39:18,520
EXP-complete means right here.

737
00:39:18,520 --> 00:39:22,880
Turns out Chess is EXP-complete.

738
00:39:22,880 --> 00:39:27,710
EXP-hard is anything
from here over.

739
00:39:27,710 --> 00:39:30,670
EXP is anything from
here, over this way.

740
00:39:30,670 --> 00:39:32,335
Chess is right at
that borderline.

741
00:39:32,335 --> 00:39:34,512
It is the hardest
problem in EXP.

742
00:39:34,512 --> 00:39:35,970
And that's actually
the only way we

743
00:39:35,970 --> 00:39:37,970
know to prove that it's not NP.

744
00:39:37,970 --> 00:39:39,970
It's is pretty easy to
show that EXP is bigger

745
00:39:39,970 --> 00:39:43,770
than P. And Chess is the
farthest to the right in EXP--

746
00:39:43,770 --> 00:39:47,800
of any problem in EXP-- and
so, therefore, it's not in P.

747
00:39:47,800 --> 00:39:51,350
So whereas this one-- these two,
we're not sure are they equal.

748
00:39:51,350 --> 00:39:55,190
This line we know is
different from this one.

749
00:39:55,190 --> 00:39:58,720
We don't know about
these two, though.

750
00:39:58,720 --> 00:40:01,550
Does NP equal EXP?

751
00:40:01,550 --> 00:40:02,240
Not as famous.

752
00:40:02,240 --> 00:40:04,850
You won't get a million
dollars, but still a very big,

753
00:40:04,850 --> 00:40:07,550
open question.

754
00:40:07,550 --> 00:40:09,590
What else do I wanna say?

755
00:40:09,590 --> 00:40:11,020
Tetris, Chess, EXP-hard.

756
00:40:11,020 --> 00:40:16,369
So these lines, here--
this is NP-complete

757
00:40:16,369 --> 00:40:17,410
And this is EXP-complete.

758
00:40:35,980 --> 00:40:39,015
So the last thing I want to
talk about is reductions.

759
00:40:43,770 --> 00:40:45,980
Reductions-- so how do you
prove something like this?

760
00:40:45,980 --> 00:40:47,710
What is as hard as even mean?

761
00:40:47,710 --> 00:40:49,230
I haven't defined that.

762
00:40:49,230 --> 00:40:51,270
But it's not hard to define.

763
00:40:51,270 --> 00:40:53,130
In fact, it's a concept
we've seen already.

764
00:41:18,610 --> 00:41:21,380
Reductions are actually a
way to design algorithms

765
00:41:21,380 --> 00:41:24,354
that we've been using
implicitly a lot.

766
00:41:24,354 --> 00:41:25,770
You may have even
heard this term.

767
00:41:25,770 --> 00:41:28,010
A bunch of recitations have
used the word reduction

768
00:41:28,010 --> 00:41:29,970
for graph reduction.

769
00:41:29,970 --> 00:41:31,770
You have some problem,
you convert it

770
00:41:31,770 --> 00:41:34,590
into a graph problem, then you
just call the graph algorithm.

771
00:41:34,590 --> 00:41:35,830
You're done.

772
00:41:35,830 --> 00:41:36,760
That's reduction.

773
00:41:36,760 --> 00:41:38,820
In general, you have
some problem, A,

774
00:41:38,820 --> 00:41:40,830
that you want to solve.

775
00:41:40,830 --> 00:41:44,030
And you convert it into
some other problem, B,

776
00:41:44,030 --> 00:41:46,182
that you already
know how to solve.

777
00:41:46,182 --> 00:41:47,890
It's a great tool
because, in this class,

778
00:41:47,890 --> 00:41:50,630
you learn tons of algorithms
for solving tons of problems.

779
00:41:50,630 --> 00:41:55,070
Now, someone gives you,
in your job or whatever,

780
00:41:55,070 --> 00:41:56,950
or you think about
some problem that you

781
00:41:56,950 --> 00:41:59,180
don't know how to solve,
the first thing you should

782
00:41:59,180 --> 00:42:01,000
do is-- can I convert
it into something

783
00:42:01,000 --> 00:42:02,930
I know how to solve
because then you're done.

784
00:42:02,930 --> 00:42:04,721
Now it may not be the
best way to solve it,

785
00:42:04,721 --> 00:42:06,410
but at least it's
a way to solve it.

786
00:42:06,410 --> 00:42:09,015
Probably in polynomial time
because we think of B as things

787
00:42:09,015 --> 00:42:10,390
you can solve in
polynomial time.

788
00:42:10,390 --> 00:42:13,160
Great.

789
00:42:13,160 --> 00:42:20,730
So just convert
problem A, which you

790
00:42:20,730 --> 00:42:27,615
want to solve, into some problem
B that you know how to solve.

791
00:42:30,690 --> 00:42:32,370
That's reduction.

792
00:42:32,370 --> 00:42:35,460
Let me give you some examples
that we've already seen,

793
00:42:35,460 --> 00:42:38,065
just to fit this into your
mental map of the class.

794
00:42:42,640 --> 00:42:45,060
It's kind of a funny one
but it's a very simple one.

795
00:42:52,470 --> 00:42:54,460
So how do you solve
unweighted shortest paths?

796
00:42:58,300 --> 00:42:59,810
In general?

797
00:42:59,810 --> 00:43:00,670
Easy one.

798
00:43:00,670 --> 00:43:02,794
Give you a graph with no
weights on the edges and I

799
00:43:02,794 --> 00:43:04,466
want to the shortest
path from s to t.

800
00:43:04,466 --> 00:43:05,390
AUDIENCE: BFS

801
00:43:05,390 --> 00:43:06,180
PROFESSOR: BFS.

802
00:43:06,180 --> 00:43:07,600
Linear time, right?

803
00:43:07,600 --> 00:43:10,050
Well, that's if
you're smart or if you

804
00:43:10,050 --> 00:43:11,250
feel like implementing BFS.

805
00:43:11,250 --> 00:43:14,380
Suppose someone
gave you Djikstra.

806
00:43:14,380 --> 00:43:16,125
Said, here, look, I've
got Djikstra code.

807
00:43:16,125 --> 00:43:17,375
You don't have to do anything.

808
00:43:17,375 --> 00:43:18,940
There's Djisktra
code right there.

809
00:43:18,940 --> 00:43:21,100
But Djikstra solves
weighted shortest path.

810
00:43:21,100 --> 00:43:22,160
I don't have any weights.

811
00:43:22,160 --> 00:43:24,960
What do I do?

812
00:43:24,960 --> 00:43:28,140
Set the weights to 1.

813
00:43:28,140 --> 00:43:30,630
It's very easy, but
this is a reduction--

814
00:43:30,630 --> 00:43:32,460
a simple example of reduction.

815
00:43:32,460 --> 00:43:35,330
Not the smartest of reductions,
but it's a reduction.

816
00:43:38,840 --> 00:43:40,780
So I can convert
unweighted shortest paths

817
00:43:40,780 --> 00:43:43,750
into weighted shortest paths
by adding weights of 1.

818
00:43:43,750 --> 00:43:44,320
Done.

819
00:43:44,320 --> 00:43:46,070
Adding weights of
0 would not work.

820
00:43:46,070 --> 00:43:47,170
But weights of 1.

821
00:43:47,170 --> 00:43:47,900
OK.

822
00:43:47,900 --> 00:43:49,492
Weights of 2 also works.

823
00:43:49,492 --> 00:43:51,950
Pick your favorite number, but
as long as you're consistent

824
00:43:51,950 --> 00:43:52,780
about it.

825
00:43:52,780 --> 00:43:54,520
That's a reduction.

826
00:43:54,520 --> 00:43:56,570
Here's some more
interesting ones.

827
00:43:56,570 --> 00:44:03,920
On the problems set--
problem set six--

828
00:44:03,920 --> 00:44:08,205
there was this RenBook problem,
"I Can Haz Moar Frendz?"

829
00:44:08,205 --> 00:44:09,580
That was the name
of the problem.

830
00:44:09,580 --> 00:44:14,640
And the goal was
to solve-- to find

831
00:44:14,640 --> 00:44:17,884
paths that minimize
the product of weights.

832
00:44:17,884 --> 00:44:19,300
But what we've
covered in class is

833
00:44:19,300 --> 00:44:21,910
how to solve a problem when
it's the sum of weights.

834
00:44:21,910 --> 00:44:23,890
How do you do it?

835
00:44:23,890 --> 00:44:26,070
In one word, or less?

836
00:44:26,070 --> 00:44:26,990
Logs.

837
00:44:26,990 --> 00:44:28,920
Just take logs.

838
00:44:28,920 --> 00:44:31,597
That converts
products into sums.

839
00:44:31,597 --> 00:44:32,930
Now you start to get the flavor.

840
00:44:32,930 --> 00:44:37,150
This is a problem that you could
take Djikstra or Bellman-Ford,

841
00:44:37,150 --> 00:44:39,390
and change all the
relaxation steps

842
00:44:39,390 --> 00:44:42,470
and change it to work
directly with products.

843
00:44:42,470 --> 00:44:46,570
That would work,
but it's more work.

844
00:44:46,570 --> 00:44:49,200
You have to prove that
that's still correct.

845
00:44:49,200 --> 00:44:50,500
It's annoying to think about.

846
00:44:50,500 --> 00:44:52,660
And it's annoying to program.

847
00:44:52,660 --> 00:44:54,590
It's not modular,
blah, blah, blah.

848
00:44:54,590 --> 00:44:56,720
Whereas if you just
do this reduction,

849
00:44:56,720 --> 00:44:59,990
you can use exactly the
code that you had before,

850
00:44:59,990 --> 00:45:01,960
at the end.

851
00:45:01,960 --> 00:45:03,220
So that's nice.

852
00:45:03,220 --> 00:45:04,670
This is why
reductions are really

853
00:45:04,670 --> 00:45:07,562
the most common algorithm design
technique because you don't

854
00:45:07,562 --> 00:45:10,020
want to implement an algorithm
for every single problem you

855
00:45:10,020 --> 00:45:10,700
have.

856
00:45:10,700 --> 00:45:13,200
It would be nice if you could
reuse some of those algorithms

857
00:45:13,200 --> 00:45:14,630
that you had before.

858
00:45:14,630 --> 00:45:17,100
Reductions let you do that.

859
00:45:17,100 --> 00:45:21,680
Another one, which was on the
quiz in the true-false-- quiz

860
00:45:21,680 --> 00:45:25,532
two-- was converting longest
path into shortest path.

861
00:45:25,532 --> 00:45:26,990
We didn't phrase
it as a reduction.

862
00:45:26,990 --> 00:45:29,730
It was just can you
solve longest path using

863
00:45:29,730 --> 00:45:30,910
Bellman-Ford.

864
00:45:30,910 --> 00:45:31,832
And the answer is yes.

865
00:45:31,832 --> 00:45:33,165
You just negate all the weights.

866
00:45:33,165 --> 00:45:34,900
And that converts a
longest path problem

867
00:45:34,900 --> 00:45:37,660
into a shortest path problem.

868
00:45:37,660 --> 00:45:40,310
Easy.

869
00:45:40,310 --> 00:45:43,030
Also on the quiz-- maybe I don't
need to write all of these down

870
00:45:43,030 --> 00:45:45,200
because they're a little
bit weird problems.

871
00:45:45,200 --> 00:45:46,370
We made them up.

872
00:45:46,370 --> 00:45:50,220
There was the-- what was
the duck tour called?

873
00:45:50,220 --> 00:45:50,990
Bird tours?

874
00:45:50,990 --> 00:45:51,950
Bird tours?

875
00:45:51,950 --> 00:45:52,700
Aviation tours?

876
00:45:52,700 --> 00:45:53,610
Whatever.

877
00:45:53,610 --> 00:45:56,990
You want to visit a bunch of
sites in some specified order.

878
00:45:56,990 --> 00:45:58,990
The point in that problem
is you could reduce it

879
00:45:58,990 --> 00:46:02,900
to a single shortest
paths query.

880
00:46:02,900 --> 00:46:05,682
And so if you already
have shortest path code,

881
00:46:05,682 --> 00:46:06,890
you don't have to think much.

882
00:46:06,890 --> 00:46:08,400
You just do the
graph application.

883
00:46:08,400 --> 00:46:09,970
Done.

884
00:46:09,970 --> 00:46:11,600
Then there's the
leaky tank problem,

885
00:46:11,600 --> 00:46:14,570
which is also a graph
reduction problem.

886
00:46:14,570 --> 00:46:16,570
You could represent all
these extra weird things

887
00:46:16,570 --> 00:46:18,640
that were happening
in your car by just

888
00:46:18,640 --> 00:46:20,202
changing the graph a little bit.

889
00:46:20,202 --> 00:46:21,660
And it's a very
powerful technique.

890
00:46:21,660 --> 00:46:24,860
In this class, we see it
mostly in graph reductions.

891
00:46:24,860 --> 00:46:28,120
But it could apply
all over the place.

892
00:46:28,120 --> 00:46:30,810
And while this is a powerful
technique for coming up

893
00:46:30,810 --> 00:46:34,310
with new algorithms, it's
also a powerful technique

894
00:46:34,310 --> 00:46:41,380
for proving things
like Tetris is NP-hard.

895
00:46:41,380 --> 00:46:43,830
So what we proved
is that a problem

896
00:46:43,830 --> 00:46:49,600
called 3-Partition can
be reduced to Tetris.

897
00:46:57,810 --> 00:46:58,610
What's 3-Partition?

898
00:46:58,610 --> 00:47:01,000
3-Partition is I
give you n numbers.

899
00:47:01,000 --> 00:47:03,930
I want to know can I
divide them into triples,

900
00:47:03,930 --> 00:47:06,450
each of the same sum.

901
00:47:06,450 --> 00:47:07,780
So I have n numbers.

902
00:47:07,780 --> 00:47:10,170
Divide them into n
over 3 groups of 3,

903
00:47:10,170 --> 00:47:14,030
such that the sum of
each of the 3s is equal.

904
00:47:14,030 --> 00:47:15,780
Sounds like an easy
enough problem.

905
00:47:15,780 --> 00:47:18,230
But it's an NP-complete problem.

906
00:47:18,230 --> 00:47:22,950
And people knew that since
one of the first papers.

907
00:47:22,950 --> 00:47:26,790
I guess that was late
'70s, early '80s, by Karp.

908
00:47:26,790 --> 00:47:28,800
So Karp already proved
this is standing

909
00:47:28,800 --> 00:47:32,410
on the shoulders of giants.

910
00:47:32,410 --> 00:47:34,360
Karp proved 3-Partition
is NP-complete,

911
00:47:34,360 --> 00:47:37,060
so I don't need to
think about that.

912
00:47:37,060 --> 00:47:39,210
All I need to
focus on is showing

913
00:47:39,210 --> 00:47:43,470
that Tetris is harder
than 3-Partition.

914
00:47:43,470 --> 00:47:45,270
This is what I mean by harder.

915
00:47:45,270 --> 00:47:48,990
Harder means-- so when
I can reduce A to B,

916
00:47:48,990 --> 00:48:02,090
we say the A-- B is at least
as hard as A. Why's that?

917
00:48:02,090 --> 00:48:05,820
Because I can solve A by solving
B. I just apply this reduction

918
00:48:05,820 --> 00:48:08,570
and then solve B. So if I
had some good way to solve B,

919
00:48:08,570 --> 00:48:11,110
it would turn into a
good way to solve A.

920
00:48:11,110 --> 00:48:14,940
Now 3-Partition-- which
is A, here-- we're

921
00:48:14,940 --> 00:48:17,440
pretty sure there's no good
algorithm for solving this.

922
00:48:17,440 --> 00:48:22,900
Pretty sure it's not in P.
And so Tetris better not be P

923
00:48:22,900 --> 00:48:25,430
either because if
Tetris were in P, then

924
00:48:25,430 --> 00:48:27,140
we could just take
our 3-Partition,

925
00:48:27,140 --> 00:48:30,990
reduce it to Tetris, and then
3-Partition would be in P.

926
00:48:30,990 --> 00:48:33,210
In fact, all of the
NP-complete problems,

927
00:48:33,210 --> 00:48:36,470
you can reduce to each other.

928
00:48:36,470 --> 00:48:39,820
And so to show that something
is at that little position,

929
00:48:39,820 --> 00:48:41,900
NP-complete, all
you need to do is

930
00:48:41,900 --> 00:48:44,120
find some known
NP-complete problem

931
00:48:44,120 --> 00:48:47,520
and reduce it to your problem.

932
00:48:47,520 --> 00:48:51,400
So reductions are super useful
for getting positive results

933
00:48:51,400 --> 00:48:53,580
for making new
algorithms, but also

934
00:48:53,580 --> 00:48:56,110
for proving negative results--
showing that one problem is

935
00:48:56,110 --> 00:48:57,310
harder than another.

936
00:48:57,310 --> 00:48:59,080
And if you already
believe this is hard,

937
00:48:59,080 --> 00:49:00,621
then you should
believe this is hard.

938
00:49:08,570 --> 00:49:12,060
I think that's all I
really have time for.

939
00:49:12,060 --> 00:49:14,480
I'll give you a couple
more NP-complete problems.

940
00:49:14,480 --> 00:49:15,930
Kind of fun.

941
00:49:15,930 --> 00:49:18,896
Traveling salesman problem,
you may have heard of.

942
00:49:18,896 --> 00:49:20,020
Let's say you have a graph.

943
00:49:20,020 --> 00:49:22,040
And you want to find out
the shortest path that

944
00:49:22,040 --> 00:49:25,770
visits all the vertices,
not just one vertex.

945
00:49:25,770 --> 00:49:28,680
That's NP-complete.

946
00:49:28,680 --> 00:49:31,680
We solved longest common
subsequence for two strings,

947
00:49:31,680 --> 00:49:33,280
but if I give you
n strings that you

948
00:49:33,280 --> 00:49:35,238
need to find the longest
common subsequence of,

949
00:49:35,238 --> 00:49:37,730
that's NP-complete.

950
00:49:37,730 --> 00:49:41,560
Minesweeper, Sudoku, most
puzzles that are interesting

951
00:49:41,560 --> 00:49:43,990
are NP-complete.

952
00:49:43,990 --> 00:49:45,360
SAT.

953
00:49:45,360 --> 00:49:53,120
SAT is a-- I give you a Boolean
formula like x or y AND NOT

954
00:49:53,120 --> 00:49:55,050
x-- something like that.

955
00:49:55,050 --> 00:49:57,499
I want to know is there some
setting of the variables that

956
00:49:57,499 --> 00:49:58,790
makes this thing come out true?

957
00:49:58,790 --> 00:50:01,634
Is it possible to
make this true?

958
00:50:01,634 --> 00:50:02,800
That's NP-complete complete.

959
00:50:02,800 --> 00:50:04,310
This was actually
the first problem

960
00:50:04,310 --> 00:50:05,610
that was shown NP-complete.

961
00:50:05,610 --> 00:50:06,880
There's this issue, right?

962
00:50:06,880 --> 00:50:08,754
If I'm going to show
everything's NP-complete

963
00:50:08,754 --> 00:50:10,910
by reduction, how the
heck do I get started?

964
00:50:10,910 --> 00:50:12,360
What's the first problem?

965
00:50:12,360 --> 00:50:15,620
And this is the first problem.

966
00:50:15,620 --> 00:50:18,580
You could sort of prove it
by definition, almost, of NP,

967
00:50:18,580 --> 00:50:19,480
here.

968
00:50:19,480 --> 00:50:22,760
But I won't do that.

969
00:50:22,760 --> 00:50:24,610
Three coloring a graph.

970
00:50:24,610 --> 00:50:25,280
Shortest paths.

971
00:50:25,280 --> 00:50:26,010
This is fun.

972
00:50:26,010 --> 00:50:27,840
Shortest paths in
a graph is hard.

973
00:50:27,840 --> 00:50:30,620
But in the real world, we
live in a three dimensional,

974
00:50:30,620 --> 00:50:31,880
geometric environment.

975
00:50:31,880 --> 00:50:33,338
What if I want to
find the shortest

976
00:50:33,338 --> 00:50:35,620
path from this point,
where I am, to that point,

977
00:50:35,620 --> 00:50:37,500
over on the ceiling
or something.

978
00:50:37,500 --> 00:50:40,020
And I can fly.

979
00:50:40,020 --> 00:50:41,669
That's NP-complete.

980
00:50:41,669 --> 00:50:42,460
It's kind of weird.

981
00:50:42,460 --> 00:50:44,160
Shortest paths in a two
dimensional environment

982
00:50:44,160 --> 00:50:44,743
is polynomial.

983
00:50:44,743 --> 00:50:47,532
It's a good thing that we are
on ground because, then, we

984
00:50:47,532 --> 00:50:48,990
can model things
by two dimensions.

985
00:50:48,990 --> 00:50:50,470
We can model things by graphs.

986
00:50:50,470 --> 00:50:53,500
But in 3D, shortest
paths is NP-complete.

987
00:50:53,500 --> 00:50:56,139
So all these things where
a problem-- knapsack,

988
00:50:56,139 --> 00:50:56,930
that's another one.

989
00:50:56,930 --> 00:50:58,221
We've already covered knapsack.

990
00:50:58,221 --> 00:50:59,990
We saw a pseudo-polynomial
algorithm.

991
00:50:59,990 --> 00:51:02,390
Turns out, you can't do
better than pseudo-polynomial

992
00:51:02,390 --> 00:51:07,030
unless P equals NP because
knapsack is NP-complete.

993
00:51:07,030 --> 00:51:08,160
So there you go.

994
00:51:08,160 --> 00:51:11,313
Computational complexity
in 50 minutes.