1
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

2
00:00:02,960 --> 00:00:04,370
Commons license.

3
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

4
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.

5
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

6
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.

8
00:00:22,660 --> 00:00:25,790
PROFESSOR: Good morning,
everybody.

9
00:00:25,790 --> 00:00:30,970
All right, I ended up the last
lecture talking about how to

10
00:00:30,970 --> 00:00:34,510
calculate the absolute
goodness of fit using

11
00:00:34,510 --> 00:00:36,763
something called the coefficient
of determination.

12
00:00:39,390 --> 00:00:43,115
It's usually spelled as
R-squared, or R2.

13
00:00:46,830 --> 00:00:49,070
And the formula was
quite simple.

14
00:00:59,930 --> 00:01:04,269
We measure the goodness of the
fit as R-squared equals 1

15
00:01:04,269 --> 00:01:10,790
minus the estimated
error divided by

16
00:01:10,790 --> 00:01:12,490
the measured variance.

17
00:01:16,170 --> 00:01:22,590
As I observed, R-squared always
lies between 0 and 1.

18
00:01:22,590 --> 00:01:27,140
If R-squared equals 1, that
means that the model that we

19
00:01:27,140 --> 00:01:32,370
constructed, the predicted
values if you will, explains

20
00:01:32,370 --> 00:01:37,260
all of the variability in the
data so that any change in the

21
00:01:37,260 --> 00:01:41,030
data is explained perfectly
by the model.

22
00:01:41,030 --> 00:01:42,910
We don't usually get 1.

23
00:01:42,910 --> 00:01:44,680
In fact, if I ever got
1, I would think

24
00:01:44,680 --> 00:01:45,930
somebody cheated me.

25
00:01:48,760 --> 00:01:52,730
R-squared equals 0, or
conversely, it means there's

26
00:01:52,730 --> 00:01:56,730
no linear relationship at all
between the values predicted

27
00:01:56,730 --> 00:01:59,310
by the model and the
actual data.

28
00:01:59,310 --> 00:02:01,675
That is to say, the model
is totally worthless.

29
00:02:05,360 --> 00:02:10,669
So we have code here, the top
of the screen, showing how

30
00:02:10,669 --> 00:02:13,500
easy it is to compute
R-squared.

31
00:02:13,500 --> 00:02:16,190
And for those of you who have a
little trouble interpreting

32
00:02:16,190 --> 00:02:19,560
the formula, because maybe
you're not quite sure what EE

33
00:02:19,560 --> 00:02:24,640
and MV mean, this will give you
a very straightforward way

34
00:02:24,640 --> 00:02:26,390
to understand it.

35
00:02:30,310 --> 00:02:33,070
So now, we can run it.

36
00:02:33,070 --> 00:02:35,780
We can get some answers.

37
00:02:35,780 --> 00:02:38,240
So if we look at it, you'll
remember last time, we looked

38
00:02:38,240 --> 00:02:40,230
at two different fits.

39
00:02:40,230 --> 00:02:44,470
We looked at a quadratic fit
and a linear fit for the

40
00:02:44,470 --> 00:02:49,770
trajectory of an arrow
fired from my bow.

41
00:02:49,770 --> 00:02:51,585
And we can now compare
the two.

42
00:02:56,690 --> 00:03:01,500
And not surprisingly, given what
we know about the physics

43
00:03:01,500 --> 00:03:06,560
of projectiles, we see it is
exactly what we'd expect, that

44
00:03:06,560 --> 00:03:14,000
the linear fit has an R-quared
of 0.0177, showing that, in

45
00:03:14,000 --> 00:03:17,530
fact, it explains almost
none of the data.

46
00:03:17,530 --> 00:03:22,070
Whereas the quadratic fit has
a really astonishingly good

47
00:03:22,070 --> 00:03:28,130
R-squared of 0.98, saying that
almost all of the changes in

48
00:03:28,130 --> 00:03:31,460
the values of the variables,
that is to say the way the y

49
00:03:31,460 --> 00:03:35,840
value changes with respect
to the x value, is

50
00:03:35,840 --> 00:03:38,150
explained by the model.

51
00:03:38,150 --> 00:03:41,460
i.e., we have a really
good model of

52
00:03:41,460 --> 00:03:44,750
the physical situation.

53
00:03:44,750 --> 00:03:46,000
Very comforting.

54
00:03:48,250 --> 00:03:51,000
Essentially, it's telling us
at less than 2% of the

55
00:03:51,000 --> 00:03:56,800
variation is explained by the
linear model, 98% by the

56
00:03:56,800 --> 00:03:58,180
quadratic model.

57
00:03:58,180 --> 00:04:04,200
Presumably the other 2%
is experimental error.

58
00:04:04,200 --> 00:04:07,090
Well, now that we know that we
have a really good model of

59
00:04:07,090 --> 00:04:13,830
the data, we can ask the
question, why do we care?

60
00:04:13,830 --> 00:04:16,290
We have the data itself.

61
00:04:16,290 --> 00:04:19,320
What's the point of building
a model of the data?

62
00:04:19,320 --> 00:04:21,180
And that, of course, is what
we're getting when we run

63
00:04:21,180 --> 00:04:25,610
polyfit to get this curve.

64
00:04:25,610 --> 00:04:29,160
The whole purpose of creating
a model, or an important

65
00:04:29,160 --> 00:04:32,930
purpose of creating a model,
is to be able to answer

66
00:04:32,930 --> 00:04:37,290
questions about the actual
physical situation.

67
00:04:37,290 --> 00:04:40,200
So one of the questions one
might ask, for example, about

68
00:04:40,200 --> 00:04:43,010
firing an arrow is, how
fast is it going?

69
00:04:46,180 --> 00:04:48,610
That's kind of a useful thing
to know if you're worried

70
00:04:48,610 --> 00:04:51,820
about whether it will penetrate
a target and kill

71
00:04:51,820 --> 00:04:55,380
somebody on the other
side, for example.

72
00:04:55,380 --> 00:04:58,000
We can't answer that
question directly

73
00:04:58,000 --> 00:05:00,350
looking at the data points.

74
00:05:00,350 --> 00:05:01,380
You look at the data.

75
00:05:01,380 --> 00:05:04,240
Well, I don't know.

76
00:05:04,240 --> 00:05:08,000
But we can use the model
to answer the question.

77
00:05:08,000 --> 00:05:11,970
And that's an exercise I want to
go through now to show you

78
00:05:11,970 --> 00:05:17,020
the interplay between models,
and theory, and computation,

79
00:05:17,020 --> 00:05:21,720
and how we can use the three to
answer relevant questions

80
00:05:21,720 --> 00:05:24,290
about data.

81
00:05:24,290 --> 00:05:26,790
No, I do not want to check for
new software, thank you.

82
00:05:26,790 --> 00:05:31,430
In fact, let's make sure it
won't do that anymore.

83
00:05:36,650 --> 00:05:39,120
So let's look at
the PowerPoint.

84
00:05:39,120 --> 00:05:43,340
So here, we'll see how I'm using
a little bit of theory,

85
00:05:43,340 --> 00:05:48,100
not very much, to be able to
understand how to use the

86
00:05:48,100 --> 00:05:52,250
model to compute the
speed of the arrow.

87
00:05:52,250 --> 00:05:58,830
So what we see is we know by
our model, and by the good

88
00:05:58,830 --> 00:06:04,050
fit, that the trajectory is
given by y equals ax-squared

89
00:06:04,050 --> 00:06:06,900
plus bx plus c.

90
00:06:06,900 --> 00:06:08,150
We know that.

91
00:06:10,160 --> 00:06:15,690
We also know from looking at
this equation that the highest

92
00:06:15,690 --> 00:06:23,720
point, which I'll call yPeak,
of the arrow must occur at

93
00:06:23,720 --> 00:06:28,340
xMid, the middle
of the x-axis.

94
00:06:28,340 --> 00:06:31,850
So if we look at a parabola, and
it doesn't matter what the

95
00:06:31,850 --> 00:06:38,240
parabola is, we always know
that the vertical peak is

96
00:06:38,240 --> 00:06:42,210
halfway along the x-axis.

97
00:06:42,210 --> 00:06:43,980
The math tells us that
from the equation.

98
00:06:47,150 --> 00:06:52,640
So we can say yPeak is x times
xMid squared plus b

99
00:06:52,640 --> 00:06:54,880
times xMid plus c.

100
00:06:54,880 --> 00:06:58,410
So now, we have a model that
we can tell how high

101
00:06:58,410 --> 00:07:01,530
the arrow can get.

102
00:07:01,530 --> 00:07:09,020
The next question I'll ask is
if I fired the arrow from

103
00:07:09,020 --> 00:07:12,200
here, and it hits the
target here--

104
00:07:12,200 --> 00:07:14,410
I've exaggerated
by it this way.

105
00:07:14,410 --> 00:07:17,020
It's nowhere near this steep.

106
00:07:17,020 --> 00:07:21,150
How long does it take to
get from here to here?

107
00:07:21,150 --> 00:07:26,590
We don't have anything about
time in our data.

108
00:07:26,590 --> 00:07:31,660
Yet, I claim we have enough
information to go from the

109
00:07:31,660 --> 00:07:36,870
distance here and the distance
here to how long it's going to

110
00:07:36,870 --> 00:07:38,810
take the arrow to get from
here to the target.

111
00:07:41,310 --> 00:07:42,560
Why do I know that?

112
00:07:46,220 --> 00:07:49,240
What determines how long it's
going to take to get

113
00:07:49,240 --> 00:07:50,490
from here to here?

114
00:07:53,310 --> 00:07:57,990
It's going to be how long it
takes it to fall that far.

115
00:07:57,990 --> 00:08:01,040
It's going to be gravity.

116
00:08:01,040 --> 00:08:05,520
Because we know that gravity, at
least on this planet, is a

117
00:08:05,520 --> 00:08:08,035
constant or close
enough to it.

118
00:08:08,035 --> 00:08:12,370
Unless maybe the arrow were
going a million miles.

119
00:08:12,370 --> 00:08:15,930
And it's going to be gravity
that tells me how long it

120
00:08:15,930 --> 00:08:17,290
takes to get from
here to here.

121
00:08:20,250 --> 00:08:27,540
And when it gets to the bottom,
it's going to be here,

122
00:08:27,540 --> 00:08:34,270
So again, I can use some very
simple math and say that the

123
00:08:34,270 --> 00:08:40,100
time will be the square root of
2 times the yPeak divided

124
00:08:40,100 --> 00:08:41,475
by the gravitational constant.

125
00:08:46,680 --> 00:08:49,540
Because I know that however long
it takes to get from this

126
00:08:49,540 --> 00:08:52,990
height to this height is going
to be the same time it takes

127
00:08:52,990 --> 00:08:54,650
to get from this point
to this point.

128
00:08:57,620 --> 00:09:00,800
And that will therefore let me
compute the average speed from

129
00:09:00,800 --> 00:09:03,100
here to here.

130
00:09:03,100 --> 00:09:07,810
And once I know that,
I'm done.

131
00:09:07,810 --> 00:09:11,860
Now again, this is assuming no
drag and things like that.

132
00:09:11,860 --> 00:09:15,650
The thing that we always have to
understand about a model is

133
00:09:15,650 --> 00:09:20,360
no model is actually
ever correct.

134
00:09:20,360 --> 00:09:24,250
On the other hand, many models
are very useful and they're

135
00:09:24,250 --> 00:09:26,570
close enough to correct.

136
00:09:26,570 --> 00:09:30,190
So I left out things like
gravity, wind shear

137
00:09:30,190 --> 00:09:31,440
and stuff like that.

138
00:09:31,440 --> 00:09:35,030
But in fact, the answer we get
here will turn out to be very

139
00:09:35,030 --> 00:09:37,590
close to correct.

140
00:09:37,590 --> 00:09:40,605
We can now go back and
look at some code.

141
00:09:48,700 --> 00:09:49,950
Get rid of this.

142
00:09:58,300 --> 00:10:00,200
And so now, you'll see
this on the handout.

143
00:10:04,390 --> 00:10:09,820
I'm going to just write a little
bit of code that just

144
00:10:09,820 --> 00:10:14,310
goes through the math I just
showed you to compute the

145
00:10:14,310 --> 00:10:17,970
average x velocity.

146
00:10:17,970 --> 00:10:20,180
Got a print statement here
I use to debug it.

147
00:10:20,180 --> 00:10:21,430
And I'm going to return it.

148
00:10:25,500 --> 00:10:28,935
And then, we'll just be able to
run it and see what we get.

149
00:10:41,120 --> 00:10:43,526
Well, all right, that
we looked at before.

150
00:10:49,310 --> 00:10:51,160
I forgot to close the
previous figure.

151
00:10:51,160 --> 00:10:54,900
So now, I'm sure this is a
problem you've all seen.

152
00:10:54,900 --> 00:10:56,720
And we'll fix it the
way we always fix

153
00:10:56,720 --> 00:10:58,025
things, just start over.

154
00:11:13,920 --> 00:11:17,220
I'll bet you guys have also
seen this happen.

155
00:11:17,220 --> 00:11:21,730
What this is suggesting, as
we've seen before, is that the

156
00:11:21,730 --> 00:11:26,220
process, the old process,
still exists.

157
00:11:26,220 --> 00:11:28,790
Not a good thing.

158
00:11:28,790 --> 00:11:30,500
Again, I'm sure you've
all seen these.

159
00:11:30,500 --> 00:11:32,560
Let's make sure we don't have
anything running here that

160
00:11:32,560 --> 00:11:34,300
looks like IDLE.

161
00:11:34,300 --> 00:11:35,550
We don't.

162
00:11:42,790 --> 00:11:44,310
Just takes it a little time.

163
00:11:44,310 --> 00:11:46,560
There it is, all right.

164
00:11:46,560 --> 00:11:47,810
All right, now we'll go back.

165
00:11:53,520 --> 00:11:57,790
All of this happened because I
forgot to close the figure and

166
00:11:57,790 --> 00:12:01,360
executed pyLab.show twice,
which we know

167
00:12:01,360 --> 00:12:02,755
can lead to bad things.

168
00:12:05,800 --> 00:12:08,430
So let's get rid of this.

169
00:12:14,420 --> 00:12:15,670
Now, we'll run it.

170
00:12:23,590 --> 00:12:32,600
And now, we have our figure just
using the quadratic fit.

171
00:12:32,600 --> 00:12:37,370
And we see that the speed is
136.25 feet per second.

172
00:12:37,370 --> 00:12:39,980
Do I believe 136.25?

173
00:12:39,980 --> 00:12:41,520
Not really.

174
00:12:41,520 --> 00:12:43,400
I know it's the ballpark.

175
00:12:43,400 --> 00:12:47,850
I confused precision with
accuracy here by giving you it

176
00:12:47,850 --> 00:12:50,000
to two decimal places.

177
00:12:50,000 --> 00:12:52,300
I can compute it as precisely
as I want.

178
00:12:52,300 --> 00:12:55,050
But that doesn't mean it's
actually accurate.

179
00:12:55,050 --> 00:12:58,970
Probably, I should have just
said it's about 135 or

180
00:12:58,970 --> 00:13:00,230
something like that.

181
00:13:00,230 --> 00:13:02,150
But it's pretty good.

182
00:13:02,150 --> 00:13:04,910
And for those of who don't know
how to do this arithmetic

183
00:13:04,910 --> 00:13:10,190
in your head like me, this is
about 93 miles per hour.

184
00:13:10,190 --> 00:13:13,840
And for comparison, the speed of
sound, instead of 136 feet

185
00:13:13,840 --> 00:13:16,650
per second, is 1,100
feet per second.

186
00:13:16,650 --> 00:13:19,380
So it's traveling pretty fast.

187
00:13:19,380 --> 00:13:20,990
Well, what's the
point of this?

188
00:13:20,990 --> 00:13:22,490
I don't really care
if you know how

189
00:13:22,490 --> 00:13:24,550
fast an arrow travels.

190
00:13:24,550 --> 00:13:28,540
I don't expect you'll ever
need to compute that.

191
00:13:28,540 --> 00:13:32,670
But I wanted to show you this
as an example of a pattern

192
00:13:32,670 --> 00:13:35,510
that we use a lot.

193
00:13:35,510 --> 00:13:39,495
So what we did is we started
with an experiment.

194
00:13:44,750 --> 00:13:48,580
You didn't see this, but I
actually stood in my backyard

195
00:13:48,580 --> 00:13:53,300
and shot a bunch of arrows and
measured them, got real data

196
00:13:53,300 --> 00:13:54,580
out of that.

197
00:13:54,580 --> 00:13:57,790
And this gave me some data
about the behavior of a

198
00:13:57,790 --> 00:13:59,040
physical system.

199
00:14:05,590 --> 00:14:07,010
That's what I get for
wearing a tie.

200
00:14:07,010 --> 00:14:10,930
Maybe it'll be quieter if
I put it in my shirt.

201
00:14:10,930 --> 00:14:13,490
Actually, it looks silly.

202
00:14:13,490 --> 00:14:14,540
Excuse me.

203
00:14:14,540 --> 00:14:16,875
I hope none of you will mind
if I take my tie off?

204
00:14:16,875 --> 00:14:19,845
It seems to be making noises
in the microphone.

205
00:14:27,680 --> 00:14:29,306
Maybe we should write
a computation.

206
00:14:32,220 --> 00:14:36,160
All right, so ends my
experiment with

207
00:14:36,160 --> 00:14:39,070
trying to look dignified.

208
00:14:39,070 --> 00:14:40,420
Not something I'm good at.

209
00:14:44,740 --> 00:14:47,740
OK, had an experiment.

210
00:14:47,740 --> 00:14:50,110
That gave us some data.

211
00:14:50,110 --> 00:15:05,170
We then use computation to both
find and very importantly

212
00:15:05,170 --> 00:15:09,160
evaluate a model.

213
00:15:09,160 --> 00:15:10,800
It's no good just to
find the model.

214
00:15:10,800 --> 00:15:14,350
You need to do some evaluation
to convince yourself that it's

215
00:15:14,350 --> 00:15:24,730
a good model of the actual
physical system.

216
00:15:24,730 --> 00:15:43,450
And then, finally, we use some
theory and analysis and

217
00:15:43,450 --> 00:15:54,975
computation to derive the
consequence of the model.

218
00:16:06,600 --> 00:16:10,140
And then, since we believe the
accuracy of the model, we

219
00:16:10,140 --> 00:16:13,630
assume this consequence was
also a true fact about the

220
00:16:13,630 --> 00:16:16,700
physical system we
started with.

221
00:16:16,700 --> 00:16:20,840
This is a pattern that we see
over and over again these days

222
00:16:20,840 --> 00:16:23,600
in all branches of science
and engineering.

223
00:16:23,600 --> 00:16:25,630
And it's just the kind
of thing that you

224
00:16:25,630 --> 00:16:27,260
should get used to doing.

225
00:16:27,260 --> 00:16:30,710
It is what you will do if you go
onto a career in science or

226
00:16:30,710 --> 00:16:32,780
engineering.

227
00:16:32,780 --> 00:16:37,580
OK, that's all I want to say
now about the topic of data

228
00:16:37,580 --> 00:16:40,210
and experiments and analysis.

229
00:16:40,210 --> 00:16:44,600
We will return to this topic
of interpretation of data

230
00:16:44,600 --> 00:16:48,260
later in the semester near the
end when we start talking

231
00:16:48,260 --> 00:16:51,180
about machine learning
and clustering.

232
00:16:51,180 --> 00:16:56,910
But for now, I want to pull
back and start down a new

233
00:16:56,910 --> 00:17:01,340
track that will, I'm sure you'll
be pleased to hear,

234
00:17:01,340 --> 00:17:05,420
dovetail nicely with the next
few problem sets that you're

235
00:17:05,420 --> 00:17:08,050
going to have to work on.

236
00:17:08,050 --> 00:17:12,032
What I want to talk about is
the topic of optimization.

237
00:17:23,069 --> 00:17:27,099
Not so much optimization in the
sense of how do you make a

238
00:17:27,099 --> 00:17:31,230
program fast, though we will
talk a little about that, but

239
00:17:31,230 --> 00:17:33,880
what people refer to as
optimization problems.

240
00:17:37,710 --> 00:17:42,210
How do we write programs to
find optimal solutions to

241
00:17:42,210 --> 00:17:43,830
problems that occur
in real life?

242
00:17:46,830 --> 00:17:50,740
Every optimization problem
we'll look at is

243
00:17:50,740 --> 00:17:52,450
going to have two parts.

244
00:17:55,360 --> 00:18:07,390
There's going to be (1) an
objective function that will

245
00:18:07,390 --> 00:18:11,180
either be maximized
or minimized.

246
00:18:11,180 --> 00:18:15,210
So for example, I might want to
find the minimal air fare

247
00:18:15,210 --> 00:18:18,860
between Boston and Istanbul.

248
00:18:18,860 --> 00:18:21,560
Or more likely, the minimum
bus fare between

249
00:18:21,560 --> 00:18:24,290
Boston and New York.

250
00:18:24,290 --> 00:18:27,070
So there's an objective
function.

251
00:18:27,070 --> 00:18:28,750
Sometimes, you find the least.

252
00:18:28,750 --> 00:18:30,960
Sometimes, you find the most.

253
00:18:30,960 --> 00:18:35,660
Maybe I want to maximize
my income.

254
00:18:35,660 --> 00:18:43,600
And (2) a set of constraints
that have to be satisfied.

255
00:18:50,210 --> 00:18:54,210
So maybe I want to find the
minimum transportation,

256
00:18:54,210 --> 00:18:59,190
minimum cost transportation
between Boston and New York

257
00:18:59,190 --> 00:19:02,200
subject to the constraint that
it not take more than eight

258
00:19:02,200 --> 00:19:06,630
hours or some such thing.

259
00:19:06,630 --> 00:19:09,630
So the objective function that
you're minimizing or

260
00:19:09,630 --> 00:19:13,760
maximizing, and some
set of constraints

261
00:19:13,760 --> 00:19:16,910
that must be obeyed.

262
00:19:16,910 --> 00:19:21,170
A vast number of problems of
practical importance can be

263
00:19:21,170 --> 00:19:23,580
formulated this way.

264
00:19:23,580 --> 00:19:28,050
Once we've formulated in this
systematic way, we can then

265
00:19:28,050 --> 00:19:31,830
think about how to attack them
with a computation that will

266
00:19:31,830 --> 00:19:33,620
help us solve the problem.

267
00:19:36,470 --> 00:19:38,640
You guys do this all the time.

268
00:19:38,640 --> 00:19:42,500
I heard a talk yesterday by
Jeremy Wertheimer, an MIT

269
00:19:42,500 --> 00:19:45,810
graduate, who founded a
company called ITA.

270
00:19:45,810 --> 00:19:49,660
If you ever use Kayak, for
example, or many of these

271
00:19:49,660 --> 00:19:52,840
systems to find an airline
fare, they use some of

272
00:19:52,840 --> 00:19:57,180
Jeremy's code and algorithms
to solve these various

273
00:19:57,180 --> 00:19:59,720
optimization problems
like this.

274
00:19:59,720 --> 00:20:03,600
If you've ever used Google or
Bing, they solve optimization

275
00:20:03,600 --> 00:20:06,480
problems to decide what
pages to show you.

276
00:20:06,480 --> 00:20:08,550
They're all over the place.

277
00:20:08,550 --> 00:20:12,760
There are a lot of classic
optimization problems that

278
00:20:12,760 --> 00:20:17,700
people have worked
on for decades.

279
00:20:17,700 --> 00:20:20,640
What we often do when confronted
with a new problem,

280
00:20:20,640 --> 00:20:23,620
and it's something you'll get
some experience on in problem

281
00:20:23,620 --> 00:20:30,680
sets, is take a seemingly new
problem and map it onto a

282
00:20:30,680 --> 00:20:35,490
classic problem, and then use
one of the classic solutions.

283
00:20:35,490 --> 00:20:39,100
So we'll go through this
section of the course.

284
00:20:39,100 --> 00:20:43,990
And we'll look at a number of
classic optimization problems.

285
00:20:43,990 --> 00:20:47,430
And then, you can think about
how you would map other

286
00:20:47,430 --> 00:20:48,835
problems onto those.

287
00:20:53,230 --> 00:21:04,230
This is the process known as
problem reduction, where we

288
00:21:04,230 --> 00:21:09,970
take a problem and map it onto
an existing problem that we

289
00:21:09,970 --> 00:21:12,590
already know how to solve.

290
00:21:12,590 --> 00:21:15,570
I'm not going to go through a
list of classic optimization

291
00:21:15,570 --> 00:21:16,960
problems right now.

292
00:21:16,960 --> 00:21:20,055
But we'll see a bunch of
them as we go forward.

293
00:21:22,640 --> 00:21:25,310
Now, an important thing to
think about when we think

294
00:21:25,310 --> 00:21:29,750
about optimization problems
is how long, how

295
00:21:29,750 --> 00:21:32,210
hard, they are to solve.

296
00:21:32,210 --> 00:21:36,450
So far, we have looked at
problems that, for the most

297
00:21:36,450 --> 00:21:42,600
part, have pretty fast
solutions, often sub-linear,

298
00:21:42,600 --> 00:21:46,990
binary search, sometimes linear,
and at worst case,

299
00:21:46,990 --> 00:21:48,260
low-order polynomials.

300
00:21:52,890 --> 00:21:57,130
Optimization problems, as we'll
see, are typically much

301
00:21:57,130 --> 00:21:59,440
worse than that.

302
00:21:59,440 --> 00:22:04,070
In fact, what we'll see
is there is often no

303
00:22:04,070 --> 00:22:07,980
computationally efficient
way to solve them.

304
00:22:07,980 --> 00:22:12,610
And so we end up dealing with
approximate solutions to them,

305
00:22:12,610 --> 00:22:16,290
or what people might call
best effort solutions.

306
00:22:16,290 --> 00:22:18,460
And we see that as
an increasing

307
00:22:18,460 --> 00:22:22,230
trend in tackling problems.

308
00:22:22,230 --> 00:22:26,210
All right, enough of this
abstract stuff.

309
00:22:26,210 --> 00:22:27,980
Let's look at an example.

310
00:22:27,980 --> 00:22:31,950
So one of the classic
optimization problems is

311
00:22:31,950 --> 00:22:35,440
called the knapsack problem.

312
00:22:35,440 --> 00:22:37,110
People know what
a knapsack is?

313
00:22:37,110 --> 00:22:39,070
Sort of an archaic term.

314
00:22:39,070 --> 00:22:43,020
Today, people would use
the word backpack.

315
00:22:43,020 --> 00:22:46,300
But in the old days, they called
them knapsacks when

316
00:22:46,300 --> 00:22:48,650
they started looking
at these things.

317
00:22:48,650 --> 00:22:52,160
And the problem is also
discussed in the context of a

318
00:22:52,160 --> 00:22:56,550
burglar or various
kinds of thieves.

319
00:22:56,550 --> 00:22:58,730
So it's not easy being a
burglar, by the way.

320
00:22:58,730 --> 00:23:01,360
I don't know if any of
you ever tried it.

321
00:23:01,360 --> 00:23:04,160
You've got some of the obvious
problems, like making sure the

322
00:23:04,160 --> 00:23:06,260
house is empty and
picking locks,

323
00:23:06,260 --> 00:23:08,800
circumventing alarms, et cetera.

324
00:23:08,800 --> 00:23:11,690
But one of the really hard
problems a burglar has to deal

325
00:23:11,690 --> 00:23:14,610
with is deciding
what to steal.

326
00:23:14,610 --> 00:23:17,880
Because you break into the
typical luxury home-- and why

327
00:23:17,880 --> 00:23:19,660
would you break into a
poor person's house

328
00:23:19,660 --> 00:23:21,730
if you were a burglar--

329
00:23:21,730 --> 00:23:26,090
there's usually far more to
steal than you can carry away.

330
00:23:26,090 --> 00:23:28,750
And so the problem is formulated
in terms of the

331
00:23:28,750 --> 00:23:30,860
burglar having a backpack.

332
00:23:30,860 --> 00:23:33,510
They can put a certain amount
of stuff in it.

333
00:23:33,510 --> 00:23:40,090
And they have to maximize the
value of what they steal

334
00:23:40,090 --> 00:23:43,940
subject to the constraint of
how much weight they can

335
00:23:43,940 --> 00:23:46,330
actually carry.

336
00:23:46,330 --> 00:23:49,180
So it's a classic optimization
problem.

337
00:23:49,180 --> 00:23:53,020
And people have worked for years
at how to solve it, not

338
00:23:53,020 --> 00:23:54,780
so much because they want
to be burglars.

339
00:23:54,780 --> 00:23:58,000
But as you'll see, these kinds
of optimization problems are

340
00:23:58,000 --> 00:24:00,490
actually quite common.

341
00:24:00,490 --> 00:24:03,420
So let's look at an example.

342
00:24:03,420 --> 00:24:04,800
You break into the house.

343
00:24:04,800 --> 00:24:07,360
And among other things,
you have a

344
00:24:07,360 --> 00:24:08,850
choice of what to steal.

345
00:24:08,850 --> 00:24:13,090
You have a rather strange
looking clock, some artwork, a

346
00:24:13,090 --> 00:24:18,580
book, a Velvet Elvis in case
you lean in that direction,

347
00:24:18,580 --> 00:24:20,630
all sorts of things.

348
00:24:20,630 --> 00:24:25,380
And for some reason, the owner
was nice enough to leave you

349
00:24:25,380 --> 00:24:29,200
information about how much
everything cost and how much

350
00:24:29,200 --> 00:24:30,350
it weighed.

351
00:24:30,350 --> 00:24:32,760
So you find this
piece of paper.

352
00:24:32,760 --> 00:24:35,920
And now, you're trying to decide
what to steal based

353
00:24:35,920 --> 00:24:39,120
upon this in a way to
maximize your value.

354
00:24:42,030 --> 00:24:44,590
How do we go about doing it?

355
00:24:44,590 --> 00:24:46,580
Oh, I should show
you, by the way.

356
00:24:46,580 --> 00:24:48,360
There's a picture of
a typical knapsack.

357
00:24:52,260 --> 00:24:56,600
All right, it's almost
Easter, after all.

358
00:24:56,600 --> 00:25:02,125
Well, the simplest solution is
probably a greedy algorithm.

359
00:25:06,430 --> 00:25:10,440
And we'll talk a lot about
greedy algorithms because they

360
00:25:10,440 --> 00:25:15,630
are very popular and often
the right way to

361
00:25:15,630 --> 00:25:20,640
tackle a hard problem.

362
00:25:20,640 --> 00:25:26,650
So the notion of a greedy
algorithm is it's iterative.

363
00:25:26,650 --> 00:25:30,646
And at each step, you pick the
locally optimal solution.

364
00:25:57,780 --> 00:26:03,610
So you make the best choice, put
that item in the knapsack.

365
00:26:03,610 --> 00:26:05,810
Ask if you have room, if
you're out of weight.

366
00:26:05,810 --> 00:26:09,620
If not, you make the best choice
of the remaining ones.

367
00:26:09,620 --> 00:26:10,700
Ask the same question.

368
00:26:10,700 --> 00:26:16,540
You do that until you can't
fit anything else in.

369
00:26:16,540 --> 00:26:22,850
Now of course, to do that, that
assumes that we know at

370
00:26:22,850 --> 00:26:28,380
each stage what we mean
by locally optimal.

371
00:26:28,380 --> 00:26:31,130
And of course, we have
choices here.

372
00:26:31,130 --> 00:26:36,000
We're trying to figure out,
in some sense, what greedy

373
00:26:36,000 --> 00:26:38,890
algorithm, what approach to
being greedy, will give us the

374
00:26:38,890 --> 00:26:40,940
best result.

375
00:26:40,940 --> 00:26:43,900
So one could, for example,
say, all right.

376
00:26:43,900 --> 00:26:46,760
At each step, I'll choose the
most valuable item and put

377
00:26:46,760 --> 00:26:48,910
that in my knapsack.

378
00:26:48,910 --> 00:26:53,490
And I'll do that till I run
out of valuable items.

379
00:26:53,490 --> 00:26:57,890
Or, you could, at each step,
say, well, what I'm really

380
00:26:57,890 --> 00:27:01,670
going to choose is the one
that weights the least.

381
00:27:01,670 --> 00:27:03,290
That will give me
the most items.

382
00:27:03,290 --> 00:27:04,920
And maybe that will give
me the most total

383
00:27:04,920 --> 00:27:07,210
value when I'm done.

384
00:27:07,210 --> 00:27:10,810
Or maybe, at each step, you
could say, well, let me choose

385
00:27:10,810 --> 00:27:14,610
the one that has the best
value to weight ratio

386
00:27:14,610 --> 00:27:15,410
and put that in.

387
00:27:15,410 --> 00:27:20,100
And maybe that will give
me the best solution.

388
00:27:20,100 --> 00:27:26,080
As we will see, in this case,
none of those is guaranteed to

389
00:27:26,080 --> 00:27:30,090
give you the best solution
all the time.

390
00:27:30,090 --> 00:27:33,390
In fact, as we'll see, none of
them is guaranteed to be

391
00:27:33,390 --> 00:27:37,430
better than any of the
others all the time.

392
00:27:37,430 --> 00:27:41,790
And that's one of the issues
with greedy algorithms.

393
00:27:41,790 --> 00:27:45,340
I should point out, by the way,
that this version of the

394
00:27:45,340 --> 00:27:49,620
knapsack problem that we're
talking about is typically

395
00:27:49,620 --> 00:27:51,825
called the 0/1 knapsack
problem.

396
00:27:59,180 --> 00:28:04,840
And that's because we either
have to take the entire item

397
00:28:04,840 --> 00:28:07,480
or none of the item.

398
00:28:07,480 --> 00:28:10,270
We're not allowed to cut the
Velvet Elvis in half and take

399
00:28:10,270 --> 00:28:11,520
half of it.

400
00:28:14,120 --> 00:28:18,300
This is in contrast to the
continuous knapsack problem.

401
00:28:18,300 --> 00:28:21,300
If you imagine you break into
the house and you see a barrel

402
00:28:21,300 --> 00:28:26,840
of gold dust, and a barrel of
silver dust, and a barrel of

403
00:28:26,840 --> 00:28:31,500
raisins, what you would do is
you would fill your knapsack

404
00:28:31,500 --> 00:28:34,380
with as much gold as you
could carry, or until

405
00:28:34,380 --> 00:28:36,540
you ran out of gold.

406
00:28:36,540 --> 00:28:38,480
And then, you would fill it
with as much silver as you

407
00:28:38,480 --> 00:28:39,730
could carry.

408
00:28:39,730 --> 00:28:41,000
And then, if there's
any room left,

409
00:28:41,000 --> 00:28:43,900
you'd put in the raisins.

410
00:28:43,900 --> 00:28:47,110
For the continuous knapsack
problem, a greedy algorithm

411
00:28:47,110 --> 00:28:50,470
provides an optimal solution.

412
00:28:50,470 --> 00:28:53,735
Unfortunately, most of the
problems we actually encounter

413
00:28:53,735 --> 00:28:58,560
in life, as we'll see, are
0/1 knapsack problems.

414
00:28:58,560 --> 00:29:01,160
You either take something
or you don't.

415
00:29:01,160 --> 00:29:03,910
And that's more complicated.

416
00:29:03,910 --> 00:29:05,220
All right, let's look
at some code.

417
00:29:13,280 --> 00:29:15,900
So I'm going to formulate it.

418
00:29:15,900 --> 00:29:21,510
I'm first going to start by
putting in a class, just so

419
00:29:21,510 --> 00:29:24,110
the rest of my code
is simpler.

420
00:29:24,110 --> 00:29:26,200
This is something we've been
talking about, that

421
00:29:26,200 --> 00:29:29,960
increasingly people want to
start by putting in some

422
00:29:29,960 --> 00:29:32,190
useful data abstractions.

423
00:29:32,190 --> 00:29:37,030
So I've got a class item where
I can put in the item.

424
00:29:37,030 --> 00:29:38,170
I can get its name.

425
00:29:38,170 --> 00:29:39,120
I get its value.

426
00:29:39,120 --> 00:29:39,990
I can get its weight.

427
00:29:39,990 --> 00:29:41,860
And I can print it.

428
00:29:41,860 --> 00:29:48,130
Kind of a boring class,
but useful to have.

429
00:29:48,130 --> 00:29:56,220
Then, I'm going to use this
class to build items.

430
00:29:56,220 --> 00:29:59,040
And in this case, I'm going to
build the items based upon

431
00:29:59,040 --> 00:30:02,240
what we just looked at,
the table that--

432
00:30:02,240 --> 00:30:03,470
I think it's in your hand out.

433
00:30:03,470 --> 00:30:07,250
And it's also on this slide.

434
00:30:07,250 --> 00:30:10,520
Later, if we want, we can have
a randomized program to build

435
00:30:10,520 --> 00:30:13,250
up a much bigger choice
of items.

436
00:30:13,250 --> 00:30:15,450
But here, we'll just try the
clock, the painting, the

437
00:30:15,450 --> 00:30:19,350
radio, the vase, the book,
and the computer.

438
00:30:19,350 --> 00:30:20,765
Now comes the interesting
part.

439
00:30:25,690 --> 00:30:28,970
I've written a function, greedy,
that takes three

440
00:30:28,970 --> 00:30:31,410
arguments--

441
00:30:31,410 --> 00:30:35,830
the set of items that I have to
choose from, makes sense,

442
00:30:35,830 --> 00:30:39,640
the maximum weight the
burglar can carry.

443
00:30:39,640 --> 00:30:43,580
And there's something called
key function, which is

444
00:30:43,580 --> 00:30:47,960
defining essentially what I
mean by locally optimal.

445
00:30:55,340 --> 00:30:58,230
Then, it's quite simple.

446
00:30:58,230 --> 00:31:06,270
I'm going to sort the items
using the key function.

447
00:31:06,270 --> 00:31:09,190
Remember, sort has this optional
argument that says,

448
00:31:09,190 --> 00:31:10,670
what's the ordering?

449
00:31:10,670 --> 00:31:12,450
So maybe I'll order
it by value.

450
00:31:12,450 --> 00:31:13,870
Maybe I'll order
it by density.

451
00:31:13,870 --> 00:31:16,980
Maybe I'll order it by weight.

452
00:31:16,980 --> 00:31:20,900
I'm going to reverse it,
because I want the most

453
00:31:20,900 --> 00:31:25,890
valuable first, not the least
valuable, for example.

454
00:31:25,890 --> 00:31:29,510
And then, I'm going to just
take the first thing on my

455
00:31:29,510 --> 00:31:33,650
list until I run out of weight,
and then I'm done.

456
00:31:33,650 --> 00:31:36,105
And I'll return the result
and the total value.

457
00:31:41,600 --> 00:31:45,310
To make life simple, I'm going
to define some functions.

458
00:31:45,310 --> 00:31:51,100
These are the functions that
I can use for the ordering.

459
00:31:51,100 --> 00:31:54,830
Value, which is just return
the value of the item.

460
00:31:54,830 --> 00:31:59,190
The inverse of the weight,
because I'm thinking, as a

461
00:31:59,190 --> 00:32:00,760
greedy algorithm, I'll take the

462
00:32:00,760 --> 00:32:02,950
lightest, not the heaviest.

463
00:32:02,950 --> 00:32:06,670
And since I'm reversing it,
I want to do the inverse.

464
00:32:06,670 --> 00:32:10,400
And the density, which
is just the value

465
00:32:10,400 --> 00:32:11,650
divided by the weight.

466
00:32:15,380 --> 00:32:17,105
OK, make sense to everybody?

467
00:32:20,790 --> 00:32:22,370
You with me?

468
00:32:22,370 --> 00:32:25,010
Speak now, or not.

469
00:32:25,010 --> 00:32:27,890
And then, we'll test it.

470
00:32:27,890 --> 00:32:31,520
So again, kind of a theme of
this part of the course.

471
00:32:31,520 --> 00:32:36,020
As we write these more complex
programs, we tend to have to

472
00:32:36,020 --> 00:32:38,320
worry about our test
harnesses.

473
00:32:38,320 --> 00:32:41,280
So I've got a function that
tests the greedy algorithm,

474
00:32:41,280 --> 00:32:44,590
and then another function that
tests all three greedy

475
00:32:44,590 --> 00:32:45,850
approaches--

476
00:32:45,850 --> 00:32:48,510
the one algorithm with
different functions--

477
00:32:48,510 --> 00:32:53,220
and looks at what
our results are.

478
00:32:53,220 --> 00:32:54,520
So let's run it.

479
00:33:02,240 --> 00:33:04,020
See what we get.

480
00:33:04,020 --> 00:33:05,500
Oh, you know what I did?

481
00:33:05,500 --> 00:33:07,090
Just the same thing
I did last time.

482
00:33:07,090 --> 00:33:09,920
But this time, I'm going
to be smarter.

483
00:33:09,920 --> 00:33:14,810
We're going to get rid of this
figure and comment out the

484
00:33:14,810 --> 00:33:16,060
code that generated it.

485
00:33:31,150 --> 00:33:32,810
And now, we'll test the
greedy algorithms.

486
00:33:39,210 --> 00:33:43,570
So we see the items we had to
choose from, which I printed

487
00:33:43,570 --> 00:33:46,640
using the string function
and items.

488
00:33:46,640 --> 00:33:51,080
And if I use greedy by value to
fill a knapsack of size 20,

489
00:33:51,080 --> 00:33:56,290
we see that I end up getting
just the computer if I do

490
00:33:56,290 --> 00:33:59,490
greedy by value.

491
00:33:59,490 --> 00:34:03,860
This is for the nerd burglar.

492
00:34:03,860 --> 00:34:06,120
If I use weight, I
get a different--

493
00:34:06,120 --> 00:34:08,540
I get more things, not
surprisingly --

494
00:34:08,540 --> 00:34:12,429
but lower value.

495
00:34:12,429 --> 00:34:17,219
And if I use density, I also
get four things, but four

496
00:34:17,219 --> 00:34:19,720
different things, and I
get a higher value.

497
00:34:23,139 --> 00:34:26,389
So I see that I can run these
greedy algorithms.

498
00:34:26,389 --> 00:34:27,969
I can get an answer.

499
00:34:27,969 --> 00:34:30,790
But it's not always
the same answer.

500
00:34:35,520 --> 00:34:38,429
As I said earlier, greedy
by density happens

501
00:34:38,429 --> 00:34:40,170
to work best here.

502
00:34:40,170 --> 00:34:44,679
But you shouldn't assume that
will always be the case.

503
00:34:44,679 --> 00:34:47,130
I'm sure you can all imagine
a different assignment of

504
00:34:47,130 --> 00:34:50,159
weights and values that would
make greedy by density give

505
00:34:50,159 --> 00:34:53,710
you a bad answer.

506
00:34:53,710 --> 00:34:56,350
All right, before we
talk about how good

507
00:34:56,350 --> 00:34:58,200
these answers are--

508
00:34:58,200 --> 00:35:01,290
and we will come back to that
as, in particular, suppose I

509
00:35:01,290 --> 00:35:03,480
want the best answer--

510
00:35:03,480 --> 00:35:07,350
I want to stop for a minute and
talk about the algorithmic

511
00:35:07,350 --> 00:35:10,220
efficiency of the greedy
algorithm.

512
00:35:13,130 --> 00:35:14,770
So let's go back and
look at the code.

513
00:35:14,770 --> 00:35:17,030
And this is why people use
greedy algorithms.

514
00:35:17,030 --> 00:35:19,290
Actually, there are
two reasons.

515
00:35:19,290 --> 00:35:23,390
One reason is that they're easy
to program, and that's

516
00:35:23,390 --> 00:35:25,420
always a good thing.

517
00:35:25,420 --> 00:35:31,140
And the other is that they are
typically highly efficient.

518
00:35:31,140 --> 00:35:34,740
So what's the efficiency
of this?

519
00:35:34,740 --> 00:35:36,750
How would we think about
the efficiency

520
00:35:36,750 --> 00:35:40,060
of this greedy algorithm?

521
00:35:40,060 --> 00:35:43,080
What are we looking at here?

522
00:35:43,080 --> 00:35:46,130
Well, the first thing we have
to ask is, what's the first

523
00:35:46,130 --> 00:35:48,290
thing it does?

524
00:35:48,290 --> 00:35:52,090
It sorts the list, right?

525
00:35:52,090 --> 00:35:57,330
So one thing that governs the
efficiency might be the amount

526
00:35:57,330 --> 00:36:03,020
of time it takes to sort
the list of items.

527
00:36:03,020 --> 00:36:06,580
Well, we know how
long that takes.

528
00:36:06,580 --> 00:36:08,960
Or we can speculate at least.

529
00:36:08,960 --> 00:36:12,210
And let's assume it does
something like merge sort.

530
00:36:12,210 --> 00:36:14,080
So what's that term
going to be?

531
00:36:16,760 --> 00:36:18,095
Order what?

532
00:36:25,110 --> 00:36:32,160
Len of items times what?

533
00:36:32,160 --> 00:36:33,410
Log n, right?

534
00:36:41,210 --> 00:36:43,850
So maybe that's going
to tell us the

535
00:36:43,850 --> 00:36:47,410
complexity, but maybe not.

536
00:36:47,410 --> 00:36:51,630
The next thing we have to do is
look at the while loop and

537
00:36:51,630 --> 00:36:54,110
see how many times are we going
through the while loop.

538
00:36:59,810 --> 00:37:01,060
What's the worst case?

539
00:37:06,170 --> 00:37:07,050
Somebody?

540
00:37:07,050 --> 00:37:09,070
I know I didn't bring any candy
today, but you could

541
00:37:09,070 --> 00:37:10,985
answer the question anyway.

542
00:37:10,985 --> 00:37:12,440
Be a sport.

543
00:37:12,440 --> 00:37:13,480
Do it for free.

544
00:37:13,480 --> 00:37:15,782
Yeah?

545
00:37:15,782 --> 00:37:17,440
AUDIENCE: The length
of the items.

546
00:37:26,879 --> 00:37:30,880
PROFESSOR: Well, we know
this one is bigger.

547
00:37:30,880 --> 00:37:34,530
So it looks like that's
the complexity, right?

548
00:37:38,720 --> 00:37:43,910
So we can say, all right,
pretty good.

549
00:37:43,910 --> 00:37:47,420
Slightly worse than linear in
the length of the items, but

550
00:37:47,420 --> 00:37:50,570
not bad at all.

551
00:37:50,570 --> 00:37:54,390
And that's a big attraction
of greedy algorithms.

552
00:37:54,390 --> 00:38:00,930
They are typically order length
of items, or order

553
00:38:00,930 --> 00:38:04,430
length of items times the
log of the length.

554
00:38:04,430 --> 00:38:07,970
So greedy algorithms are usually
very close to linear.

555
00:38:07,970 --> 00:38:11,800
And that's why we really
like them.

556
00:38:11,800 --> 00:38:16,750
Why we don't like them is it may
be that the accumulation

557
00:38:16,750 --> 00:38:21,460
of a sequence of locally optimal
solutions does not

558
00:38:21,460 --> 00:38:23,260
yield a globally optimal
solution.

559
00:38:26,420 --> 00:38:30,590
So now, let's ask the question,
suppose that's not

560
00:38:30,590 --> 00:38:32,380
good enough.

561
00:38:32,380 --> 00:38:36,330
I have a very demanding thief.

562
00:38:36,330 --> 00:38:40,450
Or maybe the thief works for
a very demanding person and

563
00:38:40,450 --> 00:38:45,610
needs to choose the absolute
optimal set.

564
00:38:45,610 --> 00:38:51,200
Let's think first about how we
formulate that carefully.

565
00:38:51,200 --> 00:38:55,180
And then, what the complexity
of solving it would be.

566
00:38:55,180 --> 00:39:00,180
And then, algorithms that
might be useful.

567
00:39:00,180 --> 00:39:04,060
Again, the important step here,
I think, is not the

568
00:39:04,060 --> 00:39:09,380
solution to the problem, but the
process used to formulate

569
00:39:09,380 --> 00:39:11,200
the problem.

570
00:39:11,200 --> 00:39:16,110
Often, it is the case that once
one has done a careful

571
00:39:16,110 --> 00:39:21,550
formulation of a problem, it
becomes obvious how to solve

572
00:39:21,550 --> 00:39:24,750
it, at least in a
brute force way.

573
00:39:24,750 --> 00:39:27,750
So now, let's look at a
formalization of the 0/1

574
00:39:27,750 --> 00:39:29,330
knapsack problem.

575
00:39:29,330 --> 00:39:31,720
And it's a kind of formalization
we'll use for a

576
00:39:31,720 --> 00:39:33,850
lot of problems.

577
00:39:33,850 --> 00:39:41,820
So step one, we'll represent
each item by a pair.

578
00:39:51,530 --> 00:39:54,550
Because in fact, in deciding
whether or not to take an

579
00:39:54,550 --> 00:39:56,960
item, we don't care
what its name.

580
00:39:56,960 --> 00:40:00,260
We don't care if it's a clock,
or a radio, or whatever.

581
00:40:00,260 --> 00:40:03,330
What matters is what's its value
and what's its weight.

582
00:40:07,300 --> 00:40:15,320
We'll write W as the maximum
weight that the thief can

583
00:40:15,320 --> 00:40:18,700
carry, or that can fit
in the knapsack.

584
00:40:18,700 --> 00:40:20,060
So far, so good.

585
00:40:20,060 --> 00:40:22,160
Nothing complicated there.

586
00:40:22,160 --> 00:40:26,230
Now comes the interesting
step.

587
00:40:26,230 --> 00:40:29,910
We're going to represent
the set of

588
00:40:29,910 --> 00:40:34,030
available items as a vector.

589
00:40:34,030 --> 00:40:57,750
We'll call it I. And then we'll
have another vector, V,

590
00:40:57,750 --> 00:41:02,670
which indicates whether or not
each item in I has been taken.

591
00:41:05,290 --> 00:41:08,620
So V is a vector.

592
00:41:08,620 --> 00:41:23,050
And if V_i is equal to
1, that implies I_i--

593
00:41:23,050 --> 00:41:25,320
big I sub little i--

594
00:41:25,320 --> 00:41:29,560
has been taken, is
in the knapsack.

595
00:41:33,360 --> 00:41:37,550
Conversely, if V_i is
0, it means I_i

596
00:41:37,550 --> 00:41:40,910
is not in the knapsack.

597
00:41:40,910 --> 00:41:52,080
So having formulated the
situation thusly, we can now

598
00:41:52,080 --> 00:41:56,370
go back to our notion of an
optimization problem as an

599
00:41:56,370 --> 00:42:01,580
objective function and a set
of constraints to carefully

600
00:42:01,580 --> 00:42:02,830
state the problem.

601
00:42:05,440 --> 00:42:21,760
So for the objective function,
we want to maximize the sum of

602
00:42:21,760 --> 00:42:40,700
V_i times I_i dot value, where
i ranges over the length of

603
00:42:40,700 --> 00:42:42,270
the vectors.

604
00:42:42,270 --> 00:42:45,360
So that's the trick
of the 0/1.

605
00:42:45,360 --> 00:42:46,820
If I don't take it, it's 0.

606
00:42:46,820 --> 00:42:48,520
So it's 0 times the value.

607
00:42:48,520 --> 00:42:51,390
If I take it, it's 1
times the value.

608
00:42:51,390 --> 00:42:54,980
So this is going to give me the
sum of the values of the

609
00:42:54,980 --> 00:42:56,230
items I've taken.

610
00:42:58,780 --> 00:43:01,175
And then, I have this subject
to the constraint.

611
00:43:11,040 --> 00:43:13,970
And again, we'll
do a summation.

612
00:43:13,970 --> 00:43:16,945
And it will look very similar.

613
00:43:16,945 --> 00:43:31,330
V_i times I_i, but this time,
dot weight is less than or

614
00:43:31,330 --> 00:43:41,350
equal to W.

615
00:43:41,350 --> 00:43:46,830
Straightforward, but a
useful kind of skill.

616
00:43:46,830 --> 00:43:50,180
And people do spend a lot
of time on doing that.

617
00:43:50,180 --> 00:43:53,880
If you've ever used MATLAB, you
know it wants everything

618
00:43:53,880 --> 00:43:55,840
to be a vector.

619
00:43:55,840 --> 00:43:59,590
And that's often because a lot
of these problems can be

620
00:43:59,590 --> 00:44:04,570
nicely formulated in
this kind of way.

621
00:44:04,570 --> 00:44:10,620
All right, now let's return to
the question of complexity.

622
00:44:10,620 --> 00:44:14,360
What happens if we implement
this in the most

623
00:44:14,360 --> 00:44:15,610
straightforward way?

624
00:44:18,220 --> 00:44:20,050
What would the most
straightforward

625
00:44:20,050 --> 00:44:23,050
implementation look like?

626
00:44:23,050 --> 00:44:46,660
Well, we could enumerate all
possibilities and then choose

627
00:44:46,660 --> 00:44:50,535
the best that meets
the constraint.

628
00:44:55,280 --> 00:45:00,650
So this would be the obvious
brute force solution to the

629
00:45:00,650 --> 00:45:03,800
optimization problem.

630
00:45:03,800 --> 00:45:05,410
Look at all possible
solutions.

631
00:45:05,410 --> 00:45:09,100
Choose the best one.

632
00:45:09,100 --> 00:45:11,820
I think you can see immediately
that this is

633
00:45:11,820 --> 00:45:14,300
guaranteed to give you
the optimal solution.

634
00:45:14,300 --> 00:45:16,540
Actually, an optimal solution.

635
00:45:16,540 --> 00:45:20,080
Maybe there's more
than one best.

636
00:45:20,080 --> 00:45:23,750
But in that case, you can just
choose whichever one you like

637
00:45:23,750 --> 00:45:27,490
or whichever comes first,
for example.

638
00:45:27,490 --> 00:45:30,840
The question is, how long
will this take to run?

639
00:45:34,260 --> 00:45:38,440
Well, we can think about that
by asking the question, how

640
00:45:38,440 --> 00:45:39,900
big will this set be?

641
00:45:39,900 --> 00:45:41,540
How many possibilities
are there?

642
00:45:44,470 --> 00:45:48,140
Well, we can think about that
in a pretty straightforward

643
00:45:48,140 --> 00:45:57,300
way because if we look at our
formulation, we can ask

644
00:45:57,300 --> 00:46:02,640
ourselves, how many possible
vectors are there?

645
00:46:02,640 --> 00:46:07,960
How many vector V's could there
be which shows which

646
00:46:07,960 --> 00:46:10,110
items were taken and
which weren't?

647
00:46:10,110 --> 00:46:12,670
And what's the answer to that?

648
00:46:12,670 --> 00:46:23,340
Well, if we have n items,
how long will V be?

649
00:46:23,340 --> 00:46:24,920
Length n, right?

650
00:46:24,920 --> 00:46:27,430
0, 1 for each.

651
00:46:27,430 --> 00:46:32,150
If we have a vector of 0's and
1''s of length n, how many

652
00:46:32,150 --> 00:46:34,220
different values can that
vector take on?

653
00:46:37,990 --> 00:46:39,790
We asked this question before.

654
00:46:39,790 --> 00:46:42,050
What's the answer?

655
00:46:42,050 --> 00:46:43,325
Somebody shout it out.

656
00:46:45,960 --> 00:46:48,840
I've got a vector of length n.

657
00:46:51,470 --> 00:46:55,050
Every value in the vector
is either a 0 or a 1.

658
00:46:55,050 --> 00:46:56,955
So maybe it looks something
like this.

659
00:47:02,330 --> 00:47:05,440
How many possible combinations
of (0, 1)'s are there?

660
00:47:05,440 --> 00:47:06,292
AUDIENCE: 2 to the n?

661
00:47:06,292 --> 00:47:07,680
PROFESSOR: 2 to the n.

662
00:47:07,680 --> 00:47:11,960
Because essentially, this is
a binary number, exactly.

663
00:47:11,960 --> 00:47:16,490
And so if I had an n-bit binary
number, I can represent

664
00:47:16,490 --> 00:47:19,630
2 to the n different values.

665
00:47:19,630 --> 00:47:26,030
And so we see that we have 2 to
the n possible combinations

666
00:47:26,030 --> 00:47:28,370
to look at if we use a
brute force solution.

667
00:47:31,650 --> 00:47:33,790
How bad is this?

668
00:47:33,790 --> 00:47:38,200
Well, if the number of items
is small, it's not so bad.

669
00:47:38,200 --> 00:47:42,630
And you'll see that, in fact, I
can run this on the example

670
00:47:42,630 --> 00:47:43,400
we've looked.

671
00:47:43,400 --> 00:47:47,500
2 to the 5 is not
a huge number.

672
00:47:47,500 --> 00:47:52,100
Suppose I have a different
number.

673
00:47:52,100 --> 00:47:56,440
Suppose I have 50 items
to choose from.

674
00:47:56,440 --> 00:47:58,940
Not a big problem.

675
00:47:58,940 --> 00:48:02,740
I heard yesterday that the
number of different airfares

676
00:48:02,740 --> 00:48:07,700
between two cities in the
US is order of 500 --

677
00:48:07,700 --> 00:48:09,080
500 different airfares between,

678
00:48:09,080 --> 00:48:11,620
say, Boston and Chicago.

679
00:48:11,620 --> 00:48:15,700
So looking at the best there
might be 2 to the 500, kind of

680
00:48:15,700 --> 00:48:17,260
a bigger number.

681
00:48:17,260 --> 00:48:18,570
Let's look at 2 to the 50.

682
00:48:21,830 --> 00:48:23,810
Let's say there were
50 items to choose

683
00:48:23,810 --> 00:48:25,880
from in this question.

684
00:48:25,880 --> 00:48:30,610
And let's say for the sake
of argument, it takes a

685
00:48:30,610 --> 00:48:33,410
microsecond, one millionth
of a second,

686
00:48:33,410 --> 00:48:35,840
to generate a solution.

687
00:48:35,840 --> 00:48:39,830
How long will it take to solve
this problem in a brute force

688
00:48:39,830 --> 00:48:43,550
way for 50 items?

689
00:48:43,550 --> 00:48:47,950
Who thinks you can do it
in under four seconds?

690
00:48:47,950 --> 00:48:51,470
How about under four minutes?

691
00:48:51,470 --> 00:48:52,740
Wow, skeptics.

692
00:48:52,740 --> 00:48:53,600
Four hours?

693
00:48:53,600 --> 00:48:56,020
That's a lot of computation.

694
00:48:56,020 --> 00:48:57,790
Four hours, you're starting
to get some people.

695
00:48:57,790 --> 00:48:59,040
Four days?

696
00:49:01,280 --> 00:49:02,050
All right.

697
00:49:02,050 --> 00:49:03,500
Well, how about four years?

698
00:49:06,300 --> 00:49:09,150
Still longer, just under
four decades?

699
00:49:12,000 --> 00:49:18,490
Looking at one choice every
microsecond, it takes you

700
00:49:18,490 --> 00:49:24,830
roughly 36 years to evaluate
all these possibilities.

701
00:49:24,830 --> 00:49:27,500
Certainly for people of my age,
that's not a practical

702
00:49:27,500 --> 00:49:31,800
solution to have to wait
36 years for an answer.

703
00:49:31,800 --> 00:49:35,110
So we have to find
something better.

704
00:49:35,110 --> 00:49:36,740
And we'll be talking
about that later.