1
00:00:00,080 --> 00:00:01,800
The following
content is provided

2
00:00:01,800 --> 00:00:04,030
under a Creative
Commons license.

3
00:00:04,030 --> 00:00:06,880
Your support will help MIT
OpenCourseWare continue

4
00:00:06,880 --> 00:00:10,740
to offer high quality
educational resources for free.

5
00:00:10,740 --> 00:00:13,360
To make a donation, or
view additional materials

6
00:00:13,360 --> 00:00:17,256
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,256 --> 00:00:17,881
at ocw.mit.edu.

8
00:00:21,184 --> 00:00:23,600
PROFESSOR: So you guys know
the quiz is cumulative, right?

9
00:00:23,600 --> 00:00:25,590
Everything all the way
back from lecture one,

10
00:00:25,590 --> 00:00:28,620
so I would look at all the
lectures and all the P sets,

11
00:00:28,620 --> 00:00:31,390
and look at all the
stuff that we taught you,

12
00:00:31,390 --> 00:00:33,590
so data structures,
algorithms, everything.

13
00:00:33,590 --> 00:00:37,420
And at least be able to
know, for every one of them,

14
00:00:37,420 --> 00:00:40,910
what's the name, what it does,
and wants the running time.

15
00:00:40,910 --> 00:00:43,360
Proofs and how it does
it might be harder,

16
00:00:43,360 --> 00:00:46,410
but these be able to
call it as a black box

17
00:00:46,410 --> 00:00:49,800
and argue about
the running times.

18
00:00:49,800 --> 00:00:54,680
So I have a dp problem, and
I have a non-dp problem.

19
00:00:54,680 --> 00:00:58,320
Which problem would you
like me to start with?

20
00:00:58,320 --> 00:00:58,820
OK.

21
00:01:03,910 --> 00:01:06,630
Do you guys know the saying, if
a woodchucker would chuck wood,

22
00:01:06,630 --> 00:01:09,620
how much wood would
a woodchucker chuck?

23
00:01:09,620 --> 00:01:12,640
Today we're going to chuck wood.

24
00:01:12,640 --> 00:01:19,200
So you have a piece of
wood that is l meters long,

25
00:01:19,200 --> 00:01:20,390
and they have n markings.

26
00:01:27,360 --> 00:01:32,560
So say the first
mark is at 3 meters,

27
00:01:32,560 --> 00:01:37,620
the second mark is at 5
meters, so on, so forth.

28
00:01:37,620 --> 00:01:44,400
And 3, and 4, all
the way up to mn.

29
00:01:44,400 --> 00:01:50,870
So we want to cut this piece
of wood at all the markings.

30
00:01:50,870 --> 00:01:55,150
The thing is the woodchucker
doesn't work for free.

31
00:01:55,150 --> 00:01:59,140
If you give it a piece
of wood of length l,

32
00:01:59,140 --> 00:02:01,710
and you ask it to cut
it at some marking,

33
00:02:01,710 --> 00:02:07,930
you're going to get two pieces
of wood, length l1 and l2.

34
00:02:07,930 --> 00:02:15,730
The price for this
is l1 times l2.

35
00:02:15,730 --> 00:02:18,200
So we like woodchucker,
but woodchuckers would also

36
00:02:18,200 --> 00:02:19,020
like our wallets.

37
00:02:19,020 --> 00:02:22,590
So we want to cut
this up by paying

38
00:02:22,590 --> 00:02:23,820
the minimum amount of money.

39
00:02:27,820 --> 00:02:30,270
Rings a bell?

40
00:02:30,270 --> 00:02:32,540
So I'll let you guys
think for a minute,

41
00:02:32,540 --> 00:02:35,150
then I'll give you the running
time, then we'll start talking.

42
00:02:38,351 --> 00:02:40,350
So we usually give you
running times on quizzes.

43
00:02:40,350 --> 00:02:42,935
The running time is why you
should know all the problems

44
00:02:42,935 --> 00:02:45,560
in their matching running times,
because the moment we give you

45
00:02:45,560 --> 00:02:48,450
a running time you can
automatically eliminate all

46
00:02:48,450 --> 00:02:51,450
the things that don't match,
and just focus on a few things.

47
00:02:57,870 --> 00:03:00,930
So you're going to have to
cut it at all the markings,

48
00:03:00,930 --> 00:03:05,120
eventually, but the order in
which you cut is important.

49
00:03:05,120 --> 00:03:08,550
So if I cut here first, then
I'm going to pay three times l

50
00:03:08,550 --> 00:03:11,310
minus 3, whereas if I
cut in the middle first,

51
00:03:11,310 --> 00:03:19,120
I'm going to pay whatever this
is, and 3 times l minus m3.

52
00:03:23,780 --> 00:03:27,400
So we're trying to
decide the order.

53
00:03:27,400 --> 00:03:29,220
Does this look like
any familiar problem?

54
00:03:33,300 --> 00:03:37,260
AUDIENCE: [INAUDIBLE]
using dp, right?

55
00:03:37,260 --> 00:03:38,960
PROFESSOR: dp, that is good.

56
00:03:38,960 --> 00:03:41,490
I did say that we're going
to start with a dp problem,

57
00:03:41,490 --> 00:03:44,371
so this is dp.

58
00:03:44,371 --> 00:03:45,120
It's a good start.

59
00:03:54,468 --> 00:03:55,452
AUDIENCE: [INAUDIBLE]

60
00:03:55,452 --> 00:03:58,404
PROFESSOR: What?

61
00:03:58,404 --> 00:03:59,880
Not exactly.

62
00:03:59,880 --> 00:04:01,860
AUDIENCE: Yeah.

63
00:04:01,860 --> 00:04:05,880
PROFESSOR: So, it is
not like any problems

64
00:04:05,880 --> 00:04:07,275
on the recitations.

65
00:04:11,420 --> 00:04:13,900
So far recitations did
prefixes and suffixes.

66
00:04:13,900 --> 00:04:17,209
We're going to solve this
using a running time of n

67
00:04:17,209 --> 00:04:20,700
cubed, which is like
the parenthesis problem.

68
00:04:23,740 --> 00:04:26,430
It should be what you said, but
I don't know how to spell that,

69
00:04:26,430 --> 00:04:28,013
so we're going to
go for this instead.

70
00:04:31,720 --> 00:04:33,604
So running n
cubed-- the moment I

71
00:04:33,604 --> 00:04:35,270
said this you guys
should know that this

72
00:04:35,270 --> 00:04:37,940
is the n cubed problem that
we have in lecture notes.

73
00:04:40,712 --> 00:04:42,670
So make sure to have
those on the cheat sheets,

74
00:04:42,670 --> 00:04:46,010
and try to understand
them, right?

75
00:04:46,010 --> 00:04:50,040
OK, so given that
I've said this,

76
00:04:50,040 --> 00:04:54,020
you should know
the solution now.

77
00:04:54,020 --> 00:04:56,330
To make sure
everyone is with me,

78
00:04:56,330 --> 00:04:58,280
we're going to go through
the solution, whole.

79
00:04:58,280 --> 00:04:59,280
So what is a subproblem?

80
00:05:03,903 --> 00:05:06,300
AUDIENCE:Smaller piece of wood.

81
00:05:06,300 --> 00:05:07,738
PROFESSOR: OK.

82
00:05:07,738 --> 00:05:09,960
AUDIENCE: Like how to cut it up.

83
00:05:09,960 --> 00:05:10,680
PROFESSOR: OK.

84
00:05:10,680 --> 00:05:12,700
So this is how you
think of it informally.

85
00:05:12,700 --> 00:05:14,930
When you write it up,
I want to see this.

86
00:05:14,930 --> 00:05:21,650
I want to see dp of
something means something.

87
00:05:21,650 --> 00:05:23,240
So how you fill
out your dp table.

88
00:05:25,920 --> 00:05:30,040
It's really useful to
write this up on your exam

89
00:05:30,040 --> 00:05:33,140
before, because one, this will
help you write the recursion

90
00:05:33,140 --> 00:05:36,240
correctly, and two, if
the grader sees this

91
00:05:36,240 --> 00:05:38,410
they might skim over the
recursion completely.

92
00:05:38,410 --> 00:05:40,040
And then you might
have bugs there.

93
00:05:40,040 --> 00:05:41,150
We might not see them.

94
00:05:41,150 --> 00:05:43,250
Good for you.

95
00:05:43,250 --> 00:05:45,630
So this says how you're
going to fill out the table.

96
00:05:45,630 --> 00:05:48,810
Right? dp of something
equals something.

97
00:05:48,810 --> 00:05:50,150
What's in a dp table?

98
00:05:50,150 --> 00:05:51,460
Numbers.

99
00:05:51,460 --> 00:05:53,140
It's never how to do something.

100
00:05:53,140 --> 00:05:55,570
It's always the
numbers, so it's always

101
00:05:55,570 --> 00:05:58,680
the maximum profit,
or the minimum cost,

102
00:05:58,680 --> 00:06:01,290
or the shortest distance,
or the longest something.

103
00:06:01,290 --> 00:06:03,050
So it's always a number.

104
00:06:03,050 --> 00:06:04,760
So what we do here?

105
00:06:04,760 --> 00:06:10,037
AUDIENCE: Start and dp, start
location to the end location is

106
00:06:10,037 --> 00:06:12,620
PROFESSOR: OK, so we're going
to get the mean distance, right?

107
00:06:12,620 --> 00:06:16,390
We usually do i j k and
whatever else it takes.

108
00:06:16,390 --> 00:06:18,620
So start to end is?

109
00:06:18,620 --> 00:06:22,030
AUDIENCE: The minimum
cost of cutting that up.

110
00:06:22,030 --> 00:06:35,020
PROFESSOR: Minimum cost of
cutting up the wood board

111
00:06:35,020 --> 00:06:40,460
from marking i, all
the way to marking j.

112
00:06:43,250 --> 00:06:46,560
There's a tiny problem here,
that the initial-- there's

113
00:06:46,560 --> 00:06:52,990
no problem for this big
piece of wood, right?

114
00:06:52,990 --> 00:06:56,380
If I can only consider
the board from i to j,

115
00:06:56,380 --> 00:07:00,980
so if I can only consider
the board from marking 1

116
00:07:00,980 --> 00:07:03,580
to marking n, then
I get to this.

117
00:07:03,580 --> 00:07:05,720
So this part and this
part get left out.

118
00:07:11,110 --> 00:07:13,070
AUDIENCE: [INAUDIBLE]

119
00:07:13,070 --> 00:07:14,420
PROFESSOR: Exactly.

120
00:07:14,420 --> 00:07:15,700
We add fake markings.

121
00:07:15,700 --> 00:07:21,170
Then 0 is 0, and
mn plus 1 equals l.

122
00:07:21,170 --> 00:07:21,670
Very good.

123
00:07:21,670 --> 00:07:23,770
AUDIENCE: [INAUDIBLE]
equally spaced?

124
00:07:23,770 --> 00:07:24,450
PROFESSOR: No.

125
00:07:24,450 --> 00:07:27,100
So these are numbers.

126
00:07:27,100 --> 00:07:30,250
If they were evenly spaced,
I think there's an algorithm.

127
00:07:30,250 --> 00:07:33,180
You might come up with a
math and say, you always

128
00:07:33,180 --> 00:07:34,030
cut it up like this.

129
00:07:40,380 --> 00:07:42,970
So while we solve this,
you guys have candy, right?

130
00:07:42,970 --> 00:07:46,390
You should eat the candy and
be energetic and everything.

131
00:07:49,460 --> 00:07:51,980
So min cost of
cutting up the board

132
00:07:51,980 --> 00:07:53,360
from marking i to marking j.

133
00:07:53,360 --> 00:07:54,280
I like this.

134
00:07:54,280 --> 00:07:55,966
Have this on your
exam if possible,

135
00:07:55,966 --> 00:07:57,590
because this will
make our life easier,

136
00:07:57,590 --> 00:07:59,214
and it's going to
make your life easier

137
00:07:59,214 --> 00:08:00,720
when you get to the
next step, which

138
00:08:00,720 --> 00:08:03,590
is how do we compute dp of i j?

139
00:08:06,490 --> 00:08:11,930
So suppose I'm looking at
the subboard from m1 to m4

140
00:08:11,930 --> 00:08:16,650
so I'm looking at only this.

141
00:08:16,650 --> 00:08:20,820
How do I compute the best way
to cut the board from m1 to m4?

142
00:08:33,147 --> 00:08:33,980
What are my options?

143
00:08:36,698 --> 00:08:38,510
AUDIENCE: The locations
you can cut it.

144
00:08:38,510 --> 00:08:39,630
PROFESSOR: Exactly.

145
00:08:39,630 --> 00:08:42,110
So in order to cut
this up, I can either

146
00:08:42,110 --> 00:08:45,230
make a first cut at m2.

147
00:08:45,230 --> 00:08:48,220
So say I make my first
cut here, and then I

148
00:08:48,220 --> 00:08:52,880
recursively cut
this, and cut this.

149
00:08:52,880 --> 00:08:59,750
Or the other alternative
is take the same guy-- m1,

150
00:08:59,750 --> 00:09:07,110
m2, m3, m4-- cut it at m3,
and then recursively cut this,

151
00:09:07,110 --> 00:09:10,730
and recursively cut this.

152
00:09:10,730 --> 00:09:15,790
So I'm iterating over all the
markings inside the board.

153
00:09:15,790 --> 00:09:17,505
Now suppose I'm
cutting it-- yes?

154
00:09:17,505 --> 00:09:21,176
AUDIENCE: [INAUDIBLE] cutting
both, or actually, never mind.

155
00:09:21,176 --> 00:09:22,550
PROFESSOR: Yeah,
when I recursed,

156
00:09:22,550 --> 00:09:24,470
that takes care of it.

157
00:09:24,470 --> 00:09:27,900
So suppose I'm looking
at m1 through m4,

158
00:09:27,900 --> 00:09:35,470
and I'm cutting it at m2.

159
00:09:35,470 --> 00:09:38,220
What's the total cost?

160
00:09:38,220 --> 00:09:41,670
So what's the best way
to cut, given that then I

161
00:09:41,670 --> 00:09:43,562
know I'm going to cut there?

162
00:09:43,562 --> 00:09:46,470
AUDIENCE: The sum of the dp's.

163
00:09:46,470 --> 00:09:56,580
PROFESSOR: OK, so it's the
best way to cut m1 through m2,

164
00:09:56,580 --> 00:10:07,050
plus best way to
cut m2 through m4,

165
00:10:07,050 --> 00:10:10,520
plus the price I'm paying
for this cut, right?

166
00:10:10,520 --> 00:10:11,910
Not just the sum of the dp's.

167
00:10:11,910 --> 00:10:12,890
One more term.

168
00:10:12,890 --> 00:10:16,105
What's this term?

169
00:10:16,105 --> 00:10:17,560
AUDIENCE: 4 minus 1?

170
00:10:17,560 --> 00:10:20,634
Or the location of 4
minus the location of 1.

171
00:10:23,390 --> 00:10:27,300
PROFESSOR: So,
not quite, almost.

172
00:10:27,300 --> 00:10:31,340
So if I'm cutting a
board into two pieces,

173
00:10:31,340 --> 00:10:33,840
the cost is the product of
the length of the two pieces.

174
00:10:38,710 --> 00:10:46,050
m2 minus m1, times yes.

175
00:10:46,050 --> 00:10:48,580
OK, why did I bother doing this?

176
00:10:48,580 --> 00:10:51,410
Some people think better
with concrete numbers.

177
00:10:51,410 --> 00:10:56,050
If that's the case, then
give yourself an example.

178
00:10:56,050 --> 00:10:59,310
Write some numbers on
your sheet of paper,

179
00:10:59,310 --> 00:11:01,680
then see what
letters match to what

180
00:11:01,680 --> 00:11:03,940
numbers, and copy
it up using letters.

181
00:11:03,940 --> 00:11:07,220
And there you go, you've
solved the problem.

182
00:11:07,220 --> 00:11:09,680
So where are i and j here?

183
00:11:12,630 --> 00:11:14,200
AUDIENCE: i would be 1.

184
00:11:14,200 --> 00:11:16,438
PROFESSOR: OK, so this is i.

185
00:11:16,438 --> 00:11:19,130
AUDIENCE: That's j.

186
00:11:19,130 --> 00:11:24,400
PROFESSOR: Cool, so let's
try to write it up, now.

187
00:11:24,400 --> 00:11:31,040
So in order to cut the board
from i to j, what am I doing?

188
00:11:31,040 --> 00:11:33,010
So what am I computing?

189
00:11:33,010 --> 00:11:35,670
Usually the first word in
your subproblem definition

190
00:11:35,670 --> 00:11:38,080
is the function that
you're going to use.

191
00:11:38,080 --> 00:11:43,008
So it's minimum, and I'm
going iterate over something.

192
00:11:43,008 --> 00:11:50,770
AUDIENCE: dp of i to--
it has to be all of j.

193
00:11:50,770 --> 00:11:52,260
dp of i, j, and
you're looking to--

194
00:11:52,260 --> 00:11:54,380
PROFESSOR: So I'm
computing dp of i j.

195
00:11:54,380 --> 00:11:57,380
AUDIENCE: I know, of
j minus [INAUDIBLE].

196
00:11:57,380 --> 00:12:00,620
AUDIENCE: j minus i, then
k j minus [INAUDIBLE].

197
00:12:00,620 --> 00:12:01,980
PROFESSOR: There's a k, right?

198
00:12:01,980 --> 00:12:05,980
I need a new variable for where
I'm going to cut up, right?

199
00:12:05,980 --> 00:12:08,680
So fortunately, we have a lot
of letters in the alphabet,

200
00:12:08,680 --> 00:12:11,030
i, j, k, so on and
so forth, l, m.

201
00:12:13,622 --> 00:12:14,930
AUDIENCE: i plus k.

202
00:12:17,209 --> 00:12:19,250
PROFESSOR: So let's say
that k is the place where

203
00:12:19,250 --> 00:12:21,410
we cut, to make our life easy.

204
00:12:21,410 --> 00:12:26,270
So I'm going to have dp of

205
00:12:26,270 --> 00:12:28,269
AUDIENCE: Well i is
the starting point.

206
00:12:28,269 --> 00:12:28,810
PROFESSOR: OK

207
00:12:28,810 --> 00:12:34,280
AUDIENCE: And then, the
endpoint is i plus k, right?

208
00:12:34,280 --> 00:12:36,267
PROFESSOR: So what's k here?

209
00:12:36,267 --> 00:12:37,600
AUDIENCE: k is an actual number.

210
00:12:37,600 --> 00:12:40,350
It's not the offset,
it's the actual number,

211
00:12:40,350 --> 00:12:41,840
so it should be i to k.

212
00:12:41,840 --> 00:12:43,484
It depends how you define k.

213
00:12:43,484 --> 00:12:45,900
PROFESSOR: So I'm going to
make my life easy, and define k

214
00:12:45,900 --> 00:12:47,735
as exactly the marking
at which I cut.

215
00:12:50,940 --> 00:12:51,680
k is this 2 here.

216
00:12:55,800 --> 00:12:58,380
And this is easier, trust me.

217
00:12:58,380 --> 00:13:01,042
OK, plus?

218
00:13:01,042 --> 00:13:01,625
AUDIENCE: k j?

219
00:13:05,030 --> 00:13:08,866
PROFESSOR: OK, and?

220
00:13:08,866 --> 00:13:16,156
AUDIENCE: Cost of m--

221
00:13:16,156 --> 00:13:23,910
AUDIENCE: j minus i, m of
k minus m of i times m of j

222
00:13:23,910 --> 00:13:25,470
minus m of k.

223
00:13:25,470 --> 00:13:26,394
PROFESSOR: Cool.

224
00:13:26,394 --> 00:13:28,060
Yeah, other way
around-- doesn't matter.

225
00:13:30,750 --> 00:13:35,430
So now where does k go?

226
00:13:35,430 --> 00:13:38,374
We have to come up with
numbers for the loop, right?

227
00:13:40,912 --> 00:13:42,160
AUDIENCE: Between i and j.

228
00:13:42,160 --> 00:13:44,522
AUDIENCE: j minus i.

229
00:13:44,522 --> 00:13:48,490
AUDIENCE: Just for k in i to j.

230
00:13:48,490 --> 00:13:52,624
PROFESSOR: So if I have
the board from 1 to 4,

231
00:13:52,624 --> 00:13:54,400
do I cut at 1?

232
00:13:54,400 --> 00:13:56,854
I can, but that's kind of weird.

233
00:13:56,854 --> 00:13:59,580
Because I'm recursing
on the same subproblem.

234
00:13:59,580 --> 00:14:01,880
By the way, if you recurse
to the same subproblem,

235
00:14:01,880 --> 00:14:05,150
what are you going to
get as your running time?

236
00:14:05,150 --> 00:14:07,450
Infinite.

237
00:14:07,450 --> 00:14:09,430
So let's not do that.

238
00:14:09,430 --> 00:14:11,470
So we're going to go from?

239
00:14:11,470 --> 00:14:14,014
AUDIENCE: [INAUDIBLE]

240
00:14:14,014 --> 00:14:15,680
PROFESSOR: So going
from i would be bad.

241
00:14:15,680 --> 00:14:16,991
So i plus 1.

242
00:14:16,991 --> 00:14:17,490
2?

243
00:14:17,490 --> 00:14:19,430
AUDIENCE: j minus 1.

244
00:14:19,430 --> 00:14:20,750
PROFESSOR: Very good.

245
00:14:20,750 --> 00:14:26,480
AUDIENCE: Would it be m over
i plus 1, because [INAUDIBLE].

246
00:14:26,480 --> 00:14:28,650
PROFESSOR: So k is which
marking I'm cutting at.

247
00:14:28,650 --> 00:14:32,020
I never want to cut
inside a marking.

248
00:14:32,020 --> 00:14:34,760
However, I don't even
know these are integers.

249
00:14:34,760 --> 00:14:37,890
AUDIENCE: They wouldn't
be called [INAUDIBLE].

250
00:14:37,890 --> 00:14:39,975
PROFESSOR: So k is
which marking, i, j,

251
00:14:39,975 --> 00:14:41,975
and k are which
marking I'm cutting at.

252
00:14:44,580 --> 00:14:46,330
These are the only
discrete things I have.

253
00:14:46,330 --> 00:14:51,132
This board is all filled
with real numbers.

254
00:14:51,132 --> 00:14:52,590
So if I want to
cut somewhere here,

255
00:14:52,590 --> 00:14:54,480
that's a real number--
I don't like that.

256
00:14:54,480 --> 00:14:55,900
I want to have integers.

257
00:14:55,900 --> 00:15:00,040
So my markings help
me get integers.

258
00:15:00,040 --> 00:15:01,530
I only want to cut
at the marking,

259
00:15:01,530 --> 00:15:05,370
so I always look at my problem
in terms of which marking I'm

260
00:15:05,370 --> 00:15:06,080
cutting it.

261
00:15:09,110 --> 00:15:11,650
So this always
iterates over markings.

262
00:15:11,650 --> 00:15:16,790
So this looks very much like
the parentheses problem, right?

263
00:15:16,790 --> 00:15:20,740
Same subproblems, roughly
the same recursion.

264
00:15:20,740 --> 00:15:22,500
Turns out that these
problems, where

265
00:15:22,500 --> 00:15:24,550
you're not considering
suffixes or prefixes,

266
00:15:24,550 --> 00:15:27,100
but rather you're
considering substrings,

267
00:15:27,100 --> 00:15:30,600
are reasonably hard to come by,
and reasonably hard to solve.

268
00:15:30,600 --> 00:15:32,840
So if we give these
to you, chances

269
00:15:32,840 --> 00:15:35,890
are they're going to be exactly
like the parentheses problem,

270
00:15:35,890 --> 00:15:37,990
except for the cost function.

271
00:15:37,990 --> 00:15:42,050
This isn't what we had in the
parentheses problem, right?

272
00:15:42,050 --> 00:15:44,220
So you should be prepared
to solve problems

273
00:15:44,220 --> 00:15:46,070
that look exactly like
the paren problem,

274
00:15:46,070 --> 00:15:49,780
but might have a
different cost function.

275
00:15:49,780 --> 00:15:52,960
And this is how we solve it.

276
00:15:52,960 --> 00:15:53,460
OK.

277
00:15:53,460 --> 00:15:55,668
AUDIENCE: When you say that
the complexity determines

278
00:15:55,668 --> 00:15:58,245
which type of dp
example we use, does

279
00:15:58,245 --> 00:16:03,260
that mean that a
problem can be solved

280
00:16:03,260 --> 00:16:08,514
using any of dp examples?

281
00:16:08,514 --> 00:16:13,760
It's just that the only thing
that changes is the complexity.

282
00:16:13,760 --> 00:16:15,310
PROFESSOR: I don't
think you can map

283
00:16:15,310 --> 00:16:16,840
every approach
onto every problem.

284
00:16:16,840 --> 00:16:20,450
For example, if you tried
to map prefixes onto this,

285
00:16:20,450 --> 00:16:23,050
you'd come up with
a solution that

286
00:16:23,050 --> 00:16:25,370
doesn't look at all
the possible choices,

287
00:16:25,370 --> 00:16:27,540
so your answer would
be sub-optimal.

288
00:16:27,540 --> 00:16:30,330
So you'd come up with a fast,
but incorrect algorithm.

289
00:16:30,330 --> 00:16:35,750
However, if you take the
problem of find the longest

290
00:16:35,750 --> 00:16:37,900
increasing sub-sequence,
you can definitely

291
00:16:37,900 --> 00:16:39,230
apply this technique to it.

292
00:16:39,230 --> 00:16:41,290
It's more general than
suffixes or prefixes.

293
00:16:41,290 --> 00:16:44,200
So it's going to work, but
it's going to be slower.

294
00:16:44,200 --> 00:16:46,610
So in theory, what
you should do is,

295
00:16:46,610 --> 00:16:48,510
you have all these techniques.

296
00:16:48,510 --> 00:16:51,450
Given a problem, you
try all the techniques.

297
00:16:51,450 --> 00:16:53,720
You see which ones apply,
and out of those, you

298
00:16:53,720 --> 00:16:56,460
see which one gives you
the best running time.

299
00:16:56,460 --> 00:16:59,620
In practice, if we give
you the running time,

300
00:16:59,620 --> 00:17:03,360
you match it to the techniques
that match the running time.

301
00:17:03,360 --> 00:17:05,700
You start backwards from
the stuff that you know.

302
00:17:10,630 --> 00:17:12,869
OK.

303
00:17:12,869 --> 00:17:15,730
Does this problem make sense?

304
00:17:15,730 --> 00:17:18,180
Sweet.

305
00:17:18,180 --> 00:17:19,680
Now let's do a hard problem.

306
00:17:19,680 --> 00:17:23,819
Do people remember hashing?

307
00:17:23,819 --> 00:17:25,589
You have one minute
to remember hashing

308
00:17:25,589 --> 00:17:26,640
while I erase the board.

309
00:17:26,640 --> 00:17:28,530
[LAUGHING]

310
00:17:28,530 --> 00:17:31,940
So suppose we want
to implement the set.

311
00:17:31,940 --> 00:17:36,590
The way we're going to implement
the set is, we have n elements.

312
00:17:39,810 --> 00:17:41,680
We're going to put
them into the set,

313
00:17:41,680 --> 00:17:50,020
so for i goes from
1 through n, we're

314
00:17:50,020 --> 00:17:54,700
going to insert element
i, so first we're

315
00:17:54,700 --> 00:17:57,730
going to insert all the
elements into the set.

316
00:17:57,730 --> 00:18:02,140
And then after that, given a
random number, we want to see

317
00:18:02,140 --> 00:18:03,810
is it in the set, or not.

318
00:18:03,810 --> 00:18:09,190
So for some other number--
I used n before, so let's

319
00:18:09,190 --> 00:18:15,450
use-- for some other number f,
we want to see is f in the set,

320
00:18:15,450 --> 00:18:19,068
or is f not in the set?

321
00:18:22,140 --> 00:18:24,400
What data structure would
you use normally for this?

322
00:18:27,490 --> 00:18:28,560
A hash table, right?

323
00:18:28,560 --> 00:18:31,080
You stick everything
into a hash table,

324
00:18:31,080 --> 00:18:32,820
then you try to
find the elements.

325
00:18:32,820 --> 00:18:34,880
If you find them,
then you say yes.

326
00:18:34,880 --> 00:18:36,980
If not, then you say no.

327
00:18:36,980 --> 00:18:38,710
Well, it turns out
that this would

328
00:18:38,710 --> 00:18:41,160
take more memory
than what we have.

329
00:18:41,160 --> 00:18:44,720
So instead, we're
going to do this.

330
00:18:44,720 --> 00:18:47,850
We're going to have a
hash table of m bits.

331
00:18:53,660 --> 00:18:54,840
So these are m bits.

332
00:18:54,840 --> 00:18:58,670
And say we have a
hash function that

333
00:18:58,670 --> 00:19:02,410
satisfies with uniform
hashing, so given any element,

334
00:19:02,410 --> 00:19:07,784
the value is anywhere
from 0 to m minus 1,

335
00:19:07,784 --> 00:19:08,950
and they're all independent.

336
00:19:12,490 --> 00:19:14,600
So the way we're going
to insert an element

337
00:19:14,600 --> 00:19:23,040
is-- this table is T-- we're
going to say that T of h of ai

338
00:19:23,040 --> 00:19:24,690
equals 1.

339
00:19:24,690 --> 00:19:26,600
So this is a table of bits.

340
00:19:26,600 --> 00:19:28,570
For every element
we hash the element,

341
00:19:28,570 --> 00:19:32,610
and we set the
corresponding bit to 1.

342
00:19:32,610 --> 00:19:38,230
So we're going to have some 1s,
and some zeros in the table.

343
00:19:38,230 --> 00:19:43,680
Say if this is ai, it
hashes somewhere here.

344
00:19:43,680 --> 00:19:46,530
OK so the question
is, we inserted

345
00:19:46,530 --> 00:19:49,560
n elements into a
table of size n.

346
00:19:49,560 --> 00:19:55,110
Given a new element, f, where
f stands for false positive-- f

347
00:19:55,110 --> 00:19:58,395
is not one of the
elements that we inserted.

348
00:20:02,040 --> 00:20:04,490
I want to know what's the
probability that the set will

349
00:20:04,490 --> 00:20:08,700
say that the element is
in the set, so basically,

350
00:20:08,700 --> 00:20:11,674
the probability of
a false positive.

351
00:20:11,674 --> 00:20:14,660
AUDIENCE: So what are we
doing about [INAUDIBLE]?

352
00:20:14,660 --> 00:20:15,595
PROFESSOR: Nothing.

353
00:20:15,595 --> 00:20:18,612
AUDIENCE: Is it chaining,
is it open addressing?

354
00:20:18,612 --> 00:20:21,604
Does it even matter?

355
00:20:21,604 --> 00:20:23,520
PROFESSOR: So we're not
inserting the elements

356
00:20:23,520 --> 00:20:25,200
into the table.

357
00:20:25,200 --> 00:20:26,890
This table only has bits.

358
00:20:26,890 --> 00:20:31,800
The elements are lost
completely after we insert them.

359
00:20:31,800 --> 00:20:34,200
So the tradeoff is
uses a lot less memory.

360
00:20:34,200 --> 00:20:36,050
Instead of having to
store entire elements,

361
00:20:36,050 --> 00:20:38,050
you just store bits.

362
00:20:38,050 --> 00:20:40,927
On the downside you're going
to have false positives.

363
00:20:40,927 --> 00:20:42,510
Because if I have a
different element,

364
00:20:42,510 --> 00:20:47,090
say f, if it hashes
to the same location,

365
00:20:47,090 --> 00:20:51,360
then the set is going to
say, yeah, it's in the set.

366
00:20:51,360 --> 00:20:53,041
So you get false positives.

367
00:20:53,041 --> 00:20:54,290
Would you get false negatives?

368
00:20:59,230 --> 00:20:59,730
No, right?

369
00:21:02,530 --> 00:21:05,790
Because you start out
with a table of 0's,

370
00:21:05,790 --> 00:21:07,680
and you only set
the table to ones

371
00:21:07,680 --> 00:21:11,570
for the numbers
that match to hashes

372
00:21:11,570 --> 00:21:13,340
of elements that are in the set.

373
00:21:13,340 --> 00:21:14,954
Did you have a question?

374
00:21:14,954 --> 00:21:15,453
OK.

375
00:21:19,910 --> 00:21:21,680
OK, do we understand
the problem,

376
00:21:21,680 --> 00:21:23,584
before we attempt to solve it?

377
00:21:23,584 --> 00:21:27,250
AUDIENCE: Is it probably 1/m?

378
00:21:27,250 --> 00:21:29,880
PROFESSOR: You'd wish, but no.

379
00:21:32,860 --> 00:21:34,245
AUDIENCE: It's less than n/m.

380
00:21:37,100 --> 00:21:38,560
PROFESSOR: OK, I like that.

381
00:21:38,560 --> 00:21:41,410
So what are you thinking?

382
00:21:41,410 --> 00:21:43,410
AUDIENCE: If there are
no collisions previously,

383
00:21:43,410 --> 00:21:48,730
then it would equal to n/m, but
there are collisions, probably

384
00:21:48,730 --> 00:21:50,566
collisions.

385
00:21:50,566 --> 00:21:52,940
PROFESSOR: OK, I'm going to
open up a window in your head

386
00:21:52,940 --> 00:21:56,770
and tell everyone else the small
steps you took to get here.

387
00:21:56,770 --> 00:21:58,940
So we have this new number f.

388
00:21:58,940 --> 00:22:01,200
How are we going to check
if it's in the set or not?

389
00:22:01,200 --> 00:22:04,300
We're going to compute
h of f, and we're

390
00:22:04,300 --> 00:22:09,580
going to check if t
of h of f is 0 or 1.

391
00:22:13,370 --> 00:22:15,870
f is different from
all the other elements.

392
00:22:15,870 --> 00:22:19,920
So its hash value is independent
from all the other hash values

393
00:22:19,920 --> 00:22:20,640
we had before.

394
00:22:24,300 --> 00:22:26,480
We don't really care
about this anymore,

395
00:22:26,480 --> 00:22:29,900
after we have the
independence assumption.

396
00:22:29,900 --> 00:22:34,400
So h of f is just some
random position in the table.

397
00:22:34,400 --> 00:22:38,080
So the question is, given some
random position in the table,

398
00:22:38,080 --> 00:22:41,090
will that be a 0 or a 1?

399
00:22:41,090 --> 00:22:42,570
How do you know?

400
00:22:42,570 --> 00:22:44,910
If I knew how many 1's
I have in the table--

401
00:22:44,910 --> 00:22:49,530
if I have k 1's in the table,
and automatically this means n

402
00:22:49,530 --> 00:22:56,190
minus k 0's-- then what's the
probability that h of f will

403
00:22:56,190 --> 00:22:57,060
point to a 1?

404
00:23:06,332 --> 00:23:08,265
AUDIENCE: k/m.

405
00:23:08,265 --> 00:23:08,890
PROFESSOR: Yes.

406
00:23:08,890 --> 00:23:11,580
So the hash takes
m possible values.

407
00:23:11,580 --> 00:23:13,090
k of them are 1's.

408
00:23:13,090 --> 00:23:18,750
So the probability that the hash
is going to guess a 1 is k/m.

409
00:23:18,750 --> 00:23:23,820
So if we knew how many 1's we
have, then this is the answer.

410
00:23:23,820 --> 00:23:26,520
We know that we're going to
have at most n 1's-- that's what

411
00:23:26,520 --> 00:23:28,200
you're thinking, right?

412
00:23:28,200 --> 00:23:31,950
So k is definitely
smaller or equal to n,

413
00:23:31,950 --> 00:23:39,900
so the answer definitely has to
be smaller or equal than n/m.

414
00:23:39,900 --> 00:23:41,930
Now if you're in a
rush, you might say,

415
00:23:41,930 --> 00:23:44,960
well, we inserted n
elements, so we're definitely

416
00:23:44,960 --> 00:23:48,030
going to have n 1's here.

417
00:23:48,030 --> 00:23:49,320
That is not true.

418
00:23:49,320 --> 00:23:53,060
The hashes of all the
elements are independent.

419
00:23:53,060 --> 00:23:55,440
So there is some probability
that two elements will

420
00:23:55,440 --> 00:23:59,210
hash to the same value, and as
the number of elements grows,

421
00:23:59,210 --> 00:24:00,785
that probability also grows.

422
00:24:03,970 --> 00:24:06,510
OK, so now by
looking at this, we

423
00:24:06,510 --> 00:24:08,650
got rid of this
part of the problem.

424
00:24:08,650 --> 00:24:10,372
We don't care that
there's a new element.

425
00:24:10,372 --> 00:24:12,080
We don't care that
it's a false positive.

426
00:24:12,080 --> 00:24:14,270
All that we care
about is how many

427
00:24:14,270 --> 00:24:18,080
1's do we have in the table
after inserting n values.

428
00:24:23,010 --> 00:24:23,760
Well, what's that?

429
00:24:23,760 --> 00:24:41,810
That's m times the probability
that a slot in the table is 1.

430
00:24:41,810 --> 00:24:46,820
Right, the probability that the
slot in the table is 1 is k/m.

431
00:24:46,820 --> 00:24:49,390
So if we know this probability,
and we multiply it by m,

432
00:24:49,390 --> 00:24:50,130
then we get k.

433
00:24:57,450 --> 00:25:00,240
People still with me?

434
00:25:00,240 --> 00:25:03,470
AUDIENCE: And what does
that variable represent, h?

435
00:25:03,470 --> 00:25:05,050
PROFESSOR: This is k.

436
00:25:05,050 --> 00:25:07,050
Represents that my
handwriting sucks, basically.

437
00:25:07,050 --> 00:25:10,530
AUDIENCE: I mean, why do we
do m times the probability.

438
00:25:10,530 --> 00:25:15,208
That's the the expected
number of 1's in the table?

439
00:25:15,208 --> 00:25:15,874
PROFESSOR: Yeah.

440
00:25:19,710 --> 00:25:24,330
Yeah, this is E of k, I guess.

441
00:25:24,330 --> 00:25:30,194
So then our final answer
is this thing divided by m.

442
00:25:34,940 --> 00:25:40,390
So the answer is the
expected value of k,

443
00:25:40,390 --> 00:25:43,830
or you can just think of it as
the average value of k, divided

444
00:25:43,830 --> 00:25:44,396
by m.

445
00:25:44,396 --> 00:25:53,300
So this is m times this
probability, divided by m.

446
00:25:53,300 --> 00:25:56,450
So it is exactly
this probability.

447
00:25:56,450 --> 00:25:59,920
So the thing that
we want to focus on

448
00:25:59,920 --> 00:26:07,710
is, what's the probability
that a random slot in the table

449
00:26:07,710 --> 00:26:08,210
is a 1?

450
00:26:16,900 --> 00:26:18,858
AUDIENCE: It's equal to
1 minus the probability

451
00:26:18,858 --> 00:26:21,290
that it was never fixed.

452
00:26:21,290 --> 00:26:26,090
PROFESSOR: Exactly,
the first thing we do.

453
00:26:26,090 --> 00:26:30,570
1 minus the probability
that a slot is 0.

454
00:26:33,270 --> 00:26:35,560
This is easy, right,
like it looks easy.

455
00:26:35,560 --> 00:26:38,210
But this makes a
huge difference,

456
00:26:38,210 --> 00:26:42,660
because once we're here,
well, a slot is zero

457
00:26:42,660 --> 00:26:46,246
if none of the
insertions made it a one.

458
00:26:46,246 --> 00:26:47,870
And the insertions
are all independent.

459
00:26:50,710 --> 00:26:54,030
So this is like,
you're flipping a coin.

460
00:26:54,030 --> 00:26:56,380
What's the probability that
after you flip it n times,

461
00:26:56,380 --> 00:26:57,470
you never get a head?

462
00:27:01,450 --> 00:27:04,384
So this is 1 minus

463
00:27:04,384 --> 00:27:07,950
AUDIENCE: 1 over m
to the something.

464
00:27:07,950 --> 00:27:11,420
PROFESSOR: That--
So a slot is 0 means

465
00:27:11,420 --> 00:27:14,260
that no number was
inserted in it.

466
00:27:14,260 --> 00:27:17,990
We're inserting n numbers,
so it's the probability

467
00:27:17,990 --> 00:27:27,940
that a single number
was not necessarily

468
00:27:27,940 --> 00:27:36,350
in the slot, raised
to the power of n.

469
00:27:36,350 --> 00:27:38,720
So we have n independent
experiments, right?

470
00:27:38,720 --> 00:27:43,390
Every time you insert a
number into the hash function,

471
00:27:43,390 --> 00:27:45,380
that's one experiment.

472
00:27:45,380 --> 00:27:47,790
The hash function gives
you independent values

473
00:27:47,790 --> 00:27:50,560
for all the elements.

474
00:27:50,560 --> 00:27:53,530
So all the insertions are
independent of each other.

475
00:27:53,530 --> 00:27:58,140
If, in a single insertion,
you've hit that slot,

476
00:27:58,140 --> 00:28:00,460
then you've made
it a 1-- game over.

477
00:28:00,460 --> 00:28:03,670
So the slot is only a zero
if none of the insertions

478
00:28:03,670 --> 00:28:05,420
make it the 1.

479
00:28:05,420 --> 00:28:07,670
So you take the probability
that the insertion doesn't

480
00:28:07,670 --> 00:28:09,810
make it a one, and you
raise it to the power n,

481
00:28:09,810 --> 00:28:12,129
because that has to
happen n times in order

482
00:28:12,129 --> 00:28:13,670
for the whole thing
to be successful.

483
00:28:24,320 --> 00:28:26,200
And the probability
that the number was not

484
00:28:26,200 --> 00:28:29,520
inserted in a slot is
1 minus the probability

485
00:28:29,520 --> 00:28:31,440
that it was inserted.

486
00:28:31,440 --> 00:28:33,820
Right, we're doing this again.

487
00:28:33,820 --> 00:28:39,710
1 minus probability
that a number hit.

488
00:28:44,081 --> 00:28:45,205
Well what this probability?

489
00:28:47,740 --> 00:28:48,760
Uniform hashing.

490
00:28:48,760 --> 00:28:49,865
AUDIENCE: 1/m

491
00:28:49,865 --> 00:28:50,490
PROFESSOR: 1/m.

492
00:28:53,800 --> 00:29:01,690
So this whole thing is 1 minus 1
minus 1, over m to the power n.

493
00:29:01,690 --> 00:29:07,470
1 minus m minus 1,
over m to the power n.

494
00:29:13,398 --> 00:29:15,868
AUDIENCE: Can we go
through this again.

495
00:29:15,868 --> 00:29:20,067
From 1 minus probability
of a slot is 0, to 1

496
00:29:20,067 --> 00:29:25,260
minus probability of a number
was not inserted in a slot?

497
00:29:25,260 --> 00:29:26,200
PROFESSOR: OK.

498
00:29:26,200 --> 00:29:28,120
So first off, the
point of the problem.

499
00:29:28,120 --> 00:29:29,570
It's our problem, right?

500
00:29:29,570 --> 00:29:31,284
Don't panic, don't be angry.

501
00:29:31,284 --> 00:29:33,450
You're not going to have
some this hard on the exam.

502
00:29:33,450 --> 00:29:35,984
The point of this is, I want
to go through probabilities

503
00:29:35,984 --> 00:29:37,900
a little bit, and I want
to go through hashing

504
00:29:37,900 --> 00:29:39,220
and the math behind hashing.

505
00:29:39,220 --> 00:29:43,140
Because remembering
that will be useful.

506
00:29:43,140 --> 00:29:48,970
OK, so now you said you're
having trouble with this step?

507
00:29:53,474 --> 00:29:55,470
OK, so let's see.

508
00:29:55,470 --> 00:29:59,100
Let's do this here.

509
00:29:59,100 --> 00:30:01,540
So we have this
table here, right?

510
00:30:01,540 --> 00:30:08,550
And we have n elements-- e1,
e2, e3, all the way through en.

511
00:30:08,550 --> 00:30:10,140
How do we put them in the table?

512
00:30:10,140 --> 00:30:13,720
We hash each of them,
and each of them maps

513
00:30:13,720 --> 00:30:16,570
to a random slot in the table.

514
00:30:16,570 --> 00:30:22,060
If these are the slots,
then e1 might map here,

515
00:30:22,060 --> 00:30:26,260
e2 might map here,
e3 might map here,

516
00:30:26,260 --> 00:30:30,220
e4 might map here,
so on and so forth.

517
00:30:30,220 --> 00:30:32,460
So I have arrows, right?

518
00:30:32,460 --> 00:30:38,162
Every time I do a hash, that's
going to set something to a 1.

519
00:30:38,162 --> 00:30:40,370
The numbers don't necessarily
map to different slots,

520
00:30:40,370 --> 00:30:44,820
because each number, on its
own, maps to a random slot.

521
00:30:44,820 --> 00:30:48,240
So these are all
going to be ones.

522
00:30:48,240 --> 00:30:50,480
And everything
else becomes zero.

523
00:30:50,480 --> 00:30:54,840
If no number maps
to a slot, it is 0.

524
00:30:54,840 --> 00:30:58,490
OK, let's look at
one slot, any slot.

525
00:30:58,490 --> 00:31:01,590
So let's say I'm looking
at this slot over here.

526
00:31:01,590 --> 00:31:03,870
Can you guys see, by the way?

527
00:31:03,870 --> 00:31:06,420
OK, so let's look
at this guy here.

528
00:31:06,420 --> 00:31:09,940
What's the probability
that it's a 0?

529
00:31:09,940 --> 00:31:14,935
So the probability
that the slot is

530
00:31:14,935 --> 00:31:21,180
a 0 is the probability that
the first number didn't

531
00:31:21,180 --> 00:31:27,560
map to it-- otherwise
it would be a 1-- e1

532
00:31:27,560 --> 00:31:30,490
didn't hash to that slot.

533
00:31:33,410 --> 00:31:37,950
e2 also couldn't match
to that slot, right?

534
00:31:37,950 --> 00:31:43,180
So it's the probability that
e1 didn't hash to the slot,

535
00:31:43,180 --> 00:31:54,940
and e2 didn't hash
into slot, and e3

536
00:31:54,940 --> 00:31:59,100
didn't hash into the slot,
so on so forth, right?

537
00:31:59,100 --> 00:32:04,600
All the way up until en
didn't hash to the slot.

538
00:32:04,600 --> 00:32:06,510
This makes sense?

539
00:32:06,510 --> 00:32:08,240
Now these are all
independent events,

540
00:32:08,240 --> 00:32:10,440
because all the hashes
are independent,

541
00:32:10,440 --> 00:32:13,100
by the uniform
hashing assumption.

542
00:32:13,100 --> 00:32:16,860
So then I can turn
ands into products.

543
00:32:16,860 --> 00:32:20,880
So I can say that this
equals to the probability

544
00:32:20,880 --> 00:32:28,500
that e1 didn't hash into the
slot, times the probability

545
00:32:28,500 --> 00:32:34,990
that e2 didn't hash into the
slot, times the probability

546
00:32:34,990 --> 00:32:39,240
that e3 didn't hash into the
slot, so on and so forth,

547
00:32:39,240 --> 00:32:43,915
all the way to the probability
that en didn't hash.

548
00:32:51,800 --> 00:32:53,970
So since I'm dealing with
the same hash function,

549
00:32:53,970 --> 00:32:56,920
turns out that all the
probabilities are the same.

550
00:32:56,920 --> 00:33:01,810
So there, the probability
that some fixed number

551
00:33:01,810 --> 00:33:06,700
didn't hash, to the power n.

552
00:33:11,700 --> 00:33:15,300
So this is how I got
from here to here.

553
00:33:15,300 --> 00:33:18,900
Probabilities and the
properties of hashes and hashing

554
00:33:18,900 --> 00:33:19,930
assumptions.

555
00:33:19,930 --> 00:33:22,700
So you guys should have
those on your cheat sheet,

556
00:33:22,700 --> 00:33:25,521
and maybe if you have time,
review probabilities a bit.

557
00:33:25,521 --> 00:33:26,896
AUDIENCE: What is
the probability

558
00:33:26,896 --> 00:33:29,540
that any given one
doesn't hash, 1/m?

559
00:33:32,697 --> 00:33:34,985
So if e1 doesn't
hash in that spot,

560
00:33:34,985 --> 00:33:37,690
isn't that probability 1/m?

561
00:33:37,690 --> 00:33:39,590
PROFESSOR: Not quite.

562
00:33:39,590 --> 00:33:41,520
You're close, but not quite.

563
00:33:41,520 --> 00:33:44,550
So you're saying that
the probability that e1

564
00:33:44,550 --> 00:33:47,722
doesn't hash to
this slot is 1/m?

565
00:33:47,722 --> 00:33:49,180
AUDIENCE: I guess
it's 1 minus 1/m.

566
00:33:49,180 --> 00:33:50,750
PROFESSOR: Exactly.

567
00:33:50,750 --> 00:33:52,590
The probability that
it would hash here

568
00:33:52,590 --> 00:33:55,480
is 1/m, because it has
to pick that one slot out

569
00:33:55,480 --> 00:33:57,630
of n possible slots.

570
00:33:57,630 --> 00:33:59,470
But if you're just
saying, all I want

571
00:33:59,470 --> 00:34:02,920
is that it doesn't
hash here, well, it

572
00:34:02,920 --> 00:34:05,290
means it can hash anywhere else.

573
00:34:05,290 --> 00:34:07,680
So it has m minus 1 options.

574
00:34:07,680 --> 00:34:09,850
It can go to any of
those m minus 1 places,

575
00:34:09,850 --> 00:34:11,520
just not to that one place.

576
00:34:11,520 --> 00:34:13,111
So m minus 1 over m.

577
00:34:19,845 --> 00:34:22,250
AUDIENCE: It's interesting
it went the other direction.

578
00:34:22,250 --> 00:34:25,050
Instead of saying, it's
1, it's 1 minus it.

579
00:34:27,387 --> 00:34:29,053
Wouldn't it have been
just as easy to go

580
00:34:29,053 --> 00:34:30,409
the other direction, or no?

581
00:34:30,409 --> 00:34:32,510
PROFESSOR: No.

582
00:34:32,510 --> 00:34:34,460
Not doing this makes
the problem hard,

583
00:34:34,460 --> 00:34:36,489
so that's why we're doing it.

584
00:34:36,489 --> 00:34:40,100
This kind of flipping is
easy to do conceptually,

585
00:34:40,100 --> 00:34:43,152
but it might make a hard problem
into a really easy problem,

586
00:34:43,152 --> 00:34:44,610
or at least into
a do-able problem.

587
00:34:49,923 --> 00:34:52,338
AUDIENCE: Isn't it
this the same thing?

588
00:34:52,338 --> 00:34:55,474
I guess maybe not totally.

589
00:34:55,474 --> 00:34:57,890
PROFESSOR: So it is exactly
the same in terms of the math,

590
00:34:57,890 --> 00:35:01,875
but computing this without
turning it into this

591
00:35:01,875 --> 00:35:05,240
is really hard.

592
00:35:05,240 --> 00:35:07,070
AUDIENCE: Any given
slot is 1, isn't it

593
00:35:07,070 --> 00:35:10,125
kind of like what we just
said, except if the probability

594
00:35:10,125 --> 00:35:17,020
of any one mapping is 1/m,
mapping to a 1, right?

595
00:35:17,020 --> 00:35:18,550
And then you take
1 over m raised

596
00:35:18,550 --> 00:35:23,636
to the n, that's the
probability of it being a 1

597
00:35:23,636 --> 00:35:27,479
at that one place, right?

598
00:35:27,479 --> 00:35:28,520
PROFESSOR: No, not quite.

599
00:35:37,304 --> 00:35:40,250
Yeah.

600
00:35:40,250 --> 00:35:43,610
OK, so are we getting this?

601
00:35:43,610 --> 00:35:46,000
Somewhat?

602
00:35:46,000 --> 00:35:46,555
Yes?

603
00:35:46,555 --> 00:35:48,500
AUDIENCE: So the probability
of a false positive,

604
00:35:48,500 --> 00:35:50,666
you're saying that's what's
the probability that you

605
00:35:50,666 --> 00:35:54,004
get the 1, if you actually
should [INAUDIBLE] the 0.

606
00:35:54,004 --> 00:35:57,479
It's because multiple things
mapped to that one slot, right?

607
00:35:57,479 --> 00:35:59,520
PROFESSOR: So the probability
of a false positive

608
00:35:59,520 --> 00:36:04,420
is the probability that, given
a new element, when we hash it

609
00:36:04,420 --> 00:36:07,260
we get the 1.

610
00:36:07,260 --> 00:36:09,927
The hash of that new element
is independent of all

611
00:36:09,927 --> 00:36:10,635
the other hashes.

612
00:36:12,898 --> 00:36:14,814
AUDIENCE: Then why is
it simple in probability

613
00:36:14,814 --> 00:36:17,877
that you get the 1?

614
00:36:17,877 --> 00:36:19,460
PROFESSOR: So if I
have a new element,

615
00:36:19,460 --> 00:36:21,350
I'm going to compute
its hash, and I'm

616
00:36:21,350 --> 00:36:22,970
going to look in the table.

617
00:36:22,970 --> 00:36:24,801
If I see a 1, I'm
going to say, oh.

618
00:36:24,801 --> 00:36:26,176
AUDIENCE: Oh, it's
a new element.

619
00:36:26,176 --> 00:36:26,240
OK.

620
00:36:26,240 --> 00:36:27,656
PROFESSOR: Yeah,
so it's something

621
00:36:27,656 --> 00:36:29,230
that was not in the set.

622
00:36:29,230 --> 00:36:30,714
AUDIENCE: OK.

623
00:36:30,714 --> 00:36:31,630
PROFESSOR: Okay, cool.

624
00:36:36,940 --> 00:36:40,080
OK, so let's see if we can
do a harder version of this.

625
00:36:43,000 --> 00:36:48,030
So this probability isn't
great, but if we do one trick,

626
00:36:48,030 --> 00:36:49,390
we can make this really nice.

627
00:36:49,390 --> 00:36:52,430
And this puts together a trick
called bloom filters that

628
00:36:52,430 --> 00:36:54,490
is used in all
sorts of situations.

629
00:37:01,510 --> 00:37:07,220
So for Bloom filters, we
still have n elements,

630
00:37:07,220 --> 00:37:11,175
and we still have
a table of m bits.

631
00:37:16,310 --> 00:37:19,230
What changes this time is
instead of having one function,

632
00:37:19,230 --> 00:37:21,510
we have k hash functions.

633
00:37:27,570 --> 00:37:31,120
So when we take an
element and insert it,

634
00:37:31,120 --> 00:37:32,930
we're taking element i.

635
00:37:32,930 --> 00:37:34,640
The way to insert
it is we're going

636
00:37:34,640 --> 00:37:43,980
to compute its hash value
using all the hash functions,

637
00:37:43,980 --> 00:37:46,500
and set all the
corresponding bits to 1.

638
00:37:54,010 --> 00:38:06,350
So insert ei becomes,
for j in 1 through k,

639
00:38:06,350 --> 00:38:09,530
the table bit corresponding
to the hash function, j,

640
00:38:09,530 --> 00:38:13,970
of the element is 1.

641
00:38:13,970 --> 00:38:16,030
So each element
sets k bits to 1.

642
00:38:19,050 --> 00:38:21,330
Now how do we check if an
element is in the table?

643
00:38:26,395 --> 00:38:27,270
AUDIENCE: [INAUDIBLE]

644
00:38:34,680 --> 00:38:36,180
PROFESSOR: Since,
for every element,

645
00:38:36,180 --> 00:38:38,980
we set all the corresponding
k bits to 1, now when

646
00:38:38,980 --> 00:38:40,850
we have a new
element, we're going

647
00:38:40,850 --> 00:38:44,900
to compute to the k positions,
and if any of them is a 0,

648
00:38:44,900 --> 00:38:47,540
then we couldn't have possibly
put that in the table.

649
00:38:52,610 --> 00:39:03,830
So all T of h j
of f have to be 1.

650
00:39:06,530 --> 00:39:09,070
So for every element,
we hashed it k times,

651
00:39:09,070 --> 00:39:10,990
and set the corresponding bits.

652
00:39:10,990 --> 00:39:17,730
If we have a new element, and
by hashing we get here and here,

653
00:39:17,730 --> 00:39:21,370
but we also get here,
and this guy was a zero,

654
00:39:21,370 --> 00:39:23,137
we know we definitely
didn't put this in.

655
00:39:25,940 --> 00:39:28,480
So now what's the probability
of a false positive?

656
00:39:33,420 --> 00:39:36,384
AUDIENCE: My first intuition is
just raising that to a power.

657
00:39:41,324 --> 00:39:44,222
AUDIENCE: The probability
that when you check--

658
00:39:44,222 --> 00:39:46,430
PROFESSOR: Oh, I forgot to
say something, by the way.

659
00:39:46,430 --> 00:39:50,320
The k hash functions-- I think
they satisfy simple uniform

660
00:39:50,320 --> 00:39:51,285
hashing.

661
00:39:51,285 --> 00:39:52,910
I'm not sure if that's
the right thing,

662
00:39:52,910 --> 00:39:55,345
but they all have independent
values from each other.

663
00:39:55,345 --> 00:39:56,470
So they're all independent.

664
00:40:02,080 --> 00:40:05,170
So for any number you
give, any hash function

665
00:40:05,170 --> 00:40:07,000
returns a value that's
independent of all

666
00:40:07,000 --> 00:40:10,551
the other hash functions,
and they're all 0

667
00:40:10,551 --> 00:40:11,300
through n minus 1.

668
00:40:18,500 --> 00:40:20,780
AUDIENCE: Why is not that
just raised to something?

669
00:40:20,780 --> 00:40:22,321
Because we know the
probability-- OK,

670
00:40:22,321 --> 00:40:25,290
actually we need to
recalculate that.

671
00:40:25,290 --> 00:40:27,581
AUDIENCE: Because it's the
probability that all of them

672
00:40:27,581 --> 00:40:30,606
are 1, even though you
haven't hashed yet.

673
00:40:35,930 --> 00:40:38,130
PROFESSOR: So the false
positive, the probability

674
00:40:38,130 --> 00:40:40,280
of false positives
is the probability

675
00:40:40,280 --> 00:40:46,995
that all the k slots that
correspond to f are 1's, right?

676
00:40:54,760 --> 00:41:01,620
So, since the hash functions
are all independent,

677
00:41:01,620 --> 00:41:03,730
this is the probability
that one slot

678
00:41:03,730 --> 00:41:05,810
is the 1, raised to the power k.

679
00:41:05,810 --> 00:41:08,940
Right, because they're
all independent slots.

680
00:41:08,940 --> 00:41:14,160
So it's the probability
that one slot

681
00:41:14,160 --> 00:41:18,740
is a 1, raised to the power k.

682
00:41:18,740 --> 00:41:20,510
OK, so now what's
the probability

683
00:41:20,510 --> 00:41:22,500
that one slot is a 1?

684
00:41:22,500 --> 00:41:26,170
It looks a lot like
this problem, right?

685
00:41:26,170 --> 00:41:27,850
Except there's a tweak.

686
00:41:27,850 --> 00:41:30,294
How many times did we
put the 1 in the table?

687
00:41:33,430 --> 00:41:38,010
So here, we put a 1 in the
table for every element.

688
00:41:38,010 --> 00:41:42,920
So we have n sets, right?

689
00:41:42,920 --> 00:41:49,220
So n times we're going to
set t of something to 1.

690
00:41:52,220 --> 00:41:53,160
Right?

691
00:41:53,160 --> 00:41:55,690
For every element,
we have one set.

692
00:41:55,690 --> 00:41:57,049
We set one bit to 1.

693
00:41:57,049 --> 00:41:59,340
It might have been said
before-- that's something else.

694
00:41:59,340 --> 00:41:59,840
Yes?

695
00:41:59,840 --> 00:42:03,570
AUDIENCE: So here it's
raised to the m k?

696
00:42:03,570 --> 00:42:05,790
PROFESSOR: Yeah, pretty much.

697
00:42:05,790 --> 00:42:08,440
So here, for every element
we hash it through all the k

698
00:42:08,440 --> 00:42:12,210
functions, and set the
corresponding bits to 1.

699
00:42:12,210 --> 00:42:20,370
So one element generates
k set operations,

700
00:42:20,370 --> 00:42:25,230
and we have n elements,
so we set n k bits to 1.

701
00:42:35,060 --> 00:42:36,274
Does this make sense?

702
00:42:36,274 --> 00:42:39,754
AUDIENCE: Can two hash functions
point to the same slot?

703
00:42:39,754 --> 00:42:40,420
PROFESSOR: Sure.

704
00:42:43,216 --> 00:42:44,840
But they're all
independent, and that's

705
00:42:44,840 --> 00:42:46,840
the only thing that matters.

706
00:42:46,840 --> 00:42:50,000
So every time we set the
bit, which bit was set

707
00:42:50,000 --> 00:42:54,380
is independent of all
the other bits we set,

708
00:42:54,380 --> 00:42:56,500
because all the hash
functions are independent,

709
00:42:56,500 --> 00:42:58,560
and all the values are
independent of each other.

710
00:43:01,830 --> 00:43:05,410
So this time, the table size is
still m, so that didn't change.

711
00:43:05,410 --> 00:43:08,750
This time we set n bits to 1,
this time we set n k bits to 1.

712
00:43:08,750 --> 00:43:12,420
So then the right thing
to do is copy this answer,

713
00:43:12,420 --> 00:43:14,790
and replace n with n k.

714
00:43:14,790 --> 00:43:17,330
And if you have to
write the proof,

715
00:43:17,330 --> 00:43:19,969
you'd copy-paste the proof
and replace n with n k.

716
00:43:26,460 --> 00:43:32,900
So this is 1 minus m minus
1, over m, times n k.

717
00:43:35,815 --> 00:43:37,940
And of course you should
go through the whole thing

718
00:43:37,940 --> 00:43:40,324
in your head and convince
yourselves that this is true.

719
00:43:40,324 --> 00:43:42,490
AUDIENCE: Does that say one
of the elements is what?

720
00:43:42,490 --> 00:43:44,460
k, something?

721
00:43:44,460 --> 00:43:45,460
AUDIENCE: Sets.

722
00:43:45,460 --> 00:43:47,270
PROFESSOR: Bit sets.

723
00:43:47,270 --> 00:43:51,920
So one element sets k bits
in the table, not necessarily

724
00:43:51,920 --> 00:43:54,400
different bits, just
independent bits.

725
00:43:54,400 --> 00:43:56,350
So if you have n
elements altogether,

726
00:43:56,350 --> 00:43:57,913
they set n times k bits.

727
00:44:09,120 --> 00:44:14,880
This thing gets run n times
k times, whereas here,

728
00:44:14,880 --> 00:44:21,180
the set operation gets
run n times in total.

729
00:44:21,180 --> 00:44:22,930
That's the difference
in the two problems.

730
00:44:30,360 --> 00:44:32,560
Right here you have one
function for each element,

731
00:44:32,560 --> 00:44:34,055
here you have k hash functions.

732
00:44:44,240 --> 00:44:46,500
This is hard, right?

733
00:44:46,500 --> 00:44:50,060
Well, it's the hardest
hashing problem

734
00:44:50,060 --> 00:44:51,530
that I could think
about and that

735
00:44:51,530 --> 00:44:54,430
makes us go through
probabilities and through all

736
00:44:54,430 --> 00:44:55,990
the hash stuff.

737
00:44:55,990 --> 00:44:59,200
The problems on the exam will
be easier, so one, don't panic.

738
00:44:59,200 --> 00:45:02,720
Two, review hashing,
review probabilities.

739
00:45:02,720 --> 00:45:06,160
When I said, from the
theory, this is what you get,

740
00:45:06,160 --> 00:45:09,540
if you didn't understand that
then please review the theory.

741
00:45:09,540 --> 00:45:11,550
AUDIENCE: Why is
it raised to the k?

742
00:45:11,550 --> 00:45:14,590
Because we did down there,
if we replace n with n k,

743
00:45:14,590 --> 00:45:18,310
then we'd just get
everything except.

744
00:45:18,310 --> 00:45:22,510
PROFESSOR: So this thing
in here is the answer

745
00:45:22,510 --> 00:45:28,500
to the previous problem,
except you take an n

746
00:45:28,500 --> 00:45:31,640
and you replace it with an n k.

747
00:45:31,640 --> 00:45:36,560
So this is the probability
that one bit is set to 1.

748
00:45:36,560 --> 00:45:38,430
But here, when you're
given an element,

749
00:45:38,430 --> 00:45:41,450
you're going to hash it through
the k functions-- you take

750
00:45:41,450 --> 00:45:44,550
this guy-- you're going to hash
it through the k functions,

751
00:45:44,550 --> 00:45:46,580
and you're going to
check all the bits.

752
00:45:46,580 --> 00:45:50,450
So you're going to check k bits.

753
00:45:50,450 --> 00:45:53,130
So as long as any of the
k bits is a zero, not

754
00:45:53,130 --> 00:45:55,140
a false positive.

755
00:45:55,140 --> 00:45:58,812
So we need all the
k bits to be a 1.

756
00:45:58,812 --> 00:45:59,645
AUDIENCE: Oh, I see.

757
00:46:02,579 --> 00:46:04,977
AUDIENCE: What if the hash
functions are dependent?

758
00:46:04,977 --> 00:46:06,435
PROFESSOR: Then
become intractable.

759
00:46:09,140 --> 00:46:11,530
AUDIENCE: And what if they are?

760
00:46:15,370 --> 00:46:19,770
I think the in this problem,
the way they are being hashed,

761
00:46:19,770 --> 00:46:21,566
that becomes
dependent, because I

762
00:46:21,566 --> 00:46:24,670
think there were some problems
where, if something is being

763
00:46:24,670 --> 00:46:27,694
hashed somewhere,
then the probability--

764
00:46:27,694 --> 00:46:29,110
there could be
hash functions that

765
00:46:29,110 --> 00:46:33,980
would put the other
thing in the next slot.

766
00:46:33,980 --> 00:46:36,770
PROFESSOR: Yes, so you want
to reduce these problems

767
00:46:36,770 --> 00:46:37,870
to independent hashing.

768
00:46:37,870 --> 00:46:40,050
If you look at the
proofs, all the proofs

769
00:46:40,050 --> 00:46:42,890
assume uniform hashing,
simple uniform,

770
00:46:42,890 --> 00:46:45,722
whatever it takes to get the
math down to independence.

771
00:46:45,722 --> 00:46:47,305
Because this is the
only thing that we

772
00:46:47,305 --> 00:46:49,310
know how to solve
with probabilities.

773
00:46:49,310 --> 00:46:51,090
If everything is
independent, then things

774
00:46:51,090 --> 00:46:54,700
multiply and add up in the right
places, and everything is easy.

775
00:46:54,700 --> 00:46:56,720
If things are
dependent, then proofs

776
00:46:56,720 --> 00:46:57,890
become really, really hard.

777
00:46:57,890 --> 00:46:59,750
So whenever you have
dependent things,

778
00:46:59,750 --> 00:47:02,397
you want to find a way to reduce
that to independent things.

779
00:47:15,560 --> 00:47:17,860
Is everyone tired,
or do you guys really

780
00:47:17,860 --> 00:47:19,830
not like this problem?

781
00:47:19,830 --> 00:47:23,180
By the way, really
cool trick-- so this

782
00:47:23,180 --> 00:47:25,260
turns out to be a
lot better than that,

783
00:47:25,260 --> 00:47:27,280
and I think the
optimal value of k

784
00:47:27,280 --> 00:47:30,150
is around square roots of log n.

785
00:47:30,150 --> 00:47:34,220
And that gives you some
filters with a really low

786
00:47:34,220 --> 00:47:36,244
false positive rate.

787
00:47:36,244 --> 00:47:38,680
AUDIENCE: What do
you mean by optimal?

788
00:47:38,680 --> 00:47:41,080
PROFESSOR: Minimize
the false positives.

789
00:47:41,080 --> 00:47:47,840
So given n and m, pick a case
so that this thing is minimized.

790
00:47:47,840 --> 00:47:50,370
AUDIENCE: What was
the answer again?

791
00:47:50,370 --> 00:47:52,812
Or actually, regardless
of that, what's

792
00:47:52,812 --> 00:47:55,540
the percentage of
false positives?

793
00:47:55,540 --> 00:47:58,270
PROFESSOR: It depends on
what your n and m are, right?

794
00:47:58,270 --> 00:48:00,040
The more bits you can afford

795
00:48:00,040 --> 00:48:01,510
AUDIENCE: But if
maximize your k,

796
00:48:01,510 --> 00:48:06,100
you said you came up with
some k that's maximized

797
00:48:06,100 --> 00:48:07,079
PROFESSOR: I think k is

798
00:48:07,079 --> 00:48:08,370
AUDIENCE: Square root of log n.

799
00:48:10,964 --> 00:48:12,660
AUDIENCE: So then
if you use that.

800
00:48:12,660 --> 00:48:14,206
PROFESSOR: Let's
not do the math.

801
00:48:14,206 --> 00:48:15,595
[LAUGHTER]

802
00:48:15,595 --> 00:48:16,900
It's really, really good.

803
00:48:16,900 --> 00:48:21,540
So these are used for all sorts
of practical problems, all

804
00:48:21,540 --> 00:48:25,268
the way from branch predictors
in processors, to databases.

805
00:48:25,268 --> 00:48:26,836
AUDIENCE: So is
it better than 1%?

806
00:48:26,836 --> 00:48:27,960
Do you know that, at least?

807
00:48:27,960 --> 00:48:33,210
PROFESSOR: Oh yeah, for
practical uses, this gets you,

808
00:48:33,210 --> 00:48:36,030
I think to 1% of 1% of 1%.

809
00:48:41,680 --> 00:48:45,110
So usually, put a Bloom filter
before a really expensive

810
00:48:45,110 --> 00:48:48,300
check, and the Bloom
filter gets rid of most

811
00:48:48,300 --> 00:48:50,264
of the false positives.

812
00:48:50,264 --> 00:48:51,680
And then you have
a few more where

813
00:48:51,680 --> 00:48:53,206
you do the more expensive check.

814
00:49:05,620 --> 00:49:06,952
Okay, does this make sense?

815
00:49:11,240 --> 00:49:11,910
Any questions?

816
00:49:15,290 --> 00:49:19,812
AUDIENCE: Do you more optimal
if you repeated this Bloom

817
00:49:19,812 --> 00:49:22,040
filter independently
of the other one,

818
00:49:22,040 --> 00:49:25,330
with more hash functions
in that memory structure?

819
00:49:25,330 --> 00:49:31,090
PROFESSOR: I think doubling
the memory size is better.

820
00:49:31,090 --> 00:49:33,600
So two filters is the
same as having two n bits.

821
00:49:33,600 --> 00:49:36,150
I think doubling gives you
better results, always.

822
00:49:47,500 --> 00:49:48,650
OK, so general stuff.

823
00:49:48,650 --> 00:49:51,610
We're going to have a lot
of conceptual questions,

824
00:49:51,610 --> 00:49:55,300
so please make sure, again,
make sure that for everything

825
00:49:55,300 --> 00:49:57,130
that we did, go
through the problem.

826
00:49:57,130 --> 00:50:00,760
Understand the problem, know
that there is a solution.

827
00:50:00,760 --> 00:50:02,430
Know the running
time, maybe know

828
00:50:02,430 --> 00:50:03,880
how to implement the solution.

829
00:50:03,880 --> 00:50:05,905
Don't worry so much
about the proof.

830
00:50:05,905 --> 00:50:07,530
We're going to have
some problems where

831
00:50:07,530 --> 00:50:10,150
you have to come up with
new things on your own,

832
00:50:10,150 --> 00:50:12,950
so get a good night's
sleep before the exam.

833
00:50:12,950 --> 00:50:14,495
Really, if you have
five hours left,

834
00:50:14,495 --> 00:50:16,620
then you have to choose
between sleeping five hours

835
00:50:16,620 --> 00:50:18,985
or reading notes
for five hours--

836
00:50:18,985 --> 00:50:21,060
AUDIENCE: Drink caffeine.

837
00:50:21,060 --> 00:50:22,480
PROFESSOR: It's
not going to help,

838
00:50:22,480 --> 00:50:24,300
so caffeine actually
helps you stay up,

839
00:50:24,300 --> 00:50:26,280
but it decreases
your performance.

840
00:50:26,280 --> 00:50:30,030
And so if you're on caffeine,
you're not going to think.

841
00:50:30,030 --> 00:50:33,530
You can regurgitate stuff,
but you can't think.

842
00:50:33,530 --> 00:50:36,556
So caffeinating yourself is a--

843
00:50:36,556 --> 00:50:39,670
AUDIENCE: I thought it was like
it gives you concentration.

844
00:50:39,670 --> 00:50:42,460
PROFESSOR: So there's an optimum
amount of sleep and caffeine

845
00:50:42,460 --> 00:50:43,080
combination.

846
00:50:43,080 --> 00:50:45,240
If you don't sleep and
caffeinate yourself,

847
00:50:45,240 --> 00:50:46,720
I guarantee that
you will not solve

848
00:50:46,720 --> 00:50:49,246
any of the problems that
require new algorithms.

849
00:50:49,246 --> 00:50:51,620
AUDIENCE: Caffeine just squirts
adrenaline in your brain.

850
00:50:51,620 --> 00:50:54,870
It doesn't do anything else.

851
00:50:54,870 --> 00:50:57,370
PROFESSOR: So the thing is the
memory is going to be better.

852
00:50:57,370 --> 00:50:59,270
If all you're doing
is memorization stuff,

853
00:50:59,270 --> 00:51:01,230
then it's going to be better.

854
00:51:01,230 --> 00:51:03,850
So you're going to do well on
the pattern matching stuff.

855
00:51:03,850 --> 00:51:05,290
But when your
brain is panicking,

856
00:51:05,290 --> 00:51:07,540
you're not going to come up
with new solutions, right?

857
00:51:07,540 --> 00:51:10,030
Usually, you have a
problem, a hard problem.

858
00:51:10,030 --> 00:51:12,370
You're thinking about it,
and then at some point

859
00:51:12,370 --> 00:51:14,495
when you're relaxed, like
when you're in the shower

860
00:51:14,495 --> 00:51:18,380
or when you wake up you're
like, crap, I found a solution.

861
00:51:18,380 --> 00:51:20,470
So the brain finds
solutions when it's relaxed,

862
00:51:20,470 --> 00:51:23,160
not when it's like, holy
shit, holy shit, holy shit.

863
00:51:23,160 --> 00:51:26,570
And adrenaline gets
it in that mood.

864
00:51:26,570 --> 00:51:27,700
That's what it does.

865
00:51:27,700 --> 00:51:29,920
And that's what caffeine
does in the end.

866
00:51:29,920 --> 00:51:32,970
So a little bit of caffeine
might help you get up

867
00:51:32,970 --> 00:51:35,250
and get you running,
but don't caffeinate

868
00:51:35,250 --> 00:51:37,990
yourself to not sleep
the entire night.

869
00:51:37,990 --> 00:51:40,520
That's probably going to make
you bomb the hard questions.

870
00:51:40,520 --> 00:51:41,550
Good luck on Friday.

871
00:51:41,550 --> 00:51:43,140
Eat candy.