1
00:00:00,060 --> 00:00:01,770
The following
content is provided

2
00:00:01,770 --> 00:00:04,010
under a Creative
Commons license.

3
00:00:04,010 --> 00:00:06,860
Your support will help MIT
OpenCourseWare continue

4
00:00:06,860 --> 00:00:10,720
to offer high quality
educational resources for free.

5
00:00:10,720 --> 00:00:13,320
To make a donation or
view additional materials

6
00:00:13,320 --> 00:00:17,207
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,207 --> 00:00:17,832
at ocw.mit.edu.

8
00:00:21,896 --> 00:00:24,520
PROFESSOR: Good
morning everyone.

9
00:00:24,520 --> 00:00:26,470
Morning.

10
00:00:26,470 --> 00:00:28,910
Let's get started.

11
00:00:28,910 --> 00:00:33,420
So the second of two
lectures on numerics.

12
00:00:33,420 --> 00:00:37,210
Last time we had this
motivating question

13
00:00:37,210 --> 00:00:41,280
of finding the millionth
digit of the square root of 2,

14
00:00:41,280 --> 00:00:44,530
or the square root
of quantities that

15
00:00:44,530 --> 00:00:47,510
end up becoming irrational.

16
00:00:47,510 --> 00:00:51,060
And we talked about
high-precision arithmetic,

17
00:00:51,060 --> 00:00:55,960
and we use Newton's method
to compute the square roots.

18
00:00:55,960 --> 00:00:59,010
You saw a demo of
computing square roots,

19
00:00:59,010 --> 00:01:02,100
but there's a few
things missing.

20
00:01:02,100 --> 00:01:05,840
We don't quite know how
to do division, which

21
00:01:05,840 --> 00:01:08,040
is required for the
Newton's method,

22
00:01:08,040 --> 00:01:13,680
and we didn't really talk at
all about algorithmic complexity

23
00:01:13,680 --> 00:01:17,520
beyond talking about the
complexity of multiplication.

24
00:01:17,520 --> 00:01:20,370
So multiplication is a
primitive that at this point

25
00:01:20,370 --> 00:01:24,180
we know how to do in a
couple of different ways,

26
00:01:24,180 --> 00:01:26,050
including the naive
order n squared

27
00:01:26,050 --> 00:01:27,930
algorithm and the
Karatsuba algorithm,

28
00:01:27,930 --> 00:01:31,370
which is something
like n raised to 1.58.

29
00:01:31,370 --> 00:01:34,040
But how many times
is multiplication

30
00:01:34,040 --> 00:01:37,500
called when you
compute square roots?

31
00:01:37,500 --> 00:01:40,310
In fact, multiplication
is called

32
00:01:40,310 --> 00:01:43,141
when you call the
division operator

33
00:01:43,141 --> 00:01:44,390
when you compute square roots.

34
00:01:44,390 --> 00:01:48,510
So there's really two levels
off a computation going on here

35
00:01:48,510 --> 00:01:51,850
and we need to open this
up, and look at in detail,

36
00:01:51,850 --> 00:01:55,500
and figure out what our overall
algorithmic complexity is.

37
00:01:55,500 --> 00:02:00,800
So that's really the
meat of today's lecture.

38
00:02:00,800 --> 00:02:04,020
Getting to the
point where we know

39
00:02:04,020 --> 00:02:07,580
what we have with respect
to asymptotic complexity

40
00:02:07,580 --> 00:02:10,979
of computing the square
root of a number.

41
00:02:10,979 --> 00:02:15,800
So let me start with a review
of what we covered last time.

42
00:02:18,460 --> 00:02:24,230
We decided that we wanted
the millionth digit

43
00:02:24,230 --> 00:02:26,350
of square root of 2.

44
00:02:26,350 --> 00:02:27,870
And the way we're
going to do this

45
00:02:27,870 --> 00:02:35,280
is by working with integers
and computing the floor,

46
00:02:35,280 --> 00:02:41,810
since we needed to be an
integer, of 2 times 10 raised

47
00:02:41,810 --> 00:02:46,115
to 2d, where d is the number
of digits of precision.

48
00:02:53,940 --> 00:02:56,570
N over there.

49
00:02:56,570 --> 00:03:01,080
So we'll take a look
at an example or two

50
00:03:01,080 --> 00:03:04,510
here as to how this
works with integers.

51
00:03:04,510 --> 00:03:12,180
But what we do is compute
essentially the floor

52
00:03:12,180 --> 00:03:15,930
of some quantity a, the square
root of some quantity a,

53
00:03:15,930 --> 00:03:17,270
via Newton's method.

54
00:03:27,170 --> 00:03:29,220
And the way Newton's
method works

55
00:03:29,220 --> 00:03:32,280
is you go through an iteration.

56
00:03:32,280 --> 00:03:39,460
You start with x0 being one,
which is your initial guess,

57
00:03:39,460 --> 00:03:50,949
and compute xi plus 1 equals
xi plus a divided by xi over 2.

58
00:03:50,949 --> 00:03:52,740
And as you can see,
this requires division,

59
00:03:52,740 --> 00:03:55,130
because we're computing
a divided by xi.

60
00:03:55,130 --> 00:03:58,380
That's the outer
Newton iteration.

61
00:03:58,380 --> 00:04:05,870
And I said a couple
of things that's said

62
00:04:05,870 --> 00:04:11,260
you are going to have a
quadratic rate of convergence.

63
00:04:11,260 --> 00:04:17,000
The precision with respect
to the number of digits

64
00:04:17,000 --> 00:04:22,680
is going to increase by a
factor of 2 every iteration.

65
00:04:22,680 --> 00:04:24,960
And so if you started out
with one digit of precision,

66
00:04:24,960 --> 00:04:28,320
you go to two, then
four, eight, et cetera.

67
00:04:28,320 --> 00:04:30,290
And so that's a
geometric progression.

68
00:04:30,290 --> 00:04:35,250
And that means that
we're going to have

69
00:04:35,250 --> 00:04:39,550
a logarithmic number of
iterations, which is nice.

70
00:04:39,550 --> 00:04:45,340
And we were all happy about
that, and you believed me.

71
00:04:45,340 --> 00:04:47,740
I gave you an example and
it looked pretty good,

72
00:04:47,740 --> 00:04:52,890
but didn't really prove anything
about the rate of convergence.

73
00:04:52,890 --> 00:04:59,480
What I'd like to do now is
take a look at this particular

74
00:04:59,480 --> 00:05:02,450
iterative computation, where
we're computing xi plus 1 given

75
00:05:02,450 --> 00:05:05,990
xi , and argue
that this, in fact,

76
00:05:05,990 --> 00:05:07,495
has a quadratic
rate of convergence.

77
00:05:10,060 --> 00:05:13,110
So you can think
of this as doing

78
00:05:13,110 --> 00:05:18,790
an error analysis
of Newton's method.

79
00:05:26,310 --> 00:05:34,960
And let's say that xn equals
square root of a 1 plus epsilon

80
00:05:34,960 --> 00:05:42,370
n, where epsilon may be
positive or negative.

81
00:05:42,370 --> 00:05:48,080
So we have an error
associated with xn

82
00:05:48,080 --> 00:05:50,710
in the n-th iteration with
respect to what we want,

83
00:05:50,710 --> 00:05:52,800
which is the square root of a.

84
00:05:52,800 --> 00:05:54,570
And it's off by something.

85
00:05:54,570 --> 00:05:56,720
It may be a large
quantity in the beginning.

86
00:05:56,720 --> 00:05:58,740
We want to show
convergence, so obviously we

87
00:05:58,740 --> 00:06:04,210
want epsilon n, as n
becomes large, do tend to 0.

88
00:06:04,210 --> 00:06:06,620
How fast does this approach 0?

89
00:06:06,620 --> 00:06:08,900
That's the question.

90
00:06:08,900 --> 00:06:14,130
And so if you take this equation
and plug this into that,

91
00:06:14,130 --> 00:06:17,410
and say, what is xn plus 1?

92
00:06:17,410 --> 00:06:29,040
xn plus 1 would be square root
of a times 1 plus epsilon n

93
00:06:29,040 --> 00:06:33,760
plus a divided by square
root of a 1 plus epsilon

94
00:06:33,760 --> 00:06:39,990
n divided by 2.

95
00:06:39,990 --> 00:06:43,130
Just plugging it
in, the value of xn.

96
00:06:43,130 --> 00:06:47,990
And then some a couple of steps
of algebraic simplification,

97
00:06:47,990 --> 00:06:50,830
you can pull out the
square root of a here,

98
00:06:50,830 --> 00:06:56,290
then you have 1
plus epsilon n, 1

99
00:06:56,290 --> 00:07:01,170
divided by 1 plus
epsilon n over here.

100
00:07:01,170 --> 00:07:04,650
The whole thing divided by 2.

101
00:07:04,650 --> 00:07:12,730
And if you keep going-- there's
one step that I'm skipping here

102
00:07:12,730 --> 00:07:16,750
in terms of
simplification, but let

103
00:07:16,750 --> 00:07:19,340
me write this last result out.

104
00:07:24,590 --> 00:07:32,560
Which is xn plus 1 is
square root of a times 1

105
00:07:32,560 --> 00:07:39,760
plus epsilon n squared
divided by 2 times 1

106
00:07:39,760 --> 00:07:43,370
plus epsilon n
down at the bottom.

107
00:07:43,370 --> 00:07:51,990
So what do we have here in
terms of the overall observation

108
00:07:51,990 --> 00:07:55,980
for epsilon n plus 1,
which is the error in the n

109
00:07:55,980 --> 00:07:59,700
plus 1-th iteration given
that you have an epsilon n

110
00:07:59,700 --> 00:08:02,100
error in the n-th iteration?

111
00:08:02,100 --> 00:08:09,920
You have a relationship like
so where epsilon n plus 1

112
00:08:09,920 --> 00:08:13,630
is related to epsilon
n whole square.

113
00:08:13,630 --> 00:08:17,920
And this part here,
as n becomes large,

114
00:08:17,920 --> 00:08:21,720
epsilon n is going
to go to 0 assuming

115
00:08:21,720 --> 00:08:24,140
a decent initial guess.

116
00:08:24,140 --> 00:08:27,020
And so you can say that
this is essentially

117
00:08:27,020 --> 00:08:32,789
1, which means you have this
quadratic rate of convergence

118
00:08:32,789 --> 00:08:37,340
where the error, which
is a small quantity,

119
00:08:37,340 --> 00:08:40,159
is getting squared
at every iteration.

120
00:08:40,159 --> 00:08:43,530
And so if you have
something like a 0.01 error

121
00:08:43,530 --> 00:08:46,640
at the beginning for
epsilon n, epsilon n squared

122
00:08:46,640 --> 00:08:55,040
is going to be 0.0001.

123
00:08:55,040 --> 00:08:58,840
So that's where you get the
quadratic rate of convergence.

124
00:08:58,840 --> 00:09:02,560
So it really comes from this
relationship, the relationship

125
00:09:02,560 --> 00:09:06,449
epsilon n squared
to epsilon n plus 1,

126
00:09:06,449 --> 00:09:07,490
Any questions about this?

127
00:09:12,010 --> 00:09:12,780
Great.

128
00:09:12,780 --> 00:09:17,160
So if you have the quadratic
rate of convergence,

129
00:09:17,160 --> 00:09:27,610
if you want to go to d digits
of precision like I have here,

130
00:09:27,610 --> 00:09:31,805
you can argue that you
need to log d iterations.

131
00:09:35,612 --> 00:09:37,820
So that's kind of nice, you
have a logarithmic number

132
00:09:37,820 --> 00:09:38,650
of iterations.

133
00:09:38,650 --> 00:09:40,540
I'm going to get back to that.

134
00:09:40,540 --> 00:09:45,030
There's one little
subtlety that is associated

135
00:09:45,030 --> 00:09:48,610
with asymptotic analysis
that goes beyond simply

136
00:09:48,610 --> 00:09:51,510
the number of
iterations that you have

137
00:09:51,510 --> 00:09:53,300
and the digits of precision.

138
00:09:53,300 --> 00:09:55,370
But so far so good.

139
00:09:55,370 --> 00:09:57,730
We're happy with this
logarithmic number

140
00:09:57,730 --> 00:09:58,490
of iterations.

141
00:09:58,490 --> 00:10:08,020
And if we can now compute the
complexity of the division,

142
00:10:08,020 --> 00:10:10,930
then obviously we need
an algorithm for that.

143
00:10:10,930 --> 00:10:13,130
But if you have an
algorithm and we figure out

144
00:10:13,130 --> 00:10:15,990
what the complexity of
the division algorithm is,

145
00:10:15,990 --> 00:10:20,520
then we have complexity
for the square root of 2

146
00:10:20,520 --> 00:10:22,800
or square root of a
using Newton's method.

147
00:10:25,540 --> 00:10:30,450
So just justified what I
said last time with respect

148
00:10:30,450 --> 00:10:33,000
to quadratic rate
of convergence.

149
00:10:33,000 --> 00:10:36,140
And then we talked about
multiplication last time.

150
00:10:36,140 --> 00:10:39,460
I want to revisit that.

151
00:10:39,460 --> 00:10:50,090
You have multiplication
algorithms,

152
00:10:50,090 --> 00:10:55,205
and we want to be able to
multiply d digit numbers.

153
00:10:59,090 --> 00:11:01,560
And the naive algorithm.

154
00:11:01,560 --> 00:11:05,870
And you could imagine
doing divide and conquer.

155
00:11:10,690 --> 00:11:17,870
So you take x1,
x0; y1, y0 where x1

156
00:11:17,870 --> 00:11:21,170
is the most
significant half of x.

157
00:11:21,170 --> 00:11:23,055
You're trying to
multiply x times y.

158
00:11:27,880 --> 00:11:30,970
And same thing for y1 and y0.

159
00:11:30,970 --> 00:11:36,100
So each of these will have
d by 2, digits of precision.

160
00:11:36,100 --> 00:11:40,870
And if you implement
the naive algorithm that

161
00:11:40,870 --> 00:11:48,730
looks like tn equals 4
tn by 2 plus theta n,

162
00:11:48,730 --> 00:11:52,940
you end up with theta n
squared complexity out

163
00:11:52,940 --> 00:11:55,690
so you have to do four
multiplications corresponding

164
00:11:55,690 --> 00:11:59,960
to x1 times Y1 x1
times y0, et cetera.

165
00:11:59,960 --> 00:12:02,790
And at each level in
the recursive tree,

166
00:12:02,790 --> 00:12:06,870
you're breaking things down
by a factor of 2 respect

167
00:12:06,870 --> 00:12:08,850
to the digits of
precision that you

168
00:12:08,850 --> 00:12:12,140
need to multiply on as
you're going down the tree.

169
00:12:12,140 --> 00:12:14,200
And this is the four
multiplications.

170
00:12:14,200 --> 00:12:16,740
You get your theta n
squared complexity.

171
00:12:16,740 --> 00:12:19,760
This gentleman by
the name of Karatsuba

172
00:12:19,760 --> 00:12:25,070
recognized that you could play
a few mathematical tricks, which

173
00:12:25,070 --> 00:12:29,730
I won't go over
again, but reduce

174
00:12:29,730 --> 00:12:32,340
to three multiplications.

175
00:12:32,340 --> 00:12:37,510
And you do a few more
additions, but given

176
00:12:37,510 --> 00:12:40,820
that the additions have
theta n complexity,

177
00:12:40,820 --> 00:12:49,610
the recurrence relationship
turns into tn equals 3t of n

178
00:12:49,610 --> 00:12:52,920
over 2 plus theta n.

179
00:12:52,920 --> 00:13:03,030
And this ends up having
1.58 dot dot dot complexity.

180
00:13:03,030 --> 00:13:09,470
No reason to stop with breaking
things up into two parts.

181
00:13:09,470 --> 00:13:12,960
You could imagine
generalizing Karatsuba

182
00:13:12,960 --> 00:13:14,235
and people have done this.

183
00:13:18,200 --> 00:13:22,860
Two different researchers,
Toom and Cook,

184
00:13:22,860 --> 00:13:26,230
generalized Karatsuba
for the case

185
00:13:26,230 --> 00:13:29,280
where k is greater than
or equal to 2, where

186
00:13:29,280 --> 00:13:32,020
you're breaking it into k parts.

187
00:13:32,020 --> 00:13:36,800
So the Toom-Cook 2 algorithm
is basically Karatsuba,

188
00:13:36,800 --> 00:13:39,720
but you have Toom 3,
Toom 4, and so on.

189
00:13:39,720 --> 00:13:44,520
And I'm not going to give
you a lot of details on this.

190
00:13:44,520 --> 00:13:50,190
We don't expect you to work
on this, at least in 6006.

191
00:13:50,190 --> 00:13:55,500
But just to give you a
sense of what happens,

192
00:13:55,500 --> 00:14:00,750
the Toom 3 method, or
the Toom-Cook 3 method,

193
00:14:00,750 --> 00:14:04,190
breaks and number
up into three parts.

194
00:14:04,190 --> 00:14:10,290
So each of these would have
d by 3 digits of precision.

195
00:14:10,290 --> 00:14:12,230
So this is what you're
starting out with.

196
00:14:12,230 --> 00:14:13,980
You're starting out
with a d digit number.

197
00:14:13,980 --> 00:14:16,070
But the very first level
of recursion, you're

198
00:14:16,070 --> 00:14:20,810
going to break things up
into three xi numbers that

199
00:14:20,810 --> 00:14:22,370
are d by 3 digits long.

200
00:14:22,370 --> 00:14:23,980
Same thing for y.

201
00:14:23,980 --> 00:14:27,820
And if you did a naive
multiplication of this,

202
00:14:27,820 --> 00:14:31,170
how many multiplications
do I need?

203
00:14:31,170 --> 00:14:34,000
If I just forget about
any mathematical tricks,

204
00:14:34,000 --> 00:14:38,740
if I just tried to
multiply these things out,

205
00:14:38,740 --> 00:14:43,680
how many d by 3 by d by 3
multiplications do I need?

206
00:14:43,680 --> 00:14:44,740
AUDIENCE: Nine.

207
00:14:44,740 --> 00:14:46,520
PROFESSOR: Nine.

208
00:14:46,520 --> 00:14:50,390
So if you can beat nine
using mathematical tricks,

209
00:14:50,390 --> 00:14:54,730
you have a better divide
and conquer algorithm.

210
00:14:54,730 --> 00:15:02,970
And it turns out that Toom 3
plays some arithmetic games

211
00:15:02,970 --> 00:15:13,780
and ends up with a
recurrence relationship that

212
00:15:13,780 --> 00:15:15,790
looks like this.

213
00:15:15,790 --> 00:15:21,020
Where you reduce the nine
multiplications down to five.

214
00:15:21,020 --> 00:15:24,110
So that's a win.

215
00:15:24,110 --> 00:15:33,070
And that ends up being
theta of n raised to what?

216
00:15:33,070 --> 00:15:35,030
Someone?

217
00:15:35,030 --> 00:15:37,640
Someone loudly.

218
00:15:37,640 --> 00:15:38,660
Log--

219
00:15:38,660 --> 00:15:40,780
AUDIENCE: Base 3.

220
00:15:40,780 --> 00:15:43,550
PROFESSOR: Log
with a base 3 of 5.

221
00:15:43,550 --> 00:15:46,440
Another irrational number.

222
00:15:46,440 --> 00:15:51,940
And this ends up being
n raised to 1.465.

223
00:15:51,940 --> 00:15:53,400
So you won.

224
00:15:53,400 --> 00:15:56,850
If you use Toom 3, assuming
the constants worked out--

225
00:15:56,850 --> 00:15:58,990
and Victor can say
a little bit more

226
00:15:58,990 --> 00:16:04,290
about that, because we're having
a little trouble justifying

227
00:16:04,290 --> 00:16:05,810
this particular
problem set question

228
00:16:05,810 --> 00:16:09,670
that we want to give you, given
the constant factors involved.

229
00:16:09,670 --> 00:16:14,400
So the issue really
here is this is correct.

230
00:16:14,400 --> 00:16:16,550
It's n raised to 1.46.

231
00:16:16,550 --> 00:16:18,640
That's n raised to 1.5.

232
00:16:18,640 --> 00:16:20,980
And then the naive
algorithm is n square.

233
00:16:20,980 --> 00:16:24,470
But how big does n
have to be in order

234
00:16:24,470 --> 00:16:27,440
for the n raised
to 1.58 algorithm

235
00:16:27,440 --> 00:16:30,850
to beat the n square
algorithm, and for the n raised

236
00:16:30,850 --> 00:16:33,130
to 1.46 algorithm
to beat the n raised

237
00:16:33,130 --> 00:16:35,450
to 1.58 algorithm, et cetera.

238
00:16:35,450 --> 00:16:38,010
And it turns out n needs
to be really, really large

239
00:16:38,010 --> 00:16:39,980
if you implement
these in Python.

240
00:16:39,980 --> 00:16:42,960
So if you're having a
little trouble here,

241
00:16:42,960 --> 00:16:45,900
giving you this
pristine problem set

242
00:16:45,900 --> 00:16:50,550
that you can go off and
learn about multiplication,

243
00:16:50,550 --> 00:16:52,785
and also appreciate
asymptotic complexity.

244
00:16:55,310 --> 00:16:58,010
So that's a bit of a catch-22.

245
00:16:58,010 --> 00:17:02,600
Anyway, for the purposes
of theory, this is great.

246
00:17:02,600 --> 00:17:04,980
It turns people have
done even better.

247
00:17:04,980 --> 00:17:10,359
Multiplication is just this
obviously incredibly important

248
00:17:10,359 --> 00:17:14,220
primitive that you
would need for doing

249
00:17:14,220 --> 00:17:16,240
any reasonable computation.

250
00:17:16,240 --> 00:17:22,619
And so people have worked on
using things like fast Fourier

251
00:17:22,619 --> 00:17:26,900
transforms and other
techniques improve

252
00:17:26,900 --> 00:17:29,010
the complexity of
multiplication.

253
00:17:29,010 --> 00:17:37,480
And best scheme
until a few years

254
00:17:37,480 --> 00:17:42,420
ago was this scheme called
Schonhage-Strassen scheme,

255
00:17:42,420 --> 00:17:44,920
which is almost
linear in complexity.

256
00:17:44,920 --> 00:17:53,480
It's n log n log log n time.

257
00:17:53,480 --> 00:17:58,940
And this uses the fast
Fourier transform, FFT.

258
00:17:58,940 --> 00:18:00,690
And you can play with
all of these things.

259
00:18:00,690 --> 00:18:06,210
You can play with Karatsuba
the naive algorithm, Toom 3,

260
00:18:06,210 --> 00:18:13,220
et cetera in the gmpy
package in Python.

261
00:18:13,220 --> 00:18:17,541
And you can see as to
what the value of n

262
00:18:17,541 --> 00:18:19,540
needs to be in order for
one of these algorithms

263
00:18:19,540 --> 00:18:21,510
to beat the other.

264
00:18:21,510 --> 00:18:23,007
This is not
something that you're

265
00:18:23,007 --> 00:18:24,840
going to do specifically
in the problem set,

266
00:18:24,840 --> 00:18:26,840
but I say that as an aside.

267
00:18:26,840 --> 00:18:28,230
These algorithms
are implemented,

268
00:18:28,230 --> 00:18:30,300
and they're used in real life.

269
00:18:30,300 --> 00:18:30,800
Eric?

270
00:18:30,800 --> 00:18:32,925
ERIC: It may be worth
mentioning that Python itself

271
00:18:32,925 --> 00:18:35,660
for long integers
uses Karatsuba.

272
00:18:35,660 --> 00:18:40,070
PROFESSOR: Yeah, so Python
uses-- beyond a certain n,

273
00:18:40,070 --> 00:18:42,480
you are going to
have decisions that

274
00:18:42,480 --> 00:18:44,790
are made within the package.

275
00:18:44,790 --> 00:18:50,210
And Python shifts to Karatsuba
after n becomes large.

276
00:18:50,210 --> 00:18:52,370
But if n is small,
then it's going

277
00:18:52,370 --> 00:18:53,670
to run the naive algorithm.

278
00:18:53,670 --> 00:18:55,378
Now if you write your
own multiplication,

279
00:18:55,378 --> 00:18:56,600
you can do whatever you want.

280
00:18:56,600 --> 00:18:58,910
You can have your own
adaptive scheme, assuming you

281
00:18:58,910 --> 00:19:01,380
have many of these
algorithms implemented,

282
00:19:01,380 --> 00:19:04,065
or you're calling them
using the gmpy package.

283
00:19:06,600 --> 00:19:10,600
So lastly, this looked
pretty good for a while.

284
00:19:10,600 --> 00:19:14,660
And from a
theoretical standpoint

285
00:19:14,660 --> 00:19:17,102
there was a breakthrough.

286
00:19:17,102 --> 00:19:24,620
Guy by the name of Furer came
up with this algorithm that

287
00:19:24,620 --> 00:19:32,140
is n log n-- and let me write
this carefully-- 2 raised

288
00:19:32,140 --> 00:19:41,620
big O-- that's an upper
bound-- of log star n.

289
00:19:41,620 --> 00:19:42,730
That makes sense?

290
00:19:42,730 --> 00:19:43,563
No.

291
00:19:43,563 --> 00:19:45,690
I'll have to explain it.

292
00:19:45,690 --> 00:19:47,310
OK, so what does this mean?

293
00:19:47,310 --> 00:19:48,790
This part is clear.

294
00:19:48,790 --> 00:19:50,260
This is like sorting.

295
00:19:50,260 --> 00:19:53,280
It doesn't need to really use
sorting, but that's n log n.

296
00:19:53,280 --> 00:19:56,940
And then you have this 2
raised to big O log star n.

297
00:19:56,940 --> 00:19:58,930
I need to define
what log star n is.

298
00:19:58,930 --> 00:20:06,460
And log star n is what's called
the iterative algorithm--

299
00:20:06,460 --> 00:20:07,230
logarithm, rather.

300
00:20:10,080 --> 00:20:11,740
I guess it's an
iterative algorithm,

301
00:20:11,740 --> 00:20:14,060
but it computes logs.

302
00:20:14,060 --> 00:20:17,640
And the iterative
logarithm is the number

303
00:20:17,640 --> 00:20:37,420
of times log needs to be
applied to get a result that

304
00:20:37,420 --> 00:20:41,170
is less than or equal to 1.

305
00:20:41,170 --> 00:20:46,790
So this thing really cuts
you down to size really fast.

306
00:20:46,790 --> 00:20:48,110
So it doesn't matter.

307
00:20:48,110 --> 00:20:52,520
You could be a 10 raised
to 24, or 2 raised to 50,

308
00:20:52,520 --> 00:20:56,000
let's say, if you were
doing binary logs.

309
00:20:56,000 --> 00:20:59,690
And in the very first iteration
you go down to 50, right?

310
00:20:59,690 --> 00:21:03,645
And then you take a log of
50 and you go down to about 7

311
00:21:03,645 --> 00:21:04,990
or something.

312
00:21:04,990 --> 00:21:07,210
And then you take the log of 7.

313
00:21:07,210 --> 00:21:11,470
And if you're talking
about base 2, like we were,

314
00:21:11,470 --> 00:21:14,110
you're down to less than 3.

315
00:21:14,110 --> 00:21:17,160
And so four or five
iterations, you're

316
00:21:17,160 --> 00:21:20,370
down to less than or equal to 1.

317
00:21:20,370 --> 00:21:24,280
And that's what log
star n computes.

318
00:21:24,280 --> 00:21:28,720
It's not the logarithm as
much as the number of times

319
00:21:28,720 --> 00:21:32,290
so you have to apply log to
get the result that's less than

320
00:21:32,290 --> 00:21:33,690
or equal to 1.

321
00:21:33,690 --> 00:21:36,710
So you have these giant numbers,
and it's only like five,

322
00:21:36,710 --> 00:21:41,310
six, eight times do you apply
log and you're down to one.

323
00:21:41,310 --> 00:21:44,060
So for all practical
purposes, you can think of--

324
00:21:44,060 --> 00:21:46,680
and this is upper bound--
you can think of this,

325
00:21:46,680 --> 00:21:48,550
even though this is 2
raised to something,

326
00:21:48,550 --> 00:21:51,320
it's 2 raised to a
pretty small number.

327
00:21:51,320 --> 00:21:53,470
2 raised to 10,
that would be 1,000.

328
00:21:53,470 --> 00:21:56,390
And so from an asymptotic
complexity standpoint,

329
00:21:56,390 --> 00:21:58,340
this is the winner.

330
00:21:58,340 --> 00:22:02,490
From a practical standpoint,
Schonhage-Strassen

331
00:22:02,490 --> 00:22:05,480
is really what you
probably want to use

332
00:22:05,480 --> 00:22:08,220
when n becomes very
large, to the billions

333
00:22:08,220 --> 00:22:09,780
and so on and so forth.

334
00:22:09,780 --> 00:22:13,020
And as of now, to the
best of my knowledge

335
00:22:13,020 --> 00:22:16,310
this hasn't been implemented
in the gmpy package.

336
00:22:16,310 --> 00:22:23,110
So if you actually want to use
gmpy, this is where you stop.

337
00:22:23,110 --> 00:22:24,820
So that's multiplication.

338
00:22:24,820 --> 00:22:26,480
So we have a bunch
of different ways

339
00:22:26,480 --> 00:22:29,000
that you could do
multiplication.

340
00:22:29,000 --> 00:22:34,860
What I'd like to do is give
you a sense of assuming a given

341
00:22:34,860 --> 00:22:40,270
complexity of multiplication,
how long would division take?

342
00:22:40,270 --> 00:22:47,150
So we are 1 and 1/2 lectures
in, and I haven't really

343
00:22:47,150 --> 00:22:50,020
told you how we're going
to do division, which

344
00:22:50,020 --> 00:22:55,140
is what we have to do when we
compute a divided by xi, which

345
00:22:55,140 --> 00:22:58,236
is the basic integration
in the Newton method.

346
00:22:58,236 --> 00:22:59,110
So let's get to that.

347
00:23:19,430 --> 00:23:25,325
So finally
high-precision division.

348
00:23:30,630 --> 00:23:42,600
So we want a high-precision
rep off a divided by b.

349
00:23:42,600 --> 00:23:47,500
And we're going to compute
a high-precision rep

350
00:23:47,500 --> 00:23:52,430
off 1 divided by b first.

351
00:23:52,430 --> 00:23:59,440
And what we mean by
that is that we'll

352
00:23:59,440 --> 00:24:10,640
compute r divided
by b floor where

353
00:24:10,640 --> 00:24:14,645
r is a really large value.

354
00:24:18,030 --> 00:24:28,250
And more importantly,
it's easy to divide

355
00:24:28,250 --> 00:24:31,310
by r in a particular base.

356
00:24:31,310 --> 00:24:34,820
So for example, r
equals 2 raised to k,

357
00:24:34,820 --> 00:24:38,650
when we use base
2, you can easily

358
00:24:38,650 --> 00:24:41,240
divide through a shift operator.

359
00:24:41,240 --> 00:24:44,060
So if I give you r divided
by b and I give you

360
00:24:44,060 --> 00:24:49,230
this long computer word that's
in base 2, which typically

361
00:24:49,230 --> 00:24:53,110
could have millions of
digits in its representation,

362
00:24:53,110 --> 00:24:56,107
I can shift that by
the appropriate amount

363
00:24:56,107 --> 00:24:58,900
to a given r divided by b.

364
00:24:58,900 --> 00:25:02,440
I can get 1 over b by
shifting that quantity.

365
00:25:02,440 --> 00:25:03,970
So it feels like,
hey wait a minute.

366
00:25:03,970 --> 00:25:05,800
Why are we dividing by r?

367
00:25:05,800 --> 00:25:08,740
Well remember that
you want 1 over b.

368
00:25:08,740 --> 00:25:11,980
And if you're computing
r divided by b floor,

369
00:25:11,980 --> 00:25:15,400
and you actually want 1
over b, which then you

370
00:25:15,400 --> 00:25:18,740
could use to multiply by a
so you can run your Newton

371
00:25:18,740 --> 00:25:22,370
iteration, then you
want to divide by r.

372
00:25:22,370 --> 00:25:24,750
And that division
is essentially going

373
00:25:24,750 --> 00:25:28,740
to be something that
shifts things to the right.

374
00:25:28,740 --> 00:25:31,400
So the most significant
bits move to the right,

375
00:25:31,400 --> 00:25:33,950
and you get a smaller number.

376
00:25:33,950 --> 00:25:35,810
That make sense?

377
00:25:35,810 --> 00:25:38,680
So we all know how
to divide by using

378
00:25:38,680 --> 00:25:41,730
shifting assuming the
bases work out right.

379
00:25:41,730 --> 00:25:44,390
And if you had a representation
that was decimal,

380
00:25:44,390 --> 00:25:48,830
suddenly you could certainly
divide by 10 raised to k.

381
00:25:48,830 --> 00:25:50,330
That's easy.

382
00:25:50,330 --> 00:25:51,770
You've done this many times.

383
00:25:51,770 --> 00:25:53,660
But you just changed
the decimal point

384
00:25:53,660 --> 00:25:55,493
when you're working
with decimal arithmetic.

385
00:25:55,493 --> 00:25:59,740
When you divide 72 by
100 and you get 0.72.

386
00:25:59,740 --> 00:26:02,550
And that's a very
similar notion here.

387
00:26:02,550 --> 00:26:06,310
It doesn't really matter what
base you're talking about.

388
00:26:06,310 --> 00:26:08,270
So that's the setup.

389
00:26:08,270 --> 00:26:10,290
That's how are we
going to try and tackle

390
00:26:10,290 --> 00:26:12,350
this division problem.

391
00:26:12,350 --> 00:26:18,460
But we still have this problem
of computing r divided by b.

392
00:26:18,460 --> 00:26:21,700
So how are we going to
compute r divided by b?

393
00:26:25,430 --> 00:26:29,740
And we want this to be a large
number of digits of precision.

394
00:26:29,740 --> 00:26:32,070
So we're going to use
Newton's method again.

395
00:26:36,160 --> 00:26:42,850
You've got some non-linearity
here with respect to 1 over x.

396
00:26:42,850 --> 00:26:46,380
And we're gonna use
Newton's method again.

397
00:26:46,380 --> 00:26:48,830
And we'll have to hope
that this works out,

398
00:26:48,830 --> 00:26:53,830
that we can get Newton's
method, it'll converge,

399
00:26:53,830 --> 00:26:59,830
and it'll require operations
that we know how to do.

400
00:26:59,830 --> 00:27:02,420
And all of this is going
to work out really well.

401
00:27:02,420 --> 00:27:04,460
I'm going to set
up a function, f

402
00:27:04,460 --> 00:27:14,550
of x equals 1 divided by
x minus b divided by r.

403
00:27:14,550 --> 00:27:17,230
So what this means is
that this function has

404
00:27:17,230 --> 00:27:23,790
a 0 at x equals r divided by b.

405
00:27:23,790 --> 00:27:28,470
So if I try and find
the 0 of this function,

406
00:27:28,470 --> 00:27:31,230
and I start out with a
decent initial guess,

407
00:27:31,230 --> 00:27:33,100
I'm going to end up
with r divided by b.

408
00:27:33,100 --> 00:27:35,000
And if I'm working
with integers,

409
00:27:35,000 --> 00:27:38,710
effectively that's the floor
that I have for r divided by b.

410
00:27:38,710 --> 00:27:43,800
And then I do my shift and
I end up with 1 over b.

411
00:27:43,800 --> 00:27:49,812
So someone who remembers
differentiation,

412
00:27:49,812 --> 00:27:52,260
if you're gonna apply
Newton's method,

413
00:27:52,260 --> 00:27:56,630
tell me what the
derivative of f of x is.

414
00:27:59,454 --> 00:28:00,870
Somebody's stretching
at the back,

415
00:28:00,870 --> 00:28:03,720
but I don't think
that was an answer.

416
00:28:03,720 --> 00:28:06,870
Someone at the back?

417
00:28:06,870 --> 00:28:08,580
Too easy a question?

418
00:28:08,580 --> 00:28:10,446
For the cushion.

419
00:28:10,446 --> 00:28:11,945
AUDIENCE: 1 over
negative x squared.

420
00:28:11,945 --> 00:28:13,486
PROFESSOR: 1 over
negative x squared.

421
00:28:13,486 --> 00:28:14,710
Who's that?

422
00:28:14,710 --> 00:28:15,370
All right.

423
00:28:15,370 --> 00:28:17,650
You can come pick this up.

424
00:28:17,650 --> 00:28:19,264
Whatever.

425
00:28:19,264 --> 00:28:20,180
Cut the monotony here.

426
00:28:22,570 --> 00:28:23,570
Just veered to the left.

427
00:28:23,570 --> 00:28:26,536
I think next time I'm going
to weight them or something.

428
00:28:26,536 --> 00:28:27,910
Let's just do
frisbees next time.

429
00:28:27,910 --> 00:28:30,230
Let's just do
frisbees next time.

430
00:28:30,230 --> 00:28:31,370
It makes it easy.

431
00:28:31,370 --> 00:28:33,450
Forget cushions.

432
00:28:33,450 --> 00:28:34,870
No?

433
00:28:34,870 --> 00:28:37,050
Frisbees or cushions?

434
00:28:37,050 --> 00:28:39,300
How many want frisbees?

435
00:28:39,300 --> 00:28:41,530
How many want cushions?

436
00:28:41,530 --> 00:28:44,470
Frisbees win.

437
00:28:44,470 --> 00:28:49,870
So you got derivative of x is
minus 1 divided by x squared.

438
00:28:49,870 --> 00:28:54,480
And then if you go off and
apply Newton's method--

439
00:28:54,480 --> 00:28:58,400
and I'm not going to go through
the symbolic equations here

440
00:28:58,400 --> 00:28:59,950
associated with
Newton's method--

441
00:28:59,950 --> 00:29:02,780
but that's basically the
same as we did before.

442
00:29:02,780 --> 00:29:09,380
You are computing a tangent,
and the new value of xi plus 1

443
00:29:09,380 --> 00:29:12,860
given the value of xi
is the x-intercept.

444
00:29:12,860 --> 00:29:16,740
And we needed the
derivative to compute that.

445
00:29:16,740 --> 00:29:21,590
But bottom line, you
have xi plus 1 equals

446
00:29:21,590 --> 00:29:30,830
xi minus f of xi divided
by f prime of xi.

447
00:29:30,830 --> 00:29:34,000
So that's the Newton iteration.

448
00:29:34,000 --> 00:29:42,130
And it's worth plugging in
the various values here.

449
00:29:42,130 --> 00:29:45,740
1 divided by xi
minus b divided by r.

450
00:29:45,740 --> 00:29:52,810
That's f of x on top divided by
minus 1 divided by xi square.

451
00:29:52,810 --> 00:29:54,610
So that's the
derivative over here.

452
00:29:54,610 --> 00:29:56,260
So all I'm doing is
plugging things in.

453
00:29:56,260 --> 00:30:00,140
But you want to visualize
this because this is really

454
00:30:00,140 --> 00:30:01,620
what we need to compute.

455
00:30:01,620 --> 00:30:10,820
And we have xi plus 1 equals
xi plus xi square times

456
00:30:10,820 --> 00:30:16,050
1 over xi minus b divided by r.

457
00:30:16,050 --> 00:30:24,800
And finally I get 2xi minus
b xi square divided by r.

458
00:30:24,800 --> 00:30:26,800
That is key.

459
00:30:26,800 --> 00:30:29,690
This is pretty important.

460
00:30:29,690 --> 00:30:32,380
So let's us look all the
way to the left, which

461
00:30:32,380 --> 00:30:37,620
is xi plus 1, all the way
to the right, 2 times xi.

462
00:30:37,620 --> 00:30:40,610
That doesn't scare
us, 2 times something.

463
00:30:40,610 --> 00:30:43,070
Especially base 2, pretty easy.

464
00:30:43,070 --> 00:30:43,840
That's a multiply.

465
00:30:43,840 --> 00:30:45,340
Multiplies don't
scare us because we

466
00:30:45,340 --> 00:30:47,090
know how to do
multiplies anyway.

467
00:30:47,090 --> 00:30:49,810
This is a simple multiply.

468
00:30:49,810 --> 00:30:52,061
And then I got a square here.

469
00:30:52,061 --> 00:30:52,560
Square.

470
00:30:52,560 --> 00:30:54,130
Not a square root.

471
00:30:54,130 --> 00:30:57,240
Squares don't scare us
because that's a multiply,

472
00:30:57,240 --> 00:30:59,700
just multiplying the
same number to itself.

473
00:30:59,700 --> 00:31:01,230
And this doesn't
scare us because we

474
00:31:01,230 --> 00:31:06,620
know that we've chosen r
to be an easy division.

475
00:31:06,620 --> 00:31:12,140
So all of the operations
here are either easy,

476
00:31:12,140 --> 00:31:15,720
or they require a multiply.

477
00:31:15,720 --> 00:31:19,290
So remember I'm going to put a
picture up towards the end here

478
00:31:19,290 --> 00:31:23,280
that tells you the overall
structure for computing

479
00:31:23,280 --> 00:31:25,400
square root of a or
square root of 2.

480
00:31:25,400 --> 00:31:29,010
But we've just sort of sold
out to Newton, if you will.

481
00:31:29,010 --> 00:31:32,360
Because we said that we're
going to use Newton's method

482
00:31:32,360 --> 00:31:39,960
to compute essentially,
iteratively, square root of a.

483
00:31:39,960 --> 00:31:43,350
And within the Newton
method, the first iteration,

484
00:31:43,350 --> 00:31:44,890
if you will, of
the Newton method,

485
00:31:44,890 --> 00:31:47,750
we had to compute a reciprocal.

486
00:31:47,750 --> 00:31:49,730
We had to compute 1 over xi.

487
00:31:49,730 --> 00:31:52,070
And in order to
compute 1 over xi,

488
00:31:52,070 --> 00:31:56,320
we're going to apply Newton's
method again like I showed over

489
00:31:56,320 --> 00:31:58,270
here and over there.

490
00:31:58,270 --> 00:32:03,920
And so that division is
going to require iteration.

491
00:32:03,920 --> 00:32:09,150
But the iteration at the second
level is one of multiplication.

492
00:32:09,150 --> 00:32:11,180
You're gonna repeatedly
apply multiplication

493
00:32:11,180 --> 00:32:13,480
because you're going
to go xi plus 1

494
00:32:13,480 --> 00:32:17,890
based on xi using multiplication
and some easy operations.

495
00:32:17,890 --> 00:32:22,070
And then you go xi plus 2, xi
plus 3, and so on and so forth.

496
00:32:22,070 --> 00:32:24,062
That make sense?

497
00:32:24,062 --> 00:32:27,590
I'll try and put this up to
give you the complete picture

498
00:32:27,590 --> 00:32:32,250
once we're done talking
about the division

499
00:32:32,250 --> 00:32:35,736
algorithm and its complexity.

500
00:32:35,736 --> 00:32:37,110
But before I do
that, I just want

501
00:32:37,110 --> 00:32:41,440
to give you a sense of the
convergence of this scheme.

502
00:32:41,440 --> 00:32:43,820
Again, I want to give
you an example first,

503
00:32:43,820 --> 00:32:46,000
and then I'll argue
about the convergence.

504
00:32:50,330 --> 00:32:52,230
You have to run
this iteratively.

505
00:32:52,230 --> 00:32:54,820
You've got to make i
to get to the point

506
00:32:54,820 --> 00:32:59,230
where it's large enough that you
have your digits of precision.

507
00:32:59,230 --> 00:33:01,340
And just as an
example, let's say

508
00:33:01,340 --> 00:33:08,387
we want r divided by b equals
2 raised to 16 divided by 5.

509
00:33:08,387 --> 00:33:10,220
So this is a fairly
straightforward example.

510
00:33:10,220 --> 00:33:14,660
But when you get up to integers,
it turns out it's evocative.

511
00:33:14,660 --> 00:33:21,760
So r was selected to be 2 raised
to k to make for easy division.

512
00:33:21,760 --> 00:33:26,780
And what I really want is that.

513
00:33:26,780 --> 00:33:31,540
And I want to see how I get
to that using Newton's method.

514
00:33:31,540 --> 00:33:42,180
And our initial
guess, let's say we

515
00:33:42,180 --> 00:33:45,110
try 2 raised to 16
divided by 4, because we

516
00:33:45,110 --> 00:33:49,010
know how to divide
by a power of two.

517
00:33:49,010 --> 00:33:50,600
And so that's 2 raised to 14.

518
00:33:50,600 --> 00:33:51,890
And that's our initial guess.

519
00:33:51,890 --> 00:33:56,400
So think of that as being x0.

520
00:33:56,400 --> 00:33:58,160
That is x0.

521
00:33:58,160 --> 00:34:02,834
And that 16384.

522
00:34:02,834 --> 00:34:10,679
x1 is going to be 2 times
16384, which is exactly that,

523
00:34:10,679 --> 00:34:16,850
minus 5 times
16384 whole square.

524
00:34:16,850 --> 00:34:19,610
So now you're starting to
square a fairly big number.

525
00:34:19,610 --> 00:34:22,159
And obviously if you'd
started with an even bigger r,

526
00:34:22,159 --> 00:34:23,820
this would be a large number.

527
00:34:26,920 --> 00:34:35,661
You go 65536 equals--
and this is 12288.

528
00:34:35,661 --> 00:34:38,989
So you really have one
digit of precision there.

529
00:34:38,989 --> 00:34:46,440
But the next time around,
you get 2 times 12288 minus 5

530
00:34:46,440 --> 00:34:53,159
times 12288 square
divided by 65536.

531
00:34:53,159 --> 00:34:55,239
And this division is easy.

532
00:34:55,239 --> 00:34:55,780
It's a shift.

533
00:34:55,780 --> 00:34:59,810
You get to 13056.

534
00:34:59,810 --> 00:35:01,640
And I won't write
this whole thing out,

535
00:35:01,640 --> 00:35:07,080
but if you take that, the next
thing you'll get is 13107.

536
00:35:07,080 --> 00:35:11,710
So as you can see, there's
rapid convergence here.

537
00:35:11,710 --> 00:35:16,660
And you can actually do a very
similar analysis to the epsilon

538
00:35:16,660 --> 00:35:18,620
analysis-- and I'll
put it in the notes,

539
00:35:18,620 --> 00:35:22,220
but I won't do it here-- that
I did for the square root

540
00:35:22,220 --> 00:35:24,860
iteration to show that you
have a quadratic the rate

541
00:35:24,860 --> 00:35:31,800
of convergence when you apply
Newton's method to division as

542
00:35:31,800 --> 00:35:33,700
well.

543
00:35:33,700 --> 00:35:38,190
So you can prove that using
the symbolic analysis than we

544
00:35:38,190 --> 00:35:41,420
did very similar to the
epsilon n relationship

545
00:35:41,420 --> 00:35:42,949
to epsilon n plus 1.

546
00:35:42,949 --> 00:35:44,740
I'd suggest that it's
a difference equation

547
00:35:44,740 --> 00:35:48,230
here so that analysis
is not exactly the same.

548
00:35:48,230 --> 00:35:50,330
But you can run
through that, and you

549
00:35:50,330 --> 00:35:53,200
can read that in the notes.

550
00:35:53,200 --> 00:35:54,700
So we're in business.

551
00:35:54,700 --> 00:35:57,250
Finally things are
looking up with respect

552
00:35:57,250 --> 00:36:00,850
to being able to actually
implement this in practice.

553
00:36:00,850 --> 00:36:02,850
I want to talk about complexity.

554
00:36:02,850 --> 00:36:05,720
And I promise that there
was a subtlety associated

555
00:36:05,720 --> 00:36:11,400
with the complexity of division
in relation to multiplication,

556
00:36:11,400 --> 00:36:16,490
but let me just go over and
write down what I just told you

557
00:36:16,490 --> 00:36:19,220
with respect to the
number of iterations

558
00:36:19,220 --> 00:36:23,150
that division requires.

559
00:36:23,150 --> 00:36:30,880
So division,
quadratic convergence.

560
00:36:35,020 --> 00:36:43,470
So number of digits
doubles at each step.

561
00:36:43,470 --> 00:36:44,730
Good news.

562
00:36:44,730 --> 00:36:56,770
So d digits of precision,
log d iterations.

563
00:37:00,130 --> 00:37:05,320
Now let's say that we have
a particular algorithm

564
00:37:05,320 --> 00:37:08,820
for multiplication that
I'm just going to say,

565
00:37:08,820 --> 00:37:13,330
since we have so many
different algorithms,

566
00:37:13,330 --> 00:37:18,660
I'm going to say multiplication
in theta n raised

567
00:37:18,660 --> 00:37:24,290
to alpha time, where alpha is
greater than or equal to 1.

568
00:37:24,290 --> 00:37:26,750
I just want to be
general about it.

569
00:37:26,750 --> 00:37:32,640
And so assuming that I have a
multiplication algorithm, that

570
00:37:32,640 --> 00:37:35,150
can run in theta
n raised to alpha,

571
00:37:35,150 --> 00:37:40,950
where clearly you know alpha can
be 1.46 for Toom 3, et cetera.

572
00:37:40,950 --> 00:37:45,360
And it's not quite that
for Schonhage-Strassen,

573
00:37:45,360 --> 00:37:50,450
but I just want to be working
with one particular complexity.

574
00:37:50,450 --> 00:37:52,340
So I'll parameterize
it in this fashion.

575
00:37:52,340 --> 00:37:56,620
And everything I say is going to
be true for Schonhage-Strassen

576
00:37:56,620 --> 00:37:58,090
and Furer as well.

577
00:37:58,090 --> 00:38:00,890
But first, easy question.

578
00:38:00,890 --> 00:38:03,890
What is the
complexity of division

579
00:38:03,890 --> 00:38:09,610
using the analysis that I've
put on the board so far?

580
00:38:09,610 --> 00:38:14,064
n digit numbers
it's going to be?

581
00:38:14,064 --> 00:38:14,980
I wanna hear from you.

582
00:38:17,630 --> 00:38:21,640
How many hard
multipliers do I have?

583
00:38:24,520 --> 00:38:25,476
Log of?

584
00:38:25,476 --> 00:38:26,270
AUDIENCE: n.

585
00:38:26,270 --> 00:38:27,530
PROFESSOR: Log of n, right?

586
00:38:27,530 --> 00:38:30,400
It wasn't a hard question.

587
00:38:30,400 --> 00:38:37,240
So division would be theta
log n times n raised to alpha.

588
00:38:40,059 --> 00:38:40,850
Everybody buy that?

589
00:38:44,651 --> 00:38:45,150
No?

590
00:38:50,144 --> 00:38:51,560
Ask a question if
you're confused.

591
00:38:55,610 --> 00:39:00,620
Maybe I should say
everybody buy that?

592
00:39:04,797 --> 00:39:06,130
How many people agree with that?

593
00:39:06,130 --> 00:39:07,737
Big O?

594
00:39:07,737 --> 00:39:09,070
How many people agree with that?

595
00:39:12,099 --> 00:39:12,890
Yeah, that's right.

596
00:39:12,890 --> 00:39:16,170
Big O. I'm hedging my bets here.

597
00:39:16,170 --> 00:39:19,680
I'm just saying big O. I
could say big O of n cubed

598
00:39:19,680 --> 00:39:21,370
and you should
all agree with me.

599
00:39:21,370 --> 00:39:22,780
Or big O of whatever.

600
00:39:22,780 --> 00:39:23,816
You had a question?

601
00:39:23,816 --> 00:39:25,482
AUDIENCE: What's the
longest [INAUDIBLE]

602
00:39:25,482 --> 00:39:27,565
number of [INAUDIBLE]
we need to get

603
00:39:27,565 --> 00:39:29,310
a certain level of [INAUDIBLE]?

604
00:39:29,310 --> 00:39:30,310
PROFESSOR: That's right.

605
00:39:30,310 --> 00:39:35,990
So if you want d
digits of precision,

606
00:39:35,990 --> 00:39:41,002
then according to this
argument-- and I think you

607
00:39:41,002 --> 00:39:42,710
guys are a little
doubtful here because I

608
00:39:42,710 --> 00:39:44,990
kept talking about subtleties,
and in fact there's

609
00:39:44,990 --> 00:39:48,910
a subtlety here, which I want
to get to-- but this big O

610
00:39:48,910 --> 00:39:50,260
thing is perfectly correct.

611
00:39:50,260 --> 00:39:52,010
But to answer your
question, yes.

612
00:39:52,010 --> 00:39:53,980
Let's assume that it's
n digits of precision.

613
00:39:53,980 --> 00:39:56,530
That's what we assume
whether it's n or d.

614
00:39:56,530 --> 00:39:58,520
You can plug in the
appropriate symbol here.

615
00:39:58,520 --> 00:40:02,130
And we're saying that, look,
every iteration is bounded

616
00:40:02,130 --> 00:40:07,050
by n raised to alpha
complexity for the multiply.

617
00:40:07,050 --> 00:40:08,860
And I'm going to do
a logarithmic number

618
00:40:08,860 --> 00:40:09,770
of iterations.

619
00:40:09,770 --> 00:40:13,150
So I end up getting log n
times n raised to alpha.

620
00:40:13,150 --> 00:40:15,220
So that is correct, in fact.

621
00:40:15,220 --> 00:40:16,450
Big O is correct.

622
00:40:16,450 --> 00:40:19,810
So now it comes to the
interesting question,

623
00:40:19,810 --> 00:40:23,050
which is can you do
a better analysis?

624
00:40:23,050 --> 00:40:26,250
So this sort of hearkens
back to three weeks

625
00:40:26,250 --> 00:40:27,750
ago, maybe you've forgotten.

626
00:40:27,750 --> 00:40:29,560
Maybe you've blanked
it out of your memory,

627
00:40:29,560 --> 00:40:34,670
but I thought I described
to you build max-heap.

628
00:40:34,670 --> 00:40:36,540
And we had this
straightforward analysis

629
00:40:36,540 --> 00:40:39,379
of build max-heap that
was n log n complexity.

630
00:40:39,379 --> 00:40:41,420
And then we looked at it
a little more carefully,

631
00:40:41,420 --> 00:40:44,160
and we started adding things
up much more carefully.

632
00:40:44,160 --> 00:40:45,940
We turned into bank accountants.

633
00:40:45,940 --> 00:40:49,370
And then we decided that
it was theta n complexity.

634
00:40:49,370 --> 00:40:50,590
People remember that?

635
00:40:50,590 --> 00:40:51,090
Right?

636
00:40:51,090 --> 00:40:52,881
So I want you to turn
into bank accountants

637
00:40:52,881 --> 00:40:57,920
again, and then tell me first,
there's a nice observation

638
00:40:57,920 --> 00:41:02,110
that you can make here
that we haven't made yet

639
00:41:02,110 --> 00:41:05,542
with respect to the
size of these numbers.

640
00:41:05,542 --> 00:41:07,000
We know what we
want to eventually,

641
00:41:07,000 --> 00:41:09,416
but there's a nice observation
we can make it with respect

642
00:41:09,416 --> 00:41:10,930
to the size of these numbers.

643
00:41:10,930 --> 00:41:14,290
And then we want to
exploit that observation

644
00:41:14,290 --> 00:41:19,830
to do a better analysis of the
theta complexity of division.

645
00:41:19,830 --> 00:41:22,962
So who wants to tell me
what the observation is.

646
00:41:22,962 --> 00:41:25,260
This is definitely
worth a cushion.

647
00:41:25,260 --> 00:41:26,680
What's the observation?

648
00:41:26,680 --> 00:41:29,270
I want to end up with
d digits of precision.

649
00:41:32,930 --> 00:41:35,440
If I give you another hint,
I'm gonna give it away.

650
00:41:35,440 --> 00:41:38,690
Someone tell me.

651
00:41:38,690 --> 00:41:42,630
This is a dynamic process, OK?

652
00:41:42,630 --> 00:41:46,160
So what do I start with?

653
00:41:46,160 --> 00:41:49,230
What do I start with?

654
00:41:49,230 --> 00:41:51,154
If I want to compute
something and you

655
00:41:51,154 --> 00:41:53,320
want to use Newton's method,
what do you start with?

656
00:41:53,320 --> 00:41:54,242
Yeah?

657
00:41:54,242 --> 00:41:55,749
AUDIENCE: [INAUDIBLE]

658
00:41:55,749 --> 00:41:57,790
PROFESSOR: You start with
one digit of precision.

659
00:41:57,790 --> 00:41:59,507
That's fantastic.

660
00:41:59,507 --> 00:42:01,590
I don't know if you already
have a cushion or not,

661
00:42:01,590 --> 00:42:03,140
but here's the second one.

662
00:42:03,140 --> 00:42:07,830
So you start with a small
number of digits of precision.

663
00:42:07,830 --> 00:42:12,550
And then you end up with a
large million, whatever, number,

664
00:42:12,550 --> 00:42:15,100
which is your d.

665
00:42:15,100 --> 00:42:17,340
So what does that mean?

666
00:42:17,340 --> 00:42:19,780
So now somebody take
that and run with it.

667
00:42:19,780 --> 00:42:22,770
Somebody take that
and run with it.

668
00:42:22,770 --> 00:42:24,380
You already have a cushion.

669
00:42:24,380 --> 00:42:25,050
Like many?

670
00:42:28,510 --> 00:42:31,720
You guys, usual suspects.

671
00:42:31,720 --> 00:42:33,620
So someone take that
and run with it.

672
00:42:33,620 --> 00:42:34,614
What can I do now?

673
00:42:34,614 --> 00:42:37,030
What does it mean if I start
with a small number of digits

674
00:42:37,030 --> 00:42:38,520
of precision?

675
00:42:38,520 --> 00:42:40,230
My initial guess was one, right?

676
00:42:40,230 --> 00:42:42,270
I mean, that had one
digit of precision.

677
00:42:42,270 --> 00:42:46,330
And then the number of digits
doubles with each step.

678
00:42:46,330 --> 00:42:50,270
So is there any
reason why I'm doing,

679
00:42:50,270 --> 00:42:52,450
if I had d digits of
precision, eventually

680
00:42:52,450 --> 00:43:00,550
that I'll have to do d digit
multiplies in each iteration?

681
00:43:00,550 --> 00:43:01,740
Any reason why?

682
00:43:01,740 --> 00:43:02,526
Yeah.

683
00:43:02,526 --> 00:43:04,984
AUDIENCE: You don't have to,
because [INAUDIBLE] multiplies

684
00:43:04,984 --> 00:43:05,468
are going to be trivial.

685
00:43:05,468 --> 00:43:07,525
And [INAUDIBLE] then you're
going to eventually approach

686
00:43:07,525 --> 00:43:08,856
the d to the alpha iteration.

687
00:43:08,856 --> 00:43:09,910
PROFESSOR: That's exactly right.

688
00:43:09,910 --> 00:43:10,620
Exactly right.

689
00:43:10,620 --> 00:43:12,400
That's worth a cushion.

690
00:43:12,400 --> 00:43:15,490
But now I want you
or someone else,

691
00:43:15,490 --> 00:43:19,540
tell me what the
iteration looks like.

692
00:43:19,540 --> 00:43:21,250
So this is the key observation.

693
00:43:21,250 --> 00:43:40,390
The key observation is that if
I want d digits of precision,

694
00:43:40,390 --> 00:43:42,960
I'm going to start with
maybe one digit of precision.

695
00:43:42,960 --> 00:43:49,120
So this is d of p, or dig
of p, not to be confused.

696
00:43:49,120 --> 00:43:52,810
I start with 1, 2, 4,
and I end up with d.

697
00:43:52,810 --> 00:43:57,460
And our claim was that this
was log d iterations, right?

698
00:43:57,460 --> 00:44:05,460
So the initial
multiplies are easy.

699
00:44:05,460 --> 00:44:07,580
Initially you're
doing constant work

700
00:44:07,580 --> 00:44:10,530
if you have really
small numbers associated

701
00:44:10,530 --> 00:44:11,840
with these multiplies.

702
00:44:11,840 --> 00:44:13,940
It's only towards the
end that you end up

703
00:44:13,940 --> 00:44:16,720
doing a lot more work, right?

704
00:44:16,720 --> 00:44:24,510
So someone tell me if I
have n raised to alpha,

705
00:44:24,510 --> 00:44:30,440
and if I say I want
to write an equation.

706
00:44:30,440 --> 00:44:32,841
And I don't want
to use theta here.

707
00:44:32,841 --> 00:44:34,340
I'm going to use
constants because I

708
00:44:34,340 --> 00:44:38,200
want to add up constants,
and it's a little iffy then

709
00:44:38,200 --> 00:44:40,400
you add up thetas.

710
00:44:40,400 --> 00:44:43,630
You need to be
looking at constants.

711
00:44:43,630 --> 00:44:52,330
Now I can imagine that for this
iteration, the very first one,

712
00:44:52,330 --> 00:44:55,850
that I have something like
c times 1 raised to alpha,

713
00:44:55,850 --> 00:44:58,070
because it's just a
single digit of precision.

714
00:44:58,070 --> 00:45:02,210
OK And the next one, I'm
using the same algorithm.

715
00:45:02,210 --> 00:45:05,646
This is c times 2 raised
to alpha, c times 4

716
00:45:05,646 --> 00:45:06,330
raised to alpha.

717
00:45:09,910 --> 00:45:12,890
And then out here
I'm going to have

718
00:45:12,890 --> 00:45:19,000
c times d by 4 raised to
alpha plus c times d by 2

719
00:45:19,000 --> 00:45:24,120
raised to alpha plus finally
c times d raised to alpha.

720
00:45:24,120 --> 00:45:29,022
And someone give me a bound.

721
00:45:29,022 --> 00:45:30,755
Who wants to give
me a bound on this?

722
00:45:33,882 --> 00:45:37,370
Who wants to give
me a bound on this?

723
00:45:37,370 --> 00:45:40,320
Less than or equal to.

724
00:45:40,320 --> 00:45:42,130
Let's just make it less than.

725
00:45:42,130 --> 00:45:43,030
What?

726
00:45:43,030 --> 00:45:43,530
Someone?

727
00:45:47,500 --> 00:45:49,650
Just plug in a value of alpha.

728
00:45:49,650 --> 00:45:54,020
And remember your convergent
geometric series and things

729
00:45:54,020 --> 00:45:55,084
like that.

730
00:45:55,084 --> 00:45:55,625
What is that?

731
00:45:58,300 --> 00:45:59,890
Someone?

732
00:45:59,890 --> 00:46:00,550
Yeah.

733
00:46:00,550 --> 00:46:03,005
AUDIENCE: Just some constant
times d to the alpha?

734
00:46:03,005 --> 00:46:04,338
PROFESSOR: That's exactly right.

735
00:46:04,338 --> 00:46:07,710
Just some constant
times d to the alpha.

736
00:46:07,710 --> 00:46:12,320
And in fact, you can say,
it's 2c d to the alpha.

737
00:46:16,100 --> 00:46:17,840
Keep a question for you aside.

738
00:46:17,840 --> 00:46:18,500
So that' sit.

739
00:46:18,500 --> 00:46:22,280
That's the little careful
analysis that we had to do,

740
00:46:22,280 --> 00:46:26,610
which basically without
changing your code, really,

741
00:46:26,610 --> 00:46:28,810
suddenly gave you a
better complexity.

742
00:46:28,810 --> 00:46:30,000
Isn't that fun?

743
00:46:30,000 --> 00:46:31,270
That's always fun.

744
00:46:31,270 --> 00:46:34,760
You had this neat
algorithm to begin with.

745
00:46:34,760 --> 00:46:37,840
And bottom line is you're
just computing things

746
00:46:37,840 --> 00:46:41,130
a little more accurately,
than essentially saying

747
00:46:41,130 --> 00:46:44,540
that you had to do
all of this work

748
00:46:44,540 --> 00:46:48,360
with large number of digits of
precision at every iteration.

749
00:46:48,360 --> 00:46:51,250
The number of digits
actually increases.

750
00:46:51,250 --> 00:46:52,840
So what does this mean?

751
00:46:52,840 --> 00:46:56,000
I guess ultimately, the
complexity of division

752
00:46:56,000 --> 00:46:57,930
is now what?

753
00:46:57,930 --> 00:47:03,650
It's the same as the complexity
of multiplication, right?

754
00:47:03,650 --> 00:47:08,570
So regardless of whether we
did a Newton iteration or not,

755
00:47:08,570 --> 00:47:12,905
the complexity of division.

756
00:47:24,940 --> 00:47:27,270
You are doing a logarithmic
number of iterations,

757
00:47:27,270 --> 00:47:29,870
but since eventually
all of the work

758
00:47:29,870 --> 00:47:32,766
is going to get done
at the end here.

759
00:47:32,766 --> 00:47:34,765
Most of the work is getting
done at the end when

760
00:47:34,765 --> 00:47:36,740
you have these long numbers.

761
00:47:36,740 --> 00:47:40,250
That's basically the
essence of the argument.

762
00:47:40,250 --> 00:47:44,430
So let me finish up and
talk about the complexity

763
00:47:44,430 --> 00:47:45,785
of computing square roots.

764
00:47:51,000 --> 00:47:56,030
And as you can imagine,
even though you

765
00:47:56,030 --> 00:47:58,830
have two nested Newton
iterations here,

766
00:47:58,830 --> 00:48:01,850
you can make basically
the same argument.

767
00:48:01,850 --> 00:48:04,690
So let's recall what
we're doing in terms

768
00:48:04,690 --> 00:48:06,400
of computing square roots.

769
00:48:06,400 --> 00:48:09,320
We want to compute
square root of a.

770
00:48:09,320 --> 00:48:12,270
And we said, well we don't
quite know how to do this.

771
00:48:12,270 --> 00:48:18,370
We're going to end up doing
10 raised to 2d times a,

772
00:48:18,370 --> 00:48:20,710
and we're going to run
Newton's method on it.

773
00:48:20,710 --> 00:48:24,000
So you've got one level
of Newton's method.

774
00:48:26,590 --> 00:48:29,730
And the iteration here with
respect to Newton's method

775
00:48:29,730 --> 00:48:40,100
is something like xi plus 1
equals xi plus a divided by xi.

776
00:48:40,100 --> 00:48:44,970
Now every time you do
that for a particular xi,

777
00:48:44,970 --> 00:48:49,320
you're going to end up
having to call a division.

778
00:48:49,320 --> 00:48:52,940
So you're going to
call a division here,

779
00:48:52,940 --> 00:48:56,050
and then you're going
to call a division here.

780
00:48:56,050 --> 00:48:58,430
For each iteration you
have to call a division.

781
00:48:58,430 --> 00:49:00,300
And what we're
saying is, well we're

782
00:49:00,300 --> 00:49:03,747
going to end up having to call
for each of these division

783
00:49:03,747 --> 00:49:05,580
methods we're going to
call Newton's method.

784
00:49:09,260 --> 00:49:16,830
And what that is
something like 2xi

785
00:49:16,830 --> 00:49:22,520
minus b xi square divided by r.

786
00:49:22,520 --> 00:49:25,975
And that's going to be a
bunch of multiplications.

787
00:49:28,600 --> 00:49:30,970
And what we argued up
until this point was

788
00:49:30,970 --> 00:49:33,460
that the complexity
of the division,

789
00:49:33,460 --> 00:49:35,640
even though we had a
bunch of iterations here,

790
00:49:35,640 --> 00:49:37,920
a logarithmic number of
iterations, the complexity

791
00:49:37,920 --> 00:49:40,000
of the division was the
same as the complexity

792
00:49:40,000 --> 00:49:42,240
of the multiplication
because the numbers

793
00:49:42,240 --> 00:49:44,690
started out small and grew big.

794
00:49:44,690 --> 00:49:45,200
All right?

795
00:49:45,200 --> 00:49:46,960
Everybody buy that?

796
00:49:46,960 --> 00:49:49,250
I'm going to use exactly
the same argument

797
00:49:49,250 --> 00:49:52,500
for this level of
iteration as well.

798
00:49:52,500 --> 00:49:57,010
And again, when you start out
with the digits of precision

799
00:49:57,010 --> 00:49:59,180
corresponding to
square root of 2,

800
00:49:59,180 --> 00:50:01,630
you're going to start
out guessing 1.5,

801
00:50:01,630 --> 00:50:04,435
which is your initial guess
for the square root of 2,

802
00:50:04,435 --> 00:50:07,270
and it's going to be a small
number of digits of precision.

803
00:50:07,270 --> 00:50:09,560
And eventually you'll
get to a million digits.

804
00:50:09,560 --> 00:50:14,840
So using essentially the
same equation summing,

805
00:50:14,840 --> 00:50:17,610
you can argue that the
complexity of computing

806
00:50:17,610 --> 00:50:25,060
square roots is the complexity
of division, which of course is

807
00:50:25,060 --> 00:50:28,200
the complexity of
multiplication.

808
00:50:32,500 --> 00:50:34,880
And that's the story.

809
00:50:34,880 --> 00:50:37,990
So obviously the code would
be a little more complicated

810
00:50:37,990 --> 00:50:39,890
than a multiplication
code, because you

811
00:50:39,890 --> 00:50:42,360
have all this control
structure outside of it.

812
00:50:42,360 --> 00:50:44,920
It's really two nested loops.

813
00:50:44,920 --> 00:50:47,320
The multiply is getting
called a bunch of times

814
00:50:47,320 --> 00:50:49,100
to do the divide,
and the divide is

815
00:50:49,100 --> 00:50:51,840
getting called a bunch of times
to compute the square root.

816
00:50:51,840 --> 00:50:54,440
But ultimately, because
the numbers are growing

817
00:50:54,440 --> 00:50:56,690
and you start out with small
numbers, most of the work

818
00:50:56,690 --> 00:50:58,820
is done when you get to
the millions of digits

819
00:50:58,820 --> 00:50:59,760
of precision.

820
00:50:59,760 --> 00:51:03,760
And you basically
have theta n raised

821
00:51:03,760 --> 00:51:06,940
to alpha complexity for
computing square roots.

822
00:51:06,940 --> 00:51:09,940
If you have n raised
to alpha multiply,

823
00:51:09,940 --> 00:51:13,060
and you want n
digits of precision.

824
00:51:13,060 --> 00:51:13,990
All right?

825
00:51:13,990 --> 00:51:14,970
See you next time.

826
00:51:14,970 --> 00:51:16,980
Stick around for questions.