1
00:00:00,000 --> 00:00:01,910
NARRATOR: The following content
is provided under a

2
00:00:01,910 --> 00:00:03,640
Creative Commons license.

3
00:00:03,640 --> 00:00:06,600
Your support will help MIT
OpenCourseWare continue to

4
00:00:06,600 --> 00:00:09,970
offer high-quality educational
resources for free.

5
00:00:09,970 --> 00:00:12,810
To make a donation or to view
additional materials from

6
00:00:12,810 --> 00:00:16,880
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:16,880 --> 00:00:18,130
ocw.mit.edu.

8
00:00:21,990 --> 00:00:25,160
PROFESSOR: To take up the topic
of quantization, you

9
00:00:25,160 --> 00:00:27,990
remember we're talking about
source coding in the first

10
00:00:27,990 --> 00:00:30,900
part of the course, channel
coding in the

11
00:00:30,900 --> 00:00:33,700
last 2/3 of the course.

12
00:00:33,700 --> 00:00:39,980
Source coding, like all, is
divided into three parts.

13
00:00:39,980 --> 00:00:44,980
If you have waveforms, such
as speech, you start

14
00:00:44,980 --> 00:00:47,400
out with the waveform.

15
00:00:47,400 --> 00:00:50,810
The typical way to encode
waveforms is you first either

16
00:00:50,810 --> 00:00:54,750
sample the waveform or
you expand it in

17
00:00:54,750 --> 00:00:56,910
some kind of expansion.

18
00:00:56,910 --> 00:01:01,720
When you do that, you wind up
with a sequence of numbers.

19
00:01:01,720 --> 00:01:05,850
You put the sequence of numbers
into a quantizer, and

20
00:01:05,850 --> 00:01:10,270
the quantizer reduces that
to a discrete alphabet.

21
00:01:10,270 --> 00:01:15,500
You put the discrete symbols
into the discrete encoder.

22
00:01:15,500 --> 00:01:18,490
You pass it through a reliable
binary channel.

23
00:01:18,490 --> 00:01:21,140
What is a reliable
binary channel?

24
00:01:21,140 --> 00:01:25,220
It's a layered view of any old
channel in the world, OK?

25
00:01:25,220 --> 00:01:28,110
In other words, the way that
discrete channels work these

26
00:01:28,110 --> 00:01:33,910
days is that, in almost all
cases, what goes into them is

27
00:01:33,910 --> 00:01:38,450
a binary stream of signals, and
what comes out of them is

28
00:01:38,450 --> 00:01:44,320
a binary stream of symbols, and
the output is essentially

29
00:01:44,320 --> 00:01:45,630
the same z-input.

30
00:01:45,630 --> 00:01:49,710
That's the whole purpose of
how you design digital

31
00:01:49,710 --> 00:01:53,690
channels, and they work
over analog media and

32
00:01:53,690 --> 00:01:55,220
all sorts of things.

33
00:01:55,220 --> 00:02:01,400
OK, so discrete encoders and
discrete decoders are really a

34
00:02:01,400 --> 00:02:05,610
valid topic to study
in their own.

35
00:02:05,610 --> 00:02:08,190
I mean, you have text and stuff
like that, which is

36
00:02:08,190 --> 00:02:12,030
discrete to start with, so
there's a general topic of how

37
00:02:12,030 --> 00:02:15,990
do you encode discrete things?

38
00:02:15,990 --> 00:02:19,860
We've pretty much answered that
problem, at least in an

39
00:02:19,860 --> 00:02:27,230
abstract sense, and the main
point there is you find the

40
00:02:27,230 --> 00:02:30,700
entropy of that discrete
sequence, the entropy per

41
00:02:30,700 --> 00:02:36,380
symbol, and then you find ways
of encoding that discrete

42
00:02:36,380 --> 00:02:41,120
source in such a way that the
number of bits per symbol is

43
00:02:41,120 --> 00:02:43,400
approximately equal
to that entropy.

44
00:02:43,400 --> 00:02:46,140
We know you can't do any
better than that.

45
00:02:46,140 --> 00:02:51,390
You can't do an encoding, which
is uniquely decodable,

46
00:02:51,390 --> 00:02:55,930
which you can get out of with
the original symbols again,

47
00:02:55,930 --> 00:02:58,540
with anything less
than the entropy.

48
00:02:58,540 --> 00:03:02,210
So at least we know roughly what
the answer to that is.

49
00:03:02,210 --> 00:03:06,150
We even know some classy schemes
like the Lempel-Ziv

50
00:03:06,150 --> 00:03:11,900
algorithm, which will in fact
operate without even knowing

51
00:03:11,900 --> 00:03:14,910
anything about what the
probabilities are.

52
00:03:14,910 --> 00:03:19,750
So we sort of understand this
block here at this point.

53
00:03:19,750 --> 00:03:23,390
And we could start with this
next, or we could start with

54
00:03:23,390 --> 00:03:28,570
this next, and unlike the
electrician here, we're going

55
00:03:28,570 --> 00:03:34,120
to move in sequence and look at
this next and this third.

56
00:03:34,120 --> 00:03:36,480
There's another reason
for that.

57
00:03:36,480 --> 00:03:41,690
When we get into this question,
we will be talking

58
00:03:41,690 --> 00:03:43,550
about what do you do
with waveforms?

59
00:03:43,550 --> 00:03:45,820
How do you deal with
waveforms?

60
00:03:45,820 --> 00:03:49,850
It's what you've been studying
probably since the sixth

61
00:03:49,850 --> 00:03:53,670
grade, and you're all
familiar with it.

62
00:03:53,670 --> 00:03:55,770
You do integration and
things like that.

63
00:03:55,770 --> 00:03:58,240
You know how to work
with functions.

64
00:03:58,240 --> 00:04:01,300
What we're going to do with
functions in this course is a

65
00:04:01,300 --> 00:04:04,070
little bit different than
what you're used to.

66
00:04:04,070 --> 00:04:07,010
It's quite a bit different than
what you learned in your

67
00:04:07,010 --> 00:04:10,500
Signals and Systems course, and
you'll find out why as we

68
00:04:10,500 --> 00:04:11,820
move along.

69
00:04:11,820 --> 00:04:17,310
But we have to understand this
business of how to deal with

70
00:04:17,310 --> 00:04:22,350
waveforms, both in terms of this
block, which is the final

71
00:04:22,350 --> 00:04:26,010
block we'll study in source
coding, and also in terms of

72
00:04:26,010 --> 00:04:28,780
how to deal with channels
because in real channels,

73
00:04:28,780 --> 00:04:31,870
generally what you transmit
is a waveform.

74
00:04:31,870 --> 00:04:35,920
What the noise does to you is to
add a waveform or, in some

75
00:04:35,920 --> 00:04:40,040
sense, multiply what you put
in by something else.

76
00:04:40,040 --> 00:04:42,650
And all of that is
waveform stuff.

77
00:04:42,650 --> 00:04:45,440
And all of information theory
and all of digital

78
00:04:45,440 --> 00:04:49,610
communication is based
on thinking bits.

79
00:04:49,610 --> 00:04:54,860
So somehow or other, we have to
become very facile in going

80
00:04:54,860 --> 00:04:57,530
from waveforms to bits.

81
00:04:57,530 --> 00:05:00,560
Now I've been around the
professional communities of

82
00:05:00,560 --> 00:05:04,160
both communication and
information theory for

83
00:05:04,160 --> 00:05:06,790
a long, long time.

84
00:05:06,790 --> 00:05:10,150
There is one fundamental problem
that gives everyone

85
00:05:10,150 --> 00:05:14,170
problems because information
theory texts do not deal with

86
00:05:14,170 --> 00:05:17,250
it and communication texts
do not deal with it.

87
00:05:17,250 --> 00:05:19,960
And that problem is how do you
go from one to the other,

88
00:05:19,960 --> 00:05:22,430
which seems like it ought
to be an easy thing.

89
00:05:22,430 --> 00:05:25,820
It's not as easy as it looks,
and therefore, we're going to

90
00:05:25,820 --> 00:05:27,930
spent quite a bit
of time on that.

91
00:05:27,930 --> 00:05:32,140
In fact, I passed out lectures
6 and 7 today, which in

92
00:05:32,140 --> 00:05:36,150
previous years I've done in two
separate lectures because

93
00:05:36,150 --> 00:05:39,420
quantization is a problem
that looks important.

94
00:05:39,420 --> 00:05:43,800
We'll see that it's not quite
as important as it looks.

95
00:05:43,800 --> 00:05:49,180
And I guess the other thing is,
I mean, at different times

96
00:05:49,180 --> 00:05:52,490
you have to teach different
things or you get stale on it.

97
00:05:52,490 --> 00:05:56,220
And I've just finished writing
some fairly nice notes, I

98
00:05:56,220 --> 00:05:59,530
think, on this question of how
do you go from waveforms to

99
00:05:59,530 --> 00:06:03,080
numbers and how do you go from
numbers to waveforms.

100
00:06:03,080 --> 00:06:06,190
And I want to spend a little
more time on that this year

101
00:06:06,190 --> 00:06:07,920
than I did last year.

102
00:06:07,920 --> 00:06:10,730
I want to spend, therefore,
a little less time on

103
00:06:10,730 --> 00:06:15,910
quantization, so that next time,
we will briefly review

104
00:06:15,910 --> 00:06:20,560
what we've done today on
quantization, but we will

105
00:06:20,560 --> 00:06:23,310
essentially just compress
those two

106
00:06:23,310 --> 00:06:25,980
lectures all into one.

107
00:06:25,980 --> 00:06:30,310
In dealing with waveforms,
we're going to learn some

108
00:06:30,310 --> 00:06:33,090
interesting and kind
of cool things.

109
00:06:33,090 --> 00:06:36,180
Like for those of you who
really don't -- are not

110
00:06:36,180 --> 00:06:40,380
interested in mathematics, you
know that people study things

111
00:06:40,380 --> 00:06:43,920
like the theory of real
variables and functional

112
00:06:43,920 --> 00:06:50,070
analysis and all of these neat
things, which are very, in a

113
00:06:50,070 --> 00:06:52,870
sense, advanced mathematics.

114
00:06:52,870 --> 00:06:56,920
They're all based on measure
theory, and you're going to

115
00:06:56,920 --> 00:06:59,810
find out a little bit about
measure theory here.

116
00:06:59,810 --> 00:07:03,380
Not an awful lot, but just
enough to know why one has to

117
00:07:03,380 --> 00:07:07,220
deal with those questions
because the major results in

118
00:07:07,220 --> 00:07:12,160
dealing with waveforms and
samples really can't be stated

119
00:07:12,160 --> 00:07:14,560
in any other form than
in a somewhat

120
00:07:14,560 --> 00:07:16,820
measure-theoretic form.

121
00:07:16,820 --> 00:07:20,740
So we're going to find just
enough about that so we can

122
00:07:20,740 --> 00:07:22,880
understand what those
issues are about.

123
00:07:22,880 --> 00:07:25,610
So that's why next time we're
going to start dealing with

124
00:07:25,610 --> 00:07:27,930
the waveform issues.

125
00:07:27,930 --> 00:07:31,990
OK, so today, we're dealing with
these quantizer issues

126
00:07:31,990 --> 00:07:36,630
and how do you take a sequence
of numbers, turn them into a

127
00:07:36,630 --> 00:07:39,220
sequence of symbols
at the other end?

128
00:07:39,220 --> 00:07:41,450
How do you take a sequence
of symbols, turn

129
00:07:41,450 --> 00:07:43,110
them back into numbers?

130
00:07:55,720 --> 00:07:59,890
So when you convert real numbers
to binary strings, you

131
00:07:59,890 --> 00:08:03,360
need a mapping from the
set of real numbers

132
00:08:03,360 --> 00:08:05,470
to a discrete alphabet.

133
00:08:05,470 --> 00:08:08,520
And we're typically going to
have a mapping from the set of

134
00:08:08,520 --> 00:08:12,700
real numbers into a finite
discrete alphabet.

135
00:08:12,700 --> 00:08:16,790
Now what's the first obvious
thing that you notice when you

136
00:08:16,790 --> 00:08:20,280
have a mapping that goes from an
infinite set of things into

137
00:08:20,280 --> 00:08:21,620
a finite set of things?

138
00:08:21,620 --> 00:08:23,270
AUDIENCE: [INAUDIBLE]

139
00:08:23,270 --> 00:08:23,550
PROFESSOR: What?

140
00:08:23,550 --> 00:08:25,020
AUDIENCE: [INAUDIBLE]

141
00:08:28,450 --> 00:08:29,820
PROFESSOR: Not really.

142
00:08:29,820 --> 00:08:34,010
It's a much simpler
idea than that.

143
00:08:34,010 --> 00:08:36,630
How are you going to get back?

144
00:08:36,630 --> 00:08:40,580
I mean, usually when you map
something into something else,

145
00:08:40,580 --> 00:08:43,590
you would like to
get back again.

146
00:08:43,590 --> 00:08:46,870
When you map an infinite set
into a finite set, how are you

147
00:08:46,870 --> 00:08:47,590
going to get back?

148
00:08:47,590 --> 00:08:50,670
AUDIENCE: You're not.

149
00:08:50,670 --> 00:08:51,170
PROFESSOR: You're not.

150
00:08:51,170 --> 00:08:52,190
Good!

151
00:08:52,190 --> 00:08:54,330
There's not any way in hell
that you're ever going

152
00:08:54,330 --> 00:08:56,160
to get back, OK?

153
00:08:56,160 --> 00:08:59,780
So, in other words, what
you've done here is to

154
00:08:59,780 --> 00:09:03,770
deliberately introduce some
distortion into the picture.

155
00:09:03,770 --> 00:09:07,210
You've introduced distortion
because you have no choice.

156
00:09:07,210 --> 00:09:11,930
If you want to turn numbers into
bits, you can't get back

157
00:09:11,930 --> 00:09:13,460
to the exact numbers again.

158
00:09:13,460 --> 00:09:16,650
So you can only get back to some
approximation of what the

159
00:09:16,650 --> 00:09:18,020
numbers are.

160
00:09:18,020 --> 00:09:21,250
But anyway, this process of
doing this is called scalar

161
00:09:21,250 --> 00:09:24,690
quantization, if we're mapping
from the set of real numbers

162
00:09:24,690 --> 00:09:27,410
to a discrete alphabet.

163
00:09:27,410 --> 00:09:34,680
If instead you want to convert
real n-tuples into sequences

164
00:09:34,680 --> 00:09:39,120
of discrete symbols, in other
words, into a finite alphabet,

165
00:09:39,120 --> 00:09:42,330
you call that vector
quantization because you can

166
00:09:42,330 --> 00:09:46,760
view n real numbers, a sequence
of n real numbers as

167
00:09:46,760 --> 00:09:50,990
a vector within coordinates
and within components.

168
00:09:50,990 --> 00:09:53,660
And I'm not doing anything
fancy with vectors here.

169
00:09:53,660 --> 00:09:57,490
You just look at an n-tuple
of numbers as a vector.

170
00:09:57,490 --> 00:10:02,420
OK, so scalar quantization is
going to encode each term of

171
00:10:02,420 --> 00:10:06,290
the source sequence
separately.

172
00:10:06,290 --> 00:10:12,040
And vector quantization is first
going to segment this

173
00:10:12,040 --> 00:10:18,600
sequence of numbers into blocks
of n numbers each, and

174
00:10:18,600 --> 00:10:23,610
then it's going to find a way
of encoding those n-blocks

175
00:10:23,610 --> 00:10:25,540
into discrete symbols.

176
00:10:25,540 --> 00:10:28,000
Does this sound a little bit
like what we've already done

177
00:10:28,000 --> 00:10:31,150
in dealing with discrete
sources?

178
00:10:31,150 --> 00:10:33,350
Yeah, it's exactly
the same thing.

179
00:10:33,350 --> 00:10:37,430
I mean, we started out by
mapping individual symbols

180
00:10:37,430 --> 00:10:42,950
into bit sequences, and then we
said, gee, we can also map

181
00:10:42,950 --> 00:10:47,830
n-blocks of those symbols into
bits, and we said, gee, this

182
00:10:47,830 --> 00:10:49,230
is the same problem again.

183
00:10:49,230 --> 00:10:52,760
There's nothing different,
nothing new.

184
00:10:52,760 --> 00:10:57,660
And it's the same thing here
almost except here the

185
00:10:57,660 --> 00:11:00,780
properties of the real numbers
are important.

186
00:11:00,780 --> 00:11:03,200
Why are the properties of the
real numbers important?

187
00:11:03,200 --> 00:11:06,570
Why can't we just look
at this as symbols?

188
00:11:06,570 --> 00:11:10,210
Well, because, since we can't
do this mapping in an

189
00:11:10,210 --> 00:11:14,490
invertable way, you have to deal
with the fact that you

190
00:11:14,490 --> 00:11:16,520
have distortion here.

191
00:11:16,520 --> 00:11:18,160
There's no other way
to think about it.

192
00:11:18,160 --> 00:11:19,960
There is distortion.

193
00:11:19,960 --> 00:11:21,180
You might as well face it.

194
00:11:21,180 --> 00:11:24,110
If you try to cover it
up, it just comes

195
00:11:24,110 --> 00:11:26,110
up to kick you later.

196
00:11:26,110 --> 00:11:30,760
So we face it right in the
beginning, and that's why we

197
00:11:30,760 --> 00:11:32,630
deal with these things
as numbers.

198
00:11:35,150 --> 00:11:38,380
So let's look at a simple
example of what a scalar

199
00:11:38,380 --> 00:11:42,090
quantizer is going to do.

200
00:11:42,090 --> 00:11:50,240
Basically, what we have to do
is to map the line R, mainly

201
00:11:50,240 --> 00:11:53,480
the set of real numbers, into
M different regions, which

202
00:11:53,480 --> 00:11:57,880
we'll call R1 up to R sub M.
And in this picture here,

203
00:11:57,880 --> 00:12:03,220
here's R1, here's R2, here's R3,
here's R4, R5 and R6, and

204
00:12:03,220 --> 00:12:05,600
that's all the regions
we have.

205
00:12:05,600 --> 00:12:08,290
You'll notice one of the things
that that does is it

206
00:12:08,290 --> 00:12:11,890
takes an enormous set of
numbers, namely all these

207
00:12:11,890 --> 00:12:14,790
numbers less than this,
and match them all

208
00:12:14,790 --> 00:12:16,730
into the same symbol.

209
00:12:16,730 --> 00:12:19,200
So you might wind up with a
fair amount of distortion

210
00:12:19,200 --> 00:12:22,880
there, no matter how you
measure distortion.

211
00:12:22,880 --> 00:12:26,000
All these outliers here all
get mapped into a6, and

212
00:12:26,000 --> 00:12:29,620
everything in the middle gets
mapped somehow into these

213
00:12:29,620 --> 00:12:32,080
intermediate values.

214
00:12:32,080 --> 00:12:38,140
But every source value now in
the region R sub j, we're

215
00:12:38,140 --> 00:12:42,210
going to map into a
representation point a sub j.

216
00:12:42,210 --> 00:12:46,220
So everything in R1 is going
to be mapped into a1.

217
00:12:46,220 --> 00:12:48,325
Everything in R2 is going
to get mapped

218
00:12:48,325 --> 00:12:51,380
into a2 and so forth.

219
00:12:51,380 --> 00:12:55,500
Is this a general way to map
R into a set of M symbols?

220
00:12:58,380 --> 00:13:00,580
Is there anything else I ought
to be thinking about?

221
00:13:05,700 --> 00:13:10,490
Well, here, these regions here
have a very special property.

222
00:13:10,490 --> 00:13:13,850
Namely, each region
is an interval.

223
00:13:13,850 --> 00:13:17,580
And we might say to ourselves,
well, maybe we shouldn't map

224
00:13:17,580 --> 00:13:19,830
points into intervals.

225
00:13:19,830 --> 00:13:23,450
But aside from the fact that
we've chosen intervals here,

226
00:13:23,450 --> 00:13:27,230
this is a perfectly general
way to represent a mapping

227
00:13:27,230 --> 00:13:31,080
from the real numbers into
a discrete set of things.

228
00:13:31,080 --> 00:13:33,970
Namely, when you're doing a
mapping from the real numbers

229
00:13:33,970 --> 00:13:37,920
into a discrete set of things,
there's some set of real

230
00:13:37,920 --> 00:13:41,660
numbers that get mapped
into a1, and that by

231
00:13:41,660 --> 00:13:43,920
definition is called R1.

232
00:13:43,920 --> 00:13:47,220
There's some set of numbers
which get mapped into a2.

233
00:13:47,220 --> 00:13:51,030
That by definition is called
R2 and so forth.

234
00:13:51,030 --> 00:13:54,570
So aside from the fact that
these intervals with these

235
00:13:54,570 --> 00:13:58,530
regions in this picture happen
to be intervals, this is a

236
00:13:58,530 --> 00:14:04,290
perfectly general mapping from
R into a discrete alphabet.

237
00:14:04,290 --> 00:14:07,100
So since I've decided I'm
going to look at scalar

238
00:14:07,100 --> 00:14:10,720
quantizers first, this is a
completely general view of

239
00:14:10,720 --> 00:14:13,120
what a scalar quantizer is.

240
00:14:13,120 --> 00:14:17,580
You tell me how many
quantization regions you want,

241
00:14:17,580 --> 00:14:23,610
namely how big the alphabet is
that you're mapping things

242
00:14:23,610 --> 00:14:26,540
into, and then your only problem
is how do you choose

243
00:14:26,540 --> 00:14:28,910
these regions and how
do you choose the

244
00:14:28,910 --> 00:14:31,110
representation points?

245
00:14:31,110 --> 00:14:37,260
OK, one new thing here: Before,
we said when you have

246
00:14:37,260 --> 00:14:41,040
a set of symbols a1 up to
a6, it doesn't matter

247
00:14:41,040 --> 00:14:43,120
what you call them.

248
00:14:43,120 --> 00:14:45,630
They're just six symbols.

249
00:14:45,630 --> 00:14:48,290
Here it makes a difference what
you call them because

250
00:14:48,290 --> 00:14:52,350
here they are representing
real numbers and they are

251
00:14:52,350 --> 00:14:57,590
representing real numbers
because when you map some real

252
00:14:57,590 --> 00:15:03,940
number on the real line into one
of these letters here, the

253
00:15:03,940 --> 00:15:07,440
distortion is u minus a sub j.

254
00:15:07,440 --> 00:15:11,950
If you're mapping u into a sub
j, then you get a distortion,

255
00:15:11,950 --> 00:15:14,630
which is this difference here.

256
00:15:14,630 --> 00:15:18,010
I haven't said yet what I am
interested in as far as

257
00:15:18,010 --> 00:15:19,320
distortion is concerned.

258
00:15:19,320 --> 00:15:24,040
Am I interested in squared
distortion, cubed distortion,

259
00:15:24,040 --> 00:15:26,580
absolute magnitude
of distortion?

260
00:15:26,580 --> 00:15:28,600
I haven't answered that
question yet.

261
00:15:28,600 --> 00:15:31,980
But there is a distortion here,
and somehow that has to

262
00:15:31,980 --> 00:15:32,720
be important.

263
00:15:32,720 --> 00:15:38,610
We have to come to grips with
what we call a distortion and

264
00:15:38,610 --> 00:15:41,860
somehow how big that distortion
is going to be.

265
00:15:41,860 --> 00:15:46,230
So our problem here is somehow
to trade off between

266
00:15:46,230 --> 00:15:49,750
distortion and number
of points.

267
00:15:49,750 --> 00:15:52,110
As we make the number of
points bigger, we can

268
00:15:52,110 --> 00:15:56,150
presumably make the distortion
smaller in some sense,

269
00:15:56,150 --> 00:15:58,800
although the distortion is
always going to be very big

270
00:15:58,800 --> 00:16:02,850
from these really big negative
numbers and from really big

271
00:16:02,850 --> 00:16:04,130
positive numbers.

272
00:16:04,130 --> 00:16:08,460
But aside from that, we just
have a problem of how do you

273
00:16:08,460 --> 00:16:09,440
choose the regions?

274
00:16:09,440 --> 00:16:13,530
How do you choose the points?

275
00:16:13,530 --> 00:16:16,960
OK, I've sort of forgotten
about the problem that we

276
00:16:16,960 --> 00:16:20,000
started with.

277
00:16:20,000 --> 00:16:22,990
And the problem that we started
with was to have a

278
00:16:22,990 --> 00:16:28,140
source where the source was
a sequence of numbers.

279
00:16:28,140 --> 00:16:30,570
And when we're talking about
sources, we're talking about

280
00:16:30,570 --> 00:16:31,770
something stochastic.

281
00:16:31,770 --> 00:16:37,680
We need a probability measure
on these real numbers that

282
00:16:37,680 --> 00:16:38,920
we're encoding.

283
00:16:38,920 --> 00:16:41,730
If we knew what they were, there
wouldn't be any need to

284
00:16:41,730 --> 00:16:44,440
encode them.

285
00:16:44,440 --> 00:16:47,490
I mean, if we knew and the
receiver knew, there would be

286
00:16:47,490 --> 00:16:48,540
no need to encode them.

287
00:16:48,540 --> 00:16:53,680
The receiver would just print
out what they were or store

288
00:16:53,680 --> 00:16:55,540
them or do whatever
the receiver

289
00:16:55,540 --> 00:16:57,770
wants to do with them.

290
00:16:57,770 --> 00:17:01,020
OK, so we're going to view the
source value u with the sample

291
00:17:01,020 --> 00:17:05,460
value of some random variable
capital U. And more generally,

292
00:17:05,460 --> 00:17:10,410
since we have a sequence, we're
going to consider a

293
00:17:10,410 --> 00:17:15,610
source sequence to be U1, U2, U3
and so forth, or you could

294
00:17:15,610 --> 00:17:19,330
consider it a bi-infinite
sequence, starting at U minus

295
00:17:19,330 --> 00:17:23,970
infinity and working
its way up forever.

296
00:17:23,970 --> 00:17:28,650
And then we're going to have
some sort of model,

297
00:17:28,650 --> 00:17:32,480
statistical model, for this
sequence of random variables.

298
00:17:32,480 --> 00:17:36,010
Our typical model for these is
to assume that we have a

299
00:17:36,010 --> 00:17:37,580
memoryless source.

300
00:17:37,580 --> 00:17:41,950
In other words, U1, U2, U3 are
independent, identically

301
00:17:41,950 --> 00:17:43,930
distributed, random variables.

302
00:17:43,930 --> 00:17:49,290
That's the model we'll use until
we get smarter and start

303
00:17:49,290 --> 00:17:51,220
to think of something else.

304
00:17:51,220 --> 00:17:54,860
OK, so now each of these source
values we're going to

305
00:17:54,860 --> 00:17:58,590
map into some representation
point a sub j.

306
00:17:58,590 --> 00:18:01,690
That's what defines
the quantizer.

307
00:18:01,690 --> 00:18:05,090
And now since a sub j is a
sample value of a random

308
00:18:05,090 --> 00:18:09,400
variable U, a sub j is going to
be a sample value of some

309
00:18:09,400 --> 00:18:13,740
random variable V. OK, in other
words, the probabilities

310
00:18:13,740 --> 00:18:16,510
of these different sample
values is going to be

311
00:18:16,510 --> 00:18:22,330
determined by the set of U's
that math into that a sub j.

312
00:18:22,330 --> 00:18:26,240
So we have a source sequence
U1, U2, blah, blah, blah.

313
00:18:26,240 --> 00:18:30,720
We have a representation
sequence V1, V2, blah, blah,

314
00:18:30,720 --> 00:18:38,040
blah, which is defined by if U
sub k is in R sub j, then Vk

315
00:18:38,040 --> 00:18:40,470
is equal to a sub j.

316
00:18:40,470 --> 00:18:43,690
Point of confusion here: It's
not confusing now, but it will

317
00:18:43,690 --> 00:18:46,890
be confusing at some
point to you.

318
00:18:46,890 --> 00:18:50,250
When you're talking about
sources, you really need two

319
00:18:50,250 --> 00:18:53,690
indices that you're talking
about all the time.

320
00:18:53,690 --> 00:18:59,450
OK, one of them is
how to represent

321
00:18:59,450 --> 00:19:01,340
different elements in time.

322
00:19:01,340 --> 00:19:05,300
Here we're using k as a way of
keeping track of what element

323
00:19:05,300 --> 00:19:07,070
in time we're talking about.

324
00:19:07,070 --> 00:19:10,570
We're also talking about a
discrete alphabet, which has a

325
00:19:10,570 --> 00:19:13,310
certain number of elements in
it, which is completely

326
00:19:13,310 --> 00:19:15,500
independent of time.

327
00:19:15,500 --> 00:19:18,900
Namely, we've just described
the quantizer as something

328
00:19:18,900 --> 00:19:22,420
which maps real numbers
into sample values.

329
00:19:22,420 --> 00:19:24,350
It has nothing to do
with time at all.

330
00:19:24,350 --> 00:19:26,650
We're going to use that same
thing again and again and

331
00:19:26,650 --> 00:19:29,880
again, and we're using
the subscript j

332
00:19:29,880 --> 00:19:32,930
to talk about that.

333
00:19:32,930 --> 00:19:37,830
When you write out problem
solutions, you are going to

334
00:19:37,830 --> 00:19:41,570
find that it's incredibly
difficult sometimes to write

335
00:19:41,570 --> 00:19:45,080
sentences which distinguish
about whether you're talking

336
00:19:45,080 --> 00:19:50,880
about one element out of an
alphabet or one element out of

337
00:19:50,880 --> 00:19:54,220
a time sequence.

338
00:19:54,220 --> 00:19:56,910
And everybody has that trouble,
and you read most of

339
00:19:56,910 --> 00:19:59,870
the literature in information
theory or communication

340
00:19:59,870 --> 00:20:02,880
theory, and you can't sort out
most of the time what people

341
00:20:02,880 --> 00:20:05,770
are talking about because
they're doing that.

342
00:20:05,770 --> 00:20:11,360
I recommend to you using an
element and an alphabet to

343
00:20:11,360 --> 00:20:15,030
talk about this sort of thing,
what a sub j is or an element

344
00:20:15,030 --> 00:20:17,630
and a time sequence to
keep track of things

345
00:20:17,630 --> 00:20:18,590
at different times.

346
00:20:18,590 --> 00:20:21,950
It's a nice way of keeping
them straight.

347
00:20:21,950 --> 00:20:26,370
OK, so anyway, for a scalar
quantizer, we're going to be

348
00:20:26,370 --> 00:20:30,890
able to just look at a single
random variable U, which is a

349
00:20:30,890 --> 00:20:34,490
continuous-valued random
variable, which takes values

350
00:20:34,490 --> 00:20:40,230
anywhere on the real line and
maps it into a single element

351
00:20:40,230 --> 00:20:49,120
in this discrete alphabet, which
is the set a1 up to a6

352
00:20:49,120 --> 00:20:50,650
that we were talking
about here.

353
00:20:50,650 --> 00:20:56,090
So a scalar quantizer then is
just a map of this form, OK?

354
00:20:56,090 --> 00:20:58,960
So the only thing we need for a
scalar quantizer, we can now

355
00:20:58,960 --> 00:21:02,030
forget about time and talk about
how do you choose the

356
00:21:02,030 --> 00:21:07,080
regions, how do you choose the
representation points?

357
00:21:07,080 --> 00:21:13,270
OK, and there's a nice
algorithm there.

358
00:21:13,270 --> 00:21:16,380
Again, one of these things,
which if you were the first

359
00:21:16,380 --> 00:21:21,540
person to think about it, easy
way to become famous.

360
00:21:21,540 --> 00:21:26,540
You might not stay famous, but
you can get famous initially.

361
00:21:26,540 --> 00:21:30,970
Anyway, we're almost always
interested in the mean squared

362
00:21:30,970 --> 00:21:35,880
error or the mean squared
distortion, MSD or MSE, which

363
00:21:35,880 --> 00:21:41,890
is the expected value of U minus
V. U is this real-valued

364
00:21:41,890 --> 00:21:43,040
random variable.

365
00:21:43,040 --> 00:21:48,570
V is the discrete random
variable into which it maps.

366
00:21:48,570 --> 00:21:52,960
We have the distortion between
U and V, which is U minus V.

367
00:21:52,960 --> 00:21:58,740
We now have the expected value
of that squared distortion.

368
00:21:58,740 --> 00:22:02,880
Why is everybody interested in
squared distortion instead of

369
00:22:02,880 --> 00:22:05,630
magnitude distortion
or something else?

370
00:22:05,630 --> 00:22:09,480
In many engineering problems,
you should be more interested

371
00:22:09,480 --> 00:22:11,690
in magnitude distortion.

372
00:22:11,690 --> 00:22:14,180
Sometimes you're much more
interested in fourth-moment

373
00:22:14,180 --> 00:22:16,610
distortion or some other
strange thing.

374
00:22:16,610 --> 00:22:20,780
Why do we always use mean
squared distortion?

375
00:22:20,780 --> 00:22:24,680
And why do we use mean-squared
everything throughout almost

376
00:22:24,680 --> 00:22:29,470
everything we do in
communication?

377
00:22:29,470 --> 00:22:30,810
I'll tell you the reason now.

378
00:22:30,810 --> 00:22:35,780
We'll come back and talk about
it more a number of times.

379
00:22:35,780 --> 00:22:38,870
It's because this quantization
problem that we're talking

380
00:22:38,870 --> 00:22:43,870
about is almost always a
subproblem of this problem,

381
00:22:43,870 --> 00:22:46,910
where you're dealing with
waveforms, where you take the

382
00:22:46,910 --> 00:22:49,760
waveform and you sample the
waveform, where you take the

383
00:22:49,760 --> 00:22:54,970
waveform, and you turn the
waveform into some expansion.

384
00:22:54,970 --> 00:22:59,980
When you find the mean-squared
distortion in a quantizer, it

385
00:22:59,980 --> 00:23:03,700
turns out that that maps in
a beautiful way into the

386
00:23:03,700 --> 00:23:07,340
mean-squared distortion
between waveforms.

387
00:23:07,340 --> 00:23:11,040
If you deal with magnitudes or
with anything else in the

388
00:23:11,040 --> 00:23:14,990
world, all of that beauty
goes away, OK?

389
00:23:14,990 --> 00:23:17,740
In other words, whenever you
want to go from waveforms to

390
00:23:17,740 --> 00:23:22,890
numbers, the one thing which
remains invariant, which

391
00:23:22,890 --> 00:23:28,040
remains nice all the way
through, is this mean-square

392
00:23:28,040 --> 00:23:31,040
distortion, mean-square
value, OK?

393
00:23:31,040 --> 00:23:34,620
In other words, if you want
to layer this problem into

394
00:23:34,620 --> 00:23:37,390
looking at this problem
separately from looking at

395
00:23:37,390 --> 00:23:42,250
this problem, almost the only
way you can do it that makes

396
00:23:42,250 --> 00:23:46,100
sense is to worry about mean
square distortion rather than

397
00:23:46,100 --> 00:23:48,920
some other kind of distortion.

398
00:23:48,920 --> 00:23:51,980
So even though as engineers
we might be interested in

399
00:23:51,980 --> 00:23:55,240
something else, we almost always
stick to that because

400
00:23:55,240 --> 00:23:58,730
that's the thing that we can
deal with most nicely.

401
00:23:58,730 --> 00:24:04,210
I mean, as engineers, we're like
the drunk who dropped his

402
00:24:04,210 --> 00:24:07,940
wallet on a dark street, and
he's searching for it, and

403
00:24:07,940 --> 00:24:08,980
somebody comes along.

404
00:24:08,980 --> 00:24:14,090
And here he is underneath a
beautiful light where he can

405
00:24:14,090 --> 00:24:15,390
see everything.

406
00:24:15,390 --> 00:24:17,890
Somebody asks him what he's
looking for you, and he says

407
00:24:17,890 --> 00:24:19,290
he's looking for his wallet.

408
00:24:19,290 --> 00:24:21,620
The guy looks down and says,
well, there's no wallet there.

409
00:24:21,620 --> 00:24:22,630
The drunk says I know.

410
00:24:22,630 --> 00:24:25,200
I dropped it over there, but
it's dark over there.

411
00:24:25,200 --> 00:24:28,350
So we use mean square
distortion in

412
00:24:28,350 --> 00:24:29,880
exactly the same sense.

413
00:24:29,880 --> 00:24:32,940
It isn't necessarily the problem
we're interested in,

414
00:24:32,940 --> 00:24:37,150
but it's a problem where we
can see things clearly.

415
00:24:37,150 --> 00:24:44,560
So given that we're interested
in the mean square distortion

416
00:24:44,560 --> 00:24:48,950
of a scalar quantizer, an
interesting analytical problem

417
00:24:48,950 --> 00:24:54,630
that we can play with is for a
given probability density on

418
00:24:54,630 --> 00:24:58,530
this real random variable, and
we're assuming we have a

419
00:24:58,530 --> 00:25:06,010
probability density and a given
alphabet size M, the

420
00:25:06,010 --> 00:25:10,330
problem is how do you choose
these regions and how do you

421
00:25:10,330 --> 00:25:14,140
choose these representation
points in such a way as to

422
00:25:14,140 --> 00:25:18,340
minimize the mean square
error, OK?

423
00:25:18,340 --> 00:25:20,950
So, in other words, we've taken
a big sort of messy

424
00:25:20,950 --> 00:25:24,690
amorphous engineering problem,
and we said, OK, we're going

425
00:25:24,690 --> 00:25:27,160
to deal with mean square error,
and OK, we're going to

426
00:25:27,160 --> 00:25:32,670
deal with scalar quantizers, and
OK, we're going to fix a

427
00:25:32,670 --> 00:25:36,290
number of quantization levels,
so we've made all those

428
00:25:36,290 --> 00:25:38,450
choices to start with.

429
00:25:38,450 --> 00:25:42,200
We still have this interesting
problem of how do you choose

430
00:25:42,200 --> 00:25:43,700
the right regions?

431
00:25:43,700 --> 00:25:47,250
How do you choose the
right sample points?

432
00:25:47,250 --> 00:25:51,280
And that turns out to
be a simple problem.

433
00:25:51,280 --> 00:25:53,000
I wouldn't talk about
if it wasn't simple.

434
00:25:58,450 --> 00:26:01,310
And we'll break it into
subproblems, and the

435
00:26:01,310 --> 00:26:04,510
subproblems are really simple.

436
00:26:04,510 --> 00:26:09,200
The first subproblem is if I
tell you what representation

437
00:26:09,200 --> 00:26:15,190
points I want to use, namely,
in this picture here, I say

438
00:26:15,190 --> 00:26:19,610
OK, I want to use these
representation points.

439
00:26:19,610 --> 00:26:22,150
And then I ask you, how are
you going to choose the

440
00:26:22,150 --> 00:26:28,350
regions in an optimal way to
minimize mean square error?

441
00:26:28,350 --> 00:26:31,370
Well, you think about that for
awhile, and you think about it

442
00:26:31,370 --> 00:26:33,160
in a number of ways.

443
00:26:33,160 --> 00:26:35,460
When you think about it in just
the right way, the answer

444
00:26:35,460 --> 00:26:38,130
becomes obvious.

445
00:26:38,130 --> 00:26:42,970
And the answer is, let me not
think about the regions here,

446
00:26:42,970 --> 00:26:46,030
but let me think about a
particular value that comes

447
00:26:46,030 --> 00:26:47,730
out of the source.

448
00:26:47,730 --> 00:26:51,580
Let me think, how should I
construct a rule for how to

449
00:26:51,580 --> 00:26:57,120
take outputs from this
source and map them

450
00:26:57,120 --> 00:27:00,100
into some number here.

451
00:27:00,100 --> 00:27:05,250
So if I get some output from
the source u, I say, OK,

452
00:27:05,250 --> 00:27:09,080
what's the distortion
between u and a1?

453
00:27:09,080 --> 00:27:10,540
It's u minus a1.

454
00:27:10,540 --> 00:27:13,930
And the magnitude of that is the
magnitude of u minus a1.

455
00:27:13,930 --> 00:27:17,790
The square of it is the square
of u minus a1, and then say,

456
00:27:17,790 --> 00:27:20,660
OK, let me compare that
with u minus a2.

457
00:27:20,660 --> 00:27:24,650
Let me compare it with u
minus a3 and so forth.

458
00:27:24,650 --> 00:27:28,500
Let me choose the smallest
of those things.

459
00:27:28,500 --> 00:27:31,690
What's the smallest stuff
for any given u?

460
00:27:31,690 --> 00:27:35,760
Suppose I have a u which
happens to be

461
00:27:35,760 --> 00:27:38,630
right there, for example.

462
00:27:38,630 --> 00:27:40,900
What's the closest
representation point?

463
00:27:44,320 --> 00:27:49,650
Well, a3 is obviously
closer than a2.

464
00:27:49,650 --> 00:27:55,460
And in fact for any point in
here, which is closer to a3

465
00:27:55,460 --> 00:27:58,620
than it is to a2, we're
going to choose a3.

466
00:27:58,620 --> 00:28:02,420
Now what's the set of points
which are closer to a3 than

467
00:28:02,420 --> 00:28:04,720
they are to a2?

468
00:28:04,720 --> 00:28:09,590
Well, you put a line here right
between a2 and a3, OK?

469
00:28:09,590 --> 00:28:13,330
When you put that line right
between a2 and a3, everything

470
00:28:13,330 --> 00:28:17,720
on this side is closer to a3 and
everything on this side is

471
00:28:17,720 --> 00:28:19,970
closer to a2.

472
00:28:19,970 --> 00:28:21,830
So the answer is,
in fact, simple,

473
00:28:21,830 --> 00:28:24,990
once you see the answer.

474
00:28:24,990 --> 00:28:30,610
And the answer is, given these
points, we simply construct

475
00:28:30,610 --> 00:28:34,430
bisectors between them, namely,
bisectors with --

476
00:28:34,430 --> 00:28:38,810
halfway between a1 and
a2, we call that b1.

477
00:28:38,810 --> 00:28:43,600
Halfway between a2 and a3, we
call that b2, and those are

478
00:28:43,600 --> 00:28:46,300
the separators between
the regions, OK?

479
00:28:46,300 --> 00:28:53,390
In other words, what we wind
up doing is we define the

480
00:28:53,390 --> 00:28:57,740
region R sub j, the set of
things which get mapped into a

481
00:28:57,740 --> 00:29:06,660
sub j as the region which is
bounded by bj minus 1 is the

482
00:29:06,660 --> 00:29:13,040
average of aj and aj minus 1,
and bj is the average of aj

483
00:29:13,040 --> 00:29:17,360
and aj plus 1, where I've
already ordered the points a

484
00:29:17,360 --> 00:29:23,750
sub j going from left
to right, OK?

485
00:29:23,750 --> 00:29:29,060
And that also tells us that the
minimum mean square error

486
00:29:29,060 --> 00:29:32,550
regions have got to
be intervals.

487
00:29:32,550 --> 00:29:35,990
There is no reason at all to
ever pick regions which are

488
00:29:35,990 --> 00:29:39,130
not intervals because, as soon
as you start to solve this

489
00:29:39,130 --> 00:29:44,690
problem for any given set of
representation points, you

490
00:29:44,690 --> 00:29:47,050
wind up with intervals.

491
00:29:47,050 --> 00:29:51,150
So if you ever think of using
things that are not intervals

492
00:29:51,150 --> 00:29:55,570
for this mean square error
problem, then as soon as you

493
00:29:55,570 --> 00:29:58,630
look at this, you say, well,
aha, that can't be the best

494
00:29:58,630 --> 00:30:00,050
thing to do.

495
00:30:00,050 --> 00:30:02,560
I will make my regions
intervals.

496
00:30:02,560 --> 00:30:05,770
I will therefore simplify the
whole problem, and I'll also

497
00:30:05,770 --> 00:30:06,890
improve it.

498
00:30:06,890 --> 00:30:09,550
And when you can both simplify
things and improve them at the

499
00:30:09,550 --> 00:30:13,860
same time, it usually is worth
doing it unless you're dealing

500
00:30:13,860 --> 00:30:17,060
with standards bodies or
something, and then

501
00:30:17,060 --> 00:30:19,210
all bets are off.

502
00:30:19,210 --> 00:30:22,420
OK, so this one part of
the problem is easy.

503
00:30:22,420 --> 00:30:32,640
If we know what the
representation points are, we

504
00:30:32,640 --> 00:30:35,070
can solve the problem.

505
00:30:35,070 --> 00:30:40,180
OK, we have a second problem,
which is just

506
00:30:40,180 --> 00:30:41,720
the opposite problem.

507
00:30:41,720 --> 00:30:46,630
Suppose then that somebody gives
you the interval regions

508
00:30:46,630 --> 00:30:50,050
and asks you, OK, if I give you
these interval regions,

509
00:30:50,050 --> 00:30:53,240
how are you going to choose the
representation points to

510
00:30:53,240 --> 00:30:55,030
minimize the mean
square error?

511
00:31:00,720 --> 00:31:05,710
And analytically that's harder,
but conceptually, it's

512
00:31:05,710 --> 00:31:07,740
just about as easy.

513
00:31:07,740 --> 00:31:10,200
And let's look at it and see
if we can understand it.

514
00:31:12,970 --> 00:31:18,240
Somebody gives us this region
here, R sub 2, and says, OK,

515
00:31:18,240 --> 00:31:22,720
where should we put
the point a sub 2?

516
00:31:22,720 --> 00:31:25,490
Well, anybody have any
clues as to where

517
00:31:25,490 --> 00:31:26,390
you want to put it?

518
00:31:26,390 --> 00:31:29,020
AUDIENCE: [INAUDIBLE]

519
00:31:29,020 --> 00:31:29,370
PROFESSOR: What?

520
00:31:29,370 --> 00:31:30,040
AUDIENCE: At the midpoint?

521
00:31:30,040 --> 00:31:30,980
PROFESSOR: At the midpoint.

522
00:31:30,980 --> 00:31:32,700
It sounds like a reasonable
thing, but

523
00:31:32,700 --> 00:31:33,600
it's not quite right.

524
00:31:33,600 --> 00:31:36,330
AUDIENCE: [INAUDIBLE]

525
00:31:36,330 --> 00:31:38,790
PROFESSOR: What?

526
00:31:38,790 --> 00:31:38,960
AUDIENCE: [INAUDIBLE]

527
00:31:38,960 --> 00:31:43,100
PROFESSOR: It depends on the
probability density, yes.

528
00:31:43,100 --> 00:31:46,340
If you have a probability
density, which is highly

529
00:31:46,340 --> 00:31:52,030
weighted over on -- let's see,
was I talking about R2 or R3?

530
00:31:52,030 --> 00:31:53,050
Well, it doesn't make
any difference.

531
00:31:53,050 --> 00:31:54,830
We'll talk about R2.

532
00:31:54,830 --> 00:31:58,400
If I have a probability density,
which looks like

533
00:31:58,400 --> 00:32:06,450
this, then I want a2 to be
closer to the left-hand side

534
00:32:06,450 --> 00:32:07,690
than to the right-hand side.

535
00:32:07,690 --> 00:32:11,320
I want it to be a little bit
weighted towards here because

536
00:32:11,320 --> 00:32:13,410
that's more important
in choosing

537
00:32:13,410 --> 00:32:14,660
the mean square error.

538
00:32:17,480 --> 00:32:21,150
OK, if you didn't have any of
this stuff, and I said how do

539
00:32:21,150 --> 00:32:26,220
I choose this to minimize the
mean square error, what's your

540
00:32:26,220 --> 00:32:27,060
answer then?

541
00:32:27,060 --> 00:32:31,850
If I just have one region and
I want to minimize the mean

542
00:32:31,850 --> 00:32:34,220
square error, what do you do?

543
00:32:36,740 --> 00:32:38,920
Anybody who doesn't know should
go back and study

544
00:32:38,920 --> 00:32:42,880
elementary probability theory
because this is almost day one

545
00:32:42,880 --> 00:32:46,760
of elementary probability theory
when you start to study

546
00:32:46,760 --> 00:32:49,740
what random variables
are all about.

547
00:32:49,740 --> 00:32:52,410
And the next thing you start
to look at is things like

548
00:32:52,410 --> 00:32:54,930
variance and second moment.

549
00:32:54,930 --> 00:32:59,530
And what I'm asking you here
is, how do you choose a2 in

550
00:32:59,530 --> 00:33:04,190
order to minimize the second
moment of U, whatever it is in

551
00:33:04,190 --> 00:33:05,170
here, minus a2?

552
00:33:05,170 --> 00:33:05,480
Yes?

553
00:33:05,480 --> 00:33:07,850
AUDIENCE: [INAUDIBLE]

554
00:33:07,850 --> 00:33:10,290
PROFESSOR: I want to take
the expectation of U

555
00:33:10,290 --> 00:33:12,600
over the region R2.

556
00:33:12,600 --> 00:33:16,130
In other words, to say it more
technically, I want to take

557
00:33:16,130 --> 00:33:20,970
the expectation of the
conditional random variable U,

558
00:33:20,970 --> 00:33:24,260
conditional on being
in R2, OK?

559
00:33:27,790 --> 00:33:31,220
And all of you could figure that
out yourselves, if you

560
00:33:31,220 --> 00:33:35,820
sat down quietly and thought
about it for five minutes.

561
00:33:35,820 --> 00:33:39,720
If you got frightened about it
and started looking it up in a

562
00:33:39,720 --> 00:33:41,860
book or something, it
would take you about

563
00:33:41,860 --> 00:33:43,840
two hours to do it.

564
00:33:43,840 --> 00:33:47,110
But if you just asked
yourselves, how do I do it?

565
00:33:47,110 --> 00:33:52,900
You'll come up with the right
answer very, very soon, OK?

566
00:33:52,900 --> 00:33:58,300
So subproblem 2 says let's
look at the conditional

567
00:33:58,300 --> 00:34:02,670
density of U in this
region R sub j.

568
00:34:02,670 --> 00:34:05,780
I'll call the conditional
density -- well, the

569
00:34:05,780 --> 00:34:09,010
conditional density, given that
you're in this region,

570
00:34:09,010 --> 00:34:13,990
is, in fact, the real density
divided by the probability of

571
00:34:13,990 --> 00:34:16,500
being in that interval, OK?

572
00:34:16,500 --> 00:34:20,270
So we'll call that the
conditional density.

573
00:34:20,270 --> 00:34:25,000
And I'll let U of j be the
random variable, which has

574
00:34:25,000 --> 00:34:25,790
this density.

575
00:34:25,790 --> 00:34:29,860
In other words, this isn't
a random variable on the

576
00:34:29,860 --> 00:34:32,320
probability space that we
started to deal with it.

577
00:34:32,320 --> 00:34:36,450
It's sort of a phony
random variable.

578
00:34:36,450 --> 00:34:38,200
But, in fact, it's the
intuitive thing that

579
00:34:38,200 --> 00:34:39,930
you think of, OK?

580
00:34:39,930 --> 00:34:42,080
In other words, if this is
exactly what you were thinking

581
00:34:42,080 --> 00:34:44,450
of, you probably wouldn't have
called this a separate random

582
00:34:44,450 --> 00:34:46,850
variable, OK?

583
00:34:46,850 --> 00:34:50,390
So this is a random variable
with this density.

584
00:34:50,390 --> 00:34:55,170
The expected value of U of j
minus a of j quantity squared,

585
00:34:55,170 --> 00:35:02,150
as you all know, is sigma
squared of U of j plus the

586
00:35:02,150 --> 00:35:08,850
expected value a U of j minus a
sub j quantity squared, OK?

587
00:35:08,850 --> 00:35:10,680
How do you minimize this?

588
00:35:10,680 --> 00:35:13,090
Well, you're stuck with this.

589
00:35:13,090 --> 00:35:17,070
This term, you minimize it by
making a sub j equal to the

590
00:35:17,070 --> 00:35:20,770
expected value of
U of j, which is

591
00:35:20,770 --> 00:35:21,980
exactly what you said.

592
00:35:21,980 --> 00:35:29,070
Namely, you set a of j to be the
conditional mean of U of j

593
00:35:29,070 --> 00:35:31,980
conditional on being
in the center.

594
00:35:31,980 --> 00:35:33,840
It's harder to say
mathematically than

595
00:35:33,840 --> 00:35:35,090
it is to see it.

596
00:35:37,310 --> 00:35:41,280
But, in fact, the intuitive
idea is exactly right.

597
00:35:41,280 --> 00:35:44,470
Namely, you condition your
random variable on being in

598
00:35:44,470 --> 00:35:47,680
that interval, and then what you
want to do is choose the

599
00:35:47,680 --> 00:35:51,670
mean within that interval, which
is exactly the sort of

600
00:35:51,670 --> 00:35:53,810
thing we were thinking
about here.

601
00:35:53,810 --> 00:35:57,900
When I drew this curve here, if
I scale it, this, in fact,

602
00:35:57,900 --> 00:36:01,140
is the density of
u conditional on

603
00:36:01,140 --> 00:36:03,050
being in R sub j.

604
00:36:03,050 --> 00:36:05,660
And now all I'm trying to do
is choose the point here,

605
00:36:05,660 --> 00:36:07,620
which happens to be
the mean which

606
00:36:07,620 --> 00:36:11,470
minimizes this second moment.

607
00:36:11,470 --> 00:36:14,870
The second moment, in fact,
is this mean square error.

608
00:36:14,870 --> 00:36:16,760
So, bingo!

609
00:36:16,760 --> 00:36:20,400
That's the second problem.

610
00:36:20,400 --> 00:36:22,310
Well, how do you put the
two problems together?

611
00:36:25,130 --> 00:36:28,260
Well, you can put it together
in two ways.

612
00:36:28,260 --> 00:36:31,960
One of them is to say, OK, well
then, clearly an optimal

613
00:36:31,960 --> 00:36:35,540
scalar quantizer has to
satisfy both of these

614
00:36:35,540 --> 00:36:36,650
conditions.

615
00:36:36,650 --> 00:36:39,600
Namely, the endpoints of the
regions have to be the

616
00:36:39,600 --> 00:36:43,390
midpoints of the representation
points, and the

617
00:36:43,390 --> 00:36:47,770
representation points have to
be the conditional means of

618
00:36:47,770 --> 00:36:56,790
the points within a region
that you start with.

619
00:36:56,790 --> 00:37:00,160
And then you say, OK, how
do I solve that problem?

620
00:37:00,160 --> 00:37:02,750
Well, if you're a computer
scientist, and sometimes it's

621
00:37:02,750 --> 00:37:05,220
good to be a computer scientist,
you say, well, I

622
00:37:05,220 --> 00:37:08,780
don't know how to solve the
problem, but generating an

623
00:37:08,780 --> 00:37:12,290
algorithm to solve the problem
is almost trivial.

624
00:37:12,290 --> 00:37:16,040
I start out with some
arbitrary set of

625
00:37:16,040 --> 00:37:17,730
representation points.

626
00:37:17,730 --> 00:37:22,220
That should be a capital M
because that's the number of

627
00:37:22,220 --> 00:37:25,480
points I'm allowed to use.

628
00:37:25,480 --> 00:37:29,330
Then the next step in the
algorithm is I choose these

629
00:37:29,330 --> 00:37:34,070
separation points to be b sub j
is the midpoint between the

630
00:37:34,070 --> 00:37:37,120
a's for each of these values.

631
00:37:37,120 --> 00:37:40,630
Then as soon as I get these
midpoints, I have a set of

632
00:37:40,630 --> 00:37:45,200
intervals, and my next step is
to say set a sub j is equal to

633
00:37:45,200 --> 00:37:49,320
the expected value of this
conditional random variable U

634
00:37:49,320 --> 00:37:54,530
of j where R sub j is now this
new interval for 1 less than

635
00:37:54,530 --> 00:37:57,410
or equal to j less than
or equal to M minus 1.

636
00:37:57,410 --> 00:38:01,820
This means it's open
on the left side.

637
00:38:01,820 --> 00:38:04,950
This means it's closed on the
right side, and since it's a

638
00:38:04,950 --> 00:38:07,140
probability density, it doesn't
make any difference

639
00:38:07,140 --> 00:38:08,030
what it is.

640
00:38:08,030 --> 00:38:11,610
I just wanted to give you an
explicit rule for what to do

641
00:38:11,610 --> 00:38:15,640
with probability zero when you
happen to wind up on that

642
00:38:15,640 --> 00:38:17,260
special point.

643
00:38:17,260 --> 00:38:22,800
OK, and then you iterate on
2 and 3 until you get a

644
00:38:22,800 --> 00:38:25,390
negligible improvement.

645
00:38:25,390 --> 00:38:27,880
And then you ask, of course,
well, if I got a negligible

646
00:38:27,880 --> 00:38:29,680
improvement this time, maybe
I'll get a bigger

647
00:38:29,680 --> 00:38:31,440
improvement next time.

648
00:38:31,440 --> 00:38:34,670
And that, of course, is true.

649
00:38:34,670 --> 00:38:40,700
And you then say, well, it's
possible that after I can't

650
00:38:40,700 --> 00:38:43,220
get any improvement anymore,
I still don't

651
00:38:43,220 --> 00:38:44,910
have the optimal solution.

652
00:38:44,910 --> 00:38:46,780
And that, of course,
is true also.

653
00:38:46,780 --> 00:38:50,010
But at least you have an
algorithm which makes sense

654
00:38:50,010 --> 00:38:53,000
and which, each time you try
it, you do a little better

655
00:38:53,000 --> 00:38:55,250
than you did before.

656
00:38:55,250 --> 00:39:00,030
Now, this mean square error for
any choice of regions and

657
00:39:00,030 --> 00:39:03,160
any choice of representation
points is going to be

658
00:39:03,160 --> 00:39:05,640
nonnegative because
it's an expected

659
00:39:05,640 --> 00:39:10,080
value of squared terms.

660
00:39:10,080 --> 00:39:13,040
The algorithm is nonincreasing
with iterations.

661
00:39:13,040 --> 00:39:17,390
In other words, the algorithm
is going down all the time.

662
00:39:17,390 --> 00:39:19,480
So you have zero down here.

663
00:39:19,480 --> 00:39:24,090
You have an algorithm which is
marching you down toward zero.

664
00:39:24,090 --> 00:39:30,080
It's not going to get to zero,
but it has to reach a minimum.

665
00:39:30,080 --> 00:39:32,900
That's a major theorem in
analysis, but you don't need

666
00:39:32,900 --> 00:39:35,710
any major theorems in analysis
to see this.

667
00:39:35,710 --> 00:39:38,290
You have a set of numbers, which
are decreasing all the

668
00:39:38,290 --> 00:39:41,390
time, and they're bounded
underneath.

669
00:39:41,390 --> 00:39:44,860
After awhile, you have to get
to some point, and you don't

670
00:39:44,860 --> 00:39:45,920
go any further.

671
00:39:45,920 --> 00:39:48,610
So it has a limit.

672
00:39:48,610 --> 00:39:49,420
So that's nice.

673
00:39:49,420 --> 00:39:53,100
So you have an algorithm
which has to converge.

674
00:39:53,100 --> 00:39:54,870
It can't keep on going.

675
00:39:54,870 --> 00:39:57,330
Well, it can keep on going
forever, but it keeps going on

676
00:39:57,330 --> 00:40:00,540
forever with smaller and
smaller improvements.

677
00:40:00,540 --> 00:40:05,050
So eventually, you might as well
stop because you're not

678
00:40:05,050 --> 00:40:08,150
getting anywhere.

679
00:40:08,150 --> 00:40:11,530
OK, well, those conditions
that we stated, the way a

680
00:40:11,530 --> 00:40:15,790
mathematician would say this, is
these Lloyd-Max conditions

681
00:40:15,790 --> 00:40:18,880
are necessary but not
sufficient, OK?

682
00:40:18,880 --> 00:40:21,720
In other words, any solution
to this problem that we're

683
00:40:21,720 --> 00:40:24,510
looking at has to have
the property that the

684
00:40:24,510 --> 00:40:30,250
representation points are the
conditional means of the

685
00:40:30,250 --> 00:40:34,330
intervals and the interval
boundaries are the midpoints

686
00:40:34,330 --> 00:40:37,410
between the representation
points.

687
00:40:37,410 --> 00:40:40,230
But that isn't necessarily
enough.

688
00:40:40,230 --> 00:40:42,950
Here's a simple example of
where it's not enough.

689
00:40:42,950 --> 00:40:46,050
Suppose you have a probability
density, which is running

690
00:40:46,050 --> 00:40:50,670
along almost zero, jumps up to
a big value, almost zero,

691
00:40:50,670 --> 00:40:54,310
jumps up to a big value, jumps
up to a big value.

692
00:40:54,310 --> 00:40:59,970
This intentionally is wider than
this and wider than this.

693
00:40:59,970 --> 00:41:03,090
In other words, there's a lot
more probability over here

694
00:41:03,090 --> 00:41:06,820
than there is here
or there is here.

695
00:41:06,820 --> 00:41:11,840
And you're unlucky, and you
start out somehow with points

696
00:41:11,840 --> 00:41:18,460
like a1 and a2, and you start
out with regions like R1 --

697
00:41:18,460 --> 00:41:23,050
well, yeah, you want to start
out with points, a1 and a2.

698
00:41:23,050 --> 00:41:26,280
So you start out with a point
a1, which is way over here,

699
00:41:26,280 --> 00:41:29,830
and a point a2, which is
not too far over here.

700
00:41:29,830 --> 00:41:33,490
And your algorithm then says
pick the midpoint b1.

701
00:41:33,490 --> 00:41:36,480
Here the algorithm is
particularly simple because,

702
00:41:36,480 --> 00:41:40,380
since we're only using two
regions, all you need is one

703
00:41:40,380 --> 00:41:42,220
separator point.

704
00:41:42,220 --> 00:41:45,440
So we wind up with b1 here,
which is halfway

705
00:41:45,440 --> 00:41:47,660
between a1 and a2.

706
00:41:47,660 --> 00:41:50,790
a1 happens to be right in the
middle of that big interval

707
00:41:50,790 --> 00:41:54,990
there, and there's hardly
anything else here.

708
00:41:54,990 --> 00:41:59,100
So a1 just stays there as the
conditional mean, given that

709
00:41:59,100 --> 00:42:01,310
you're on this side.

710
00:42:01,310 --> 00:42:05,070
And a2 stays as the conditional
mean, given that

711
00:42:05,070 --> 00:42:08,350
you're over here, which means
that a2 is a little closer

712
00:42:08,350 --> 00:42:12,010
this big region than it is
to this region, but a2 is

713
00:42:12,010 --> 00:42:13,940
perfectly happy there.

714
00:42:13,940 --> 00:42:16,810
And we go back and
we iterate again.

715
00:42:16,810 --> 00:42:21,530
And we haven't changed b1, we
haven't changed a1, we haven't

716
00:42:21,530 --> 00:42:25,810
changed a2, and therefore, the
algorithm sticks there.

717
00:42:25,810 --> 00:42:27,740
Well, that's not surprising.

718
00:42:27,740 --> 00:42:30,180
I mean, you know that there are
many problems where you

719
00:42:30,180 --> 00:42:33,570
try to minimize things by
differentiating or all the

720
00:42:33,570 --> 00:42:37,380
different tricks you have for
minimizing things, and you

721
00:42:37,380 --> 00:42:40,840
very often find local minima.

722
00:42:40,840 --> 00:42:43,940
People call algorithms like this
hill-climbing algorithms.

723
00:42:43,940 --> 00:42:46,570
They should call them
valley-finding algorithms

724
00:42:46,570 --> 00:42:49,380
because we're sitting
at someplace.

725
00:42:49,380 --> 00:42:51,030
We try to find a better place.

726
00:42:51,030 --> 00:42:53,240
So we wind up moving
down into the

727
00:42:53,240 --> 00:42:55,760
valley further and further.

728
00:42:55,760 --> 00:42:58,780
Of course, if it's a real
geographical area, you finally

729
00:42:58,780 --> 00:43:00,180
wind up at the river.

730
00:43:00,180 --> 00:43:02,560
You then move down the river,
you wind up in the

731
00:43:02,560 --> 00:43:03,290
ocean and all that.

732
00:43:03,290 --> 00:43:05,290
But let's forgot about that.

733
00:43:05,290 --> 00:43:08,000
Let's just assume that there
aren't rivers or anything.

734
00:43:08,000 --> 00:43:11,980
That we just have some arbitrary
geographical area.

735
00:43:11,980 --> 00:43:15,720
We wind up in the bottom of a
valley, and we say, OK, are we

736
00:43:15,720 --> 00:43:18,070
at the minimum or not?

737
00:43:18,070 --> 00:43:20,730
Well, with hill climbing, you
can take a pair of binoculars,

738
00:43:20,730 --> 00:43:23,330
and you look around to see if
there's any higher peak

739
00:43:23,330 --> 00:43:25,350
someplace else.

740
00:43:25,350 --> 00:43:27,350
With valley seeking,
you can't do that.

741
00:43:27,350 --> 00:43:29,780
We're sitting there at the
bottom of the valley, and we

742
00:43:29,780 --> 00:43:32,640
have no idea whether there's a
better valley somewhere else

743
00:43:32,640 --> 00:43:35,970
or not, OK?

744
00:43:35,970 --> 00:43:39,110
So that's the trouble with
hill-climbing algorithms or

745
00:43:39,110 --> 00:43:41,700
valley-seeking algorithms.

746
00:43:41,700 --> 00:43:45,030
And that's exactly what
this algorithm does.

747
00:43:45,030 --> 00:43:48,660
This is called the Lloyd-Max
algorithm because a guy by the

748
00:43:48,660 --> 00:43:53,820
name of Lloyd at Bell Labs
discovered it, I think in '57.

749
00:43:53,820 --> 00:43:57,350
A guy by the name of Joel
Max discovered it again.

750
00:43:57,350 --> 00:44:01,770
He was at MIT in 1960, and
because all the information

751
00:44:01,770 --> 00:44:04,660
theorists around were at MIT at
that time, they called it

752
00:44:04,660 --> 00:44:06,990
the Max algorithm
for many years.

753
00:44:06,990 --> 00:44:09,250
And then somebody discovered
that Lloyd had done it three

754
00:44:09,250 --> 00:44:10,030
years earlier.

755
00:44:10,030 --> 00:44:12,300
Lloyd never even published it.

756
00:44:12,300 --> 00:44:15,170
So it became the Lloyd-Max
algorithm.

757
00:44:15,170 --> 00:44:17,520
And now there's somebody else
who did it even earlier, I

758
00:44:17,520 --> 00:44:21,780
think, so we should probably
take Max's name off it.

759
00:44:21,780 --> 00:44:27,210
But anyway, sometime when I
revise the notes, I will give

760
00:44:27,210 --> 00:44:29,410
the whole story of that.

761
00:44:29,410 --> 00:44:31,780
But I hope you see that
this algorithm

762
00:44:31,780 --> 00:44:34,400
is no big deal anyway.

763
00:44:34,400 --> 00:44:38,220
It was just people fortunately
looking at the question at the

764
00:44:38,220 --> 00:44:41,170
right time before too many other
people had looked at it.

765
00:44:41,170 --> 00:44:43,680
And Max unfortunately
looked at it.

766
00:44:43,680 --> 00:44:44,580
He was in a valley.

767
00:44:44,580 --> 00:44:47,900
He didn't see all the other
people had looked at it.

768
00:44:47,900 --> 00:44:49,130
But he became famous.

769
00:44:49,130 --> 00:44:53,840
He had his moment of time and
then sunk gradually into

770
00:44:53,840 --> 00:44:59,040
oblivion except when once
in awhile we call it the

771
00:44:59,040 --> 00:45:00,570
Lloyd-Max algorithm.

772
00:45:00,570 --> 00:45:03,020
Most people call it the Lloyd
algorithm, though, and I

773
00:45:03,020 --> 00:45:05,710
really should also.

774
00:45:05,710 --> 00:45:10,290
OK, vector quantization:
We talked about vector

775
00:45:10,290 --> 00:45:11,610
quantization a little bit.

776
00:45:11,610 --> 00:45:16,530
It's the idea of segmenting the
source outputs before you

777
00:45:16,530 --> 00:45:19,330
try to do the encoding.

778
00:45:19,330 --> 00:45:21,800
So we ask is scalar quantization
going to be the

779
00:45:21,800 --> 00:45:23,710
right approach?

780
00:45:23,710 --> 00:45:27,080
To answer that question, we want
to look at quantizing two

781
00:45:27,080 --> 00:45:31,070
sample values jointly and
drawing pictures.

782
00:45:31,070 --> 00:45:35,100
Incidentally, what's the
simplest way of quantizing

783
00:45:35,100 --> 00:45:37,490
that you can think of?

784
00:45:37,490 --> 00:45:40,440
And what do people who do simple
things call quantizers?

785
00:45:44,780 --> 00:45:48,360
Ever hear of an
analog-to-digital converter?

786
00:45:48,360 --> 00:45:51,380
That's what a quantizer is.

787
00:45:51,380 --> 00:45:53,330
And an analog-to-digital
converter, the way that

788
00:45:53,330 --> 00:45:56,890
everybody does it, is scalar.

789
00:45:56,890 --> 00:46:00,370
So that says that either the
people who implement things

790
00:46:00,370 --> 00:46:03,400
are very stupid, or there's
something pretty good about

791
00:46:03,400 --> 00:46:05,630
scalar quantization.

792
00:46:05,630 --> 00:46:09,190
But anyway, since this is a
course trying to find better

793
00:46:09,190 --> 00:46:13,220
ways of doing things, we ought
to investigate whether it is

794
00:46:13,220 --> 00:46:16,000
better to use vector
quantizers and

795
00:46:16,000 --> 00:46:18,260
what they result in.

796
00:46:18,260 --> 00:46:20,880
OK, well, the first thing that
we can do is look at

797
00:46:20,880 --> 00:46:23,360
quantizing two samples.

798
00:46:23,360 --> 00:46:26,480
In other words, when you want
to generalize a problem to

799
00:46:26,480 --> 00:46:31,050
vectors, I find it better to
generalize it to two vectors

800
00:46:31,050 --> 00:46:33,980
first and see what
goes on there.

801
00:46:33,980 --> 00:46:37,720
And one possible approach is to
use a rectangular grid of

802
00:46:37,720 --> 00:46:40,280
quantization regions.

803
00:46:40,280 --> 00:46:43,740
And as I'll show you in the
next slide, that really is

804
00:46:43,740 --> 00:46:48,420
just a camouflaged scalar
quantizer again.

805
00:46:48,420 --> 00:46:52,410
So you have a two-dimensional
region corresponding to two

806
00:46:52,410 --> 00:46:53,880
real samples.

807
00:46:53,880 --> 00:46:57,460
So you've got two real
numbers, U1 and U2.

808
00:46:57,460 --> 00:47:03,080
You're trying to map them into
a finite set of sample values

809
00:47:03,080 --> 00:47:05,790
of representation points.

810
00:47:05,790 --> 00:47:09,760
Since you're going to be
interested in the distortion

811
00:47:09,760 --> 00:47:16,370
between U1 and U2, between the
vector U1, U2, and your

812
00:47:16,370 --> 00:47:23,680
representation vector, a1,
a2, these points, these

813
00:47:23,680 --> 00:47:26,350
representation points, are going
to be two-dimensional

814
00:47:26,350 --> 00:47:28,280
points also.

815
00:47:28,280 --> 00:47:31,970
So if you start out by saying
let's put these points on a

816
00:47:31,970 --> 00:47:36,420
rectangular grid, well, we can
then look at it, and we say,

817
00:47:36,420 --> 00:47:39,560
well, given the points, how
do we choose the regions?

818
00:47:42,360 --> 00:47:44,710
You see, it's exactly
the same problem

819
00:47:44,710 --> 00:47:47,440
that you solved before.

820
00:47:47,440 --> 00:47:52,440
If I give you the points and
then I ask you, well, if we

821
00:47:52,440 --> 00:47:58,820
get a vector U1, U2 that's
there, what do we map it to?

822
00:47:58,820 --> 00:48:02,275
Well, we map it to the closest
thing, which means if we want

823
00:48:02,275 --> 00:48:05,570
to find these regions, we set
up these perpendicular

824
00:48:05,570 --> 00:48:09,800
bisectors halfway between the
representation points.

825
00:48:09,800 --> 00:48:13,250
So all of this is looking very
rectangular now because we

826
00:48:13,250 --> 00:48:15,660
started out with these
points rectangular.

827
00:48:15,660 --> 00:48:19,840
These lines are rectangular, and
now I say, well, is this

828
00:48:19,840 --> 00:48:23,240
really any different from
a scalar quantizer?

829
00:48:23,240 --> 00:48:26,740
And, of course, it isn't because
for this particular

830
00:48:26,740 --> 00:48:31,340
vector quantizer, I can first
ask the question, OK, here's

831
00:48:31,340 --> 00:48:37,520
U1, which is something
in this direction.

832
00:48:37,520 --> 00:48:39,270
How do I find regions
for that?

833
00:48:39,270 --> 00:48:44,090
Well, for U1, I just establish
these regions here.

834
00:48:44,090 --> 00:48:46,920
And then I say, OK, let's
look at U2 next.

835
00:48:46,920 --> 00:48:51,350
And then I look at things in
this direction, and I wind up

836
00:48:51,350 --> 00:48:54,770
in saying, OK, that's all
there is to the problem.

837
00:48:54,770 --> 00:48:57,310
I have a scalar quantizer
again.

838
00:48:57,310 --> 00:49:00,520
Everything that I said
before works.

839
00:49:00,520 --> 00:49:04,040
Now if you're a theoretician,
you go one step further, and

840
00:49:04,040 --> 00:49:10,160
you say, what this tells me is
that vector quantizers cannot

841
00:49:10,160 --> 00:49:13,280
be any worse than scalar
quantizers.

842
00:49:13,280 --> 00:49:19,360
Because, in fact, a vector
quantizer has a -- or at least

843
00:49:19,360 --> 00:49:22,590
a vector quantizer in two
dimensions -- has a scalar

844
00:49:22,590 --> 00:49:23,430
quantizer --

845
00:49:23,430 --> 00:49:28,030
has two scalar quantizers
as a special case.

846
00:49:28,030 --> 00:49:32,210
And therefore, the amount of
square distortion that I wind

847
00:49:32,210 --> 00:49:37,600
up with in the vector case,
I can always get that --

848
00:49:40,970 --> 00:49:44,940
whatever I can do with the
scalar quantizer, I can do

849
00:49:44,940 --> 00:49:49,440
just as well with a vector
quantizer by choosing it, in

850
00:49:49,440 --> 00:49:51,770
fact, to be rectangular
like this.

851
00:49:51,770 --> 00:49:56,320
And you can also get some
intuitive ideas that if

852
00:49:56,320 --> 00:50:00,770
instead of having IID random
variables, if U1 and U2 are

853
00:50:00,770 --> 00:50:03,430
very heavily correlated somehow
so that they're very

854
00:50:03,430 --> 00:50:06,860
close together, I mean, you
sort of get an engineering

855
00:50:06,860 --> 00:50:08,700
view of what you want to do.

856
00:50:08,700 --> 00:50:11,610
You want to take this
rectangular picture here.

857
00:50:11,610 --> 00:50:14,300
You want to skew it
around that way.

858
00:50:14,300 --> 00:50:16,790
And you want to have lots of
points going this way and not

859
00:50:16,790 --> 00:50:20,010
too many points going this way
because almost everything is

860
00:50:20,010 --> 00:50:22,220
going this way, and
there's not much

861
00:50:22,220 --> 00:50:24,050
going on in that direction.

862
00:50:24,050 --> 00:50:30,470
So you got some pictures
of what you want to do.

863
00:50:30,470 --> 00:50:35,700
These regions here, a little bit
of terminology, are called

864
00:50:35,700 --> 00:50:38,330
Voronoi regions.

865
00:50:38,330 --> 00:50:42,680
Anytime you start out with a
set of points and you put

866
00:50:42,680 --> 00:50:46,520
perpendicular bisectors in
between those points, halfway

867
00:50:46,520 --> 00:50:49,550
between the points, you call the
regions that you wind up

868
00:50:49,550 --> 00:50:52,640
with Voronoi regions.

869
00:50:52,640 --> 00:50:58,000
So, in fact, part of this
Lloyd-Max algorithm,

870
00:50:58,000 --> 00:51:03,800
generalized at two dimensions,
says given any set of points,

871
00:51:03,800 --> 00:51:07,360
the regions ought to be the
Voronoi regions for them.

872
00:51:07,360 --> 00:51:10,190
So that's that first subproblem
generalized at two

873
00:51:10,190 --> 00:51:11,440
dimensions.

874
00:51:15,920 --> 00:51:19,730
And if you have an arbitrary
set of points, the Voronoi

875
00:51:19,730 --> 00:51:24,580
regions look sort of
like this, OK?

876
00:51:24,580 --> 00:51:27,410
And I've only drawn it for the
center point because, well,

877
00:51:27,410 --> 00:51:30,860
there aren't enough points to
do anything more than that.

878
00:51:30,860 --> 00:51:34,520
So anything in this region gets
mapped into this point.

879
00:51:34,520 --> 00:51:38,030
Anything in this semi-infinite
region gets mapped into this

880
00:51:38,030 --> 00:51:40,590
point and so forth.

881
00:51:40,590 --> 00:51:43,510
So even in two dimensions,
this part of the

882
00:51:43,510 --> 00:51:46,580
algorithm is simple.

883
00:51:46,580 --> 00:51:49,810
When you start out with the
regions, with almost the same

884
00:51:49,810 --> 00:51:55,340
argument that we used before,
you can see that the mean

885
00:51:55,340 --> 00:51:59,370
square error is going to be
minimized by using conditional

886
00:51:59,370 --> 00:52:03,510
means for the representation
points.

887
00:52:03,510 --> 00:52:06,380
I mean, that's done in
detail in the notes.

888
00:52:06,380 --> 00:52:09,660
It's just algebra to do that.

889
00:52:09,660 --> 00:52:11,750
It's sort of intuitive that
the same thing ought to

890
00:52:11,750 --> 00:52:14,940
happen, and in fact, it does.

891
00:52:14,940 --> 00:52:18,490
So you can still find
a local minimum by

892
00:52:18,490 --> 00:52:22,440
this Lloyd-Max algorithm.

893
00:52:22,440 --> 00:52:24,950
If you're unhappy with the
fact that the Lloyd-Max

894
00:52:24,950 --> 00:52:30,020
algorithm doesn't always work
in one dimension, be content

895
00:52:30,020 --> 00:52:32,880
with the fact that it's far
worse in two dimensions, and

896
00:52:32,880 --> 00:52:35,330
it gets worse and worse as you
go to higher numbers of

897
00:52:35,330 --> 00:52:36,780
dimensions.

898
00:52:36,780 --> 00:52:41,910
So it is a local minimum,
but not

899
00:52:41,910 --> 00:52:45,360
necessarily the best thing.

900
00:52:45,360 --> 00:52:50,490
OK, well, about that time, and
let's go forward for maybe 10

901
00:52:50,490 --> 00:52:54,250
years from 1957 and '60 when
people were inventing the

902
00:52:54,250 --> 00:52:57,560
Lloyd-Max algorithm and where
they thought that quantization

903
00:52:57,560 --> 00:53:00,640
was a really neat academic
problem, and many people were

904
00:53:00,640 --> 00:53:04,950
writing theses on it and having
lots of fun with it.

905
00:53:04,950 --> 00:53:08,420
And then eventually, they
started to realize that when

906
00:53:08,420 --> 00:53:11,390
you try to solve that problem
and find the minimum, it

907
00:53:11,390 --> 00:53:13,990
really is just a very
ugly problem.

908
00:53:13,990 --> 00:53:17,180
At least it looks like a very
ugly problem with everything

909
00:53:17,180 --> 00:53:19,750
that anybody knows after
having worked on it

910
00:53:19,750 --> 00:53:21,160
for a very long time.

911
00:53:21,160 --> 00:53:24,380
So not many people work
on this anymore.

912
00:53:24,380 --> 00:53:27,090
So we stop and say, well, we
really don't want to go too

913
00:53:27,090 --> 00:53:29,640
far on this because it's ugly.

914
00:53:29,640 --> 00:53:31,640
But then we stop and think.

915
00:53:31,640 --> 00:53:34,510
I mean, anytime you get stuck
on a problem, you ought to

916
00:53:34,510 --> 00:53:36,640
stop and ask, well, am
I really looking

917
00:53:36,640 --> 00:53:37,890
at the right problem?

918
00:53:40,500 --> 00:53:45,770
Now why am I or why am I not
looking at the right problem?

919
00:53:45,770 --> 00:53:47,640
And remember where
we started off.

920
00:53:50,730 --> 00:53:54,570
We started off with this kind
of layered solution, which

921
00:53:54,570 --> 00:53:58,360
said we were going to quantize
these things into a finite

922
00:53:58,360 --> 00:54:00,200
alphabet and then
we were going to

923
00:54:00,200 --> 00:54:04,870
discrete code them, OK?

924
00:54:04,870 --> 00:54:06,890
And here what we've been
doing for awhile,

925
00:54:06,890 --> 00:54:09,250
and none of you objected.

926
00:54:09,250 --> 00:54:12,800
Of course, it's hard to object
at 9:30 in the morning.

927
00:54:12,800 --> 00:54:17,020
You just listen and you -- but
you should have objected.

928
00:54:17,020 --> 00:54:20,230
You should have said why in hell
am I choosing the number

929
00:54:20,230 --> 00:54:24,360
of quantization levels
to minimize over?

930
00:54:24,360 --> 00:54:26,830
What should I be minimizing
over?

931
00:54:26,830 --> 00:54:31,510
I should be trying to find the
minimum mean square error

932
00:54:31,510 --> 00:54:37,050
conditional on the entropy
of this output alphabet.

933
00:54:37,050 --> 00:54:39,900
Because the entropy of the
output alphabet is what

934
00:54:39,900 --> 00:54:45,020
determines what I can accomplish
by discrete coding.

935
00:54:45,020 --> 00:54:49,080
That's a slightly phony problem
because I'm insisting,

936
00:54:49,080 --> 00:54:56,360
now at least to start with, that
the quantizer is a scalar

937
00:54:56,360 --> 00:55:01,330
quantizer, and by using coding
here, I'm allowing memory in

938
00:55:01,330 --> 00:55:03,170
the coding process.

939
00:55:03,170 --> 00:55:06,860
But it's not that phony
because, in fact, this

940
00:55:06,860 --> 00:55:10,680
quantization job is a real
mess, and this job

941
00:55:10,680 --> 00:55:13,770
is very, very easy.

942
00:55:13,770 --> 00:55:15,930
And, in fact, when you think
about it a little bit, you

943
00:55:15,930 --> 00:55:23,210
say, OK, how do people really
do this if they're trying to

944
00:55:23,210 --> 00:55:25,180
implement things?

945
00:55:25,180 --> 00:55:28,230
And how would you go about
implementing something like

946
00:55:28,230 --> 00:55:32,660
this entire problem that
we're talking about?

947
00:55:32,660 --> 00:55:34,790
What would you do if you
had to implement it?

948
00:55:38,250 --> 00:55:41,940
Well, you're probably afraid to
say it, but you would use

949
00:55:41,940 --> 00:55:45,040
digital signal processing,
wouldn't you?

950
00:55:45,040 --> 00:55:47,550
I mean, the first thing you
would try to do is to get rid

951
00:55:47,550 --> 00:55:52,790
of all these analog values, and
you would try to turn them

952
00:55:52,790 --> 00:55:57,560
into discrete values, so that
you would really, after you

953
00:55:57,560 --> 00:56:02,200
somehow find this sequence of
numbers here, you would go

954
00:56:02,200 --> 00:56:05,030
through a quantizer and quantize
these numbers very,

955
00:56:05,030 --> 00:56:06,890
very finely.

956
00:56:06,890 --> 00:56:11,810
You would think of them as real
numbers, and then you

957
00:56:11,810 --> 00:56:15,470
would do some kind of discrete
coding, and you would wind up

958
00:56:15,470 --> 00:56:16,310
with something.

959
00:56:16,310 --> 00:56:20,820
And then you would say, ah, I
have quantized these things

960
00:56:20,820 --> 00:56:22,660
very, very finely.

961
00:56:22,660 --> 00:56:25,350
And we'll see that when you
quantize it very, very finely,

962
00:56:25,350 --> 00:56:30,310
what you're going to wind up
with, which is almost optimal,

963
00:56:30,310 --> 00:56:34,880
is a uniform scalar quantizer,
which is just what a

964
00:56:34,880 --> 00:56:39,060
garden-variety analog-to-digital
converter

965
00:56:39,060 --> 00:56:42,180
does for you, OK?

966
00:56:42,180 --> 00:56:44,490
But then you say, aha!

967
00:56:44,490 --> 00:56:46,920
What I can't do at that moment,
I don't have enough

968
00:56:46,920 --> 00:56:52,220
bits to represent what this very
fine quantizer has done.

969
00:56:52,220 --> 00:56:55,820
So then I think of this
quantization as real numbers.

970
00:56:55,820 --> 00:56:59,500
I go through the same process
again, and I think of then

971
00:56:59,500 --> 00:57:01,740
quantizing the real numbers
to the number of

972
00:57:01,740 --> 00:57:04,710
bits I really want.

973
00:57:04,710 --> 00:57:08,190
Anybody catch what
I just said?

974
00:57:08,190 --> 00:57:11,020
I'm saying you do this
in two steps.

975
00:57:11,020 --> 00:57:15,630
The first step is a very fine
quantization, strictly for

976
00:57:15,630 --> 00:57:20,440
implementation purposes and
for no other reason.

977
00:57:20,440 --> 00:57:24,460
And at that point, you
have bits to process.

978
00:57:24,460 --> 00:57:27,880
You do digital-signal
processing, but you think of

979
00:57:27,880 --> 00:57:31,300
those bits as representing
numbers, OK?

980
00:57:31,300 --> 00:57:34,430
In other words, as far as your
thoughts are concerned, you're

981
00:57:34,430 --> 00:57:36,970
not dealing with the
quantization errors that

982
00:57:36,970 --> 00:57:38,360
occurred here at all.

983
00:57:38,360 --> 00:57:40,730
You're just thinking
of real numbers.

984
00:57:40,730 --> 00:57:43,960
And at that point, you try to
design conceptually what it is

985
00:57:43,960 --> 00:57:47,020
you want to do in this
process of quantizing

986
00:57:47,020 --> 00:57:49,420
and discrete coding.

987
00:57:49,420 --> 00:57:52,800
And you then go back to looking
at these things as

988
00:57:52,800 --> 00:57:55,750
numbers again, and you quantize
them again, and you

989
00:57:55,750 --> 00:57:59,560
discrete encode them again in
whatever way makes sense.

990
00:57:59,560 --> 00:58:02,730
So you do one thing strictly for
implementation purposes.

991
00:58:02,730 --> 00:58:05,600
You do the other thing for
conceptual purposes.

992
00:58:05,600 --> 00:58:08,050
Then you put them together into
something that works.

993
00:58:08,050 --> 00:58:12,390
And that's the way engineers
operate all the time, I think.

994
00:58:12,390 --> 00:58:15,880
At least they did when I was
more active doing real

995
00:58:15,880 --> 00:58:17,910
engineering.

996
00:58:17,910 --> 00:58:23,530
OK, so that's that.

997
00:58:23,530 --> 00:58:26,870
But anyway, it says finding a
minimum mean square error

998
00:58:26,870 --> 00:58:30,620
quantizer for fixed M isn't the
right problem that we're

999
00:58:30,620 --> 00:58:32,910
interested in.

1000
00:58:32,910 --> 00:58:34,660
If you're going to have
quantization followed by

1001
00:58:34,660 --> 00:58:38,150
discrete coding, the quantizer
should minimize mean square

1002
00:58:38,150 --> 00:58:43,460
error for fixed representation
point entropy.

1003
00:58:43,460 --> 00:58:45,670
In other words, I'd
want to find these

1004
00:58:45,670 --> 00:58:47,460
representation points.

1005
00:58:47,460 --> 00:58:49,720
It's important what the
numerical values of the

1006
00:58:49,720 --> 00:58:54,380
representation points are for
mean square error, but I'm not

1007
00:58:54,380 --> 00:58:56,480
interested in how many
of them I have.

1008
00:58:56,480 --> 00:59:00,120
What I'm interested in is
the entropy of them.

1009
00:59:00,120 --> 00:59:04,900
OK, so the quantizer should
minimize mean square error for

1010
00:59:04,900 --> 00:59:07,810
a fixed representation
point entropy.

1011
00:59:07,810 --> 00:59:12,570
I would like some algorithm,
which goes through and changes

1012
00:59:12,570 --> 00:59:16,110
my representation points,
including perhaps changing the

1013
00:59:16,110 --> 00:59:18,960
number of representation
points as a

1014
00:59:18,960 --> 00:59:21,530
way of reducing entropy.

1015
00:59:21,530 --> 00:59:25,240
OK, sometimes you can get more
representation points by

1016
00:59:25,240 --> 00:59:28,830
having them out where there's
virtually no probability, and

1017
00:59:28,830 --> 00:59:32,460
therefore they don't happen very
often, but when they do

1018
00:59:32,460 --> 00:59:34,640
happen, they sure save
you an awful lot

1019
00:59:34,640 --> 00:59:36,390
of mean square error.

1020
00:59:36,390 --> 00:59:40,320
So you can wind up with things
using many, many more

1021
00:59:40,320 --> 00:59:43,800
quantization points than you
would think you would want

1022
00:59:43,800 --> 00:59:48,270
because that's the optimal
thing to do.

1023
00:59:48,270 --> 00:59:52,470
OK, when you're given the
regions, if we try to say what

1024
00:59:52,470 --> 00:59:56,140
happens to the Lloyd-Max
algorithm then, the

1025
00:59:56,140 --> 00:59:58,720
representation points
should still be

1026
00:59:58,720 --> 01:00:00,420
the conditional means.

1027
01:00:00,420 --> 01:00:01,670
Why?

1028
01:00:06,040 --> 01:00:11,730
Anybody figure out why we want
to solve the problem that way?

1029
01:00:11,730 --> 01:00:18,800
If I've already figured out what
the region should the --

1030
01:00:18,800 --> 01:00:23,760
well, before, when I told you
what the regions were, you

1031
01:00:23,760 --> 01:00:27,250
told me that you wanted to make
the representation points

1032
01:00:27,250 --> 01:00:29,670
the conditional means.

1033
01:00:29,670 --> 01:00:31,570
Now if I make the representation
points

1034
01:00:31,570 --> 01:00:36,350
something other than the
conditional means, what's

1035
01:00:36,350 --> 01:00:37,600
going to happen to
the entropy?

1036
01:00:41,840 --> 01:00:43,730
The entropy stays the same.

1037
01:00:43,730 --> 01:00:45,480
The entropy has nothing
to do with what

1038
01:00:45,480 --> 01:00:48,260
you call these symbols.

1039
01:00:48,260 --> 01:00:51,760
I can move where they are, but
they still have the same

1040
01:00:51,760 --> 01:00:56,800
probability because they still
occur whenever you wind up in

1041
01:00:56,800 --> 01:00:59,670
that region.

1042
01:00:59,670 --> 01:01:02,630
And therefore, the entropy
doesn't change, and therefore,

1043
01:01:02,630 --> 01:01:06,520
the same rule holds.

1044
01:01:06,520 --> 01:01:10,060
But now the peculiar thing is
you don't want to make the

1045
01:01:10,060 --> 01:01:14,150
representation regions Voronoi
regions anymore.

1046
01:01:14,150 --> 01:01:17,290
In other words, sometimes you
want to make a region much

1047
01:01:17,290 --> 01:01:20,630
closer to one point than
to the other point.

1048
01:01:20,630 --> 01:01:22,880
And why do you want
to do that?

1049
01:01:22,880 --> 01:01:25,210
Because you'd like to make these
probabilities of the

1050
01:01:25,210 --> 01:01:28,850
points as unequal as you can
because you're trying to

1051
01:01:28,850 --> 01:01:30,570
reduce entropy.

1052
01:01:30,570 --> 01:01:33,540
And you can reduce entropy by
making things have different

1053
01:01:33,540 --> 01:01:36,230
probabilities.

1054
01:01:36,230 --> 01:01:40,740
OK, so that's where we
wind up with that.

1055
01:01:40,740 --> 01:01:43,140
I would like to say there's
a nice algorithm

1056
01:01:43,140 --> 01:01:45,070
to solve this problem.

1057
01:01:45,070 --> 01:01:46,670
There isn't.

1058
01:01:46,670 --> 01:01:49,060
It's an incredibly
ugly problem.

1059
01:01:49,060 --> 01:01:52,230
You might think it makes sense
to use a Lagrange multiplier

1060
01:01:52,230 --> 01:01:56,100
approach and try to minimize
some linear combination of

1061
01:01:56,100 --> 01:02:00,660
entropy and mean square error.

1062
01:02:00,660 --> 01:02:03,710
I've never been able
to make it work.

1063
01:02:03,710 --> 01:02:08,860
So anyway, let's go on.

1064
01:02:13,730 --> 01:02:17,220
When we want to look at a
high-rate quantizer, which is

1065
01:02:17,220 --> 01:02:22,350
what we very often want, you
can do a very simple

1066
01:02:22,350 --> 01:02:24,450
approximation.

1067
01:02:24,450 --> 01:02:28,740
And the simple approximation
makes the problem much easier,

1068
01:02:28,740 --> 01:02:31,020
and it gives you an
added benefit.

1069
01:02:31,020 --> 01:02:34,130
There's something called
differential entropy.

1070
01:02:34,130 --> 01:02:38,530
And differential entropy is the
same as ordinary entropy,

1071
01:02:38,530 --> 01:02:42,300
except instead of dealing with
probabilities, it deals with

1072
01:02:42,300 --> 01:02:45,440
probability densities.

1073
01:02:45,440 --> 01:02:49,030
And you look at this, and you
say, well, that's virtually

1074
01:02:49,030 --> 01:02:53,570
the same thing, and it looks
like the same thing.

1075
01:02:53,570 --> 01:02:56,020
And to most physicists, most
physicists look at something

1076
01:02:56,020 --> 01:03:00,135
like this, and physicists who
are into statistical mechanics

1077
01:03:00,135 --> 01:03:01,990
say, oh, of course that's
the same thing.

1078
01:03:01,990 --> 01:03:04,400
There's no fundamental
difference between a

1079
01:03:04,400 --> 01:03:08,650
differential entropy
and a real entropy.

1080
01:03:08,650 --> 01:03:10,790
And you say, well, but
they're all these

1081
01:03:10,790 --> 01:03:11,600
scaling issues there.

1082
01:03:11,600 --> 01:03:15,060
And they say, blah, that's
not important.

1083
01:03:15,060 --> 01:03:17,035
And they're right, because once
you understand it, it's

1084
01:03:17,035 --> 01:03:19,160
not important.

1085
01:03:19,160 --> 01:03:25,670
But let's look and see what the
similarities and what the

1086
01:03:25,670 --> 01:03:26,920
differences are.

1087
01:03:29,960 --> 01:03:34,130
And I'll think in terms of, in
fact, a quantization problem,

1088
01:03:34,130 --> 01:03:38,410
where I'm taking this
continuous-valued random

1089
01:03:38,410 --> 01:03:42,250
variable with density, and I'm
quantizing it into a set of

1090
01:03:42,250 --> 01:03:44,110
discrete points.

1091
01:03:44,110 --> 01:03:47,750
And I want to say what's the
difference between this

1092
01:03:47,750 --> 01:03:52,630
differential entropy here and
this discrete entropy, which

1093
01:03:52,630 --> 01:03:55,410
we sort of understand by now.

1094
01:03:55,410 --> 01:03:59,760
Well, things that are the same
-- the first thing that's the

1095
01:03:59,760 --> 01:04:04,830
same is the differential entropy
is still the expected

1096
01:04:04,830 --> 01:04:09,980
value of minus the logarithm
of the probability density.

1097
01:04:13,830 --> 01:04:16,340
OK, we found that that was
useful before when we were

1098
01:04:16,340 --> 01:04:17,840
trying to understand
the entropy.

1099
01:04:17,840 --> 01:04:22,180
It's the expected value
of a log pmf.

1100
01:04:22,180 --> 01:04:27,020
Now the differential entropy is
expected value of a log pdf

1101
01:04:27,020 --> 01:04:30,180
instead of pmf.

1102
01:04:30,180 --> 01:04:35,350
And the entropy of two random
variables, U1 and U2, if

1103
01:04:35,350 --> 01:04:40,020
they're independent, is just the
differential entropy of U1

1104
01:04:40,020 --> 01:04:42,940
plus the differential
entropy of U2.

1105
01:04:42,940 --> 01:04:43,980
How do I see that?

1106
01:04:43,980 --> 01:04:45,160
You just write it down.

1107
01:04:45,160 --> 01:04:46,790
You write down these
joint densities.

1108
01:04:46,790 --> 01:04:48,150
You write down this.

1109
01:04:48,150 --> 01:04:53,650
The logarithm of the joint
density for IID splits into a

1110
01:04:53,650 --> 01:04:55,580
product of densities.

1111
01:04:55,580 --> 01:04:59,210
A log of a product is
the sum of the logs.

1112
01:04:59,210 --> 01:05:01,380
It's the same as the
argument before.

1113
01:05:01,380 --> 01:05:06,940
In other words, it's not only
that this is the same as the

1114
01:05:06,940 --> 01:05:09,270
answer that we got before,
but the argument

1115
01:05:09,270 --> 01:05:11,560
is exactly the same.

1116
01:05:11,560 --> 01:05:15,440
And the other thing, the next
thing is if you shift this

1117
01:05:15,440 --> 01:05:16,800
density here.

1118
01:05:16,800 --> 01:05:20,045
If I have a density, which is
going along here and I shift

1119
01:05:20,045 --> 01:05:23,410
it over to here and have a
density over there, the

1120
01:05:23,410 --> 01:05:25,510
entropy is the same again.

1121
01:05:25,510 --> 01:05:28,930
Because this entropy is just
fundamentally not -- it

1122
01:05:28,930 --> 01:05:32,230
doesn't have to do with where
you happen to be.

1123
01:05:32,230 --> 01:05:35,520
It has to do with what these
probability densities are.

1124
01:05:35,520 --> 01:05:40,320
And you can stick that shift
into here, and if you're very

1125
01:05:40,320 --> 01:05:45,140
good at doing calculus, you can,
in fact, see that when

1126
01:05:45,140 --> 01:05:49,770
you put a shift in here, you
still get the same entropy.

1127
01:05:49,770 --> 01:05:53,390
I couldn't do that at this
point, but I'm sure that all

1128
01:05:53,390 --> 01:05:55,100
of you can.

1129
01:05:55,100 --> 01:05:56,830
And I can do it if
I spent enough

1130
01:05:56,830 --> 01:05:58,000
time thinking it through.

1131
01:05:58,000 --> 01:06:01,600
It would just take me longer
than most of you.

1132
01:06:01,600 --> 01:06:05,690
OK, so all of these things
are still true.

1133
01:06:05,690 --> 01:06:07,300
There a couple of
very disturbing

1134
01:06:07,300 --> 01:06:10,590
differences, though.

1135
01:06:10,590 --> 01:06:13,450
And the first difference
is that h of U

1136
01:06:13,450 --> 01:06:14,700
is not scale invariant.

1137
01:06:17,720 --> 01:06:19,800
And, in fact, if it were
scale invariant, it

1138
01:06:19,800 --> 01:06:22,690
would be totally useless.

1139
01:06:22,690 --> 01:06:25,610
The fact that it's not scale
invariant turns out to be very

1140
01:06:25,610 --> 01:06:29,850
important when you try
to understand it, OK?

1141
01:06:29,850 --> 01:06:36,890
In other words, if you stretch
this probability density by

1142
01:06:36,890 --> 01:06:40,930
some quantity a, then your
probability density, which

1143
01:06:40,930 --> 01:06:44,570
looks like this, shifts
like this.

1144
01:06:44,570 --> 01:06:47,990
It shifts down and
it shifts out.

1145
01:06:47,990 --> 01:06:52,790
So the log of the
pmf gets bigger.

1146
01:06:52,790 --> 01:06:56,450
The pmf itself is just
sort of spread out.

1147
01:06:56,450 --> 01:06:59,280
It does what you would
expect it to do, and

1148
01:06:59,280 --> 01:07:01,090
that works very nicely.

1149
01:07:01,090 --> 01:07:04,630
But the log of the pmf has this
extra term which comes in

1150
01:07:04,630 --> 01:07:08,470
here, which is a factor of a.

1151
01:07:08,470 --> 01:07:12,400
And it turns out you can just
factor that factor of a out,

1152
01:07:12,400 --> 01:07:17,730
and you wind up with the
differential entropy of a

1153
01:07:17,730 --> 01:07:22,590
scaled random variable U is
equal to the differential

1154
01:07:22,590 --> 01:07:30,760
entropy of the original U plus
log of a, which might be

1155
01:07:30,760 --> 01:07:32,010
minus log of a.

1156
01:07:34,610 --> 01:07:36,280
I'm not sure which it is.

1157
01:07:36,280 --> 01:07:38,780
I'll write down plus
or minus log of a.

1158
01:07:38,780 --> 01:07:42,450
It's one or the other.

1159
01:07:42,450 --> 01:07:45,240
I don't care at this point.

1160
01:07:45,240 --> 01:07:47,260
All I'm interested
in is that you

1161
01:07:47,260 --> 01:07:50,130
recognize that it's different.

1162
01:07:50,130 --> 01:07:52,860
An even more disturbing point
is this differential entropy

1163
01:07:52,860 --> 01:07:54,280
can be negative.

1164
01:07:54,280 --> 01:07:55,830
It can be negative
or positive.

1165
01:07:55,830 --> 01:07:59,720
It can do whatever
it wants to do.

1166
01:07:59,720 --> 01:08:05,540
And so we're left with a sort
of a peculiar thing.

1167
01:08:05,540 --> 01:08:09,860
But we say, well, all right,
that's the way it is.

1168
01:08:09,860 --> 01:08:11,410
The physicists had
to deal with that

1169
01:08:11,410 --> 01:08:14,050
for many, many years.

1170
01:08:14,050 --> 01:08:19,410
And the physicists, who think
about it all the time, deal

1171
01:08:19,410 --> 01:08:22,470
with it and say it's
not very important.

1172
01:08:22,470 --> 01:08:26,980
So we'll say, OK, we'll go along
with this unless asked:

1173
01:08:26,980 --> 01:08:31,310
What happens if we try to build
a uniform high-rate

1174
01:08:31,310 --> 01:08:35,100
scalar quantizer, which is
exactly what you would do for

1175
01:08:35,100 --> 01:08:38,670
implementation purposes
anyway.

1176
01:08:38,670 --> 01:08:40,550
So how does this work?

1177
01:08:40,550 --> 01:08:43,380
You pick a little tiny delta.

1178
01:08:43,380 --> 01:08:45,920
You have all these
regions here.

1179
01:08:45,920 --> 01:08:50,090
And by uniform, I mean you make
all the regions the same.

1180
01:08:50,090 --> 01:08:52,750
I mean, conceptually, this might
mean having accountably

1181
01:08:52,750 --> 01:08:54,150
infinite number of regions.

1182
01:08:54,150 --> 01:08:57,610
Let's not worry about that
for the time being.

1183
01:08:57,610 --> 01:09:01,130
And you have points in here,
which are the conditional

1184
01:09:01,130 --> 01:09:04,310
means of the regions.

1185
01:09:04,310 --> 01:09:11,020
Well, if I assume that delta is
small, then the probability

1186
01:09:11,020 --> 01:09:14,960
density of u is almost constant
within a region, if

1187
01:09:14,960 --> 01:09:17,970
it really is a density.

1188
01:09:17,970 --> 01:09:23,800
And I can define an average
value of f of u within each

1189
01:09:23,800 --> 01:09:25,350
region in the following way.

1190
01:09:25,350 --> 01:09:28,070
It's easier to see what it is
graphically if I have a

1191
01:09:28,070 --> 01:09:31,220
density here, which runs
along like this.

1192
01:09:33,810 --> 01:09:39,930
And within each region, I will
choose f-bar of u to be a

1193
01:09:39,930 --> 01:09:47,630
piecewise constant version of
this density, which might be

1194
01:09:47,630 --> 01:09:49,620
like that, OK?

1195
01:09:49,620 --> 01:09:51,610
Now that's just analytical
stuff to make

1196
01:09:51,610 --> 01:09:53,980
things come out right.

1197
01:09:53,980 --> 01:09:58,300
If you read the notes about how
to do this, you find out

1198
01:09:58,300 --> 01:10:02,060
that this quantity is important
analytically to

1199
01:10:02,060 --> 01:10:04,270
trace through what's going on.

1200
01:10:04,270 --> 01:10:07,510
I think for the level that we're
at now, it's fine to

1201
01:10:07,510 --> 01:10:13,380
just say, OK, this is the same
as this, if we make the

1202
01:10:13,380 --> 01:10:16,190
quantization very small.

1203
01:10:16,190 --> 01:10:18,350
I mean, at some point in this
argument, you want to go

1204
01:10:18,350 --> 01:10:21,420
through and say, OK, what do
I mean by approximate?

1205
01:10:21,420 --> 01:10:23,340
Is the approximation close?

1206
01:10:23,340 --> 01:10:27,520
How does the approximation
become better as I make delta

1207
01:10:27,520 --> 01:10:29,440
smaller, and all of
these questions.

1208
01:10:29,440 --> 01:10:35,050
But let's just now say, OK, this
is the same as that, and

1209
01:10:35,050 --> 01:10:39,710
I'd like to find out how well
this uniform high-rate scalar

1210
01:10:39,710 --> 01:10:40,960
quantizer does.

1211
01:10:44,470 --> 01:10:49,610
So my high-rate approximation
is that this average density

1212
01:10:49,610 --> 01:10:53,870
is the same as the true
density and for

1213
01:10:53,870 --> 01:10:56,290
all possible u.

1214
01:10:56,290 --> 01:11:02,760
So conditional on u and Rj,
this quantity is constant.

1215
01:11:02,760 --> 01:11:07,770
It's constant and equal to 1
over delta, and the actual

1216
01:11:07,770 --> 01:11:10,860
density is approximately
equal to 1 over delta.

1217
01:11:10,860 --> 01:11:11,590
What's that mean?

1218
01:11:11,590 --> 01:11:14,510
It means that the mean square
error in one of these

1219
01:11:14,510 --> 01:11:19,500
quantization regions is our
usual delta squared over 12,

1220
01:11:19,500 --> 01:11:22,850
which is the mean square error
of a uniform density in a

1221
01:11:22,850 --> 01:11:26,530
region from minus delta over
2 to plus delta over 2.

1222
01:11:26,530 --> 01:11:28,550
So that's the mean
square error.

1223
01:11:28,550 --> 01:11:30,010
You can't do anything
with that.

1224
01:11:30,010 --> 01:11:32,680
You're stuck with it.

1225
01:11:32,680 --> 01:11:36,650
The next question is
what is the entropy

1226
01:11:36,650 --> 01:11:39,880
as a quantizer output?

1227
01:11:39,880 --> 01:11:42,760
I'm sorry, I'm giving a typical
MIT lecture today,

1228
01:11:42,760 --> 01:11:46,780
which people have characterized
as a fire hose,

1229
01:11:46,780 --> 01:11:51,130
and it's because I want to get
done with this material today.

1230
01:11:51,130 --> 01:11:53,680
I think if you read the notes
afterwards, you've got a

1231
01:11:53,680 --> 01:11:57,010
pretty good picture of
what's going on.

1232
01:11:57,010 --> 01:11:59,300
We are not going to have
an enormous number

1233
01:11:59,300 --> 01:12:00,380
of problems on this.

1234
01:12:00,380 --> 01:12:03,295
It's not something we're
stressing so I want to get you

1235
01:12:03,295 --> 01:12:07,280
to have some idea of what
this is all about.

1236
01:12:07,280 --> 01:12:10,830
This slide is probably the most
important slide because

1237
01:12:10,830 --> 01:12:13,640
it tells you what this
differential

1238
01:12:13,640 --> 01:12:15,390
entropy really is.

1239
01:12:18,040 --> 01:12:22,070
OK, I'm going to look at the
probabilities of each of these

1240
01:12:22,070 --> 01:12:27,680
discrete points now. p sub j
is the probability that the

1241
01:12:27,680 --> 01:12:30,990
quantizer will get the
point a sub j.

1242
01:12:30,990 --> 01:12:33,260
In other words, it is the
probability of the

1243
01:12:33,260 --> 01:12:35,790
region R sub j.

1244
01:12:35,790 --> 01:12:40,320
And that's just the integral
of the probability density

1245
01:12:40,320 --> 01:12:41,490
over the j-th region.

1246
01:12:41,490 --> 01:12:46,450
Namely, it's the probability of
being in that j-th region.

1247
01:12:46,450 --> 01:12:52,080
p sub j is also equal to this
average value times delta.

1248
01:12:52,080 --> 01:12:55,090
OK, so I look at what
the entropy is of

1249
01:12:55,090 --> 01:12:57,610
the high-rate quantizer.

1250
01:12:57,610 --> 01:13:02,450
It's the sum of minus
pj log pj.

1251
01:13:02,450 --> 01:13:07,090
I translate that pj is
equal to this, so I

1252
01:13:07,090 --> 01:13:08,900
have a sum over j.

1253
01:13:08,900 --> 01:13:15,850
I have an integral minus a fU of
u, which is that, times log

1254
01:13:15,850 --> 01:13:19,200
And now I'll use this quantity
instead of this quantity.

1255
01:13:19,200 --> 01:13:22,440
f-bar of u times delta du.

1256
01:13:22,440 --> 01:13:25,660
OK, in other words, these
probabilities are scaled by

1257
01:13:25,660 --> 01:13:27,560
delta, which is what I was
telling you before.

1258
01:13:27,560 --> 01:13:30,330
That's the crucial thing here.

1259
01:13:30,330 --> 01:13:33,790
As you make delta smaller and
smaller, the probabilities of

1260
01:13:33,790 --> 01:13:37,510
the intervals go down
with delta.

1261
01:13:37,510 --> 01:13:41,810
OK, so I break this up
into two terms now.

1262
01:13:41,810 --> 01:13:47,230
I take out the log of delta,
which is integrated over fU.

1263
01:13:47,230 --> 01:13:49,200
I have this quantity left.

1264
01:13:49,200 --> 01:13:52,670
This quantity is approximately
equal to the density, and what

1265
01:13:52,670 --> 01:13:56,380
I wind up with is that the
actual entropy of the

1266
01:13:56,380 --> 01:14:00,680
quantized version is equal to
the differential entropy minus

1267
01:14:00,680 --> 01:14:04,010
the log of delta at high rate.

1268
01:14:04,010 --> 01:14:07,550
OK, that's the only way I know
to interpret differential

1269
01:14:07,550 --> 01:14:11,750
entropy that makes any sense,
OK, in other words, usually

1270
01:14:11,750 --> 01:14:14,120
when you think of integrals,
you think of them

1271
01:14:14,120 --> 01:14:15,190
in a Riemann sense.

1272
01:14:15,190 --> 01:14:18,160
You think of breaking up the
interval into lots of very

1273
01:14:18,160 --> 01:14:22,530
thin slices with delta, and then
you integrate things by

1274
01:14:22,530 --> 01:14:26,250
adding up things, and that
integral then is then a sum.

1275
01:14:26,250 --> 01:14:28,600
It's the same as this
kind of sum.

1276
01:14:28,600 --> 01:14:30,990
And the integral that you wind
up with when you're dealing

1277
01:14:30,990 --> 01:14:36,110
with a log of the pmf, you
get this extra delta

1278
01:14:36,110 --> 01:14:40,550
sticking in there, OK?

1279
01:14:40,550 --> 01:14:48,640
So this entropy is this phony
thing minus log of delta.

1280
01:14:48,640 --> 01:14:52,650
As I quantize more and
more finely, this

1281
01:14:52,650 --> 01:14:55,020
entropy keeps going up.

1282
01:14:55,020 --> 01:14:58,990
It goes up because when delta is
very, very small, minus log

1283
01:14:58,990 --> 01:15:01,590
delta is a large number.

1284
01:15:01,590 --> 01:15:05,900
So as delta gets smaller and
smaller, this entropy heads

1285
01:15:05,900 --> 01:15:08,330
off towards infinity.

1286
01:15:08,330 --> 01:15:12,030
And in fact, you would expect
that because if you take a

1287
01:15:12,030 --> 01:15:16,150
real number, which has a
distribution which is -- well,

1288
01:15:16,150 --> 01:15:21,470
anything just between minus 1
and plus 1, for example, and I

1289
01:15:21,470 --> 01:15:25,250
want to represent it very, very
well if it's uniformly

1290
01:15:25,250 --> 01:15:28,670
distributed there, the better
I try to represent it, the

1291
01:15:28,670 --> 01:15:30,800
more bits it takes.

1292
01:15:30,800 --> 01:15:33,580
That's what this is saying.

1293
01:15:33,580 --> 01:15:37,680
This thing is changing as I try
to represent things better

1294
01:15:37,680 --> 01:15:39,350
and better.

1295
01:15:39,350 --> 01:15:43,590
This quantity here is just some
funny thing that deals

1296
01:15:43,590 --> 01:15:45,790
with the shape of the
probability density and

1297
01:15:45,790 --> 01:15:48,040
nothing else.

1298
01:15:48,040 --> 01:15:53,180
It essentially has a scale
factor built into it because

1299
01:15:53,180 --> 01:15:56,690
the probability density has a
scale factor built into it.

1300
01:15:56,690 --> 01:16:02,270
A probability density is
probability per unit length.

1301
01:16:02,270 --> 01:16:08,500
And therefore, this kind of
entropy has to have that unit

1302
01:16:08,500 --> 01:16:11,490
length coming in here somehow,
and that's the

1303
01:16:11,490 --> 01:16:13,380
way it comes in.

1304
01:16:13,380 --> 01:16:14,630
That's the way it is.

1305
01:16:19,800 --> 01:16:25,050
So to summarize all of that, and
I'm sure you haven't quite

1306
01:16:25,050 --> 01:16:30,050
understood all of it, but if
you do efficient discrete

1307
01:16:30,050 --> 01:16:35,240
coding at the end of the whole
thing, the number of bits per

1308
01:16:35,240 --> 01:16:40,545
sample, namely, per number
coming into the quantizer that

1309
01:16:40,545 --> 01:16:44,030
you need, is H of V, OK?

1310
01:16:44,030 --> 01:16:52,510
With this uniform quantizer,
which produces an entropy of

1311
01:16:52,510 --> 01:16:56,530
the symbols H of V, then you
L-bar bits per symbol to

1312
01:16:56,530 --> 01:16:57,470
represent that.

1313
01:16:57,470 --> 01:17:00,110
That's the result that
we had before.

1314
01:17:03,300 --> 01:17:08,870
This quantity H of V depends
only on delta and h of U.

1315
01:17:08,870 --> 01:17:14,920
Namely, h of U is equal to
H of V plus log delta.

1316
01:17:14,920 --> 01:17:17,160
So, in other words,
analytically, the only thing

1317
01:17:17,160 --> 01:17:20,120
you have to worry about
is what is this

1318
01:17:20,120 --> 01:17:21,880
differential entropy?

1319
01:17:21,880 --> 01:17:25,310
You can't interpret what it is,
but you can calculate it.

1320
01:17:25,310 --> 01:17:28,660
And once you calculate it,
this tells you what this

1321
01:17:28,660 --> 01:17:31,630
entropy is for every
choice of delta so

1322
01:17:31,630 --> 01:17:34,240
long as delta is small.

1323
01:17:34,240 --> 01:17:38,530
It says that when you get to the
point where delta is small

1324
01:17:38,530 --> 01:17:42,690
and the probability density is
essentially constant over a

1325
01:17:42,690 --> 01:17:46,360
region, if I want to make delta
half as big as it was

1326
01:17:46,360 --> 01:17:49,700
before, then I'm going to wind
up with twice as many

1327
01:17:49,700 --> 01:17:51,270
quantization regions.

1328
01:17:51,270 --> 01:17:52,990
They're all going to
be only half as

1329
01:17:52,990 --> 01:17:55,000
probable as the ones before.

1330
01:17:55,000 --> 01:17:56,660
It's going to take me
one extra bit to

1331
01:17:56,660 --> 01:17:59,550
represent that, OK?

1332
01:17:59,550 --> 01:18:03,930
Namely, this one extra bit for
each of these old regions is

1333
01:18:03,930 --> 01:18:07,110
going to tell me whether I'm on
the right side or the left

1334
01:18:07,110 --> 01:18:13,220
side of that old quantization
region to talk about the new

1335
01:18:13,220 --> 01:18:14,310
quantization region.

1336
01:18:14,310 --> 01:18:17,280
So this all make sense, OK?

1337
01:18:17,280 --> 01:18:19,770
Namely, the only thing that's
happening as you make the

1338
01:18:19,770 --> 01:18:25,270
quantization finer and finer is
that you have these extra

1339
01:18:25,270 --> 01:18:28,580
bits coming in, which are sort
of telling you what the fine

1340
01:18:28,580 --> 01:18:30,540
structure is.

1341
01:18:30,540 --> 01:18:34,930
And little h of U already has
built into it the overall

1342
01:18:34,930 --> 01:18:38,280
shape of the thing.

1343
01:18:38,280 --> 01:18:42,850
OK, now I've added one thing
more here, which I haven't

1344
01:18:42,850 --> 01:18:44,450
talked about at all.

1345
01:18:44,450 --> 01:18:50,440
This uniform scalar quantizer
in fact becomes optimal as

1346
01:18:50,440 --> 01:18:53,440
delta become small.

1347
01:18:53,440 --> 01:18:56,830
And that's not obvious;
it's not intuitive.

1348
01:18:56,830 --> 01:18:59,070
There's an argument
in the text that

1349
01:18:59,070 --> 01:19:00,410
shows why that is true.

1350
01:19:00,410 --> 01:19:05,680
It uses a Lagrange multiplier
to do it.

1351
01:19:05,680 --> 01:19:09,210
I guess there's a certain
elegance to it.

1352
01:19:09,210 --> 01:19:12,310
I mean, after years of thinking
about it, I would

1353
01:19:12,310 --> 01:19:14,590
say, yeah, it's pretty likely
that it's true if I didn't

1354
01:19:14,590 --> 01:19:17,670
have the mathematics
to know it's true.

1355
01:19:17,670 --> 01:19:20,700
But, in fact, it's a very
interesting result.

1356
01:19:20,700 --> 01:19:26,760
It says that what you do with
classical a to z converters,

1357
01:19:26,760 --> 01:19:29,320
if you're going to have very
fine quantization, is, in

1358
01:19:29,320 --> 01:19:31,780
fact, the right thing to do.