1 00:00:00,000 --> 00:00:01,910 NARRATOR: The following content is provided under a 2 00:00:01,910 --> 00:00:03,640 Creative Commons license. 3 00:00:03,640 --> 00:00:06,600 Your support will help MIT OpenCourseWare continue to 4 00:00:06,600 --> 00:00:09,970 offer high-quality educational resources for free. 5 00:00:09,970 --> 00:00:12,810 To make a donation or to view additional materials from 6 00:00:12,810 --> 00:00:16,880 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:16,880 --> 00:00:18,130 ocw.mit.edu. 8 00:00:21,990 --> 00:00:25,160 PROFESSOR: To take up the topic of quantization, you 9 00:00:25,160 --> 00:00:27,990 remember we're talking about source coding in the first 10 00:00:27,990 --> 00:00:30,900 part of the course, channel coding in the 11 00:00:30,900 --> 00:00:33,700 last 2/3 of the course. 12 00:00:33,700 --> 00:00:39,980 Source coding, like all, is divided into three parts. 13 00:00:39,980 --> 00:00:44,980 If you have waveforms, such as speech, you start 14 00:00:44,980 --> 00:00:47,400 out with the waveform. 15 00:00:47,400 --> 00:00:50,810 The typical way to encode waveforms is you first either 16 00:00:50,810 --> 00:00:54,750 sample the waveform or you expand it in 17 00:00:54,750 --> 00:00:56,910 some kind of expansion. 18 00:00:56,910 --> 00:01:01,720 When you do that, you wind up with a sequence of numbers. 19 00:01:01,720 --> 00:01:05,850 You put the sequence of numbers into a quantizer, and 20 00:01:05,850 --> 00:01:10,270 the quantizer reduces that to a discrete alphabet. 21 00:01:10,270 --> 00:01:15,500 You put the discrete symbols into the discrete encoder. 22 00:01:15,500 --> 00:01:18,490 You pass it through a reliable binary channel. 23 00:01:18,490 --> 00:01:21,140 What is a reliable binary channel? 24 00:01:21,140 --> 00:01:25,220 It's a layered view of any old channel in the world, OK? 25 00:01:25,220 --> 00:01:28,110 In other words, the way that discrete channels work these 26 00:01:28,110 --> 00:01:33,910 days is that, in almost all cases, what goes into them is 27 00:01:33,910 --> 00:01:38,450 a binary stream of signals, and what comes out of them is 28 00:01:38,450 --> 00:01:44,320 a binary stream of symbols, and the output is essentially 29 00:01:44,320 --> 00:01:45,630 the same z-input. 30 00:01:45,630 --> 00:01:49,710 That's the whole purpose of how you design digital 31 00:01:49,710 --> 00:01:53,690 channels, and they work over analog media and 32 00:01:53,690 --> 00:01:55,220 all sorts of things. 33 00:01:55,220 --> 00:02:01,400 OK, so discrete encoders and discrete decoders are really a 34 00:02:01,400 --> 00:02:05,610 valid topic to study in their own. 35 00:02:05,610 --> 00:02:08,190 I mean, you have text and stuff like that, which is 36 00:02:08,190 --> 00:02:12,030 discrete to start with, so there's a general topic of how 37 00:02:12,030 --> 00:02:15,990 do you encode discrete things? 38 00:02:15,990 --> 00:02:19,860 We've pretty much answered that problem, at least in an 39 00:02:19,860 --> 00:02:27,230 abstract sense, and the main point there is you find the 40 00:02:27,230 --> 00:02:30,700 entropy of that discrete sequence, the entropy per 41 00:02:30,700 --> 00:02:36,380 symbol, and then you find ways of encoding that discrete 42 00:02:36,380 --> 00:02:41,120 source in such a way that the number of bits per symbol is 43 00:02:41,120 --> 00:02:43,400 approximately equal to that entropy. 44 00:02:43,400 --> 00:02:46,140 We know you can't do any better than that. 45 00:02:46,140 --> 00:02:51,390 You can't do an encoding, which is uniquely decodable, 46 00:02:51,390 --> 00:02:55,930 which you can get out of with the original symbols again, 47 00:02:55,930 --> 00:02:58,540 with anything less than the entropy. 48 00:02:58,540 --> 00:03:02,210 So at least we know roughly what the answer to that is. 49 00:03:02,210 --> 00:03:06,150 We even know some classy schemes like the Lempel-Ziv 50 00:03:06,150 --> 00:03:11,900 algorithm, which will in fact operate without even knowing 51 00:03:11,900 --> 00:03:14,910 anything about what the probabilities are. 52 00:03:14,910 --> 00:03:19,750 So we sort of understand this block here at this point. 53 00:03:19,750 --> 00:03:23,390 And we could start with this next, or we could start with 54 00:03:23,390 --> 00:03:28,570 this next, and unlike the electrician here, we're going 55 00:03:28,570 --> 00:03:34,120 to move in sequence and look at this next and this third. 56 00:03:34,120 --> 00:03:36,480 There's another reason for that. 57 00:03:36,480 --> 00:03:41,690 When we get into this question, we will be talking 58 00:03:41,690 --> 00:03:43,550 about what do you do with waveforms? 59 00:03:43,550 --> 00:03:45,820 How do you deal with waveforms? 60 00:03:45,820 --> 00:03:49,850 It's what you've been studying probably since the sixth 61 00:03:49,850 --> 00:03:53,670 grade, and you're all familiar with it. 62 00:03:53,670 --> 00:03:55,770 You do integration and things like that. 63 00:03:55,770 --> 00:03:58,240 You know how to work with functions. 64 00:03:58,240 --> 00:04:01,300 What we're going to do with functions in this course is a 65 00:04:01,300 --> 00:04:04,070 little bit different than what you're used to. 66 00:04:04,070 --> 00:04:07,010 It's quite a bit different than what you learned in your 67 00:04:07,010 --> 00:04:10,500 Signals and Systems course, and you'll find out why as we 68 00:04:10,500 --> 00:04:11,820 move along. 69 00:04:11,820 --> 00:04:17,310 But we have to understand this business of how to deal with 70 00:04:17,310 --> 00:04:22,350 waveforms, both in terms of this block, which is the final 71 00:04:22,350 --> 00:04:26,010 block we'll study in source coding, and also in terms of 72 00:04:26,010 --> 00:04:28,780 how to deal with channels because in real channels, 73 00:04:28,780 --> 00:04:31,870 generally what you transmit is a waveform. 74 00:04:31,870 --> 00:04:35,920 What the noise does to you is to add a waveform or, in some 75 00:04:35,920 --> 00:04:40,040 sense, multiply what you put in by something else. 76 00:04:40,040 --> 00:04:42,650 And all of that is waveform stuff. 77 00:04:42,650 --> 00:04:45,440 And all of information theory and all of digital 78 00:04:45,440 --> 00:04:49,610 communication is based on thinking bits. 79 00:04:49,610 --> 00:04:54,860 So somehow or other, we have to become very facile in going 80 00:04:54,860 --> 00:04:57,530 from waveforms to bits. 81 00:04:57,530 --> 00:05:00,560 Now I've been around the professional communities of 82 00:05:00,560 --> 00:05:04,160 both communication and information theory for 83 00:05:04,160 --> 00:05:06,790 a long, long time. 84 00:05:06,790 --> 00:05:10,150 There is one fundamental problem that gives everyone 85 00:05:10,150 --> 00:05:14,170 problems because information theory texts do not deal with 86 00:05:14,170 --> 00:05:17,250 it and communication texts do not deal with it. 87 00:05:17,250 --> 00:05:19,960 And that problem is how do you go from one to the other, 88 00:05:19,960 --> 00:05:22,430 which seems like it ought to be an easy thing. 89 00:05:22,430 --> 00:05:25,820 It's not as easy as it looks, and therefore, we're going to 90 00:05:25,820 --> 00:05:27,930 spent quite a bit of time on that. 91 00:05:27,930 --> 00:05:32,140 In fact, I passed out lectures 6 and 7 today, which in 92 00:05:32,140 --> 00:05:36,150 previous years I've done in two separate lectures because 93 00:05:36,150 --> 00:05:39,420 quantization is a problem that looks important. 94 00:05:39,420 --> 00:05:43,800 We'll see that it's not quite as important as it looks. 95 00:05:43,800 --> 00:05:49,180 And I guess the other thing is, I mean, at different times 96 00:05:49,180 --> 00:05:52,490 you have to teach different things or you get stale on it. 97 00:05:52,490 --> 00:05:56,220 And I've just finished writing some fairly nice notes, I 98 00:05:56,220 --> 00:05:59,530 think, on this question of how do you go from waveforms to 99 00:05:59,530 --> 00:06:03,080 numbers and how do you go from numbers to waveforms. 100 00:06:03,080 --> 00:06:06,190 And I want to spend a little more time on that this year 101 00:06:06,190 --> 00:06:07,920 than I did last year. 102 00:06:07,920 --> 00:06:10,730 I want to spend, therefore, a little less time on 103 00:06:10,730 --> 00:06:15,910 quantization, so that next time, we will briefly review 104 00:06:15,910 --> 00:06:20,560 what we've done today on quantization, but we will 105 00:06:20,560 --> 00:06:23,310 essentially just compress those two 106 00:06:23,310 --> 00:06:25,980 lectures all into one. 107 00:06:25,980 --> 00:06:30,310 In dealing with waveforms, we're going to learn some 108 00:06:30,310 --> 00:06:33,090 interesting and kind of cool things. 109 00:06:33,090 --> 00:06:36,180 Like for those of you who really don't -- are not 110 00:06:36,180 --> 00:06:40,380 interested in mathematics, you know that people study things 111 00:06:40,380 --> 00:06:43,920 like the theory of real variables and functional 112 00:06:43,920 --> 00:06:50,070 analysis and all of these neat things, which are very, in a 113 00:06:50,070 --> 00:06:52,870 sense, advanced mathematics. 114 00:06:52,870 --> 00:06:56,920 They're all based on measure theory, and you're going to 115 00:06:56,920 --> 00:06:59,810 find out a little bit about measure theory here. 116 00:06:59,810 --> 00:07:03,380 Not an awful lot, but just enough to know why one has to 117 00:07:03,380 --> 00:07:07,220 deal with those questions because the major results in 118 00:07:07,220 --> 00:07:12,160 dealing with waveforms and samples really can't be stated 119 00:07:12,160 --> 00:07:14,560 in any other form than in a somewhat 120 00:07:14,560 --> 00:07:16,820 measure-theoretic form. 121 00:07:16,820 --> 00:07:20,740 So we're going to find just enough about that so we can 122 00:07:20,740 --> 00:07:22,880 understand what those issues are about. 123 00:07:22,880 --> 00:07:25,610 So that's why next time we're going to start dealing with 124 00:07:25,610 --> 00:07:27,930 the waveform issues. 125 00:07:27,930 --> 00:07:31,990 OK, so today, we're dealing with these quantizer issues 126 00:07:31,990 --> 00:07:36,630 and how do you take a sequence of numbers, turn them into a 127 00:07:36,630 --> 00:07:39,220 sequence of symbols at the other end? 128 00:07:39,220 --> 00:07:41,450 How do you take a sequence of symbols, turn 129 00:07:41,450 --> 00:07:43,110 them back into numbers? 130 00:07:55,720 --> 00:07:59,890 So when you convert real numbers to binary strings, you 131 00:07:59,890 --> 00:08:03,360 need a mapping from the set of real numbers 132 00:08:03,360 --> 00:08:05,470 to a discrete alphabet. 133 00:08:05,470 --> 00:08:08,520 And we're typically going to have a mapping from the set of 134 00:08:08,520 --> 00:08:12,700 real numbers into a finite discrete alphabet. 135 00:08:12,700 --> 00:08:16,790 Now what's the first obvious thing that you notice when you 136 00:08:16,790 --> 00:08:20,280 have a mapping that goes from an infinite set of things into 137 00:08:20,280 --> 00:08:21,620 a finite set of things? 138 00:08:21,620 --> 00:08:23,270 AUDIENCE: [INAUDIBLE] 139 00:08:23,270 --> 00:08:23,550 PROFESSOR: What? 140 00:08:23,550 --> 00:08:25,020 AUDIENCE: [INAUDIBLE] 141 00:08:28,450 --> 00:08:29,820 PROFESSOR: Not really. 142 00:08:29,820 --> 00:08:34,010 It's a much simpler idea than that. 143 00:08:34,010 --> 00:08:36,630 How are you going to get back? 144 00:08:36,630 --> 00:08:40,580 I mean, usually when you map something into something else, 145 00:08:40,580 --> 00:08:43,590 you would like to get back again. 146 00:08:43,590 --> 00:08:46,870 When you map an infinite set into a finite set, how are you 147 00:08:46,870 --> 00:08:47,590 going to get back? 148 00:08:47,590 --> 00:08:50,670 AUDIENCE: You're not. 149 00:08:50,670 --> 00:08:51,170 PROFESSOR: You're not. 150 00:08:51,170 --> 00:08:52,190 Good! 151 00:08:52,190 --> 00:08:54,330 There's not any way in hell that you're ever going 152 00:08:54,330 --> 00:08:56,160 to get back, OK? 153 00:08:56,160 --> 00:08:59,780 So, in other words, what you've done here is to 154 00:08:59,780 --> 00:09:03,770 deliberately introduce some distortion into the picture. 155 00:09:03,770 --> 00:09:07,210 You've introduced distortion because you have no choice. 156 00:09:07,210 --> 00:09:11,930 If you want to turn numbers into bits, you can't get back 157 00:09:11,930 --> 00:09:13,460 to the exact numbers again. 158 00:09:13,460 --> 00:09:16,650 So you can only get back to some approximation of what the 159 00:09:16,650 --> 00:09:18,020 numbers are. 160 00:09:18,020 --> 00:09:21,250 But anyway, this process of doing this is called scalar 161 00:09:21,250 --> 00:09:24,690 quantization, if we're mapping from the set of real numbers 162 00:09:24,690 --> 00:09:27,410 to a discrete alphabet. 163 00:09:27,410 --> 00:09:34,680 If instead you want to convert real n-tuples into sequences 164 00:09:34,680 --> 00:09:39,120 of discrete symbols, in other words, into a finite alphabet, 165 00:09:39,120 --> 00:09:42,330 you call that vector quantization because you can 166 00:09:42,330 --> 00:09:46,760 view n real numbers, a sequence of n real numbers as 167 00:09:46,760 --> 00:09:50,990 a vector within coordinates and within components. 168 00:09:50,990 --> 00:09:53,660 And I'm not doing anything fancy with vectors here. 169 00:09:53,660 --> 00:09:57,490 You just look at an n-tuple of numbers as a vector. 170 00:09:57,490 --> 00:10:02,420 OK, so scalar quantization is going to encode each term of 171 00:10:02,420 --> 00:10:06,290 the source sequence separately. 172 00:10:06,290 --> 00:10:12,040 And vector quantization is first going to segment this 173 00:10:12,040 --> 00:10:18,600 sequence of numbers into blocks of n numbers each, and 174 00:10:18,600 --> 00:10:23,610 then it's going to find a way of encoding those n-blocks 175 00:10:23,610 --> 00:10:25,540 into discrete symbols. 176 00:10:25,540 --> 00:10:28,000 Does this sound a little bit like what we've already done 177 00:10:28,000 --> 00:10:31,150 in dealing with discrete sources? 178 00:10:31,150 --> 00:10:33,350 Yeah, it's exactly the same thing. 179 00:10:33,350 --> 00:10:37,430 I mean, we started out by mapping individual symbols 180 00:10:37,430 --> 00:10:42,950 into bit sequences, and then we said, gee, we can also map 181 00:10:42,950 --> 00:10:47,830 n-blocks of those symbols into bits, and we said, gee, this 182 00:10:47,830 --> 00:10:49,230 is the same problem again. 183 00:10:49,230 --> 00:10:52,760 There's nothing different, nothing new. 184 00:10:52,760 --> 00:10:57,660 And it's the same thing here almost except here the 185 00:10:57,660 --> 00:11:00,780 properties of the real numbers are important. 186 00:11:00,780 --> 00:11:03,200 Why are the properties of the real numbers important? 187 00:11:03,200 --> 00:11:06,570 Why can't we just look at this as symbols? 188 00:11:06,570 --> 00:11:10,210 Well, because, since we can't do this mapping in an 189 00:11:10,210 --> 00:11:14,490 invertable way, you have to deal with the fact that you 190 00:11:14,490 --> 00:11:16,520 have distortion here. 191 00:11:16,520 --> 00:11:18,160 There's no other way to think about it. 192 00:11:18,160 --> 00:11:19,960 There is distortion. 193 00:11:19,960 --> 00:11:21,180 You might as well face it. 194 00:11:21,180 --> 00:11:24,110 If you try to cover it up, it just comes 195 00:11:24,110 --> 00:11:26,110 up to kick you later. 196 00:11:26,110 --> 00:11:30,760 So we face it right in the beginning, and that's why we 197 00:11:30,760 --> 00:11:32,630 deal with these things as numbers. 198 00:11:35,150 --> 00:11:38,380 So let's look at a simple example of what a scalar 199 00:11:38,380 --> 00:11:42,090 quantizer is going to do. 200 00:11:42,090 --> 00:11:50,240 Basically, what we have to do is to map the line R, mainly 201 00:11:50,240 --> 00:11:53,480 the set of real numbers, into M different regions, which 202 00:11:53,480 --> 00:11:57,880 we'll call R1 up to R sub M. And in this picture here, 203 00:11:57,880 --> 00:12:03,220 here's R1, here's R2, here's R3, here's R4, R5 and R6, and 204 00:12:03,220 --> 00:12:05,600 that's all the regions we have. 205 00:12:05,600 --> 00:12:08,290 You'll notice one of the things that that does is it 206 00:12:08,290 --> 00:12:11,890 takes an enormous set of numbers, namely all these 207 00:12:11,890 --> 00:12:14,790 numbers less than this, and match them all 208 00:12:14,790 --> 00:12:16,730 into the same symbol. 209 00:12:16,730 --> 00:12:19,200 So you might wind up with a fair amount of distortion 210 00:12:19,200 --> 00:12:22,880 there, no matter how you measure distortion. 211 00:12:22,880 --> 00:12:26,000 All these outliers here all get mapped into a6, and 212 00:12:26,000 --> 00:12:29,620 everything in the middle gets mapped somehow into these 213 00:12:29,620 --> 00:12:32,080 intermediate values. 214 00:12:32,080 --> 00:12:38,140 But every source value now in the region R sub j, we're 215 00:12:38,140 --> 00:12:42,210 going to map into a representation point a sub j. 216 00:12:42,210 --> 00:12:46,220 So everything in R1 is going to be mapped into a1. 217 00:12:46,220 --> 00:12:48,325 Everything in R2 is going to get mapped 218 00:12:48,325 --> 00:12:51,380 into a2 and so forth. 219 00:12:51,380 --> 00:12:55,500 Is this a general way to map R into a set of M symbols? 220 00:12:58,380 --> 00:13:00,580 Is there anything else I ought to be thinking about? 221 00:13:05,700 --> 00:13:10,490 Well, here, these regions here have a very special property. 222 00:13:10,490 --> 00:13:13,850 Namely, each region is an interval. 223 00:13:13,850 --> 00:13:17,580 And we might say to ourselves, well, maybe we shouldn't map 224 00:13:17,580 --> 00:13:19,830 points into intervals. 225 00:13:19,830 --> 00:13:23,450 But aside from the fact that we've chosen intervals here, 226 00:13:23,450 --> 00:13:27,230 this is a perfectly general way to represent a mapping 227 00:13:27,230 --> 00:13:31,080 from the real numbers into a discrete set of things. 228 00:13:31,080 --> 00:13:33,970 Namely, when you're doing a mapping from the real numbers 229 00:13:33,970 --> 00:13:37,920 into a discrete set of things, there's some set of real 230 00:13:37,920 --> 00:13:41,660 numbers that get mapped into a1, and that by 231 00:13:41,660 --> 00:13:43,920 definition is called R1. 232 00:13:43,920 --> 00:13:47,220 There's some set of numbers which get mapped into a2. 233 00:13:47,220 --> 00:13:51,030 That by definition is called R2 and so forth. 234 00:13:51,030 --> 00:13:54,570 So aside from the fact that these intervals with these 235 00:13:54,570 --> 00:13:58,530 regions in this picture happen to be intervals, this is a 236 00:13:58,530 --> 00:14:04,290 perfectly general mapping from R into a discrete alphabet. 237 00:14:04,290 --> 00:14:07,100 So since I've decided I'm going to look at scalar 238 00:14:07,100 --> 00:14:10,720 quantizers first, this is a completely general view of 239 00:14:10,720 --> 00:14:13,120 what a scalar quantizer is. 240 00:14:13,120 --> 00:14:17,580 You tell me how many quantization regions you want, 241 00:14:17,580 --> 00:14:23,610 namely how big the alphabet is that you're mapping things 242 00:14:23,610 --> 00:14:26,540 into, and then your only problem is how do you choose 243 00:14:26,540 --> 00:14:28,910 these regions and how do you choose the 244 00:14:28,910 --> 00:14:31,110 representation points? 245 00:14:31,110 --> 00:14:37,260 OK, one new thing here: Before, we said when you have 246 00:14:37,260 --> 00:14:41,040 a set of symbols a1 up to a6, it doesn't matter 247 00:14:41,040 --> 00:14:43,120 what you call them. 248 00:14:43,120 --> 00:14:45,630 They're just six symbols. 249 00:14:45,630 --> 00:14:48,290 Here it makes a difference what you call them because 250 00:14:48,290 --> 00:14:52,350 here they are representing real numbers and they are 251 00:14:52,350 --> 00:14:57,590 representing real numbers because when you map some real 252 00:14:57,590 --> 00:15:03,940 number on the real line into one of these letters here, the 253 00:15:03,940 --> 00:15:07,440 distortion is u minus a sub j. 254 00:15:07,440 --> 00:15:11,950 If you're mapping u into a sub j, then you get a distortion, 255 00:15:11,950 --> 00:15:14,630 which is this difference here. 256 00:15:14,630 --> 00:15:18,010 I haven't said yet what I am interested in as far as 257 00:15:18,010 --> 00:15:19,320 distortion is concerned. 258 00:15:19,320 --> 00:15:24,040 Am I interested in squared distortion, cubed distortion, 259 00:15:24,040 --> 00:15:26,580 absolute magnitude of distortion? 260 00:15:26,580 --> 00:15:28,600 I haven't answered that question yet. 261 00:15:28,600 --> 00:15:31,980 But there is a distortion here, and somehow that has to 262 00:15:31,980 --> 00:15:32,720 be important. 263 00:15:32,720 --> 00:15:38,610 We have to come to grips with what we call a distortion and 264 00:15:38,610 --> 00:15:41,860 somehow how big that distortion is going to be. 265 00:15:41,860 --> 00:15:46,230 So our problem here is somehow to trade off between 266 00:15:46,230 --> 00:15:49,750 distortion and number of points. 267 00:15:49,750 --> 00:15:52,110 As we make the number of points bigger, we can 268 00:15:52,110 --> 00:15:56,150 presumably make the distortion smaller in some sense, 269 00:15:56,150 --> 00:15:58,800 although the distortion is always going to be very big 270 00:15:58,800 --> 00:16:02,850 from these really big negative numbers and from really big 271 00:16:02,850 --> 00:16:04,130 positive numbers. 272 00:16:04,130 --> 00:16:08,460 But aside from that, we just have a problem of how do you 273 00:16:08,460 --> 00:16:09,440 choose the regions? 274 00:16:09,440 --> 00:16:13,530 How do you choose the points? 275 00:16:13,530 --> 00:16:16,960 OK, I've sort of forgotten about the problem that we 276 00:16:16,960 --> 00:16:20,000 started with. 277 00:16:20,000 --> 00:16:22,990 And the problem that we started with was to have a 278 00:16:22,990 --> 00:16:28,140 source where the source was a sequence of numbers. 279 00:16:28,140 --> 00:16:30,570 And when we're talking about sources, we're talking about 280 00:16:30,570 --> 00:16:31,770 something stochastic. 281 00:16:31,770 --> 00:16:37,680 We need a probability measure on these real numbers that 282 00:16:37,680 --> 00:16:38,920 we're encoding. 283 00:16:38,920 --> 00:16:41,730 If we knew what they were, there wouldn't be any need to 284 00:16:41,730 --> 00:16:44,440 encode them. 285 00:16:44,440 --> 00:16:47,490 I mean, if we knew and the receiver knew, there would be 286 00:16:47,490 --> 00:16:48,540 no need to encode them. 287 00:16:48,540 --> 00:16:53,680 The receiver would just print out what they were or store 288 00:16:53,680 --> 00:16:55,540 them or do whatever the receiver 289 00:16:55,540 --> 00:16:57,770 wants to do with them. 290 00:16:57,770 --> 00:17:01,020 OK, so we're going to view the source value u with the sample 291 00:17:01,020 --> 00:17:05,460 value of some random variable capital U. And more generally, 292 00:17:05,460 --> 00:17:10,410 since we have a sequence, we're going to consider a 293 00:17:10,410 --> 00:17:15,610 source sequence to be U1, U2, U3 and so forth, or you could 294 00:17:15,610 --> 00:17:19,330 consider it a bi-infinite sequence, starting at U minus 295 00:17:19,330 --> 00:17:23,970 infinity and working its way up forever. 296 00:17:23,970 --> 00:17:28,650 And then we're going to have some sort of model, 297 00:17:28,650 --> 00:17:32,480 statistical model, for this sequence of random variables. 298 00:17:32,480 --> 00:17:36,010 Our typical model for these is to assume that we have a 299 00:17:36,010 --> 00:17:37,580 memoryless source. 300 00:17:37,580 --> 00:17:41,950 In other words, U1, U2, U3 are independent, identically 301 00:17:41,950 --> 00:17:43,930 distributed, random variables. 302 00:17:43,930 --> 00:17:49,290 That's the model we'll use until we get smarter and start 303 00:17:49,290 --> 00:17:51,220 to think of something else. 304 00:17:51,220 --> 00:17:54,860 OK, so now each of these source values we're going to 305 00:17:54,860 --> 00:17:58,590 map into some representation point a sub j. 306 00:17:58,590 --> 00:18:01,690 That's what defines the quantizer. 307 00:18:01,690 --> 00:18:05,090 And now since a sub j is a sample value of a random 308 00:18:05,090 --> 00:18:09,400 variable U, a sub j is going to be a sample value of some 309 00:18:09,400 --> 00:18:13,740 random variable V. OK, in other words, the probabilities 310 00:18:13,740 --> 00:18:16,510 of these different sample values is going to be 311 00:18:16,510 --> 00:18:22,330 determined by the set of U's that math into that a sub j. 312 00:18:22,330 --> 00:18:26,240 So we have a source sequence U1, U2, blah, blah, blah. 313 00:18:26,240 --> 00:18:30,720 We have a representation sequence V1, V2, blah, blah, 314 00:18:30,720 --> 00:18:38,040 blah, which is defined by if U sub k is in R sub j, then Vk 315 00:18:38,040 --> 00:18:40,470 is equal to a sub j. 316 00:18:40,470 --> 00:18:43,690 Point of confusion here: It's not confusing now, but it will 317 00:18:43,690 --> 00:18:46,890 be confusing at some point to you. 318 00:18:46,890 --> 00:18:50,250 When you're talking about sources, you really need two 319 00:18:50,250 --> 00:18:53,690 indices that you're talking about all the time. 320 00:18:53,690 --> 00:18:59,450 OK, one of them is how to represent 321 00:18:59,450 --> 00:19:01,340 different elements in time. 322 00:19:01,340 --> 00:19:05,300 Here we're using k as a way of keeping track of what element 323 00:19:05,300 --> 00:19:07,070 in time we're talking about. 324 00:19:07,070 --> 00:19:10,570 We're also talking about a discrete alphabet, which has a 325 00:19:10,570 --> 00:19:13,310 certain number of elements in it, which is completely 326 00:19:13,310 --> 00:19:15,500 independent of time. 327 00:19:15,500 --> 00:19:18,900 Namely, we've just described the quantizer as something 328 00:19:18,900 --> 00:19:22,420 which maps real numbers into sample values. 329 00:19:22,420 --> 00:19:24,350 It has nothing to do with time at all. 330 00:19:24,350 --> 00:19:26,650 We're going to use that same thing again and again and 331 00:19:26,650 --> 00:19:29,880 again, and we're using the subscript j 332 00:19:29,880 --> 00:19:32,930 to talk about that. 333 00:19:32,930 --> 00:19:37,830 When you write out problem solutions, you are going to 334 00:19:37,830 --> 00:19:41,570 find that it's incredibly difficult sometimes to write 335 00:19:41,570 --> 00:19:45,080 sentences which distinguish about whether you're talking 336 00:19:45,080 --> 00:19:50,880 about one element out of an alphabet or one element out of 337 00:19:50,880 --> 00:19:54,220 a time sequence. 338 00:19:54,220 --> 00:19:56,910 And everybody has that trouble, and you read most of 339 00:19:56,910 --> 00:19:59,870 the literature in information theory or communication 340 00:19:59,870 --> 00:20:02,880 theory, and you can't sort out most of the time what people 341 00:20:02,880 --> 00:20:05,770 are talking about because they're doing that. 342 00:20:05,770 --> 00:20:11,360 I recommend to you using an element and an alphabet to 343 00:20:11,360 --> 00:20:15,030 talk about this sort of thing, what a sub j is or an element 344 00:20:15,030 --> 00:20:17,630 and a time sequence to keep track of things 345 00:20:17,630 --> 00:20:18,590 at different times. 346 00:20:18,590 --> 00:20:21,950 It's a nice way of keeping them straight. 347 00:20:21,950 --> 00:20:26,370 OK, so anyway, for a scalar quantizer, we're going to be 348 00:20:26,370 --> 00:20:30,890 able to just look at a single random variable U, which is a 349 00:20:30,890 --> 00:20:34,490 continuous-valued random variable, which takes values 350 00:20:34,490 --> 00:20:40,230 anywhere on the real line and maps it into a single element 351 00:20:40,230 --> 00:20:49,120 in this discrete alphabet, which is the set a1 up to a6 352 00:20:49,120 --> 00:20:50,650 that we were talking about here. 353 00:20:50,650 --> 00:20:56,090 So a scalar quantizer then is just a map of this form, OK? 354 00:20:56,090 --> 00:20:58,960 So the only thing we need for a scalar quantizer, we can now 355 00:20:58,960 --> 00:21:02,030 forget about time and talk about how do you choose the 356 00:21:02,030 --> 00:21:07,080 regions, how do you choose the representation points? 357 00:21:07,080 --> 00:21:13,270 OK, and there's a nice algorithm there. 358 00:21:13,270 --> 00:21:16,380 Again, one of these things, which if you were the first 359 00:21:16,380 --> 00:21:21,540 person to think about it, easy way to become famous. 360 00:21:21,540 --> 00:21:26,540 You might not stay famous, but you can get famous initially. 361 00:21:26,540 --> 00:21:30,970 Anyway, we're almost always interested in the mean squared 362 00:21:30,970 --> 00:21:35,880 error or the mean squared distortion, MSD or MSE, which 363 00:21:35,880 --> 00:21:41,890 is the expected value of U minus V. U is this real-valued 364 00:21:41,890 --> 00:21:43,040 random variable. 365 00:21:43,040 --> 00:21:48,570 V is the discrete random variable into which it maps. 366 00:21:48,570 --> 00:21:52,960 We have the distortion between U and V, which is U minus V. 367 00:21:52,960 --> 00:21:58,740 We now have the expected value of that squared distortion. 368 00:21:58,740 --> 00:22:02,880 Why is everybody interested in squared distortion instead of 369 00:22:02,880 --> 00:22:05,630 magnitude distortion or something else? 370 00:22:05,630 --> 00:22:09,480 In many engineering problems, you should be more interested 371 00:22:09,480 --> 00:22:11,690 in magnitude distortion. 372 00:22:11,690 --> 00:22:14,180 Sometimes you're much more interested in fourth-moment 373 00:22:14,180 --> 00:22:16,610 distortion or some other strange thing. 374 00:22:16,610 --> 00:22:20,780 Why do we always use mean squared distortion? 375 00:22:20,780 --> 00:22:24,680 And why do we use mean-squared everything throughout almost 376 00:22:24,680 --> 00:22:29,470 everything we do in communication? 377 00:22:29,470 --> 00:22:30,810 I'll tell you the reason now. 378 00:22:30,810 --> 00:22:35,780 We'll come back and talk about it more a number of times. 379 00:22:35,780 --> 00:22:38,870 It's because this quantization problem that we're talking 380 00:22:38,870 --> 00:22:43,870 about is almost always a subproblem of this problem, 381 00:22:43,870 --> 00:22:46,910 where you're dealing with waveforms, where you take the 382 00:22:46,910 --> 00:22:49,760 waveform and you sample the waveform, where you take the 383 00:22:49,760 --> 00:22:54,970 waveform, and you turn the waveform into some expansion. 384 00:22:54,970 --> 00:22:59,980 When you find the mean-squared distortion in a quantizer, it 385 00:22:59,980 --> 00:23:03,700 turns out that that maps in a beautiful way into the 386 00:23:03,700 --> 00:23:07,340 mean-squared distortion between waveforms. 387 00:23:07,340 --> 00:23:11,040 If you deal with magnitudes or with anything else in the 388 00:23:11,040 --> 00:23:14,990 world, all of that beauty goes away, OK? 389 00:23:14,990 --> 00:23:17,740 In other words, whenever you want to go from waveforms to 390 00:23:17,740 --> 00:23:22,890 numbers, the one thing which remains invariant, which 391 00:23:22,890 --> 00:23:28,040 remains nice all the way through, is this mean-square 392 00:23:28,040 --> 00:23:31,040 distortion, mean-square value, OK? 393 00:23:31,040 --> 00:23:34,620 In other words, if you want to layer this problem into 394 00:23:34,620 --> 00:23:37,390 looking at this problem separately from looking at 395 00:23:37,390 --> 00:23:42,250 this problem, almost the only way you can do it that makes 396 00:23:42,250 --> 00:23:46,100 sense is to worry about mean square distortion rather than 397 00:23:46,100 --> 00:23:48,920 some other kind of distortion. 398 00:23:48,920 --> 00:23:51,980 So even though as engineers we might be interested in 399 00:23:51,980 --> 00:23:55,240 something else, we almost always stick to that because 400 00:23:55,240 --> 00:23:58,730 that's the thing that we can deal with most nicely. 401 00:23:58,730 --> 00:24:04,210 I mean, as engineers, we're like the drunk who dropped his 402 00:24:04,210 --> 00:24:07,940 wallet on a dark street, and he's searching for it, and 403 00:24:07,940 --> 00:24:08,980 somebody comes along. 404 00:24:08,980 --> 00:24:14,090 And here he is underneath a beautiful light where he can 405 00:24:14,090 --> 00:24:15,390 see everything. 406 00:24:15,390 --> 00:24:17,890 Somebody asks him what he's looking for you, and he says 407 00:24:17,890 --> 00:24:19,290 he's looking for his wallet. 408 00:24:19,290 --> 00:24:21,620 The guy looks down and says, well, there's no wallet there. 409 00:24:21,620 --> 00:24:22,630 The drunk says I know. 410 00:24:22,630 --> 00:24:25,200 I dropped it over there, but it's dark over there. 411 00:24:25,200 --> 00:24:28,350 So we use mean square distortion in 412 00:24:28,350 --> 00:24:29,880 exactly the same sense. 413 00:24:29,880 --> 00:24:32,940 It isn't necessarily the problem we're interested in, 414 00:24:32,940 --> 00:24:37,150 but it's a problem where we can see things clearly. 415 00:24:37,150 --> 00:24:44,560 So given that we're interested in the mean square distortion 416 00:24:44,560 --> 00:24:48,950 of a scalar quantizer, an interesting analytical problem 417 00:24:48,950 --> 00:24:54,630 that we can play with is for a given probability density on 418 00:24:54,630 --> 00:24:58,530 this real random variable, and we're assuming we have a 419 00:24:58,530 --> 00:25:06,010 probability density and a given alphabet size M, the 420 00:25:06,010 --> 00:25:10,330 problem is how do you choose these regions and how do you 421 00:25:10,330 --> 00:25:14,140 choose these representation points in such a way as to 422 00:25:14,140 --> 00:25:18,340 minimize the mean square error, OK? 423 00:25:18,340 --> 00:25:20,950 So, in other words, we've taken a big sort of messy 424 00:25:20,950 --> 00:25:24,690 amorphous engineering problem, and we said, OK, we're going 425 00:25:24,690 --> 00:25:27,160 to deal with mean square error, and OK, we're going to 426 00:25:27,160 --> 00:25:32,670 deal with scalar quantizers, and OK, we're going to fix a 427 00:25:32,670 --> 00:25:36,290 number of quantization levels, so we've made all those 428 00:25:36,290 --> 00:25:38,450 choices to start with. 429 00:25:38,450 --> 00:25:42,200 We still have this interesting problem of how do you choose 430 00:25:42,200 --> 00:25:43,700 the right regions? 431 00:25:43,700 --> 00:25:47,250 How do you choose the right sample points? 432 00:25:47,250 --> 00:25:51,280 And that turns out to be a simple problem. 433 00:25:51,280 --> 00:25:53,000 I wouldn't talk about if it wasn't simple. 434 00:25:58,450 --> 00:26:01,310 And we'll break it into subproblems, and the 435 00:26:01,310 --> 00:26:04,510 subproblems are really simple. 436 00:26:04,510 --> 00:26:09,200 The first subproblem is if I tell you what representation 437 00:26:09,200 --> 00:26:15,190 points I want to use, namely, in this picture here, I say 438 00:26:15,190 --> 00:26:19,610 OK, I want to use these representation points. 439 00:26:19,610 --> 00:26:22,150 And then I ask you, how are you going to choose the 440 00:26:22,150 --> 00:26:28,350 regions in an optimal way to minimize mean square error? 441 00:26:28,350 --> 00:26:31,370 Well, you think about that for awhile, and you think about it 442 00:26:31,370 --> 00:26:33,160 in a number of ways. 443 00:26:33,160 --> 00:26:35,460 When you think about it in just the right way, the answer 444 00:26:35,460 --> 00:26:38,130 becomes obvious. 445 00:26:38,130 --> 00:26:42,970 And the answer is, let me not think about the regions here, 446 00:26:42,970 --> 00:26:46,030 but let me think about a particular value that comes 447 00:26:46,030 --> 00:26:47,730 out of the source. 448 00:26:47,730 --> 00:26:51,580 Let me think, how should I construct a rule for how to 449 00:26:51,580 --> 00:26:57,120 take outputs from this source and map them 450 00:26:57,120 --> 00:27:00,100 into some number here. 451 00:27:00,100 --> 00:27:05,250 So if I get some output from the source u, I say, OK, 452 00:27:05,250 --> 00:27:09,080 what's the distortion between u and a1? 453 00:27:09,080 --> 00:27:10,540 It's u minus a1. 454 00:27:10,540 --> 00:27:13,930 And the magnitude of that is the magnitude of u minus a1. 455 00:27:13,930 --> 00:27:17,790 The square of it is the square of u minus a1, and then say, 456 00:27:17,790 --> 00:27:20,660 OK, let me compare that with u minus a2. 457 00:27:20,660 --> 00:27:24,650 Let me compare it with u minus a3 and so forth. 458 00:27:24,650 --> 00:27:28,500 Let me choose the smallest of those things. 459 00:27:28,500 --> 00:27:31,690 What's the smallest stuff for any given u? 460 00:27:31,690 --> 00:27:35,760 Suppose I have a u which happens to be 461 00:27:35,760 --> 00:27:38,630 right there, for example. 462 00:27:38,630 --> 00:27:40,900 What's the closest representation point? 463 00:27:44,320 --> 00:27:49,650 Well, a3 is obviously closer than a2. 464 00:27:49,650 --> 00:27:55,460 And in fact for any point in here, which is closer to a3 465 00:27:55,460 --> 00:27:58,620 than it is to a2, we're going to choose a3. 466 00:27:58,620 --> 00:28:02,420 Now what's the set of points which are closer to a3 than 467 00:28:02,420 --> 00:28:04,720 they are to a2? 468 00:28:04,720 --> 00:28:09,590 Well, you put a line here right between a2 and a3, OK? 469 00:28:09,590 --> 00:28:13,330 When you put that line right between a2 and a3, everything 470 00:28:13,330 --> 00:28:17,720 on this side is closer to a3 and everything on this side is 471 00:28:17,720 --> 00:28:19,970 closer to a2. 472 00:28:19,970 --> 00:28:21,830 So the answer is, in fact, simple, 473 00:28:21,830 --> 00:28:24,990 once you see the answer. 474 00:28:24,990 --> 00:28:30,610 And the answer is, given these points, we simply construct 475 00:28:30,610 --> 00:28:34,430 bisectors between them, namely, bisectors with -- 476 00:28:34,430 --> 00:28:38,810 halfway between a1 and a2, we call that b1. 477 00:28:38,810 --> 00:28:43,600 Halfway between a2 and a3, we call that b2, and those are 478 00:28:43,600 --> 00:28:46,300 the separators between the regions, OK? 479 00:28:46,300 --> 00:28:53,390 In other words, what we wind up doing is we define the 480 00:28:53,390 --> 00:28:57,740 region R sub j, the set of things which get mapped into a 481 00:28:57,740 --> 00:29:06,660 sub j as the region which is bounded by bj minus 1 is the 482 00:29:06,660 --> 00:29:13,040 average of aj and aj minus 1, and bj is the average of aj 483 00:29:13,040 --> 00:29:17,360 and aj plus 1, where I've already ordered the points a 484 00:29:17,360 --> 00:29:23,750 sub j going from left to right, OK? 485 00:29:23,750 --> 00:29:29,060 And that also tells us that the minimum mean square error 486 00:29:29,060 --> 00:29:32,550 regions have got to be intervals. 487 00:29:32,550 --> 00:29:35,990 There is no reason at all to ever pick regions which are 488 00:29:35,990 --> 00:29:39,130 not intervals because, as soon as you start to solve this 489 00:29:39,130 --> 00:29:44,690 problem for any given set of representation points, you 490 00:29:44,690 --> 00:29:47,050 wind up with intervals. 491 00:29:47,050 --> 00:29:51,150 So if you ever think of using things that are not intervals 492 00:29:51,150 --> 00:29:55,570 for this mean square error problem, then as soon as you 493 00:29:55,570 --> 00:29:58,630 look at this, you say, well, aha, that can't be the best 494 00:29:58,630 --> 00:30:00,050 thing to do. 495 00:30:00,050 --> 00:30:02,560 I will make my regions intervals. 496 00:30:02,560 --> 00:30:05,770 I will therefore simplify the whole problem, and I'll also 497 00:30:05,770 --> 00:30:06,890 improve it. 498 00:30:06,890 --> 00:30:09,550 And when you can both simplify things and improve them at the 499 00:30:09,550 --> 00:30:13,860 same time, it usually is worth doing it unless you're dealing 500 00:30:13,860 --> 00:30:17,060 with standards bodies or something, and then 501 00:30:17,060 --> 00:30:19,210 all bets are off. 502 00:30:19,210 --> 00:30:22,420 OK, so this one part of the problem is easy. 503 00:30:22,420 --> 00:30:32,640 If we know what the representation points are, we 504 00:30:32,640 --> 00:30:35,070 can solve the problem. 505 00:30:35,070 --> 00:30:40,180 OK, we have a second problem, which is just 506 00:30:40,180 --> 00:30:41,720 the opposite problem. 507 00:30:41,720 --> 00:30:46,630 Suppose then that somebody gives you the interval regions 508 00:30:46,630 --> 00:30:50,050 and asks you, OK, if I give you these interval regions, 509 00:30:50,050 --> 00:30:53,240 how are you going to choose the representation points to 510 00:30:53,240 --> 00:30:55,030 minimize the mean square error? 511 00:31:00,720 --> 00:31:05,710 And analytically that's harder, but conceptually, it's 512 00:31:05,710 --> 00:31:07,740 just about as easy. 513 00:31:07,740 --> 00:31:10,200 And let's look at it and see if we can understand it. 514 00:31:12,970 --> 00:31:18,240 Somebody gives us this region here, R sub 2, and says, OK, 515 00:31:18,240 --> 00:31:22,720 where should we put the point a sub 2? 516 00:31:22,720 --> 00:31:25,490 Well, anybody have any clues as to where 517 00:31:25,490 --> 00:31:26,390 you want to put it? 518 00:31:26,390 --> 00:31:29,020 AUDIENCE: [INAUDIBLE] 519 00:31:29,020 --> 00:31:29,370 PROFESSOR: What? 520 00:31:29,370 --> 00:31:30,040 AUDIENCE: At the midpoint? 521 00:31:30,040 --> 00:31:30,980 PROFESSOR: At the midpoint. 522 00:31:30,980 --> 00:31:32,700 It sounds like a reasonable thing, but 523 00:31:32,700 --> 00:31:33,600 it's not quite right. 524 00:31:33,600 --> 00:31:36,330 AUDIENCE: [INAUDIBLE] 525 00:31:36,330 --> 00:31:38,790 PROFESSOR: What? 526 00:31:38,790 --> 00:31:38,960 AUDIENCE: [INAUDIBLE] 527 00:31:38,960 --> 00:31:43,100 PROFESSOR: It depends on the probability density, yes. 528 00:31:43,100 --> 00:31:46,340 If you have a probability density, which is highly 529 00:31:46,340 --> 00:31:52,030 weighted over on -- let's see, was I talking about R2 or R3? 530 00:31:52,030 --> 00:31:53,050 Well, it doesn't make any difference. 531 00:31:53,050 --> 00:31:54,830 We'll talk about R2. 532 00:31:54,830 --> 00:31:58,400 If I have a probability density, which looks like 533 00:31:58,400 --> 00:32:06,450 this, then I want a2 to be closer to the left-hand side 534 00:32:06,450 --> 00:32:07,690 than to the right-hand side. 535 00:32:07,690 --> 00:32:11,320 I want it to be a little bit weighted towards here because 536 00:32:11,320 --> 00:32:13,410 that's more important in choosing 537 00:32:13,410 --> 00:32:14,660 the mean square error. 538 00:32:17,480 --> 00:32:21,150 OK, if you didn't have any of this stuff, and I said how do 539 00:32:21,150 --> 00:32:26,220 I choose this to minimize the mean square error, what's your 540 00:32:26,220 --> 00:32:27,060 answer then? 541 00:32:27,060 --> 00:32:31,850 If I just have one region and I want to minimize the mean 542 00:32:31,850 --> 00:32:34,220 square error, what do you do? 543 00:32:36,740 --> 00:32:38,920 Anybody who doesn't know should go back and study 544 00:32:38,920 --> 00:32:42,880 elementary probability theory because this is almost day one 545 00:32:42,880 --> 00:32:46,760 of elementary probability theory when you start to study 546 00:32:46,760 --> 00:32:49,740 what random variables are all about. 547 00:32:49,740 --> 00:32:52,410 And the next thing you start to look at is things like 548 00:32:52,410 --> 00:32:54,930 variance and second moment. 549 00:32:54,930 --> 00:32:59,530 And what I'm asking you here is, how do you choose a2 in 550 00:32:59,530 --> 00:33:04,190 order to minimize the second moment of U, whatever it is in 551 00:33:04,190 --> 00:33:05,170 here, minus a2? 552 00:33:05,170 --> 00:33:05,480 Yes? 553 00:33:05,480 --> 00:33:07,850 AUDIENCE: [INAUDIBLE] 554 00:33:07,850 --> 00:33:10,290 PROFESSOR: I want to take the expectation of U 555 00:33:10,290 --> 00:33:12,600 over the region R2. 556 00:33:12,600 --> 00:33:16,130 In other words, to say it more technically, I want to take 557 00:33:16,130 --> 00:33:20,970 the expectation of the conditional random variable U, 558 00:33:20,970 --> 00:33:24,260 conditional on being in R2, OK? 559 00:33:27,790 --> 00:33:31,220 And all of you could figure that out yourselves, if you 560 00:33:31,220 --> 00:33:35,820 sat down quietly and thought about it for five minutes. 561 00:33:35,820 --> 00:33:39,720 If you got frightened about it and started looking it up in a 562 00:33:39,720 --> 00:33:41,860 book or something, it would take you about 563 00:33:41,860 --> 00:33:43,840 two hours to do it. 564 00:33:43,840 --> 00:33:47,110 But if you just asked yourselves, how do I do it? 565 00:33:47,110 --> 00:33:52,900 You'll come up with the right answer very, very soon, OK? 566 00:33:52,900 --> 00:33:58,300 So subproblem 2 says let's look at the conditional 567 00:33:58,300 --> 00:34:02,670 density of U in this region R sub j. 568 00:34:02,670 --> 00:34:05,780 I'll call the conditional density -- well, the 569 00:34:05,780 --> 00:34:09,010 conditional density, given that you're in this region, 570 00:34:09,010 --> 00:34:13,990 is, in fact, the real density divided by the probability of 571 00:34:13,990 --> 00:34:16,500 being in that interval, OK? 572 00:34:16,500 --> 00:34:20,270 So we'll call that the conditional density. 573 00:34:20,270 --> 00:34:25,000 And I'll let U of j be the random variable, which has 574 00:34:25,000 --> 00:34:25,790 this density. 575 00:34:25,790 --> 00:34:29,860 In other words, this isn't a random variable on the 576 00:34:29,860 --> 00:34:32,320 probability space that we started to deal with it. 577 00:34:32,320 --> 00:34:36,450 It's sort of a phony random variable. 578 00:34:36,450 --> 00:34:38,200 But, in fact, it's the intuitive thing that 579 00:34:38,200 --> 00:34:39,930 you think of, OK? 580 00:34:39,930 --> 00:34:42,080 In other words, if this is exactly what you were thinking 581 00:34:42,080 --> 00:34:44,450 of, you probably wouldn't have called this a separate random 582 00:34:44,450 --> 00:34:46,850 variable, OK? 583 00:34:46,850 --> 00:34:50,390 So this is a random variable with this density. 584 00:34:50,390 --> 00:34:55,170 The expected value of U of j minus a of j quantity squared, 585 00:34:55,170 --> 00:35:02,150 as you all know, is sigma squared of U of j plus the 586 00:35:02,150 --> 00:35:08,850 expected value a U of j minus a sub j quantity squared, OK? 587 00:35:08,850 --> 00:35:10,680 How do you minimize this? 588 00:35:10,680 --> 00:35:13,090 Well, you're stuck with this. 589 00:35:13,090 --> 00:35:17,070 This term, you minimize it by making a sub j equal to the 590 00:35:17,070 --> 00:35:20,770 expected value of U of j, which is 591 00:35:20,770 --> 00:35:21,980 exactly what you said. 592 00:35:21,980 --> 00:35:29,070 Namely, you set a of j to be the conditional mean of U of j 593 00:35:29,070 --> 00:35:31,980 conditional on being in the center. 594 00:35:31,980 --> 00:35:33,840 It's harder to say mathematically than 595 00:35:33,840 --> 00:35:35,090 it is to see it. 596 00:35:37,310 --> 00:35:41,280 But, in fact, the intuitive idea is exactly right. 597 00:35:41,280 --> 00:35:44,470 Namely, you condition your random variable on being in 598 00:35:44,470 --> 00:35:47,680 that interval, and then what you want to do is choose the 599 00:35:47,680 --> 00:35:51,670 mean within that interval, which is exactly the sort of 600 00:35:51,670 --> 00:35:53,810 thing we were thinking about here. 601 00:35:53,810 --> 00:35:57,900 When I drew this curve here, if I scale it, this, in fact, 602 00:35:57,900 --> 00:36:01,140 is the density of u conditional on 603 00:36:01,140 --> 00:36:03,050 being in R sub j. 604 00:36:03,050 --> 00:36:05,660 And now all I'm trying to do is choose the point here, 605 00:36:05,660 --> 00:36:07,620 which happens to be the mean which 606 00:36:07,620 --> 00:36:11,470 minimizes this second moment. 607 00:36:11,470 --> 00:36:14,870 The second moment, in fact, is this mean square error. 608 00:36:14,870 --> 00:36:16,760 So, bingo! 609 00:36:16,760 --> 00:36:20,400 That's the second problem. 610 00:36:20,400 --> 00:36:22,310 Well, how do you put the two problems together? 611 00:36:25,130 --> 00:36:28,260 Well, you can put it together in two ways. 612 00:36:28,260 --> 00:36:31,960 One of them is to say, OK, well then, clearly an optimal 613 00:36:31,960 --> 00:36:35,540 scalar quantizer has to satisfy both of these 614 00:36:35,540 --> 00:36:36,650 conditions. 615 00:36:36,650 --> 00:36:39,600 Namely, the endpoints of the regions have to be the 616 00:36:39,600 --> 00:36:43,390 midpoints of the representation points, and the 617 00:36:43,390 --> 00:36:47,770 representation points have to be the conditional means of 618 00:36:47,770 --> 00:36:56,790 the points within a region that you start with. 619 00:36:56,790 --> 00:37:00,160 And then you say, OK, how do I solve that problem? 620 00:37:00,160 --> 00:37:02,750 Well, if you're a computer scientist, and sometimes it's 621 00:37:02,750 --> 00:37:05,220 good to be a computer scientist, you say, well, I 622 00:37:05,220 --> 00:37:08,780 don't know how to solve the problem, but generating an 623 00:37:08,780 --> 00:37:12,290 algorithm to solve the problem is almost trivial. 624 00:37:12,290 --> 00:37:16,040 I start out with some arbitrary set of 625 00:37:16,040 --> 00:37:17,730 representation points. 626 00:37:17,730 --> 00:37:22,220 That should be a capital M because that's the number of 627 00:37:22,220 --> 00:37:25,480 points I'm allowed to use. 628 00:37:25,480 --> 00:37:29,330 Then the next step in the algorithm is I choose these 629 00:37:29,330 --> 00:37:34,070 separation points to be b sub j is the midpoint between the 630 00:37:34,070 --> 00:37:37,120 a's for each of these values. 631 00:37:37,120 --> 00:37:40,630 Then as soon as I get these midpoints, I have a set of 632 00:37:40,630 --> 00:37:45,200 intervals, and my next step is to say set a sub j is equal to 633 00:37:45,200 --> 00:37:49,320 the expected value of this conditional random variable U 634 00:37:49,320 --> 00:37:54,530 of j where R sub j is now this new interval for 1 less than 635 00:37:54,530 --> 00:37:57,410 or equal to j less than or equal to M minus 1. 636 00:37:57,410 --> 00:38:01,820 This means it's open on the left side. 637 00:38:01,820 --> 00:38:04,950 This means it's closed on the right side, and since it's a 638 00:38:04,950 --> 00:38:07,140 probability density, it doesn't make any difference 639 00:38:07,140 --> 00:38:08,030 what it is. 640 00:38:08,030 --> 00:38:11,610 I just wanted to give you an explicit rule for what to do 641 00:38:11,610 --> 00:38:15,640 with probability zero when you happen to wind up on that 642 00:38:15,640 --> 00:38:17,260 special point. 643 00:38:17,260 --> 00:38:22,800 OK, and then you iterate on 2 and 3 until you get a 644 00:38:22,800 --> 00:38:25,390 negligible improvement. 645 00:38:25,390 --> 00:38:27,880 And then you ask, of course, well, if I got a negligible 646 00:38:27,880 --> 00:38:29,680 improvement this time, maybe I'll get a bigger 647 00:38:29,680 --> 00:38:31,440 improvement next time. 648 00:38:31,440 --> 00:38:34,670 And that, of course, is true. 649 00:38:34,670 --> 00:38:40,700 And you then say, well, it's possible that after I can't 650 00:38:40,700 --> 00:38:43,220 get any improvement anymore, I still don't 651 00:38:43,220 --> 00:38:44,910 have the optimal solution. 652 00:38:44,910 --> 00:38:46,780 And that, of course, is true also. 653 00:38:46,780 --> 00:38:50,010 But at least you have an algorithm which makes sense 654 00:38:50,010 --> 00:38:53,000 and which, each time you try it, you do a little better 655 00:38:53,000 --> 00:38:55,250 than you did before. 656 00:38:55,250 --> 00:39:00,030 Now, this mean square error for any choice of regions and 657 00:39:00,030 --> 00:39:03,160 any choice of representation points is going to be 658 00:39:03,160 --> 00:39:05,640 nonnegative because it's an expected 659 00:39:05,640 --> 00:39:10,080 value of squared terms. 660 00:39:10,080 --> 00:39:13,040 The algorithm is nonincreasing with iterations. 661 00:39:13,040 --> 00:39:17,390 In other words, the algorithm is going down all the time. 662 00:39:17,390 --> 00:39:19,480 So you have zero down here. 663 00:39:19,480 --> 00:39:24,090 You have an algorithm which is marching you down toward zero. 664 00:39:24,090 --> 00:39:30,080 It's not going to get to zero, but it has to reach a minimum. 665 00:39:30,080 --> 00:39:32,900 That's a major theorem in analysis, but you don't need 666 00:39:32,900 --> 00:39:35,710 any major theorems in analysis to see this. 667 00:39:35,710 --> 00:39:38,290 You have a set of numbers, which are decreasing all the 668 00:39:38,290 --> 00:39:41,390 time, and they're bounded underneath. 669 00:39:41,390 --> 00:39:44,860 After awhile, you have to get to some point, and you don't 670 00:39:44,860 --> 00:39:45,920 go any further. 671 00:39:45,920 --> 00:39:48,610 So it has a limit. 672 00:39:48,610 --> 00:39:49,420 So that's nice. 673 00:39:49,420 --> 00:39:53,100 So you have an algorithm which has to converge. 674 00:39:53,100 --> 00:39:54,870 It can't keep on going. 675 00:39:54,870 --> 00:39:57,330 Well, it can keep on going forever, but it keeps going on 676 00:39:57,330 --> 00:40:00,540 forever with smaller and smaller improvements. 677 00:40:00,540 --> 00:40:05,050 So eventually, you might as well stop because you're not 678 00:40:05,050 --> 00:40:08,150 getting anywhere. 679 00:40:08,150 --> 00:40:11,530 OK, well, those conditions that we stated, the way a 680 00:40:11,530 --> 00:40:15,790 mathematician would say this, is these Lloyd-Max conditions 681 00:40:15,790 --> 00:40:18,880 are necessary but not sufficient, OK? 682 00:40:18,880 --> 00:40:21,720 In other words, any solution to this problem that we're 683 00:40:21,720 --> 00:40:24,510 looking at has to have the property that the 684 00:40:24,510 --> 00:40:30,250 representation points are the conditional means of the 685 00:40:30,250 --> 00:40:34,330 intervals and the interval boundaries are the midpoints 686 00:40:34,330 --> 00:40:37,410 between the representation points. 687 00:40:37,410 --> 00:40:40,230 But that isn't necessarily enough. 688 00:40:40,230 --> 00:40:42,950 Here's a simple example of where it's not enough. 689 00:40:42,950 --> 00:40:46,050 Suppose you have a probability density, which is running 690 00:40:46,050 --> 00:40:50,670 along almost zero, jumps up to a big value, almost zero, 691 00:40:50,670 --> 00:40:54,310 jumps up to a big value, jumps up to a big value. 692 00:40:54,310 --> 00:40:59,970 This intentionally is wider than this and wider than this. 693 00:40:59,970 --> 00:41:03,090 In other words, there's a lot more probability over here 694 00:41:03,090 --> 00:41:06,820 than there is here or there is here. 695 00:41:06,820 --> 00:41:11,840 And you're unlucky, and you start out somehow with points 696 00:41:11,840 --> 00:41:18,460 like a1 and a2, and you start out with regions like R1 -- 697 00:41:18,460 --> 00:41:23,050 well, yeah, you want to start out with points, a1 and a2. 698 00:41:23,050 --> 00:41:26,280 So you start out with a point a1, which is way over here, 699 00:41:26,280 --> 00:41:29,830 and a point a2, which is not too far over here. 700 00:41:29,830 --> 00:41:33,490 And your algorithm then says pick the midpoint b1. 701 00:41:33,490 --> 00:41:36,480 Here the algorithm is particularly simple because, 702 00:41:36,480 --> 00:41:40,380 since we're only using two regions, all you need is one 703 00:41:40,380 --> 00:41:42,220 separator point. 704 00:41:42,220 --> 00:41:45,440 So we wind up with b1 here, which is halfway 705 00:41:45,440 --> 00:41:47,660 between a1 and a2. 706 00:41:47,660 --> 00:41:50,790 a1 happens to be right in the middle of that big interval 707 00:41:50,790 --> 00:41:54,990 there, and there's hardly anything else here. 708 00:41:54,990 --> 00:41:59,100 So a1 just stays there as the conditional mean, given that 709 00:41:59,100 --> 00:42:01,310 you're on this side. 710 00:42:01,310 --> 00:42:05,070 And a2 stays as the conditional mean, given that 711 00:42:05,070 --> 00:42:08,350 you're over here, which means that a2 is a little closer 712 00:42:08,350 --> 00:42:12,010 this big region than it is to this region, but a2 is 713 00:42:12,010 --> 00:42:13,940 perfectly happy there. 714 00:42:13,940 --> 00:42:16,810 And we go back and we iterate again. 715 00:42:16,810 --> 00:42:21,530 And we haven't changed b1, we haven't changed a1, we haven't 716 00:42:21,530 --> 00:42:25,810 changed a2, and therefore, the algorithm sticks there. 717 00:42:25,810 --> 00:42:27,740 Well, that's not surprising. 718 00:42:27,740 --> 00:42:30,180 I mean, you know that there are many problems where you 719 00:42:30,180 --> 00:42:33,570 try to minimize things by differentiating or all the 720 00:42:33,570 --> 00:42:37,380 different tricks you have for minimizing things, and you 721 00:42:37,380 --> 00:42:40,840 very often find local minima. 722 00:42:40,840 --> 00:42:43,940 People call algorithms like this hill-climbing algorithms. 723 00:42:43,940 --> 00:42:46,570 They should call them valley-finding algorithms 724 00:42:46,570 --> 00:42:49,380 because we're sitting at someplace. 725 00:42:49,380 --> 00:42:51,030 We try to find a better place. 726 00:42:51,030 --> 00:42:53,240 So we wind up moving down into the 727 00:42:53,240 --> 00:42:55,760 valley further and further. 728 00:42:55,760 --> 00:42:58,780 Of course, if it's a real geographical area, you finally 729 00:42:58,780 --> 00:43:00,180 wind up at the river. 730 00:43:00,180 --> 00:43:02,560 You then move down the river, you wind up in the 731 00:43:02,560 --> 00:43:03,290 ocean and all that. 732 00:43:03,290 --> 00:43:05,290 But let's forgot about that. 733 00:43:05,290 --> 00:43:08,000 Let's just assume that there aren't rivers or anything. 734 00:43:08,000 --> 00:43:11,980 That we just have some arbitrary geographical area. 735 00:43:11,980 --> 00:43:15,720 We wind up in the bottom of a valley, and we say, OK, are we 736 00:43:15,720 --> 00:43:18,070 at the minimum or not? 737 00:43:18,070 --> 00:43:20,730 Well, with hill climbing, you can take a pair of binoculars, 738 00:43:20,730 --> 00:43:23,330 and you look around to see if there's any higher peak 739 00:43:23,330 --> 00:43:25,350 someplace else. 740 00:43:25,350 --> 00:43:27,350 With valley seeking, you can't do that. 741 00:43:27,350 --> 00:43:29,780 We're sitting there at the bottom of the valley, and we 742 00:43:29,780 --> 00:43:32,640 have no idea whether there's a better valley somewhere else 743 00:43:32,640 --> 00:43:35,970 or not, OK? 744 00:43:35,970 --> 00:43:39,110 So that's the trouble with hill-climbing algorithms or 745 00:43:39,110 --> 00:43:41,700 valley-seeking algorithms. 746 00:43:41,700 --> 00:43:45,030 And that's exactly what this algorithm does. 747 00:43:45,030 --> 00:43:48,660 This is called the Lloyd-Max algorithm because a guy by the 748 00:43:48,660 --> 00:43:53,820 name of Lloyd at Bell Labs discovered it, I think in '57. 749 00:43:53,820 --> 00:43:57,350 A guy by the name of Joel Max discovered it again. 750 00:43:57,350 --> 00:44:01,770 He was at MIT in 1960, and because all the information 751 00:44:01,770 --> 00:44:04,660 theorists around were at MIT at that time, they called it 752 00:44:04,660 --> 00:44:06,990 the Max algorithm for many years. 753 00:44:06,990 --> 00:44:09,250 And then somebody discovered that Lloyd had done it three 754 00:44:09,250 --> 00:44:10,030 years earlier. 755 00:44:10,030 --> 00:44:12,300 Lloyd never even published it. 756 00:44:12,300 --> 00:44:15,170 So it became the Lloyd-Max algorithm. 757 00:44:15,170 --> 00:44:17,520 And now there's somebody else who did it even earlier, I 758 00:44:17,520 --> 00:44:21,780 think, so we should probably take Max's name off it. 759 00:44:21,780 --> 00:44:27,210 But anyway, sometime when I revise the notes, I will give 760 00:44:27,210 --> 00:44:29,410 the whole story of that. 761 00:44:29,410 --> 00:44:31,780 But I hope you see that this algorithm 762 00:44:31,780 --> 00:44:34,400 is no big deal anyway. 763 00:44:34,400 --> 00:44:38,220 It was just people fortunately looking at the question at the 764 00:44:38,220 --> 00:44:41,170 right time before too many other people had looked at it. 765 00:44:41,170 --> 00:44:43,680 And Max unfortunately looked at it. 766 00:44:43,680 --> 00:44:44,580 He was in a valley. 767 00:44:44,580 --> 00:44:47,900 He didn't see all the other people had looked at it. 768 00:44:47,900 --> 00:44:49,130 But he became famous. 769 00:44:49,130 --> 00:44:53,840 He had his moment of time and then sunk gradually into 770 00:44:53,840 --> 00:44:59,040 oblivion except when once in awhile we call it the 771 00:44:59,040 --> 00:45:00,570 Lloyd-Max algorithm. 772 00:45:00,570 --> 00:45:03,020 Most people call it the Lloyd algorithm, though, and I 773 00:45:03,020 --> 00:45:05,710 really should also. 774 00:45:05,710 --> 00:45:10,290 OK, vector quantization: We talked about vector 775 00:45:10,290 --> 00:45:11,610 quantization a little bit. 776 00:45:11,610 --> 00:45:16,530 It's the idea of segmenting the source outputs before you 777 00:45:16,530 --> 00:45:19,330 try to do the encoding. 778 00:45:19,330 --> 00:45:21,800 So we ask is scalar quantization going to be the 779 00:45:21,800 --> 00:45:23,710 right approach? 780 00:45:23,710 --> 00:45:27,080 To answer that question, we want to look at quantizing two 781 00:45:27,080 --> 00:45:31,070 sample values jointly and drawing pictures. 782 00:45:31,070 --> 00:45:35,100 Incidentally, what's the simplest way of quantizing 783 00:45:35,100 --> 00:45:37,490 that you can think of? 784 00:45:37,490 --> 00:45:40,440 And what do people who do simple things call quantizers? 785 00:45:44,780 --> 00:45:48,360 Ever hear of an analog-to-digital converter? 786 00:45:48,360 --> 00:45:51,380 That's what a quantizer is. 787 00:45:51,380 --> 00:45:53,330 And an analog-to-digital converter, the way that 788 00:45:53,330 --> 00:45:56,890 everybody does it, is scalar. 789 00:45:56,890 --> 00:46:00,370 So that says that either the people who implement things 790 00:46:00,370 --> 00:46:03,400 are very stupid, or there's something pretty good about 791 00:46:03,400 --> 00:46:05,630 scalar quantization. 792 00:46:05,630 --> 00:46:09,190 But anyway, since this is a course trying to find better 793 00:46:09,190 --> 00:46:13,220 ways of doing things, we ought to investigate whether it is 794 00:46:13,220 --> 00:46:16,000 better to use vector quantizers and 795 00:46:16,000 --> 00:46:18,260 what they result in. 796 00:46:18,260 --> 00:46:20,880 OK, well, the first thing that we can do is look at 797 00:46:20,880 --> 00:46:23,360 quantizing two samples. 798 00:46:23,360 --> 00:46:26,480 In other words, when you want to generalize a problem to 799 00:46:26,480 --> 00:46:31,050 vectors, I find it better to generalize it to two vectors 800 00:46:31,050 --> 00:46:33,980 first and see what goes on there. 801 00:46:33,980 --> 00:46:37,720 And one possible approach is to use a rectangular grid of 802 00:46:37,720 --> 00:46:40,280 quantization regions. 803 00:46:40,280 --> 00:46:43,740 And as I'll show you in the next slide, that really is 804 00:46:43,740 --> 00:46:48,420 just a camouflaged scalar quantizer again. 805 00:46:48,420 --> 00:46:52,410 So you have a two-dimensional region corresponding to two 806 00:46:52,410 --> 00:46:53,880 real samples. 807 00:46:53,880 --> 00:46:57,460 So you've got two real numbers, U1 and U2. 808 00:46:57,460 --> 00:47:03,080 You're trying to map them into a finite set of sample values 809 00:47:03,080 --> 00:47:05,790 of representation points. 810 00:47:05,790 --> 00:47:09,760 Since you're going to be interested in the distortion 811 00:47:09,760 --> 00:47:16,370 between U1 and U2, between the vector U1, U2, and your 812 00:47:16,370 --> 00:47:23,680 representation vector, a1, a2, these points, these 813 00:47:23,680 --> 00:47:26,350 representation points, are going to be two-dimensional 814 00:47:26,350 --> 00:47:28,280 points also. 815 00:47:28,280 --> 00:47:31,970 So if you start out by saying let's put these points on a 816 00:47:31,970 --> 00:47:36,420 rectangular grid, well, we can then look at it, and we say, 817 00:47:36,420 --> 00:47:39,560 well, given the points, how do we choose the regions? 818 00:47:42,360 --> 00:47:44,710 You see, it's exactly the same problem 819 00:47:44,710 --> 00:47:47,440 that you solved before. 820 00:47:47,440 --> 00:47:52,440 If I give you the points and then I ask you, well, if we 821 00:47:52,440 --> 00:47:58,820 get a vector U1, U2 that's there, what do we map it to? 822 00:47:58,820 --> 00:48:02,275 Well, we map it to the closest thing, which means if we want 823 00:48:02,275 --> 00:48:05,570 to find these regions, we set up these perpendicular 824 00:48:05,570 --> 00:48:09,800 bisectors halfway between the representation points. 825 00:48:09,800 --> 00:48:13,250 So all of this is looking very rectangular now because we 826 00:48:13,250 --> 00:48:15,660 started out with these points rectangular. 827 00:48:15,660 --> 00:48:19,840 These lines are rectangular, and now I say, well, is this 828 00:48:19,840 --> 00:48:23,240 really any different from a scalar quantizer? 829 00:48:23,240 --> 00:48:26,740 And, of course, it isn't because for this particular 830 00:48:26,740 --> 00:48:31,340 vector quantizer, I can first ask the question, OK, here's 831 00:48:31,340 --> 00:48:37,520 U1, which is something in this direction. 832 00:48:37,520 --> 00:48:39,270 How do I find regions for that? 833 00:48:39,270 --> 00:48:44,090 Well, for U1, I just establish these regions here. 834 00:48:44,090 --> 00:48:46,920 And then I say, OK, let's look at U2 next. 835 00:48:46,920 --> 00:48:51,350 And then I look at things in this direction, and I wind up 836 00:48:51,350 --> 00:48:54,770 in saying, OK, that's all there is to the problem. 837 00:48:54,770 --> 00:48:57,310 I have a scalar quantizer again. 838 00:48:57,310 --> 00:49:00,520 Everything that I said before works. 839 00:49:00,520 --> 00:49:04,040 Now if you're a theoretician, you go one step further, and 840 00:49:04,040 --> 00:49:10,160 you say, what this tells me is that vector quantizers cannot 841 00:49:10,160 --> 00:49:13,280 be any worse than scalar quantizers. 842 00:49:13,280 --> 00:49:19,360 Because, in fact, a vector quantizer has a -- or at least 843 00:49:19,360 --> 00:49:22,590 a vector quantizer in two dimensions -- has a scalar 844 00:49:22,590 --> 00:49:23,430 quantizer -- 845 00:49:23,430 --> 00:49:28,030 has two scalar quantizers as a special case. 846 00:49:28,030 --> 00:49:32,210 And therefore, the amount of square distortion that I wind 847 00:49:32,210 --> 00:49:37,600 up with in the vector case, I can always get that -- 848 00:49:40,970 --> 00:49:44,940 whatever I can do with the scalar quantizer, I can do 849 00:49:44,940 --> 00:49:49,440 just as well with a vector quantizer by choosing it, in 850 00:49:49,440 --> 00:49:51,770 fact, to be rectangular like this. 851 00:49:51,770 --> 00:49:56,320 And you can also get some intuitive ideas that if 852 00:49:56,320 --> 00:50:00,770 instead of having IID random variables, if U1 and U2 are 853 00:50:00,770 --> 00:50:03,430 very heavily correlated somehow so that they're very 854 00:50:03,430 --> 00:50:06,860 close together, I mean, you sort of get an engineering 855 00:50:06,860 --> 00:50:08,700 view of what you want to do. 856 00:50:08,700 --> 00:50:11,610 You want to take this rectangular picture here. 857 00:50:11,610 --> 00:50:14,300 You want to skew it around that way. 858 00:50:14,300 --> 00:50:16,790 And you want to have lots of points going this way and not 859 00:50:16,790 --> 00:50:20,010 too many points going this way because almost everything is 860 00:50:20,010 --> 00:50:22,220 going this way, and there's not much 861 00:50:22,220 --> 00:50:24,050 going on in that direction. 862 00:50:24,050 --> 00:50:30,470 So you got some pictures of what you want to do. 863 00:50:30,470 --> 00:50:35,700 These regions here, a little bit of terminology, are called 864 00:50:35,700 --> 00:50:38,330 Voronoi regions. 865 00:50:38,330 --> 00:50:42,680 Anytime you start out with a set of points and you put 866 00:50:42,680 --> 00:50:46,520 perpendicular bisectors in between those points, halfway 867 00:50:46,520 --> 00:50:49,550 between the points, you call the regions that you wind up 868 00:50:49,550 --> 00:50:52,640 with Voronoi regions. 869 00:50:52,640 --> 00:50:58,000 So, in fact, part of this Lloyd-Max algorithm, 870 00:50:58,000 --> 00:51:03,800 generalized at two dimensions, says given any set of points, 871 00:51:03,800 --> 00:51:07,360 the regions ought to be the Voronoi regions for them. 872 00:51:07,360 --> 00:51:10,190 So that's that first subproblem generalized at two 873 00:51:10,190 --> 00:51:11,440 dimensions. 874 00:51:15,920 --> 00:51:19,730 And if you have an arbitrary set of points, the Voronoi 875 00:51:19,730 --> 00:51:24,580 regions look sort of like this, OK? 876 00:51:24,580 --> 00:51:27,410 And I've only drawn it for the center point because, well, 877 00:51:27,410 --> 00:51:30,860 there aren't enough points to do anything more than that. 878 00:51:30,860 --> 00:51:34,520 So anything in this region gets mapped into this point. 879 00:51:34,520 --> 00:51:38,030 Anything in this semi-infinite region gets mapped into this 880 00:51:38,030 --> 00:51:40,590 point and so forth. 881 00:51:40,590 --> 00:51:43,510 So even in two dimensions, this part of the 882 00:51:43,510 --> 00:51:46,580 algorithm is simple. 883 00:51:46,580 --> 00:51:49,810 When you start out with the regions, with almost the same 884 00:51:49,810 --> 00:51:55,340 argument that we used before, you can see that the mean 885 00:51:55,340 --> 00:51:59,370 square error is going to be minimized by using conditional 886 00:51:59,370 --> 00:52:03,510 means for the representation points. 887 00:52:03,510 --> 00:52:06,380 I mean, that's done in detail in the notes. 888 00:52:06,380 --> 00:52:09,660 It's just algebra to do that. 889 00:52:09,660 --> 00:52:11,750 It's sort of intuitive that the same thing ought to 890 00:52:11,750 --> 00:52:14,940 happen, and in fact, it does. 891 00:52:14,940 --> 00:52:18,490 So you can still find a local minimum by 892 00:52:18,490 --> 00:52:22,440 this Lloyd-Max algorithm. 893 00:52:22,440 --> 00:52:24,950 If you're unhappy with the fact that the Lloyd-Max 894 00:52:24,950 --> 00:52:30,020 algorithm doesn't always work in one dimension, be content 895 00:52:30,020 --> 00:52:32,880 with the fact that it's far worse in two dimensions, and 896 00:52:32,880 --> 00:52:35,330 it gets worse and worse as you go to higher numbers of 897 00:52:35,330 --> 00:52:36,780 dimensions. 898 00:52:36,780 --> 00:52:41,910 So it is a local minimum, but not 899 00:52:41,910 --> 00:52:45,360 necessarily the best thing. 900 00:52:45,360 --> 00:52:50,490 OK, well, about that time, and let's go forward for maybe 10 901 00:52:50,490 --> 00:52:54,250 years from 1957 and '60 when people were inventing the 902 00:52:54,250 --> 00:52:57,560 Lloyd-Max algorithm and where they thought that quantization 903 00:52:57,560 --> 00:53:00,640 was a really neat academic problem, and many people were 904 00:53:00,640 --> 00:53:04,950 writing theses on it and having lots of fun with it. 905 00:53:04,950 --> 00:53:08,420 And then eventually, they started to realize that when 906 00:53:08,420 --> 00:53:11,390 you try to solve that problem and find the minimum, it 907 00:53:11,390 --> 00:53:13,990 really is just a very ugly problem. 908 00:53:13,990 --> 00:53:17,180 At least it looks like a very ugly problem with everything 909 00:53:17,180 --> 00:53:19,750 that anybody knows after having worked on it 910 00:53:19,750 --> 00:53:21,160 for a very long time. 911 00:53:21,160 --> 00:53:24,380 So not many people work on this anymore. 912 00:53:24,380 --> 00:53:27,090 So we stop and say, well, we really don't want to go too 913 00:53:27,090 --> 00:53:29,640 far on this because it's ugly. 914 00:53:29,640 --> 00:53:31,640 But then we stop and think. 915 00:53:31,640 --> 00:53:34,510 I mean, anytime you get stuck on a problem, you ought to 916 00:53:34,510 --> 00:53:36,640 stop and ask, well, am I really looking 917 00:53:36,640 --> 00:53:37,890 at the right problem? 918 00:53:40,500 --> 00:53:45,770 Now why am I or why am I not looking at the right problem? 919 00:53:45,770 --> 00:53:47,640 And remember where we started off. 920 00:53:50,730 --> 00:53:54,570 We started off with this kind of layered solution, which 921 00:53:54,570 --> 00:53:58,360 said we were going to quantize these things into a finite 922 00:53:58,360 --> 00:54:00,200 alphabet and then we were going to 923 00:54:00,200 --> 00:54:04,870 discrete code them, OK? 924 00:54:04,870 --> 00:54:06,890 And here what we've been doing for awhile, 925 00:54:06,890 --> 00:54:09,250 and none of you objected. 926 00:54:09,250 --> 00:54:12,800 Of course, it's hard to object at 9:30 in the morning. 927 00:54:12,800 --> 00:54:17,020 You just listen and you -- but you should have objected. 928 00:54:17,020 --> 00:54:20,230 You should have said why in hell am I choosing the number 929 00:54:20,230 --> 00:54:24,360 of quantization levels to minimize over? 930 00:54:24,360 --> 00:54:26,830 What should I be minimizing over? 931 00:54:26,830 --> 00:54:31,510 I should be trying to find the minimum mean square error 932 00:54:31,510 --> 00:54:37,050 conditional on the entropy of this output alphabet. 933 00:54:37,050 --> 00:54:39,900 Because the entropy of the output alphabet is what 934 00:54:39,900 --> 00:54:45,020 determines what I can accomplish by discrete coding. 935 00:54:45,020 --> 00:54:49,080 That's a slightly phony problem because I'm insisting, 936 00:54:49,080 --> 00:54:56,360 now at least to start with, that the quantizer is a scalar 937 00:54:56,360 --> 00:55:01,330 quantizer, and by using coding here, I'm allowing memory in 938 00:55:01,330 --> 00:55:03,170 the coding process. 939 00:55:03,170 --> 00:55:06,860 But it's not that phony because, in fact, this 940 00:55:06,860 --> 00:55:10,680 quantization job is a real mess, and this job 941 00:55:10,680 --> 00:55:13,770 is very, very easy. 942 00:55:13,770 --> 00:55:15,930 And, in fact, when you think about it a little bit, you 943 00:55:15,930 --> 00:55:23,210 say, OK, how do people really do this if they're trying to 944 00:55:23,210 --> 00:55:25,180 implement things? 945 00:55:25,180 --> 00:55:28,230 And how would you go about implementing something like 946 00:55:28,230 --> 00:55:32,660 this entire problem that we're talking about? 947 00:55:32,660 --> 00:55:34,790 What would you do if you had to implement it? 948 00:55:38,250 --> 00:55:41,940 Well, you're probably afraid to say it, but you would use 949 00:55:41,940 --> 00:55:45,040 digital signal processing, wouldn't you? 950 00:55:45,040 --> 00:55:47,550 I mean, the first thing you would try to do is to get rid 951 00:55:47,550 --> 00:55:52,790 of all these analog values, and you would try to turn them 952 00:55:52,790 --> 00:55:57,560 into discrete values, so that you would really, after you 953 00:55:57,560 --> 00:56:02,200 somehow find this sequence of numbers here, you would go 954 00:56:02,200 --> 00:56:05,030 through a quantizer and quantize these numbers very, 955 00:56:05,030 --> 00:56:06,890 very finely. 956 00:56:06,890 --> 00:56:11,810 You would think of them as real numbers, and then you 957 00:56:11,810 --> 00:56:15,470 would do some kind of discrete coding, and you would wind up 958 00:56:15,470 --> 00:56:16,310 with something. 959 00:56:16,310 --> 00:56:20,820 And then you would say, ah, I have quantized these things 960 00:56:20,820 --> 00:56:22,660 very, very finely. 961 00:56:22,660 --> 00:56:25,350 And we'll see that when you quantize it very, very finely, 962 00:56:25,350 --> 00:56:30,310 what you're going to wind up with, which is almost optimal, 963 00:56:30,310 --> 00:56:34,880 is a uniform scalar quantizer, which is just what a 964 00:56:34,880 --> 00:56:39,060 garden-variety analog-to-digital converter 965 00:56:39,060 --> 00:56:42,180 does for you, OK? 966 00:56:42,180 --> 00:56:44,490 But then you say, aha! 967 00:56:44,490 --> 00:56:46,920 What I can't do at that moment, I don't have enough 968 00:56:46,920 --> 00:56:52,220 bits to represent what this very fine quantizer has done. 969 00:56:52,220 --> 00:56:55,820 So then I think of this quantization as real numbers. 970 00:56:55,820 --> 00:56:59,500 I go through the same process again, and I think of then 971 00:56:59,500 --> 00:57:01,740 quantizing the real numbers to the number of 972 00:57:01,740 --> 00:57:04,710 bits I really want. 973 00:57:04,710 --> 00:57:08,190 Anybody catch what I just said? 974 00:57:08,190 --> 00:57:11,020 I'm saying you do this in two steps. 975 00:57:11,020 --> 00:57:15,630 The first step is a very fine quantization, strictly for 976 00:57:15,630 --> 00:57:20,440 implementation purposes and for no other reason. 977 00:57:20,440 --> 00:57:24,460 And at that point, you have bits to process. 978 00:57:24,460 --> 00:57:27,880 You do digital-signal processing, but you think of 979 00:57:27,880 --> 00:57:31,300 those bits as representing numbers, OK? 980 00:57:31,300 --> 00:57:34,430 In other words, as far as your thoughts are concerned, you're 981 00:57:34,430 --> 00:57:36,970 not dealing with the quantization errors that 982 00:57:36,970 --> 00:57:38,360 occurred here at all. 983 00:57:38,360 --> 00:57:40,730 You're just thinking of real numbers. 984 00:57:40,730 --> 00:57:43,960 And at that point, you try to design conceptually what it is 985 00:57:43,960 --> 00:57:47,020 you want to do in this process of quantizing 986 00:57:47,020 --> 00:57:49,420 and discrete coding. 987 00:57:49,420 --> 00:57:52,800 And you then go back to looking at these things as 988 00:57:52,800 --> 00:57:55,750 numbers again, and you quantize them again, and you 989 00:57:55,750 --> 00:57:59,560 discrete encode them again in whatever way makes sense. 990 00:57:59,560 --> 00:58:02,730 So you do one thing strictly for implementation purposes. 991 00:58:02,730 --> 00:58:05,600 You do the other thing for conceptual purposes. 992 00:58:05,600 --> 00:58:08,050 Then you put them together into something that works. 993 00:58:08,050 --> 00:58:12,390 And that's the way engineers operate all the time, I think. 994 00:58:12,390 --> 00:58:15,880 At least they did when I was more active doing real 995 00:58:15,880 --> 00:58:17,910 engineering. 996 00:58:17,910 --> 00:58:23,530 OK, so that's that. 997 00:58:23,530 --> 00:58:26,870 But anyway, it says finding a minimum mean square error 998 00:58:26,870 --> 00:58:30,620 quantizer for fixed M isn't the right problem that we're 999 00:58:30,620 --> 00:58:32,910 interested in. 1000 00:58:32,910 --> 00:58:34,660 If you're going to have quantization followed by 1001 00:58:34,660 --> 00:58:38,150 discrete coding, the quantizer should minimize mean square 1002 00:58:38,150 --> 00:58:43,460 error for fixed representation point entropy. 1003 00:58:43,460 --> 00:58:45,670 In other words, I'd want to find these 1004 00:58:45,670 --> 00:58:47,460 representation points. 1005 00:58:47,460 --> 00:58:49,720 It's important what the numerical values of the 1006 00:58:49,720 --> 00:58:54,380 representation points are for mean square error, but I'm not 1007 00:58:54,380 --> 00:58:56,480 interested in how many of them I have. 1008 00:58:56,480 --> 00:59:00,120 What I'm interested in is the entropy of them. 1009 00:59:00,120 --> 00:59:04,900 OK, so the quantizer should minimize mean square error for 1010 00:59:04,900 --> 00:59:07,810 a fixed representation point entropy. 1011 00:59:07,810 --> 00:59:12,570 I would like some algorithm, which goes through and changes 1012 00:59:12,570 --> 00:59:16,110 my representation points, including perhaps changing the 1013 00:59:16,110 --> 00:59:18,960 number of representation points as a 1014 00:59:18,960 --> 00:59:21,530 way of reducing entropy. 1015 00:59:21,530 --> 00:59:25,240 OK, sometimes you can get more representation points by 1016 00:59:25,240 --> 00:59:28,830 having them out where there's virtually no probability, and 1017 00:59:28,830 --> 00:59:32,460 therefore they don't happen very often, but when they do 1018 00:59:32,460 --> 00:59:34,640 happen, they sure save you an awful lot 1019 00:59:34,640 --> 00:59:36,390 of mean square error. 1020 00:59:36,390 --> 00:59:40,320 So you can wind up with things using many, many more 1021 00:59:40,320 --> 00:59:43,800 quantization points than you would think you would want 1022 00:59:43,800 --> 00:59:48,270 because that's the optimal thing to do. 1023 00:59:48,270 --> 00:59:52,470 OK, when you're given the regions, if we try to say what 1024 00:59:52,470 --> 00:59:56,140 happens to the Lloyd-Max algorithm then, the 1025 00:59:56,140 --> 00:59:58,720 representation points should still be 1026 00:59:58,720 --> 01:00:00,420 the conditional means. 1027 01:00:00,420 --> 01:00:01,670 Why? 1028 01:00:06,040 --> 01:00:11,730 Anybody figure out why we want to solve the problem that way? 1029 01:00:11,730 --> 01:00:18,800 If I've already figured out what the region should the -- 1030 01:00:18,800 --> 01:00:23,760 well, before, when I told you what the regions were, you 1031 01:00:23,760 --> 01:00:27,250 told me that you wanted to make the representation points 1032 01:00:27,250 --> 01:00:29,670 the conditional means. 1033 01:00:29,670 --> 01:00:31,570 Now if I make the representation points 1034 01:00:31,570 --> 01:00:36,350 something other than the conditional means, what's 1035 01:00:36,350 --> 01:00:37,600 going to happen to the entropy? 1036 01:00:41,840 --> 01:00:43,730 The entropy stays the same. 1037 01:00:43,730 --> 01:00:45,480 The entropy has nothing to do with what 1038 01:00:45,480 --> 01:00:48,260 you call these symbols. 1039 01:00:48,260 --> 01:00:51,760 I can move where they are, but they still have the same 1040 01:00:51,760 --> 01:00:56,800 probability because they still occur whenever you wind up in 1041 01:00:56,800 --> 01:00:59,670 that region. 1042 01:00:59,670 --> 01:01:02,630 And therefore, the entropy doesn't change, and therefore, 1043 01:01:02,630 --> 01:01:06,520 the same rule holds. 1044 01:01:06,520 --> 01:01:10,060 But now the peculiar thing is you don't want to make the 1045 01:01:10,060 --> 01:01:14,150 representation regions Voronoi regions anymore. 1046 01:01:14,150 --> 01:01:17,290 In other words, sometimes you want to make a region much 1047 01:01:17,290 --> 01:01:20,630 closer to one point than to the other point. 1048 01:01:20,630 --> 01:01:22,880 And why do you want to do that? 1049 01:01:22,880 --> 01:01:25,210 Because you'd like to make these probabilities of the 1050 01:01:25,210 --> 01:01:28,850 points as unequal as you can because you're trying to 1051 01:01:28,850 --> 01:01:30,570 reduce entropy. 1052 01:01:30,570 --> 01:01:33,540 And you can reduce entropy by making things have different 1053 01:01:33,540 --> 01:01:36,230 probabilities. 1054 01:01:36,230 --> 01:01:40,740 OK, so that's where we wind up with that. 1055 01:01:40,740 --> 01:01:43,140 I would like to say there's a nice algorithm 1056 01:01:43,140 --> 01:01:45,070 to solve this problem. 1057 01:01:45,070 --> 01:01:46,670 There isn't. 1058 01:01:46,670 --> 01:01:49,060 It's an incredibly ugly problem. 1059 01:01:49,060 --> 01:01:52,230 You might think it makes sense to use a Lagrange multiplier 1060 01:01:52,230 --> 01:01:56,100 approach and try to minimize some linear combination of 1061 01:01:56,100 --> 01:02:00,660 entropy and mean square error. 1062 01:02:00,660 --> 01:02:03,710 I've never been able to make it work. 1063 01:02:03,710 --> 01:02:08,860 So anyway, let's go on. 1064 01:02:13,730 --> 01:02:17,220 When we want to look at a high-rate quantizer, which is 1065 01:02:17,220 --> 01:02:22,350 what we very often want, you can do a very simple 1066 01:02:22,350 --> 01:02:24,450 approximation. 1067 01:02:24,450 --> 01:02:28,740 And the simple approximation makes the problem much easier, 1068 01:02:28,740 --> 01:02:31,020 and it gives you an added benefit. 1069 01:02:31,020 --> 01:02:34,130 There's something called differential entropy. 1070 01:02:34,130 --> 01:02:38,530 And differential entropy is the same as ordinary entropy, 1071 01:02:38,530 --> 01:02:42,300 except instead of dealing with probabilities, it deals with 1072 01:02:42,300 --> 01:02:45,440 probability densities. 1073 01:02:45,440 --> 01:02:49,030 And you look at this, and you say, well, that's virtually 1074 01:02:49,030 --> 01:02:53,570 the same thing, and it looks like the same thing. 1075 01:02:53,570 --> 01:02:56,020 And to most physicists, most physicists look at something 1076 01:02:56,020 --> 01:03:00,135 like this, and physicists who are into statistical mechanics 1077 01:03:00,135 --> 01:03:01,990 say, oh, of course that's the same thing. 1078 01:03:01,990 --> 01:03:04,400 There's no fundamental difference between a 1079 01:03:04,400 --> 01:03:08,650 differential entropy and a real entropy. 1080 01:03:08,650 --> 01:03:10,790 And you say, well, but they're all these 1081 01:03:10,790 --> 01:03:11,600 scaling issues there. 1082 01:03:11,600 --> 01:03:15,060 And they say, blah, that's not important. 1083 01:03:15,060 --> 01:03:17,035 And they're right, because once you understand it, it's 1084 01:03:17,035 --> 01:03:19,160 not important. 1085 01:03:19,160 --> 01:03:25,670 But let's look and see what the similarities and what the 1086 01:03:25,670 --> 01:03:26,920 differences are. 1087 01:03:29,960 --> 01:03:34,130 And I'll think in terms of, in fact, a quantization problem, 1088 01:03:34,130 --> 01:03:38,410 where I'm taking this continuous-valued random 1089 01:03:38,410 --> 01:03:42,250 variable with density, and I'm quantizing it into a set of 1090 01:03:42,250 --> 01:03:44,110 discrete points. 1091 01:03:44,110 --> 01:03:47,750 And I want to say what's the difference between this 1092 01:03:47,750 --> 01:03:52,630 differential entropy here and this discrete entropy, which 1093 01:03:52,630 --> 01:03:55,410 we sort of understand by now. 1094 01:03:55,410 --> 01:03:59,760 Well, things that are the same -- the first thing that's the 1095 01:03:59,760 --> 01:04:04,830 same is the differential entropy is still the expected 1096 01:04:04,830 --> 01:04:09,980 value of minus the logarithm of the probability density. 1097 01:04:13,830 --> 01:04:16,340 OK, we found that that was useful before when we were 1098 01:04:16,340 --> 01:04:17,840 trying to understand the entropy. 1099 01:04:17,840 --> 01:04:22,180 It's the expected value of a log pmf. 1100 01:04:22,180 --> 01:04:27,020 Now the differential entropy is expected value of a log pdf 1101 01:04:27,020 --> 01:04:30,180 instead of pmf. 1102 01:04:30,180 --> 01:04:35,350 And the entropy of two random variables, U1 and U2, if 1103 01:04:35,350 --> 01:04:40,020 they're independent, is just the differential entropy of U1 1104 01:04:40,020 --> 01:04:42,940 plus the differential entropy of U2. 1105 01:04:42,940 --> 01:04:43,980 How do I see that? 1106 01:04:43,980 --> 01:04:45,160 You just write it down. 1107 01:04:45,160 --> 01:04:46,790 You write down these joint densities. 1108 01:04:46,790 --> 01:04:48,150 You write down this. 1109 01:04:48,150 --> 01:04:53,650 The logarithm of the joint density for IID splits into a 1110 01:04:53,650 --> 01:04:55,580 product of densities. 1111 01:04:55,580 --> 01:04:59,210 A log of a product is the sum of the logs. 1112 01:04:59,210 --> 01:05:01,380 It's the same as the argument before. 1113 01:05:01,380 --> 01:05:06,940 In other words, it's not only that this is the same as the 1114 01:05:06,940 --> 01:05:09,270 answer that we got before, but the argument 1115 01:05:09,270 --> 01:05:11,560 is exactly the same. 1116 01:05:11,560 --> 01:05:15,440 And the other thing, the next thing is if you shift this 1117 01:05:15,440 --> 01:05:16,800 density here. 1118 01:05:16,800 --> 01:05:20,045 If I have a density, which is going along here and I shift 1119 01:05:20,045 --> 01:05:23,410 it over to here and have a density over there, the 1120 01:05:23,410 --> 01:05:25,510 entropy is the same again. 1121 01:05:25,510 --> 01:05:28,930 Because this entropy is just fundamentally not -- it 1122 01:05:28,930 --> 01:05:32,230 doesn't have to do with where you happen to be. 1123 01:05:32,230 --> 01:05:35,520 It has to do with what these probability densities are. 1124 01:05:35,520 --> 01:05:40,320 And you can stick that shift into here, and if you're very 1125 01:05:40,320 --> 01:05:45,140 good at doing calculus, you can, in fact, see that when 1126 01:05:45,140 --> 01:05:49,770 you put a shift in here, you still get the same entropy. 1127 01:05:49,770 --> 01:05:53,390 I couldn't do that at this point, but I'm sure that all 1128 01:05:53,390 --> 01:05:55,100 of you can. 1129 01:05:55,100 --> 01:05:56,830 And I can do it if I spent enough 1130 01:05:56,830 --> 01:05:58,000 time thinking it through. 1131 01:05:58,000 --> 01:06:01,600 It would just take me longer than most of you. 1132 01:06:01,600 --> 01:06:05,690 OK, so all of these things are still true. 1133 01:06:05,690 --> 01:06:07,300 There a couple of very disturbing 1134 01:06:07,300 --> 01:06:10,590 differences, though. 1135 01:06:10,590 --> 01:06:13,450 And the first difference is that h of U 1136 01:06:13,450 --> 01:06:14,700 is not scale invariant. 1137 01:06:17,720 --> 01:06:19,800 And, in fact, if it were scale invariant, it 1138 01:06:19,800 --> 01:06:22,690 would be totally useless. 1139 01:06:22,690 --> 01:06:25,610 The fact that it's not scale invariant turns out to be very 1140 01:06:25,610 --> 01:06:29,850 important when you try to understand it, OK? 1141 01:06:29,850 --> 01:06:36,890 In other words, if you stretch this probability density by 1142 01:06:36,890 --> 01:06:40,930 some quantity a, then your probability density, which 1143 01:06:40,930 --> 01:06:44,570 looks like this, shifts like this. 1144 01:06:44,570 --> 01:06:47,990 It shifts down and it shifts out. 1145 01:06:47,990 --> 01:06:52,790 So the log of the pmf gets bigger. 1146 01:06:52,790 --> 01:06:56,450 The pmf itself is just sort of spread out. 1147 01:06:56,450 --> 01:06:59,280 It does what you would expect it to do, and 1148 01:06:59,280 --> 01:07:01,090 that works very nicely. 1149 01:07:01,090 --> 01:07:04,630 But the log of the pmf has this extra term which comes in 1150 01:07:04,630 --> 01:07:08,470 here, which is a factor of a. 1151 01:07:08,470 --> 01:07:12,400 And it turns out you can just factor that factor of a out, 1152 01:07:12,400 --> 01:07:17,730 and you wind up with the differential entropy of a 1153 01:07:17,730 --> 01:07:22,590 scaled random variable U is equal to the differential 1154 01:07:22,590 --> 01:07:30,760 entropy of the original U plus log of a, which might be 1155 01:07:30,760 --> 01:07:32,010 minus log of a. 1156 01:07:34,610 --> 01:07:36,280 I'm not sure which it is. 1157 01:07:36,280 --> 01:07:38,780 I'll write down plus or minus log of a. 1158 01:07:38,780 --> 01:07:42,450 It's one or the other. 1159 01:07:42,450 --> 01:07:45,240 I don't care at this point. 1160 01:07:45,240 --> 01:07:47,260 All I'm interested in is that you 1161 01:07:47,260 --> 01:07:50,130 recognize that it's different. 1162 01:07:50,130 --> 01:07:52,860 An even more disturbing point is this differential entropy 1163 01:07:52,860 --> 01:07:54,280 can be negative. 1164 01:07:54,280 --> 01:07:55,830 It can be negative or positive. 1165 01:07:55,830 --> 01:07:59,720 It can do whatever it wants to do. 1166 01:07:59,720 --> 01:08:05,540 And so we're left with a sort of a peculiar thing. 1167 01:08:05,540 --> 01:08:09,860 But we say, well, all right, that's the way it is. 1168 01:08:09,860 --> 01:08:11,410 The physicists had to deal with that 1169 01:08:11,410 --> 01:08:14,050 for many, many years. 1170 01:08:14,050 --> 01:08:19,410 And the physicists, who think about it all the time, deal 1171 01:08:19,410 --> 01:08:22,470 with it and say it's not very important. 1172 01:08:22,470 --> 01:08:26,980 So we'll say, OK, we'll go along with this unless asked: 1173 01:08:26,980 --> 01:08:31,310 What happens if we try to build a uniform high-rate 1174 01:08:31,310 --> 01:08:35,100 scalar quantizer, which is exactly what you would do for 1175 01:08:35,100 --> 01:08:38,670 implementation purposes anyway. 1176 01:08:38,670 --> 01:08:40,550 So how does this work? 1177 01:08:40,550 --> 01:08:43,380 You pick a little tiny delta. 1178 01:08:43,380 --> 01:08:45,920 You have all these regions here. 1179 01:08:45,920 --> 01:08:50,090 And by uniform, I mean you make all the regions the same. 1180 01:08:50,090 --> 01:08:52,750 I mean, conceptually, this might mean having accountably 1181 01:08:52,750 --> 01:08:54,150 infinite number of regions. 1182 01:08:54,150 --> 01:08:57,610 Let's not worry about that for the time being. 1183 01:08:57,610 --> 01:09:01,130 And you have points in here, which are the conditional 1184 01:09:01,130 --> 01:09:04,310 means of the regions. 1185 01:09:04,310 --> 01:09:11,020 Well, if I assume that delta is small, then the probability 1186 01:09:11,020 --> 01:09:14,960 density of u is almost constant within a region, if 1187 01:09:14,960 --> 01:09:17,970 it really is a density. 1188 01:09:17,970 --> 01:09:23,800 And I can define an average value of f of u within each 1189 01:09:23,800 --> 01:09:25,350 region in the following way. 1190 01:09:25,350 --> 01:09:28,070 It's easier to see what it is graphically if I have a 1191 01:09:28,070 --> 01:09:31,220 density here, which runs along like this. 1192 01:09:33,810 --> 01:09:39,930 And within each region, I will choose f-bar of u to be a 1193 01:09:39,930 --> 01:09:47,630 piecewise constant version of this density, which might be 1194 01:09:47,630 --> 01:09:49,620 like that, OK? 1195 01:09:49,620 --> 01:09:51,610 Now that's just analytical stuff to make 1196 01:09:51,610 --> 01:09:53,980 things come out right. 1197 01:09:53,980 --> 01:09:58,300 If you read the notes about how to do this, you find out 1198 01:09:58,300 --> 01:10:02,060 that this quantity is important analytically to 1199 01:10:02,060 --> 01:10:04,270 trace through what's going on. 1200 01:10:04,270 --> 01:10:07,510 I think for the level that we're at now, it's fine to 1201 01:10:07,510 --> 01:10:13,380 just say, OK, this is the same as this, if we make the 1202 01:10:13,380 --> 01:10:16,190 quantization very small. 1203 01:10:16,190 --> 01:10:18,350 I mean, at some point in this argument, you want to go 1204 01:10:18,350 --> 01:10:21,420 through and say, OK, what do I mean by approximate? 1205 01:10:21,420 --> 01:10:23,340 Is the approximation close? 1206 01:10:23,340 --> 01:10:27,520 How does the approximation become better as I make delta 1207 01:10:27,520 --> 01:10:29,440 smaller, and all of these questions. 1208 01:10:29,440 --> 01:10:35,050 But let's just now say, OK, this is the same as that, and 1209 01:10:35,050 --> 01:10:39,710 I'd like to find out how well this uniform high-rate scalar 1210 01:10:39,710 --> 01:10:40,960 quantizer does. 1211 01:10:44,470 --> 01:10:49,610 So my high-rate approximation is that this average density 1212 01:10:49,610 --> 01:10:53,870 is the same as the true density and for 1213 01:10:53,870 --> 01:10:56,290 all possible u. 1214 01:10:56,290 --> 01:11:02,760 So conditional on u and Rj, this quantity is constant. 1215 01:11:02,760 --> 01:11:07,770 It's constant and equal to 1 over delta, and the actual 1216 01:11:07,770 --> 01:11:10,860 density is approximately equal to 1 over delta. 1217 01:11:10,860 --> 01:11:11,590 What's that mean? 1218 01:11:11,590 --> 01:11:14,510 It means that the mean square error in one of these 1219 01:11:14,510 --> 01:11:19,500 quantization regions is our usual delta squared over 12, 1220 01:11:19,500 --> 01:11:22,850 which is the mean square error of a uniform density in a 1221 01:11:22,850 --> 01:11:26,530 region from minus delta over 2 to plus delta over 2. 1222 01:11:26,530 --> 01:11:28,550 So that's the mean square error. 1223 01:11:28,550 --> 01:11:30,010 You can't do anything with that. 1224 01:11:30,010 --> 01:11:32,680 You're stuck with it. 1225 01:11:32,680 --> 01:11:36,650 The next question is what is the entropy 1226 01:11:36,650 --> 01:11:39,880 as a quantizer output? 1227 01:11:39,880 --> 01:11:42,760 I'm sorry, I'm giving a typical MIT lecture today, 1228 01:11:42,760 --> 01:11:46,780 which people have characterized as a fire hose, 1229 01:11:46,780 --> 01:11:51,130 and it's because I want to get done with this material today. 1230 01:11:51,130 --> 01:11:53,680 I think if you read the notes afterwards, you've got a 1231 01:11:53,680 --> 01:11:57,010 pretty good picture of what's going on. 1232 01:11:57,010 --> 01:11:59,300 We are not going to have an enormous number 1233 01:11:59,300 --> 01:12:00,380 of problems on this. 1234 01:12:00,380 --> 01:12:03,295 It's not something we're stressing so I want to get you 1235 01:12:03,295 --> 01:12:07,280 to have some idea of what this is all about. 1236 01:12:07,280 --> 01:12:10,830 This slide is probably the most important slide because 1237 01:12:10,830 --> 01:12:13,640 it tells you what this differential 1238 01:12:13,640 --> 01:12:15,390 entropy really is. 1239 01:12:18,040 --> 01:12:22,070 OK, I'm going to look at the probabilities of each of these 1240 01:12:22,070 --> 01:12:27,680 discrete points now. p sub j is the probability that the 1241 01:12:27,680 --> 01:12:30,990 quantizer will get the point a sub j. 1242 01:12:30,990 --> 01:12:33,260 In other words, it is the probability of the 1243 01:12:33,260 --> 01:12:35,790 region R sub j. 1244 01:12:35,790 --> 01:12:40,320 And that's just the integral of the probability density 1245 01:12:40,320 --> 01:12:41,490 over the j-th region. 1246 01:12:41,490 --> 01:12:46,450 Namely, it's the probability of being in that j-th region. 1247 01:12:46,450 --> 01:12:52,080 p sub j is also equal to this average value times delta. 1248 01:12:52,080 --> 01:12:55,090 OK, so I look at what the entropy is of 1249 01:12:55,090 --> 01:12:57,610 the high-rate quantizer. 1250 01:12:57,610 --> 01:13:02,450 It's the sum of minus pj log pj. 1251 01:13:02,450 --> 01:13:07,090 I translate that pj is equal to this, so I 1252 01:13:07,090 --> 01:13:08,900 have a sum over j. 1253 01:13:08,900 --> 01:13:15,850 I have an integral minus a fU of u, which is that, times log 1254 01:13:15,850 --> 01:13:19,200 And now I'll use this quantity instead of this quantity. 1255 01:13:19,200 --> 01:13:22,440 f-bar of u times delta du. 1256 01:13:22,440 --> 01:13:25,660 OK, in other words, these probabilities are scaled by 1257 01:13:25,660 --> 01:13:27,560 delta, which is what I was telling you before. 1258 01:13:27,560 --> 01:13:30,330 That's the crucial thing here. 1259 01:13:30,330 --> 01:13:33,790 As you make delta smaller and smaller, the probabilities of 1260 01:13:33,790 --> 01:13:37,510 the intervals go down with delta. 1261 01:13:37,510 --> 01:13:41,810 OK, so I break this up into two terms now. 1262 01:13:41,810 --> 01:13:47,230 I take out the log of delta, which is integrated over fU. 1263 01:13:47,230 --> 01:13:49,200 I have this quantity left. 1264 01:13:49,200 --> 01:13:52,670 This quantity is approximately equal to the density, and what 1265 01:13:52,670 --> 01:13:56,380 I wind up with is that the actual entropy of the 1266 01:13:56,380 --> 01:14:00,680 quantized version is equal to the differential entropy minus 1267 01:14:00,680 --> 01:14:04,010 the log of delta at high rate. 1268 01:14:04,010 --> 01:14:07,550 OK, that's the only way I know to interpret differential 1269 01:14:07,550 --> 01:14:11,750 entropy that makes any sense, OK, in other words, usually 1270 01:14:11,750 --> 01:14:14,120 when you think of integrals, you think of them 1271 01:14:14,120 --> 01:14:15,190 in a Riemann sense. 1272 01:14:15,190 --> 01:14:18,160 You think of breaking up the interval into lots of very 1273 01:14:18,160 --> 01:14:22,530 thin slices with delta, and then you integrate things by 1274 01:14:22,530 --> 01:14:26,250 adding up things, and that integral then is then a sum. 1275 01:14:26,250 --> 01:14:28,600 It's the same as this kind of sum. 1276 01:14:28,600 --> 01:14:30,990 And the integral that you wind up with when you're dealing 1277 01:14:30,990 --> 01:14:36,110 with a log of the pmf, you get this extra delta 1278 01:14:36,110 --> 01:14:40,550 sticking in there, OK? 1279 01:14:40,550 --> 01:14:48,640 So this entropy is this phony thing minus log of delta. 1280 01:14:48,640 --> 01:14:52,650 As I quantize more and more finely, this 1281 01:14:52,650 --> 01:14:55,020 entropy keeps going up. 1282 01:14:55,020 --> 01:14:58,990 It goes up because when delta is very, very small, minus log 1283 01:14:58,990 --> 01:15:01,590 delta is a large number. 1284 01:15:01,590 --> 01:15:05,900 So as delta gets smaller and smaller, this entropy heads 1285 01:15:05,900 --> 01:15:08,330 off towards infinity. 1286 01:15:08,330 --> 01:15:12,030 And in fact, you would expect that because if you take a 1287 01:15:12,030 --> 01:15:16,150 real number, which has a distribution which is -- well, 1288 01:15:16,150 --> 01:15:21,470 anything just between minus 1 and plus 1, for example, and I 1289 01:15:21,470 --> 01:15:25,250 want to represent it very, very well if it's uniformly 1290 01:15:25,250 --> 01:15:28,670 distributed there, the better I try to represent it, the 1291 01:15:28,670 --> 01:15:30,800 more bits it takes. 1292 01:15:30,800 --> 01:15:33,580 That's what this is saying. 1293 01:15:33,580 --> 01:15:37,680 This thing is changing as I try to represent things better 1294 01:15:37,680 --> 01:15:39,350 and better. 1295 01:15:39,350 --> 01:15:43,590 This quantity here is just some funny thing that deals 1296 01:15:43,590 --> 01:15:45,790 with the shape of the probability density and 1297 01:15:45,790 --> 01:15:48,040 nothing else. 1298 01:15:48,040 --> 01:15:53,180 It essentially has a scale factor built into it because 1299 01:15:53,180 --> 01:15:56,690 the probability density has a scale factor built into it. 1300 01:15:56,690 --> 01:16:02,270 A probability density is probability per unit length. 1301 01:16:02,270 --> 01:16:08,500 And therefore, this kind of entropy has to have that unit 1302 01:16:08,500 --> 01:16:11,490 length coming in here somehow, and that's the 1303 01:16:11,490 --> 01:16:13,380 way it comes in. 1304 01:16:13,380 --> 01:16:14,630 That's the way it is. 1305 01:16:19,800 --> 01:16:25,050 So to summarize all of that, and I'm sure you haven't quite 1306 01:16:25,050 --> 01:16:30,050 understood all of it, but if you do efficient discrete 1307 01:16:30,050 --> 01:16:35,240 coding at the end of the whole thing, the number of bits per 1308 01:16:35,240 --> 01:16:40,545 sample, namely, per number coming into the quantizer that 1309 01:16:40,545 --> 01:16:44,030 you need, is H of V, OK? 1310 01:16:44,030 --> 01:16:52,510 With this uniform quantizer, which produces an entropy of 1311 01:16:52,510 --> 01:16:56,530 the symbols H of V, then you L-bar bits per symbol to 1312 01:16:56,530 --> 01:16:57,470 represent that. 1313 01:16:57,470 --> 01:17:00,110 That's the result that we had before. 1314 01:17:03,300 --> 01:17:08,870 This quantity H of V depends only on delta and h of U. 1315 01:17:08,870 --> 01:17:14,920 Namely, h of U is equal to H of V plus log delta. 1316 01:17:14,920 --> 01:17:17,160 So, in other words, analytically, the only thing 1317 01:17:17,160 --> 01:17:20,120 you have to worry about is what is this 1318 01:17:20,120 --> 01:17:21,880 differential entropy? 1319 01:17:21,880 --> 01:17:25,310 You can't interpret what it is, but you can calculate it. 1320 01:17:25,310 --> 01:17:28,660 And once you calculate it, this tells you what this 1321 01:17:28,660 --> 01:17:31,630 entropy is for every choice of delta so 1322 01:17:31,630 --> 01:17:34,240 long as delta is small. 1323 01:17:34,240 --> 01:17:38,530 It says that when you get to the point where delta is small 1324 01:17:38,530 --> 01:17:42,690 and the probability density is essentially constant over a 1325 01:17:42,690 --> 01:17:46,360 region, if I want to make delta half as big as it was 1326 01:17:46,360 --> 01:17:49,700 before, then I'm going to wind up with twice as many 1327 01:17:49,700 --> 01:17:51,270 quantization regions. 1328 01:17:51,270 --> 01:17:52,990 They're all going to be only half as 1329 01:17:52,990 --> 01:17:55,000 probable as the ones before. 1330 01:17:55,000 --> 01:17:56,660 It's going to take me one extra bit to 1331 01:17:56,660 --> 01:17:59,550 represent that, OK? 1332 01:17:59,550 --> 01:18:03,930 Namely, this one extra bit for each of these old regions is 1333 01:18:03,930 --> 01:18:07,110 going to tell me whether I'm on the right side or the left 1334 01:18:07,110 --> 01:18:13,220 side of that old quantization region to talk about the new 1335 01:18:13,220 --> 01:18:14,310 quantization region. 1336 01:18:14,310 --> 01:18:17,280 So this all make sense, OK? 1337 01:18:17,280 --> 01:18:19,770 Namely, the only thing that's happening as you make the 1338 01:18:19,770 --> 01:18:25,270 quantization finer and finer is that you have these extra 1339 01:18:25,270 --> 01:18:28,580 bits coming in, which are sort of telling you what the fine 1340 01:18:28,580 --> 01:18:30,540 structure is. 1341 01:18:30,540 --> 01:18:34,930 And little h of U already has built into it the overall 1342 01:18:34,930 --> 01:18:38,280 shape of the thing. 1343 01:18:38,280 --> 01:18:42,850 OK, now I've added one thing more here, which I haven't 1344 01:18:42,850 --> 01:18:44,450 talked about at all. 1345 01:18:44,450 --> 01:18:50,440 This uniform scalar quantizer in fact becomes optimal as 1346 01:18:50,440 --> 01:18:53,440 delta become small. 1347 01:18:53,440 --> 01:18:56,830 And that's not obvious; it's not intuitive. 1348 01:18:56,830 --> 01:18:59,070 There's an argument in the text that 1349 01:18:59,070 --> 01:19:00,410 shows why that is true. 1350 01:19:00,410 --> 01:19:05,680 It uses a Lagrange multiplier to do it. 1351 01:19:05,680 --> 01:19:09,210 I guess there's a certain elegance to it. 1352 01:19:09,210 --> 01:19:12,310 I mean, after years of thinking about it, I would 1353 01:19:12,310 --> 01:19:14,590 say, yeah, it's pretty likely that it's true if I didn't 1354 01:19:14,590 --> 01:19:17,670 have the mathematics to know it's true. 1355 01:19:17,670 --> 01:19:20,700 But, in fact, it's a very interesting result. 1356 01:19:20,700 --> 01:19:26,760 It says that what you do with classical a to z converters, 1357 01:19:26,760 --> 01:19:29,320 if you're going to have very fine quantization, is, in 1358 01:19:29,320 --> 01:19:31,780 fact, the right thing to do.