1 00:00:00,000 --> 00:00:02,360 The following content is provided under a Creative 2 00:00:02,360 --> 00:00:03,650 Commons license. 3 00:00:03,650 --> 00:00:06,540 Your support will help MIT OpenCourseWare continue to 4 00:00:06,540 --> 00:00:09,515 offer high quality educatiohal resources for free. 5 00:00:09,515 --> 00:00:12,810 To make a donation or to view additional materials from 6 00:00:12,810 --> 00:00:16,920 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:16,920 --> 00:00:18,170 ocw.mit.edu. 8 00:00:21,380 --> 00:00:24,225 PROFESSOR: As this says, and all the handouts say 9 00:00:24,225 --> 00:00:26,650 that this is 6.450. 10 00:00:26,650 --> 00:00:31,500 It's a first year graduate course in the principles of 11 00:00:31,500 --> 00:00:33,930 digital communication. 12 00:00:33,930 --> 00:00:38,870 It's sort of the major first course that you take as a 13 00:00:38,870 --> 00:00:42,850 graduate student in the communication area. 14 00:00:42,850 --> 00:00:46,740 The information theory course uses this as a prerequisite, 15 00:00:46,740 --> 00:00:49,050 uses it in a rather strong way. 16 00:00:49,050 --> 00:00:52,950 6.432, the stochastic process course uses it as a 17 00:00:52,950 --> 00:00:54,080 prerequisite. 18 00:00:54,080 --> 00:00:57,710 6.451, which is the companion course which follows after 19 00:00:57,710 --> 00:01:00,600 this uses it as a prerequisite. 20 00:01:03,130 --> 00:01:07,110 It's sort of material that you have to know if you're going 21 00:01:07,110 --> 00:01:11,360 to go into the communication area. 22 00:01:11,360 --> 00:01:14,320 It deals with this huge industry, which is the 23 00:01:14,320 --> 00:01:15,520 communication industry. 24 00:01:15,520 --> 00:01:19,080 You look at the faculty here in the department and you're a 25 00:01:19,080 --> 00:01:21,760 little surprised to see that there are so few faculty 26 00:01:21,760 --> 00:01:25,630 members in digital communication, so many faculty 27 00:01:25,630 --> 00:01:27,960 members in the computer area. 28 00:01:27,960 --> 00:01:30,070 You wonder why this is. 29 00:01:30,070 --> 00:01:34,810 It's sort of this historical accident, because somehow or 30 00:01:34,810 --> 00:01:37,930 other the department didn't realize as these two fields 31 00:01:37,930 --> 00:01:41,070 were growing, the one that looked more glamorous for a 32 00:01:41,070 --> 00:01:44,180 long time was the computer field. 33 00:01:44,180 --> 00:01:47,840 Nobody in retrospect understands why that is. 34 00:01:47,840 --> 00:01:51,580 But at the same time that's how it is. 35 00:01:51,580 --> 00:01:54,200 You will not see as much excitement here. 36 00:01:54,200 --> 00:01:59,450 You also will see that because of the big bust in the year 37 00:01:59,450 --> 00:02:04,750 2000, there's much less commercial activity in the 38 00:02:04,750 --> 00:02:07,840 communications field now. 39 00:02:07,840 --> 00:02:11,410 As students, you ought to relish this. 40 00:02:11,410 --> 00:02:14,060 You ought to really be happy about this if you want to go 41 00:02:14,060 --> 00:02:17,950 into the communication field, because at some point in the 42 00:02:17,950 --> 00:02:24,010 future, all of the failure to start new investments, and all 43 00:02:24,010 --> 00:02:27,410 of the large communication companies means a very 44 00:02:27,410 --> 00:02:30,480 explosive growth is about to start. 45 00:02:30,480 --> 00:02:34,170 So I don't think this is a dead field by any means. 46 00:02:34,170 --> 00:02:39,050 Looking back historically, every time the communication 47 00:02:39,050 --> 00:02:42,390 field has seemed dead, and I can think of three or four 48 00:02:42,390 --> 00:02:47,270 such times in history, those have been the exact times 49 00:02:47,270 --> 00:02:50,470 where it was great to get into the field. 50 00:02:50,470 --> 00:02:53,660 There was a time in the early '70s when all the 51 00:02:53,660 --> 00:02:57,680 theoreticians who worked in communication theory all were 52 00:02:57,680 --> 00:03:02,070 going around with long faces saying everything that we can 53 00:03:02,070 --> 00:03:06,790 do has been done, the field is dead, nothing more to do, 54 00:03:06,790 --> 00:03:09,550 might as well give up and go into another field. 55 00:03:09,550 --> 00:03:15,080 This was really exactly the time that a number of small 56 00:03:15,080 --> 00:03:18,420 communication companies were really starting to grow very 57 00:03:18,420 --> 00:03:22,940 fast because they were using some of the modern ideas of 58 00:03:22,940 --> 00:03:26,340 communication theory and they were using these modern ideas 59 00:03:26,340 --> 00:03:27,980 to do new things. 60 00:03:27,980 --> 00:03:31,160 You now see companies like Qualcomm that were started 61 00:03:31,160 --> 00:03:34,670 then, Motorola of whom major divisions were 62 00:03:34,670 --> 00:03:37,070 started back then. 63 00:03:37,070 --> 00:03:41,420 An enormous amount of activity was starting just then, partly 64 00:03:41,420 --> 00:03:46,090 because the field seemed to be moribund and the theoreticians 65 00:03:46,090 --> 00:03:49,260 were turning and deciding well, I guess we have to do 66 00:03:49,260 --> 00:03:51,780 something practical now. 67 00:03:51,780 --> 00:03:55,310 If you want to find a good practical engineer, if you 68 00:03:55,310 --> 00:03:58,840 wind up in industry, if you wind up as an entrepreneur or 69 00:03:58,840 --> 00:04:01,980 if you wind up climbing the ladder and being a chief 70 00:04:01,980 --> 00:04:05,380 executive, if you want to find somebody who will really do 71 00:04:05,380 --> 00:04:08,370 important things, find somebody who understands 72 00:04:08,370 --> 00:04:12,620 theory who has decided that they suddenly want to start to 73 00:04:12,620 --> 00:04:14,800 do practical things. 74 00:04:14,800 --> 00:04:17,900 Because if you point to the people who have really done 75 00:04:17,900 --> 00:04:21,570 exciting things in this field, those are the ones who have 76 00:04:21,570 --> 00:04:22,950 done most of it. 77 00:04:22,950 --> 00:04:27,340 OK, so anyway, the big field, it's an important field. 78 00:04:27,340 --> 00:04:32,350 What we want to do is to study the aspects of communication 79 00:04:32,350 --> 00:04:37,380 systems that are unique to communication systems. 80 00:04:37,380 --> 00:04:40,900 In other words, we don't want to study hardware here, we 81 00:04:40,900 --> 00:04:44,180 don't want to study software here, because hardware is 82 00:04:44,180 --> 00:04:47,400 pretty much the same for any problem you want to look at. 83 00:04:47,400 --> 00:04:50,810 It's not that highly specialized, and software is 84 00:04:50,810 --> 00:04:51,860 not either. 85 00:04:51,860 --> 00:04:54,220 So we're not going to study either of those things, we're 86 00:04:54,220 --> 00:04:58,310 going to study much more the architectural principles of 87 00:04:58,310 --> 00:04:59,820 communication. 88 00:04:59,820 --> 00:05:02,970 We're going to study a lot of the theory of communication, 89 00:05:02,970 --> 00:05:06,500 and I want to explain why it is that we're going to do that 90 00:05:06,500 --> 00:05:07,390 as we move on. 91 00:05:07,390 --> 00:05:12,080 Because it's not at all clear at first why we want to study 92 00:05:12,080 --> 00:05:15,810 the silly things that we'll be studying in the course, and I 93 00:05:15,810 --> 00:05:19,460 want to give you some idea today of why we're pushed into 94 00:05:19,460 --> 00:05:21,370 doing that. 95 00:05:21,370 --> 00:05:23,340 As we start doing these things, I'll give 96 00:05:23,340 --> 00:05:25,390 you more of an idea. 97 00:05:25,390 --> 00:05:29,470 I know when I was a student I couldn't figure out why I was 98 00:05:29,470 --> 00:05:31,660 learning the things that I was learning. 99 00:05:31,660 --> 00:05:34,130 I just found that some things were much more interesting 100 00:05:34,130 --> 00:05:36,810 than others, and the things that tended to be interesting 101 00:05:36,810 --> 00:05:40,480 were the more theoretical things, so I started looking 102 00:05:40,480 --> 00:05:44,700 at them harder and thought gee, this is neat stuff. 103 00:05:44,700 --> 00:05:48,000 But I had no idea that it was ever going to be important. 104 00:05:48,000 --> 00:05:52,190 So I'm going to try to explain to you why this is important, 105 00:05:52,190 --> 00:05:56,800 and something today about what the proper inter-relationship 106 00:05:56,800 --> 00:06:00,160 is of engineering and theory. 107 00:06:00,160 --> 00:06:04,800 As another example of my own life, for a long time I 108 00:06:04,800 --> 00:06:10,960 thought -- this is rather embarrassing to say -- 109 00:06:10,960 --> 00:06:15,730 but I was a theoretician and not much of an engineer for a 110 00:06:15,730 --> 00:06:17,850 very long time. 111 00:06:17,850 --> 00:06:20,570 And I kept worrying about this and saying well the stuff I'm 112 00:06:20,570 --> 00:06:24,530 doing is fun and I enjoy doing it, but why should 113 00:06:24,530 --> 00:06:27,070 anybody pay me for it? 114 00:06:27,070 --> 00:06:30,450 Of course, I was being paid by MIT, I was being paid as a 115 00:06:30,450 --> 00:06:33,380 consultant, but I felt I was robbing all these people. 116 00:06:33,380 --> 00:06:37,600 And finally, I decided a rationale for all of this. 117 00:06:37,600 --> 00:06:42,270 Most people felt that theory was silly, and therefore, they 118 00:06:42,270 --> 00:06:44,640 would keep asking me well, if you're so smart 119 00:06:44,640 --> 00:06:46,940 why aren't you rich? 120 00:06:46,940 --> 00:06:50,530 I didn't have any very good answer to that because I felt 121 00:06:50,530 --> 00:06:53,390 I was smart because I understood all this theory, 122 00:06:53,390 --> 00:06:54,810 but I wasn't rich. 123 00:06:54,810 --> 00:06:57,630 So I thought gee, what I've gotta do is do enough 124 00:06:57,630 --> 00:07:00,750 engineering to get rich, and then when people ask me that 125 00:07:00,750 --> 00:07:02,780 question I can say I am. 126 00:07:05,980 --> 00:07:09,910 I would suggest to all you, if you have theoretical leadings, 127 00:07:09,910 --> 00:07:14,170 to focus on some engineering also so you can become rich, 128 00:07:14,170 --> 00:07:17,420 not because there's anything to do with the money, but just 129 00:07:17,420 --> 00:07:22,440 because it lets you hold your head up high in a society that 130 00:07:22,440 --> 00:07:25,770 values money a lot more than values intellect. 131 00:07:25,770 --> 00:07:27,390 So it's important. 132 00:07:27,390 --> 00:07:29,850 OK. 133 00:07:29,850 --> 00:07:32,720 This is a part of this two-term sequence. 134 00:07:32,720 --> 00:07:37,580 One comment here is I'm not sure that whether 6.451, which 135 00:07:37,580 --> 00:07:40,800 is the second part of this sequence, is going to be 136 00:07:40,800 --> 00:07:43,590 taught in the spring or not. 137 00:07:43,590 --> 00:07:48,020 It might not be taught until spring of the following year. 138 00:07:48,020 --> 00:07:51,400 If a large number of you decide that you really want to 139 00:07:51,400 --> 00:07:56,300 do this and do it now, make yourselves heard a little bit 140 00:07:56,300 --> 00:07:59,990 because the idea of a lot of people want this course will 141 00:07:59,990 --> 00:08:03,180 have something to do on whether somebody manages to 142 00:08:03,180 --> 00:08:04,430 teach it or not. 143 00:08:09,430 --> 00:08:14,270 As I was starting to say, theory has had a very large 144 00:08:14,270 --> 00:08:18,210 impact on almost every field we can think of. 145 00:08:18,210 --> 00:08:21,240 I mean you think of electromagnetism, you think of 146 00:08:21,240 --> 00:08:23,140 all these other fields and theory is very 147 00:08:23,140 --> 00:08:24,370 important in them. 148 00:08:24,370 --> 00:08:29,650 But in communication systems it is particularly important. 149 00:08:29,650 --> 00:08:35,070 It's partly important because back in 1948, Claude Shannon 150 00:08:35,070 --> 00:08:38,160 who was the inventor of information theory -- you 151 00:08:38,160 --> 00:08:42,420 usually don't invent theories, but he really did. 152 00:08:42,420 --> 00:08:46,500 This came whole cloth out of this one mind. 153 00:08:46,500 --> 00:08:49,990 He had these ideas turning around in his head for about 154 00:08:49,990 --> 00:08:53,620 eight years while the second World War was going on. 155 00:08:53,620 --> 00:08:56,830 He was working on other things, which were, in fact, 156 00:08:56,830 --> 00:09:01,140 very important things, and he was doing those things for the 157 00:09:01,140 --> 00:09:02,510 government. 158 00:09:02,510 --> 00:09:05,900 He was working on aircraft control and things like that. 159 00:09:05,900 --> 00:09:09,040 He was working on cryptography. 160 00:09:09,040 --> 00:09:12,230 But he was just fascinated by these communication problems 161 00:09:12,230 --> 00:09:14,970 and he couldn't get his mind off them. 162 00:09:14,970 --> 00:09:19,050 He took up cryptography because it happened to use all 163 00:09:19,050 --> 00:09:21,900 the ideas that he was really interested in. 164 00:09:21,900 --> 00:09:26,030 He wrote something about cryptography during the middle 165 00:09:26,030 --> 00:09:29,100 of the war, which made many people believe that he had 166 00:09:29,100 --> 00:09:32,040 done his cryptography work before he did his 167 00:09:32,040 --> 00:09:33,370 communication work. 168 00:09:33,370 --> 00:09:36,090 But in fact, it was the other way around. 169 00:09:36,090 --> 00:09:39,470 This was purely the fact that he could write about 170 00:09:39,470 --> 00:09:45,080 cryptography, and he never liked to write, so he didn't 171 00:09:45,080 --> 00:09:47,390 want to write about communication because it was a 172 00:09:47,390 --> 00:09:49,570 more complicated field and he didn't 173 00:09:49,570 --> 00:09:51,640 understand the whole thing. 174 00:09:51,640 --> 00:09:55,080 So for 25 years it was a theory floating around. 175 00:09:55,080 --> 00:09:57,840 You will learn a good deal of what this theory is. 176 00:09:57,840 --> 00:10:02,160 Most graduate courses on communication don't spend much 177 00:10:02,160 --> 00:10:05,800 time on information theory. 178 00:10:05,800 --> 00:10:09,910 This was not a mistake many, many years ago, but it is a 179 00:10:09,910 --> 00:10:14,790 mistake now, because communication theory at its 180 00:10:14,790 --> 00:10:17,930 core is information theory. 181 00:10:17,930 --> 00:10:19,300 Most people realize this. 182 00:10:19,300 --> 00:10:22,760 Most kids recognize it, most people in the 183 00:10:22,760 --> 00:10:24,370 street recognize it. 184 00:10:24,370 --> 00:10:28,450 When they start to use a system to communicate with the 185 00:10:28,450 --> 00:10:30,620 internet, what do they talk about? 186 00:10:30,620 --> 00:10:32,470 Do they talk about the bandwidth? 187 00:10:32,470 --> 00:10:32,630 No. 188 00:10:32,630 --> 00:10:35,560 They might call it bandwidth but what they're talking about 189 00:10:35,560 --> 00:10:38,750 is the number of bits per second they can communicate. 190 00:10:38,750 --> 00:10:41,930 This is a distinctly information theoretic idea, 191 00:10:41,930 --> 00:10:45,470 because communication used to be involved with studying 192 00:10:45,470 --> 00:10:47,490 different kinds of wave forms. 193 00:10:47,490 --> 00:10:50,620 Suddenly, at this point, people are studying all these 194 00:10:50,620 --> 00:10:55,430 other things and are based on Claude Shannon's ideas, 195 00:10:55,430 --> 00:11:00,640 therefore, we will teach some of those ideas as we go. 196 00:11:00,640 --> 00:11:05,960 As a matter of fact, I was talking a little bit about the 197 00:11:05,960 --> 00:11:11,020 fact that there's been a sort of a love-hate relationship by 198 00:11:11,020 --> 00:11:15,400 communication engineers for information theory. 199 00:11:15,400 --> 00:11:19,550 Back in 1960, a long, long time ago when I got my PhD 200 00:11:19,550 --> 00:11:25,080 degree, the people at MIT, the people who started this field, 201 00:11:25,080 --> 00:11:29,090 many of them told me that information theory was dead at 202 00:11:29,090 --> 00:11:30,360 that point. 203 00:11:30,360 --> 00:11:33,180 There'd been an enormous amount of research done on it 204 00:11:33,180 --> 00:11:36,450 in the ten years before 1960. 205 00:11:36,450 --> 00:11:38,780 They told me I should go into vacuum tubes, which was the 206 00:11:38,780 --> 00:11:41,680 coming field at that time. 207 00:11:41,680 --> 00:11:42,950 Two points to that story. 208 00:11:42,950 --> 00:11:45,800 If people tell you that information theory isn't 209 00:11:45,800 --> 00:11:49,880 important, please remember that story or else tell them 210 00:11:49,880 --> 00:11:53,940 to start to study vacuum tubes. 211 00:11:53,940 --> 00:11:56,620 The other point of it -- well, I forgot what the other point 212 00:11:56,620 --> 00:11:58,720 of it is so let's go on. 213 00:11:58,720 --> 00:12:03,560 Anyway, information theory is very much based on a bunch of 214 00:12:03,560 --> 00:12:05,410 very abstract ideas. 215 00:12:05,410 --> 00:12:08,210 I will start to say what that means as we go on, but 216 00:12:08,210 --> 00:12:11,000 basically it means that it's based on very 217 00:12:11,000 --> 00:12:12,860 simple-minded models. 218 00:12:12,860 --> 00:12:16,220 It's based on ignoring all the complexities of the 219 00:12:16,220 --> 00:12:20,650 communication systems and starting to play with toy 220 00:12:20,650 --> 00:12:23,110 models and trying to understand those. 221 00:12:23,110 --> 00:12:26,970 That's part of what we have to understand to understand why 222 00:12:26,970 --> 00:12:28,310 we're doing what we're doing. 223 00:12:30,850 --> 00:12:35,390 A very complex relationship between modeling, theory, 224 00:12:35,390 --> 00:12:39,780 exercises and engineering design. 225 00:12:39,780 --> 00:12:42,290 Maybe I shouldn't be talking about that now and maybe the 226 00:12:42,290 --> 00:12:45,660 notes shouldn't talk about it, but I want to open the subject 227 00:12:45,660 --> 00:12:48,080 because I want you to be thinking about that 228 00:12:48,080 --> 00:12:49,430 throughout the term. 229 00:12:49,430 --> 00:12:52,000 Because it's something that most people really don't 230 00:12:52,000 --> 00:12:53,230 understand. 231 00:12:53,230 --> 00:12:55,260 Something that most teachers don't understand. 232 00:12:55,260 --> 00:12:58,990 It's something that most students don't understand. 233 00:12:58,990 --> 00:13:03,120 When you're a student, you do the exercises you do because 234 00:13:03,120 --> 00:13:05,290 you know you have to do them. 235 00:13:05,290 --> 00:13:09,660 There was a sort of apocryphal story awhile ago that somebody 236 00:13:09,660 --> 00:13:12,870 told me about a graduate student, not at MIT, thank 237 00:13:12,870 --> 00:13:18,700 God, who went into see the person teaching a course on 238 00:13:18,700 --> 00:13:22,240 wireless and saying, I don't want to know all that theory, 239 00:13:22,240 --> 00:13:24,850 I don't want to have to think about these things, just tell 240 00:13:24,850 --> 00:13:28,450 me how to do the problems. 241 00:13:28,450 --> 00:13:33,210 Now, we will explain the enormous stupidity of that as 242 00:13:33,210 --> 00:13:34,430 we move on. 243 00:13:34,430 --> 00:13:40,360 Part of the stupidity is that the problems that you work on 244 00:13:40,360 --> 00:13:43,230 as a graduate student, even in your thesis, the problems that 245 00:13:43,230 --> 00:13:45,165 you work on are not real problems. 246 00:13:45,165 --> 00:13:50,080 The problems that we work on everywhere in engineering are 247 00:13:50,080 --> 00:13:53,030 toy problems. 248 00:13:53,030 --> 00:13:56,480 We work on them because we want to understand some aspect 249 00:13:56,480 --> 00:13:58,060 of the real problem. 250 00:13:58,060 --> 00:14:01,190 We can never understand the whole problem, so we study one 251 00:14:01,190 --> 00:14:03,890 aspect, then we study another aspect. 252 00:14:03,890 --> 00:14:07,780 The reason for all these equations that we have, it's 253 00:14:07,780 --> 00:14:10,160 not a way to understand reality. 254 00:14:10,160 --> 00:14:11,120 It really isn't. 255 00:14:11,120 --> 00:14:14,020 It's a way to understand little pieces of reality. 256 00:14:14,020 --> 00:14:17,370 You take all of those and after you have all of these in 257 00:14:17,370 --> 00:14:21,020 your mind, you then start to study the real engineering 258 00:14:21,020 --> 00:14:24,850 problem and you say I put together this and this and 259 00:14:24,850 --> 00:14:28,140 this and this and this -- this is important, this is not so 260 00:14:28,140 --> 00:14:31,100 important, here, this is more important. 261 00:14:31,100 --> 00:14:34,550 And then finally, if you're a really good engineer, instead 262 00:14:34,550 --> 00:14:37,090 of writing down an equation, you take an envelope, as the 263 00:14:37,090 --> 00:14:41,540 story goes, and you scribble down a few numbers, and you 264 00:14:41,540 --> 00:14:45,650 say the way this system ought to be built is the following. 265 00:14:45,650 --> 00:14:50,060 That's the way all important systems get built, I claim. 266 00:14:50,060 --> 00:14:52,950 Important systems are not designed by the kinds of 267 00:14:52,950 --> 00:14:55,550 things you do in homework exercises. 268 00:14:55,550 --> 00:14:59,130 The things you do in homework exercises are designed to help 269 00:14:59,130 --> 00:15:02,500 you understand small pieces of problems. 270 00:15:02,500 --> 00:15:05,570 You have to understand those small pieces of problems in 271 00:15:05,570 --> 00:15:09,210 order to understand the whole engineering problems, but you 272 00:15:09,210 --> 00:15:13,660 don't do simulations to build a communications system. 273 00:15:13,660 --> 00:15:17,570 You understand what the whole system is, you can see it in 274 00:15:17,570 --> 00:15:20,830 your mind after you think about it long enough, and 275 00:15:20,830 --> 00:15:24,370 that's where good system design comes from. 276 00:15:24,370 --> 00:15:27,730 Simulations are a big part of all of this, writing down 277 00:15:27,730 --> 00:15:29,860 equations are a big part of it. 278 00:15:29,860 --> 00:15:33,190 But when you're all done, if you design an engineering 279 00:15:33,190 --> 00:15:36,330 system and you don't see in your mind why the whole thing 280 00:15:36,330 --> 00:15:40,680 works, you will wind up with something like Microsoft Word. 281 00:15:44,360 --> 00:15:48,200 That's the truth. 282 00:15:48,200 --> 00:15:53,910 So the exercises that we're going to be doing are really 283 00:15:53,910 --> 00:15:58,940 aimed at understanding these simple models. 284 00:15:58,940 --> 00:16:02,960 The point of the exercises is not to get the right answer. 285 00:16:02,960 --> 00:16:06,540 The answer is totally irrelevant to everything. 286 00:16:06,540 --> 00:16:10,850 What's important is you start to understand what assumptions 287 00:16:10,850 --> 00:16:14,090 in the problem lead to what conclusions. 288 00:16:14,090 --> 00:16:18,040 How if you change something it changes something else. 289 00:16:18,040 --> 00:16:20,640 That's when you're using system design, because in 290 00:16:20,640 --> 00:16:24,440 system design you can never understand the whole thing. 291 00:16:24,440 --> 00:16:27,450 You have to look at what parts of it are important, what 292 00:16:27,450 --> 00:16:30,440 parts of it aren't important, and the only way you can do 293 00:16:30,440 --> 00:16:33,830 that is having a catalog of the simple-minded things in 294 00:16:33,830 --> 00:16:35,470 the back of your mind. 295 00:16:35,470 --> 00:16:40,820 Practical engineers get that by dealing with real systems 296 00:16:40,820 --> 00:16:42,330 day-to-day all their lives. 297 00:16:42,330 --> 00:16:44,850 They see catastrophes occur. 298 00:16:44,850 --> 00:16:49,420 Those things tell them what to avoid and what not to avoid. 299 00:16:49,420 --> 00:16:52,850 Students can do this in a much more efficient way. 300 00:16:52,850 --> 00:16:56,970 Obviously, in the practical experience also, but the 301 00:16:56,970 --> 00:17:01,090 experience that you get from looking at these exercises and 302 00:17:01,090 --> 00:17:04,580 looking at this analytical material in the right way, 303 00:17:04,580 --> 00:17:08,940 mainly looking at it is how do you analyze these simple toy 304 00:17:08,940 --> 00:17:12,510 models, how do you make conclusions from them is what 305 00:17:12,510 --> 00:17:16,090 let's you start being a good engineer. 306 00:17:16,090 --> 00:17:20,880 I'm stressing this because at some point this term, all of 307 00:17:20,880 --> 00:17:25,190 you, all of you at different times are going to start 308 00:17:25,190 --> 00:17:27,640 saying why the hell am I doing this? 309 00:17:27,640 --> 00:17:31,580 Why am I dealing with this stupid little problem? 310 00:17:31,580 --> 00:17:36,080 Stupid little problems can get incredibly frustrating at 311 00:17:36,080 --> 00:17:38,920 times, because you see something that's so simple and 312 00:17:38,920 --> 00:17:43,270 you can't understand it. 313 00:17:43,270 --> 00:17:45,200 Then you say well, this simple thing can't 314 00:17:45,200 --> 00:17:46,980 be important anyway. 315 00:17:46,980 --> 00:17:49,670 What I really want to do is understand what this overall 316 00:17:49,670 --> 00:17:52,500 system is, and you give up at that point. 317 00:17:52,500 --> 00:17:55,250 Don't do that. 318 00:17:55,250 --> 00:18:02,500 Sometimes there's a very difficult question about what 319 00:18:02,500 --> 00:18:05,110 you mean by something being simple. 320 00:18:05,110 --> 00:18:08,240 I will sometimes say that something is very simple, and 321 00:18:08,240 --> 00:18:14,530 I will offend many you to whom it's not simple. 322 00:18:14,530 --> 00:18:17,650 The point is something becomes simple after 323 00:18:17,650 --> 00:18:19,340 you understand it. 324 00:18:19,340 --> 00:18:22,200 Nothing is simple before you understand it. 325 00:18:22,200 --> 00:18:25,140 So there's that point at which the light goes off in your 326 00:18:25,140 --> 00:18:27,580 head at which it becomes simple. 327 00:18:27,580 --> 00:18:30,860 When I say that something is simple, what I mean is that if 328 00:18:30,860 --> 00:18:33,240 you think about it long enough a light will go off in your 329 00:18:33,240 --> 00:18:35,810 head and it will become simple. 330 00:18:35,810 --> 00:18:37,960 It's not simple to start with. 331 00:18:37,960 --> 00:18:41,050 Other things are just messy. 332 00:18:41,050 --> 00:18:43,610 You've all heard of mathematical problems which 333 00:18:43,610 --> 00:18:45,290 are just ugly. 334 00:18:45,290 --> 00:18:48,270 There are these things of extraordinary complexity. 335 00:18:48,270 --> 00:18:50,870 There are things where there just isn't any structure. 336 00:18:50,870 --> 00:18:54,020 You see a lot of these things in computer science. 337 00:18:54,020 --> 00:18:55,810 I talked about Microsoft Word. 338 00:18:55,810 --> 00:18:59,300 Part of the reason it's such a lousy language is because the 339 00:18:59,300 --> 00:19:01,240 problem is incredibly difficult. 340 00:19:01,240 --> 00:19:03,230 It's incredibly unstructured. 341 00:19:03,230 --> 00:19:06,630 So people come up with things that don't make any sense, 342 00:19:06,630 --> 00:19:09,700 because that inherently is not simple. 343 00:19:09,700 --> 00:19:13,030 OK, these other things that we'll be studying here are 344 00:19:13,030 --> 00:19:16,460 inherently simple and the only problem is how do you get to 345 00:19:16,460 --> 00:19:21,070 the point of seeing these simple ideas. 346 00:19:21,070 --> 00:19:24,710 OK, so that's enough philosophy. 347 00:19:27,390 --> 00:19:31,710 No, not quite enough, a little more of that. 348 00:19:31,710 --> 00:19:37,160 All the everyday communication systems that we deal with are 349 00:19:37,160 --> 00:19:41,490 incredibly complex in terms of the amount of hardware, the 350 00:19:41,490 --> 00:19:43,100 amount of software in them. 351 00:19:43,100 --> 00:19:45,640 If you look at the whole system and you try to 352 00:19:45,640 --> 00:19:50,020 understand it as a whole, you don't have prayer. 353 00:19:50,020 --> 00:19:52,310 There's no way to do it. 354 00:19:52,310 --> 00:19:55,760 These things work and they can be understood because they're 355 00:19:55,760 --> 00:19:59,870 very highly structured. 356 00:19:59,870 --> 00:20:03,800 They have simple architectural principles. 357 00:20:03,800 --> 00:20:05,910 What do I mean by architectural principles? 358 00:20:05,910 --> 00:20:09,390 It's a word that engineers use more and more. 359 00:20:09,390 --> 00:20:13,030 It started off with computer systems because it became 360 00:20:13,030 --> 00:20:15,720 essential there to start thinking in terms of 361 00:20:15,720 --> 00:20:16,700 architecture. 362 00:20:16,700 --> 00:20:19,020 It's exactly the same thing that you mean when you're 363 00:20:19,020 --> 00:20:21,830 talking about a house or a building. 364 00:20:21,830 --> 00:20:25,990 Mainly the architecture is the overall design. 365 00:20:25,990 --> 00:20:29,000 It's the way the different pieces fit together. 366 00:20:29,000 --> 00:20:31,510 In engineering, architecture is the same thing. 367 00:20:31,510 --> 00:20:34,160 It's the thing that happens when your eyes fuzz over a 368 00:20:34,160 --> 00:20:38,500 little bit and you can't see any of the details anymore, 369 00:20:38,500 --> 00:20:41,170 and suddenly what you're doing is you're looking at the whole 370 00:20:41,170 --> 00:20:43,540 thing as an entity and saying how do 371 00:20:43,540 --> 00:20:46,320 the pieces fit together. 372 00:20:46,320 --> 00:20:50,370 Well, what we're going to be focusing on here. 373 00:20:50,370 --> 00:20:56,510 One of the major keys to making these things simpler is 374 00:20:56,510 --> 00:20:57,560 to have standardized 375 00:20:57,560 --> 00:21:01,340 interfaces and to have layering. 376 00:21:01,340 --> 00:21:04,410 And we'll talk a little bit about each of these because 377 00:21:04,410 --> 00:21:08,420 our entire study of this subject is based on studying 378 00:21:08,420 --> 00:21:12,220 the different layers that we'll wind up with. 379 00:21:12,220 --> 00:21:18,050 OK, the most important and most critical interface in a 380 00:21:18,050 --> 00:21:20,950 communication system is between the 381 00:21:20,950 --> 00:21:24,000 source and the channel. 382 00:21:24,000 --> 00:21:27,850 Today, that standard interface is almost always 383 00:21:27,850 --> 00:21:30,250 a binary data stream. 384 00:21:30,250 --> 00:21:32,490 You all know this. 385 00:21:32,490 --> 00:21:37,070 When you talk about your modab, how do 386 00:21:37,070 --> 00:21:38,920 you talk about it? 387 00:21:38,920 --> 00:21:41,860 48 kilobits per second if you're an 388 00:21:41,860 --> 00:21:44,180 old-fashioned kind of guy. 389 00:21:44,180 --> 00:21:49,250 1.2 megabits if you're a medium technology person, or 390 00:21:49,250 --> 00:21:52,660 several gigabits if you're one of these persons who likes to 391 00:21:52,660 --> 00:21:55,830 gobble up huge amounts of stuff. 392 00:21:55,830 --> 00:21:58,130 That's the way you describe channels -- how many bits per 393 00:21:58,130 --> 00:22:00,470 second can you send? 394 00:22:00,470 --> 00:22:04,790 The way you describe sources is how many bits per second do 395 00:22:04,790 --> 00:22:08,540 you need as a way of viewing that source. 396 00:22:08,540 --> 00:22:11,780 When you store a picture, how do you talk about it? 397 00:22:11,780 --> 00:22:14,030 You don't talk about in terms of the colors or any of these 398 00:22:14,030 --> 00:22:16,700 other things, picture is a bunch of bits. 399 00:22:19,750 --> 00:22:21,820 That, at the fundamental level, is 400 00:22:21,820 --> 00:22:22,940 what we're doing here. 401 00:22:22,940 --> 00:22:25,340 When you deal with source coding, you're talking 402 00:22:25,340 --> 00:22:28,660 fundamentally about the problem of taking whatever the 403 00:22:28,660 --> 00:22:33,050 source is, and it can be any sort of thing at all -- it can 404 00:22:33,050 --> 00:22:37,460 be voice, it can be text, it can be emails, the emails with 405 00:22:37,460 --> 00:22:42,680 viruses in it, whatever -- and the problem is you want to 406 00:22:42,680 --> 00:22:46,020 turn it into a bit stream. 407 00:22:46,020 --> 00:22:50,500 Then this bit stream gets presented to the channel. 408 00:22:50,500 --> 00:22:54,520 Channels and the channel encoding equipment don't have 409 00:22:54,520 --> 00:22:58,210 any idea of what those bits mean. 410 00:22:58,210 --> 00:23:01,110 They don't want to have any idea of what those bits mean. 411 00:23:01,110 --> 00:23:03,440 That's what an interface means. 412 00:23:03,440 --> 00:23:07,310 It means the problem of interpreting the bits, the 413 00:23:07,310 --> 00:23:10,690 problem of going from something which you view as 414 00:23:10,690 --> 00:23:14,650 having intelligence down to a sequence of bits is the 415 00:23:14,650 --> 00:23:17,190 problem of the source coder. 416 00:23:17,190 --> 00:23:20,260 The problem of taking those bits and moving them from one 417 00:23:20,260 --> 00:23:24,270 place to another is the function of the channel 418 00:23:24,270 --> 00:23:29,250 designer, the network designer and all of those things. 419 00:23:29,250 --> 00:23:32,760 When you talk about information theory, it's a 420 00:23:32,760 --> 00:23:35,570 total misnomer from the beginning. 421 00:23:35,570 --> 00:23:39,120 Information theory does not deal with information at all, 422 00:23:39,120 --> 00:23:41,230 it deals with data. 423 00:23:41,230 --> 00:23:45,390 We will understand what that distinction is as we go on. 424 00:23:45,390 --> 00:23:48,250 When you talk about a bit stream, what you're talking 425 00:23:48,250 --> 00:23:51,740 about is a sequence of data. 426 00:23:51,740 --> 00:23:53,970 It doesn't mean anything. 427 00:23:53,970 --> 00:23:57,040 It's just that a one is different from a zero, and you 428 00:23:57,040 --> 00:23:59,200 don't care how it's different, it just is. 429 00:24:01,750 --> 00:24:05,390 So the channel input is a binary stream. 430 00:24:05,390 --> 00:24:07,210 It doesn't have any meaning as far as 431 00:24:07,210 --> 00:24:08,670 the channel is concerned. 432 00:24:08,670 --> 00:24:11,520 The bit stream does have a meaning as far as the source 433 00:24:11,520 --> 00:24:13,020 is concerned. 434 00:24:13,020 --> 00:24:18,290 So the problem is the source encoder takes these bits, 435 00:24:18,290 --> 00:24:23,030 takes the source, whatever it is, with its meaning and 436 00:24:23,030 --> 00:24:26,960 everything else, turns it into a stream of bits -- 437 00:24:26,960 --> 00:24:30,700 you would like to turn it into as few bits as possible. 438 00:24:30,700 --> 00:24:33,960 Then you take those bits, you transmit them on the channel, 439 00:24:33,960 --> 00:24:37,050 the channel designer understands what the physical 440 00:24:37,050 --> 00:24:40,000 medium is all about, understands what the network 441 00:24:40,000 --> 00:24:42,630 is all about, and doesn't have a clue as to what 442 00:24:42,630 --> 00:24:45,200 all those bits mean. 443 00:24:45,200 --> 00:24:46,610 So the picture is the following. 444 00:24:51,060 --> 00:24:57,350 That's the major layering of all communication systems. 445 00:24:57,350 --> 00:25:01,500 Here's the input, which is whatever it happens to be. 446 00:25:01,500 --> 00:25:03,250 Here's the source encoder. 447 00:25:03,250 --> 00:25:07,970 The source encoder has to know a great deal about 448 00:25:07,970 --> 00:25:09,490 what that input is. 449 00:25:09,490 --> 00:25:11,160 You have to know the structure of the input. 450 00:25:11,160 --> 00:25:15,880 You have a source encoder for voice and you use it to try to 451 00:25:15,880 --> 00:25:21,400 encode a video stream it's not going to work, obviously. 452 00:25:21,400 --> 00:25:24,340 If you try to use a source encoder for English and try to 453 00:25:24,340 --> 00:25:29,860 use it on Chinese, it probably won't work very well either. 454 00:25:29,860 --> 00:25:33,600 It won't work for anything other than what 455 00:25:33,600 --> 00:25:34,460 it's intended for. 456 00:25:34,460 --> 00:25:38,910 So the source encoder has to understand the source. 457 00:25:38,910 --> 00:25:42,280 When I say understand the source, what do I mean? 458 00:25:42,280 --> 00:25:44,340 It means to understand the probabilistic 459 00:25:44,340 --> 00:25:47,150 structure of the source. 460 00:25:47,150 --> 00:25:52,080 This is another of the things that Shannon recognized 461 00:25:52,080 --> 00:25:57,600 clearly that hadn't been recognized at all before this. 462 00:25:57,600 --> 00:26:02,610 You have to somehow understand how to view what's coming out 463 00:26:02,610 --> 00:26:08,090 of the source, this picture, say, as one of a possible set 464 00:26:08,090 --> 00:26:10,470 of pictures. 465 00:26:10,470 --> 00:26:14,950 If you know ahead of time that a communication system is 466 00:26:14,950 --> 00:26:21,110 going to be sending the Gettysburg Address or one of 467 00:26:21,110 --> 00:26:28,230 50 other highly-inspiring texts, what do you do? 468 00:26:28,230 --> 00:26:31,950 Do you try to encode these things, all 469 00:26:31,950 --> 00:26:33,120 these different texts? 470 00:26:33,120 --> 00:26:34,100 Of course not. 471 00:26:34,100 --> 00:26:36,790 You just assign number one to the Gettysburg Address, number 472 00:26:36,790 --> 00:26:39,690 two to the second thing that you might be interested in. 473 00:26:39,690 --> 00:26:43,500 You send a number and then at the output out comes the 474 00:26:43,500 --> 00:26:46,930 Gettysburg Address because you've stored that there. 475 00:26:46,930 --> 00:26:49,220 So that's the idea of source coding. 476 00:26:49,220 --> 00:26:53,790 What is important is not the complexity of the individual 477 00:26:53,790 --> 00:26:57,110 messages, it's the probabilistic structure that 478 00:26:57,110 --> 00:27:00,330 tells you what are the different possibilities. 479 00:27:00,330 --> 00:27:02,530 That's what happens when you try to turn 480 00:27:02,530 --> 00:27:04,440 things into binary digits. 481 00:27:04,440 --> 00:27:07,780 You have one sequence of binary digits for each of the 482 00:27:07,780 --> 00:27:11,890 possible things that you want to represent. 483 00:27:11,890 --> 00:27:13,960 Then in the channel you have the same sort of thing. 484 00:27:13,960 --> 00:27:16,760 You have noise, you have all sorts of other crazy things on 485 00:27:16,760 --> 00:27:18,220 the channel. 486 00:27:18,220 --> 00:27:21,470 You have a sequence of bits coming in, you have a sequence 487 00:27:21,470 --> 00:27:23,160 of bits coming out. 488 00:27:23,160 --> 00:27:27,030 The fundamental problem of the channel is very, very simple. 489 00:27:27,030 --> 00:27:30,440 You want to spit out the same bits that came in. 490 00:27:30,440 --> 00:27:34,360 You can take a certain amount of delay doing that, but 491 00:27:34,360 --> 00:27:36,110 that's what you have to do. 492 00:27:36,110 --> 00:27:39,170 In order to do that, you have to understand something about 493 00:27:39,170 --> 00:27:42,110 the probabilistic structure of the noise. 494 00:27:42,110 --> 00:27:45,290 So both ways you have probabilistic structure. 495 00:27:45,290 --> 00:27:50,170 It's why you have to understand probability at the 496 00:27:50,170 --> 00:27:55,250 level of 6.041, which is the undergraduate course here 497 00:27:55,250 --> 00:27:56,950 studying probability. 498 00:27:56,950 --> 00:28:00,010 If you don't have that background, please don't take 499 00:28:00,010 --> 00:28:03,680 the course because you're going to be crucified. 500 00:28:03,680 --> 00:28:09,660 If you think you can learn it on the fly, don't. 501 00:28:09,660 --> 00:28:11,680 You can't learn it on the fly. 502 00:28:14,570 --> 00:28:18,650 As a matter of fact, if you've taken the course, you still 503 00:28:18,650 --> 00:28:24,530 don't understand it, and you will need to scramble a little 504 00:28:24,530 --> 00:28:27,900 bit to understand it at a deeper level in order to 505 00:28:27,900 --> 00:28:29,910 understand what's going on here. 506 00:28:29,910 --> 00:28:34,190 6.041 is a particularly good undergraduate course. 507 00:28:34,190 --> 00:28:36,650 I think it's probably taught here better than it's taught 508 00:28:36,650 --> 00:28:38,710 at most places. 509 00:28:38,710 --> 00:28:41,410 But you still don't understand it the first time through. 510 00:28:41,410 --> 00:28:47,330 It's a tricky, subtle subject you really need to understand 511 00:28:47,330 --> 00:28:50,370 enough of it, If you have no exposure to it. 512 00:28:50,370 --> 00:28:52,630 If you've taken a course called statistics and 513 00:28:52,630 --> 00:28:56,220 probability, which first teaches you statistics and 514 00:28:56,220 --> 00:29:01,070 then tucks in a little bit of probability at the end, go 515 00:29:01,070 --> 00:29:04,710 learn probability first because you don't have enough 516 00:29:04,710 --> 00:29:07,940 of it to take this subject. 517 00:29:07,940 --> 00:29:12,640 The other part of dealing with these channels is that 518 00:29:12,640 --> 00:29:16,070 suddenly we have to deal with 4AA analysis and all of these 519 00:29:16,070 --> 00:29:20,090 things, if you don't have some kind of subject 520 00:29:20,090 --> 00:29:22,020 in signals and systems. 521 00:29:22,020 --> 00:29:24,480 Some computer scientists don't learn that 522 00:29:24,480 --> 00:29:26,070 kind of material anymore. 523 00:29:26,070 --> 00:29:29,510 Again, you can't learn it on the fly because you need a 524 00:29:29,510 --> 00:29:32,240 little bit of background for it. 525 00:29:32,240 --> 00:29:35,530 Those two prerequisites are really essential. 526 00:29:35,530 --> 00:29:37,810 Nothing else is essential. 527 00:29:37,810 --> 00:29:40,960 The more mathematics you know the better off you are, but we 528 00:29:40,960 --> 00:29:43,470 will develop what we need as we go. 529 00:29:50,150 --> 00:29:53,510 Now, we talked about this fundamental layer in all 530 00:29:53,510 --> 00:29:57,030 communication systems, which is between source coding and 531 00:29:57,030 --> 00:29:58,670 channel coding. 532 00:29:58,670 --> 00:30:02,470 You first turn the source into bits, and then you turn the 533 00:30:02,470 --> 00:30:06,290 bits into something you can transmit on the channel. 534 00:30:06,290 --> 00:30:10,840 Source coding breaks down into three pieces itself. 535 00:30:10,840 --> 00:30:13,980 It doesn't have to break down into those three pieces, but 536 00:30:13,980 --> 00:30:16,590 it usually does. 537 00:30:16,590 --> 00:30:21,330 Most source coding you start out with a wave form, such as 538 00:30:21,330 --> 00:30:26,860 with voice, or with pictures you start out with what's more 539 00:30:26,860 --> 00:30:30,130 complicated than a wave form, some kind of mapping from 540 00:30:30,130 --> 00:30:35,110 x-coordinate and y-coordinate and time, and you map all of 541 00:30:35,110 --> 00:30:40,980 those things into something. 542 00:30:40,980 --> 00:30:45,420 Our problem here is we want to take these analog wave forms 543 00:30:45,420 --> 00:30:46,885 or generalizations of. 544 00:30:46,885 --> 00:30:49,720 It turns out that all we have to deal with here is the 545 00:30:49,720 --> 00:30:52,630 analog wave forms because everything else 546 00:30:52,630 --> 00:30:54,370 follows easily from that. 547 00:30:54,370 --> 00:30:55,320 So there's one question. 548 00:30:55,320 --> 00:30:58,790 How do you turn an analog wave form into 549 00:30:58,790 --> 00:31:00,810 a sequence of numbers? 550 00:31:00,810 --> 00:31:06,020 This is the common way of doing source coding. 551 00:31:06,020 --> 00:31:08,180 Many of you who have been exposed to the sampling 552 00:31:08,180 --> 00:31:10,810 theorem, you think that's the way to do it, this is 553 00:31:10,810 --> 00:31:12,590 one way to do it. 554 00:31:12,590 --> 00:31:14,870 We will talk a great deal about this. 555 00:31:14,870 --> 00:31:18,260 You will find out that the sampling theorem really isn't 556 00:31:18,260 --> 00:31:19,270 the way to do it. 557 00:31:19,270 --> 00:31:22,080 Although again, it's a simple model, it's the way to start 558 00:31:22,080 --> 00:31:23,220 dealing with it. 559 00:31:23,220 --> 00:31:25,350 Then you learn the various other parts of 560 00:31:25,350 --> 00:31:27,070 that as you go along. 561 00:31:27,070 --> 00:31:30,130 After you wind up with sequences, sequences of 562 00:31:30,130 --> 00:31:34,750 numbers, we worry about how to quantize those numbers, 563 00:31:34,750 --> 00:31:37,670 because fundamentally we're trying to get from 564 00:31:37,670 --> 00:31:40,220 wave forms into bits. 565 00:31:40,220 --> 00:31:43,970 So the three stages in that process are first go to 566 00:31:43,970 --> 00:31:48,910 sequences, go from sequences to symbols -- namely, you have 567 00:31:48,910 --> 00:31:53,470 a finite number of symbols, which are quantized versions 568 00:31:53,470 --> 00:31:56,910 of those analog numbers which can be anything. 569 00:31:56,910 --> 00:32:01,360 Then finally, you encode the symbols into bits. 570 00:32:01,360 --> 00:32:03,920 The coding goes in the opposite order. 571 00:32:03,920 --> 00:32:08,480 Here's a picture which shows it better than the words do. 572 00:32:11,270 --> 00:32:16,310 From the input wave form, you sample it or something more 573 00:32:16,310 --> 00:32:18,770 generalized than sampling it. 574 00:32:18,770 --> 00:32:21,450 That gives you a sequence of numbers. 575 00:32:21,450 --> 00:32:24,380 You quantize those numbers, that gives 576 00:32:24,380 --> 00:32:26,690 you a symbol sequence. 577 00:32:26,690 --> 00:32:30,140 You have a discrete coder, which turns those symbols into 578 00:32:30,140 --> 00:32:33,410 a sequence of bits in some nice way. 579 00:32:33,410 --> 00:32:36,460 Then you have this reliable binary channel. 580 00:32:36,460 --> 00:32:38,500 Notice what I've done here. 581 00:32:38,500 --> 00:32:41,520 This thing is that entire system we were 582 00:32:41,520 --> 00:32:43,300 talking about before. 583 00:32:43,300 --> 00:32:47,560 This has a channel encoder in it, a channel, a channel 584 00:32:47,560 --> 00:32:50,140 decoder and all of that. 585 00:32:50,140 --> 00:32:53,900 But what we've been saying in this layering idea is when 586 00:32:53,900 --> 00:32:57,580 you're dealing with source coding you ignore all of that. 587 00:32:57,580 --> 00:33:02,700 You say OK, it's Tom's job who was designing the channel 588 00:33:02,700 --> 00:33:05,880 encoder to make sure that my bits come out 589 00:33:05,880 --> 00:33:08,400 here as the same bits. 590 00:33:08,400 --> 00:33:10,740 If the bits are wrong it's his fault. 591 00:33:10,740 --> 00:33:15,210 If the bits are right, and this output wave form doesn't 592 00:33:15,210 --> 00:33:19,130 look like the input wave form, it's my fault. 593 00:33:19,130 --> 00:33:24,260 So part of the layering idea is we just recreate this whole 594 00:33:24,260 --> 00:33:28,370 thing here as the same bits and we don't worry about what 595 00:33:28,370 --> 00:33:32,470 happens when the bits are different. 596 00:33:32,470 --> 00:33:36,490 The other part of this is that all of this breaks 597 00:33:36,490 --> 00:33:40,120 down in the same way. 598 00:33:40,120 --> 00:33:45,870 Namely, at this interface between here and here, the 599 00:33:45,870 --> 00:33:47,480 idea is the same. 600 00:33:47,480 --> 00:33:50,470 Namely, there's an input wave form that comes in, there's a 601 00:33:50,470 --> 00:33:53,540 sequence of numbers here. 602 00:33:53,540 --> 00:33:56,670 The job of this analog filter, as we call it, and you'll see 603 00:33:56,670 --> 00:34:01,500 why we call it that later, is to take numbers here, turn 604 00:34:01,500 --> 00:34:04,370 them back into wave forms. 605 00:34:04,370 --> 00:34:08,526 These numbers are supposed to resemble these numbers in some 606 00:34:08,526 --> 00:34:10,300 way or other. 607 00:34:10,300 --> 00:34:14,600 I can deal with this part of the problem totally separately 608 00:34:14,600 --> 00:34:17,350 from dealing with the other parts of the problem. 609 00:34:17,350 --> 00:34:20,810 Namely, the only problem here is how do I take numbers, turn 610 00:34:20,810 --> 00:34:23,520 them into wave forms. 611 00:34:23,520 --> 00:34:26,880 There are a bunch of other problems hidden here, like 612 00:34:26,880 --> 00:34:30,180 these numbers are not going to be exactly the same as these 613 00:34:30,180 --> 00:34:33,910 wave forms because I have quantization involved here. 614 00:34:33,910 --> 00:34:36,920 Since the numbers are not exactly the same, we have to 615 00:34:36,920 --> 00:34:39,280 deal with questions of approximation. 616 00:34:39,280 --> 00:34:43,250 How to approximations in numbers come out in terms of 617 00:34:43,250 --> 00:34:46,120 approximations in terms of wave forms? 618 00:34:46,120 --> 00:34:49,380 That's where you need things like the sampling theorem and 619 00:34:49,380 --> 00:34:51,740 the generalizations of it that we're going to spend 620 00:34:51,740 --> 00:34:52,980 a lot of time with. 621 00:34:52,980 --> 00:34:55,390 Then you go down to this point. 622 00:34:55,390 --> 00:34:59,900 At this point the quantizer produces a string of symbols. 623 00:34:59,900 --> 00:35:04,180 Whatever those symbols happen to be, but those symbols 624 00:35:04,180 --> 00:35:07,810 might, in fact, just be integers whereas the things 625 00:35:07,810 --> 00:35:12,440 here were real valued numbers, and in quantizing we turn real 626 00:35:12,440 --> 00:35:17,560 numbers into intergers by saying -- well, it's the same 627 00:35:17,560 --> 00:35:19,420 way you approximate things all the time. 628 00:35:19,420 --> 00:35:22,620 You round off the pennies in your checkbook. 629 00:35:22,620 --> 00:35:25,720 It's the same idea. 630 00:35:25,720 --> 00:35:29,300 So the problem here is the quantizer produces these 631 00:35:29,300 --> 00:35:31,530 symbols here. 632 00:35:31,530 --> 00:35:35,410 The job of all of this stuff is to recreate those same 633 00:35:35,410 --> 00:35:38,090 symbols here. 634 00:35:38,090 --> 00:35:41,020 So the symbols now go into something that does the 635 00:35:41,020 --> 00:35:42,750 reverse of a quantizer -- 636 00:35:42,750 --> 00:35:44,690 we'll call it a table look-up because that's 637 00:35:44,690 --> 00:35:46,090 sort of what it is. 638 00:35:46,090 --> 00:35:50,890 So we can study this part of the problem separately, also. 639 00:35:50,890 --> 00:35:53,890 You don't have to understand anything about this problem to 640 00:35:53,890 --> 00:35:56,580 understand this problem. 641 00:35:56,580 --> 00:36:01,430 Finally, we have symbols here going into a discrete encoder. 642 00:36:01,430 --> 00:36:04,390 We have an interesting problem which says how do we take 643 00:36:04,390 --> 00:36:07,310 these symbols which have some kind of probabilistic 644 00:36:07,310 --> 00:36:11,500 structure to them and turn them into binary digits in an 645 00:36:11,500 --> 00:36:12,960 efficient way. 646 00:36:12,960 --> 00:36:17,760 If the symbols here, for example, are binary symbols 647 00:36:17,760 --> 00:36:24,500 and the symbols happen to be 1 with probability 999 out of 648 00:36:24,500 --> 00:36:29,000 1,000, and zero with probability 1 in 1,000, you 649 00:36:29,000 --> 00:36:32,300 don't really just want to take these symbols and map them in 650 00:36:32,300 --> 00:36:35,420 a straightforward way, 1 into 1 and zero into zero. 651 00:36:35,420 --> 00:36:37,600 You want to look at whole sequences of them. 652 00:36:37,600 --> 00:36:39,460 You want to do run length coding, for example. 653 00:36:39,460 --> 00:36:44,210 You want to count how long it is between these zero's, and 654 00:36:44,210 --> 00:36:47,260 then what you do is send those counts and you encode them. 655 00:36:47,260 --> 00:36:50,000 So there are all kinds of tricks that you can use here. 656 00:36:50,000 --> 00:36:53,870 But the point is, this problem is separate, and this problem 657 00:36:53,870 --> 00:36:56,480 is separate from this problem. 658 00:36:56,480 --> 00:36:58,980 Now what we're going to do in this course is start out with 659 00:36:58,980 --> 00:37:04,140 this problem, because this is the cleanest and the neatest 660 00:37:04,140 --> 00:37:06,460 of the three problems. 661 00:37:06,460 --> 00:37:09,700 It's the one you can say the most about. 662 00:37:09,700 --> 00:37:12,920 It's the one which has the nicest results about it. 663 00:37:12,920 --> 00:37:16,430 In fact, a lot of source coding problems just deal with 664 00:37:16,430 --> 00:37:17,510 this one part of it. 665 00:37:17,510 --> 00:37:20,820 Whenever you're encoding text, you're starting out with 666 00:37:20,820 --> 00:37:24,060 symbols which are the letters of whatever language you're 667 00:37:24,060 --> 00:37:27,340 dealing with and perhaps a bunch of other things -- ask a 668 00:37:27,340 --> 00:37:31,570 code, you've generated 256 different things, including 669 00:37:31,570 --> 00:37:34,540 all the letters, all the capital letters, all of the 670 00:37:34,540 --> 00:37:37,850 junk that used to be on typewriters, and some of the 671 00:37:37,850 --> 00:37:41,430 junk that's now in word processing languages, and you 672 00:37:41,430 --> 00:37:44,310 have a way of mapping those into binary digits. 673 00:37:44,310 --> 00:37:46,600 We want to look at better ways of doing that. 674 00:37:46,600 --> 00:37:50,240 So this, in fact, covers the whole problem of source coding 675 00:37:50,240 --> 00:37:53,950 when you're dealing with text rather than 676 00:37:53,950 --> 00:37:57,150 dealing with wave forms. 677 00:37:57,150 --> 00:38:01,120 This problem here combined with this problem deals with 678 00:38:01,120 --> 00:38:03,720 all of these many problems where what you're interested 679 00:38:03,720 --> 00:38:07,780 in is taking a sequence of numbers or a sequence of 680 00:38:07,780 --> 00:38:10,190 whatjamacallits and quantitizing 681 00:38:10,190 --> 00:38:12,050 them and then coding. 682 00:38:12,050 --> 00:38:15,180 Finally, when you get to the input wave forms and output 683 00:38:15,180 --> 00:38:18,830 wave forms, you've solved both of these problems and you can 684 00:38:18,830 --> 00:38:21,760 then focus on what's important here. 685 00:38:21,760 --> 00:38:25,660 This is a good case study of why the information theoretic 686 00:38:25,660 --> 00:38:29,010 approach works and why theory works. 687 00:38:29,010 --> 00:38:31,760 Because if you started out with the problem of saying I 688 00:38:31,760 --> 00:38:36,300 want to build something to encode voice, and you try to 689 00:38:36,300 --> 00:38:39,700 do all of these together, and you look up all the things you 690 00:38:39,700 --> 00:38:44,070 can find about voice encoding on the web and you read all of 691 00:38:44,070 --> 00:38:47,320 them, what you will wind up with, I can guarantee you, is 692 00:38:47,320 --> 00:38:50,990 you will know an enormous amount of jargon, you will 693 00:38:50,990 --> 00:38:54,210 know an enormous number of different systems, which each 694 00:38:54,210 --> 00:38:59,420 half work, you will not have the slightest clue as to how 695 00:38:59,420 --> 00:39:02,420 to build a better system. 696 00:39:02,420 --> 00:39:05,640 So you have to take the structured viewpoint, which is 697 00:39:05,640 --> 00:39:09,090 really the only way to do things to find new and better 698 00:39:09,090 --> 00:39:11,430 ways of doing things. 699 00:39:18,700 --> 00:39:20,530 Now there are some extra things. 700 00:39:20,530 --> 00:39:22,940 I have over-simplified it. 701 00:39:22,940 --> 00:39:25,850 I want to over-simplify things in this course. 702 00:39:25,850 --> 00:39:29,170 What I will always do is I'll try to cheat you into thinking 703 00:39:29,170 --> 00:39:33,760 the problem is simpler than it is, and then I will talk about 704 00:39:33,760 --> 00:39:37,760 all of the added little nasties that come in as soon 705 00:39:37,760 --> 00:39:40,080 as you try to use any of this stuff. 706 00:39:40,080 --> 00:39:41,890 There are always nasties. 707 00:39:41,890 --> 00:39:46,010 What I hope you will all do by the end of the term is find 708 00:39:46,010 --> 00:39:49,680 those little nasties on your own before I start to talk 709 00:39:49,680 --> 00:39:50,390 about them. 710 00:39:50,390 --> 00:39:53,700 Namely, when we start to talk about a simple model, be 711 00:39:53,700 --> 00:39:57,610 interested in the simple model, study it, find out what 712 00:39:57,610 --> 00:40:01,230 it's all about, but at the same time, for Pete's sake, 713 00:40:01,230 --> 00:40:02,670 ask yourself the question. 714 00:40:02,670 --> 00:40:06,940 What does this have to do with the price of rice in China or 715 00:40:06,940 --> 00:40:09,050 with anything else? 716 00:40:09,050 --> 00:40:10,520 And ask those questions. 717 00:40:10,520 --> 00:40:13,840 Don't let those questions interfere with understanding 718 00:40:13,840 --> 00:40:17,280 the simple model, but you've got to always be focusing on 719 00:40:17,280 --> 00:40:23,030 how that simple model relates to what you visualize as a 720 00:40:23,030 --> 00:40:26,630 communication system problem. 721 00:40:26,630 --> 00:40:28,540 It usually will have some relation but 722 00:40:28,540 --> 00:40:30,250 not a complete relation. 723 00:40:30,250 --> 00:40:34,590 Here the problems are in this binary interface. 724 00:40:34,590 --> 00:40:39,580 The source might produce packets or the source might 725 00:40:39,580 --> 00:40:41,870 produce a stream of data. 726 00:40:41,870 --> 00:40:45,480 In other words, if what you're dealing with is the kind of 727 00:40:45,480 --> 00:40:50,270 situation where you're an amateur photographer, you go 728 00:40:50,270 --> 00:40:53,840 around taking all sorts of pictures, and then you encode 729 00:40:53,840 --> 00:40:55,420 these pictures -- 730 00:40:55,420 --> 00:40:57,130 what do I mean by encoding the pictures? 731 00:40:57,130 --> 00:41:00,010 You turn them into binary digits. 732 00:41:00,010 --> 00:41:03,040 Then you want to send your pictures to somebody else, but 733 00:41:03,040 --> 00:41:05,590 what you're really doing is sending these packets to 734 00:41:05,590 --> 00:41:06,560 someone else. 735 00:41:06,560 --> 00:41:09,580 You're not sending a stream of data -- you have 10 pictures, 736 00:41:09,580 --> 00:41:12,180 you want to send those 10 pictures. 737 00:41:12,180 --> 00:41:15,300 At the output of the whole system, you hope you see 10 738 00:41:15,300 --> 00:41:16,270 separate pictures. 739 00:41:16,270 --> 00:41:20,310 You hope there's something in there which can recognize that 740 00:41:20,310 --> 00:41:23,020 out of the stream of binary digits that come across the 741 00:41:23,020 --> 00:41:26,350 channel, there's some packet structure there. 742 00:41:26,350 --> 00:41:29,490 Well we're not going to talk about that really. 743 00:41:29,490 --> 00:41:32,600 Whenever you start dealing with this problem of source 744 00:41:32,600 --> 00:41:35,700 encoding, you also need to worry a little bit about the 745 00:41:35,700 --> 00:41:36,750 packet structure. 746 00:41:36,750 --> 00:41:40,720 What are the protocols for knowing when something new 747 00:41:40,720 --> 00:41:43,900 starts, when something new ends. 748 00:41:43,900 --> 00:41:46,790 Those kinds of problems are dealt with mostly in a network 749 00:41:46,790 --> 00:41:50,590 course here, and you can find out all sorts of things about 750 00:41:50,590 --> 00:41:51,150 them there. 751 00:41:51,150 --> 00:41:53,750 But there are these two general types of things -- 752 00:41:53,750 --> 00:41:56,860 packets and streams of data. 753 00:41:56,860 --> 00:41:59,560 In terms of understanding voice encoding, you can study 754 00:41:59,560 --> 00:42:01,000 them both together. 755 00:42:01,000 --> 00:42:02,790 Why can you study them both together? 756 00:42:02,790 --> 00:42:05,980 Because the packets are long and because the packets are 757 00:42:05,980 --> 00:42:08,960 long because they include a lot of data, you have a little 758 00:42:08,960 --> 00:42:12,110 bit of end effect about how to start it, a little bit of end 759 00:42:12,110 --> 00:42:15,110 effect about how to end it, but the main structural 760 00:42:15,110 --> 00:42:17,200 problem you'll be dealing with is what to do 761 00:42:17,200 --> 00:42:19,010 with the whole thing. 762 00:42:19,010 --> 00:42:23,550 It's an added piece that comes at the end to worry about how 763 00:42:23,550 --> 00:42:26,690 do you denote the beginning and the end. 764 00:42:26,690 --> 00:42:29,970 That's a problem you have with stream data also, because 765 00:42:29,970 --> 00:42:34,820 stream data does not start at time minus infinity. 766 00:42:34,820 --> 00:42:37,300 Whatever the stream data is coming from, what it's coming 767 00:42:37,300 --> 00:42:40,060 from did not at time, t equals infinity. 768 00:42:40,060 --> 00:42:43,810 You just think of it that way because you recognize you can 769 00:42:43,810 --> 00:42:47,450 postpone the problem of how do you start it and how do you 770 00:42:47,450 --> 00:42:51,650 end it, and hopefully you can postpone it until somebody 771 00:42:51,650 --> 00:42:53,040 else has taken over the job. 772 00:42:55,700 --> 00:42:58,300 What the channel accepts is either of these binary streams 773 00:42:58,300 --> 00:43:02,300 or packets, but then queueing exists. 774 00:43:02,300 --> 00:43:05,650 In other words, you have stuff coming into a channel -- 775 00:43:05,650 --> 00:43:09,330 sometimes I'm sitting at home and I want to send long files, 776 00:43:09,330 --> 00:43:10,830 I want to send this whole textbook 777 00:43:10,830 --> 00:43:13,260 I'm writing to somebody. 778 00:43:13,260 --> 00:43:16,870 It queues up, it takes a long time to get it out of my 779 00:43:16,870 --> 00:43:20,110 computer and into this other person's computer. 780 00:43:20,110 --> 00:43:23,370 It would be nice if I could send it at optical speeds. 781 00:43:23,370 --> 00:43:24,410 But I don't care much. 782 00:43:24,410 --> 00:43:28,750 I don't care much whether it takes a second or a minute or 783 00:43:28,750 --> 00:43:30,740 an hour to get to this other person. 784 00:43:30,740 --> 00:43:35,190 He won't read it for a week anyway, if he reads it at all, 785 00:43:35,190 --> 00:43:37,230 so what difference does it make? 786 00:43:37,230 --> 00:43:39,650 But anyway, we have these queueing problems, which are, 787 00:43:39,650 --> 00:43:40,990 again, separable. 788 00:43:40,990 --> 00:43:44,550 They're dealt with mostly in the network course. 789 00:43:44,550 --> 00:43:48,710 What we're mostly interested in is how to reduce bit rate 790 00:43:48,710 --> 00:43:49,520 for sources. 791 00:43:49,520 --> 00:43:52,080 How do you encode things more efficiently? 792 00:43:52,080 --> 00:43:54,920 Data compression is the word usually given to this. 793 00:43:54,920 --> 00:43:58,970 How do you compress things into a smaller number of bits 794 00:43:58,970 --> 00:44:01,790 using the statistical structure of it? 795 00:44:01,790 --> 00:44:04,570 Then how do you increase the number of bits you can send 796 00:44:04,570 --> 00:44:05,470 over a channel? 797 00:44:05,470 --> 00:44:07,340 That's the problem with channels. 798 00:44:07,340 --> 00:44:11,240 If you have a channel that'll send 4,800 bits per second, 799 00:44:11,240 --> 00:44:14,440 which is what telephone lines used to do. 800 00:44:14,440 --> 00:44:17,550 If you build a new product that sends 9,600 bits per 801 00:44:17,550 --> 00:44:21,080 second, which was a major technological achievement back 802 00:44:21,080 --> 00:44:26,540 in the '70s and '80s, suddenly your company becomes a very 803 00:44:26,540 --> 00:44:29,140 important communication player. 804 00:44:29,140 --> 00:44:32,450 If you then take the same channel and learn how to 805 00:44:32,450 --> 00:44:36,770 communicate at megabits over it, that's a major thing also. 806 00:44:36,770 --> 00:44:39,080 How did people learn to do that? 807 00:44:39,080 --> 00:44:43,340 Mostly by using all the ideas that the people used in going 808 00:44:43,340 --> 00:44:47,900 from 4,800 bits per second to 9,600 bits per second. 809 00:44:47,900 --> 00:44:51,830 It was the people who did that first job who were the primary 810 00:44:51,830 --> 00:44:55,260 people involved in the second job. 811 00:44:55,260 --> 00:44:57,870 The point of that is it doesn't really make any 812 00:44:57,870 --> 00:45:01,690 difference as an engineer whether you're going from 813 00:45:01,690 --> 00:45:10,040 4,800 to 9,600 or from 9,600 to 19.2 or from 19.9 to 38.4 814 00:45:10,040 --> 00:45:12,640 or whatever it is, or so forth up. 815 00:45:12,640 --> 00:45:16,120 Each one of these is just putting in a few new goodies, 816 00:45:16,120 --> 00:45:18,790 recognizing a few new things, making the system 817 00:45:18,790 --> 00:45:20,040 a little bit better. 818 00:45:22,560 --> 00:45:24,150 Let's talk about the channel now. 819 00:45:27,160 --> 00:45:28,340 The channel is given -- 820 00:45:28,340 --> 00:45:31,050 in other words, it's not under the control of the designer. 821 00:45:31,050 --> 00:45:34,820 This is something we haven't talked about yet. 822 00:45:34,820 --> 00:45:39,880 In all of these communication system design problems, one of 823 00:45:39,880 --> 00:45:44,180 the things that gets straight is when you're worrying about 824 00:45:44,180 --> 00:45:48,440 the engineering of something, what parts of the system can 825 00:45:48,440 --> 00:45:53,540 you control and what parts of it can't you control. 826 00:45:53,540 --> 00:45:57,040 Even more when you get into layering -- what parts can you 827 00:45:57,040 --> 00:45:59,920 control and what parts can other people control. 828 00:45:59,920 --> 00:46:02,750 Namely, if I have some source and I'm trying to transmit it 829 00:46:02,750 --> 00:46:07,330 over some physical channel and I have two engineers -- one of 830 00:46:07,330 --> 00:46:12,380 them is a data compressor, and the other is a channel guy. 831 00:46:12,380 --> 00:46:15,830 What the channel guy is trying to do is to find a way of 832 00:46:15,830 --> 00:46:18,570 sending more bits over this channel than 833 00:46:18,570 --> 00:46:20,170 could be done before. 834 00:46:20,170 --> 00:46:24,050 What the data compressor is trying to do is to find a way 835 00:46:24,050 --> 00:46:28,180 to encode the source into fewer bits. 836 00:46:28,180 --> 00:46:31,560 If the two come together, the whole thing works. 837 00:46:31,560 --> 00:46:34,550 If the two don't come together, then you have to 838 00:46:34,550 --> 00:46:37,710 fire the engineers, hire some new engineers or do something 839 00:46:37,710 --> 00:46:39,890 else or you get fired. 840 00:46:39,890 --> 00:46:44,270 So that's the story. 841 00:46:44,270 --> 00:46:46,720 The things you can't change here, you can't change the 842 00:46:46,720 --> 00:46:50,620 structure of the source usually, and you can't change 843 00:46:50,620 --> 00:46:52,700 the structure of the channel. 844 00:46:52,700 --> 00:46:55,700 There's some very interesting new problems coming into 845 00:46:55,700 --> 00:46:59,430 existence these days by information theorists who are 846 00:46:59,430 --> 00:47:02,540 trying to study biological systems. 847 00:47:02,540 --> 00:47:06,720 One of the peculiar things in trying to understand how 848 00:47:06,720 --> 00:47:11,320 biological organisms, which have remarkable systems for 849 00:47:11,320 --> 00:47:15,870 communicating information from one place to another, how they 850 00:47:15,870 --> 00:47:17,380 manage to do it. 851 00:47:17,380 --> 00:47:20,720 One of the things you find is that this separation that we 852 00:47:20,720 --> 00:47:24,820 use in studying electronic communication does not exist 853 00:47:24,820 --> 00:47:26,800 in the body at all. 854 00:47:26,800 --> 00:47:30,020 In other words, what we call a source gets 855 00:47:30,020 --> 00:47:32,440 blended with the channel. 856 00:47:32,440 --> 00:47:35,910 What we call the statistics of the source get very much 857 00:47:35,910 --> 00:47:39,130 blended in with what this organism is trying to do. 858 00:47:39,130 --> 00:47:42,840 If the organism cannot send such refined data, it figures 859 00:47:42,840 --> 00:47:46,680 out ways to get the data that it needs which is essential 860 00:47:46,680 --> 00:47:50,390 for its survival or it dies and some other 861 00:47:50,390 --> 00:47:52,590 organism takes over. 862 00:47:52,590 --> 00:47:56,740 So you find the system which has no constraints in it, and 863 00:47:56,740 --> 00:47:58,820 it evolves with all of these things 864 00:47:58,820 --> 00:48:00,760 changing at the same time. 865 00:48:00,760 --> 00:48:03,650 The channels are changing, the source statistics are 866 00:48:03,650 --> 00:48:07,790 changing, and the way the source data is encoded is 867 00:48:07,790 --> 00:48:12,360 changing, and the way the data is transmitted is changing. 868 00:48:12,360 --> 00:48:14,940 So it's a whole new generalization of this whole 869 00:48:14,940 --> 00:48:17,210 problem of how do you communicate. 870 00:48:17,210 --> 00:48:20,510 Here though, the problem we're dealing with is these fixed 871 00:48:20,510 --> 00:48:26,210 channels, which you can think of, depending on what your 872 00:48:26,210 --> 00:48:30,670 interests are, you can view the canonical channel as being 873 00:48:30,670 --> 00:48:35,110 a piece of wire in a telephone network, or even view it as 874 00:48:35,110 --> 00:48:41,000 being a cable in a cable TV system, or you can view it as 875 00:48:41,000 --> 00:48:42,670 being a wireless channel. 876 00:48:42,670 --> 00:48:44,760 We will talk about all of these in the course. 877 00:48:44,760 --> 00:48:48,890 The last three weeks of the course is primarily talking 878 00:48:48,890 --> 00:48:52,540 about wireless because that's where the problems are most 879 00:48:52,540 --> 00:48:55,380 interesting. 880 00:48:55,380 --> 00:49:01,910 But in any situation we look at, the channel is fixed. 881 00:49:01,910 --> 00:49:04,500 It's given to us and we can't change it. 882 00:49:04,500 --> 00:49:08,520 All we can do fiddle around with the channel encoding and 883 00:49:08,520 --> 00:49:10,460 the channel decoding. 884 00:49:10,460 --> 00:49:13,750 The channels that we're most interested in usually send 885 00:49:13,750 --> 00:49:15,510 wave forms. 886 00:49:15,510 --> 00:49:17,530 That brings up another interesting question. 887 00:49:17,530 --> 00:49:21,140 We call this digital communication, and what I've 888 00:49:21,140 --> 00:49:24,230 been telling you is that the sources that we're interested 889 00:49:24,230 --> 00:49:28,890 in are most often wave form sources. 890 00:49:28,890 --> 00:49:31,600 The channels that we're interested in are most often 891 00:49:31,600 --> 00:49:34,870 channels that send wave forms, receive wave forms and add 892 00:49:34,870 --> 00:49:36,920 noise, which is wave forms. 893 00:49:36,920 --> 00:49:38,910 Why do we call it digital communication? 894 00:49:42,450 --> 00:49:45,760 Anybody have a clue as to why we might call all of this 895 00:49:45,760 --> 00:49:48,700 digital communication? 896 00:49:48,700 --> 00:49:48,950 Yeah? 897 00:49:48,950 --> 00:49:50,200 AUDIENCE: [INAUDIBLE] 898 00:49:52,590 --> 00:49:55,660 PROFESSOR: Because the analog gets represented in binary 899 00:49:55,660 --> 00:50:01,040 digits, because we have chosen essentially, because of having 900 00:50:01,040 --> 00:50:05,180 studied Shannon at some point, to turn all the analog sources 901 00:50:05,180 --> 00:50:08,070 into binary bit streams. 902 00:50:08,070 --> 00:50:10,900 Namely, when you talk about digital communication, what 903 00:50:10,900 --> 00:50:14,220 you're really doing is saying I have decided there will be a 904 00:50:14,220 --> 00:50:18,800 digital interface between source and channel. 905 00:50:18,800 --> 00:50:20,820 That's what digital communication is. 906 00:50:20,820 --> 00:50:23,370 That's what most communication today is. 907 00:50:23,370 --> 00:50:26,050 There's very little communication today that 908 00:50:26,050 --> 00:50:29,750 starts out with an analog wave form and finds a way to 909 00:50:29,750 --> 00:50:36,130 transmit it without first going into a binary sequence. 910 00:50:36,130 --> 00:50:39,850 So here's a picture of one of the noises that 911 00:50:39,850 --> 00:50:42,430 we're going to study. 912 00:50:42,430 --> 00:50:47,640 We have an input which is now a wave form. 913 00:50:47,640 --> 00:50:51,700 We have noise, which is a wave form. 914 00:50:51,700 --> 00:50:55,180 And we have an output, which is a wave form. 915 00:50:55,180 --> 00:51:01,280 This input here is going to be created somehow in a process 916 00:51:01,280 --> 00:51:06,380 of modulation from this binary stream that's coming into the 917 00:51:06,380 --> 00:51:07,730 channel encoder. 918 00:51:07,730 --> 00:51:11,020 So somehow or other we're taking a binary stream, 919 00:51:11,020 --> 00:51:14,820 turning it into a wave form. 920 00:51:14,820 --> 00:51:17,900 Now, think a little bit about what might be essential in 921 00:51:17,900 --> 00:51:19,150 that process. 922 00:51:25,110 --> 00:51:28,530 If you know something about practical engineering systems 923 00:51:28,530 --> 00:51:32,620 for communication, you know that there's an organization 924 00:51:32,620 --> 00:51:37,130 in the U.S. called the FCC, and every other country has 925 00:51:37,130 --> 00:51:43,450 its companion set of initials, which says what part of the 926 00:51:43,450 --> 00:51:46,650 radio spectrum you can use and what part of the radio 927 00:51:46,650 --> 00:51:49,510 spectrum you can't use. 928 00:51:49,510 --> 00:51:51,870 You suddenly realize that somehow it's going to be 929 00:51:51,870 --> 00:51:57,260 important to turn these binary streams into wave forms, which 930 00:51:57,260 --> 00:52:02,360 are more or less compressed into some frequency bands. 931 00:52:02,360 --> 00:52:03,670 That's one of the problems we're going to 932 00:52:03,670 --> 00:52:06,120 have to worry about. 933 00:52:06,120 --> 00:52:11,270 But at some level if I take a whole bunch of different 934 00:52:11,270 --> 00:52:17,330 binary sequences, one thing I can do, for example, I could 935 00:52:17,330 --> 00:52:20,730 take a binary sequence of length 100. 936 00:52:20,730 --> 00:52:24,160 How many such binary sequences are there? 937 00:52:24,160 --> 00:52:28,030 Has length 100, the first bit can be 1 or zero, second bit 938 00:52:28,030 --> 00:52:30,650 can be 1 or zero -- that's four combinations. 939 00:52:30,650 --> 00:52:33,160 Third bit can be 1 or zero also -- that makes eight 940 00:52:33,160 --> 00:52:35,290 combinations of three bits. 941 00:52:35,290 --> 00:52:38,270 There are 2 to the 100th combinations 942 00:52:38,270 --> 00:52:41,430 of 100 binary digits. 943 00:52:41,430 --> 00:52:45,830 So a theoretician says fine, I'm at those 2 to the 100th 944 00:52:45,830 --> 00:52:50,120 different combinations of binary digits into evenly 945 00:52:50,120 --> 00:52:55,500 spaced numbers between zero and 1. 946 00:52:55,500 --> 00:52:58,810 Then I will take those evenly spaced numbers between zero 947 00:52:58,810 --> 00:53:03,210 and 1 and I'll modulate some wave form, whatever wave form 948 00:53:03,210 --> 00:53:07,120 I happen to choose, and I will send that wave form. 949 00:53:07,120 --> 00:53:10,700 In the absence of noise, I pick up that wave form and 950 00:53:10,700 --> 00:53:15,010 turn it back into my binary sequence again. 951 00:53:15,010 --> 00:53:16,660 What's the conclusion from this? 952 00:53:19,800 --> 00:53:22,700 The conclusion is if you don't have noise there's no 953 00:53:22,700 --> 00:53:26,120 constraint on how much data you can send. 954 00:53:26,120 --> 00:53:31,250 I can send as much data as I want to, and there's nothing 955 00:53:31,250 --> 00:53:35,450 to stop me from doing it, if I don't have noise. 956 00:53:35,450 --> 00:53:35,690 Yeah. 957 00:53:35,690 --> 00:53:38,666 AUDIENCE: So, is that where something that 958 00:53:38,666 --> 00:53:40,155 [UNINTELLIGIBLE] basically comes in. 959 00:53:40,155 --> 00:53:42,390 Like a difference between one signal and the next signal. 960 00:53:42,390 --> 00:53:45,490 PROFESSOR: Somehow I have to keep these things seperate. 961 00:53:45,490 --> 00:53:49,710 I don't necessarily have to separate one binary digit from 962 00:53:49,710 --> 00:53:52,280 the next binary digit in time. 963 00:53:52,280 --> 00:53:56,730 I could, in fact, do something like creating 2 to the 100th 964 00:53:56,730 --> 00:54:03,820 different wave forms or I could do anything in between. 965 00:54:03,820 --> 00:54:08,290 I could take each sequence of eight bits, turn them into 1 966 00:54:08,290 --> 00:54:13,540 of 256 wave forms, transmit one wave form, then a little 967 00:54:13,540 --> 00:54:16,520 bit later transmit another wave form and so forth. 968 00:54:16,520 --> 00:54:19,850 So I can split up the pie in any way I want to, and we'll 969 00:54:19,850 --> 00:54:21,790 talk about that a lot as we go on. 970 00:54:21,790 --> 00:54:24,610 Now, I split them up into different wave forms, which I 971 00:54:24,610 --> 00:54:26,760 send at spaced times. 972 00:54:26,760 --> 00:54:30,210 I somehow have to worry about the fact that if I send them 973 00:54:30,210 --> 00:54:35,750 over a finite bandwidth, there is no way in hell that you can 974 00:54:35,750 --> 00:54:39,790 take finite bandwidth wave forms and send 975 00:54:39,790 --> 00:54:42,270 them in finite time. 976 00:54:42,270 --> 00:54:47,680 Anything which lasts for a finite amount of time spreads 977 00:54:47,680 --> 00:54:49,360 out in bandwidth. 978 00:54:49,360 --> 00:54:52,620 Anything which is in a finite amount of bandwidth spreads 979 00:54:52,620 --> 00:54:53,540 out in time. 980 00:54:53,540 --> 00:54:56,090 That's what you ought to know from 6.003. 981 00:54:56,090 --> 00:54:59,890 So we're dealing with an unsoluble problem here. 982 00:54:59,890 --> 00:55:03,580 Well, Nyquist back in 1928 solved that problem. 983 00:55:03,580 --> 00:55:06,330 He said well no, you can't make the wave form separate, 984 00:55:06,330 --> 00:55:10,000 but you can make the sample separate. 985 00:55:10,000 --> 00:55:12,670 People earlier had figured out they could do that with the 986 00:55:12,670 --> 00:55:15,390 sampling theorem, but the sampling theorem wasn't very 987 00:55:15,390 --> 00:55:19,300 practical, but Nyquist's way of doing it was practical. 988 00:55:19,300 --> 00:55:22,350 So, in fact, you can separate these wave forms. 989 00:55:22,350 --> 00:55:24,340 But you still have the problem how do you 990 00:55:24,340 --> 00:55:25,590 deal with the noise. 991 00:55:27,710 --> 00:55:31,290 Well, one of the favorite ways of dealing with noise is to 992 00:55:31,290 --> 00:55:35,510 assume that it's something called white Gaussian noise. 993 00:55:35,510 --> 00:55:36,910 What is white Gaussian noise? 994 00:55:36,910 --> 00:55:42,400 White Gaussian noise is noise which no matter where you look 995 00:55:42,400 --> 00:55:43,500 it's sitting there. 996 00:55:43,500 --> 00:55:44,990 You can't get away from it. 997 00:55:44,990 --> 00:55:47,940 You move around to different frequencies, the noise is 998 00:55:47,940 --> 00:55:48,750 still there. 999 00:55:48,750 --> 00:55:51,960 You move around to different times, it's still there. 1000 00:55:51,960 --> 00:55:54,520 It is somehow uniformed throughout time 1001 00:55:54,520 --> 00:55:57,640 and throughout frequency. 1002 00:55:57,640 --> 00:56:02,830 Kind of an awkward thing because if I developed a 1003 00:56:02,830 --> 00:56:07,830 receiver which didn't have any bandwidth constraint on it, 1004 00:56:07,830 --> 00:56:10,510 and I looked at this noise coming in, which was spread 1005 00:56:10,510 --> 00:56:14,040 out over all frequencies, the noise would 1006 00:56:14,040 --> 00:56:16,190 burn out the receiver. 1007 00:56:16,190 --> 00:56:19,300 In other words, this white Gaussian noise 1008 00:56:19,300 --> 00:56:22,540 has infinite power. 1009 00:56:22,540 --> 00:56:23,890 That shouldn't bother you too much. 1010 00:56:23,890 --> 00:56:26,610 You deal with unit impulses all the time. 1011 00:56:26,610 --> 00:56:30,310 A unit impulse has infinite energy also. 1012 00:56:30,310 --> 00:56:32,840 You probably never thought about it because most 1013 00:56:32,840 --> 00:56:36,380 undergraduate courses try very hard to conceal that fact from 1014 00:56:36,380 --> 00:56:39,530 you because they like to use impulses for everything, but 1015 00:56:39,530 --> 00:56:44,170 impulses have infinite energy associated with them, and 1016 00:56:44,170 --> 00:56:45,770 that's kind of a nasty thing. 1017 00:56:45,770 --> 00:56:48,900 But anyway, we will deal with the fact that impulses have 1018 00:56:48,900 --> 00:56:51,720 infinite energy, and therefore, not use them to 1019 00:56:51,720 --> 00:56:53,050 transmit data. 1020 00:56:53,050 --> 00:56:56,550 And we will deal with the fact that this noise has infinite 1021 00:56:56,550 --> 00:57:00,930 power, and find out how to deal with that. 1022 00:57:00,930 --> 00:57:04,440 And we will learn how to understand the statistical 1023 00:57:04,440 --> 00:57:06,620 structure of that noise. 1024 00:57:06,620 --> 00:57:10,590 So we'll deal with all of these things when we start 1025 00:57:10,590 --> 00:57:13,190 studying these channels. 1026 00:57:13,190 --> 00:57:16,820 As we do that we will find out that no matter what you do on 1027 00:57:16,820 --> 00:57:21,540 a channel like this, there are some fundamental constraints 1028 00:57:21,540 --> 00:57:24,540 on how much data can be transmitted. 1029 00:57:24,540 --> 00:57:27,190 I can take you through a little bit of a thought 1030 00:57:27,190 --> 00:57:32,230 experiment which will sort of explain that, I think. 1031 00:57:32,230 --> 00:57:36,150 It's partly what we were just starting to say over here. 1032 00:57:36,150 --> 00:57:42,460 A simple-minded way to look at this problem is I have binary 1033 00:57:42,460 --> 00:57:44,960 digits coming in. 1034 00:57:44,960 --> 00:57:51,690 I take these binary digits and I separate them into shorts 1035 00:57:51,690 --> 00:57:55,390 sequences of m binary digits each. 1036 00:57:55,390 --> 00:57:58,640 So, I take the first m binary digits, then I take the next m 1037 00:57:58,640 --> 00:58:03,540 binary digits, and the next m binary digits and so forth. 1038 00:58:03,540 --> 00:58:08,240 A sequence of m binary digits, there are 2 to the m 1039 00:58:08,240 --> 00:58:09,960 combinations of this. 1040 00:58:09,960 --> 00:58:15,310 So I map these 2 to the m combinations into one of 2 to 1041 00:58:15,310 --> 00:58:17,950 the m different numbers. 1042 00:58:17,950 --> 00:58:20,190 Now, you look at this noise and you say I have to keep 1043 00:58:20,190 --> 00:58:24,260 some separation between these numbers. 1044 00:58:24,260 --> 00:58:27,230 So, with a separation between the numbers, say the 1045 00:58:27,230 --> 00:58:31,170 separation is 1, let this define the number 1. 1046 00:58:31,170 --> 00:58:35,580 That's something we theoreticians do all the time. 1047 00:58:35,580 --> 00:58:39,020 So I have 2 to the m different levels. 1048 00:58:39,020 --> 00:58:41,890 I have to send this in a certain bandwidth. 1049 00:58:41,890 --> 00:58:45,270 The sampling theorem says I can't send things that are 1050 00:58:45,270 --> 00:58:49,150 more than twice the bandwidth of this system, so I have a 1051 00:58:49,150 --> 00:58:51,750 bandwidth of 2w. 1052 00:58:51,750 --> 00:58:59,100 I can send m binary digits in this bandwidth, w. 1053 00:58:59,100 --> 00:59:01,580 Therefore, I can send bits at a certain rate. 1054 00:59:01,580 --> 00:59:04,540 This is all very crude but it's going to lead us to an 1055 00:59:04,540 --> 00:59:07,000 interesting trade-off. 1056 00:59:07,000 --> 00:59:10,910 I can increase m and by increasing m I can get by with 1057 00:59:10,910 --> 00:59:17,210 a smaller bandwidth, because I have these symbols, these m 1058 00:59:17,210 --> 00:59:20,760 bit symbols coming along at a slower rate. 1059 00:59:20,760 --> 00:59:24,490 So by increasing m, I can decrease the bandwidth, by 1060 00:59:24,490 --> 00:59:28,590 increasing m, I can reduce the bandwidth, but by reducing the 1061 00:59:28,590 --> 00:59:33,990 bandwidth, the power is going up. 1062 00:59:33,990 --> 00:59:38,070 You can see that as I change m, the power is going up 1063 00:59:38,070 --> 00:59:41,700 exponentially with m. 1064 00:59:41,700 --> 00:59:47,590 In other words, the trade-off between bandwidth and power is 1065 00:59:47,590 --> 00:59:50,390 kind of nasty in this simple-minded picture. 1066 00:59:54,320 --> 01:00:01,570 Let me jump ahead. 1067 01:00:06,920 --> 01:00:09,080 For this added of white Gaussian noise with a 1068 01:00:09,080 --> 01:00:14,510 bandwidth constraint, Shannon showed that the capacity was w 1069 01:00:14,510 --> 01:00:19,300 times log of 1 plus the power in the system times the noise 1070 01:00:19,300 --> 01:00:21,470 power times w. 1071 01:00:21,470 --> 01:00:25,410 Now, I don't expect you to understand this formula now, 1072 01:00:25,410 --> 01:00:28,520 and there's a lot of very tricky and somewhat confusing 1073 01:00:28,520 --> 01:00:30,410 things about it. 1074 01:00:30,410 --> 01:00:34,320 One thing is that the power and the noise is going up with 1075 01:00:34,320 --> 01:00:35,910 the bandwidth that you're looking at. 1076 01:00:35,910 --> 01:00:39,170 Because I said the bandwidth is everywhere, you can't avoid 1077 01:00:39,170 --> 01:00:44,710 it, and therefore, if you use a wider bandwidth system, you 1078 01:00:44,710 --> 01:00:46,870 get more noise coming into it. 1079 01:00:46,870 --> 01:00:48,770 If you have a certain amount of power that you're willing 1080 01:00:48,770 --> 01:00:53,860 to transmit and a certain amount of noise, and the 1081 01:00:53,860 --> 01:00:56,060 number of bits per second you can transmit is 1082 01:00:56,060 --> 01:00:59,200 going up with w. 1083 01:00:59,200 --> 01:01:03,690 But look at this affect with p here, the effect with p is 1084 01:01:03,690 --> 01:01:07,520 logarithmic, which says as I increase this parameter m and 1085 01:01:07,520 --> 01:01:12,770 I put more and more bits into one symbol, this quantity is 1086 01:01:12,770 --> 01:01:17,430 going up exponentially, which means the logarithm of this is 1087 01:01:17,430 --> 01:01:23,760 going up just linearly with m. 1088 01:01:23,760 --> 01:01:27,230 This gives you, if you look at it carefully, the same 1089 01:01:27,230 --> 01:01:32,380 trade-off as the simple-minded example I was talking about. 1090 01:01:32,380 --> 01:01:34,530 That simple-minded example does not explain 1091 01:01:34,530 --> 01:01:35,770 this formula at all. 1092 01:01:35,770 --> 01:01:38,370 This formula's rather deep. 1093 01:01:38,370 --> 01:01:39,760 We probably won't even completely 1094 01:01:39,760 --> 01:01:41,990 have proven this term. 1095 01:01:41,990 --> 01:01:46,420 What Shannon's results says is that no matter what you do, no 1096 01:01:46,420 --> 01:01:50,160 matter how clever you are with this noise that exists 1097 01:01:50,160 --> 01:01:55,270 everywhere, the fastest you can transmit with the most 1098 01:01:55,270 --> 01:02:00,640 sophisticated coding schemes you can think of is this rate 1099 01:02:00,640 --> 01:02:04,380 right here, which is what you would think it might be just 1100 01:02:04,380 --> 01:02:09,540 from the scaling in terms of w and m, where when you increase 1101 01:02:09,540 --> 01:02:11,960 m the power goes up exponentially. 1102 01:02:11,960 --> 01:02:15,420 It's the same kind of answer you get with that simple 1103 01:02:15,420 --> 01:02:17,010 thought experiment. 1104 01:02:17,010 --> 01:02:19,820 The fact that you have a 1 in here is a little bit strange, 1105 01:02:19,820 --> 01:02:23,070 and we'll find out where that comes from. 1106 01:02:23,070 --> 01:02:26,610 The idea here is that this noise is 1107 01:02:26,610 --> 01:02:28,040 something you can't beat. 1108 01:02:28,040 --> 01:02:29,480 The noise is fundamental. 1109 01:02:29,480 --> 01:02:33,330 It's a fundamental part of every communication system. 1110 01:02:33,330 --> 01:02:38,130 As I always like to say, noise is like death and taxes, you 1111 01:02:38,130 --> 01:02:41,960 can't avoid them, they're there, and there's nothing you 1112 01:02:41,960 --> 01:02:43,160 can do about them. 1113 01:02:43,160 --> 01:02:46,220 Like death and taxes, you can take care of yourself and live 1114 01:02:46,220 --> 01:02:51,750 longer, and with taxes, well, you can become wealthy and 1115 01:02:51,750 --> 01:02:55,030 support the government and make all sorts of payments and 1116 01:02:55,030 --> 01:02:57,680 reduce your taxes that way. 1117 01:02:57,680 --> 01:03:00,180 Or you can hire a good accountant. 1118 01:03:00,180 --> 01:03:02,380 So you can do things there also. 1119 01:03:02,380 --> 01:03:05,860 But those things are still always there. 1120 01:03:05,860 --> 01:03:08,960 So you're always stuck with this. 1121 01:03:08,960 --> 01:03:12,650 One of the major parts of the course will be to understand 1122 01:03:12,650 --> 01:03:16,600 what this noise is in a much deeper context 1123 01:03:16,600 --> 01:03:19,020 than we would otherwise. 1124 01:03:19,020 --> 01:03:21,880 So, I'll save that and come back to it later. 1125 01:03:25,660 --> 01:03:33,085 We have the same kind of layering in channel encoding. 1126 01:03:33,085 --> 01:03:36,850 When I talk about channeling encoding, I'm not talking 1127 01:03:36,850 --> 01:03:40,510 about any kind of complicated coding technique. 1128 01:03:40,510 --> 01:03:44,270 6.451 talks about all of those. 1129 01:03:44,270 --> 01:03:47,380 What I'm talking about is simply the question of how do 1130 01:03:47,380 --> 01:03:51,040 you take a sequence of bits, turn it into a sequence of 1131 01:03:51,040 --> 01:03:56,170 wave forms in such a way that at the receiver we can take 1132 01:03:56,170 --> 01:03:59,890 that sequence of wave forms and match them reliabily back 1133 01:03:59,890 --> 01:04:02,170 into the original bits. 1134 01:04:02,170 --> 01:04:05,850 There's a standard way of doing this, which is to take 1135 01:04:05,850 --> 01:04:08,620 the sequence of bits, first send them 1136 01:04:08,620 --> 01:04:10,690 through a discrete decoder. 1137 01:04:10,690 --> 01:04:14,170 What the discrete encoder code will do is 1138 01:04:14,170 --> 01:04:17,895 something like this. 1139 01:04:17,895 --> 01:04:22,080 And match bits at one rate into bits at a higher rate, 1140 01:04:22,080 --> 01:04:24,250 and this let's you correct channel errors. 1141 01:04:24,250 --> 01:04:28,530 A simple example of this is you match zero into zero, 1142 01:04:28,530 --> 01:04:29,720 zero, zero. 1143 01:04:29,720 --> 01:04:32,380 1 into 1, 1, 1. 1144 01:04:32,380 --> 01:04:37,670 If the rest of your system, namely, if this part of the 1145 01:04:37,670 --> 01:04:43,280 system makes a single error at some point, this part of the 1146 01:04:43,280 --> 01:04:47,750 system escapes from it. 1147 01:04:47,750 --> 01:04:51,200 I sort of put these on one slide but I couldn't. 1148 01:04:51,200 --> 01:04:56,060 This takes a single zero, turns it into three zeros. 1149 01:04:56,060 --> 01:05:00,590 These three zeros go into this modulator, which maps this 1150 01:05:00,590 --> 01:05:05,330 into some wave from responsive to these three zeros. 1151 01:05:05,330 --> 01:05:08,140 Wave form goes through here, noise gets added, the 1152 01:05:08,140 --> 01:05:12,930 modulator, the text, those binary digits, comes out with 1153 01:05:12,930 --> 01:05:15,840 some approximation to these three zeros. 1154 01:05:15,840 --> 01:05:19,780 Every once in awhile because of the noise, one of these 1155 01:05:19,780 --> 01:05:24,100 bits comes out wrong, and because they come out wrong 1156 01:05:24,100 --> 01:05:29,910 this discrete decoder says ah-ha, I know that what was 1157 01:05:29,910 --> 01:05:34,430 sent/put in was either zero, zero, zero or 1, 1, 1. 1158 01:05:34,430 --> 01:05:38,120 It's more likely to have one error than to have two errors, 1159 01:05:38,120 --> 01:05:40,780 and therefore, any time one error occurs I 1160 01:05:40,780 --> 01:05:43,760 can decode it correctly. 1161 01:05:43,760 --> 01:05:49,000 Now, that is a miserable code. 1162 01:05:49,000 --> 01:05:54,100 A guy by the name of Hamming became very famous for 1163 01:05:54,100 --> 01:05:58,300 generalizing this just a little bit into another single 1164 01:05:58,300 --> 01:06:02,230 error correcting code, which came through at a slightly 1165 01:06:02,230 --> 01:06:05,740 higher rate but still corrected single errors. 1166 01:06:05,740 --> 01:06:09,290 He became inordinately famous because he was a tireless 1167 01:06:09,290 --> 01:06:14,220 self-promoter and people think that he was the person who 1168 01:06:14,220 --> 01:06:17,290 invented error correction coding. 1169 01:06:17,290 --> 01:06:19,040 He really wasn't. 1170 01:06:19,040 --> 01:06:21,770 And his heirs will probably -- 1171 01:06:21,770 --> 01:06:24,210 I don't know. 1172 01:06:24,210 --> 01:06:32,270 Anyway, coding of this type, namely mapping binary digits 1173 01:06:32,270 --> 01:06:36,300 into longer strings of binary digits in such a way that you 1174 01:06:36,300 --> 01:06:42,450 can correct binary errors, has always been a major part of 1175 01:06:42,450 --> 01:06:45,580 communication system research. 1176 01:06:45,580 --> 01:06:49,790 A large number of communication systems work in 1177 01:06:49,790 --> 01:06:51,630 exactly this way. 1178 01:06:51,630 --> 01:06:55,750 In fact, the older systems which use coding tended to 1179 01:06:55,750 --> 01:06:57,220 work this way. 1180 01:06:57,220 --> 01:07:00,960 This was a very popular way of trying to design system. 1181 01:07:00,960 --> 01:07:04,940 You would put a total layering across here. 1182 01:07:04,940 --> 01:07:08,040 Namely, this part of the system didn't know that there 1183 01:07:08,040 --> 01:07:11,540 was any coding going on here. 1184 01:07:11,540 --> 01:07:15,490 At this point where you're doing discrete decoding, you 1185 01:07:15,490 --> 01:07:18,675 don't know anything about what the detector is doing here. 1186 01:07:18,675 --> 01:07:22,100 It doesn't take a lot of imagination to say if you have 1187 01:07:22,100 --> 01:07:26,900 a detector here and there's this symbol that comes through 1188 01:07:26,900 --> 01:07:30,830 here, noise gets added to it, you're trying to decode this 1189 01:07:30,830 --> 01:07:33,890 symbol, and there's this Gaussian noise, which is just 1190 01:07:33,890 --> 01:07:37,810 sort of spread around, it's concentrated. 1191 01:07:37,810 --> 01:07:43,960 Usually when errors occur, errors occur just barely. 1192 01:07:43,960 --> 01:07:47,990 What you see at this point is you almost flip a coin to 1193 01:07:47,990 --> 01:07:52,870 decide whether a zero or a 1 was transmitted at this point. 1194 01:07:52,870 --> 01:07:57,470 If this poor guy had that information, this poor guy 1195 01:07:57,470 --> 01:08:00,030 could work much, much better. 1196 01:08:00,030 --> 01:08:02,440 Namely, this guy could correct double errors instead of 1197 01:08:02,440 --> 01:08:05,500 single errors, because if one bit came through with a clear 1198 01:08:05,500 --> 01:08:09,050 indication that that's what it was, and the other two bits 1199 01:08:09,050 --> 01:08:11,930 came through saying well, I can't really tell what these 1200 01:08:11,930 --> 01:08:15,380 are, then this guy could correct two errors 1201 01:08:15,380 --> 01:08:17,260 instead of one error. 1202 01:08:17,260 --> 01:08:21,980 So this is a good example where this layering in here is 1203 01:08:21,980 --> 01:08:24,680 really a rotten idea. 1204 01:08:24,680 --> 01:08:29,300 Most modern systems don't do that strict layering. 1205 01:08:29,300 --> 01:08:34,560 But a whole lot of modern systems do, in fact, do this 1206 01:08:34,560 --> 01:08:39,080 binary encoding here and they use the principles that came 1207 01:08:39,080 --> 01:08:44,210 out of the binary encoding from a long time ago. 1208 01:08:44,210 --> 01:08:47,320 The only extra thing they do here is they use this extra 1209 01:08:47,320 --> 01:08:50,180 information from the demodulator. 1210 01:08:50,180 --> 01:08:52,400 This is something called soft decoding 1211 01:08:52,400 --> 01:08:54,040 instead of hard decoding. 1212 01:08:54,040 --> 01:08:57,270 In other words, what comes out of the demodulator is soft 1213 01:08:57,270 --> 01:09:01,820 decisions, and soft decisions say well, I think it was this 1214 01:09:01,820 --> 01:09:06,900 but I'm not sure or I am sure and so forth. 1215 01:09:06,900 --> 01:09:09,160 There's a whole range of graduations. 1216 01:09:09,160 --> 01:09:12,470 We will study detection theory and we'll understand how that 1217 01:09:12,470 --> 01:09:16,120 works and we'll understand how you use those soft decisions 1218 01:09:16,120 --> 01:09:22,200 and all of that stuff and it's kind of neat. 1219 01:09:22,200 --> 01:09:27,120 The modulation part of this is what maps bit sequences into 1220 01:09:27,120 --> 01:09:28,850 wave forms. 1221 01:09:28,850 --> 01:09:32,660 When we map bit sequences into wave form, there's this very 1222 01:09:32,660 --> 01:09:33,800 simple idea. 1223 01:09:33,800 --> 01:09:38,940 If I take a single bit that's either zero or 1, I map a zero 1224 01:09:38,940 --> 01:09:44,120 into one wave form, I map a 1 into another wave form, I send 1225 01:09:44,120 --> 01:09:47,920 either wave form a or wave form b. 1226 01:09:47,920 --> 01:09:50,820 I somehow want to make those wave forms as different from 1227 01:09:50,820 --> 01:09:53,850 each other as I can. 1228 01:09:53,850 --> 01:09:58,370 Now, what do you mean by making wave forms different? 1229 01:09:58,370 --> 01:10:01,830 I know what it means to make numbers different. 1230 01:10:01,830 --> 01:10:06,425 I know that zero and 3 are more different 1231 01:10:06,425 --> 01:10:08,120 than zero and 2 are. 1232 01:10:08,120 --> 01:10:11,200 Namely, I have a good measure of distance for 1233 01:10:11,200 --> 01:10:13,490 a sequence of numbers. 1234 01:10:13,490 --> 01:10:16,540 One of the things that we have to deal with is how do you 1235 01:10:16,540 --> 01:10:22,680 find an appropriate measure of distance for wave forms. 1236 01:10:22,680 --> 01:10:26,010 When I find the measure of distance for wave forms, what 1237 01:10:26,010 --> 01:10:29,230 do I want to make it responsive to? 1238 01:10:29,230 --> 01:10:32,440 What I'm trying to do is to overcome this noise, and 1239 01:10:32,440 --> 01:10:35,210 therefore, I have to find the measure of distance which is 1240 01:10:35,210 --> 01:10:39,450 appropriate for what the noise is doing to me. 1241 01:10:39,450 --> 01:10:41,930 Well that's why we have to study wave forms as much as we 1242 01:10:41,930 --> 01:10:43,850 do in this course. 1243 01:10:43,850 --> 01:10:48,090 What we will find is that you can treat wave forms in the 1244 01:10:48,090 --> 01:10:51,290 same way as you can treat sequences of numbers. 1245 01:10:51,290 --> 01:10:54,720 We will find that there's a vector space associated with 1246 01:10:54,720 --> 01:10:59,040 wave forms, which is exactly the same as a vector space 1247 01:10:59,040 --> 01:11:04,130 associated with infinite sequences of numbers. 1248 01:11:04,130 --> 01:11:08,750 It's almost the same as the vector space associated with 1249 01:11:08,750 --> 01:11:12,070 finite sequences of wave form, and that vector space 1250 01:11:12,070 --> 01:11:20,950 associated with a finite sequence of numbers is exactly 1251 01:11:20,950 --> 01:11:24,420 the vector space that you've been looking at all of your 1252 01:11:24,420 --> 01:11:27,510 lives, and which you use primarily as a notational 1253 01:11:27,510 --> 01:11:31,520 convenience to talk about a sequence of numbers. 1254 01:11:31,520 --> 01:11:35,060 What's the distance that you use there? 1255 01:11:35,060 --> 01:11:37,560 Well, you take the difference between each number in the 1256 01:11:37,560 --> 01:11:41,170 sequence, you square it, you add it up and you take the 1257 01:11:41,170 --> 01:11:42,550 square root of it. 1258 01:11:42,550 --> 01:11:44,350 Namely, you look at the energy difference 1259 01:11:44,350 --> 01:11:46,310 between these sequences. 1260 01:11:46,310 --> 01:11:49,260 What we will find remarkably is when we do things right, 1261 01:11:49,260 --> 01:11:53,630 the appropriate distance to talk about on wave forms, is 1262 01:11:53,630 --> 01:11:55,810 exactly what comes from the 1263 01:11:55,810 --> 01:11:58,460 appropriate looking at sequences. 1264 01:11:58,460 --> 01:12:01,730 In fact, we'll take wave forms and break them into sequences. 1265 01:12:01,730 --> 01:12:07,530 That's a major part of this modulation process. 1266 01:12:07,530 --> 01:12:10,100 You think of that in terms of the sampling theorem. 1267 01:12:10,100 --> 01:12:14,250 Namely, you take a bunch of numbers, you take each number 1268 01:12:14,250 --> 01:12:18,780 in the sampling theorem and you put a little sin x over x 1269 01:12:18,780 --> 01:12:23,790 hat around it and you transmit that, and then you add up all 1270 01:12:23,790 --> 01:12:27,370 of these different samples, which is a sequence of 1271 01:12:27,370 --> 01:12:31,520 numbers, each of them with these little sin x over x caps 1272 01:12:31,520 --> 01:12:32,630 around them. 1273 01:12:32,630 --> 01:12:36,520 The neat thing about the sin x over x caps around them is 1274 01:12:36,520 --> 01:12:38,200 they all go to zero with these sample 1275 01:12:38,200 --> 01:12:42,650 points and it all works. 1276 01:12:42,650 --> 01:12:45,570 Therefore, the sequences look almost the 1277 01:12:45,570 --> 01:12:47,500 same as a wave form. 1278 01:12:47,500 --> 01:12:50,800 We'll find out that that works in general, so we 1279 01:12:50,800 --> 01:12:53,190 will do most of that. 1280 01:12:53,190 --> 01:12:56,970 Modern practice often combines all these layers into what's 1281 01:12:56,970 --> 01:12:58,530 called coding modulation. 1282 01:12:58,530 --> 01:13:01,460 In other words, they go further than just this idea of 1283 01:13:01,460 --> 01:13:06,180 soft decisions, and actually treat the problem in general 1284 01:13:06,180 --> 01:13:10,130 of what do you do with bits coming into a sequence, into a 1285 01:13:10,130 --> 01:13:13,490 system, how do you turn those into wave forms that can be 1286 01:13:13,490 --> 01:13:15,690 dealt with an appropriate way. 1287 01:13:15,690 --> 01:13:19,180 And we'll talk just a little bit about that and mostly 1288 01:13:19,180 --> 01:13:25,520 6.451 talks about that. 1289 01:13:25,520 --> 01:13:28,070 I want to talk a little bit right at the end, I'm almost 1290 01:13:28,070 --> 01:13:34,920 finished, about computational complexity in these 1291 01:13:34,920 --> 01:13:39,190 communication systems that we're trying to worry about. 1292 01:13:39,190 --> 01:13:43,730 One of the things that you will become curious about as 1293 01:13:43,730 --> 01:13:49,350 we move along is that we are sometimes suggesting doing 1294 01:13:49,350 --> 01:13:53,750 things in ways that look very complex. 1295 01:13:53,750 --> 01:13:57,820 The word complex is an extraordinarily complex word. 1296 01:13:57,820 --> 01:14:00,890 None of us know what it means because it means so many 1297 01:14:00,890 --> 01:14:03,600 different things. 1298 01:14:03,600 --> 01:14:06,610 I was saying that these telephone systems are 1299 01:14:06,610 --> 01:14:08,980 inordinately complex. 1300 01:14:08,980 --> 01:14:13,770 In a sense they aren't because they have an incredible number 1301 01:14:13,770 --> 01:14:16,780 of pieces, all of which are the same. 1302 01:14:16,780 --> 01:14:19,930 So you can understand a telephone system in terms of 1303 01:14:19,930 --> 01:14:23,530 understanding a relatively small number of principles. 1304 01:14:23,530 --> 01:14:26,590 You can understand these Gaussian channels in terms of 1305 01:14:26,590 --> 01:14:30,110 understanding a small number of principles. 1306 01:14:30,110 --> 01:14:32,680 So they're complex if you don't know how to look at 1307 01:14:32,680 --> 01:14:36,090 them, they're simple if you do know how to look at them. 1308 01:14:36,090 --> 01:14:41,260 Here what I'm talking about is how many chips do you need to 1309 01:14:41,260 --> 01:14:43,240 build something? 1310 01:14:43,240 --> 01:14:45,520 How complicated are those chips? 1311 01:14:45,520 --> 01:14:49,870 Fundamentally, how much does it cost to build the system? 1312 01:14:49,870 --> 01:14:52,920 Well, one of the curious things and one of the things 1313 01:14:52,920 --> 01:14:57,740 that has made information theory so popular in designing 1314 01:14:57,740 --> 01:15:02,130 communication systems is that we're almost at the point 1315 01:15:02,130 --> 01:15:06,060 where chips are free. 1316 01:15:06,060 --> 01:15:08,860 Now, what does that mean? 1317 01:15:08,860 --> 01:15:12,200 It means if you want to do something more complicated it 1318 01:15:12,200 --> 01:15:13,780 hardly costs anything. 1319 01:15:13,780 --> 01:15:15,110 This is sort of Moore's law. 1320 01:15:15,110 --> 01:15:19,400 Moore's law says every year you can put more and more 1321 01:15:19,400 --> 01:15:22,780 stuff on a single chip. 1322 01:15:22,780 --> 01:15:26,540 You can do this cheaply and the chips work faster and 1323 01:15:26,540 --> 01:15:30,150 faster so you can do more and more complicated things very, 1324 01:15:30,150 --> 01:15:31,160 very easily. 1325 01:15:31,160 --> 01:15:33,580 There are some caveats. 1326 01:15:33,580 --> 01:15:39,240 Low cost with high complexity requires large volume. 1327 01:15:39,240 --> 01:15:42,380 In other words, it takes a very long time to design a 1328 01:15:42,380 --> 01:15:47,110 chip, it takes an enormous amount of lead time -- 1329 01:15:47,110 --> 01:15:50,740 I mean we're not going to study these details in the 1330 01:15:50,740 --> 01:15:54,220 course, but I want you to be aware of why it is that in a 1331 01:15:54,220 --> 01:15:57,430 sense complexity doesn't matter anymore. 1332 01:15:57,430 --> 01:16:00,380 If you spend a long enough time designing it and you can 1333 01:16:00,380 --> 01:16:04,130 produce enough of them, you can do extraordinarily complex 1334 01:16:04,130 --> 01:16:06,330 things very, very cheaply. 1335 01:16:06,330 --> 01:16:09,000 I mean you see this with all the cellular 1336 01:16:09,000 --> 01:16:11,030 phones that you buy. 1337 01:16:11,030 --> 01:16:14,830 I mean cellular phones now are actually give-away devices. 1338 01:16:14,830 --> 01:16:18,010 You look at them to see what's in them and they're 1339 01:16:18,010 --> 01:16:20,370 inordinately complicated. 1340 01:16:20,370 --> 01:16:23,260 You buy a personal computer today and it's 1,000 times 1341 01:16:23,260 --> 01:16:25,770 more powerful than the biggest computers in the 1342 01:16:25,770 --> 01:16:28,590 world 30 years ago. 1343 01:16:28,590 --> 01:16:33,170 You can do everything more cheaply now 1344 01:16:33,170 --> 01:16:34,740 than you could before. 1345 01:16:34,740 --> 01:16:37,560 What's the hitch? 1346 01:16:37,560 --> 01:16:40,970 Well, you see the hitch when you program a computer, 1347 01:16:40,970 --> 01:16:45,670 namely, if you look at word processing languages, yes, 1348 01:16:45,670 --> 01:16:51,520 they become bigger and bigger every year, they become more 1349 01:16:51,520 --> 01:16:54,570 and more complex, they will do more and more things. 1350 01:16:54,570 --> 01:16:58,670 As far as their function of word processing, some of us 1351 01:16:58,670 --> 01:17:01,900 think there have been advances in the last 30 years. 1352 01:17:01,900 --> 01:17:05,870 Others, like myself, feel that there have not been advances 1353 01:17:05,870 --> 01:17:09,690 and, in fact, we're going backwards. 1354 01:17:09,690 --> 01:17:13,360 The problem is we have so much complexity we don't know how 1355 01:17:13,360 --> 01:17:15,410 to deal with it anymore. 1356 01:17:15,410 --> 01:17:18,190 Most of us have become blinking 12's. 1357 01:17:18,190 --> 01:17:19,670 You know what a blinking 12 is. 1358 01:17:19,670 --> 01:17:22,250 It's all of the electronic equipment you have in your 1359 01:17:22,250 --> 01:17:25,760 home, whenever you don't program it right you see a 12 1360 01:17:25,760 --> 01:17:29,315 blinking on and off, which is where the clock is and the 1361 01:17:29,315 --> 01:17:31,550 clock hasn't been set, and therefore, 1362 01:17:31,550 --> 01:17:33,890 it's blinking at you. 1363 01:17:33,890 --> 01:17:36,690 Most people who have complicated devices, whether 1364 01:17:36,690 --> 01:17:40,620 they're audio devices or what have you, are using less than 1365 01:17:40,620 --> 01:17:43,900 one percent of the fancy features in them. 1366 01:17:43,900 --> 01:17:47,100 You spend hours and hours trying to bring that from one 1367 01:17:47,100 --> 01:17:48,350 percent up to two percent. 1368 01:17:51,230 --> 01:17:57,050 So that the cost of complexity is not in what you can build, 1369 01:17:57,050 --> 01:18:01,180 but it's in this conceptual complexity. 1370 01:18:01,180 --> 01:18:04,500 If you design an algorithm and you understand the algorithm, 1371 01:18:04,500 --> 01:18:08,230 it hardly makes any difference how complicated it is, unless 1372 01:18:08,230 --> 01:18:11,440 the complexity is going up exponentially with something. 1373 01:18:11,440 --> 01:18:13,880 That's the first caveat. 1374 01:18:13,880 --> 01:18:15,680 It's actually the second caveat, too. 1375 01:18:15,680 --> 01:18:19,050 Complex systems are often not well thought through. 1376 01:18:19,050 --> 01:18:22,790 They often don't work or are not robust. 1377 01:18:22,790 --> 01:18:25,620 The non-robustness is the worst part of it. 1378 01:18:25,620 --> 01:18:30,040 The third caveat is when you have special applications. 1379 01:18:30,040 --> 01:18:32,920 Since they involve small numbers, you're not going to 1380 01:18:32,920 --> 01:18:36,660 build a lot of them, it takes a long time to design them. 1381 01:18:36,660 --> 01:18:39,900 The only way you can build special applications is to 1382 01:18:39,900 --> 01:18:43,780 make them minor modifications of other things that have been 1383 01:18:43,780 --> 01:18:45,920 done before. 1384 01:18:45,920 --> 01:18:48,800 So this question of how complicated the systems we can 1385 01:18:48,800 --> 01:18:50,760 build are, if you can make enough of 1386 01:18:50,760 --> 01:18:52,610 them they become cheap. 1387 01:18:52,610 --> 01:18:56,090 If you have enough lead time they become cheap. 1388 01:18:56,090 --> 01:18:59,300 Which says that this layering of communication system that 1389 01:18:59,300 --> 01:19:03,130 we're talking about really ultimately makes sense. 1390 01:19:03,130 --> 01:19:05,800 Starting next time, we're going to start talking about 1391 01:19:05,800 --> 01:19:09,090 this first part of source coding, which is the discrete 1392 01:19:09,090 --> 01:19:10,340 part of source coding. 1393 01:19:13,400 --> 01:19:17,150 We will have those notes on the web shortly. 1394 01:19:17,150 --> 01:19:18,870 You can look ahead at them, if you would 1395 01:19:18,870 --> 01:19:21,250 like to, before Monday. 1396 01:19:21,250 --> 01:19:26,190 Please review the probability that you are supposed to know 1397 01:19:26,190 --> 01:19:27,640 because you will need it very shortly. 1398 01:19:27,640 --> 01:19:28,890 Thanks.