1 00:00:00,000 --> 00:00:02,522 SPEAKER: The following content is provided under a Creative 2 00:00:02,522 --> 00:00:03,650 Commons license. 3 00:00:03,650 --> 00:00:06,600 Your support will help MIT OpenCourseWare continue to 4 00:00:06,600 --> 00:00:10,030 offer high quality educational resources for free. 5 00:00:10,030 --> 00:00:12,815 To make a donation, or to view additional material from 6 00:00:12,815 --> 00:00:16,550 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:16,550 --> 00:00:17,800 ocw.mit.edu. 8 00:00:22,370 --> 00:00:24,670 PROFESSOR: OK I want to review a little bit what we said 9 00:00:24,670 --> 00:00:29,550 about detection at the end of last hour, last hour and a 10 00:00:29,550 --> 00:00:35,620 half, because we were going through it relatively quickly. 11 00:00:35,620 --> 00:00:38,570 Detection is a very funny subject. 12 00:00:38,570 --> 00:00:43,850 Particularly the way that we do it here because we go 13 00:00:43,850 --> 00:00:47,250 through a bunch of very, very simple steps. 14 00:00:47,250 --> 00:00:48,620 Everything looks trivial-- 15 00:00:48,620 --> 00:00:51,920 I hope it looks trivial-- as we go, at least after you 16 00:00:51,920 --> 00:00:54,970 think about it for awhile, and you work with it for awhile, 17 00:00:54,970 --> 00:00:58,010 you will come back and look at it and you will say, "yes, in 18 00:00:58,010 --> 00:01:01,450 fact that is trivial." Nothing when you look at it for the 19 00:01:01,450 --> 00:01:04,170 first time is trivial. 20 00:01:04,170 --> 00:01:07,460 And the kind of detection problems that we're 21 00:01:07,460 --> 00:01:11,460 interested in is-- 22 00:01:11,460 --> 00:01:15,280 to start out with-- we want to look just at binary detection. 23 00:01:15,280 --> 00:01:19,490 We're sending a binary signal, it's going to go through some 24 00:01:19,490 --> 00:01:24,350 signal encoder, which is the kind of channel encoder we've 25 00:01:24,350 --> 00:01:26,270 been thinking about. 26 00:01:26,270 --> 00:01:32,460 It's going to go through a baseband modulator, a baseband 27 00:01:32,460 --> 00:01:34,510 to passband modulator. 28 00:01:34,510 --> 00:01:37,440 It's going to have white Gaussian noise, or some kind 29 00:01:37,440 --> 00:01:39,230 of noise, added to it. 30 00:01:39,230 --> 00:01:42,090 It's going to come out the other end. 31 00:01:42,090 --> 00:01:44,900 We come back from passband to baseband. 32 00:01:44,900 --> 00:01:47,610 We then go through a baseband demodulator. 33 00:01:47,610 --> 00:01:50,330 We then sample at that point. 34 00:01:50,330 --> 00:01:53,920 And, the point is, when you're all done with all of that, 35 00:01:53,920 --> 00:02:00,870 what you've done is you've started out sending either 36 00:02:00,870 --> 00:02:07,220 plus a or minus a as a one- dimensional numerical signal. 37 00:02:07,220 --> 00:02:11,640 And when you're all through, there's some one- dimensional 38 00:02:11,640 --> 00:02:15,320 number that comes out, v. And on the basis of that one- 39 00:02:15,320 --> 00:02:18,850 dimensional number, v, you're supposed to guess whether the 40 00:02:18,850 --> 00:02:21,250 output is zero or one. 41 00:02:21,250 --> 00:02:23,640 Now, one of the things that we're doing right now is we're 42 00:02:23,640 --> 00:02:28,660 simplifying the problem in the sense that we're not looking 43 00:02:28,660 --> 00:02:32,060 at a sequence of inputs coming in, and we're not looking at a 44 00:02:32,060 --> 00:02:34,560 sequence of outputs coming out. 45 00:02:34,560 --> 00:02:37,750 We're only looking at a single input coming in. 46 00:02:37,750 --> 00:02:41,190 In other words, you build this piece of communication 47 00:02:41,190 --> 00:02:44,680 equipment, you get it all tuned up, you get it into 48 00:02:44,680 --> 00:02:45,690 steady state. 49 00:02:45,690 --> 00:02:47,910 You send one bit. 50 00:02:47,910 --> 00:02:49,400 You receive something. 51 00:02:49,400 --> 00:02:52,300 You try to guess at the receiver, what was sent. 52 00:02:52,300 --> 00:02:55,710 And at that point you tear the whole thing down and you wait 53 00:02:55,710 --> 00:02:58,470 a year until you've set it up perfectly again. 54 00:02:58,470 --> 00:03:00,480 You send another bit. 55 00:03:00,480 --> 00:03:03,250 And we're not going to worry at all about what happens with 56 00:03:03,250 --> 00:03:05,840 the sequence, we're only going to worry about 57 00:03:05,840 --> 00:03:07,660 this one shot problem. 58 00:03:07,660 --> 00:03:14,950 You sort of have some kind of clue that if you send the 59 00:03:14,950 --> 00:03:19,100 whole sequence of bits, in a system like this, and you 60 00:03:19,100 --> 00:03:21,950 don't have intersymbol interference, and the noise is 61 00:03:21,950 --> 00:03:25,000 white, so it's sort of independent from time to time. 62 00:03:25,000 --> 00:03:29,210 You sort of have a clue that you're going to get the same 63 00:03:29,210 --> 00:03:32,330 answer whether you send the sequence of data or whether 64 00:03:32,330 --> 00:03:35,290 you just send a single bit. 65 00:03:35,290 --> 00:03:39,730 And we're going to show later that that, in fact, is true. 66 00:03:39,730 --> 00:03:42,810 But for the time being we want to understand what's going on, 67 00:03:42,810 --> 00:03:45,610 and to understand what's going on we take this simplest 68 00:03:45,610 --> 00:03:49,350 possible case where there's only one bit that's being 69 00:03:49,350 --> 00:03:50,600 transmitted. 70 00:03:53,130 --> 00:03:56,120 It's the question, "Are we going to destroy ourselves in 71 00:03:56,120 --> 00:04:00,750 the next five years or not?" And this question is important 72 00:04:00,750 --> 00:04:05,340 to most of us, and at the output we find out, in fact, 73 00:04:05,340 --> 00:04:07,530 whether we're going to destroy ourselves or not. 74 00:04:07,530 --> 00:04:10,600 So it's one bit, but it's one important bit. 75 00:04:10,600 --> 00:04:13,760 OK, why are we doing things this way? 76 00:04:13,760 --> 00:04:18,330 Want to tell you a little story about the first time I 77 00:04:18,330 --> 00:04:20,650 really talked to Claude Shannon. 78 00:04:20,650 --> 00:04:25,170 I was a young member of the faculty at that time, and I 79 00:04:25,170 --> 00:04:27,360 was working on a problem which I thought was 80 00:04:27,360 --> 00:04:29,460 really a neat problem. 81 00:04:29,460 --> 00:04:31,830 It was interesting theoretically. 82 00:04:31,830 --> 00:04:34,550 It was important practically. 83 00:04:34,550 --> 00:04:37,950 And I thought, "gee, I finally have something I can go to 84 00:04:37,950 --> 00:04:41,020 this great man and talk to him about." So I screwed up my 85 00:04:41,020 --> 00:04:43,440 courage for about two days. 86 00:04:43,440 --> 00:04:47,340 Finally I saw his door open and him sitting there, so I 87 00:04:47,340 --> 00:04:50,640 went in and started to tell him about this problem. 88 00:04:50,640 --> 00:04:54,570 He's a very kind person and he listened very patiently. 89 00:04:54,570 --> 00:04:57,510 And after about 15 minutes he said, "My god! 90 00:04:57,510 --> 00:04:59,400 I'm just sort of lost with all of this. 91 00:04:59,400 --> 00:05:01,990 There's so much stuff going on in this problem. 92 00:05:01,990 --> 00:05:05,950 Can't we simplify it a little bit by throwing out this kind 93 00:05:05,950 --> 00:05:09,190 of practical constraint you've put on it?" I said, "yeah, I 94 00:05:09,190 --> 00:05:13,040 guess so." So we threw that out and the we went on for 95 00:05:13,040 --> 00:05:15,390 awhile later, and then he said, "My god, I'm still 96 00:05:15,390 --> 00:05:18,240 terribly confused about this whole thing. 97 00:05:18,240 --> 00:05:21,210 Why don't we simplify it in some other way?" 98 00:05:21,210 --> 00:05:23,170 And this went on for about an hour. 99 00:05:23,170 --> 00:05:26,050 As I say, he was a very patient guy. 100 00:05:26,050 --> 00:05:29,500 And at the end of an hour I was getting really depressed. 101 00:05:29,500 --> 00:05:31,920 Because here was this beautiful problem that I 102 00:05:31,920 --> 00:05:35,720 thought was going to make me famous, give me tenure, do all 103 00:05:35,720 --> 00:05:37,140 these neat things. 104 00:05:37,140 --> 00:05:39,290 And here he'd reduced the thing to a 105 00:05:39,290 --> 00:05:41,750 totally trivial toy problem. 106 00:05:41,750 --> 00:05:42,930 And we looked at it. 107 00:05:42,930 --> 00:05:45,130 And we said, yes this is a trivial toy problem. 108 00:05:45,130 --> 00:05:46,500 This is the answer. 109 00:05:46,500 --> 00:05:48,840 The problem is solved. 110 00:05:48,840 --> 00:05:51,590 But so what? 111 00:05:51,590 --> 00:05:53,920 And then he suggested putting some of those 112 00:05:53,920 --> 00:05:56,590 constraints back in again. 113 00:05:56,590 --> 00:05:59,370 And as we started putting the constraints back in, one- by- 114 00:05:59,370 --> 00:06:03,280 one, we saw that each time we put a new constraint in-- 115 00:06:03,280 --> 00:06:06,160 since we understood the problem and its simplest 116 00:06:06,160 --> 00:06:09,830 form-- putting the constraint in, it was still simple. 117 00:06:09,830 --> 00:06:13,060 And by the time we built the whole thing back up again, it 118 00:06:13,060 --> 00:06:16,160 was clear what the answer was. 119 00:06:16,160 --> 00:06:22,410 OK, in other words, what theory means is really solving 120 00:06:22,410 --> 00:06:23,680 these toy problems. 121 00:06:23,680 --> 00:06:26,670 And solving the toy problems first. 122 00:06:26,670 --> 00:06:29,110 And in terms of practice, some people think the most 123 00:06:29,110 --> 00:06:34,090 practical thing is to be practical. 124 00:06:34,090 --> 00:06:38,100 And the whole point of this course, and this particular 125 00:06:38,100 --> 00:06:42,890 subject of detection is a wonderful example of this, the 126 00:06:42,890 --> 00:06:47,080 most practical thing is to be theoretical. 127 00:06:47,080 --> 00:06:51,340 I mean, you need to add practice to the theory, but 128 00:06:51,340 --> 00:06:54,260 the way you do things is you start with a theory-- which 129 00:06:54,260 --> 00:06:57,000 means you start with the toy problems, you build up from 130 00:06:57,000 --> 00:07:00,630 those toy problems, and after you build up for awhile, 131 00:07:00,630 --> 00:07:03,900 understanding what the practical problem is also-- 132 00:07:03,900 --> 00:07:05,620 you then understand how to deal with 133 00:07:05,620 --> 00:07:07,200 the practical problem. 134 00:07:07,200 --> 00:07:10,100 And the practical engineer who doesn't have any of that 135 00:07:10,100 --> 00:07:12,970 fundamental knowledge about how to deal with these 136 00:07:12,970 --> 00:07:17,630 problems is always submerged in a sea of complexity. 137 00:07:17,630 --> 00:07:21,080 Always doing simulations of something that he doesn't, he 138 00:07:21,080 --> 00:07:23,090 or she doesn't understand. 139 00:07:23,090 --> 00:07:25,970 Always trying to interpret something from it, but with 140 00:07:25,970 --> 00:07:29,440 just too many things going on to have any idea of what it 141 00:07:29,440 --> 00:07:30,610 really means. 142 00:07:30,610 --> 00:07:34,600 Ok, so that's why we're making this trivial assumption here. 143 00:07:34,600 --> 00:07:37,480 We're only putting one bit in. 144 00:07:37,480 --> 00:07:39,520 We're ignoring what happens all the 145 00:07:39,520 --> 00:07:41,090 way through the system. 146 00:07:41,090 --> 00:07:43,590 We only get one number out. 147 00:07:43,590 --> 00:07:46,880 We're going to assume that this one number here is either 148 00:07:46,880 --> 00:07:49,730 plus or minus a, plus the Gaussian 149 00:07:49,730 --> 00:07:51,840 random noise variable. 150 00:07:51,840 --> 00:07:55,090 And we're not quite sure why it's going to be plus or minus 151 00:07:55,090 --> 00:07:57,960 a, plus a Gaussian noise random variable, but we're 152 00:07:57,960 --> 00:07:59,980 going to assume that for the time being. 153 00:07:59,980 --> 00:08:00,630 OK? 154 00:08:00,630 --> 00:08:06,610 So the detector observes the sample values of the random 155 00:08:06,610 --> 00:08:11,750 variable for which this is the sample value, and then guesses 156 00:08:11,750 --> 00:08:15,680 what the value of this random variable, h, which is what we 157 00:08:15,680 --> 00:08:16,780 call the input now. 158 00:08:16,780 --> 00:08:21,310 Because we view it from the standpoint of the detector-- 159 00:08:21,310 --> 00:08:24,080 the detector has two possible hypotheses-- 160 00:08:24,080 --> 00:08:27,140 one is that a zero was sent, and the other 161 00:08:27,140 --> 00:08:28,960 that a one was sent. 162 00:08:28,960 --> 00:08:33,690 And on the basis of this observation, you take first 163 00:08:33,690 --> 00:08:36,430 the hypothesis zero and you say, "Is this a reasonable 164 00:08:36,430 --> 00:08:39,840 hypothesis?" Then you look at the hypothesis one, say, "Is 165 00:08:39,840 --> 00:08:46,480 this a reasonable hypothesis?" And then you guess whether you 166 00:08:46,480 --> 00:08:49,990 think zero was more likely or one is more likely, given this 167 00:08:49,990 --> 00:08:52,070 observation that you've had. 168 00:08:52,070 --> 00:08:56,700 So what they detector has, at this point, is a full 169 00:08:56,700 --> 00:09:01,180 statistical characterization of the entire problem. 170 00:09:01,180 --> 00:09:03,090 Mainly you have a model of the problem. 171 00:09:03,090 --> 00:09:06,340 You understand every probability in the universe 172 00:09:06,340 --> 00:09:09,340 that might have any effect on this. 173 00:09:09,340 --> 00:09:13,240 And what might have any effect on this-- 174 00:09:13,240 --> 00:09:16,360 as far as the way we've set up the problem-- 175 00:09:16,360 --> 00:09:20,060 is only the question of what are the probabilities that 176 00:09:20,060 --> 00:09:22,145 you're going to send one or the other of 177 00:09:22,145 --> 00:09:25,540 these signals here? 178 00:09:25,540 --> 00:09:28,100 And conditional on each of these, what are the 179 00:09:28,100 --> 00:09:30,780 probabilities of this random variable 180 00:09:30,780 --> 00:09:32,730 appearing at the output? 181 00:09:32,730 --> 00:09:35,130 Because you have to base your decision only on this. 182 00:09:35,130 --> 00:09:37,040 So all of the probabilities only give you 183 00:09:37,040 --> 00:09:39,150 this one simple thing. 184 00:09:39,150 --> 00:09:43,310 Hypothesis testing, decision making, decoding, all mean the 185 00:09:43,310 --> 00:09:43,850 same thing. 186 00:09:43,850 --> 00:09:45,900 They mean exactly the same thing. 187 00:09:51,970 --> 00:09:54,740 And they're just done by different people. 188 00:09:54,740 --> 00:09:58,140 OK, so what that says is we're assuming the detector uses a 189 00:09:58,140 --> 00:10:01,050 known probability model. 190 00:10:01,050 --> 00:10:03,200 And in designing the detector, you know what that 191 00:10:03,200 --> 00:10:05,890 probability model is. 192 00:10:05,890 --> 00:10:09,050 It might not be the right probability model, and one of 193 00:10:09,050 --> 00:10:13,410 the things that many people interested in detection study 194 00:10:13,410 --> 00:10:17,920 is the question of when you think the probability model is 195 00:10:17,920 --> 00:10:21,620 one thing and it's actually something else, how well does 196 00:10:21,620 --> 00:10:22,880 the detection work? 197 00:10:22,880 --> 00:10:25,690 It's a little like the Lempel-Ziv algorithm that we 198 00:10:25,690 --> 00:10:31,040 studied earlier for doing source coding. 199 00:10:31,040 --> 00:10:34,090 Which is, how do you do source coding when you don't know 200 00:10:34,090 --> 00:10:35,930 what the probabilities are? 201 00:10:35,930 --> 00:10:38,450 And we found the best way to study that, of course, was to 202 00:10:38,450 --> 00:10:41,470 first find out how to do source encoding when you did 203 00:10:41,470 --> 00:10:43,460 know what the probabilities were. 204 00:10:43,460 --> 00:10:46,150 So we're doing the same thing here. 205 00:10:46,150 --> 00:10:49,090 We assume the detector is designed to maximize the 206 00:10:49,090 --> 00:10:51,820 probability of guessing correctly. 207 00:10:51,820 --> 00:10:54,240 In other words, it's trying to minimize the 208 00:10:54,240 --> 00:10:56,160 probability of error. 209 00:10:56,160 --> 00:10:59,940 We call that a MAP detector-- maximum a posteriori 210 00:10:59,940 --> 00:11:03,890 probability decoding. 211 00:11:03,890 --> 00:11:06,830 You can try to do other things. 212 00:11:06,830 --> 00:11:10,600 You can say that there's a cost of one kind of error, and 213 00:11:10,600 --> 00:11:13,420 there's another cost of another kind of error. 214 00:11:13,420 --> 00:11:16,780 I mean, if you're doing medical testing or something. 215 00:11:16,780 --> 00:11:20,350 If you guess wrong in one way, you tell the patient there's 216 00:11:20,350 --> 00:11:23,380 nothing wrong with them, the patient goes out, drops dead 217 00:11:23,380 --> 00:11:25,120 the next day. 218 00:11:25,120 --> 00:11:27,430 And you don't care about that, of course, but you care about 219 00:11:27,430 --> 00:11:30,730 the fact that the patient is going to sue the hospital for 220 00:11:30,730 --> 00:11:32,600 100 million dollars and you're going to lose your 221 00:11:32,600 --> 00:11:34,880 job because of it. 222 00:11:34,880 --> 00:11:37,860 So there's a big cost to guessing wrong in that way. 223 00:11:40,810 --> 00:11:45,960 But for now, we're not going to bother about the costs. 224 00:11:45,960 --> 00:11:49,970 One of the things that you'll see when we get all done is 225 00:11:49,970 --> 00:11:51,640 that putting in cost doesn't make the 226 00:11:51,640 --> 00:11:53,970 problem any harder, really. 227 00:11:53,970 --> 00:11:57,310 You really wind up with the same kind of problem. 228 00:11:57,310 --> 00:11:59,660 OK, so h is the random variable that will be 229 00:11:59,660 --> 00:12:02,250 detected, and v is the random variable 230 00:12:02,250 --> 00:12:03,590 that's going to be observed. 231 00:12:03,590 --> 00:12:05,650 The experiment is performed. 232 00:12:05,650 --> 00:12:10,140 Some sample value of v is observed, and some sample 233 00:12:10,140 --> 00:12:13,520 value of the hypothesis has actually happened. 234 00:12:13,520 --> 00:12:16,320 In other words, what has happened is you prepared the 235 00:12:16,320 --> 00:12:18,020 whole system. 236 00:12:18,020 --> 00:12:24,350 Then at the input end to the whole system, the input to the 237 00:12:24,350 --> 00:12:28,610 channel, somebody has chosen a one or a zero without the 238 00:12:28,610 --> 00:12:30,920 knowledge of the receiver. 239 00:12:30,920 --> 00:12:34,140 That one or zero has been sent through this whole system, the 240 00:12:34,140 --> 00:12:38,330 receiver has observed some output, v, so in fact we're 241 00:12:38,330 --> 00:12:41,320 now dealing with the sample values of 242 00:12:41,320 --> 00:12:42,100 two different things. 243 00:12:42,100 --> 00:12:45,510 The sample value of the input, which is h, the sample value 244 00:12:45,510 --> 00:12:48,880 of the output, which is v, and in terms of the sample value 245 00:12:48,880 --> 00:12:52,440 of the output, we're trying to guess what the sample value of 246 00:12:52,440 --> 00:12:54,620 the input is. 247 00:12:54,620 --> 00:13:00,630 OK, an error then occurs if, after the output chooses a 248 00:13:00,630 --> 00:13:06,030 particular hypothesis as its guess, and that hypothesis, 249 00:13:06,030 --> 00:13:08,620 then, is a function of what it receives. 250 00:13:08,620 --> 00:13:12,410 In other words, after you receive something, what the 251 00:13:12,410 --> 00:13:17,530 detector has to do is somehow map what gets received, which 252 00:13:17,530 --> 00:13:22,100 is some number, into either zero or one. 253 00:13:22,100 --> 00:13:24,250 It's like what a Quantizer does. 254 00:13:24,250 --> 00:13:28,340 Mainly, it maps the whole region into two 255 00:13:28,340 --> 00:13:30,360 different sub- regions. 256 00:13:30,360 --> 00:13:32,680 Some things are mapped into zero, Some things 257 00:13:32,680 --> 00:13:36,020 are mapped into one. 258 00:13:36,020 --> 00:13:39,220 This H bar then becomes a random variable, but is a 259 00:13:39,220 --> 00:13:43,400 random variable that is a function of what's received. 260 00:13:43,400 --> 00:13:46,260 So we have one random variable, H, which is what 261 00:13:46,260 --> 00:13:47,620 actually happened. 262 00:13:47,620 --> 00:13:50,720 There's another random variable, H hat, which is what 263 00:13:50,720 --> 00:13:53,390 the detector guesses will happen. 264 00:13:53,390 --> 00:13:58,030 This is an unusual random variable, because it's not 265 00:13:58,030 --> 00:14:01,070 determined ahead of time. 266 00:14:01,070 --> 00:14:04,170 It's determined only in terms of what you decide your 267 00:14:04,170 --> 00:14:06,020 detection rule is going to be. 268 00:14:06,020 --> 00:14:09,190 This is a random variable that you have some control over. 269 00:14:09,190 --> 00:14:10,940 These other random variables you have no 270 00:14:10,940 --> 00:14:14,540 control over at all. 271 00:14:14,540 --> 00:14:17,290 So that's the random variable we're going to choose. 272 00:14:17,290 --> 00:14:20,330 And, in fact, what we're going to do is we're going to say 273 00:14:20,330 --> 00:14:24,690 what we want to do is this MAP decoding, maximum a posteriori 274 00:14:24,690 --> 00:14:27,710 probability decoding, where we're trying to minimize the 275 00:14:27,710 --> 00:14:29,710 probability of screwing up. 276 00:14:29,710 --> 00:14:35,470 And we don't care whether we make an error of one kind or 277 00:14:35,470 --> 00:14:38,590 make an error of the other kind. 278 00:14:38,590 --> 00:14:44,840 OK, is that formulation of the problem crystal clear? 279 00:14:44,840 --> 00:14:48,940 Anybody have any questions about it? 280 00:14:48,940 --> 00:14:54,730 I mean, the easiest way to get screwed up with detection is 281 00:14:54,730 --> 00:14:58,140 at a certain point to be going through, studying a detection 282 00:14:58,140 --> 00:15:00,700 problem, and then you suddenly realize you 283 00:15:00,700 --> 00:15:03,180 don't understand what-- 284 00:15:03,180 --> 00:15:06,590 you don't understand what the whole problem is about. 285 00:15:06,590 --> 00:15:12,650 OK, let's assume we do know what the problem is, then. 286 00:15:12,650 --> 00:15:15,760 In principle it's simple. 287 00:15:15,760 --> 00:15:19,190 Given a particular observed value, what we're going to do 288 00:15:19,190 --> 00:15:22,240 is we're going to calculate the-- what we call the a 289 00:15:22,240 --> 00:15:25,850 posteriori probability, the probability given that 290 00:15:25,850 --> 00:15:29,010 particular sample value of the observation-- 291 00:15:29,010 --> 00:15:32,120 we're going to calculate the probability that what went 292 00:15:32,120 --> 00:15:35,340 into the system is a zero and what went into 293 00:15:35,340 --> 00:15:38,790 the system is a one. 294 00:15:38,790 --> 00:15:39,390 OK? 295 00:15:39,390 --> 00:15:44,930 This is the probability that j is the sample value of H, 296 00:15:44,930 --> 00:15:50,040 conditional on what we observed. 297 00:15:50,040 --> 00:15:54,780 OK, if you can calculate this quantity, it tells you if I 298 00:15:54,780 --> 00:15:56,030 decide that-- 299 00:15:58,310 --> 00:16:02,870 if I guess that H is equal to j-- this in fact tells me that 300 00:16:02,870 --> 00:16:06,950 this is the probability that guess is correct. 301 00:16:06,950 --> 00:16:10,180 And if this is the probability that guess is correct, and I 302 00:16:10,180 --> 00:16:12,570 want to maximize my probability of guessing 303 00:16:12,570 --> 00:16:15,350 correctly, what do I do? 304 00:16:15,350 --> 00:16:22,500 Well, what I do is my MAP rule is arg max of this 305 00:16:22,500 --> 00:16:23,820 probability. 306 00:16:23,820 --> 00:16:30,830 And "arg max" means instead of trying to maximize this 307 00:16:30,830 --> 00:16:33,670 instead of trying to maximize this quantity over something, 308 00:16:33,670 --> 00:16:37,870 what we're doing is trying to find the value of j, which 309 00:16:37,870 --> 00:16:39,050 maximizes this. 310 00:16:39,050 --> 00:16:42,870 In other words, we calculate this for each value of j, and 311 00:16:42,870 --> 00:16:46,330 then we picked the j for which this quantity is largest. 312 00:16:46,330 --> 00:16:49,890 In other words, we maximize this but we're not interested 313 00:16:49,890 --> 00:16:52,860 in the maximum value of it at this point. 314 00:16:52,860 --> 00:16:55,420 We're interested in it later because that's the probability 315 00:16:55,420 --> 00:16:56,820 that we're choosing correctly. 316 00:16:56,820 --> 00:17:01,100 What we're interested in, for now, is what is the hypothesis 317 00:17:01,100 --> 00:17:02,820 that we're going to guess. 318 00:17:02,820 --> 00:17:03,480 OK? 319 00:17:03,480 --> 00:17:08,570 So this probability of being correct is going to the this 320 00:17:08,570 --> 00:17:11,730 probability for this maximal j. 321 00:17:11,730 --> 00:17:16,125 And when we average over v we get the overall probability of 322 00:17:16,125 --> 00:17:17,690 being correct. 323 00:17:17,690 --> 00:17:21,840 There's a theorem which is stated in the notes, which is 324 00:17:21,840 --> 00:17:25,120 one of the more trivial theorems you can think of, 325 00:17:25,120 --> 00:17:28,240 which says that if you do the best thing for every sample 326 00:17:28,240 --> 00:17:32,780 point you, in fact, have done the best thing on average. 327 00:17:32,780 --> 00:17:35,050 I think that's pretty clear. 328 00:17:35,050 --> 00:17:39,605 If you, well I mean you can read it if you want to form a 329 00:17:39,605 --> 00:17:44,650 proof, but if you do the best thing all the time, then it's 330 00:17:44,650 --> 00:17:48,150 the overall best thing. 331 00:17:48,150 --> 00:17:57,570 OK, so that's the general idea of detection. 332 00:17:57,570 --> 00:18:01,590 And in doing this we have to be able to calculate these 333 00:18:01,590 --> 00:18:04,820 probabilities, so that's the only constraint. 334 00:18:04,820 --> 00:18:07,940 These are probabilities, which means that this set of 335 00:18:07,940 --> 00:18:12,080 hypothesis is discrete. 336 00:18:12,080 --> 00:18:16,460 If you have an uncountable infinite number of hypotheses, 337 00:18:16,460 --> 00:18:19,070 at that point you're dealing with an estimation problem. 338 00:18:19,070 --> 00:18:22,790 Because you don't have any chance in hell of getting the 339 00:18:22,790 --> 00:18:25,700 right answer, exactly the right answer. 340 00:18:25,700 --> 00:18:28,990 And therefore you have to have some criterion for 341 00:18:28,990 --> 00:18:30,040 how close you are. 342 00:18:30,040 --> 00:18:32,280 And that's what's important in estimation. 343 00:18:32,280 --> 00:18:36,090 And here what's important is really, do we guess right or 344 00:18:36,090 --> 00:18:37,300 don't we guess right. 345 00:18:37,300 --> 00:18:38,850 And we don't, we don't care. 346 00:18:38,850 --> 00:18:41,130 There aren't any near misses here. 347 00:18:41,130 --> 00:18:45,330 You either get it on the nose or you don't. 348 00:18:45,330 --> 00:18:47,820 OK, so we want to study binary detection now 349 00:18:47,820 --> 00:18:48,690 to start off with. 350 00:18:48,690 --> 00:18:51,420 We want to trivialize the problem because even that 351 00:18:51,420 --> 00:18:53,980 problem we just stated is too hard. 352 00:18:53,980 --> 00:18:56,140 So we're going to trivialize it in two ways. 353 00:18:56,140 --> 00:18:58,960 We're going to assume that there are only two hypotheses. 354 00:18:58,960 --> 00:19:01,670 That it's a binary detection problem. 355 00:19:01,670 --> 00:19:05,370 And we're also going to assume that it's Gaussian noise. 356 00:19:05,370 --> 00:19:10,380 And that will make it sort of transparent what's happening. 357 00:19:10,380 --> 00:19:13,250 So H takes the values zero or one. 358 00:19:13,250 --> 00:19:15,880 And we'll call the probabilities with which it 359 00:19:15,880 --> 00:19:18,610 takes those values P zero and P one. 360 00:19:18,610 --> 00:19:20,790 These are called a priori probabilities. 361 00:19:20,790 --> 00:19:23,610 In other words, these are the probabilities that the 362 00:19:23,610 --> 00:19:27,780 hypothesis takes to value zero or one before seeing any 363 00:19:27,780 --> 00:19:29,680 observation. 364 00:19:29,680 --> 00:19:33,070 And the probabilities after you see the observation are 365 00:19:33,070 --> 00:19:35,940 called a posteriori probabilities. 366 00:19:35,940 --> 00:19:38,870 In other words, probabilities after the observation and the 367 00:19:38,870 --> 00:19:41,420 probabilities before the observation. 368 00:19:41,420 --> 00:19:45,170 Up until about 1950 statisticians used to argue 369 00:19:45,170 --> 00:19:50,150 terribly about whether it was valid to assume a priori 370 00:19:50,150 --> 00:19:52,600 probabilities. 371 00:19:52,600 --> 00:19:56,150 And as you can see by thinking about it a little bit, the 372 00:19:56,150 --> 00:19:59,630 problem they were facing was they couldn't separate the 373 00:19:59,630 --> 00:20:02,680 problem of choosing a mathematical model and 374 00:20:02,680 --> 00:20:07,730 analyzing it from the problem of figuring out whether the 375 00:20:07,730 --> 00:20:10,330 model was valid or not. 376 00:20:10,330 --> 00:20:15,090 And at that point people studying in that area had not 377 00:20:15,090 --> 00:20:18,100 gotten to the point where they could say, "Well, maybe I 378 00:20:18,100 --> 00:20:21,660 ought to analyze the problem for different models, and then 379 00:20:21,660 --> 00:20:24,630 after I understand what happens for different models I 380 00:20:24,630 --> 00:20:27,470 then ought to go back because I'll know what's important to 381 00:20:27,470 --> 00:20:33,140 find out in the real problem." But up until that time, there 382 00:20:33,140 --> 00:20:36,200 was just fighting among everyone. 383 00:20:36,200 --> 00:20:39,310 Bayes was the person who decided you really ought to 384 00:20:39,310 --> 00:20:41,600 assume that there's a model to start with. 385 00:20:41,600 --> 00:20:44,040 And he developed most of detection 386 00:20:44,040 --> 00:20:46,810 theory at an early time. 387 00:20:46,810 --> 00:20:50,070 And people used to think that Bayes was a terrible fraud. 388 00:20:50,070 --> 00:20:53,530 Because in fact he was using models of the problem rather 389 00:20:53,530 --> 00:20:56,710 than nothing. 390 00:20:56,710 --> 00:20:59,430 But anyway, that's where we were. 391 00:21:02,310 --> 00:21:06,910 We're also going to assume that, after we get all through 392 00:21:06,910 --> 00:21:09,230 with modulation and demodulation, and we really 393 00:21:09,230 --> 00:21:11,740 want to look at a general problem here. 394 00:21:11,740 --> 00:21:16,950 There's one discrete random variable, H, one analog random 395 00:21:16,950 --> 00:21:20,800 variable, which has a probability density, v, what 396 00:21:20,800 --> 00:21:25,200 we want to assume is that there's a probability density 397 00:21:25,200 --> 00:21:29,800 that we know, which is the probability density of the 398 00:21:29,800 --> 00:21:34,050 observation conditional on the hypothesis. 399 00:21:34,050 --> 00:21:37,860 We're assuming that we know this, and we know this-- 400 00:21:37,860 --> 00:21:41,160 we call these things likelihoods-- 401 00:21:41,160 --> 00:21:46,430 and in most communication problems anyway, it's far 402 00:21:46,430 --> 00:21:49,650 easier to get your hand on these likelihoods than it is 403 00:21:49,650 --> 00:21:53,260 to get your hand the a posteriori probabilities, 404 00:21:53,260 --> 00:21:55,510 which you're really interested in. 405 00:21:55,510 --> 00:22:01,390 So we find these likelihoods, we can find the marginal 406 00:22:01,390 --> 00:22:05,070 density of the observation. 407 00:22:05,070 --> 00:22:10,720 Which is just the weighted sum. 408 00:22:10,720 --> 00:22:13,790 The probability that the hypothesis is zero times a 409 00:22:13,790 --> 00:22:18,200 conditional probability of the observation and so forth. 410 00:22:21,440 --> 00:22:24,570 So we're going to assume that those densities exist. 411 00:22:24,570 --> 00:22:26,150 We're going to assume that we know them. 412 00:22:28,840 --> 00:22:35,840 And then with a great feat of probability theory, we say the 413 00:22:35,840 --> 00:22:44,290 a posteriori probability is equal to the a priori 414 00:22:44,290 --> 00:22:50,570 probability times the likelihood divided by the 415 00:22:50,570 --> 00:22:53,820 marginal probability of v. OK? 416 00:22:53,820 --> 00:22:55,900 What was the first thing you learn in probability when you 417 00:22:55,900 --> 00:22:58,340 started studying random variables? 418 00:22:58,340 --> 00:23:02,680 It was probably this formula, which says when you have a 419 00:23:02,680 --> 00:23:06,970 joint density of two random variables you can 420 00:23:06,970 --> 00:23:08,370 write it in two ways. 421 00:23:08,370 --> 00:23:11,920 You can either write, "density of one times density of two 422 00:23:11,920 --> 00:23:16,420 conditional and one is equal to the density of two times 423 00:23:16,420 --> 00:23:20,510 density of one conditional in two." And then you think about 424 00:23:20,510 --> 00:23:23,900 it a little bit and you say, "A ha!" It doesn't matter 425 00:23:23,900 --> 00:23:26,890 whether the first one is the density or whether it's the 426 00:23:26,890 --> 00:23:27,720 probability. 427 00:23:27,720 --> 00:23:29,640 You can deal with it in the same way. 428 00:23:29,640 --> 00:23:32,390 And you get this formula, which I hope is 429 00:23:32,390 --> 00:23:36,340 not unusual to you. 430 00:23:36,340 --> 00:23:40,310 OK, so our MAP decision rule then, our MAP decision rule, 431 00:23:40,310 --> 00:23:45,720 remember, is to pick the more, is to find the a posteriori 432 00:23:45,720 --> 00:23:48,860 probability which is most probable. 433 00:23:48,860 --> 00:23:52,540 Because that is the probability of being correct. 434 00:23:52,540 --> 00:23:56,530 So, in fact, if this probability is bigger than 435 00:23:56,530 --> 00:24:01,260 this probability, this is the a posteriori probability that 436 00:24:01,260 --> 00:24:03,200 H is equal to zero. 437 00:24:03,200 --> 00:24:06,810 This is the a posteriori probability that 438 00:24:06,810 --> 00:24:08,290 H is equal to one. 439 00:24:08,290 --> 00:24:11,220 So we're just going to compare those two and pick the larger. 440 00:24:11,220 --> 00:24:18,420 And if this one is larger than this one, we pick our choice 441 00:24:18,420 --> 00:24:19,740 equal zero. 442 00:24:19,740 --> 00:24:22,850 And if it's smaller, we pick our choice equal to one. 443 00:24:22,850 --> 00:24:27,240 So this is what MAP detection is. 444 00:24:27,240 --> 00:24:30,670 Why did I make this greater than or equal 445 00:24:30,670 --> 00:24:33,000 and this less than? 446 00:24:33,000 --> 00:24:36,320 Well, if we have densities, it doesn't often make any 447 00:24:36,320 --> 00:24:37,730 difference. 448 00:24:37,730 --> 00:24:40,710 Strangely enough, it does sometimes make a difference. 449 00:24:40,710 --> 00:24:43,280 Because sometimes you can have a density, and the densities 450 00:24:43,280 --> 00:24:47,890 are the same for both of these likelihoods. 451 00:24:47,890 --> 00:24:50,870 And you can find situations where it's important. 452 00:24:50,870 --> 00:24:55,500 But when the two probabilities are the same, the probability 453 00:24:55,500 --> 00:24:58,980 of being correct is the same in both cases, so it doesn't 454 00:24:58,980 --> 00:25:01,750 make any difference what you do when you 455 00:25:01,750 --> 00:25:04,250 have a quality here. 456 00:25:04,250 --> 00:25:06,580 And therefore we've just made a decision. 457 00:25:06,580 --> 00:25:09,610 We've said, OK, what we're going to do is whenever this 458 00:25:09,610 --> 00:25:13,950 is equal to this, we're going to choose zero. 459 00:25:13,950 --> 00:25:18,020 If you prefer choosing one, be my guest. 460 00:25:18,020 --> 00:25:20,390 All of your MAP error probabilities will 461 00:25:20,390 --> 00:25:21,660 be exactly the same. 462 00:25:21,660 --> 00:25:23,580 Nothing will change. 463 00:25:23,580 --> 00:25:27,640 It just is easier to do the same thing all the time. 464 00:25:27,640 --> 00:25:29,700 OK, well then we look at this formula, and we say, "Well, I 465 00:25:29,700 --> 00:25:33,990 can simplify this a little bit." If I take this 466 00:25:33,990 --> 00:25:39,700 likelihood and move it over to this side, and if I take this 467 00:25:39,700 --> 00:25:43,010 marginal density and move it over to this side, and if I 468 00:25:43,010 --> 00:25:45,970 take p zero and move it over to this side, then the 469 00:25:45,970 --> 00:25:48,270 marginal densities cancel out. 470 00:25:48,270 --> 00:25:50,430 They had nothing to do with the problem. 471 00:25:50,430 --> 00:25:54,080 And I wind up with a ratio of the likelihoods. 472 00:25:54,080 --> 00:25:55,860 And what do you think the ratio of the 473 00:25:55,860 --> 00:25:57,460 likelihoods is called? 474 00:25:57,460 --> 00:25:59,415 Somebody got the smart idea of calling that 475 00:25:59,415 --> 00:26:02,930 a likelihood ratio. 476 00:26:02,930 --> 00:26:05,810 Somehow the people in statistics were much better at 477 00:26:05,810 --> 00:26:09,640 generating notation than the people in communication theory 478 00:26:09,640 --> 00:26:12,360 who have done just an abominable job of choosing 479 00:26:12,360 --> 00:26:14,660 notation for things. 480 00:26:14,660 --> 00:26:17,500 But anyway, they call this a likelihood ratio. 481 00:26:17,500 --> 00:26:21,230 And the rule then becomes: if the likelihood ratio is 482 00:26:21,230 --> 00:26:24,180 greater than or equal to the ratio of p1 to 483 00:26:24,180 --> 00:26:26,710 p0, we choose zero. 484 00:26:26,710 --> 00:26:29,030 And if it's less, we choose one. 485 00:26:29,030 --> 00:26:32,220 And we call this ratio the threshold. 486 00:26:32,220 --> 00:26:38,240 So in fact what this says is binary MAP tests are always 487 00:26:38,240 --> 00:26:39,840 threshold tests. 488 00:26:39,840 --> 00:26:44,610 And by a threshold test I mean finds the likelihood ratio, 489 00:26:44,610 --> 00:26:47,520 compare the likelihood ratio with the threshold-- the 490 00:26:47,520 --> 00:26:51,010 threshold in fact is this ratio of a priori 491 00:26:51,010 --> 00:26:53,120 probabilities-- 492 00:26:53,120 --> 00:26:55,520 and at that point you have actually 493 00:26:55,520 --> 00:26:57,980 achieved the MAP test. 494 00:26:57,980 --> 00:27:02,850 In other words, you have done something which actually, for 495 00:27:02,850 --> 00:27:06,950 real, minimizes the probability of error. 496 00:27:06,950 --> 00:27:10,610 Maximizes the probability of being correct. 497 00:27:10,610 --> 00:27:15,730 Well because of that, this thing here, this likelihood 498 00:27:15,730 --> 00:27:19,820 ratio, is called a sufficient statistic. 499 00:27:19,820 --> 00:27:22,830 And it's called a sufficient statistic because you can do 500 00:27:22,830 --> 00:27:27,100 math decoding just by knowing this number. 501 00:27:27,100 --> 00:27:27,420 OK? 502 00:27:27,420 --> 00:27:30,920 In other words, it says you can you can calculate these 503 00:27:30,920 --> 00:27:33,770 likelihoods. 504 00:27:33,770 --> 00:27:36,410 You can find the ratio of them-- which is this 505 00:27:36,410 --> 00:27:40,120 likelihood ratio-- and after you know the likelihood ratio, 506 00:27:40,120 --> 00:27:42,810 you don't have to worry about these likelihoods anymore. 507 00:27:42,810 --> 00:27:47,030 This is the only thing relevant to the problem. 508 00:27:47,030 --> 00:27:50,420 Now this doesn't seem to be a huge saving, because here 509 00:27:50,420 --> 00:27:53,270 we're dealing with two real numbers-- well here we've 510 00:27:53,270 --> 00:27:55,820 reduced it to one real number-- which is something. 511 00:27:55,820 --> 00:27:59,140 When we start dealing with vectors, when we start dealing 512 00:27:59,140 --> 00:28:02,830 with wave forms, this is really a big thing. 513 00:28:02,830 --> 00:28:05,580 Because what you're doing is reducing 514 00:28:05,580 --> 00:28:08,270 the vectors to numbers. 515 00:28:08,270 --> 00:28:11,540 And when you reduce accountibly infinite 516 00:28:11,540 --> 00:28:13,960 dimensional vector to a number, 517 00:28:13,960 --> 00:28:16,930 that's a big advantage. 518 00:28:16,930 --> 00:28:19,870 It also, in terms of the communication problems we're 519 00:28:19,870 --> 00:28:24,270 facing, breaks up a detector into two pieces in an 520 00:28:24,270 --> 00:28:25,900 interesting way. 521 00:28:25,900 --> 00:28:28,400 Mainly it says there are things you do with the wave 522 00:28:28,400 --> 00:28:33,400 form in order to calculate what this likelihood ratio is, 523 00:28:33,400 --> 00:28:36,490 and then after you find the likelihood ratio you just 524 00:28:36,490 --> 00:28:39,150 forget about what the wave form was and you 525 00:28:39,150 --> 00:28:40,630 deal only with that. 526 00:28:40,630 --> 00:28:43,390 What we're going to find out is in this problem we were 527 00:28:43,390 --> 00:28:44,640 looking at here-- 528 00:28:47,540 --> 00:28:49,390 we're going to find out later when we look 529 00:28:49,390 --> 00:28:52,060 at the vector problem-- 530 00:28:52,060 --> 00:28:59,010 in fact this thing here is in fact the likelihood ratio if 531 00:28:59,010 --> 00:29:02,560 you make an observation out at this point here. 532 00:29:02,560 --> 00:29:06,090 In other words, right at the front end of the receiver, 533 00:29:06,090 --> 00:29:07,950 that's where you have all the information you 534 00:29:07,950 --> 00:29:10,620 can possibly have. 535 00:29:10,620 --> 00:29:16,680 If you calculate likelihood ratios at that point what 536 00:29:16,680 --> 00:29:20,640 you're going to do is to find the likelihood ratio you're 537 00:29:20,640 --> 00:29:24,530 going to go through all this stuff right here and wind up 538 00:29:24,530 --> 00:29:28,490 with that which is work which is proportional to the 539 00:29:28,490 --> 00:29:30,760 likelihood ratio right here. 540 00:29:30,760 --> 00:29:34,140 OK, so one of the things we're doing right now is we're not 541 00:29:34,140 --> 00:29:35,360 looking at that problem. 542 00:29:35,360 --> 00:29:38,070 We're only looking at the simpler problem, assuming a 543 00:29:38,070 --> 00:29:40,210 one dimensional problem. 544 00:29:40,210 --> 00:29:42,990 But the reason we're looking at it is that later we're 545 00:29:42,990 --> 00:29:45,970 going to show that this is, in fact, the solution to the more 546 00:29:45,970 --> 00:29:48,730 general problem. 547 00:29:48,730 --> 00:29:53,040 Which was Shannon's idea in the first place. 548 00:29:53,040 --> 00:29:56,630 Of, how do you solve the trivial problem first and then 549 00:29:56,630 --> 00:30:02,520 see what the complicated problem is. 550 00:30:02,520 --> 00:30:10,340 OK, so that's what we're trying to do, summarized here 551 00:30:10,340 --> 00:30:14,870 for any binary detection problem where the observation 552 00:30:14,870 --> 00:30:18,520 has a sample value of a random something. 553 00:30:18,520 --> 00:30:21,180 Namely, a random vector, a random process, a random 554 00:30:21,180 --> 00:30:24,730 variable, a complex variable, a complex anything. 555 00:30:24,730 --> 00:30:27,770 Anything whatsoever, so long as you can assign a 556 00:30:27,770 --> 00:30:30,950 probability density to it. 557 00:30:30,950 --> 00:30:34,540 You calculate the likelihood ratio, which is this ratio 558 00:30:34,540 --> 00:30:38,250 here, so long as you have then cities even talk about. 559 00:30:38,250 --> 00:30:42,340 The MAP rule is to compare this likelihood ratio with the 560 00:30:42,340 --> 00:30:45,460 threshold data-- which is just the ratio of the a priori 561 00:30:45,460 --> 00:30:47,980 probabilities-- 562 00:30:47,980 --> 00:30:49,660 if this is greater than or equal to 563 00:30:49,660 --> 00:30:51,780 that, you choose zero. 564 00:30:51,780 --> 00:30:54,270 Otherwise you choose one. 565 00:30:54,270 --> 00:30:58,000 The MAP rule, as I said before, partitions this 566 00:30:58,000 --> 00:31:01,940 observation space into two pieces. 567 00:31:01,940 --> 00:31:05,670 Into two segments. 568 00:31:05,670 --> 00:31:09,320 And one of those pieces gets mapped into zero. 569 00:31:09,320 --> 00:31:11,850 One of the pieces gets mapped into one. 570 00:31:11,850 --> 00:31:15,600 It's exactly like a binary quanitizer. 571 00:31:15,600 --> 00:31:19,440 Except the rule you use to choose the the quantization 572 00:31:19,440 --> 00:31:20,440 regions is different. 573 00:31:20,440 --> 00:31:25,430 But a quanitizer maps a space into a finite set of regions. 574 00:31:25,430 --> 00:31:29,200 And this detection rule does exactly the same thing. 575 00:31:29,200 --> 00:31:32,270 And since the beginning of information theory people have 576 00:31:32,270 --> 00:31:36,750 been puzzling over how to make use of the correspondence 577 00:31:36,750 --> 00:31:41,020 between quanitization on one hand and detection on the 578 00:31:41,020 --> 00:31:42,410 other hand. 579 00:31:42,410 --> 00:31:44,300 And there are some correspondences but they 580 00:31:44,300 --> 00:31:47,830 aren't all that good most of the time. 581 00:31:47,830 --> 00:31:52,640 OK, so you get an error when the actual hypothesis that 582 00:31:52,640 --> 00:31:57,290 occurred, namely the bit that got sent was i and if the 583 00:31:57,290 --> 00:32:01,630 observation landed in the other subset. 584 00:32:01,630 --> 00:32:03,960 We know that the MAP rule minimizes the error 585 00:32:03,960 --> 00:32:05,130 probability. 586 00:32:05,130 --> 00:32:08,490 So you have a rule which you can use for all binary 587 00:32:08,490 --> 00:32:12,390 detection problems so long as you have the density. 588 00:32:12,390 --> 00:32:15,010 And if you don't have a density you can generalize it 589 00:32:15,010 --> 00:32:18,560 without too much trouble. 590 00:32:18,560 --> 00:32:24,330 OK, so we want to look at the problem in Gaussian noise. 591 00:32:24,330 --> 00:32:27,530 In particular we want to look at it for 2PAM. 592 00:32:27,530 --> 00:32:32,720 In other words for a standard PAM system, where zero gets 593 00:32:32,720 --> 00:32:37,670 mapped into plus a and one gets mapped into minus a. 594 00:32:37,670 --> 00:32:40,850 This is often called antipodal signaling because you're 595 00:32:40,850 --> 00:32:43,420 sending a plus something and a minus something. 596 00:32:43,420 --> 00:32:46,600 They are at opposite ends of the spectrum. 597 00:32:46,600 --> 00:32:49,760 You push them as far away as you can, because as you push 598 00:32:49,760 --> 00:32:51,560 them further and further away it requires 599 00:32:51,560 --> 00:32:52,930 more and more energy. 600 00:32:52,930 --> 00:32:54,780 So you use the energy you have. 601 00:32:54,780 --> 00:32:57,880 You get them as far apart as you can, and you hope that's 602 00:32:57,880 --> 00:32:58,940 going to help you. 603 00:32:58,940 --> 00:33:01,720 And we'll see that it does help you. 604 00:33:01,720 --> 00:33:07,110 OK, so what you receive then, we'll assume, is either plus 605 00:33:07,110 --> 00:33:08,020 or minus a-- 606 00:33:08,020 --> 00:33:11,560 depending on which hypothesis occured-- 607 00:33:11,560 --> 00:33:14,750 plus a Gaussian random variable. 608 00:33:14,750 --> 00:33:19,040 And here's where the notation of communication theorists 609 00:33:19,040 --> 00:33:22,030 rears its ugly head. 610 00:33:22,030 --> 00:33:27,450 We call the variance of this random variable n 0 over 2. 611 00:33:27,450 --> 00:33:31,900 I would prefer to call it sigma squared but 612 00:33:31,900 --> 00:33:34,470 unfortunately you can't fight city hall on 613 00:33:34,470 --> 00:33:36,410 something like this. 614 00:33:36,410 --> 00:33:40,510 And everybody talks about n 0 and n 0 over 2. 615 00:33:40,510 --> 00:33:42,670 And you got to get used to it. 616 00:33:42,670 --> 00:33:45,260 So here's where we're starting to get used to it. 617 00:33:45,260 --> 00:33:50,510 So that's the variance of this noise random variable. 618 00:33:50,510 --> 00:33:54,460 OK, we're only going to send one binary digit, H, so this 619 00:33:54,460 --> 00:33:57,850 is the only, this is the sole problem we have to deal with. 620 00:33:57,850 --> 00:34:00,750 We've made a binary choice. 621 00:34:00,750 --> 00:34:04,110 Added one Gaussian random variable to it. 622 00:34:04,110 --> 00:34:07,970 You observe the sum, and you guess. 623 00:34:07,970 --> 00:34:12,330 So what are these likelihoods in this case? 624 00:34:12,330 --> 00:34:16,130 Well, the likelihood if H is equal zero, in other words if 625 00:34:16,130 --> 00:34:20,270 you're sending a plus a, the likelihood is just a Gaussian 626 00:34:20,270 --> 00:34:24,350 density shifted over by a. 627 00:34:24,350 --> 00:34:27,620 And if you're sending, on the other hand, a one-- 628 00:34:27,620 --> 00:34:30,600 which means you're sending minus a-- you have a Gaussian 629 00:34:30,600 --> 00:34:33,420 density shifted over the other way. 630 00:34:33,420 --> 00:34:36,200 Let me show you a picture of that. 631 00:34:40,530 --> 00:34:44,080 We'll come back to analyze more things about the picture 632 00:34:44,080 --> 00:34:46,080 in a little bit, so don't worry about most of the 633 00:34:46,080 --> 00:34:47,330 picture at this point. 634 00:34:49,820 --> 00:34:55,030 OK, this is the likelihood probability density of the 635 00:34:55,030 --> 00:34:58,410 output given that you sent a zero. 636 00:34:58,410 --> 00:35:00,380 Mainly that you sent plus a. 637 00:35:00,380 --> 00:35:02,940 So we have a Gaussian density-- this bell shaped 638 00:35:02,940 --> 00:35:05,620 curve-- centered around plus a. 639 00:35:05,620 --> 00:35:09,750 If you sent a one, you're sending minus a-- one gets 640 00:35:09,750 --> 00:35:13,230 mapped into minus a-- and you have the same bell shaped 641 00:35:13,230 --> 00:35:16,560 curve centered around minus a. 642 00:35:16,560 --> 00:35:21,560 If you receive any particular value of v, mainly suppose you 643 00:35:21,560 --> 00:35:25,220 receive the value of v here, you calculate these two 644 00:35:25,220 --> 00:35:26,160 likelihoods. 645 00:35:26,160 --> 00:35:27,490 One of them is this. 646 00:35:27,490 --> 00:35:29,850 One of them is that. 647 00:35:29,850 --> 00:35:33,200 You compare them, the ratio with the threshold, and you 648 00:35:33,200 --> 00:35:35,860 make your choice. 649 00:35:35,860 --> 00:35:39,910 OK, so let's go back to do it. 650 00:35:39,910 --> 00:35:42,730 To do the arithmetic. 651 00:35:42,730 --> 00:35:45,080 Here are the two likelihoods. 652 00:35:45,080 --> 00:35:47,860 You take the ratio of these two things. 653 00:35:47,860 --> 00:35:51,180 When you take the ratio of them, what happens? 654 00:35:51,180 --> 00:35:52,560 And this sort of always happens in 655 00:35:52,560 --> 00:35:54,980 these Gaussian problems. 656 00:35:54,980 --> 00:35:57,320 These terms cancel out. 657 00:35:57,320 --> 00:35:58,330 Well it always happens in these 658 00:35:58,330 --> 00:36:00,150 additive Gaussian problems. 659 00:36:00,150 --> 00:36:02,810 These terms cancel out. 660 00:36:02,810 --> 00:36:06,150 You take a ratio of two exponents. 661 00:36:06,150 --> 00:36:08,520 You just get the difference. 662 00:36:08,520 --> 00:36:12,600 So the likelihood ratio-- this divided by this-- 663 00:36:12,600 --> 00:36:18,140 is then e to the minus v minus a squared over n0. 664 00:36:18,140 --> 00:36:21,550 and v plus a squared over n0. 665 00:36:21,550 --> 00:36:21,900 OK? 666 00:36:21,900 --> 00:36:26,940 Because normally the Gaussian density is something divided 667 00:36:26,940 --> 00:36:30,830 by two sigma squared, and sigma squared here is n0 over 668 00:36:30,830 --> 00:36:32,580 2, so the 2's cancel out. 669 00:36:32,580 --> 00:36:35,560 One nice thing about the notation anyway, you get rid 670 00:36:35,560 --> 00:36:37,260 of one factor of two in it. 671 00:36:37,260 --> 00:36:39,760 Well so you have this minus this. 672 00:36:39,760 --> 00:36:43,190 When you take the difference of these two things the v 673 00:36:43,190 --> 00:36:46,480 squareds cancel out. 674 00:36:46,480 --> 00:36:50,130 Because one of these things is in the numerator, the other 675 00:36:50,130 --> 00:36:51,970 one was in the denominator. 676 00:36:51,970 --> 00:36:55,200 So you have this term comes through as is. 677 00:36:55,200 --> 00:36:56,640 This is-- 678 00:36:56,640 --> 00:36:58,830 you're dividing by this-- 679 00:36:58,830 --> 00:37:01,800 so when you multiply this turns into a plus sign. 680 00:37:01,800 --> 00:37:05,930 So the v squared here cancels out with the v squared here. 681 00:37:05,930 --> 00:37:09,680 The a squared here cancels out with the a squared here. 682 00:37:09,680 --> 00:37:12,810 And it's only the inner product term that survives 683 00:37:12,810 --> 00:37:14,490 this whole thing. 684 00:37:14,490 --> 00:37:16,750 And here you have plus 2va. 685 00:37:16,750 --> 00:37:18,810 Here you have plus 2va. 686 00:37:18,810 --> 00:37:24,920 So you wind up with e to the 4av divided by n0. 687 00:37:24,920 --> 00:37:28,060 Which is very nice, because what it says is this 688 00:37:28,060 --> 00:37:31,140 likelihood, which is what determines everything in the 689 00:37:31,140 --> 00:37:35,210 world, is just a scalar in multiple of the observation. 690 00:37:37,740 --> 00:37:40,870 And that that's going to simplify things a fair amount. 691 00:37:40,870 --> 00:37:43,170 It's why that picture comes out as simply is it does. 692 00:37:47,500 --> 00:37:52,760 OK, so to do a little more of the arithmetic. 693 00:37:52,760 --> 00:37:57,220 This is the likelihood here, e to the 4av over n0. 694 00:37:57,220 --> 00:38:02,340 So our rule is you compare this likelihood to the 695 00:38:02,340 --> 00:38:05,690 threshold-- which is p1 over p0, which we call eta-- 696 00:38:05,690 --> 00:38:09,590 and you look at that for awhile and you say, "Gee, this 697 00:38:09,590 --> 00:38:12,500 is going to be much easier to deal with. 698 00:38:12,500 --> 00:38:15,630 Instead of looking at the likelihood ratio, I look at 699 00:38:15,630 --> 00:38:18,530 the log likelihood ratio." 700 00:38:18,530 --> 00:38:21,950 And people who deal with Gaussian problems a lot, you 701 00:38:21,950 --> 00:38:25,220 never hear them talk about likelihood ratios, you always 702 00:38:25,220 --> 00:38:28,890 hear them talk about log likelihood ratios. 703 00:38:28,890 --> 00:38:33,200 And you can find one from the other, so either one is 704 00:38:33,200 --> 00:38:34,110 equally good. 705 00:38:34,110 --> 00:38:36,830 In other words, the log likelihood ratio is a 706 00:38:36,830 --> 00:38:40,110 sufficient statistic, because you can calculate the 707 00:38:40,110 --> 00:38:43,790 likelihood ratio from it. 708 00:38:43,790 --> 00:38:46,040 So this is a sufficient statistic. 709 00:38:46,040 --> 00:38:49,110 It's equal to 4av over n0. 710 00:38:49,110 --> 00:38:52,200 And when this is greater than or equal the log of the 711 00:38:52,200 --> 00:38:54,410 threshold, you go this way. 712 00:38:54,410 --> 00:38:56,960 When it's less than, you go this way. 713 00:38:56,960 --> 00:39:05,170 So when you then multiply by n0 over 4a, your decision rule 714 00:39:05,170 --> 00:39:07,330 is you just look at the observation. 715 00:39:07,330 --> 00:39:13,710 You compare it with n0 times log of eta divided by 4a. 716 00:39:13,710 --> 00:39:18,330 And at this point we can go back to this picture and sort 717 00:39:18,330 --> 00:39:22,770 of sort out what all of it means. 718 00:39:22,770 --> 00:39:26,800 Because this point here is now the threshold. 719 00:39:26,800 --> 00:39:29,640 It's n0 times log of eta divided 4eta. 720 00:39:32,680 --> 00:39:34,230 By 4a. 721 00:39:34,230 --> 00:39:37,090 That's what we said the threshold had to be. 722 00:39:37,090 --> 00:39:41,090 So we have these two Gaussian curves now. 723 00:39:41,090 --> 00:39:43,440 Why do we have to go back and look at these Gaussian curves? 724 00:39:43,440 --> 00:39:48,640 I told you that once we calculated the likelihood 725 00:39:48,640 --> 00:39:52,230 ratio we could forget about the curves. 726 00:39:52,230 --> 00:39:55,480 So why do I want to put the curves back in? 727 00:39:55,480 --> 00:39:57,710 Well because I want to calculate the probability of 728 00:39:57,710 --> 00:40:01,150 error at this point. 729 00:40:01,150 --> 00:40:01,900 OK? 730 00:40:01,900 --> 00:40:04,810 And it's easier to calculate the probability of error if, 731 00:40:04,810 --> 00:40:08,150 in fact, I draw curve for myself and I look at 732 00:40:08,150 --> 00:40:10,380 what's going on. 733 00:40:10,380 --> 00:40:12,420 So here's the threshold. 734 00:40:12,420 --> 00:40:18,220 Here's the density when H equals one is the correct 735 00:40:18,220 --> 00:40:20,000 hypothesis. 736 00:40:20,000 --> 00:40:23,250 The probability of error is the probability that when I 737 00:40:23,250 --> 00:40:29,970 send one, the observation is going to be a random variable 738 00:40:29,970 --> 00:40:36,460 with this probability density and if it's a wild case, and I 739 00:40:36,460 --> 00:40:41,190 got an enormous value of noise, positive noise, the 740 00:40:41,190 --> 00:40:44,310 noise is going to push me over that threshold there and I'm 741 00:40:44,310 --> 00:40:45,860 going to make a mistake. 742 00:40:45,860 --> 00:40:49,010 So, in fact, the probability of error-- conditional on 743 00:40:49,010 --> 00:40:50,570 sending one-- 744 00:40:50,570 --> 00:40:55,050 is just this probability of that little space in there. 745 00:40:55,050 --> 00:40:55,350 OK? 746 00:40:55,350 --> 00:40:59,360 Which is the probability that I'm going to say that zero 747 00:40:59,360 --> 00:41:02,330 occurred when, in fact, one occurred. 748 00:41:02,330 --> 00:41:06,830 So that's my probability of error when one occurs. 749 00:41:06,830 --> 00:41:09,660 What's the probability of error when zero occurs? 750 00:41:09,660 --> 00:41:11,350 Well it's the same analysis. 751 00:41:11,350 --> 00:41:13,520 When zero occurs-- 752 00:41:13,520 --> 00:41:16,620 mainly when the correct hypothesis is zero-- 753 00:41:16,620 --> 00:41:21,850 the output, v, follows this probability density here. 754 00:41:21,850 --> 00:41:27,410 And I'm going to screw up if the noise carries me beyond 755 00:41:27,410 --> 00:41:29,860 this point here. 756 00:41:29,860 --> 00:41:33,680 So you can see what the threshold is doing now. 757 00:41:33,680 --> 00:41:38,150 I mean, when you choose a threshold which is positive it 758 00:41:38,150 --> 00:41:43,560 makes it much harder to screw up when you send a minus a. 759 00:41:43,560 --> 00:41:45,580 It makes it much easier to screw up when 760 00:41:45,580 --> 00:41:47,750 you send a plus a. 761 00:41:47,750 --> 00:41:50,690 But you see that's what we wanted to do, because the 762 00:41:50,690 --> 00:41:58,230 threshold was positive in this case, because p1 was so much 763 00:41:58,230 --> 00:42:00,300 larger than p0. 764 00:42:00,300 --> 00:42:03,250 And because p1 is so much larger than p0-- 765 00:42:03,250 --> 00:42:06,800 p1 happens almost all the time-- 766 00:42:06,800 --> 00:42:10,320 and therefore you would normally almost choose p1 767 00:42:10,320 --> 00:42:13,860 without looking at v. Which says you want to push it over 768 00:42:13,860 --> 00:42:16,730 that way a little bit. 769 00:42:16,730 --> 00:42:21,630 OK, when you calculate this probability of error, it's the 770 00:42:21,630 --> 00:42:26,650 probability of the tail of a Gaussian random variable. 771 00:42:26,650 --> 00:42:33,790 So you define this tail variable, q of x, is the 772 00:42:33,790 --> 00:42:36,710 complimentary distribution function of a 773 00:42:36,710 --> 00:42:38,900 normal random variable. 774 00:42:38,900 --> 00:42:41,710 It's the integral from x to infinity of one over the 775 00:42:41,710 --> 00:42:48,750 square root of 2pi, e to the minus c squared over 2. 776 00:42:48,750 --> 00:42:51,890 I guess this would make better sense if this were a z-- 777 00:42:51,890 --> 00:42:54,510 one and-- 778 00:42:54,510 --> 00:42:55,810 oh, no. 779 00:42:55,810 --> 00:42:57,800 No, I did it right the first time. 780 00:42:57,800 --> 00:43:01,380 That's an x, because x is the limit in there, you see. 781 00:43:01,380 --> 00:43:06,250 So I'm calculating all of the probability density that's off 782 00:43:06,250 --> 00:43:08,550 to the right of x. 783 00:43:08,550 --> 00:43:13,900 And the probability of error when H is equal to one is this 784 00:43:13,900 --> 00:43:17,640 probability-- which looks like it's the tail on the negative 785 00:43:17,640 --> 00:43:20,120 side but if you think about it a little bit, since the 786 00:43:20,120 --> 00:43:24,480 Gaussian curve is symmetric, you can also look at it as a q 787 00:43:24,480 --> 00:43:32,070 function where now when you have this is equal to zero, 788 00:43:32,070 --> 00:43:37,720 this corresponds to changing this plus to a minus here and 789 00:43:37,720 --> 00:43:40,280 that's the only change. 790 00:43:40,280 --> 00:43:46,080 OK, so this looks a little ugly and it 791 00:43:46,080 --> 00:43:49,410 looks a little strange. 792 00:43:49,410 --> 00:43:52,000 I mean you can sort of interpret 793 00:43:52,000 --> 00:43:55,180 this part of it here-- 794 00:43:55,180 --> 00:43:57,780 I can interpret this part if I'm using 795 00:43:57,780 --> 00:44:00,580 maximum likelihood decoding-- 796 00:44:00,580 --> 00:44:04,390 maximum likelihood is mapped decoding where the threshold 797 00:44:04,390 --> 00:44:05,270 is equal to one. 798 00:44:05,270 --> 00:44:08,360 In other words, it's where you're assuming that the 799 00:44:08,360 --> 00:44:13,340 hypothesis is equally likely to be zero or one-- a priori-- 800 00:44:13,340 --> 00:44:17,330 which is a good assumption almost always in communication 801 00:44:17,330 --> 00:44:22,680 because we work very hard in doing source coding to make 802 00:44:22,680 --> 00:44:27,500 those binary digits equally likely zero or one. 803 00:44:27,500 --> 00:44:29,420 And there are other reasons for choosing maximum 804 00:44:29,420 --> 00:44:30,320 likelihood. 805 00:44:30,320 --> 00:44:34,020 If you don't know anything about the probabilities it's a 806 00:44:34,020 --> 00:44:37,910 good assumption in a sort of a max/min sense. 807 00:44:37,910 --> 00:44:42,130 It sort of limits how much you can screw up by having the 808 00:44:42,130 --> 00:44:47,390 wrong probability, so it's a very robust choice also. 809 00:44:47,390 --> 00:44:52,690 OK, but now this, we're taking the ratio of a with the square 810 00:44:52,690 --> 00:44:54,580 root of n0. 811 00:44:54,580 --> 00:44:59,110 Well the square root of n0 over 2 is really the standard 812 00:44:59,110 --> 00:45:00,820 deviation of the noise. 813 00:45:00,820 --> 00:45:03,770 So what we're doing is comparing the amount of input 814 00:45:03,770 --> 00:45:10,000 we've put in with the standard deviation of the noise. 815 00:45:10,000 --> 00:45:11,260 Now does that make any sense? 816 00:45:11,260 --> 00:45:13,950 The probability of error depends on that ratio? 817 00:45:13,950 --> 00:45:16,600 Well yeah, it makes a whole lot of sense. 818 00:45:16,600 --> 00:45:19,540 Because if, for example, I wanted to look at this problem 819 00:45:19,540 --> 00:45:22,450 in a different scaling system-- 820 00:45:22,450 --> 00:45:26,440 if this is volts and I want to look at it in milli-volts-- 821 00:45:26,440 --> 00:45:32,020 I'm going to divide a by, I'm going to multiply a by 1000. 822 00:45:32,020 --> 00:45:35,670 I'm going to multiply the standard deviation of the 823 00:45:35,670 --> 00:45:37,430 noise by 1000. 824 00:45:37,430 --> 00:45:39,660 Because one of the things we always do here-- the way we 825 00:45:39,660 --> 00:45:42,390 choose n0 over 2-- 826 00:45:42,390 --> 00:45:46,210 n0 over 2 is sort of a meaningless quantity. 827 00:45:46,210 --> 00:45:52,990 It's the noise energy in one degree of freedom in the 828 00:45:52,990 --> 00:45:56,920 scaling reference that we're using for the data. 829 00:45:56,920 --> 00:45:57,890 OK? 830 00:45:57,890 --> 00:46:00,190 And that's the only definition you can come up with that 831 00:46:00,190 --> 00:46:01,640 makes any sense. 832 00:46:01,640 --> 00:46:06,420 I mean you scale the data in whatever way you please, and 833 00:46:06,420 --> 00:46:12,580 when we've gone from baseband to passband, we in fact have 834 00:46:12,580 --> 00:46:19,140 multiplied the energy in the input by a factor of two. 835 00:46:19,140 --> 00:46:21,510 And therefore, because of that, we're going to-- 836 00:46:21,510 --> 00:46:26,190 n0 at passband is going to be a square root of 2 bigger than 837 00:46:26,190 --> 00:46:27,870 it is at baseband. 838 00:46:27,870 --> 00:46:31,150 If you don't like that, live with it. 839 00:46:31,150 --> 00:46:33,140 That's the way it is. 840 00:46:33,140 --> 00:46:36,030 Nobody will change n0, no matter who wants 841 00:46:36,030 --> 00:46:37,910 them to change it. 842 00:46:37,910 --> 00:46:40,870 OK, so this term make sense. 843 00:46:40,870 --> 00:46:48,080 It's the ratio of the signal amplitude to the standard 844 00:46:48,080 --> 00:46:50,130 deviation of the noise. 845 00:46:50,130 --> 00:46:53,720 And that should be the only way that the signal amplitude 846 00:46:53,720 --> 00:46:57,500 or the standard deviation of the noise enters in, because 847 00:46:57,500 --> 00:47:00,250 it's really the ratio that has to be important. 848 00:47:00,250 --> 00:47:04,070 Why this crazy term? 849 00:47:04,070 --> 00:47:09,410 Well if you look at the curve you can sort of see why it is. 850 00:47:09,410 --> 00:47:13,300 The threshold test is comparing the likelihood ratio 851 00:47:13,300 --> 00:47:15,250 of this curve with the likelihood 852 00:47:15,250 --> 00:47:17,310 ratio of this curve. 853 00:47:17,310 --> 00:47:27,150 What's going to happen as a gets very, very large? 854 00:47:27,150 --> 00:47:32,240 You move a out, and the thing that's happening then is this 855 00:47:32,240 --> 00:47:36,460 curve-- which is now coming down in a modest way here-- if 856 00:47:36,460 --> 00:47:39,370 you move a way out here, you're going to have almost 857 00:47:39,370 --> 00:47:40,970 nothing there. 858 00:47:40,970 --> 00:47:44,750 And it's going to be going down very fast. 859 00:47:44,750 --> 00:47:46,690 It's going to be going down very fast 860 00:47:46,690 --> 00:47:50,470 relative to its magnitude. 861 00:47:50,470 --> 00:47:53,960 In other words the bigger a gets, the bigger this 862 00:47:53,960 --> 00:47:57,850 difference is going to be for any given threshold. 863 00:47:57,850 --> 00:48:03,370 And that's why you get a over square root of n0 here. 864 00:48:03,370 --> 00:48:06,130 And here you get exactly the opposite thing. 865 00:48:06,130 --> 00:48:09,400 That's because for a given threshold, as this signal to 866 00:48:09,400 --> 00:48:17,730 noise ratio gets bigger, this threshold term becomes almost 867 00:48:17,730 --> 00:48:20,560 totally unimportant. 868 00:48:20,560 --> 00:48:23,270 I mean you get so much information out of the reading 869 00:48:23,270 --> 00:48:30,470 you're making, because it's so reliable, that having a 870 00:48:30,470 --> 00:48:34,240 threshold is almost completely irrelevant. 871 00:48:34,240 --> 00:48:35,990 And therefore you can sort of forget about it. 872 00:48:35,990 --> 00:48:40,100 If a is very large, this term is zilch. 873 00:48:40,100 --> 00:48:41,350 OK? 874 00:48:43,240 --> 00:48:45,770 So if you want to have reliable communication and you 875 00:48:45,770 --> 00:48:49,000 use a large amount of single noise ratio to get it, that's 876 00:48:49,000 --> 00:48:55,320 another reason for forgetting about whether the threshold is 877 00:48:55,320 --> 00:48:58,000 one or something else. 878 00:48:58,000 --> 00:49:02,520 And we would certainly like to deal with problems where the 879 00:49:02,520 --> 00:49:05,790 threshold is equal to one because most people who 880 00:49:05,790 --> 00:49:10,820 remember q of a signal to noise ratio. 881 00:49:10,820 --> 00:49:14,340 I don't know anybody who can remember this formula. 882 00:49:14,340 --> 00:49:18,330 I'm sure there's some people, but I don't think anybody who 883 00:49:18,330 --> 00:49:20,940 works in the communication field ever thinks 884 00:49:20,940 --> 00:49:22,390 about this at all. 885 00:49:22,390 --> 00:49:25,020 Except the first time they derive it and the say, "Oh, 886 00:49:25,020 --> 00:49:29,160 that's very nice." And then they promptly forget about it. 887 00:49:29,160 --> 00:49:31,420 The only reason I think about it more is I 888 00:49:31,420 --> 00:49:33,790 teach the course sometimes. 889 00:49:33,790 --> 00:49:35,780 Otherwise I would forget about it, too. 890 00:49:41,730 --> 00:49:44,200 OK, which is what this says. 891 00:49:44,200 --> 00:49:48,390 For communication we assume p0 is equal to p1. 892 00:49:48,390 --> 00:49:50,920 So we assume that eta is equal to one. 893 00:49:50,920 --> 00:49:54,330 So the probability of error, which is also the probability 894 00:49:54,330 --> 00:49:58,100 of error when H is equal to one-- in other words when a 895 00:49:58,100 --> 00:50:01,230 one actually enters the communication system-- is 896 00:50:01,230 --> 00:50:04,890 equal to the probability of error when H is equal to 0. 897 00:50:04,890 --> 00:50:08,980 In other words these two tails here, when the threshold is 898 00:50:08,980 --> 00:50:12,340 equal to one you set the threshold right there. 899 00:50:12,340 --> 00:50:15,540 The probability of this tail is clearly equal to the 900 00:50:15,540 --> 00:50:16,960 probability of this tail. 901 00:50:16,960 --> 00:50:19,400 Just by symmetry. 902 00:50:19,400 --> 00:50:22,280 So these two error probabilities are the same. 903 00:50:22,280 --> 00:50:26,540 And in fact they are just q of a over the square 904 00:50:26,540 --> 00:50:29,660 root of n0 over 2. 905 00:50:29,660 --> 00:50:31,870 It's nice to put this in terms of energy. 906 00:50:31,870 --> 00:50:34,400 We said before that energy is sort of important in the 907 00:50:34,400 --> 00:50:36,800 communication field. 908 00:50:36,800 --> 00:50:40,330 So we call e sub b the energy per bit that we're 909 00:50:40,330 --> 00:50:42,900 spending to send data. 910 00:50:42,900 --> 00:50:45,020 I mean don't worry about the fact that we're only sending 911 00:50:45,020 --> 00:50:48,150 one bit and then we're tearing the communication system down. 912 00:50:48,150 --> 00:50:51,300 Because pretty soon we're going to send multiple bits. 913 00:50:51,300 --> 00:50:55,830 But the amount of energy we're spending sending this one bit 914 00:50:55,830 --> 00:50:57,970 is a squared. 915 00:50:57,970 --> 00:51:01,270 At least back in this frame of reference that we're looking 916 00:51:01,270 --> 00:51:03,640 at now, where we're just looking at this discrete 917 00:51:03,640 --> 00:51:08,980 signal and single variable noise. 918 00:51:08,980 --> 00:51:14,010 And n0 over 2 is the noise variance of this particular 919 00:51:14,010 --> 00:51:20,960 random variable, z, so when we write this out in terms of Eb, 920 00:51:20,960 --> 00:51:27,010 which is a squared, it looks like this-- it's 2eb over n0. 921 00:51:27,010 --> 00:51:31,970 So the probability of error for this binary communication 922 00:51:31,970 --> 00:51:36,210 problem is just the square root of 2Eb over n0. 923 00:51:36,210 --> 00:51:39,750 Which is a formula that you want to remember. 924 00:51:39,750 --> 00:51:47,850 It's the error probability for binary detection when n0 over 925 00:51:47,850 --> 00:51:55,520 2 is the noise energy on this one degree of freedom and Eb 926 00:51:55,520 --> 00:51:57,680 is the amount of energy you're spending on this 927 00:51:57,680 --> 00:52:00,660 one degree of freedom. 928 00:52:00,660 --> 00:52:02,960 You will see about 50 variations of 929 00:52:02,960 --> 00:52:06,090 this as we go on. 930 00:52:06,090 --> 00:52:11,220 If you try to remember this fundamental definition, it'll 931 00:52:11,220 --> 00:52:14,210 save you a lot of agony. 932 00:52:14,210 --> 00:52:20,170 Even so, everybody I know who deals with this kind of thing 933 00:52:20,170 --> 00:52:23,440 always screws up the factors of twos. 934 00:52:23,440 --> 00:52:27,200 And finally when they get all done, they try to figure out 935 00:52:27,200 --> 00:52:30,020 from common sense or from something else, what the 936 00:52:30,020 --> 00:52:32,660 factors of two ought to be. 937 00:52:32,660 --> 00:52:35,850 And they reduce their probability of error to about 938 00:52:35,850 --> 00:52:38,790 a quarter after they're all done with doing all of that. 939 00:52:42,870 --> 00:52:46,490 OK, so we spent a lot of time analyzing 940 00:52:46,490 --> 00:52:50,010 binary antipodal signals. 941 00:52:50,010 --> 00:52:53,430 What about the binary non antipodal signals? 942 00:52:53,430 --> 00:52:57,530 This is beautiful example of Shannon's idea of studying the 943 00:52:57,530 --> 00:52:58,910 simplest cases first. 944 00:53:01,660 --> 00:53:05,500 If you have two signals, one of which is b and one of which 945 00:53:05,500 --> 00:53:14,330 is b prime, and they can be put anywhere on the real line. 946 00:53:14,330 --> 00:53:17,140 And what I've done, because I didn't want to plot this whole 947 00:53:17,140 --> 00:53:21,440 picture again, is I just took the zero out, and I replaced 948 00:53:21,440 --> 00:53:25,650 the zero by the point halfway between these two points which 949 00:53:25,650 --> 00:53:28,530 is b plus b prime over 2. 950 00:53:28,530 --> 00:53:33,040 And then we look at it and we say what happens if you have 951 00:53:33,040 --> 00:53:38,500 an arbitrary set of two points anywhere on the real line? 952 00:53:38,500 --> 00:53:44,900 Well when I send this point the likelihood, conditional on 953 00:53:44,900 --> 00:53:48,350 this point being sent, is this Gaussian curve 954 00:53:48,350 --> 00:53:50,470 centered on b prime. 955 00:53:50,470 --> 00:53:54,640 When I send b the likelihood is a Gaussian 956 00:53:54,640 --> 00:53:57,310 curve centered on b. 957 00:53:57,310 --> 00:54:02,210 And it is in fact the same curve that we drew before. 958 00:54:02,210 --> 00:54:06,870 If in fact I replaced zero with a center point between 959 00:54:06,870 --> 00:54:10,930 these two, which is b plus b prime over 2. 960 00:54:10,930 --> 00:54:28,500 And if I then define a as this distance in here, the 961 00:54:28,500 --> 00:54:32,140 probability of error that I calculated before is the same 962 00:54:32,140 --> 00:54:34,280 as it was before. 963 00:54:34,280 --> 00:54:40,050 Now I would suggest to all of you that you try to find the 964 00:54:40,050 --> 00:54:46,700 probability of error for this system here, just not using 965 00:54:46,700 --> 00:54:49,910 what we've already done, and just writing out the 966 00:54:49,910 --> 00:54:53,160 likelihood ratios as a function of an arbitrary b 967 00:54:53,160 --> 00:54:58,370 prime and an arbitrary b and finding a likelihood ratio, 968 00:54:58,370 --> 00:55:00,640 and calculating through all of that. 969 00:55:00,640 --> 00:55:04,760 And most of you are capable of calculating through all of it. 970 00:55:04,760 --> 00:55:07,930 But when you do so, you will get a god awful looking 971 00:55:07,930 --> 00:55:12,530 formula, which just looks totally messy. 972 00:55:12,530 --> 00:55:17,360 And by looking at the formula you are not going to be able 973 00:55:17,360 --> 00:55:20,540 to realize that what's going on is what we see here from 974 00:55:20,540 --> 00:55:22,670 the picture. 975 00:55:22,670 --> 00:55:25,230 And the only reason we figured out what was going like the 976 00:55:25,230 --> 00:55:27,610 picture is we already solved the problem 977 00:55:27,610 --> 00:55:29,370 for the Simpson case. 978 00:55:29,370 --> 00:55:35,400 OK so it never makes any sense in this problem to look at 979 00:55:35,400 --> 00:55:37,120 this general case. 980 00:55:37,120 --> 00:55:40,300 You always want to say the general case is just a special 981 00:55:40,300 --> 00:55:43,360 case of a special case. 982 00:55:43,360 --> 00:55:48,110 Where you just have to define things slightly differently. 983 00:55:48,110 --> 00:55:51,650 OK, so we're going to do the center point, then, I mean it 984 00:55:51,650 --> 00:55:54,280 might be a pilot tone. 985 00:55:54,280 --> 00:55:58,600 It might be any other non information bearing signal. 986 00:55:58,600 --> 00:56:00,900 In other words were sending the one bit. 987 00:56:00,900 --> 00:56:03,470 Sometimes for some reason or other, you need to get 988 00:56:03,470 --> 00:56:04,470 synchronization. 989 00:56:04,470 --> 00:56:07,470 You need to get other things in a communication system. 990 00:56:07,470 --> 00:56:09,500 And for that reason, you send other things. 991 00:56:09,500 --> 00:56:12,260 We'll talk about a lot of those later. 992 00:56:12,260 --> 00:56:15,500 But they don't change the error probability at all. 993 00:56:15,500 --> 00:56:19,370 The error probability is determined solely by the 994 00:56:19,370 --> 00:56:21,910 distance between these two points which we call 2a. 995 00:56:25,820 --> 00:56:28,660 So probability of error remains the same in terms of 996 00:56:28,660 --> 00:56:30,310 this distance. 997 00:56:30,310 --> 00:56:33,020 The energy per bit now changes. 998 00:56:33,020 --> 00:56:36,080 The energy per bit is the energy here 999 00:56:36,080 --> 00:56:37,630 plus the energy here. 1000 00:56:37,630 --> 00:56:40,270 Which in fact is the energy in the center 1001 00:56:40,270 --> 00:56:43,990 point plus a squared. 1002 00:56:43,990 --> 00:56:46,390 I mean we've done that in a number of contexts, the way to 1003 00:56:46,390 --> 00:56:52,990 find the energy in a binary random variable is to take the 1004 00:56:52,990 --> 00:56:56,650 energy in the center point plus the energy in the 1005 00:56:56,650 --> 00:56:58,070 difference. 1006 00:56:58,070 --> 00:57:02,040 It's the same as finding the fluctuation plus the 1007 00:57:02,040 --> 00:57:02,900 square of the mean. 1008 00:57:02,900 --> 00:57:05,530 It's that same underlying idea. 1009 00:57:05,530 --> 00:57:10,280 So any time you use non antipodal and you shift things 1010 00:57:10,280 --> 00:57:13,840 off the mean, you can see what's going 1011 00:57:13,840 --> 00:57:15,840 on very, very easily. 1012 00:57:15,840 --> 00:57:17,580 You waste energy. 1013 00:57:17,580 --> 00:57:21,170 I mean it might not be wasted, you might have to waste it for 1014 00:57:21,170 --> 00:57:22,500 some reason. 1015 00:57:22,500 --> 00:57:25,360 But as far as a communication is concerned you're simply 1016 00:57:25,360 --> 00:57:26,830 wasting it. 1017 00:57:26,830 --> 00:57:30,960 So your energy per bit changes, but your probability 1018 00:57:30,960 --> 00:57:33,310 of error remains the same. 1019 00:57:33,310 --> 00:57:38,450 Because of that, you get a very clear cut idea of what 1020 00:57:38,450 --> 00:57:42,040 it's costing you to send that pilot tone. 1021 00:57:42,040 --> 00:57:45,990 Because in fact what you've done is to just increase this 1022 00:57:45,990 --> 00:57:50,480 energy, which we talk about in terms of db. 1023 00:57:50,480 --> 00:57:54,340 If c is equal to a in this case which, as will see, is a 1024 00:57:54,340 --> 00:57:57,910 common thing that happens in a lot of systems, what you've 1025 00:57:57,910 --> 00:57:59,590 lost is a factor three db. 1026 00:57:59,590 --> 00:58:02,970 Because you're using twice as much energy which is three db 1027 00:58:02,970 --> 00:58:05,850 more energy, than you have to use for the pure 1028 00:58:05,850 --> 00:58:07,590 communication. 1029 00:58:07,590 --> 00:58:12,040 So it's costing you three db to do whatever silly thing you 1030 00:58:12,040 --> 00:58:14,960 want to do for synchronization or something else. 1031 00:58:14,960 --> 00:58:18,560 Which is why people work very hard to try to send signals 1032 00:58:18,560 --> 00:58:20,850 which carry their own synchronization 1033 00:58:20,850 --> 00:58:23,010 information in them. 1034 00:58:23,010 --> 00:58:25,030 And we will talk about that more when we get to wireless 1035 00:58:25,030 --> 00:58:26,280 and things like that. 1036 00:58:31,100 --> 00:58:31,960 OK. 1037 00:58:31,960 --> 00:58:35,800 Let's go on to real antipodal vectors in 1038 00:58:35,800 --> 00:58:39,030 white Gaussian noise. 1039 00:58:39,030 --> 00:58:42,160 And again, let me point out to you again that one of the 1040 00:58:42,160 --> 00:58:45,500 remarkable things about detection theory is once you 1041 00:58:45,500 --> 00:58:51,940 understand detection for antipodal binary signals and 1042 00:58:51,940 --> 00:58:54,510 Gaussian noise, everything else just follows along. 1043 00:58:58,260 --> 00:59:02,410 OK, so here what we're going to do is to assume that under 1044 00:59:02,410 --> 00:59:05,840 the hypothesis H equals zero-- in other words conditional on 1045 00:59:05,840 --> 00:59:09,570 a zero entering the communication system-- 1046 00:59:09,570 --> 00:59:14,700 what we're going to send is not a single degree, is not 1047 00:59:14,700 --> 00:59:16,960 something in a single degree of freedom. 1048 00:59:16,960 --> 00:59:19,470 But we're actually going to send a vector. 1049 00:59:19,470 --> 00:59:23,520 And you can think of that if you want to as sending a way 1050 00:59:23,520 --> 00:59:27,930 form and breaking up the waveform into an orthonormal 1051 00:59:27,930 --> 00:59:31,440 expansion and a1 to ak as being the 1052 00:59:31,440 --> 00:59:33,940 coefficients in that expansion. 1053 00:59:33,940 --> 00:59:35,560 So we're going to use several degrees of 1054 00:59:35,560 --> 00:59:37,870 freedom to send one signal. 1055 00:59:37,870 --> 00:59:38,210 Yes? 1056 00:59:38,210 --> 00:59:42,750 AUDIENCE: [INAUDIBLE] 1057 00:59:42,750 --> 00:59:46,270 PROFESSOR: I'm only sending one bit. 1058 00:59:46,270 --> 00:59:50,100 And on Wednesday I'm going to talk about what happens when 1059 00:59:50,100 --> 00:59:54,400 we want to send multiple bits or when we want to send a 1060 00:59:54,400 --> 00:59:57,770 large number of signals in one degree of freedom, or all 1061 00:59:57,770 --> 01:00:01,040 those cases, multiple hypotheses. 1062 01:00:01,040 --> 01:00:03,640 You know what's going to happen? 1063 01:00:03,640 --> 01:00:06,750 It's going to turn out to be a trivial problem again. 1064 01:00:06,750 --> 01:00:10,560 Multiple hypotheses are no harder than just binary 1065 01:00:10,560 --> 01:00:12,560 hypotheses. 1066 01:00:12,560 --> 01:00:15,720 So again, once you understand the simplest case, all 1067 01:00:15,720 --> 01:00:18,660 Gaussian problems turn out to be solved. 1068 01:00:18,660 --> 01:00:24,240 Just with minor variations and minor tweaks. 1069 01:00:24,240 --> 01:00:26,790 OK, so I have antipodal vectors. 1070 01:00:26,790 --> 01:00:30,020 One vector is a1 to a sub k. 1071 01:00:30,020 --> 01:00:34,290 Under the other hypothesis we're going to send minus a, 1072 01:00:34,290 --> 01:00:36,550 which is the opposite vector. 1073 01:00:36,550 --> 01:00:40,030 So if we're dealing with two dimensional space with 1074 01:00:40,030 --> 01:00:43,140 coordinates here and here, if I send this I'm 1075 01:00:43,140 --> 01:00:44,360 going to send this. 1076 01:00:44,360 --> 01:00:47,330 If I send that I'm going to send that, and so forth. 1077 01:00:47,330 --> 01:00:51,030 As the opposite alternative. 1078 01:00:51,030 --> 01:00:58,650 The likelihood ratio is then, the probability of vector v 1079 01:00:58,650 --> 01:01:02,140 conditional on sending this. 1080 01:01:02,140 --> 01:01:06,780 I'm assuming here that the noise is IID and each noise 1081 01:01:06,780 --> 01:01:12,650 variable has mean zero and variance n0 over 2. 1082 01:01:12,650 --> 01:01:16,550 Namely, we're pretending we're communication people here, 1083 01:01:16,550 --> 01:01:18,430 using and n0 over 2 here. 1084 01:01:18,430 --> 01:01:21,790 So the conditional density-- 1085 01:01:21,790 --> 01:01:27,090 the likelihood of this output vector given zero-- is just 1086 01:01:27,090 --> 01:01:32,350 this density of z shifted over by a. 1087 01:01:32,350 --> 01:01:37,580 So it's what we've talked about as the Gaussian density. 1088 01:01:37,580 --> 01:01:46,540 Just this, which is just the energy in v minus a is what 1089 01:01:46,540 --> 01:01:49,040 that turns out to be. 1090 01:01:49,040 --> 01:01:55,460 So the log likelihood ratio is the ratio of this quantity to 1091 01:01:55,460 --> 01:01:58,270 the density where H is equal to one. 1092 01:01:58,270 --> 01:02:06,180 And when H is equal to one, what happens is the same thing 1093 01:02:06,180 --> 01:02:07,390 as happened before. 1094 01:02:07,390 --> 01:02:14,040 One makes this sign turn into a plus sign. 1095 01:02:14,040 --> 01:02:17,550 So when I look at the log likelihood ratio, I want to 1096 01:02:17,550 --> 01:02:23,480 take the ratio of this quantity to the same quantity 1097 01:02:23,480 --> 01:02:26,110 with a plus put into it. 1098 01:02:26,110 --> 01:02:29,250 And when I take the log of that, what happens is I get 1099 01:02:29,250 --> 01:02:32,990 this term minus the opposite term of the opposite side. 1100 01:02:32,990 --> 01:02:38,930 So I have minus the norm squared of v minus a plus the 1101 01:02:38,930 --> 01:02:43,650 norm squared of v plus a over n0. 1102 01:02:43,650 --> 01:02:46,080 And again, if you multiply this out, the v squareds 1103 01:02:46,080 --> 01:02:47,130 cancel out. 1104 01:02:47,130 --> 01:02:48,950 The a squareds cancel out. 1105 01:02:48,950 --> 01:02:51,460 And you just get the inner product terms. 1106 01:02:51,460 --> 01:02:54,060 And strangely enough you get the same formula that you got 1107 01:02:54,060 --> 01:02:58,130 before, almost, except here you have the inner product of 1108 01:02:58,130 --> 01:03:02,140 v with a instead of just the product of v times a. 1109 01:03:02,140 --> 01:03:06,440 So in fact we just have a slight generalization of the 1110 01:03:06,440 --> 01:03:09,140 thing that we did before. 1111 01:03:09,140 --> 01:03:21,980 In other words, the scalar product is 1112 01:03:21,980 --> 01:03:25,100 a sufficient statistic. 1113 01:03:25,100 --> 01:03:27,850 Now what does that tell you? 1114 01:03:27,850 --> 01:03:30,900 It tells you how to build a detector. 1115 01:03:30,900 --> 01:03:31,350 OK? 1116 01:03:31,350 --> 01:03:34,770 It tells you when you have a vector detection problem, the 1117 01:03:34,770 --> 01:03:40,440 thing that you want to do is to take this vector, v, that 1118 01:03:40,440 --> 01:03:47,260 you have and form the inner product of v times a. 1119 01:03:47,260 --> 01:03:51,350 If in fact v is a waveform and a is a waveform, 1120 01:03:51,350 --> 01:03:52,600 what do you do then? 1121 01:03:56,140 --> 01:03:58,550 Well the first thing you do is to think of 1122 01:03:58,550 --> 01:04:01,730 v as being a vector-- 1123 01:04:01,730 --> 01:04:04,150 where it's the vector of coefficients-- in the 1124 01:04:04,150 --> 01:04:10,470 expansion for that waveform and a in the same way. 1125 01:04:10,470 --> 01:04:13,670 You look at what the inner product is then, and then you 1126 01:04:13,670 --> 01:04:16,860 say, "well what does that correspond to when I deal with 1127 01:04:16,860 --> 01:04:19,550 L2 waveforms?" What's the inner 1128 01:04:19,550 --> 01:04:21,700 product for L2 waveforms? 1129 01:04:26,620 --> 01:04:28,550 It's the integral of the product of the waveforms. 1130 01:04:31,260 --> 01:04:32,930 And how do you form the integral of the 1131 01:04:32,930 --> 01:04:35,770 product of the waveforms? 1132 01:04:35,770 --> 01:04:38,520 You take this waveform here. 1133 01:04:38,520 --> 01:04:44,770 You turn it around and you call it a matched filter to a. 1134 01:04:44,770 --> 01:04:46,890 And you take the received waveform. 1135 01:04:46,890 --> 01:04:50,280 You pass it through the max filter for a, and you look at 1136 01:04:50,280 --> 01:04:51,530 the output for it. 1137 01:04:53,960 --> 01:05:01,820 Now, let's go back and look at what all of this was doing. 1138 01:05:01,820 --> 01:05:04,560 And for now let's forget about the baseband to 1139 01:05:04,560 --> 01:05:05,730 the passband business. 1140 01:05:05,730 --> 01:05:10,070 Let's just look at this part here because it's a little 1141 01:05:10,070 --> 01:05:11,670 easier to see this first. 1142 01:05:21,040 --> 01:05:23,540 So this comes in here. 1143 01:05:23,540 --> 01:05:25,120 Now remember what we were saying 1144 01:05:25,120 --> 01:05:26,530 when we studied Nyquist. 1145 01:05:26,530 --> 01:05:32,095 We said a neat thing to do was to use a square root of the 1146 01:05:32,095 --> 01:05:34,640 Nyquist pulse at the transmitter. 1147 01:05:34,640 --> 01:05:37,040 When you use a square root of the Nyquist pulse at the 1148 01:05:37,040 --> 01:05:41,920 transmitter, what you have is orthogonality between the 1149 01:05:41,920 --> 01:05:44,140 pulse and all of its shifts. 1150 01:05:44,140 --> 01:05:46,760 Well now we don't much care about the orthogonality 1151 01:05:46,760 --> 01:05:49,820 between the pulse and all of the shifts because we're only 1152 01:05:49,820 --> 01:05:52,930 sending this one bit anyway. 1153 01:05:52,930 --> 01:05:54,990 But it sort of looks like we're going to be able to put 1154 01:05:54,990 --> 01:05:58,180 that back in in a nice convenient way. 1155 01:05:58,180 --> 01:06:02,000 So we're sending this one pulse, p of t, city and what 1156 01:06:02,000 --> 01:06:04,150 did we do in this baseband demodulator? 1157 01:06:06,670 --> 01:06:11,050 We passed this through another filter, q of t, which was the 1158 01:06:11,050 --> 01:06:14,620 matched filter to p of t. 1159 01:06:14,620 --> 01:06:19,220 What's our optimal detector for maximum likelihood? 1160 01:06:19,220 --> 01:06:22,490 It's to take whatever this waveform was, pass it through 1161 01:06:22,490 --> 01:06:23,550 the matched filter. 1162 01:06:23,550 --> 01:06:26,120 In other words, to calculate that inner product we just 1163 01:06:26,120 --> 01:06:28,990 talked about. 1164 01:06:28,990 --> 01:06:30,570 OK? 1165 01:06:30,570 --> 01:06:34,090 So in fact when we were looking at the Nyquist problem 1166 01:06:34,090 --> 01:06:37,990 and worrying about inner symbol interference, in fact 1167 01:06:37,990 --> 01:06:41,590 what we were doing was also doing the first part of an 1168 01:06:41,590 --> 01:06:44,390 optimal MAP detector. 1169 01:06:44,390 --> 01:06:48,760 And at this point what comes out of here is a single 1170 01:06:48,760 --> 01:06:54,740 number, v, which in fact now is the inner product of this 1171 01:06:54,740 --> 01:06:57,330 waveform at this point. 1172 01:06:57,330 --> 01:07:01,450 With the waveform, a, that we sent. 1173 01:07:01,450 --> 01:07:02,860 OK? 1174 01:07:02,860 --> 01:07:05,800 In other words, we started out by saying, "let's suppose that 1175 01:07:05,800 --> 01:07:08,280 what we have here is a number. 1176 01:07:08,280 --> 01:07:12,390 What's the optimal detector to build?" And then we go on and 1177 01:07:12,390 --> 01:07:15,850 say, "OK, let's suppose we look at the problem here. 1178 01:07:15,850 --> 01:07:19,730 What's the optimal detector to build now?" And the optimal 1179 01:07:19,730 --> 01:07:23,850 detector to build now at this point is this matched filter 1180 01:07:23,850 --> 01:07:25,100 to this input waveform. 1181 01:07:27,610 --> 01:07:30,770 Followed by the inner product here-- which is what the match 1182 01:07:30,770 --> 01:07:35,250 filter does for us-- followed by our binary antipodal 1183 01:07:35,250 --> 01:07:37,620 detector again. 1184 01:07:37,620 --> 01:07:40,080 OK? 1185 01:07:40,080 --> 01:07:44,770 So by studying the problem at this point, we now understand 1186 01:07:44,770 --> 01:07:46,100 what happens at this point. 1187 01:07:51,050 --> 01:07:53,760 And do I have time to show you what happens at this point? 1188 01:07:53,760 --> 01:07:55,380 I don't know. 1189 01:07:55,380 --> 01:07:56,630 Let me-- 1190 01:08:03,270 --> 01:08:07,710 let's not do that at least right now-- 1191 01:08:07,710 --> 01:08:10,520 let's look at the picture of this that we get when we just 1192 01:08:10,520 --> 01:08:15,200 look at the problem when we have two dimensions. 1193 01:08:15,200 --> 01:08:18,940 So we're either going to transmit a vector, a, or we're 1194 01:08:18,940 --> 01:08:20,910 going to transmit a vector, minus a. 1195 01:08:20,910 --> 01:08:23,100 And think of this in two dimensions. 1196 01:08:23,100 --> 01:08:26,050 When we transmit the vector, a, we have 1197 01:08:26,050 --> 01:08:28,100 two dimensional noise. 1198 01:08:28,100 --> 01:08:31,490 We've already pointed out that two dimensional Gaussian noise 1199 01:08:31,490 --> 01:08:32,750 has circular symmetry. 1200 01:08:32,750 --> 01:08:35,530 Spherical symmetry in an arbitrary number of dimension. 1201 01:08:35,530 --> 01:08:38,740 So what happens is you get these equal probability 1202 01:08:38,740 --> 01:08:42,930 regions which are spreading out like when you drop a rock 1203 01:08:42,930 --> 01:08:45,360 into a pool of water. 1204 01:08:45,360 --> 01:08:49,240 You see all of these things spreading out in circles. 1205 01:08:49,240 --> 01:08:56,120 And you then say, "OK, what's this inner product going to 1206 01:08:56,120 --> 01:09:01,330 correspond to?" Finding the inner product and comparing it 1207 01:09:01,330 --> 01:09:03,480 with a threshold. 1208 01:09:03,480 --> 01:09:06,940 Well you can see geometrically what's going to happen here. 1209 01:09:06,940 --> 01:09:11,090 You're trying to do maximum likelihood. 1210 01:09:11,090 --> 01:09:13,230 And we already know we're supposed to calculate the 1211 01:09:13,230 --> 01:09:16,260 inner product, so what the inner product is going to do 1212 01:09:16,260 --> 01:09:19,250 is take whatever v that we receive-- it's going to 1213 01:09:19,250 --> 01:09:25,920 project it on to this line between 0 and a. 1214 01:09:25,920 --> 01:09:30,020 So if I got a v here I'm going to project it down to here. 1215 01:09:30,020 --> 01:09:32,210 And then what I'm going to do is I'm going to compare the 1216 01:09:32,210 --> 01:09:34,600 distance from here to there with the distance 1217 01:09:34,600 --> 01:09:36,680 from here to there. 1218 01:09:36,680 --> 01:09:41,160 Which says first project, then do the old decision in a one 1219 01:09:41,160 --> 01:09:42,510 dimensional way. 1220 01:09:42,510 --> 01:09:47,910 Now geometrically, this distance squared is equal to 1221 01:09:47,910 --> 01:09:50,850 this distance squared plus this distance squared. 1222 01:09:50,850 --> 01:09:53,880 And this distance squared is equal to the same distance 1223 01:09:53,880 --> 01:09:56,920 squared plus this distance squared. 1224 01:09:56,920 --> 01:10:00,980 So whatever you decide to do in terms of these distances, 1225 01:10:00,980 --> 01:10:04,280 you will also decide to do in terms of these distances. 1226 01:10:04,280 --> 01:10:07,940 Which also means that the maximum likelihood regions 1227 01:10:07,940 --> 01:10:10,470 that you're going to develop, or in fact the maximum a 1228 01:10:10,470 --> 01:10:15,480 posteriori probability regions are simply planes. 1229 01:10:15,480 --> 01:10:18,520 Which are perpendicular to the line between 1230 01:10:18,520 --> 01:10:21,330 minus a and plus a. 1231 01:10:21,330 --> 01:10:22,520 OK? 1232 01:10:22,520 --> 01:10:24,650 So if you're doing maximum likelihood you just form a 1233 01:10:24,650 --> 01:10:27,330 plane halfway between these two points. 1234 01:10:27,330 --> 01:10:27,560 Yeah? 1235 01:10:27,560 --> 01:10:42,100 AUDIENCE: [UNINTELLIGIBLE] 1236 01:10:42,100 --> 01:10:44,990 PROFESSOR: We got the error probability just by first 1237 01:10:44,990 --> 01:10:48,820 doing the projection and then turning it into this scale or 1238 01:10:48,820 --> 01:10:50,500 problem again. 1239 01:10:50,500 --> 01:10:51,830 So in fact the error probability-- 1240 01:10:51,830 --> 01:10:52,770 What? 1241 01:10:52,770 --> 01:10:55,870 AUDIENCE: [UNINTELLIGIBLE] 1242 01:10:55,870 --> 01:10:58,110 PROFESSOR: The probability of error is just the probability 1243 01:10:58,110 --> 01:10:59,940 of error in the projection. 1244 01:10:59,940 --> 01:11:02,860 Did I write it down someplace? 1245 01:11:02,860 --> 01:11:04,770 Oh yeah, I did write it down. 1246 01:11:08,100 --> 01:11:14,000 But I wrote it down, well I sort of cheated. 1247 01:11:14,000 --> 01:11:16,980 It's in the notes. 1248 01:11:16,980 --> 01:11:21,330 I mean the likelihood ratio is just a center product here 1249 01:11:21,330 --> 01:11:23,410 which is a number. 1250 01:11:23,410 --> 01:11:26,290 And when you find the error probability, you just use the 1251 01:11:26,290 --> 01:11:30,030 same q formula that we used before. 1252 01:11:30,030 --> 01:11:32,360 And in place of a you substitute-- 1253 01:11:36,700 --> 01:11:42,800 in place of a you substitute the inner product of v with a. 1254 01:11:42,800 --> 01:11:45,180 Which is the corresponding quantity. 1255 01:11:45,180 --> 01:12:01,080 So it's q of 4va divided by n0. 1256 01:12:01,080 --> 01:12:02,510 OK? 1257 01:12:02,510 --> 01:12:07,780 So that's the maximum likelihood error probability. 1258 01:12:07,780 --> 01:12:08,010 OK? 1259 01:12:08,010 --> 01:12:11,870 In other words, nothing new has happened here. 1260 01:12:11,870 --> 01:12:15,900 You just go through the match filter and then you do this 1261 01:12:15,900 --> 01:12:19,470 same one dimensional problem that we've already 1262 01:12:19,470 --> 01:12:21,620 figured out how to do. 1263 01:12:21,620 --> 01:12:26,090 I think I'm going to stop there and we'll do the complex 1264 01:12:26,090 --> 01:12:33,410 case which really corresponds to what happens after baseband 1265 01:12:33,410 --> 01:12:35,160 to passband then passband to baseband.