1 00:00:01,180 --> 00:00:04,050 Least mean squares estimation is remarkable 2 00:00:04,050 --> 00:00:06,820 because it has such a simple answer. 3 00:00:06,820 --> 00:00:09,100 The way to come up with estimates, 4 00:00:09,100 --> 00:00:12,070 if what you care about is to keep the mean squared 5 00:00:12,070 --> 00:00:14,630 error small, the way to come up with estimates 6 00:00:14,630 --> 00:00:18,120 is to just report the conditional expectation, which 7 00:00:18,120 --> 00:00:20,440 is going to be a number, once you have obtained 8 00:00:20,440 --> 00:00:22,060 some values of the data. 9 00:00:22,060 --> 00:00:24,080 Or more abstractly, you can think of it 10 00:00:24,080 --> 00:00:27,240 as a random variable, if you do not know ahead of time 11 00:00:27,240 --> 00:00:29,510 what data you're going to obtain. 12 00:00:29,510 --> 00:00:31,820 Because this estimator is so important, 13 00:00:31,820 --> 00:00:35,480 it is worth writing down what the performance 14 00:00:35,480 --> 00:00:37,810 of that estimator is. 15 00:00:37,810 --> 00:00:39,620 So suppose that you have obtained 16 00:00:39,620 --> 00:00:42,350 a particular measurement, a particular value 17 00:00:42,350 --> 00:00:45,520 of the observation, then the resulting mean squared 18 00:00:45,520 --> 00:00:48,100 error within that conditional universe 19 00:00:48,100 --> 00:00:51,000 where you have already obtained that value, 20 00:00:51,000 --> 00:00:52,690 is just this quantity. 21 00:00:52,690 --> 00:00:57,350 It's the mean square of the error between the variable 22 00:00:57,350 --> 00:01:00,160 that you're trying to estimate and your estimate. 23 00:01:00,160 --> 00:01:02,170 And everything gets computed within 24 00:01:02,170 --> 00:01:03,950 this conditional universe. 25 00:01:03,950 --> 00:01:07,010 Now this is a very familiar quantity, however. 26 00:01:07,010 --> 00:01:11,039 It's the expected value of a random variable difference 27 00:01:11,039 --> 00:01:13,070 from its mean, squared. 28 00:01:13,070 --> 00:01:15,240 This is just the variance, except that 29 00:01:15,240 --> 00:01:17,155 because all quantities are calculated 30 00:01:17,155 --> 00:01:21,850 in a conditional universe, this is the conditional variance. 31 00:01:21,850 --> 00:01:26,140 So the conditional variance is the optimal mean squared error, 32 00:01:26,140 --> 00:01:27,960 the mean squared error that you obtain 33 00:01:27,960 --> 00:01:30,620 when you use this particular estimate. 34 00:01:30,620 --> 00:01:33,759 And it's the value that you would report to your boss 35 00:01:33,759 --> 00:01:37,270 if you were asked how good is the estimate that you're 36 00:01:37,270 --> 00:01:38,870 giving me. 37 00:01:38,870 --> 00:01:43,140 But suppose that you have not yet obtained a measurement, 38 00:01:43,140 --> 00:01:45,320 but you're going to your boss and you're 39 00:01:45,320 --> 00:01:50,180 proposing this particular estimator as your design. 40 00:01:50,180 --> 00:01:52,500 What are you going to report to your boss 41 00:01:52,500 --> 00:01:55,180 as the performance of your design? 42 00:01:55,180 --> 00:01:59,030 Since you have not yet obtained the value of X, 43 00:01:59,030 --> 00:02:02,030 X is a random variable, you do not 44 00:02:02,030 --> 00:02:04,320 know what the value of this conditional expectation 45 00:02:04,320 --> 00:02:05,090 is going to be. 46 00:02:05,090 --> 00:02:06,630 It's a random variable. 47 00:02:06,630 --> 00:02:08,759 But no matter what it is, this is 48 00:02:08,759 --> 00:02:12,380 going to be the error that you're going to be obtaining. 49 00:02:12,380 --> 00:02:16,110 And this is the overall value of the mean squared error. 50 00:02:16,110 --> 00:02:17,800 So this is the quantity that you would 51 00:02:17,800 --> 00:02:21,350 report to your boss as your overall mean squared 52 00:02:21,350 --> 00:02:25,180 error, the value that you report before obtaining 53 00:02:25,180 --> 00:02:27,480 any specific measurement. 54 00:02:27,480 --> 00:02:29,030 Now, what is this quantity? 55 00:02:29,030 --> 00:02:34,020 This quantity is just the average of this quantity 56 00:02:34,020 --> 00:02:40,310 up here, averaged over all the possible values of X. 57 00:02:40,310 --> 00:02:43,329 And in our more abstract notation, 58 00:02:43,329 --> 00:02:47,600 it is just the expectation of the conditional variance. 59 00:02:47,600 --> 00:02:50,730 The conditional variance, the abstract conditional variance, 60 00:02:50,730 --> 00:02:53,829 is a random variable that takes this value whenever 61 00:02:53,829 --> 00:02:57,400 capital X happens to be equal to little x. 62 00:02:57,400 --> 00:03:01,550 And when we average it over all possible values of X, 63 00:03:01,550 --> 00:03:04,600 we just have the expectation of this random variable. 64 00:03:07,110 --> 00:03:09,710 Let me continue now with a few more comments 65 00:03:09,710 --> 00:03:11,860 on LMS estimation. 66 00:03:11,860 --> 00:03:14,800 First, something that should be pretty clear at this point 67 00:03:14,800 --> 00:03:18,010 is that LMS estimation is only relevant to estimation 68 00:03:18,010 --> 00:03:19,030 problems. 69 00:03:19,030 --> 00:03:21,470 This is because in hypothesis testing problems 70 00:03:21,470 --> 00:03:24,690 we typically care about the probability of error, 71 00:03:24,690 --> 00:03:28,380 not the mean squared error. 72 00:03:28,380 --> 00:03:32,890 A second important comment is that in some cases 73 00:03:32,890 --> 00:03:37,380 the LMS estimates and the MAP estimates turn out 74 00:03:37,380 --> 00:03:38,960 to be the same. 75 00:03:38,960 --> 00:03:40,430 When is that the case? 76 00:03:40,430 --> 00:03:43,210 If the posterior distribution of Theta 77 00:03:43,210 --> 00:03:48,160 happens to have a single peak, and it is also 78 00:03:48,160 --> 00:03:51,660 symmetric around a certain point, 79 00:03:51,660 --> 00:03:55,720 so that the peak also occurs at that particular point, 80 00:03:55,720 --> 00:03:58,780 then clearly the peak occurs here. 81 00:03:58,780 --> 00:04:02,050 But the conditional expectation is also that same point, 82 00:04:02,050 --> 00:04:04,560 because it is the center of symmetry. 83 00:04:04,560 --> 00:04:09,400 So in those cases, the two types of estimates, or estimators, 84 00:04:09,400 --> 00:04:10,960 coincide. 85 00:04:10,960 --> 00:04:12,380 When does this happen? 86 00:04:12,380 --> 00:04:17,640 This happens in one particular important special case. 87 00:04:17,640 --> 00:04:20,870 We have seen that in linear-normal models 88 00:04:20,870 --> 00:04:23,600 the posterior distribution is normal. 89 00:04:23,600 --> 00:04:27,880 And so this is one case where the MAP estimate and the LMS 90 00:04:27,880 --> 00:04:30,871 estimate are going to coincide.