1 00:00:01,270 --> 00:00:05,080 Let us now illustrate the linear least mean squares estimation 2 00:00:05,080 --> 00:00:07,730 methodology in the context of an example. 3 00:00:07,730 --> 00:00:11,230 And we're going to revisit our familiar example 4 00:00:11,230 --> 00:00:15,780 that we considered earlier in the context of general least 5 00:00:15,780 --> 00:00:17,790 mean squares estimation. 6 00:00:17,790 --> 00:00:19,740 Let us remind ourselves what were 7 00:00:19,740 --> 00:00:22,540 the assumptions behind this example. 8 00:00:22,540 --> 00:00:24,360 There is an unknown random variable 9 00:00:24,360 --> 00:00:28,790 that we wish to estimate, and that random variable happens 10 00:00:28,790 --> 00:00:34,610 to be uniform on the range from 4 to 10. 11 00:00:34,610 --> 00:00:39,220 What we get to observe is a random variable X, 12 00:00:39,220 --> 00:00:43,110 which is equal to Theta plus or minus something. 13 00:00:43,110 --> 00:00:47,230 So X is Theta plus a noise term. 14 00:00:47,230 --> 00:00:51,370 And that noise term can be anything 15 00:00:51,370 --> 00:00:57,800 in the range from minus 1 to plus 1. 16 00:00:57,800 --> 00:01:00,540 Furthermore, the distribution of U 17 00:01:00,540 --> 00:01:02,750 is this particular uniform no matter 18 00:01:02,750 --> 00:01:07,710 what theta is, so Theta and U are independent. 19 00:01:14,760 --> 00:01:18,090 These modeling assumptions correspond to this picture. 20 00:01:18,090 --> 00:01:20,440 This is the range of X and Theta. 21 00:01:20,440 --> 00:01:22,850 And the joint distribution of X and Theta 22 00:01:22,850 --> 00:01:25,789 happens to be a uniform distribution 23 00:01:25,789 --> 00:01:28,930 on this particular shape. 24 00:01:28,930 --> 00:01:32,280 However, we're not really going to use this picture other than 25 00:01:32,280 --> 00:01:34,320 for illustration purposes. 26 00:01:34,320 --> 00:01:38,039 You could take just this as the formulation of the problem 27 00:01:38,039 --> 00:01:40,470 that we're interested in. 28 00:01:40,470 --> 00:01:45,789 So now, to develop the form of the optimal linear estimator, 29 00:01:45,789 --> 00:01:47,750 all we need to do is to determine 30 00:01:47,750 --> 00:01:50,789 the various constants that show up. 31 00:01:50,789 --> 00:01:53,550 So let's start with expectations. 32 00:01:53,550 --> 00:01:56,960 Theta is uniform from 4 to 10. 33 00:01:56,960 --> 00:01:59,440 Therefore, the expected value is the midpoint, 34 00:01:59,440 --> 00:02:01,710 which is equal to 7. 35 00:02:01,710 --> 00:02:05,980 U has a symmetric distribution around 0, 36 00:02:05,980 --> 00:02:09,720 so its expected value is going to be equal to 0. 37 00:02:09,720 --> 00:02:13,270 X is the sum of Theta and U. Therefore, 38 00:02:13,270 --> 00:02:16,550 its expected value is the sum of these two expected values 39 00:02:16,550 --> 00:02:19,800 and is equal to 7. 40 00:02:19,800 --> 00:02:22,600 Let us now look at variances. 41 00:02:22,600 --> 00:02:28,110 The variance of a uniform is equal to the square 42 00:02:28,110 --> 00:02:31,930 of the length of the interval on which the uniform is 43 00:02:31,930 --> 00:02:37,180 distributed-- in this case, it's 6 squared-- divided always 44 00:02:37,180 --> 00:02:39,270 by a coefficient of 12. 45 00:02:39,270 --> 00:02:44,600 6 squared is 36, so we obtain three. 46 00:02:44,600 --> 00:02:49,210 The variance of U is determined by a similar formula, 47 00:02:49,210 --> 00:02:52,180 except that now we have an interval of length 2, 48 00:02:52,180 --> 00:02:56,560 so we obtain 2 squared over 12. 49 00:02:56,560 --> 00:02:57,720 And this is 1/3. 50 00:03:00,750 --> 00:03:04,300 Now let us look at the variance of X. 51 00:03:04,300 --> 00:03:08,510 Since X is the sum of Theta and U, and since the two of them 52 00:03:08,510 --> 00:03:11,130 are independent, the variance of X 53 00:03:11,130 --> 00:03:15,420 is going to be the sum of these two variances, which 54 00:03:15,420 --> 00:03:17,755 is 10 over 3. 55 00:03:21,220 --> 00:03:25,070 Now let us try to calculate the covariance term. 56 00:03:28,890 --> 00:03:36,560 The covariance of Theta with X is this expression, 57 00:03:36,560 --> 00:03:40,480 because X is Theta plus U. And then, 58 00:03:40,480 --> 00:03:44,210 using linearity properties of covariances, 59 00:03:44,210 --> 00:03:48,400 this is the covariance of Theta with itself, 60 00:03:48,400 --> 00:03:54,120 plus the covariance of Theta with U. Now, 61 00:03:54,120 --> 00:03:59,280 Theta and U are independent, so this covariance is equal to 0. 62 00:03:59,280 --> 00:04:01,320 The covariance of Theta with itself 63 00:04:01,320 --> 00:04:03,440 is just the same as the variance, 64 00:04:03,440 --> 00:04:09,030 so here we obtain an answer of 3. 65 00:04:09,030 --> 00:04:12,140 And so now we have all the pieces of information 66 00:04:12,140 --> 00:04:15,770 that we need, and we can proceed and write down 67 00:04:15,770 --> 00:04:18,589 the form of the linear estimator. 68 00:04:18,589 --> 00:04:23,080 The expected value of Theta is 7. 69 00:04:23,080 --> 00:04:26,540 Then, the covariance of Theta with X 70 00:04:26,540 --> 00:04:32,310 is 3, divided by the variance, which is 10 over 3. 71 00:04:32,310 --> 00:04:34,380 So this ratio becomes 9/10. 72 00:04:37,780 --> 00:04:44,440 And then X minus the expected value of X gives us this term. 73 00:04:44,440 --> 00:04:48,760 So this is the form of the optimal linear estimator, 74 00:04:48,760 --> 00:04:54,110 and if you wish to plot it, it is a curve of this kind. 75 00:04:54,110 --> 00:04:58,450 It is actually interesting to contrast this solution 76 00:04:58,450 --> 00:05:02,710 to the shape of the optimal estimator, the least mean 77 00:05:02,710 --> 00:05:05,530 squares estimator, or the conditional expectation 78 00:05:05,530 --> 00:05:08,265 estimator, which we had found earlier, 79 00:05:08,265 --> 00:05:12,960 and which corresponds to this particular blue curve. 80 00:05:12,960 --> 00:05:15,910 So in some sense, the linear estimator 81 00:05:15,910 --> 00:05:21,830 is a pretty good approximation of the optimal non-linear one. 82 00:05:21,830 --> 00:05:24,240 It does the best job that it can do, 83 00:05:24,240 --> 00:05:27,570 subject to the constraint that it has to be a linear function. 84 00:05:30,200 --> 00:05:33,300 Notice also that this coefficient here is, of course, 85 00:05:33,300 --> 00:05:34,290 positive. 86 00:05:34,290 --> 00:05:38,220 This reflects the fact that the two random variables, 87 00:05:38,220 --> 00:05:41,780 X and Theta, are positively correlated. 88 00:05:41,780 --> 00:05:43,800 This should be clear from this diagram. 89 00:05:43,800 --> 00:05:48,200 When X increases, Theta tends to also increase, and vice versa. 90 00:05:48,200 --> 00:05:51,430 It's also reflected in the fact that the covariance 91 00:05:51,430 --> 00:05:54,600 is a positive number. 92 00:05:54,600 --> 00:05:56,740 On the other hand, because this coefficient 93 00:05:56,740 --> 00:06:01,410 is 9/10 and not equal to 1, this red line 94 00:06:01,410 --> 00:06:03,880 is somewhat slanted in comparison 95 00:06:03,880 --> 00:06:07,405 with the orientation of the diagram that we have. 96 00:06:12,060 --> 00:06:14,320 The calculations that we went through 97 00:06:14,320 --> 00:06:18,710 in this particular example are pretty generic. 98 00:06:18,710 --> 00:06:21,620 This is what you need to do in general. 99 00:06:21,620 --> 00:06:24,100 You just look at the random variables involved, 100 00:06:24,100 --> 00:06:27,840 you calculate their means, you calculate their variances. 101 00:06:27,840 --> 00:06:30,420 Then you may have to do some extra work 102 00:06:30,420 --> 00:06:33,800 to calculate the covariance of interest. 103 00:06:33,800 --> 00:06:36,500 And once you're done, you plug in the numerical values 104 00:06:36,500 --> 00:06:39,040 that you have found, and you obtain 105 00:06:39,040 --> 00:06:41,900 the form of the linear estimator.