1 00:00:00,820 --> 00:00:04,100 We will now develop the solution to the linear least 2 00:00:04,100 --> 00:00:06,910 mean squares estimation problem. 3 00:00:06,910 --> 00:00:09,630 We want to find coefficients a and b 4 00:00:09,630 --> 00:00:12,970 that make this expression, this mean squared error, 5 00:00:12,970 --> 00:00:14,640 as small as possible. 6 00:00:14,640 --> 00:00:18,200 We will approach this problem by proceeding in two stages. 7 00:00:18,200 --> 00:00:22,290 In the first stage, we assume that the choice of a 8 00:00:22,290 --> 00:00:25,080 has already been found and concentrate 9 00:00:25,080 --> 00:00:27,470 on the question of choosing b. 10 00:00:27,470 --> 00:00:30,310 Now if a has been found and has been 11 00:00:30,310 --> 00:00:33,520 fixed to a specific number, then this quantity 12 00:00:33,520 --> 00:00:36,400 here is a specific random variable. 13 00:00:36,400 --> 00:00:38,090 And what do we have? 14 00:00:38,090 --> 00:00:40,930 We have a random variable minus a constant. 15 00:00:40,930 --> 00:00:42,760 And we want to choose that constant, 16 00:00:42,760 --> 00:00:47,100 so that this difference squared is as small as possible 17 00:00:47,100 --> 00:00:49,180 in the expected value sense. 18 00:00:49,180 --> 00:00:52,750 So essentially, we're trying to choose a constant b that 19 00:00:52,750 --> 00:00:56,740 estimates this random variable in the best possible way. 20 00:00:56,740 --> 00:00:59,040 But this is a problem that's familiar to us, 21 00:00:59,040 --> 00:01:01,830 and we know that the best choice of b 22 00:01:01,830 --> 00:01:06,020 is equal to the expected value of that random variable. 23 00:01:06,020 --> 00:01:09,630 And using also linearity, this expected value 24 00:01:09,630 --> 00:01:13,070 can be written in this form. 25 00:01:13,070 --> 00:01:18,020 So if we know a, this is how b should be chosen. 26 00:01:18,020 --> 00:01:22,840 Let us now move on to the choice of a. 27 00:01:22,840 --> 00:01:26,930 Since we know what b should be equal to, 28 00:01:26,930 --> 00:01:30,250 we can rewrite this expression that we're 29 00:01:30,250 --> 00:01:35,130 trying to minimize by substituting our choice of b, 30 00:01:35,130 --> 00:01:38,393 which is the expected value of Theta minus aX. 31 00:01:43,229 --> 00:01:45,590 And this is the quantity we want to minimize. 32 00:01:45,590 --> 00:01:46,660 What is it? 33 00:01:46,660 --> 00:01:49,430 We have a random variable minus the expected value 34 00:01:49,430 --> 00:01:51,259 of that random variable squared. 35 00:01:51,259 --> 00:01:53,110 And then we take expected value. 36 00:01:53,110 --> 00:02:00,620 This is just the variance of the random variable Theta minus aX. 37 00:02:00,620 --> 00:02:02,680 This is the variance of the difference 38 00:02:02,680 --> 00:02:04,660 of two random variables. 39 00:02:04,660 --> 00:02:07,550 Because the two random variables are dependent, 40 00:02:07,550 --> 00:02:11,640 this is not just the sum of the individual variances. 41 00:02:11,640 --> 00:02:15,630 But we do have a formula, even for the general case. 42 00:02:15,630 --> 00:02:18,720 And the formula tells us that the variance 43 00:02:18,720 --> 00:02:20,780 of the difference of two random variables 44 00:02:20,780 --> 00:02:24,010 is the variance of the first random variable 45 00:02:24,010 --> 00:02:26,420 plus the variance of the second. 46 00:02:26,420 --> 00:02:29,170 And when we pull a outside the variance, 47 00:02:29,170 --> 00:02:33,000 that gives us a contribution of a squared. 48 00:02:33,000 --> 00:02:35,400 And then we have a cross-term. 49 00:02:35,400 --> 00:02:37,680 Because of the minus sign here, the cross-term 50 00:02:37,680 --> 00:02:39,329 will have a minus sign. 51 00:02:39,329 --> 00:02:41,600 We have a factor of 2. 52 00:02:41,600 --> 00:02:46,120 And then we want the covariance of Theta with aX. 53 00:02:46,120 --> 00:02:49,390 And because we have seen that covariances behave 54 00:02:49,390 --> 00:02:53,270 in a linear manner, we can pull a outside the covariance. 55 00:02:53,270 --> 00:02:55,940 And this is what we are left with. 56 00:02:55,940 --> 00:02:59,710 This is the quantity we want to optimize with respect to a. 57 00:02:59,710 --> 00:03:02,570 And to do this optimization, we just 58 00:03:02,570 --> 00:03:07,320 set the derivative with respect to a to 0. 59 00:03:07,320 --> 00:03:13,180 And that's going to give us 2a times the variance of X 60 00:03:13,180 --> 00:03:19,850 minus twice the covariance of Theta with X, equal to 0. 61 00:03:19,850 --> 00:03:22,460 From which it follows that a should 62 00:03:22,460 --> 00:03:25,310 be equal to the covariance of Theta 63 00:03:25,310 --> 00:03:31,670 with X divided by the variance of X. 64 00:03:31,670 --> 00:03:34,140 So we have found what a should be. 65 00:03:34,140 --> 00:03:38,110 Once we know what a is, we know what b should be equal to. 66 00:03:38,110 --> 00:03:40,079 So we have solved the problem. 67 00:03:40,079 --> 00:03:43,030 And here's the form of the solution. 68 00:03:43,030 --> 00:03:48,050 This coefficient here is the coefficient a. 69 00:03:48,050 --> 00:03:50,990 So we have here the term aX. 70 00:03:50,990 --> 00:03:55,940 And then this term together with a times the expected value 71 00:03:55,940 --> 00:04:01,290 of X, this corresponds to the coefficient b. 72 00:04:01,290 --> 00:04:04,070 It's also instructive to rewrite this solution 73 00:04:04,070 --> 00:04:07,460 in a slightly different form involving the correlation 74 00:04:07,460 --> 00:04:08,870 coefficient. 75 00:04:08,870 --> 00:04:11,930 Recall that the correlation coefficient between two 76 00:04:11,930 --> 00:04:16,730 random variables is defined as the covariance between the two 77 00:04:16,730 --> 00:04:20,350 random variables divided by the product 78 00:04:20,350 --> 00:04:23,560 of their standard deviations. 79 00:04:23,560 --> 00:04:27,340 Using this relation, we can now write the coefficient a 80 00:04:27,340 --> 00:04:34,060 as the covariance which is who times sigma Theta sigma 81 00:04:34,060 --> 00:04:41,430 X divided by the variance of X, which is sigma X squared. 82 00:04:41,430 --> 00:04:44,920 And after we cancel a factor of sigma X from the numerator 83 00:04:44,920 --> 00:04:48,640 and the denominator, we see that a is also 84 00:04:48,640 --> 00:04:52,640 equal to rho times sigma Theta divided by sigma X. 85 00:04:52,640 --> 00:04:55,190 And this gives us this alternative form 86 00:04:55,190 --> 00:04:57,470 for the solution. 87 00:04:57,470 --> 00:05:01,360 What we will be doing next will be to interpret this solution 88 00:05:01,360 --> 00:05:04,410 and also to give a number of examples.