1 00:00:00,000 --> 00:00:00,040 2 00:00:00,040 --> 00:00:00,760 Hey guys. 3 00:00:00,760 --> 00:00:02,140 Welcome back. 4 00:00:02,140 --> 00:00:05,440 Today we're going to do a fun problem that will test your 5 00:00:05,440 --> 00:00:08,010 knowledge of the law of total variance. 6 00:00:08,010 --> 00:00:11,050 And in the process, we'll also get more practice dealing with 7 00:00:11,050 --> 00:00:14,460 joint PDFs and computing conditional expectations and 8 00:00:14,460 --> 00:00:16,810 conditional variances. 9 00:00:16,810 --> 00:00:22,030 So in this problem, we are given a joint PDF for x and y. 10 00:00:22,030 --> 00:00:26,490 So we're told that x and y can take on the following values 11 00:00:26,490 --> 00:00:27,320 in the shape of this 12 00:00:27,320 --> 00:00:29,660 parallelogram, which I've drawn. 13 00:00:29,660 --> 00:00:33,250 And moreover, that x and y are uniformly distributed. 14 00:00:33,250 --> 00:00:37,630 So the joint PDF is just flat over this parallelogram. 15 00:00:37,630 --> 00:00:41,430 And because the parallelogram has an area of 1, the height 16 00:00:41,430 --> 00:00:47,100 of the PDF must also be 1 so that the PDF integrates to 1. 17 00:00:47,100 --> 00:00:47,430 OK. 18 00:00:47,430 --> 00:00:49,670 And then we are asked to compute the 19 00:00:49,670 --> 00:00:51,470 variance of x plus y. 20 00:00:51,470 --> 00:00:54,560 So you can think of x plus y as a new random variable whose 21 00:00:54,560 --> 00:00:56,520 variance we want to compute. 22 00:00:56,520 --> 00:00:59,530 And moreover, we're told we should compute this variance 23 00:00:59,530 --> 00:01:03,240 by using something called the law of total variance. 24 00:01:03,240 --> 00:01:07,140 So from lecture, you should remember or you should recall 25 00:01:07,140 --> 00:01:09,170 that the law of total variance can be written 26 00:01:09,170 --> 00:01:11,470 in these two ways. 27 00:01:11,470 --> 00:01:14,310 And the reason why there's two different forms for this case 28 00:01:14,310 --> 00:01:17,350 is because the formula always has you 29 00:01:17,350 --> 00:01:18,590 conditioning on something. 30 00:01:18,590 --> 00:01:22,120 Here we condition on x, here we condition on y. 31 00:01:22,120 --> 00:01:25,590 And for this problem, the logical choice you have for 32 00:01:25,590 --> 00:01:28,400 what to condition on is x or y. 33 00:01:28,400 --> 00:01:29,970 So again, we have this option. 34 00:01:29,970 --> 00:01:33,450 And my claim is that we should condition on x. 35 00:01:33,450 --> 00:01:37,740 And the reason has to do with the geometry of this diagram. 36 00:01:37,740 --> 00:01:42,220 So notice that if you freeze an x and then you sort of vary 37 00:01:42,220 --> 00:01:47,120 x, the width of this parallelogram stays constant. 38 00:01:47,120 --> 00:01:50,190 However, if you condition on y and look at the width this 39 00:01:50,190 --> 00:01:53,070 way, you see that the width of the slices you get by 40 00:01:53,070 --> 00:01:56,680 conditioning vary with y. 41 00:01:56,680 --> 00:02:00,280 So to make our lives easier, we're going to condition on x. 42 00:02:00,280 --> 00:02:02,480 And I'm going to erase this bottom one, because 43 00:02:02,480 --> 00:02:04,980 we're not using it. 44 00:02:04,980 --> 00:02:08,889 So this really can seem quite intimidating, because we have 45 00:02:08,889 --> 00:02:11,590 nested variances and expectations going on, but 46 00:02:11,590 --> 00:02:13,920 we'll just take it slowly step by step. 47 00:02:13,920 --> 00:02:16,650 So first, I want to focus on this term-- 48 00:02:16,650 --> 00:02:23,600 the conditional expectation of x plus y conditioned on x. 49 00:02:23,600 --> 00:02:28,160 So coming back over to this picture, if you fix an 50 00:02:28,160 --> 00:02:32,100 arbitrary x in the interval, 0 to 1, we're restricting 51 00:02:32,100 --> 00:02:34,900 ourselves to this universe. 52 00:02:34,900 --> 00:02:39,860 So y can only vary between this point and this point. 53 00:02:39,860 --> 00:02:42,560 Now, I've already written down here that the formula for this 54 00:02:42,560 --> 00:02:45,555 line is given by y is equal to x. 55 00:02:45,555 --> 00:02:48,510 And the formula for this line is given by y is 56 00:02:48,510 --> 00:02:50,190 equal to x plus 1. 57 00:02:50,190 --> 00:02:53,990 So in particular, when we condition on x, we know that y 58 00:02:53,990 --> 00:02:56,920 varies between x and x plus 1. 59 00:02:56,920 --> 00:02:58,960 But we actually know more than that. 60 00:02:58,960 --> 00:03:01,810 We know that in the unconditional universe, x and 61 00:03:01,810 --> 00:03:04,180 y were uniformly distributed. 62 00:03:04,180 --> 00:03:07,520 So it follows that in the conditional universe, y should 63 00:03:07,520 --> 00:03:10,600 also be uniformly distributed, because conditioning doesn't 64 00:03:10,600 --> 00:03:13,880 change the relative frequency of outcomes. 65 00:03:13,880 --> 00:03:19,210 So that reasoning means that we can draw the conditional 66 00:03:19,210 --> 00:03:22,270 PDF of y conditioned on x as this. 67 00:03:22,270 --> 00:03:24,770 We said it varies between x and x plus 1. 68 00:03:24,770 --> 00:03:27,770 And we also said that it's uniform, which means that it 69 00:03:27,770 --> 00:03:29,670 must have a height of 1. 70 00:03:29,670 --> 00:03:34,540 So this is py given x, y given x. 71 00:03:34,540 --> 00:03:37,820 Now, you might be concerned, because, well, we're trying to 72 00:03:37,820 --> 00:03:43,150 compute the expectation of x plus y and this is the 73 00:03:43,150 --> 00:03:47,450 conditional PDF of y, not of the random variable, x plus y. 74 00:03:47,450 --> 00:03:50,990 But I claim that we're OK, this is still useful, because 75 00:03:50,990 --> 00:03:53,680 if we're conditioning on x, this x 76 00:03:53,680 --> 00:03:55,640 just acts as a constant. 77 00:03:55,640 --> 00:03:58,555 It's not really going to change anything except shift 78 00:03:58,555 --> 00:04:02,400 the expectation of y by an amount of x. 79 00:04:02,400 --> 00:04:05,350 So what I'm saying in math terms is that this is actually 80 00:04:05,350 --> 00:04:11,820 just x plus the expectation of y given x. 81 00:04:11,820 --> 00:04:16,470 And now our conditional PDF comes into play. 82 00:04:16,470 --> 00:04:18,930 Conditioned on x, this is the PDF of y. 83 00:04:18,930 --> 00:04:21,899 And because it's uniformly distributed and because 84 00:04:21,899 --> 00:04:24,860 expectation acts like center of mass, we know that the 85 00:04:24,860 --> 00:04:27,800 expectation should be the midpoint, right? 86 00:04:27,800 --> 00:04:31,695 And so to compute this point, we simply take the average of 87 00:04:31,695 --> 00:04:35,990 the endpoints, x plus 1 plus x over 2, which gives us 88 00:04:35,990 --> 00:04:38,240 2x plus 1 over 2. 89 00:04:38,240 --> 00:04:45,805 So plugging this back up here, we get 2x/2 plus 2x plus 1 90 00:04:45,805 --> 00:04:53,960 over 2, which is 4x plus 1 over 2, or 2x plus 1/2. 91 00:04:53,960 --> 00:04:54,540 OK. 92 00:04:54,540 --> 00:04:58,810 So now I want to look at the next term, the next inner 93 00:04:58,810 --> 00:05:01,610 term, which is this guy. 94 00:05:01,610 --> 00:05:05,070 So this computation is going to be very 95 00:05:05,070 --> 00:05:07,320 similar in nature, actually. 96 00:05:07,320 --> 00:05:10,180 So we already discussed that the joint-- 97 00:05:10,180 --> 00:05:13,140 sorry, not the joint, the conditional PDF of y given x 98 00:05:13,140 --> 00:05:14,640 is this guy. 99 00:05:14,640 --> 00:05:18,520 So the variance of x plus y conditioned on x, we sort of 100 00:05:18,520 --> 00:05:20,940 have a similar phenomenon occurring. 101 00:05:20,940 --> 00:05:24,180 x now in this conditional world just acts like a 102 00:05:24,180 --> 00:05:28,250 constant that shifts the PDF but doesn't change the width 103 00:05:28,250 --> 00:05:30,700 of the distribution at all. 104 00:05:30,700 --> 00:05:35,550 So this is actually just equal to the variance of y given x, 105 00:05:35,550 --> 00:05:39,060 because constants don't affect the variance. 106 00:05:39,060 --> 00:05:42,480 And now we can look at this conditional PDF to figure out 107 00:05:42,480 --> 00:05:44,350 what this is. 108 00:05:44,350 --> 00:05:47,530 So we're going to take a quick tangent over here, and I'm 109 00:05:47,530 --> 00:05:51,070 just going to remind you guys that we have a formula for 110 00:05:51,070 --> 00:05:53,880 computing the variance of a random variable when it's 111 00:05:53,880 --> 00:05:57,420 uniformly distributed between two endpoints. 112 00:05:57,420 --> 00:06:02,270 So say we have a random variable whose PDF looks 113 00:06:02,270 --> 00:06:04,330 something like this. 114 00:06:04,330 --> 00:06:06,420 Let's call it, let's say, w. 115 00:06:06,420 --> 00:06:09,190 This is pww. 116 00:06:09,190 --> 00:06:13,510 We have a formula that says variance of w is equal to b 117 00:06:13,510 --> 00:06:17,000 minus a squared over 12. 118 00:06:17,000 --> 00:06:20,280 So we can apply that formula over here. 119 00:06:20,280 --> 00:06:23,000 b is x plus 1, a is x. 120 00:06:23,000 --> 00:06:26,220 So b minus a squared over 12 is just 1/12. 121 00:06:26,220 --> 00:06:29,150 So we get 1/12. 122 00:06:29,150 --> 00:06:32,250 So we're making good progress, because we have this inner 123 00:06:32,250 --> 00:06:34,460 quantity and this inner quantity. 124 00:06:34,460 --> 00:06:37,170 So now all we need to do is take the outer variance and 125 00:06:37,170 --> 00:06:39,430 the outer expectation. 126 00:06:39,430 --> 00:06:44,220 So writing this all down, we get variance of x plus y is 127 00:06:44,220 --> 00:06:51,390 equal to variance of this guy, 2x plus 1/2 plus the 128 00:06:51,390 --> 00:06:52,890 expectation of 1/12. 129 00:06:52,890 --> 00:06:55,720 130 00:06:55,720 --> 00:06:59,230 So this term is quite simple. 131 00:06:59,230 --> 00:07:02,120 We know that the expectation of a constant or of a scalar 132 00:07:02,120 --> 00:07:03,590 is simply that scalar. 133 00:07:03,590 --> 00:07:05,900 So this evaluates to 1/12. 134 00:07:05,900 --> 00:07:07,790 And this one is not bad either. 135 00:07:07,790 --> 00:07:12,340 So similar to our discussion up here, we know constants do 136 00:07:12,340 --> 00:07:13,930 not affect variance. 137 00:07:13,930 --> 00:07:16,180 You know they shift your distribution, they don't 138 00:07:16,180 --> 00:07:17,480 change the variance. 139 00:07:17,480 --> 00:07:19,760 So we can ignore the 1/2. 140 00:07:19,760 --> 00:07:21,550 This scaling factor of 2, however, 141 00:07:21,550 --> 00:07:23,670 will change the variance. 142 00:07:23,670 --> 00:07:25,190 But we know how to handle this already 143 00:07:25,190 --> 00:07:26,600 from previous lectures. 144 00:07:26,600 --> 00:07:30,850 We know that you can just take out this scalar scaling factor 145 00:07:30,850 --> 00:07:33,320 as long as we square it. 146 00:07:33,320 --> 00:07:36,820 So this becomes 2 squared, or 4 times the 147 00:07:36,820 --> 00:07:40,790 variance of x plus 1/12. 148 00:07:40,790 --> 00:07:43,410 And now to compute the variance of x, we're going to 149 00:07:43,410 --> 00:07:47,050 use that formula again, and we're 150 00:07:47,050 --> 00:07:48,670 going to use this picture. 151 00:07:48,670 --> 00:07:54,650 So here we have the joint PDF of x and y, but really we want 152 00:07:54,650 --> 00:07:57,810 now the PDF of x, so we can figure out what 153 00:07:57,810 --> 00:08:00,380 the variance is. 154 00:08:00,380 --> 00:08:02,800 So hopefully you remember a trick we taught you called 155 00:08:02,800 --> 00:08:04,360 marginalization. 156 00:08:04,360 --> 00:08:08,950 To get the PDF of x given a joint PDF, you simply 157 00:08:08,950 --> 00:08:12,090 marginalize over the values of y. 158 00:08:12,090 --> 00:08:17,750 So if you freeze x is equal to 0, you get the probability 159 00:08:17,750 --> 00:08:20,730 density line over x by integrating over this 160 00:08:20,730 --> 00:08:22,130 interval, over y. 161 00:08:22,130 --> 00:08:25,000 So if you integrate over this strip, you get 1. 162 00:08:25,000 --> 00:08:27,340 If you move x over a little bit and you integrate over 163 00:08:27,340 --> 00:08:29,210 this strip, you get 1. 164 00:08:29,210 --> 00:08:32,480 This is the argument I was making earlier that the width 165 00:08:32,480 --> 00:08:35,240 of this interval stays the same, and hence, the variance 166 00:08:35,240 --> 00:08:36,600 stays the same. 167 00:08:36,600 --> 00:08:40,590 So based on that argument, which was slightly hand wavy, 168 00:08:40,590 --> 00:08:42,059 let's come over here and draw it. 169 00:08:42,059 --> 00:08:44,860 170 00:08:44,860 --> 00:08:50,060 We're claiming that the PDF of x, px of x, looks like this. 171 00:08:50,060 --> 00:08:53,390 It's just uniformly distributed between 0 and 1. 172 00:08:53,390 --> 00:08:56,330 And if you buy that, then we're done, we're home free, 173 00:08:56,330 --> 00:08:59,380 because we can apply this formula, b minus a squared 174 00:08:59,380 --> 00:09:01,570 over 12, gives us the variance. 175 00:09:01,570 --> 00:09:04,620 So b is 1, a is 0, which gives variance of 176 00:09:04,620 --> 00:09:07,010 x is equal to 1/12. 177 00:09:07,010 --> 00:09:13,980 So coming back over here, we get 4 times 1/12 plus 1/12, 178 00:09:13,980 --> 00:09:16,100 which is 5/12. 179 00:09:16,100 --> 00:09:19,340 And that is our answer. 180 00:09:19,340 --> 00:09:22,340 So this problem was straightforward in the sense 181 00:09:22,340 --> 00:09:24,850 that our task was very clear. 182 00:09:24,850 --> 00:09:28,010 We had to compute this, and we had to do so by using the law 183 00:09:28,010 --> 00:09:29,730 of total variance. 184 00:09:29,730 --> 00:09:33,170 But we sort of reviewed a lot of concepts along the way. 185 00:09:33,170 --> 00:09:38,170 We saw how, given a joint PDF, you marginalize to 186 00:09:38,170 --> 00:09:39,950 get the PDF of x. 187 00:09:39,950 --> 00:09:44,580 We saw how constants don't change variance. 188 00:09:44,580 --> 00:09:47,780 We got a lot of practice finding conditional 189 00:09:47,780 --> 00:09:49,700 distributions and computing conditional 190 00:09:49,700 --> 00:09:51,670 expectations and variances. 191 00:09:51,670 --> 00:09:54,530 And we also saw this trick. 192 00:09:54,530 --> 00:09:58,040 And it might seem like cheating to memorize formulas, 193 00:09:58,040 --> 00:09:59,960 but there's a few important ones you should know. 194 00:09:59,960 --> 00:10:03,240 And it will help you sort of become faster at doing 195 00:10:03,240 --> 00:10:04,170 computations. 196 00:10:04,170 --> 00:10:06,200 And that's important, especially if you 197 00:10:06,200 --> 00:10:08,390 guys take the exams. 198 00:10:08,390 --> 00:10:09,020 So that's it. 199 00:10:09,020 --> 00:10:10,270 See you next time. 200 00:10:10,270 --> 00:10:11,220