1 00:00:01,370 --> 00:00:04,160 In this segment, we point out and discuss 2 00:00:04,160 --> 00:00:07,330 some important but also intuitive properties 3 00:00:07,330 --> 00:00:09,580 of conditional expectations. 4 00:00:09,580 --> 00:00:13,900 The first property is the one that is written up here. 5 00:00:13,900 --> 00:00:16,300 What is the intuitive meaning? 6 00:00:16,300 --> 00:00:21,140 If you condition on Y, then the value of Y is known, 7 00:00:21,140 --> 00:00:23,710 and so g of Y is also known. 8 00:00:23,710 --> 00:00:25,340 There's no randomness in it. 9 00:00:25,340 --> 00:00:28,400 It can be treated as a constant, and therefore it 10 00:00:28,400 --> 00:00:31,860 can be pulled outside the expectation. 11 00:00:31,860 --> 00:00:33,560 So that's the intuition. 12 00:00:33,560 --> 00:00:37,160 How does one establish such a result formally? 13 00:00:37,160 --> 00:00:38,850 Let us take the discrete case. 14 00:00:38,850 --> 00:00:43,250 So let us assume that X and Y are both discrete. 15 00:00:45,850 --> 00:00:49,270 What does it take to establish a fact of this kind? 16 00:00:49,270 --> 00:00:52,410 We want to show that two random variables are equal, 17 00:00:52,410 --> 00:00:54,830 and that amounts to the following. 18 00:00:54,830 --> 00:00:58,790 We consider an outcome of the experiment, 19 00:00:58,790 --> 00:01:00,880 and we want to show that whatever 20 00:01:00,880 --> 00:01:03,100 the outcome of the experiment is, 21 00:01:03,100 --> 00:01:06,480 these two random variables will be the same. 22 00:01:06,480 --> 00:01:16,370 So let us consider an outcome for which the random variable, 23 00:01:16,370 --> 00:01:22,480 Y, takes a specific value, little y. 24 00:01:22,480 --> 00:01:25,950 And of course, this has to be a specific little y that 25 00:01:25,950 --> 00:01:27,630 is possible. 26 00:01:27,630 --> 00:01:33,759 Otherwise, conditioning on that event would not be meaningful. 27 00:01:33,759 --> 00:01:39,500 So if an outcome has this value for the random variable, 28 00:01:39,500 --> 00:01:43,740 Y, then what does the random variable do? 29 00:01:43,740 --> 00:01:48,150 This is, by definition, the random variable 30 00:01:48,150 --> 00:01:51,080 that takes the value-- expected value 31 00:01:51,080 --> 00:01:57,770 of g Y X, conditional expectation, 32 00:01:57,770 --> 00:02:01,610 given that capital Y took on this value. 33 00:02:01,610 --> 00:02:04,220 This was our definition of the concept 34 00:02:04,220 --> 00:02:06,610 of the abstract conditional expectation 35 00:02:06,610 --> 00:02:08,030 as a random variable. 36 00:02:08,030 --> 00:02:10,990 This is the random variable that takes 37 00:02:10,990 --> 00:02:13,430 this specific numerical value whenever 38 00:02:13,430 --> 00:02:17,320 the random variable, capital Y, takes the value, little y. 39 00:02:17,320 --> 00:02:22,860 And similarly, if the random variable, capital Y, 40 00:02:22,860 --> 00:02:27,120 takes the value, little y, this random variable here 41 00:02:27,120 --> 00:02:33,800 is the expected value of X, given that Y is little y. 42 00:02:33,800 --> 00:02:36,190 And when capital Y takes the value, 43 00:02:36,190 --> 00:02:39,110 little y, this function, g of Y, takes 44 00:02:39,110 --> 00:02:42,990 on this particular numerical value. 45 00:02:42,990 --> 00:02:45,720 So we want to show that these two expressions 46 00:02:45,720 --> 00:02:49,500 we will be equal no matter what capital Y is. 47 00:02:49,500 --> 00:02:52,370 Now, when we place ourselves in a conditional universe, where 48 00:02:52,370 --> 00:02:56,280 capital Y takes a value, little y, 49 00:02:56,280 --> 00:02:59,660 then the joint PMF of X and Y gets 50 00:02:59,660 --> 00:03:03,500 concentrated on those values of capital Y 51 00:03:03,500 --> 00:03:06,080 that obey this relation. 52 00:03:06,080 --> 00:03:10,190 So conditioned on this event, capital Y 53 00:03:10,190 --> 00:03:13,490 is, with certainty, equal to little y. 54 00:03:13,490 --> 00:03:15,970 Therefore, this random variable here, 55 00:03:15,970 --> 00:03:20,530 in the conditional universe, is the same as this number. 56 00:03:27,210 --> 00:03:29,750 But now, since this is a number, it 57 00:03:29,750 --> 00:03:31,825 can be pulled outside the expectation. 58 00:03:38,740 --> 00:03:43,880 So we have concluded that for any outcome for which 59 00:03:43,880 --> 00:03:47,600 the random variable, capital Y, takes this specific value, 60 00:03:47,600 --> 00:03:52,260 little y, this random variable takes this value. 61 00:03:52,260 --> 00:03:55,110 This random variable takes this value. 62 00:03:55,110 --> 00:03:56,970 They are the same. 63 00:03:56,970 --> 00:04:00,050 So no matter what the outcome is, 64 00:04:00,050 --> 00:04:02,740 these two random variables take the same value, 65 00:04:02,740 --> 00:04:08,130 and therefore they are the same random variables. 66 00:04:08,130 --> 00:04:13,290 Now, this is a correct proof if the random variables 67 00:04:13,290 --> 00:04:14,625 are discrete. 68 00:04:14,625 --> 00:04:18,470 If the random variables are continuous or general, 69 00:04:18,470 --> 00:04:23,160 then carrying out a rigorous proof is actually quite subtle, 70 00:04:23,160 --> 00:04:26,430 and it is beyond our scope. 71 00:04:26,430 --> 00:04:29,960 However, the intuition is still correct, 72 00:04:29,960 --> 00:04:31,990 and the result is correct. 73 00:04:31,990 --> 00:04:36,780 And we will be using it freely whenever we need to. 74 00:04:36,780 --> 00:04:39,980 Let us now move to a second observation. 75 00:04:39,980 --> 00:04:43,260 Suppose that h is an invertible function. 76 00:04:43,260 --> 00:04:44,260 What does that mean? 77 00:04:44,260 --> 00:04:46,770 That if I give you the value of h, 78 00:04:46,770 --> 00:04:49,720 you can tell me the value of the argument. 79 00:04:49,720 --> 00:04:54,980 So in some sense, Y and h of Y can 80 00:04:54,980 --> 00:04:56,490 be recovered from each other. 81 00:04:56,490 --> 00:04:59,900 If I give you Y, you can calculate h of Y. 82 00:04:59,900 --> 00:05:04,870 But also, if I give you h of Y, you can figure out what Y was. 83 00:05:04,870 --> 00:05:09,890 An example could be the function, h of Y, 84 00:05:09,890 --> 00:05:14,980 equals Y to the third power. 85 00:05:14,980 --> 00:05:18,220 If I tell you the value of Y, you know Y cubed. 86 00:05:18,220 --> 00:05:21,710 But if I tell you Y cubed, you can also figure out Y. 87 00:05:21,710 --> 00:05:27,180 So Y and Y cubed carry exactly the same information. 88 00:05:27,180 --> 00:05:30,740 In that case, the conditional expectation-- 89 00:05:30,740 --> 00:05:33,930 what you expect, on the average, X to be-- if I tell you 90 00:05:33,930 --> 00:05:39,159 the value of Y, should be the same as what you would expect 91 00:05:39,159 --> 00:05:43,670 X to be if I give you the value of, let's say, Y cubed. 92 00:05:43,670 --> 00:05:47,060 In both cases, I'm giving you the same amount of information, 93 00:05:47,060 --> 00:05:50,520 so the conditional distribution of X should be the same. 94 00:05:50,520 --> 00:05:54,060 And the conditional expectations should also be the same. 95 00:05:54,060 --> 00:05:57,730 So this is, again, a very intuitive fact. 96 00:05:57,730 --> 00:06:00,470 How do we verify that this fact is true? 97 00:06:00,470 --> 00:06:03,610 Using the same method as before. 98 00:06:03,610 --> 00:06:09,950 So fix some particular outcome for which 99 00:06:09,950 --> 00:06:14,570 the random variable, capital Y, takes a specific value, little 100 00:06:14,570 --> 00:06:15,070 y. 101 00:06:18,610 --> 00:06:22,080 When that happens, this random variable 102 00:06:22,080 --> 00:06:25,610 will take this value here. 103 00:06:25,610 --> 00:06:28,740 That's just by the definition of conditional expectation. 104 00:06:28,740 --> 00:06:31,720 This is the random variable that takes this value whenever 105 00:06:31,720 --> 00:06:34,070 capital Y happens to be equal to little y. 106 00:06:36,850 --> 00:06:41,650 In that case, we also have that h of capital 107 00:06:41,650 --> 00:06:46,025 Y takes on a specific value, h of little y. 108 00:06:48,659 --> 00:06:54,440 When this random variable takes this specific value, 109 00:06:54,440 --> 00:07:03,810 this random variable here will take a value of this kind. 110 00:07:07,830 --> 00:07:11,350 So this is the random variable that 111 00:07:11,350 --> 00:07:15,800 takes this value whenever h of capital Y 112 00:07:15,800 --> 00:07:19,460 happens to be this specific number. 113 00:07:19,460 --> 00:07:25,510 But now, the event that h of Y takes this specific value, 114 00:07:25,510 --> 00:07:28,350 because the function, h, is invertible, 115 00:07:28,350 --> 00:07:35,750 is identical to the event that Y takes that particular value. 116 00:07:35,750 --> 00:07:39,620 And so, since this event is identical to that event, 117 00:07:39,620 --> 00:07:42,207 the conditional probabilities, given this event, 118 00:07:42,207 --> 00:07:44,540 would be the same as the conditional probabilities given 119 00:07:44,540 --> 00:07:45,560 that the event. 120 00:07:45,560 --> 00:07:48,720 And therefore, the conditional expectations 121 00:07:48,720 --> 00:07:50,130 would also be the same. 122 00:07:52,800 --> 00:07:54,780 Once more, this is a proof that's 123 00:07:54,780 --> 00:07:57,110 entirely rigorous if we are dealing 124 00:07:57,110 --> 00:07:59,630 with discrete random variables, although 125 00:07:59,630 --> 00:08:01,640 in the continuous case, there could 126 00:08:01,640 --> 00:08:03,870 be some subtleties involved. 127 00:08:03,870 --> 00:08:07,040 However, the result is true in general. 128 00:08:07,040 --> 00:08:10,910 The technical details are beyond our scope.