1 00:00:00,820 --> 00:00:03,680 In this segment, we discuss the expected value rule for 2 00:00:03,680 --> 00:00:05,890 calculating the expected value of a 3 00:00:05,890 --> 00:00:08,000 function of a random variable. 4 00:00:08,000 --> 00:00:11,180 It corresponds to a nice formula that we will see 5 00:00:11,180 --> 00:00:15,220 shortly, but it also involves a much more general idea that 6 00:00:15,220 --> 00:00:18,970 we will encounter many times in this course, in somewhat 7 00:00:18,970 --> 00:00:20,480 different forms. 8 00:00:20,480 --> 00:00:22,920 Here's what it is all about. 9 00:00:22,920 --> 00:00:27,910 We start with a certain random variable that has a known PMF. 10 00:00:27,910 --> 00:00:30,720 However, we're ultimately interested in another random 11 00:00:30,720 --> 00:00:34,190 variable Y, which is defined as a function of the original 12 00:00:34,190 --> 00:00:35,460 random variable. 13 00:00:35,460 --> 00:00:38,130 We're interested in calculating the expected value 14 00:00:38,130 --> 00:00:42,130 of this new random variable, Y. How should we do it? 15 00:00:42,130 --> 00:00:45,150 We will illustrate the ideas involved through a simple 16 00:00:45,150 --> 00:00:46,670 numerical example. 17 00:00:46,670 --> 00:00:49,810 In this example, we have a random variable, X, that takes 18 00:00:49,810 --> 00:00:52,910 values 2, 3, 4, or 5, according to some given 19 00:00:52,910 --> 00:00:54,140 probabilities. 20 00:00:54,140 --> 00:00:56,570 We are also given a function that maps 21 00:00:56,570 --> 00:00:59,300 x-values into y-values. 22 00:00:59,300 --> 00:01:04,080 And this function, g, then defines a new random variable. 23 00:01:04,080 --> 00:01:07,550 So if the outcome of the experiment leads to an X equal 24 00:01:07,550 --> 00:01:11,200 to 4, then the random variable, Y, will also take a 25 00:01:11,200 --> 00:01:13,130 value equal to 4. 26 00:01:13,130 --> 00:01:16,200 How do we calculate the expected value of Y? 27 00:01:16,200 --> 00:01:18,720 The only tool that we have available in our hands at this 28 00:01:18,720 --> 00:01:22,310 point is the definition of the expected value, which tells us 29 00:01:22,310 --> 00:01:26,620 that we should run a summation over the y-axis, consider 30 00:01:26,620 --> 00:01:29,520 different values of y one at the time. 31 00:01:29,520 --> 00:01:33,640 And for each value for y, multiply that value by its 32 00:01:33,640 --> 00:01:35,430 corresponding probability. 33 00:01:35,430 --> 00:01:39,640 So in this case, we start with Y equal to 3, which needs to 34 00:01:39,640 --> 00:01:41,870 be multiplied by the probability that 35 00:01:41,870 --> 00:01:43,620 Y is equal to 3. 36 00:01:43,620 --> 00:01:45,150 What is that probability? 37 00:01:45,150 --> 00:01:49,850 Well, Y is equal to three, if and only if X is 2 or 3, which 38 00:01:49,850 --> 00:01:56,430 happens with probability, 0.1 plus 0.2. 39 00:01:56,430 --> 00:01:59,400 Then we continue with the summation by considering the 40 00:01:59,400 --> 00:02:01,790 next value of little y. 41 00:02:01,790 --> 00:02:04,280 The next possible value is 4. 42 00:02:04,280 --> 00:02:08,009 And this gives us a contribution of 4, weighted by 43 00:02:08,009 --> 00:02:10,490 the probability of obtaining a 4. 44 00:02:10,490 --> 00:02:13,340 The probability that Y is equal to 4 is the probability 45 00:02:13,340 --> 00:02:17,579 that X is either equal to 4 or to 5, which happens with 46 00:02:17,579 --> 00:02:18,340 probability. 47 00:02:18,340 --> 00:02:20,750 0.3 plus 0.4. 48 00:02:20,750 --> 00:02:24,660 So this way, we obtain an arithmetic expression which we 49 00:02:24,660 --> 00:02:25,890 can evaluate. 50 00:02:25,890 --> 00:02:29,480 And it's going to give us the expected value of Y. But 51 00:02:29,480 --> 00:02:31,680 here's an alternative way of calculating 52 00:02:31,680 --> 00:02:33,400 the expected value. 53 00:02:33,400 --> 00:02:38,310 And this corresponds to the following type of thinking. 54 00:02:38,310 --> 00:02:42,260 10% of the time, X is going to be equal to 2. 55 00:02:42,260 --> 00:02:46,079 And when that happens, Y takes on a value of 3. 56 00:02:46,079 --> 00:02:49,930 So this should give us a contribution to the average 57 00:02:49,930 --> 00:02:55,430 value of Y, which is 3 times 0.1. 58 00:02:55,430 --> 00:03:02,590 Then, 20% of the time, X is 3 and Y is also 3. 59 00:03:02,590 --> 00:03:09,980 So 20% of the time, we also get 3's in Y. 60 00:03:09,980 --> 00:03:15,790 Then 30% of the time, X is 4, which results in a Y that's 61 00:03:15,790 --> 00:03:17,280 equal to 4. 62 00:03:17,280 --> 00:03:21,620 So we obtain a 4 30% of the time. 63 00:03:21,620 --> 00:03:26,440 And finally, 40% of the time, X equals to [5], which results 64 00:03:26,440 --> 00:03:30,329 in a Y equal to 4. 65 00:03:30,329 --> 00:03:33,710 And we obtain this arithmetic expression. 66 00:03:33,710 --> 00:03:36,600 Now you can compare the two arithmetic expressions, the 67 00:03:36,600 --> 00:03:39,940 red and the blue one, and you will notice that they're 68 00:03:39,940 --> 00:03:43,470 equal, except that the terms are arranged in a slightly 69 00:03:43,470 --> 00:03:44,880 different way. 70 00:03:44,880 --> 00:03:48,210 Conceptually, however, there's a very big difference. 71 00:03:48,210 --> 00:03:50,920 In the first summation, we run over the values of 72 00:03:50,920 --> 00:03:53,079 Y one at the time. 73 00:03:53,079 --> 00:03:56,630 In the second summation, we run over the different values 74 00:03:56,630 --> 00:04:01,320 of X one at a time, and took into account their individual 75 00:04:01,320 --> 00:04:02,990 contributions. 76 00:04:02,990 --> 00:04:07,500 This second way of calculating the expected value of Y is 77 00:04:07,500 --> 00:04:09,710 called the expected value rule. 78 00:04:09,710 --> 00:04:13,560 And it corresponds to the following formula. 79 00:04:13,560 --> 00:04:17,390 We carry out a summation over the x-axis. 80 00:04:17,390 --> 00:04:22,330 For each x-value that we consider, we calculate what is 81 00:04:22,330 --> 00:04:27,830 the corresponding y-value, that's g of x, and also weigh 82 00:04:27,830 --> 00:04:30,620 this term according to the probability of 83 00:04:30,620 --> 00:04:32,510 this particular x. 84 00:04:32,510 --> 00:04:37,710 So for instance, a typical term here would be when x is 85 00:04:37,710 --> 00:04:42,440 equal to 2, g of x would be equal to 3. 86 00:04:42,440 --> 00:04:44,480 And the corresponding probability, that's the 87 00:04:44,480 --> 00:04:49,560 probability of a 2, would be 0.1. 88 00:04:49,560 --> 00:04:52,790 The advantage of using the expected value rule instead of 89 00:04:52,790 --> 00:04:55,180 the definition of the expectation is that the 90 00:04:55,180 --> 00:04:58,510 expected value rule only involves the PMF of the 91 00:04:58,510 --> 00:05:02,540 original random variable, so we do not need to do any 92 00:05:02,540 --> 00:05:05,295 additional work to find the PMF of 93 00:05:05,295 --> 00:05:08,230 the new random variable. 94 00:05:08,230 --> 00:05:13,610 Now we argued in favor of the expected value rule by 95 00:05:13,610 --> 00:05:16,450 considering this numerical example, and by checking that 96 00:05:16,450 --> 00:05:18,730 it gives the right result. 97 00:05:18,730 --> 00:05:20,520 But now let us verify. 98 00:05:20,520 --> 00:05:23,760 Let us argue more generally that it's going to give us the 99 00:05:23,760 --> 00:05:25,410 right answer. 100 00:05:25,410 --> 00:05:30,300 So what we're going to do is to take this summation and 101 00:05:30,300 --> 00:05:34,980 argue that it's equal to the expected value of Y, which is 102 00:05:34,980 --> 00:05:37,040 defined by that summation. 103 00:05:37,040 --> 00:05:38,510 So let us start with this. 104 00:05:38,510 --> 00:05:41,790 It's a sum over all x's. 105 00:05:41,790 --> 00:05:48,920 Let us first fix a particular value of y, and add over all 106 00:05:48,920 --> 00:05:52,970 those x's that correspond to that particular y. 107 00:05:52,970 --> 00:05:56,440 So we're fixing a particular y. 108 00:05:56,440 --> 00:06:02,320 And so we're adding only over those x's that lead to that 109 00:06:02,320 --> 00:06:04,170 particular y. 110 00:06:04,170 --> 00:06:06,085 And we carry out to the summation. 111 00:06:09,680 --> 00:06:13,500 So this is the part of this sum associated with one 112 00:06:13,500 --> 00:06:16,620 particular choice of y. 113 00:06:16,620 --> 00:06:20,000 And it's a sum, really, over this set of x's. 114 00:06:20,000 --> 00:06:24,370 But in order to exhaust all x's, we need to consider all 115 00:06:24,370 --> 00:06:26,930 possible values of y. 116 00:06:26,930 --> 00:06:31,690 And this gives rise to an outer summation over the 117 00:06:31,690 --> 00:06:33,130 different y's. 118 00:06:33,130 --> 00:06:37,630 So for any fixed y, we add over the associated x's. 119 00:06:37,630 --> 00:06:40,250 But we want to consider all the possible y's. 120 00:06:43,010 --> 00:06:47,960 Now at this point, we make the following observation. 121 00:06:50,460 --> 00:06:53,460 Here, we have a summation over y's. 122 00:06:53,460 --> 00:06:57,710 And let's look at the inner summation. 123 00:06:57,710 --> 00:07:03,520 The inner summation involves x's, all of which are 124 00:07:03,520 --> 00:07:07,140 associated with a specific value of y. 125 00:07:07,140 --> 00:07:13,360 Having fixed y, all the terms inside this sum have the 126 00:07:13,360 --> 00:07:16,290 property that g of x is equal to y. 127 00:07:16,290 --> 00:07:20,960 So g of x is equal to that particular y. 128 00:07:20,960 --> 00:07:25,710 And we can make this substitution here. 129 00:07:25,710 --> 00:07:28,510 Now when we look at this summation, we now realize that 130 00:07:28,510 --> 00:07:33,470 it's a summation over x's while y is being fixed. 131 00:07:33,470 --> 00:07:37,690 And so we can take this term of y and pull 132 00:07:37,690 --> 00:07:39,165 it outside the summation. 133 00:07:42,620 --> 00:07:50,590 What this leaves us with is a sum over all y's of y, and 134 00:07:50,590 --> 00:07:54,930 then a further sum over all x's that lead to that 135 00:07:54,930 --> 00:07:59,590 particular y, of the probabilities of those x's. 136 00:08:04,640 --> 00:08:09,850 Now what can we say about this inner summation? 137 00:08:09,850 --> 00:08:12,960 We have fixed a y. 138 00:08:12,960 --> 00:08:16,050 For that particular y, we're adding the probabilities of 139 00:08:16,050 --> 00:08:19,910 all the x's that lead to that particular y. 140 00:08:19,910 --> 00:08:23,820 Fixing y, consider all the x's that lead to it. 141 00:08:23,820 --> 00:08:27,100 This is just the probability of that particular y. 142 00:08:29,740 --> 00:08:33,549 But what we have now is just the definition of the expected 143 00:08:33,549 --> 00:08:38,840 value of Y. And this concludes the proof that this 144 00:08:38,840 --> 00:08:42,490 expression, as given by the expected value rule, gives us 145 00:08:42,490 --> 00:08:45,530 the same answer as the original definition of the 146 00:08:45,530 --> 00:08:48,110 expected value of Y. 147 00:08:48,110 --> 00:08:50,860 Now before closing, a few observations. 148 00:08:50,860 --> 00:08:54,350 The expected value rule is really simple to use. 149 00:08:54,350 --> 00:08:56,890 For example, if you want to calculate the expected value 150 00:08:56,890 --> 00:09:00,330 of the square of a random variable, then you're dealing 151 00:09:00,330 --> 00:09:03,560 with a situation where the g function 152 00:09:03,560 --> 00:09:07,350 is the square function. 153 00:09:07,350 --> 00:09:11,790 And so, the expected value of X-squared will be the sum over 154 00:09:11,790 --> 00:09:16,700 x's of x squared weighted according to the probability 155 00:09:16,700 --> 00:09:19,260 of a particular x. 156 00:09:19,260 --> 00:09:22,950 And finally, one important word of caution, that in 157 00:09:22,950 --> 00:09:26,090 general, the expected value of the function-- 158 00:09:26,090 --> 00:09:29,750 so for example, the expected value of X-squared. 159 00:09:29,750 --> 00:09:35,480 In general, it's not going to be the same as taking the 160 00:09:35,480 --> 00:09:40,620 expected value of X and squaring it. 161 00:09:40,620 --> 00:09:43,800 So this is a word [of] caution, that in general, you 162 00:09:43,800 --> 00:09:46,960 cannot interchange the order with which you apply a 163 00:09:46,960 --> 00:09:50,820 function, and then you calculate expectation. 164 00:09:50,820 --> 00:09:54,060 There are exceptions, however, in which we happen to have 165 00:09:54,060 --> 00:09:55,400 equality here. 166 00:09:55,400 --> 00:09:57,320 And this is going to be our next topic.