1 00:00:00,840 --> 00:00:04,950 We have introduced the concept of expected value or mean, 2 00:00:04,950 --> 00:00:08,670 which tells us the average value of a random variable. 3 00:00:08,670 --> 00:00:13,170 We will now introduce another quantity, the variance, which 4 00:00:13,170 --> 00:00:15,700 quantifies the spread of the 5 00:00:15,700 --> 00:00:18,530 distribution of a random variable. 6 00:00:18,530 --> 00:00:25,100 So consider a random variable with a given PMF, for example 7 00:00:25,100 --> 00:00:27,640 like the PMF shown in this diagram. 8 00:00:31,930 --> 00:00:37,310 And consider another random variable that happens to have 9 00:00:37,310 --> 00:00:40,280 the same mean, but it's distribution 10 00:00:40,280 --> 00:00:42,200 is more spread out. 11 00:00:45,100 --> 00:00:49,090 So both random variables have the same mean, which we denote 12 00:00:49,090 --> 00:00:53,880 by mu, and which in this picture would be somewhere 13 00:00:53,880 --> 00:00:55,130 around here. 14 00:00:58,310 --> 00:01:03,840 However, the second PMF, the blue PMF, has typical outcomes 15 00:01:03,840 --> 00:01:08,020 that tend to have a larger distance from the mean. 16 00:01:08,020 --> 00:01:13,920 By distance from the mean what we mean is that if the result 17 00:01:13,920 --> 00:01:17,289 of the random variable, its numerical value, happens to 18 00:01:17,289 --> 00:01:23,230 be, let's say for example, this one, then this quantity 19 00:01:23,230 --> 00:01:27,700 here is X minus mu is the distance from the mean, how 20 00:01:27,700 --> 00:01:31,440 far away the outcome of the random variable happens to be 21 00:01:31,440 --> 00:01:34,400 from the mean of that random variable. 22 00:01:34,400 --> 00:01:37,820 Of course, the distance from the mean is a random quantity. 23 00:01:37,820 --> 00:01:39,520 It is a random variable. 24 00:01:39,520 --> 00:01:43,210 Its value is determined once we know the outcome of the 25 00:01:43,210 --> 00:01:47,140 experiment and the value of the random variable. 26 00:01:47,140 --> 00:01:50,640 What can we say about the distance from the mean. 27 00:01:50,640 --> 00:01:55,390 Let us calculate its average or expected value. 28 00:01:55,390 --> 00:01:58,840 The expected value of the distance from the mean, which 29 00:01:58,840 --> 00:02:03,290 is this quantity, using the linearity of expectations, is 30 00:02:03,290 --> 00:02:08,288 equal to the expected value of X minus the constant mu. 31 00:02:08,288 --> 00:02:12,040 But the expected value is by definition equal to mu. 32 00:02:12,040 --> 00:02:15,420 And so we obtain zero. 33 00:02:15,420 --> 00:02:18,090 So we see that the average value of the distance from the 34 00:02:18,090 --> 00:02:19,890 mean is always zero. 35 00:02:19,890 --> 00:02:22,240 And so it is uninformative. 36 00:02:22,240 --> 00:02:26,360 What we really want is the average absolute value of the 37 00:02:26,360 --> 00:02:30,750 distance from the mean, or something with this flavor. 38 00:02:30,750 --> 00:02:33,700 Mathematically, it turns out that the average of the 39 00:02:33,700 --> 00:02:37,690 squared distance from the mean is a better behaved 40 00:02:37,690 --> 00:02:39,280 mathematical object. 41 00:02:39,280 --> 00:02:42,220 And this is the quantity that we will consider. 42 00:02:42,220 --> 00:02:43,490 It has a name. 43 00:02:43,490 --> 00:02:45,410 It is called the variance. 44 00:02:45,410 --> 00:02:50,190 And it is defined as the expected value of the squared 45 00:02:50,190 --> 00:02:53,050 distance from the mean. 46 00:02:53,050 --> 00:02:56,350 The first thing to note is that the variance is always 47 00:02:56,350 --> 00:02:57,600 non-negative. 48 00:02:59,990 --> 00:03:04,400 This is because it is the expected value of non-negative 49 00:03:04,400 --> 00:03:06,590 quantities. 50 00:03:06,590 --> 00:03:10,040 How exactly do we computer the variance? 51 00:03:10,040 --> 00:03:15,440 The squared distance from the mean is really a function of 52 00:03:15,440 --> 00:03:24,010 the random variable X. So it is a function of the form g of 53 00:03:24,010 --> 00:03:30,490 X, where g is a particular function defined this way. 54 00:03:37,530 --> 00:03:42,460 So we can use the expected value rule applied to this 55 00:03:42,460 --> 00:03:44,410 particular function g. 56 00:03:44,410 --> 00:03:46,700 And we obtain the following. 57 00:03:55,310 --> 00:04:00,790 So what we have to do is to go over all numerical values of 58 00:04:00,790 --> 00:04:05,310 the random variable X. For each one, calculate its 59 00:04:05,310 --> 00:04:10,630 squared distance from the mean and weigh that quantity 60 00:04:10,630 --> 00:04:15,190 according to the corresponding probability of that particular 61 00:04:15,190 --> 00:04:16,440 numerical value. 62 00:04:19,149 --> 00:04:22,860 One final comment, the variance is a bit hard to 63 00:04:22,860 --> 00:04:25,865 interpret, because it is in the wrong units. 64 00:04:25,865 --> 00:04:29,790 If capital X corresponds to meters, then the variance has 65 00:04:29,790 --> 00:04:32,590 units of meters squared. 66 00:04:32,590 --> 00:04:36,450 A more intuitive quantity is the square root of the 67 00:04:36,450 --> 00:04:39,970 variance, which is called the standard deviation. 68 00:04:39,970 --> 00:04:43,280 It has the same units as the random variable and captures 69 00:04:43,280 --> 00:04:44,845 the width of the distribution. 70 00:04:47,490 --> 00:04:50,360 Let us now take a quick look at some of the properties of 71 00:04:50,360 --> 00:04:51,380 the variance. 72 00:04:51,380 --> 00:04:54,710 We know that expectations have a linearity property. 73 00:04:54,710 --> 00:04:57,300 Is this the case for the variance as well? 74 00:04:57,300 --> 00:04:58,500 Not quite. 75 00:04:58,500 --> 00:05:01,820 Instead we have this relation for the variance of a linear 76 00:05:01,820 --> 00:05:03,880 function of a random variable. 77 00:05:03,880 --> 00:05:06,920 Let us see why it is true. 78 00:05:06,920 --> 00:05:11,590 We use the shorthand notation mu for the expected value of 79 00:05:11,590 --> 00:05:15,950 X. We will proceed one step at a time and first consider what 80 00:05:15,950 --> 00:05:18,160 happens to the variance if we add the 81 00:05:18,160 --> 00:05:20,470 constant to a random variable. 82 00:05:20,470 --> 00:05:26,040 So let Y be X plus some constant b. 83 00:05:26,040 --> 00:05:31,120 And let us just define nu to be the expected value of Y, 84 00:05:31,120 --> 00:05:34,450 which, using linearity of expectations, is the expected 85 00:05:34,450 --> 00:05:37,470 value of X plus b. 86 00:05:37,470 --> 00:05:40,030 Let us now calculate the variance. 87 00:05:40,030 --> 00:05:45,890 By definition the variance of Y is the expected value of the 88 00:05:45,890 --> 00:05:50,170 distance squared of Y from its mean. 89 00:05:53,170 --> 00:05:58,290 Now we substitute, because in this case Y is 90 00:05:58,290 --> 00:06:00,700 equal to X plus b. 91 00:06:00,700 --> 00:06:05,170 Whereas the mean, nu, is mu plus b. 92 00:06:10,790 --> 00:06:16,890 And now we notice that this b cancels with that b. 93 00:06:16,890 --> 00:06:25,080 And we are left with the expected value of X minus mu 94 00:06:25,080 --> 00:06:34,190 squared, which is just the variance of X. So this proves 95 00:06:34,190 --> 00:06:38,020 this relation for the case where a is equal to 1. 96 00:06:38,020 --> 00:06:43,030 The variance of X plus b is equal to the variance of X. So 97 00:06:43,030 --> 00:06:46,960 we see that when we add a constant to a random variable, 98 00:06:46,960 --> 00:06:49,300 the variance remains unchanged. 99 00:06:49,300 --> 00:06:53,890 Intuitively, adding a constant just moves the entire PMF 100 00:06:53,890 --> 00:06:56,880 right or left by some amount, but without 101 00:06:56,880 --> 00:06:58,750 changing its shape. 102 00:06:58,750 --> 00:07:03,370 And so the spread of this PMF remains unchanged. 103 00:07:03,370 --> 00:07:07,080 Let us now see what happens if we multiply a random variable 104 00:07:07,080 --> 00:07:08,866 by a constant. 105 00:07:08,866 --> 00:07:14,920 Let again nu be the expected value of Y. And so in this 106 00:07:14,920 --> 00:07:19,000 case by linearity this is equal to a times the expected 107 00:07:19,000 --> 00:07:22,880 value of X. So it is a times mu. 108 00:07:22,880 --> 00:07:29,220 We calculate the variance once more using the definition and 109 00:07:29,220 --> 00:07:33,490 substituting in the place of Y what Y is in this case-- 110 00:07:33,490 --> 00:07:34,909 it's aX-- 111 00:07:34,909 --> 00:07:40,770 and subtracting the mean of Y, which is a mu, squared. 112 00:07:40,770 --> 00:07:44,170 We take out a factor of a squared. 113 00:07:47,150 --> 00:07:51,909 And then we use linearity of expectations to note that this 114 00:07:51,909 --> 00:07:55,750 is a squared times the expected value of X minus mu 115 00:07:55,750 --> 00:08:04,050 squared, which is a squared times the variance of X. 116 00:08:04,050 --> 00:08:07,680 So this establishes this formula for the case where b 117 00:08:07,680 --> 00:08:09,260 equals zero. 118 00:08:09,260 --> 00:08:12,900 Putting together these two facts, if we multiply a random 119 00:08:12,900 --> 00:08:19,080 variable by a, the variance gets multiplied by a squared. 120 00:08:19,080 --> 00:08:22,510 And if we add a constant, the variance doesn't change. 121 00:08:22,510 --> 00:08:26,720 And this establishes this particular fact. 122 00:08:26,720 --> 00:08:38,159 As an example, the variance of, let's say, 3 minus 4X is 123 00:08:38,159 --> 00:08:45,100 going to be equal minus 4 squared times the variance of 124 00:08:45,100 --> 00:08:54,230 X, which is 16 times the variance of X. 125 00:08:54,230 --> 00:08:58,100 Finally, let me mention an alternative way of computing 126 00:08:58,100 --> 00:09:03,260 variances, which is often a bit quicker. 127 00:09:03,260 --> 00:09:06,020 We have this useful formula here. 128 00:09:06,020 --> 00:09:09,770 We will see later a few examples of how it is used, 129 00:09:09,770 --> 00:09:15,180 but for now let me just show why it is true. 130 00:09:15,180 --> 00:09:21,800 We have by definition that the variance of X is the expected 131 00:09:21,800 --> 00:09:27,410 value of X minus mu squared. 132 00:09:27,410 --> 00:09:32,700 Now let us rewrite what is inside the expectation by just 133 00:09:32,700 --> 00:09:35,495 expanding this square, which is [X squared minus] 134 00:09:35,495 --> 00:09:40,900 2 mu X plus mu squared. 135 00:09:40,900 --> 00:09:44,290 Using linearity of expectations, this is broken 136 00:09:44,290 --> 00:09:52,340 down into expected value of X squared minus the expected 137 00:09:52,340 --> 00:09:56,000 value of two times mu X. But mu is a constant. 138 00:09:56,000 --> 00:09:58,450 So we can take it outside the expected value. 139 00:09:58,450 --> 00:10:01,480 And we're left with 2mu expected value 140 00:10:01,480 --> 00:10:09,160 of X plus mu squared. 141 00:10:09,160 --> 00:10:13,970 But remember that mu is just the same as the expected value 142 00:10:13,970 --> 00:10:18,480 of X. So what we have here is twice the expected value of X, 143 00:10:18,480 --> 00:10:22,840 squared, plus the expected value of X, squared, and that 144 00:10:22,840 --> 00:10:27,730 leaves us just minus the expected value of X, squared. 145 00:10:35,090 --> 00:10:38,280 So we will now move in the next segment into a few 146 00:10:38,280 --> 00:10:40,190 examples of variance calculations.