1 00:00:02,250 --> 00:00:05,040 In this segment, we will go through the calculation of the 2 00:00:05,040 --> 00:00:08,900 variances of some familiar random variables, starting 3 00:00:08,900 --> 00:00:12,170 with the simplest one that we know, which is the Bernoulli 4 00:00:12,170 --> 00:00:13,480 random variable. 5 00:00:13,480 --> 00:00:18,990 So let X take values 0 or 1, and it takes a value of 1 with 6 00:00:18,990 --> 00:00:20,500 probability p. 7 00:00:20,500 --> 00:00:24,950 We have already calculated the expected value of X, and we 8 00:00:24,950 --> 00:00:27,150 know that it is equal to p. 9 00:00:27,150 --> 00:00:29,930 Let us now compute its variance. 10 00:00:29,930 --> 00:00:34,040 One way of proceeding is to use the definition and then 11 00:00:34,040 --> 00:00:36,460 the expected value rule. 12 00:00:36,460 --> 00:00:39,600 So if we now apply the expected value rule, we need 13 00:00:39,600 --> 00:00:42,570 the summation over all possible values of X. There 14 00:00:42,570 --> 00:00:43,600 are two values-- 15 00:00:43,600 --> 00:00:46,730 x equal to 1 or x equal to 0. 16 00:00:46,730 --> 00:00:51,400 The contribution when X is equal to 1 is 1 minus the 17 00:00:51,400 --> 00:00:55,010 expected value, which is p squared. 18 00:00:55,010 --> 00:00:58,610 And the value of 1 is taken with probability p. 19 00:00:58,610 --> 00:01:02,430 There is another contribution to this sum when little x is 20 00:01:02,430 --> 00:01:03,890 equal to 0. 21 00:01:03,890 --> 00:01:08,390 And that contribution is going to be 0 minus p, all of this 22 00:01:08,390 --> 00:01:15,789 squared, times the probability of 0, which is 1 minus p. 23 00:01:15,789 --> 00:01:19,020 And now we carry out some algebra. 24 00:01:19,020 --> 00:01:24,100 We expand the square here, 1 minus 2p plus p squared. 25 00:01:24,100 --> 00:01:29,170 And after we multiply with this factor of p, we obtain p 26 00:01:29,170 --> 00:01:36,220 minus 2p squared plus p to the third power. 27 00:01:36,220 --> 00:01:41,500 And then from here we have a factor of p squared times 1, p 28 00:01:41,500 --> 00:01:44,710 squared times minus p. 29 00:01:44,710 --> 00:01:48,310 That gives us a minus p cubed. 30 00:01:48,310 --> 00:01:53,350 Then we notice that this term cancels out with that term. 31 00:01:53,350 --> 00:01:59,860 p squared minus 2p squared leaves us 32 00:01:59,860 --> 00:02:05,330 with p minus p squared. 33 00:02:05,330 --> 00:02:09,020 And we factor this as p times 1 minus p. 34 00:02:11,760 --> 00:02:16,120 An alternative calculation uses the formula that we 35 00:02:16,120 --> 00:02:19,810 provided a little earlier. 36 00:02:19,810 --> 00:02:23,190 Let's see how this will go. 37 00:02:23,190 --> 00:02:25,220 We have the following observation. 38 00:02:25,220 --> 00:02:30,210 The random variable X squared and the random variable X-- 39 00:02:30,210 --> 00:02:32,380 they are one and the same. 40 00:02:32,380 --> 00:02:35,120 When X is 0, X squared is also 0. 41 00:02:35,120 --> 00:02:37,430 When X is 1, X squared is also 1. 42 00:02:37,430 --> 00:02:42,530 So as random variables, these two random variables are equal 43 00:02:42,530 --> 00:02:45,430 in the case where X is a Bernoulli. 44 00:02:45,430 --> 00:02:52,790 So what we have here is just the expected value of X minus 45 00:02:52,790 --> 00:02:58,720 the expected value of X squared, to the second power. 46 00:02:58,720 --> 00:03:04,230 And this is p minus p squared, which is the same answer as we 47 00:03:04,230 --> 00:03:05,070 got before-- 48 00:03:05,070 --> 00:03:07,030 p times 1 minus p. 49 00:03:07,030 --> 00:03:09,420 And we see that the calculations and the algebra 50 00:03:09,420 --> 00:03:12,620 involved using this formula were a little simpler than 51 00:03:12,620 --> 00:03:15,110 they were before. 52 00:03:15,110 --> 00:03:18,940 Now the form of the variance of the Bernoulli random 53 00:03:18,940 --> 00:03:24,120 variable has an interesting dependence on p. 54 00:03:24,120 --> 00:03:29,395 It's instructive to plot it as a function of p. 55 00:03:29,395 --> 00:03:32,810 So this is a plot of the variance of the Bernoulli as a 56 00:03:32,810 --> 00:03:37,380 function of p, as p ranges between 0 and 1. 57 00:03:37,380 --> 00:03:41,450 p times 1 minus p is a parabola. 58 00:03:41,450 --> 00:03:48,590 And it's a parabola that is 0 when p is either 0 or 1. 59 00:03:48,590 --> 00:03:52,460 And it has this particular shape, and the peak of this 60 00:03:52,460 --> 00:03:58,300 parabola occurs when p is equal to 1/2, in which case 61 00:03:58,300 --> 00:04:01,770 the variance is 1/4. 62 00:04:01,770 --> 00:04:05,030 In some sense, the variance is a measure of the amount of 63 00:04:05,030 --> 00:04:08,590 uncertainty in a random variable, a measure of the 64 00:04:08,590 --> 00:04:10,260 amount of randomness. 65 00:04:10,260 --> 00:04:15,060 A coin is most random if it is fair, that is, when 66 00:04:15,060 --> 00:04:17,060 p is equal to 1/2. 67 00:04:17,060 --> 00:04:21,570 And in this case, the variance confirms this intuition. 68 00:04:21,570 --> 00:04:29,460 The variance of a coin flip is biggest if that coin is fair. 69 00:04:29,460 --> 00:04:31,630 On the other hand, in the extreme cases 70 00:04:31,630 --> 00:04:32,880 where p equals 0-- 71 00:04:32,880 --> 00:04:38,440 so the coin always results in tails, or if p equals to 1 so 72 00:04:38,440 --> 00:04:41,180 that the coin always results in heads-- in those cases, we 73 00:04:41,180 --> 00:04:42,920 do not have any randomness. 74 00:04:42,920 --> 00:04:43,750 And the variance, 75 00:04:43,750 --> 00:04:45,720 correspondingly, is equal to 0. 76 00:04:48,300 --> 00:04:50,270 Let us now calculate the variance of a 77 00:04:50,270 --> 00:04:52,440 uniform random variable. 78 00:04:52,440 --> 00:04:55,159 Let us start with a simple case where the range of the 79 00:04:55,159 --> 00:04:59,370 uniform random variable starts at 0 and extends up to some n. 80 00:04:59,370 --> 00:05:03,300 So there is a total of n plus 1 possible values, each one of 81 00:05:03,300 --> 00:05:04,960 them having the same probability-- 82 00:05:04,960 --> 00:05:06,930 1 over n plus 1. 83 00:05:06,930 --> 00:05:10,725 We calculate the variance using the alternative formula. 84 00:05:16,060 --> 00:05:19,230 And let us start with the first term. 85 00:05:19,230 --> 00:05:20,700 What is it? 86 00:05:20,700 --> 00:05:26,010 We use the expected value rule, and we argue that with 87 00:05:26,010 --> 00:05:31,070 probability 1 over n plus 1, the random variable X squared 88 00:05:31,070 --> 00:05:35,460 takes the value 0 squared, with the same probability, 89 00:05:35,460 --> 00:05:37,350 takes the value 1 squared. 90 00:05:37,350 --> 00:05:41,370 With the same probability, it takes the value 2 squared, and 91 00:05:41,370 --> 00:05:46,500 so on, all of the way up to n squared. 92 00:05:46,500 --> 00:05:48,860 And then there's the next term. 93 00:05:48,860 --> 00:05:52,360 The expected value of the uniform is the midpoint of the 94 00:05:52,360 --> 00:05:54,040 distribution by symmetry. 95 00:05:54,040 --> 00:06:00,820 So it's n over 2, and we take the square of that. 96 00:06:00,820 --> 00:06:06,000 Now to make progress here, we need to evaluate this sum. 97 00:06:06,000 --> 00:06:10,580 Fortunately, this has been done by others. 98 00:06:10,580 --> 00:06:17,630 And it turns out to be equal to 1 over 6 n, n plus 1 99 00:06:17,630 --> 00:06:20,970 times 2n plus 1. 100 00:06:20,970 --> 00:06:24,610 This formula can be proved by induction, but we will just 101 00:06:24,610 --> 00:06:26,470 take it for granted. 102 00:06:26,470 --> 00:06:31,740 Using this formula, and after a little bit of simple algebra 103 00:06:31,740 --> 00:06:35,659 and after we simplify, we obtain a final answer, which 104 00:06:35,659 --> 00:06:42,240 is of the form 1 over 12 n times n plus 2. 105 00:06:42,240 --> 00:06:44,680 How about the variance of a more general 106 00:06:44,680 --> 00:06:46,580 uniform random variable? 107 00:06:46,580 --> 00:06:49,620 So suppose we have a uniform random variable whose range is 108 00:06:49,620 --> 00:06:51,690 from a to b. 109 00:06:51,690 --> 00:06:56,610 How is this PMF related to the one that we already studied? 110 00:06:56,610 --> 00:07:01,410 First, let us assume that n is chosen so that it is 111 00:07:01,410 --> 00:07:03,800 equal to b minus a. 112 00:07:03,800 --> 00:07:06,320 So in that case, the difference between the last 113 00:07:06,320 --> 00:07:10,240 and the first value of the random variable is the same as 114 00:07:10,240 --> 00:07:13,060 the difference between the last and the first possible 115 00:07:13,060 --> 00:07:14,830 value in this PMF. 116 00:07:14,830 --> 00:07:18,320 So both PMFs have the same number of terms. 117 00:07:18,320 --> 00:07:21,110 They have exactly the same shape. 118 00:07:21,110 --> 00:07:24,730 The only difference is that the second PMF is shifted away 119 00:07:24,730 --> 00:07:29,640 from 0, and it starts at a instead of starting at 0. 120 00:07:29,640 --> 00:07:34,220 Now what does shifting a PMF correspond to? 121 00:07:34,220 --> 00:07:37,780 It essentially amounts to taking a random variable-- 122 00:07:37,780 --> 00:07:39,400 let's say, with this PMF-- 123 00:07:39,400 --> 00:07:42,860 and adding a constant to that random variable. 124 00:07:42,860 --> 00:07:46,070 So if the original random variable takes the value of 0, 125 00:07:46,070 --> 00:07:49,090 the new random variable takes the value of a. 126 00:07:49,090 --> 00:07:51,890 If the original takes the value of 1, this new random 127 00:07:51,890 --> 00:07:56,780 variable takes the value of a plus 1, and so on. 128 00:07:56,780 --> 00:08:02,650 So this shifted PMF is the PMF associated to a random 129 00:08:02,650 --> 00:08:05,010 variable equal to the original random 130 00:08:05,010 --> 00:08:07,010 variable plus a constant. 131 00:08:07,010 --> 00:08:09,230 But we know that adding a constant does 132 00:08:09,230 --> 00:08:10,910 not change the variance. 133 00:08:10,910 --> 00:08:14,160 Therefore, the variance of this PMF is going to be the 134 00:08:14,160 --> 00:08:18,380 same as the variance of the original PMF, as long as we 135 00:08:18,380 --> 00:08:22,320 make the correspondence that n is equal to b minus a. 136 00:08:22,320 --> 00:08:26,050 So doing this substitution in the formula that we derived 137 00:08:26,050 --> 00:08:34,900 earlier, we obtain 1 over 12 b minus a times b 138 00:08:34,900 --> 00:08:36,830 minus a plus 2.