1 00:00:00,000 --> 00:00:00,040 2 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 3 00:00:02,460 --> 00:00:03,870 Commons license. 4 00:00:03,870 --> 00:00:06,910 Your support will help MIT OpenCourseWare continue to 5 00:00:06,910 --> 00:00:10,560 offer high quality educational resources for free. 6 00:00:10,560 --> 00:00:13,460 To make a donation, or view additional materials from 7 00:00:13,460 --> 00:00:17,390 hundreds of MIT courses, visit MIT OpenCourseWare at 8 00:00:17,390 --> 00:00:22,620 ocw.mit.edu, 9 00:00:22,620 --> 00:00:23,380 OK. 10 00:00:23,380 --> 00:00:24,630 So let us start. 11 00:00:24,630 --> 00:00:55,510 12 00:00:55,510 --> 00:00:55,890 All right. 13 00:00:55,890 --> 00:01:00,230 So today we're starting a new unit in this class. 14 00:01:00,230 --> 00:01:03,390 We have covered, so far, the basics of probability theory-- 15 00:01:03,390 --> 00:01:06,890 the main concepts and tools, as far as just probabilities 16 00:01:06,890 --> 00:01:08,150 are concerned. 17 00:01:08,150 --> 00:01:11,300 But if that was all that there is in this subject, the 18 00:01:11,300 --> 00:01:13,060 subject would not be rich enough. 19 00:01:13,060 --> 00:01:16,070 What makes probability theory a lot more interesting and 20 00:01:16,070 --> 00:01:19,590 richer is that we can also talk about random variables, 21 00:01:19,590 --> 00:01:25,230 which are ways of assigning numerical results to the 22 00:01:25,230 --> 00:01:27,430 outcomes of an experiment. 23 00:01:27,430 --> 00:01:32,500 So we're going to define what random variables are, and then 24 00:01:32,500 --> 00:01:35,560 we're going to describe them using so-called probability 25 00:01:35,560 --> 00:01:36,410 mass functions. 26 00:01:36,410 --> 00:01:39,770 Basically some numerical values are more likely to 27 00:01:39,770 --> 00:01:43,260 occur than other numerical values, and we capture this by 28 00:01:43,260 --> 00:01:46,260 assigning probabilities to them the usual way. 29 00:01:46,260 --> 00:01:49,870 And we represent these in a compact way using the 30 00:01:49,870 --> 00:01:51,950 so-called probability mass functions. 31 00:01:51,950 --> 00:01:55,340 We're going to see a couple of examples of random variables, 32 00:01:55,340 --> 00:01:58,260 some of which we have already seen but with different 33 00:01:58,260 --> 00:01:59,930 terminology. 34 00:01:59,930 --> 00:02:04,950 And so far, it's going to be just a couple of definitions 35 00:02:04,950 --> 00:02:06,790 and calculations of the type that you 36 00:02:06,790 --> 00:02:08,810 already know how to do. 37 00:02:08,810 --> 00:02:11,370 But then we're going to introduce the one new, big 38 00:02:11,370 --> 00:02:12,980 concept of the day. 39 00:02:12,980 --> 00:02:17,040 So up to here it's going to be mostly an exercise in notation 40 00:02:17,040 --> 00:02:18,190 and definitions. 41 00:02:18,190 --> 00:02:20,850 But then we got our big concept which is the concept 42 00:02:20,850 --> 00:02:24,190 of the expected value of a random variable, which is some 43 00:02:24,190 --> 00:02:27,290 kind of average value of the random variable. 44 00:02:27,290 --> 00:02:30,260 And then we're going to also talk, very briefly, about 45 00:02:30,260 --> 00:02:33,560 close distance of the expectation, which is the 46 00:02:33,560 --> 00:02:37,010 concept of the variance of a random variable. 47 00:02:37,010 --> 00:02:37,910 OK. 48 00:02:37,910 --> 00:02:40,455 So what is a random variable? 49 00:02:40,455 --> 00:02:43,860 50 00:02:43,860 --> 00:02:47,710 It's an assignment of a numerical value to every 51 00:02:47,710 --> 00:02:49,950 possible outcome of the experiment. 52 00:02:49,950 --> 00:02:51,430 So here's the picture. 53 00:02:51,430 --> 00:02:54,800 The sample space is this class, and we've got lots of 54 00:02:54,800 --> 00:02:56,900 students in here. 55 00:02:56,900 --> 00:03:00,480 This is our sample space, omega. 56 00:03:00,480 --> 00:03:04,220 I'm interested in the height of a random student. 57 00:03:04,220 --> 00:03:08,990 So I'm going to use a real line where I record height, 58 00:03:08,990 --> 00:03:12,490 and let's say this is height in inches. 59 00:03:12,490 --> 00:03:16,380 And the experiment happens, I pick a random student. 60 00:03:16,380 --> 00:03:19,510 And I go and measure the height of that random student, 61 00:03:19,510 --> 00:03:22,200 and that gives me a specific number. 62 00:03:22,200 --> 00:03:25,310 So what's a good number in inches? 63 00:03:25,310 --> 00:03:28,430 Let's say 60. 64 00:03:28,430 --> 00:03:28,840 OK. 65 00:03:28,840 --> 00:03:33,480 Or I pick another student, and that student has a height of 66 00:03:33,480 --> 00:03:36,030 71 inches, and so on. 67 00:03:36,030 --> 00:03:37,420 So this is the experiment. 68 00:03:37,420 --> 00:03:39,020 These are the outcomes. 69 00:03:39,020 --> 00:03:43,070 These are the numerical values of the random variable that we 70 00:03:43,070 --> 00:03:46,340 call height. 71 00:03:46,340 --> 00:03:46,690 OK. 72 00:03:46,690 --> 00:03:49,770 So mathematically, what are we dealing with here? 73 00:03:49,770 --> 00:03:54,100 We're basically dealing with a function from the sample space 74 00:03:54,100 --> 00:03:56,230 into the real numbers. 75 00:03:56,230 --> 00:04:01,280 That function takes as argument, an outcome of the 76 00:04:01,280 --> 00:04:04,720 experiment, that is a typical student, and produces the 77 00:04:04,720 --> 00:04:07,520 value of that function, which is the height of that 78 00:04:07,520 --> 00:04:09,100 particular student. 79 00:04:09,100 --> 00:04:12,600 So we think of an abstract object that we denote by a 80 00:04:12,600 --> 00:04:17,480 capital H, which is the random variable called height. 81 00:04:17,480 --> 00:04:21,279 And that random variable is essentially this particular 82 00:04:21,279 --> 00:04:25,870 function that we talked about here. 83 00:04:25,870 --> 00:04:26,490 OK. 84 00:04:26,490 --> 00:04:29,190 So there's a distinction that we're making here-- 85 00:04:29,190 --> 00:04:32,170 H is height in the abstract. 86 00:04:32,170 --> 00:04:33,570 It's the function. 87 00:04:33,570 --> 00:04:36,440 These numbers here are particular numerical values 88 00:04:36,440 --> 00:04:39,580 that this function takes when you choose one particular 89 00:04:39,580 --> 00:04:41,400 outcome of the experiment. 90 00:04:41,400 --> 00:04:44,910 Now, when you have a single probability experiment, you 91 00:04:44,910 --> 00:04:47,690 can have multiple random variables. 92 00:04:47,690 --> 00:04:52,690 So perhaps, instead of just height, I'm also interested in 93 00:04:52,690 --> 00:04:55,890 the weight of a typical student. 94 00:04:55,890 --> 00:04:58,680 And so when the experiment happens, I 95 00:04:58,680 --> 00:05:00,690 pick that random student-- 96 00:05:00,690 --> 00:05:02,250 this is the height of the student. 97 00:05:02,250 --> 00:05:05,450 But that student would also have a weight, and I could 98 00:05:05,450 --> 00:05:06,880 record it here. 99 00:05:06,880 --> 00:05:09,570 And similarly, every student is going to have their own 100 00:05:09,570 --> 00:05:10,910 particular weight. 101 00:05:10,910 --> 00:05:13,850 So the weight function is a different function from the 102 00:05:13,850 --> 00:05:17,330 sample space to the real numbers, and it's a different 103 00:05:17,330 --> 00:05:18,680 random variable. 104 00:05:18,680 --> 00:05:21,840 So the point I'm making here is that a single probabilistic 105 00:05:21,840 --> 00:05:26,825 experiment may involve several interesting random variables. 106 00:05:26,825 --> 00:05:30,190 I may be interested in the height of a random student or 107 00:05:30,190 --> 00:05:31,760 the weight of the random student. 108 00:05:31,760 --> 00:05:33,300 These are different random variables 109 00:05:33,300 --> 00:05:35,140 that could be of interest. 110 00:05:35,140 --> 00:05:37,580 I can also do other things. 111 00:05:37,580 --> 00:05:44,000 Suppose I define an object such as H bar, which is 2.58. 112 00:05:44,000 --> 00:05:46,420 What does that correspond to? 113 00:05:46,420 --> 00:05:50,540 Well, this is the height in centimeters. 114 00:05:50,540 --> 00:05:55,160 Now, H bar is a function of H itself, but if you were to 115 00:05:55,160 --> 00:05:57,740 draw the picture, the picture would go this way. 116 00:05:57,740 --> 00:06:03,100 60 gets mapped to 150, 71 gets mapped to, oh, that's 117 00:06:03,100 --> 00:06:04,720 too hard for me. 118 00:06:04,720 --> 00:06:10,040 OK, gets mapped to something, and so on. 119 00:06:10,040 --> 00:06:14,220 So H bar is also a random variable. 120 00:06:14,220 --> 00:06:15,100 Why? 121 00:06:15,100 --> 00:06:19,140 Once I pick a particular student, that particular 122 00:06:19,140 --> 00:06:24,120 outcome determines completely the numerical value of H bar, 123 00:06:24,120 --> 00:06:29,070 which is the height of that student but measured in 124 00:06:29,070 --> 00:06:31,080 centimeters. 125 00:06:31,080 --> 00:06:33,820 What we have here is actually a random variable, which is 126 00:06:33,820 --> 00:06:37,860 defined as a function of another random variable. 127 00:06:37,860 --> 00:06:41,010 And the point that this example is trying to make is 128 00:06:41,010 --> 00:06:42,640 that functions of random variables 129 00:06:42,640 --> 00:06:44,630 are also random variables. 130 00:06:44,630 --> 00:06:47,390 The experiment happens, the experiment determines a 131 00:06:47,390 --> 00:06:49,410 numerical value for this object. 132 00:06:49,410 --> 00:06:51,630 And once you have the numerical value for this 133 00:06:51,630 --> 00:06:54,180 object, that determines also the numerical 134 00:06:54,180 --> 00:06:55,900 value for that object. 135 00:06:55,900 --> 00:06:59,080 So given an outcome, the numerical value of this 136 00:06:59,080 --> 00:07:00,840 particular object is determined. 137 00:07:00,840 --> 00:07:05,730 So H bar is itself a function from the sample space, from 138 00:07:05,730 --> 00:07:08,340 outcomes to numerical values. 139 00:07:08,340 --> 00:07:11,350 And that makes it a random variable according to the 140 00:07:11,350 --> 00:07:13,920 formal definition that we have here. 141 00:07:13,920 --> 00:07:18,610 So the formal definition is that the random variable is 142 00:07:18,610 --> 00:07:22,570 not random, it's not a variable, it's just a function 143 00:07:22,570 --> 00:07:25,000 from the sample space to the real numbers. 144 00:07:25,000 --> 00:07:29,220 That's the abstract, right way of thinking about them. 145 00:07:29,220 --> 00:07:32,000 Now, random variables can be of different types. 146 00:07:32,000 --> 00:07:34,440 They can be discrete or continuous. 147 00:07:34,440 --> 00:07:38,490 Suppose that I measure the heights in inches, but I round 148 00:07:38,490 --> 00:07:40,270 to the nearest inch. 149 00:07:40,270 --> 00:07:43,490 Then the numerical values I'm going to get here would be 150 00:07:43,490 --> 00:07:45,000 just integers. 151 00:07:45,000 --> 00:07:47,080 So that would make it an integer 152 00:07:47,080 --> 00:07:48,520 valued random variable. 153 00:07:48,520 --> 00:07:50,870 And this is a discrete random variable. 154 00:07:50,870 --> 00:07:54,950 Or maybe I have a scale for measuring height which is 155 00:07:54,950 --> 00:07:58,640 infinitely precise and records your height to an infinite 156 00:07:58,640 --> 00:08:00,830 number of digits of precision. 157 00:08:00,830 --> 00:08:03,390 In that case, your height would be just a 158 00:08:03,390 --> 00:08:05,430 general real number. 159 00:08:05,430 --> 00:08:08,510 So we would have a random variable that takes values in 160 00:08:08,510 --> 00:08:10,720 the entire set of real numbers. 161 00:08:10,720 --> 00:08:14,490 Well, I guess not really negative numbers, but the set 162 00:08:14,490 --> 00:08:16,350 of non-negative numbers. 163 00:08:16,350 --> 00:08:19,880 And that would be a continuous random variable. 164 00:08:19,880 --> 00:08:22,250 It takes values in a continuous set. 165 00:08:22,250 --> 00:08:25,020 So we will be talking about both discrete and continuous 166 00:08:25,020 --> 00:08:26,330 random variables. 167 00:08:26,330 --> 00:08:28,790 The first thing we will do will be to devote a few 168 00:08:28,790 --> 00:08:32,030 lectures on discrete random variables, because discrete is 169 00:08:32,030 --> 00:08:33,380 always easier. 170 00:08:33,380 --> 00:08:35,640 And then we're going to repeat everything in 171 00:08:35,640 --> 00:08:37,710 the continuous setting. 172 00:08:37,710 --> 00:08:41,760 So discrete is easier, and it's the right place to 173 00:08:41,760 --> 00:08:45,140 understand all the concepts, even those who may appear to 174 00:08:45,140 --> 00:08:47,090 be elementary. 175 00:08:47,090 --> 00:08:50,110 And then you will be set to understand what's going on 176 00:08:50,110 --> 00:08:51,650 when we go to the continuous case. 177 00:08:51,650 --> 00:08:54,260 So in the continuous case, you get all the complications of 178 00:08:54,260 --> 00:08:57,840 calculus and some extra math that comes in there. 179 00:08:57,840 --> 00:09:00,910 So it's important to have been down all the concepts very 180 00:09:00,910 --> 00:09:04,320 well in the easy, discrete case so that you don't have 181 00:09:04,320 --> 00:09:06,770 conceptual hurdles when you move on to 182 00:09:06,770 --> 00:09:08,730 the continuous case. 183 00:09:08,730 --> 00:09:13,610 Now, one important remark that may seem trivial but it's 184 00:09:13,610 --> 00:09:17,920 actually very important so that you don't get tangled up 185 00:09:17,920 --> 00:09:20,550 between different types of concepts-- 186 00:09:20,550 --> 00:09:23,440 there's a fundamental distinction between the random 187 00:09:23,440 --> 00:09:26,220 variable itself, and the numerical 188 00:09:26,220 --> 00:09:29,080 values that it takes. 189 00:09:29,080 --> 00:09:31,670 Abstractly speaking, or mathematically speaking, a 190 00:09:31,670 --> 00:09:38,350 random variable, x, or H in this example, is a function. 191 00:09:38,350 --> 00:09:39,140 OK. 192 00:09:39,140 --> 00:09:43,750 Maybe if you like programming the words "procedure" or 193 00:09:43,750 --> 00:09:45,820 "sub-routine" might be better. 194 00:09:45,820 --> 00:09:48,290 So what's the sub-routine height? 195 00:09:48,290 --> 00:09:51,470 Given a student, I take that student, force them on the 196 00:09:51,470 --> 00:09:53,110 scale and measure them. 197 00:09:53,110 --> 00:09:56,610 That's the sub-routine that measures heights. 198 00:09:56,610 --> 00:10:00,270 It's really a function that takes students as input and 199 00:10:00,270 --> 00:10:02,670 produces numbers as output. 200 00:10:02,670 --> 00:10:05,790 The sub-routine we denoted by capital H. 201 00:10:05,790 --> 00:10:07,450 That's the random variable. 202 00:10:07,450 --> 00:10:10,500 But once you plug in a particular student into that 203 00:10:10,500 --> 00:10:14,460 sub-routine, you end up getting a particular number. 204 00:10:14,460 --> 00:10:17,610 This is the numerical output of that sub-routine or the 205 00:10:17,610 --> 00:10:19,900 numerical value of that function. 206 00:10:19,900 --> 00:10:24,390 And that numerical value is an element of the real numbers. 207 00:10:24,390 --> 00:10:29,040 So the numerical value is a real number, where this 208 00:10:29,040 --> 00:10:35,670 capital X is a function from omega to the real numbers. 209 00:10:35,670 --> 00:10:38,400 So they are very different types of objects. 210 00:10:38,400 --> 00:10:41,510 And the way that we keep track of what we're talking about at 211 00:10:41,510 --> 00:10:45,020 any given time is by using capital letters for random 212 00:10:45,020 --> 00:10:49,150 variables and lower case letters for numbers. 213 00:10:49,150 --> 00:10:52,260 214 00:10:52,260 --> 00:10:52,700 OK. 215 00:10:52,700 --> 00:11:00,520 So now once we have a random variable at hand, that random 216 00:11:00,520 --> 00:11:04,410 variable takes on different numerical values. 217 00:11:04,410 --> 00:11:08,980 And we want to describe to say something about the relative 218 00:11:08,980 --> 00:11:12,760 likelihoods of the different numerical values that the 219 00:11:12,760 --> 00:11:15,350 random variable can take. 220 00:11:15,350 --> 00:11:21,300 So here's our sample space, and here's the real line. 221 00:11:21,300 --> 00:11:23,850 222 00:11:23,850 --> 00:11:28,900 And there's a bunch of outcomes that gave rise to one 223 00:11:28,900 --> 00:11:30,940 particular numerical value. 224 00:11:30,940 --> 00:11:33,660 There's another numerical value that arises if we have 225 00:11:33,660 --> 00:11:34,270 this outcome. 226 00:11:34,270 --> 00:11:37,620 There's another numerical value that arises if we have 227 00:11:37,620 --> 00:11:38,390 this outcome. 228 00:11:38,390 --> 00:11:40,430 So our sample space is here. 229 00:11:40,430 --> 00:11:42,530 The real numbers are here. 230 00:11:42,530 --> 00:11:46,600 And what we want to do is to ask the question, how likely 231 00:11:46,600 --> 00:11:50,000 is that particular numerical value to occur? 232 00:11:50,000 --> 00:11:53,620 So what we're essentially asking is, how likely is it 233 00:11:53,620 --> 00:11:57,450 that we obtain an outcome that leads to that particular 234 00:11:57,450 --> 00:11:59,000 numerical value? 235 00:11:59,000 --> 00:12:02,680 We calculate that overall probability of that numerical 236 00:12:02,680 --> 00:12:07,810 value and we represent that probability using a bar so 237 00:12:07,810 --> 00:12:13,300 that we end up generating a bar graph. 238 00:12:13,300 --> 00:12:16,550 So that could be a possible bar graph 239 00:12:16,550 --> 00:12:19,210 associated with this picture. 240 00:12:19,210 --> 00:12:22,860 The size of this bar is the total probability that our 241 00:12:22,860 --> 00:12:27,590 random variable took on this numerical value, which is just 242 00:12:27,590 --> 00:12:32,240 the sum of the probabilities of the different outcomes that 243 00:12:32,240 --> 00:12:34,310 led to that numerical value. 244 00:12:34,310 --> 00:12:37,100 So the thing that we're plotting here, the bar graph-- 245 00:12:37,100 --> 00:12:39,370 we give a name to it. 246 00:12:39,370 --> 00:12:43,910 It's a function, which we denote by lowercase b, capital 247 00:12:43,910 --> 00:12:47,690 X. The capital X indicates which random variable we're 248 00:12:47,690 --> 00:12:48,920 talking about. 249 00:12:48,920 --> 00:12:54,670 And it's a function of little x, which is the range of 250 00:12:54,670 --> 00:12:58,530 values that our random variable is taking. 251 00:12:58,530 --> 00:13:04,830 So in mathematical notation, the value of the PMF at some 252 00:13:04,830 --> 00:13:09,220 particular number, little x, is the probability that our 253 00:13:09,220 --> 00:13:14,060 random variable takes on the numerical value, little x. 254 00:13:14,060 --> 00:13:17,510 And if you want to be precise about what this means, it's 255 00:13:17,510 --> 00:13:23,020 the overall probability of all outcomes for which the random 256 00:13:23,020 --> 00:13:26,770 variable ends up taking that value, little x. 257 00:13:26,770 --> 00:13:34,110 So this is the overall probability of all omegas that 258 00:13:34,110 --> 00:13:36,950 lead to that particular numerical 259 00:13:36,950 --> 00:13:39,260 value, x, of interest. 260 00:13:39,260 --> 00:13:44,630 So what do we know about PMFs? 261 00:13:44,630 --> 00:13:47,600 Since there are probabilities, all these entries in the bar 262 00:13:47,600 --> 00:13:49,880 graph have to be non-negative. 263 00:13:49,880 --> 00:13:54,610 Also, if you exhaust all the possible values of little x's, 264 00:13:54,610 --> 00:13:57,840 you will have exhausted all the possible outcomes here. 265 00:13:57,840 --> 00:14:01,030 Because every outcome leads to some particular x. 266 00:14:01,030 --> 00:14:03,160 So the sum of these probabilities 267 00:14:03,160 --> 00:14:04,760 should be equal to one. 268 00:14:04,760 --> 00:14:06,890 This is the second relation here. 269 00:14:06,890 --> 00:14:10,970 So this relation tell us that some little 270 00:14:10,970 --> 00:14:13,150 x is going to happen. 271 00:14:13,150 --> 00:14:15,500 They happen with different probabilities, but when you 272 00:14:15,500 --> 00:14:19,370 consider all the possible little x's together, one of 273 00:14:19,370 --> 00:14:21,750 those little x's is going to be realized. 274 00:14:21,750 --> 00:14:25,640 Probabilities need to add to one. 275 00:14:25,640 --> 00:14:26,090 OK. 276 00:14:26,090 --> 00:14:31,200 So let's get our first example of a non-trivial bar graph. 277 00:14:31,200 --> 00:14:35,780 Consider the experiment where I start with a coin and I 278 00:14:35,780 --> 00:14:38,270 start flipping it over and over. 279 00:14:38,270 --> 00:14:42,670 And I do this until I obtain heads for the first time. 280 00:14:42,670 --> 00:14:45,610 So what are possible outcomes of this experiment? 281 00:14:45,610 --> 00:14:48,850 One possible outcome is that I obtain heads at the first 282 00:14:48,850 --> 00:14:50,930 toss, and then I stop. 283 00:14:50,930 --> 00:14:55,040 In this case, my random variable takes the value 1. 284 00:14:55,040 --> 00:14:59,360 Or it's possible that I obtain tails and then heads. 285 00:14:59,360 --> 00:15:02,500 How many tosses did it take until heads appeared? 286 00:15:02,500 --> 00:15:04,390 This would be x equals to 2. 287 00:15:04,390 --> 00:15:09,930 Or more generally, I might obtain tails for k minus 1 288 00:15:09,930 --> 00:15:15,000 times, and then obtain heads at the k-th time, in which 289 00:15:15,000 --> 00:15:19,930 case, our random variable takes the value, little k. 290 00:15:19,930 --> 00:15:21,210 So that's the experiment. 291 00:15:21,210 --> 00:15:25,350 So capital X is a well defined random variable. 292 00:15:25,350 --> 00:15:29,070 It's the number of tosses it takes until I see heads for 293 00:15:29,070 --> 00:15:30,710 the first time. 294 00:15:30,710 --> 00:15:32,330 These are the possible outcomes. 295 00:15:32,330 --> 00:15:34,710 These are elements of our sample space. 296 00:15:34,710 --> 00:15:38,750 And these are the values of X depending on the outcome. 297 00:15:38,750 --> 00:15:43,950 Clearly X is a function of the outcome. 298 00:15:43,950 --> 00:15:47,570 You tell me the outcome, I'm going to tell you what X is. 299 00:15:47,570 --> 00:15:54,520 So what we want to do now is to calculate the PMF of X. So 300 00:15:54,520 --> 00:15:59,210 Px of k is, by definition, the probability that our random 301 00:15:59,210 --> 00:16:02,810 variable takes the value k. 302 00:16:02,810 --> 00:16:07,250 For the random variable to take the value of k, the first 303 00:16:07,250 --> 00:16:09,680 head appears at toss number k. 304 00:16:09,680 --> 00:16:13,550 The only way that this event can happen is if we obtain 305 00:16:13,550 --> 00:16:15,650 this sequence of events. 306 00:16:15,650 --> 00:16:19,290 T's the first k minus 1 times, tails, and 307 00:16:19,290 --> 00:16:21,820 heads at the k-th flip. 308 00:16:21,820 --> 00:16:25,980 So this event, that the random variable is equal to k, is the 309 00:16:25,980 --> 00:16:30,590 same as this event, k minus 1 tails followed by 1 head. 310 00:16:30,590 --> 00:16:32,780 What's the probability of that event? 311 00:16:32,780 --> 00:16:36,450 We're assuming that the coin tosses are independent. 312 00:16:36,450 --> 00:16:39,190 So to find the probability of this event, we need to 313 00:16:39,190 --> 00:16:41,860 multiply the probability of tails, times the probability 314 00:16:41,860 --> 00:16:43,680 of tails, times the probability of tails. 315 00:16:43,680 --> 00:16:47,370 We multiply k minus one times, times the probability of 316 00:16:47,370 --> 00:16:50,660 heads, which puts an extra p at the end. 317 00:16:50,660 --> 00:16:56,470 And this is the formula for the so-called geometric PMF. 318 00:16:56,470 --> 00:16:58,650 And why do we call it geometric? 319 00:16:58,650 --> 00:17:04,859 Because if you go and plot the bar graph of this random 320 00:17:04,859 --> 00:17:10,510 variable, X, we start at 1 with a certain 321 00:17:10,510 --> 00:17:14,300 number, which is p. 322 00:17:14,300 --> 00:17:20,550 And then at 2 we get p(1-p). 323 00:17:20,550 --> 00:17:23,640 At 3 we're going to get something smaller, it's p 324 00:17:23,640 --> 00:17:25,900 times (1-p)-squared. 325 00:17:25,900 --> 00:17:29,740 And the bars keep going down at the rate of geometric 326 00:17:29,740 --> 00:17:30,730 progression. 327 00:17:30,730 --> 00:17:34,490 Each bar is smaller than the previous bar, because each 328 00:17:34,490 --> 00:17:38,380 time we get an extra factor of 1-p involved. 329 00:17:38,380 --> 00:17:42,480 So the shape of this PMF is the graph 330 00:17:42,480 --> 00:17:44,300 of a geometric sequence. 331 00:17:44,300 --> 00:17:48,330 For that reason, we say that it's the geometric PMF, and we 332 00:17:48,330 --> 00:17:51,860 call X also a geometric random variable. 333 00:17:51,860 --> 00:17:55,730 So the number of coin tosses until the first head is a 334 00:17:55,730 --> 00:17:58,290 geometric random variable. 335 00:17:58,290 --> 00:18:00,730 So this was an example of how to compute the 336 00:18:00,730 --> 00:18:02,630 PMF of a random variable. 337 00:18:02,630 --> 00:18:06,510 This was an easy example, because this event could be 338 00:18:06,510 --> 00:18:09,520 realized in one and only one way. 339 00:18:09,520 --> 00:18:12,650 So to find the probability of this, we just needed to find 340 00:18:12,650 --> 00:18:15,510 the probability of this particular outcome. 341 00:18:15,510 --> 00:18:18,680 More generally, there's going to be many outcomes that can 342 00:18:18,680 --> 00:18:22,120 lead to the same numerical value. 343 00:18:22,120 --> 00:18:25,010 And we need to keep track of all of them. 344 00:18:25,010 --> 00:18:28,030 For example, in this picture, if I want to find this value 345 00:18:28,030 --> 00:18:31,610 of the PMF, I need to add up the probabilities of all the 346 00:18:31,610 --> 00:18:34,550 outcomes that leads to that value. 347 00:18:34,550 --> 00:18:37,070 So the general procedure is exactly what 348 00:18:37,070 --> 00:18:38,240 this picture suggests. 349 00:18:38,240 --> 00:18:43,050 To find this probability, you go and identify which outcomes 350 00:18:43,050 --> 00:18:47,770 lead to this numerical value, and add their probabilities. 351 00:18:47,770 --> 00:18:49,590 So let's do a simple example. 352 00:18:49,590 --> 00:18:51,820 I take a tetrahedral die. 353 00:18:51,820 --> 00:18:53,820 I toss it twice. 354 00:18:53,820 --> 00:18:55,850 And there's lots of random variables that you can 355 00:18:55,850 --> 00:18:57,850 associate with the same experiment. 356 00:18:57,850 --> 00:19:01,450 So the outcome of the first throw, we can call it F. 357 00:19:01,450 --> 00:19:05,300 That's a random variable because it's determined once 358 00:19:05,300 --> 00:19:09,840 you tell me what happens in the experiment. 359 00:19:09,840 --> 00:19:11,615 The outcome of the second throw is 360 00:19:11,615 --> 00:19:13,470 another random variable. 361 00:19:13,470 --> 00:19:16,890 The minimum of the two throws is also a random variable. 362 00:19:16,890 --> 00:19:20,580 Once I do the experiment, this random variable takes on a 363 00:19:20,580 --> 00:19:22,580 specific numerical value. 364 00:19:22,580 --> 00:19:26,850 So suppose I do the experiment and I get a 2 and a 3. 365 00:19:26,850 --> 00:19:29,530 So this random variable is going to take the numerical 366 00:19:29,530 --> 00:19:30,420 value of 2. 367 00:19:30,420 --> 00:19:32,440 This is going to take the numerical value of 3. 368 00:19:32,440 --> 00:19:35,500 This is going to take the numerical value of 2. 369 00:19:35,500 --> 00:19:38,830 And now suppose that I want to calculate the PMF of this 370 00:19:38,830 --> 00:19:40,490 random variable. 371 00:19:40,490 --> 00:19:43,311 What I will need to do is to calculate Px(0), Px(1), Px(2), 372 00:19:43,311 --> 00:19:47,980 Px(3), and so on. 373 00:19:47,980 --> 00:19:50,680 Let's not do the entire calculation then, let's just 374 00:19:50,680 --> 00:19:54,770 calculate one of the entries of the PMF. 375 00:19:54,770 --> 00:19:56,010 So Px(2)-- 376 00:19:56,010 --> 00:19:58,870 that's the probability that the minimum of the two throws 377 00:19:58,870 --> 00:20:00,280 gives us a 2. 378 00:20:00,280 --> 00:20:04,080 And this can happen in many ways. 379 00:20:04,080 --> 00:20:06,390 There are five ways that it can happen. 380 00:20:06,390 --> 00:20:11,010 Those are all of the outcomes for which the smallest of the 381 00:20:11,010 --> 00:20:13,780 two is equal to 2. 382 00:20:13,780 --> 00:20:18,090 That's five outcomes assuming that the tetrahedral die is 383 00:20:18,090 --> 00:20:20,920 fair and the tosses are independent. 384 00:20:20,920 --> 00:20:24,450 Each one of these outcomes has probability of 1/16. 385 00:20:24,450 --> 00:20:27,185 There's five of them, so we get an answer, 5/16. 386 00:20:27,185 --> 00:20:30,490 387 00:20:30,490 --> 00:20:33,770 Conceptually, this is just the procedure that you use to 388 00:20:33,770 --> 00:20:37,280 calculate PMFs the way that you construct this 389 00:20:37,280 --> 00:20:38,730 particular bar graph. 390 00:20:38,730 --> 00:20:41,340 You consider all the possible values of your random 391 00:20:41,340 --> 00:20:43,860 variable, and for each one of those random variables you 392 00:20:43,860 --> 00:20:47,090 find the probability that the random variable takes on that 393 00:20:47,090 --> 00:20:49,710 value by adding the probabilities of all the 394 00:20:49,710 --> 00:20:51,750 possible outcomes that leads to that 395 00:20:51,750 --> 00:20:54,100 particular numerical value. 396 00:20:54,100 --> 00:20:57,620 So let's do another, more interesting one. 397 00:20:57,620 --> 00:21:00,270 So let's revisit the coin tossing 398 00:21:00,270 --> 00:21:02,490 problem from last time. 399 00:21:02,490 --> 00:21:11,600 Let us fix a number n, and we decide to flip a coin n 400 00:21:11,600 --> 00:21:13,080 consecutive times. 401 00:21:13,080 --> 00:21:16,100 Each time the coin tosses are independent. 402 00:21:16,100 --> 00:21:19,300 And each one of the tosses will have a probability, p, of 403 00:21:19,300 --> 00:21:20,960 obtaining heads. 404 00:21:20,960 --> 00:21:23,590 Let's consider the random variable, which is the total 405 00:21:23,590 --> 00:21:26,325 number of heads that have been obtained. 406 00:21:26,325 --> 00:21:29,690 Well, that's something that we dealt with last time. 407 00:21:29,690 --> 00:21:33,380 We know the probabilities for different numbers of heads, 408 00:21:33,380 --> 00:21:35,530 but we're just going to do the same now 409 00:21:35,530 --> 00:21:37,960 using today's notation. 410 00:21:37,960 --> 00:21:41,610 So let's, for concreteness, n equal to 4. 411 00:21:41,610 --> 00:21:48,410 Px is the PMF of that random variable, X. Px(2) is meant to 412 00:21:48,410 --> 00:21:52,130 be, by definition, it's the probability that a random 413 00:21:52,130 --> 00:21:54,420 variable takes the value of 2. 414 00:21:54,420 --> 00:21:57,410 So this is the probability that we have, exactly two 415 00:21:57,410 --> 00:22:00,080 heads in our four tosses. 416 00:22:00,080 --> 00:22:03,910 The event of exactly two heads can happen in multiple ways. 417 00:22:03,910 --> 00:22:05,740 And here I've written down the different 418 00:22:05,740 --> 00:22:06,920 ways that it can happen. 419 00:22:06,920 --> 00:22:09,230 It turns out that there's exactly six 420 00:22:09,230 --> 00:22:10,920 ways that it can happen. 421 00:22:10,920 --> 00:22:15,010 And each one of these ways, luckily enough, has the same 422 00:22:15,010 --> 00:22:16,180 probability-- 423 00:22:16,180 --> 00:22:19,460 p-squared times (1-p)-squared. 424 00:22:19,460 --> 00:22:24,690 So that gives us the value for the PMF evaluated at 2. 425 00:22:24,690 --> 00:22:28,370 So here we just counted explicitly that we have six 426 00:22:28,370 --> 00:22:31,170 possible ways that this can happen, and this gave rise to 427 00:22:31,170 --> 00:22:32,900 this factor of 6. 428 00:22:32,900 --> 00:22:37,360 But this factor of 6 turns out to be the same as 429 00:22:37,360 --> 00:22:39,350 this 4 choose 2. 430 00:22:39,350 --> 00:22:42,490 If you remember definition from last time, 4 choose 2 is 431 00:22:42,490 --> 00:22:45,650 4 factorial divided by 2 factorial, divided by 2 432 00:22:45,650 --> 00:22:49,940 factorial, which is indeed equal to 6. 433 00:22:49,940 --> 00:22:52,370 And this is the more general formula that 434 00:22:52,370 --> 00:22:53,830 you would be using. 435 00:22:53,830 --> 00:22:59,190 In general, if you have n tosses and you're interested 436 00:22:59,190 --> 00:23:02,540 in the probability of obtaining k heads, the 437 00:23:02,540 --> 00:23:05,560 probability of that event is given by this formula. 438 00:23:05,560 --> 00:23:08,710 So that's the formula that we derived last time. 439 00:23:08,710 --> 00:23:11,230 Except that last time we didn't use this notation. 440 00:23:11,230 --> 00:23:15,300 We just said the probability of k heads is equal to this. 441 00:23:15,300 --> 00:23:18,020 Today we introduce the extra notation. 442 00:23:18,020 --> 00:23:22,470 And also having that notation, we may be tempted to also plot 443 00:23:22,470 --> 00:23:26,310 a bar graph for the Px. 444 00:23:26,310 --> 00:23:29,390 In this case, for the coin tossing problem. 445 00:23:29,390 --> 00:23:35,090 And if you plot that bar graph as a function of k when n is a 446 00:23:35,090 --> 00:23:40,850 fairly large number, what you will end up obtaining is a bar 447 00:23:40,850 --> 00:23:47,525 graph that has a shape of something like this. 448 00:23:47,525 --> 00:23:53,840 449 00:23:53,840 --> 00:23:58,800 So certain values of k are more likely than others, and 450 00:23:58,800 --> 00:24:00,790 the more likely values are somewhere in the 451 00:24:00,790 --> 00:24:02,230 middle of the range. 452 00:24:02,230 --> 00:24:03,490 And extreme values-- 453 00:24:03,490 --> 00:24:07,110 too few heads or too many heads, are unlikely. 454 00:24:07,110 --> 00:24:09,870 Now, the miraculous thing is that it turns out that this 455 00:24:09,870 --> 00:24:15,550 curve gets a pretty definite shape, like a so-called bell 456 00:24:15,550 --> 00:24:18,210 curve, when n is big. 457 00:24:18,210 --> 00:24:20,770 458 00:24:20,770 --> 00:24:24,920 This is a very deep and central fact from probability 459 00:24:24,920 --> 00:24:30,210 theory that we will get to in a couple of months. 460 00:24:30,210 --> 00:24:33,900 For now, it just could be a curious observation. 461 00:24:33,900 --> 00:24:38,390 If you go into MATLAB and put this formula in and ask MATLAB 462 00:24:38,390 --> 00:24:41,540 to plot it for you, you're going to get an interesting 463 00:24:41,540 --> 00:24:43,140 shape of this form. 464 00:24:43,140 --> 00:24:46,700 And later on we will have to sort of understand where this 465 00:24:46,700 --> 00:24:50,760 is coming from and whether there's a nice, simple formula 466 00:24:50,760 --> 00:24:54,920 for the asymptotic form that we get. 467 00:24:54,920 --> 00:24:55,370 All right. 468 00:24:55,370 --> 00:25:00,580 So, so far I've said essentially nothing new, just 469 00:25:00,580 --> 00:25:05,240 a little bit of notation and this little conceptual thing 470 00:25:05,240 --> 00:25:07,900 that you have to think of random variables as functions 471 00:25:07,900 --> 00:25:09,060 in the sample space. 472 00:25:09,060 --> 00:25:11,620 So now it's time to introduce something new. 473 00:25:11,620 --> 00:25:14,250 This is the big concept of the day. 474 00:25:14,250 --> 00:25:17,180 In some sense it's an easy concept. 475 00:25:17,180 --> 00:25:23,420 But it's the most central, most important concept that we 476 00:25:23,420 --> 00:25:26,970 have to deal with random variables. 477 00:25:26,970 --> 00:25:28,790 It's the concept of the expected 478 00:25:28,790 --> 00:25:30,860 value of a random variable. 479 00:25:30,860 --> 00:25:34,570 So the expected value is meant to be, let's speak loosely, 480 00:25:34,570 --> 00:25:38,100 something like an average, where you interpret 481 00:25:38,100 --> 00:25:41,520 probabilities as something like frequencies. 482 00:25:41,520 --> 00:25:46,490 So you play a certain game and your rewards are going to be-- 483 00:25:46,490 --> 00:25:49,660 484 00:25:49,660 --> 00:25:52,010 use my standard numbers-- 485 00:25:52,010 --> 00:25:54,530 your rewards are going to be one dollar 486 00:25:54,530 --> 00:25:58,040 with probability 1/6. 487 00:25:58,040 --> 00:26:04,711 It's going to be 2 dollars with probability 1/2, and four 488 00:26:04,711 --> 00:26:08,670 dollars with probability 1/3. 489 00:26:08,670 --> 00:26:11,920 So this is a plot of the PMF of some random variable. 490 00:26:11,920 --> 00:26:15,270 If you play that game and you get so many dollars with this 491 00:26:15,270 --> 00:26:18,520 probability, and so on, how much do you expect to get on 492 00:26:18,520 --> 00:26:21,670 the average if you play the game a zillion times? 493 00:26:21,670 --> 00:26:23,420 Well, you can think as follows-- 494 00:26:23,420 --> 00:26:27,990 one sixth of the time I'm going to get one dollar. 495 00:26:27,990 --> 00:26:31,620 One half of the time that outcome is going to happen and 496 00:26:31,620 --> 00:26:34,140 I'm going to get two dollars. 497 00:26:34,140 --> 00:26:37,920 And one third of the time the other outcome happens, and I'm 498 00:26:37,920 --> 00:26:40,690 going to get four dollars. 499 00:26:40,690 --> 00:26:45,230 And you evaluate that number and it turns out to be 2.5. 500 00:26:45,230 --> 00:26:45,490 OK. 501 00:26:45,490 --> 00:26:50,410 So that's a reasonable way of calculating the average payoff 502 00:26:50,410 --> 00:26:52,550 if you think of these probabilities as the 503 00:26:52,550 --> 00:26:56,440 frequencies with which you obtain the different payoffs. 504 00:26:56,440 --> 00:26:59,430 And loosely speaking, it doesn't hurt to think of 505 00:26:59,430 --> 00:27:02,430 probabilities as frequencies when you try to make sense of 506 00:27:02,430 --> 00:27:04,990 various things. 507 00:27:04,990 --> 00:27:06,480 So what did we do here? 508 00:27:06,480 --> 00:27:11,710 We took the probabilities of the different outcomes, of the 509 00:27:11,710 --> 00:27:15,370 different numerical values, and multiplied them with the 510 00:27:15,370 --> 00:27:17,610 corresponding numerical value. 511 00:27:17,610 --> 00:27:19,910 Similarly here, we have a probability and the 512 00:27:19,910 --> 00:27:24,890 corresponding numerical value and we added up over all x's. 513 00:27:24,890 --> 00:27:26,430 So that's what we did. 514 00:27:26,430 --> 00:27:29,750 It looks like an interesting quantity to deal with. 515 00:27:29,750 --> 00:27:32,800 So we're going to give a name to it, and we're going to call 516 00:27:32,800 --> 00:27:35,740 it the expected value of a random variable. 517 00:27:35,740 --> 00:27:39,610 So this formula just captures the calculation that we did. 518 00:27:39,610 --> 00:27:43,520 How do we interpret the expected value? 519 00:27:43,520 --> 00:27:46,490 So the one interpretation is the one that I 520 00:27:46,490 --> 00:27:48,110 used in this example. 521 00:27:48,110 --> 00:27:52,290 You can think of it as the average that you get over a 522 00:27:52,290 --> 00:27:56,200 large number of repetitions of an experiment where you 523 00:27:56,200 --> 00:27:59,330 interpret the probabilities as the frequencies with which the 524 00:27:59,330 --> 00:28:02,090 different numerical values can happen. 525 00:28:02,090 --> 00:28:04,870 There's another interpretation that's a little more visual 526 00:28:04,870 --> 00:28:07,550 and that's kind of insightful, if you remember your freshman 527 00:28:07,550 --> 00:28:10,860 physics, this kind of formula gives you the center of 528 00:28:10,860 --> 00:28:14,390 gravity of an object of this kind. 529 00:28:14,390 --> 00:28:17,290 If you take that picture literally and think of this as 530 00:28:17,290 --> 00:28:20,700 a mass of one sixth sitting here, and the mass of one half 531 00:28:20,700 --> 00:28:24,000 sitting here, and one third sitting there, and you ask me 532 00:28:24,000 --> 00:28:26,920 what's the center of gravity of that structure. 533 00:28:26,920 --> 00:28:29,320 This is the formula that gives you the center of gravity. 534 00:28:29,320 --> 00:28:30,900 Now what's the center of gravity? 535 00:28:30,900 --> 00:28:34,960 It's the place where if you put your pen right underneath, 536 00:28:34,960 --> 00:28:38,050 that diagram will stay in place and will not fall on one 537 00:28:38,050 --> 00:28:40,440 side and will not fall on the other side. 538 00:28:40,440 --> 00:28:44,950 So in this thing, by picture, since the 4 is a little more 539 00:28:44,950 --> 00:28:47,880 to the right and a little heavier, the center of gravity 540 00:28:47,880 --> 00:28:50,200 should be somewhere around here. 541 00:28:50,200 --> 00:28:52,290 And that's what for the math gave us. 542 00:28:52,290 --> 00:28:54,740 It turns out to be two and a half. 543 00:28:54,740 --> 00:28:56,920 Once you have this interpretation about centers 544 00:28:56,920 --> 00:28:58,890 of gravity, sometimes you can calculate 545 00:28:58,890 --> 00:29:01,090 expectations pretty fast. 546 00:29:01,090 --> 00:29:04,410 So here's our new random variable. 547 00:29:04,410 --> 00:29:07,840 It's the uniform random variable in which each one of 548 00:29:07,840 --> 00:29:10,420 the numerical values is equally likely. 549 00:29:10,420 --> 00:29:13,980 Here there's a total of n plus 1 possible numerical values. 550 00:29:13,980 --> 00:29:17,600 So each one of them has probability 1 over (n + 1). 551 00:29:17,600 --> 00:29:20,650 Let's calculate the expected value of this random variable. 552 00:29:20,650 --> 00:29:24,620 We can take the formula literally and consider all 553 00:29:24,620 --> 00:29:28,920 possible outcomes, or all possible numerical values, and 554 00:29:28,920 --> 00:29:32,330 weigh them by their corresponding probability, and 555 00:29:32,330 --> 00:29:35,170 do this calculation and obtain an answer. 556 00:29:35,170 --> 00:29:38,520 But I gave you the intuition of centers of gravity. 557 00:29:38,520 --> 00:29:41,990 Can you use that intuition to guess the answer? 558 00:29:41,990 --> 00:29:46,680 What's the center of gravity infrastructure of this kind? 559 00:29:46,680 --> 00:29:47,860 We have symmetry. 560 00:29:47,860 --> 00:29:50,710 So it should be in the middle. 561 00:29:50,710 --> 00:29:51,970 And what's the middle? 562 00:29:51,970 --> 00:29:54,850 It's the average of the two end points. 563 00:29:54,850 --> 00:29:57,850 So without having to do the algebra, you know that's the 564 00:29:57,850 --> 00:30:01,200 answer is going to be n over 2. 565 00:30:01,200 --> 00:30:05,850 So this is a moral that you should keep whenever you have 566 00:30:05,850 --> 00:30:11,460 PMF, which is symmetric around a certain point. 567 00:30:11,460 --> 00:30:15,350 That certain point is going to be the expected value 568 00:30:15,350 --> 00:30:17,460 associated with this particular PMF. 569 00:30:17,460 --> 00:30:21,920 570 00:30:21,920 --> 00:30:22,380 OK. 571 00:30:22,380 --> 00:30:29,610 So having defined the expected value, what is there that's 572 00:30:29,610 --> 00:30:31,810 left for us to do? 573 00:30:31,810 --> 00:30:37,290 Well, we want to investigate how it behaves, what kind of 574 00:30:37,290 --> 00:30:43,390 properties does it have, and also how do you calculate 575 00:30:43,390 --> 00:30:48,040 expected values of complicated random variables. 576 00:30:48,040 --> 00:30:52,130 So the first complication that we're going to start with is 577 00:30:52,130 --> 00:30:54,985 the case where we deal with a function of a random variable. 578 00:30:54,985 --> 00:30:59,002 579 00:30:59,002 --> 00:30:59,680 OK. 580 00:30:59,680 --> 00:31:05,890 So let me redraw this same picture as before. 581 00:31:05,890 --> 00:31:07,090 We have omega. 582 00:31:07,090 --> 00:31:09,580 This is our sample space. 583 00:31:09,580 --> 00:31:12,310 This is the real line. 584 00:31:12,310 --> 00:31:17,370 And we have a random variable that gives rise to various 585 00:31:17,370 --> 00:31:24,400 values for X. So the random variable is capital X, and 586 00:31:24,400 --> 00:31:28,690 every outcome leads to a particular numerical value x 587 00:31:28,690 --> 00:31:33,030 for our random variable X. So capital X is really the 588 00:31:33,030 --> 00:31:37,930 function that maps these points into the real line. 589 00:31:37,930 --> 00:31:42,710 And then I consider a function of this random variable, call 590 00:31:42,710 --> 00:31:47,080 it capital Y, and it's a function of my 591 00:31:47,080 --> 00:31:49,980 previous random variable. 592 00:31:49,980 --> 00:31:54,190 And this new random variable Y takes numerical values that 593 00:31:54,190 --> 00:31:58,140 are completely determined once I know the numerical value of 594 00:31:58,140 --> 00:32:03,090 capital X. And perhaps you get a diagram of this kind. 595 00:32:03,090 --> 00:32:08,520 596 00:32:08,520 --> 00:32:10,970 So X is a random variable. 597 00:32:10,970 --> 00:32:14,506 Once you have an outcome, this determines the value of x. 598 00:32:14,506 --> 00:32:16,760 Y is also a random variable. 599 00:32:16,760 --> 00:32:19,200 Once you have the outcome, that determines 600 00:32:19,200 --> 00:32:21,230 the value of y. 601 00:32:21,230 --> 00:32:26,630 Y is completely determined once you know X. We have a 602 00:32:26,630 --> 00:32:31,560 formula for how to calculate the expected value of X. 603 00:32:31,560 --> 00:32:34,380 Suppose that you're interested in calculating the expected 604 00:32:34,380 --> 00:32:39,910 value of Y. How would you go about it? 605 00:32:39,910 --> 00:32:40,710 OK. 606 00:32:40,710 --> 00:32:43,580 The only thing you have in your hands is the definition, 607 00:32:43,580 --> 00:32:47,750 so you could start by just using the definition. 608 00:32:47,750 --> 00:32:50,150 And what does this entail? 609 00:32:50,150 --> 00:32:55,330 It entails for every particular value of y, collect 610 00:32:55,330 --> 00:32:59,160 all the outcomes that leads to that value of y. 611 00:32:59,160 --> 00:33:01,010 Find their probability. 612 00:33:01,010 --> 00:33:02,280 Do the same here. 613 00:33:02,280 --> 00:33:04,550 For that value, collect those outcomes. 614 00:33:04,550 --> 00:33:07,700 Find their probability and weight by y. 615 00:33:07,700 --> 00:33:13,050 So this formula does the addition over this line. 616 00:33:13,050 --> 00:33:17,060 We consider the different outcomes and add things up. 617 00:33:17,060 --> 00:33:20,290 There's an alternative way of doing the same accounting 618 00:33:20,290 --> 00:33:23,930 where instead of doing the addition over those numbers, 619 00:33:23,930 --> 00:33:26,540 we do the addition up here. 620 00:33:26,540 --> 00:33:30,250 We consider the different possible values of x, and we 621 00:33:30,250 --> 00:33:31,500 think as follows-- 622 00:33:31,500 --> 00:33:34,500 623 00:33:34,500 --> 00:33:38,900 for each possible value of x, that value is going to occur 624 00:33:38,900 --> 00:33:41,310 with this probability. 625 00:33:41,310 --> 00:33:45,890 And if that value has occurred, this is how much I'm 626 00:33:45,890 --> 00:33:47,840 getting, the g of x. 627 00:33:47,840 --> 00:33:52,990 So I'm considering the probability of this outcome. 628 00:33:52,990 --> 00:33:56,050 And in that case, y takes this value. 629 00:33:56,050 --> 00:34:00,240 Then I'm considering the probabilities of this outcome. 630 00:34:00,240 --> 00:34:04,650 And in that case, g of x takes again that value. 631 00:34:04,650 --> 00:34:08,100 Then I consider this particular x, it happens with 632 00:34:08,100 --> 00:34:11,280 this much probability, and in that case, g of x takes that 633 00:34:11,280 --> 00:34:14,300 value, and similarly here. 634 00:34:14,300 --> 00:34:18,170 We end up doing exactly the same arithmetic, it's only a 635 00:34:18,170 --> 00:34:21,760 question whether we bundle things together. 636 00:34:21,760 --> 00:34:25,790 That is, if we calculate the probability of this, then 637 00:34:25,790 --> 00:34:28,239 we're bundling these two cases together. 638 00:34:28,239 --> 00:34:32,110 Whereas if we do the addition up here, we do a separate 639 00:34:32,110 --> 00:34:32,949 calculation-- 640 00:34:32,949 --> 00:34:35,639 this probability times this number, and then this 641 00:34:35,639 --> 00:34:37,989 probability times that number. 642 00:34:37,989 --> 00:34:41,420 So it's just a simple rearrangement of the way that 643 00:34:41,420 --> 00:34:45,330 we do the calculations, but it does make a big difference in 644 00:34:45,330 --> 00:34:49,010 practice if you actually want to calculate expectations. 645 00:34:49,010 --> 00:34:52,389 So the second procedure that I mentioned, where you do the 646 00:34:52,389 --> 00:34:56,790 addition by running over the x-axis 647 00:34:56,790 --> 00:34:59,710 corresponds to this formula. 648 00:34:59,710 --> 00:35:05,830 Consider all possibilities for x and when that x happens, how 649 00:35:05,830 --> 00:35:07,530 much money are you getting? 650 00:35:07,530 --> 00:35:10,850 That gives you the average money that you are getting. 651 00:35:10,850 --> 00:35:11,270 All right. 652 00:35:11,270 --> 00:35:14,840 So I kind of hand waved and argued that it's just a 653 00:35:14,840 --> 00:35:17,690 different way of accounting, of course one needs to prove 654 00:35:17,690 --> 00:35:19,060 this formula. 655 00:35:19,060 --> 00:35:20,950 And fortunately it can be proved. 656 00:35:20,950 --> 00:35:23,470 You're going to see that in recitation. 657 00:35:23,470 --> 00:35:25,710 Most people, once they're a little comfortable with the 658 00:35:25,710 --> 00:35:28,570 concepts of probability, actually believe that this is 659 00:35:28,570 --> 00:35:30,130 true by definition. 660 00:35:30,130 --> 00:35:31,860 In fact it's not true by definition. 661 00:35:31,860 --> 00:35:34,610 It's called the law of the unconscious statistician. 662 00:35:34,610 --> 00:35:37,930 It's something that you always do, but it's something that 663 00:35:37,930 --> 00:35:40,750 does require justification. 664 00:35:40,750 --> 00:35:41,100 All right. 665 00:35:41,100 --> 00:35:44,160 So this gives us basically a shortcut for calculating 666 00:35:44,160 --> 00:35:47,770 expected values of functions of a random variable without 667 00:35:47,770 --> 00:35:51,990 having to find the PMF of that function. 668 00:35:51,990 --> 00:35:54,470 We can work with the PMF of the original function. 669 00:35:54,470 --> 00:35:57,140 670 00:35:57,140 --> 00:35:57,430 All right. 671 00:35:57,430 --> 00:36:00,940 So we're going to use this property over and over. 672 00:36:00,940 --> 00:36:06,570 Before we start using it, one general word of caution-- 673 00:36:06,570 --> 00:36:10,640 the average of a function of a random variable, in general, 674 00:36:10,640 --> 00:36:16,400 is not the same as the function of the average. 675 00:36:16,400 --> 00:36:20,820 So these two operations of taking averages and taking 676 00:36:20,820 --> 00:36:23,180 functions do not commute. 677 00:36:23,180 --> 00:36:28,130 What this inequality tells you is that, in general, you can 678 00:36:28,130 --> 00:36:30,280 not reason on the average. 679 00:36:30,280 --> 00:36:34,420 680 00:36:34,420 --> 00:36:38,600 So we're going to see instances where this property 681 00:36:38,600 --> 00:36:39,610 is not true. 682 00:36:39,610 --> 00:36:41,080 You're going to see lots of them. 683 00:36:41,080 --> 00:36:43,920 Let me just throw it here that it's something that's not true 684 00:36:43,920 --> 00:36:47,710 in general, but we will be interested in the exceptions 685 00:36:47,710 --> 00:36:51,480 where a relation like this is true. 686 00:36:51,480 --> 00:36:53,360 But these will be the exceptions. 687 00:36:53,360 --> 00:36:56,960 So in general, expectations are average, 688 00:36:56,960 --> 00:36:58,850 something like averages. 689 00:36:58,850 --> 00:37:02,400 But the function of an average is not the same as the average 690 00:37:02,400 --> 00:37:05,070 of the function. 691 00:37:05,070 --> 00:37:05,440 OK. 692 00:37:05,440 --> 00:37:09,530 So now let's go to properties of expectations. 693 00:37:09,530 --> 00:37:15,170 Suppose that alpha is a real number, and I ask you, what's 694 00:37:15,170 --> 00:37:17,740 the expected value of that real number? 695 00:37:17,740 --> 00:37:21,010 So for example, if I write down this expression-- 696 00:37:21,010 --> 00:37:23,070 expected value of 2. 697 00:37:23,070 --> 00:37:25,930 What is this? 698 00:37:25,930 --> 00:37:29,470 Well, we defined random variables and we defined 699 00:37:29,470 --> 00:37:31,860 expectations of random variables. 700 00:37:31,860 --> 00:37:35,870 So for this to make syntactic sense, this thing inside here 701 00:37:35,870 --> 00:37:37,670 should be a random variable. 702 00:37:37,670 --> 00:37:39,260 Is 2 -- 703 00:37:39,260 --> 00:37:41,140 the number 2 --- is it a random variable? 704 00:37:41,140 --> 00:37:44,740 705 00:37:44,740 --> 00:37:48,420 In some sense, yes. 706 00:37:48,420 --> 00:37:55,750 It's the random variable that takes, always, the value of 2. 707 00:37:55,750 --> 00:37:59,220 So suppose that you have some experiment and that experiment 708 00:37:59,220 --> 00:38:02,580 always outputs 2 whenever it happens. 709 00:38:02,580 --> 00:38:05,880 Then you can say, yes, it's a random experiment but it 710 00:38:05,880 --> 00:38:06,960 always gives me 2. 711 00:38:06,960 --> 00:38:08,600 The value of the random variable is 712 00:38:08,600 --> 00:38:10,460 always 2 no matter what. 713 00:38:10,460 --> 00:38:13,200 It's kind of a degenerate random variable that doesn't 714 00:38:13,200 --> 00:38:17,230 have any real randomness in it, but it's still useful to 715 00:38:17,230 --> 00:38:20,130 think of it as a special case. 716 00:38:20,130 --> 00:38:23,000 So it corresponds to a function from the sample space 717 00:38:23,000 --> 00:38:26,750 to the real line that takes only one value. 718 00:38:26,750 --> 00:38:30,390 No matter what the outcome is, it always gives me a 2. 719 00:38:30,390 --> 00:38:30,770 OK. 720 00:38:30,770 --> 00:38:34,390 If you have a random variable that always gives you a 2, 721 00:38:34,390 --> 00:38:37,980 what is the expected value going to be? 722 00:38:37,980 --> 00:38:40,530 The only entry that shows up in this summation 723 00:38:40,530 --> 00:38:43,000 is that number 2. 724 00:38:43,000 --> 00:38:46,270 The probability of a 2 is equal to 1, and the value of 725 00:38:46,270 --> 00:38:48,330 that random variable is equal to 2. 726 00:38:48,330 --> 00:38:51,030 So it's the number itself. 727 00:38:51,030 --> 00:38:53,910 So the average value in an experiment that always gives 728 00:38:53,910 --> 00:38:56,580 you 2's is 2. 729 00:38:56,580 --> 00:38:57,100 All right. 730 00:38:57,100 --> 00:38:59,450 So that's nice and simple. 731 00:38:59,450 --> 00:39:04,890 Now let's go to our experiment where age was 732 00:39:04,890 --> 00:39:07,310 your height in inches. 733 00:39:07,310 --> 00:39:11,160 And I know your height in inches, but I'm interested in 734 00:39:11,160 --> 00:39:15,880 your height measured in centimeters. 735 00:39:15,880 --> 00:39:19,040 How is that going to be related to 736 00:39:19,040 --> 00:39:22,675 your height in inches? 737 00:39:22,675 --> 00:39:27,440 Well, if you take your height in inches and convert it to 738 00:39:27,440 --> 00:39:30,690 centimeters, I have another random variable, which is 739 00:39:30,690 --> 00:39:34,280 always, no matter what, two and a half times bigger than 740 00:39:34,280 --> 00:39:36,570 the random variable I started with. 741 00:39:36,570 --> 00:39:40,470 If you take some quantity and always multiplied by two and a 742 00:39:40,470 --> 00:39:43,610 half what happens to the average of that quantity? 743 00:39:43,610 --> 00:39:46,990 It also gets multiplied by two and a half. 744 00:39:46,990 --> 00:39:52,030 So you get a relation like this, which says that the 745 00:39:52,030 --> 00:39:56,480 average height of a student measured in centimeters is two 746 00:39:56,480 --> 00:39:58,660 and a half times the average height of a 747 00:39:58,660 --> 00:40:01,660 student measured in inches. 748 00:40:01,660 --> 00:40:03,730 So that makes perfect intuitive sense. 749 00:40:03,730 --> 00:40:07,490 If you generalize it, it gives us this relation, that if you 750 00:40:07,490 --> 00:40:13,790 have a number, you can pull it outside the expectation and 751 00:40:13,790 --> 00:40:16,210 you get the right result. 752 00:40:16,210 --> 00:40:20,440 So this is a case where you can reason on the average. 753 00:40:20,440 --> 00:40:23,150 If you take a number, such as height, and multiply it by a 754 00:40:23,150 --> 00:40:25,500 certain number, you can reason on the average. 755 00:40:25,500 --> 00:40:27,650 I multiply the numbers by two, the averages 756 00:40:27,650 --> 00:40:29,630 will go up by two. 757 00:40:29,630 --> 00:40:33,750 So this is an exception to this cautionary statement that 758 00:40:33,750 --> 00:40:35,460 I had up there. 759 00:40:35,460 --> 00:40:39,860 How do we prove that this fact is true? 760 00:40:39,860 --> 00:40:44,360 Well, we can use the expected value rule here, which tells 761 00:40:44,360 --> 00:40:52,690 us that the expected value of alpha X, this is our g of X, 762 00:40:52,690 --> 00:40:59,720 essentially, is going to be the sum over all x's of my 763 00:40:59,720 --> 00:41:04,900 function, g of X, times the probability of the x's. 764 00:41:04,900 --> 00:41:11,270 In our particular case, g of X is alpha times X. And we have 765 00:41:11,270 --> 00:41:12,450 those probabilities. 766 00:41:12,450 --> 00:41:15,600 And the alpha goes outside the summation. 767 00:41:15,600 --> 00:41:23,100 So we get alpha, sum over x's, x Px of x, which is alpha 768 00:41:23,100 --> 00:41:26,740 times the expected value of X. 769 00:41:26,740 --> 00:41:30,580 So that's how you prove this relation formally using this 770 00:41:30,580 --> 00:41:32,490 rule up here. 771 00:41:32,490 --> 00:41:35,810 And the next formula that I have here also gets 772 00:41:35,810 --> 00:41:37,310 proved the same way. 773 00:41:37,310 --> 00:41:41,110 What does this formula tell you? 774 00:41:41,110 --> 00:41:46,560 If I take everybody's height in centimeters-- 775 00:41:46,560 --> 00:41:49,030 we already multiplied by alpha-- 776 00:41:49,030 --> 00:41:52,800 and the gods give everyone a bonus of ten extra 777 00:41:52,800 --> 00:41:54,670 centimeters. 778 00:41:54,670 --> 00:41:57,720 What's going to happen to the average height of the class? 779 00:41:57,720 --> 00:42:02,800 Well, it will just go up by an extra ten centimeters. 780 00:42:02,800 --> 00:42:08,040 So this expectation is going to be giving you the bonus of 781 00:42:08,040 --> 00:42:15,710 beta just adds a beta to the average height in centimeters, 782 00:42:15,710 --> 00:42:20,740 which we also know to be alpha times the expected 783 00:42:20,740 --> 00:42:24,430 value of X, plus beta. 784 00:42:24,430 --> 00:42:29,390 So this is a linearity property of expectations. 785 00:42:29,390 --> 00:42:34,140 If you take a linear function of a single random variable, 786 00:42:34,140 --> 00:42:38,390 the expected value of that linear function is the linear 787 00:42:38,390 --> 00:42:41,140 function of the expected value. 788 00:42:41,140 --> 00:42:44,100 So this is our big exception to this cautionary note, that 789 00:42:44,100 --> 00:42:48,710 we have equal if g is linear. 790 00:42:48,710 --> 00:42:55,840 791 00:42:55,840 --> 00:42:57,090 OK. 792 00:42:57,090 --> 00:42:59,790 793 00:42:59,790 --> 00:43:00,790 All right. 794 00:43:00,790 --> 00:43:05,850 So let's get to the last concept of the day. 795 00:43:05,850 --> 00:43:07,470 What kind of functions of random 796 00:43:07,470 --> 00:43:11,010 variables may be of interest? 797 00:43:11,010 --> 00:43:15,660 One possibility might be the average value of X-squared. 798 00:43:15,660 --> 00:43:18,780 799 00:43:18,780 --> 00:43:20,150 Why is it interesting? 800 00:43:20,150 --> 00:43:21,760 Well, why not. 801 00:43:21,760 --> 00:43:24,290 It's the simplest function that you can think of. 802 00:43:24,290 --> 00:43:27,560 803 00:43:27,560 --> 00:43:30,800 So if you want to calculate the expected value of 804 00:43:30,800 --> 00:43:35,260 X-squared, you would use this general rule for how you can 805 00:43:35,260 --> 00:43:39,470 calculate expected values of functions of random variables. 806 00:43:39,470 --> 00:43:41,340 You consider all the possible x's. 807 00:43:41,340 --> 00:43:45,550 For each x, you see what's the probability that it occurs. 808 00:43:45,550 --> 00:43:49,790 And if that x occurs, you consider and see how big 809 00:43:49,790 --> 00:43:52,090 x-squared is. 810 00:43:52,090 --> 00:43:54,810 Now, the more interesting quantity, a more interesting 811 00:43:54,810 --> 00:43:58,580 expectation that you can calculate has to do not with 812 00:43:58,580 --> 00:44:03,570 x-squared, but with the distance of x from the mean 813 00:44:03,570 --> 00:44:05,710 and then squared. 814 00:44:05,710 --> 00:44:10,800 So let's try to parse what we've got up here. 815 00:44:10,800 --> 00:44:14,970 Let's look just at the quantity inside here. 816 00:44:14,970 --> 00:44:16,610 What kind of quantity is it? 817 00:44:16,610 --> 00:44:19,190 818 00:44:19,190 --> 00:44:21,030 It's a random variable. 819 00:44:21,030 --> 00:44:22,370 Why? 820 00:44:22,370 --> 00:44:26,540 X is random, the random variable, expected value of X 821 00:44:26,540 --> 00:44:28,090 is a number. 822 00:44:28,090 --> 00:44:30,800 Subtract a number from a random variable, you get 823 00:44:30,800 --> 00:44:32,460 another random variable. 824 00:44:32,460 --> 00:44:35,630 Take a random variable and square it, you get another 825 00:44:35,630 --> 00:44:36,810 random variable. 826 00:44:36,810 --> 00:44:40,590 So the thing inside here is a legitimate random variable. 827 00:44:40,590 --> 00:44:44,950 What kind of random variable is it? 828 00:44:44,950 --> 00:44:47,720 So suppose that we have our experiment and we have 829 00:44:47,720 --> 00:44:49,452 different x's that can happen. 830 00:44:49,452 --> 00:44:52,310 831 00:44:52,310 --> 00:44:56,090 And the mean of X in this picture might be somewhere 832 00:44:56,090 --> 00:44:57,340 around here. 833 00:44:57,340 --> 00:45:00,500 834 00:45:00,500 --> 00:45:02,570 I do the experiment. 835 00:45:02,570 --> 00:45:05,350 I obtain some numerical value of x. 836 00:45:05,350 --> 00:45:09,610 Let's say I obtain this numerical value. 837 00:45:09,610 --> 00:45:13,810 I look at the distance from the mean, which is this 838 00:45:13,810 --> 00:45:18,460 length, and I take the square of that. 839 00:45:18,460 --> 00:45:22,730 Each time that I do the experiment, I go and record my 840 00:45:22,730 --> 00:45:25,780 distance from the mean and square it. 841 00:45:25,780 --> 00:45:29,490 So I give more emphasis to big distances. 842 00:45:29,490 --> 00:45:33,370 And then I take the average over all possible outcomes, 843 00:45:33,370 --> 00:45:35,520 all possible numerical values. 844 00:45:35,520 --> 00:45:39,510 So I'm trying to compute the average squared 845 00:45:39,510 --> 00:45:42,980 distance from the mean. 846 00:45:42,980 --> 00:45:47,770 This corresponds to this formula here. 847 00:45:47,770 --> 00:45:51,110 So the picture that I drew corresponds to that. 848 00:45:51,110 --> 00:45:55,920 For every possible numerical value of x, that numerical 849 00:45:55,920 --> 00:45:59,010 value corresponds to a certain distance from the mean 850 00:45:59,010 --> 00:46:03,580 squared, and I weight according to how likely is 851 00:46:03,580 --> 00:46:07,360 that particular value of x to arise. 852 00:46:07,360 --> 00:46:10,840 So this measures the average squared 853 00:46:10,840 --> 00:46:13,280 distance from the mean. 854 00:46:13,280 --> 00:46:17,180 Now, because of that expected value rule, of course, this 855 00:46:17,180 --> 00:46:20,010 thing is the same as that expectation. 856 00:46:20,010 --> 00:46:23,880 It's the average value of the random variable, which is the 857 00:46:23,880 --> 00:46:26,300 squared distance from the mean. 858 00:46:26,300 --> 00:46:29,820 With this probability, the random variable takes on this 859 00:46:29,820 --> 00:46:33,050 numerical value, and the squared distance from the mean 860 00:46:33,050 --> 00:46:37,200 ends up taking that particular numerical value. 861 00:46:37,200 --> 00:46:37,680 OK. 862 00:46:37,680 --> 00:46:40,560 So why is the variance interesting? 863 00:46:40,560 --> 00:46:45,380 It tells us how far away from the mean we expect to be on 864 00:46:45,380 --> 00:46:46,900 the average. 865 00:46:46,900 --> 00:46:49,550 Well, actually we're not counting distances from the 866 00:46:49,550 --> 00:46:51,630 mean, it's distances squared. 867 00:46:51,630 --> 00:46:56,500 So it gives more emphasis to the kind of outliers in here. 868 00:46:56,500 --> 00:46:59,090 But it's a measure of how spread out the 869 00:46:59,090 --> 00:47:01,180 distribution is. 870 00:47:01,180 --> 00:47:05,240 A big variance means that those bars go far to the left 871 00:47:05,240 --> 00:47:07,010 and to the right, typically. 872 00:47:07,010 --> 00:47:10,230 Where as a small variance would mean that all those bars 873 00:47:10,230 --> 00:47:13,850 are tightly concentrated around the mean value. 874 00:47:13,850 --> 00:47:16,190 It's the average squared deviation. 875 00:47:16,190 --> 00:47:18,970 Small variance means that we generally have small 876 00:47:18,970 --> 00:47:19,580 deviations. 877 00:47:19,580 --> 00:47:22,500 Large variances mean that we generally have large 878 00:47:22,500 --> 00:47:24,210 deviations. 879 00:47:24,210 --> 00:47:27,310 Now as a practical matter, when you want to calculate the 880 00:47:27,310 --> 00:47:31,140 variance, there's a handy formula which I'm not proving 881 00:47:31,140 --> 00:47:33,110 but you will see it in recitation. 882 00:47:33,110 --> 00:47:36,270 It's just two lines of algebra. 883 00:47:36,270 --> 00:47:40,680 And it allows us to calculate it in a somewhat simpler way. 884 00:47:40,680 --> 00:47:43,110 We need to calculate the expected value of the random 885 00:47:43,110 --> 00:47:45,210 variable and the expected value of the squares of the 886 00:47:45,210 --> 00:47:47,580 random variable, and these two are going 887 00:47:47,580 --> 00:47:49,710 to give us the variance. 888 00:47:49,710 --> 00:47:53,970 So to summarize what we did up here, the variance, by 889 00:47:53,970 --> 00:47:57,370 definition, is given by this formula. 890 00:47:57,370 --> 00:48:01,470 It's the expected value of the squared deviation. 891 00:48:01,470 --> 00:48:06,380 But we have the equivalent formula, which comes from 892 00:48:06,380 --> 00:48:13,960 application of the expected value rule, to the function g 893 00:48:13,960 --> 00:48:18,690 of X, equals to x minus the (expected value of X)-squared. 894 00:48:18,690 --> 00:48:25,640 895 00:48:25,640 --> 00:48:26,330 OK. 896 00:48:26,330 --> 00:48:27,460 So this is the definition. 897 00:48:27,460 --> 00:48:31,170 This comes from the expected value rule. 898 00:48:31,170 --> 00:48:35,010 What are some properties of the variance? 899 00:48:35,010 --> 00:48:38,650 Of course variances are always non-negative. 900 00:48:38,650 --> 00:48:40,880 Why is it always non-negative? 901 00:48:40,880 --> 00:48:43,650 Well, you look at the definition and your just 902 00:48:43,650 --> 00:48:45,660 adding up non-negative things. 903 00:48:45,660 --> 00:48:47,630 We're adding squared deviations. 904 00:48:47,630 --> 00:48:50,100 So when you add non-negative things, you get something 905 00:48:50,100 --> 00:48:51,400 non-negative. 906 00:48:51,400 --> 00:48:55,800 The next question is, how do things scale if you take a 907 00:48:55,800 --> 00:48:59,880 linear function of a random variable? 908 00:48:59,880 --> 00:49:02,350 Let's think about the effects of beta. 909 00:49:02,350 --> 00:49:06,200 If I take a random variable and add the constant to it, 910 00:49:06,200 --> 00:49:09,820 how does this affect the amount of spread that we have? 911 00:49:09,820 --> 00:49:10,950 It doesn't affect-- 912 00:49:10,950 --> 00:49:14,610 whatever the spread of this thing is, if I add the 913 00:49:14,610 --> 00:49:18,840 constant beta, it just moves this diagram here, but the 914 00:49:18,840 --> 00:49:21,930 spread doesn't grow or get reduced. 915 00:49:21,930 --> 00:49:24,470 The thing is that when I'm adding a constant to a random 916 00:49:24,470 --> 00:49:28,160 variable, all the x's that are going to appear are further to 917 00:49:28,160 --> 00:49:32,890 the right, but the expected value also moves to the right. 918 00:49:32,890 --> 00:49:35,960 And since we're only interested in distances from 919 00:49:35,960 --> 00:49:39,500 the mean, these distances do not get affected. 920 00:49:39,500 --> 00:49:42,180 x gets increased by something. 921 00:49:42,180 --> 00:49:44,390 The mean gets increased by that same something. 922 00:49:44,390 --> 00:49:46,180 The difference stays the same. 923 00:49:46,180 --> 00:49:49,350 So adding a constant to a random variable doesn't do 924 00:49:49,350 --> 00:49:51,050 anything to it's variance. 925 00:49:51,050 --> 00:49:54,940 But if I multiply a random variable by a constant alpha, 926 00:49:54,940 --> 00:49:58,730 what is that going to do to its variance? 927 00:49:58,730 --> 00:50:04,720 Because we have a square here, when I multiply my random 928 00:50:04,720 --> 00:50:08,430 variable by a constant, this x gets multiplied by a constant, 929 00:50:08,430 --> 00:50:12,310 the mean gets multiplied by a constant, the square gets 930 00:50:12,310 --> 00:50:15,650 multiplied by the square of that constant. 931 00:50:15,650 --> 00:50:18,960 And because of that reason, we get this square of alpha 932 00:50:18,960 --> 00:50:20,210 showing up here. 933 00:50:20,210 --> 00:50:22,870 So that's how variances transform under linear 934 00:50:22,870 --> 00:50:23,650 transformations. 935 00:50:23,650 --> 00:50:26,180 You multiply your random variable by constant, the 936 00:50:26,180 --> 00:50:30,540 variance goes up by the square of that same constant. 937 00:50:30,540 --> 00:50:31,290 OK. 938 00:50:31,290 --> 00:50:32,950 That's it for today. 939 00:50:32,950 --> 00:50:34,200 See you on Wednesday. 940 00:50:34,200 --> 00:50:34,750