1 00:00:00,650 --> 00:00:03,990 In this segment, we introduce the concept of continuous 2 00:00:03,990 --> 00:00:07,230 random variables and their characterization in terms of 3 00:00:07,230 --> 00:00:11,100 probability density functions, or PDFs for short. 4 00:00:11,100 --> 00:00:13,810 Let us first go back to discrete random variables. 5 00:00:13,810 --> 00:00:15,800 A discrete random variable takes values 6 00:00:15,800 --> 00:00:17,130 in a discrete set. 7 00:00:17,130 --> 00:00:20,430 There is a total of one unit of probability assigned to the 8 00:00:20,430 --> 00:00:21,780 possible values. 9 00:00:21,780 --> 00:00:26,080 And the PMF tells us exactly how much of this probability 10 00:00:26,080 --> 00:00:28,100 is assigned to each value. 11 00:00:28,100 --> 00:00:33,590 So we can think of the bars in the PMF as point masses with 12 00:00:33,590 --> 00:00:37,000 positive weight that sit on top of each 13 00:00:37,000 --> 00:00:38,870 possible numerical value. 14 00:00:38,870 --> 00:00:41,440 And we can calculate the probability that the random 15 00:00:41,440 --> 00:00:45,720 variable falls inside an interval by adding all the 16 00:00:45,720 --> 00:00:49,270 masses that sit on top of that interval. 17 00:00:49,270 --> 00:00:52,890 So for example, if we're looking at the interval from a 18 00:00:52,890 --> 00:00:57,730 to b, the probability of this interval is equal to the sum 19 00:00:57,730 --> 00:01:01,530 of the probabilities of these three masses that fall inside 20 00:01:01,530 --> 00:01:02,870 this interval. 21 00:01:02,870 --> 00:01:05,690 On the other hand, a continuous random variable 22 00:01:05,690 --> 00:01:08,860 will be taking values over a continuous range-- 23 00:01:08,860 --> 00:01:12,800 for example, the real line or an interval on the real line. 24 00:01:12,800 --> 00:01:18,000 In this case, we still have one total unit of probability 25 00:01:18,000 --> 00:01:21,350 mass that is assigned to the possible values of the random 26 00:01:21,350 --> 00:01:25,390 variable, except that this unit of mass is spread all 27 00:01:25,390 --> 00:01:26,860 over the real line. 28 00:01:26,860 --> 00:01:29,990 But it is not spread in a uniform manner. 29 00:01:29,990 --> 00:01:32,210 Some parts of the real line have more 30 00:01:32,210 --> 00:01:33,880 mass per unit length. 31 00:01:33,880 --> 00:01:35,670 Some have less. 32 00:01:35,670 --> 00:01:39,900 How much mass exactly is sitting on top of each part of 33 00:01:39,900 --> 00:01:43,090 the real line is described by the probability density 34 00:01:43,090 --> 00:01:47,380 function, this function plotted here, which we denote 35 00:01:47,380 --> 00:01:48,840 with this notation. 36 00:01:48,840 --> 00:01:52,090 The letter f will always indicate that we are dealing 37 00:01:52,090 --> 00:01:53,440 with a PDF. 38 00:01:53,440 --> 00:01:57,310 And the subscript will indicate which random variable 39 00:01:57,310 --> 00:02:00,190 we're talking about. 40 00:02:00,190 --> 00:02:03,210 We use the probability density function to calculate the 41 00:02:03,210 --> 00:02:07,020 probability that X lies in a certain interval-- 42 00:02:07,020 --> 00:02:10,560 let's say the interval from a to b. 43 00:02:10,560 --> 00:02:16,079 And we calculate it by finding the area under the PDF that 44 00:02:16,079 --> 00:02:19,900 sits on top of that interval. 45 00:02:19,900 --> 00:02:23,300 So this area here, the shaded area, is the probability that 46 00:02:23,300 --> 00:02:26,100 X stakes values in this interval. 47 00:02:26,100 --> 00:02:28,990 Think of probability as snow fall. 48 00:02:28,990 --> 00:02:31,770 There is one pound of snow that has fallen on 49 00:02:31,770 --> 00:02:34,500 top of the real line. 50 00:02:34,500 --> 00:02:39,160 The PDF tells us the height of the snow accumulated over a 51 00:02:39,160 --> 00:02:41,329 particular point. 52 00:02:41,329 --> 00:02:47,190 We then find the weight of the overall amount of snow sitting 53 00:02:47,190 --> 00:02:51,079 on top of an interval by calculating the area under 54 00:02:51,079 --> 00:02:52,400 this curve. 55 00:02:52,400 --> 00:02:56,750 Of course, mathematically, area under the curve is just 56 00:02:56,750 --> 00:02:57,720 an integral. 57 00:02:57,720 --> 00:03:01,320 So the probability that X takes values in this interval 58 00:03:01,320 --> 00:03:07,790 is the integral of the PDF over this particular interval. 59 00:03:07,790 --> 00:03:10,320 What properties should the PDF have? 60 00:03:10,320 --> 00:03:14,780 By analogy with the discrete case, a PDF must be 61 00:03:14,780 --> 00:03:18,060 non-negative, because we do not want to get negative 62 00:03:18,060 --> 00:03:20,290 probabilities. 63 00:03:20,290 --> 00:03:24,870 In the discrete case, the sum of the PMF entries has to be 64 00:03:24,870 --> 00:03:26,680 equal to 1. 65 00:03:26,680 --> 00:03:31,240 In the continuous case, X is certain to lie in the interval 66 00:03:31,240 --> 00:03:34,200 between minus infinity and plus infinity. 67 00:03:34,200 --> 00:03:39,260 So letting a be minus infinity and b plus infinity, we should 68 00:03:39,260 --> 00:03:41,680 get a probability of 1. 69 00:03:41,680 --> 00:03:46,180 So the total area under the PDF, when we integrate over 70 00:03:46,180 --> 00:03:49,780 the entire real line, should be equal to 1. 71 00:03:49,780 --> 00:03:55,140 These two conditions are all that we need in order to have 72 00:03:55,140 --> 00:03:58,300 a legitimate PDF. 73 00:03:58,300 --> 00:04:01,280 We can now give a formal definition of what a 74 00:04:01,280 --> 00:04:03,770 continuous random variable is. 75 00:04:03,770 --> 00:04:08,270 A continuous random variable is a random variable whose 76 00:04:08,270 --> 00:04:14,380 probabilities can be described by a PDF according to a 77 00:04:14,380 --> 00:04:17,209 formula of this type. 78 00:04:17,209 --> 00:04:19,899 An important point-- 79 00:04:19,899 --> 00:04:22,550 the fact that a random variable takes values on a 80 00:04:22,550 --> 00:04:26,980 continuous set is not enough to make it what we call a 81 00:04:26,980 --> 00:04:29,180 continuous random variable. 82 00:04:29,180 --> 00:04:30,910 For a continuous random variable, we're 83 00:04:30,910 --> 00:04:32,730 asking for a bit more-- 84 00:04:32,730 --> 00:04:36,630 that it can be described by a PDF, that a formula of this 85 00:04:36,630 --> 00:04:37,880 type is valid. 86 00:04:40,760 --> 00:04:44,240 Now, once we have the probabilities of intervals as 87 00:04:44,240 --> 00:04:48,380 given by a PDF, we can use of additivity to calculate the 88 00:04:48,380 --> 00:04:51,340 probabilities of more complicated sets. 89 00:04:51,340 --> 00:04:54,520 For example, if you're interested in the probability 90 00:04:54,520 --> 00:05:06,050 that X lies between 1 and 3 or that X lies between 4 and 5-- 91 00:05:06,050 --> 00:05:11,250 so this is the probability that X falls in a region that 92 00:05:11,250 --> 00:05:14,940 consists of two disjoint intervals. 93 00:05:14,940 --> 00:05:19,790 We find the probability of the union of these two intervals, 94 00:05:19,790 --> 00:05:24,020 by additivity, by adding the probabilities of the two 95 00:05:24,020 --> 00:05:29,480 intervals, since these intervals are disjoint. 96 00:05:29,480 --> 00:05:35,840 And then we can use the PDF to calculate the probabilities of 97 00:05:35,840 --> 00:05:39,380 each one of these intervals according to this formula. 98 00:05:42,570 --> 00:05:45,780 At this point, you may be wondering what happened to the 99 00:05:45,780 --> 00:05:49,120 sample space in all this discussion. 100 00:05:49,120 --> 00:05:53,070 Well, there is still an underlying sample space 101 00:05:53,070 --> 00:05:54,320 lurking in the background. 102 00:05:57,800 --> 00:06:01,860 And different outcomes in the sample space result in 103 00:06:01,860 --> 00:06:05,670 different numerical values for the 104 00:06:05,670 --> 00:06:07,135 random variable of interest. 105 00:06:09,790 --> 00:06:13,650 And when we talk about the probability that X takes 106 00:06:13,650 --> 00:06:18,050 values between some numbers a and b, what we really mean is 107 00:06:18,050 --> 00:06:23,780 the probability of those outcomes for which the 108 00:06:23,780 --> 00:06:27,430 resulting value of X lies inside 109 00:06:27,430 --> 00:06:29,800 this particular interval. 110 00:06:29,800 --> 00:06:31,720 So that's what probability means. 111 00:06:31,720 --> 00:06:35,659 On the other hand, once we have a PDF in our hands, we 112 00:06:35,659 --> 00:06:39,290 can completely forget about the underlying sample space. 113 00:06:39,290 --> 00:06:42,890 And we can carry out any calculations we may be 114 00:06:42,890 --> 00:06:46,680 interested in by just working with the PDF. 115 00:06:46,680 --> 00:06:50,600 So as we move on in this course, the sample space will 116 00:06:50,600 --> 00:06:52,400 be moved offstage. 117 00:06:52,400 --> 00:06:54,470 There will be less and less mention of it. 118 00:06:54,470 --> 00:06:58,159 And we will be working just directly with PDFs or with 119 00:06:58,159 --> 00:07:03,270 PMFs if we are dealing with discrete random variables. 120 00:07:03,270 --> 00:07:06,470 Let us now build a little bit on our understanding of what 121 00:07:06,470 --> 00:07:08,580 PDFs really are by looking at 122 00:07:08,580 --> 00:07:11,110 probabilities of small intervals. 123 00:07:11,110 --> 00:07:16,660 Let us look at an interval that starts at some a and goes 124 00:07:16,660 --> 00:07:20,340 up to some number a plus delta. 125 00:07:20,340 --> 00:07:23,410 So here, delta is a positive number. 126 00:07:23,410 --> 00:07:25,270 But we're interested in the case where 127 00:07:25,270 --> 00:07:28,150 delta is very small. 128 00:07:28,150 --> 00:07:33,750 Let us look at the probability that X falls in this interval. 129 00:07:33,750 --> 00:07:40,640 The probability that X lies inside this interval is the 130 00:07:40,640 --> 00:07:42,390 area of this region. 131 00:07:45,650 --> 00:07:50,630 On the other hand, as long as f does not change too much 132 00:07:50,630 --> 00:07:53,590 over this little interval, which will be the case if we 133 00:07:53,590 --> 00:07:59,520 have a continuous density f, then we can approximate the 134 00:07:59,520 --> 00:08:04,670 area we have of this region by the area of a rectangle where 135 00:08:04,670 --> 00:08:08,130 we keep the height constant. 136 00:08:08,130 --> 00:08:14,800 The area of this rectangle is equal to the height, which is 137 00:08:14,800 --> 00:08:21,360 the value of the PDF at the point a, times the base of the 138 00:08:21,360 --> 00:08:24,360 rectangle, which is equal to delta. 139 00:08:24,360 --> 00:08:27,560 So this gives us an interpretation of PDFs in 140 00:08:27,560 --> 00:08:30,940 terms of probabilities of small intervals. 141 00:08:30,940 --> 00:08:34,370 If we take this factor of delta and send it to the other 142 00:08:34,370 --> 00:08:37,960 side in this approximate equality, we see that the 143 00:08:37,960 --> 00:08:43,308 value of the PDF can be interpreted as probability per 144 00:08:43,308 --> 00:08:44,760 unit length. 145 00:08:44,760 --> 00:08:47,350 So PDFs are not probabilities. 146 00:08:47,350 --> 00:08:48,750 They are densities. 147 00:08:48,750 --> 00:08:52,770 Their units are probability per unit length. 148 00:08:52,770 --> 00:08:57,710 Now, if the probability per unit length is finite and the 149 00:08:57,710 --> 00:09:04,410 length delta is sent to 0, we will get 0 probability. 150 00:09:04,410 --> 00:09:09,690 More formally, if we look at this integral and we let b to 151 00:09:09,690 --> 00:09:16,200 be the same as a, then we obtain the probability that X 152 00:09:16,200 --> 00:09:19,150 is equal to a. 153 00:09:19,150 --> 00:09:21,620 And on that side, we get an integral 154 00:09:21,620 --> 00:09:24,176 over a 0 length interval. 155 00:09:24,176 --> 00:09:26,800 And that integral is going to be 0. 156 00:09:26,800 --> 00:09:30,960 So we obtain that the probability that X takes a 157 00:09:30,960 --> 00:09:35,900 value equal to a specific, particular point-- 158 00:09:35,900 --> 00:09:39,220 that probability is going to be equal to 0. 159 00:09:39,220 --> 00:09:42,520 So for a continuous random variable, any particular point 160 00:09:42,520 --> 00:09:44,710 has 0 probability. 161 00:09:44,710 --> 00:09:50,280 Yet somehow, collectively, the infinitely many points in an 162 00:09:50,280 --> 00:09:55,070 interval together will have positive probablility. 163 00:09:55,070 --> 00:09:56,650 Is this a puzzle? 164 00:09:56,650 --> 00:09:57,260 Not really. 165 00:09:57,260 --> 00:09:59,890 That's exactly what happens, also, with the 166 00:09:59,890 --> 00:10:02,050 ordinary notion of length. 167 00:10:02,050 --> 00:10:06,590 Single points have 0 length, yet by putting together lots 168 00:10:06,590 --> 00:10:14,030 of points, we can create a set that has a positive length. 169 00:10:14,030 --> 00:10:17,680 And a final consequence of the fact that individual points 170 00:10:17,680 --> 00:10:20,650 have 0 length. 171 00:10:20,650 --> 00:10:24,380 Using the additivity axiom, the probability that our 172 00:10:24,380 --> 00:10:29,680 random variable takes values inside an interval is equal to 173 00:10:29,680 --> 00:10:33,150 the probability that our random variable takes a value 174 00:10:33,150 --> 00:10:37,780 of a plus the probability that our random variable takes a 175 00:10:37,780 --> 00:10:42,470 value of b plus the probability that our random 176 00:10:42,470 --> 00:10:47,390 variable is strictly between a and b. 177 00:10:47,390 --> 00:10:54,030 According to our discussion, this term is equal to 0. 178 00:10:54,030 --> 00:10:56,230 And this term is equal to 0. 179 00:10:56,230 --> 00:10:59,070 And so we conclude that the probability of a closed 180 00:10:59,070 --> 00:11:00,650 interval is the same as the 181 00:11:00,650 --> 00:11:03,100 probability of an open interval. 182 00:11:03,100 --> 00:11:05,340 When calculating probabilities, it does not 183 00:11:05,340 --> 00:11:08,300 matter whether we include the endpoints or not.