1 00:00:01,110 --> 00:00:04,280 We now move to the case of continuous random variables. 2 00:00:04,280 --> 00:00:07,660 We will start with a special case where we want to find the 3 00:00:07,660 --> 00:00:12,810 PDF of a linear function of a continuous random variable. 4 00:00:12,810 --> 00:00:17,780 We will start by considering a simple example, and study it 5 00:00:17,780 --> 00:00:20,410 using an intuitive argument. 6 00:00:20,410 --> 00:00:23,580 And afterwards, we will justify our conclusions 7 00:00:23,580 --> 00:00:24,870 mathematically. 8 00:00:24,870 --> 00:00:28,190 So we start with a random variable X that has a PDF over 9 00:00:28,190 --> 00:00:31,560 the form shown in this figure so that it is a piecewise 10 00:00:31,560 --> 00:00:33,510 constant PDF. 11 00:00:33,510 --> 00:00:36,650 We then consider a random variable z, which is defined 12 00:00:36,650 --> 00:00:42,180 to be 2 times X. The random variable x takes values 13 00:00:42,180 --> 00:00:44,250 between minus 1 and 1. 14 00:00:44,250 --> 00:00:51,560 So z takes values between minus 2 and 2. 15 00:00:51,560 --> 00:00:56,500 Now, values of X between minus 1 and 0 correspond to values 16 00:00:56,500 --> 00:00:59,810 of Z between minus 2 and 0. 17 00:00:59,810 --> 00:01:03,780 The different values of X in this range are, in some sense, 18 00:01:03,780 --> 00:01:06,750 equally likely, because we have a constant PDF. 19 00:01:06,750 --> 00:01:09,300 And that argues that the corresponding values of Z 20 00:01:09,300 --> 00:01:12,200 should also be, in some sense, equally likely. 21 00:01:12,200 --> 00:01:16,180 So the PDF should be constant over this range. 22 00:01:16,180 --> 00:01:20,930 By a similar argument, the PDF of Z should also be constant 23 00:01:20,930 --> 00:01:23,480 over the range from 0 to 2. 24 00:01:23,480 --> 00:01:28,200 And the PDF must, of course, be 0 outside this range, 25 00:01:28,200 --> 00:01:31,780 because these are values of Z that are impossible. 26 00:01:31,780 --> 00:01:35,759 Let us now try to figure out the parameters of this PDF. 27 00:01:35,759 --> 00:01:40,729 The probability that X is positive is the 28 00:01:40,729 --> 00:01:42,979 area of this rectangle. 29 00:01:42,979 --> 00:01:46,000 And the area of this rectangle is 2/3. 30 00:01:46,000 --> 00:01:50,710 So the area of this rectangle should also be 2/3. 31 00:01:50,710 --> 00:01:54,810 And that means that the height of this rectangle should be 32 00:01:54,810 --> 00:01:56,830 equal to 1/3. 33 00:01:56,830 --> 00:02:00,770 Similarly, the probability that X is negative is the area 34 00:02:00,770 --> 00:02:04,070 of this rectangle, and the area of this rectangle is 35 00:02:04,070 --> 00:02:05,920 equal to 1/3. 36 00:02:05,920 --> 00:02:09,130 When X is negative, Z is also negative, so the probability 37 00:02:09,130 --> 00:02:11,960 of a negative value should be equal to 1/3. 38 00:02:11,960 --> 00:02:15,650 And for the area of this rectangle to be 1/3, it means 39 00:02:15,650 --> 00:02:20,240 that the height of this rectangle should be 1/6. 40 00:02:20,240 --> 00:02:21,530 So what happened here? 41 00:02:21,530 --> 00:02:26,420 We started with a PDF of X and essentially stretched it out 42 00:02:26,420 --> 00:02:31,100 by a factor of 2 while keeping the same shape. 43 00:02:31,100 --> 00:02:35,300 However, we also scaled it down by a 44 00:02:35,300 --> 00:02:36,880 corresponding amount. 45 00:02:36,880 --> 00:02:41,660 So 2/3 became 1/3, and 1/3 became 1/6. 46 00:02:41,660 --> 00:02:45,590 The reason for this scaling down is because we need the 47 00:02:45,590 --> 00:02:49,870 total probability, the total area under this PDF, to be 48 00:02:49,870 --> 00:02:52,270 equal to 1. 49 00:02:52,270 --> 00:02:57,540 If we now add a number, let's say 3, to the random variable 50 00:02:57,540 --> 00:03:00,340 Z, what is going to happen? 51 00:03:00,340 --> 00:03:03,740 The random variable Y now will take values from 52 00:03:03,740 --> 00:03:05,290 minus 2 plus 3-- 53 00:03:05,290 --> 00:03:07,010 this is plus 1-- 54 00:03:07,010 --> 00:03:11,660 all the way up to 2 plus 3, which is plus 5. 55 00:03:14,170 --> 00:03:20,700 Values in the range from 1 to 3 correspond to values of Z in 56 00:03:20,700 --> 00:03:22,590 the range from minus 2 to 0. 57 00:03:22,590 --> 00:03:26,360 These values are all, in some sense, equally likely. 58 00:03:26,360 --> 00:03:29,290 So they should also be equally likely here. 59 00:03:29,290 --> 00:03:33,380 And by a similar argument, these values in the range from 60 00:03:33,380 --> 00:03:36,600 3 to 5 should also be equally likely. 61 00:03:36,600 --> 00:03:42,280 This rectangle corresponds to this rectangle here. 62 00:03:42,280 --> 00:03:44,030 So the area should be the same. 63 00:03:44,030 --> 00:03:47,320 And therefore, the height should also be the same. 64 00:03:47,320 --> 00:03:50,090 Therefore, the height here should be 1/6. 65 00:03:50,090 --> 00:03:52,150 And by the same argument, the height here 66 00:03:52,150 --> 00:03:53,450 should be equal to 1/3. 67 00:03:56,130 --> 00:04:00,640 So what happens here is that when we add 3 to a random 68 00:04:00,640 --> 00:04:06,590 variable, the PDF just gets shifted by 3 but otherwise 69 00:04:06,590 --> 00:04:09,510 retains the same shape. 70 00:04:09,510 --> 00:04:12,220 So the story is entirely similar to what happened in 71 00:04:12,220 --> 00:04:13,520 the discrete case. 72 00:04:13,520 --> 00:04:18,660 We start with a PDF of X. We stretch it horizontally by a 73 00:04:18,660 --> 00:04:20,360 factor of 2. 74 00:04:20,360 --> 00:04:24,280 And then we shift it horizontally by 3. 75 00:04:24,280 --> 00:04:27,640 The only difference is that here in the continuous case, 76 00:04:27,640 --> 00:04:33,820 we also need to scale the plot in the vertical dimension by a 77 00:04:33,820 --> 00:04:35,330 factor of 2. 78 00:04:35,330 --> 00:04:38,909 Actually, make it smaller by a factor of 2. 79 00:04:38,909 --> 00:04:42,300 And this needs to be done in order to keep the total area 80 00:04:42,300 --> 00:04:46,302 under the PDF equal to 1. 81 00:04:46,302 --> 00:04:49,390 Let us now go through a mathematical argument with the 82 00:04:49,390 --> 00:04:52,700 purpose of also finding a formula that represents what 83 00:04:52,700 --> 00:04:55,730 we just did in our previous example. 84 00:04:55,730 --> 00:04:58,980 Let Y be equal to aX plus b. 85 00:04:58,980 --> 00:05:03,350 Here, X is a random variable with a given PDF. 86 00:05:03,350 --> 00:05:06,140 a and b are given constants. 87 00:05:06,140 --> 00:05:11,990 Now, if a is equal to 0, then Y is identically equal to b. 88 00:05:11,990 --> 00:05:14,020 So it is a constant random variable and 89 00:05:14,020 --> 00:05:15,560 does not have a PDF. 90 00:05:15,560 --> 00:05:20,080 So let us exclude this case and start by assuming that a 91 00:05:20,080 --> 00:05:23,400 is a positive number. 92 00:05:23,400 --> 00:05:27,490 We can try to work, as in the discrete case, and try 93 00:05:27,490 --> 00:05:29,440 something like the following. 94 00:05:29,440 --> 00:05:34,120 The probability that Y takes on a specific value is the 95 00:05:34,120 --> 00:05:39,430 same as the probability that aX plus b takes on a specific 96 00:05:39,430 --> 00:05:44,030 value, which is the same as the probability that X takes 97 00:05:44,030 --> 00:05:48,390 on the specific value, y minus b divided by a. 98 00:05:48,390 --> 00:05:50,909 This equality was useful in the discrete case. 99 00:05:50,909 --> 00:05:52,610 Is it useful here? 100 00:05:52,610 --> 00:05:53,850 Unfortunately not. 101 00:05:53,850 --> 00:05:56,180 When we're dealing with continuous random variables, 102 00:05:56,180 --> 00:05:58,710 the probability that the continuous random variable is 103 00:05:58,710 --> 00:06:02,210 exactly equal to a given number, this probability is 104 00:06:02,210 --> 00:06:03,440 going to be equal to 0. 105 00:06:03,440 --> 00:06:05,750 And the same applies to this side as well. 106 00:06:05,750 --> 00:06:08,190 So we have that 0 is equal to 0. 107 00:06:08,190 --> 00:06:12,030 And this is uninformative, and we have not made any progress. 108 00:06:12,030 --> 00:06:15,020 So instead of working with probabilities of individual 109 00:06:15,020 --> 00:06:19,110 points which will always be 0, we will work with 110 00:06:19,110 --> 00:06:23,140 probabilities of intervals that generally have non-zero 111 00:06:23,140 --> 00:06:24,340 probability. 112 00:06:24,340 --> 00:06:26,930 The trick is to work with CDFs. 113 00:06:26,930 --> 00:06:33,690 So let us try to find the CDF of Y. The CDF of the random 114 00:06:33,690 --> 00:06:37,950 variable Y is defined as the probability that the random 115 00:06:37,950 --> 00:06:42,000 variable is less than or equal to a certain number. 116 00:06:42,000 --> 00:06:45,400 Now, in our case, Y is aX plus b. 117 00:06:49,210 --> 00:06:53,310 We move b to the other side of the inequality and then divide 118 00:06:53,310 --> 00:06:56,230 both sides of the inequality by a. 119 00:06:56,230 --> 00:07:00,700 And we get that this is the same as the probability that X 120 00:07:00,700 --> 00:07:07,010 is less than or equal to y minus b divided by a, which is 121 00:07:07,010 --> 00:07:14,350 the same as the CDF of X evaluated at y minus b over a. 122 00:07:14,350 --> 00:07:18,500 So we have a formula for the CDF of Y in terms of the CDF 123 00:07:18,500 --> 00:07:20,280 of X. 124 00:07:20,280 --> 00:07:22,060 How can we find the PDF? 125 00:07:22,060 --> 00:07:24,080 Simply by differentiating. 126 00:07:24,080 --> 00:07:28,130 We differentiate both sides of this equation. 127 00:07:28,130 --> 00:07:31,220 The derivative of a CDF is a PDF. 128 00:07:31,220 --> 00:07:36,890 And therefore, the PDF of Y is going to be equal to the 129 00:07:36,890 --> 00:07:38,730 derivative of this side. 130 00:07:38,730 --> 00:07:41,080 Here we need to use the chain rule. 131 00:07:41,080 --> 00:07:45,380 First, we take the derivative of this function. 132 00:07:45,380 --> 00:07:50,930 And the derivative of the CDF is a PDF, so the PDF of X 133 00:07:50,930 --> 00:07:53,315 evaluated at this particular number. 134 00:07:55,960 --> 00:07:59,680 But then we also need to take the derivative of the argument 135 00:07:59,680 --> 00:08:02,230 inside with respect to y. 136 00:08:02,230 --> 00:08:06,480 And that derivative is equal to 1/a. 137 00:08:06,480 --> 00:08:11,130 And this gives us a formula for the PDF of Y in terms of 138 00:08:11,130 --> 00:08:13,680 the PDF of X. 139 00:08:13,680 --> 00:08:18,950 How about the case where a is less than 0? 140 00:08:18,950 --> 00:08:20,370 What is going to change? 141 00:08:20,370 --> 00:08:24,570 The first step up to here remains valid. 142 00:08:24,570 --> 00:08:29,140 But now when we divide both sides of the inequality by a, 143 00:08:29,140 --> 00:08:32,520 the direction of the inequality gets reversed. 144 00:08:32,520 --> 00:08:38,510 So we obtain instead the probability that X is larger 145 00:08:38,510 --> 00:08:43,850 than or equal to y minus b divided by a. 146 00:08:43,850 --> 00:08:50,840 And this is 1 minus the probability that X is less 147 00:08:50,840 --> 00:08:54,040 than y minus b over a. 148 00:08:54,040 --> 00:08:57,070 Now, X is a continuous random variable, so the probability 149 00:08:57,070 --> 00:09:00,860 is not going to change if here we make the inequality to be a 150 00:09:00,860 --> 00:09:02,900 less than or equal sign. 151 00:09:06,320 --> 00:09:12,760 And what we have here is 1 minus the CDF of X evaluated 152 00:09:12,760 --> 00:09:16,890 at y minus b over a. 153 00:09:16,890 --> 00:09:22,300 We use the chain rule once more, and we obtain that the 154 00:09:22,300 --> 00:09:32,920 PDF of Y, in this case, is equal to minus the PDF of X 155 00:09:32,920 --> 00:09:36,970 evaluated at y minus b over a times 1/a. 156 00:09:41,420 --> 00:09:45,240 Now, when a is positive, a is the same as the 157 00:09:45,240 --> 00:09:47,052 absolute value of a. 158 00:09:47,052 --> 00:09:50,450 When a is negative and we have this formula, we have here a 159 00:09:50,450 --> 00:09:54,690 minus a, which is the same as the absolute value of a. 160 00:09:54,690 --> 00:09:59,560 So we can unify these two formulas by replacing the 161 00:09:59,560 --> 00:10:03,150 occurrences of a and that minus sign by just using the 162 00:10:03,150 --> 00:10:04,670 absolute value. 163 00:10:04,670 --> 00:10:10,360 And this gives us this formula for the PDF of Y in terms of 164 00:10:10,360 --> 00:10:14,270 the PDF of X. And it is a formula that's valid whether a 165 00:10:14,270 --> 00:10:18,510 is positive or negative. 166 00:10:18,510 --> 00:10:22,420 What this formula represents is the following. 167 00:10:22,420 --> 00:10:27,700 Because of the factor of a that we have here, we take the 168 00:10:27,700 --> 00:10:32,482 PDF of X and scale it horizontally by a factor of a. 169 00:10:32,482 --> 00:10:37,040 Because of the term b that we have here, the PDF also gets 170 00:10:37,040 --> 00:10:39,560 shifted horizontally by b. 171 00:10:39,560 --> 00:10:43,380 And finally, this term here corresponds to a vertical 172 00:10:43,380 --> 00:10:45,920 scaling of the plot that we have. 173 00:10:45,920 --> 00:10:49,980 And the reason that this term is present is so that the PDF 174 00:10:49,980 --> 00:10:53,750 of Y integrates to 1. 175 00:10:53,750 --> 00:10:56,730 It is interesting to also compare with the corresponding 176 00:10:56,730 --> 00:10:59,620 discrete formula that we derived earlier. 177 00:10:59,620 --> 00:11:03,940 The discrete formula has exactly the same appearance 178 00:11:03,940 --> 00:11:07,630 except that the scaling factor is not present. 179 00:11:07,630 --> 00:11:10,340 So for the case of continuous random variables, we need to 180 00:11:10,340 --> 00:11:12,880 scale vertically the PDF. 181 00:11:12,880 --> 00:11:16,370 But in the discrete case, such a scaling is not present.