1 00:00:00,000 --> 00:00:00,840 2 00:00:00,840 --> 00:00:01,980 Hi. 3 00:00:01,980 --> 00:00:03,910 In this video, we're going to do some approximate 4 00:00:03,910 --> 00:00:07,850 calculations using the central limit theorem. 5 00:00:07,850 --> 00:00:10,490 We're given that Xn is the number of gadgets produced on 6 00:00:10,490 --> 00:00:12,140 day n by a factory. 7 00:00:12,140 --> 00:00:15,140 And it has a normal distribution with mean 5 and 8 00:00:15,140 --> 00:00:16,460 variance 9. 9 00:00:16,460 --> 00:00:20,340 And they're all independent and identically distributed. 10 00:00:20,340 --> 00:00:22,020 We're looking for the probability that the total 11 00:00:22,020 --> 00:00:26,180 number of gadgets in 100 days is less than 440. 12 00:00:26,180 --> 00:00:33,660 To start, we can first write this as the probability of the 13 00:00:33,660 --> 00:00:37,820 sum of the gadgets produced on each of 100 days 14 00:00:37,820 --> 00:00:42,320 being less than 440. 15 00:00:42,320 --> 00:00:45,890 Notice that this is a sum of a large number of independent 16 00:00:45,890 --> 00:00:46,960 random variables. 17 00:00:46,960 --> 00:00:50,370 So we can use the central limit theorem and approximate 18 00:00:50,370 --> 00:00:53,435 the sum as a normal random variable. 19 00:00:53,435 --> 00:00:56,780 And then, basically, in order to compute this probability, 20 00:00:56,780 --> 00:00:59,790 we'd basically need to standardize this and then use 21 00:00:59,790 --> 00:01:01,440 the standard normal table. 22 00:01:01,440 --> 00:01:03,490 So let's first compute the expectation and 23 00:01:03,490 --> 00:01:06,920 variance of the sum. 24 00:01:06,920 --> 00:01:13,400 So I'm going to actually sum up from 1 to n instead of 100, 25 00:01:13,400 --> 00:01:15,790 to do it more generally. 26 00:01:15,790 --> 00:01:21,700 So the linearity is preserved for the expectation operator. 27 00:01:21,700 --> 00:01:24,660 So this is the sum of the expected value. 28 00:01:24,660 --> 00:01:28,120 And since they're all identically distributed, they 29 00:01:28,120 --> 00:01:31,210 all have the same expectation, and there are n of them. 30 00:01:31,210 --> 00:01:35,240 And so we have this being n times 5. 31 00:01:35,240 --> 00:01:45,240 For the variance of the sum is also the sum of the variances 32 00:01:45,240 --> 00:01:48,710 because the independents. 33 00:01:48,710 --> 00:01:50,838 And so they're identically distributed to the -- so we 34 00:01:50,838 --> 00:01:58,860 have n times the variance of Xi, and this is n times 9. 35 00:01:58,860 --> 00:02:03,860 So now, we can standardize it, or make it 0 mean 36 00:02:03,860 --> 00:02:04,980 and variance 1. 37 00:02:04,980 --> 00:02:09,750 So to do that we would take these Xi's, 38 00:02:09,750 --> 00:02:11,220 subtract by their mean. 39 00:02:11,220 --> 00:02:17,670 So it's going to be 5 times 100 of them, so it's 500 over 40 00:02:17,670 --> 00:02:19,620 the square root of the variance, which is going to be 41 00:02:19,620 --> 00:02:24,430 9 times 100 of them, so it's going to be 900. 42 00:02:24,430 --> 00:02:31,400 So that's going to be less than 440 minus 500 over square 43 00:02:31,400 --> 00:02:34,980 root of 900. 44 00:02:34,980 --> 00:02:39,790 So notice what we're trying to do here is-- 45 00:02:39,790 --> 00:02:45,840 notice that the sum of Xi's is a discrete quantity. 46 00:02:45,840 --> 00:02:47,780 So it's a discrete random variable, so it may 47 00:02:47,780 --> 00:02:49,190 have a PMF like this. 48 00:02:49,190 --> 00:02:53,110 49 00:02:53,110 --> 00:02:54,140 And we're trying to approximate it 50 00:02:54,140 --> 00:02:56,600 with a normal density. 51 00:02:56,600 --> 00:03:01,160 So this is not drawn to scale, but let's say that this is 440 52 00:03:01,160 --> 00:03:04,700 and this is 439. 53 00:03:04,700 --> 00:03:07,170 Basically, we're trying to say what's the probability of this 54 00:03:07,170 --> 00:03:10,990 being less than 440, so it's the probability that it's 439, 55 00:03:10,990 --> 00:03:15,410 or 438, or 437. 56 00:03:15,410 --> 00:03:19,310 But in the continuous case, a good approximation to this 57 00:03:19,310 --> 00:03:24,540 would be to take the middle, say, 439.5, and compute the 58 00:03:24,540 --> 00:03:27,690 area below that. 59 00:03:27,690 --> 00:03:32,560 So in this case, when we do the normal approximation, it 60 00:03:32,560 --> 00:03:37,130 works out better if we use this half correction. 61 00:03:37,130 --> 00:03:42,320 And so, this, in this case, probability, let's call Z the 62 00:03:42,320 --> 00:03:44,220 standard normal. 63 00:03:44,220 --> 00:03:47,390 And so this is approximately equal to a standard normal 64 00:03:47,390 --> 00:03:52,800 with the probability of standard normal being less 65 00:03:52,800 --> 00:03:53,950 than whatever that is. 66 00:03:53,950 --> 00:03:55,850 And if you plug that into your calculator, you 67 00:03:55,850 --> 00:03:59,790 get negative 2.02. 68 00:03:59,790 --> 00:04:03,670 So now, if we try to figure out what this-- 69 00:04:03,670 --> 00:04:06,230 from the table, we'll find that negative 70 00:04:06,230 --> 00:04:07,730 values are not tabulated. 71 00:04:07,730 --> 00:04:11,970 But we know that the normal, the center of normal is 72 00:04:11,970 --> 00:04:15,520 symmetric, and so if we want to compute the area in this 73 00:04:15,520 --> 00:04:18,010 region, it's the same as the area in this 74 00:04:18,010 --> 00:04:20,019 region, above 2.02. 75 00:04:20,019 --> 00:04:22,970 So this is the same as the probability that Z 76 00:04:22,970 --> 00:04:28,400 is bigger than 2.02. 77 00:04:28,400 --> 00:04:32,480 That's just 1 minus the probability that Z is less 78 00:04:32,480 --> 00:04:36,370 than or equal to 2.02, and so that's, by 79 00:04:36,370 --> 00:04:40,910 definition, phi of 2.02. 80 00:04:40,910 --> 00:04:45,950 And if we look it up on the table, 2.02 has probability 81 00:04:45,950 --> 00:04:49,490 here of 0.9783. 82 00:04:49,490 --> 00:04:51,590 And we can just write that in. 83 00:04:51,590 --> 00:04:59,590 That's the answer for Part A. 84 00:04:59,590 --> 00:05:01,800 So now for Part B. 85 00:05:01,800 --> 00:05:07,500 We're asked what's the largest n, approximately, so that it 86 00:05:07,500 --> 00:05:09,600 satisfies this. 87 00:05:09,600 --> 00:05:13,590 So again, we can use the central limit theorem. 88 00:05:13,590 --> 00:05:20,360 Use similar steps here so that we have, in this case, n 89 00:05:20,360 --> 00:05:25,290 greater than or equal to 200 plus 5n. 90 00:05:25,290 --> 00:05:26,620 And standardized. 91 00:05:26,620 --> 00:05:35,320 So we have n and the mean here-- this is where this 92 00:05:35,320 --> 00:05:36,030 comes handy. 93 00:05:36,030 --> 00:05:40,450 It's going to be 5n and the variance is 9n. 94 00:05:40,450 --> 00:05:42,060 It's greater than or equal to. 95 00:05:42,060 --> 00:05:44,010 5n's will cancel and you subtract. 96 00:05:44,010 --> 00:05:49,460 And then you get 200 over the square root of 9n. 97 00:05:49,460 --> 00:05:54,600 And we can, again, use the half approximation here, half 98 00:05:54,600 --> 00:05:55,520 correction here. 99 00:05:55,520 --> 00:06:00,820 But I'm not going to do it, to keep the problem simple. 100 00:06:00,820 --> 00:06:05,340 And so in this case, this is approximately equal to the 101 00:06:05,340 --> 00:06:07,890 standard normal being greater than probability of the center 102 00:06:07,890 --> 00:06:11,870 of normal being greater than or equal to 200 over square 103 00:06:11,870 --> 00:06:13,520 root of 9n. 104 00:06:13,520 --> 00:06:19,160 And so same sort of thing here. 105 00:06:19,160 --> 00:06:23,540 This is just 1 minus this. 106 00:06:23,540 --> 00:06:26,740 The equal sign doesn't matter because Z is a continuous 107 00:06:26,740 --> 00:06:28,140 random variable. 108 00:06:28,140 --> 00:06:34,270 And so we have this here. 109 00:06:34,270 --> 00:06:40,310 And we want this to be less than or equal 0.05. 110 00:06:40,310 --> 00:06:47,170 So that means that phi of 200 over square root of 9 has to 111 00:06:47,170 --> 00:06:49,970 be greater than or equal to 0.95. 112 00:06:49,970 --> 00:07:00,020 So we're basically looking for something here that ensures 113 00:07:00,020 --> 00:07:04,360 that this region's at least 0.95. 114 00:07:04,360 --> 00:07:09,810 So if you look at the table, 0.95 lies somewhere in between 115 00:07:09,810 --> 00:07:13,140 1.64 and 1.65. 116 00:07:13,140 --> 00:07:17,780 And I'm going to use 1.65 to be conservative, because we 117 00:07:17,780 --> 00:07:20,640 want this region to be at least 0.95. 118 00:07:20,640 --> 00:07:24,030 So 1.65 works better here. 119 00:07:24,030 --> 00:07:30,320 And so we want this thing, this here, which is going to 120 00:07:30,320 --> 00:07:35,560 be 200 over square root of n-- 121 00:07:35,560 --> 00:07:43,030 square root of 9n, to be bigger than or equal to 1.65. 122 00:07:43,030 --> 00:07:51,240 So n here is going to be less than or equal to 200 over 1.65 123 00:07:51,240 --> 00:07:55,140 squared, 1 over 9. 124 00:07:55,140 --> 00:07:57,650 If you plug this into your calculator, you might have a 125 00:07:57,650 --> 00:07:58,660 decimal in there. 126 00:07:58,660 --> 00:08:02,210 Then we just pick n, the largest integer 127 00:08:02,210 --> 00:08:05,220 that satisfies this. 128 00:08:05,220 --> 00:08:09,200 So we can plug that into your calculator, you'll find that 129 00:08:09,200 --> 00:08:12,920 it's going to be 1,632. 130 00:08:12,920 --> 00:08:13,290 That's 131 00:08:13,290 --> 00:08:16,690 part B. Last part. 132 00:08:16,690 --> 00:08:19,910 Let n be the first day when the total number of gadgets is 133 00:08:19,910 --> 00:08:21,720 greater than 1,000. 134 00:08:21,720 --> 00:08:23,930 What's the probability that n is greater 135 00:08:23,930 --> 00:08:25,350 than or equal to 220? 136 00:08:25,350 --> 00:08:29,090 Again, we want to use the central limit theorem, but the 137 00:08:29,090 --> 00:08:35,650 trick here is to recognize that this is actually equal to 138 00:08:35,650 --> 00:08:42,240 the probability that the sum from i equals 1 to 219 of Xi, 139 00:08:42,240 --> 00:08:45,370 is less than or equal to 1,000. 140 00:08:45,370 --> 00:08:48,200 So let's look at both directions to check this. 141 00:08:48,200 --> 00:08:51,390 If n is greater than or equal to 220, then 142 00:08:51,390 --> 00:08:52,500 this has to be true. 143 00:08:52,500 --> 00:08:56,320 Because if it weren't true, and if this were greater than 144 00:08:56,320 --> 00:09:00,790 1,000, then n would have been less than or equal to 219. 145 00:09:00,790 --> 00:09:04,190 So this direction works. 146 00:09:04,190 --> 00:09:05,270 The other direction. 147 00:09:05,270 --> 00:09:08,990 If this were the case, it has to be the case that n is 148 00:09:08,990 --> 00:09:14,060 greater than or equal to 220, because up till 219 it hasn't 149 00:09:14,060 --> 00:09:15,410 exceeded 1,000. 150 00:09:15,410 --> 00:09:20,050 And so, at some point beyond that, it's going to exceed 151 00:09:20,050 --> 00:09:24,090 1,000 and n is going to be greater than or equal to 220. 152 00:09:24,090 --> 00:09:25,710 So this is the key trick here. 153 00:09:25,710 --> 00:09:34,120 And once you see this, you realize that this is very easy 154 00:09:34,120 --> 00:09:38,070 because we do the same steps as we did before. 155 00:09:38,070 --> 00:09:44,160 So you're looking for this, this is equal to, again, you 156 00:09:44,160 --> 00:09:46,923 do your standardization. 157 00:09:46,923 --> 00:09:49,850 158 00:09:49,850 --> 00:09:56,780 So this is from 219, and you get 5 times 219 for the mean, 159 00:09:56,780 --> 00:10:01,180 and 9 times 219 for the variance, less than or equal 160 00:10:01,180 --> 00:10:07,300 to 1,000 minus 5 times 219 over square 161 00:10:07,300 --> 00:10:09,710 root of 9 times 219. 162 00:10:09,710 --> 00:10:14,130 Again, you can do the half correction here, make it 163 00:10:14,130 --> 00:10:19,410 1,000.5, but I'm not going to do that in this case, for 164 00:10:19,410 --> 00:10:20,760 simplicity. 165 00:10:20,760 --> 00:10:23,780 So this is approximately equal to Z being less than 166 00:10:23,780 --> 00:10:27,170 whatever this is. 167 00:10:27,170 --> 00:10:29,080 And if you plug it in, you'll find that 168 00:10:29,080 --> 00:10:31,950 this is negative 2.14. 169 00:10:31,950 --> 00:10:39,670 So in this case, we have this is the probability that Z-- 170 00:10:39,670 --> 00:10:41,410 again, we do the same thing-- 171 00:10:41,410 --> 00:10:44,550 is greater than or equal to 2.14. 172 00:10:44,550 --> 00:10:49,590 And this is 1 minus the probability that Z is less 173 00:10:49,590 --> 00:10:53,750 than or equal to 2.14. 174 00:10:53,750 --> 00:11:00,640 And that's just phi of 2.14-- 175 00:11:00,640 --> 00:11:03,570 1 minus Z of 2.14. 176 00:11:03,570 --> 00:11:04,830 And that's-- 177 00:11:04,830 --> 00:11:07,850 if you look it up on the table, it's 2.14. 178 00:11:07,850 --> 00:11:12,700 It's 0.9838. 179 00:11:12,700 --> 00:11:14,160 So here's your answer. 180 00:11:14,160 --> 00:11:16,440 So we're done with Part C as well. 181 00:11:16,440 --> 00:11:20,330 So in this exercise, we did a lot of approximate 182 00:11:20,330 --> 00:11:23,070 calculations using the central--