1 00:00:00,000 --> 00:00:00,580 2 00:00:00,580 --> 00:00:04,710 In this problem, we're given a collection of 10 variables, x1 3 00:00:04,710 --> 00:00:10,580 through x10, where each i, xi, is a uniform random variable 4 00:00:10,580 --> 00:00:12,345 between 0 and 1. 5 00:00:12,345 --> 00:00:18,710 So each i is uniform between 0 and 1, and all 10 variables 6 00:00:18,710 --> 00:00:19,950 are independent. 7 00:00:19,950 --> 00:00:26,640 And we'd like to develop a bound on the probability that 8 00:00:26,640 --> 00:00:34,860 some of the 10 variables, 1 to 10, being greater than 7 using 9 00:00:34,860 --> 00:00:36,250 different methods. 10 00:00:36,250 --> 00:00:39,960 So in part A we'll be using the Markov's inequality 11 00:00:39,960 --> 00:00:41,240 written here. 12 00:00:41,240 --> 00:00:44,060 That is, if we have a random variable, positive random 13 00:00:44,060 --> 00:00:48,350 variable x, the probability x is greater than a, where a is 14 00:00:48,350 --> 00:00:52,480 again some positive number, is bounded above by the expected 15 00:00:52,480 --> 00:00:55,450 value of x divided by a. 16 00:00:55,450 --> 00:00:58,140 And let's see how that works out in our situation. 17 00:00:58,140 --> 00:01:02,910 In our situation, we will call x the summation of i equal to 18 00:01:02,910 --> 00:01:13,010 1 to 10xi, and therefore, E of x is simply 10 times E of x1, 19 00:01:13,010 --> 00:01:16,470 the individual ones, and this gives us 5. 20 00:01:16,470 --> 00:01:21,970 Here we use used the linearity of expectation such that the 21 00:01:21,970 --> 00:01:24,530 expectation of the sum of the random variable is simply the 22 00:01:24,530 --> 00:01:27,020 sum of the expectations. 23 00:01:27,020 --> 00:01:29,600 Now, we can invoke Markov's Inequality. 24 00:01:29,600 --> 00:01:32,630 It says x greater or equal to 7. 25 00:01:32,630 --> 00:01:42,750 This is less than E of x over 7, and this gives us 5 over 7. 26 00:01:42,750 --> 00:01:45,780 For part B, let's see if we can improve the bound we got 27 00:01:45,780 --> 00:01:49,860 in part A using the Chebyshev inequality, which takes into 28 00:01:49,860 --> 00:01:52,950 account the variance of random variable x. 29 00:01:52,950 --> 00:01:56,700 Again, to refresh you on this, the Chebyshev Inequality says 30 00:01:56,700 --> 00:02:01,010 the probability that x deviates from its mean E of x, 31 00:02:01,010 --> 00:02:05,850 by more than a is bound above by the variance of x divided 32 00:02:05,850 --> 00:02:08,090 by a squared. 33 00:02:08,090 --> 00:02:11,960 So we have to actually do some work to transform the 34 00:02:11,960 --> 00:02:15,680 probability we're interested in, which is x greater or 35 00:02:15,680 --> 00:02:20,140 equal to 7, into the form that's convenient to use using 36 00:02:20,140 --> 00:02:22,430 the Chebyshev Inequality. 37 00:02:22,430 --> 00:02:26,510 To do so, we'll rewrite this probability as the probability 38 00:02:26,510 --> 00:02:32,410 of x minus 5 greater or equal to 2 simply by moving 5 from 39 00:02:32,410 --> 00:02:33,920 the right to the left. 40 00:02:33,920 --> 00:02:37,470 The reason we chose 5 is because 5 is equal to the 41 00:02:37,470 --> 00:02:42,660 expected value of x from part A as we know before. 42 00:02:42,660 --> 00:02:45,880 And in fact, this quantity is also equal to the probability 43 00:02:45,880 --> 00:02:49,430 that x minus 5 less or equal to negative 2. 44 00:02:49,430 --> 00:02:52,490 45 00:02:52,490 --> 00:02:55,730 To see why this is true, recall that x is simply the 46 00:02:55,730 --> 00:03:01,670 summation of the xi's, the 10 xi's, and each xi is a uniform 47 00:03:01,670 --> 00:03:04,370 random variable between 0 and 1. 48 00:03:04,370 --> 00:03:08,100 And therefore, each xi, the distribution of which is 49 00:03:08,100 --> 00:03:11,470 symmetric around its mean 1/2. 50 00:03:11,470 --> 00:03:14,290 So we can see that after we add up all the xi's, the 51 00:03:14,290 --> 00:03:16,960 resulting distribution x is also symmetric 52 00:03:16,960 --> 00:03:19,050 around its mean 5. 53 00:03:19,050 --> 00:03:21,840 And as a result, the probability of x minus 5 54 00:03:21,840 --> 00:03:26,380 greater than 2 is now equal to the probability that x minus 5 55 00:03:26,380 --> 00:03:28,670 less than negative 2. 56 00:03:28,670 --> 00:03:32,120 And knowing these two, we can then say they're both equal to 57 00:03:32,120 --> 00:03:38,440 1/2 the probability x minus 5 absolute value greater or 58 00:03:38,440 --> 00:03:42,270 equal to 2, because this term right here is simply the sum 59 00:03:42,270 --> 00:03:44,850 of both terms here and here. 60 00:03:44,850 --> 00:03:48,810 61 00:03:48,810 --> 00:03:51,270 At this point, we have transformed the probability of 62 00:03:51,270 --> 00:03:54,970 x greater or equal to 7 into the form right here, such that 63 00:03:54,970 --> 00:03:59,090 we can apply the Chebyshev's Inequality basically directly. 64 00:03:59,090 --> 00:04:02,050 And we'll write the probably here being less than or equal 65 00:04:02,050 --> 00:04:08,560 to 1/2 times, applying the Chebyshev Inequality, variance 66 00:04:08,560 --> 00:04:11,510 of x divided by 2 squared. 67 00:04:11,510 --> 00:04:16,589 Now, 2 is the same as a right here, and this 68 00:04:16,589 --> 00:04:20,230 gives us 1/8 times-- 69 00:04:20,230 --> 00:04:24,230 now, the variance of x, we know is 10 times the variance 70 00:04:24,230 --> 00:04:29,220 of a uniform random variable between 0 and 1, which is 71 00:04:29,220 --> 00:04:33,230 1/12, and that gives us 5/48. 72 00:04:33,230 --> 00:04:36,500 73 00:04:36,500 --> 00:04:38,900 Now, let's compare this with the number we got earlier 74 00:04:38,900 --> 00:04:42,650 using the Markov Inequality, which was 5/7. 75 00:04:42,650 --> 00:04:46,770 We see that 5/48 is much smaller, and this tells us 76 00:04:46,770 --> 00:04:49,540 that, at least for this example, using the Chebyshev 77 00:04:49,540 --> 00:04:52,720 Inequality combined with the information of the variance of 78 00:04:52,720 --> 00:04:56,030 x, we're able to get a stronger upper bound on the 79 00:04:56,030 --> 00:04:59,150 probability of the event that we're interested in. 80 00:04:59,150 --> 00:05:01,890 Now, in part B, we saw that by using the additional 81 00:05:01,890 --> 00:05:04,340 information of the variance combined with the Chebyshev 82 00:05:04,340 --> 00:05:08,190 Inequality, we can improve upon bound given by Markov's 83 00:05:08,190 --> 00:05:09,650 Inequality. 84 00:05:09,650 --> 00:05:12,420 Now, in part C, we'll use a somewhat more powerful 85 00:05:12,420 --> 00:05:15,230 approach in addition to the Chebyshev Inequality, the 86 00:05:15,230 --> 00:05:16,910 so-called central limit theorem. 87 00:05:16,910 --> 00:05:19,450 Let's see if we can even get a better bound. 88 00:05:19,450 --> 00:05:22,600 To remind you what a central limit theorem is, let's say we 89 00:05:22,600 --> 00:05:28,740 have a summation of i equal to 1 to some number n of 90 00:05:28,740 --> 00:05:30,610 independent and identically distributed 91 00:05:30,610 --> 00:05:32,420 random variables xi. 92 00:05:32,420 --> 00:05:34,720 Now, the central limit theorem says the following. 93 00:05:34,720 --> 00:05:38,770 We take the sum right here, and subtract out its means, 94 00:05:38,770 --> 00:05:44,890 which is E of the same summation, and further, we'll 95 00:05:44,890 --> 00:05:48,500 divide out, what we call normalize, by the standard 96 00:05:48,500 --> 00:05:50,275 deviation of the summation. 97 00:05:50,275 --> 00:05:55,580 In other words, the square root of the variance 98 00:05:55,580 --> 00:05:57,000 of the sum of xi. 99 00:05:57,000 --> 00:05:59,500 100 00:05:59,500 --> 00:06:05,230 So if we perform this procedure right here, then as 101 00:06:05,230 --> 00:06:09,090 the number of terms in the sums going to infinity, here 102 00:06:09,090 --> 00:06:13,270 as in goes to infinity, we will actually see that this 103 00:06:13,270 --> 00:06:17,490 random variable will converge in distribution in some way 104 00:06:17,490 --> 00:06:20,960 that will eventually look like a standard normal random 105 00:06:20,960 --> 00:06:23,880 variable with means 0 and 1. 106 00:06:23,880 --> 00:06:26,380 And since we know how the distribution of a standard 107 00:06:26,380 --> 00:06:29,790 normal looks like, we can go to table and look up certain 108 00:06:29,790 --> 00:06:32,580 properties of the resulting distribution. 109 00:06:32,580 --> 00:06:34,430 So that is a plan to do. 110 00:06:34,430 --> 00:06:36,970 So right now, we have about 10 variables. 111 00:06:36,970 --> 00:06:40,680 It's not that many compared to a huge numbering, but again, 112 00:06:40,680 --> 00:06:43,310 if we believe it's a good approximation, we can get some 113 00:06:43,310 --> 00:06:46,600 information out of it by using the central limit theorem. 114 00:06:46,600 --> 00:06:49,840 So we are interesting knowing that probability summation of 115 00:06:49,840 --> 00:06:55,250 i equal to 1 to 10 x1 greater or equal to 7. 116 00:06:55,250 --> 00:06:59,410 We'll rewrite this as 1 minus the probability the summation 117 00:06:59,410 --> 00:07:06,270 i equal to 1 to 10, and xi less equal to 7. 118 00:07:06,270 --> 00:07:08,310 Now, we're going to apply the scaling to the 119 00:07:08,310 --> 00:07:10,570 summation right here. 120 00:07:10,570 --> 00:07:20,050 So this is equal to 1 minus the probability summation i 121 00:07:20,050 --> 00:07:25,880 equal to 1 to 10xi minus 5. 122 00:07:25,880 --> 00:07:29,020 Because we know from previous parts that 5 is the expected 123 00:07:29,020 --> 00:07:32,090 value of the sum right here, and divided by 124 00:07:32,090 --> 00:07:36,420 square root of 10/12. 125 00:07:36,420 --> 00:07:40,390 Again, earlier we know that 10/12 is the variance of the 126 00:07:40,390 --> 00:07:42,340 sum of xi's. 127 00:07:42,340 --> 00:07:45,890 And we'll do the same on the other side, writing it 7 minus 128 00:07:45,890 --> 00:07:48,930 5 divided by square root of 10/12. 129 00:07:48,930 --> 00:07:52,280 130 00:07:52,280 --> 00:07:55,710 Now, if we compute out the quantity right here, we know 131 00:07:55,710 --> 00:08:02,550 that this quantity is roughly 2.19, and by the central limit 132 00:08:02,550 --> 00:08:07,030 theorem, if we believe 10 is a large enough number, then this 133 00:08:07,030 --> 00:08:13,390 will be roughly equal to 1 minus the CDF of a standard 134 00:08:13,390 --> 00:08:17,200 normal evaluated at 2.19. 135 00:08:17,200 --> 00:08:20,460 And we could look up the number in the table, and this 136 00:08:20,460 --> 00:08:27,910 gives us number roughly, 0.014. 137 00:08:27,910 --> 00:08:30,580 Now let's do a quick summary of what this problem is about. 138 00:08:30,580 --> 00:08:33,720 We're asked to compute the probability of x greater or 139 00:08:33,720 --> 00:08:38,909 equal to 7, where x is the sum of 10 uniform random variables 140 00:08:38,909 --> 00:08:41,850 between 0 and 1, so we'll call it xi. 141 00:08:41,850 --> 00:08:44,560 We know that because each random variable has 142 00:08:44,560 --> 00:08:48,390 expectation 1/2, adding 10 of them up, gives us 143 00:08:48,390 --> 00:08:50,820 expectation of 5. 144 00:08:50,820 --> 00:08:54,320 So this is essentially asking, what is the chance that x is 145 00:08:54,320 --> 00:08:57,600 more than two away from its expectation? 146 00:08:57,600 --> 00:09:01,240 So if this is a real line, and 5 is here, maybe x has some 147 00:09:01,240 --> 00:09:05,180 distribution around 5, so the center what the expected value 148 00:09:05,180 --> 00:09:09,650 is at 5, we wonder how likely is it for us to see something 149 00:09:09,650 --> 00:09:11,930 greater than 7? 150 00:09:11,930 --> 00:09:16,430 Now, let's see where do we land on the probably spectrum 151 00:09:16,430 --> 00:09:17,890 from 0 to 1. 152 00:09:17,890 --> 00:09:20,110 Well, without using any information, we know the 153 00:09:20,110 --> 00:09:23,710 probability cannot be greater than 1, so a trivial upper 154 00:09:23,710 --> 00:09:27,990 bound for the probability right here will be 1. 155 00:09:27,990 --> 00:09:31,570 Well, for the first part we use Markov's Inequality and 156 00:09:31,570 --> 00:09:35,750 that gives us some number, which is roughly equal to 0.7. 157 00:09:35,750 --> 00:09:40,320 In fact, we got number 5/7, and this is from Markov's 158 00:09:40,320 --> 00:09:41,420 Inequality. 159 00:09:41,420 --> 00:09:44,210 Oh, it's better than 1, already telling us it cannot 160 00:09:44,210 --> 00:09:48,500 be between 0.7 and 1, but can we do better? 161 00:09:48,500 --> 00:09:51,980 Well, the part B, we see that all the way, using the 162 00:09:51,980 --> 00:09:54,430 additional information variance, we can get this 163 00:09:54,430 --> 00:09:59,510 number down to 5/48, which is roughly 0.1. 164 00:09:59,510 --> 00:10:01,910 Already, that's much better than 0.7. 165 00:10:01,910 --> 00:10:03,230 Can we even do better? 166 00:10:03,230 --> 00:10:05,450 And this is the Chebyshev, and it turns out we 167 00:10:05,450 --> 00:10:06,960 can indeed do better. 168 00:10:06,960 --> 00:10:09,390 Using the central limit theorem, we can squeeze this 169 00:10:09,390 --> 00:10:14,370 number all the way down to 0.014, almost a 10 times 170 00:10:14,370 --> 00:10:16,910 improvement over the previous number. 171 00:10:16,910 --> 00:10:18,630 This is from central limit theorem. 172 00:10:18,630 --> 00:10:22,010 As we can see, by using different bounding techniques, 173 00:10:22,010 --> 00:10:25,250 we can progressively improve the bound on the probability 174 00:10:25,250 --> 00:10:28,810 of x exceeding 7, and from this problem we learned that 175 00:10:28,810 --> 00:10:33,480 even with 10 variables, the truth is more like this, which 176 00:10:33,480 --> 00:10:36,850 says that the distribution of x concentrates very heavily 177 00:10:36,850 --> 00:10:40,080 around 5, and hence, the probability of x being greater 178 00:10:40,080 --> 00:10:43,400 or equal to 7 could be much smaller than one might expect. 179 00:10:43,400 --> 00:10:45,033