1 00:00:00,360 --> 00:00:03,020 Let us now put to use our understanding of the 2 00:00:03,020 --> 00:00:06,000 coin-tossing model and the associated binomial 3 00:00:06,000 --> 00:00:07,550 probabilities. 4 00:00:07,550 --> 00:00:10,210 We will solve the following problem. 5 00:00:10,210 --> 00:00:13,910 We have a coin, which is tossed 10 times. 6 00:00:13,910 --> 00:00:17,710 And we're told that exactly three out of the 10 tosses 7 00:00:17,710 --> 00:00:19,800 resulted in heads. 8 00:00:19,800 --> 00:00:22,530 Given this information, we would like to calculate the 9 00:00:22,530 --> 00:00:27,310 probability that the first two tosses were heads. 10 00:00:27,310 --> 00:00:29,940 This is a question of calculating a conditional 11 00:00:29,940 --> 00:00:33,670 probability of one event given another. 12 00:00:33,670 --> 00:00:37,720 The conditional probability of event A, namely that the first 13 00:00:37,720 --> 00:00:41,480 two tosses were heads, given that another event B has 14 00:00:41,480 --> 00:00:45,250 occurred, namely that we had exactly three heads 15 00:00:45,250 --> 00:00:47,600 out of the 10 tosses. 16 00:00:47,600 --> 00:00:51,560 However, before we can start working towards the solution 17 00:00:51,560 --> 00:00:54,990 to this problem, we need to specify a probability model 18 00:00:54,990 --> 00:00:57,080 that we will be working with. 19 00:00:57,080 --> 00:01:00,070 We need to be explicit about our assumptions. 20 00:01:00,070 --> 00:01:03,290 To this effect, let us introduce the following 21 00:01:03,290 --> 00:01:04,780 assumptions. 22 00:01:04,780 --> 00:01:08,140 We will assume that the different coin tosses are 23 00:01:08,140 --> 00:01:09,390 independent. 24 00:01:09,390 --> 00:01:13,270 In addition, we will assume that each coin toss has a 25 00:01:13,270 --> 00:01:18,300 fixed probability, p, the same for each toss, that the 26 00:01:18,300 --> 00:01:21,560 particular toss results in heads. 27 00:01:21,560 --> 00:01:24,090 These are the exact same assumptions that we made 28 00:01:24,090 --> 00:01:27,580 earlier when we derived the binomial probabilities. 29 00:01:27,580 --> 00:01:31,450 And in particular, we have the following formula that if we 30 00:01:31,450 --> 00:01:35,650 have n tosses, the probability that we obtain exactly k heads 31 00:01:35,650 --> 00:01:37,880 is given by this expression. 32 00:01:37,880 --> 00:01:42,550 So now, we have a model in place and also the tools that 33 00:01:42,550 --> 00:01:46,400 we can use to analyze this particular model. 34 00:01:46,400 --> 00:01:48,289 Let us start working towards a solution. 35 00:01:48,289 --> 00:01:51,600 Actually, we will develop two different solutions and 36 00:01:51,600 --> 00:01:54,200 compare them at the end. 37 00:01:54,200 --> 00:01:56,060 The first approach, which is the 38 00:01:56,060 --> 00:01:58,710 safest one, is the following. 39 00:01:58,710 --> 00:02:02,125 Since we want to calculate a conditional probability, let 40 00:02:02,125 --> 00:02:05,000 us just start with the definition of conditional 41 00:02:05,000 --> 00:02:06,520 probabilities. 42 00:02:06,520 --> 00:02:10,228 The conditional probability of an event given another event 43 00:02:10,228 --> 00:02:13,970 is the probability that both events happen, divided by the 44 00:02:13,970 --> 00:02:16,540 probability of the conditioning event. 45 00:02:16,540 --> 00:02:21,070 Now, let us specialize to the particular example that we're 46 00:02:21,070 --> 00:02:23,740 trying to solve. 47 00:02:23,740 --> 00:02:27,110 So in the numerator, we're talking about the probability 48 00:02:27,110 --> 00:02:31,329 that event A happens and event B happens. 49 00:02:31,329 --> 00:02:33,060 What does that mean? 50 00:02:33,060 --> 00:02:35,440 This means that event A happens-- 51 00:02:35,440 --> 00:02:39,970 that is, the first two tosses resulted in heads, which I'm 52 00:02:39,970 --> 00:02:43,440 going to denote symbolically this way. 53 00:02:43,440 --> 00:02:47,610 But in addition to that, event B happens. 54 00:02:47,610 --> 00:02:51,260 And event B requires that there is a total of three 55 00:02:51,260 --> 00:02:57,720 heads, which means that we had one more head in 56 00:02:57,720 --> 00:02:59,250 the remaining tosses. 57 00:02:59,250 --> 00:03:07,510 So we have one head in tosses 3 all the way to 10. 58 00:03:15,810 --> 00:03:19,530 As for the denominator, let's keep it the way it is for now. 59 00:03:22,410 --> 00:03:26,270 So let's continue with the numerator. 60 00:03:26,270 --> 00:03:29,380 We're talking about the probability of two events 61 00:03:29,380 --> 00:03:33,970 happening, that the first two tosses were heads and that in 62 00:03:33,970 --> 00:03:38,700 tosses 3 up to 10, we had exactly one head. 63 00:03:38,700 --> 00:03:42,550 Here comes the independence assumption. 64 00:03:42,550 --> 00:03:45,750 Because the different tosses are independent, whatever 65 00:03:45,750 --> 00:03:49,650 happens in the first two tosses is independent from 66 00:03:49,650 --> 00:03:53,480 whatever happened in tosses 3 up to 10. 67 00:03:53,480 --> 00:03:57,270 So the probability of these two events happening is the 68 00:03:57,270 --> 00:04:00,420 product of their individual probabilities. 69 00:04:00,420 --> 00:04:04,880 So we first have the probability that the first two 70 00:04:04,880 --> 00:04:10,170 tosses were heads, which is p squared. 71 00:04:10,170 --> 00:04:13,610 And we need to multiply it with the probability that 72 00:04:13,610 --> 00:04:17,230 there was exactly one head in the tosses numbered 73 00:04:17,230 --> 00:04:19,019 from 3 up to 10. 74 00:04:19,019 --> 00:04:21,450 These are eight tosses. 75 00:04:21,450 --> 00:04:25,980 The probability of one head in eight tosses is given by the 76 00:04:25,980 --> 00:04:31,820 binomial formula, with k equal to 1 and n equal to 8. 77 00:04:31,820 --> 00:04:38,190 So this expression, this part, becomes 8 choose 1, p to the 78 00:04:38,190 --> 00:04:44,790 first power times 1 minus p to the seventh power. 79 00:04:44,790 --> 00:04:47,560 So this is the numerator. 80 00:04:47,560 --> 00:04:51,420 The denominator is easier to find. 81 00:04:51,420 --> 00:04:53,230 This is the probability that we had 82 00:04:53,230 --> 00:04:55,909 three heads in 10 tosses. 83 00:04:55,909 --> 00:04:57,690 So we just use this formula. 84 00:04:57,690 --> 00:05:04,220 The probability of three heads is given by: 10 tosses choose 85 00:05:04,220 --> 00:05:10,685 three, p to the third, times 1 minus p to the seventh power. 86 00:05:13,740 --> 00:05:16,690 And here we notice that terms in the numerator and 87 00:05:16,690 --> 00:05:23,350 denominator cancel out, and we obtain 8 choose 1 divided by 88 00:05:23,350 --> 00:05:25,690 10 choose 3. 89 00:05:25,690 --> 00:05:29,480 And to simplify things just a little more, 90 00:05:29,480 --> 00:05:31,090 what is 8 choose 1? 91 00:05:31,090 --> 00:05:34,150 This is the number of ways that we can choose one item 92 00:05:34,150 --> 00:05:36,090 out of eight items. 93 00:05:36,090 --> 00:05:37,680 And this is just 8. 94 00:05:42,220 --> 00:05:45,930 And let's leave the denominator the way it is. 95 00:05:45,930 --> 00:05:51,150 So this is the answer to the question that we had. 96 00:05:51,150 --> 00:05:55,840 And now let us work towards developing a second approach 97 00:05:55,840 --> 00:05:58,140 towards this particular answer. 98 00:06:00,710 --> 00:06:05,790 In our second approach, we start first by looking at the 99 00:06:05,790 --> 00:06:08,750 sample space and understanding what 100 00:06:08,750 --> 00:06:12,110 conditioning is all about. 101 00:06:12,110 --> 00:06:16,760 In our model, we have a sample space. 102 00:06:16,760 --> 00:06:19,660 As usual we can denote it by omega. 103 00:06:19,660 --> 00:06:25,380 And the sample space contains a bunch of possible outcomes. 104 00:06:25,380 --> 00:06:33,090 A typical outcome is going to be a sequence of length 10. 105 00:06:33,090 --> 00:06:36,409 It's a sequence of heads or tails. 106 00:06:36,409 --> 00:06:39,070 And it's a sequence that has length 10. 107 00:06:42,680 --> 00:06:45,870 We want to calculate conditional probabilities. 108 00:06:45,870 --> 00:06:50,290 And this places us in a conditional universe. 109 00:06:50,290 --> 00:06:54,500 We have the conditioning event B, which is some set. 110 00:06:58,280 --> 00:07:02,010 And conditional probabilities are probabilities defined 111 00:07:02,010 --> 00:07:05,810 inside this set B and define the probabilities, the 112 00:07:05,810 --> 00:07:09,140 conditional probabilities of the different outcomes. 113 00:07:09,140 --> 00:07:11,460 What are the elements of the set B? 114 00:07:11,460 --> 00:07:17,140 A typical element of the set B is a sequence, which is, again 115 00:07:17,140 --> 00:07:23,630 of length 10, but has exactly three heads. 116 00:07:23,630 --> 00:07:26,593 So these are the three-head sequences. 117 00:07:32,580 --> 00:07:37,750 Now, since we're conditioning on event B, we can just work 118 00:07:37,750 --> 00:07:39,790 with conditional probabilities. 119 00:07:39,790 --> 00:07:44,850 So let us find the conditional probability law. 120 00:07:44,850 --> 00:07:50,880 Recall that any three-head sequence has the same 121 00:07:50,880 --> 00:07:55,520 probability of occurring in the original unconditional 122 00:07:55,520 --> 00:07:59,870 probability model, namely as we discussed earlier, any 123 00:07:59,870 --> 00:08:04,560 particular three-head sequence has a probability equal to 124 00:08:04,560 --> 00:08:06,530 this expression. 125 00:08:06,530 --> 00:08:09,150 So three-head sequences are all equally likely. 126 00:08:09,150 --> 00:08:11,740 This means that the unconditional probabilities of 127 00:08:11,740 --> 00:08:14,790 all the elements of B are the same. 128 00:08:14,790 --> 00:08:18,300 When we construct conditional probabilities given an event 129 00:08:18,300 --> 00:08:23,510 B, what happens is that the ratio or the relative 130 00:08:23,510 --> 00:08:28,520 proportions of the probabilities remain the same. 131 00:08:28,520 --> 00:08:31,640 So conditional probabilities are proportional to 132 00:08:31,640 --> 00:08:34,070 unconditional probabilities. 133 00:08:34,070 --> 00:08:36,730 These elements of B were equally likely in 134 00:08:36,730 --> 00:08:38,159 the original model. 135 00:08:38,159 --> 00:08:41,960 Therefore, they remain equally likely in the conditional 136 00:08:41,960 --> 00:08:43,870 model as well. 137 00:08:43,870 --> 00:08:48,260 What this means is that the conditional probability law on 138 00:08:48,260 --> 00:08:50,910 the set B is uniform. 139 00:08:50,910 --> 00:08:55,140 Given that B occurred, all the possible outcomes now have the 140 00:08:55,140 --> 00:08:56,910 same probability. 141 00:08:56,910 --> 00:08:59,310 Since we have a uniform probability law, this means 142 00:08:59,310 --> 00:09:01,560 that we can now answer probability 143 00:09:01,560 --> 00:09:03,940 questions by just counting. 144 00:09:03,940 --> 00:09:06,850 We're interested in the probability of a certain 145 00:09:06,850 --> 00:09:11,640 event, A, given that B occurred. 146 00:09:11,640 --> 00:09:15,660 Now, given that B occurred, this part of A cannot happen. 147 00:09:15,660 --> 00:09:18,900 So we're interested in the probability of outcomes that 148 00:09:18,900 --> 00:09:23,240 belong in this shaded region, those outcomes that belong 149 00:09:23,240 --> 00:09:28,830 within the set B. To find the probability of this shaded 150 00:09:28,830 --> 00:09:33,080 region occurring, we just need to count how many outcomes 151 00:09:33,080 --> 00:09:37,220 belong to the shaded region and divide them by the number 152 00:09:37,220 --> 00:09:41,290 of outcomes that belong to the set B. 153 00:09:41,290 --> 00:09:44,880 That is, we work inside this conditional universe. 154 00:09:44,880 --> 00:09:47,230 All of the elements in this conditional universe are 155 00:09:47,230 --> 00:09:48,700 equally likely. 156 00:09:48,700 --> 00:09:51,860 And therefore, we calculate probabilities by counting. 157 00:09:51,860 --> 00:09:55,410 So the desired probability is going to be the number of 158 00:09:55,410 --> 00:10:00,520 elements in the shaded region, which is the intersection of A 159 00:10:00,520 --> 00:10:08,280 with B, divided by the number of elements that belong to the 160 00:10:08,280 --> 00:10:13,000 set B. 161 00:10:13,000 --> 00:10:18,270 How many elements are there in the intersection of A and B? 162 00:10:18,270 --> 00:10:22,590 These are the outcomes or sequences of length 10, in 163 00:10:22,590 --> 00:10:24,980 which the first two tosses were heads-- 164 00:10:24,980 --> 00:10:26,940 no choice here. 165 00:10:26,940 --> 00:10:29,630 And there is one more head. 166 00:10:29,630 --> 00:10:33,350 That additional head can appear in one out of eight 167 00:10:33,350 --> 00:10:34,830 possible places. 168 00:10:34,830 --> 00:10:38,240 So there's eight possible sequences that have the 169 00:10:38,240 --> 00:10:40,710 desired property. 170 00:10:40,710 --> 00:10:44,010 How many elements are there in the set B? 171 00:10:44,010 --> 00:10:49,270 How many three-head sequences are there? 172 00:10:49,270 --> 00:10:54,800 Well, the number of three-head sequences is the same as the 173 00:10:54,800 --> 00:10:58,920 number of ways that we can choose three elements out of a 174 00:10:58,920 --> 00:11:01,450 set of cardinality 10. 175 00:11:01,450 --> 00:11:07,580 And this is 10 choose 3, as we also discussed earlier. 176 00:11:07,580 --> 00:11:11,750 So this is the same answer as we derived before with our 177 00:11:11,750 --> 00:11:13,100 first approach. 178 00:11:13,100 --> 00:11:17,650 So both approaches, of course, give the same solution. 179 00:11:17,650 --> 00:11:21,310 This second approach is a little easier, because we 180 00:11:21,310 --> 00:11:24,730 never had to involve any p's in our calculation. 181 00:11:24,730 --> 00:11:27,210 We go to the answer directly. 182 00:11:27,210 --> 00:11:31,110 The reason that this approach worked was that the 183 00:11:31,110 --> 00:11:35,970 conditional universe, the event B, had a uniform 184 00:11:35,970 --> 00:11:37,300 probability law on it.