1 00:00:02,330 --> 00:00:06,150 Conditional probabilities are probabilities associated with 2 00:00:06,150 --> 00:00:09,860 a revised model that takes into account some additional 3 00:00:09,860 --> 00:00:13,850 information about the outcome of a probabilistic experiment. 4 00:00:13,850 --> 00:00:16,730 The question is how to carry out this 5 00:00:16,730 --> 00:00:18,930 revision of our model. 6 00:00:18,930 --> 00:00:21,390 We will give a mathematical definition of conditional 7 00:00:21,390 --> 00:00:25,730 probabilities, but first let us motivate this definition by 8 00:00:25,730 --> 00:00:29,480 examining a simple concrete example. 9 00:00:29,480 --> 00:00:33,660 Consider a probability model with 12 equally likely 10 00:00:33,660 --> 00:00:38,180 possible outcomes, and so each one of them has probability 11 00:00:38,180 --> 00:00:39,430 equal to 1/12. 12 00:00:41,900 --> 00:00:45,940 We will focus on two particular events, event A and 13 00:00:45,940 --> 00:00:48,750 B, two subsets of the sample space. 14 00:00:48,750 --> 00:00:54,800 Event A has five elements, so its probability is 5/12, and 15 00:00:54,800 --> 00:01:01,190 event B has six elements, so it has probability 6/12. 16 00:01:01,190 --> 00:01:04,800 Suppose now that someone tells you that event B has occurred, 17 00:01:04,800 --> 00:01:07,980 but tells you nothing more about the outcome. 18 00:01:07,980 --> 00:01:10,770 How should the model change? 19 00:01:10,770 --> 00:01:16,450 First, those outcomes that are outside event B 20 00:01:16,450 --> 00:01:19,390 are no longer possible. 21 00:01:19,390 --> 00:01:23,190 So we can either eliminate them, as was done in this 22 00:01:23,190 --> 00:01:28,220 picture, or we might keep them in the picture but assign them 23 00:01:28,220 --> 00:01:33,130 0 probability, so that they cannot occur. 24 00:01:33,130 --> 00:01:36,570 How about the outcomes inside the event B? 25 00:01:36,570 --> 00:01:41,460 So we're told that one of these has occurred. 26 00:01:41,460 --> 00:01:45,979 Now these 6 outcomes inside the event B were equally 27 00:01:45,979 --> 00:01:50,280 likely in the original model, and there is no reason to 28 00:01:50,280 --> 00:01:52,500 change their relative probabilities. 29 00:01:52,500 --> 00:01:56,479 So they should remain equally likely in revised model as 30 00:01:56,479 --> 00:02:00,530 well, so each one of them should have now probability 31 00:02:00,530 --> 00:02:03,710 1/6 since there's 6 of them. 32 00:02:03,710 --> 00:02:05,680 And this is our revised model, the 33 00:02:05,680 --> 00:02:07,760 conditional probability law. 34 00:02:07,760 --> 00:02:12,740 0 probability to outcomes outside B, and probability 1/6 35 00:02:12,740 --> 00:02:18,120 to each one of the outcomes that is inside the event B. 36 00:02:18,120 --> 00:02:21,790 Let us write now this down mathematically. 37 00:02:21,790 --> 00:02:26,710 We will use this notation to describe the conditional 38 00:02:26,710 --> 00:02:32,280 probability of an event A given that some other event B 39 00:02:32,280 --> 00:02:34,500 is known to have occurred. 40 00:02:34,500 --> 00:02:40,540 We read this expression as probability of A given B. So 41 00:02:40,540 --> 00:02:44,760 what are these conditional probabilities in our example? 42 00:02:44,760 --> 00:02:48,829 So in the new model, where these outcomes are equally 43 00:02:48,829 --> 00:02:52,610 likely, we know that event A can occur in 44 00:02:52,610 --> 00:02:54,240 two different ways. 45 00:02:54,240 --> 00:02:56,829 Each one of them has probability 1/6. 46 00:02:56,829 --> 00:03:02,320 So the probability of event A is 2/6 which 47 00:03:02,320 --> 00:03:06,060 is the same as 1/3. 48 00:03:06,060 --> 00:03:11,350 How about event B. Well, B consists of 6 possible 49 00:03:11,350 --> 00:03:13,850 outcomes each with probability 1/6. 50 00:03:13,850 --> 00:03:18,670 So event B in this revised model should have probability 51 00:03:18,670 --> 00:03:20,030 equal to 1. 52 00:03:20,030 --> 00:03:22,770 Of course, this is just saying the obvious. 53 00:03:22,770 --> 00:03:26,150 Given that we already know that B has occurred, the 54 00:03:26,150 --> 00:03:29,310 probability that B occurs in this new model 55 00:03:29,310 --> 00:03:31,920 should be equal to 1. 56 00:03:31,920 --> 00:03:35,010 How about now, if the sample space does not consist of 57 00:03:35,010 --> 00:03:38,340 equally likely outcomes, but instead we're given the 58 00:03:38,340 --> 00:03:42,120 probabilities of different pieces of the sample space, as 59 00:03:42,120 --> 00:03:44,210 in this example. 60 00:03:44,210 --> 00:03:47,800 Notice here that the probabilities are consistent 61 00:03:47,800 --> 00:03:51,350 with what was used in the original example. 62 00:03:51,350 --> 00:03:55,329 So this part of A that lies outside B has probability 63 00:03:55,329 --> 00:04:00,190 3/12, but in this case I'm not telling you how that 64 00:04:00,190 --> 00:04:01,780 probability is made up. 65 00:04:01,780 --> 00:04:03,780 I'm not telling you that it consists of 3 66 00:04:03,780 --> 00:04:04,990 equally likely outcomes. 67 00:04:04,990 --> 00:04:07,390 So all I'm telling you is that the collective probability in 68 00:04:07,390 --> 00:04:10,070 this region is 3/12. 69 00:04:10,070 --> 00:04:14,820 The total probability of A is, again, 5/12 as before. 70 00:04:14,820 --> 00:04:17,930 The total probability of B is 2 plus 4 equals 71 00:04:17,930 --> 00:04:21,940 6/12, exactly as before. 72 00:04:21,940 --> 00:04:25,440 So it's a sort of similar situation as before. 73 00:04:25,440 --> 00:04:31,160 How should we revise our probabilities and create-- 74 00:04:31,160 --> 00:04:32,040 construct-- 75 00:04:32,040 --> 00:04:35,800 conditional probabilities once we are told 76 00:04:35,800 --> 00:04:39,430 that event B has occurred? 77 00:04:39,430 --> 00:04:44,430 First, this relation should remain true. 78 00:04:44,430 --> 00:04:49,780 Once we are told that B has occurred, then B is certain to 79 00:04:49,780 --> 00:04:51,340 occur, so it should have conditional 80 00:04:51,340 --> 00:04:53,280 probability equal to 1. 81 00:04:53,280 --> 00:04:56,600 How about the conditional probability of A given that B 82 00:04:56,600 --> 00:04:57,760 has occurred? 83 00:04:57,760 --> 00:05:00,140 Well, we can reason as follows. 84 00:05:00,140 --> 00:05:06,320 In the original model, and if we just look inside event B, 85 00:05:06,320 --> 00:05:12,050 those outcomes that make event A happen had a collective 86 00:05:12,050 --> 00:05:17,340 probability which was 1/3 of the total probability assigned 87 00:05:17,340 --> 00:05:22,230 to B. So out of the overall probability assigned to B, 1/3 88 00:05:22,230 --> 00:05:25,210 of that probability corresponds to outcomes in 89 00:05:25,210 --> 00:05:27,840 which event A is happening. 90 00:05:27,840 --> 00:05:33,010 So therefore, if I tell you that B has occurred, I should 91 00:05:33,010 --> 00:05:37,710 assign probability equal to 1/3 that event A is 92 00:05:37,710 --> 00:05:39,159 also going to happen. 93 00:05:39,159 --> 00:05:41,750 So that, given that B happened, the conditional 94 00:05:41,750 --> 00:05:46,440 probability of A given B should be equal to 1/3. 95 00:05:46,440 --> 00:05:49,290 By now, we should be satisfied that this approach is a 96 00:05:49,290 --> 00:05:52,210 reasonable way of constructing conditional probabilities. 97 00:05:52,210 --> 00:05:56,220 But now let us translate our reasoning into a formula. 98 00:06:00,920 --> 00:06:05,450 So we wish to come up with a formula that gives us the 99 00:06:05,450 --> 00:06:09,890 conditional probability of an event given another event. 100 00:06:09,890 --> 00:06:14,630 The particular formula that captures our way of thinking, 101 00:06:14,630 --> 00:06:19,540 as motivated before, is the following. 102 00:06:19,540 --> 00:06:22,700 Out of the total probability assigned to B-- 103 00:06:22,700 --> 00:06:25,290 which is this-- 104 00:06:25,290 --> 00:06:32,230 we ask the question, which fraction of that probability 105 00:06:32,230 --> 00:06:40,650 is assigned to outcomes under which event A also happens? 106 00:06:40,650 --> 00:06:46,890 So we are living inside event B, but within that event, we 107 00:06:46,890 --> 00:06:51,290 look at those outcomes for which event A also happens. 108 00:06:51,290 --> 00:06:55,430 So this is the intersection of A and B. And we ask, out of 109 00:06:55,430 --> 00:06:58,040 the total probability of B, what fraction of that 110 00:06:58,040 --> 00:07:03,460 probability is allocated to that intersection of A with B? 111 00:07:03,460 --> 00:07:06,690 So this formula, this definition, captures our 112 00:07:06,690 --> 00:07:11,370 intuition of what we did before to construct 113 00:07:11,370 --> 00:07:14,530 conditional probabilities in our particular example. 114 00:07:14,530 --> 00:07:17,510 Let us check that the definition indeed does what 115 00:07:17,510 --> 00:07:19,310 it's supposed to do. 116 00:07:19,310 --> 00:07:21,520 In this example, the probability of the 117 00:07:21,520 --> 00:07:27,750 intersection was 2/12 and the total probability of B was 118 00:07:27,750 --> 00:07:33,010 6/12, which gives us 1/3, which is the answer that we 119 00:07:33,010 --> 00:07:38,170 had gotten intuitively a little earlier. 120 00:07:38,170 --> 00:07:42,430 At this point, let me also make a comment that this 121 00:07:42,430 --> 00:07:47,730 definition of conditional probabilities makes sense only 122 00:07:47,730 --> 00:07:51,230 if we do not attempt to divide by zero. 123 00:07:51,230 --> 00:07:55,280 That this, only if the event B on which we're conditioning, 124 00:07:55,280 --> 00:07:57,900 has positive probability. 125 00:07:57,900 --> 00:08:03,470 If B, if an event B has 0 probability, then conditional 126 00:08:03,470 --> 00:08:08,350 probabilities given B will be left undefined. 127 00:08:08,350 --> 00:08:11,440 And one final comment. 128 00:08:11,440 --> 00:08:14,880 This is a definition. 129 00:08:14,880 --> 00:08:18,310 It's not a theorem. 130 00:08:18,310 --> 00:08:19,690 What does that mean? 131 00:08:19,690 --> 00:08:23,770 It means that there is no question whether this equality 132 00:08:23,770 --> 00:08:25,600 is correct or not. 133 00:08:25,600 --> 00:08:27,520 It's just a definition. 134 00:08:27,520 --> 00:08:31,010 There's no issue of correctness. 135 00:08:31,010 --> 00:08:36,909 The earlier argument that we gave was just a motivation of 136 00:08:36,909 --> 00:08:38,110 the definition. 137 00:08:38,110 --> 00:08:42,610 We tried to figure out what the definition should be if we 138 00:08:42,610 --> 00:08:46,160 want to have a certain intuitive and meaningful 139 00:08:46,160 --> 00:08:49,165 interpretation of the conditional probabilities. 140 00:08:52,600 --> 00:08:54,810 Let us now continue with a simple example.