1 00:00:00,790 --> 00:00:03,450 In this segment, we develop the inclusion-exclusion 2 00:00:03,450 --> 00:00:07,220 formula, which is a beautiful generalization of a formula 3 00:00:07,220 --> 00:00:09,390 that we have seen before. 4 00:00:09,390 --> 00:00:11,500 Let us look at this formula and remind 5 00:00:11,500 --> 00:00:13,520 ourselves what it says. 6 00:00:13,520 --> 00:00:19,110 If we have two sets, A1 and A2, and we're interested in 7 00:00:19,110 --> 00:00:22,080 the probability of their union, how can we find it? 8 00:00:22,080 --> 00:00:25,960 We take the probability of the first set, we add to it the 9 00:00:25,960 --> 00:00:29,810 probability of the second set, but then we realize that by 10 00:00:29,810 --> 00:00:32,710 doing so we have double counted 11 00:00:32,710 --> 00:00:34,770 this part of the diagram. 12 00:00:34,770 --> 00:00:38,740 And so we need to correct for that and we need to subtract 13 00:00:38,740 --> 00:00:41,040 the probability of this intersection. 14 00:00:41,040 --> 00:00:43,830 And that's how this formula comes about. 15 00:00:43,830 --> 00:00:46,350 Can we generalize this thinking, let's say, to the 16 00:00:46,350 --> 00:00:49,240 case of three events? 17 00:00:49,240 --> 00:01:01,240 Suppose that we have three events, A1, A2, and A3. 18 00:01:01,240 --> 00:01:05,470 And we want to calculate the probability of their union. 19 00:01:05,470 --> 00:01:08,920 We first start by adding the probabilities of 20 00:01:08,920 --> 00:01:10,660 the different sets. 21 00:01:10,660 --> 00:01:14,260 But then we realize that, for example, this part of the 22 00:01:14,260 --> 00:01:17,410 diagram has been counted twice. 23 00:01:17,410 --> 00:01:21,090 It shows up once inside the probability of A1 and once 24 00:01:21,090 --> 00:01:24,180 inside the probability of A2. 25 00:01:24,180 --> 00:01:27,740 So, for this reason, we need to make a correction and we 26 00:01:27,740 --> 00:01:29,930 need to subtract the probability of this 27 00:01:29,930 --> 00:01:30,670 intersection. 28 00:01:30,670 --> 00:01:32,630 Similarly, subtract the probability of that 29 00:01:32,630 --> 00:01:34,940 intersection and of this one. 30 00:01:34,940 --> 00:01:37,630 So we subtract the probabilities of these 31 00:01:37,630 --> 00:01:38,800 intersections. 32 00:01:38,800 --> 00:01:41,120 But, actually, the intersections are not just 33 00:01:41,120 --> 00:01:42,340 what I drew here. 34 00:01:42,340 --> 00:01:45,780 The intersections also involve this part. 35 00:01:45,780 --> 00:01:50,940 So now, let us just focus on this part of the diagram here. 36 00:01:50,940 --> 00:01:57,570 A typical element that belongs to all three of the sets will 37 00:01:57,570 --> 00:02:02,520 show up once here, once here and once there. 38 00:02:02,520 --> 00:02:07,350 But it will also show up in all of these intersections. 39 00:02:07,350 --> 00:02:10,610 And so it shows up three times with a plus sign, three times 40 00:02:10,610 --> 00:02:13,390 with a minus sign, which means that these elements will not 41 00:02:13,390 --> 00:02:14,990 to be counted at all. 42 00:02:14,990 --> 00:02:19,280 In order to count them, we need to add one more term 43 00:02:19,280 --> 00:02:23,640 which is the probability of the three way intersection. 44 00:02:23,640 --> 00:02:27,760 So this is the formula for the probability of the union of 45 00:02:27,760 --> 00:02:30,550 three events. 46 00:02:30,550 --> 00:02:34,329 It has a rationale similar to this formula, and you can 47 00:02:34,329 --> 00:02:37,829 convince yourself that it is a correct formula by just 48 00:02:37,829 --> 00:02:40,490 looking at the different pieces of this diagram and 49 00:02:40,490 --> 00:02:44,410 make sure that each one of them is accounted properly. 50 00:02:44,410 --> 00:02:48,055 But instead of working in terms of such a picture, let 51 00:02:48,055 --> 00:02:51,050 us think about a more formal derivation. 52 00:02:51,050 --> 00:02:54,550 And the formal derivation will use a beautiful trick. 53 00:02:54,550 --> 00:02:58,950 Namely, indicator functions. 54 00:02:58,950 --> 00:03:02,800 So here is the formula that we want to establish. 55 00:03:02,800 --> 00:03:08,190 And let us remind ourselves what indicator functions are. 56 00:03:08,190 --> 00:03:11,680 To any set or event, we can associate 57 00:03:11,680 --> 00:03:13,640 an indicator function. 58 00:03:13,640 --> 00:03:16,440 Let's say that this is the set Ai. 59 00:03:16,440 --> 00:03:20,900 We're going to associate an indicator function, call it 60 00:03:20,900 --> 00:03:25,070 Xi, which is equal to 1 when the outcome is inside this 61 00:03:25,070 --> 00:03:29,250 set, and it's going to be 0 when the outcome is outside. 62 00:03:29,250 --> 00:03:32,829 What is the indicator function of the complement? 63 00:03:32,829 --> 00:03:37,190 The indicator function of the complement is 1 minus the 64 00:03:37,190 --> 00:03:39,600 indicator of the event. 65 00:03:39,600 --> 00:03:41,300 Why is this? 66 00:03:41,300 --> 00:03:47,180 If the outcome is in the complement, then Xi is equal 67 00:03:47,180 --> 00:03:50,600 to 0, and this expression is equal to 1. 68 00:03:50,600 --> 00:03:55,110 On the other hand, if the outcome is inside Ai, then the 69 00:03:55,110 --> 00:03:58,610 indicator function will be equal to 1 and this quantity 70 00:03:58,610 --> 00:04:01,520 is going to be equal to 0. 71 00:04:01,520 --> 00:04:07,520 If we have the intersection of two events, Ai and Aj, what is 72 00:04:07,520 --> 00:04:09,360 their indicator function? 73 00:04:09,360 --> 00:04:12,930 It is Xi times Xj. 74 00:04:12,930 --> 00:04:17,149 This expression is equal to 1, if and only if, Xi is equal to 75 00:04:17,149 --> 00:04:21,890 1 and Xj is equal to 1, which happens, if and only if, the 76 00:04:21,890 --> 00:04:27,000 outcome is inside Ai and also inside Aj. 77 00:04:27,000 --> 00:04:31,710 Now, what about the indicator of the intersection of the 78 00:04:31,710 --> 00:04:32,960 complements? 79 00:04:35,409 --> 00:04:36,940 Well, it's an intersection. 80 00:04:36,940 --> 00:04:40,150 So the associated indicator function is going to be the 81 00:04:40,150 --> 00:04:43,640 product of the indicator function of the first set, 82 00:04:43,640 --> 00:04:48,560 which is 1 minus Xi times the indicator function of the 83 00:04:48,560 --> 00:04:52,840 second set, which is 1 minus Xj. 84 00:04:52,840 --> 00:04:55,930 And finally, what is the indicator 85 00:04:55,930 --> 00:04:57,895 function of this event? 86 00:05:00,840 --> 00:05:04,380 Here we remember De Morgan's Laws. 87 00:05:04,380 --> 00:05:09,770 De Morgan's Laws tell us that the complement of this set-- 88 00:05:09,770 --> 00:05:11,470 the complement of a union-- 89 00:05:11,470 --> 00:05:13,880 is the intersection of the complements. 90 00:05:13,880 --> 00:05:18,140 So this event here is the complement of that event. 91 00:05:18,140 --> 00:05:21,850 And, therefore, the associated indicator function is going to 92 00:05:21,850 --> 00:05:24,435 be 1 minus this expression. 93 00:05:31,440 --> 00:05:35,030 And if we were dealing with more than two sets-- 94 00:05:35,030 --> 00:05:38,600 and here we had, for example, three way intersections-- 95 00:05:38,600 --> 00:05:41,430 you would get the product of three terms. 96 00:05:41,430 --> 00:05:44,770 And if we had a three way union, we would get a similar 97 00:05:44,770 --> 00:05:47,460 expression, except that here we would have, again, a 98 00:05:47,460 --> 00:05:51,250 product of three terms instead of two. 99 00:05:51,250 --> 00:05:57,159 So now, let us put to use what we have done so far. 100 00:05:57,159 --> 00:06:02,030 We are interested in the probability that the outcome 101 00:06:02,030 --> 00:06:06,550 falls in the union of three sets. 102 00:06:06,550 --> 00:06:09,760 Now, an important fact to remember is that the 103 00:06:09,760 --> 00:06:15,300 probability of an event is the same as the expected value of 104 00:06:15,300 --> 00:06:17,660 the indicator of that event. 105 00:06:22,810 --> 00:06:28,090 This is because the indicator is equal to 1, if and only if, 106 00:06:28,090 --> 00:06:32,400 the outcome happens to be inside that set. 107 00:06:32,400 --> 00:06:36,820 And so the contribution that we get to the expectation is 1 108 00:06:36,820 --> 00:06:40,810 times the probability that the indicator is 1, which is just 109 00:06:40,810 --> 00:06:43,070 this probability. 110 00:06:43,070 --> 00:06:51,600 Now, the indicator of a three way union is going to be, by 111 00:06:51,600 --> 00:06:56,620 what we just discussed, 1 minus a product of this kind, 112 00:06:56,620 --> 00:06:58,125 but now with three terms. 113 00:07:05,660 --> 00:07:10,320 Let us now calculate this expectation by expanding the 114 00:07:10,320 --> 00:07:12,020 product involved. 115 00:07:12,020 --> 00:07:16,480 We have this first term, then, when we multiply those three 116 00:07:16,480 --> 00:07:19,960 terms together, we're going to get a bunch of contributions. 117 00:07:19,960 --> 00:07:25,170 One contribution with a minus sign is 1 times 1 times 1. 118 00:07:25,170 --> 00:07:27,970 Another contribution would be minus minus-- 119 00:07:27,970 --> 00:07:28,580 that's a plus-- 120 00:07:28,580 --> 00:07:32,420 X1 times 1 times 1. 121 00:07:32,420 --> 00:07:37,360 And similarly, we get a contribution of X2 and X3. 122 00:07:37,360 --> 00:07:41,050 And then we have a contribution such as X1 times 123 00:07:41,050 --> 00:07:43,310 X2 times 1. 124 00:07:43,310 --> 00:07:45,630 And if you look at the minus signs-- 125 00:07:45,630 --> 00:07:48,250 there are three minuses involved-- so, overall, it's 126 00:07:48,250 --> 00:07:49,750 going to be a minus. 127 00:07:49,750 --> 00:07:52,510 Minus X1 times X2. 128 00:07:52,510 --> 00:07:55,350 And then there is going to be similar terms, such as 129 00:07:55,350 --> 00:08:01,010 X1 X3 and X2 X3. 130 00:08:01,010 --> 00:08:03,680 And, finally, there's going to be a term X1 131 00:08:03,680 --> 00:08:06,100 times X2 times X3. 132 00:08:06,100 --> 00:08:09,770 There's a total of four minus signs involved, so everything 133 00:08:09,770 --> 00:08:12,230 shows up in the end with a plus sign. 134 00:08:17,410 --> 00:08:20,280 So the probability of this event is equal to the 135 00:08:20,280 --> 00:08:23,530 expectation of this random variable here. 136 00:08:23,530 --> 00:08:26,270 We notice that the ones cancel out. 137 00:08:26,270 --> 00:08:31,310 The expected value of X1 for an indicator variable is just 138 00:08:31,310 --> 00:08:32,799 the probability of that event. 139 00:08:32,799 --> 00:08:33,820 And we get this term. 140 00:08:33,820 --> 00:08:39,120 The expected value of X2 and X3 give us these terms. 141 00:08:39,120 --> 00:08:42,159 The expected value of X1 times X2. 142 00:08:42,159 --> 00:08:46,110 This is the indicator random variable of the intersection. 143 00:08:46,110 --> 00:08:50,380 So the expected value of this term is just the probability 144 00:08:50,380 --> 00:08:53,740 of the intersection of A1 and A2. 145 00:08:53,740 --> 00:08:58,800 And, similarly, these terms here give rise to those two 146 00:08:58,800 --> 00:09:00,360 terms here. 147 00:09:00,360 --> 00:09:06,610 Finally, X1 times X2 times X3 is the indicator variable for 148 00:09:06,610 --> 00:09:10,290 the event A1 intersection A2 intersection A3. 149 00:09:10,290 --> 00:09:14,220 Therefore, the expected value of this term, is equal to this 150 00:09:14,220 --> 00:09:15,600 probability. 151 00:09:15,600 --> 00:09:19,290 And, therefore, we have established exactly the 152 00:09:19,290 --> 00:09:22,940 formula that we wanted to establish. 153 00:09:22,940 --> 00:09:26,310 Now this derivation that we carried out here, there's 154 00:09:26,310 --> 00:09:29,070 nothing special about the case of three. 155 00:09:29,070 --> 00:09:32,770 We could have the union of many more events, we would 156 00:09:32,770 --> 00:09:36,750 just have here the product of more terms, and we would need 157 00:09:36,750 --> 00:09:40,010 to carry out the multiplication and we would 158 00:09:40,010 --> 00:09:45,770 get cross terms of all types involving just one of the 159 00:09:45,770 --> 00:09:48,400 indicator variables, or products of two indicator 160 00:09:48,400 --> 00:09:50,680 variables, or products of three indicator 161 00:09:50,680 --> 00:09:53,020 variables, and so on. 162 00:09:53,020 --> 00:09:56,790 And after you carry out this exercise and keep track of the 163 00:09:56,790 --> 00:10:01,870 various terms, you end up with this general version of what 164 00:10:01,870 --> 00:10:04,530 is called the inclusion-exclusion formula. 165 00:10:04,530 --> 00:10:07,390 So the probability of a union is-- 166 00:10:07,390 --> 00:10:10,000 there's the sum of the probabilities, but then you 167 00:10:10,000 --> 00:10:14,600 subtract all possible probabilities of two way 168 00:10:14,600 --> 00:10:16,060 intersections. 169 00:10:16,060 --> 00:10:20,510 Then we add probabilities of three way intersections, then 170 00:10:20,510 --> 00:10:23,560 you subtract probabilities of four way intersections, and 171 00:10:23,560 --> 00:10:28,330 you keep going this way alternating sings until you 172 00:10:28,330 --> 00:10:32,300 get to the last term, which is the probability of the 173 00:10:32,300 --> 00:10:36,570 intersection of all the events involved. 174 00:10:36,570 --> 00:10:39,810 And this exponent here of n minus 1 is the exponent that 175 00:10:39,810 --> 00:10:44,370 you need so that the last term has the correct sign. 176 00:10:44,370 --> 00:10:50,210 So, for example, if n is equal to 3, the exponent would be 2, 177 00:10:50,210 --> 00:10:54,130 so this would be a plus sign, which is consistent with what 178 00:10:54,130 --> 00:10:56,640 we got here. 179 00:10:56,640 --> 00:11:00,140 So this is a formula that is quite useful when you want to 180 00:11:00,140 --> 00:11:03,730 calculate probabilities of unions of events. 181 00:11:03,730 --> 00:11:08,640 But also, this derivation using indicator functions, is 182 00:11:08,640 --> 00:11:09,890 quite beautiful.