1 00:00:01,639 --> 00:00:03,180 PROFESSOR: The arguments that we just 2 00:00:03,180 --> 00:00:05,590 reviewed for proving inclusion-exclusion 3 00:00:05,590 --> 00:00:07,990 by looking at how many times points are counted 4 00:00:07,990 --> 00:00:09,540 can be made perfectly rigorous. 5 00:00:09,540 --> 00:00:11,650 In fact, we'll do that in a later segment, 6 00:00:11,650 --> 00:00:13,960 but it's interesting and good practice 7 00:00:13,960 --> 00:00:16,530 to realize that they can also be proved just 8 00:00:16,530 --> 00:00:20,070 from some simple set theoretic identities using 9 00:00:20,070 --> 00:00:23,060 the ordinary disjoint sum rule. 10 00:00:23,060 --> 00:00:27,590 So how am I going to prove the inclusion-exclusion principle 11 00:00:27,590 --> 00:00:28,260 for two sets? 12 00:00:30,890 --> 00:00:32,530 The size of A union B is the size of A 13 00:00:32,530 --> 00:00:36,110 plus the size of B minus the size of A intersect B, 14 00:00:36,110 --> 00:00:43,120 and the idea is just break up A union B into disjoint sets 15 00:00:43,120 --> 00:00:48,110 because once they're disjoint sets, I can add up their sizes. 16 00:00:48,110 --> 00:00:52,570 So if we look at A union B, A union B 17 00:00:52,570 --> 00:00:56,430 can be expressed as the union of two disjoint sets, 18 00:00:56,430 --> 00:01:00,870 namely A-- the round blue circle-- and what's 19 00:01:00,870 --> 00:01:05,710 left-- the points in B that are not in A, so this lighter 20 00:01:05,710 --> 00:01:07,230 orange-colored region. 21 00:01:07,230 --> 00:01:11,010 So A and B union and B minus A-- so these are the points in A 22 00:01:11,010 --> 00:01:14,140 and these are the points that are in B that are not in A 23 00:01:14,140 --> 00:01:19,320 are two disjoint sets whose union is A union B. 24 00:01:19,320 --> 00:01:22,050 That means that I know the size of this. 25 00:01:22,050 --> 00:01:25,420 It's just the size of A plus the size of B minus A 26 00:01:25,420 --> 00:01:28,990 because they're disjoint, so we conclude by the sum rule 27 00:01:28,990 --> 00:01:32,320 that the size of A union B is equal to the size of A 28 00:01:32,320 --> 00:01:35,800 plus the size of B minus A. 29 00:01:35,800 --> 00:01:37,860 So now I need a little lemma that 30 00:01:37,860 --> 00:01:42,360 says that the size of B minus A is the size of B 31 00:01:42,360 --> 00:01:45,480 minus the size of A intersection B. 32 00:01:45,480 --> 00:01:48,230 If I get that, then I've proved inclusion-exclusion 33 00:01:48,230 --> 00:01:52,530 because now I have A plus B minus A intersection B. 34 00:01:52,530 --> 00:01:53,930 So we need a lemma. 35 00:01:53,930 --> 00:01:57,000 The lemma says that the size of B minus A 36 00:01:57,000 --> 00:02:00,925 is equal to the size of B minus the size of A intersect B. 37 00:02:00,925 --> 00:02:02,550 So here we're back to the Venn diagram. 38 00:02:05,710 --> 00:02:11,350 A is blue, B is reddish, and the intersection region, 39 00:02:11,350 --> 00:02:16,040 that lens-shaped region, is shown in purple, 40 00:02:16,040 --> 00:02:18,350 and we want to prove this lemma. 41 00:02:18,350 --> 00:02:27,520 Well, again, what we can do is look at the set B broken up 42 00:02:27,520 --> 00:02:29,560 into two pieces. 43 00:02:29,560 --> 00:02:33,830 B can be expressed as a disjoint union of the part of B 44 00:02:33,830 --> 00:02:38,220 that's in A union, the part of B that's not in A. That is, 45 00:02:38,220 --> 00:02:41,610 for any set B and any set A, B is 46 00:02:41,610 --> 00:02:44,840 equal to the B points that are in the A 47 00:02:44,840 --> 00:02:47,700 and the B points that are not in A covers all the cases. 48 00:02:47,700 --> 00:02:50,020 And this, again, is a disjoint union. 49 00:02:50,020 --> 00:02:54,700 So we conclude immediately from the sum rule that the size of B 50 00:02:54,700 --> 00:02:57,410 is equal to the size of B intersection 51 00:02:57,410 --> 00:03:01,430 A plus the size of B minus A. And then just transposing 52 00:03:01,430 --> 00:03:03,810 this term for the size of B minus A 53 00:03:03,810 --> 00:03:07,450 to the left-hand side of the equality, I've proven the lemma 54 00:03:07,450 --> 00:03:10,540 and we're done. 55 00:03:10,540 --> 00:03:12,530 Now, inclusion-exclusion for three sets, 56 00:03:12,530 --> 00:03:15,310 we've said before it's this slightly more complicated thing 57 00:03:15,310 --> 00:03:19,895 where you've got a sum of the intersections of one 58 00:03:19,895 --> 00:03:25,740 set minus the sizes of the intersections of two sets 59 00:03:25,740 --> 00:03:29,000 plus the size of the intersection of three sets. 60 00:03:29,000 --> 00:03:34,190 And that generalizes to the following somewhat messy 61 00:03:34,190 --> 00:03:37,010 formula, but let's look at it closely together. 62 00:03:37,010 --> 00:03:41,110 If I want to know what's the size of A1 through An-- 63 00:03:41,110 --> 00:03:45,510 if I have n sets potentially overlapping A1 A2 through An 64 00:03:45,510 --> 00:03:49,540 and I want their union, and I can express the union of them 65 00:03:49,540 --> 00:03:53,210 in terms of a sum of sizes of intersections, 66 00:03:53,210 --> 00:03:55,060 and here's the formula. 67 00:03:55,060 --> 00:03:58,260 Let's read this slowly together. 68 00:03:58,260 --> 00:04:03,060 So this is a sum over every possible subset of the indices 69 00:04:03,060 --> 00:04:05,710 1 through n that's not empty. 70 00:04:05,710 --> 00:04:10,100 So this sum is ranging over S, and it can do it in any order. 71 00:04:10,100 --> 00:04:13,580 It's typical to write the sum where you sum up first all 72 00:04:13,580 --> 00:04:18,380 the sets S of size 1 and then sum up all the sets S of size 2 73 00:04:18,380 --> 00:04:20,459 and sum-- but that's not necessary. 74 00:04:20,459 --> 00:04:26,280 We just sum in any order for every subset of the indices, 75 00:04:26,280 --> 00:04:29,080 the size of the intersection of the Ai's that 76 00:04:29,080 --> 00:04:32,060 are in this set of indices specified by S. 77 00:04:32,060 --> 00:04:39,140 So this is just the intersection of those A's that s specifies. 78 00:04:39,140 --> 00:04:43,980 Now, what's the sign of that size of intersection? 79 00:04:43,980 --> 00:04:48,300 As we said, if it's of odd size, I want it to count positively. 80 00:04:48,300 --> 00:04:52,400 So if I take minus 1 to the odd size plus 1, 81 00:04:52,400 --> 00:04:57,520 I get an even power of minus 1, so it comes out to be 1. 82 00:04:57,520 --> 00:05:00,620 If, on the other hand, the size of S 83 00:05:00,620 --> 00:05:02,560 is even-- so I'm taking an intersection 84 00:05:02,560 --> 00:05:06,440 of an even number of sets-- then this number to the plus 1 85 00:05:06,440 --> 00:05:07,030 is odd. 86 00:05:07,030 --> 00:05:09,070 I'm taking minus 1 to an odd power, 87 00:05:09,070 --> 00:05:11,610 and sure enough, I'm getting the negative sign 88 00:05:11,610 --> 00:05:15,440 on all the intersections of odd size. 89 00:05:15,440 --> 00:05:19,830 So that's what this rather concise but hairy formula 90 00:05:19,830 --> 00:05:20,810 looks like. 91 00:05:20,810 --> 00:05:24,120 Here we have an intersection over the Ai's where 92 00:05:24,120 --> 00:05:27,200 the i is specified by the set S of indices, 93 00:05:27,200 --> 00:05:31,930 and I sum up these terms over every possible nonempty set 94 00:05:31,930 --> 00:05:36,490 S. That is the generalized form of inclusion-exclusion 95 00:05:36,490 --> 00:05:38,950 for n sets. 96 00:05:38,950 --> 00:05:40,230 Now, how do we prove this? 97 00:05:40,230 --> 00:05:42,455 Well, there's lots of ways to prove it. 98 00:05:42,455 --> 00:05:44,580 The simplest way is to do it actually by induction. 99 00:05:44,580 --> 00:05:46,900 It's not very hard to do by induction. 100 00:05:46,900 --> 00:05:51,720 You just use the two-set version of inclusion-exclusion, which 101 00:05:51,720 --> 00:05:55,160 we've proved rigorously using the disjoint sum rule, 102 00:05:55,160 --> 00:05:58,800 and you go from the union of the first n sets plus the nth 103 00:05:58,800 --> 00:06:01,299 and apply the formula and simplify, and that works fine. 104 00:06:01,299 --> 00:06:03,590 The other problem with it is [? it's ?] not really very 105 00:06:03,590 --> 00:06:04,117 informative. 106 00:06:04,117 --> 00:06:05,450 You prove the theorem all right. 107 00:06:05,450 --> 00:06:09,080 As is frequently the case with induction proofs, 108 00:06:09,080 --> 00:06:10,860 you have the right induction hypothesis, 109 00:06:10,860 --> 00:06:13,068 the proof is kind of mechanical, but you don't really 110 00:06:13,068 --> 00:06:15,400 learn much from it-- not always, but in this case, 111 00:06:15,400 --> 00:06:16,990 I don't think you do. 112 00:06:16,990 --> 00:06:21,490 A second way to do it is by using the binomial theorem 113 00:06:21,490 --> 00:06:22,130 and counting. 114 00:06:22,130 --> 00:06:24,510 This is a way to make rigorous the argument that said, 115 00:06:24,510 --> 00:06:30,730 OK, we counted two points in the intersection of two things 116 00:06:30,730 --> 00:06:32,500 twice, so we had to subtract them away, 117 00:06:32,500 --> 00:06:34,001 and then that meant that we were not 118 00:06:34,001 --> 00:06:36,416 counting the things that were in the intersection of three 119 00:06:36,416 --> 00:06:37,430 not at all, and so on. 120 00:06:37,430 --> 00:06:40,940 And we've talked through that argument informally, 121 00:06:40,940 --> 00:06:44,230 and I will do that in a next segment of making 122 00:06:44,230 --> 00:06:46,310 that argument a little bit more precise to prove 123 00:06:46,310 --> 00:06:47,640 the general theorem. 124 00:06:47,640 --> 00:06:48,140 And-- 125 00:06:48,140 --> 00:06:50,996 [AUDIO OUT] 126 00:06:52,430 --> 00:06:56,690 Law from algebra, we can just look at a product of sums 127 00:06:56,690 --> 00:07:00,060 and understand how it expands into a sum of products 128 00:07:00,060 --> 00:07:01,620 by the distributivity law. 129 00:07:01,620 --> 00:07:05,730 And that's worked out in a problem that is in the text, 130 00:07:05,730 --> 00:07:08,650 and I'm not going to do that in a video.