1 00:00:01,696 --> 00:00:03,520 PROFESSOR: An elementary idea that 2 00:00:03,520 --> 00:00:06,920 gets you a long way in counting things 3 00:00:06,920 --> 00:00:10,660 is this idea of counting with bijections, which is counting 4 00:00:10,660 --> 00:00:12,410 one thing by counting another. 5 00:00:12,410 --> 00:00:14,490 And we can illustrate that by example. 6 00:00:14,490 --> 00:00:16,960 Let's begin with looking at some stuff that 7 00:00:16,960 --> 00:00:22,490 is easy to count using just the simple sum and product rules. 8 00:00:22,490 --> 00:00:25,810 So suppose that I'm trying to count passwords. 9 00:00:25,810 --> 00:00:27,730 This is a contrived, over-simplified example, 10 00:00:27,730 --> 00:00:29,220 but it gives you the idea. 11 00:00:29,220 --> 00:00:31,130 And this is what I mean by a password. 12 00:00:31,130 --> 00:00:33,840 A password is a sequence of characters that are either 13 00:00:33,840 --> 00:00:36,734 letters or digits subject to the constraints 14 00:00:36,734 --> 00:00:38,400 that they are supposed to be between six 15 00:00:38,400 --> 00:00:39,870 and eight characters long. 16 00:00:39,870 --> 00:00:41,620 They're supposed to start with a letter, 17 00:00:41,620 --> 00:00:42,800 and they're case sensitive. 18 00:00:42,800 --> 00:00:45,490 So you can tell the difference between uppercase and lowercase 19 00:00:45,490 --> 00:00:46,830 letters. 20 00:00:46,830 --> 00:00:51,400 So let's define the set L of all the letters-- uppercase 21 00:00:51,400 --> 00:00:53,310 and lowercase together. 22 00:00:53,310 --> 00:00:58,590 And let D be the set of digits from 0 through 9. 23 00:00:58,590 --> 00:01:01,870 Then we said that passwords are supposed 24 00:01:01,870 --> 00:01:04,019 to be between six and eight words long, 25 00:01:04,019 --> 00:01:06,560 but it's a little bit easier actually to just use links 26 00:01:06,560 --> 00:01:07,420 as a parameter. 27 00:01:07,420 --> 00:01:10,840 So let's think about words of length n 28 00:01:10,840 --> 00:01:14,400 that satisfy the password conditions. 29 00:01:14,400 --> 00:01:16,500 So Pn is going to be the length n 30 00:01:16,500 --> 00:01:24,330 words starting with a letter, which is one of the password 31 00:01:24,330 --> 00:01:24,940 constraints. 32 00:01:24,940 --> 00:01:27,500 So we can express that has a length n 33 00:01:27,500 --> 00:01:30,770 word can be broken up into the first character, which 34 00:01:30,770 --> 00:01:34,490 is an L, paired with the rest of the word-- 35 00:01:34,490 --> 00:01:37,060 the remaining n minus 1 characters. 36 00:01:37,060 --> 00:01:40,340 And the remaining n minus 1 characters can be either 37 00:01:40,340 --> 00:01:41,290 L's or D's. 38 00:01:41,290 --> 00:01:45,730 So though length n passwords can be expressed 39 00:01:45,730 --> 00:01:50,950 as the product of L with the n-th power of L union D-- that 40 00:01:50,950 --> 00:01:53,110 is, L union D cross L union D cross 41 00:01:53,110 --> 00:01:55,597 L union D, n minus 1 times. 42 00:01:55,597 --> 00:01:57,430 Well, now we have an easy way to count this, 43 00:01:57,430 --> 00:02:01,750 because the size of this product by the product rule 44 00:02:01,750 --> 00:02:04,180 is the size of L times the size of L union 45 00:02:04,180 --> 00:02:07,570 D to the n minus first power. 46 00:02:07,570 --> 00:02:10,960 And of course, L union D, since letters and digits 47 00:02:10,960 --> 00:02:14,030 don't overlap by the sum rule, the size of them 48 00:02:14,030 --> 00:02:17,540 is just L plus D. And so I get this nice formula 49 00:02:17,540 --> 00:02:22,650 that is 52 letters times 52 letters 50 00:02:22,650 --> 00:02:27,830 plus 10 digits raised to the n minus first power. 51 00:02:27,830 --> 00:02:29,160 What about the passwords? 52 00:02:29,160 --> 00:02:33,930 Well, the passwords were then P6 union P7 union P8. 53 00:02:33,930 --> 00:02:36,437 And since words of length six don't 54 00:02:36,437 --> 00:02:38,270 overlap with words of length seven or eight, 55 00:02:38,270 --> 00:02:39,750 this is a disjoint union. 56 00:02:39,750 --> 00:02:42,880 And therefore, the total number of passwords as specified 57 00:02:42,880 --> 00:02:46,220 is simply the size of P6 plus the size P7 58 00:02:46,220 --> 00:02:47,690 plus the size of P8. 59 00:02:47,690 --> 00:02:49,690 There's the formula when I plug in. 60 00:02:49,690 --> 00:02:53,410 And it turns out to be a good size number, 19 times 10 61 00:02:53,410 --> 00:02:55,860 to the 14th. 62 00:02:55,860 --> 00:02:58,110 That's one simple example where I'm 63 00:02:58,110 --> 00:03:01,560 translating a spec into because something that I can express 64 00:03:01,560 --> 00:03:07,530 easily as a products and disjoint sums of stuff 65 00:03:07,530 --> 00:03:09,740 that I already know the size of. 66 00:03:09,740 --> 00:03:12,530 Let's just do another example. 67 00:03:12,530 --> 00:03:16,850 Suppose that I want to count the number of 4-digit numbers. 68 00:03:16,850 --> 00:03:19,670 So the elements of these 4-digit numbers 69 00:03:19,670 --> 00:03:24,410 are 0 through 9-- there are 10 possibilities-- with at least 70 00:03:24,410 --> 00:03:29,580 one 7-- the number of 4-digit sequences of digits that 71 00:03:29,580 --> 00:03:32,890 have at least one 7 in them. 72 00:03:32,890 --> 00:03:35,380 And one way to count is I can make 73 00:03:35,380 --> 00:03:39,500 it a sum of different 4-digit numbers containing 74 00:03:39,500 --> 00:03:42,090 one 7, depending on where the first 7 is. 75 00:03:42,090 --> 00:03:44,440 If there's at least one 7, there's a first 7. 76 00:03:44,440 --> 00:03:47,080 That's the well-ordering principle applied. 77 00:03:47,080 --> 00:03:49,820 So if we let x abbreviate any digit-- 78 00:03:49,820 --> 00:03:52,500 there are 10 possible values of x-- and o 79 00:03:52,500 --> 00:03:54,900 represent any digit other than 7-- so there's 80 00:03:54,900 --> 00:04:00,660 nine possible values of o-- then the words that start with 7 81 00:04:00,660 --> 00:04:04,940 can then be followed with any three digits. 82 00:04:04,940 --> 00:04:09,390 So 7xxx is one possible pattern when the first occurrence of 7 83 00:04:09,390 --> 00:04:10,470 is first. 84 00:04:10,470 --> 00:04:13,400 Another possible pattern is when you have a digit that's 85 00:04:13,400 --> 00:04:15,035 not 7 followed by a 7. 86 00:04:15,035 --> 00:04:18,430 This is when 7 occurs second followed by anything at all. 87 00:04:18,430 --> 00:04:23,320 Likewise, here 7 occurs third, and here, 7 occurs forth. 88 00:04:23,320 --> 00:04:26,840 Now, these individual patterns are easy enough 89 00:04:26,840 --> 00:04:30,030 to count using the product rule, because here, I 90 00:04:30,030 --> 00:04:33,090 have to count how many triples of any digits are there. 91 00:04:33,090 --> 00:04:35,180 Well, there's 10 digits, so it's 10 cubed. 92 00:04:35,180 --> 00:04:38,990 Here, how many sequences of where 93 00:04:38,990 --> 00:04:44,470 the first choice is 9 and the second two choices are 10. 94 00:04:44,470 --> 00:04:46,360 And it's 9 times 10 squared. 95 00:04:46,360 --> 00:04:48,040 Here, it's 9 squared times 10. 96 00:04:48,040 --> 00:04:49,800 And here it's 9 cubed. 97 00:04:49,800 --> 00:04:52,920 These are disjoint, because they're 98 00:04:52,920 --> 00:04:55,630 distinguished by where the first 7 occurs. 99 00:04:55,630 --> 00:04:56,920 And so I just add them up. 100 00:04:56,920 --> 00:04:58,810 And I get this number. 101 00:04:58,810 --> 00:05:03,540 It's not especially interesting, but it's 3,439. 102 00:05:03,540 --> 00:05:06,140 So that's an exercise in counting something 103 00:05:06,140 --> 00:05:09,160 by somewhat ingeniously breaking it up 104 00:05:09,160 --> 00:05:11,020 into a sum of disjoint things that 105 00:05:11,020 --> 00:05:15,488 are themselves easier to count. 106 00:05:15,488 --> 00:05:17,737 There's another way that's another standard trick that 107 00:05:17,737 --> 00:05:19,290 comes up in combinatorics of how do 108 00:05:19,290 --> 00:05:23,470 you count the sequence of 4-digit numbers 109 00:05:23,470 --> 00:05:29,030 with at least one 7, by counting the complement. 110 00:05:29,030 --> 00:05:33,400 Count the numbers of 4-digit numbers 111 00:05:33,400 --> 00:05:36,220 that don't have any 7's and simply 112 00:05:36,220 --> 00:05:41,210 subtract that number, the number of 4-digit numbers with no 7's, 113 00:05:41,210 --> 00:05:43,575 from the total number of 4-digit numbers. 114 00:05:43,575 --> 00:05:45,200 And that's going to be the numbers that 115 00:05:45,200 --> 00:05:47,110 are left over that have one 7. 116 00:05:47,110 --> 00:05:50,150 Now, the number of 4-digit numbers is easy to count. 117 00:05:50,150 --> 00:05:52,600 And it will turn out that the number of 4-digit numbers 118 00:05:52,600 --> 00:05:54,880 with no 7's is also really easy to count, 119 00:05:54,880 --> 00:05:58,460 because the number of 4-digit numbers is 10 to the fourth 120 00:05:58,460 --> 00:06:01,880 and the number of 4-digit numbers with no 7's, 121 00:06:01,880 --> 00:06:05,910 there's nine possible choices for each of the remaining 122 00:06:05,910 --> 00:06:06,410 digits. 123 00:06:06,410 --> 00:06:09,630 So it's just the digits 0 through 9, 124 00:06:09,630 --> 00:06:13,681 leaving out 7, to the fourth power, or 9 to the fourth. 125 00:06:13,681 --> 00:06:15,930 And you can double check that 10 to the fourth minus 9 126 00:06:15,930 --> 00:06:21,500 to the fourth is 3,4,39. 127 00:06:21,500 --> 00:06:24,950 So now, with that practice using the basic sum and product 128 00:06:24,950 --> 00:06:28,104 rules, we can start applying and thinking 129 00:06:28,104 --> 00:06:29,145 about the bijection rule. 130 00:06:29,145 --> 00:06:30,840 So the bijection rule simply says 131 00:06:30,840 --> 00:06:34,330 that if I have a bijection between two sets A and B, 132 00:06:34,330 --> 00:06:36,740 then they have the same size, at least 133 00:06:36,740 --> 00:06:38,650 assuming that they are finite sets. 134 00:06:38,650 --> 00:06:43,260 And the only kind of things we're counting are finite sets. 135 00:06:43,260 --> 00:06:45,720 Let's use an example of that, where 136 00:06:45,720 --> 00:06:50,940 I'm going to count the number of subsets of a set A 137 00:06:50,940 --> 00:06:54,630 by finding a bijection between the subsets of a set A 138 00:06:54,630 --> 00:06:56,670 and something that I do know how to count. 139 00:06:56,670 --> 00:06:58,390 In fact, we've already counted them, 140 00:06:58,390 --> 00:07:01,260 the binary strings of a given length. 141 00:07:01,260 --> 00:07:02,350 What's the bijection? 142 00:07:02,350 --> 00:07:05,090 Well, suppose that A is a set of n elements, 143 00:07:05,090 --> 00:07:08,430 call them a1 through an. 144 00:07:08,430 --> 00:07:13,080 And I have some arbitrary subset of A. Say, it's got a1, 145 00:07:13,080 --> 00:07:16,060 and it doesn't have a2, and it has a3, and it has a4, 146 00:07:16,060 --> 00:07:17,620 and it doesn't have a5. 147 00:07:17,620 --> 00:07:20,250 And then it's got some selection of the other numbers. 148 00:07:20,250 --> 00:07:23,070 And it turns out it has a n in it. 149 00:07:23,070 --> 00:07:25,130 Well, if I think of a subset laid out 150 00:07:25,130 --> 00:07:29,162 this way up against the corresponding elements in A, 151 00:07:29,162 --> 00:07:31,670 I can code this in an obvious way 152 00:07:31,670 --> 00:07:36,300 by putting a 1 where the element is in the subset and a 0 153 00:07:36,300 --> 00:07:38,700 where the element is not in the subset. 154 00:07:38,700 --> 00:07:41,590 In effect, this is the so-called characteristic function 155 00:07:41,590 --> 00:07:45,830 of the subset where 1 means that that index element-- a 1 156 00:07:45,830 --> 00:07:48,400 in the i-th position means that ai is there. 157 00:07:48,400 --> 00:07:51,950 And a 0 in the i-th position means that ai is not there. 158 00:07:51,950 --> 00:07:56,730 So the second coordinate here is a 0. 159 00:07:56,730 --> 00:07:58,460 That means a2 is not there. 160 00:07:58,460 --> 00:08:01,162 And this is easily seen to be a bijection. 161 00:08:01,162 --> 00:08:03,120 That is, given the string, you could figure out 162 00:08:03,120 --> 00:08:03,940 what the subset is. 163 00:08:03,940 --> 00:08:06,730 Given the subset, you can figure out what the unique string is. 164 00:08:06,730 --> 00:08:08,240 So we have a bijection. 165 00:08:08,240 --> 00:08:10,590 And what we conclude then is that the number 166 00:08:10,590 --> 00:08:15,255 of n-bit strings is equal to the size of power set of A. 167 00:08:15,255 --> 00:08:17,440 It's equal to the number of subsets of A. 168 00:08:17,440 --> 00:08:19,314 And of course, we know how to count 169 00:08:19,314 --> 00:08:20,480 the number of n-bit strings. 170 00:08:20,480 --> 00:08:21,860 It's 2 to the n. 171 00:08:21,860 --> 00:08:25,250 So what we just figured out is, if I have a set of size n, 172 00:08:25,250 --> 00:08:27,770 it's got 2 to the n subsets. 173 00:08:27,770 --> 00:08:30,220 And a slick way to say that without mentioning 174 00:08:30,220 --> 00:08:32,989 n is that the size of the power set of A 175 00:08:32,989 --> 00:08:38,970 is simply 2 the size of A. 176 00:08:38,970 --> 00:08:41,890 One more example of bijection counting 177 00:08:41,890 --> 00:08:47,440 that is kind of fun and interesting 178 00:08:47,440 --> 00:08:50,910 will illustrate the fact that we learn something 179 00:08:50,910 --> 00:08:52,810 by finding a bijection, even if we don't know 180 00:08:52,810 --> 00:08:55,100 how to count either one yet. 181 00:08:55,100 --> 00:08:56,830 So what I'm interested in is, suppose 182 00:08:56,830 --> 00:08:58,246 I have a situation where there are 183 00:08:58,246 --> 00:09:01,320 five kinds of doughnuts-- five different flavors of doughnuts. 184 00:09:01,320 --> 00:09:02,920 And I want to sort of select a dozen. 185 00:09:02,920 --> 00:09:05,003 Now, I want to know how many selections there are. 186 00:09:05,003 --> 00:09:08,220 So for example-- these little O's represent doughnuts-- 187 00:09:08,220 --> 00:09:10,370 I might choose a selection of a dozen 188 00:09:10,370 --> 00:09:12,340 by choosing two chocolate and no lemon-- I 189 00:09:12,340 --> 00:09:15,330 don't like those so much-- and six sugars and two 190 00:09:15,330 --> 00:09:17,350 glazed and two plain. 191 00:09:17,350 --> 00:09:23,630 So there are 12 doughnuts here using four out 192 00:09:23,630 --> 00:09:25,780 of the five possible flavors of doughnuts. 193 00:09:25,780 --> 00:09:28,670 This is what I'll call a selection of a doughnut. 194 00:09:28,670 --> 00:09:31,670 And I'd like to know how many such selections of doughnuts 195 00:09:31,670 --> 00:09:33,000 are there. 196 00:09:33,000 --> 00:09:36,910 Well, let that be the set A, the set of all these different ways 197 00:09:36,910 --> 00:09:40,090 of selecting 12 doughnuts when there are five 198 00:09:40,090 --> 00:09:41,920 flavors of doughnuts available. 199 00:09:41,920 --> 00:09:45,630 Well, this is, again, an obvious correspondence between the set 200 00:09:45,630 --> 00:09:51,590 A of doughnut selections and the set B of 0's and 1's of length 201 00:09:51,590 --> 00:09:54,120 16 that contain four 1's. 202 00:09:54,120 --> 00:09:55,240 What's the correspondence? 203 00:09:55,240 --> 00:09:57,200 Well, here's my doughnut selection. 204 00:09:57,200 --> 00:10:00,570 And of course, the reason why I use those O's for doughnuts is 205 00:10:00,570 --> 00:10:03,240 that they also correspond to 0's. 206 00:10:03,240 --> 00:10:06,410 I can just put in 1's as delimiters 207 00:10:06,410 --> 00:10:08,130 between the groups of flavors. 208 00:10:08,130 --> 00:10:11,850 So after the chocolate doughnuts, I put a 1. 209 00:10:11,850 --> 00:10:13,730 And then after the lemon doughnuts, 210 00:10:13,730 --> 00:10:15,640 that happen to be none, I put another 1. 211 00:10:15,640 --> 00:10:20,280 And then after the six sugar doughnuts, I put a 1. 212 00:10:20,280 --> 00:10:23,560 And then I kind of consolidate and I extract from the doughnut 213 00:10:23,560 --> 00:10:28,770 selection this 16-bit word with 12 0's 214 00:10:28,770 --> 00:10:32,330 corresponding to 12 doughnuts and four 1's 215 00:10:32,330 --> 00:10:37,690 corresponding to breaking up those groups of 0's into five 216 00:10:37,690 --> 00:10:40,320 categories, five slots, corresponding 217 00:10:40,320 --> 00:10:44,050 to the number of doughnuts of each flavor. 218 00:10:44,050 --> 00:10:45,850 So the general bijection, of course, 219 00:10:45,850 --> 00:10:49,910 is that if I have a selection of c chocolate doughnuts, 220 00:10:49,910 --> 00:10:52,660 l lemon doughnuts, s sugar doughnuts, g glazed, 221 00:10:52,660 --> 00:10:57,310 and p plain of any number really, 222 00:10:57,310 --> 00:11:01,120 a selection of doughnuts with this number of chocolates, 223 00:11:01,120 --> 00:11:06,730 lemons, glazed, plain corresponds to a binary word 224 00:11:06,730 --> 00:11:12,830 with c plus l plus s plus g plus p 0's and four 1's. 225 00:11:12,830 --> 00:11:15,130 And so what we can say is that the set 226 00:11:15,130 --> 00:11:21,270 of 16-digit words with four 1's is exactly the same size 227 00:11:21,270 --> 00:11:22,820 as the number of doughnut selections, 228 00:11:22,820 --> 00:11:24,840 even though at this moment we don't 229 00:11:24,840 --> 00:11:26,590 know how to count either one. 230 00:11:26,590 --> 00:11:30,110 We will see in the next lecture an easy way to count the number 231 00:11:30,110 --> 00:11:34,090 of those 16-bit words with four 1's. 232 00:11:34,090 --> 00:11:36,720 But for now, our conclusion from bijection counting 233 00:11:36,720 --> 00:11:39,300 is that these two sets are the same size, even though I 234 00:11:39,300 --> 00:11:41,809 haven't counted yet either one.