1 00:00:00,061 --> 00:00:01,810 PROFESSOR: We're going to look at the most 2 00:00:01,810 --> 00:00:05,710 fundamental of all mathematical data types, namely sets, 3 00:00:05,710 --> 00:00:07,661 and let's begin with the definitions. 4 00:00:12,070 --> 00:00:15,460 So informally, a set is a collection 5 00:00:15,460 --> 00:00:18,280 of mathematical objects, and the idea 6 00:00:18,280 --> 00:00:22,350 is that you treat the collection of objects as one new object. 7 00:00:22,350 --> 00:00:24,390 And that's a working definition. 8 00:00:24,390 --> 00:00:27,170 But of course, it might help a little bit, 9 00:00:27,170 --> 00:00:28,510 but it's a circular definition. 10 00:00:28,510 --> 00:00:31,347 This is not math yet because I haven't defined 11 00:00:31,347 --> 00:00:32,930 what a collection is, and a collection 12 00:00:32,930 --> 00:00:36,030 is no clearer or easier to define than a set is. 13 00:00:36,030 --> 00:00:38,960 So let's try to work up the idea of sets 14 00:00:38,960 --> 00:00:42,539 by looking at some examples. 15 00:00:42,539 --> 00:00:44,580 So we've already talked about some familiar ones. 16 00:00:44,580 --> 00:00:45,996 There's the real numbers for which 17 00:00:45,996 --> 00:00:50,120 we had this symbol R in a special font, 18 00:00:50,120 --> 00:00:54,120 and the complex numbers C, and the integers 19 00:00:54,120 --> 00:00:58,850 Z. And we might have mentioned, and you 20 00:00:58,850 --> 00:01:01,990 might have seen already, the idea of the empty set for which 21 00:01:01,990 --> 00:01:06,170 we use this symbol that looks like a zero with a line 22 00:01:06,170 --> 00:01:08,420 through it. 23 00:01:08,420 --> 00:01:10,740 Let's look at an example to pin things down. 24 00:01:10,740 --> 00:01:13,820 Let's look at this set of four things. 25 00:01:13,820 --> 00:01:18,850 Namely, it's got two numbers, pi/2 and 7, a character 26 00:01:18,850 --> 00:01:25,076 string in quotes, "Albert R," and the Boolean value true. 27 00:01:25,076 --> 00:01:26,950 So those are the four different things in it. 28 00:01:26,950 --> 00:01:28,400 They're of mixed type. 29 00:01:28,400 --> 00:01:33,017 And you might not like to have a mixed type like this 30 00:01:33,017 --> 00:01:34,100 in a programming language. 31 00:01:34,100 --> 00:01:37,650 But mathematicians don't worry about such things very much, 32 00:01:37,650 --> 00:01:39,590 rarely. 33 00:01:39,590 --> 00:01:43,450 Anyway, the first observation is that the order 34 00:01:43,450 --> 00:01:45,820 in which these elements are listed doesn't matter. 35 00:01:45,820 --> 00:01:48,470 This set-- the braces indicates that it's 36 00:01:48,470 --> 00:01:50,580 the set of these things-- is the same if I 37 00:01:50,580 --> 00:01:55,370 listed T first, then the string, and the two numbers last. 38 00:01:55,370 --> 00:01:59,265 There is no notion of order in a set. 39 00:01:59,265 --> 00:02:02,194 Now, to a computer scientist this is a little unnatural. 40 00:02:02,194 --> 00:02:03,610 The most natural thing to be would 41 00:02:03,610 --> 00:02:05,490 be to define a sequence of things, 42 00:02:05,490 --> 00:02:07,764 like the sequence that began with 7, then 43 00:02:07,764 --> 00:02:09,680 had the character string, then had the number, 44 00:02:09,680 --> 00:02:12,220 then had the Boolean. 45 00:02:12,220 --> 00:02:15,940 And you could get by with working with lists of things 46 00:02:15,940 --> 00:02:18,280 as long as they're finite. 47 00:02:18,280 --> 00:02:21,500 But they very quickly get out of hand 48 00:02:21,500 --> 00:02:25,000 when you have to talk about, say, a set of lists. 49 00:02:25,000 --> 00:02:27,630 Then it's not clear how to make a list out of those, 50 00:02:27,630 --> 00:02:30,610 and you wind up making sets again. 51 00:02:30,610 --> 00:02:33,660 So sets, in fact, are an unavoidable kind of idea. 52 00:02:36,380 --> 00:02:39,030 So another basic thing to understand 53 00:02:39,030 --> 00:02:40,880 about the notion of a set is that an element 54 00:02:40,880 --> 00:02:43,360 is either in a set or not in a set. 55 00:02:43,360 --> 00:02:46,800 So if I write down 7, pi/2, 7, this 56 00:02:46,800 --> 00:02:50,432 is the same description of the same set 7, pi/2. 57 00:02:50,432 --> 00:02:52,390 I'm just telling you the same thing twice here. 58 00:02:52,390 --> 00:02:56,280 That 7 is in the set, and the 7 is in the set again. 59 00:02:56,280 --> 00:02:59,530 So no notion of being in the set more than once. 60 00:02:59,530 --> 00:03:01,810 Now, sometimes, technically you want 61 00:03:01,810 --> 00:03:03,880 to add a notion of so-called multisets 62 00:03:03,880 --> 00:03:05,740 in which elements can be in a set 63 00:03:05,740 --> 00:03:08,360 a certain number of times, an integer number of times. 64 00:03:08,360 --> 00:03:10,030 But there's no real need for that. 65 00:03:10,030 --> 00:03:11,610 It's a secondary idea. 66 00:03:11,610 --> 00:03:14,270 And from our point of view, you're in or out of a set. 67 00:03:14,270 --> 00:03:19,340 If you repeat elements, it's the same as mentioning them once. 68 00:03:19,340 --> 00:03:22,870 So the most fundamental feature of a set is what's in it. 69 00:03:22,870 --> 00:03:25,040 And for that, there's a special notation. 70 00:03:25,040 --> 00:03:28,580 So we'll say that x is a member of A, where A is a set, 71 00:03:28,580 --> 00:03:31,850 and use this epsilon symbol to indicate membership. 72 00:03:31,850 --> 00:03:36,350 It's read x is a member of A. So for example, pi/2 73 00:03:36,350 --> 00:03:40,730 is a member of that set that we saw before that had pi/2 in it. 74 00:03:40,730 --> 00:03:44,790 14/2 is also a member of that set because 14/2 is just 75 00:03:44,790 --> 00:03:46,440 another description of 7. 76 00:03:46,440 --> 00:03:49,810 When I write 7 here, I don't mean the character 7. 77 00:03:49,810 --> 00:03:50,960 I mean the number 7. 78 00:03:50,960 --> 00:03:54,350 And so 14/2 is the description of the same number. 79 00:03:54,350 --> 00:03:55,360 It's in that set. 80 00:03:55,360 --> 00:03:57,790 On the other hand, pi/3 is a number 81 00:03:57,790 --> 00:03:59,450 that's simply not in that set. 82 00:03:59,450 --> 00:04:03,360 So I'm using the epsilon with a vertical bar through it, 83 00:04:03,360 --> 00:04:09,180 or some kind of a line through it, to mean not a member of. 84 00:04:09,180 --> 00:04:11,770 Membership is so basic that there's a lot of different ways 85 00:04:11,770 --> 00:04:12,270 to say it. 86 00:04:12,270 --> 00:04:15,340 Besides using the membership symbol x is a member of A, 87 00:04:15,340 --> 00:04:18,290 you can sometimes say x is an element of A, 88 00:04:18,290 --> 00:04:22,640 or x is in A, as well as x is a member of A. 89 00:04:22,640 --> 00:04:25,010 They're all synonyms. 90 00:04:25,010 --> 00:04:28,450 So for example, 7 is a member of the integers. 91 00:04:28,450 --> 00:04:30,900 Z is our symbol for the integers. 92 00:04:30,900 --> 00:04:34,210 2/3 is not a member of the integers 93 00:04:34,210 --> 00:04:37,040 because it's a fraction that's not an integer. 94 00:04:37,040 --> 00:04:39,210 And on the other hand, the set Z of integers 95 00:04:39,210 --> 00:04:42,830 itself is a member of this three-element set consisting 96 00:04:42,830 --> 00:04:47,090 of the truth value T, the set of all integers, 97 00:04:47,090 --> 00:04:48,600 and the element 7. 98 00:04:48,600 --> 00:04:52,680 So here's an example where a set can contain sets, quite 99 00:04:52,680 --> 00:04:55,760 big ones even, and that's fine. 100 00:04:55,760 --> 00:04:59,480 That's not any problem mathematically. 101 00:04:59,480 --> 00:05:02,660 Related to membership is another fundamental notion of subset. 102 00:05:02,660 --> 00:05:06,050 So A is a subset of B, it's pronounced. 103 00:05:06,050 --> 00:05:10,120 So that horizontal U with a line under it 104 00:05:10,120 --> 00:05:13,520 is meant to resemble a less than or equal to symbol. 105 00:05:13,520 --> 00:05:16,400 So you can think of it as being A is less than or equal to B. 106 00:05:16,400 --> 00:05:17,880 But don't overload the symbols. 107 00:05:17,880 --> 00:05:21,520 Less than or equal to is used on numbers and other things 108 00:05:21,520 --> 00:05:22,810 that we know how to order. 109 00:05:22,810 --> 00:05:28,220 And this is a relation that's only allowed between sets. 110 00:05:28,220 --> 00:05:31,290 So A is a subset of B-- A synonym is that A is contained 111 00:05:31,290 --> 00:05:35,170 in B-- simply means that every element of A 112 00:05:35,170 --> 00:05:37,560 is also an element of B. If I wrote 113 00:05:37,560 --> 00:05:41,130 that out in predicate logic notation 114 00:05:41,130 --> 00:05:46,750 as a predicate formula, I'd say for every x, x is in A implies 115 00:05:46,750 --> 00:05:52,650 x is in B. If it's in A, then it's in B. Everything in A 116 00:05:52,650 --> 00:05:55,860 is in B. 117 00:05:55,860 --> 00:05:59,180 So some examples of the subset relation 118 00:05:59,180 --> 00:06:01,960 are that the integers are a kind of-- an integer 119 00:06:01,960 --> 00:06:03,560 is a special case of a real number. 120 00:06:03,560 --> 00:06:06,910 So the set of integers is a subset of the real numbers. 121 00:06:06,910 --> 00:06:09,740 A real number is a special case of a complex number, 122 00:06:09,740 --> 00:06:13,430 so the real numbers are a subset of the complex numbers. 123 00:06:13,430 --> 00:06:15,530 And here's a concrete example, where 124 00:06:15,530 --> 00:06:18,270 I have a set of three things, 5, 7, and 3, 125 00:06:18,270 --> 00:06:22,230 and this is the set with just the element 3 in it. 126 00:06:22,230 --> 00:06:25,360 Now, we sometimes are sloppy about distinguishing 127 00:06:25,360 --> 00:06:29,220 the element 3 from the set that's consisting of just 3 128 00:06:29,220 --> 00:06:30,240 as its only element. 129 00:06:30,240 --> 00:06:32,640 But in fact, it's a pretty important distinction 130 00:06:32,640 --> 00:06:34,060 to keep track of. 131 00:06:34,060 --> 00:06:38,990 In this case, 3 is not a subset of this set on the right. 132 00:06:38,990 --> 00:06:42,910 But the set consisting of 3 is a subset of the set on the right 133 00:06:42,910 --> 00:06:46,370 because, after all, the only member of this set is 3, 134 00:06:46,370 --> 00:06:50,250 and that is a member of this set. 135 00:06:50,250 --> 00:06:53,350 A consequence of this general definition is that every set is 136 00:06:53,350 --> 00:06:56,770 a subset of itself because everything in A is in A . 137 00:06:56,770 --> 00:06:58,680 That's not really very interesting. 138 00:06:58,680 --> 00:07:01,340 Another important general observation 139 00:07:01,340 --> 00:07:04,380 is that the empty set is a subset of everything. 140 00:07:04,380 --> 00:07:06,500 The empty set is a subset of every set. 141 00:07:06,500 --> 00:07:08,780 Let's look at why that is in more detail. 142 00:07:08,780 --> 00:07:13,120 So the claim is that the empty set is a subset of everything. 143 00:07:13,120 --> 00:07:15,480 Let B be any old set, then the empty set 144 00:07:15,480 --> 00:07:18,310 is a subset of B. What exactly does that mean according 145 00:07:18,310 --> 00:07:19,770 to the definition of subset? 146 00:07:19,770 --> 00:07:24,110 Well, it says that everything that's in the empty set, 147 00:07:24,110 --> 00:07:25,960 if it's in the empty set, then it 148 00:07:25,960 --> 00:07:28,970 implies that it's in B. For every element, 149 00:07:28,970 --> 00:07:32,430 if it's in the empty set, then it's in B. 150 00:07:32,430 --> 00:07:34,820 Well, what do we know about this? 151 00:07:34,820 --> 00:07:39,110 The assertion that x is in the empty set is false. 152 00:07:39,110 --> 00:07:42,720 No matter what x is, there's nothing in the empty set. 153 00:07:42,720 --> 00:07:44,790 And now I have an implication that 154 00:07:44,790 --> 00:07:49,110 implies where the left-hand side, the hypothesis, is false. 155 00:07:49,110 --> 00:07:51,490 That means that the whole implication is true, 156 00:07:51,490 --> 00:07:54,170 and it doesn't depend on what B is. 157 00:07:54,170 --> 00:07:55,590 I'm not even going to look at B. I 158 00:07:55,590 --> 00:07:58,340 can see that x is in empty set is false, 159 00:07:58,340 --> 00:08:01,320 so the whole implication is true. 160 00:08:01,320 --> 00:08:04,450 And so what I'm saying is that for every x, 161 00:08:04,450 --> 00:08:06,000 something that's true has to be true. 162 00:08:06,000 --> 00:08:06,920 Well, it is. 163 00:08:06,920 --> 00:08:10,280 And that's why the empty set is a subset of B 164 00:08:10,280 --> 00:08:13,280 satisfies this definition in a formal way. 165 00:08:13,280 --> 00:08:16,750 And this is an example of why that convention 166 00:08:16,750 --> 00:08:20,400 that false implies anything is convenient and is made 167 00:08:20,400 --> 00:08:21,097 use here. 168 00:08:24,190 --> 00:08:29,550 So when you're defining sets, if they're small, 169 00:08:29,550 --> 00:08:32,480 you can just list the elements, as we 170 00:08:32,480 --> 00:08:37,620 did with that set with 7 and pi/2 and "Albert R." 171 00:08:37,620 --> 00:08:40,682 Sometimes we can even describe infinite sets 172 00:08:40,682 --> 00:08:41,640 as some kind of a list. 173 00:08:41,640 --> 00:08:45,900 Like I might describe the set of integers as saying, 174 00:08:45,900 --> 00:08:49,700 well, it's 0, 1, minus 1, 2, minus 2, and so on, 175 00:08:49,700 --> 00:08:51,890 and you would understand that. 176 00:08:51,890 --> 00:08:55,510 But in general, if I'm describing 177 00:08:55,510 --> 00:08:57,710 a set that is not so easy to list, 178 00:08:57,710 --> 00:09:00,400 say the real numbers, then what I'm going to do 179 00:09:00,400 --> 00:09:04,250 is define a set by a defining property of in the set. 180 00:09:04,250 --> 00:09:08,880 So I'm interested in a property P of elements, 181 00:09:08,880 --> 00:09:10,720 and I'm going to look at the set of elements 182 00:09:10,720 --> 00:09:14,330 x that are in some set A such that P of x is true, 183 00:09:14,330 --> 00:09:16,630 and that's the notation we use. 184 00:09:16,630 --> 00:09:21,640 So this would be read as the set of x in A such that P of x 185 00:09:21,640 --> 00:09:25,270 holds, that x has property P. So notice this vertical bar is 186 00:09:25,270 --> 00:09:26,200 read as "such that." 187 00:09:26,200 --> 00:09:28,790 It's just a mathematical abbreviation. 188 00:09:28,790 --> 00:09:32,860 This is those elements in A that have property P that P of x 189 00:09:32,860 --> 00:09:36,279 holds for, and that defines a set of those elements. 190 00:09:36,279 --> 00:09:37,570 Let's look at a simple example. 191 00:09:37,570 --> 00:09:41,950 The set E of even integers is simply the set of numbers n 192 00:09:41,950 --> 00:09:45,020 that are integers such that n is even. 193 00:09:45,020 --> 00:09:49,500 So in this case, the property P of n means that n is even. 194 00:09:52,240 --> 00:09:55,895 One last concept is the concept of the power set. 195 00:09:55,895 --> 00:10:02,790 So the power set of a set A is all of the subsets of A. 196 00:10:02,790 --> 00:10:05,180 So we could define it using set notation 197 00:10:05,180 --> 00:10:08,960 as it's the set of B such that B is 198 00:10:08,960 --> 00:10:14,430 a subset of A. An example would be-- 199 00:10:14,430 --> 00:10:18,670 let's take the power set of the two Boolean values 200 00:10:18,670 --> 00:10:19,570 true and false. 201 00:10:19,570 --> 00:10:22,030 So the power set of true and false, 202 00:10:22,030 --> 00:10:23,870 of that set consisting of two elements, 203 00:10:23,870 --> 00:10:26,100 is-- well, what are some of its subsets? 204 00:10:26,100 --> 00:10:30,260 The set consisting of just true is a subset of true, false. 205 00:10:30,260 --> 00:10:31,850 So is the set consisting of false, 206 00:10:31,850 --> 00:10:33,380 and so is the whole thing. 207 00:10:33,380 --> 00:10:34,960 It's a subset of itself. 208 00:10:34,960 --> 00:10:37,810 And one final element, the empty set, 209 00:10:37,810 --> 00:10:43,970 is a subset of the set of Boolean values true and false. 210 00:10:43,970 --> 00:10:45,990 So the power set of this two-element set 211 00:10:45,990 --> 00:10:50,290 is a set that has four things in it-- two elements of size 1, 212 00:10:50,290 --> 00:10:53,480 one element of size 2, one element of size 0. 213 00:10:53,480 --> 00:10:55,860 And that's going to be a general phenomenon 214 00:10:55,860 --> 00:10:57,110 that we'll examine more later. 215 00:10:57,110 --> 00:10:59,350 How big is the power set of a set? 216 00:11:02,110 --> 00:11:06,750 The even numbers, E that we just defined on the previous slide, 217 00:11:06,750 --> 00:11:09,670 is a member of the power set of Z because it's 218 00:11:09,670 --> 00:11:11,260 a subset of integers. 219 00:11:11,260 --> 00:11:14,780 Even integers are a special case of integers. 220 00:11:14,780 --> 00:11:19,350 And the integers are a member of the power set of R. 221 00:11:19,350 --> 00:11:21,830 That's just a synonym for saying that integers 222 00:11:21,830 --> 00:11:24,140 are a subset of reals. 223 00:11:24,140 --> 00:11:26,364 Every integer is a real, so the integers 224 00:11:26,364 --> 00:11:27,780 are a subset of reals, which means 225 00:11:27,780 --> 00:11:31,080 they're a member of the power set of reals. 226 00:11:31,080 --> 00:11:34,250 So the general property is that a set B 227 00:11:34,250 --> 00:11:37,390 is a member of the power set of A if and only 228 00:11:37,390 --> 00:11:40,660 if B is a subset of A. That was the defining 229 00:11:40,660 --> 00:11:45,170 condition for power set. 230 00:11:45,170 --> 00:11:49,900 And that's a fact to remember, and it may potentially 231 00:11:49,900 --> 00:11:50,500 confuse you. 232 00:11:50,500 --> 00:11:53,720 But it's a good exercise in keeping 233 00:11:53,720 --> 00:11:56,750 track of the difference between is a member of 234 00:11:56,750 --> 00:11:59,300 and is a subset of.