1 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,870 Commons license. 3 00:00:03,870 --> 00:00:06,910 Your support will help MIT OpenCourseWare continue to 4 00:00:06,910 --> 00:00:10,560 offer high-quality educational resources for free. 5 00:00:10,560 --> 00:00:13,460 To make a donation or view additional materials from 6 00:00:13,460 --> 00:00:19,290 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:19,290 --> 00:00:20,540 ocw.mit.edu. 8 00:00:22,080 --> 00:00:27,070 PROFESSOR: I want to finish up our little excursion into 9 00:00:27,070 --> 00:00:31,450 searching and sorting, a very important topic in computing. 10 00:00:31,450 --> 00:00:35,060 How do you sort lists or databases? 11 00:00:35,060 --> 00:00:37,910 How do you search them? 12 00:00:37,910 --> 00:00:41,140 And I want to finish it up with hashing. 13 00:00:41,140 --> 00:00:46,670 Hashing is how dictionaries are implemented in Python. 14 00:00:46,670 --> 00:00:49,520 And this leads to a very efficient-- 15 00:00:49,520 --> 00:00:53,250 at least most of the time-- search. 16 00:00:53,250 --> 00:00:56,280 But it comes at the cost of space. 17 00:00:56,280 --> 00:01:01,970 So this is an example where one can trade space for time. 18 00:01:01,970 --> 00:01:03,480 So I want to start with a simple 19 00:01:03,480 --> 00:01:06,260 example to explain hashing. 20 00:01:06,260 --> 00:01:09,800 And I'm going to begin by assuming that what we're 21 00:01:09,800 --> 00:01:15,480 hashing is a set of integers. 22 00:01:15,480 --> 00:01:19,650 So we want to build a set of integers and detect whether or 23 00:01:19,650 --> 00:01:23,290 not a particular integer is in that set. 24 00:01:23,290 --> 00:01:28,360 And we want to do that quickly. 25 00:01:28,360 --> 00:01:37,610 So the basic idea is we're going to take an integer-- 26 00:01:37,610 --> 00:01:39,880 call it i-- 27 00:01:39,880 --> 00:01:41,560 and we're going to hash it. 28 00:01:41,560 --> 00:01:44,870 I'll tell you what that means in a minute. 29 00:01:44,870 --> 00:01:51,420 And what a hash function does is it converts i to a 30 00:01:51,420 --> 00:01:57,090 different integer, perhaps, in some range. 31 00:01:57,090 --> 00:02:09,729 So it, say, converts i to some integer in the range 0 to k, 32 00:02:09,729 --> 00:02:13,540 for some constant k. 33 00:02:13,540 --> 00:02:21,660 We're going to then use this integer to index 34 00:02:21,660 --> 00:02:31,320 into a list of lists. 35 00:02:31,320 --> 00:02:36,580 So each of these is called a bucket. 36 00:02:36,580 --> 00:02:38,550 And a bucket will itself be a list. 37 00:02:41,740 --> 00:02:43,560 So this together is called a bucket. 38 00:02:49,880 --> 00:02:54,640 We've already seen that we can find the i-th element of a 39 00:02:54,640 --> 00:02:57,470 list in constant time. 40 00:03:02,300 --> 00:03:07,910 So when we ask whether some integers in this set will hash 41 00:03:07,910 --> 00:03:14,870 it, we'll go immediately to the correct bucket, the bucket 42 00:03:14,870 --> 00:03:17,610 associated with that integer. 43 00:03:17,610 --> 00:03:21,430 And then we'll search the list at that bucket to see if the 44 00:03:21,430 --> 00:03:24,200 integer is there. 45 00:03:24,200 --> 00:03:28,130 If this list is short enough, it will be very efficient. 46 00:03:28,130 --> 00:03:30,960 All, right I realize that's very abstract. 47 00:03:30,960 --> 00:03:32,970 But let's look at the code, which will make 48 00:03:32,970 --> 00:03:34,660 it much less abstract. 49 00:03:47,330 --> 00:03:50,320 So the code starts with something very ugly that I'll 50 00:03:50,320 --> 00:03:52,090 apologize for. 51 00:03:52,090 --> 00:03:55,030 But very soon, we'll see how to get rid of that. 52 00:03:55,030 --> 00:03:59,530 I'm using a global variable here to say how many buckets 53 00:03:59,530 --> 00:04:01,220 there are going to be. 54 00:04:01,220 --> 00:04:03,690 And I've very arbitrarily chosen 47. 55 00:04:07,200 --> 00:04:11,220 I then have a function called create, which uses this global 56 00:04:11,220 --> 00:04:17,620 variable and creates a list of lists, each 57 00:04:17,620 --> 00:04:21,130 element of which is empty. 58 00:04:21,130 --> 00:04:24,595 Because initially, we have no elements in the set. 59 00:04:28,280 --> 00:04:33,680 When I want to insert an element in the set, I'll call 60 00:04:33,680 --> 00:04:41,215 the function insert, which will hash the element. 61 00:04:41,215 --> 00:04:44,340 It doesn't actually even need to use the global numBuckets 62 00:04:44,340 --> 00:04:47,640 in this case, in fact. 63 00:04:47,640 --> 00:04:52,190 And then append it to the correct list. 64 00:04:52,190 --> 00:04:56,950 So it calls this function hashElem, which could hardly 65 00:04:56,950 --> 00:04:58,770 be simpler. 66 00:04:58,770 --> 00:05:03,780 It just takes the remainder, the modulus of the element and 67 00:05:03,780 --> 00:05:05,410 the number of buckets. 68 00:05:05,410 --> 00:05:09,490 So that will give me a value between 0 and 69 00:05:09,490 --> 00:05:11,220 numBuckets minus 1. 70 00:05:13,806 --> 00:05:15,940 It gives me one of the list, and I'll just 71 00:05:15,940 --> 00:05:17,190 insert it at the end. 72 00:05:20,610 --> 00:05:25,300 When I want to check for membership, you'll see it's 73 00:05:25,300 --> 00:05:27,130 quite simple. 74 00:05:27,130 --> 00:05:35,760 All I do is ask the question, is i in the list associated 75 00:05:35,760 --> 00:05:37,010 with the correct bucket? 76 00:05:41,250 --> 00:05:44,930 Remove is a little bit more complicated, but in fact, we 77 00:05:44,930 --> 00:05:47,890 don't need to spend much time looking at it now. 78 00:05:47,890 --> 00:05:49,410 It's just some code. 79 00:05:49,410 --> 00:05:53,290 And the only reason it's complicated is Insert doesn't 80 00:05:53,290 --> 00:05:55,470 look whether or not the element is already there. 81 00:05:55,470 --> 00:05:58,530 So it may occur multiple times in the list. 82 00:05:58,530 --> 00:06:00,740 So I would have to remove each one of them. 83 00:06:05,580 --> 00:06:08,230 People see the basic structure of this? 84 00:06:14,240 --> 00:06:17,000 Why a list of lists? 85 00:06:17,000 --> 00:06:23,780 Why don't I just, say, have a list of Booleans, where I hash 86 00:06:23,780 --> 00:06:27,700 the integer, and it's true or false, whether or not I've 87 00:06:27,700 --> 00:06:30,660 seen it, depending upon the value of that bucket? 88 00:06:30,660 --> 00:06:31,910 Why can't I do that? 89 00:06:37,690 --> 00:06:38,940 Somebody? 90 00:06:40,940 --> 00:06:44,430 Well, what's the property of this hash function? 91 00:06:44,430 --> 00:06:46,505 The key issue here is the hash function -- 92 00:06:53,950 --> 00:06:55,940 is many-to-one. 93 00:07:00,850 --> 00:07:05,490 That is to say, an infinite number of different integers 94 00:07:05,490 --> 00:07:08,690 will hash to the same value. 95 00:07:08,690 --> 00:07:10,470 Because after all, I have a set in which I 96 00:07:10,470 --> 00:07:14,170 can store any integer-- 97 00:07:14,170 --> 00:07:16,920 or any positive integer, at least-- 98 00:07:16,920 --> 00:07:21,660 and there are only 47 buckets. 99 00:07:21,660 --> 00:07:25,790 So it's pretty obvious that many integers will hash to the 100 00:07:25,790 --> 00:07:27,040 same bucket. 101 00:07:29,490 --> 00:07:35,930 When two different elements hash to the same bucket, we 102 00:07:35,930 --> 00:07:37,330 have what is called a collision. 103 00:07:44,890 --> 00:07:49,350 There are lots of different ways to handle collisions. 104 00:07:49,350 --> 00:07:53,170 What I've shown you here is probably the simplest way, 105 00:07:53,170 --> 00:07:55,375 which is called linear rehashing. 106 00:07:58,550 --> 00:07:59,815 I'm not actually rehashing. 107 00:08:02,680 --> 00:08:03,930 I'm just keeping a list. 108 00:08:08,026 --> 00:08:11,570 Does that makes sense? 109 00:08:11,570 --> 00:08:12,780 Yes, thank you for-- 110 00:08:12,780 --> 00:08:14,390 I'm glad somebody has a question. 111 00:08:18,800 --> 00:08:21,700 You have to ask loudly. 112 00:08:21,700 --> 00:08:23,300 AUDIENCE: When you take the modulus 47, 113 00:08:23,300 --> 00:08:25,120 what does that return? 114 00:08:27,940 --> 00:08:28,926 PROFESSOR: 0. 115 00:08:28,926 --> 00:08:33,210 AUDIENCE: So hashElem always returns 0? 116 00:08:33,210 --> 00:08:34,190 PROFESSOR: Well, no, sorry. 117 00:08:34,190 --> 00:08:37,150 It depends what I'm hashing. 118 00:08:37,150 --> 00:08:38,865 Sorry, I thought if you were saying that if I 119 00:08:38,865 --> 00:08:41,730 asked 47 mod 47. 120 00:08:41,730 --> 00:08:47,395 If I take 48 mod 47, I get 1. 121 00:08:47,395 --> 00:08:49,790 If I take 49 mod 47-- 122 00:08:49,790 --> 00:08:52,000 it's the remainder. 123 00:08:52,000 --> 00:08:53,675 Maybe I should have called it just the remainder. 124 00:08:56,540 --> 00:08:57,000 Look up. 125 00:08:57,000 --> 00:08:59,910 I'm about to throw something at you. 126 00:08:59,910 --> 00:09:01,730 Ooh, I threw a curve ball. 127 00:09:09,830 --> 00:09:10,035 OK. 128 00:09:10,035 --> 00:09:12,905 What's the complexity here of the membership test? 129 00:09:20,970 --> 00:09:23,380 Kind of hard to analyze. 130 00:09:23,380 --> 00:09:31,380 Roughly speaking, or exactly, it will be the length of the 131 00:09:31,380 --> 00:09:35,640 bucket, the size of the bucket. 132 00:09:35,640 --> 00:09:37,710 Now, I don't know how many elements 133 00:09:37,710 --> 00:09:39,170 will be in the bucket. 134 00:09:39,170 --> 00:09:40,650 But what will this depend on? 135 00:09:44,250 --> 00:09:47,910 It will depend upon the number of buckets. 136 00:09:47,910 --> 00:09:52,250 If I have a million buckets, I'll get a lot fewer 137 00:09:52,250 --> 00:09:57,280 collisions than if I have two buckets. 138 00:09:57,280 --> 00:10:00,500 So let's look at an example here. 139 00:10:00,500 --> 00:10:02,490 There's a small program called test. 140 00:10:11,700 --> 00:10:15,660 I said numBuckets to 47 in this case. 141 00:10:15,660 --> 00:10:19,820 And then I'm going to create it, create a set. 142 00:10:19,820 --> 00:10:25,220 And then I'm going to put a bunch of integers in it, then 143 00:10:25,220 --> 00:10:28,260 a few more, just for fun, including one very big 144 00:10:28,260 --> 00:10:31,180 integer, just to show that it works. 145 00:10:31,180 --> 00:10:34,800 Then I'm going to show you what the set looks like. 146 00:10:34,800 --> 00:10:38,430 And in fact, what we'll do is we'll stop it here and see 147 00:10:38,430 --> 00:10:39,680 what we get. 148 00:10:52,960 --> 00:10:56,950 So what we'll see here is, as you would expect with that 149 00:10:56,950 --> 00:11:00,650 number of buckets, each of the small numbers hashes to a 150 00:11:00,650 --> 00:11:02,510 separate thing. 151 00:11:02,510 --> 00:11:04,170 That's just the way remainder works. 152 00:11:06,920 --> 00:11:08,470 Not surprisingly-- 153 00:11:08,470 --> 00:11:11,200 in fact, it would be disappointing if 325 didn't 154 00:11:11,200 --> 00:11:15,200 have the same value both times I inserted it. 155 00:11:15,200 --> 00:11:17,980 So we say we happen to have one bucket that's got two 156 00:11:17,980 --> 00:11:19,140 elements in it. 157 00:11:19,140 --> 00:11:20,950 Happen to be the same. 158 00:11:20,950 --> 00:11:24,730 But this very big number happened to hash to the same 159 00:11:24,730 --> 00:11:27,500 value of 30 as 34. 160 00:11:27,500 --> 00:11:30,260 So here we have two elements in it. 161 00:11:33,190 --> 00:11:39,910 A good hash function has the property that it will widely 162 00:11:39,910 --> 00:11:43,700 disperse the values you hash. 163 00:11:43,700 --> 00:11:49,740 So they end up in different buckets, rather than some 164 00:11:49,740 --> 00:11:52,770 stupid hash function that tends to put everything in the 165 00:11:52,770 --> 00:11:55,830 same bucket. 166 00:11:55,830 --> 00:12:03,970 Now, let's see what happens if I change the number of buckets 167 00:12:03,970 --> 00:12:05,460 to, say, 3. 168 00:12:17,240 --> 00:12:26,770 Well, not surprisingly, we get some very big buckets, because 169 00:12:26,770 --> 00:12:30,760 there are relatively few choices. 170 00:12:30,760 --> 00:12:34,260 So what we see here is we have a genuine trade off between 171 00:12:34,260 --> 00:12:35,510 time and space. 172 00:12:38,000 --> 00:12:43,550 If the number of buckets is large relative to the number 173 00:12:43,550 --> 00:12:50,980 of elements that we insert in the table, then looking at 174 00:12:50,980 --> 00:12:55,540 whether or not an element in it is roughly order one. 175 00:12:55,540 --> 00:12:59,950 Because these lists will be very short. 176 00:12:59,950 --> 00:13:04,290 So we can actually look up something in constant time if 177 00:13:04,290 --> 00:13:06,760 we dedicate enough space to the hash table. 178 00:13:09,600 --> 00:13:13,310 If the hash table is very small-- 179 00:13:13,310 --> 00:13:17,190 the reduction ad absurdium case of one bucket-- 180 00:13:17,190 --> 00:13:20,610 then it's order n. 181 00:13:20,610 --> 00:13:21,770 It's not constant time. 182 00:13:21,770 --> 00:13:23,040 It's linear in the number of elements. 183 00:13:26,040 --> 00:13:30,090 Typically, when people use hash tables, they make the 184 00:13:30,090 --> 00:13:35,240 hash tables big enough that for all intents and purposes, 185 00:13:35,240 --> 00:13:38,330 you can assume that looking something up is 186 00:13:38,330 --> 00:13:42,370 constant time, order 1. 187 00:13:42,370 --> 00:13:48,060 And that, in fact, is what Python does with dictionaries. 188 00:13:48,060 --> 00:13:55,640 It hashes the keys and chooses a big enough table so that 189 00:13:55,640 --> 00:13:58,890 looking up whether or not something is in a dictionary 190 00:13:58,890 --> 00:14:02,600 can be done in constant time. 191 00:14:02,600 --> 00:14:07,280 If it then notices that the table is too small, because 192 00:14:07,280 --> 00:14:11,390 you've ended up putting a lot of elements in it, it just 193 00:14:11,390 --> 00:14:12,990 re-does it and gets a bigger table. 194 00:14:17,660 --> 00:14:21,430 So hashing is an extremely powerful technique. 195 00:14:21,430 --> 00:14:27,420 And it's used all the time for quite complicated things. 196 00:14:27,420 --> 00:14:31,430 Now is it useful here only when we want to store ints? 197 00:14:31,430 --> 00:14:32,270 No. 198 00:14:32,270 --> 00:14:34,810 It would be kind of bad if that were the case. 199 00:14:34,810 --> 00:14:40,225 In fact, any kind of immutable object can be hashed. 200 00:14:43,910 --> 00:14:48,570 Now, you may have wondered why the keys in 201 00:14:48,570 --> 00:14:50,990 dicts have to be immutable. 202 00:14:50,990 --> 00:14:55,070 And that's so that they can be hashed. 203 00:14:55,070 --> 00:14:57,120 Why does it have to be immutable? 204 00:14:57,120 --> 00:15:00,970 Well, imagine that you used a list as a key. 205 00:15:00,970 --> 00:15:04,040 You'd hash it when you put the list in the hash table. 206 00:15:04,040 --> 00:15:06,460 But then you might mutate it, and the next time you hashed 207 00:15:06,460 --> 00:15:09,490 it, you'd get a different value, and so you wouldn't be 208 00:15:09,490 --> 00:15:12,110 able to find it again. 209 00:15:12,110 --> 00:15:15,410 So you need it to be a kind of object, where every time you 210 00:15:15,410 --> 00:15:18,385 apply the hash function, you get the same value. 211 00:15:28,370 --> 00:15:30,260 I don't need to show you that this works. 212 00:15:30,260 --> 00:15:33,210 You'll just believe me, I'm sure. 213 00:15:33,210 --> 00:15:36,600 Let's look at a slightly more complicated hash function. 214 00:15:39,960 --> 00:15:43,220 Here, I want to hash something that could be 215 00:15:43,220 --> 00:15:46,220 either an int or a string. 216 00:15:46,220 --> 00:15:48,480 So I first check and see if it's an int. 217 00:15:48,480 --> 00:15:53,300 If so, I'll set the value to be e. 218 00:15:53,300 --> 00:15:59,040 Then down at the bottom, I'll do the modulus operator again. 219 00:15:59,040 --> 00:16:02,340 But if it's a string, I'm going to first 220 00:16:02,340 --> 00:16:06,150 convert e to an int. 221 00:16:06,150 --> 00:16:09,020 And this is basically the trick that people typically 222 00:16:09,020 --> 00:16:11,940 use when they're doing hashing, is they convert 223 00:16:11,940 --> 00:16:13,915 whatever thing they have, to some integer. 224 00:16:16,610 --> 00:16:19,520 People do this with all sorts of things. 225 00:16:19,520 --> 00:16:24,510 For example, this is the way airport security systems today 226 00:16:24,510 --> 00:16:27,250 do face recognition. 227 00:16:27,250 --> 00:16:31,160 They hash every picture of a face to an integer, and then 228 00:16:31,160 --> 00:16:34,010 look it up. 229 00:16:34,010 --> 00:16:36,420 So we can see it here. 230 00:16:36,420 --> 00:16:38,160 The way I'm doing it-- and the details 231 00:16:38,160 --> 00:16:40,000 don't very much matter-- 232 00:16:40,000 --> 00:16:45,350 is I'm going to do it a character at a time, do a 233 00:16:45,350 --> 00:16:50,540 shift ord of C, takes the ASCII, the internal 234 00:16:50,540 --> 00:16:55,950 representation bits of each character. 235 00:16:55,950 --> 00:16:58,070 And again, I don't care if you understand how 236 00:16:58,070 --> 00:16:59,860 the code works here. 237 00:16:59,860 --> 00:17:03,580 What I just want you to see is that it's not very long. 238 00:17:03,580 --> 00:17:07,630 And in, fact it's typically fairly simple to hash almost 239 00:17:07,630 --> 00:17:09,880 any kind of an object. 240 00:17:09,880 --> 00:17:12,920 Because deep down, we know that every object is 241 00:17:12,920 --> 00:17:17,380 represented by some string of bits in the computer's memory. 242 00:17:17,380 --> 00:17:21,450 And we can always convert a string of bits to an integer. 243 00:17:25,750 --> 00:17:30,440 All right, so that's hashing, a very powerful and extremely 244 00:17:30,440 --> 00:17:31,690 useful technique. 245 00:17:35,370 --> 00:17:38,250 Yeah? 246 00:17:38,250 --> 00:17:39,210 AUDIENCE: [INAUDIBLE] 247 00:17:39,210 --> 00:17:43,221 you're returning something, will you always find the 248 00:17:43,221 --> 00:17:45,656 remainder, or can you use [INAUDIBLE]? 249 00:17:45,656 --> 00:17:47,510 PROFESSOR: There are many different ways of doing hash. 250 00:17:47,510 --> 00:17:54,770 All the hash function has to do is convert its argument 251 00:17:54,770 --> 00:17:58,240 into an integer in some way or another within a fixed range. 252 00:18:01,420 --> 00:18:04,180 It happens to be that remainder is a very simple way 253 00:18:04,180 --> 00:18:07,960 to do that and is often used. 254 00:18:07,960 --> 00:18:11,650 There's a whole theory about hash functions. 255 00:18:11,650 --> 00:18:15,730 You can read about them in gory detail on Wikipedia. 256 00:18:15,730 --> 00:18:18,720 The math can actually be quite complicated, and there's no 257 00:18:18,720 --> 00:18:20,730 real need to understand it. 258 00:18:20,730 --> 00:18:23,780 I typically do something like dividing by a prime number, 259 00:18:23,780 --> 00:18:26,980 which is known to have good properties. 260 00:18:26,980 --> 00:18:30,000 But, again, you can get carried away with this. 261 00:18:30,000 --> 00:18:34,310 And it's usually not worth the trouble, unless you're very 262 00:18:34,310 --> 00:18:37,550 deeply involved in something. 263 00:18:37,550 --> 00:18:39,360 OK? 264 00:18:39,360 --> 00:18:41,830 And then I've got another program that does this, but 265 00:18:41,830 --> 00:18:43,750 again, I don't think much would be served by 266 00:18:43,750 --> 00:18:45,380 running it for you. 267 00:18:45,380 --> 00:18:47,100 Any questions about hashing? 268 00:18:50,970 --> 00:18:51,380 Yes? 269 00:18:51,380 --> 00:18:54,132 AUDIENCE: So does this only work because you said 270 00:18:54,132 --> 00:18:54,615 [INAUDIBLE] 271 00:18:54,615 --> 00:18:57,513 Python treats lists in a certain way? 272 00:18:57,513 --> 00:19:01,860 You said other languages, you can't have this constant time 273 00:19:01,860 --> 00:19:02,826 list search. 274 00:19:02,826 --> 00:19:04,290 So how would hashing work? 275 00:19:04,290 --> 00:19:08,410 PROFESSOR: So the question is that, because Python gives us 276 00:19:08,410 --> 00:19:11,781 constant-time looking up for lists is the key to making 277 00:19:11,781 --> 00:19:13,710 this work -- 278 00:19:13,710 --> 00:19:17,180 every programming language I know has some 279 00:19:17,180 --> 00:19:19,680 concept that's similar. 280 00:19:19,680 --> 00:19:23,720 In, say, C, it's not a list, but it's an array. 281 00:19:23,720 --> 00:19:26,620 And so in C, you would use array for this purpose? 282 00:19:26,620 --> 00:19:30,570 In Java, you would use the arrays for this purpose? 283 00:19:30,570 --> 00:19:34,730 But yes, you can do this in any programming language, 284 00:19:34,730 --> 00:19:38,320 because every reasonable programming language has some 285 00:19:38,320 --> 00:19:44,380 way to create the equivalent of a list, in which you can 286 00:19:44,380 --> 00:19:48,180 get a particular index in constant time. 287 00:19:48,180 --> 00:19:50,440 So it's a universally useful technique. 288 00:19:53,000 --> 00:19:53,830 Good question. 289 00:19:53,830 --> 00:19:56,010 Anything else? 290 00:19:56,010 --> 00:20:01,160 If not, we're about to abandon searching and sorting, and in 291 00:20:01,160 --> 00:20:04,395 fact, abandon algorithms in general for a while. 292 00:20:07,650 --> 00:20:11,230 Bounced right off his hands. 293 00:20:11,230 --> 00:20:13,110 I will not sign you to a baseball contract. 294 00:20:16,930 --> 00:20:19,460 Pardon? 295 00:20:19,460 --> 00:20:21,430 Yes, I threw him a Butterfinger? 296 00:20:21,430 --> 00:20:23,790 Oh, it's terrible. 297 00:20:23,790 --> 00:20:25,330 Boy, your jokes are worse than mine. 298 00:20:31,490 --> 00:20:36,300 I now want to move on to the last-- 299 00:20:36,300 --> 00:20:39,540 go back to Python, away from algorithms, away from computer 300 00:20:39,540 --> 00:20:43,210 science in general, and talk about the last three major 301 00:20:43,210 --> 00:20:48,240 linguistic concepts in Python, exceptions, classes, and then 302 00:20:48,240 --> 00:20:49,490 later iterators. 303 00:20:51,600 --> 00:20:54,310 Let's start with exceptions, because we've already seen 304 00:20:54,310 --> 00:20:58,340 them, and they're pretty simple. 305 00:20:58,340 --> 00:21:02,250 Exceptions are everywhere in Python. 306 00:21:02,250 --> 00:21:04,070 And we've certainly seen plenty of them 307 00:21:04,070 --> 00:21:05,320 all semester already. 308 00:21:11,960 --> 00:21:22,970 We get them if we set some list to, say [1,2] 309 00:21:22,970 --> 00:21:30,540 and then I asked for a test of 12, I get an index error. 310 00:21:30,540 --> 00:21:31,790 This is an exception. 311 00:21:34,180 --> 00:21:37,150 We've got exceptions when we've tried to convert things 312 00:21:37,150 --> 00:21:39,360 to incorrect types. 313 00:21:39,360 --> 00:21:44,820 So if I go into test, I'll get a type error. 314 00:21:44,820 --> 00:21:49,360 Anything that ends in the word "error" is a built-in kind of 315 00:21:49,360 --> 00:21:51,630 exception in Python. 316 00:21:51,630 --> 00:21:54,440 We've got them when we accessed, nonexistent 317 00:21:54,440 --> 00:21:57,430 variables, a name error. 318 00:21:57,430 --> 00:22:00,610 So there are a whole bunch of these things. 319 00:22:00,610 --> 00:22:02,490 In each of these cases-- 320 00:22:02,490 --> 00:22:06,370 here, I'm just kind of playing around in Python. 321 00:22:06,370 --> 00:22:10,930 And it printed an error message, and it stopped. 322 00:22:10,930 --> 00:22:18,500 These kind of exceptions are called unhandled exceptions, 323 00:22:18,500 --> 00:22:21,480 where they cause the program to effectively crash. 324 00:22:44,310 --> 00:22:45,950 The program just stops running. 325 00:22:48,740 --> 00:22:53,650 And I suspect that some of you have written programs in which 326 00:22:53,650 --> 00:22:55,680 this has happened. 327 00:22:55,680 --> 00:22:59,250 In fact, is there anybody here who has not written a program 328 00:22:59,250 --> 00:23:01,475 that crashed because of an unhandled exception? 329 00:23:04,600 --> 00:23:06,510 Good. 330 00:23:06,510 --> 00:23:09,730 For those of you watching on TV, no hands went up. 331 00:23:12,310 --> 00:23:14,900 Almost every day, I write a program that crashes because 332 00:23:14,900 --> 00:23:17,330 of an unhandled exception. 333 00:23:17,330 --> 00:23:22,140 On the other hand, once my program is debugged-- 334 00:23:22,140 --> 00:23:23,750 once your program is debugged-- this 335 00:23:23,750 --> 00:23:26,660 should never happen. 336 00:23:26,660 --> 00:23:30,090 Because there are mechanisms in Python for handling 337 00:23:30,090 --> 00:23:32,610 exceptions. 338 00:23:32,610 --> 00:23:37,700 And in fact, as we'll see, it's a perfectly valid flow of 339 00:23:37,700 --> 00:23:39,740 control concept. 340 00:23:39,740 --> 00:23:42,640 You will sometimes write programs that are intended to 341 00:23:42,640 --> 00:23:44,760 raise exceptions. 342 00:23:44,760 --> 00:23:48,040 And then you'll catch the exception and do something 343 00:23:48,040 --> 00:23:49,290 useful with it. 344 00:23:51,680 --> 00:23:55,340 The way we use exceptions is in something called a 345 00:23:55,340 --> 00:23:59,080 try-except block. 346 00:23:59,080 --> 00:24:07,080 So you write the word "try," and then you have some code, 347 00:24:07,080 --> 00:24:12,325 and then "except," and then some more code. 348 00:24:17,740 --> 00:24:24,060 What the interpreter does is it starts executing this code. 349 00:24:24,060 --> 00:24:27,470 If it gets through this code without raising any kind of an 350 00:24:27,470 --> 00:24:30,290 exception, it jumps to the code 351 00:24:30,290 --> 00:24:32,220 following the except block. 352 00:24:35,330 --> 00:24:40,660 On the other hand, if an exception gets raised here, it 353 00:24:40,660 --> 00:24:45,020 immediately stops executing this code and jumps to the 354 00:24:45,020 --> 00:24:49,940 start of this code, the code associated with the except, 355 00:24:49,940 --> 00:24:52,460 and executes that. 356 00:24:52,460 --> 00:24:55,600 And, then when it finishes this, it again goes to the 357 00:24:55,600 --> 00:24:56,870 code following the except. 358 00:24:59,430 --> 00:25:04,760 Just like an if, you can nest these things. 359 00:25:04,760 --> 00:25:08,860 It's just a control concept. 360 00:25:08,860 --> 00:25:11,350 It's nothing more. 361 00:25:11,350 --> 00:25:12,600 Let's look at an example. 362 00:25:31,940 --> 00:25:33,190 So here's readVal. 363 00:25:36,320 --> 00:25:38,080 This is a function. 364 00:25:38,080 --> 00:25:41,020 It's a polymorphic function. 365 00:25:41,020 --> 00:25:45,550 Its first argument is of type 'type'. 366 00:25:45,550 --> 00:25:46,230 Remember-- 367 00:25:46,230 --> 00:25:50,760 and you'll hear me say this a 1,000 more times at least-- 368 00:25:50,760 --> 00:25:54,950 in Python, everything is an object and can be manipulated 369 00:25:54,950 --> 00:25:57,200 by the program. 370 00:25:57,200 --> 00:25:59,590 It's one of the beauties of Python. 371 00:25:59,590 --> 00:26:03,860 It's something that's not true in many programming languages. 372 00:26:03,860 --> 00:26:10,220 So types are objects, just like ints or floats. 373 00:26:10,220 --> 00:26:14,740 So the first argument to readVal is a type. 374 00:26:14,740 --> 00:26:17,740 And then there are two strings, the request message 375 00:26:17,740 --> 00:26:20,890 and the error message. 376 00:26:20,890 --> 00:26:26,730 It then sets a local variable, numTries, to 0. 377 00:26:26,730 --> 00:26:32,170 And while numTries is less than 4, it sets val equal to 378 00:26:32,170 --> 00:26:35,510 raw input, printing the request message. 379 00:26:38,830 --> 00:26:45,410 And then it tries to convert val to valType, whatever that 380 00:26:45,410 --> 00:26:47,480 happens to be. 381 00:26:47,480 --> 00:26:51,950 And if it succeeds, it returns it. 382 00:26:51,950 --> 00:26:56,680 On the other hand, if during this attempt to convert it an 383 00:26:56,680 --> 00:26:58,470 exception is raised-- 384 00:26:58,470 --> 00:27:03,720 for example, the user is asked to input an integer, and they 385 00:27:03,720 --> 00:27:07,760 type in the letter b, which can't be converted to an int-- 386 00:27:07,760 --> 00:27:14,130 then a type error exception will be raised, or a value 387 00:27:14,130 --> 00:27:15,590 error exception. 388 00:27:15,590 --> 00:27:19,810 And if a value error exception is raised, it prints the error 389 00:27:19,810 --> 00:27:26,740 message, increases numTries by 1, and goes back to the top of 390 00:27:26,740 --> 00:27:27,990 the y loop. 391 00:27:31,020 --> 00:27:35,740 If it goes through this y loop too many times, it leaves, 392 00:27:35,740 --> 00:27:40,140 raising the type error with the message that the argument 393 00:27:40,140 --> 00:27:40,880 "numTries 394 00:27:40,880 --> 00:27:46,970 exceeded." All right. 395 00:27:46,970 --> 00:27:48,875 Let's run this and see what happens. 396 00:28:10,990 --> 00:28:14,180 Every once in a while, funny things happen. 397 00:28:14,180 --> 00:28:18,430 When that happens, somehow, clicking in this window, and 398 00:28:18,430 --> 00:28:20,410 then clicking back in that window tends to 399 00:28:20,410 --> 00:28:23,040 make it work again. 400 00:28:23,040 --> 00:28:24,290 This is a bug in Python. 401 00:28:30,420 --> 00:28:33,940 So it's now asked me to enter an int. 402 00:28:33,940 --> 00:28:40,870 If I enter an int, everything is good. 403 00:28:40,870 --> 00:28:42,550 It just prints it. 404 00:28:42,550 --> 00:28:51,630 On the other hand, if I run it, and I enter the letter A, 405 00:28:51,630 --> 00:28:54,960 it will give me another chance. 406 00:28:54,960 --> 00:28:59,190 And now I can enter an int, and it's happy. 407 00:29:10,270 --> 00:29:10,530 All right. 408 00:29:10,530 --> 00:29:17,290 Now, suppose I am unhappy with the fact that if four times in 409 00:29:17,290 --> 00:29:21,510 a row I've failed to enter an int, something bad happens. 410 00:29:21,510 --> 00:29:25,110 It comes back, and it raises an exception. 411 00:29:25,110 --> 00:29:39,050 Well, I can catch that in the code that calls readVal. 412 00:29:39,050 --> 00:29:41,970 So here at the top level, I'm going to try readVal. 413 00:29:44,480 --> 00:29:47,080 And then I'm going to say, except if a type error is 414 00:29:47,080 --> 00:29:52,340 raised, print the argument returned by that exception. 415 00:29:55,940 --> 00:29:58,320 So this is what's called the handler for the exception. 416 00:30:01,470 --> 00:30:05,030 And so I don't have to crash when the exception is raised. 417 00:30:05,030 --> 00:30:11,030 But I can actually deal with it and do something sensible. 418 00:30:11,030 --> 00:30:14,730 If following the word "except," as you see over 419 00:30:14,730 --> 00:30:22,370 there, I have not listed any exception names, then I'll go 420 00:30:22,370 --> 00:30:25,080 to the except clause for all exceptions. 421 00:30:25,080 --> 00:30:26,820 Doesn't matter what the exception is. 422 00:30:26,820 --> 00:30:29,650 I will go there. 423 00:30:29,650 --> 00:30:32,635 So I can write code that captures any exception. 424 00:30:38,610 --> 00:30:38,837 OK. 425 00:30:38,837 --> 00:30:42,790 Usually, that's not as good a thing, because it shows that I 426 00:30:42,790 --> 00:30:45,510 did not anticipate what the exception might be. 427 00:30:48,270 --> 00:30:50,680 But you can see that this is a pretty 428 00:30:50,680 --> 00:30:53,830 powerful programming paradigm. 429 00:30:53,830 --> 00:30:58,080 I can write this fairly compact readVal function 430 00:30:58,080 --> 00:31:00,340 that's pretty robust. 431 00:31:00,340 --> 00:31:04,130 It's polymorphic, it can take in what the error messages 432 00:31:04,130 --> 00:31:07,030 are, and it can try as many times as I want. 433 00:31:10,900 --> 00:31:12,898 Yeah? 434 00:31:12,898 --> 00:31:16,650 AUDIENCE: [INAUDIBLE] after type error [INAUDIBLE]? 435 00:31:16,650 --> 00:31:19,200 AUDIENCE: PROFESSOR: You'll note that when I raised the 436 00:31:19,200 --> 00:31:23,330 exception type error after it, I had an open [UNINTELLIGIBLE] 437 00:31:23,330 --> 00:31:25,480 a string. 438 00:31:25,480 --> 00:31:30,030 So what this basically says is that an exception can have 439 00:31:30,030 --> 00:31:33,220 associated with it a set of arguments, 440 00:31:33,220 --> 00:31:35,300 a sequence of arguments. 441 00:31:35,300 --> 00:31:40,360 And I've just chosen to call the first of the arguments s, 442 00:31:40,360 --> 00:31:43,060 so that I could then print it. 443 00:31:43,060 --> 00:31:46,810 So this is a fairly common paradigm, that you associate a 444 00:31:46,810 --> 00:31:51,710 message with an exception, explaining the exception. 445 00:31:51,710 --> 00:31:55,750 Since after all, type error is not all that meaningful. 446 00:31:55,750 --> 00:31:57,860 And this tells me why it was raised, that I 447 00:31:57,860 --> 00:32:01,030 tried too many times. 448 00:32:01,030 --> 00:32:04,630 OK, does it make sense? 449 00:32:04,630 --> 00:32:05,880 Anything else? 450 00:32:07,980 --> 00:32:09,302 Yeah? 451 00:32:09,302 --> 00:32:10,552 AUDIENCE: [INAUDIBLE] 452 00:32:13,586 --> 00:32:16,920 before you said [INAUDIBLE]? 453 00:32:16,920 --> 00:32:22,340 PROFESSOR: Well, what we'll see when I execute something 454 00:32:22,340 --> 00:32:34,010 like assert false, it's actually raising an exception. 455 00:32:34,010 --> 00:32:37,400 It's raising an assertion error exception. 456 00:32:37,400 --> 00:32:41,560 And so I can actually catch that and do something with it. 457 00:32:41,560 --> 00:32:43,360 I've been using these asserts just 458 00:32:43,360 --> 00:32:46,310 basically to stop the program. 459 00:32:46,310 --> 00:32:49,290 But we've also seen that sometimes in functions, we use 460 00:32:49,290 --> 00:32:52,940 assertions to check the types of the arguments. 461 00:32:52,940 --> 00:32:56,560 But if you don't catch that exception, then the program 462 00:32:56,560 --> 00:32:58,830 crashes with the wrong arguments, without doing 463 00:32:58,830 --> 00:33:00,560 anything useful. 464 00:33:00,560 --> 00:33:05,040 Because that's just an exception, I can catch it, and 465 00:33:05,040 --> 00:33:07,350 then do something useful. 466 00:33:07,350 --> 00:33:11,250 So it's, again, just an example of that. 467 00:33:11,250 --> 00:33:13,730 We're going to try your hands once more. 468 00:33:13,730 --> 00:33:16,860 I'll throw you a different kind of candy. 469 00:33:16,860 --> 00:33:19,820 And this time he caught it. 470 00:33:19,820 --> 00:33:21,070 Two for two. 471 00:33:25,360 --> 00:33:28,460 Very powerful. 472 00:33:28,460 --> 00:33:32,140 A very useful mechanism. 473 00:33:32,140 --> 00:33:35,400 You can use them for sort of catching errors, 474 00:33:35,400 --> 00:33:36,650 as we've done here. 475 00:33:39,100 --> 00:33:43,050 They are frequently used in situations where you are 476 00:33:43,050 --> 00:33:46,350 getting input from a user. 477 00:33:46,350 --> 00:33:50,490 So for example, if you were writing a text editor, and you 478 00:33:50,490 --> 00:33:54,710 wanted to open up a file, and you typed in the file name, 479 00:33:54,710 --> 00:33:59,130 and it didn't exist, you would in Python get an error message 480 00:33:59,130 --> 00:34:01,130 when you try to open that file. 481 00:34:01,130 --> 00:34:05,510 It would raise an exception, "File Not Found," basically, 482 00:34:05,510 --> 00:34:09,870 which you could then catch, and then print to the user a 483 00:34:09,870 --> 00:34:13,760 useful error message saying, "File Not Found." 484 00:34:13,760 --> 00:34:17,510 Similarly, if you try and write to a file that already 485 00:34:17,510 --> 00:34:22,370 exists, you can get an exception saying, do you 486 00:34:22,370 --> 00:34:24,650 really want to overwrite this file? 487 00:34:24,650 --> 00:34:26,310 And ask the user. 488 00:34:26,310 --> 00:34:29,429 So it's very commonly used in a lot of 489 00:34:29,429 --> 00:34:31,800 programs as a mechanism. 490 00:34:31,800 --> 00:34:35,620 And now, as we go on and see more and more code, you'll see 491 00:34:35,620 --> 00:34:39,850 that I'm going to start using exceptions fairly frequently 492 00:34:39,850 --> 00:34:43,060 as a flow of control mechanism. 493 00:34:43,060 --> 00:34:45,830 It makes certain kinds of code easier to write. 494 00:34:48,350 --> 00:34:49,440 All right. 495 00:34:49,440 --> 00:34:53,230 That's all I have to say about exceptions. 496 00:34:53,230 --> 00:34:57,250 Simple but useful mechanism. 497 00:34:57,250 --> 00:34:59,900 Now, on to something that's even more 498 00:34:59,900 --> 00:35:02,113 useful, but not so simple. 499 00:35:04,830 --> 00:35:08,770 It is probably the distinguishing thing, not only 500 00:35:08,770 --> 00:35:13,100 in Python, but in a whole class of 501 00:35:13,100 --> 00:35:15,620 modern programming languages. 502 00:35:15,620 --> 00:35:17,780 And that's the notion of a class. 503 00:35:26,230 --> 00:35:27,800 I'm not going to finish it today. 504 00:35:27,800 --> 00:35:32,220 I'm barely going to scratch the surface of it today. 505 00:35:32,220 --> 00:35:35,430 But we're going to start leading up and I will pretty 506 00:35:35,430 --> 00:35:37,215 much finish it on Thursday. 507 00:35:39,880 --> 00:35:52,100 We've already seen the notion of a module, which has been a 508 00:35:52,100 --> 00:35:53,750 collection of related functions. 509 00:36:01,480 --> 00:36:06,660 So, for example, we've seen code that includes something 510 00:36:06,660 --> 00:36:08,390 like import math. 511 00:36:13,160 --> 00:36:17,220 And that provided me with access to 512 00:36:17,220 --> 00:36:18,620 functions like math.log. 513 00:36:23,710 --> 00:36:28,480 What the module mechanism does in the import mechanism, it 514 00:36:28,480 --> 00:36:32,690 makes it convenient to import a lot of 515 00:36:32,690 --> 00:36:36,550 related things at once. 516 00:36:36,550 --> 00:36:47,920 And then we use this dot notation to disambiguate, to 517 00:36:47,920 --> 00:36:50,740 tell us, well, which log. 518 00:36:50,740 --> 00:36:54,760 Typically, there's probably only one log function. 519 00:36:54,760 --> 00:37:03,540 But certainly, you might imagine that set.member and 520 00:37:03,540 --> 00:37:09,940 table.member would be different, and 521 00:37:09,940 --> 00:37:11,190 that both might exist. 522 00:37:15,250 --> 00:37:20,030 And as we've said before, the dot notation avoids conflicts 523 00:37:20,030 --> 00:37:21,280 by disambiguating. 524 00:37:31,460 --> 00:37:31,695 OK. 525 00:37:31,695 --> 00:37:32,870 That's a module. 526 00:37:32,870 --> 00:37:41,100 What a class is, it is like a module, but it's not just a 527 00:37:41,100 --> 00:37:44,750 collection of functions. 528 00:37:44,750 --> 00:38:04,010 A class is a collection of data and functions, functions 529 00:38:04,010 --> 00:38:05,730 that operate on that data. 530 00:38:08,420 --> 00:38:13,520 They are bound together, so that you can pass an object 531 00:38:13,520 --> 00:38:17,490 from one part of a program to another. 532 00:38:17,490 --> 00:38:21,950 And the part of the program to which you pass it 533 00:38:21,950 --> 00:38:27,680 automatically gets access to the functions associated with 534 00:38:27,680 --> 00:38:29,245 that type of object. 535 00:38:33,670 --> 00:38:37,550 And this is really the key to what people call 536 00:38:37,550 --> 00:38:42,430 object-oriented programming, a very popular buzzword. 537 00:38:53,150 --> 00:38:56,580 So we've already seen that kind of thing, where if we 538 00:38:56,580 --> 00:39:01,690 pass a list from one function to another, we can write 539 00:39:01,690 --> 00:39:07,775 something like L.append, some value. 540 00:39:16,690 --> 00:39:21,760 The data and functions associated with an object are 541 00:39:21,760 --> 00:39:24,800 called that object's attributes. 542 00:39:32,510 --> 00:39:41,670 So you can think of this as a way to associate attributes 543 00:39:41,670 --> 00:39:42,920 with objects. 544 00:39:51,400 --> 00:39:54,480 Now, I've been talking about data and functions as if 545 00:39:54,480 --> 00:39:57,720 they're different kinds of things. 546 00:39:57,720 --> 00:40:02,210 In fact, they're not really, because they're just objects. 547 00:40:02,210 --> 00:40:05,360 In Python, everything is an object, including, as we'll 548 00:40:05,360 --> 00:40:10,570 see on Thursday, the class itself is an object. 549 00:40:10,570 --> 00:40:13,710 When people talk about objects and object-oriented 550 00:40:13,710 --> 00:40:21,570 programming, they often use a message passing metaphor. 551 00:40:21,570 --> 00:40:24,385 And I want to emphasize it's nothing more than a metaphor. 552 00:40:28,980 --> 00:40:32,960 And I almost hesitate to bring it up, because it makes it all 553 00:40:32,960 --> 00:40:35,460 sound more complicated than it is. 554 00:40:35,460 --> 00:40:37,980 But you will see this phrase in the literature. 555 00:40:37,980 --> 00:40:39,480 People will use it. 556 00:40:39,480 --> 00:40:42,050 So you need to know what it is. 557 00:40:42,050 --> 00:40:48,970 The basic metaphor is that when I write something like 558 00:40:48,970 --> 00:40:57,690 L.append, I am passing the message append e to the object 559 00:40:57,690 --> 00:41:03,970 L. And then there's a mechanism for looking up what 560 00:41:03,970 --> 00:41:08,760 that object means, what that message means. 561 00:41:08,760 --> 00:41:13,270 And then the object, L, executes that message, and 562 00:41:13,270 --> 00:41:14,520 does something. 563 00:41:17,970 --> 00:41:29,150 So, for example, if I had a class called Circle, I might 564 00:41:29,150 --> 00:41:32,020 pass the object C -- 565 00:41:32,020 --> 00:41:36,850 the message area, which would cause this object to return 566 00:41:36,850 --> 00:41:41,120 the area of its own area, the area of the circle that is 567 00:41:41,120 --> 00:41:42,370 that object. 568 00:41:45,140 --> 00:41:48,510 Again, nothing dramatic going on here. 569 00:41:48,510 --> 00:41:52,800 If you just think of this as a fancy way of writing 570 00:41:52,800 --> 00:41:57,000 functions, you'll be absolutely correct. 571 00:41:57,000 --> 00:42:01,520 But I did think you should hear about this metaphor. 572 00:42:01,520 --> 00:42:05,160 I add by the way -- should have-- 573 00:42:05,160 --> 00:42:08,270 I've been using the word... 574 00:42:08,270 --> 00:42:18,290 Method is a function associated with an object. 575 00:42:29,110 --> 00:42:34,280 So in this case, the method area is associated with the 576 00:42:34,280 --> 00:42:40,610 object C. And purists would say, always refer to append as 577 00:42:40,610 --> 00:42:44,910 a method, rather than a function, because we use the 578 00:42:44,910 --> 00:42:48,890 dot notation to get to it, and it's always associated with 579 00:42:48,890 --> 00:42:50,885 some object of type list. 580 00:42:58,800 --> 00:43:06,310 Now, just as data can have types, objects, as we know, 581 00:43:06,310 --> 00:43:08,450 have types. 582 00:43:08,450 --> 00:43:16,290 What a class is is it's a collection of objects with 583 00:43:16,290 --> 00:43:23,870 identical characteristics that form a type. 584 00:43:23,870 --> 00:43:28,190 So we can use classes to introduce new types into the 585 00:43:28,190 --> 00:43:30,490 programming environment. 586 00:43:30,490 --> 00:43:34,890 So as you think about existing things-- 587 00:43:34,890 --> 00:43:37,570 now, I won't put that up, because it will disappear 588 00:43:37,570 --> 00:43:39,990 behind the screen. 589 00:43:39,990 --> 00:43:46,365 We've looked at things like lists and dict. 590 00:43:51,980 --> 00:43:54,180 What these are are built-in classes. 591 00:44:00,240 --> 00:44:06,960 They happen to be classes that are so useful that somebody 592 00:44:06,960 --> 00:44:09,750 decided they should be part of the language, they should have 593 00:44:09,750 --> 00:44:13,160 efficient implementations, they should be built-in, 594 00:44:13,160 --> 00:44:16,770 people shouldn't have to reimplement them themselves. 595 00:44:16,770 --> 00:44:19,180 And in fact, there are a whole bunch of interesting built-in 596 00:44:19,180 --> 00:44:21,300 classes in Python. 597 00:44:21,300 --> 00:44:24,580 And there are a whole bunch of libraries of classes you can 598 00:44:24,580 --> 00:44:27,140 bring in-- and we'll look at many of those-- 599 00:44:27,140 --> 00:44:30,390 that extend it. 600 00:44:30,390 --> 00:44:33,000 And that's the beauty of the class mechanism. 601 00:44:33,000 --> 00:44:39,180 It lets you add new types to the language that are every 602 00:44:39,180 --> 00:44:42,530 bit as easy to use as the built-in types. 603 00:44:42,530 --> 00:44:47,220 So in effect, the language can be extended to add new and 604 00:44:47,220 --> 00:44:49,440 useful types. 605 00:44:49,440 --> 00:44:51,260 And we'll look at several examples of 606 00:44:51,260 --> 00:44:52,510 that starting on Thursday.