1 00:00:00,050 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,229 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,229 --> 00:00:17,854 at ocw.mit.edu. 8 00:00:21,738 --> 00:00:25,090 PROFESSOR: Good morning, everyone. 9 00:00:25,090 --> 00:00:28,820 Today, we're going to talk about boosting. 10 00:00:28,820 --> 00:00:33,900 Boosting is pretty awesome, and it's not as hard 11 00:00:33,900 --> 00:00:34,630 as it might seem. 12 00:00:34,630 --> 00:00:39,050 It's actually pretty easy, as long as you do it right. 13 00:00:39,050 --> 00:00:42,610 So let's take a look at this boosting problem. 14 00:00:42,610 --> 00:00:44,680 Just like with ID trees, you may have 15 00:00:44,680 --> 00:00:47,950 noticed there's the ID tree problem where there's a graph, 16 00:00:47,950 --> 00:00:52,200 and all of the tests are x and y-axis tests on that graph. 17 00:00:52,200 --> 00:00:54,150 And then there's the ID tree problem 18 00:00:54,150 --> 00:00:57,950 where there are a lot of crazy different classifiers 19 00:00:57,950 --> 00:01:00,480 about characteristics that are discrete. 20 00:01:00,480 --> 00:01:07,530 And then all of the sort of ID tree stumps, if you will, 21 00:01:07,530 --> 00:01:09,730 are built out of those discrete qualities. 22 00:01:09,730 --> 00:01:12,300 This is an example of a boosting problem 23 00:01:12,300 --> 00:01:15,510 of the second type with a bunch of discrete qualities, 24 00:01:15,510 --> 00:01:20,040 like evil, emo, transforms, sparkly. 25 00:01:20,040 --> 00:01:24,820 And it has an numerical quality number of romantic interests. 26 00:01:24,820 --> 00:01:26,680 So it's one of basically the two kinds 27 00:01:26,680 --> 00:01:28,346 of boosting problems that you might see. 28 00:01:28,346 --> 00:01:32,850 And now, there is the two-axis Cartesian problem. 29 00:01:32,850 --> 00:01:34,990 A good example of that is the hours 30 00:01:34,990 --> 00:01:41,290 of sleep versus coffee problem, which I, for one, am planning 31 00:01:41,290 --> 00:01:46,980 on doing in tutorial on Monday so that you guys sort of get 32 00:01:46,980 --> 00:01:49,390 a sense of both types of problems. 33 00:01:49,390 --> 00:01:50,990 This one looks big. 34 00:01:50,990 --> 00:01:54,190 I think it looks-- I mean, it has ten different vampires 35 00:01:54,190 --> 00:01:56,880 or non-vampires to classify. 36 00:01:56,880 --> 00:01:59,920 It has a whole bunch of possible classifiers. 37 00:01:59,920 --> 00:02:04,340 But if you do it right, you can do it fast. 38 00:02:04,340 --> 00:02:07,290 So here's the prompt for the problem. 39 00:02:07,290 --> 00:02:08,550 Let's see. 40 00:02:08,550 --> 00:02:10,840 After graduating MIT, you get a job 41 00:02:10,840 --> 00:02:12,620 working for Van Helsing and Sommers, 42 00:02:12,620 --> 00:02:16,160 a famous vampire-hunting consulting agency. 43 00:02:16,160 --> 00:02:18,080 Gabriel Van Helsing, one of the two founders, 44 00:02:18,080 --> 00:02:20,511 once attended several 6.034 lectures as a guest, 45 00:02:20,511 --> 00:02:22,010 and he remembers Professor Winston's 46 00:02:22,010 --> 00:02:24,262 vampire identification tree lecture. 47 00:02:24,262 --> 00:02:25,720 He assigns you the task of creating 48 00:02:25,720 --> 00:02:27,700 a superior classifier for vampires 49 00:02:27,700 --> 00:02:30,740 by using boosting on the following data. 50 00:02:30,740 --> 00:02:34,650 So we've got the ID number, which 51 00:02:34,650 --> 00:02:37,170 we can use to just write things out in shorthand. 52 00:02:37,170 --> 00:02:40,980 We've got the name of several vampires and non-vampires. 53 00:02:40,980 --> 00:02:43,502 Then you see whether they're a vampire or not. 54 00:02:43,502 --> 00:02:44,960 That's whether their classification 55 00:02:44,960 --> 00:02:48,590 is a plus or minus for the quality of vampire. 56 00:02:48,590 --> 00:02:54,260 After that, there's a bunch of possible ways 57 00:02:54,260 --> 00:02:56,170 to classify whether they're a vampire or not. 58 00:02:56,170 --> 00:02:58,800 There's whether or not they're evil, whether or not 59 00:02:58,800 --> 00:03:01,740 they're emo, whether or not they transform, 60 00:03:01,740 --> 00:03:04,540 and whether or not they're sparkly, as well 61 00:03:04,540 --> 00:03:08,260 as the number of romantic interests that they have. 62 00:03:08,260 --> 00:03:10,080 So for instance, on the one hand, 63 00:03:10,080 --> 00:03:13,160 you have Dracula, who's evil, but he's not emo. 64 00:03:13,160 --> 00:03:19,250 He can transform into a bat or a cloud of mist. 65 00:03:19,250 --> 00:03:22,390 He does not sparkle, and he has five romantic interests-- 66 00:03:22,390 --> 00:03:25,300 those three vampire chicks at the beginning, Wilhelmina 67 00:03:25,300 --> 00:03:28,200 Murray, and Lucy Westenra. 68 00:03:28,200 --> 00:03:34,300 So on the other hand, you have Squall Leonhart, 69 00:03:34,300 --> 00:03:37,510 who's the protagonist of Final Fantasy VII, is extremely emo 70 00:03:37,510 --> 00:03:41,350 and doesn't have any of the other characteristics. 71 00:03:41,350 --> 00:03:43,790 And he's not a vampire. 72 00:03:43,790 --> 00:03:45,280 However, he's a nice counterexample 73 00:03:45,280 --> 00:03:47,750 for a possible rule that all emo people 74 00:03:47,750 --> 00:03:49,875 are vampires because he's very, very emo, 75 00:03:49,875 --> 00:03:51,390 and he's not a vampire. 76 00:03:51,390 --> 00:03:54,852 So how will we go about tackling this problem with boosting? 77 00:03:54,852 --> 00:03:57,060 Well, there's a whole bunch of different classifiers. 78 00:03:57,060 --> 00:03:59,520 And if you think this is all of them-- 79 00:03:59,520 --> 00:04:01,640 like evil, emo, transforms, sparkly. 80 00:04:01,640 --> 00:04:05,510 romantics interests, and true-- it's actually only half 81 00:04:05,510 --> 00:04:06,380 of them. 82 00:04:06,380 --> 00:04:08,810 The other half are the opposite versions, 83 00:04:08,810 --> 00:04:10,060 but we'll ignore them for now. 84 00:04:13,300 --> 00:04:15,280 So if you look at these, you can probably 85 00:04:15,280 --> 00:04:17,630 figure out what they mean-- evil equals 86 00:04:17,630 --> 00:04:20,810 yes means vampire is what we're saying here-- 87 00:04:20,810 --> 00:04:21,990 except maybe true. 88 00:04:21,990 --> 00:04:24,730 You might be saying, why is there one that just says true? 89 00:04:24,730 --> 00:04:28,500 The one that just says true says that everybody is a vampire. 90 00:04:28,500 --> 00:04:29,880 You might think, oh, that sucks. 91 00:04:29,880 --> 00:04:32,250 But it's not that bad since seven of the 10 samples 92 00:04:32,250 --> 00:04:33,530 are vampires. 93 00:04:33,530 --> 00:04:38,690 The key, crucial thing about boosting 94 00:04:38,690 --> 00:04:42,520 is that for any possible classifier, 95 00:04:42,520 --> 00:04:46,000 like classifying on the evil dimension-- which actually 96 00:04:46,000 --> 00:04:47,650 sounds like some kind of weird place 97 00:04:47,650 --> 00:04:48,899 that you'd go in a comic book. 98 00:04:48,899 --> 00:04:51,690 But classifying on the emo dimension or whatever, 99 00:04:51,690 --> 00:04:56,970 as long as it's not a 50-50 split of the data, 100 00:04:56,970 --> 00:05:00,990 you're guaranteed to be able to use it 101 00:05:00,990 --> 00:05:03,630 in some way for boosting. 102 00:05:03,630 --> 00:05:07,220 If there is a 50-50 split, it's like flipping a coin, 103 00:05:07,220 --> 00:05:08,670 so it's useless. 104 00:05:08,670 --> 00:05:11,490 Because if you had some other thing, 105 00:05:11,490 --> 00:05:13,210 like gender equals male or female-- 106 00:05:13,210 --> 00:05:14,580 and let's say that was 50-50. 107 00:05:14,580 --> 00:05:15,700 It's not. 108 00:05:15,700 --> 00:05:17,450 But let's say it was 50-50 between vampire 109 00:05:17,450 --> 00:05:21,390 and non-vampire-- it's a useless classifier 110 00:05:21,390 --> 00:05:25,495 because it would be just the same as flipping a coin. 111 00:05:25,495 --> 00:05:26,690 You'd get no information. 112 00:05:26,690 --> 00:05:28,210 Now, you might say, wait a minute. 113 00:05:28,210 --> 00:05:31,960 What about classifiers that get worse than 50-50? 114 00:05:31,960 --> 00:05:33,470 What about them? 115 00:05:33,470 --> 00:05:36,430 Might not they be even worse than a 50-50 classifier? 116 00:05:36,430 --> 00:05:39,325 I claim a classifier that gets less than 50-50 117 00:05:39,325 --> 00:05:41,760 is still better than a classifier that 118 00:05:41,760 --> 00:05:44,242 gets exactly a 50-50 split. 119 00:05:44,242 --> 00:05:45,113 Is there a question? 120 00:05:45,113 --> 00:05:45,738 AUDIENCE: Yeah. 121 00:05:45,738 --> 00:05:49,224 In the ID tree example, somebody said you used 50-50 classifiers 122 00:05:49,224 --> 00:05:51,714 and played around. 123 00:05:51,714 --> 00:05:54,536 And after you already produced the elements per set, 124 00:05:54,536 --> 00:05:57,190 then you only used 50-50 classifiers [INAUDIBLE] 125 00:05:57,190 --> 00:05:57,690 per set. 126 00:05:57,690 --> 00:06:00,350 PROFESSOR: So the question is, in the ID tree example, 127 00:06:00,350 --> 00:06:03,900 you might use 50-50 classifiers in later rounds 128 00:06:03,900 --> 00:06:08,670 if, for instance, there's a 50-50 classifier 129 00:06:08,670 --> 00:06:11,420 except for that most of the things off of that side 130 00:06:11,420 --> 00:06:13,410 have been already removed. 131 00:06:13,410 --> 00:06:15,750 Let's say there's 20 data points, 132 00:06:15,750 --> 00:06:19,750 and there's a classifier that splits it 10 and 10. 133 00:06:19,750 --> 00:06:24,650 And it gets half plus half minus on both sides. 134 00:06:24,650 --> 00:06:27,209 But all the pluses from the right side 135 00:06:27,209 --> 00:06:29,000 have been removed by some other classifier. 136 00:06:29,000 --> 00:06:29,708 You might use it. 137 00:06:29,708 --> 00:06:30,360 That's true. 138 00:06:30,360 --> 00:06:32,235 But in boosting, you will never use something 139 00:06:32,235 --> 00:06:34,410 that's a 50-50 classifier. 140 00:06:34,410 --> 00:06:38,970 You never use something that has exactly a 50-50 chance 141 00:06:38,970 --> 00:06:40,690 of being correct. 142 00:06:40,690 --> 00:06:44,620 Because if it has a 50-50 chance of being correct, it's useless. 143 00:06:44,620 --> 00:06:51,080 And if it has a 50-50 chance of-- no, sorry. 144 00:06:51,080 --> 00:06:53,630 Let me specify again. 145 00:06:53,630 --> 00:06:58,945 You'll never use something that has 146 00:06:58,945 --> 00:07:03,480 a 50-50 chance of giving you the right answer given the weights. 147 00:07:03,480 --> 00:07:04,870 That's very, very important. 148 00:07:04,870 --> 00:07:08,940 And that may be what your question was getting at. 149 00:07:08,940 --> 00:07:12,070 As I'm about to show you and as Patrick told you 150 00:07:12,070 --> 00:07:14,910 in the lecture, in later rounds of boosting, 151 00:07:14,910 --> 00:07:18,430 you change the weights of each of the 10 data points. 152 00:07:18,430 --> 00:07:22,120 At first, you start with all weights being 1/10. 153 00:07:22,120 --> 00:07:23,680 The weights have add up to 1. 154 00:07:23,680 --> 00:07:25,580 In this case, you never, ever choose 155 00:07:25,580 --> 00:07:28,640 a classifier that gets five of them right and five of them 156 00:07:28,640 --> 00:07:29,540 wrong. 157 00:07:29,540 --> 00:07:31,530 In the later rounds, you'll never, ever 158 00:07:31,530 --> 00:07:36,750 choose a classifier that gets half of the weight 159 00:07:36,750 --> 00:07:40,430 wrong-- exactly half of the weight wrong. 160 00:07:40,430 --> 00:07:44,590 But half of the weight may not be half of the data points. 161 00:07:44,590 --> 00:07:46,740 So it's possible to choose a classifier that 162 00:07:46,740 --> 00:07:49,130 gets half of the data points wrong if it doesn't 163 00:07:49,130 --> 00:07:50,630 get half of the weight wrong. 164 00:07:50,630 --> 00:07:53,920 And that's similar to an ID tree when you've already 165 00:07:53,920 --> 00:07:55,330 gotten things right before. 166 00:07:55,330 --> 00:07:57,371 Because you'll see that the weight is going to go 167 00:07:57,371 --> 00:07:58,730 to the ones you got wrong. 168 00:07:58,730 --> 00:08:01,760 So I'm not saying that you should throw out 169 00:08:01,760 --> 00:08:04,400 right away anything that gets five of the points wrong. 170 00:08:04,400 --> 00:08:06,900 Hell, you shouldn't even throw out right away something that 171 00:08:06,900 --> 00:08:08,191 gets seven of the points wrong. 172 00:08:08,191 --> 00:08:11,390 It's possible-- possible-- that you can get seven of the points 173 00:08:11,390 --> 00:08:13,590 wrong, or getting less than half of the weight 174 00:08:13,590 --> 00:08:15,980 wrong if those other three points are really, really 175 00:08:15,980 --> 00:08:18,050 annoying to get right. 176 00:08:18,050 --> 00:08:19,930 And we'll see that later on. 177 00:08:19,930 --> 00:08:24,660 But for insight, at every step along the way, 178 00:08:24,660 --> 00:08:26,620 we're willing to choose any classifier that 179 00:08:26,620 --> 00:08:29,380 doesn't get 50-50. 180 00:08:29,380 --> 00:08:34,590 However, we want to choose the classifier that gets 181 00:08:34,590 --> 00:08:38,010 the most of the weight right. 182 00:08:38,010 --> 00:08:39,610 By most of the weight, at first, we 183 00:08:39,610 --> 00:08:41,030 mean most of the points right. 184 00:08:41,030 --> 00:08:44,810 Later, we will mean exactly what I said-- most of the weight. 185 00:08:44,810 --> 00:08:48,110 And if you don't understand that, it's sometimes 186 00:08:48,110 --> 00:08:51,304 hard to get it right away when Patrick just 187 00:08:51,304 --> 00:08:53,220 lectures through it, introduces a new concept. 188 00:08:53,220 --> 00:08:54,719 If you don't understand that, you'll 189 00:08:54,719 --> 00:08:58,040 see what I mean when we go through, all right? 190 00:08:58,040 --> 00:09:01,100 So my point I was making before is, what about things 191 00:09:01,100 --> 00:09:04,007 that get less than half of the weight right? 192 00:09:04,007 --> 00:09:06,340 Well, those are always OK because you can just flip them 193 00:09:06,340 --> 00:09:09,900 around, use their inverse, and that gets more than half 194 00:09:09,900 --> 00:09:12,010 of the weight right. 195 00:09:12,010 --> 00:09:15,570 It's sort of like-- yeah, it's sort 196 00:09:15,570 --> 00:09:18,270 of like my girlfriend always tells me 197 00:09:18,270 --> 00:09:21,790 that she is more than 50% likely to choose the wrong direction 198 00:09:21,790 --> 00:09:24,740 when you're trying to go between two places, which 199 00:09:24,740 --> 00:09:25,920 I'm kind of skeptical of. 200 00:09:25,920 --> 00:09:27,910 But I said, if that's really true, 201 00:09:27,910 --> 00:09:31,200 then we can just go wherever you didn't say to go, 202 00:09:31,200 --> 00:09:33,280 and we'll be more likely to go the right way. 203 00:09:33,280 --> 00:09:35,909 So you're actually really good at finding the place 204 00:09:35,909 --> 00:09:36,700 that we want to go. 205 00:09:36,700 --> 00:09:37,950 And then she's like, no, that won't work. 206 00:09:37,950 --> 00:09:40,270 Because then I'll know that you're going to do that, 207 00:09:40,270 --> 00:09:43,115 and I'll double say the wrong way. 208 00:09:43,115 --> 00:09:44,740 And then you'll go the wrong way again. 209 00:09:44,740 --> 00:09:47,670 But that not withstanding, you can see you 210 00:09:47,670 --> 00:09:50,250 can apply the same concept to boosting. 211 00:09:50,250 --> 00:09:52,270 And that's why underneath of this, 212 00:09:52,270 --> 00:09:55,450 I have all the opposite versions of all these tests. 213 00:09:55,450 --> 00:10:00,520 So what should we be doing to solve this problem more 214 00:10:00,520 --> 00:10:01,270 quickly? 215 00:10:01,270 --> 00:10:04,900 First, let's figure out which data points are misclassified 216 00:10:04,900 --> 00:10:06,980 by each of these classifiers. 217 00:10:06,980 --> 00:10:10,460 In other words, if we say all the evil things are vampires 218 00:10:10,460 --> 00:10:13,250 and all the non-evil things are not vampires, 219 00:10:13,250 --> 00:10:14,850 what do we get wrong? 220 00:10:14,850 --> 00:10:17,740 And if we do that for every classifier, 221 00:10:17,740 --> 00:10:19,830 then that'll make it faster later on. 222 00:10:19,830 --> 00:10:22,630 Because later on, we're going to go through classifiers, 223 00:10:22,630 --> 00:10:25,450 and we're going to have to add up the ones they got wrong. 224 00:10:25,450 --> 00:10:29,670 So this chart over here is going to be extremely useful. 225 00:10:29,670 --> 00:10:31,420 And on the test that this appeared in, 226 00:10:31,420 --> 00:10:34,020 they even made you fill it in to help yourself out. 227 00:10:34,020 --> 00:10:35,440 So let's see. 228 00:10:35,440 --> 00:10:38,560 If we said that all the evil equals yes are vampires 229 00:10:38,560 --> 00:10:43,130 and all the evil equals no are not vampires, 230 00:10:43,130 --> 00:10:47,400 then-- I'll do the first one for you. 231 00:10:47,400 --> 00:10:50,940 So we get all of the non-vampires 232 00:10:50,940 --> 00:10:53,390 correct because they are all evil equals no. 233 00:10:53,390 --> 00:10:59,770 But unfortunately, we get Angel, Edward Cullen, Saya Otonashi, 234 00:10:59,770 --> 00:11:06,880 and Lestat de Lioncourt wrong because they are vampires, 235 00:11:06,880 --> 00:11:08,275 and they're evil equals no. 236 00:11:11,420 --> 00:11:12,840 Apparently, Lestat is iffy. 237 00:11:12,840 --> 00:11:16,530 But I never read those books, and the Wikipedia article 238 00:11:16,530 --> 00:11:18,540 said that in the end, he wasn't that evil. 239 00:11:18,540 --> 00:11:20,310 So there we go. 240 00:11:20,310 --> 00:11:24,060 Evil equals yes misclassifies 2, 3, 4, and 5. 241 00:11:24,060 --> 00:11:28,310 All right, so let's try emo equals yes. 242 00:11:28,310 --> 00:11:29,950 I'll have someone else do it. 243 00:11:29,950 --> 00:11:32,750 So let's see if you guys got it. 244 00:11:32,750 --> 00:11:35,485 So if we say that all the emo people are vampires 245 00:11:35,485 --> 00:11:37,930 and all the non-emo people are not vampires, 246 00:11:37,930 --> 00:11:39,175 what do we get wrong? 247 00:11:39,175 --> 00:11:40,360 AUDIENCE: 1, 6, 7, 9. 248 00:11:40,360 --> 00:11:42,810 PROFESSOR: 1, 6, 7, 9-- that's exactly right, and fast. 249 00:11:42,810 --> 00:11:44,350 Good. 250 00:11:44,350 --> 00:11:46,230 We get 1, 6, 7, and 9 wrong. 251 00:11:46,230 --> 00:11:49,365 1, 6, and 7 are wrong because they are not emo, 252 00:11:49,365 --> 00:11:50,240 but they're vampires. 253 00:11:50,240 --> 00:11:53,065 9 is wrong because Squall is emo, and he is not a vampire. 254 00:11:56,210 --> 00:11:57,040 Good. 255 00:11:57,040 --> 00:12:02,150 OK, what if we said that exactly the transforming characters are 256 00:12:02,150 --> 00:12:05,502 vampires and the ones that do not transform are not vampires? 257 00:12:05,502 --> 00:12:06,710 Which ones will we get wrong? 258 00:12:11,690 --> 00:12:14,180 AUDIENCE: [INAUDIBLE]. 259 00:12:14,180 --> 00:12:16,670 PROFESSOR: Transforms is the next one over. 260 00:12:16,670 --> 00:12:18,662 AUDIENCE: All of the ones [INAUDIBLE]. 261 00:12:18,662 --> 00:12:19,408 PROFESSOR: So which ones would we 262 00:12:19,408 --> 00:12:21,866 get wrong if we said that said transforms yes were vampires 263 00:12:21,866 --> 00:12:23,642 and transforms no were not vampires? 264 00:12:23,642 --> 00:12:25,620 AUDIENCE: We'd get 8 wrong. 265 00:12:25,620 --> 00:12:27,036 PROFESSOR: We'd definitely 8 wrong 266 00:12:27,036 --> 00:12:30,008 because she's not a vampire. 267 00:12:30,008 --> 00:12:30,924 AUDIENCE: [INAUDIBLE]. 268 00:12:33,840 --> 00:12:36,170 PROFESSOR: Well, no. 269 00:12:36,170 --> 00:12:37,582 It's actually on there. 270 00:12:37,582 --> 00:12:43,095 AUDIENCE: And 3 and 4 we'd also get wrong. 271 00:12:43,095 --> 00:12:43,720 PROFESSOR: Yep. 272 00:12:43,720 --> 00:12:44,710 AUDIENCE: As well as 5. 273 00:12:44,710 --> 00:12:45,710 PROFESSOR: Yes, exactly. 274 00:12:45,710 --> 00:12:46,230 OK. 275 00:12:46,230 --> 00:12:46,700 Oh, man. 276 00:12:46,700 --> 00:12:47,741 You didn't see the chart. 277 00:12:47,741 --> 00:12:50,580 You were just like, hmmm. 278 00:12:50,580 --> 00:12:51,470 You saw the left. 279 00:12:51,470 --> 00:12:53,033 You just said, hm, which one of these 280 00:12:53,033 --> 00:12:53,850 are the transforming characters? 281 00:12:53,850 --> 00:12:55,320 OK, that's pretty hardcore. 282 00:12:55,320 --> 00:12:56,195 AUDIENCE: [LAUGHTER]. 283 00:12:59,240 --> 00:13:01,140 PROFESSOR: But yeah, 3, 4, 5, and 8. 284 00:13:01,140 --> 00:13:02,780 No, no, it's definitely given to you. 285 00:13:02,780 --> 00:13:04,210 That would be like the worst test 286 00:13:04,210 --> 00:13:06,920 ever for international students. 287 00:13:06,920 --> 00:13:10,320 Ah, if you don't know these ten characters as vampires, 288 00:13:10,320 --> 00:13:12,270 you lose. 289 00:13:12,270 --> 00:13:15,390 All right, so what about that sparkly 290 00:13:15,390 --> 00:13:19,160 equals yes is a vampire, and if it's not sparkly, 291 00:13:19,160 --> 00:13:20,634 it's not a vampire? 292 00:13:20,634 --> 00:13:22,050 This is guaranteed not to go well. 293 00:13:22,050 --> 00:13:24,875 What do you think it's going to get wrong? 294 00:13:24,875 --> 00:13:26,330 AUDIENCE: For sparkly? 295 00:13:26,330 --> 00:13:29,039 PROFESSOR: Yeah, sparkly equals yes are the only vampires. 296 00:13:29,039 --> 00:13:30,205 AUDIENCE: [INAUDIBLE] wrong. 297 00:13:30,205 --> 00:13:34,133 Angel's going to be wrong. 298 00:13:34,133 --> 00:13:43,709 Saya-- so 1, 2, 4, 5, 6, 7, and 8. 299 00:13:43,709 --> 00:13:45,250 PROFESSOR: And 8-- yes, that's right. 300 00:13:45,250 --> 00:13:47,950 It gets 1, 2, 4, 5, 6, 7, and 8 wrong. 301 00:13:47,950 --> 00:13:50,980 That's pretty awful. 302 00:13:50,980 --> 00:13:53,760 But dammit, it gets Edward Cullen right. 303 00:13:53,760 --> 00:13:57,230 And he's hard to get correct due to the fact 304 00:13:57,230 --> 00:13:58,860 that he's not very much like a vampire. 305 00:13:58,860 --> 00:14:02,100 He's more of a superhero who says he's a vampire. 306 00:14:02,100 --> 00:14:06,620 OK, so next, number of romantic interest greater than two. 307 00:14:06,620 --> 00:14:09,580 So if they have more than two romantic interest, 308 00:14:09,580 --> 00:14:10,510 they're a vampire. 309 00:14:10,510 --> 00:14:12,520 And otherwise, they're not a vampire. 310 00:14:12,520 --> 00:14:15,352 So which ones would that get wrong? 311 00:14:15,352 --> 00:14:15,852 Hm? 312 00:14:15,852 --> 00:14:16,643 AUDIENCE: 3 and 10. 313 00:14:16,643 --> 00:14:18,740 PROFESSOR: Just 3 and 10, that's right. 314 00:14:18,740 --> 00:14:22,850 Because Circe had Odysseus. 315 00:14:22,850 --> 00:14:24,360 She had Telemachus. 316 00:14:24,360 --> 00:14:27,360 Actually, she had that guy she turned into a woodpecker. 317 00:14:27,360 --> 00:14:31,510 She had that other guy who was a sea god who 318 00:14:31,510 --> 00:14:33,800 caused her to turn Scylla into the nine-headed thing, 319 00:14:33,800 --> 00:14:36,130 and probably at least one other person. 320 00:14:36,130 --> 00:14:37,770 So Circe it gets wrong. 321 00:14:37,770 --> 00:14:41,530 And it also gets Edward Cullen wrong because he only has one. 322 00:14:41,530 --> 00:14:45,240 So 3 and 10. 323 00:14:45,240 --> 00:14:47,150 You can tell I thought about this problem 324 00:14:47,150 --> 00:14:48,430 when I was writing it up. 325 00:14:48,430 --> 00:14:49,580 I wrote this one. 326 00:14:49,580 --> 00:14:53,400 All right, number of romantic interest greater than four. 327 00:14:53,400 --> 00:14:55,460 So it's a little bit different this time. 328 00:14:55,460 --> 00:14:58,482 Now you have to have at least four romantic interests-- 329 00:14:58,482 --> 00:15:00,190 or actually, greater than four, but there 330 00:15:00,190 --> 00:15:03,100 are none that are exactly four-- to be classified as a vampire. 331 00:15:03,100 --> 00:15:05,100 Which ones do you think it's going to get wrong? 332 00:15:05,100 --> 00:15:09,460 AUDIENCE: 3, 4, 10. 333 00:15:09,460 --> 00:15:16,510 PROFESSOR: Yup, it is going to get 3, 4, and 10 wrong. 334 00:15:16,510 --> 00:15:19,330 Because now, you run into the fact 335 00:15:19,330 --> 00:15:27,250 that Saya has that blond [INAUDIBLE] guy, Haji, and Kai. 336 00:15:27,250 --> 00:15:31,874 So the last of the positive ones-- because I claim I 337 00:15:31,874 --> 00:15:34,040 can derive all the negative ones if you guys give me 338 00:15:34,040 --> 00:15:35,220 the positive ones. 339 00:15:35,220 --> 00:15:37,511 The last of the positive ones is everybody's a vampire. 340 00:15:37,511 --> 00:15:39,393 Who does that get wrong? 341 00:15:39,393 --> 00:15:40,836 AUDIENCE: 8, 9, 10. 342 00:15:40,836 --> 00:15:42,625 PROFESSOR: Yes, OK. 343 00:15:46,210 --> 00:15:48,450 Now, I can derive all the negative ones 344 00:15:48,450 --> 00:15:51,750 from this without a sweat. 345 00:15:51,750 --> 00:16:00,462 Evil equals no-- well, it's 1, 6, 7, 8, 9, 10 346 00:16:00,462 --> 00:16:01,670 without looking at the chart. 347 00:16:05,720 --> 00:16:08,680 Raise your hand if you see why it's 1, 6, 7, 8, 9, 10 348 00:16:08,680 --> 00:16:11,390 without looking at the chart. 349 00:16:11,390 --> 00:16:13,780 Raise your hand if you don't. 350 00:16:13,780 --> 00:16:15,910 Nobody, OK-- wait. 351 00:16:15,910 --> 00:16:16,920 One hand. 352 00:16:16,920 --> 00:16:18,730 OK, I saw another one back there too later. 353 00:16:18,730 --> 00:16:19,980 They were just more tentative. 354 00:16:19,980 --> 00:16:23,410 OK, it's the complement of A because A is 355 00:16:23,410 --> 00:16:25,200 evil equals yes is a vampire. 356 00:16:25,200 --> 00:16:26,690 It gets 2, 3, 4, and 5 wrong. 357 00:16:26,690 --> 00:16:29,470 So therefore, evil equals no is a vampire is guaranteed 358 00:16:29,470 --> 00:16:30,810 to get all the opposite ones. 359 00:16:30,810 --> 00:16:32,630 AUDIENCE: Oh, we could have looked at that too? 360 00:16:32,630 --> 00:16:33,510 PROFESSOR: Yeah, we're looking here, 361 00:16:33,510 --> 00:16:35,370 but we're not looking at the big chart there. 362 00:16:35,370 --> 00:16:36,190 AUDIENCE: You can look at any? 363 00:16:36,190 --> 00:16:36,710 PROFESSOR: Oh, yeah. 364 00:16:36,710 --> 00:16:38,460 If you can't look at anything, then you're 365 00:16:38,460 --> 00:16:43,250 screwed-- unless not only are you as hardcore as this guy, 366 00:16:43,250 --> 00:16:46,700 but you've also memorized the numbers. 367 00:16:46,700 --> 00:16:56,470 All right, so emo equals no is going to be 2, 3, 4, 5, 8, 10. 368 00:16:56,470 --> 00:17:03,450 Transforms equals no is 1, 2, 6, 7, 9, 10. 369 00:17:03,450 --> 00:17:11,490 Sparkle equals no is 3, 9, 10. 370 00:17:11,490 --> 00:17:14,440 Romantic interest less than two is everything except 3 and 10-- 371 00:17:14,440 --> 00:17:20,805 1, 2, 4, 5, 6, 7, 8, 9. 372 00:17:20,805 --> 00:17:27,140 1, 2, 5, 6, 7, 8, 9. 373 00:17:27,140 --> 00:17:34,754 And then finally, everything but 8, 9, and 10, so 1, 2, 3, 4, 5, 374 00:17:34,754 --> 00:17:36,240 6, 7. 375 00:17:36,240 --> 00:17:37,860 All right, so when we started off, 376 00:17:37,860 --> 00:17:39,370 we know what everything gets wrong. 377 00:17:39,370 --> 00:17:41,740 I then make a bold claim, because there 378 00:17:41,740 --> 00:17:44,560 are n of these, which is 14. 379 00:17:44,560 --> 00:17:46,700 I make the claim that there are only six 380 00:17:46,700 --> 00:17:50,370 that, in your wildest dreams, you would ever possibly 381 00:17:50,370 --> 00:17:53,820 even consider using ever. 382 00:17:53,820 --> 00:17:56,071 And the rest, you would never, ever use. 383 00:17:56,071 --> 00:17:56,570 Question? 384 00:17:56,570 --> 00:17:58,861 AUDIENCE: Yeah, I just have a question about the number 385 00:17:58,861 --> 00:17:59,913 of romantic interests. 386 00:17:59,913 --> 00:18:00,538 PROFESSOR: Yes. 387 00:18:00,538 --> 00:18:03,018 AUDIENCE: You negated it without an equals on either side. 388 00:18:03,018 --> 00:18:04,010 PROFESSOR: That's true. 389 00:18:04,010 --> 00:18:06,884 That works only because there are [INAUDIBLE] 2 or 4. 390 00:18:06,884 --> 00:18:09,300 But it should have been negated with a less than or equal. 391 00:18:09,300 --> 00:18:11,250 I'm copying off of the quiz. 392 00:18:11,250 --> 00:18:12,780 But yes. 393 00:18:12,780 --> 00:18:15,334 I noticed that this morning when I was putting myself 394 00:18:15,334 --> 00:18:16,000 through my pace. 395 00:18:16,000 --> 00:18:17,530 I'm like, there's not a less than or equal to. 396 00:18:17,530 --> 00:18:18,113 Wait a minute. 397 00:18:18,113 --> 00:18:19,750 And then, oh, wait. 398 00:18:19,750 --> 00:18:21,429 It doesn't have any 2's or 4's. 399 00:18:21,429 --> 00:18:23,220 Actually, I don't remember writing them all 400 00:18:23,220 --> 00:18:24,280 as 5's and 3's. 401 00:18:24,280 --> 00:18:27,690 It's possible that somebody else in the post-editing process 402 00:18:27,690 --> 00:18:30,410 changed them all to be about the same number, 403 00:18:30,410 --> 00:18:32,960 and then changed the less than or equal tos 404 00:18:32,960 --> 00:18:33,840 to be less confusing. 405 00:18:33,840 --> 00:18:36,410 It's possible I had Circe at 4, and there was an equal 406 00:18:36,410 --> 00:18:37,270 to somewhere. 407 00:18:37,270 --> 00:18:38,980 And they were like, forget it. 408 00:18:38,980 --> 00:18:41,146 Because I can't think of the fifth romantic interest 409 00:18:41,146 --> 00:18:42,390 for her. 410 00:18:42,390 --> 00:18:44,820 So yes, normally, you would have to negate it 411 00:18:44,820 --> 00:18:46,700 with an equal to sign, but there happened 412 00:18:46,700 --> 00:18:49,750 to not be any things that are equal to 4 or 2 here. 413 00:18:49,750 --> 00:18:53,640 So they get away with it this time. 414 00:18:53,640 --> 00:18:56,819 But it's good practice. 415 00:18:56,819 --> 00:18:58,610 So I'm claiming that in our wildest dreams, 416 00:18:58,610 --> 00:19:01,800 we'd only ever want to use six of these ever. 417 00:19:01,800 --> 00:19:04,190 And the other eight, forget it. 418 00:19:04,190 --> 00:19:07,860 So let's see. 419 00:19:07,860 --> 00:19:11,445 I will call on people at random to-- the first people obviously 420 00:19:11,445 --> 00:19:13,620 are getting it really easy-- to tell me 421 00:19:13,620 --> 00:19:16,610 which of these you think that you might ever want to use. 422 00:19:16,610 --> 00:19:18,733 Give me one you might ever want to use. 423 00:19:18,733 --> 00:19:19,680 AUDIENCE: E. 424 00:19:19,680 --> 00:19:20,490 PROFESSOR: Why, E? 425 00:19:20,490 --> 00:19:21,310 Of course. 426 00:19:21,310 --> 00:19:22,740 That's the best one. 427 00:19:22,740 --> 00:19:25,180 Yes, that's one that you might ever want to use. 428 00:19:25,180 --> 00:19:28,780 I'll circle the ones that you might ever want to use. 429 00:19:28,780 --> 00:19:30,840 E-- it only gets 3 and 10 wrong. 430 00:19:30,840 --> 00:19:31,920 That's amazing. 431 00:19:31,920 --> 00:19:34,220 It's like the best class of [INAUDIBLE]. 432 00:19:34,220 --> 00:19:37,934 OK, so give me another one that you might ever want to us. 433 00:19:37,934 --> 00:19:38,433 AUDIENCE: F. 434 00:19:38,433 --> 00:19:40,620 PROFESSOR: F. Let's see. 435 00:19:40,620 --> 00:19:41,930 F-- F is great. 436 00:19:41,930 --> 00:19:43,820 It only gets three wrong. 437 00:19:43,820 --> 00:19:46,180 Do people agree that you would ever want to use F? 438 00:19:46,180 --> 00:19:46,810 AUDIENCE: No. 439 00:19:46,810 --> 00:19:48,143 PROFESSOR: Everyone's saying no. 440 00:19:48,143 --> 00:19:50,165 Why not? 441 00:19:50,165 --> 00:19:51,990 AUDIENCE: It's like E, except worse. 442 00:19:51,990 --> 00:19:54,000 PROFESSOR: It's like E, except worse. 443 00:19:54,000 --> 00:19:56,500 It's guaranteed at every step, no matter 444 00:19:56,500 --> 00:19:59,900 what the weights are, to have a worse accuracy than E. 445 00:19:59,900 --> 00:20:01,940 It is definitely good. 446 00:20:01,940 --> 00:20:04,530 If E wasn't around, it would be one of our best classifiers 447 00:20:04,530 --> 00:20:05,580 of all. 448 00:20:05,580 --> 00:20:08,830 But actually, F is not one of the six. 449 00:20:08,830 --> 00:20:11,670 This is why I had them write on the test that there were six, 450 00:20:11,670 --> 00:20:13,870 because people might not have found all six. 451 00:20:13,870 --> 00:20:16,130 Because people who did figure out not to include F 452 00:20:16,130 --> 00:20:18,380 might not have figured out to include some of the ones 453 00:20:18,380 --> 00:20:19,470 you want to include. 454 00:20:19,470 --> 00:20:20,030 So-- 455 00:20:20,030 --> 00:20:22,071 AUDIENCE: I don't understand why you can't use F. 456 00:20:22,071 --> 00:20:23,690 PROFESSOR: Why you can't use F, OK. 457 00:20:23,690 --> 00:20:28,200 So we start off with 1/10 weight for all our data points. 458 00:20:28,200 --> 00:20:31,190 But let's say during our time of boosting 459 00:20:31,190 --> 00:20:34,470 that all ten of the data points have now different weights. 460 00:20:34,470 --> 00:20:36,300 So we'll call whatever the weight of 3 461 00:20:36,300 --> 00:20:38,510 is, which you're going to get wrong-- you want 462 00:20:38,510 --> 00:20:40,200 to minimize the error, right? 463 00:20:40,200 --> 00:20:44,080 So that weight of 3, which goes into the error of the E, is x. 464 00:20:44,080 --> 00:20:45,600 The weight of 10 can be y. 465 00:20:45,600 --> 00:20:47,954 So if you're thinking of choosing 3, 466 00:20:47,954 --> 00:20:49,120 you know you're-- oh, sorry. 467 00:20:49,120 --> 00:20:52,035 If you're thinking of choosing E, your error is x plus y. 468 00:20:52,035 --> 00:20:52,660 AUDIENCE: Sure. 469 00:20:52,660 --> 00:20:54,493 PROFESSOR: If you're thinking of choosing F, 470 00:20:54,493 --> 00:20:57,920 your error is x plus y plus z, where z is the error of 4. 471 00:20:57,920 --> 00:21:00,280 And since you're never going to have a negative weight, 472 00:21:00,280 --> 00:21:02,462 x plus y plus z is always greater than x plus y. 473 00:21:02,462 --> 00:21:03,962 AUDIENCE: You can't choose something 474 00:21:03,962 --> 00:21:06,503 without the 3 and the 10 anymore because you already chose E. 475 00:21:06,503 --> 00:21:10,417 PROFESSOR: That's-- yes, you would probably choose something 476 00:21:10,417 --> 00:21:12,250 that didn't get the three and the ten wrong. 477 00:21:12,250 --> 00:21:15,340 But you would certainly never choose F ever 478 00:21:15,340 --> 00:21:18,600 because it's always worse than E. In fact, this is exactly 479 00:21:18,600 --> 00:21:21,600 the process that will allow you to find the correct six. 480 00:21:21,600 --> 00:21:23,930 And by "will," I mean "can." 481 00:21:23,930 --> 00:21:26,210 And by "can," I mean, let's see if you guys get it. 482 00:21:26,210 --> 00:21:28,580 Give me another one of the six that you might keep. 483 00:21:28,580 --> 00:21:29,590 AUDIENCE: K. 484 00:21:29,590 --> 00:21:32,520 PROFESSOR: K is the claim-- sparkly. 485 00:21:32,520 --> 00:21:36,560 K, I'm going to say will lose for the same reason 486 00:21:36,560 --> 00:21:40,120 as F. It's 3, 9, and 10. 487 00:21:40,120 --> 00:21:43,970 It's essentially similar to 3, 4, and 10. 488 00:21:43,970 --> 00:21:45,620 So-- oh, by the way, we should not 489 00:21:45,620 --> 00:21:50,380 be only going for the ones with the fewest incorrect. 490 00:21:50,380 --> 00:21:53,630 You need to be going for ones that do not have something 491 00:21:53,630 --> 00:21:55,030 that is strictly better. 492 00:21:55,030 --> 00:21:57,070 In this case, 3 and 10 wrong is strictly 493 00:21:57,070 --> 00:21:59,491 better than getting 3, 9, and 10 wrong. 494 00:21:59,491 --> 00:21:59,990 Question? 495 00:21:59,990 --> 00:22:02,230 AUDIENCE: Oh, I was going to say transform. 496 00:22:02,230 --> 00:22:04,021 PROFESSOR: You were going to say transform. 497 00:22:04,021 --> 00:22:06,460 You are going to be correct. 498 00:22:06,460 --> 00:22:09,775 Transforms is one of the ones we need, C. 3, 4, 5 499 00:22:09,775 --> 00:22:13,280 and 8-- there's nothing down here that gets fewer 500 00:22:13,280 --> 00:22:15,720 than those wrong. 501 00:22:15,720 --> 00:22:18,120 There's nothing that gets us 3, 4, 5 wrong, for instance. 502 00:22:20,800 --> 00:22:25,950 Yeah, there's no way to get 3, 4 wrong without getting either 503 00:22:25,950 --> 00:22:28,028 10 wrong, or 5 and 8 wrong. 504 00:22:28,028 --> 00:22:30,594 AUDIENCE: So why is the 8 like that? 505 00:22:30,594 --> 00:22:31,260 PROFESSOR: What? 506 00:22:31,260 --> 00:22:32,410 AUDIENCE: Why not G? 507 00:22:32,410 --> 00:22:34,030 PROFESSOR: Why not G? 508 00:22:34,030 --> 00:22:34,931 Why not? 509 00:22:34,931 --> 00:22:35,430 Why not G? 510 00:22:35,430 --> 00:22:38,078 Let's include G, too. 511 00:22:38,078 --> 00:22:38,619 AUDIENCE: Oh. 512 00:22:38,619 --> 00:22:40,070 We don't have to do it all? 513 00:22:40,070 --> 00:22:41,261 PROFESSOR: We need six. 514 00:22:41,261 --> 00:22:43,260 No, I just said give me any, and someone gave me 515 00:22:43,260 --> 00:22:45,310 the easiest one, E. Question? 516 00:22:45,310 --> 00:22:45,995 AUDIENCE: B 517 00:22:45,995 --> 00:22:46,870 PROFESSOR: Why not B? 518 00:22:46,870 --> 00:22:47,920 B looks great. 519 00:22:47,920 --> 00:22:52,674 I love B. Let's include B. Does someone else want 520 00:22:52,674 --> 00:22:54,590 to give another one that they want to include? 521 00:22:54,590 --> 00:22:55,650 AUDIENCE: A. 522 00:22:55,650 --> 00:22:57,321 PROFESSOR: A. Why not A? 523 00:22:57,321 --> 00:22:57,820 Sure. 524 00:22:57,820 --> 00:22:59,360 I mean, it's hard to see down here 525 00:22:59,360 --> 00:23:01,610 because there might be something better on the bottom. 526 00:23:01,610 --> 00:23:03,600 But yeah, there's not. 527 00:23:03,600 --> 00:23:05,090 So let's include A. Why not A? 528 00:23:05,090 --> 00:23:08,370 I love A. A's great. 529 00:23:08,370 --> 00:23:10,280 OK. 530 00:23:10,280 --> 00:23:18,330 So that is now five. 531 00:23:18,330 --> 00:23:20,500 There's one more that we need. 532 00:23:20,500 --> 00:23:25,360 It's by far the hardest one to find. 533 00:23:25,360 --> 00:23:27,880 Find me one more that there's nothing better than it. 534 00:23:27,880 --> 00:23:30,810 There's nothing that has a strict subset 535 00:23:30,810 --> 00:23:33,270 of the same ones wrong. 536 00:23:33,270 --> 00:23:34,090 AUDIENCE: Question. 537 00:23:34,090 --> 00:23:34,520 PROFESSOR: What? 538 00:23:34,520 --> 00:23:35,934 AUDIENCE: Sorry, can we quickly-- 539 00:23:35,934 --> 00:23:38,711 why would you choose A before we chose C? 540 00:23:38,711 --> 00:23:40,252 PROFESSOR: OK, why would you choose A 541 00:23:40,252 --> 00:23:41,580 before you've chosen C? 542 00:23:41,580 --> 00:23:45,490 Let's say 8 was a real problem for you. 543 00:23:45,490 --> 00:23:47,970 And you were just getting-- let's say that 3, 4 and 5, 544 00:23:47,970 --> 00:23:49,324 they weren't that bad. 545 00:23:49,324 --> 00:23:50,240 They weren't that bad. 546 00:23:50,240 --> 00:23:50,750 They weren't that bad. 547 00:23:50,750 --> 00:23:52,680 OK, you got them wrong here with transforms. 548 00:23:52,680 --> 00:23:58,600 You chose C. But sometime later, 8 was just by far your issue. 549 00:23:58,600 --> 00:23:59,770 All right? 550 00:23:59,770 --> 00:24:03,441 3, 4, and 5, and 8-- 3, 4 and 5 are much smaller weights 551 00:24:03,441 --> 00:24:03,940 than 8. 552 00:24:03,940 --> 00:24:07,790 And then after you've got 3, 4, 5, and 8 wrong, 3, 4, and 5 553 00:24:07,790 --> 00:24:08,840 were still not that bad. 554 00:24:08,840 --> 00:24:11,200 And 8 still was a high number. 555 00:24:11,200 --> 00:24:14,640 And then sometime later down the line, you're looking at things. 556 00:24:14,640 --> 00:24:16,092 And you're saying, you know what? 557 00:24:16,092 --> 00:24:17,800 I really don't want to get 8 wrong again, 558 00:24:17,800 --> 00:24:19,689 but I don't mind if I get 3, 4 and 5 wrong. 559 00:24:19,689 --> 00:24:21,230 Maybe I'll get it wrong with 2, which 560 00:24:21,230 --> 00:24:22,664 I've never gone wrong yet. 561 00:24:22,664 --> 00:24:25,080 Actually, none of the ones we've circled here get 2 wrong, 562 00:24:25,080 --> 00:24:27,950 so it's probably not that bad to get 2 wrong. 563 00:24:27,950 --> 00:24:31,540 So that's why-- because it doesn't get 8 wrong. 564 00:24:31,540 --> 00:24:35,405 If A was 2, 3, 4, 5, 8, you'd never take it. 565 00:24:35,405 --> 00:24:36,445 Do you see what I mean? 566 00:24:36,445 --> 00:24:37,820 Oh, did someone raise their hand? 567 00:24:37,820 --> 00:24:38,920 Did someone find it? 568 00:24:38,920 --> 00:24:40,378 AUDIENCE: I just have a question. 569 00:24:40,378 --> 00:24:43,780 You can use the same reasoning for choosing K, right? 570 00:24:43,780 --> 00:24:47,668 Because after E, we could have chosen A 571 00:24:47,668 --> 00:24:51,080 and said that 9 only a little different-- 572 00:24:51,080 --> 00:24:52,770 PROFESSOR: But it's strictly worse. 573 00:24:52,770 --> 00:24:59,348 AUDIENCE: Sorry, sorry, I meant [INAUDIBLE] E to 9 and 10, 574 00:24:59,348 --> 00:25:02,327 and then we could have chosen 3, 9, and 10, right? 575 00:25:02,327 --> 00:25:02,827 Because-- 576 00:25:02,827 --> 00:25:07,051 PROFESSOR: But we chose to use E. Because getting only 3, 577 00:25:07,051 --> 00:25:08,800 10 wrong is better than getting 3, 9, 578 00:25:08,800 --> 00:25:10,782 and 10 wrong in any universe. 579 00:25:10,782 --> 00:25:12,310 You pick. 580 00:25:12,310 --> 00:25:13,760 You see what I mean? 581 00:25:13,760 --> 00:25:15,940 It might not be that much worse. 582 00:25:15,940 --> 00:25:17,980 It might be only a little bit worse to choose K, 583 00:25:17,980 --> 00:25:20,170 but it's always worse. 584 00:25:20,170 --> 00:25:22,118 So it-- question? 585 00:25:22,118 --> 00:25:25,106 AUDIENCE: [INAUDIBLE] the right one by reasoning [INAUDIBLE] 586 00:25:25,106 --> 00:25:28,094 we need one that doesn't have 3 in it. 587 00:25:28,094 --> 00:25:32,576 Because right now, [INAUDIBLE] get 3 wrong. 588 00:25:32,576 --> 00:25:34,700 PROFESSOR: OK, that's a pretty good insight. 589 00:25:34,700 --> 00:25:36,349 What are you thinking about? 590 00:25:36,349 --> 00:25:38,217 AUDIENCE: Well, I'm trying to justify D. 591 00:25:38,217 --> 00:25:39,800 PROFESSOR: You're trying to justify D. 592 00:25:39,800 --> 00:25:41,070 AUDIENCE: [INAUDIBLE]. 593 00:25:41,070 --> 00:25:42,199 PROFESSOR: D is huge. 594 00:25:42,199 --> 00:25:43,740 It gets more than half of them wrong. 595 00:25:43,740 --> 00:25:44,670 But you know what? 596 00:25:44,670 --> 00:25:45,600 It gets 3 right. 597 00:25:45,600 --> 00:25:46,530 You know what? 598 00:25:46,530 --> 00:25:47,930 It gets 10 right. 599 00:25:47,930 --> 00:25:52,570 And unlike our things that get 3 and 10 right, which is B, 600 00:25:52,570 --> 00:25:55,660 it also gets 9 right. 601 00:25:55,660 --> 00:25:57,450 D is the last classifier. 602 00:25:57,450 --> 00:25:58,530 You got it. 603 00:25:58,530 --> 00:26:02,250 It's hard to choose one that has seven of them wrong, 604 00:26:02,250 --> 00:26:04,620 but D is the last one you might pick. 605 00:26:04,620 --> 00:26:06,695 It turns out there's nothing better than this 606 00:26:06,695 --> 00:26:12,120 for classifying correctly those annoying data points of Edward 607 00:26:12,120 --> 00:26:16,730 Cullen and Squall, and also Circe, who's not that annoying, 608 00:26:16,730 --> 00:26:20,210 but she tends to be a problem when romance is concerned. 609 00:26:20,210 --> 00:26:24,240 So these are the six that we might use. 610 00:26:24,240 --> 00:26:25,720 We can now ignore the rest of them 611 00:26:25,720 --> 00:26:29,452 forever-- or at least until someone reuses this problem 612 00:26:29,452 --> 00:26:30,410 or something like that. 613 00:26:30,410 --> 00:26:32,762 But we can ignore everything except A, B, C, D, E, 614 00:26:32,762 --> 00:26:34,470 G. In fact, why did I even bring that up? 615 00:26:34,470 --> 00:26:36,300 All the ones we want are on the front. 616 00:26:36,300 --> 00:26:38,020 I'll bring it back down. 617 00:26:38,020 --> 00:26:42,270 Then I'll cross this off with reckless abandon. 618 00:26:42,270 --> 00:26:43,979 That even broke off a piece of my chalk. 619 00:26:43,979 --> 00:26:45,770 Now, these are the ones that we're actually 620 00:26:45,770 --> 00:26:48,110 thinking about using. 621 00:26:48,110 --> 00:26:50,810 There is a chart over here already 622 00:26:50,810 --> 00:26:55,590 prepared to do some boosting with these six classifiers, 623 00:26:55,590 --> 00:26:57,020 all right? 624 00:26:57,020 --> 00:26:59,710 So let's give it a try. 625 00:26:59,710 --> 00:27:01,270 Now, remember, in boosting, we always 626 00:27:01,270 --> 00:27:02,790 try to choose whichever classifier has the least 627 00:27:02,790 --> 00:27:03,150 errors. 628 00:27:03,150 --> 00:27:03,720 Is there a question? 629 00:27:03,720 --> 00:27:04,636 AUDIENCE: Sorry, yeah. 630 00:27:04,636 --> 00:27:07,450 Before we move on, can you say again a little 631 00:27:07,450 --> 00:27:09,610 more slowly what exactly we were looking 632 00:27:09,610 --> 00:27:12,225 for when we were choosing our classifiers, 633 00:27:12,225 --> 00:27:14,310 like something about the subset? 634 00:27:17,590 --> 00:27:20,320 PROFESSOR: You generally want to take classifiers. 635 00:27:20,320 --> 00:27:23,220 So I'll tell you what let's you cross off a classifier. 636 00:27:23,220 --> 00:27:25,210 That may be a better way to do it. 637 00:27:25,210 --> 00:27:31,400 You can cross off a classifier as useless if-- and by the way, 638 00:27:31,400 --> 00:27:34,100 this is only useful if you can do it faster 639 00:27:34,100 --> 00:27:36,880 than just wasting your time looking at all of them. 640 00:27:36,880 --> 00:27:39,492 Because if you can't cross off some of them as useless-- 641 00:27:39,492 --> 00:27:41,200 usually on the test, they won't make you. 642 00:27:41,200 --> 00:27:42,790 You can just waste your time and have 643 00:27:42,790 --> 00:27:46,380 14 instead of six possibilities every step of the boosting. 644 00:27:46,380 --> 00:27:51,530 But take a look at this-- 1, 2, 3, 4, 5, 6, 7. 645 00:27:51,530 --> 00:27:52,260 Then see. 646 00:27:52,260 --> 00:27:54,660 Do you have anything here that has 647 00:27:54,660 --> 00:27:57,610 a strict subset of these wrong? 648 00:27:57,610 --> 00:27:58,110 Oh, look. 649 00:27:58,110 --> 00:28:00,220 2, 3, 4, 5 is a strict subset. 650 00:28:00,220 --> 00:28:01,720 This can be crossed off. 651 00:28:01,720 --> 00:28:06,170 1, 2, 5, 6, 7, 8, 9-- anything that's a strict subset? 652 00:28:06,170 --> 00:28:08,419 Yes, 1, 6, 7, 9. 653 00:28:08,419 --> 00:28:09,460 So it can be crossed off. 654 00:28:09,460 --> 00:28:12,640 1, 2, 4, 5, 6, 7, 8, 9. 655 00:28:12,640 --> 00:28:13,660 Let's see. 656 00:28:13,660 --> 00:28:16,040 1, 6, 7, 9 is a strict subset. 657 00:28:16,040 --> 00:28:16,820 3, 9, 10. 658 00:28:16,820 --> 00:28:18,970 3, 10 is a strict subset. 659 00:28:18,970 --> 00:28:21,310 1, 6, 7, 9-- a strict subset. 660 00:28:21,310 --> 00:28:25,756 3, 4, 5, 8 is a strict subset, as is 2, 3, 4, 5. 661 00:28:25,756 --> 00:28:28,000 1, 6, 7, 9 is a strict subset. 662 00:28:28,000 --> 00:28:31,170 And up here, 3, 4, 10-- 3, 10 is a strict subset. 663 00:28:31,170 --> 00:28:34,185 But the others don't have one, even 1, 2, 4, 5, 6, 7, 8. 664 00:28:37,620 --> 00:28:39,440 In general, you want to keep them. 665 00:28:39,440 --> 00:28:41,930 You want to keep every classifier you might use. 666 00:28:41,930 --> 00:28:43,400 The only ones you'll never use are 667 00:28:43,400 --> 00:28:45,816 ones that there's something else that's just better always 668 00:28:45,816 --> 00:28:50,370 by having a strict subset of them wrong. 669 00:28:50,370 --> 00:28:52,810 Hopefully, that was more clear. 670 00:28:52,810 --> 00:28:55,430 It's tricky. 671 00:28:55,430 --> 00:28:58,140 Very few people realize-- we're brave enough 672 00:28:58,140 --> 00:29:01,020 to take sparkly even when it got seven things wrong. 673 00:29:01,020 --> 00:29:03,670 So let's start out some boosting. 674 00:29:03,670 --> 00:29:04,870 This wasn't boosting. 675 00:29:04,870 --> 00:29:06,970 This was setting yourself up. 676 00:29:06,970 --> 00:29:09,040 But it was setting yourself up with the knowledge 677 00:29:09,040 --> 00:29:10,260 of how boosting works. 678 00:29:10,260 --> 00:29:11,820 Less knowledge, less search. 679 00:29:11,820 --> 00:29:13,700 Now we only have to search six things. 680 00:29:13,700 --> 00:29:16,158 Ah, I mean more knowledge, less search, not less knowledge, 681 00:29:16,158 --> 00:29:16,980 less search. 682 00:29:16,980 --> 00:29:23,180 So we start off with all weights being equal. 683 00:29:23,180 --> 00:29:25,456 And since there's ten data points, 684 00:29:25,456 --> 00:29:27,330 all ten of the data points are weighted 1/10. 685 00:29:30,710 --> 00:29:33,495 OK, we're now weighting all of them equally. 686 00:29:33,495 --> 00:29:34,870 Since we're weighting all of them 687 00:29:34,870 --> 00:29:36,911 equally, when we want to find the classifier that 688 00:29:36,911 --> 00:29:38,670 gets the least error, we just want 689 00:29:38,670 --> 00:29:41,360 to find the one that gets the fewest points wrong. 690 00:29:41,360 --> 00:29:42,920 Which one is that? 691 00:29:42,920 --> 00:29:45,020 That's our friend E, the first one that people 692 00:29:45,020 --> 00:29:46,650 realized was a good one. 693 00:29:46,650 --> 00:29:49,900 So we're going to choose classifier E. 694 00:29:49,900 --> 00:29:51,470 What's our error? 695 00:29:51,470 --> 00:29:53,940 It's just the sum of the ones we get wrong. 696 00:29:53,940 --> 00:29:56,660 So what's our error this time? 697 00:29:56,660 --> 00:29:58,050 It's 1/5. 698 00:29:58,050 --> 00:30:00,057 We got points 3 and 10 wrong. 699 00:30:00,057 --> 00:30:01,390 They both have a weight of 1/10. 700 00:30:01,390 --> 00:30:04,430 1/10 plus 1/10 is 1/5. 701 00:30:04,430 --> 00:30:08,070 So I'll put 1/5, and alpha. 702 00:30:08,070 --> 00:30:11,760 Alpha is sort of a vote that will be used at the very end 703 00:30:11,760 --> 00:30:14,480 to aggregate our classifier. 704 00:30:14,480 --> 00:30:19,090 Alpha is 1/2 natural log of 1 minus the error over the error. 705 00:30:19,090 --> 00:30:21,580 However, I have a little trick for you. 706 00:30:21,580 --> 00:30:24,650 It's not that impressive of a trick, but it's a little fun. 707 00:30:24,650 --> 00:30:30,100 So since error is 1/2-- sorry, not error-- alpha. 708 00:30:30,100 --> 00:30:33,020 Alpha is 1/2 natural log of 1 minus error over error. 709 00:30:37,260 --> 00:30:45,403 If error is 1/x, then alpha is 1/2 natural log of x minus 1. 710 00:30:48,061 --> 00:30:49,980 That just follows from the math. 711 00:30:49,980 --> 00:30:52,150 It's a little shortcut. 712 00:30:52,150 --> 00:30:55,490 If error is in the form of 1/x, then it's 713 00:30:55,490 --> 00:30:57,270 just 1/2 natural log of x minus 1, 714 00:30:57,270 --> 00:31:00,290 which means since this is in the form of 1/5, everyone, 715 00:31:00,290 --> 00:31:03,260 alpha is? 716 00:31:03,260 --> 00:31:04,550 1/2 ln 4. 717 00:31:12,570 --> 00:31:14,340 OK, 1/2 ln 4. 718 00:31:14,340 --> 00:31:17,820 So now we come to the part in boosting 719 00:31:17,820 --> 00:31:19,730 that many people consider the hardest part, 720 00:31:19,730 --> 00:31:22,090 and I'm going to show you how to do it more easily. 721 00:31:22,090 --> 00:31:24,060 This is that part where we try to make 722 00:31:24,060 --> 00:31:27,316 all the ones we got right, we change their weights to be 1/2. 723 00:31:27,316 --> 00:31:28,690 And all of the ones we got wrong, 724 00:31:28,690 --> 00:31:30,250 we change their weights to be 1/2. 725 00:31:30,250 --> 00:31:31,960 Here is my automated process. 726 00:31:31,960 --> 00:31:34,287 It's called the numerator stays the same method. 727 00:31:34,287 --> 00:31:35,120 Here's how it works. 728 00:31:42,460 --> 00:31:43,900 Here's our ten data points. 729 00:31:43,900 --> 00:31:46,940 Their current weight is 1/10, all of them. 730 00:31:55,410 --> 00:31:58,900 We're about to re-weight them for the next step. 731 00:31:58,900 --> 00:32:01,950 So you agree they're all 1/10? 732 00:32:01,950 --> 00:32:03,600 They're equal, to start off. 733 00:32:03,600 --> 00:32:08,585 So step one-- erase the denominators. 734 00:32:11,830 --> 00:32:12,620 Screw fractions. 735 00:32:12,620 --> 00:32:14,150 I don't like them. 736 00:32:14,150 --> 00:32:16,940 There's division, multiplication. 737 00:32:16,940 --> 00:32:17,530 It's a pain. 738 00:32:17,530 --> 00:32:19,910 I just want to add whole numbers. 739 00:32:19,910 --> 00:32:21,240 That's what we're going to do. 740 00:32:21,240 --> 00:32:24,150 So which ones do we get wrong? 741 00:32:24,150 --> 00:32:25,100 3 and 10. 742 00:32:25,100 --> 00:32:26,010 Circle those. 743 00:32:30,940 --> 00:32:33,240 All right. 744 00:32:33,240 --> 00:32:37,600 Add the numbers in the circles and multiply by 2. 745 00:32:37,600 --> 00:32:39,800 What does that give you? 746 00:32:39,800 --> 00:32:40,456 4. 747 00:32:40,456 --> 00:32:41,580 That's the new denominator. 748 00:32:44,210 --> 00:32:46,210 AUDIENCE: Do you always multiply by 2, or just-- 749 00:32:46,210 --> 00:32:48,900 PROFESSOR: You always multiply by 2. 750 00:32:48,900 --> 00:32:50,760 Add the numbers not in the circles. 751 00:32:50,760 --> 00:32:51,571 Multiply by 2. 752 00:32:51,571 --> 00:32:52,570 What does that give you? 753 00:32:52,570 --> 00:32:53,210 AUDIENCE: 16. 754 00:32:53,210 --> 00:32:54,270 PROFESSOR: 16. 755 00:32:54,270 --> 00:32:56,270 That's the new denominator. 756 00:32:56,270 --> 00:33:00,750 The final, crucial step so that we can do this again next round 757 00:33:00,750 --> 00:33:02,476 is by far the most mathematically 758 00:33:02,476 --> 00:33:03,850 complicated thing here because we 759 00:33:03,850 --> 00:33:06,190 have to actually do something with fractions, 760 00:33:06,190 --> 00:33:09,450 but it's not too bad-- is we then change everything 761 00:33:09,450 --> 00:33:11,460 to be with the same denominator. 762 00:33:11,460 --> 00:33:13,975 So the 1/4's become? 763 00:33:13,975 --> 00:33:14,600 AUDIENCE: 4/16. 764 00:33:14,600 --> 00:33:15,266 PROFESSOR: 4/16. 765 00:33:21,850 --> 00:33:22,770 All right. 766 00:33:22,770 --> 00:33:26,618 I can also uncircle these for next-- ah. 767 00:33:26,618 --> 00:33:30,610 I hit the that button. 768 00:33:30,610 --> 00:33:31,110 All right. 769 00:33:36,290 --> 00:33:49,072 New weights-- 1/16, 1/16, 4/16, 1/16, 1/16, 1/16, 1/16, 1/16, 770 00:33:49,072 --> 00:33:50,760 4/16. 771 00:33:50,760 --> 00:33:52,619 Note, the weights add up to 1. 772 00:33:52,619 --> 00:33:54,160 The ones you got wrong add up to 1/2. 773 00:33:54,160 --> 00:33:56,040 The ones you got right add up to 1/2. 774 00:33:56,040 --> 00:33:56,910 You're happy. 775 00:33:56,910 --> 00:34:01,700 So now that you get 4/16 of an error for getting 3 wrong, 776 00:34:01,700 --> 00:34:04,300 4/16 of an error for getting 10 wrong, 777 00:34:04,300 --> 00:34:05,536 take a look at these six. 778 00:34:05,536 --> 00:34:06,910 I'm not going to call on someone, 779 00:34:06,910 --> 00:34:09,690 just whoever's good at math and can 780 00:34:09,690 --> 00:34:13,400 add these up more quickly-- just 3 and 10 count as 4. 781 00:34:13,400 --> 00:34:14,530 All the others count as 1. 782 00:34:14,530 --> 00:34:15,554 Add them up. 783 00:34:15,554 --> 00:34:16,929 Tell me which one's the lightest. 784 00:34:16,929 --> 00:34:17,637 What did you say? 785 00:34:17,637 --> 00:34:18,650 AUDIENCE: I'd go with B. 786 00:34:18,650 --> 00:34:20,774 PROFESSOR: You'd go with B. It doesn't get 3 wrong. 787 00:34:20,774 --> 00:34:23,330 That sounds pretty good to me. 788 00:34:23,330 --> 00:34:25,889 Does everyone else like B as well? 789 00:34:25,889 --> 00:34:27,159 I like it. 790 00:34:27,159 --> 00:34:31,989 I mean, all of our ones that don't get 3 wrong or 10 wrong, 791 00:34:31,989 --> 00:34:36,909 we're only looking at B and D. And D has seven. 792 00:34:36,909 --> 00:34:38,750 B has four. 793 00:34:38,750 --> 00:34:41,360 So B is the best. 794 00:34:41,360 --> 00:34:42,800 B gets 4/16 wrong. 795 00:34:42,800 --> 00:34:44,540 Does everyone see that? 796 00:34:44,540 --> 00:34:47,920 Because even getting one of 3 or 10 wrong 797 00:34:47,920 --> 00:34:50,409 is as bad as all the ones that B gets wrong because 798 00:34:50,409 --> 00:34:52,540 of the new weights. 799 00:34:52,540 --> 00:34:53,270 So cool. 800 00:34:53,270 --> 00:34:55,540 Let's choose B. That's right. 801 00:34:55,540 --> 00:34:58,820 And I sort of gave it away. 802 00:34:58,820 --> 00:35:00,740 What's the error that B has? 803 00:35:00,740 --> 00:35:04,050 It has four wrong, each of which are worth 1/16. 804 00:35:04,050 --> 00:35:06,300 The error is? 805 00:35:06,300 --> 00:35:06,900 What? 806 00:35:06,900 --> 00:35:12,380 4/16, or 1/4, whichever is your favorite, which 807 00:35:12,380 --> 00:35:14,300 means that the alpha is? 808 00:35:14,300 --> 00:35:15,530 AUDIENCE: 1/2 ln 3. 809 00:35:15,530 --> 00:35:16,550 PROFESSOR: 1/2 ln 3. 810 00:35:21,023 --> 00:35:22,450 Bingo. 811 00:35:22,450 --> 00:35:24,190 Final round. 812 00:35:24,190 --> 00:35:26,010 OK, what did we get wrong? 813 00:35:26,010 --> 00:35:30,090 We got 1, 6, 7, and 9 wrong. 814 00:35:30,090 --> 00:35:35,405 Oh yeah, we can erase the denominators. 815 00:35:40,640 --> 00:35:41,970 All right. 816 00:35:41,970 --> 00:35:45,050 What are the numbers in the circles, summed up, 817 00:35:45,050 --> 00:35:46,511 multiplied by 2? 818 00:35:46,511 --> 00:35:47,010 AUDIENCE: 8. 819 00:35:47,010 --> 00:35:48,093 PROFESSOR: That's 8-- 1/8. 820 00:35:51,570 --> 00:35:54,800 And what about the numbers not in the circle, summed up, 821 00:35:54,800 --> 00:35:56,340 multiplied by 2? 822 00:35:56,340 --> 00:35:57,750 AUDIENCE: 24. 823 00:35:57,750 --> 00:35:59,660 PROFESSOR: That's right-- 24, which 824 00:35:59,660 --> 00:36:01,160 means I'm going to have to change 825 00:36:01,160 --> 00:36:06,960 all the numbers in the circle to 3/24, except I guess I don't 826 00:36:06,960 --> 00:36:08,754 because this is the last round. 827 00:36:08,754 --> 00:36:10,670 But if I was going to do another round-- let's 828 00:36:10,670 --> 00:36:13,510 prepare in case we where, change all of these to 3/24. 829 00:36:23,100 --> 00:36:25,127 Besides, it makes it easier to calculate 830 00:36:25,127 --> 00:36:27,710 which one is the best possible classifier because you can just 831 00:36:27,710 --> 00:36:30,820 use the numerators and sort of add them up. 832 00:36:30,820 --> 00:36:33,210 So while I'm writing that up, you guys 833 00:36:33,210 --> 00:36:36,370 figure out which one you like for classifier 834 00:36:36,370 --> 00:36:39,590 and call it out to me when I'm done. 835 00:36:39,590 --> 00:36:40,360 3/24. 836 00:36:40,360 --> 00:36:41,160 1/24. 837 00:36:41,160 --> 00:36:43,340 4/24. 838 00:36:43,340 --> 00:36:45,000 1/24. 839 00:36:45,000 --> 00:36:47,320 1/24. 840 00:36:47,320 --> 00:36:49,176 3/24. 841 00:36:49,176 --> 00:36:49,675 3/24. 842 00:36:52,350 --> 00:36:54,540 1/24. 843 00:36:54,540 --> 00:36:56,610 3/24-- wait. 844 00:36:56,610 --> 00:36:58,530 I'm off by one here. 845 00:36:58,530 --> 00:37:00,940 3, 1, 4-- 846 00:37:00,940 --> 00:37:03,730 AUDIENCE: It's because w1 is not assigned to anything. 847 00:37:03,730 --> 00:37:05,600 So w2 is really w1. 848 00:37:05,600 --> 00:37:06,710 PROFESSOR: Aha. 849 00:37:06,710 --> 00:37:08,730 You're right. 850 00:37:08,730 --> 00:37:19,176 w1 is not assigned to anything, so w2 is really w1 Yeah? 851 00:37:19,176 --> 00:37:22,902 AUDIENCE: There's an extra 1/16 between w1, w2. 852 00:37:22,902 --> 00:37:25,760 There's an extra 1/16. 853 00:37:25,760 --> 00:37:28,390 PROFESSOR: Yes, that's true. 854 00:37:28,390 --> 00:37:29,057 OK, well-- 855 00:37:29,057 --> 00:37:29,890 AUDIENCE: We get it. 856 00:37:29,890 --> 00:37:32,160 PROFESSOR: You get it. 857 00:37:32,160 --> 00:37:34,950 H, so what is the best H? 858 00:37:34,950 --> 00:37:38,550 You get it because it's right here. 859 00:37:38,550 --> 00:37:39,420 See? 860 00:37:39,420 --> 00:37:41,580 The process is so foolproof, even a fool 861 00:37:41,580 --> 00:37:45,060 like me can get it right while they have the chart wrong. 862 00:37:45,060 --> 00:37:47,220 All right, so what's the best classifier? 863 00:37:47,220 --> 00:37:48,300 AUDIENCE: C. 864 00:37:48,300 --> 00:37:52,382 PROFESSOR: You say C. I say that seems pretty reasonable. 865 00:37:52,382 --> 00:37:54,770 It only gets 3, 4, 5, and 8 wrong. 866 00:37:54,770 --> 00:37:56,460 Does anyone else get a different answer? 867 00:37:56,460 --> 00:37:57,510 AUDIENCE: A. 868 00:37:57,510 --> 00:38:01,950 PROFESSOR: Someone else gets A. I like A. Who said A? 869 00:38:01,950 --> 00:38:03,690 A lot of people said A. 870 00:38:03,690 --> 00:38:05,750 Well, let's figure it out. 871 00:38:05,750 --> 00:38:12,000 So A gets 1, 5, 6, 7. 872 00:38:12,000 --> 00:38:14,740 C gets 4, 5, 6, 7. 873 00:38:14,740 --> 00:38:16,900 They're, in fact, equal. 874 00:38:16,900 --> 00:38:19,500 Tie-break goes to the lower letter 875 00:38:19,500 --> 00:38:22,224 because that's what we said. 876 00:38:22,224 --> 00:38:24,390 In fact, I didn't tell you, but that's what we said. 877 00:38:24,390 --> 00:38:24,965 Question? 878 00:38:24,965 --> 00:38:27,090 AUDIENCE: So when we were deciding which classifier 879 00:38:27,090 --> 00:38:29,540 to use, can we only look at the weights, 880 00:38:29,540 --> 00:38:32,970 or do we also have to look at the [INAUDIBLE]-- 881 00:38:32,970 --> 00:38:35,160 PROFESSOR: Ignore all previous rounds. 882 00:38:35,160 --> 00:38:37,700 The question is, do you only look at the current weights 883 00:38:37,700 --> 00:38:39,460 when determining a classifier? 884 00:38:39,460 --> 00:38:42,530 Or do you look at the previous rounds as well? 885 00:38:42,530 --> 00:38:44,261 You have to ignore the previous rounds. 886 00:38:44,261 --> 00:38:44,760 Trust me. 887 00:38:44,760 --> 00:38:47,450 They will be used later in the vote. 888 00:38:47,450 --> 00:38:50,940 But it's sort of like tainting the jury a little bit 889 00:38:50,940 --> 00:38:54,250 to use the previous rounds when you're doing the current round. 890 00:38:54,250 --> 00:38:56,910 Because you want to start fresh with these new weights, 891 00:38:56,910 --> 00:38:57,930 get a new classifier. 892 00:38:57,930 --> 00:39:00,460 And then later, everyone will get to make their vote. 893 00:39:00,460 --> 00:39:02,750 So you only do it based on the current weights. 894 00:39:02,750 --> 00:39:04,458 AUDIENCE: We don't take any consideration 895 00:39:04,458 --> 00:39:06,309 if the last round's 6 was wrong or anything. 896 00:39:06,309 --> 00:39:08,850 PROFESSOR: Nope, although the weights take into consideration 897 00:39:08,850 --> 00:39:11,471 is when it's wrong, it's going to increase. 898 00:39:11,471 --> 00:39:12,425 AUDIENCE: OK. 899 00:39:12,425 --> 00:39:13,379 PROFESSOR: Question? 900 00:39:13,379 --> 00:39:14,754 AUDIENCE: Could you theoretically 901 00:39:14,754 --> 00:39:15,764 reuse a classifier? 902 00:39:15,764 --> 00:39:18,139 PROFESSOR: The question is, could you theoretically reuse 903 00:39:18,139 --> 00:39:18,895 a classifier? 904 00:39:18,895 --> 00:39:21,220 Answer-- you absolutely can. 905 00:39:21,220 --> 00:39:23,457 When that happens, it essentially gets extra weight 906 00:39:23,457 --> 00:39:24,540 because you used it again. 907 00:39:24,540 --> 00:39:29,680 But you can never, ever use it twice in a row. 908 00:39:29,680 --> 00:39:30,460 Here's why. 909 00:39:30,460 --> 00:39:33,150 Let's say that we want to use-- which was the one we used last 910 00:39:33,150 --> 00:39:33,650 over there? 911 00:39:33,650 --> 00:39:34,170 B? 912 00:39:34,170 --> 00:39:35,628 Let's say we wanted to use B again. 913 00:39:35,628 --> 00:39:37,780 What does it give us? 914 00:39:37,780 --> 00:39:39,890 50-50. 915 00:39:39,890 --> 00:39:43,750 If we wanted to use B and then B-- 3, 6, 9, 12 wrong. 916 00:39:43,750 --> 00:39:45,820 Always guaranteed to give you 50-50, 917 00:39:45,820 --> 00:39:48,420 which is the only way that you can 918 00:39:48,420 --> 00:39:50,030 be sure you'll never use it. 919 00:39:50,030 --> 00:39:51,880 In fact, that's by design. 920 00:39:51,880 --> 00:39:54,420 You could reuse it, but not twice in a row. 921 00:39:54,420 --> 00:39:57,100 It could be used later on down the stream. 922 00:39:57,100 --> 00:39:58,360 And it will be used. 923 00:39:58,360 --> 00:40:03,160 Because if you do seven rounds, one of them has to be reused. 924 00:40:03,160 --> 00:40:05,980 It just gives more weight to whichever one is reused. 925 00:40:05,980 --> 00:40:09,640 But yes, A wins against C. C was a perfectly good answer 926 00:40:09,640 --> 00:40:10,140 as well. 927 00:40:10,140 --> 00:40:10,640 Question? 928 00:40:10,640 --> 00:40:12,674 AUDIENCE: Wait, can you reuse [INAUDIBLE]? 929 00:40:12,674 --> 00:40:13,340 PROFESSOR: What? 930 00:40:13,340 --> 00:40:14,628 AUDIENCE: Instead of A or C. 931 00:40:14,628 --> 00:40:15,211 PROFESSOR: OK. 932 00:40:15,211 --> 00:40:18,520 If you could reuse, why doesn't he pick E? 933 00:40:18,520 --> 00:40:20,380 E gets eight out of 24 wrong. 934 00:40:20,380 --> 00:40:25,620 It's one worse than A and C. That's the only reason. 935 00:40:25,620 --> 00:40:27,760 Next step will probably use A-- or sorry. 936 00:40:27,760 --> 00:40:29,660 Well, next step, we'll probably use E, 937 00:40:29,660 --> 00:40:35,190 frankly-- although maybe not, because we got 3 wrong on A. 938 00:40:35,190 --> 00:40:37,060 But pretty soon, we would use E again 939 00:40:37,060 --> 00:40:38,670 because E's pretty awesome. 940 00:40:38,670 --> 00:40:43,310 But anyway, here we used A. And we said we got 7/24 wrong. 941 00:40:43,310 --> 00:40:45,220 Oh, man, we can't use my little shortcut. 942 00:40:45,220 --> 00:40:49,450 So the answer, it has to be 17/7-- or 1/2 natural log 943 00:40:49,450 --> 00:40:50,060 of 17/7. 944 00:40:55,060 --> 00:40:58,260 So there we go. 945 00:40:58,260 --> 00:41:02,870 Now, we have to ask, what is the final classifier that we 946 00:41:02,870 --> 00:41:04,330 created from all these things? 947 00:41:04,330 --> 00:41:09,680 All we do is we sum up all the classifiers we chose. 948 00:41:09,680 --> 00:41:12,100 And we multiply them times their weight, alpha. 949 00:41:12,100 --> 00:41:19,020 So 1/2 ln 4 times E, whether or not 950 00:41:19,020 --> 00:41:28,780 E returns true, plus 1/2 ln 3 times B 951 00:41:28,780 --> 00:41:36,370 plus 1/2 ln 17/7 times A, is our final classifier, where 952 00:41:36,370 --> 00:41:41,330 E returns a plus 1 if E thinks it's a vampire, and a minus 1 953 00:41:41,330 --> 00:41:42,920 if E think it's not. 954 00:41:42,920 --> 00:41:47,020 Same for B and A, all right? 955 00:41:47,020 --> 00:41:49,180 And then we take the sign of this. 956 00:41:49,180 --> 00:41:50,830 And I don't mean sine and cosine. 957 00:41:50,830 --> 00:41:54,640 I mean just, just is it positive or negative? 958 00:41:54,640 --> 00:41:56,190 OK? 959 00:41:56,190 --> 00:42:01,190 So the question now on the exam is, how many of the ten data 960 00:42:01,190 --> 00:42:04,720 points do you get right if we used this? 961 00:42:04,720 --> 00:42:05,906 Let's give it a look see. 962 00:42:05,906 --> 00:42:09,830 E is-- so we have romantic interest greater than 2. 963 00:42:09,830 --> 00:42:11,610 We have emo yes. 964 00:42:11,610 --> 00:42:13,140 And we have evil yes. 965 00:42:13,140 --> 00:42:18,760 So oh my gosh, logarithms, they're sometimes annoying. 966 00:42:18,760 --> 00:42:20,285 Do we have to actually add them up? 967 00:42:20,285 --> 00:42:22,290 I claim we don't. 968 00:42:22,290 --> 00:42:24,010 Here's a nice special case of having 969 00:42:24,010 --> 00:42:26,420 three logarithms on the board. 970 00:42:26,420 --> 00:42:28,040 One of two things is true. 971 00:42:28,040 --> 00:42:30,090 Either one of those three logarithms 972 00:42:30,090 --> 00:42:33,900 is so large that it's bigger than the other two 973 00:42:33,900 --> 00:42:38,930 combined, in which case, if that one returns a positive or a 974 00:42:38,930 --> 00:42:42,830 negative, it's just positive or negative because that one's 975 00:42:42,830 --> 00:42:44,340 big. 976 00:42:44,340 --> 00:42:48,370 Or one is not that large, and in which 977 00:42:48,370 --> 00:42:51,180 case, any two can dominate the other one, 978 00:42:51,180 --> 00:42:53,710 and so is just equivalent to a majority vote. 979 00:42:53,710 --> 00:42:57,070 So I claim we never have to add them when there's only three. 980 00:42:57,070 --> 00:42:58,122 You guys see what I mean? 981 00:42:58,122 --> 00:43:00,330 Like, let's say one of them was 1/2 log of a billion, 982 00:43:00,330 --> 00:43:03,320 and the others were 1/2 log of 3 and 1/2 log of 4. 983 00:43:03,320 --> 00:43:05,500 Obviously, whatever the 1/2 log of a billion 984 00:43:05,500 --> 00:43:08,570 says, which is multiplied by 1/2 log of a billion, 985 00:43:08,570 --> 00:43:11,834 is it's just going to be that, and the others will be ignored. 986 00:43:11,834 --> 00:43:13,750 However, if it's not the case that one of them 987 00:43:13,750 --> 00:43:15,630 is larger than the other two combined, 988 00:43:15,630 --> 00:43:17,690 then it's a simple vote between the three, 989 00:43:17,690 --> 00:43:20,320 because any two can out-vote the other one 990 00:43:20,320 --> 00:43:21,820 if they work together. 991 00:43:21,820 --> 00:43:28,120 And in this case, let's see, 17/7 is not quite 3. 992 00:43:28,120 --> 00:43:32,160 However, log of 4 is certainly not 993 00:43:32,160 --> 00:43:34,900 better than log of 3 plus log of 17/7. 994 00:43:34,900 --> 00:43:36,820 It's not even-- log of 4 is equal to log 995 00:43:36,820 --> 00:43:38,450 of 2 plus log of 2. 996 00:43:38,450 --> 00:43:40,175 And these are both bigger than log of 2. 997 00:43:43,270 --> 00:43:46,930 That's rules of logs-- log of 4 equals log of 2 squared, 998 00:43:46,930 --> 00:43:48,260 and you can take the 2 out. 999 00:43:48,260 --> 00:43:51,357 So these are not big enough that one of them 1000 00:43:51,357 --> 00:43:52,940 is bigger than the other two combined. 1001 00:43:52,940 --> 00:43:54,630 So it's just going to be a simple vote. 1002 00:43:54,630 --> 00:43:55,800 So let's go through. 1003 00:43:55,800 --> 00:44:00,750 Dracula-- OK, he's got tons of his little vampyrettes. 1004 00:44:00,750 --> 00:44:04,670 He's not emo, so E gets it right. 1005 00:44:04,670 --> 00:44:05,770 He's not emo. 1006 00:44:05,770 --> 00:44:06,950 So that gets it wrong. 1007 00:44:06,950 --> 00:44:08,190 But he is evil. 1008 00:44:08,190 --> 00:44:09,070 That gets it right. 1009 00:44:09,070 --> 00:44:12,780 Two out of three vote that he's a vampire-- correct. 1010 00:44:12,780 --> 00:44:13,660 Next. 1011 00:44:13,660 --> 00:44:17,640 Angel-- OK, well, he was in a long running series. 1012 00:44:17,640 --> 00:44:19,560 He's got plenty of romantic interests. 1013 00:44:19,560 --> 00:44:22,140 So that gets it right. 1014 00:44:22,140 --> 00:44:23,230 He's certainly emo. 1015 00:44:23,230 --> 00:44:24,072 That gets it right. 1016 00:44:24,072 --> 00:44:26,030 And even though he's not evil, two out of three 1017 00:44:26,030 --> 00:44:28,850 says he's a vampire, so correct. 1018 00:44:28,850 --> 00:44:34,010 Next, Edward Cullen-- well, Twilight, here we come. 1019 00:44:34,010 --> 00:44:34,584 Let's see. 1020 00:44:34,584 --> 00:44:36,000 He only has one romantic interest, 1021 00:44:36,000 --> 00:44:37,610 so that gets it wrong. 1022 00:44:37,610 --> 00:44:38,610 OK. 1023 00:44:38,610 --> 00:44:40,410 He's emo, so that gets it right. 1024 00:44:40,410 --> 00:44:42,150 But he's not evil, so two wrong. 1025 00:44:42,150 --> 00:44:44,210 So Edward's not a vampire according 1026 00:44:44,210 --> 00:44:45,820 to our final classifier. 1027 00:44:45,820 --> 00:44:46,880 But he is. 1028 00:44:46,880 --> 00:44:49,620 So we got one of the data points wrong. 1029 00:44:49,620 --> 00:44:50,380 You guys see that? 1030 00:44:50,380 --> 00:44:52,780 Because two out of three of our classifiers 1031 00:44:52,780 --> 00:44:56,405 here said that he was not a vampire. 1032 00:44:56,405 --> 00:44:57,280 All right, let's see. 1033 00:44:57,280 --> 00:45:02,410 Saya-- well, she has more than two romantic interests. 1034 00:45:02,410 --> 00:45:03,610 And she's emo. 1035 00:45:03,610 --> 00:45:06,600 So even though she's not evil, we get it right. 1036 00:45:06,600 --> 00:45:07,290 OK? 1037 00:45:07,290 --> 00:45:08,260 Let's see. 1038 00:45:08,260 --> 00:45:14,950 Lestat-- he also has may love interests, 1039 00:45:14,950 --> 00:45:16,580 is emo, and is not evil. 1040 00:45:16,580 --> 00:45:18,640 So you get it right. 1041 00:45:18,640 --> 00:45:23,140 OK, Bianca is evil with many love interests. 1042 00:45:23,140 --> 00:45:26,620 Even though she's not emo, two out of three, you get it right. 1043 00:45:26,620 --> 00:45:29,670 All right, Carmilla-- I'm going to call her Karnstein-- 1044 00:45:29,670 --> 00:45:31,944 is basically exactly the same as Bianca 1045 00:45:31,944 --> 00:45:34,360 with the number of romantic interests fixed the way it is. 1046 00:45:34,360 --> 00:45:37,460 So she will always do the same thing that Bianca does. 1047 00:45:37,460 --> 00:45:39,790 It's why 6 and 7 always travel together. 1048 00:45:39,790 --> 00:45:41,150 So we get it right. 1049 00:45:41,150 --> 00:45:44,710 Sailor Moon is supposed to be not a vampire. 1050 00:45:44,710 --> 00:45:47,320 So her number of love interests say that she's not a vampire 1051 00:45:47,320 --> 00:45:49,040 because she only has one. 1052 00:45:49,040 --> 00:45:51,150 The fact that she's not evil and not emo 1053 00:45:51,150 --> 00:45:53,550 says that actually, she's perfectly not a vampire. 1054 00:45:53,550 --> 00:45:54,810 They all agree. 1055 00:45:54,810 --> 00:45:56,400 And that's correct. 1056 00:45:56,400 --> 00:45:59,560 Squall has only one love interest, Rinoa. 1057 00:45:59,560 --> 00:46:03,020 And he's not evil, both of which of say he's not a vampire. 1058 00:46:03,020 --> 00:46:03,710 But he is emo. 1059 00:46:03,710 --> 00:46:05,585 But two out of three says he's not a vampire. 1060 00:46:05,585 --> 00:46:06,610 We get it correct. 1061 00:46:06,610 --> 00:46:09,190 And Circe, despite her many romantic interests 1062 00:46:09,190 --> 00:46:10,750 which says she might be a vampire, 1063 00:46:10,750 --> 00:46:13,870 is neither evil nor emo, and is not a vampire. 1064 00:46:13,870 --> 00:46:16,085 So we got everything right except Edward Cullen, 1065 00:46:16,085 --> 00:46:19,190 which perhaps says more about Stephenie Meyers writing 1066 00:46:19,190 --> 00:46:21,770 than about our boosting. 1067 00:46:21,770 --> 00:46:25,130 All right, final question-- Wesley Wyndham, 1068 00:46:25,130 --> 00:46:28,200 a fellow consultant, has noticed a few correlations between some 1069 00:46:28,200 --> 00:46:29,780 of the classifiers you used. 1070 00:46:29,780 --> 00:46:33,270 He suggests using a new set of weak classifiers consisting 1071 00:46:33,270 --> 00:46:40,280 of a pair of your classifiers that are logically 1072 00:46:40,280 --> 00:46:41,744 anded and ored together. 1073 00:46:41,744 --> 00:46:43,410 For instance, two of the new classifiers 1074 00:46:43,410 --> 00:46:48,960 would be emo equals yes or evil equals yes, or sparkly equals 1075 00:46:48,960 --> 00:46:51,470 no and transforms equals yes. 1076 00:46:51,470 --> 00:46:55,230 So that would cut out Sailor Moon from the transforms cloud. 1077 00:46:55,230 --> 00:46:58,000 He believes that you'll be able to classify large vampire data 1078 00:46:58,000 --> 00:46:59,980 sets-- larger than this one, anyway-- 1079 00:46:59,980 --> 00:47:03,160 more quickly with fewer rounds of boosting using his system. 1080 00:47:03,160 --> 00:47:04,710 Do you agree or disagree with Wesley? 1081 00:47:04,710 --> 00:47:06,560 Explain your arguments. 1082 00:47:06,560 --> 00:47:09,890 So this was, you know, the tough concept question. 1083 00:47:09,890 --> 00:47:12,037 Does anyone have just an instinctual thing other 1084 00:47:12,037 --> 00:47:13,370 than like, oh, man, it's Wesley. 1085 00:47:13,370 --> 00:47:14,440 He must be wrong. 1086 00:47:18,851 --> 00:47:20,601 AUDIENCE: You'll probably use fewer rounds 1087 00:47:20,601 --> 00:47:23,487 of boosting because you have more classifiers. 1088 00:47:23,487 --> 00:47:26,800 But you'll have to search through more classifiers. 1089 00:47:26,800 --> 00:47:30,360 PROFESSOR: Aha, that is the rare full-point answer. 1090 00:47:30,360 --> 00:47:33,420 Very few people realize that Wesley was partially right. 1091 00:47:33,420 --> 00:47:35,920 They either said something about him being completely wrong, 1092 00:47:35,920 --> 00:47:38,220 which was wrong, or said that he was completely right. 1093 00:47:38,220 --> 00:47:41,740 Yes, it will use fewer rounds of boosting because of the fact 1094 00:47:41,740 --> 00:47:45,970 that you can-- essentially, one of these boosting already 1095 00:47:45,970 --> 00:47:49,410 does is sort of gets things to vote together 1096 00:47:49,410 --> 00:47:50,860 in an [INAUDIBLE] fashion. 1097 00:47:50,860 --> 00:47:52,662 So it will use approximately half 1098 00:47:52,662 --> 00:47:54,370 the number of rounds of boosting by being 1099 00:47:54,370 --> 00:47:55,770 able to combine into two. 1100 00:47:55,770 --> 00:47:58,300 But there's a lot of ands and ors. 1101 00:47:58,300 --> 00:47:59,830 There's in fact n choose 2, where 1102 00:47:59,830 --> 00:48:01,620 n is the number of vampires. 1103 00:48:01,620 --> 00:48:04,340 And since using half the number of rounds but taking n 1104 00:48:04,340 --> 00:48:08,040 choose 2 time for each round is not less time. 1105 00:48:08,040 --> 00:48:10,200 So that's exactly correct. 1106 00:48:10,200 --> 00:48:13,520 Not that many people got full credit on that one 1107 00:48:13,520 --> 00:48:16,201 because sometimes, they were seduced by Wesley's idea. 1108 00:48:16,201 --> 00:48:17,700 Or they were just like, it's Wesley. 1109 00:48:17,700 --> 00:48:23,080 He's wrong-- or just some other funny answer. 1110 00:48:23,080 --> 00:48:25,080 Any questions about boosting? 1111 00:48:25,080 --> 00:48:25,580 Question? 1112 00:48:25,580 --> 00:48:27,848 AUDIENCE: How do you know how many rounds of boosting 1113 00:48:27,848 --> 00:48:28,677 to take? 1114 00:48:28,677 --> 00:48:30,260 PROFESSOR: The question is, how do you 1115 00:48:30,260 --> 00:48:32,690 know how many rounds of boosting to do? 1116 00:48:32,690 --> 00:48:36,140 The answer is-- so on the quiz, it 1117 00:48:36,140 --> 00:48:38,070 tells you that you have three. 1118 00:48:38,070 --> 00:48:42,830 In real life, you might want to just kind of keep 1119 00:48:42,830 --> 00:48:46,100 it running until it converges. 1120 00:48:46,100 --> 00:48:48,330 That's one possibility-- keep it running 1121 00:48:48,330 --> 00:48:51,204 until it converges to an answer, and it 1122 00:48:51,204 --> 00:48:52,370 doesn't do anything anymore. 1123 00:48:52,370 --> 00:48:56,592 Patrick has a little widget on the 6.034 website, I think, 1124 00:48:56,592 --> 00:48:58,300 that lets you plunk down some data points 1125 00:48:58,300 --> 00:48:59,710 and run boosting on them. 1126 00:48:59,710 --> 00:49:03,640 And you can see that eventually, it converges. 1127 00:49:03,640 --> 00:49:06,932 The boosting converges to an answer, and it doesn't change. 1128 00:49:06,932 --> 00:49:09,524 AUDIENCE: What converges? 1129 00:49:09,524 --> 00:49:12,390 PROFESSOR: Basically, the-- not the classifiers 1130 00:49:12,390 --> 00:49:14,050 you picked, of course, or the weights. 1131 00:49:14,050 --> 00:49:17,100 But what converges is which ones of your data set 1132 00:49:17,100 --> 00:49:18,470 you get correct. 1133 00:49:18,470 --> 00:49:21,720 Because he does his in two-dimensional space 1134 00:49:21,720 --> 00:49:23,114 rather than like this. 1135 00:49:23,114 --> 00:49:24,780 And he shows you the lines that boosting 1136 00:49:24,780 --> 00:49:26,196 is drawing between classification, 1137 00:49:26,196 --> 00:49:29,320 and colors things in green and red or something like that. 1138 00:49:29,320 --> 00:49:31,604 And eventually, it converges where the lines are 1139 00:49:31,604 --> 00:49:33,020 and which ones it's getting right. 1140 00:49:33,020 --> 00:49:36,470 It generally converges to getting everything correct. 1141 00:49:36,470 --> 00:49:40,030 And when that happens, then you can stop. 1142 00:49:40,030 --> 00:49:41,240 But that's a good question. 1143 00:49:41,240 --> 00:49:43,830 And it's not always that easy in the real world. 1144 00:49:43,830 --> 00:49:47,330 You have to sometimes just say, this is enough for me. 1145 00:49:47,330 --> 00:49:50,230 I've given it n number of rounds. 1146 00:49:50,230 --> 00:49:52,320 And that's much more the number of classifiers, 1147 00:49:52,320 --> 00:49:55,300 so maybe it won't get anything better.