1 00:00:00,070 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high-quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,229 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,229 --> 00:00:17,854 at ocw.mit.edu. 8 00:00:21,530 --> 00:00:25,640 PROFESSOR: Hope everyone had a great Veteran's Day break 9 00:00:25,640 --> 00:00:28,670 yesterday, spending it, of course, 10 00:00:28,670 --> 00:00:32,049 as I check my test audience, spending it mostly doing 11 00:00:32,049 --> 00:00:34,050 your [INAUDIBLE] research or PSETs. 12 00:00:34,050 --> 00:00:37,910 But at least one person got to watch TV. 13 00:00:37,910 --> 00:00:40,830 So at least one person got to have a real break, 14 00:00:40,830 --> 00:00:44,170 and that's something truly amazing and special. 15 00:00:44,170 --> 00:00:48,160 So now we're going to talk about SVMs. 16 00:00:48,160 --> 00:00:51,990 They're pretty much the hardest thing in 6.034. 17 00:00:51,990 --> 00:00:58,280 However, in recent years a few shortcuts 18 00:00:58,280 --> 00:01:00,384 have popped up that will sometimes 19 00:01:00,384 --> 00:01:02,800 allow you to solve the question, depending on what they're 20 00:01:02,800 --> 00:01:08,170 asking for, without solving some vast ugly set of equations 21 00:01:08,170 --> 00:01:10,630 with a vast ugly number of unknowns. 22 00:01:10,630 --> 00:01:12,410 So I'm going to show that to you guys. 23 00:01:12,410 --> 00:01:14,910 I'm also going to try to explain all the alphabet 24 00:01:14,910 --> 00:01:18,620 soup that's in [INAUDIBLE] and what all the letters stand 25 00:01:18,620 --> 00:01:26,260 for because it took me a few times going through SVMs. 26 00:01:26,260 --> 00:01:28,810 It took me a few times going through SVMs 27 00:01:28,810 --> 00:01:32,174 to actually find out for sure what 28 00:01:32,174 --> 00:01:33,340 all those letters stood for. 29 00:01:33,340 --> 00:01:36,630 And if you guys figure it out first try, 30 00:01:36,630 --> 00:01:40,130 that's going to be great, and you guys will be just fine. 31 00:01:40,130 --> 00:01:44,490 So let's take a look at the problem that's perhaps 32 00:01:44,490 --> 00:01:49,470 most optimized for using some of the shortcuts to solving it 33 00:01:49,470 --> 00:01:53,560 and not putting up all the equations. 34 00:01:53,560 --> 00:01:57,180 Then I will-- not because I'm sadistic, 35 00:01:57,180 --> 00:02:00,630 but because I'm being nice, I will force you with me 36 00:02:00,630 --> 00:02:05,240 to solve some of the things they didn't ask for us to solve so 37 00:02:05,240 --> 00:02:07,260 that you can see that we can't get away 38 00:02:07,260 --> 00:02:11,680 with everything without doing some of the harder stuff. 39 00:02:11,680 --> 00:02:16,770 And of course, definitely ask questions as always, 40 00:02:16,770 --> 00:02:19,330 but this time even more so. 41 00:02:19,330 --> 00:02:21,350 You guys, well, if you were looking around, 42 00:02:21,350 --> 00:02:25,520 you saw nobody in this entire lecture hall 43 00:02:25,520 --> 00:02:28,880 raised their hand that they are already set and ready 44 00:02:28,880 --> 00:02:30,120 and know SVMs. 45 00:02:30,120 --> 00:02:34,390 So if you have a question, maybe everybody else does. 46 00:02:34,390 --> 00:02:38,030 So let's go. 47 00:02:38,030 --> 00:02:41,180 We'll start right here. 48 00:02:41,180 --> 00:02:43,520 As always, pretend that I can draw, 49 00:02:43,520 --> 00:02:45,870 and that therefore all the pluses and minuses are only 50 00:02:45,870 --> 00:02:48,105 on integer coordinates. 51 00:02:50,690 --> 00:02:55,620 So we are asked in this problem to circle the support vectors. 52 00:02:55,620 --> 00:02:59,020 Draw the edges of the street and then the dotted line 53 00:02:59,020 --> 00:03:00,850 in the middle that separates them, 54 00:03:00,850 --> 00:03:05,940 the separator, as a dashed line. 55 00:03:05,940 --> 00:03:11,140 And then to give w and b. 56 00:03:11,140 --> 00:03:14,920 So what are w and b? 57 00:03:14,920 --> 00:03:21,300 Well, there's a few important equations in SVMs 58 00:03:21,300 --> 00:03:22,924 that we really hope-- and I'm going 59 00:03:22,924 --> 00:03:25,340 to tell you we're lucky in this because we don't have to-- 60 00:03:25,340 --> 00:03:26,910 but we really hope that we don't have 61 00:03:26,910 --> 00:03:30,960 to use because they provide a huge number of variables. 62 00:03:30,960 --> 00:03:33,500 So one of those crucial equations 63 00:03:33,500 --> 00:03:43,190 is that for a plus support vector, w vector dot x plus, 64 00:03:43,190 --> 00:03:48,095 the plus support vector, plus b equals 1. 65 00:03:50,885 --> 00:03:55,630 w dot x minus plus b equals minus 1. 66 00:03:55,630 --> 00:04:00,490 And w dot that dotted line-- I don't know, 67 00:04:00,490 --> 00:04:06,140 we'll call it dot dot dot-- plus b equals 0. 68 00:04:06,140 --> 00:04:07,410 So what does this mean? 69 00:04:07,410 --> 00:04:09,400 There are a lot of vectors. 70 00:04:09,400 --> 00:04:12,430 Well, I mean, we're usually in two-dimensional space, 71 00:04:12,430 --> 00:04:14,810 so we can basically just say that there's 72 00:04:14,810 --> 00:04:17,899 two components of this w vector, w1 and w2. 73 00:04:17,899 --> 00:04:20,130 And they're just two coefficients 74 00:04:20,130 --> 00:04:21,810 in a linear equation. 75 00:04:21,810 --> 00:04:25,850 So for instance, what we're interested in finding, 76 00:04:25,850 --> 00:04:27,690 this dot dot dot line, we'll just 77 00:04:27,690 --> 00:04:29,920 call that x, so with nothing on it. 78 00:04:29,920 --> 00:04:31,840 Actually, maybe that'll be easier. 79 00:04:31,840 --> 00:04:44,900 This is equivalent to saying w1x1 plus w2x2 plus b equals 0, 80 00:04:44,900 --> 00:04:49,380 where x1 is this, and x2 is this. 81 00:04:49,380 --> 00:04:52,210 We would possibly call them x and y. 82 00:04:52,210 --> 00:04:55,690 So one way to think about it is w1, 83 00:04:55,690 --> 00:05:03,200 we'll call it a, ax plus-- call w2 b-- by. 84 00:05:03,200 --> 00:05:05,890 Oh, don't call it b. 85 00:05:05,890 --> 00:05:12,340 Well, ax plus cy plus b equals-- I'll put this all in 86 00:05:12,340 --> 00:05:15,740 parentheses-- this is basically an equation like this. 87 00:05:15,740 --> 00:05:29,830 Or y equals negative a over c x minus b over c. 88 00:05:29,830 --> 00:05:33,270 It's basically y equals mx plus b. 89 00:05:33,270 --> 00:05:34,394 Does everyone see that? 90 00:05:34,394 --> 00:05:35,810 This thing that we're looking for, 91 00:05:35,810 --> 00:05:39,490 this w dot x plus b equals 0, is the equation 92 00:05:39,490 --> 00:05:43,440 of a line in Cartesian coordinates. 93 00:05:43,440 --> 00:05:45,840 It just looks uglier. 94 00:05:45,840 --> 00:05:50,320 Normally, when we're doing all this solving for w and b, 95 00:05:50,320 --> 00:05:54,790 we would have to put in tons of equations, 96 00:05:54,790 --> 00:05:57,940 plug in all of the support vectors in there. 97 00:05:57,940 --> 00:06:03,220 And we'd have to use these little devils called alphas. 98 00:06:03,220 --> 00:06:06,280 Alphas essentially-- if it wasn't clear 99 00:06:06,280 --> 00:06:08,650 in the lecture, which it usually isn't completely 100 00:06:08,650 --> 00:06:12,420 clear to everyone, wasn't clear to me completely-- alphas, 101 00:06:12,420 --> 00:06:14,120 the way I like to think about them, 102 00:06:14,120 --> 00:06:16,630 the alphas in this problem is they 103 00:06:16,630 --> 00:06:20,390 are the weight of how significant any particular 104 00:06:20,390 --> 00:06:25,280 point on the graph is towards creating the boundary. 105 00:06:25,280 --> 00:06:28,845 The higher the alpha is, the more 106 00:06:28,845 --> 00:06:32,210 that that point narrows in the boundary. 107 00:06:32,210 --> 00:06:34,740 The lower the alpha is, the less that that point narrows 108 00:06:34,740 --> 00:06:38,570 in the boundary, the wider the road can be. 109 00:06:38,570 --> 00:06:40,320 And if that point doesn't do anything, 110 00:06:40,320 --> 00:06:42,327 if that point is irrelevant and could be removed 111 00:06:42,327 --> 00:06:44,410 and it wouldn't affect the boundary, the alpha is? 112 00:06:44,410 --> 00:06:45,230 Everyone? 113 00:06:45,230 --> 00:06:45,964 AUDIENCE: Zero. 114 00:06:45,964 --> 00:06:46,630 PROFESSOR: Zero. 115 00:06:46,630 --> 00:06:49,940 Well, that was one person, but you can suffice for everyone. 116 00:06:49,940 --> 00:06:51,060 The alpha is 0. 117 00:06:51,060 --> 00:06:53,620 And that means if it's not a support vector, if it's not 118 00:06:53,620 --> 00:06:56,520 one of the vectors on the boundary lines, 119 00:06:56,520 --> 00:07:02,210 it will always have an alpha of 0 because it doesn't affect. 120 00:07:02,210 --> 00:07:06,380 So keeping that in mind, there's a few fun 121 00:07:06,380 --> 00:07:08,072 and important equations about alphas 122 00:07:08,072 --> 00:07:10,030 that we'll need if we're solving many equations 123 00:07:10,030 --> 00:07:13,990 for many unknowns, which hopefully we won't have to do. 124 00:07:13,990 --> 00:07:18,380 The sum over the positive alphas equals 125 00:07:18,380 --> 00:07:23,300 the sum over the alphas-- the negative points. 126 00:07:23,300 --> 00:07:25,530 And this is true over all the points. 127 00:07:25,530 --> 00:07:28,830 But since all of the alphas are 0, 128 00:07:28,830 --> 00:07:30,560 except for the support vectors, it also 129 00:07:30,560 --> 00:07:32,710 means the alphas of the positive support 130 00:07:32,710 --> 00:07:35,250 vectors are equal to the alphas of the negative support 131 00:07:35,250 --> 00:07:36,490 vectors. 132 00:07:36,490 --> 00:07:41,070 Additionally, our old buddy, the w vector, 133 00:07:41,070 --> 00:07:46,450 is equal to the sum over all i that 134 00:07:46,450 --> 00:07:56,800 are plus vectors of wi alpha i minus m over j 135 00:07:56,800 --> 00:08:01,500 minus vectors of wj alpha j. 136 00:08:04,060 --> 00:08:07,510 Now, all of these equations can be used in a bloody mess 137 00:08:07,510 --> 00:08:11,110 to figure out the answer to what we're trying to find, 138 00:08:11,110 --> 00:08:13,940 which is circles-- well, actually, they 139 00:08:13,940 --> 00:08:15,580 can't be used as circle support vectors 140 00:08:15,580 --> 00:08:17,820 and draw the dotted line. 141 00:08:17,820 --> 00:08:20,450 But once we do that, all these equations 142 00:08:20,450 --> 00:08:23,860 can be used in a bloody mess to give us the next thing that we 143 00:08:23,860 --> 00:08:25,380 want, which is w and b. 144 00:08:30,390 --> 00:08:35,120 So fortunately, there's another way to get w and b. 145 00:08:35,120 --> 00:08:38,429 If you guys really want, at the end of the hour 146 00:08:38,429 --> 00:08:43,080 we can also try to derive w and b using 147 00:08:43,080 --> 00:08:44,740 the many equations in many unknowns, 148 00:08:44,740 --> 00:08:46,200 but it's a bit painful. 149 00:08:46,200 --> 00:08:47,930 We'll try to do it the cool way. 150 00:08:47,930 --> 00:08:50,545 So let's start off. 151 00:08:50,545 --> 00:08:51,920 This is the one we're looking at. 152 00:08:51,920 --> 00:08:54,590 We need to find where the support vectors are. 153 00:08:54,590 --> 00:08:59,010 So the first thing we need to do is simply eye it. 154 00:08:59,010 --> 00:09:00,970 Fortunately, on the test, there will always 155 00:09:00,970 --> 00:09:02,960 be ones that you can eye if you're supposed 156 00:09:02,960 --> 00:09:04,850 to circle the support vectors. 157 00:09:04,850 --> 00:09:08,200 There's obviously some number of pluses 158 00:09:08,200 --> 00:09:09,460 and some number of minuses. 159 00:09:09,460 --> 00:09:11,564 I say obviously, but maybe not. 160 00:09:11,564 --> 00:09:12,980 But hopefully obviously, and we'll 161 00:09:12,980 --> 00:09:15,490 find out because I'm going to call on random people. 162 00:09:15,490 --> 00:09:19,899 So give me a positive support vector. 163 00:09:19,899 --> 00:09:28,120 AUDIENCE: Um, going to the one that looked like [INAUDIBLE]? 164 00:09:28,120 --> 00:09:31,210 PROFESSOR: Which plus sign, [INAUDIBLE]? 165 00:09:31,210 --> 00:09:32,580 AUDIENCE: One on the right. 166 00:09:32,580 --> 00:09:33,380 PROFESSOR: The one all the way on the right. 167 00:09:33,380 --> 00:09:35,840 Yeah, that plus sign is a positive support vector. 168 00:09:35,840 --> 00:09:37,170 That's good. 169 00:09:37,170 --> 00:09:37,920 All right? 170 00:09:37,920 --> 00:09:39,256 Excellent. 171 00:09:39,256 --> 00:09:40,880 Now, give me a negative support vector. 172 00:09:44,810 --> 00:09:45,310 That one? 173 00:09:49,360 --> 00:09:50,114 No? 174 00:09:50,114 --> 00:09:51,030 AUDIENCE: Yeah, sorry. 175 00:09:51,030 --> 00:09:52,113 PROFESSOR: Ah, no problem. 176 00:09:52,113 --> 00:09:53,530 Give me a negative support vector. 177 00:09:53,530 --> 00:09:54,904 AUDIENCE: I should definitely ask 178 00:09:54,904 --> 00:09:56,470 you, what's a support vector? 179 00:09:56,470 --> 00:09:57,955 [LAUGHTER] 180 00:09:57,955 --> 00:09:59,440 PROFESSOR: That is a good question. 181 00:09:59,440 --> 00:10:03,720 The question is, what is a support vector? 182 00:10:03,720 --> 00:10:08,600 How many other people will admit to having this question? 183 00:10:08,600 --> 00:10:09,100 See? 184 00:10:09,100 --> 00:10:12,225 You're not alone. 185 00:10:12,225 --> 00:10:12,725 OK. 186 00:10:15,510 --> 00:10:18,390 Before I go on, I'm going to assume-- you guys make 187 00:10:18,390 --> 00:10:22,331 sure I'm correct-- Monday was, just being sure 188 00:10:22,331 --> 00:10:23,580 so I can tailor based on this. 189 00:10:23,580 --> 00:10:27,230 Monday was the support vector machine lecture. 190 00:10:27,230 --> 00:10:30,470 But it was also very difficult to follow. 191 00:10:30,470 --> 00:10:32,460 That's what I usually expect. 192 00:10:32,460 --> 00:10:35,170 So what is a support vector? 193 00:10:35,170 --> 00:10:40,350 Well, all these pluses and minuses, if we were me, 194 00:10:40,350 --> 00:10:43,600 and if, I guess-- yeah, if we were me 195 00:10:43,600 --> 00:10:47,490 and if I was describing this problem, the one that we work 196 00:10:47,490 --> 00:10:49,970 out in class, I would call them points 197 00:10:49,970 --> 00:10:51,840 because they're on the graph. 198 00:10:51,840 --> 00:10:53,320 They're points. 199 00:10:53,320 --> 00:10:55,515 They're data points. 200 00:10:55,515 --> 00:10:59,240 But however, in more difficult versions of this problem 201 00:10:59,240 --> 00:11:01,760 that have n dimensions, where n is 202 00:11:01,760 --> 00:11:03,330 some ridiculous number of dimensions 203 00:11:03,330 --> 00:11:05,040 that you're never going to graph. 204 00:11:05,040 --> 00:11:08,310 Like say some of the research I'm doing now, 205 00:11:08,310 --> 00:11:10,060 I could use support vector machines 206 00:11:10,060 --> 00:11:15,420 on some of these articles that I'm reading about cyber events 207 00:11:15,420 --> 00:11:17,520 to try to figure out if there's a real event 208 00:11:17,520 --> 00:11:19,780 or if it's just someone complaining about how we're 209 00:11:19,780 --> 00:11:21,540 really vulnerable or something like that 210 00:11:21,540 --> 00:11:23,550 and no event actually happened. 211 00:11:23,550 --> 00:11:25,820 So the reason why they call these guys vectors is 212 00:11:25,820 --> 00:11:28,990 when you're not able to graph them on a Cartesian plane, 213 00:11:28,990 --> 00:11:33,790 there's still this long vector of many different dimensions. 214 00:11:33,790 --> 00:11:37,750 Right now, though, these points represent the vectors. 215 00:11:37,750 --> 00:11:39,134 This is very simple. 216 00:11:39,134 --> 00:11:40,550 It's easier to view them this way. 217 00:11:40,550 --> 00:11:44,685 But for instance, that plus at negative 1, 2 218 00:11:44,685 --> 00:11:46,060 represents the fact that there is 219 00:11:46,060 --> 00:11:49,670 a vector going in the direction of negative 1, 2 220 00:11:49,670 --> 00:11:55,415 with a magnitude such that it reaches negative 1, 2. 221 00:11:55,415 --> 00:11:57,540 So all these points are just a point representation 222 00:11:57,540 --> 00:11:58,180 of a vector. 223 00:11:58,180 --> 00:12:01,400 You probably, in any class that worked with vectors, 224 00:12:01,400 --> 00:12:04,330 saw this, saw vectors being represented as points. 225 00:12:04,330 --> 00:12:04,830 Question? 226 00:12:05,593 --> 00:12:07,865 AUDIENCE: Always from the respect to the origin? 227 00:12:07,865 --> 00:12:08,490 PROFESSOR: Yes. 228 00:12:08,490 --> 00:12:10,680 The question is always with respect to the origin. 229 00:12:10,680 --> 00:12:12,842 The answer is canonically, when vectors 230 00:12:12,842 --> 00:12:15,340 are represented as points, yes, it's always 231 00:12:15,340 --> 00:12:17,820 with respect to the origin. 232 00:12:17,820 --> 00:12:21,604 So that's the basic idea is that all these points are vectors. 233 00:12:21,604 --> 00:12:22,770 So what are support vectors? 234 00:12:22,770 --> 00:12:25,400 Well, you could call them support points for this case. 235 00:12:25,400 --> 00:12:27,191 But the reason we call them support vectors 236 00:12:27,191 --> 00:12:28,800 is again, in the generalized case 237 00:12:28,800 --> 00:12:31,980 that you might be doing in the real world with real AI, 238 00:12:31,980 --> 00:12:33,480 you're going to have a giant vector. 239 00:12:33,480 --> 00:12:36,560 And it's not just going to be points on a graph. 240 00:12:36,560 --> 00:12:37,850 Well, usually. 241 00:12:37,850 --> 00:12:41,970 So the support vectors, the support points, 242 00:12:41,970 --> 00:12:44,000 we found one of them correctly. 243 00:12:44,000 --> 00:12:45,660 It's this guy. 244 00:12:45,660 --> 00:12:50,124 They're going to be the ones that again, they 245 00:12:50,124 --> 00:12:51,290 don't have an alpha of zero. 246 00:12:51,290 --> 00:12:54,740 They're the ones that bind in the, as Petra calls it, 247 00:12:54,740 --> 00:12:57,780 the road, the boundary lines. 248 00:12:57,780 --> 00:12:59,930 They're going to be on the edge of plus. 249 00:12:59,930 --> 00:13:02,010 Whichever direction we draw it, this plus 250 00:13:02,010 --> 00:13:04,616 is the edge of the plus region. 251 00:13:04,616 --> 00:13:06,947 If we made this the edge of the plus region 252 00:13:06,947 --> 00:13:09,030 and everything on this side is plus and everything 253 00:13:09,030 --> 00:13:10,710 on this side is minus, we'd be screwed 254 00:13:10,710 --> 00:13:15,170 because there's two pluses on the other side of that. 255 00:13:15,170 --> 00:13:18,040 Generally, when trying to find a support vector, 256 00:13:18,040 --> 00:13:22,100 you do something a little bit similar to my crazy method 257 00:13:22,100 --> 00:13:24,950 of doing nearest neighbors, and try 258 00:13:24,950 --> 00:13:30,860 to find a plus-minus pair that's close to each other. 259 00:13:30,860 --> 00:13:33,650 Sometimes though, it's not just two points 260 00:13:33,650 --> 00:13:35,900 because sometimes if you try to draw the simple-minded 261 00:13:35,900 --> 00:13:38,760 thing, which is the perpendicular 262 00:13:38,760 --> 00:13:41,780 bisector of the two points, you get screwed because there's 263 00:13:41,780 --> 00:13:43,800 another point in your way. 264 00:13:43,800 --> 00:13:46,400 So now that I've given away a clue, 265 00:13:46,400 --> 00:13:48,722 let's go-- and hopefully that made sense to you guys. 266 00:13:48,722 --> 00:13:50,805 The support vectors are the ones on the edges that 267 00:13:50,805 --> 00:13:53,320 are just barely a plus for sure, or just barely 268 00:13:53,320 --> 00:13:54,390 a minus for sure. 269 00:13:54,390 --> 00:13:55,040 Let's go back. 270 00:13:55,040 --> 00:13:56,900 Can you give me a negative support vector? 271 00:13:56,900 --> 00:13:57,852 AUDIENCE: The top one? 272 00:13:57,852 --> 00:13:58,477 PROFESSOR: Hmm? 273 00:13:58,477 --> 00:14:00,055 AUDIENCE: The top negative point? 274 00:14:00,055 --> 00:14:02,380 PROFESSOR: The one on the top left? 275 00:14:02,380 --> 00:14:02,880 Yes. 276 00:14:07,214 --> 00:14:09,630 And does anyone think that there's a third support vector? 277 00:14:09,630 --> 00:14:12,420 Well, let's simple-mindedly try the thing that-- remember, 278 00:14:12,420 --> 00:14:14,160 support vectors always attempt to have 279 00:14:14,160 --> 00:14:18,470 the widest possible space between the pluses and minuses 280 00:14:18,470 --> 00:14:19,600 that they can. 281 00:14:19,600 --> 00:14:23,240 So let's simple-mindedly try to do the perpendicular bisector 282 00:14:23,240 --> 00:14:26,380 and see if screws us over. 283 00:14:26,380 --> 00:14:29,050 So when we simple-mindedly do the perpendicular bisector, 284 00:14:29,050 --> 00:14:39,330 it goes through here like this. 285 00:14:39,330 --> 00:14:41,665 And it's just fine. 286 00:14:41,665 --> 00:14:43,415 So these are the only two support vectors. 287 00:14:59,620 --> 00:15:02,490 And there's our divider line. 288 00:15:02,490 --> 00:15:06,210 So we're on the home stretch. 289 00:15:06,210 --> 00:15:08,160 But we have to find w and b. 290 00:15:08,160 --> 00:15:14,160 In olden days, we would find w and b by plugging in w dot 291 00:15:14,160 --> 00:15:17,130 the plus support vector, plus b equals 1. 292 00:15:17,130 --> 00:15:18,600 Oh, that's very crucial. 293 00:15:18,600 --> 00:15:21,800 These w dot x plus x minus are only 294 00:15:21,800 --> 00:15:23,780 true equaling 1 or negative 1? 295 00:15:23,780 --> 00:15:26,540 Or only true for support vectors? 296 00:15:26,540 --> 00:15:31,630 It's always true that w dot any positive point plus b will 297 00:15:31,630 --> 00:15:34,520 be some positive number. 298 00:15:34,520 --> 00:15:37,730 But it won't always be 1. 299 00:15:37,730 --> 00:15:42,870 In fact, it will always be greater than 1 up over here. 300 00:15:42,870 --> 00:15:45,275 It will always be less than -1 down over there. 301 00:15:48,310 --> 00:15:53,330 In olden days, we would plug in -1, 2 into this equation. 302 00:15:53,330 --> 00:15:57,670 We would plug in 3, -2 into this equation. 303 00:15:57,670 --> 00:16:00,557 We'd plug in alpha plus equals alpha minus in its sums. 304 00:16:00,557 --> 00:16:02,140 And since there's only 1 plus 1 minus, 305 00:16:02,140 --> 00:16:04,030 we'd know they were equal. 306 00:16:04,030 --> 00:16:07,440 And then we'd fidget around with this w equation. 307 00:16:07,440 --> 00:16:10,880 However, there is a better way to do it. 308 00:16:10,880 --> 00:16:14,720 And so let's use this cheap strategy 309 00:16:14,720 --> 00:16:17,310 to solve this version of the SVM. 310 00:16:17,310 --> 00:16:18,180 Here's how. 311 00:16:18,180 --> 00:16:22,730 First, and I know I didn't draw these completely straight. 312 00:16:22,730 --> 00:16:24,030 Sorry. 313 00:16:24,030 --> 00:16:28,060 But can anyone, by looking at-- this is three, -2. 314 00:16:28,060 --> 00:16:30,150 2 And this is -1, 2. 315 00:16:30,150 --> 00:16:34,680 Can anyone tell me what the equation-- you can do y 316 00:16:34,680 --> 00:16:35,720 equals mx plus b. 317 00:16:35,720 --> 00:16:38,011 Can anyone tell me what the equation of the dotted line 318 00:16:38,011 --> 00:16:40,354 is supposed to be if I was good at drawing? 319 00:16:40,354 --> 00:16:42,764 AUDIENCE: [SEVERAL ANSWERS] 320 00:16:42,764 --> 00:16:46,138 PROFESSOR: People say y equals x minus 1. 321 00:16:46,138 --> 00:16:50,100 And I say yes, y equals x minus 1. 322 00:16:50,100 --> 00:16:51,800 So therefore, the pluses would be 323 00:16:51,800 --> 00:16:57,330 y is greater than or equal to x minus 1 indeed. 324 00:16:57,330 --> 00:17:02,710 So we've already seen that w dot x plus b somehow can 325 00:17:02,710 --> 00:17:05,640 be converted into this form. 326 00:17:05,640 --> 00:17:06,480 Right? 327 00:17:06,480 --> 00:17:09,730 So therefore, if we have y equals x minus 1, 328 00:17:09,730 --> 00:17:19,490 then we know that we have we have w dot x plus b equals 0. 329 00:17:19,490 --> 00:17:21,250 Let's do that here. 330 00:17:21,250 --> 00:17:28,730 So we know that w1 x1. 331 00:17:28,730 --> 00:17:30,450 We can even call it x and y. 332 00:17:30,450 --> 00:17:32,170 I think it'll be fine. 333 00:17:32,170 --> 00:17:33,720 No one will come after us. 334 00:17:33,720 --> 00:17:39,160 w2 y plus b equals 0. 335 00:17:39,160 --> 00:17:41,770 But we also know that y equals x minus 1, 336 00:17:41,770 --> 00:17:48,060 which means that if y equals x minus 1, 337 00:17:48,060 --> 00:17:51,310 then according to this thing we have over here, 338 00:17:51,310 --> 00:17:57,915 then negative w1 over w2 equals-- 339 00:18:01,020 --> 00:18:12,340 So we know that negative w1 over w2-- and we have -b over w2. 340 00:18:12,340 --> 00:18:15,960 So y equals x minus 1. 341 00:18:15,960 --> 00:18:18,330 And if we solve this equation to make it look like this, 342 00:18:18,330 --> 00:18:29,220 we would have y equals negative w1 over w2 Minus b over w2. 343 00:18:29,220 --> 00:18:37,920 So we know that in some way, shape, or form-- we know that 344 00:18:37,920 --> 00:18:43,100 then therefore, w1 over w2 is some scalar 345 00:18:43,100 --> 00:18:47,900 multiple of minus 1. 346 00:18:47,900 --> 00:18:53,280 And we know that b over w2 is, in fact, some scalar 347 00:18:53,280 --> 00:18:54,910 multiple of positive 1. 348 00:18:54,910 --> 00:18:56,660 Scalar multiple, what's a scalar multiple? 349 00:18:56,660 --> 00:18:58,410 Well, why is it a scalar multiple? 350 00:18:58,410 --> 00:19:01,520 Why isn't it just going to be negative 1 or positive 1? 351 00:19:05,510 --> 00:19:07,530 Just because in this equation, we 352 00:19:07,530 --> 00:19:11,284 can multiply the entire equation by any number 353 00:19:11,284 --> 00:19:13,200 and it will still have the same boundary line. 354 00:19:16,480 --> 00:19:17,255 You guys see that? 355 00:19:19,834 --> 00:19:22,320 Oh, there's an x here. 356 00:19:22,320 --> 00:19:26,170 If we multiplied everything, since it's all divided by w2. 357 00:19:26,170 --> 00:19:31,310 If we double w2, but also doubled b and w1, 358 00:19:31,310 --> 00:19:33,125 it would be the exact same equation. 359 00:19:33,125 --> 00:19:35,010 Do you guys agree? 360 00:19:35,010 --> 00:19:37,356 So there's, in fact, infinitely many possible equations. 361 00:19:37,356 --> 00:19:38,480 You say, well, great, Mark. 362 00:19:38,480 --> 00:19:40,450 You've figured out what form it is. 363 00:19:40,450 --> 00:19:45,260 So you figured out that w1 over w2 364 00:19:45,260 --> 00:19:51,040 equals some scalar multiple of negative 1. 365 00:19:51,040 --> 00:19:53,600 So it's negative 1 times-- what's 366 00:19:53,600 --> 00:19:55,446 everyone's favorite letter? 367 00:19:55,446 --> 00:19:56,179 AUDIENCE: k. 368 00:19:56,179 --> 00:19:56,720 PROFESSOR: k. 369 00:19:56,720 --> 00:19:58,440 Negative 1 times k. 370 00:19:58,440 --> 00:20:05,180 And we figured out that b over w2 is-- 371 00:20:05,180 --> 00:20:08,320 I guess we can just do negative -- is positive k. 372 00:20:08,320 --> 00:20:09,000 But what's k? 373 00:20:09,000 --> 00:20:10,416 How are we going to figure it out? 374 00:20:10,416 --> 00:20:11,600 Well, it's a good question. 375 00:20:11,600 --> 00:20:12,600 And I will tell you how. 376 00:20:15,440 --> 00:20:19,090 I will assert the following fact as true without proof. 377 00:20:19,090 --> 00:20:20,320 Then I will not prove it. 378 00:20:23,720 --> 00:20:28,640 1 over the magnitude of w, which is this vector here 379 00:20:28,640 --> 00:20:36,300 with w1 and w2, equals this where 380 00:20:36,300 --> 00:20:38,720 this is that line that I just drew, the line from here 381 00:20:38,720 --> 00:20:39,870 to this point. 382 00:20:45,690 --> 00:20:47,840 1 over the magnitude of w equals this. 383 00:20:50,850 --> 00:20:54,970 Therefore, since 1 over the magnitude of w equals this, 384 00:20:54,970 --> 00:20:58,250 and this equals, I believe, 2 root 2, 385 00:20:58,250 --> 00:21:03,550 because we're going over to, down to, 386 00:21:03,550 --> 00:21:05,200 Pythagorean Theorem, 2, root 2. 387 00:21:09,930 --> 00:21:13,760 So therefore, flip everything over. 388 00:21:13,760 --> 00:21:16,740 Magnitude of w equals 1 over 2 root 2. 389 00:21:16,740 --> 00:21:19,430 So therefore, magnitude of w equals root 2 over 4. 390 00:21:23,520 --> 00:21:26,130 But why are we OK? 391 00:21:26,130 --> 00:21:29,900 Well, how do we calculate the magnitude of w? 392 00:21:29,900 --> 00:21:33,560 Do people know, in general, magnitudes of vectors? 393 00:21:33,560 --> 00:21:37,720 Generally, for these vectors, we do it 394 00:21:37,720 --> 00:21:41,990 by the square root of the sum of the components squared. 395 00:21:41,990 --> 00:21:49,290 So the square root of w1 squared plus w2 squared 396 00:21:49,290 --> 00:21:50,300 equals root 2 over 4. 397 00:21:50,300 --> 00:21:52,610 But that's not all. 398 00:21:52,610 --> 00:22:00,710 That's not all, we say, because we know from this over here 399 00:22:00,710 --> 00:22:04,412 that the ratio of w1 and w2 is-- 400 00:22:04,412 --> 00:22:07,090 AUDIENCE: [SEVERAL ANSWERS] 401 00:22:07,090 --> 00:22:13,900 PROFESSOR: Yeah, the ratio of w1 and w2 402 00:22:13,900 --> 00:22:18,340 is going to be-- actually, sorry. 403 00:22:18,340 --> 00:22:20,150 I shouldn't put a k here. 404 00:22:20,150 --> 00:22:23,490 I realize I probably have been confusing you guys a lot. 405 00:22:23,490 --> 00:22:25,475 w1 over w2 is just -1. 406 00:22:25,475 --> 00:22:27,510 B Over w2 is just 1. 407 00:22:27,510 --> 00:22:29,570 That's just a fact. 408 00:22:29,570 --> 00:22:30,600 There's no k. 409 00:22:30,600 --> 00:22:35,630 The k is to determine what w1 and w2 are. 410 00:22:35,630 --> 00:22:38,403 So w1 equals -k. 411 00:22:38,403 --> 00:22:41,760 And w2 equals positive k. 412 00:22:41,760 --> 00:22:44,156 And b equals also positive k. 413 00:22:47,560 --> 00:22:49,790 By the way, here's a question for you. 414 00:22:49,790 --> 00:22:54,830 Could I have put the negative sign on w2 and b 415 00:22:54,830 --> 00:22:55,860 instead of on w1? 416 00:22:58,600 --> 00:22:59,970 So many people said yes. 417 00:22:59,970 --> 00:23:01,250 That's a very smart answer. 418 00:23:01,250 --> 00:23:05,080 Actually, no, because of the fact 419 00:23:05,080 --> 00:23:09,270 that the pluses are on the negative x-axis. 420 00:23:09,270 --> 00:23:11,402 It's just a little trick I picked up. 421 00:23:11,402 --> 00:23:13,902 When one of them is negative and the other one isn't, follow 422 00:23:13,902 --> 00:23:15,400 the pluses. 423 00:23:15,400 --> 00:23:18,620 So we know that w1 is -k, w2 is positive k, 424 00:23:18,620 --> 00:23:20,196 and b Is positive k. 425 00:23:20,196 --> 00:23:22,455 w1 over w2 is -1. 426 00:23:22,455 --> 00:23:24,700 b over w2 is positive 1. 427 00:23:24,700 --> 00:23:27,460 So what do we know about the ratio of w1 and w2? 428 00:23:30,600 --> 00:23:32,720 It's equal to -1. 429 00:23:32,720 --> 00:23:34,810 And that means that when we square it, 430 00:23:34,810 --> 00:23:38,290 w1 squared equals w2 squared. 431 00:23:38,290 --> 00:23:40,720 So therefore, this is the square root 432 00:23:40,720 --> 00:23:46,540 of 2 w1 squared, which equals root 2 w1. 433 00:23:50,860 --> 00:23:54,460 Well, actually no, it doesn't equal root 2 w1 434 00:23:54,460 --> 00:23:57,340 because w1 is actually negative. 435 00:23:57,340 --> 00:24:02,420 So it's negative root 2 w1. 436 00:24:02,420 --> 00:24:03,430 It doesn't matter. 437 00:24:03,430 --> 00:24:09,100 The point is that if that equals root 2 over 4, 438 00:24:09,100 --> 00:24:12,042 then w1 is-- everyone? 439 00:24:12,042 --> 00:24:13,000 AUDIENCE: Negative 1/4. 440 00:24:13,000 --> 00:24:14,210 PROFESSOR: Negative 1/4. 441 00:24:14,210 --> 00:24:15,042 Bingo. 442 00:24:15,042 --> 00:24:18,690 And if w1 is -1/4, 1 everything else falls into place. 443 00:24:18,690 --> 00:24:19,630 What are w2 and b? 444 00:24:19,630 --> 00:24:20,460 Everyone? 445 00:24:20,460 --> 00:24:21,600 AUDIENCE: Positive 1/4. 446 00:24:21,600 --> 00:24:22,720 PROFESSOR: Positive 1/4. 447 00:24:22,720 --> 00:24:23,600 We got it. 448 00:24:23,600 --> 00:24:25,720 We're done with this part of problem. 449 00:24:25,720 --> 00:24:27,210 However, bonus. 450 00:24:27,210 --> 00:24:28,800 Let's come to the alphas, which they 451 00:24:28,800 --> 00:24:30,420 didn't ask you to calculate. 452 00:24:30,420 --> 00:24:31,719 Actually, you know what? 453 00:24:31,719 --> 00:24:33,510 We'll do the alphas if we have enough time, 454 00:24:33,510 --> 00:24:35,718 since they didn't actually ask you to calculate them. 455 00:24:35,718 --> 00:24:38,240 However, my recommendation is since there's only 456 00:24:38,240 --> 00:24:40,040 one alpha-plus and one alpha-minus, 457 00:24:40,040 --> 00:24:42,570 they must be equal from this equation, 458 00:24:42,570 --> 00:24:46,010 since the sum of the alpha-plus equals the sum of alpha-minus. 459 00:24:46,010 --> 00:24:54,030 And so therefore, w equals the sum of w-- sorry, 460 00:24:54,030 --> 00:24:56,520 this should be an x. 461 00:24:56,520 --> 00:24:58,980 Of course, there's not a million w's in this equation. 462 00:24:58,980 --> 00:25:02,912 The sum of the positive data points times their alpha 463 00:25:02,912 --> 00:25:05,490 is minus the negative data points times their alphas. 464 00:25:05,490 --> 00:25:14,550 So we're looking at here -1/4, 1/4 equals-- what 465 00:25:14,550 --> 00:25:15,180 do we got here? 466 00:25:15,180 --> 00:25:17,110 Positive point negative 1, 2? 467 00:25:17,110 --> 00:25:27,020 So we've got alpha, alpha of that point negative 1, 2 minus 468 00:25:27,020 --> 00:25:28,750 alpha of that minus point. 469 00:25:28,750 --> 00:25:29,890 And what is that? 470 00:25:29,890 --> 00:25:30,790 It's 3, -2. 471 00:25:33,420 --> 00:25:35,690 3, -2. 472 00:25:35,690 --> 00:25:38,370 So if both of the alphas which are equal were 1, 473 00:25:38,370 --> 00:25:42,440 we'd have -4, 4. 474 00:25:42,440 --> 00:25:44,220 But we want -1/4, 1/4. 475 00:25:44,220 --> 00:25:53,070 So actually both of the alphas are 1/16. 476 00:25:53,070 --> 00:25:55,340 And that's the answer. 477 00:25:55,340 --> 00:25:57,370 We'll do that more in depth if we have time. 478 00:25:57,370 --> 00:25:58,300 But we won't. 479 00:25:58,300 --> 00:26:00,530 So let's do number two. 480 00:26:00,530 --> 00:26:02,550 So let's go into faster mode. 481 00:26:02,550 --> 00:26:06,780 Number two, very similar to number one in many ways. 482 00:26:06,780 --> 00:26:09,680 But as you can see, one of the main things that they added 483 00:26:09,680 --> 00:26:11,800 an extra minus sign at 2, -1. 484 00:26:11,800 --> 00:26:15,890 So I think we can all agree that this will still 485 00:26:15,890 --> 00:26:19,560 be our plus-- Actually, they added another plus sign there, 486 00:26:19,560 --> 00:26:20,470 too. 487 00:26:20,470 --> 00:26:24,060 So maybe this plus sign is a support vector. 488 00:26:24,060 --> 00:26:24,870 But it's not. 489 00:26:24,870 --> 00:26:26,496 This plus sign is a support vector. 490 00:26:26,496 --> 00:26:28,620 What do you guys think about the new negative sign? 491 00:26:28,620 --> 00:26:31,100 Will it become a support vector since it is strictly 492 00:26:31,100 --> 00:26:33,370 closer to the pluses? 493 00:26:33,370 --> 00:26:35,460 Yep, you're right. 494 00:26:35,460 --> 00:26:41,080 OK, so this is a very beautiful division 495 00:26:41,080 --> 00:26:44,580 because if I do this correctly, which I didn't, but if we 496 00:26:44,580 --> 00:26:45,980 pretend that I did. 497 00:26:45,980 --> 00:26:48,445 Then the dotted line is-- 498 00:26:48,445 --> 00:26:49,840 AUDIENCE: [INAUDIBLE]. 499 00:26:49,840 --> 00:26:50,805 PROFESSOR: y equals x. 500 00:26:57,660 --> 00:27:02,780 OK, so with the dotted line at y equals x, then 501 00:27:02,780 --> 00:27:11,800 just like we did up here, we know that if y equals x plus 0, 502 00:27:11,800 --> 00:27:15,700 we know that first of all, b equals 0. 503 00:27:15,700 --> 00:27:22,550 Second of all, we know that if y equals x, then 504 00:27:22,550 --> 00:27:27,940 we know that -w1 over w2 equals 1. 505 00:27:33,460 --> 00:27:35,910 The pluses are still on the left and up, 506 00:27:35,910 --> 00:27:40,364 so we know that w1 is some negative number, -k, 507 00:27:40,364 --> 00:27:43,580 and w2 is some positive number k. 508 00:27:43,580 --> 00:27:44,390 Great. 509 00:27:44,390 --> 00:27:46,480 How are we going to figure it out? 510 00:27:46,480 --> 00:27:51,020 Well, let's call this d for distance, 511 00:27:51,020 --> 00:27:52,710 or whatever you want to call it. 512 00:27:52,710 --> 00:27:57,300 So 1 over w equals d. 513 00:27:57,300 --> 00:28:00,860 d In this case is not 2 over 2. 514 00:28:00,860 --> 00:28:05,270 Can everyone tell what d is here? 515 00:28:05,270 --> 00:28:06,226 AUDIENCE: [INAUDIBLE] 516 00:28:11,490 --> 00:28:16,700 PROFESSOR: It's actually-- so it goes over 2 and 1. 517 00:28:16,700 --> 00:28:23,820 So it should be 1 1/2 root 2 since this width distance, 518 00:28:23,820 --> 00:28:28,420 which is twice as much, goes over 3 and 3, 519 00:28:28,420 --> 00:28:29,650 which is 3 root 2. 520 00:28:29,650 --> 00:28:31,780 So it's 1 1/2 root 2. 521 00:28:31,780 --> 00:28:35,170 I don't like putting in decimals and stuff there. 522 00:28:35,170 --> 00:28:39,820 So we'll say that 2 over magnitude of w 523 00:28:39,820 --> 00:28:45,630 equals 2d equals 3 root 2. 524 00:28:45,630 --> 00:28:49,510 So therefore, right, Pythagorean, one, two, three, 525 00:28:49,510 --> 00:28:51,310 one, two, three, 3 root 2. 526 00:28:51,310 --> 00:28:59,300 So therefore, magnitude of w equals-- let's see. 527 00:28:59,300 --> 00:29:00,040 Switch them over. 528 00:29:00,040 --> 00:29:01,330 We should get root 2 over 3. 529 00:29:05,026 --> 00:29:07,230 And if magnitude of w is root 2 over 3, 530 00:29:07,230 --> 00:29:10,420 we can do our same trick from before, 531 00:29:10,420 --> 00:29:16,630 square root of 2 w2 squared equals root 2 over 3. 532 00:29:19,950 --> 00:29:26,250 And this is just root 2 times w2 equals root 2 over 3. 533 00:29:26,250 --> 00:29:28,650 So therefore, w2 is? 534 00:29:28,650 --> 00:29:29,240 1/3. 535 00:29:29,240 --> 00:29:31,430 And w1 is? 536 00:29:31,430 --> 00:29:32,250 -1/3. 537 00:29:32,250 --> 00:29:33,370 Bingo. 538 00:29:33,370 --> 00:29:34,730 We've got w1. 539 00:29:34,730 --> 00:29:36,980 We've got w2. 540 00:29:36,980 --> 00:29:39,960 We know that b was zero because obviously it's 0. 541 00:29:39,960 --> 00:29:41,300 It's y equals x. 542 00:29:41,300 --> 00:29:42,940 And we're done. 543 00:29:42,940 --> 00:29:44,040 That was fast. 544 00:29:44,040 --> 00:29:45,340 The alphas might taken longer. 545 00:29:45,340 --> 00:29:47,923 Actually, the alphas on this one are more of a pain in the ass 546 00:29:47,923 --> 00:29:51,030 than anywhere else because let's take a look at this one 547 00:29:51,030 --> 00:29:52,770 if you can see it. 548 00:29:52,770 --> 00:29:54,850 We've added in yet some new points. 549 00:29:54,850 --> 00:29:58,100 We've got this point up here and this point down there. 550 00:29:58,100 --> 00:30:00,640 So I think pretty clearly this plus and minus 551 00:30:00,640 --> 00:30:01,906 are the closest to each other. 552 00:30:01,906 --> 00:30:03,780 But what happens if we take the perpendicular 553 00:30:03,780 --> 00:30:09,200 bisector between these two and do like this? 554 00:30:09,200 --> 00:30:12,194 This plus is in the middle. 555 00:30:12,194 --> 00:30:13,860 So therefore, this plus is going to have 556 00:30:13,860 --> 00:30:15,540 to also be a support vector. 557 00:30:15,540 --> 00:30:20,090 So we can't just draw this line. 558 00:30:20,090 --> 00:30:21,810 We have to include this. 559 00:30:21,810 --> 00:30:23,040 What's our best division? 560 00:30:25,660 --> 00:30:27,800 Vertical lines, that's right. 561 00:30:27,800 --> 00:30:34,252 Vertical lines just so. 562 00:30:34,252 --> 00:30:39,100 And that means that the equation of our boundary, 563 00:30:39,100 --> 00:30:42,085 the dotted line here, y-axis. 564 00:30:46,180 --> 00:30:49,695 So the equation of the boundary with the y-axis, then b 565 00:30:49,695 --> 00:30:50,770 equals 0. 566 00:30:50,770 --> 00:30:53,490 And hell, w2 equals 0. 567 00:30:53,490 --> 00:30:55,860 So the only thing that is not-- 568 00:30:55,860 --> 00:30:58,770 AUDIENCE: Wait, w1 [INAUDIBLE]. 569 00:30:58,770 --> 00:31:00,077 PROFESSOR: w2 equals 0. 570 00:31:04,460 --> 00:31:06,560 So w2 equals 0. 571 00:31:06,560 --> 00:31:07,490 b equals 0. 572 00:31:07,490 --> 00:31:12,420 But w1 is not equal to zero because it's just 573 00:31:12,420 --> 00:31:15,070 the equation of the y-axis. 574 00:31:15,070 --> 00:31:21,270 So we therefore know that the equation is just 575 00:31:21,270 --> 00:31:25,980 w1 times x equals 0. 576 00:31:32,420 --> 00:31:35,400 So it's w1 times x equals 0. 577 00:31:35,400 --> 00:31:38,470 And we know that that just means essentially 578 00:31:38,470 --> 00:31:39,777 that x is going to be some k. 579 00:31:39,777 --> 00:31:41,860 It's also going to be negative because of the fact 580 00:31:41,860 --> 00:31:44,400 that the pluses are still on the left. 581 00:31:44,400 --> 00:31:47,610 Then we're going to have to figure out what that k is. 582 00:31:47,610 --> 00:31:51,500 We'll use our old trick-- by this point old, hopefully. 583 00:31:51,500 --> 00:31:54,060 One over magnitude of w equals d. 584 00:31:54,060 --> 00:31:55,280 This time d is just 1. 585 00:31:58,650 --> 00:32:02,050 So therefore, magnitude of w equals 1. 586 00:32:02,050 --> 00:32:04,450 There's only one component in w. 587 00:32:04,450 --> 00:32:07,000 So therefore w1 is 588 00:32:07,000 --> 00:32:08,360 AUDIENCE: -1. 589 00:32:08,360 --> 00:32:12,330 PROFESSOR: -1 because the plus is on the left. 590 00:32:12,330 --> 00:32:13,900 Do people see that? 591 00:32:13,900 --> 00:32:14,810 Not too bad. 592 00:32:14,810 --> 00:32:16,563 This one's easy to calculate the w. 593 00:32:16,563 --> 00:32:20,460 But it's not as easy to get all the alphas. 594 00:32:20,460 --> 00:32:23,740 But let's move on to a new and even more 595 00:32:23,740 --> 00:32:30,920 fun-- maybe not-- question, which is this guy. 596 00:32:30,920 --> 00:32:34,100 As you can see-- well maybe not. 597 00:32:34,100 --> 00:32:36,960 This is a one dimensional vector. 598 00:32:36,960 --> 00:32:38,970 These vectors only have a single dimension. 599 00:32:38,970 --> 00:32:41,380 So it just looks like a number line here. 600 00:32:41,380 --> 00:32:46,610 That dimension varies from it looks like -9 to positive 9. 601 00:32:46,610 --> 00:32:48,247 It just has one component. 602 00:32:48,247 --> 00:32:50,705 You don't have to worry about any of these crazy magnitudes 603 00:32:50,705 --> 00:32:53,530 with two components, just everything 604 00:32:53,530 --> 00:32:55,100 as a single component. 605 00:32:55,100 --> 00:32:59,300 However, it's obvious that a linear basis 606 00:32:59,300 --> 00:33:01,890 line is going to completely screw us up here, 607 00:33:01,890 --> 00:33:05,090 since lines at this point are just like, grunk, 608 00:33:05,090 --> 00:33:06,024 all these are pluses. 609 00:33:06,024 --> 00:33:06,940 All these are minuses. 610 00:33:06,940 --> 00:33:08,860 Well, great, that doesn't get them all. 611 00:33:08,860 --> 00:33:10,570 So how are we going to do it? 612 00:33:10,570 --> 00:33:13,180 Well, we're going to use what is usually perhaps the hardest 613 00:33:13,180 --> 00:33:15,180 thing in SVMs, but in this case is not 614 00:33:15,180 --> 00:33:17,010 going to be too bad for us. 615 00:33:17,010 --> 00:33:19,520 We're going to use a kernel. 616 00:33:19,520 --> 00:33:23,160 Now, based on how little I understood kernels 617 00:33:23,160 --> 00:33:26,320 the first time I took this class, 618 00:33:26,320 --> 00:33:27,970 I'm guessing that you guys would like 619 00:33:27,970 --> 00:33:30,714 to have some explanation on these kernels. 620 00:33:30,714 --> 00:33:31,630 You probably saw them. 621 00:33:31,630 --> 00:33:35,070 You remember the kernels from Patrick's lecture vaguely? 622 00:33:35,070 --> 00:33:36,120 There's this phi. 623 00:33:36,120 --> 00:33:37,160 And then there's this k. 624 00:33:37,160 --> 00:33:39,730 And then they get really complicated. 625 00:33:39,730 --> 00:33:43,420 OK, so here's how the kernel works. 626 00:33:43,420 --> 00:33:45,660 The basic idea is this. 627 00:33:45,660 --> 00:33:51,420 And I'll write it over here. 628 00:33:51,420 --> 00:33:53,030 Oh, wow, there's more stuff. 629 00:33:53,030 --> 00:33:53,739 OK. 630 00:33:53,739 --> 00:33:54,780 I'll write it right here. 631 00:33:54,780 --> 00:33:56,730 The basic idea is this. 632 00:33:56,730 --> 00:34:01,910 We're taking the normal space, which is this number line 633 00:34:01,910 --> 00:34:04,100 or it could be any kind of normal space, 634 00:34:04,100 --> 00:34:06,100 and we're going to take a vector, 635 00:34:06,100 --> 00:34:11,230 we're going to put into it a function called phi. 636 00:34:11,230 --> 00:34:16,120 And phi of vector x brings x into some new dimension. 637 00:34:16,120 --> 00:34:18,659 Phi, or "phee" if you like it better, 638 00:34:18,659 --> 00:34:21,400 is usually a nasty piece of work and something 639 00:34:21,400 --> 00:34:22,929 you never, ever want to look at. 640 00:34:22,929 --> 00:34:24,820 Sometimes it's not too bad. 641 00:34:24,820 --> 00:34:27,679 Phi is the function that brings it into the new dimension. 642 00:34:27,679 --> 00:34:29,224 OK? 643 00:34:29,224 --> 00:34:31,570 And when you brought the data into a new dimension, 644 00:34:31,570 --> 00:34:34,070 sometimes you can just cut a straight line in that dimension 645 00:34:34,070 --> 00:34:36,699 and you'll just be happy. 646 00:34:36,699 --> 00:34:42,159 However, something that was noted 647 00:34:42,159 --> 00:34:48,449 by the very, very smart inventor of support vector machines 648 00:34:48,449 --> 00:34:54,710 is that you don't actually need to work with function phi, 649 00:34:54,710 --> 00:34:58,740 even if phi is an absolutely horrible monstrosity, because 650 00:34:58,740 --> 00:35:04,850 of the fact that you never need to know what all these vectors 651 00:35:04,850 --> 00:35:10,400 x actually are in the new space, at least not directly. 652 00:35:10,400 --> 00:35:11,930 In none of these equations up here 653 00:35:11,930 --> 00:35:14,720 do we ever use x by itself. 654 00:35:14,720 --> 00:35:19,440 However, we do use x being dot product with something else. 655 00:35:19,440 --> 00:35:23,780 So he figured out a very sneaky and excellent shortcut. 656 00:35:26,580 --> 00:35:29,510 OK, so-- oh, I shouldn't use x1 and x2. 657 00:35:29,510 --> 00:35:35,530 I'll use x and z. 658 00:35:35,530 --> 00:35:37,910 So if you have two vectors, x and z, which 659 00:35:37,910 --> 00:35:40,140 are in a regular space, you put them 660 00:35:40,140 --> 00:35:42,650 into this function called the kernel. 661 00:35:42,650 --> 00:35:55,120 Then it will tell you phi x dotted with phi z. 662 00:35:55,120 --> 00:35:59,920 And if you have that, you don't need phi. 663 00:35:59,920 --> 00:36:00,934 Does everyone see that? 664 00:36:00,934 --> 00:36:02,600 Does everyone see why we don't need phi? 665 00:36:02,600 --> 00:36:04,099 Look at all these equations up here. 666 00:36:04,099 --> 00:36:07,770 We have never looked at x by itself 667 00:36:07,770 --> 00:36:12,390 in these vector equations at least. 668 00:36:12,390 --> 00:36:15,610 Now, calculating alphas, yeah, that gets a little bit fuzzy. 669 00:36:18,650 --> 00:36:21,240 Also, you may ask, why would you do this? 670 00:36:21,240 --> 00:36:22,720 You can't calculate the alphas. 671 00:36:22,720 --> 00:36:29,200 It turns out that actually, other than for these very 672 00:36:29,200 --> 00:36:31,800 simple linear problems, human minds cannot calculate 673 00:36:31,800 --> 00:36:32,740 the alphas. 674 00:36:32,740 --> 00:36:35,827 In fact, you run a very complicated 675 00:36:35,827 --> 00:36:36,785 quadratic optimization. 676 00:36:36,785 --> 00:36:39,090 In fact, finding out the best alphas 677 00:36:39,090 --> 00:36:42,720 is the thing that you hill climb on when you're 678 00:36:42,720 --> 00:36:44,440 doing SVMs in the real world. 679 00:36:44,440 --> 00:36:47,075 You say, all right, I'll run my algorithm 680 00:36:47,075 --> 00:36:48,770 when I know there's only one peak, which 681 00:36:48,770 --> 00:36:51,430 is very, very good because it's quadratic optimization. 682 00:36:51,430 --> 00:36:53,080 Let me figure out the alphas. 683 00:36:53,080 --> 00:36:55,010 So in fact, it doesn't matter that you 684 00:36:55,010 --> 00:37:00,359 can't use these alpha equations to figure out the alphas if you 685 00:37:00,359 --> 00:37:01,900 only know the kernel function and not 686 00:37:01,900 --> 00:37:05,280 the phi function because normally, the computer 687 00:37:05,280 --> 00:37:09,166 figures out the alphas for you with quadratic optimization. 688 00:37:09,166 --> 00:37:10,540 Just in these simple problems, we 689 00:37:10,540 --> 00:37:12,400 know you can calculate the alphas. 690 00:37:12,400 --> 00:37:17,120 So we have the kernel, which basically gives us 691 00:37:17,120 --> 00:37:19,500 the dot product of the things in the new space. 692 00:37:19,500 --> 00:37:22,200 So being that as it may, I'll give you the kernel here. 693 00:37:22,200 --> 00:37:23,410 I'd like you give me phi. 694 00:37:26,670 --> 00:37:29,670 Someone got an idea, whose name was Susan Q. Random 695 00:37:29,670 --> 00:37:31,360 Student, apparently. 696 00:37:31,360 --> 00:37:37,980 She got an idea that if we had a kernel for x and z-- actually, 697 00:37:37,980 --> 00:37:39,210 they're not vectors, I guess. 698 00:37:39,210 --> 00:37:43,430 There just single components. 699 00:37:43,430 --> 00:37:55,500 And the kernel equals cosine Pi over 4x times cosine Pi over 4 700 00:37:55,500 --> 00:38:07,260 z plus sine Pi over 4x plus sine Pi over 4z. 701 00:38:10,630 --> 00:38:13,670 So that is the new dot product. 702 00:38:16,180 --> 00:38:17,090 Oh, wait, sorry. 703 00:38:17,090 --> 00:38:21,030 I put one of the z's not inside the parentheses. 704 00:38:21,030 --> 00:38:22,340 That was silly of me. 705 00:38:22,340 --> 00:38:28,220 So cosine of the quantity pi Over 4x times 706 00:38:28,220 --> 00:38:31,300 cosine of the quantity Pi over 4z plus sine of quantity Pi 707 00:38:31,300 --> 00:38:36,480 over 4x plus sine of quantity Pi over 4z is the new product. 708 00:38:36,480 --> 00:38:37,910 So that begs the question. 709 00:38:37,910 --> 00:38:42,150 This is an easy one so we can calculate the phi. 710 00:38:42,150 --> 00:38:47,620 What is phi of x? 711 00:38:47,620 --> 00:38:51,025 We're actually taking it from one dimension 712 00:38:51,025 --> 00:38:53,890 and we may be playing around with it a lot to get this. 713 00:38:53,890 --> 00:38:56,910 And this thing has become a new dot product. 714 00:38:56,910 --> 00:38:58,532 It replaces dot product. 715 00:38:58,532 --> 00:39:00,240 And remember, the dot product for scalars 716 00:39:00,240 --> 00:39:02,750 would have just been multiplying two numbers together. 717 00:39:02,750 --> 00:39:05,000 So it actually makes it a little bit more complicated. 718 00:39:05,000 --> 00:39:07,650 Does anyone think they know the phi? 719 00:39:07,650 --> 00:39:08,930 Oh, we got one. 720 00:39:08,930 --> 00:39:09,896 What do you think? 721 00:39:09,896 --> 00:39:10,888 AUDIENCE: [INAUDIBLE] 722 00:39:13,870 --> 00:39:16,228 PROFESSOR: You mean two common vectors, two dimensional? 723 00:39:16,228 --> 00:39:17,540 AUDIENCE: The two points. 724 00:39:17,540 --> 00:39:19,040 PROFESSOR: Absolutely. 725 00:39:19,040 --> 00:39:22,550 That's exactly correct. 726 00:39:22,550 --> 00:39:25,310 How would you have solved this on the actual quiz 727 00:39:25,310 --> 00:39:28,900 if you're not our brave volunteer? 728 00:39:28,900 --> 00:39:34,330 Well, that k, if you squint at it-- not very much actually-- 729 00:39:34,330 --> 00:39:40,200 is pretty much a dot product between cosine of Pi over 4 730 00:39:40,200 --> 00:39:41,570 and sine Pi over 4. 731 00:39:41,570 --> 00:39:43,070 I mean, look at it. 732 00:39:43,070 --> 00:39:48,310 Remember, if the dot product of x and z vectors 733 00:39:48,310 --> 00:39:56,550 is x1, z1 plus x2, z2-- so that basically is x1, z1 plus x2 z2. 734 00:39:56,550 --> 00:39:59,490 Oh, this should have been a times. 735 00:39:59,490 --> 00:40:01,190 Yeah, this should have been a times. 736 00:40:01,190 --> 00:40:01,770 Sorry. 737 00:40:01,770 --> 00:40:03,090 There's a plus there. 738 00:40:03,090 --> 00:40:05,807 Anyone who missed it because of that, my bad. 739 00:40:05,807 --> 00:40:07,140 That's should have been a times. 740 00:40:07,140 --> 00:40:09,080 That should have been a times up there. 741 00:40:09,080 --> 00:40:12,580 It's cosine Pi over 4x cosine Pi over 4z plus sine Pi 742 00:40:12,580 --> 00:40:15,260 over 4x times sine Pi over 4z. 743 00:40:15,260 --> 00:40:17,040 So yeah, it's basically the dot product 744 00:40:17,040 --> 00:40:19,420 between cosine Pi over 4x and sine Pi over 4x. 745 00:40:19,420 --> 00:40:20,930 Bingo. 746 00:40:20,930 --> 00:40:24,177 All right, last thing. 747 00:40:24,177 --> 00:40:26,010 Well, we're not done yet because we're going 748 00:40:26,010 --> 00:40:27,399 to maybe ask some questions. 749 00:40:27,399 --> 00:40:29,940 And then we're going to see if we can calculate those alphas. 750 00:40:29,940 --> 00:40:33,850 But last thing, let's graph in this new dimension 751 00:40:33,850 --> 00:40:35,610 all the points. 752 00:40:35,610 --> 00:40:40,330 So obviously, cosines and sines, so we're 753 00:40:40,330 --> 00:40:44,050 going to get results between 1 and -1. 754 00:40:44,050 --> 00:40:44,550 Let's see. 755 00:40:44,550 --> 00:40:52,280 Maybe I can graph it-- did I write on all these? 756 00:40:55,160 --> 00:40:59,060 Wait, maybe this one. 757 00:40:59,060 --> 00:41:01,730 No, people drew weird stick figures there. 758 00:41:01,730 --> 00:41:02,230 OK. 759 00:41:05,092 --> 00:41:06,550 Oh, yeah, this one's kind of messy. 760 00:41:06,550 --> 00:41:08,980 But we'll do it on this. 761 00:41:08,980 --> 00:41:26,400 OK, so this is 1, -1, 1/2, -1/2, 1, -1, -1/2, 1/2. 762 00:41:26,400 --> 00:41:27,460 OK? 763 00:41:27,460 --> 00:41:31,140 So given that, let's try to graph 764 00:41:31,140 --> 00:41:33,110 all these points on this number line 765 00:41:33,110 --> 00:41:38,740 into this brave new dimension by using their cosine times 766 00:41:38,740 --> 00:41:39,830 Pi over 4. 767 00:41:39,830 --> 00:41:42,660 So all right, great. 768 00:41:42,660 --> 00:41:46,920 So let's do the pluses first. 769 00:41:46,920 --> 00:41:51,110 The plus at 0 is cosine 0, sine 0. 770 00:41:51,110 --> 00:41:52,170 So what is that? 771 00:41:52,170 --> 00:41:53,202 AUDIENCE: 1, 0. 772 00:41:53,202 --> 00:41:54,160 PROFESSOR: That's 1, 0. 773 00:41:54,160 --> 00:41:54,790 That's right. 774 00:41:57,890 --> 00:42:08,140 In fact, the 8 and the -8 are also that times 3. 775 00:42:08,140 --> 00:42:10,340 The 8 and the -8 are also that because then it's 776 00:42:10,340 --> 00:42:15,300 just 2 Pi minus 2 Pi, which both cosine and sine are periodic. 777 00:42:15,300 --> 00:42:16,740 OK, great. 778 00:42:16,740 --> 00:42:18,600 What about the 1? 779 00:42:18,600 --> 00:42:21,130 Well, that's cosine Pi over 4, sine Pi over 4. 780 00:42:21,130 --> 00:42:22,015 And what's that? 781 00:42:24,780 --> 00:42:26,700 AUDIENCE: [SEVERAL ANSWERS] 782 00:42:26,700 --> 00:42:29,080 PROFESSOR: Yeah, it's root 2 over 2, root 2 over 2. 783 00:42:29,080 --> 00:42:33,950 So that's something like here, we'll say. 784 00:42:33,950 --> 00:42:42,320 And in fact, and the 9 and the -7 are also that. 785 00:42:42,320 --> 00:42:43,570 So there's three of these two. 786 00:42:47,150 --> 00:42:48,640 What about the -1? 787 00:42:48,640 --> 00:42:52,554 That's cosine negative pi over 4, sine Pi over 4. 788 00:42:55,422 --> 00:42:57,040 AUDIENCE: [SEVERAL ANSWERS] 789 00:42:57,040 --> 00:42:58,040 PROFESSOR: That's right. 790 00:42:58,040 --> 00:43:02,780 The x value is positive root 2 over 2. 791 00:43:02,780 --> 00:43:05,660 And the y value is negative. 792 00:43:05,660 --> 00:43:09,070 And again, there's three of them. 793 00:43:09,070 --> 00:43:10,520 All right, great. 794 00:43:10,520 --> 00:43:11,910 Now let's do the minuses. 795 00:43:11,910 --> 00:43:14,840 There's the minus at 3, which is also 796 00:43:14,840 --> 00:43:16,110 the same as the minus at -7. 797 00:43:16,110 --> 00:43:20,260 The minus at 3 is cosine 3 Pi over 4, sine 3 Pi over 4. 798 00:43:20,260 --> 00:43:21,035 Which one is that? 799 00:43:23,740 --> 00:43:24,700 AUDIENCE: [INAUDIBLE]. 800 00:43:30,540 --> 00:43:33,670 PROFESSOR: Yeah, that's going to be in the second quadrant. 801 00:43:33,670 --> 00:43:37,010 The cosine is going to be negative. 802 00:43:37,010 --> 00:43:40,870 But the sine is going to be positive. 803 00:43:40,870 --> 00:43:44,260 And so we get 3 points here. 804 00:43:44,260 --> 00:43:48,730 And as you may have predicted, the other one, the 5 Pi over 4, 805 00:43:48,730 --> 00:43:50,980 is in the third quadrant. 806 00:43:50,980 --> 00:43:56,044 We get 3 points here, Where are the support vectors? 807 00:44:01,852 --> 00:44:03,304 AUDIENCE: Question. 808 00:44:03,304 --> 00:44:04,756 PROFESSOR: Question? 809 00:44:04,756 --> 00:44:06,934 AUDIENCE: I understand where you're 810 00:44:06,934 --> 00:44:09,418 getting at the three quantities of pluses 811 00:44:09,418 --> 00:44:11,314 in the first of the quadrants. 812 00:44:11,314 --> 00:44:13,684 But according to the [INAUDIBLE] line there, 813 00:44:13,684 --> 00:44:15,959 you want that the total of four-- maybe the values. 814 00:44:15,959 --> 00:44:17,125 PROFESSOR: Oh, you're right. 815 00:44:17,125 --> 00:44:18,795 There's only two. 816 00:44:18,795 --> 00:44:20,250 Good call. 817 00:44:20,250 --> 00:44:21,739 There's two negatives here. 818 00:44:21,739 --> 00:44:23,030 And there's two negatives here. 819 00:44:23,030 --> 00:44:23,700 Good call. 820 00:44:23,700 --> 00:44:25,977 It doesn't change the problem. 821 00:44:25,977 --> 00:44:27,560 In fact, if we just graph more points, 822 00:44:27,560 --> 00:44:28,684 there might have been more. 823 00:44:28,684 --> 00:44:30,940 But that's a very subtle and important distinction. 824 00:44:30,940 --> 00:44:33,860 There are two negatives. 825 00:44:33,860 --> 00:44:36,780 But otherwise, yeah, these are graphed correctly. 826 00:44:36,780 --> 00:44:39,714 Does anyone see where the support vectors are? 827 00:44:39,714 --> 00:44:42,199 AUDIENCE: The first two. 828 00:44:42,199 --> 00:44:44,187 AUDIENCE: Maybe the top two. 829 00:44:44,187 --> 00:44:46,449 PROFESSOR:So the top two, the minus and plus. 830 00:44:46,449 --> 00:44:48,240 We'll try to do the perpendicular bisector. 831 00:44:48,240 --> 00:44:49,460 Let's see it. 832 00:44:55,160 --> 00:44:56,100 That works. 833 00:44:56,100 --> 00:44:56,880 But guess what? 834 00:44:56,880 --> 00:44:58,700 These guys are on the same line. 835 00:44:58,700 --> 00:45:01,350 So we'd better circle them. 836 00:45:01,350 --> 00:45:04,410 So actually, the question is what isn't a support vector? 837 00:45:04,410 --> 00:45:05,170 Only this. 838 00:45:05,170 --> 00:45:06,050 Only those three. 839 00:45:06,050 --> 00:45:06,997 Question? 840 00:45:06,997 --> 00:45:09,496 AUDIENCE: Couldn't you have just done this in one dimension? 841 00:45:09,496 --> 00:45:14,406 I mean, you just showed that those ran on the same lines. 842 00:45:14,406 --> 00:45:16,861 So you really didn't need the cosine term and a sine term. 843 00:45:16,861 --> 00:45:19,314 You could have proved all this with just the cosine. 844 00:45:19,314 --> 00:45:20,730 PROFESSOR: All right, the question 845 00:45:20,730 --> 00:45:22,840 is couldn't we have done this in one dimension. 846 00:45:22,840 --> 00:45:25,800 All we do is only the cosine. 847 00:45:25,800 --> 00:45:29,910 So if we did only the cosine, then they 848 00:45:29,910 --> 00:45:32,470 would've still been easily divisible. 849 00:45:32,470 --> 00:45:36,650 The answer is, absolutely, we could have. 850 00:45:36,650 --> 00:45:40,170 However, the question did not because Susan Q. Random Student 851 00:45:40,170 --> 00:45:42,950 decided to do cosine and sine. 852 00:45:42,950 --> 00:45:46,740 But yes, if we had said you, [INAUDIBLE] student, 853 00:45:46,740 --> 00:45:49,940 find a phi that will work for this, 854 00:45:49,940 --> 00:45:52,230 you could have found a phi that was just cosine. 855 00:45:52,230 --> 00:45:53,860 That would have been easier. 856 00:45:53,860 --> 00:45:56,930 However, it's important to be a little work with what 857 00:45:56,930 --> 00:45:57,930 somebody else gives you. 858 00:45:57,930 --> 00:46:01,634 In this case, they gave you that transformation, which yeah, 859 00:46:01,634 --> 00:46:03,175 was wasteful with an extra dimension. 860 00:46:03,175 --> 00:46:05,630 You didn't need the sine because you didn't 861 00:46:05,630 --> 00:46:09,240 need the y-axis really here. 862 00:46:09,240 --> 00:46:10,235 You just needed the x. 863 00:46:12,800 --> 00:46:16,030 Does everyone see this, how this works? 864 00:46:16,030 --> 00:46:17,970 You can maybe transform dimensions? 865 00:46:17,970 --> 00:46:20,542 The main hardest part is they'll usually give you a k 866 00:46:20,542 --> 00:46:22,850 and ask for a phi or give you a phi and ask for a k. 867 00:46:22,850 --> 00:46:24,700 But it's not too bad. 868 00:46:24,700 --> 00:46:26,520 Just remember, if they give you a phi, 869 00:46:26,520 --> 00:46:28,190 do a dot product with it. 870 00:46:28,190 --> 00:46:29,650 And if they give you a phi that's 871 00:46:29,650 --> 00:46:32,820 just one component, dot product of one component, 872 00:46:32,820 --> 00:46:34,290 just multiply them together. 873 00:46:34,290 --> 00:46:34,930 Easy enough. 874 00:46:34,930 --> 00:46:37,994 If they give you a k, treat it as a dot product 875 00:46:37,994 --> 00:46:39,160 and try to reverse engineer. 876 00:46:39,160 --> 00:46:40,784 It's usually something like this that's 877 00:46:40,784 --> 00:46:42,200 easy to reverse engineer. 878 00:46:42,200 --> 00:46:44,240 I really haven't seen it where it's not. 879 00:46:44,240 --> 00:46:48,370 So it often looks like the scariest problem. 880 00:46:48,370 --> 00:46:51,690 But it's usually not too bad to go between phis and k's. 881 00:46:51,690 --> 00:46:53,660 Does anyone have any questions on anything 882 00:46:53,660 --> 00:46:59,646 that we did on support vector machines? 883 00:46:59,646 --> 00:47:01,594 Question. 884 00:47:01,594 --> 00:47:04,029 AUDIENCE: So what's the intuition behind the w? 885 00:47:04,029 --> 00:47:06,451 We solved it and figured out numbers and integrations 886 00:47:06,451 --> 00:47:06,951 with it. 887 00:47:06,951 --> 00:47:08,671 But what is it in relation to-- 888 00:47:08,671 --> 00:47:10,420 PROFESSOR: Question is, what is intuition? 889 00:47:10,420 --> 00:47:12,976 What is w? 890 00:47:12,976 --> 00:47:14,980 W Is the dividing line. 891 00:47:14,980 --> 00:47:16,996 It is the drop dead dividing line. 892 00:47:16,996 --> 00:47:18,620 When I say the drop dead dividing line, 893 00:47:18,620 --> 00:47:21,960 you like those big, bold solid lines over there. 894 00:47:21,960 --> 00:47:23,680 Those are your pretty certain lines. 895 00:47:23,680 --> 00:47:25,057 Everything past that was a minus. 896 00:47:25,057 --> 00:47:26,640 In your training, it's that everything 897 00:47:26,640 --> 00:47:30,150 past the big bold line there was a plus in your training stuff. 898 00:47:30,150 --> 00:47:32,290 But the dotted line is the one you're really 899 00:47:32,290 --> 00:47:34,350 going to use in the test data. 900 00:47:34,350 --> 00:47:36,770 In the test data, when push comes to shove, 901 00:47:36,770 --> 00:47:39,910 you might get something if it's inside of that gutter. 902 00:47:39,910 --> 00:47:41,879 And if it's on the, say, of that one 903 00:47:41,879 --> 00:47:44,420 up there, if it's on the upper left side of that dotted line, 904 00:47:44,420 --> 00:47:47,040 you're going to call it a plus. 905 00:47:47,040 --> 00:47:49,870 So that dotted line is your decision boundary. 906 00:47:49,870 --> 00:47:54,010 And that is basically the idea. 907 00:47:54,010 --> 00:47:56,952 And in fact, the way that the algorithm would do it 908 00:47:56,952 --> 00:47:58,660 on the computer is it would quadratically 909 00:47:58,660 --> 00:48:01,430 optimize the alphas, which messes around 910 00:48:01,430 --> 00:48:02,610 with the dotted line. 911 00:48:02,610 --> 00:48:06,230 And by quadratically maximizing the alphas-- you see 912 00:48:06,230 --> 00:48:07,884 how the alphas add up to a w. 913 00:48:07,884 --> 00:48:08,925 It just checks it around. 914 00:48:08,925 --> 00:48:12,519 And eventually, it finds, oh, making the alpha of this one 0 915 00:48:12,519 --> 00:48:13,810 makes it a better optimization. 916 00:48:13,810 --> 00:48:15,980 You're trying to get the widest possible road. 917 00:48:15,980 --> 00:48:17,715 It would eventually come out to this. 918 00:48:17,715 --> 00:48:19,340 This is trivial for a human to eyeball. 919 00:48:19,340 --> 00:48:21,500 But some real problems with 200 data 920 00:48:21,500 --> 00:48:24,720 points that have to get one or two of them wrong, classified, 921 00:48:24,720 --> 00:48:27,960 and you may be using a quadratic kernel or something 922 00:48:27,960 --> 00:48:29,500 that, you can't do that. 923 00:48:29,500 --> 00:48:30,510 You just can't. 924 00:48:30,510 --> 00:48:32,930 Well, maybe can, in which case you 925 00:48:32,930 --> 00:48:34,782 should be getting a MacArthur Fellowship 926 00:48:34,782 --> 00:48:35,740 or something like that. 927 00:48:35,740 --> 00:48:38,310 But the computer can. 928 00:48:38,310 --> 00:48:40,680 And the basic idea is when it comes down to it, 929 00:48:40,680 --> 00:48:44,860 it figures out the alphas, that the best w for the widest road. 930 00:48:44,860 --> 00:48:47,410 And the w s your decision boundary. 931 00:48:47,410 --> 00:48:48,620 Good question. 932 00:48:48,620 --> 00:48:52,270 Any other questions about our old friend, SVM? 933 00:48:52,270 --> 00:48:54,250 I have a question for you. 934 00:48:54,250 --> 00:48:57,090 After seeing this-- and let's pretend that they only 935 00:48:57,090 --> 00:49:01,065 asked to solve for w's, b's, these kind of kernels and phi, 936 00:49:01,065 --> 00:49:03,810 which are the typical things-- how many people now 937 00:49:03,810 --> 00:49:10,030 think that you can go through and work an SVM problem? 938 00:49:10,030 --> 00:49:12,270 All right, we've got a few. 939 00:49:12,270 --> 00:49:14,250 We've got a happy few. 940 00:49:14,250 --> 00:49:15,150 Band of brothers. 941 00:49:15,150 --> 00:49:16,980 Maybe eight people raised their hand there. 942 00:49:16,980 --> 00:49:18,010 That's good. 943 00:49:18,010 --> 00:49:21,630 How many people know what a support vector is now? 944 00:49:21,630 --> 00:49:22,635 That's really good. 945 00:49:25,360 --> 00:49:29,410 Because if that's all you learned from today's 946 00:49:29,410 --> 00:49:32,300 recitation, it's still good. 947 00:49:32,300 --> 00:49:33,130 It really is. 948 00:49:33,130 --> 00:49:33,800 I'm telling you. 949 00:49:33,800 --> 00:49:37,360 I had to take two classes on this and then TA it before I 950 00:49:37,360 --> 00:49:39,800 really, really understood it. 951 00:49:39,800 --> 00:49:44,660 So you guys are ahead of me. 952 00:49:44,660 --> 00:49:45,790 All right, take care. 953 00:49:45,790 --> 00:49:47,340 Have a great weekend. 954 00:49:47,340 --> 00:49:51,390 And we'll see you for boosting and vampires next week.