1 00:00:01,150 --> 00:00:05,510 We've seen graphs involving boys and girls and connections 2 00:00:05,510 --> 00:00:09,600 between them in the context of our sexual demographics 3 00:00:09,600 --> 00:00:12,320 calculation and study. 4 00:00:12,320 --> 00:00:16,300 A similar problem comes up in terms 5 00:00:16,300 --> 00:00:19,780 of what I call stable matching, which is again 6 00:00:19,780 --> 00:00:24,230 the issue of matching up boys and girls in a special way 7 00:00:24,230 --> 00:00:25,480 according to some constraints. 8 00:00:25,480 --> 00:00:27,479 It turns out to have a lot of applications which 9 00:00:27,479 --> 00:00:29,030 we'll discuss toward the end. 10 00:00:29,030 --> 00:00:30,680 Let's just look at what the problem is. 11 00:00:30,680 --> 00:00:35,040 The setup is that there's some number of boys, in this case 5, 12 00:00:35,040 --> 00:00:38,920 1 through 5, and an equal number of girls labeled A through E. 13 00:00:38,920 --> 00:00:43,451 And each of the boys has a ranking of the girls. 14 00:00:43,451 --> 00:00:45,200 Different rankings, because different boys 15 00:00:45,200 --> 00:00:46,620 have different preferences. 16 00:00:46,620 --> 00:00:49,549 And likewise, the girls have rankings of the boys. 17 00:00:49,549 --> 00:00:51,340 Different girls have different preferences. 18 00:00:51,340 --> 00:00:56,070 So here, girl A likes boy 3 best and boy 5 second best, 19 00:00:56,070 --> 00:01:01,670 and boy 1 likes girls C best and girl D least. 20 00:01:01,670 --> 00:01:03,660 So the problem, basically, is that we 21 00:01:03,660 --> 00:01:05,870 want to get all the boys married to all the girls. 22 00:01:05,870 --> 00:01:11,470 We want to form 5 monogamous bisexual marriages, a boy 23 00:01:11,470 --> 00:01:16,770 and a girl, and we'd like, in some vague way, 24 00:01:16,770 --> 00:01:21,946 to acknowledge these preferences and satisfy as many as we can. 25 00:01:21,946 --> 00:01:23,820 I'll be more specific about that in a minute. 26 00:01:23,820 --> 00:01:25,420 But let's just play with that idea 27 00:01:25,420 --> 00:01:29,420 of trying to accommodate people's preferences. 28 00:01:29,420 --> 00:01:32,250 So one way to do it is let's just decide, 29 00:01:32,250 --> 00:01:33,970 well, we'll favor the boys this time. 30 00:01:33,970 --> 00:01:36,430 Let's try a greedy strategy for the boys. 31 00:01:36,430 --> 00:01:38,170 Let's just look at the boy preferences. 32 00:01:38,170 --> 00:01:39,711 And a greedy strategy means I'm going 33 00:01:39,711 --> 00:01:43,520 to try to give each boy the best possible choice that he 34 00:01:43,520 --> 00:01:44,040 can make. 35 00:01:44,040 --> 00:01:46,300 So I'm going to start off by deciding 36 00:01:46,300 --> 00:01:49,540 that let's let boy 1 have his first choice, girl 37 00:01:49,540 --> 00:01:51,742 C. I'm going to marry them off. 38 00:01:51,742 --> 00:01:53,450 And once I've married them off, I'll just 39 00:01:53,450 --> 00:01:57,240 stop considering 1 and C. And now I have a reduced problem. 40 00:01:57,240 --> 00:02:00,250 I have four remaining boys and four remaining girls. 41 00:02:00,250 --> 00:02:02,740 And proceeding in this way-- greedy way for boys-- 42 00:02:02,740 --> 00:02:07,910 I'm going to now give boy 2 his next choice that remains, 43 00:02:07,910 --> 00:02:10,530 namely girl A. And I'll marry them off. 44 00:02:10,530 --> 00:02:15,160 And again, now 2 and A can be eliminated from consideration. 45 00:02:15,160 --> 00:02:17,780 I continue in this way and I wind up 46 00:02:17,780 --> 00:02:21,660 with this set of five marriages ending with boy 5 47 00:02:21,660 --> 00:02:24,050 married to girl E. 48 00:02:24,050 --> 00:02:25,184 OK. 49 00:02:25,184 --> 00:02:26,850 Now if we look at this set of marriages, 50 00:02:26,850 --> 00:02:28,347 there's a problem which motivates 51 00:02:28,347 --> 00:02:30,180 the whole stable marriage problem that we're 52 00:02:30,180 --> 00:02:32,280 going to be examining. 53 00:02:32,280 --> 00:02:36,000 Namely, we've married off boy 1 to his first choice, girl C. 54 00:02:36,000 --> 00:02:38,700 He should be happy, but she may not be. 55 00:02:38,700 --> 00:02:42,810 And we've also married off boy 4 to girl B. 56 00:02:42,810 --> 00:02:48,690 Now a difficulty here is that if you look at the preferences, 57 00:02:48,690 --> 00:02:55,810 girl C actually is more desirable to boy 4 than girl B. 58 00:02:55,810 --> 00:02:59,970 Girl C, or boy 4, likes somebody else's wife 59 00:02:59,970 --> 00:03:01,280 better than his own. 60 00:03:01,280 --> 00:03:06,130 And what makes it really bad is that girl C, the other person's 61 00:03:06,130 --> 00:03:10,000 wife, likes boy 4 better than her husband. 62 00:03:10,000 --> 00:03:12,530 Each of them would be better off if they ran off together. 63 00:03:12,530 --> 00:03:14,340 They are, whether they do or not, 64 00:03:14,340 --> 00:03:16,215 they certainly are under tremendous pressure. 65 00:03:16,215 --> 00:03:19,540 It makes this set of marriages unstable. 66 00:03:19,540 --> 00:03:21,330 So they're called a rogue couple. 67 00:03:21,330 --> 00:03:23,300 When you have in a set of marriages 68 00:03:23,300 --> 00:03:27,930 a boy and a girl who prefer each other 69 00:03:27,930 --> 00:03:30,040 over their current spouses, they are 70 00:03:30,040 --> 00:03:35,300 said to be a rogue couple and a source of instability. 71 00:03:35,300 --> 00:03:37,330 So the stable marriage problem is 72 00:03:37,330 --> 00:03:40,820 let's see if we can get everybody married off 73 00:03:40,820 --> 00:03:43,420 and have no rogue couples. 74 00:03:43,420 --> 00:03:45,400 It'll be a stable set of marriages. 75 00:03:45,400 --> 00:03:47,600 Now people may not be happy, but it 76 00:03:47,600 --> 00:03:49,150 doesn't matter because they'll never 77 00:03:49,150 --> 00:03:53,510 find anybody else that is unhappy in the same way that 78 00:03:53,510 --> 00:03:57,320 would be willing to run off with them and make them happier. 79 00:03:57,320 --> 00:03:59,680 So it's stable. 80 00:03:59,680 --> 00:04:01,520 And it turns out that there always 81 00:04:01,520 --> 00:04:04,220 is a way to find a stable set of marriages. 82 00:04:04,220 --> 00:04:06,050 A couple of ways. 83 00:04:06,050 --> 00:04:08,860 But why don't we just play with the idea. 84 00:04:08,860 --> 00:04:12,820 Here is display of those preferences again 85 00:04:12,820 --> 00:04:16,850 and you can stop the video and fiddle with a piece of paper 86 00:04:16,850 --> 00:04:21,100 and see if you can come up with a stable set of marriages 87 00:04:21,100 --> 00:04:22,760 between the boys and girls. 88 00:04:22,760 --> 00:04:24,880 We used to do this in class in real time. 89 00:04:24,880 --> 00:04:27,430 We would give five different boys 90 00:04:27,430 --> 00:04:29,520 a chart of preferences of girls, and we'd 91 00:04:29,520 --> 00:04:32,454 give the five different girls a chart of preferences of boys. 92 00:04:32,454 --> 00:04:34,370 They were not supposed to tell each other what 93 00:04:34,370 --> 00:04:36,250 their preferences were, but the girls 94 00:04:36,250 --> 00:04:38,499 were supposed to be interviewing the boys and the boys 95 00:04:38,499 --> 00:04:41,400 interviewing the girls simultaneously in a parallel, 96 00:04:41,400 --> 00:04:43,880 and try to agree to get married and see 97 00:04:43,880 --> 00:04:46,200 whether the set of marriages that they wound up with 98 00:04:46,200 --> 00:04:46,920 were stable. 99 00:04:46,920 --> 00:04:48,810 Most of the time they actually did wind up 100 00:04:48,810 --> 00:04:51,390 with a stable set of marriages, but not always. 101 00:04:51,390 --> 00:04:52,930 Just an amusing exercise. 102 00:04:52,930 --> 00:04:55,187 And it does illustrate something about the fact 103 00:04:55,187 --> 00:04:57,520 that the procedures that we're going to be going through 104 00:04:57,520 --> 00:05:00,440 to find stable marriages work very nicely if you 105 00:05:00,440 --> 00:05:03,190 choose to do them in parallel. 106 00:05:03,190 --> 00:05:07,390 Anyway, there are, it turns out, two sets of stable marriages 107 00:05:07,390 --> 00:05:11,290 that we can find in this particular set of preferences. 108 00:05:11,290 --> 00:05:14,152 The simplest one to understand is all the girls 109 00:05:14,152 --> 00:05:15,110 get their first choice. 110 00:05:15,110 --> 00:05:18,580 It so happens, if you look at that chart, all of the girls 111 00:05:18,580 --> 00:05:21,240 have different first choice boys. 112 00:05:21,240 --> 00:05:23,890 If we simply give them their first choice, 113 00:05:23,890 --> 00:05:26,349 no girl is going to be tempted to be part of a rogue couple 114 00:05:26,349 --> 00:05:27,806 because she's got her first choice. 115 00:05:27,806 --> 00:05:28,880 It's absolutely stable. 116 00:05:28,880 --> 00:05:31,110 But of course, that's a very special circumstance. 117 00:05:31,110 --> 00:05:34,210 It would be unusual that all the girls' first choices were 118 00:05:34,210 --> 00:05:36,370 different, or likewise, it would be unusual 119 00:05:36,370 --> 00:05:38,610 if all the boys' first choices were different. 120 00:05:38,610 --> 00:05:40,140 But if they were, then you instantly 121 00:05:40,140 --> 00:05:42,530 get a stable set of marriages. 122 00:05:42,530 --> 00:05:45,600 There's another stable set that's not quite so obvious. 123 00:05:45,600 --> 00:05:50,337 And you can check that all of these pairs 124 00:05:50,337 --> 00:05:51,170 have no instability. 125 00:05:51,170 --> 00:05:55,730 There's no rogue couples in here when I marry 5 to A and 1 to E. 126 00:05:55,730 --> 00:06:00,340 This is a so-called "boy optimal" set of marriages. 127 00:06:00,340 --> 00:06:03,020 It turns out that in this set of marriages, 128 00:06:03,020 --> 00:06:05,890 every boy gets the best possible spouse 129 00:06:05,890 --> 00:06:09,910 that he could possibly get in any set of stable marriages. 130 00:06:09,910 --> 00:06:11,510 There's no set of stable marriages 131 00:06:11,510 --> 00:06:14,810 in which boy 5 gets a more desirable girl than A. There's 132 00:06:14,810 --> 00:06:18,230 no set of stable marriages in which boy 1 gets a girl that's 133 00:06:18,230 --> 00:06:20,720 more desirable to him than girl E. 134 00:06:20,720 --> 00:06:23,490 The sad news is that it's simultaneously 135 00:06:23,490 --> 00:06:24,770 pessimal for the girls. 136 00:06:24,770 --> 00:06:28,732 That is, each girl is getting their worst possible spouse 137 00:06:28,732 --> 00:06:30,190 among all sets of stable marriages. 138 00:06:30,190 --> 00:06:32,210 We'll examine that further in a minute. 139 00:06:32,210 --> 00:06:35,290 But let me just point out that this is more than a puzzle. 140 00:06:35,290 --> 00:06:37,460 I mean, it's fun, and it's a nice puzzle, 141 00:06:37,460 --> 00:06:38,890 but it's more than a puzzle. 142 00:06:38,890 --> 00:06:42,220 Because the original case where it was studied or published 143 00:06:42,220 --> 00:06:45,810 first was in a paper by Gale and Shapley in 1962. 144 00:06:45,810 --> 00:06:49,070 You may remember the name David Gale from the subset game 145 00:06:49,070 --> 00:06:51,110 that we played early in the term when we were 146 00:06:51,110 --> 00:06:53,880 practicing with set relations. 147 00:06:53,880 --> 00:06:56,170 And what they were dealing was with the problem 148 00:06:56,170 --> 00:06:58,490 of college admissions where students 149 00:06:58,490 --> 00:07:01,200 have rankings of colleges that they've applied to 150 00:07:01,200 --> 00:07:03,410 and their preferences and colleges have 151 00:07:03,410 --> 00:07:06,990 rankings of students that have applied to them. 152 00:07:06,990 --> 00:07:11,020 And we're trying to get matching up between college offers 153 00:07:11,020 --> 00:07:13,490 and student preferences. 154 00:07:13,490 --> 00:07:17,390 And in a circumstance where a college made 155 00:07:17,390 --> 00:07:19,850 an offer and a student sort of accepted, but then later 156 00:07:19,850 --> 00:07:23,024 the student got another offer from a place they preferred 157 00:07:23,024 --> 00:07:24,690 more, and they were changing their mind, 158 00:07:24,690 --> 00:07:28,450 and withdrawing acceptances and so on, 159 00:07:28,450 --> 00:07:30,920 it was making everybody crazy, the administrators 160 00:07:30,920 --> 00:07:32,410 and the students themselves. 161 00:07:32,410 --> 00:07:36,260 And the desire was, let's get some stable set 162 00:07:36,260 --> 00:07:38,460 of offers on the table. 163 00:07:38,460 --> 00:07:41,940 And Gale and Shapley proposed a protocol 164 00:07:41,940 --> 00:07:43,570 to get stable marriages that would 165 00:07:43,570 --> 00:07:45,050 apply to college admissions. 166 00:07:45,050 --> 00:07:46,540 It turns out, interestingly enough, 167 00:07:46,540 --> 00:07:48,410 that although Gale and Shapley are credited 168 00:07:48,410 --> 00:07:50,540 with the stable marriage solution 169 00:07:50,540 --> 00:07:52,806 that we're going to discuss, they 170 00:07:52,806 --> 00:07:54,930 did that because they were the first to publish it. 171 00:07:54,930 --> 00:07:56,763 But, in fact, it had been discovered and put 172 00:07:56,763 --> 00:07:59,690 into practice at least 20 years earlier 173 00:07:59,690 --> 00:08:04,050 by a national board whose job was to match interns 174 00:08:04,050 --> 00:08:06,866 in hospitals, that is graduating medical students who 175 00:08:06,866 --> 00:08:08,990 were about to start their further clinical training 176 00:08:08,990 --> 00:08:13,310 as interns, now called residents in modern language, 177 00:08:13,310 --> 00:08:14,880 had to be matched up with hospitals. 178 00:08:14,880 --> 00:08:17,330 And the residents had preferred hospitals 179 00:08:17,330 --> 00:08:19,310 that they'd like to go to, and the hospitals 180 00:08:19,310 --> 00:08:22,700 had rankings of residents that met their criteria. 181 00:08:22,700 --> 00:08:25,935 And again, the issue was, how do you assign residents 182 00:08:25,935 --> 00:08:28,130 to hospitals in a stable way? 183 00:08:28,130 --> 00:08:30,760 Before they had discovered this stability algorithm, 184 00:08:30,760 --> 00:08:31,480 it was a mess. 185 00:08:31,480 --> 00:08:34,159 Again, there's a wonderful story in the book by Gusfield 186 00:08:34,159 --> 00:08:38,130 and Irving that is an entire book about the stable marriage 187 00:08:38,130 --> 00:08:41,122 problem published by MIT Press, I think in '89. 188 00:08:41,122 --> 00:08:42,830 And as a matter of fact, I was the editor 189 00:08:42,830 --> 00:08:44,830 of the series in which it appears. 190 00:08:44,830 --> 00:08:47,160 This stable marriage problem turns out 191 00:08:47,160 --> 00:08:49,160 to have a lot of structure. 192 00:08:49,160 --> 00:08:52,430 And they describe a wonderful anecdote in their preface 193 00:08:52,430 --> 00:08:54,186 about the problems that were happening 194 00:08:54,186 --> 00:08:55,810 between the hospitals and the residents 195 00:08:55,810 --> 00:08:59,080 and the measures that were taken to try to achieve stability 196 00:08:59,080 --> 00:09:00,670 before they discovered this algorithm. 197 00:09:03,700 --> 00:09:06,890 Another genuine computer science application 198 00:09:06,890 --> 00:09:09,835 is one that was described by Tom Leighton who 199 00:09:09,835 --> 00:09:15,790 was a co-author of the text and then now the CEO of Akamai, 200 00:09:15,790 --> 00:09:19,080 an internet infrastructure plumbing company that 201 00:09:19,080 --> 00:09:21,880 has some large number of servers, 202 00:09:21,880 --> 00:09:26,080 I think 65,000 servers in 2010 around the world. 203 00:09:26,080 --> 00:09:28,960 And it basically is providing cached web pages 204 00:09:28,960 --> 00:09:33,260 so that they can respond more quickly to local calls. 205 00:09:33,260 --> 00:09:35,830 They have a problem that they get something 206 00:09:35,830 --> 00:09:39,020 like 62,000, 65,000 servers and they 207 00:09:39,020 --> 00:09:42,480 get I think 200 billion web requests a day. 208 00:09:42,480 --> 00:09:44,740 So the web requests are kind of acting 209 00:09:44,740 --> 00:09:49,430 like the boys and the servers are 210 00:09:49,430 --> 00:09:51,940 kind of acting like the girls, or the hospitals 211 00:09:51,940 --> 00:09:53,990 and the residents. 212 00:09:53,990 --> 00:09:57,400 And the web requests have preferences 213 00:09:57,400 --> 00:09:59,547 based on proximity and speed of server, 214 00:09:59,547 --> 00:10:02,130 and the servers have preference based on where they're located 215 00:10:02,130 --> 00:10:05,420 and the magnitude of the web request that they're coming to. 216 00:10:05,420 --> 00:10:09,500 And the question is, how do you assign web requests to servers 217 00:10:09,500 --> 00:10:13,840 so that things get done expeditiously? 218 00:10:13,840 --> 00:10:16,770 And it turns out that the stable marriage 219 00:10:16,770 --> 00:10:19,880 method gave a satisfactory way to accomplish 220 00:10:19,880 --> 00:10:22,255 that kind of matching. 221 00:10:22,255 --> 00:10:23,880 And in particular, because there's such 222 00:10:23,880 --> 00:10:27,010 large numbers involved that the stable marriage ritual, which 223 00:10:27,010 --> 00:10:31,280 we'll describe shortly, is very amenable to being 224 00:10:31,280 --> 00:10:32,660 run in parallel. 225 00:10:32,660 --> 00:10:34,540 Another application, it turns out, to come up 226 00:10:34,540 --> 00:10:36,020 was in matching dance partners when 227 00:10:36,020 --> 00:10:39,050 I was teaching this course ten years ago with a co-instructor 228 00:10:39,050 --> 00:10:41,090 who was a member of the Indian dance team. 229 00:10:41,090 --> 00:10:43,470 She said, we could use this, because it turns out 230 00:10:43,470 --> 00:10:47,340 that, again, there are boy and girl partners in the dance, 231 00:10:47,340 --> 00:10:50,850 and it was constantly the case that one boy would 232 00:10:50,850 --> 00:10:53,630 like another boy's partner better, and vice versa. 233 00:10:53,630 --> 00:10:57,132 And they would start pairing up and leaving 234 00:10:57,132 --> 00:10:59,340 the other people hanging and there were bad feelings, 235 00:10:59,340 --> 00:11:03,510 and it was a source of disruption in the society. 236 00:11:03,510 --> 00:11:05,900 There's a picture of that Indian dance group. 237 00:11:05,900 --> 00:11:07,950 My co-instructor's not actually there, 238 00:11:07,950 --> 00:11:12,080 but it gives you some sense of the reality 239 00:11:12,080 --> 00:11:14,480 of the problem here at MIT. 240 00:11:14,480 --> 00:11:15,980 And she told me that it was actually 241 00:11:15,980 --> 00:11:18,420 being used by that group.