1 00:00:01,060 --> 00:00:04,120 PROFESSOR: The example of scheduling courses in terms 2 00:00:04,120 --> 00:00:06,780 is really a special case of a general problem 3 00:00:06,780 --> 00:00:10,850 that you can probably see of scheduling a bunch of tasks 4 00:00:10,850 --> 00:00:13,290 or jobs under constraints of which ones have 5 00:00:13,290 --> 00:00:15,830 to be done before other ones, which 6 00:00:15,830 --> 00:00:17,610 is a topic that comes up actually 7 00:00:17,610 --> 00:00:18,860 in lots of applications. 8 00:00:18,860 --> 00:00:21,160 But you can see applications in computer science 9 00:00:21,160 --> 00:00:25,550 where you might have a complex calculation, pieces 10 00:00:25,550 --> 00:00:27,800 of which could be done in parallel and other parts had 11 00:00:27,800 --> 00:00:30,060 to be done in order because later results depended 12 00:00:30,060 --> 00:00:33,100 on the results of an earlier computation. 13 00:00:33,100 --> 00:00:34,740 It leads us to the general discussion 14 00:00:34,740 --> 00:00:36,500 of parallel scheduling. 15 00:00:36,500 --> 00:00:40,810 And we've already worked out some theory of that really just 16 00:00:40,810 --> 00:00:42,080 from the example. 17 00:00:42,080 --> 00:00:44,590 Namely, if we look at the minimum number of terms 18 00:00:44,590 --> 00:00:49,600 to graduate, this corresponds to the minimum amount 19 00:00:49,600 --> 00:00:52,930 of number of stages or the minimum amount of time 20 00:00:52,930 --> 00:00:55,650 that it takes to process a bunch of tasks, 21 00:00:55,650 --> 00:00:58,860 assuming that you can do tasks in parallel 22 00:00:58,860 --> 00:01:02,080 and as many in parallel as you need 23 00:01:02,080 --> 00:01:04,760 to-- that there's no limit on the amount of parallelism 24 00:01:04,760 --> 00:01:05,570 allowed. 25 00:01:05,570 --> 00:01:09,470 In that case, what we can say is that the minimum parallel time 26 00:01:09,470 --> 00:01:14,890 for a bunch of constrained tasks is simply the maximum chain 27 00:01:14,890 --> 00:01:17,730 size in the constraint graph. 28 00:01:17,730 --> 00:01:20,470 We saw that example with the course 29 00:01:20,470 --> 00:01:23,020 prerequisites where we had five. 30 00:01:23,020 --> 00:01:24,930 And in general, this is the theorem. 31 00:01:24,930 --> 00:01:29,980 Minimum parallel time is exactly equal to maximum change size 32 00:01:29,980 --> 00:01:32,470 for chains in the graph that constrains 33 00:01:32,470 --> 00:01:35,550 the order in which tasks can be completed. 34 00:01:35,550 --> 00:01:38,730 Now what about the maximum term load? 35 00:01:38,730 --> 00:01:40,830 Well, that corresponds to the number 36 00:01:40,830 --> 00:01:47,330 of processors you need to be doing tasks in parallel. 37 00:01:47,330 --> 00:01:50,700 So for the course scheduling example, 38 00:01:50,700 --> 00:01:54,110 it means how many subjects can you take in one term? 39 00:01:54,110 --> 00:01:56,880 But if you were, say, doing computations, 40 00:01:56,880 --> 00:02:00,850 how many separate CPUs would you need in order 41 00:02:00,850 --> 00:02:04,190 to be able to fully utilize the parallelism to as 42 00:02:04,190 --> 00:02:07,620 much in parallel as you possibly could and abound 43 00:02:07,620 --> 00:02:11,070 on the number of processors that are needed for minimum time is 44 00:02:11,070 --> 00:02:15,110 simply the maximum antichain size, which 45 00:02:15,110 --> 00:02:18,065 in the example from the previous segment on course scheduling, 46 00:02:18,065 --> 00:02:20,930 it turns out there were five courses you could take 47 00:02:20,930 --> 00:02:22,460 in one term, the second term. 48 00:02:22,460 --> 00:02:25,000 And that was, in fact, the maximum antichain size. 49 00:02:25,000 --> 00:02:30,190 So that's an upper bound on the number of processors that you 50 00:02:30,190 --> 00:02:32,640 need to achieve minimum time. 51 00:02:32,640 --> 00:02:34,990 But in fact, it's a course upper bound 52 00:02:34,990 --> 00:02:37,870 because although the number of processors needed 53 00:02:37,870 --> 00:02:40,340 to achieve minimum parallel time is at most 54 00:02:40,340 --> 00:02:42,450 the maximum antichain size. 55 00:02:42,450 --> 00:02:44,481 In fact, in the previous example, 56 00:02:44,481 --> 00:02:46,730 it turns out you could get away with three processors. 57 00:02:46,730 --> 00:02:49,330 It was possible to schedule the subjects 58 00:02:49,330 --> 00:02:51,780 so you only took three courses a term 59 00:02:51,780 --> 00:02:54,710 and still finished in minimum time. 60 00:02:54,710 --> 00:02:56,514 So can you do better than three subjects? 61 00:02:56,514 --> 00:02:58,930 Well, there's a trivial argument that says, no, you can't. 62 00:02:58,930 --> 00:03:01,710 Because in that previous example, 63 00:03:01,710 --> 00:03:04,280 we had 13 subjects to schedule. 64 00:03:04,280 --> 00:03:06,510 The maximum chain size was 5. 65 00:03:06,510 --> 00:03:09,480 So it was going to take at least five terms. 66 00:03:09,480 --> 00:03:12,270 So that means you have to distribute these 13 67 00:03:12,270 --> 00:03:14,250 subjects among five terms. 68 00:03:14,250 --> 00:03:17,280 There has to be some term that has 69 00:03:17,280 --> 00:03:19,980 at least the average number of subjects, 70 00:03:19,980 --> 00:03:22,620 namely 13 divided by 5. 71 00:03:22,620 --> 00:03:26,800 So that means there has to be a term in which you're taking 72 00:03:26,800 --> 00:03:28,680 13 divided by 5 subjects. 73 00:03:28,680 --> 00:03:32,280 Of course, you round up because it has to be an integer. 74 00:03:32,280 --> 00:03:36,800 So the minimum number of terms to finish and graduate-- 75 00:03:36,800 --> 00:03:39,330 finishing these 13 subjects in five terms-- 76 00:03:39,330 --> 00:03:43,330 is 3 because 13 divided by 5 rounded up is 3. 77 00:03:43,330 --> 00:03:47,270 And this is a general phenomenon that applies. 78 00:03:47,270 --> 00:03:52,410 And what we can say is that if you have a DAG with n vertices 79 00:03:52,410 --> 00:03:56,700 and the maximum chain size is c-- so that's 80 00:03:56,700 --> 00:04:00,830 how deep it can be at most-- and the maximum antichain size is 81 00:04:00,830 --> 00:04:03,700 a-- that's the largest number of things 82 00:04:03,700 --> 00:04:07,880 that you could ever possibly do in parallel-- then clearly, 83 00:04:07,880 --> 00:04:12,940 the total number of vertices is c times a, at most. 84 00:04:12,940 --> 00:04:15,190 So the total number of tasks that you 85 00:04:15,190 --> 00:04:19,399 can do where you are going to finish 86 00:04:19,399 --> 00:04:23,840 in c steps using at most a processors is bounded 87 00:04:23,840 --> 00:04:25,060 by c times a. 88 00:04:25,060 --> 00:04:27,280 So what that tells you is that you can't both 89 00:04:27,280 --> 00:04:30,540 have the antichain size and the chain size 90 00:04:30,540 --> 00:04:34,020 be too small because their product has to be at least n. 91 00:04:34,020 --> 00:04:37,760 That can be rephrased as a lemma that is credited 92 00:04:37,760 --> 00:04:38,830 to a guy named Dilworth. 93 00:04:38,830 --> 00:04:41,540 Dilworth is actually famous for Dilworth's theorem of which 94 00:04:41,540 --> 00:04:43,510 this Dilworth's lemma is a special case, 95 00:04:43,510 --> 00:04:45,550 but we don't need the general theorem. 96 00:04:45,550 --> 00:04:49,860 Dilworth's lemma says that if you have an n-vertex DAG, then 97 00:04:49,860 --> 00:04:54,750 for any number t, it either has a chain of size bigger than t, 98 00:04:54,750 --> 00:04:57,370 or it has an antichain of size greater than 99 00:04:57,370 --> 00:04:58,860 or equal to n over t. 100 00:04:58,860 --> 00:05:01,600 And we proved this on the previous slide. 101 00:05:01,600 --> 00:05:04,710 The product of these two things has to be at least n, 102 00:05:04,710 --> 00:05:07,730 and the general case is t times n over t 103 00:05:07,730 --> 00:05:10,100 is greater than or equal to n. 104 00:05:10,100 --> 00:05:12,860 And this holds for all t between 1 and n. 105 00:05:12,860 --> 00:05:15,710 Well, let's think of a simple application of that. 106 00:05:15,710 --> 00:05:20,730 If I choose the t that balances antichain size from chain size, 107 00:05:20,730 --> 00:05:22,790 then I choose t to be the square root of n. 108 00:05:22,790 --> 00:05:24,567 So over here, I have square root of n, 109 00:05:24,567 --> 00:05:26,650 and here I have n divided by the square root of n, 110 00:05:26,650 --> 00:05:28,020 which is also square root of n. 111 00:05:28,020 --> 00:05:33,070 And what we can conclude is that every end vertex 112 00:05:33,070 --> 00:05:36,830 DAG has either a chain of size at least the square root of n 113 00:05:36,830 --> 00:05:40,220 or an anti chain of size at least square root of n. 114 00:05:40,220 --> 00:05:43,550 This turns out to actually have a few applications, 115 00:05:43,550 --> 00:05:45,630 but we're just going to look at a fun 116 00:05:45,630 --> 00:05:47,960 application of this remark that you 117 00:05:47,960 --> 00:05:50,820 have to have a chain or an antichain of size 118 00:05:50,820 --> 00:05:52,110 at least square root of n. 119 00:05:52,110 --> 00:05:54,060 You might have only one of these. 120 00:05:54,060 --> 00:05:55,430 You might have both. 121 00:05:55,430 --> 00:05:58,440 But one or the other has to be at least as big 122 00:05:58,440 --> 00:06:00,820 as square root of n. 123 00:06:00,820 --> 00:06:02,960 Let's think of a new DAG that I'm 124 00:06:02,960 --> 00:06:04,210 going to construct as follows. 125 00:06:04,210 --> 00:06:08,460 I'm going to draw an edge between students in the class, 126 00:06:08,460 --> 00:06:10,600 and I'm going to think of one student 127 00:06:10,600 --> 00:06:13,720 as having a direct edge to another student 128 00:06:13,720 --> 00:06:18,210 if the first student is both shorter and younger-- actually 129 00:06:18,210 --> 00:06:21,540 meaning no taller than and no older than the other. 130 00:06:21,540 --> 00:06:23,870 Let's just say shorter-- meaning shorter or possibly 131 00:06:23,870 --> 00:06:27,060 the same height-- younger or possibly the same age. 132 00:06:27,060 --> 00:06:29,110 And so the rule is if I think of a student 133 00:06:29,110 --> 00:06:32,810 as being represented by their shortness s and their age 134 00:06:32,810 --> 00:06:37,360 a, then a student with a height s 1 135 00:06:37,360 --> 00:06:42,070 and age a 1 has a direct arrow to another student with height 136 00:06:42,070 --> 00:06:49,460 s 2 and age a 2, providing that the first pair is less than 137 00:06:49,460 --> 00:06:51,460 or equal to the second pair in both coordinates. 138 00:06:51,460 --> 00:06:53,370 S 1 is less than or equal to s 2. 139 00:06:53,370 --> 00:06:56,510 And A 1 is less than or equal to A 2. 140 00:06:56,510 --> 00:06:58,500 Now, we don't want ties here because that 141 00:06:58,500 --> 00:06:59,950 would break the DAG property if I 142 00:06:59,950 --> 00:07:02,970 have two students with exactly the same age and height. 143 00:07:02,970 --> 00:07:06,010 So let's assume that we're measuring age in microseconds 144 00:07:06,010 --> 00:07:08,210 and the height in micrometers. 145 00:07:08,210 --> 00:07:10,700 And with that kind of a fineness, 146 00:07:10,700 --> 00:07:12,400 the likelihood of a tie is pretty low. 147 00:07:12,400 --> 00:07:15,720 So then it becomes a DAG again. 148 00:07:15,720 --> 00:07:21,310 So this is the definition of taking a DAG built out 149 00:07:21,310 --> 00:07:25,950 of pairs-- there's a pure DAG for height, 150 00:07:25,950 --> 00:07:27,420 and there's a pure DAG for age. 151 00:07:27,420 --> 00:07:30,310 And I combine them into pairs, and I get a new DAG 152 00:07:30,310 --> 00:07:32,995 by looking at how the coordinates behave together. 153 00:07:32,995 --> 00:07:34,390 This is called the product graph. 154 00:07:34,390 --> 00:07:37,160 It's a general construction that comes up, 155 00:07:37,160 --> 00:07:38,940 and we will talk a little bit more 156 00:07:38,940 --> 00:07:43,240 about when we reexamined DAGs in the context 157 00:07:43,240 --> 00:07:46,270 of the language of relations and partial orders. 158 00:07:46,270 --> 00:07:49,350 Anyway, this is the product graph. 159 00:07:49,350 --> 00:07:52,490 According to Dilworth's lemma, in a class like ours 160 00:07:52,490 --> 00:07:55,450 of 141 students, it means that we're 161 00:07:55,450 --> 00:07:58,920 going to have a chain or an antichain in this product 162 00:07:58,920 --> 00:08:03,895 DAG of size square root of 141 rounded up, or 12. 163 00:08:07,046 --> 00:08:08,420 According to Dilworth's lemma, in 164 00:08:08,420 --> 00:08:13,100 this particular age-height graph, what does 165 00:08:13,100 --> 00:08:17,330 it mean for this to be an antichain? 166 00:08:17,330 --> 00:08:18,970 Suppose I take a bunch of students 167 00:08:18,970 --> 00:08:21,580 and I line them up in order of size 168 00:08:21,580 --> 00:08:26,360 with the tallest on the left and the shortest on the right. 169 00:08:26,360 --> 00:08:28,160 If this is going to be an antichain, 170 00:08:28,160 --> 00:08:33,940 it means that they have to be getting older 171 00:08:33,940 --> 00:08:35,230 as they get shorter. 172 00:08:35,230 --> 00:08:38,230 Because if I ever had a case where somebody to the right 173 00:08:38,230 --> 00:08:42,320 was both shorter and younger than somebody to the left, 174 00:08:42,320 --> 00:08:44,695 it wouldn't be an antichain because they'd be comparable. 175 00:08:46,640 --> 00:08:48,430 So an antichain-- according to Dilworth's 176 00:08:48,430 --> 00:08:50,830 lemma-- if you'd sort the students by height, 177 00:08:50,830 --> 00:08:55,740 they have to be getting older and as they get shorter. 178 00:08:55,740 --> 00:08:59,230 If it was a chain, they would be getting younger 179 00:08:59,230 --> 00:09:01,430 as they got shorter. 180 00:09:01,430 --> 00:09:05,620 But the more interesting one is the antichain 181 00:09:05,620 --> 00:09:08,050 in this height-birthday example. 182 00:09:08,050 --> 00:09:10,350 So we should be looking at-- we'll either have a chain 183 00:09:10,350 --> 00:09:15,230 or an antichain in this class according to this product DAG. 184 00:09:15,230 --> 00:09:19,040 As a matter of fact, we really had an antichain. 185 00:09:19,040 --> 00:09:22,160 Here's a quick list of a dozen students. 186 00:09:22,160 --> 00:09:24,950 And indeed, if you look at the birthdays, 187 00:09:24,950 --> 00:09:29,590 there's somebody who's 6'1 was born in August '94 and then 188 00:09:29,590 --> 00:09:34,010 somebody who was born in April '94 and is 6'0 all the way down 189 00:09:34,010 --> 00:09:37,820 to somebody who was born in 1991 and who's five feet tall. 190 00:09:37,820 --> 00:09:39,620 So we lucked out. 191 00:09:39,620 --> 00:09:41,650 We could have only had to chain, but we actually 192 00:09:41,650 --> 00:09:45,030 had the antichain in this case.