1 00:00:00,820 --> 00:00:03,140 PROFESSOR: The greatest common divisor of two numbers 2 00:00:03,140 --> 00:00:04,890 is easy to compute. 3 00:00:04,890 --> 00:00:08,850 And that's a factor will play a crucial role in the number 4 00:00:08,850 --> 00:00:12,760 theory we're going to develop, and the properties of some 5 00:00:12,760 --> 00:00:17,100 of the modern codes that are based on number theory. 6 00:00:17,100 --> 00:00:20,870 The efficient way to compute the GCD of two numbers 7 00:00:20,870 --> 00:00:22,940 is based on a classical algorithm 8 00:00:22,940 --> 00:00:25,140 known as Euclidean algorithm, which 9 00:00:25,140 --> 00:00:27,840 is several thousand years old. 10 00:00:27,840 --> 00:00:32,680 And let's describe how it works now. 11 00:00:32,680 --> 00:00:38,480 So the Euclidean algorithm is based on the following lemma, 12 00:00:38,480 --> 00:00:40,340 which we'll call the remainder lemma, 13 00:00:40,340 --> 00:00:43,650 and it says that if a and b are two integers, 14 00:00:43,650 --> 00:00:46,510 then the greatest common divisor of a and b 15 00:00:46,510 --> 00:00:49,920 is the same as the greatest common divisor of b, 16 00:00:49,920 --> 00:00:54,720 and the remainder of a divided by b-- providing, of course, 17 00:00:54,720 --> 00:00:58,140 b is not 0, because otherwise you can't divide by b. 18 00:00:58,140 --> 00:01:00,622 OK how do you make sense out of this? 19 00:01:00,622 --> 00:01:01,330 Why is this true? 20 00:01:01,330 --> 00:01:02,913 Well, it's actually a very easy proof. 21 00:01:02,913 --> 00:01:07,090 Remember that by the so-called division algorithm-- 22 00:01:07,090 --> 00:01:12,134 or it's really a theorem-- if you divide a by b 23 00:01:12,134 --> 00:01:14,300 and we're doing integer division, what that means is 24 00:01:14,300 --> 00:01:17,710 you find a quotient of a divided by b in the quotient, 25 00:01:17,710 --> 00:01:18,626 and a remainder. 26 00:01:18,626 --> 00:01:20,000 And the quotient has the property 27 00:01:20,000 --> 00:01:24,030 that q times b plus the remainder is equal to a. 28 00:01:24,030 --> 00:01:27,190 The remainder is always going to be smaller than a. 29 00:01:27,190 --> 00:01:30,950 It will be the range from 0 up to, but not including, a. 30 00:01:30,950 --> 00:01:34,930 OK, if you look at this simple expression, what 31 00:01:34,930 --> 00:01:39,900 becomes apparent is that if you've got a divisor of two 32 00:01:39,900 --> 00:01:42,990 out of three of these terms, then 33 00:01:42,990 --> 00:01:44,980 it's going to divide the third term. 34 00:01:44,980 --> 00:01:49,420 So for example, if you have a divisor of b and r, 35 00:01:49,420 --> 00:01:52,730 then the sum of those two things is also 36 00:01:52,730 --> 00:01:55,180 going to have the same divisor, which means 37 00:01:55,180 --> 00:01:57,080 that a will have that divisor. 38 00:01:57,080 --> 00:02:01,120 If something divides both a and b, then it divides r. 39 00:02:01,120 --> 00:02:04,672 And if it divides b and r, it divides a. 40 00:02:04,672 --> 00:02:08,495 And that means that a and b and b and r 41 00:02:08,495 --> 00:02:10,470 have exactly the same divisors. 42 00:02:10,470 --> 00:02:13,510 They not only have the same greatest common divisor, 43 00:02:13,510 --> 00:02:14,860 all their divisors are the same. 44 00:02:14,860 --> 00:02:18,180 So obviously, the greatest one is the same. 45 00:02:18,180 --> 00:02:21,980 And that proves this key remainder lemma. 46 00:02:21,980 --> 00:02:24,580 Well, the remainder lemma now gives us a very lovely way 47 00:02:24,580 --> 00:02:25,960 to compute the GCD. 48 00:02:25,960 --> 00:02:27,620 And here's an example. 49 00:02:27,620 --> 00:02:31,370 Suppose I want to compute the GCD of 899 and 493. 50 00:02:31,370 --> 00:02:34,650 A is 899, b is 493. 51 00:02:34,650 --> 00:02:38,030 Well, so I want this GCD, 899 of 493. 52 00:02:38,030 --> 00:02:43,660 Well, according to the remainder lemma, if I divide 899 by 493, 53 00:02:43,660 --> 00:02:48,620 I get a quotient of 1, and a remainder of 406. 54 00:02:48,620 --> 00:02:52,180 So that means that 899 and 493 have 55 00:02:52,180 --> 00:03:00,790 the same GCD as 493 and 406. 56 00:03:00,790 --> 00:03:04,210 That is the original number b, and the new remainder 406. 57 00:03:04,210 --> 00:03:07,660 But now, I can divide 493 by 406. 58 00:03:07,660 --> 00:03:11,750 I get a quotient of zero and a remainder of 87. 59 00:03:11,750 --> 00:03:14,990 So 406 and 87 have the same GCD. 60 00:03:14,990 --> 00:03:19,930 Dividing 406 by 87, I get that 87 and 58 have the same GCD. 61 00:03:19,930 --> 00:03:27,830 Dividing 87 by 58, I get that 58 and 29 have the same GCD. 62 00:03:27,830 --> 00:03:33,250 And now I win, because look, when I divide 58 by 29, 63 00:03:33,250 --> 00:03:35,660 I get a remainder of 0. 64 00:03:35,660 --> 00:03:39,360 And the GCD of anything and 0 is that thing. 65 00:03:39,360 --> 00:03:43,290 So the GCD of 29 and 0 is 0. 66 00:03:43,290 --> 00:03:46,070 I guess the only exception is the GCD of 0 and 0, 67 00:03:46,070 --> 00:03:48,740 which is not defined. 68 00:03:48,740 --> 00:03:53,540 But if it's not 0, then the GCD of x and 0 is x. 69 00:03:53,540 --> 00:03:54,230 And there it is. 70 00:03:54,230 --> 00:03:59,610 So I've just found that the GCD of 899 and 493 is 29. 71 00:03:59,610 --> 00:04:01,400 And this is a quite fast algorithm, 72 00:04:01,400 --> 00:04:03,660 because I keep dividing the numbers that I 73 00:04:03,660 --> 00:04:07,260 have by each other, and it gets small fast. 74 00:04:07,260 --> 00:04:09,240 We'll be more precise about that in a minute. 75 00:04:09,240 --> 00:04:13,830 OK, it's a good exercise in state machine thinking 76 00:04:13,830 --> 00:04:15,570 and practice in program verification 77 00:04:15,570 --> 00:04:17,800 to reformulate the Euclidean algorithm, 78 00:04:17,800 --> 00:04:19,759 or formulate it explicitly as a state machine. 79 00:04:19,759 --> 00:04:22,190 It's a very simple kind of state machine. 80 00:04:22,190 --> 00:04:24,660 The states of this Euclidean algorithm state machine 81 00:04:24,660 --> 00:04:26,350 will be pairs of non-negative integers. 82 00:04:26,350 --> 00:04:29,340 So the states are n cross n, the Cartesian product 83 00:04:29,340 --> 00:04:32,300 of the non-negative integers, with itself. 84 00:04:32,300 --> 00:04:34,650 The start state is going to be the pair a, b, 85 00:04:34,650 --> 00:04:37,990 whose GCD I want to compute. 86 00:04:37,990 --> 00:04:42,000 And the transitions are simply repeatedly applying 87 00:04:42,000 --> 00:04:43,050 the remainder lemma. 88 00:04:43,050 --> 00:04:47,890 Namely, if I'm in state x, y, where 89 00:04:47,890 --> 00:04:52,050 you think of x as and y as the GCD that I'm trying to compute, 90 00:04:52,050 --> 00:04:56,120 I simply convert x and y to y, and the remainder 91 00:04:56,120 --> 00:04:58,250 of x divided by y. 92 00:04:58,250 --> 00:05:02,680 And I keep doing that as long as y is not 0. 93 00:05:02,680 --> 00:05:05,470 OK, very simple state machine-- really, 94 00:05:05,470 --> 00:05:08,010 just one transition rule. 95 00:05:08,010 --> 00:05:10,810 Well, according to the lemma, since I'm 96 00:05:10,810 --> 00:05:14,960 replacing the GCD of x and y by the GCD of y 97 00:05:14,960 --> 00:05:17,000 and the remainder of x divided by y, 98 00:05:17,000 --> 00:05:19,850 the GCD is actually staying constant. 99 00:05:19,850 --> 00:05:22,790 This transition preserves the GCD 100 00:05:22,790 --> 00:05:26,480 that's left in the pair of registers, x and y. 101 00:05:26,480 --> 00:05:29,330 So what we can say is that since the GCD of x and y 102 00:05:29,330 --> 00:05:32,120 doesn't change from one step to another, 103 00:05:32,120 --> 00:05:35,340 we can say that the GCD of x and y at any point 104 00:05:35,340 --> 00:05:38,790 is equal to its original value, which is the GCD of a and b. 105 00:05:38,790 --> 00:05:40,380 So in other words, this equation, 106 00:05:40,380 --> 00:05:42,820 GCD of x and y in the current state 107 00:05:42,820 --> 00:05:46,490 is equal to GCD of a and b, the GCD of a and b 108 00:05:46,490 --> 00:05:49,550 that we started with, is a preserved invariant 109 00:05:49,550 --> 00:05:50,160 of the state. 110 00:05:50,160 --> 00:05:54,900 So p of a state xy, the property that GCD of x and y 111 00:05:54,900 --> 00:05:59,170 is the original GCD is a preserved invariant 112 00:05:59,170 --> 00:06:00,990 of the state machine. 113 00:06:00,990 --> 00:06:04,220 Moreover, p of start is trivially true, 114 00:06:04,220 --> 00:06:08,290 because at the start, x and y are a equals b. 115 00:06:08,290 --> 00:06:10,560 So p of x and y is just saying the GCD of a and b 116 00:06:10,560 --> 00:06:12,690 is equal to GCD of a and b. 117 00:06:12,690 --> 00:06:13,280 Cool. 118 00:06:13,280 --> 00:06:16,620 So I've got that this property is true at the start, 119 00:06:16,620 --> 00:06:18,600 and it's preserved by the transitions. 120 00:06:18,600 --> 00:06:20,900 So the invariance principle tells me 121 00:06:20,900 --> 00:06:24,350 that if the program stops, I'm going 122 00:06:24,350 --> 00:06:27,820 to have the GCD of x and y when it terminates 123 00:06:27,820 --> 00:06:30,440 is equal to the actual GCD that I want. 124 00:06:30,440 --> 00:06:33,050 And that enables us to prove partial correctness. 125 00:06:33,050 --> 00:06:35,450 The claim is that if this program terminates-- 126 00:06:35,450 --> 00:06:38,830 we haven't determined that it does yet-- but at termination, 127 00:06:38,830 --> 00:06:45,595 if any, I claim that x is left in-- that the GCD of a and b 128 00:06:45,595 --> 00:06:47,230 is left in register x. 129 00:06:47,230 --> 00:06:49,780 The value of x at the end is going 130 00:06:49,780 --> 00:06:51,040 to be the GCD of and and b. 131 00:06:51,040 --> 00:06:51,789 Well, why is that? 132 00:06:51,789 --> 00:06:55,430 Well, look-- at termination, what we know is that y is 0. 133 00:06:55,430 --> 00:06:57,470 That's the only way that this procedure stops, 134 00:06:57,470 --> 00:07:01,230 because otherwise, the transition rule is applicable. 135 00:07:01,230 --> 00:07:03,650 So that means that when y equals 0 136 00:07:03,650 --> 00:07:09,480 at termination, what we have is that since y is 0, 137 00:07:09,480 --> 00:07:13,420 GCD of x and y is equal to the GCD of x and 0. 138 00:07:13,420 --> 00:07:17,046 And that's equal to x, assuming, again, that x is positive, 139 00:07:17,046 --> 00:07:18,480 or not 0.z 140 00:07:18,480 --> 00:07:20,330 So x is the GCD of x and y. 141 00:07:20,330 --> 00:07:23,170 And by the invariant, the GCD of x and y 142 00:07:23,170 --> 00:07:25,180 is equal to the GCD of a and b. 143 00:07:25,180 --> 00:07:26,970 So I've prove this little fact. 144 00:07:26,970 --> 00:07:30,630 This procedure correctly computes the GCD of a and b, 145 00:07:30,630 --> 00:07:34,830 leaving the answer in register x, if it terminates. 146 00:07:34,830 --> 00:07:37,710 Well, of course it terminates, and it terminates fast. 147 00:07:37,710 --> 00:07:40,610 So let's see why. 148 00:07:40,610 --> 00:07:46,390 Notice that at each transition, we're going to replace x by y, 149 00:07:46,390 --> 00:07:50,050 and y by the remainder of x divided by y. 150 00:07:50,050 --> 00:07:51,650 Let's just assume for simplicity that 151 00:07:51,650 --> 00:07:55,400 of the [? pairings, ?] y that x is the bigger one. 152 00:07:55,400 --> 00:08:01,460 So there's two cases of why these numbers are getting small 153 00:08:01,460 --> 00:08:02,240 fast. 154 00:08:02,240 --> 00:08:06,900 The first case is suppose that y is less than x over 2, 155 00:08:06,900 --> 00:08:08,370 or less than or equal to x over 2. 156 00:08:08,370 --> 00:08:15,870 Well, since at this step, you're going to replace x by y, 157 00:08:15,870 --> 00:08:19,020 it means that you're replacing x by something that's 158 00:08:19,020 --> 00:08:19,920 less than half x. 159 00:08:19,920 --> 00:08:23,420 So x gets halved at this step. 160 00:08:23,420 --> 00:08:24,870 What about if y is big? 161 00:08:24,870 --> 00:08:27,330 Well, if y is bigger than x over 2, 162 00:08:27,330 --> 00:08:32,270 then the remainder of x divided by y is simply x minus y. 163 00:08:32,270 --> 00:08:34,490 And it's going to be less than x over 2. 164 00:08:34,490 --> 00:08:38,870 But that's going to be the value of y after the next step. 165 00:08:38,870 --> 00:08:40,659 So y is going to be halved either 166 00:08:40,659 --> 00:08:43,130 at this step or the next step when it's replaced 167 00:08:43,130 --> 00:08:45,000 by the remainder of x and y. 168 00:08:45,000 --> 00:08:49,460 And the net result is that y it gets cut in half, 169 00:08:49,460 --> 00:08:53,610 or even smaller, at every other step, which 170 00:08:53,610 --> 00:08:58,290 means that this procedure can't continue for more than twice 171 00:08:58,290 --> 00:09:02,730 the log to the base 2 of the original value of y, which 172 00:09:02,730 --> 00:09:06,930 is b number of steps, because that's how many 173 00:09:06,930 --> 00:09:10,200 halves you can do before you start hitting 0. 174 00:09:10,200 --> 00:09:13,920 So we've just shown that this procedure holds 175 00:09:13,920 --> 00:09:16,970 in logarithmic number of steps, which 176 00:09:16,970 --> 00:09:20,140 is the same as saying that it's about the length of b 177 00:09:20,140 --> 00:09:24,970 in binary, and even fewer steps than the length of b 178 00:09:24,970 --> 00:09:25,710 in decimal. 179 00:09:25,710 --> 00:09:30,470 The GCD algorithm is really very efficient.