1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:20,364 --> 00:00:21,530 SRINIVAS DEVADAS: All right. 9 00:00:21,530 --> 00:00:23,860 Good morning, everyone. 10 00:00:23,860 --> 00:00:29,780 So more of the same in terms of cryptography and cryptographic 11 00:00:29,780 --> 00:00:32,750 techniques similar to Tuesday's lecture. 12 00:00:32,750 --> 00:00:35,430 So if you liked it, you'll like this one. 13 00:00:35,430 --> 00:00:37,180 If you didn't like it, well, it's 14 00:00:37,180 --> 00:00:41,020 going to be more of the same, so sorry. 15 00:00:41,020 --> 00:00:44,340 But what we're going to do today is 16 00:00:44,340 --> 00:00:47,840 do a couple of things a little bit differently. 17 00:00:47,840 --> 00:00:49,760 We're going to talk about encryption. 18 00:00:49,760 --> 00:00:51,660 So we talked about hashing, which, of course, 19 00:00:51,660 --> 00:00:55,150 you know about from the use of dictionaries. 20 00:00:55,150 --> 00:00:58,770 We haven't really talked about encryption in 046 or even 21 00:00:58,770 --> 00:01:02,770 in 006 previously. 22 00:01:02,770 --> 00:01:07,940 But we look at two different kinds of encryption algorithms. 23 00:01:07,940 --> 00:01:11,950 We spend a little bit of time on symmetric key encryption, 24 00:01:11,950 --> 00:01:15,310 which is really something that was used in encoding machines 25 00:01:15,310 --> 00:01:21,180 including the enigma from World War II that you probably saw 26 00:01:21,180 --> 00:01:23,190 if you saw The Imitation Game. 27 00:01:23,190 --> 00:01:26,410 So there you have a shared secret. 28 00:01:26,410 --> 00:01:28,560 And it's that single, what's called, 29 00:01:28,560 --> 00:01:29,910 symmetric shared secret. 30 00:01:29,910 --> 00:01:32,320 Both parties know the secret. 31 00:01:32,320 --> 00:01:37,570 And very quickly, we'll talk about what it means 32 00:01:37,570 --> 00:01:39,620 to actually exchange a secret. 33 00:01:39,620 --> 00:01:41,360 So we'll talk about key exchange. 34 00:01:41,360 --> 00:01:45,350 And then I'll move to asymmetric key encryption, which 35 00:01:45,350 --> 00:01:47,240 I alluded to a little bit when we 36 00:01:47,240 --> 00:01:49,520 talked about digital signatures last time. 37 00:01:49,520 --> 00:01:51,860 I talked about public keys and private keys. 38 00:01:51,860 --> 00:01:53,700 But we're going to actually look at a couple 39 00:01:53,700 --> 00:01:56,840 of different public key encryption algorithms today. 40 00:01:56,840 --> 00:02:00,650 The classical algorithm, the very first algorithm really 41 00:02:00,650 --> 00:02:04,520 that stayed secure, the RSA algorithm. 42 00:02:04,520 --> 00:02:07,270 It stands for Rivest, Shamir, and Adleman, 43 00:02:07,270 --> 00:02:09,139 the initials of the three inventors, 44 00:02:09,139 --> 00:02:14,990 invented at MIT in 1977 and still in use today. 45 00:02:14,990 --> 00:02:17,410 And the last part of today's lecture 46 00:02:17,410 --> 00:02:21,220 is going to be looking at hardness. 47 00:02:21,220 --> 00:02:25,840 So in encryption, if you don't know the secret key, 48 00:02:25,840 --> 00:02:28,830 it should be hard for an adversary 49 00:02:28,830 --> 00:02:33,040 to discover what the message corresponds to. 50 00:02:33,040 --> 00:02:36,040 The adversary sees what's called the cipher text, which 51 00:02:36,040 --> 00:02:38,270 is the encrypted text. 52 00:02:38,270 --> 00:02:40,890 The adversary does not know the secret key. 53 00:02:40,890 --> 00:02:43,990 If he or she knows the secret key, game over. 54 00:02:43,990 --> 00:02:47,280 But assuming that adversary does not know the secret key, 55 00:02:47,280 --> 00:02:51,810 it should be computational hard to discover the message, right? 56 00:02:51,810 --> 00:02:53,340 That makes sense. 57 00:02:53,340 --> 00:02:55,510 So there's clearly a relationship 58 00:02:55,510 --> 00:02:59,840 between this hardness and NP-complete problems, 59 00:02:59,840 --> 00:03:01,930 which are computationally hard. 60 00:03:01,930 --> 00:03:04,520 And it's not surprising that people 61 00:03:04,520 --> 00:03:08,230 have tried to build what are called cryptosystems, which 62 00:03:08,230 --> 00:03:10,830 are essentially cryptographic techniques, based 63 00:03:10,830 --> 00:03:13,540 on the hardness of NP-complete problems, 64 00:03:13,540 --> 00:03:15,430 including graph coloring and knapsack 65 00:03:15,430 --> 00:03:16,810 and so on and so forth. 66 00:03:16,810 --> 00:03:21,260 But it turns out there's a real subtle difference 67 00:03:21,260 --> 00:03:28,520 between the kinds of problems that have been useful in terms 68 00:03:28,520 --> 00:03:31,490 of building secure cryptosystems like RSA 69 00:03:31,490 --> 00:03:33,320 and NP-complete problems. 70 00:03:33,320 --> 00:03:36,140 And we'll talk about that at the end of lecture. 71 00:03:36,140 --> 00:03:38,960 We'll probably spend a bunch of time on that. 72 00:03:38,960 --> 00:03:43,230 So let's get started with the symmetric key encryption, which 73 00:03:43,230 --> 00:03:45,130 is, at some level, kind of boring 74 00:03:45,130 --> 00:03:46,870 from a mathematical standpoint. 75 00:03:46,870 --> 00:03:49,380 So we want to spend a lot of time on it. 76 00:03:49,380 --> 00:03:51,520 It's definitely very useful. 77 00:03:51,520 --> 00:03:57,280 It essentially assumes that there's a secret key k. 78 00:03:57,280 --> 00:04:00,760 And you can think of this as a 128-bit number. 79 00:04:00,760 --> 00:04:04,280 Some people want it to be larger, 256. 80 00:04:04,280 --> 00:04:06,670 But it's suddenly at 64. 81 00:04:06,670 --> 00:04:09,960 At this point, it's probably not enough, 82 00:04:09,960 --> 00:04:13,380 even though 2 raised to 64 is still a fairly large number. 83 00:04:13,380 --> 00:04:18,700 With parallelism and with fast computers, 84 00:04:18,700 --> 00:04:22,250 it's a little worrisome that the adversary only 85 00:04:22,250 --> 00:04:25,820 requires 2 raised to 64 work to enumerate 86 00:04:25,820 --> 00:04:30,160 all possible secret keys of 64-bits right? 87 00:04:30,160 --> 00:04:33,910 So 122 raised to 128 is much better. 88 00:04:33,910 --> 00:04:37,830 And this is shared between Alice and Bob. 89 00:04:37,830 --> 00:04:41,330 So we'll think about the protagonists 90 00:04:41,330 --> 00:04:44,420 here as being Alice and Bob. 91 00:04:44,420 --> 00:04:47,440 And they want to exchange information. 92 00:04:47,440 --> 00:04:51,770 And typically, the adversary is mal for malicious. 93 00:04:54,770 --> 00:04:58,565 But there's other ways you can obviously name adversaries. 94 00:05:01,080 --> 00:05:05,120 So the basic equation here is really straightforward. 95 00:05:05,120 --> 00:05:08,310 It's C, which is the cipher text. 96 00:05:08,310 --> 00:05:12,880 So a little terminology here, cipher text. 97 00:05:12,880 --> 00:05:16,175 m is the plain text or the message. 98 00:05:19,100 --> 00:05:21,240 And plain text means you can just read it. 99 00:05:21,240 --> 00:05:23,870 Cipher text means it's scrambled. 100 00:05:23,870 --> 00:05:29,130 K is, of course, the secret key that's up here. 101 00:05:29,130 --> 00:05:31,130 And the e is the encryption function. 102 00:05:37,680 --> 00:05:40,680 Now, in symmetric key cryptography, 103 00:05:40,680 --> 00:05:44,450 basically the requirement is that you 104 00:05:44,450 --> 00:05:49,390 have to be able to get back the plain text from the cipher 105 00:05:49,390 --> 00:05:54,690 text given a public decryption function and knowledge 106 00:05:54,690 --> 00:05:55,930 of the secret key. 107 00:05:55,930 --> 00:05:58,110 And this should be a straightforward operation, 108 00:05:58,110 --> 00:05:58,780 right? 109 00:05:58,780 --> 00:06:04,640 So by the way, e of k m, this is a polytime computation. 110 00:06:04,640 --> 00:06:06,410 It's not constant time. 111 00:06:06,410 --> 00:06:08,140 It's typically linear time. 112 00:06:08,140 --> 00:06:11,980 And you don't really want it to be quadratic time even, 113 00:06:11,980 --> 00:06:15,340 because you want to do this vary fast, streaming. 114 00:06:15,340 --> 00:06:18,000 You need to send streams of messages, 115 00:06:18,000 --> 00:06:21,180 many gigabytes potentially, that are all encrypted. 116 00:06:21,180 --> 00:06:22,560 And this is actually what happens 117 00:06:22,560 --> 00:06:24,700 when you get stuff from your satellite 118 00:06:24,700 --> 00:06:26,380 and you're downloading movies. 119 00:06:26,380 --> 00:06:28,350 That's exactly what happens. 120 00:06:28,350 --> 00:06:31,670 It's symmetric key encryption of a lot of data. 121 00:06:31,670 --> 00:06:36,554 So backwards would be dk c. 122 00:06:36,554 --> 00:06:38,720 And the only difference here is that everything else 123 00:06:38,720 --> 00:06:39,960 stays the same. 124 00:06:39,960 --> 00:06:41,440 This is our decryption function. 125 00:06:44,980 --> 00:06:46,690 All right. 126 00:06:46,690 --> 00:06:50,630 So that's symmetric encryption is. 127 00:06:50,630 --> 00:06:54,100 And there's a requirement of reversibility here. 128 00:06:54,100 --> 00:06:57,590 So it's a lot different from one way hashes 129 00:06:57,590 --> 00:06:59,340 that we talked about last time. 130 00:06:59,340 --> 00:07:06,800 Because here, while you want going from c to m to be hard, 131 00:07:06,800 --> 00:07:11,990 it's only hard if the adversary doesn't know k. 132 00:07:11,990 --> 00:07:14,490 If anybody knows k, it should be easy. 133 00:07:14,490 --> 00:07:18,030 In fact, that e and d are going to be 134 00:07:18,030 --> 00:07:21,770 virtually identical in terms of complexity 135 00:07:21,770 --> 00:07:23,850 and sometimes implementation. 136 00:07:23,850 --> 00:07:27,300 It's just you run it in reverse, and you get back 137 00:07:27,300 --> 00:07:29,040 what you encrypted. 138 00:07:29,040 --> 00:07:30,960 That's the way you want to think about it. 139 00:07:30,960 --> 00:07:37,340 So really what happens is that you need reversible operations 140 00:07:37,340 --> 00:07:45,030 in order to build e or d, which are the encryption 141 00:07:45,030 --> 00:07:46,840 and the decryption functions. 142 00:07:46,840 --> 00:07:50,440 So permutation, for example, is reversible. 143 00:07:50,440 --> 00:07:52,520 You can always take something and permute it, 144 00:07:52,520 --> 00:07:54,660 and you can go backwards. 145 00:07:54,660 --> 00:07:57,390 So you could do something like that. 146 00:07:57,390 --> 00:08:01,530 And the permutation, the reverse, looks like that. 147 00:08:01,530 --> 00:08:03,800 What is this supposed to be? 148 00:08:03,800 --> 00:08:07,350 It's the fact that I have 3 bits here. 149 00:08:07,350 --> 00:08:12,300 And they turn into 3 bits over there. 150 00:08:12,300 --> 00:08:16,265 And obviously, if I reverse the permutation 151 00:08:16,265 --> 00:08:19,220 and if I just add a simple thing where it was abc 152 00:08:19,220 --> 00:08:21,190 and I'm going to convert it to cba, 153 00:08:21,190 --> 00:08:23,830 then cba can go back to abc through 154 00:08:23,830 --> 00:08:25,060 the reverse permutation. 155 00:08:25,060 --> 00:08:27,660 So clearly, this is a reversible operation. 156 00:08:27,660 --> 00:08:30,330 But there are other operations that are reversible as well. 157 00:08:30,330 --> 00:08:35,840 Plus can be reversed by a negation. 158 00:08:35,840 --> 00:08:39,580 And exclusive OR is simply exclusive OR. 159 00:08:39,580 --> 00:08:45,905 Because if you do A exclusive OR B, then you get C. 160 00:08:45,905 --> 00:08:50,590 And imagine if you did B again to that, you get A back. 161 00:08:50,590 --> 00:08:52,230 That's what I mean by reversal. 162 00:08:52,230 --> 00:08:56,240 Because you have, essentially, A exclusive OR A cancels out. 163 00:08:56,240 --> 00:09:00,780 And so it's something like a exclusive OR B exclusive OR B 164 00:09:00,780 --> 00:09:04,330 again would give you A. 165 00:09:04,330 --> 00:09:09,110 So if you go look at AES, for example, which is the Advanced 166 00:09:09,110 --> 00:09:12,430 Encryption Standard, is really only about four lines of code, 167 00:09:12,430 --> 00:09:14,380 maybe it's eight lines of code. 168 00:09:14,380 --> 00:09:18,520 But this is a well used, it has been around 169 00:09:18,520 --> 00:09:22,670 for a while, symmetric key cipher that 170 00:09:22,670 --> 00:09:27,020 runs in 128-bit mode as well as 256-bit mode. 171 00:09:27,020 --> 00:09:30,410 And it's, like I said, a few lines of code. 172 00:09:30,410 --> 00:09:34,370 And you'll see operations like this, permutations. 173 00:09:34,370 --> 00:09:37,720 And you'll see hat symbols if you see a C program, which 174 00:09:37,720 --> 00:09:39,140 is exclusive OR. 175 00:09:39,140 --> 00:09:42,260 And you can see the encryption and the decryption 176 00:09:42,260 --> 00:09:43,840 are identical in implementation. 177 00:09:43,840 --> 00:09:46,500 You're just going to run it one way, and you run it again. 178 00:09:46,500 --> 00:09:49,040 And you get back the result. Because these permutations 179 00:09:49,040 --> 00:09:50,060 are reversible. 180 00:09:50,060 --> 00:09:52,080 And the hats are reversible. 181 00:09:52,080 --> 00:09:54,880 And they're all signed integers, so you're just 182 00:09:54,880 --> 00:09:55,940 sort of adding them up. 183 00:09:55,940 --> 00:09:58,220 And it's just complement arithmetic, 184 00:09:58,220 --> 00:10:00,140 so that looks the same as well. 185 00:10:00,140 --> 00:10:02,440 So I encourage you to go take a look at AES. 186 00:10:02,440 --> 00:10:04,500 I'm not going to spend more time on it. 187 00:10:04,500 --> 00:10:09,390 The key idea here is of symmetry and reversibility. 188 00:10:09,390 --> 00:10:10,944 We're going to move away from that. 189 00:10:10,944 --> 00:10:12,360 Clearly, we didn't talk about that 190 00:10:12,360 --> 00:10:14,720 when we talked about one way hash functions, et cetera. 191 00:10:14,720 --> 00:10:18,500 That was a different situation where we had no secrecy. 192 00:10:18,500 --> 00:10:21,770 But here, we wanted symmetry. 193 00:10:21,770 --> 00:10:23,560 The big question that you would ask 194 00:10:23,560 --> 00:10:26,590 and you should ask yourself when you see this, you say, 195 00:10:26,590 --> 00:10:27,470 OK great. 196 00:10:27,470 --> 00:10:30,810 I can build symmetric key encryption algorithms. 197 00:10:30,810 --> 00:10:32,790 They're actually, I would say, somewhat easier 198 00:10:32,790 --> 00:10:36,020 to do than to build hash functions, which 199 00:10:36,020 --> 00:10:39,560 have a whole lot of more interesting properties 200 00:10:39,560 --> 00:10:43,260 and hard to obtain properties like collusion resistance. 201 00:10:43,260 --> 00:10:48,470 But the question really is how do Alice and Bob 202 00:10:48,470 --> 00:10:50,950 share the secret key, k? 203 00:10:50,950 --> 00:10:52,040 So you need that. 204 00:10:52,040 --> 00:10:54,770 You need this 128-bit number in order 205 00:10:54,770 --> 00:10:58,750 for there to be a channel, a secure channel, that Alice 206 00:10:58,750 --> 00:11:02,570 can communicate to Bob with and vice verse. 207 00:11:02,570 --> 00:11:05,320 And so now you could say Alice sends Bob a letter. 208 00:11:05,320 --> 00:11:09,000 But you know, mal could intercept that letter. 209 00:11:09,000 --> 00:11:12,930 And even worse, what wants to do is look at the letter, 210 00:11:12,930 --> 00:11:15,290 actually deliver the letter, which 211 00:11:15,290 --> 00:11:20,920 is the best thing for him, and Alice and Bob 212 00:11:20,920 --> 00:11:22,750 think that they have a secure channel. 213 00:11:22,750 --> 00:11:25,360 That's sort of best case scenario for mal, right? 214 00:11:25,360 --> 00:11:29,320 So you could have mal in the middle here. 215 00:11:29,320 --> 00:11:33,050 And you've got to worry about stuff like that. 216 00:11:33,050 --> 00:11:39,050 So key exchange, let's move on to talk about key exchange. 217 00:11:46,445 --> 00:12:00,060 And how does the secret key k get shared? 218 00:12:00,060 --> 00:12:03,980 You can't put this out on a website, right? 219 00:12:03,980 --> 00:12:06,750 So it has to be the case, sharing 220 00:12:06,750 --> 00:12:12,200 is something that has to be secure in the sense 221 00:12:12,200 --> 00:12:14,730 that there can't be any eavesdroppers. 222 00:12:14,730 --> 00:12:20,195 So here's my favorite example of a puzzle that most of you 223 00:12:20,195 --> 00:12:23,170 have probably heard about. 224 00:12:23,170 --> 00:12:26,800 But those of you who haven't, it's really pretty cool. 225 00:12:26,800 --> 00:12:32,700 And those of you who have, it's still pretty cool and worth 226 00:12:32,700 --> 00:12:33,790 recalling. 227 00:12:33,790 --> 00:12:36,300 And there's actually something that you probably 228 00:12:36,300 --> 00:12:38,250 haven't thought about very much even if you've 229 00:12:38,250 --> 00:12:39,450 heard about this puzzle. 230 00:12:39,450 --> 00:12:42,480 That's the mathematical assumption made 231 00:12:42,480 --> 00:12:44,080 in the solution of this puzzle. 232 00:12:44,080 --> 00:12:45,930 That will be interesting to you. 233 00:12:45,930 --> 00:12:47,980 But the puzzle is a pirate puzzle. 234 00:12:47,980 --> 00:12:50,050 So you've got Alice and Bob. 235 00:12:50,050 --> 00:12:51,900 Let's call it the Caribbean, because that's 236 00:12:51,900 --> 00:12:53,740 my favorite ocean. 237 00:12:53,740 --> 00:12:56,896 And Alice and Bob are in two different islands. 238 00:12:56,896 --> 00:12:59,020 And we all know there are pirates in the Caribbean, 239 00:12:59,020 --> 00:13:01,600 right? 240 00:13:01,600 --> 00:13:06,140 And so Alice and Bob want to communicate with each other. 241 00:13:06,140 --> 00:13:12,090 And what Alice has are a bunch of boxes and locks and keys. 242 00:13:14,930 --> 00:13:19,150 She's got the keys for her locks and nothing else 243 00:13:19,150 --> 00:13:22,980 and the same thing with Bob. 244 00:13:22,980 --> 00:13:30,460 So Bob has boxes, locks, keys for his locks 245 00:13:30,460 --> 00:13:32,360 that he can put on his boxes. 246 00:13:32,360 --> 00:13:36,690 And in this case, Alice wants to send a message to Bob. 247 00:13:36,690 --> 00:13:39,110 And Alice wants to exchange a key with Bob, 248 00:13:39,110 --> 00:13:43,770 so they can eventually communicate in a secure way 249 00:13:43,770 --> 00:13:47,050 regardless of the pirates or whoever else is listening. 250 00:13:47,050 --> 00:13:49,852 So the problem here, of course, is that if you just 251 00:13:49,852 --> 00:13:52,310 send a message on a boat-- so the pirates are kind of nice, 252 00:13:52,310 --> 00:13:58,260 in a way, that they will deliver messages. 253 00:13:58,260 --> 00:14:01,730 But they are curious, right? 254 00:14:01,730 --> 00:14:03,000 So they're very curious. 255 00:14:03,000 --> 00:14:04,654 And they will open up boxes. 256 00:14:04,654 --> 00:14:06,820 And so that's supposed to be a boat, by the way, not 257 00:14:06,820 --> 00:14:10,880 a box, a little mast there. 258 00:14:10,880 --> 00:14:14,150 So I'm not a sailor. 259 00:14:14,150 --> 00:14:18,830 But you have these boxes that are going to get delivered. 260 00:14:18,830 --> 00:14:20,740 And the deal is this. 261 00:14:20,740 --> 00:14:23,376 If there's an open box, the pirate's will open the box. 262 00:14:23,376 --> 00:14:25,584 And if there's any message in it, they might read it, 263 00:14:25,584 --> 00:14:26,760 they might throw it away. 264 00:14:26,760 --> 00:14:32,550 So clearly, a secret key can be exchanged 265 00:14:32,550 --> 00:14:36,150 by just putting a box with a message in it 266 00:14:36,150 --> 00:14:38,350 if you don't lock it. 267 00:14:38,350 --> 00:14:43,220 They will not be able to open, they'll not touch, 268 00:14:43,220 --> 00:14:47,040 and they will deliver a locked box. 269 00:14:47,040 --> 00:14:51,390 But if they ever see a key, they'll keep it. 270 00:14:51,390 --> 00:14:53,435 If they ever see kind of a key on the boat, 271 00:14:53,435 --> 00:14:54,810 they'll just grab it and keep it. 272 00:14:54,810 --> 00:14:57,018 And then the next time around, they see a locked box, 273 00:14:57,018 --> 00:15:00,140 they'll stick the key in and try and open it. 274 00:15:00,140 --> 00:15:01,970 All right. 275 00:15:01,970 --> 00:15:06,340 So how do Alice and Bob securely exchange a secret 276 00:15:06,340 --> 00:15:09,600 where security is based on this notion of piracy, 277 00:15:09,600 --> 00:15:16,980 I guess, with the pirates having a certain amount of capability 278 00:15:16,980 --> 00:15:20,610 in terms of storing keys, and opening up the locks, 279 00:15:20,610 --> 00:15:23,460 but they will not touch a locked box? 280 00:15:23,460 --> 00:15:24,635 All right. 281 00:15:24,635 --> 00:15:25,510 So that's the puzzle. 282 00:15:25,510 --> 00:15:27,790 How many of you have heard of this puzzle before? 283 00:15:27,790 --> 00:15:30,090 So all of you keep quiet. 284 00:15:30,090 --> 00:15:34,350 Someone who hasn't heard of this puzzle before, 285 00:15:34,350 --> 00:15:39,770 think for a few seconds here and see if a solution occurs to you 286 00:15:39,770 --> 00:15:43,490 with respect to going back and forth-- so that's the hint, 287 00:15:43,490 --> 00:15:47,980 going back and forth-- and being able to securely exchange 288 00:15:47,980 --> 00:15:51,660 a message, which could be a 128-bit secret key written 289 00:15:51,660 --> 00:15:55,060 on a little note between Alice and Bob. 290 00:15:55,060 --> 00:15:55,560 Yeah? 291 00:15:55,560 --> 00:15:56,578 Back there. 292 00:15:56,578 --> 00:16:00,780 AUDIENCE: Are they allowed to pass locks such that-- 293 00:16:00,780 --> 00:16:02,530 SRINIVAS DEVADAS: So if you see it locked, 294 00:16:02,530 --> 00:16:04,510 they'll throw away the lock, too. 295 00:16:04,510 --> 00:16:08,370 If the lock is on the box, then sure, 296 00:16:08,370 --> 00:16:10,340 they will deliver that box. 297 00:16:10,340 --> 00:16:12,610 But if the key is outside of that box, 298 00:16:12,610 --> 00:16:13,960 then they'll keep the key. 299 00:16:21,990 --> 00:16:24,270 No-- this is a difficult puzzle. 300 00:16:24,270 --> 00:16:26,246 Yeah? 301 00:16:26,246 --> 00:16:29,107 AUDIENCE: Alice could lock the box and send it to Bob. 302 00:16:29,107 --> 00:16:30,065 SRINIVAS DEVADAS: Yeah. 303 00:16:30,065 --> 00:16:34,572 AUDIENCE: And then Bob could also lock the box. 304 00:16:34,572 --> 00:16:35,900 SRINIVAS DEVADAS: Yeah. 305 00:16:35,900 --> 00:16:37,090 And then? 306 00:16:37,090 --> 00:16:40,801 At that point-- no, you're on the right track. 307 00:16:40,801 --> 00:16:41,300 Keep going. 308 00:16:41,300 --> 00:16:43,740 Just push it a little more. 309 00:16:43,740 --> 00:16:46,492 Couple of more boat rides and we're done. 310 00:16:46,492 --> 00:16:48,456 Yeah. 311 00:16:48,456 --> 00:16:52,501 AUDIENCE: And then if Bob sends-- 312 00:16:52,501 --> 00:16:54,750 SRINIVAS DEVADAS: So there's now two locks on the box. 313 00:16:54,750 --> 00:16:55,590 AUDIENCE: Yes. 314 00:16:55,590 --> 00:16:58,562 SRINIVAS DEVADAS: And Bob has two locks on the box. 315 00:16:58,562 --> 00:17:00,020 And so what's Bob going to do next? 316 00:17:00,020 --> 00:17:03,742 What's the logical thing for Bob to do next? 317 00:17:03,742 --> 00:17:05,089 AUDIENCE: Send it back to Alice. 318 00:17:05,089 --> 00:17:06,630 SRINIVAS DEVADAS: Send it back Alice. 319 00:17:06,630 --> 00:17:10,300 And now the key's inside. 320 00:17:10,300 --> 00:17:14,170 Alice now sees two locks on the box. 321 00:17:14,170 --> 00:17:16,420 And one of the locks is Bobs. 322 00:17:16,420 --> 00:17:17,180 And other is? 323 00:17:17,180 --> 00:17:18,069 AUDIENCE: Hers. 324 00:17:18,069 --> 00:17:18,690 SRINIVAS DEVADAS: Is hers. 325 00:17:18,690 --> 00:17:19,380 AUDIENCE: She can unlock it. 326 00:17:19,380 --> 00:17:21,046 SRINIVAS DEVADAS: She can unlock the box 327 00:17:21,046 --> 00:17:23,540 and send it back to Bob. 328 00:17:23,540 --> 00:17:27,650 And all of this time, the only thing that's gone in transit 329 00:17:27,650 --> 00:17:29,750 is a locked box. 330 00:17:29,750 --> 00:17:32,050 No keys have gone into in transit, 331 00:17:32,050 --> 00:17:34,400 whether they're inside the box or wherever. 332 00:17:34,400 --> 00:17:38,720 But the only thing that's moved is a locked box. 333 00:17:38,720 --> 00:17:39,410 So that's good. 334 00:17:39,410 --> 00:17:40,550 You're exactly right. 335 00:17:40,550 --> 00:17:45,110 So that gets you a Frisbee. 336 00:17:45,110 --> 00:17:46,930 Whoops, sorry. 337 00:17:46,930 --> 00:17:48,710 We need a secure exchange there. 338 00:17:48,710 --> 00:17:50,730 Yeah, that's good. 339 00:17:50,730 --> 00:17:54,250 So that is exactly right. 340 00:17:54,250 --> 00:18:05,610 So just to recap, Alice locks box with KA, 341 00:18:05,610 --> 00:18:09,980 and that's the key for the lock, sends it to Bob. 342 00:18:09,980 --> 00:18:18,320 Bob locks box with KB, sends it to Alice. 343 00:18:18,320 --> 00:18:24,220 And Alice unlocks-- oh, I should've said 344 00:18:24,220 --> 00:18:27,100 that Alice puts message in box. 345 00:18:32,590 --> 00:18:36,630 And that message has the secret key inside of it. 346 00:18:36,630 --> 00:18:45,840 Alice unlocks KA and sends box to Bob. 347 00:18:45,840 --> 00:18:56,130 And then Bob unlocks KB and reads message. 348 00:18:56,130 --> 00:18:57,470 So that's good. 349 00:18:57,470 --> 00:18:59,870 That's all good. 350 00:18:59,870 --> 00:19:03,480 Let's look at it a little more deeply 351 00:19:03,480 --> 00:19:08,170 and think about it from a mathematical standpoint, not 352 00:19:08,170 --> 00:19:09,330 a physical standpoint. 353 00:19:09,330 --> 00:19:11,990 You could think about it from a physical standpoint as well. 354 00:19:11,990 --> 00:19:18,050 What is the relationship between this locking, this sequence, 355 00:19:18,050 --> 00:19:22,550 in this case a pair of locks, that we require here 356 00:19:22,550 --> 00:19:25,310 in order for this to physically make sense? 357 00:19:25,310 --> 00:19:25,810 s 358 00:19:25,810 --> 00:19:27,435 I mean, there's different ways that you 359 00:19:27,435 --> 00:19:29,890 could add two locks to a box. 360 00:19:29,890 --> 00:19:32,120 There's many different ways. 361 00:19:32,120 --> 00:19:37,440 One way is to have a box that is locked if I look it over here. 362 00:19:37,440 --> 00:19:39,630 And it doesn't open, because the lid doesn't open. 363 00:19:39,630 --> 00:19:41,310 I've got suitcases like that. 364 00:19:41,310 --> 00:19:43,549 And then there could be another spot here 365 00:19:43,549 --> 00:19:44,465 that looks it as well. 366 00:19:47,030 --> 00:19:49,090 So that could be one way. 367 00:19:49,090 --> 00:19:51,800 Well, what's another way of having two locks? 368 00:19:51,800 --> 00:19:52,300 Yeah 369 00:19:52,300 --> 00:19:53,680 AUDIENCE: Putting a box within a box. 370 00:19:53,680 --> 00:19:54,860 SRINIVAS DEVADAS: Yeah, putting a box within a box. 371 00:19:54,860 --> 00:19:56,090 That's great. 372 00:19:56,090 --> 00:20:02,270 So if you put a box within a box, then does this work? 373 00:20:02,270 --> 00:20:04,090 It doesn't work, right? 374 00:20:04,090 --> 00:20:06,350 So nested locks actually don't work. 375 00:20:06,350 --> 00:20:17,680 Because what happened here is that KA was put first, then KB. 376 00:20:17,680 --> 00:20:22,730 And if, in fact, you had KA, and then KB out there, 377 00:20:22,730 --> 00:20:35,729 then you can't remove KA without removing KB. 378 00:20:35,729 --> 00:20:38,020 And this would be the case where you have nested locks. 379 00:20:42,180 --> 00:20:44,860 So the mathematical operation that we require 380 00:20:44,860 --> 00:20:48,890 is commutativity between the locks. 381 00:20:48,890 --> 00:20:51,010 And the locks need to commute. 382 00:20:51,010 --> 00:20:53,900 I want to put KA in first. 383 00:20:53,900 --> 00:20:56,320 Like I described, the physical realization 384 00:20:56,320 --> 00:20:59,440 could simply be a suitcase with two different positions 385 00:20:59,440 --> 00:21:00,250 for the two locks. 386 00:21:00,250 --> 00:21:03,760 And any one of those positions locks the suitcase. 387 00:21:03,760 --> 00:21:05,820 So you put KA here, KB here. 388 00:21:05,820 --> 00:21:08,400 And then you can take KA out, right? 389 00:21:08,400 --> 00:21:10,860 And it's still locked, because you have KB. 390 00:21:10,860 --> 00:21:20,130 So this commutativity between the locks and essentially 391 00:21:20,130 --> 00:21:24,200 the keys is what's required. 392 00:21:27,490 --> 00:21:30,930 And so now, let's move away from pirates 393 00:21:30,930 --> 00:21:33,930 and go into the cryptography domain, 394 00:21:33,930 --> 00:21:36,670 pure mathematical domain, and see 395 00:21:36,670 --> 00:21:38,740 how this turns into what's called 396 00:21:38,740 --> 00:21:43,130 the Diffie-Hellman key exchange, which is a key exchange 397 00:21:43,130 --> 00:21:48,720 algorithm or a protocol that under certain conditions 398 00:21:48,720 --> 00:21:50,590 give you exactly what you see here. 399 00:21:50,590 --> 00:21:52,420 It gives you a secure key exchange. 400 00:21:52,420 --> 00:21:55,410 And there's one issue associated with it, 401 00:21:55,410 --> 00:21:58,020 and that'll be kind of clear once we write it out here, 402 00:21:58,020 --> 00:21:59,720 that we'll get back to. 403 00:21:59,720 --> 00:22:02,430 But Diffie-Hellman key exchange assumes that you 404 00:22:02,430 --> 00:22:04,420 have commutative locks. 405 00:22:04,420 --> 00:22:09,930 And this is how it works. 406 00:22:09,930 --> 00:22:16,040 You'll see what commutes when I give you the equations 407 00:22:16,040 --> 00:22:18,760 associated with Diffie-Hellman key exchange. 408 00:22:18,760 --> 00:22:25,100 And this is also described in the '70s. 409 00:22:25,100 --> 00:22:29,630 So what we're going to do is we're going to work in a finite 410 00:22:29,630 --> 00:22:33,120 field Fp*. 411 00:22:33,120 --> 00:22:36,450 And the finite field means that we're 412 00:22:36,450 --> 00:22:41,820 going to be doing mod p, where p is prime. 413 00:22:41,820 --> 00:22:45,060 And the star means we're going to be only looking 414 00:22:45,060 --> 00:22:48,930 at invertible elements only. 415 00:22:48,930 --> 00:22:52,449 So we drop the nonvertible elements. 416 00:22:52,449 --> 00:22:54,240 These things aren't particularly important. 417 00:22:57,470 --> 00:22:59,560 And so we'll drop 0. 418 00:22:59,560 --> 00:23:03,480 And we'll be looking at 1, 2, to p minus 1. 419 00:23:03,480 --> 00:23:05,330 So all the numbers that you see are going 420 00:23:05,330 --> 00:23:07,690 to be 1 through p minus 1. 421 00:23:07,690 --> 00:23:14,440 Now, what is the analog all this protocol 422 00:23:14,440 --> 00:23:20,120 that Alice and Bob, in our pirate puzzle, 423 00:23:20,120 --> 00:23:23,070 operated on or ran? 424 00:23:23,070 --> 00:23:27,070 What is the analog in the mathematical domain 425 00:23:27,070 --> 00:23:29,060 or in the finite field domain? 426 00:23:29,060 --> 00:23:31,100 Well, here's what happens. 427 00:23:31,100 --> 00:23:39,691 Alice is going to select a random a. 428 00:23:43,128 --> 00:23:45,450 And we're going to assume the g is public. 429 00:23:45,450 --> 00:23:51,880 So she just shouts that out to-- Alice 430 00:23:51,880 --> 00:23:54,950 can see Bob from her house. 431 00:23:54,950 --> 00:23:58,069 So she shouts out g and shouts out p. 432 00:23:58,069 --> 00:23:58,860 They're all public. 433 00:23:58,860 --> 00:24:02,680 She doesn't care if the pirates can hear this. 434 00:24:02,680 --> 00:24:06,440 And Alice is going to select a, which is random, 435 00:24:06,440 --> 00:24:11,810 and compute g of a. 436 00:24:11,810 --> 00:24:15,470 And this is in the finite field, capital G. 437 00:24:15,470 --> 00:24:17,310 So you're going to do your mods. 438 00:24:17,310 --> 00:24:23,550 And she's just going to send over g of a over to Bob. 439 00:24:23,550 --> 00:24:37,540 Now, what Bob does is select b and computes g raise to b 440 00:24:37,540 --> 00:24:40,490 and sends that over. 441 00:24:40,490 --> 00:24:42,950 So g raised to a is being sent over. g 442 00:24:42,950 --> 00:24:45,600 raised to b is being sent over, sent over to Alice. 443 00:24:45,600 --> 00:24:49,160 So Alice gets g raised to b. 444 00:24:49,160 --> 00:24:53,880 And the key realization here is that Alice 445 00:24:53,880 --> 00:25:02,880 can compute g raised to b raised to a, because she knows a. 446 00:25:02,880 --> 00:25:05,030 This is all going to be mod p. 447 00:25:05,030 --> 00:25:08,200 And we're going to call that K. And thanks to the fact 448 00:25:08,200 --> 00:25:14,210 that exponentiation commutes, Bob 449 00:25:14,210 --> 00:25:20,810 computes g raised to a raised to b, because Bob 450 00:25:20,810 --> 00:25:25,727 knows b, which is also exactly K-- I should say mod 451 00:25:25,727 --> 00:25:28,827 p over here. 452 00:25:28,827 --> 00:25:29,660 Everything is mod p. 453 00:25:33,430 --> 00:25:34,920 So that's it. 454 00:25:34,920 --> 00:25:38,040 That's Diffie-Hellman key exchange. 455 00:25:38,040 --> 00:25:41,690 You have now created a shared secret 456 00:25:41,690 --> 00:25:46,110 based on the commutativity of exponentiation. 457 00:25:46,110 --> 00:25:49,840 And the part that's still missing here 458 00:25:49,840 --> 00:25:55,032 with respect to the analogy is the fact that g of a 459 00:25:55,032 --> 00:25:57,760 is essentially the locked box. 460 00:25:57,760 --> 00:26:01,690 So what g of a is hiding is a. 461 00:26:01,690 --> 00:26:05,160 Because you want a to be hidden here and the same thing with g 462 00:26:05,160 --> 00:26:05,660 of b. 463 00:26:05,660 --> 00:26:07,080 It needs to hide b. 464 00:26:07,080 --> 00:26:11,700 So the problem that the pirates had was they 465 00:26:11,700 --> 00:26:14,420 couldn't open up the box. 466 00:26:14,420 --> 00:26:19,170 The problem that the adversary, let's call them mal, has 467 00:26:19,170 --> 00:26:26,630 is that he has to invert g of a in order to discover a. 468 00:26:26,630 --> 00:26:30,870 And in this particular finite field, 469 00:26:30,870 --> 00:26:34,990 and many such finite fields, you can think of this 470 00:26:34,990 --> 00:26:37,370 as being what's called a discrete logarithm problem. 471 00:26:37,370 --> 00:26:38,870 So if you don't know how to computer 472 00:26:38,870 --> 00:26:41,099 logarithms in that continuous domain. 473 00:26:41,099 --> 00:26:41,890 And there's tables. 474 00:26:41,890 --> 00:26:43,720 And it's pretty easy to do. 475 00:26:43,720 --> 00:26:45,740 But this is what's called a discrete logarithm 476 00:26:45,740 --> 00:26:48,400 problem, because we're in a finite field. 477 00:26:48,400 --> 00:26:50,980 And we obviously want integers. a is an integer. 478 00:26:50,980 --> 00:26:54,800 So we need to discover what does that integer is. 479 00:26:54,800 --> 00:26:56,800 Because we're doing the mod p, et cetera, 480 00:26:56,800 --> 00:26:59,490 and p is typically on a large number, 481 00:26:59,490 --> 00:27:02,650 it's actually a hard problem computationally 482 00:27:02,650 --> 00:27:05,520 to do a discrete log. 483 00:27:05,520 --> 00:27:10,820 So when you see g of a, you know g, you know p, 484 00:27:10,820 --> 00:27:15,710 but trying to figure out what produced that g of a 485 00:27:15,710 --> 00:27:16,920 is a difficult problem. 486 00:27:16,920 --> 00:27:19,010 And people have looked at this for 30, 40 years 487 00:27:19,010 --> 00:27:21,929 and there's not great algorithms to solve 488 00:27:21,929 --> 00:27:23,720 this problem, certainly not anything that's 489 00:27:23,720 --> 00:27:25,440 polynomial time solvable. 490 00:27:25,440 --> 00:27:27,550 They're all kind of subexponential. 491 00:27:27,550 --> 00:27:32,070 And you can make the numbers large enough such that g of a 492 00:27:32,070 --> 00:27:37,270 is secure in the sense that it doesn't give away what a is. 493 00:27:37,270 --> 00:27:43,260 So that's the insight here that Diffie and Hellman had, 494 00:27:43,260 --> 00:27:55,020 which comes down to the discrete log problem is hard. 495 00:27:55,020 --> 00:27:59,910 And what this simply means is given g of a, 496 00:27:59,910 --> 00:28:03,160 the discrete log problem is compute a. 497 00:28:03,160 --> 00:28:05,950 And the same thing for b, of course. 498 00:28:05,950 --> 00:28:09,300 There's one other thing that you want 499 00:28:09,300 --> 00:28:11,400 to say to be precise to sort of just 500 00:28:11,400 --> 00:28:15,490 cover the spectrum with respect to how this could break. 501 00:28:15,490 --> 00:28:19,340 And that is what's called the Diffie-Hellman problem, 502 00:28:19,340 --> 00:28:23,470 for want of a better names and since these are the folks who 503 00:28:23,470 --> 00:28:25,770 first came up with it. 504 00:28:25,770 --> 00:28:27,960 And the Diffie-Hellman problem is simply 505 00:28:27,960 --> 00:28:33,680 that given g of a and g of b, which is what the pirates see 506 00:28:33,680 --> 00:28:37,610 and what the adversary mal sees, we should not 507 00:28:37,610 --> 00:28:43,400 be able to compute g of a times b, which 508 00:28:43,400 --> 00:28:44,790 is exactly what we get here. 509 00:28:44,790 --> 00:28:47,600 g of a raised to b is g of a times b. 510 00:28:47,600 --> 00:28:49,680 So given those two things, if there's 511 00:28:49,680 --> 00:28:52,990 a way of computing g of a times b, 512 00:28:52,990 --> 00:28:55,800 even though you potentially haven't discovered 513 00:28:55,800 --> 00:28:59,340 a and b precisely, g of a times b 514 00:28:59,340 --> 00:29:02,060 is just the secret key that Alice and Bob exchanged. 515 00:29:02,060 --> 00:29:03,620 So you're host. 516 00:29:03,620 --> 00:29:07,240 Alice and Bob are host if mal can do this. 517 00:29:07,240 --> 00:29:09,410 So there's two things going on here. 518 00:29:09,410 --> 00:29:12,540 You want the Diffie-Hellman problem to be hard. 519 00:29:12,540 --> 00:29:16,500 And you want the discrete log problem to be hard. 520 00:29:16,500 --> 00:29:20,500 OK So generally, this is how cryptography works. 521 00:29:20,500 --> 00:29:22,880 You set up protocols. 522 00:29:22,880 --> 00:29:26,330 And there's some information that's bound to be exposed. 523 00:29:26,330 --> 00:29:30,280 You want this information to be hard to reverse 524 00:29:30,280 --> 00:29:32,190 to get the crucial information. 525 00:29:32,190 --> 00:29:34,890 That requires the computational hardness assumption 526 00:29:34,890 --> 00:29:36,760 like the two that we've made here. 527 00:29:36,760 --> 00:29:39,100 And then you're off and running. 528 00:29:39,100 --> 00:29:42,060 Your system will break if your computational hardness 529 00:29:42,060 --> 00:29:44,010 assumptions are incorrect. 530 00:29:44,010 --> 00:29:46,500 And they may be correct for a particular time, 531 00:29:46,500 --> 00:29:49,910 for example, the 1970s for particular parameters. 532 00:29:49,910 --> 00:29:53,120 But they may end up being incorrect assumptions, 533 00:29:53,120 --> 00:29:57,180 at least for those parameters, at a later point of time, 534 00:29:57,180 --> 00:30:00,160 simply because computers got faster. 535 00:30:00,160 --> 00:30:03,110 It's like 2 raised to 40 was this huge number when 536 00:30:03,110 --> 00:30:04,320 I was your age. 537 00:30:04,320 --> 00:30:05,840 Now, it's like nothing. 538 00:30:05,840 --> 00:30:08,350 So that's basically part of the game. 539 00:30:08,350 --> 00:30:12,670 But the good systems are those that scale where you increase 540 00:30:12,670 --> 00:30:16,470 the parameter size and the system and the protocol 541 00:30:16,470 --> 00:30:17,950 stays the same. 542 00:30:17,950 --> 00:30:22,510 And so you just increase p, for example, in this case. 543 00:30:22,510 --> 00:30:24,300 And the discrete long problem is still 544 00:30:24,300 --> 00:30:26,400 hard for modern computers. 545 00:30:26,400 --> 00:30:29,600 So those are the good protocols and the good cryptosystems that 546 00:30:29,600 --> 00:30:31,950 stand the test of time, not necessarily 547 00:30:31,950 --> 00:30:36,850 the ones that have exactly particular parameters. 548 00:30:36,850 --> 00:30:37,610 That's hard to do. 549 00:30:37,610 --> 00:30:40,650 Because as I said, Moore's law and computers 550 00:30:40,650 --> 00:30:43,740 have been getting really exponentially faster. 551 00:30:43,740 --> 00:30:45,830 All right. 552 00:30:45,830 --> 00:30:51,350 So there's one main problem with the Diffie-Hellman protocol 553 00:30:51,350 --> 00:30:54,020 and the solution to our pirate puzzle. 554 00:30:54,020 --> 00:30:57,167 And so can someone tell me-- and it could be just 555 00:30:57,167 --> 00:30:59,250 for the sake of the Diffie-Hellman problem or just 556 00:30:59,250 --> 00:31:01,130 in the context of the Diffie-Hellman problem, 557 00:31:01,130 --> 00:31:05,290 but also in the context of the pirate puzzle-- what assumption 558 00:31:05,290 --> 00:31:08,690 are we making here that's as yet unstated 559 00:31:08,690 --> 00:31:14,560 with respect to this secure key exchange being actually secure? 560 00:31:14,560 --> 00:31:16,080 Someone. 561 00:31:16,080 --> 00:31:17,556 Yeah. 562 00:31:17,556 --> 00:31:21,500 AUDIENCE: If I were to intercept a message from [INAUDIBLE] 563 00:31:21,500 --> 00:31:22,990 something of their own back? 564 00:31:22,990 --> 00:31:25,060 SRINIVAS DEVADAS: What does that mean? 565 00:31:25,060 --> 00:31:26,820 The pirates see a locked box. 566 00:31:26,820 --> 00:31:29,650 So the first step of the protocol, Alice 567 00:31:29,650 --> 00:31:32,470 is sending a message inside a locked box 568 00:31:32,470 --> 00:31:33,740 with a single lock in it. 569 00:31:36,520 --> 00:31:38,120 You're kind of on the right track. 570 00:31:38,120 --> 00:31:43,340 And what could the pirates do in order to break this protocol? 571 00:31:43,340 --> 00:31:45,250 We've kind of made an assumption here. 572 00:31:45,250 --> 00:31:47,955 And I might have said it explicitly. 573 00:31:47,955 --> 00:31:48,580 Yeah, go ahead. 574 00:31:48,580 --> 00:31:50,350 AUDIENCE: [INAUDIBLE] throw the box away. 575 00:31:50,350 --> 00:31:51,230 SRINIVAS DEVADAS: They could just throw the box away. 576 00:31:51,230 --> 00:31:53,396 But that doesn't break the security of the protocol. 577 00:31:53,396 --> 00:31:56,565 That breaks the functionality of the protocol, right? 578 00:31:56,565 --> 00:31:57,190 Yeah, go ahead. 579 00:31:57,190 --> 00:31:58,770 AUDIENCE: Put their old lock on [INAUDIBLE]. 580 00:31:58,770 --> 00:32:00,561 SRINIVAS DEVADAS: Ah, that's exactly right, 581 00:32:00,561 --> 00:32:01,680 put their own lock on it. 582 00:32:01,680 --> 00:32:05,051 You know, if they had locks-- so these pirates don't have locks, 583 00:32:05,051 --> 00:32:05,550 right? 584 00:32:05,550 --> 00:32:06,758 We're making that assumption. 585 00:32:06,758 --> 00:32:09,330 If they had their own lock with a key, 586 00:32:09,330 --> 00:32:10,940 if they just had a lock that locks 587 00:32:10,940 --> 00:32:12,440 and they didn't have the key for it, 588 00:32:12,440 --> 00:32:13,898 they will wouldn't be able to break 589 00:32:13,898 --> 00:32:15,140 the security of the protocol. 590 00:32:15,140 --> 00:32:18,910 But if they had a lock and a key for that lock, 591 00:32:18,910 --> 00:32:22,350 then they can pretend to have delivered this to Bob. 592 00:32:22,350 --> 00:32:25,040 And there's no authenticity here with respect to Alice 593 00:32:25,040 --> 00:32:27,860 doesn't quite know whether she's communicating with Bob or not. 594 00:32:27,860 --> 00:32:31,640 She's at the mercy of the pirates to deliver these boxes. 595 00:32:31,640 --> 00:32:37,480 So if the pirates had a lock and the key for the lock, 596 00:32:37,480 --> 00:32:41,760 then we're in a situation where Alice may have exchanged 597 00:32:41,760 --> 00:32:44,370 the key with the pirates. 598 00:32:44,370 --> 00:32:46,520 And she thinks she's exchanged it with Bob. 599 00:32:46,520 --> 00:32:50,100 And she's, in fact, communicating with the pirates. 600 00:32:50,100 --> 00:33:01,340 So there's a man in the middle attack, 601 00:33:01,340 --> 00:33:10,190 which corresponds to the pirates having their own locks 602 00:33:10,190 --> 00:33:11,240 and keys. 603 00:33:11,240 --> 00:33:15,500 And it's even more trivial in the case of our picture here. 604 00:33:15,500 --> 00:33:18,460 Because assuming the pirates know mathematics, 605 00:33:18,460 --> 00:33:22,020 they can generate a random number-- and the number 606 00:33:22,020 --> 00:33:27,020 could be c, for example-- and just send back g raised to c. 607 00:33:27,020 --> 00:33:30,170 And for all you know, they could actually 608 00:33:30,170 --> 00:33:34,430 get Bob to send g raised to b back, 609 00:33:34,430 --> 00:33:38,940 but they would intercept it and replace it with g raised to c. 610 00:33:38,940 --> 00:33:40,767 And they know what c is. 611 00:33:40,767 --> 00:33:42,850 So what's happening now is so you can [INAUDIBLE]. 612 00:33:42,850 --> 00:33:45,320 I won't to go through all of the math here. 613 00:33:45,320 --> 00:33:48,130 But you can kind of see it, I hope. 614 00:33:48,130 --> 00:33:51,950 You end up, if you're Alice, exchanging 615 00:33:51,950 --> 00:33:58,030 a secret key with the pirates as opposed to Bob. 616 00:33:58,030 --> 00:34:03,280 And the way you can set this up is the pirates actually 617 00:34:03,280 --> 00:34:06,070 will get into a situation where Alice and Bob think 618 00:34:06,070 --> 00:34:08,980 that they're communicating with each other in a secure fashion, 619 00:34:08,980 --> 00:34:11,510 but the pirates can listen to all of the messages. 620 00:34:11,510 --> 00:34:13,210 They can decrypt all of the messages, 621 00:34:13,210 --> 00:34:16,940 because they know what they secret key, k, is. 622 00:34:16,940 --> 00:34:17,880 OK. 623 00:34:17,880 --> 00:34:19,730 And remember that the secret key, k, 624 00:34:19,730 --> 00:34:21,219 is something that is probably going 625 00:34:21,219 --> 00:34:25,239 to be used in a symmetric key encryption scheme eventually 626 00:34:25,239 --> 00:34:28,510 to send real messages. 627 00:34:28,510 --> 00:34:31,190 So you're going to have ciphered text with that secret key, 628 00:34:31,190 --> 00:34:33,179 capital K, over there. 629 00:34:33,179 --> 00:34:37,730 And if the pirates or mal get to know what the secret key, k, is 630 00:34:37,730 --> 00:34:41,000 through a man in the middle attack, you've got problems. 631 00:34:41,000 --> 00:34:42,080 All right? 632 00:34:42,080 --> 00:34:46,139 So the man in the middle attack is something 633 00:34:46,139 --> 00:34:47,730 that we have to worry about. 634 00:34:47,730 --> 00:34:49,440 What we're going to talk about next 635 00:34:49,440 --> 00:34:53,030 is something that addresses this problem. 636 00:34:53,030 --> 00:34:55,739 And it may not seem like it's directly 637 00:34:55,739 --> 00:34:57,200 addressing the problem. 638 00:34:57,200 --> 00:34:59,300 But fundamentally, what's going on here 639 00:34:59,300 --> 00:35:04,290 is you need to have authenticity in who you're 640 00:35:04,290 --> 00:35:05,310 communicating with. 641 00:35:05,310 --> 00:35:08,080 Alice has to somehow authenticate Bob. 642 00:35:08,080 --> 00:35:10,710 And Alice has to know somehow that g 643 00:35:10,710 --> 00:35:14,560 raised to b is something that came from Bob. 644 00:35:14,560 --> 00:35:16,780 It's not g raised to c that came from somebody 645 00:35:16,780 --> 00:35:18,380 else is in the middle. 646 00:35:18,380 --> 00:35:21,110 And that's where asymmetric key cryptography and public keys 647 00:35:21,110 --> 00:35:26,700 come in, where you have a certified public key associated 648 00:35:26,700 --> 00:35:29,170 with yourself. 649 00:35:29,170 --> 00:35:31,230 And maybe you need VeriSign or you 650 00:35:31,230 --> 00:35:34,090 need the DMV, or the Registry of Motor Vehicles, 651 00:35:34,090 --> 00:35:36,150 RMV, to do this for you. 652 00:35:36,150 --> 00:35:39,020 But you create a certified public key, 653 00:35:39,020 --> 00:35:41,430 which is associated with your identity. 654 00:35:41,430 --> 00:35:42,840 And it's public. 655 00:35:42,840 --> 00:35:45,100 You can put it on a website. 656 00:35:45,100 --> 00:35:47,590 And everyone can access it using HTTPS, 657 00:35:47,590 --> 00:35:50,300 so they know they're going to the exact website 658 00:35:50,300 --> 00:35:51,740 that you've put up. 659 00:35:51,740 --> 00:35:57,230 And that gives you a way of identifying yourself. 660 00:35:57,230 --> 00:35:59,440 And if you can do that, you can protect 661 00:35:59,440 --> 00:36:01,560 against the man in the middle attack using 662 00:36:01,560 --> 00:36:04,140 asymmetric key cryptography. 663 00:36:04,140 --> 00:36:09,390 So that's kind of the final part of this puzzle that's 664 00:36:09,390 --> 00:36:15,470 associated with authentication and secret key exchange and all 665 00:36:15,470 --> 00:36:16,213 of that. 666 00:36:16,213 --> 00:36:18,337 Once we do that, you'll know what the functionality 667 00:36:18,337 --> 00:36:19,789 is that we require. 668 00:36:19,789 --> 00:36:21,330 And then we'll have to talk about how 669 00:36:21,330 --> 00:36:22,975 we can build a subsystems. 670 00:36:27,520 --> 00:36:28,020 Cool. 671 00:36:28,020 --> 00:36:29,560 Any questions so far? 672 00:36:29,560 --> 00:36:32,930 How we doing? 673 00:36:32,930 --> 00:36:33,430 OK. 674 00:36:36,060 --> 00:36:40,880 So public key encryption, let me just do a little set up. 675 00:36:40,880 --> 00:36:44,370 I said some of this last time. 676 00:36:44,370 --> 00:36:48,580 But to make sure we're on the same page, what we have here 677 00:36:48,580 --> 00:36:53,970 is we really want a message plus a public key. 678 00:36:53,970 --> 00:37:02,680 And you want to obtain ciphered text using this operation. 679 00:37:02,680 --> 00:37:05,920 And this plus is not arithmetic addition. 680 00:37:05,920 --> 00:37:08,900 It's just that we're putting these two things together 681 00:37:08,900 --> 00:37:11,930 into an algorithm, a public key encryption algorithm, that 682 00:37:11,930 --> 00:37:13,500 produces ciphered text. 683 00:37:13,500 --> 00:37:14,290 All right. 684 00:37:14,290 --> 00:37:18,910 And this public key, just to reiterate what I just said, 685 00:37:18,910 --> 00:37:23,790 is going to be if Alice is producing the message 686 00:37:23,790 --> 00:37:26,460 and Bob is getting the ciphered text, 687 00:37:26,460 --> 00:37:30,650 this is going to be Bob's public key. 688 00:37:30,650 --> 00:37:33,800 And the fact that it's Bob's public key 689 00:37:33,800 --> 00:37:35,420 is something that Alice should be 690 00:37:35,420 --> 00:37:38,370 able to authenticate using VeriSign, using 691 00:37:38,370 --> 00:37:40,574 the Register of Motor Vehicles, what have you. 692 00:37:40,574 --> 00:37:42,490 That's what's going to protect against the man 693 00:37:42,490 --> 00:37:43,600 in the middle attack. 694 00:37:43,600 --> 00:37:44,690 OK. 695 00:37:44,690 --> 00:37:47,790 We're not going to talk a lot about how 696 00:37:47,790 --> 00:37:48,920 you can get a certificate. 697 00:37:48,920 --> 00:37:51,340 Your MIT certificate is something which 698 00:37:51,340 --> 00:37:54,310 corresponds to your MIT ID. 699 00:37:54,310 --> 00:37:57,330 It's got information, what year you are, what your name is. 700 00:37:57,330 --> 00:37:59,470 And when you generate that certificate, 701 00:37:59,470 --> 00:38:03,886 you are getting a certificate of authenticity that you are you. 702 00:38:03,886 --> 00:38:05,260 And of course, you give that away 703 00:38:05,260 --> 00:38:07,180 and you hand it to someone else, someone 704 00:38:07,180 --> 00:38:09,160 can pretend to be you as well. 705 00:38:09,160 --> 00:38:13,860 But that's what's happening when we talk about public keys 706 00:38:13,860 --> 00:38:17,500 and you owning public keys. 707 00:38:17,500 --> 00:38:21,390 We're not going to, as I said, get into that very much more. 708 00:38:21,390 --> 00:38:25,310 I'm more interested in describing this algorithm 709 00:38:25,310 --> 00:38:26,960 for public key encryption. 710 00:38:26,960 --> 00:38:31,410 We'll look at a couple that produces a ciphered text given 711 00:38:31,410 --> 00:38:34,440 a message and a public key. 712 00:38:34,440 --> 00:38:37,130 Now, of course, what Bob needs to do 713 00:38:37,130 --> 00:38:41,670 is to take the ciphered text. 714 00:38:41,670 --> 00:38:43,750 And this is what Bob's doing. 715 00:38:43,750 --> 00:38:53,360 And Bob has a private key that is distinct from the public key 716 00:38:53,360 --> 00:38:57,600 and needs to get back exactly the message using a decryption 717 00:38:57,600 --> 00:39:01,680 algorithm that corresponds to the message that Alice sent. 718 00:39:01,680 --> 00:39:02,720 OK. 719 00:39:02,720 --> 00:39:10,010 And this whole thing is going to work provided 720 00:39:10,010 --> 00:39:15,010 knowing the public key, let's call it PK, 721 00:39:15,010 --> 00:39:16,060 and the private key. 722 00:39:16,060 --> 00:39:17,990 We can't call it PK as well, obviously. 723 00:39:17,990 --> 00:39:19,960 So we call it SK. 724 00:39:19,960 --> 00:39:26,770 Knowing the PK does not reveal anything 725 00:39:26,770 --> 00:39:33,290 in a mathematical sense about SK. 726 00:39:33,290 --> 00:39:36,740 But obviously, in order for this whole thing to work, 727 00:39:36,740 --> 00:39:42,950 PK and SK have to have some mathematical relationship. 728 00:39:42,950 --> 00:39:46,070 And the different cryptosystems including RSA, 729 00:39:46,070 --> 00:39:48,330 and we look at a knapsack cryptosystem, 730 00:39:48,330 --> 00:39:53,310 all have different algorithms for encryption and decryption. 731 00:39:53,310 --> 00:39:55,780 And they have different mathematical relationships 732 00:39:55,780 --> 00:39:57,440 between PK and SK. 733 00:39:57,440 --> 00:40:00,320 And for each of these relationships, 734 00:40:00,320 --> 00:40:03,260 you have to show that the adversary has 735 00:40:03,260 --> 00:40:06,580 to solve a computationally hard problem in order 736 00:40:06,580 --> 00:40:12,140 to discover SK given PK. 737 00:40:12,140 --> 00:40:14,870 And it turns out that for most of these systems, 738 00:40:14,870 --> 00:40:19,330 it's symmetric in the sense that these algorithms, 739 00:40:19,330 --> 00:40:23,120 at least for RSA, you could use either one 740 00:40:23,120 --> 00:40:25,690 of these interchangeably. 741 00:40:25,690 --> 00:40:27,390 And there's issues associated with that. 742 00:40:27,390 --> 00:40:30,120 So we really won't go too deep into that. 743 00:40:30,120 --> 00:40:35,660 But what I said you should hold, which is you have one, 744 00:40:35,660 --> 00:40:38,360 it shouldn't tell you anything about the other. 745 00:40:38,360 --> 00:40:40,760 There has to be a computationally hard problem 746 00:40:40,760 --> 00:40:44,790 associated with discovering one of these 747 00:40:44,790 --> 00:40:48,350 only given the one other. 748 00:40:48,350 --> 00:40:50,410 And we'll talk about what those hardness 749 00:40:50,410 --> 00:40:53,580 assumptions are certainly for RSA and also 750 00:40:53,580 --> 00:40:57,730 for another cryptosystem, a knapsack. 751 00:40:57,730 --> 00:41:00,270 So we're going to present RSA, which 752 00:41:00,270 --> 00:41:02,400 is this real magical algorithm. 753 00:41:02,400 --> 00:41:04,580 It's amazing it works. 754 00:41:04,580 --> 00:41:08,100 Every time I prepare for this lecture, 755 00:41:08,100 --> 00:41:10,130 I got to relearn somethings. 756 00:41:10,130 --> 00:41:17,370 And that's because there's one subtle aspect of this. 757 00:41:17,370 --> 00:41:18,580 It's all about number theory. 758 00:41:18,580 --> 00:41:21,190 Number theory can get pretty subtle. 759 00:41:21,190 --> 00:41:26,850 But it's also intricate enough that I forget the details. 760 00:41:26,850 --> 00:41:30,030 So let's get started on that. 761 00:41:30,030 --> 00:41:37,300 Basically, RSA is based on primes and factoring numbers 762 00:41:37,300 --> 00:41:42,760 into primes and using number theory 763 00:41:42,760 --> 00:41:45,820 to make sure that you can actually accomplish 764 00:41:45,820 --> 00:41:49,050 what this is trying to do. 765 00:41:49,050 --> 00:41:51,580 The functionality of RSA should be 766 00:41:51,580 --> 00:41:55,550 distinct from the security of RSA. 767 00:41:55,550 --> 00:41:57,450 When we talk about the functionality of RSA, 768 00:41:57,450 --> 00:42:01,390 we are saying for any message, if Alice 769 00:42:01,390 --> 00:42:05,270 uses Bob's public key to encrypt it, 770 00:42:05,270 --> 00:42:07,780 the ciphered text resulting from that 771 00:42:07,780 --> 00:42:11,700 should be decryptable into exactly the message 772 00:42:11,700 --> 00:42:16,220 that Alice sent given Bob's private key. 773 00:42:16,220 --> 00:42:18,240 That's the functional requirement 774 00:42:18,240 --> 00:42:21,120 of a public key encryption algorithm 775 00:42:21,120 --> 00:42:23,220 or a public key cryptosystem. 776 00:42:23,220 --> 00:42:26,690 The security requirement of a public key cryptosystem 777 00:42:26,690 --> 00:42:28,520 is what I wrote up there. 778 00:42:28,520 --> 00:42:32,540 It's the knowledge of SK should be hidden even 779 00:42:32,540 --> 00:42:33,790 given the knowledge of PK. 780 00:42:33,790 --> 00:42:36,190 And there's precise computational hardness 781 00:42:36,190 --> 00:42:39,460 assumptions that are associated with each cryptosystem. 782 00:42:39,460 --> 00:42:41,745 So let's separate out functionality from security. 783 00:42:41,745 --> 00:42:43,370 We're going to talk about functionality 784 00:42:43,370 --> 00:42:44,411 for the next few minutes. 785 00:42:47,760 --> 00:43:00,970 Alice is going to pick two large secret primes, p and q. 786 00:43:00,970 --> 00:43:02,660 So what I'm going to describe here 787 00:43:02,660 --> 00:43:07,480 are Alice generating her public key and her private key. 788 00:43:07,480 --> 00:43:09,960 She's going to then publish her public key 789 00:43:09,960 --> 00:43:12,360 and keep her private key secret. 790 00:43:12,360 --> 00:43:14,160 Bob does the same thing. 791 00:43:14,160 --> 00:43:16,760 And then after that, they have to register. 792 00:43:16,760 --> 00:43:20,120 And this is not something we'll spend time on beyond me 793 00:43:20,120 --> 00:43:21,360 saying it one more time. 794 00:43:21,360 --> 00:43:23,820 They have to register their public keys with the VeriSign 795 00:43:23,820 --> 00:43:26,190 or the RMVs like I talked about. 796 00:43:26,190 --> 00:43:30,380 So everyone knows that Alice's public key is this long number. 797 00:43:30,380 --> 00:43:33,500 But no one knows Alice's private key. 798 00:43:33,500 --> 00:43:35,860 So Alice picks two large secret primes. 799 00:43:35,860 --> 00:43:38,240 So these are actually going to result 800 00:43:38,240 --> 00:43:40,560 in the creation of our private key. 801 00:43:40,560 --> 00:43:48,600 And then Alice computes N equals pq. 802 00:43:48,600 --> 00:43:51,320 So she just multiplies those out. 803 00:43:51,320 --> 00:44:00,660 She chooses an encryption exponent, 804 00:44:00,660 --> 00:44:07,600 e, which satisfies this little equation, which 805 00:44:07,600 --> 00:44:11,560 says that it's relatively prime in relation 806 00:44:11,560 --> 00:44:14,880 to p minus 1 times q minus 1. 807 00:44:14,880 --> 00:44:15,910 And she knows p and q. 808 00:44:15,910 --> 00:44:19,550 So she can compute p minus 1 times q minus 1. 809 00:44:19,550 --> 00:44:24,010 So the gcd of e and p minus 1, q minus 1 is 1. 810 00:44:24,010 --> 00:44:26,740 And you can certainly accomplish this simply 811 00:44:26,740 --> 00:44:29,890 by choosing e to be a prime. 812 00:44:29,890 --> 00:44:34,600 Because then a gcd of a prime with anything else is 1. 813 00:44:34,600 --> 00:44:39,130 And it turns out that RSA uses-- this is all going to be public, 814 00:44:39,130 --> 00:44:41,370 by the way. 815 00:44:41,370 --> 00:44:42,560 e is going to be public. 816 00:44:42,560 --> 00:44:44,080 So you can just fix that. 817 00:44:44,080 --> 00:44:48,470 And most RSA algorithms just fix that to be a small number. 818 00:44:48,470 --> 00:44:51,430 The encryption exponent is a small number. 819 00:44:51,430 --> 00:44:52,910 And the reason it's a small number 820 00:44:52,910 --> 00:44:55,160 is because you're worried about performance. 821 00:44:55,160 --> 00:44:57,500 And we're going to exponentiate using e. 822 00:44:57,500 --> 00:44:59,750 And the smaller it is, the faster the encryption 823 00:44:59,750 --> 00:45:00,910 is going to go. 824 00:45:00,910 --> 00:45:04,530 So if you want to encrypt fast and decrypt more slowly, 825 00:45:04,530 --> 00:45:06,910 unfortunately, that's the trade off here. 826 00:45:06,910 --> 00:45:08,127 You would pick a small e. 827 00:45:08,127 --> 00:45:10,710 And then we're going to compute our decryption exponent, which 828 00:45:10,710 --> 00:45:12,650 obviously is going to have to be private. 829 00:45:12,650 --> 00:45:14,470 Because that's part of our private key. 830 00:45:14,470 --> 00:45:16,810 But that's going to be bigger if e is small. 831 00:45:16,810 --> 00:45:18,349 And that's just a trade off. 832 00:45:18,349 --> 00:45:20,140 It's symmetric in the sense that while it's 833 00:45:20,140 --> 00:45:21,940 an asymmetric algorithm, it's kind of 834 00:45:21,940 --> 00:45:24,290 symmetric in the mathematical sense 835 00:45:24,290 --> 00:45:28,160 that the private keys and the public key operations 836 00:45:28,160 --> 00:45:30,910 are symmetric. 837 00:45:30,910 --> 00:45:33,015 So what is Alice's public key? 838 00:45:33,015 --> 00:45:39,540 Well, Alice's public key, which she can then publish, 839 00:45:39,540 --> 00:45:42,130 is simply m, e. 840 00:45:42,130 --> 00:45:42,630 OK. 841 00:45:50,740 --> 00:45:53,920 Now, the fun starts. 842 00:45:53,920 --> 00:45:56,520 We have to figure out what the private key is 843 00:45:56,520 --> 00:45:58,070 going to correspond to. 844 00:45:58,070 --> 00:46:02,220 And it turns out-- and this is one of those things 845 00:46:02,220 --> 00:46:05,110 where how did they ever think of this? 846 00:46:05,110 --> 00:46:09,020 And that's still true 40 years later. 847 00:46:09,020 --> 00:46:15,155 You get the decryption exponent using the extended Euclidean 848 00:46:15,155 --> 00:46:15,655 algorithm. 849 00:46:24,280 --> 00:46:30,490 And this is done by Alice secretly, where what you want 850 00:46:30,490 --> 00:46:35,110 is to have the relationship that e times d is 1. 851 00:46:35,110 --> 00:46:41,230 And this is mod p minus 1, q minus 1. 852 00:46:41,230 --> 00:46:44,540 And there's algorithms out there that 853 00:46:44,540 --> 00:46:50,150 would find the inverse that corresponds to d for e 854 00:46:50,150 --> 00:46:51,520 or vice versa. 855 00:46:51,520 --> 00:46:53,810 And they're polytime algorithms. 856 00:46:53,810 --> 00:46:58,149 As long as you know this number here, mod p minus 1, q minus 1, 857 00:46:58,149 --> 00:46:59,940 and you know that Alice knows that, you can 858 00:46:59,940 --> 00:47:02,210 get your decryption exponent. 859 00:47:02,210 --> 00:47:06,880 And typically, if a is small, as I said, d is going to be large. 860 00:47:06,880 --> 00:47:09,590 By the way, the numbers here, p and q, 861 00:47:09,590 --> 00:47:12,710 are going to be the roughly 1,000 bits long. 862 00:47:12,710 --> 00:47:15,650 So that's essentially-- we're talking about huge primes here. 863 00:47:15,650 --> 00:47:19,180 And so n would be 2048 bits in that case. 864 00:47:19,180 --> 00:47:27,010 So the private key, Alice's private key, 865 00:47:27,010 --> 00:47:32,450 you can think of as d, p, q. 866 00:47:32,450 --> 00:47:35,255 So now, it's clear as to what's public and what's private. 867 00:47:35,255 --> 00:47:37,040 n and e are public. 868 00:47:37,040 --> 00:47:40,950 d, p, and q are private. 869 00:47:40,950 --> 00:47:43,030 So that's the set up for RSA. 870 00:47:43,030 --> 00:47:48,160 And it's not at all clear that RSA accomplishes 871 00:47:48,160 --> 00:47:52,700 either of the two things that we need, the first of which 872 00:47:52,700 --> 00:47:57,490 is functionality, the fact that encrypting, and then decrypting 873 00:47:57,490 --> 00:48:00,470 a message should get you back that message. 874 00:48:00,470 --> 00:48:02,580 So that's the first thing we need to look at. 875 00:48:02,580 --> 00:48:05,140 And the security part is actually a little bit easier. 876 00:48:05,140 --> 00:48:07,190 Because you can see we're going to have 877 00:48:07,190 --> 00:48:11,460 to make assumptions about factoring primes 878 00:48:11,460 --> 00:48:13,050 and so on and so forth. 879 00:48:13,050 --> 00:48:15,690 Right here, you can just see that immediately. 880 00:48:15,690 --> 00:48:18,590 The biggest assumption made by RSA 881 00:48:18,590 --> 00:48:20,840 from a computational hardness standpoint 882 00:48:20,840 --> 00:48:24,450 is simply that if the adversary sees n, 883 00:48:24,450 --> 00:48:28,660 that they should not be able to factor it into p and q. 884 00:48:28,660 --> 00:48:30,860 Because if they can do that, it's over. 885 00:48:30,860 --> 00:48:35,870 So that's actually easier than the functionality argument. 886 00:48:35,870 --> 00:48:37,130 So why does this work? 887 00:48:42,970 --> 00:48:46,080 And amazingly, we can actually do this in about 10 minutes. 888 00:48:46,080 --> 00:48:49,350 I'm going to explain to you why this works in 10 minutes. 889 00:48:49,350 --> 00:48:53,570 And the only theorem that we'll require 890 00:48:53,570 --> 00:48:56,720 on top of this, which I will not approve, 891 00:48:56,720 --> 00:49:01,130 because Fermat proved it centuries ago, 892 00:49:01,130 --> 00:49:05,830 is Fermat's Little Theorem that says 893 00:49:05,830 --> 00:49:12,020 that when you have p being a prime-- 894 00:49:12,020 --> 00:49:13,985 you can think of this as a special case. 895 00:49:23,870 --> 00:49:27,290 You take m, and m is an arbitrary number. 896 00:49:27,290 --> 00:49:34,760 And if p is a prime, then this relationship holds. 897 00:49:34,760 --> 00:49:39,190 So you raise it to the p minus 1 power, and you get 1, mod p. 898 00:49:39,190 --> 00:49:40,980 So that's Fermat's Little Theorem 899 00:49:40,980 --> 00:49:42,160 that's going to be required. 900 00:49:42,160 --> 00:49:43,701 And that's pretty much the only thing 901 00:49:43,701 --> 00:49:47,980 that you have to invoke beyond sort of standard mod arithmetic 902 00:49:47,980 --> 00:49:51,720 to show that RSA works. 903 00:49:51,720 --> 00:49:53,010 So what's going on here? 904 00:49:53,010 --> 00:49:58,734 Let's call phi p minus 1 times q minus 1. 905 00:49:58,734 --> 00:50:00,650 Obviously, that's showed up a couple of times. 906 00:50:00,650 --> 00:50:03,190 And you may as well represent it by using 907 00:50:03,190 --> 00:50:05,830 a smaller, simpler symbol. 908 00:50:05,830 --> 00:50:07,530 So we'll call that phi. 909 00:50:07,530 --> 00:50:13,870 And we are going to say that d equals 1. 910 00:50:13,870 --> 00:50:16,280 Mod phi is given to us. 911 00:50:21,870 --> 00:50:28,490 And therefore, we can say that ed equals 1 plus k phi. 912 00:50:31,080 --> 00:50:32,840 So that's it. 913 00:50:32,840 --> 00:50:34,510 All I'm saying is the remainder. 914 00:50:34,510 --> 00:50:37,790 Then you took to the mod with respect to phi was 1. 915 00:50:37,790 --> 00:50:40,900 So the actual number was 1 plus k times phi. 916 00:50:40,900 --> 00:50:43,300 k is some integer. 917 00:50:43,300 --> 00:50:46,390 Think of it as a positive integer. 918 00:50:46,390 --> 00:50:50,570 Remember that we now have the p and the q are over there. 919 00:50:50,570 --> 00:50:52,440 And p and q are primes. 920 00:50:52,440 --> 00:50:56,230 So p and q are primes. 921 00:50:56,230 --> 00:50:58,980 And given that that's the case, we really 922 00:50:58,980 --> 00:51:02,310 have two cases to analyze. 923 00:51:02,310 --> 00:51:03,210 Oh, I'm sorry. 924 00:51:03,210 --> 00:51:14,530 I missed one crucial point, which I should have told you, 925 00:51:14,530 --> 00:51:16,360 which is I gave you this. 926 00:51:16,360 --> 00:51:20,050 But I didn't actually tell you what is going on. 927 00:51:20,050 --> 00:51:23,040 I mentioned it in passing, exponentiation. 928 00:51:23,040 --> 00:51:25,980 But I didn't tell you exactly what the encryption 929 00:51:25,980 --> 00:51:28,670 algorithm was and the decryption algorithm was. 930 00:51:28,670 --> 00:51:30,250 And obviously, you need that in order 931 00:51:30,250 --> 00:51:31,270 to prove their correctness. 932 00:51:31,270 --> 00:51:32,853 I mean, it'd be wonderful if you could 933 00:51:32,853 --> 00:51:33,990 prove correctness of this. 934 00:51:33,990 --> 00:51:38,440 There exists an algorithm that is such 935 00:51:38,440 --> 00:51:43,340 that RSA works or public key encryption works. 936 00:51:43,340 --> 00:51:46,240 So it turns out it's extremely straightforward. 937 00:51:46,240 --> 00:51:48,660 c equals m raised to e. 938 00:51:48,660 --> 00:51:51,210 And that's part of the usefulness 939 00:51:51,210 --> 00:51:56,315 and the power of RSA, which is you take m 940 00:51:56,315 --> 00:51:58,860 and you exponentiate it. 941 00:51:58,860 --> 00:52:04,310 And you take c and you exponentiate it. 942 00:52:04,310 --> 00:52:06,350 And the first is the encryption. 943 00:52:06,350 --> 00:52:08,480 You get the c through encryption, 944 00:52:08,480 --> 00:52:10,440 as you can see over there, the ciphered text. 945 00:52:10,440 --> 00:52:11,970 And m is the plain text. 946 00:52:11,970 --> 00:52:13,500 And you get that back. 947 00:52:13,500 --> 00:52:24,650 So our goal here is to show that you have essentially something 948 00:52:24,650 --> 00:52:31,660 where when you exponentiate m raised to ed, 949 00:52:31,660 --> 00:52:34,060 it should give you m. 950 00:52:36,730 --> 00:52:40,080 For these choices that we have, of e and d, 951 00:52:40,080 --> 00:52:43,160 we've set up the d in such a way that m 952 00:52:43,160 --> 00:52:47,030 raised to ed-- because if you just go do encryption 953 00:52:47,030 --> 00:52:51,754 followed by decryption, you are doubly exponentiating. 954 00:52:51,754 --> 00:52:52,420 That make sense? 955 00:52:52,420 --> 00:52:54,510 Ask me questions if this doesn't make sense. 956 00:52:54,510 --> 00:52:55,600 This is important. 957 00:52:55,600 --> 00:52:58,930 m raised to ed should give you back m. 958 00:52:58,930 --> 00:53:01,860 And if you can show that for any m, you're done. 959 00:53:01,860 --> 00:53:03,920 That's the functionality of RSA. 960 00:53:03,920 --> 00:53:06,550 All right. 961 00:53:06,550 --> 00:53:08,770 So that's encryption and decryption. 962 00:53:08,770 --> 00:53:13,140 And so now let's go back to here. 963 00:53:13,140 --> 00:53:15,340 Now, ed equals 1 mod phi. 964 00:53:15,340 --> 00:53:16,670 Because I've set that up. 965 00:53:16,670 --> 00:53:20,440 This is how I discovered d given e. 966 00:53:20,440 --> 00:53:21,440 So that's a given to me. 967 00:53:21,440 --> 00:53:26,100 That's part of what's called the key generation phase of RSA. 968 00:53:26,100 --> 00:53:29,140 And that's the mathematical relationship 969 00:53:29,140 --> 00:53:31,830 that I keep harping on in terms of the relationship 970 00:53:31,830 --> 00:53:34,320 between the public and the private key. 971 00:53:34,320 --> 00:53:37,755 So given that p and q are primes, I have two cases. 972 00:53:43,530 --> 00:53:50,050 The first case is that gcd of m, p 973 00:53:50,050 --> 00:53:54,620 is exactly 1, which means that the message m. 974 00:53:54,620 --> 00:53:56,860 So what I'm comparing here, the two cases, 975 00:53:56,860 --> 00:53:58,220 is I have the message. 976 00:53:58,220 --> 00:53:59,920 And I'm going to break up the messages 977 00:53:59,920 --> 00:54:02,010 into two different categories. 978 00:54:02,010 --> 00:54:03,057 That's it. 979 00:54:03,057 --> 00:54:05,140 There are all kinds of messages that are possible. 980 00:54:05,140 --> 00:54:06,670 These are arbitrary numbers. 981 00:54:06,670 --> 00:54:10,410 I'm going to break them up into two categories, one of which 982 00:54:10,410 --> 00:54:14,860 where the message is relatively prime in relation to the prime, 983 00:54:14,860 --> 00:54:16,080 p. 984 00:54:16,080 --> 00:54:17,710 And it's not a multiple of p. 985 00:54:17,710 --> 00:54:19,460 That's the way you want to think about it. 986 00:54:19,460 --> 00:54:24,100 Obviously, gcdmp would be 2 if M were 2p. 987 00:54:24,100 --> 00:54:28,850 So the case I'm looking at is gcd of mp equals 1. 988 00:54:28,850 --> 00:54:32,330 And then the another case is going to be trivial, actually, 989 00:54:32,330 --> 00:54:38,060 which is gcd of mp equals p. 990 00:54:38,060 --> 00:54:42,110 Did I say 2 when I said gcd of 2p, p equals 2? 991 00:54:42,110 --> 00:54:45,100 Come on. 992 00:54:45,100 --> 00:54:45,635 Wake up. 993 00:54:45,635 --> 00:54:46,260 Wow, that's it. 994 00:54:46,260 --> 00:54:47,250 Perfect. 995 00:54:47,250 --> 00:54:48,230 Wake up. 996 00:54:48,230 --> 00:54:49,710 OK. 997 00:54:49,710 --> 00:54:53,551 So gcd of 2p and p is p. 998 00:54:53,551 --> 00:54:54,675 So those are the two cases. 999 00:54:57,290 --> 00:54:59,060 All right. 1000 00:54:59,060 --> 00:55:05,576 So by Fermat's Little Theorem, this 1001 00:55:05,576 --> 00:55:06,700 is really Fermat's theorem. 1002 00:55:06,700 --> 00:55:08,920 And because he had a last theorem, for some reason, 1003 00:55:08,920 --> 00:55:11,560 some people call this the Little Theorem. 1004 00:55:11,560 --> 00:55:12,960 But it's Fermat's theorem. 1005 00:55:12,960 --> 00:55:15,270 And we know what that is. 1006 00:55:15,270 --> 00:55:17,500 I just wrote that out there. 1007 00:55:17,500 --> 00:55:21,810 You can now write something that says what I'm going to do 1008 00:55:21,810 --> 00:55:30,530 is I'm going to just take m raised to p minus 1, 1009 00:55:30,530 --> 00:55:32,030 which is 1. 1010 00:55:32,030 --> 00:55:33,600 So this thing is 1. 1011 00:55:33,600 --> 00:55:37,254 And then I'm going to raise it to k times q minus 1. 1012 00:55:37,254 --> 00:55:39,170 And you'll see why I'm doing this in a second. 1013 00:55:39,170 --> 00:55:42,270 Because I want to get the 1 plus k phi factor here. 1014 00:55:42,270 --> 00:55:44,730 So I'm taking 1 and I'm raising it 1015 00:55:44,730 --> 00:55:47,570 to this power, which obviously is going to give me 1 back. 1016 00:55:47,570 --> 00:55:51,100 So all of that is straightforward. 1017 00:55:51,100 --> 00:55:53,460 And then I'm going to multiply this by m. 1018 00:55:53,460 --> 00:55:54,550 OK. 1019 00:55:54,550 --> 00:56:02,310 And this is clearly the same as m mod p. 1020 00:56:02,310 --> 00:56:06,320 Because all I've done is multiply it by 1. 1021 00:56:06,320 --> 00:56:06,820 All right. 1022 00:56:06,820 --> 00:56:08,310 So why did I do this? 1023 00:56:08,310 --> 00:56:12,500 Well, I did this, because I want to group together these two 1024 00:56:12,500 --> 00:56:13,570 exponents. 1025 00:56:13,570 --> 00:56:16,990 And since I've run out of room here, 1026 00:56:16,990 --> 00:56:19,750 let me just erase and finish this properly. 1027 00:56:19,750 --> 00:56:21,940 The other case is easy anyway. 1028 00:56:21,940 --> 00:56:30,376 And so I can write this as 1 plus k p minus 1 q minus 1. 1029 00:56:30,376 --> 00:56:36,450 And that, of course, is exactly m raised to ed, right? 1030 00:56:36,450 --> 00:56:42,050 And so what I've done here is, because 1 times m is clearly m, 1031 00:56:42,050 --> 00:56:45,910 but if I look at this, this is m raised to ed. 1032 00:56:45,910 --> 00:56:49,080 So that's clearly m. 1033 00:56:49,080 --> 00:56:51,760 That's it. 1034 00:56:51,760 --> 00:56:55,160 Just figured out that when I have k phi 1035 00:56:55,160 --> 00:56:58,680 here, that's going to turn into 1, basically, 1036 00:56:58,680 --> 00:57:01,550 when you exponentiate it. 1037 00:57:01,550 --> 00:57:05,350 So that's the hard part, actually, as it turns out, 1038 00:57:05,350 --> 00:57:09,400 of proving RSA's correctness, just 1039 00:57:09,400 --> 00:57:12,770 introducing this 1 raised to something. 1040 00:57:12,770 --> 00:57:16,360 And then the easier part is simply 1041 00:57:16,360 --> 00:57:20,560 the case where the m is actually a multiple of p. 1042 00:57:20,560 --> 00:57:26,000 So you have a gcd m, comma p equals p. 1043 00:57:26,000 --> 00:57:31,940 And in this case, you know that m mod p is actually 0. 1044 00:57:31,940 --> 00:57:34,360 Because m is a multiple of p. 1045 00:57:34,360 --> 00:57:36,530 So you're basically exponentiating 0. 1046 00:57:36,530 --> 00:57:37,860 What do you do with 0? 1047 00:57:37,860 --> 00:57:40,220 You're going to get 0 on both sides. 1048 00:57:40,220 --> 00:57:44,790 So you're sending a message that's essentially 0 mod p 1049 00:57:44,790 --> 00:57:48,010 And when you decrypt it on that side, you get 0. 1050 00:57:48,010 --> 00:57:54,800 But one last line here is simply that m raised to ed 1051 00:57:54,800 --> 00:58:04,090 is m trivially all of this mod p if m is 0, right? 1052 00:58:04,090 --> 00:58:05,784 So that was the easy case. 1053 00:58:05,784 --> 00:58:07,450 So it was really pretty straightforward. 1054 00:58:07,450 --> 00:58:10,640 In one case, we took 1 and we exponentiated it and showed 1055 00:58:10,640 --> 00:58:12,170 the result in a couple steps. 1056 00:58:12,170 --> 00:58:15,240 In the other case, you had messages that were 0 mod p. 1057 00:58:15,240 --> 00:58:16,370 So it's very pretty. 1058 00:58:16,370 --> 00:58:17,870 It works. 1059 00:58:17,870 --> 00:58:20,890 And so what happened here? 1060 00:58:20,890 --> 00:58:23,740 I said that m raised to ed equals m. 1061 00:58:23,740 --> 00:58:27,500 And what I did here, of course, was I did everything. 1062 00:58:27,500 --> 00:58:30,350 So it's not quite done, a little slight of hand 1063 00:58:30,350 --> 00:58:35,120 here as I switched over and talked about mod p. 1064 00:58:35,120 --> 00:58:37,410 So I had mod p over here. 1065 00:58:37,410 --> 00:58:39,340 And I said p and q were primes. 1066 00:58:39,340 --> 00:58:42,770 And I looked at p first. 1067 00:58:42,770 --> 00:58:45,690 But what I need to do, just to finish this off, 1068 00:58:45,690 --> 00:58:47,880 let me do this over here, is I have 1069 00:58:47,880 --> 00:58:50,541 to do the same argument for q. 1070 00:58:50,541 --> 00:58:56,160 And I'm going to put it together for n. 1071 00:58:56,160 --> 00:59:00,530 And the reason for that is simply that I have n over here. 1072 00:59:00,530 --> 00:59:03,050 So remember n equals p times q. 1073 00:59:03,050 --> 00:59:06,460 The encryption and that decryption 1074 00:59:06,460 --> 00:59:09,580 are going to be done mod o. 1075 00:59:09,580 --> 00:59:13,210 Certainly, the encryption has to be done mod n, because n 1076 00:59:13,210 --> 00:59:15,760 is only public number that you have 1077 00:59:15,760 --> 00:59:17,670 that you can mod with, correct? 1078 00:59:17,670 --> 00:59:21,440 So what I've done here, this analysis is for p, 1079 00:59:21,440 --> 00:59:26,550 can do the same for q. 1080 00:59:26,550 --> 00:59:29,250 It's exactly the same, because p is a prime 1081 00:59:29,250 --> 00:59:31,710 and q is also a prime. 1082 00:59:31,710 --> 00:59:34,910 But I have to just do one last thing, which is put these two 1083 00:59:34,910 --> 00:59:39,130 things together and say that n equals p times q, 1084 00:59:39,130 --> 00:59:41,446 so the math is all going to work out. 1085 00:59:41,446 --> 00:59:42,570 Let me just write that out. 1086 00:59:42,570 --> 00:59:45,740 It's not too difficult to explain, 1087 00:59:45,740 --> 00:59:47,750 once I have this up here. 1088 00:59:47,750 --> 00:59:52,100 So in both cases of p and q, so when I say both cases, 1089 00:59:52,100 --> 00:59:55,330 I mean the pk's and qk's. 1090 00:59:55,330 --> 01:00:02,070 I have m raised to ed equals m mod p. 1091 01:00:02,070 --> 01:00:09,320 And m raised to ed is the same as m mod q. 1092 01:00:09,320 --> 01:00:19,810 And since p and q are distinct primes, 1093 01:00:19,810 --> 01:00:27,690 we can say that m raised to ed equals m mod N. 1094 01:00:27,690 --> 01:00:31,410 And that essentially says c raised to d, if you really 1095 01:00:31,410 --> 01:00:33,000 want to put it all together, which 1096 01:00:33,000 --> 01:00:41,590 is m raised to e raised to d, equals m mod N, which 1097 01:00:41,590 --> 01:00:45,530 of course is what we want here. 1098 01:00:45,530 --> 01:00:51,000 And this thing was also mod N. This is mod N, mod N, mod N. 1099 01:00:51,000 --> 01:00:52,270 All right. 1100 01:00:52,270 --> 01:00:53,090 So that's RSA. 1101 01:00:53,090 --> 01:00:55,400 That's your first public key algorithm, 1102 01:00:55,400 --> 01:00:57,560 the first public key algorithm, at least 1103 01:00:57,560 --> 01:01:00,075 that stood the test of time, still in use today. 1104 01:01:03,900 --> 01:01:06,430 From a standpoint of computation, 1105 01:01:06,430 --> 01:01:09,670 this is the hardest part. 1106 01:01:09,670 --> 01:01:12,900 You have to exponentiate, and you have these large numbers. 1107 01:01:12,900 --> 01:01:19,000 And as the years have rolled by, RSA, as I've said, 1108 01:01:19,000 --> 01:01:20,400 withstood the test of time. 1109 01:01:20,400 --> 01:01:23,140 But the parameters have increased. 1110 01:01:23,140 --> 01:01:25,210 Way back then in the '70s, they were 1111 01:01:25,210 --> 01:01:28,960 thinking about 512-bit primes. 1112 01:01:28,960 --> 01:01:32,060 In fact, I can't recall whether n was 512-bits or p 1113 01:01:32,060 --> 01:01:33,440 and q were 512-bits. 1114 01:01:33,440 --> 01:01:36,940 But if p and q were 512-bits, then n would be 1024. 1115 01:01:36,940 --> 01:01:43,330 And now, NSA recommends 8192-bits for n. 1116 01:01:43,330 --> 01:01:44,660 So there's been an increase. 1117 01:01:44,660 --> 01:01:46,870 But the nice thing is that it's not 1118 01:01:46,870 --> 01:01:52,040 like there's an exponential increase in the computation. 1119 01:01:52,040 --> 01:01:55,910 Because the computation is polynomially related 1120 01:01:55,910 --> 01:01:57,570 to the number of bits. 1121 01:01:57,570 --> 01:02:01,560 So if you double it, I think, if I recall correctly, 1122 01:02:01,560 --> 01:02:04,430 decryption is going to be the cube of that. 1123 01:02:04,430 --> 01:02:06,370 Or actually, verifying signatures 1124 01:02:06,370 --> 01:02:07,820 is probably the cube of that. 1125 01:02:07,820 --> 01:02:11,470 But don't worry too much about that. 1126 01:02:11,470 --> 01:02:15,950 The bottom line is that as you double 1127 01:02:15,950 --> 01:02:18,050 the size of the exponent, you're going 1128 01:02:18,050 --> 01:02:22,530 to have a fairly small increase in the time required 1129 01:02:22,530 --> 01:02:26,460 to decrypt or to verify a signature, et cetera. 1130 01:02:26,460 --> 01:02:32,270 But it has grown from 512 or 1024 to 8192. 1131 01:02:32,270 --> 01:02:35,160 And so hopefully, you all understand 1132 01:02:35,160 --> 01:02:38,340 how RSA works to some extent. 1133 01:02:38,340 --> 01:02:44,270 I just will leave it at the hardness assumptions here 1134 01:02:44,270 --> 01:02:50,050 are, like in the case of Diffie-Hellman, two-fold. 1135 01:02:50,050 --> 01:02:56,170 And the first one is kind of immediately obvious. 1136 01:02:56,170 --> 01:02:58,550 Just like it was the case with Diffie-Hellman, 1137 01:02:58,550 --> 01:03:01,390 where you had g raised to a flying about 1138 01:03:01,390 --> 01:03:04,130 and obviously that has to hide a, 1139 01:03:04,130 --> 01:03:08,130 here you got N, capital N, being published. 1140 01:03:08,130 --> 01:03:12,039 And if anybody could take N and factor it, 1141 01:03:12,039 --> 01:03:13,580 there may be multiple factorizations. 1142 01:03:13,580 --> 01:03:16,660 But you're going to get a unique prime factorization. 1143 01:03:16,660 --> 01:03:19,240 So that's what you want, that unique prime factorization 1144 01:03:19,240 --> 01:03:20,250 of N. 1145 01:03:20,250 --> 01:03:24,830 And if you get that, then you've broken the System 1146 01:03:24,830 --> 01:03:27,180 because you know what p and q are. 1147 01:03:27,180 --> 01:03:28,960 And so this is all public in the sense 1148 01:03:28,960 --> 01:03:31,120 that this algorithm is public. 1149 01:03:31,120 --> 01:03:33,559 If you're using RSA, this is what you're following. 1150 01:03:33,559 --> 01:03:35,100 So the person is trying to figure out 1151 01:03:35,100 --> 01:03:38,170 what the two primes are that together get multiplied 1152 01:03:38,170 --> 01:03:42,030 to form capital N. And so that's a factorization problem. 1153 01:03:42,030 --> 01:03:52,300 And so RSA hardness assumptions are given N, 1154 01:03:52,300 --> 01:03:59,230 hard to factor into p, comma q. 1155 01:03:59,230 --> 01:04:00,315 And this is factoring. 1156 01:04:03,740 --> 01:04:06,480 And then one other thing, which is 1157 01:04:06,480 --> 01:04:09,320 given e-- so you're not actually breaking 1158 01:04:09,320 --> 01:04:11,130 the entire cryptosystem. 1159 01:04:11,130 --> 01:04:14,860 But you're breaking the privacy associated 1160 01:04:14,860 --> 01:04:17,680 with a particular message. 1161 01:04:17,680 --> 01:04:19,730 And so you could break the privacy associated 1162 01:04:19,730 --> 01:04:20,820 with a particular message. 1163 01:04:20,820 --> 01:04:23,370 You're given e, because that's public. 1164 01:04:23,370 --> 01:04:27,830 And you don't know what p and q are. 1165 01:04:27,830 --> 01:04:34,260 But you know that e is relatively prime with respect 1166 01:04:34,260 --> 01:04:38,680 to p minus 1, q minus 1, because that's RSA algorithm. 1167 01:04:38,680 --> 01:04:40,480 And that's a publicly known. 1168 01:04:40,480 --> 01:04:45,030 And you also know c, which is the ciphered text. 1169 01:04:45,030 --> 01:04:47,520 And so what you're doing is you're trying to discover m. 1170 01:04:47,520 --> 01:04:50,780 So you're trying to break a particular encryption that was 1171 01:04:50,780 --> 01:04:53,520 created by the RSA algorithm. 1172 01:04:53,520 --> 01:04:55,910 And you haven't discovered the private key here. 1173 01:04:55,910 --> 01:04:59,450 That's only discoverable through the factoring problem. 1174 01:04:59,450 --> 01:05:02,750 But you could break security if you 1175 01:05:02,750 --> 01:05:13,440 can find m such that m raised to e is c mod N. 1176 01:05:13,440 --> 01:05:16,076 So you're doing the searching for an m. 1177 01:05:16,076 --> 01:05:17,450 So you're trying to discover an m 1178 01:05:17,450 --> 01:05:19,500 that you can exponentiate to get what you 1179 01:05:19,500 --> 01:05:20,790 have on the right-hand side. 1180 01:05:20,790 --> 01:05:23,840 Because clearly, you can compute what's on the right-hand side. 1181 01:05:23,840 --> 01:05:25,890 So this is simply called RSA problem. 1182 01:05:29,160 --> 01:05:31,510 So those are the two computational assumptions 1183 01:05:31,510 --> 01:05:35,620 that you need to make in order for RSA to be secure. 1184 01:05:35,620 --> 01:05:36,120 Cool. 1185 01:05:36,120 --> 01:05:36,703 Any questions? 1186 01:05:40,123 --> 01:05:43,477 AUDIENCE: [INAUDIBLE] center? 1187 01:05:43,477 --> 01:05:45,060 SRINIVAS DEVADAS: So that's anonymity. 1188 01:05:45,060 --> 01:05:45,630 Yes. 1189 01:05:45,630 --> 01:05:47,840 There's ring cryptography. 1190 01:05:47,840 --> 01:05:49,980 And there's a whole host of protocols. 1191 01:05:49,980 --> 01:05:54,550 I actually did some of them based on RSA, where you can, 1192 01:05:54,550 --> 01:05:59,170 by collecting a bunch of private keys together, 1193 01:05:59,170 --> 01:06:02,130 essentially set it up so it can be verified 1194 01:06:02,130 --> 01:06:06,850 that the message came from a collection of people, 1195 01:06:06,850 --> 01:06:10,680 but you can't tell which person the message came from. 1196 01:06:10,680 --> 01:06:12,550 So there's just a whole host of things. 1197 01:06:12,550 --> 01:06:14,400 There's thousands of papers written. 1198 01:06:14,400 --> 01:06:15,740 There's a wonderful field. 1199 01:06:15,740 --> 01:06:19,850 I encourage you to look into it if your interests are 1200 01:06:19,850 --> 01:06:21,110 inclined this way. 1201 01:06:21,110 --> 01:06:22,860 And it's just gone on and on. 1202 01:06:22,860 --> 01:06:25,870 It's become more important with the internet. 1203 01:06:25,870 --> 01:06:29,970 RSA, the company, was probably founded in the late '70s. 1204 01:06:29,970 --> 01:06:31,532 And they struggled for a while. 1205 01:06:31,532 --> 01:06:32,990 And then eventually, they were used 1206 01:06:32,990 --> 01:06:35,720 for Secure Sockets Layer in Netscape, 1207 01:06:35,720 --> 01:06:37,060 which was their big break. 1208 01:06:37,060 --> 01:06:40,510 And then, of course, Netscape meant the internet was around. 1209 01:06:40,510 --> 01:06:44,140 And so really, the internet made RSA what it is today. 1210 01:06:44,140 --> 01:06:46,740 And so just a whole host of wonderful 1211 01:06:46,740 --> 01:06:48,360 algorithms out there, some of which 1212 01:06:48,360 --> 01:06:51,050 are based in RSA and some that are broken. 1213 01:06:51,050 --> 01:06:53,750 And so let's talk for the last few minutes 1214 01:06:53,750 --> 01:07:01,020 about all of the fits and starts that occurred in cryptography. 1215 01:07:01,020 --> 01:07:04,440 And precisely what I'd like to focus on 1216 01:07:04,440 --> 01:07:07,640 for the time we have left is hardness. 1217 01:07:07,640 --> 01:07:11,010 So we spent a lot of time talking about hard problems. 1218 01:07:11,010 --> 01:07:17,600 And we talked about NP-complete problems that are hard. 1219 01:07:17,600 --> 01:07:20,760 But they're hard in the worst case. 1220 01:07:20,760 --> 01:07:30,250 So you have a situation where you have NP-complete problems. 1221 01:07:39,396 --> 01:07:40,770 And I'd like to talk a little bit 1222 01:07:40,770 --> 01:07:44,030 about the relationship between NP-completeness and crypto. 1223 01:07:44,030 --> 01:07:46,660 Because we've made these assumptions about hardness. 1224 01:07:46,660 --> 01:07:53,000 Now, what's interesting here is that N composite is clearly 1225 01:07:53,000 --> 01:08:02,940 in NP, but unknown if NP-complete. 1226 01:08:02,940 --> 01:08:04,380 So this is very interesting. 1227 01:08:04,380 --> 01:08:08,290 The tried and trusted algorithm for public key encryption 1228 01:08:08,290 --> 01:08:10,450 relies on a computational assumption 1229 01:08:10,450 --> 01:08:12,660 where the problem associated with that assumption 1230 01:08:12,660 --> 01:08:14,810 is not even known to be NPC. 1231 01:08:14,810 --> 01:08:15,930 All right. 1232 01:08:15,930 --> 01:08:17,920 So that's kind of wild. 1233 01:08:17,920 --> 01:08:19,430 So how does this work? 1234 01:08:19,430 --> 01:08:21,210 Or why does this work? 1235 01:08:21,210 --> 01:08:25,854 Now, if you take other problems, like, is a graph 3-colorable? 1236 01:08:29,840 --> 01:08:31,790 And so what does that mean? 1237 01:08:31,790 --> 01:08:33,810 Well, you have three colors. 1238 01:08:33,810 --> 01:08:41,580 And you're not allowed to reuse the same color 1239 01:08:41,580 --> 01:08:45,189 on two ends of an edge. 1240 01:08:45,189 --> 01:08:49,580 So if you put red over here, you can put red here, 1241 01:08:49,580 --> 01:08:51,800 but you can't put red here and there. 1242 01:08:51,800 --> 01:08:53,890 And so that graph is 3-colorable. 1243 01:08:53,890 --> 01:08:59,410 But if you had a click, then this would not be 3-colorable. 1244 01:08:59,410 --> 01:09:01,439 Because you have all these edges. 1245 01:09:01,439 --> 01:09:02,899 You have three edges coming out. 1246 01:09:02,899 --> 01:09:05,700 And so clearly, the degree from a vertex 1247 01:09:05,700 --> 01:09:07,420 is going to tell you what you have. 1248 01:09:07,420 --> 01:09:09,270 So if you have a 4-click over there, 1249 01:09:09,270 --> 01:09:11,020 immediately it's not 3-colorable. 1250 01:09:11,020 --> 01:09:14,890 But checking whether a graphic is 3-colorable is NPC. 1251 01:09:20,330 --> 01:09:25,800 You can use a three set as a way of showing that. 1252 01:09:25,800 --> 01:09:31,770 So you can say, oh, wow, maybe I shouldn't be worried about RSA. 1253 01:09:31,770 --> 01:09:33,880 I should just be building cryptosystems 1254 01:09:33,880 --> 01:09:35,210 based on 3-colorability. 1255 01:09:35,210 --> 01:09:38,380 Because it seems like a much simpler problem than all 1256 01:09:38,380 --> 01:09:40,609 these grungy map that you have out there-- actually, 1257 01:09:40,609 --> 01:09:42,490 beautiful map that you have out there. 1258 01:09:42,490 --> 01:09:43,250 OK. 1259 01:09:43,250 --> 01:09:47,550 So that's something that's a perfectly reasonable question 1260 01:09:47,550 --> 01:09:48,560 to ask. 1261 01:09:48,560 --> 01:09:51,370 And then we have Knapsack. 1262 01:09:51,370 --> 01:09:55,850 Knapsack is simply you've got a bunch of items 1263 01:09:55,850 --> 01:09:59,290 and you just want to figure out whether you 1264 01:09:59,290 --> 01:10:06,940 can get this particular sum S. Knapsack is NPC as well. 1265 01:10:11,620 --> 01:10:14,110 And you got a bunch of weights given to you. 1266 01:10:14,110 --> 01:10:17,350 And the BI are going to have to be 0, 1. 1267 01:10:17,350 --> 01:10:20,740 So you just want to discover a particular assignment 1268 01:10:20,740 --> 01:10:24,770 of the BIs, such that you pick the appropriate items to put 1269 01:10:24,770 --> 01:10:26,350 into the knapsack, right? 1270 01:10:26,350 --> 01:10:27,110 That's it. 1271 01:10:27,110 --> 01:10:29,830 That's a perfectly reasonable problem 1272 01:10:29,830 --> 01:10:33,760 to potentially use as a basis for computational hardness 1273 01:10:33,760 --> 01:10:36,060 to go build cryptosystems. 1274 01:10:36,060 --> 01:10:37,530 And people did that. 1275 01:10:37,530 --> 01:10:38,940 People did that for years. 1276 01:10:38,940 --> 01:10:40,670 They tried and they tried. 1277 01:10:40,670 --> 01:10:43,770 And they produced cryptosystems, public key cryptosystems, 1278 01:10:43,770 --> 01:10:48,350 based on Knapsack that look fantastic. 1279 01:10:48,350 --> 01:10:50,970 And they work from a functionality standpoint 1280 01:10:50,970 --> 01:10:53,609 in the sense that you would use this Knapsack-- 1281 01:10:53,609 --> 01:10:55,150 and I'll give you a sense of how this 1282 01:10:55,150 --> 01:11:01,390 is done in a minute-- problem to encrypt. 1283 01:11:01,390 --> 01:11:03,830 And then you'd use a different kind of Knapsack problem 1284 01:11:03,830 --> 01:11:05,260 to decrypt. 1285 01:11:05,260 --> 01:11:07,240 And when you encrypt it and decrypt it, 1286 01:11:07,240 --> 01:11:09,100 you did get that same message back, 1287 01:11:09,100 --> 01:11:11,720 except the whole world knew what the message was. 1288 01:11:11,720 --> 01:11:15,810 Because the problem associated with the Knapsack 1289 01:11:15,810 --> 01:11:17,760 wasn't hard enough. 1290 01:11:17,760 --> 01:11:20,540 So the computational hardness was what 1291 01:11:20,540 --> 01:11:22,660 broke the Knapsack schemes. 1292 01:11:22,660 --> 01:11:25,680 And then you come down to asking why 1293 01:11:25,680 --> 01:11:31,820 is it that this problem that's not an NPC 1294 01:11:31,820 --> 01:11:34,310 seems to have stood the test of time, 1295 01:11:34,310 --> 01:11:36,850 but all these other problems, like Knapsack 1296 01:11:36,850 --> 01:11:39,790 and 3-colorability, which is even worse than Knapsack, 1297 01:11:39,790 --> 01:11:42,540 when people have built cryptosystems based on this, 1298 01:11:42,540 --> 01:11:45,220 they've all been broken very quickly? 1299 01:11:45,220 --> 01:11:47,210 And so why is that? 1300 01:11:47,210 --> 01:11:54,340 What do you think the reason is, sort of at a high level? 1301 01:11:54,340 --> 01:11:55,600 What does NP-completenss say? 1302 01:11:59,740 --> 01:12:03,389 When we talk about complexity, what are we worried about? 1303 01:12:03,389 --> 01:12:04,930 Most of the time, what are we talking 1304 01:12:04,930 --> 01:12:07,940 about when we talk about complexity of an algorithm? 1305 01:12:07,940 --> 01:12:13,370 Or in this case, in the case of a problem, what adjective 1306 01:12:13,370 --> 01:12:17,130 do we put in front of runtime, for example, 1307 01:12:17,130 --> 01:12:18,627 then we compute complexity? 1308 01:12:18,627 --> 01:12:19,920 AUDIENCE: Worst [INAUDIBLE]. 1309 01:12:19,920 --> 01:12:21,420 SRINIVAS DEVADAS: Worst case, right? 1310 01:12:21,420 --> 01:12:23,080 Worst case. 1311 01:12:23,080 --> 01:12:25,790 In the worst case, you're going to be 1312 01:12:25,790 --> 01:12:27,900 able to create random graphs where 1313 01:12:27,900 --> 01:12:31,300 it takes exponential time to discover whether they're 1314 01:12:31,300 --> 01:12:33,170 3-colorable or not. 1315 01:12:33,170 --> 01:12:35,370 But in the average case, all you do 1316 01:12:35,370 --> 01:12:38,010 is if you have a large graph, if there's 1317 01:12:38,010 --> 01:12:41,440 one little 4-click in the graph and you can find it, 1318 01:12:41,440 --> 01:12:44,500 instantly you know that it's not 3-colorable, right? 1319 01:12:44,500 --> 01:12:46,360 So it turns out that's 3-colorability is 1320 01:12:46,360 --> 01:12:49,740 just the worst thing ever when it comes to cryptography. 1321 01:12:49,740 --> 01:12:51,770 Because the larger the graph-- and you 1322 01:12:51,770 --> 01:12:53,020 need this graph to be large. 1323 01:12:53,020 --> 01:12:55,871 Because anything that's small, is constant time, right? 1324 01:12:55,871 --> 01:12:57,370 Because so what if it's exponential? 1325 01:12:57,370 --> 01:12:58,450 It's constant time. 1326 01:12:58,450 --> 01:13:00,620 So you need the graph to be large. 1327 01:13:00,620 --> 01:13:03,340 When you have a random graph that's large, 1328 01:13:03,340 --> 01:13:06,680 the chances you're going to find a 4-click in a 2,000 vertex 1329 01:13:06,680 --> 01:13:09,120 graph is pretty high. 1330 01:13:09,120 --> 01:13:11,770 And so if you just go scan and look for a 4-click, 1331 01:13:11,770 --> 01:13:14,330 instantly you know that this graph is not 3-colorable, 1332 01:13:14,330 --> 01:13:14,830 right? 1333 01:13:14,830 --> 01:13:19,640 So in the average case, 3-colorability is easy. 1334 01:13:19,640 --> 01:13:22,590 It's easy to solve in the average case. 1335 01:13:22,590 --> 01:13:26,540 And the wonderful thing about factoring 1336 01:13:26,540 --> 01:13:29,390 is as long as the numbers are large, 1337 01:13:29,390 --> 01:13:30,940 doesn't matter what the numbers are, 1338 01:13:30,940 --> 01:13:33,822 it's hard to factor in the average case. 1339 01:13:33,822 --> 01:13:35,030 So that's the big difference. 1340 01:13:35,030 --> 01:13:37,530 If you're going to take anything away from the rest of this, 1341 01:13:37,530 --> 01:13:39,270 it's the difference between problems 1342 01:13:39,270 --> 01:13:44,070 that cryptographics systems are based on. 1343 01:13:44,070 --> 01:13:46,270 The systems that have stood the test of time, 1344 01:13:46,270 --> 01:13:49,320 they're based on problems that are hard on the average. 1345 01:13:49,320 --> 01:13:53,720 And the NP-complete problems, like the simple ones 1346 01:13:53,720 --> 01:13:57,600 here, are hard in the worst case. 1347 01:13:57,600 --> 01:14:02,290 And this is also true for Knapsack. 1348 01:14:02,290 --> 01:14:05,000 So that's the essence of it. 1349 01:14:05,000 --> 01:14:07,450 I'll just give you a sense. 1350 01:14:07,450 --> 01:14:09,650 You can read the notes. 1351 01:14:09,650 --> 01:14:14,140 There's a way of generating secret keys 1352 01:14:14,140 --> 01:14:17,250 and public keys using Knapsack that I think 1353 01:14:17,250 --> 01:14:20,800 is kind of interesting that is worth looking at, 1354 01:14:20,800 --> 01:14:23,090 even though all of these systems are broken. 1355 01:14:23,090 --> 01:14:24,380 It's just kind of cool. 1356 01:14:24,380 --> 01:14:26,980 You know, how would you get encryption out of a knapsack? 1357 01:14:26,980 --> 01:14:28,200 I mean, you're putting things in a knapsack 1358 01:14:28,200 --> 01:14:29,320 and taking things out. 1359 01:14:29,320 --> 01:14:33,170 How can you set it up so you get an asymmetric key system, 1360 01:14:33,170 --> 01:14:35,657 a public key system, through a Knapsack problem? 1361 01:14:35,657 --> 01:14:37,240 So I'll just give you a sense of that. 1362 01:14:37,240 --> 01:14:38,090 And you can read. 1363 01:14:38,090 --> 01:14:39,150 I won't finish. 1364 01:14:39,150 --> 01:14:44,840 But I'd like to do what we did here for RSA in the time 1365 01:14:44,840 --> 01:14:47,490 that I have left for Knapsack. 1366 01:14:47,490 --> 01:14:49,140 That is kind of cool. 1367 01:14:49,140 --> 01:14:51,530 You get a sense of the variety of different public key 1368 01:14:51,530 --> 01:14:54,760 cryptosystems that are out there by looking at something that 1369 01:14:54,760 --> 01:14:57,180 is very different from RSA. 1370 01:14:59,940 --> 01:15:12,140 So in the Knapsack problem, the general Knapsack problem, 1371 01:15:12,140 --> 01:15:14,720 what's hard is NPC. 1372 01:15:14,720 --> 01:15:24,250 There's a super increasing Knapsack problem that's easy, 1373 01:15:24,250 --> 01:15:29,290 that can be solved in linear time. 1374 01:15:29,290 --> 01:15:31,960 What is a super increasing knapsack? 1375 01:15:31,960 --> 01:15:34,140 Well, a super increasing knapsack 1376 01:15:34,140 --> 01:15:45,460 is something where I Wj has this property. 1377 01:15:45,460 --> 01:15:55,074 So an example of that is 2, 3, 6, 13, 27, 52. 1378 01:15:55,074 --> 01:15:56,490 So the rates are super increasing. 1379 01:15:56,490 --> 01:15:58,100 2 plus 3 is less than 6. 1380 01:15:58,100 --> 01:16:02,150 2 plus 3 plus 6 is less than 13 and so on and so forth. 1381 01:16:02,150 --> 01:16:07,330 Do you see why super increasing knapsack is easily solvable? 1382 01:16:07,330 --> 01:16:08,560 I mean, what is a knapsack? 1383 01:16:08,560 --> 01:16:10,960 I got a limit on the amount of stuff 1384 01:16:10,960 --> 01:16:12,500 I can put into the knapsack. 1385 01:16:12,500 --> 01:16:16,470 And I want to make to be able to say yes or no in terms 1386 01:16:16,470 --> 01:16:19,120 of whether it fits exactly or not, just 1387 01:16:19,120 --> 01:16:20,900 in terms of our definition. 1388 01:16:20,900 --> 01:16:25,650 So what I do here in super increasing knapsack? 1389 01:16:25,650 --> 01:16:26,893 Yep. 1390 01:16:26,893 --> 01:16:29,258 AUDIENCE: [INAUDIBLE] from the biggest 1391 01:16:29,258 --> 01:16:30,857 to the smallest [INAUDIBLE]. 1392 01:16:30,857 --> 01:16:31,940 SRINIVAS DEVADAS: Exactly. 1393 01:16:31,940 --> 01:16:33,880 That means that there's a linear time algorithm that 1394 01:16:33,880 --> 01:16:35,300 basically solves the problem. 1395 01:16:35,300 --> 01:16:38,730 And you know that you can do that, 1396 01:16:38,730 --> 01:16:40,850 and you would get the correct answer. 1397 01:16:40,850 --> 01:16:42,840 So that's pretty much what you've got. 1398 01:16:42,840 --> 01:16:45,710 That'll give you the highest weight. 1399 01:16:45,710 --> 01:16:49,100 If you have 13 exactly, you know you can't put 52 and 27. 1400 01:16:49,100 --> 01:16:50,170 You get 13. 1401 01:16:50,170 --> 01:16:52,580 There's no point in putting 2 and 3 and 6, 1402 01:16:52,580 --> 01:16:56,400 because that's not going to give you 13. 1403 01:16:56,400 --> 01:16:57,920 So that's clearly easy. 1404 01:16:57,920 --> 01:17:00,827 So we've got an interesting case here, 1405 01:17:00,827 --> 01:17:02,410 assuming this is all going to work out 1406 01:17:02,410 --> 01:17:05,670 from an adversarial standpoint, which unfortunately it doesn't, 1407 01:17:05,670 --> 01:17:09,920 you can look and say, ah, I want encryption to be 1408 01:17:09,920 --> 01:17:12,070 the super increasing knapsack. 1409 01:17:12,070 --> 01:17:14,730 Because that should be easy to do. 1410 01:17:14,730 --> 01:17:21,630 And I want the decryption, not knowing the private key, 1411 01:17:21,630 --> 01:17:23,580 to be as hard as knapsack. 1412 01:17:23,580 --> 01:17:24,250 OK. 1413 01:17:24,250 --> 01:17:25,999 So that's the kind of thing that you could 1414 01:17:25,999 --> 01:17:29,550 do if you built a cryptosystem. 1415 01:17:29,550 --> 01:17:31,550 And people did, Merkle and Hellman. 1416 01:17:31,550 --> 01:17:34,600 Hellman is the same guy, the second name, in Diffie-Hellman. 1417 01:17:34,600 --> 01:17:38,970 They proposed this particular system that ended up 1418 01:17:38,970 --> 01:17:40,830 being broken soon after. 1419 01:17:40,830 --> 01:17:43,410 But the idea is that you create a private key. 1420 01:17:46,150 --> 01:17:48,645 And the private key is a super increasing knapsack. 1421 01:17:54,980 --> 01:18:03,570 And then you use a private transform in order 1422 01:18:03,570 --> 01:18:08,270 to get a-- and this is really I put it in quotes, 1423 01:18:08,270 --> 01:18:14,725 because this was the bug-- "hard" Knapsack problem. 1424 01:18:17,650 --> 01:18:21,010 And this corresponded to the public key. 1425 01:18:25,500 --> 01:18:27,860 And so what you do is you won't actually 1426 01:18:27,860 --> 01:18:32,459 have to solve Knapsack for encryption, the hard problem. 1427 01:18:32,459 --> 01:18:34,000 The encryption would just simply take 1428 01:18:34,000 --> 01:18:36,810 the public key, which is completely public, 1429 01:18:36,810 --> 01:18:41,120 and you would create the encryption 1430 01:18:41,120 --> 01:18:45,450 of a message using this particular public key 1431 01:18:45,450 --> 01:18:49,680 in a polynomial time. 1432 01:18:49,680 --> 01:18:54,830 But the inversion, not knowing the private key, 1433 01:18:54,830 --> 01:18:57,820 you would force the adversary to solve 1434 01:18:57,820 --> 01:19:00,050 what you think was hard general Knapsack problem 1435 01:19:00,050 --> 01:19:03,920 to actually break the scheme or to get the decryption. 1436 01:19:03,920 --> 01:19:07,210 And so let me show you really quickly how this 1437 01:19:07,210 --> 01:19:08,230 works with numbers. 1438 01:19:08,230 --> 01:19:11,850 And we won't have you worry about symbols and things 1439 01:19:11,850 --> 01:19:12,820 like that. 1440 01:19:12,820 --> 01:19:16,924 So just give me a couple of extra minutes here. 1441 01:19:16,924 --> 01:19:18,340 So let's say that I had a message. 1442 01:19:22,310 --> 01:19:26,960 Oh, before I do that, let me look at-- let's say, 1443 01:19:26,960 --> 01:19:31,200 that we chose N equals 31 and M equals 105. 1444 01:19:31,200 --> 01:19:32,750 This is actually the message. 1445 01:19:36,520 --> 01:19:37,950 No, I'm sorry. 1446 01:19:37,950 --> 01:19:39,260 Capital M is not the message. 1447 01:19:39,260 --> 01:19:43,150 These are public parameters. 1448 01:19:43,150 --> 01:19:47,900 And we're going to take-- this is a transform. 1449 01:19:52,037 --> 01:19:52,620 Oh, I'm sorry. 1450 01:19:52,620 --> 01:19:53,660 These are not public. 1451 01:19:53,660 --> 01:19:54,600 These are private. 1452 01:19:54,600 --> 01:19:57,210 My bad. 1453 01:19:57,210 --> 01:19:58,710 So what I'm going to show you is I'm 1454 01:19:58,710 --> 01:20:00,830 going to take a super increasing Knapsack. 1455 01:20:00,830 --> 01:20:02,620 And that's exactly what I have up there. 1456 01:20:02,620 --> 01:20:05,580 So that corresponds to an easy Knapsack problem. 1457 01:20:05,580 --> 01:20:09,540 I'm going to convert it using these private parameters, 1458 01:20:09,540 --> 01:20:12,480 N equals 31 and M equals 105. 1459 01:20:12,480 --> 01:20:17,310 And so our private key is our super increasing knapsack, 1460 01:20:17,310 --> 01:20:23,720 which is 2, 3, 6, 13, 27, and 52. 1461 01:20:23,720 --> 01:20:29,920 And the public key, what I'm going to do 1462 01:20:29,920 --> 01:20:37,310 is simply multiply each of these, 2 times N. 1463 01:20:37,310 --> 01:20:42,950 And I'm going to take mod M. So for each of those values, 1464 01:20:42,950 --> 01:20:46,560 I multiplied by N, which is 31, and take the mod of 105, 1465 01:20:46,560 --> 01:20:55,480 and I end up getting 62, 93, 81, 88, 102, and 37. 1466 01:20:55,480 --> 01:21:01,250 So you can get a private key and a public key 1467 01:21:01,250 --> 01:21:03,649 using this private transform. 1468 01:21:03,649 --> 01:21:04,940 I'll let you look at the notes. 1469 01:21:04,940 --> 01:21:06,550 But basically what happens is when 1470 01:21:06,550 --> 01:21:10,030 you take a particular message M, what you end up 1471 01:21:10,030 --> 01:21:13,680 doing is you want to encrypt it using the public key, which 1472 01:21:13,680 --> 01:21:15,690 is this quantity over here. 1473 01:21:15,690 --> 01:21:18,860 And the way you encrypt that is simply 1474 01:21:18,860 --> 01:21:22,090 by taking a particular message. 1475 01:21:22,090 --> 01:21:26,770 And let's say that the message is written as 011000. 1476 01:21:26,770 --> 01:21:30,880 Then all you do is add up 93 and 81, 1477 01:21:30,880 --> 01:21:32,440 because those are those two. 1478 01:21:32,440 --> 01:21:37,200 And you say this is going to get encrypted by 174. 1479 01:21:37,200 --> 01:21:39,500 So the message encryption is simply 1480 01:21:39,500 --> 01:21:43,430 a simple operation where you add up weights of the knapsack. 1481 01:21:43,430 --> 01:21:46,320 So you end up getting 174 out here. 1482 01:21:46,320 --> 01:21:52,250 And the hope is, of course, that when adversary sees 174-- 1483 01:21:52,250 --> 01:21:58,030 and this is the part where things get a little iffy-- 1484 01:21:58,030 --> 01:22:00,990 that it's hard-- you have to think about lots of numbers 1485 01:22:00,990 --> 01:22:03,520 here, of course-- for the adversary 1486 01:22:03,520 --> 01:22:08,610 to figure out that that 174 is actually 93 plus 81. 1487 01:22:08,610 --> 01:22:10,490 OK. 1488 01:22:10,490 --> 01:22:14,470 So the diverse is not necessarily an easy problem. 1489 01:22:14,470 --> 01:22:16,400 And that's exactly what Knapsack is, right? 1490 01:22:16,400 --> 01:22:19,840 I tell you what the sum is over there, which is S. 1491 01:22:19,840 --> 01:22:21,710 And I tell you what the weights are. 1492 01:22:21,710 --> 01:22:24,950 And it's hard for you to figure out what the BIs are. 1493 01:22:24,950 --> 01:22:28,550 So now you see why this didn't work. 1494 01:22:28,550 --> 01:22:31,650 So you have to have a situation where 1495 01:22:31,650 --> 01:22:35,850 in the average case whatever you produce for the ciphered text 1496 01:22:35,850 --> 01:22:38,685 here, you're sending out the ciphered text according 1497 01:22:38,685 --> 01:22:41,140 to this cryptosystem, which is 174, 1498 01:22:41,140 --> 01:22:45,390 and you want to make sure that the adversary can't figure out 1499 01:22:45,390 --> 01:22:47,830 that this is actually 93 plus 81. 1500 01:22:47,830 --> 01:22:50,480 Amazingly, people thought they could build systems 1501 01:22:50,480 --> 01:22:52,770 using this assuming these numbers were 1502 01:22:52,770 --> 01:22:54,500 much larger than they are here. 1503 01:22:54,500 --> 01:22:57,220 But that certainly wasn't the case. 1504 01:22:57,220 --> 01:22:59,450 Because in the average case, you end up 1505 01:22:59,450 --> 01:23:01,590 being able to break these systems. 1506 01:23:01,590 --> 01:23:03,860 The last thing is, of course, you 1507 01:23:03,860 --> 01:23:06,250 don't want to necessarily solve the hard Knapsack 1508 01:23:06,250 --> 01:23:10,420 problem associated with this. 1509 01:23:10,420 --> 01:23:16,330 So what ends up happening is you end up using N equals 31. 1510 01:23:16,330 --> 01:23:19,470 So if you want to decrypt, you have N equals 31 1511 01:23:19,470 --> 01:23:21,620 and M equals 105. 1512 01:23:21,620 --> 01:23:25,240 And what you're going to do is take this and multiply it 1513 01:23:25,240 --> 01:23:30,270 by N inverse mod M. 1514 01:23:30,270 --> 01:23:37,200 So rather than doing times N mod M, you divide by N mod M. 1515 01:23:37,200 --> 01:23:40,720 And you can do this operation relatively simply. 1516 01:23:40,720 --> 01:23:46,810 And you can go back from 174 to figuring out 1517 01:23:46,810 --> 01:23:52,770 what the actual message was by computing this quantity. 1518 01:23:52,770 --> 01:23:54,280 So I'll stop there. 1519 01:23:54,280 --> 01:23:56,720 I didn't quite get to everything that I wanted to cover. 1520 01:23:56,720 --> 01:23:58,460 But take a look at the notes. 1521 01:23:58,460 --> 01:24:01,950 Get a sense for why the difference exists 1522 01:24:01,950 --> 01:24:04,500 between NP-complete problems and problems that 1523 01:24:04,500 --> 01:24:06,260 were used in cryptosystems. 1524 01:24:06,260 --> 01:24:10,100 And happy to stick around and answer questions.