1 00:00:00,090 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation, or view additional materials 6 00:00:13,330 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:21,222 --> 00:00:23,180 SRINIVAS DEVADAS: All right, let's get started. 9 00:00:23,180 --> 00:00:24,740 Good morning everyone. 10 00:00:24,740 --> 00:00:28,210 I see a lot of tired faces. 11 00:00:28,210 --> 00:00:29,030 I'm not tired. 12 00:00:29,030 --> 00:00:29,930 Why are you tired? 13 00:00:29,930 --> 00:00:30,894 [LAUGHTER] 14 00:00:31,860 --> 00:00:33,130 I only lecture half the time. 15 00:00:33,130 --> 00:00:36,240 You guys take the class all the time. 16 00:00:36,240 --> 00:00:41,640 So today's lecture is about hash functions. 17 00:00:41,640 --> 00:00:45,680 And you may think that you know a lot about hash functions, 18 00:00:45,680 --> 00:00:47,860 and you probably do. 19 00:00:47,860 --> 00:00:50,840 But what we're going to do today is talk about really 20 00:00:50,840 --> 00:00:54,990 a completely different application of hash functions, 21 00:00:54,990 --> 00:00:57,700 and a new set of properties that we're 22 00:00:57,700 --> 00:01:00,190 going to require of hash functions 23 00:01:00,190 --> 00:01:01,900 that I'll elaborate on. 24 00:01:01,900 --> 00:01:04,459 And we're going to see a bunch of different applications 25 00:01:04,459 --> 00:01:06,700 to things like password protection, 26 00:01:06,700 --> 00:01:10,220 checking the integrity of files, auctions, 27 00:01:10,220 --> 00:01:11,316 and so on and so forth. 28 00:01:11,316 --> 00:01:12,940 So a little bit of a different lecture. 29 00:01:12,940 --> 00:01:15,692 Both today and on Thursday I'm going 30 00:01:15,692 --> 00:01:18,320 to be going to be doing cryptography and applications, 31 00:01:18,320 --> 00:01:20,380 not too much of algorithms. 32 00:01:20,380 --> 00:01:22,960 But we will do a little bit of analysis with respect 33 00:01:22,960 --> 00:01:25,390 to whether properties are satisfied, 34 00:01:25,390 --> 00:01:28,010 in this case by hash functions or not. 35 00:01:28,010 --> 00:01:29,910 So let's just dive right in. 36 00:01:29,910 --> 00:01:33,240 You all know what hash functions are. 37 00:01:33,240 --> 00:01:37,380 There's no real change in the definition. 38 00:01:37,380 --> 00:01:39,370 But the kinds of hash functions that we're 39 00:01:39,370 --> 00:01:41,960 going to be looking at today are quite different 40 00:01:41,960 --> 00:01:45,590 from the simple hash functions, like taking a mod 41 00:01:45,590 --> 00:01:49,950 with a prime number that we've looked at in the past. 42 00:01:49,950 --> 00:01:51,680 And the notion of collisions is going 43 00:01:51,680 --> 00:01:53,980 to come up again, except that again we're 44 00:01:53,980 --> 00:01:56,600 going to raise the stakes a little bit. 45 00:01:56,600 --> 00:02:04,100 So a hash function maps arbitrary 46 00:02:04,100 --> 00:02:08,539 strings-- let me do this right. 47 00:02:11,940 --> 00:02:15,645 So you're not making a statement about the length of the string. 48 00:02:18,440 --> 00:02:23,490 You will break it up, even if you had a string of length 512, 49 00:02:23,490 --> 00:02:28,830 or maybe it was 27, you do want to get a number out of it. 50 00:02:28,830 --> 00:02:31,540 In a specific range there's going 51 00:02:31,540 --> 00:02:34,190 to be a number of bits associated 52 00:02:34,190 --> 00:02:35,346 with our hash functions. 53 00:02:35,346 --> 00:02:36,970 And previously we had a number of slots 54 00:02:36,970 --> 00:02:39,620 associated with the output of the hash function. 55 00:02:39,620 --> 00:02:42,100 But the input could be arbitrary. 56 00:02:42,100 --> 00:02:48,080 And these arbitrary strings of data 57 00:02:48,080 --> 00:02:50,310 are going to get mapped, as I just said, 58 00:02:50,310 --> 00:02:53,190 to a fixed length output. 59 00:02:56,282 --> 00:02:57,990 And we're going to think about this fixed 60 00:02:57,990 --> 00:03:01,590 length as being a number of bits today, 61 00:03:01,590 --> 00:03:04,530 as opposed to slots in the hash table. 62 00:03:04,530 --> 00:03:08,170 Because we really aren't going to be storing 63 00:03:08,170 --> 00:03:10,520 a dictionary or a hash table in the applications 64 00:03:10,520 --> 00:03:11,780 we're going to look at today. 65 00:03:11,780 --> 00:03:14,750 It's simply a question of computing a hash. 66 00:03:14,750 --> 00:03:18,120 And because the fixed length output 67 00:03:18,120 --> 00:03:23,880 is going to be something on the order of 160-bits, or 256-bits, 68 00:03:23,880 --> 00:03:26,980 there's no way that you could store two arrays 69 00:03:26,980 --> 00:03:33,370 to 160 elements in a hash table, or even two arrays to 64 70 00:03:33,370 --> 00:03:34,340 really. 71 00:03:34,340 --> 00:03:37,120 And so we're going to just assume 72 00:03:37,120 --> 00:03:41,880 that we're computing these hashes 73 00:03:41,880 --> 00:03:45,960 and using them for certain applications. 74 00:03:45,960 --> 00:03:48,800 I just wrote output twice I guess. 75 00:03:48,800 --> 00:03:52,720 So map it to a fixed length output. 76 00:03:52,720 --> 00:03:58,990 We want to do this in a deterministic fashion. 77 00:03:58,990 --> 00:04:04,430 So once we've computed the hash of a particular arbitrary 78 00:04:04,430 --> 00:04:07,960 string that is given to us, we want 79 00:04:07,960 --> 00:04:10,150 to be able to repeat that process to get 80 00:04:10,150 --> 00:04:13,090 the same hash every time. 81 00:04:13,090 --> 00:04:15,070 We want to do this in a public fashion. 82 00:04:15,070 --> 00:04:16,170 So everything is public. 83 00:04:16,170 --> 00:04:17,420 There's no secrecy. 84 00:04:17,420 --> 00:04:19,920 There's keyed hash functions that we won't actually 85 00:04:19,920 --> 00:04:22,210 look at today, but maybe in passing 86 00:04:22,210 --> 00:04:24,500 I'll mention it next time. 87 00:04:24,500 --> 00:04:26,700 We're not looking at keyed hash functions here. 88 00:04:26,700 --> 00:04:30,920 There's no secrets in any of the descriptions of algorithms 89 00:04:30,920 --> 00:04:34,540 or techniques I'm going to be describing today. 90 00:04:34,540 --> 00:04:37,720 And we want this to be random. 91 00:04:37,720 --> 00:04:39,870 We want it to look random. 92 00:04:39,870 --> 00:04:43,750 True randomness is going to be impossible to achieve, 93 00:04:43,750 --> 00:04:45,142 given our other constraints. 94 00:04:45,142 --> 00:04:46,850 But we're going to try and approximate it 95 00:04:46,850 --> 00:04:48,880 with pseudo-randomness. 96 00:04:48,880 --> 00:04:51,040 But we'd want it to look random, because we 97 00:04:51,040 --> 00:04:55,210 are interested-- as we were in the case of dictionaries 98 00:04:55,210 --> 00:04:58,030 and the regular application of hash functions-- we 99 00:04:58,030 --> 00:05:00,870 are interested in minimizing collisions. 100 00:05:00,870 --> 00:05:03,570 And in fact we're going to raise the stakes really high 101 00:05:03,570 --> 00:05:05,470 with respect to collisions. 102 00:05:05,470 --> 00:05:09,870 We want it to be impossible for you, or anyone else, 103 00:05:09,870 --> 00:05:12,210 to discover collisions. 104 00:05:12,210 --> 00:05:15,150 And that's going to be an important property of collision 105 00:05:15,150 --> 00:05:20,520 resistance that obviously is going to require randomness. 106 00:05:20,520 --> 00:05:24,220 And those are the three things we want, 107 00:05:24,220 --> 00:05:26,840 deterministic, public, and random. 108 00:05:26,840 --> 00:05:32,870 And so just from a function description standpoint 109 00:05:32,870 --> 00:05:35,800 you have 0, 1 star here, which implies that it's 110 00:05:35,800 --> 00:05:37,770 an arbitrary length strength. 111 00:05:37,770 --> 00:05:41,830 And we want to go to 0, 1 d. 112 00:05:41,830 --> 00:05:43,955 And this is a string of length d. 113 00:05:49,170 --> 00:05:51,250 So that means that you're getting d-bits out 114 00:05:51,250 --> 00:05:52,650 from your hash function. 115 00:05:52,650 --> 00:05:56,630 And here the length is greater than or equal to 0. 116 00:06:00,360 --> 00:06:01,750 So that's it. 117 00:06:01,750 --> 00:06:05,000 Not a lot that's new here. 118 00:06:05,000 --> 00:06:08,880 But a few things that are going to be a little bit different. 119 00:06:08,880 --> 00:06:11,630 And there's some subtleties here that we'll get to. 120 00:06:11,630 --> 00:06:18,600 I want to emphasize two things, one of which I just said. 121 00:06:18,600 --> 00:06:23,570 There's no secrecy, no secret keys here in the hash functions 122 00:06:23,570 --> 00:06:25,350 that we are describing. 123 00:06:25,350 --> 00:06:27,250 All operations are public. 124 00:06:27,250 --> 00:06:33,330 So just like you had your hash function, which was k mod p, 125 00:06:33,330 --> 00:06:38,220 and p was a prime and p was public and known to everyone 126 00:06:38,220 --> 00:06:40,600 who used the dictionary, everything here 127 00:06:40,600 --> 00:06:42,570 we are going to be talking about is public. 128 00:06:42,570 --> 00:06:44,680 So anyone can compute h. 129 00:06:51,942 --> 00:06:53,400 And we're going to assume that this 130 00:06:53,400 --> 00:06:57,310 is poly-time computation-- not too surprising-- but I'm 131 00:06:57,310 --> 00:07:01,700 being quite flexible here. 132 00:07:01,700 --> 00:07:03,720 When you look at dictionaries, and you 133 00:07:03,720 --> 00:07:07,140 think about using dictionaries, and using it to implement 134 00:07:07,140 --> 00:07:10,540 efficient algorithms, what is the assumption 135 00:07:10,540 --> 00:07:13,690 we kind of implicitly made-- are perhaps explicitly 136 00:07:13,690 --> 00:07:18,810 in some cases-- with respect to computing the hash? 137 00:07:18,810 --> 00:07:19,310 Anybody? 138 00:07:22,210 --> 00:07:22,710 Yeah? 139 00:07:22,710 --> 00:07:23,690 AUDIENCE: Constant time? 140 00:07:23,690 --> 00:07:25,023 SRINIVAS DEVADAS: Constant time. 141 00:07:25,023 --> 00:07:33,680 We assumed-- so this is not necessarily order 1, right? 142 00:07:33,680 --> 00:07:34,840 So that's important. 143 00:07:34,840 --> 00:07:42,180 So we're going to-- I want to make sure you're watching. 144 00:07:42,180 --> 00:07:45,770 So you're going to raise the stakes even with respect 145 00:07:45,770 --> 00:07:47,990 to the complexity of the hash. 146 00:07:47,990 --> 00:07:50,459 And as you'll see, because of the desirable properties, 147 00:07:50,459 --> 00:07:51,750 we're going to have to do that. 148 00:07:51,750 --> 00:07:54,010 We're going to ask for really a lot with respect 149 00:07:54,010 --> 00:07:55,610 to these hash functions. 150 00:07:55,610 --> 00:07:58,350 Nobody can find a collision, right? 151 00:07:58,350 --> 00:08:01,750 And if you have something as simple as k mod p, 152 00:08:01,750 --> 00:08:03,910 it's going to be trivial to find a collision. 153 00:08:03,910 --> 00:08:06,710 And so these order 1 hash functions 154 00:08:06,710 --> 00:08:08,380 that you're familiar with aren't going 155 00:08:08,380 --> 00:08:12,160 to make the grade with respect to any of the properties 156 00:08:12,160 --> 00:08:14,500 that we'll discuss in a few minutes. 157 00:08:14,500 --> 00:08:16,962 All right, so remember this is poly-time computation. 158 00:08:16,962 --> 00:08:19,170 And there's lots of examples of these hash functions. 159 00:08:19,170 --> 00:08:21,820 And for those of you who are kind of into computer security 160 00:08:21,820 --> 00:08:24,890 and cryptography already, you might have heard 161 00:08:24,890 --> 00:08:29,240 of examples like MD4 and MD5. 162 00:08:29,240 --> 00:08:30,260 These are versions. 163 00:08:30,260 --> 00:08:32,309 MD stands for message digest. 164 00:08:32,309 --> 00:08:35,610 These were functions that were invented by Professor Rivest. 165 00:08:35,610 --> 00:08:42,970 And they had d equals 128 way back when-- 1992, 166 00:08:42,970 --> 00:08:45,830 if I recall-- when they were proposed. 167 00:08:45,830 --> 00:08:50,830 And these algorithms have since been broken in the sense 168 00:08:50,830 --> 00:08:53,490 that it was conjectured that they had particular properties 169 00:08:53,490 --> 00:08:59,060 of collision resistance that it would take exponential time 170 00:08:59,060 --> 00:09:01,660 for anybody to find collisions. 171 00:09:01,660 --> 00:09:04,490 And it still kind of takes exponential time, 172 00:09:04,490 --> 00:09:11,040 but 2 raised to 37 is exponential at one level, 173 00:09:11,040 --> 00:09:13,700 but constant in another level. 174 00:09:13,700 --> 00:09:17,720 So you can kind of do it in a few seconds now. 175 00:09:17,720 --> 00:09:20,610 So a little bit of history. 176 00:09:20,610 --> 00:09:23,300 I'm not going to spend a lot of time on this. 177 00:09:23,300 --> 00:09:28,400 MD5 was used to create what was called a secure hash algorithm. 178 00:09:28,400 --> 00:09:31,770 This is 160-bits. 179 00:09:31,770 --> 00:09:36,330 And this is not quite broken at this point. 180 00:09:36,330 --> 00:09:41,920 But that people consider it broken, or soon to be broken. 181 00:09:41,920 --> 00:09:45,560 Right now the recommended algorithm 182 00:09:45,560 --> 00:09:50,260 is called SHA-3, secure hash algorithm version three. 183 00:09:50,260 --> 00:09:53,870 And there was a contest that ran for like 18 months, 184 00:09:53,870 --> 00:09:56,770 or maybe even longer, that eventually was won 185 00:09:56,770 --> 00:09:59,580 by what turned into the SHA-3. 186 00:09:59,580 --> 00:10:03,200 And they had a different name for it that I can't recall. 187 00:10:03,200 --> 00:10:05,180 But it turned into SHA-3. 188 00:10:05,180 --> 00:10:08,220 And what happened along the way, as we went from MD4, 189 00:10:08,220 --> 00:10:12,690 MD5, SHA-1 to SHA-3, is that this amount of computation 190 00:10:12,690 --> 00:10:14,810 that you had to do increased. 191 00:10:14,810 --> 00:10:16,570 And the complexity of operations that you 192 00:10:16,570 --> 00:10:21,690 had to do in order to compute the hash of an arbitrary string 193 00:10:21,690 --> 00:10:24,460 increased to the point where-- you 194 00:10:24,460 --> 00:10:27,720 want to think about this as 100 rounds of computation. 195 00:10:27,720 --> 00:10:31,780 And certainly order d computation, 196 00:10:31,780 --> 00:10:34,200 where d is the number of bits. 197 00:10:34,200 --> 00:10:35,940 And perhaps even more. 198 00:10:35,940 --> 00:10:38,850 So it's definitely not order 1. 199 00:10:38,850 --> 00:10:43,270 So as I said a little bit of context with respect 200 00:10:43,270 --> 00:10:45,355 to the things that are out there. 201 00:10:45,355 --> 00:10:46,980 At the end of the lecture I'll give you 202 00:10:46,980 --> 00:10:49,570 a sense for how these hash functions are built. 203 00:10:49,570 --> 00:10:51,380 We're not going to spend a lot of time 204 00:10:51,380 --> 00:10:53,220 on creating these hash functions. 205 00:10:53,220 --> 00:10:56,320 It's really a research topic onto itself and not really 206 00:10:56,320 --> 00:10:58,390 in the slope of 6.046. 207 00:10:58,390 --> 00:11:00,920 What is in the scope of 6.046, and what 208 00:11:00,920 --> 00:11:02,510 I think is more interesting, which 209 00:11:02,510 --> 00:11:05,690 is what we'll focus our energy and time on, 210 00:11:05,690 --> 00:11:07,990 is the properties of these hash functions. 211 00:11:07,990 --> 00:11:10,640 And why these properties are useful in a bunch 212 00:11:10,640 --> 00:11:12,270 of different apps. 213 00:11:12,270 --> 00:11:15,100 And so what is it that we want? 214 00:11:15,100 --> 00:11:19,440 We want a random oracle. 215 00:11:19,440 --> 00:11:23,790 We want to essentially build something 216 00:11:23,790 --> 00:11:27,960 that looks like that, deterministic, public, random. 217 00:11:27,960 --> 00:11:31,390 And we're going to claim that what we want 218 00:11:31,390 --> 00:11:33,135 is this random oracle which has all 219 00:11:33,135 --> 00:11:35,892 of these wonderful properties that I'm going to describe. 220 00:11:35,892 --> 00:11:37,850 I'm going to describe the random oracle to you, 221 00:11:37,850 --> 00:11:40,391 and then I'm going to tell you about what the properties are. 222 00:11:40,391 --> 00:11:44,420 And then unfortunately this is an ideal world 223 00:11:44,420 --> 00:11:47,890 and we can't build this in the real world. 224 00:11:47,890 --> 00:11:49,919 And so we're going to have to approximate it. 225 00:11:49,919 --> 00:11:52,460 And that's where the MD4's and the MD5's and the SHA-1's came 226 00:11:52,460 --> 00:11:55,200 in, OK? 227 00:11:55,200 --> 00:11:56,885 So this is not achievable in practice. 228 00:12:05,560 --> 00:12:09,300 So what is this oracle? 229 00:12:09,300 --> 00:12:17,170 This oracle is on input x, belonging to 0,1 star. 230 00:12:17,170 --> 00:12:20,740 So that could be an arbitrary string. 231 00:12:20,740 --> 00:12:26,100 If x not in the book-- so there's this the book, 232 00:12:26,100 --> 00:12:26,840 all right? 233 00:12:26,840 --> 00:12:29,720 And there's this infinite capacity book 234 00:12:29,720 --> 00:12:35,170 that has all of the computations that were ever done prior. 235 00:12:35,170 --> 00:12:36,770 And they're always stored in the book. 236 00:12:36,770 --> 00:12:38,770 And that's how we're going to get determinism. 237 00:12:38,770 --> 00:12:42,100 Because this book initially gets filled in. 238 00:12:42,100 --> 00:12:44,380 All of the entries in the book are filled 239 00:12:44,380 --> 00:12:47,000 in using pure randomness. 240 00:12:47,000 --> 00:12:55,610 So you flip a coin d times to determine h of x. 241 00:12:55,610 --> 00:12:57,760 So that's basically it. 242 00:12:57,760 --> 00:12:59,420 And you just keep flipping. 243 00:12:59,420 --> 00:13:00,950 You have to flip d times. 244 00:13:00,950 --> 00:13:05,320 And so if x was 0, you flip d times, d was 160. 245 00:13:05,320 --> 00:13:08,300 You flipped a coin 160 times and got a string. 246 00:13:08,300 --> 00:13:13,430 If x were 1, flip 160 times, you get a different string 247 00:13:13,430 --> 00:13:15,530 with very high probability, obviously. 248 00:13:15,530 --> 00:13:16,890 And so on and so forth. 249 00:13:16,890 --> 00:13:19,280 But what you do is you have this book. 250 00:13:19,280 --> 00:13:29,870 So you're going to record x h of x in the book, OK? 251 00:13:29,870 --> 00:13:31,770 So at some level your hash function 252 00:13:31,770 --> 00:13:35,220 is this giant look-up table in the sky, right? 253 00:13:35,220 --> 00:13:37,480 Actually not giant, infinite capacity look-up table 254 00:13:37,480 --> 00:13:38,510 in the sky. 255 00:13:38,510 --> 00:13:42,230 Because you can put arbitrary strings into this. 256 00:13:42,230 --> 00:13:47,910 And if it's in the book-- this is obviously the important part 257 00:13:47,910 --> 00:13:52,740 that gives you determinism-- then you return y, 258 00:13:52,740 --> 00:13:58,700 where x and y are in the book, OK? 259 00:14:01,350 --> 00:14:05,040 So you get a random answer every time, 260 00:14:05,040 --> 00:14:08,380 except as required for consistency 261 00:14:08,380 --> 00:14:10,180 with previous answers. 262 00:14:10,180 --> 00:14:11,930 So the very first time you see a string, 263 00:14:11,930 --> 00:14:16,730 or-- and the whole world can create this book. 264 00:14:16,730 --> 00:14:17,740 It's public. 265 00:14:17,740 --> 00:14:22,430 So if I created the book at first with a particular string, 266 00:14:22,430 --> 00:14:23,710 let's say Eric. 267 00:14:23,710 --> 00:14:25,120 I was the string. 268 00:14:25,120 --> 00:14:29,960 And I'm the one who put the entry-- x equals Eric, 269 00:14:29,960 --> 00:14:34,040 and h of x, h of Eric equals some random 160-bit string-- 270 00:14:34,040 --> 00:14:36,640 into the book, I get credit for that, right? 271 00:14:36,640 --> 00:14:43,650 But if you come a nanosecond later and ask for h of Eric, 272 00:14:43,650 --> 00:14:46,090 you should get exactly what got put into the book 273 00:14:46,090 --> 00:14:49,850 when I asked for h of Eric. 274 00:14:49,850 --> 00:14:51,290 And so on and so forth. 275 00:14:51,290 --> 00:14:53,050 So this is true for everybody. 276 00:14:53,050 --> 00:14:56,810 So this is like-- I mean basically impossible to get. 277 00:14:56,810 --> 00:15:01,770 Because not only can anybody and everybody query, 278 00:15:01,770 --> 00:15:05,190 you have to have this ordering associated 279 00:15:05,190 --> 00:15:09,450 with people querying the book. 280 00:15:09,450 --> 00:15:11,330 And you have to have consistency. 281 00:15:11,330 --> 00:15:11,990 All right. 282 00:15:11,990 --> 00:15:15,990 So everyone convinced that we can't build this? 283 00:15:15,990 --> 00:15:16,710 All right. 284 00:15:16,710 --> 00:15:18,660 If you took anything out of this lecture, 285 00:15:18,660 --> 00:15:20,011 that's what you should take. 286 00:15:20,011 --> 00:15:20,510 No, no. 287 00:15:20,510 --> 00:15:22,290 There's a lot more. 288 00:15:22,290 --> 00:15:26,750 So we want to approximate the random oracle. 289 00:15:26,750 --> 00:15:28,840 And we're going to get to that. 290 00:15:28,840 --> 00:15:34,767 Obviously we're going to have to do this in poly-space as well. 291 00:15:34,767 --> 00:15:35,850 So what's wrong with this? 292 00:15:35,850 --> 00:15:38,810 Of course this picture is I didn't actually say this, 293 00:15:38,810 --> 00:15:42,320 but you'd like things to be poly-time in terms of space. 294 00:15:42,320 --> 00:15:46,210 You don't want to store an infinite number-- this 295 00:15:46,210 --> 00:15:48,649 is worse than poly-time, worse than exponential time, 296 00:15:48,649 --> 00:15:51,190 because it's arbitrary strings that we're talking about here, 297 00:15:51,190 --> 00:15:51,920 right? 298 00:15:51,920 --> 00:15:53,940 So you can't possibly do that. 299 00:15:53,940 --> 00:15:56,350 So we have to do something better. 300 00:15:56,350 --> 00:16:00,180 But before I get into how we'd actually build this, and give 301 00:16:00,180 --> 00:16:04,346 you a sense of how SHA-1 and MD5 were built-- 302 00:16:04,346 --> 00:16:06,220 and that's going to come a little bit later-- 303 00:16:06,220 --> 00:16:11,910 I want to spend a lot of time on the what is interesting, 304 00:16:11,910 --> 00:16:13,630 which are the desirable properties. 305 00:16:13,630 --> 00:16:16,640 Which you can kind of see using the random oracle. 306 00:16:16,640 --> 00:16:18,690 So what is cool about the random oracle 307 00:16:18,690 --> 00:16:21,140 is that it's a simple algorithm. 308 00:16:21,140 --> 00:16:23,060 You can understand it. 309 00:16:23,060 --> 00:16:24,410 You can't implement it. 310 00:16:24,410 --> 00:16:27,130 But now you can see what wonderful properties 311 00:16:27,130 --> 00:16:28,230 it gives you. 312 00:16:28,230 --> 00:16:30,350 And these properties are going to be important 313 00:16:30,350 --> 00:16:32,490 for our applications, OK? 314 00:16:32,490 --> 00:16:36,146 And so let's get started with a bunch of different properties. 315 00:16:36,146 --> 00:16:37,520 And these are all properties that 316 00:16:37,520 --> 00:16:43,030 are going to be useful for verification or computer 317 00:16:43,030 --> 00:16:44,850 security applications. 318 00:16:44,850 --> 00:16:51,100 The first one, it's not ow, it's O, W. It's one-wayness, 319 00:16:51,100 --> 00:16:51,780 all right? 320 00:16:51,780 --> 00:16:53,930 So one-way, or one-wayness. 321 00:16:53,930 --> 00:17:03,230 And it's also called-- you're not going to call it this-- 322 00:17:03,230 --> 00:17:09,390 but perhaps this is a more technical term, a more precise 323 00:17:09,390 --> 00:17:11,150 term, pre-image resistance. 324 00:17:11,150 --> 00:17:13,060 And so what does this mean? 325 00:17:13,060 --> 00:17:15,167 Well this is a very strong requirement. 326 00:17:15,167 --> 00:17:16,750 I mean a couple of other ones are also 327 00:17:16,750 --> 00:17:18,990 going to be perhaps stronger. 328 00:17:18,990 --> 00:17:21,270 But this is a pretty strong requirement 329 00:17:21,270 --> 00:17:28,710 which says it's infeasible, given y, 330 00:17:28,710 --> 00:17:45,170 which is in the-- it's basically a d-bit vector, to find any x 331 00:17:45,170 --> 00:17:50,950 such that h of x equals y. 332 00:17:50,950 --> 00:18:00,120 And so this is x is the pre-image of y. 333 00:18:00,120 --> 00:18:01,580 So what does this say? 334 00:18:01,580 --> 00:18:04,400 It says that I want to create a hash function such 335 00:18:04,400 --> 00:18:08,030 that if I give you a specific-- we call 336 00:18:08,030 --> 00:18:12,870 it a 160-bit string, because we're talking SHA-1 here, 337 00:18:12,870 --> 00:18:16,792 and that's the hash-- I'm going to have, 338 00:18:16,792 --> 00:18:18,250 it's going to have to be impossible 339 00:18:18,250 --> 00:18:25,430 for me to discover an x that produced that 160-bit string, 340 00:18:25,430 --> 00:18:26,390 OK? 341 00:18:26,390 --> 00:18:29,100 Now if you go look at our random oracle, 342 00:18:29,100 --> 00:18:32,750 you realize that if you had a 160-bit string, 343 00:18:32,750 --> 00:18:36,970 and perhaps you have the entire book 344 00:18:36,970 --> 00:18:39,290 and you can read the entire book. 345 00:18:39,290 --> 00:18:41,940 It's an infinite capacity book. 346 00:18:41,940 --> 00:18:44,750 It's got a bunch of stuff in it. 347 00:18:44,750 --> 00:18:49,800 And know that any time anyone queried the book the first time 348 00:18:49,800 --> 00:18:53,580 for a given x, that there was this random 160-bit number that 349 00:18:53,580 --> 00:18:55,700 was generated and put into the book. 350 00:18:55,700 --> 00:18:58,100 And there's a whole lot of these numbers, right? 351 00:18:58,100 --> 00:18:59,954 So what's going to happen is, you're 352 00:18:59,954 --> 00:19:01,870 going to have to look through the entire book, 353 00:19:01,870 --> 00:19:04,710 this entire potentially infinite capacity book, 354 00:19:04,710 --> 00:19:13,500 in order to figure out if this particular y is in the book 355 00:19:13,500 --> 00:19:14,660 or not. 356 00:19:14,660 --> 00:19:18,310 And that's going to take a long time to do, potentially, OK? 357 00:19:18,310 --> 00:19:23,290 So in the case where you have a random oracle you'd 358 00:19:23,290 --> 00:19:27,660 have to go through and find-- looking at the output hash 359 00:19:27,660 --> 00:19:30,502 corresponding to each of the entries in the random oracle, 360 00:19:30,502 --> 00:19:32,960 you're going to start matching, match, match, match, match, 361 00:19:32,960 --> 00:19:35,290 it's going to take you exponential time. 362 00:19:35,290 --> 00:19:37,760 Well actually worse than that, given the infinite capacity 363 00:19:37,760 --> 00:19:38,930 of the book. 364 00:19:38,930 --> 00:19:40,970 So this clearly gives you that. 365 00:19:40,970 --> 00:19:44,070 Now you may not be a completely satisfied with that answer 366 00:19:44,070 --> 00:19:46,620 because you say well, you can't implement that. 367 00:19:46,620 --> 00:19:48,410 But we'll talk a little bit, as I said, 368 00:19:48,410 --> 00:19:50,410 about how you could actually get this. 369 00:19:50,410 --> 00:19:54,510 But what's-- I should be clear-- is that the simple hash 370 00:19:54,510 --> 00:19:59,180 functions that we've looked at in the past just to build 371 00:19:59,180 --> 00:20:02,570 dictionaries do not satisfy this, right? 372 00:20:02,570 --> 00:20:11,860 So suppose I had h of x equals x square mod p. 373 00:20:11,860 --> 00:20:18,050 Is this one-way, given a public p? 374 00:20:18,050 --> 00:20:19,210 No of course not, right? 375 00:20:19,210 --> 00:20:22,310 Because I'm going to be-- it's going to be easy 376 00:20:22,310 --> 00:20:24,730 for me to do something. 377 00:20:24,730 --> 00:20:29,320 Even though this is discrete arithmetic I could do something 378 00:20:29,320 --> 00:20:32,670 like, well, I know that what I have here-- actually let's 379 00:20:32,670 --> 00:20:34,170 do it with something that's simpler, 380 00:20:34,170 --> 00:20:36,580 and then I'll talk about the x squared. 381 00:20:36,580 --> 00:20:38,650 If I had something as simple as x mod p, 382 00:20:38,650 --> 00:20:42,100 I mean that's trivially broken in terms of one-wayness. 383 00:20:42,100 --> 00:20:45,900 Because I know that h of x could be viewed as the remainder. 384 00:20:45,900 --> 00:20:51,870 So anything-- if this is h of x, and let's 385 00:20:51,870 --> 00:20:54,310 just call that y for a second, because that's 386 00:20:54,310 --> 00:20:56,280 what we had it out there. 387 00:20:56,280 --> 00:21:00,340 Something that's a multiple of y plus the remainder-- so I 388 00:21:00,340 --> 00:21:02,777 could have a-- is that right? 389 00:21:02,777 --> 00:21:03,610 Is that what I want? 390 00:21:03,610 --> 00:21:04,109 Yeah. 391 00:21:04,109 --> 00:21:05,050 No, plus y. 392 00:21:05,050 --> 00:21:13,760 So I want a of-- well since I can't figure it out, 393 00:21:13,760 --> 00:21:16,250 why can't you? 394 00:21:16,250 --> 00:21:17,970 What do I need to put in there in order 395 00:21:17,970 --> 00:21:24,170 to discover an x that would produce a y? 396 00:21:24,170 --> 00:21:25,290 Can I write an equation? 397 00:21:25,290 --> 00:21:26,192 Yeah? 398 00:21:26,192 --> 00:21:27,967 AUDIENCE: Could you just write y itself? 399 00:21:27,967 --> 00:21:29,300 SRINIVAS DEVADAS: Just y itself. 400 00:21:29,300 --> 00:21:29,960 That's right. 401 00:21:29,960 --> 00:21:30,900 Good point. 402 00:21:30,900 --> 00:21:32,635 Just y itself in this case. 403 00:21:32,635 --> 00:21:33,820 Good. 404 00:21:33,820 --> 00:21:35,650 I knew you guys were smarter than me. 405 00:21:35,650 --> 00:21:38,060 This proves it. 406 00:21:38,060 --> 00:21:41,500 So if you just take y-- and y remember 407 00:21:41,500 --> 00:21:46,190 is going to be something that's 0 to p minus 1, right? 408 00:21:46,190 --> 00:21:47,520 And that's it. 409 00:21:47,520 --> 00:21:49,170 It just goes through, right? 410 00:21:49,170 --> 00:21:51,150 So that's a trivial example, right? 411 00:21:51,150 --> 00:21:55,780 Now if I put x squared in here, obviously it's not y, 412 00:21:55,780 --> 00:22:03,050 but I could start looking at-- what I have here is 413 00:22:03,050 --> 00:22:05,280 I'm going to get y that looks like x squared. 414 00:22:05,280 --> 00:22:07,220 But I could take the y that I have, 415 00:22:07,220 --> 00:22:09,240 take the square root of that, and then start 416 00:22:09,240 --> 00:22:13,570 looking for x's that give me the y that I have. 417 00:22:13,570 --> 00:22:18,020 Actually it's not a complicated process to try and figure out, 418 00:22:18,020 --> 00:22:20,230 through trial and error potentially, 419 00:22:20,230 --> 00:22:23,250 what an x is that produces a particular y 420 00:22:23,250 --> 00:22:25,360 for the kinds of hash functions that we've 421 00:22:25,360 --> 00:22:26,900 looked at, all right? 422 00:22:26,900 --> 00:22:32,050 Now as you complicate this equation it gets harder. 423 00:22:32,050 --> 00:22:34,626 Because you have to invert this set of equations. 424 00:22:34,626 --> 00:22:36,000 And that's what the game is going 425 00:22:36,000 --> 00:22:38,620 to be when you go create one-way hash functions. 426 00:22:38,620 --> 00:22:41,520 The amount of computation that you do in order 427 00:22:41,520 --> 00:22:44,770 to compute the y is going to increase to the point 428 00:22:44,770 --> 00:22:47,770 where, as I mentioned, you have 80, 100 rounds of computation, 429 00:22:47,770 --> 00:22:49,400 things getting mixed in. 430 00:22:49,400 --> 00:22:53,210 And the hope is that you create this circuit, if you will, 431 00:22:53,210 --> 00:22:54,970 that has all this computation in that. 432 00:22:54,970 --> 00:22:57,170 Going forwards is easy, because you've 433 00:22:57,170 --> 00:22:59,350 specified the multiplications and the mods 434 00:22:59,350 --> 00:23:00,700 and so on and so forth. 435 00:23:00,700 --> 00:23:04,830 But not all of these operations have simple inverses. 436 00:23:04,830 --> 00:23:07,790 And going backwards, which is what 437 00:23:07,790 --> 00:23:11,010 you need to do in order to break one-wayness, 438 00:23:11,010 --> 00:23:14,890 or discover the x given a y, is going 439 00:23:14,890 --> 00:23:17,100 to be harder and harder as the computations get 440 00:23:17,100 --> 00:23:18,890 more complex, OK? 441 00:23:18,890 --> 00:23:20,905 So everyone have a sense of what one-wayness is? 442 00:23:24,810 --> 00:23:26,990 So that's one-wayness. 443 00:23:26,990 --> 00:23:30,930 There's four other properties, two of which are very related. 444 00:23:30,930 --> 00:23:33,700 CR and TCR. 445 00:23:33,700 --> 00:23:35,550 So CR is collision resistance. 446 00:23:42,970 --> 00:23:54,290 It's infeasible to find x and x prime such that x not equal 447 00:23:54,290 --> 00:24:02,269 to x prime, and h of x equals h of x prime, 448 00:24:02,269 --> 00:24:03,560 which is of course a collision. 449 00:24:08,300 --> 00:24:09,690 OK? 450 00:24:09,690 --> 00:24:14,790 And that just says you have this crazy hash function where 451 00:24:14,790 --> 00:24:16,650 you can't discover collisions. 452 00:24:16,650 --> 00:24:18,620 Well it would be absolutely wonderful. 453 00:24:18,620 --> 00:24:21,740 In fact that's what we wanted when we built dictionaries. 454 00:24:21,740 --> 00:24:25,290 But why don't we use SHA-3 in dictionaries? 455 00:24:28,410 --> 00:24:30,350 Why don't we use SHA-3 in dictionaries? 456 00:24:30,350 --> 00:24:30,851 Yeah? 457 00:24:30,851 --> 00:24:33,058 AUDIENCE: Because it's more complicated than we need. 458 00:24:33,058 --> 00:24:35,270 SRINIVAS DEVADAS: Yeah, it's horribly slow, right? 459 00:24:35,270 --> 00:24:39,365 It would take longer to compute the hash than access 460 00:24:39,365 --> 00:24:40,740 the dictionary, when you actually 461 00:24:40,740 --> 00:24:44,847 had a reasonable dictionary that maybe had some collisions. 462 00:24:44,847 --> 00:24:46,930 I mean you just go off and you have a linked list, 463 00:24:46,930 --> 00:24:50,090 you can afford a few collisions, what's the big deal, right? 464 00:24:50,090 --> 00:24:51,860 So it just doesn't make any sense 465 00:24:51,860 --> 00:24:57,420 to use this level of heavyweight hash function, 466 00:24:57,420 --> 00:25:00,830 even if it satisfies collision resistance-- which 467 00:25:00,830 --> 00:25:04,070 some of these are conjectured to do-- for the applications we've 468 00:25:04,070 --> 00:25:04,617 looked at. 469 00:25:04,617 --> 00:25:06,950 But there'll be other apps where collision resistance is 470 00:25:06,950 --> 00:25:08,520 going to be important. 471 00:25:08,520 --> 00:25:10,110 So that's collision resistance. 472 00:25:10,110 --> 00:25:15,470 And then there's-- TCR is target collision resistance. 473 00:25:15,470 --> 00:25:18,300 It's a weaker form-- so sometimes people 474 00:25:18,300 --> 00:25:24,190 CR strong collision resistance, and TCR weak occlusion 475 00:25:24,190 --> 00:25:24,810 resistance. 476 00:25:24,810 --> 00:25:28,090 We'll use CR and TCR here. 477 00:25:28,090 --> 00:25:35,460 And this says it's infeasible, given 478 00:25:35,460 --> 00:25:39,200 x-- so there's a specific x that you 479 00:25:39,200 --> 00:25:41,590 want to find a collision for, as opposed 480 00:25:41,590 --> 00:25:45,360 to just finding a pair that goes once to x and x prime. 481 00:25:45,360 --> 00:25:49,700 And any pair would suffice to break the collision resistance 482 00:25:49,700 --> 00:25:50,560 property. 483 00:25:50,560 --> 00:25:54,630 But TCR says is I'm going to give you a specific x. 484 00:25:54,630 --> 00:25:57,750 And I want you to find an x prime who's 485 00:25:57,750 --> 00:26:01,050 hash collides with the hash of x, OK? 486 00:26:01,050 --> 00:26:02,065 That's TCR. 487 00:26:16,350 --> 00:26:18,082 OK that's TCR for you. 488 00:26:18,082 --> 00:26:20,040 And that just to be clear, I think you probably 489 00:26:20,040 --> 00:26:23,420 all got this, obviously we want this here 490 00:26:23,420 --> 00:26:26,340 because we have a deterministic hash function. 491 00:26:26,340 --> 00:26:29,430 And it's a trivial thing to say that if you had x, 492 00:26:29,430 --> 00:26:32,380 and you had x again, that you get the same hash back from it. 493 00:26:32,380 --> 00:26:33,740 That's a requirement, really. 494 00:26:33,740 --> 00:26:36,670 So we want two distinct x and x primes that are not 495 00:26:36,670 --> 00:26:38,590 equal that end up colliding. 496 00:26:38,590 --> 00:26:40,890 That's really what a collision is. 497 00:26:40,890 --> 00:26:44,200 And so you see the difference between CR and TCR? 498 00:26:44,200 --> 00:26:44,700 Yup? 499 00:26:44,700 --> 00:26:45,812 Yeah? 500 00:26:45,812 --> 00:26:49,144 AUDIENCE: Are we to assume that given an x 501 00:26:49,144 --> 00:26:51,105 it's very easy to get the h of x back? 502 00:26:51,105 --> 00:26:52,480 SRINIVAS DEVADAS: So the question 503 00:26:52,480 --> 00:26:57,150 was, given an x, it's poly-time computation to get h of x. 504 00:26:57,150 --> 00:26:58,230 Absolutely. 505 00:26:58,230 --> 00:27:02,480 Public poly-time computation given an x to get h of x. 506 00:27:02,480 --> 00:27:08,840 So going this way is easy. 507 00:27:08,840 --> 00:27:15,170 Going this way-- I ran out of room-- hard. 508 00:27:15,170 --> 00:27:16,954 OK? 509 00:27:16,954 --> 00:27:20,160 AUDIENCE: So does that mean that TCR is basically the same as 1? 510 00:27:20,160 --> 00:27:22,230 SRINIVAS DEVADAS: No, no, no, absolutely not. 511 00:27:22,230 --> 00:27:25,890 TCR says it's OK. 512 00:27:25,890 --> 00:27:27,620 You can compute this. 513 00:27:27,620 --> 00:27:29,030 You can get x. 514 00:27:29,030 --> 00:27:30,720 And you can get h of x. 515 00:27:30,720 --> 00:27:33,125 So given x, you know that you can get h of x. 516 00:27:33,125 --> 00:27:35,000 I didn't actually put that in the definition. 517 00:27:35,000 --> 00:27:36,800 And maybe I should have. 518 00:27:36,800 --> 00:27:38,860 So given x you can always get h of x. 519 00:27:38,860 --> 00:27:40,080 Remember that. 520 00:27:40,080 --> 00:27:41,640 It's easy to get h of x. 521 00:27:41,640 --> 00:27:44,350 So any time I say given x, you can always add it, 522 00:27:44,350 --> 00:27:46,400 saying given x and h of x. 523 00:27:46,400 --> 00:27:48,690 So I'm given x. 524 00:27:48,690 --> 00:27:49,960 I'm given h of x. 525 00:27:49,960 --> 00:27:53,600 I obviously need to map-- I need to discover 526 00:27:53,600 --> 00:27:58,080 an x prime such that h of x prime equals h of x, OK? 527 00:27:58,080 --> 00:28:04,490 Now you have situations where for-- it 528 00:28:04,490 --> 00:28:07,920 may be the case that for particular x's you 529 00:28:07,920 --> 00:28:08,900 can actually do this. 530 00:28:08,900 --> 00:28:10,363 And that's enough to break TCR. 531 00:28:13,270 --> 00:28:15,640 So you have to have this strong property 532 00:28:15,640 --> 00:28:22,520 that you really don't want to find collisions are for some-- 533 00:28:22,520 --> 00:28:26,470 even if there's a constant fraction of x's that 534 00:28:26,470 --> 00:28:29,210 break the TCR property, you don't like your hash function, 535 00:28:29,210 --> 00:28:29,710 OK? 536 00:28:29,710 --> 00:28:31,850 Because you might end up picking those and go 537 00:28:31,850 --> 00:28:35,490 build security applications using those properties. 538 00:28:35,490 --> 00:28:37,990 I want to talk a little bit about the relationship 539 00:28:37,990 --> 00:28:41,240 between OW, CR, and TCR. 540 00:28:41,240 --> 00:28:42,700 So I'm going to get back to that. 541 00:28:42,700 --> 00:28:45,290 And we're going to talking about hash functions that 542 00:28:45,290 --> 00:28:48,076 satisfy one property but don't satisfy the other. 543 00:28:48,076 --> 00:28:49,700 And I think your question will probably 544 00:28:49,700 --> 00:28:52,150 be answered better, OK? 545 00:28:52,150 --> 00:28:53,460 Thanks for the question. 546 00:28:53,460 --> 00:28:56,160 So those are the main ones. 547 00:28:56,160 --> 00:28:59,260 And really quickly, if you want to spend a lot of time 548 00:28:59,260 --> 00:29:02,972 on this-- but I do want to put up-- 549 00:29:02,972 --> 00:29:05,320 I think I'll leave these properties up here 550 00:29:05,320 --> 00:29:06,590 for the duration. 551 00:29:06,590 --> 00:29:10,350 Because it's important for you to see these definitions as we 552 00:29:10,350 --> 00:29:13,580 look at the applications where we 553 00:29:13,580 --> 00:29:17,090 require these properties, or a subset of these properties. 554 00:29:17,090 --> 00:29:19,580 But that we have pseudo randomness. 555 00:29:19,580 --> 00:29:22,910 And this is simply a function of the fact 556 00:29:22,910 --> 00:29:31,100 that-- so this is PRF-- we know we can't build a random oracle. 557 00:29:31,100 --> 00:29:35,300 And so we're going to have to do something that's pseudo-random. 558 00:29:35,300 --> 00:29:37,840 And basically what we're saying here 559 00:29:37,840 --> 00:29:45,870 is the behavior is indistinguishable from random. 560 00:29:50,990 --> 00:29:56,140 So we're going to have to use non-linearity, things that 561 00:29:56,140 --> 00:29:58,730 are called non-linear feedback shift registers, 562 00:29:58,730 --> 00:30:00,370 to create pseudo-random functions. 563 00:30:00,370 --> 00:30:03,710 There's many ways that we can create pseudo-random functions. 564 00:30:03,710 --> 00:30:05,310 We won't really get into that. 565 00:30:05,310 --> 00:30:07,680 But obviously that's what we want. 566 00:30:07,680 --> 00:30:14,420 And then the last one is a bit tricky. 567 00:30:14,420 --> 00:30:18,830 And we will have an app that requires this way at the end. 568 00:30:18,830 --> 00:30:29,240 But this is infeasible given h of x 569 00:30:29,240 --> 00:30:42,010 to produce h of x prime, where x and x prime are-- and it gets 570 00:30:42,010 --> 00:30:50,150 a little bit fuzzy here-- are related in some fashion, right? 571 00:30:50,150 --> 00:30:53,630 So a concrete example of this is, 572 00:30:53,630 --> 00:30:59,770 let's say that x prime is x plus 1. 573 00:30:59,770 --> 00:31:02,630 So this is a reasonable example of this. 574 00:31:02,630 --> 00:31:09,930 So what this says is you're just given h of x. 575 00:31:09,930 --> 00:31:12,680 It doesn't actually say anything about one-wayness yet. 576 00:31:12,680 --> 00:31:14,670 But you could assume, for example, 577 00:31:14,670 --> 00:31:18,510 that if this was a one-way hash function, 578 00:31:18,510 --> 00:31:23,581 that it would be possible to get x from h of x, correct? 579 00:31:26,300 --> 00:31:28,070 And let's keep that though. 580 00:31:28,070 --> 00:31:29,470 Hold that thought, all right? 581 00:31:29,470 --> 00:31:31,290 We're going to get back to it. 582 00:31:31,290 --> 00:31:36,710 So if I'm just given the hash through some computation, 583 00:31:36,710 --> 00:31:40,300 it may be possible for me to create another hash, h 584 00:31:40,300 --> 00:31:45,330 of x prime, such that there's some relationship 585 00:31:45,330 --> 00:31:51,010 that I can prove or argue for between the strings that 586 00:31:51,010 --> 00:31:54,390 created the hashes, namely x and x prime, OK? 587 00:31:54,390 --> 00:31:57,330 That's what malleability is, right? 588 00:31:57,330 --> 00:32:03,440 Now you might just go off and say here's an x, here's a y, 589 00:32:03,440 --> 00:32:07,700 here's h of x, and here's h of y. 590 00:32:07,700 --> 00:32:09,620 These look completely random. 591 00:32:09,620 --> 00:32:12,890 And you might go off-- I'm being facetious here-- I 592 00:32:12,890 --> 00:32:17,767 say that y is x's third cousin's roommate's brother-in-law 593 00:32:17,767 --> 00:32:18,600 or something, right? 594 00:32:18,600 --> 00:32:20,600 I mean just make something up, right? 595 00:32:20,600 --> 00:32:26,470 So clearly there's got to be a strong, precise relationship 596 00:32:26,470 --> 00:32:27,780 between x and y. 597 00:32:27,780 --> 00:32:32,180 If in fact you could do this and get y 598 00:32:32,180 --> 00:32:36,160 equals x plus 1, that'd be a problem, right? 599 00:32:36,160 --> 00:32:38,840 But if you are-- and then you can 600 00:32:38,840 --> 00:32:42,280 do this sort of consistently for different x's and y's, that 601 00:32:42,280 --> 00:32:44,980 would absolutely be a problem, right? 602 00:32:44,980 --> 00:32:48,440 But what you're really asking for-- and typically 603 00:32:48,440 --> 00:32:50,710 when you want non-malleability-- it's 604 00:32:50,710 --> 00:32:55,000 things where you have auctions, for example, where 605 00:32:55,000 --> 00:32:58,350 you are to be careful about making sure that you don't want 606 00:32:58,350 --> 00:33:01,320 to expose your bid. 607 00:33:01,320 --> 00:33:04,700 And so maybe what you're doing is exposing h of x. 608 00:33:04,700 --> 00:33:08,960 You don't want somebody to look at your h of x 609 00:33:08,960 --> 00:33:10,420 and figure out how they could beat 610 00:33:10,420 --> 00:33:13,540 your bid by just a little bit. 611 00:33:13,540 --> 00:33:17,140 Or in case of Vickrey auctions, where the second highest bidder 612 00:33:17,140 --> 00:33:20,031 wins, now just be a little bit below you, right? 613 00:33:20,031 --> 00:33:21,530 So that's the kind of thing that you 614 00:33:21,530 --> 00:33:25,110 want to think about when it comes to non-malleability, 615 00:33:25,110 --> 00:33:28,880 or malleability, where you want a strong relationship 616 00:33:28,880 --> 00:33:32,300 between two strings that are related 617 00:33:32,300 --> 00:33:35,510 in some ordered fashion, like x equals-- x prime 618 00:33:35,510 --> 00:33:38,950 equals x plus 1, or just x prime equals 2 times x. 619 00:33:38,950 --> 00:33:43,040 And you don't want to be able to-- you 620 00:33:43,040 --> 00:33:45,350 don't want the adversary to be able to discover 621 00:33:45,350 --> 00:33:47,670 these new strings. 622 00:33:47,670 --> 00:33:51,440 Because that would be the system, all right? 623 00:33:51,440 --> 00:33:55,580 So any questions about properties? 624 00:33:55,580 --> 00:33:57,620 Are we all good on these properties? 625 00:33:57,620 --> 00:33:59,840 All right, because I'm going to start asking you 626 00:33:59,840 --> 00:34:03,010 how to use them for particular applications, 627 00:34:03,010 --> 00:34:09,170 or what properties are required for certain applications, OK? 628 00:34:09,170 --> 00:34:11,150 One last thing before we get there. 629 00:34:11,150 --> 00:34:16,960 I promised a slightly more detailed analysis 630 00:34:16,960 --> 00:34:20,170 of the relationships between these properties. 631 00:34:20,170 --> 00:34:20,974 So let's do that. 632 00:34:24,810 --> 00:34:27,830 Now if your just look at it, eyeball it, 633 00:34:27,830 --> 00:34:34,888 and you look at collision resistance and TCR, 634 00:34:34,888 --> 00:34:36,429 what can I say about the relationship 635 00:34:36,429 --> 00:34:40,820 between CR and TCR? 636 00:34:40,820 --> 00:34:45,953 If h is CR, it's going to be TCR, right? 637 00:34:45,953 --> 00:34:46,744 It's got to be TCR. 638 00:34:46,744 --> 00:34:48,735 It's a strictly stronger requirement. 639 00:34:54,659 --> 00:34:55,415 But not reverse. 640 00:34:57,940 --> 00:35:04,230 And you can actually give a concrete example 641 00:35:04,230 --> 00:35:07,077 of a particular hash function that is TCR. 642 00:35:07,077 --> 00:35:08,160 I'm not going to go there. 643 00:35:08,160 --> 00:35:09,659 It's actually a little more involved 644 00:35:09,659 --> 00:35:12,780 than you might think it is, where a TCR hash function is 645 00:35:12,780 --> 00:35:14,430 not collision resistant. 646 00:35:14,430 --> 00:35:17,180 But you can see that examples such as these 647 00:35:17,180 --> 00:35:20,340 should exist, simply because I have a more stringent property 648 00:35:20,340 --> 00:35:22,280 corresponding to collision resistance 649 00:35:22,280 --> 00:35:24,680 as opposed to TCR, right? 650 00:35:24,680 --> 00:35:27,170 So if you're interested in that particular example, 651 00:35:27,170 --> 00:35:29,780 you're not responsible for it, get in touch with me 652 00:35:29,780 --> 00:35:32,545 and I'll point you to a, like a three-page description 653 00:35:32,545 --> 00:35:34,180 of an example. 654 00:35:34,180 --> 00:35:35,930 So I didn't really want to go in there. 655 00:35:35,930 --> 00:35:40,170 But what I do want to do is talk about one-wayness and collision 656 00:35:40,170 --> 00:35:40,820 resistance. 657 00:35:40,820 --> 00:35:43,069 Because I think that's actually much more interesting, 658 00:35:43,069 --> 00:35:43,720 all right? 659 00:35:43,720 --> 00:35:59,060 So if h is one-way-- any conjectures 660 00:35:59,060 --> 00:36:03,370 as to what the question mark is in the middle? 661 00:36:03,370 --> 00:36:07,950 Can I make strong statements about the collision resistance 662 00:36:07,950 --> 00:36:10,430 of a hash function, if I'm guaranteed 663 00:36:10,430 --> 00:36:14,010 that the hash function I have is a one-way hash function, 664 00:36:14,010 --> 00:36:14,730 or vice versa? 665 00:36:20,960 --> 00:36:23,080 Another way of putting it is, can you 666 00:36:23,080 --> 00:36:28,970 give me an example of, just to start with, 667 00:36:28,970 --> 00:36:35,096 a hash function which is one-way but not TCR, not 668 00:36:35,096 --> 00:36:36,220 target collision resistant? 669 00:36:40,520 --> 00:36:43,540 So I'm going to try and extract this out of you. 670 00:36:43,540 --> 00:36:46,870 This is somewhat subtle. 671 00:36:46,870 --> 00:36:48,990 But the way you want to think about this 672 00:36:48,990 --> 00:36:59,260 is, let's say that h of x is OW and TCR, OK? 673 00:36:59,260 --> 00:37:02,660 And so I have a bunch of inputs. 674 00:37:02,660 --> 00:37:03,660 And this is the output. 675 00:37:03,660 --> 00:37:06,160 And I get d-bits out. 676 00:37:06,160 --> 00:37:12,010 And I've got x1, x2, to xn, OK? 677 00:37:12,010 --> 00:37:16,620 Now I've given this h-- I've been given this h which 678 00:37:16,620 --> 00:37:18,240 is one-way and TCR. 679 00:37:18,240 --> 00:37:20,960 It satisfies those properties that you have up there. 680 00:37:20,960 --> 00:37:24,590 In the case of one-way, I give you an arbitrary d-bit string. 681 00:37:24,590 --> 00:37:28,770 You can't go backwards and find a bunch of the xi's that 682 00:37:28,770 --> 00:37:34,150 produce exactly that d-bit string, all right? 683 00:37:34,150 --> 00:37:36,530 So it's going to be hard to get here. 684 00:37:36,530 --> 00:37:40,380 But you're allowed now to give me an example. 685 00:37:40,380 --> 00:37:45,390 So this is some hash function that you can create, 686 00:37:45,390 --> 00:37:48,300 which may use h as well. 687 00:37:48,300 --> 00:37:51,780 And h is kind of nice because it has this one-way property. 688 00:37:51,780 --> 00:37:55,030 So let's say that we want to discover something where 689 00:37:55,030 --> 00:37:59,080 one-way does not imply TCR. 690 00:37:59,080 --> 00:38:03,490 So I want to cook up a hash function h prime such 691 00:38:03,490 --> 00:38:09,550 that h prime is one-way, but it's not TCR, OK? 692 00:38:09,550 --> 00:38:13,610 The way you want to think about this is you want to add to h. 693 00:38:13,610 --> 00:38:16,790 And you want to add something to h such that it's still hard-- 694 00:38:16,790 --> 00:38:20,347 if you add h it's still hard to go from here to there. 695 00:38:20,347 --> 00:38:21,680 Because you've got to go deeper. 696 00:38:21,680 --> 00:38:23,760 If you add to, for example, the inputs of h. 697 00:38:23,760 --> 00:38:26,170 Or you could add to the outputs of h as well, 698 00:38:26,170 --> 00:38:27,730 or the outputs of the current h. 699 00:38:27,730 --> 00:38:34,670 But you can basically go deeper, or need to go deeper in order 700 00:38:34,670 --> 00:38:39,580 to find the break one-wayness, in order 701 00:38:39,580 --> 00:38:43,820 to find an x, whatever you have, that produces the d-bit string 702 00:38:43,820 --> 00:38:44,910 that you have, right? 703 00:38:44,910 --> 00:38:49,690 So what's a simple way of creating an h prime such that 704 00:38:49,690 --> 00:38:53,700 it's going to be pretty easy to find targeted collisions even, 705 00:38:53,700 --> 00:38:56,150 not necessarily collisions, it's pretty easy to find 706 00:38:56,150 --> 00:38:58,740 targeted collisions, without breaking 707 00:38:58,740 --> 00:39:00,180 the one-way property of h? 708 00:39:03,785 --> 00:39:05,264 Yeah? 709 00:39:05,264 --> 00:39:11,673 AUDIENCE: So if you have x sub i, if i odd then 710 00:39:11,673 --> 00:39:14,631 return h of x of i. 711 00:39:14,631 --> 00:39:16,603 So that's minus 1. 712 00:39:16,603 --> 00:39:18,552 So return the even group. 713 00:39:18,552 --> 00:39:19,510 SRINIVAS DEVADAS: Sure. 714 00:39:19,510 --> 00:39:21,004 Yep. 715 00:39:21,004 --> 00:39:24,241 AUDIENCE: Given x any x of i, you 716 00:39:24,241 --> 00:39:27,478 can usually find another x of i that was the same output? 717 00:39:27,478 --> 00:39:28,980 You can go backwards. 718 00:39:28,980 --> 00:39:29,700 SRINIVAS DEVADAS: You can't go backwards. 719 00:39:29,700 --> 00:39:30,500 Yeah, that's good. 720 00:39:30,500 --> 00:39:31,450 That's good. 721 00:39:31,450 --> 00:39:34,114 I'm going to do something that's almost exactly what you said. 722 00:39:34,114 --> 00:39:35,655 But I'm going to draw it pictorially. 723 00:39:38,270 --> 00:39:42,705 And what you can do, you can do a parity, like odd and even 724 00:39:42,705 --> 00:39:43,705 that was just described. 725 00:39:47,520 --> 00:39:51,440 And all I'll do is add a little [? XNOR ?] 726 00:39:51,440 --> 00:39:55,240 gate, which is a parity gate, to one of the inputs. 727 00:39:55,240 --> 00:39:56,830 So you have and b here. 728 00:39:56,830 --> 00:40:01,010 So I've taken x1, and I have a and b here. 729 00:40:01,010 --> 00:40:04,560 So I've added-- I can add as many inputs 730 00:40:04,560 --> 00:40:06,190 as I want to this function. 731 00:40:06,190 --> 00:40:08,830 Oh I should mention by the way, h of x 732 00:40:08,830 --> 00:40:11,290 is working on arbitrary strings. 733 00:40:11,290 --> 00:40:13,630 And obviously I put in some number 734 00:40:13,630 --> 00:40:16,774 here that corresponds to n, which is a fixed number. 735 00:40:16,774 --> 00:40:19,190 So you might ask, what the heck happened here with respect 736 00:40:19,190 --> 00:40:20,610 to arbitrary strings? 737 00:40:20,610 --> 00:40:22,570 And there's two answers. 738 00:40:22,570 --> 00:40:25,000 The first answer is, well, ignore arbitrary. 739 00:40:25,000 --> 00:40:27,350 And assume that you only have n-bit strings. 740 00:40:27,350 --> 00:40:29,370 And n this is really large number, right? 741 00:40:29,370 --> 00:40:31,500 And that may not be particularly satisfying. 742 00:40:31,500 --> 00:40:34,220 The other answer is, which is more practical, 743 00:40:34,220 --> 00:40:35,850 which is what's used in practice, 744 00:40:35,850 --> 00:40:38,140 is that typically what happens is, 745 00:40:38,140 --> 00:40:41,000 you do have particular implementations 746 00:40:41,000 --> 00:40:43,180 of hash functions that obviously need to have 747 00:40:43,180 --> 00:40:46,440 fixed inputs, n, for example. 748 00:40:46,440 --> 00:40:48,110 And n is typically 512. 749 00:40:48,110 --> 00:40:49,680 It's usually the block size. 750 00:40:49,680 --> 00:40:52,940 And you chunk the input up into five 12-bit blocks. 751 00:40:52,940 --> 00:40:54,770 And typically what you do is, you 752 00:40:54,770 --> 00:40:57,800 take the first five 12-bits, compute the hash for it. 753 00:40:57,800 --> 00:41:02,280 And then you can do it for the remaining blocks. 754 00:41:02,280 --> 00:41:04,530 And then you can hash all of them together, all right? 755 00:41:04,530 --> 00:41:06,872 So there's typically more invocations. 756 00:41:06,872 --> 00:41:08,330 I don't really want to get into it. 757 00:41:08,330 --> 00:41:11,370 But there's typically more invocations of h 758 00:41:11,370 --> 00:41:15,600 when the input would be 2 times n, or 3 times n, all right? 759 00:41:15,600 --> 00:41:17,410 So we don't really need to go there 760 00:41:17,410 --> 00:41:18,960 for the purposes of this lecture. 761 00:41:18,960 --> 00:41:20,270 But keep that in mind. 762 00:41:20,270 --> 00:41:23,750 So we'll still stick with our arbitrary string requirement. 763 00:41:23,750 --> 00:41:26,410 So having said that, take a look at this picture. 764 00:41:26,410 --> 00:41:30,190 And see what this picture implies. 765 00:41:30,190 --> 00:41:33,340 I have an h prime that I've constructed, right? 766 00:41:33,340 --> 00:41:36,720 Now if I look at h prime, and I give you 767 00:41:36,720 --> 00:41:40,270 an output for h prime-- so h prime now has, 768 00:41:40,270 --> 00:41:45,640 it's a function of a and b, and x2 all the way to xn, right? 769 00:41:45,640 --> 00:41:47,850 So it's got an extra input. 770 00:41:47,850 --> 00:41:50,630 If I look at h prime, and I look at the output of h prime that 771 00:41:50,630 --> 00:41:56,280 is given to me, and I need to discover something that 772 00:41:56,280 --> 00:42:00,280 produces that, it is pretty clear that I need to figure out 773 00:42:00,280 --> 00:42:03,400 what these values are, all right? 774 00:42:03,400 --> 00:42:06,930 And I need to know what the parity of a and b is. 775 00:42:06,930 --> 00:42:09,293 And maybe I don't need to know exactly what a and b are, 776 00:42:09,293 --> 00:42:11,626 but I absolutely need to know what the parity of a and b 777 00:42:11,626 --> 00:42:13,230 are, because that's x1. 778 00:42:13,230 --> 00:42:15,490 And the one-way I'd break would require 779 00:42:15,490 --> 00:42:17,670 me to tell you what the value of x1 is, 780 00:42:17,670 --> 00:42:20,070 and the value of x2, and so on and so forth. 781 00:42:20,070 --> 00:42:23,640 So it's pretty clear that h prime is one-way, right? 782 00:42:23,640 --> 00:42:25,520 Everybody buy that? 783 00:42:25,520 --> 00:42:28,870 h prime is one-way. 784 00:42:28,870 --> 00:42:30,160 But you know what? 785 00:42:30,160 --> 00:42:33,860 I've got target collisions galore, right? 786 00:42:33,860 --> 00:42:37,360 All I have to do is flip-- I have a equals 1 and b equals 1. 787 00:42:37,360 --> 00:42:39,770 And I have a equals 0 and b equals 0. 788 00:42:39,770 --> 00:42:42,350 They're going to give me the same hash, right? 789 00:42:42,350 --> 00:42:45,690 So trivial example, but that gets 790 00:42:45,690 --> 00:42:50,070 to the essence of the difference between collision resistance 791 00:42:50,070 --> 00:42:52,290 and one-wayness, target collision resistance 792 00:42:52,290 --> 00:42:54,210 and one-wayness, all right? 793 00:42:54,210 --> 00:43:03,710 So this is one-way but not TCR, simply because a equals 0, b 794 00:43:03,710 --> 00:43:06,500 equals 0 for arbitrary x's produce 795 00:43:06,500 --> 00:43:11,200 the same thing as a equals 1 and b equals 1, right? 796 00:43:11,200 --> 00:43:13,940 So those are collisions. 797 00:43:13,940 --> 00:43:15,690 So admittedly contrived. 798 00:43:15,690 --> 00:43:19,350 But it's a counterexample. 799 00:43:19,350 --> 00:43:21,150 Counterexamples can be contrived. 800 00:43:21,150 --> 00:43:23,510 It's OK. 801 00:43:23,510 --> 00:43:24,710 All right. 802 00:43:24,710 --> 00:43:28,470 So that was what happens with that. 803 00:43:28,470 --> 00:43:32,400 Let's look at one more interesting thing 804 00:43:32,400 --> 00:43:36,150 that corresponds to the other way, right? 805 00:43:36,150 --> 00:43:46,030 So what I want to show is that a TCR does not imply one-wayness. 806 00:43:59,040 --> 00:44:03,122 OK, so now I want an example where it is clear 807 00:44:03,122 --> 00:44:05,580 that I have target collision resistance, because I can just 808 00:44:05,580 --> 00:44:06,370 assume that. 809 00:44:06,370 --> 00:44:08,310 And we're going to use the same strategy. 810 00:44:08,310 --> 00:44:10,550 I'm just going assume that I have an h that's 811 00:44:10,550 --> 00:44:12,240 target collision resistant. 812 00:44:12,240 --> 00:44:16,250 And I'm going to try and cook up an h prime that is not one-way. 813 00:44:16,250 --> 00:44:21,080 So I'm going to assume that in fact h is TCR and OW. 814 00:44:21,080 --> 00:44:24,420 And I'm going to take away one of the properties. 815 00:44:24,420 --> 00:44:26,060 And if I take it one of the properties 816 00:44:26,060 --> 00:44:28,350 I have a counterexample, right? 817 00:44:28,350 --> 00:44:34,320 So think about how you could do this. 818 00:44:34,320 --> 00:44:38,355 You have h as before. 819 00:44:41,920 --> 00:44:46,330 And I want to add some stuff around it 820 00:44:46,330 --> 00:44:52,820 such that it's going to be easy to discover-- for a large, 821 00:44:52,820 --> 00:44:55,610 for a constant fraction of hashes 822 00:44:55,610 --> 00:44:58,430 that I've given to me, not for any old hash. 823 00:44:58,430 --> 00:45:01,000 Because you can always claim that one-wayness 824 00:45:01,000 --> 00:45:06,360 is broken by saying I have x, I computed h of x, now 825 00:45:06,360 --> 00:45:09,780 I know what-- given h of x I know what x is. 826 00:45:09,780 --> 00:45:11,970 I mean you can't do that, right? 827 00:45:11,970 --> 00:45:14,360 So that's not breaking the one-wayness of it. 828 00:45:14,360 --> 00:45:16,420 It's when you have an h of x and this 829 00:45:16,420 --> 00:45:18,250 is the first time you've seen it, 830 00:45:18,250 --> 00:45:20,660 you're trying to find what x is, right? 831 00:45:20,660 --> 00:45:23,370 So how would you-- how would you set it up 832 00:45:23,370 --> 00:45:28,230 so you break the one-wayness of h 833 00:45:28,230 --> 00:45:31,310 without necessarily breaking the target collision 834 00:45:31,310 --> 00:45:37,430 resistance of the overall hash function that you're creating? 835 00:45:37,430 --> 00:45:41,339 And you have to do something with the outputs, OK? 836 00:45:41,339 --> 00:45:42,380 You have to do something. 837 00:45:42,380 --> 00:45:43,671 This is a little more involved. 838 00:45:43,671 --> 00:45:45,734 It's not as easy as this example. 839 00:45:45,734 --> 00:45:46,900 It's a little more involved. 840 00:45:46,900 --> 00:45:47,920 But any ideas? 841 00:45:51,240 --> 00:45:52,761 Yeah, go ahead. 842 00:45:52,761 --> 00:45:55,707 AUDIENCE: So x is less than b returns x. 843 00:45:55,707 --> 00:45:57,964 If x is greater than b, return [INAUDIBLE]. 844 00:45:57,964 --> 00:45:59,130 SRINIVAS DEVADAS: Beautiful. 845 00:45:59,130 --> 00:45:59,460 Right. 846 00:45:59,460 --> 00:46:00,970 What color did you get last time? 847 00:46:00,970 --> 00:46:02,150 AUDIENCE: Blue. 848 00:46:02,150 --> 00:46:03,050 SRINIVAS DEVADAS: You got a blue last time? 849 00:46:03,050 --> 00:46:03,800 All right. 850 00:46:03,800 --> 00:46:04,890 Well you get a purple. 851 00:46:04,890 --> 00:46:06,190 You have a set. 852 00:46:06,190 --> 00:46:09,220 Actually we have these red ones that are precious, that are-- 853 00:46:09,220 --> 00:46:12,780 no, we don't. 854 00:46:12,780 --> 00:46:14,479 We chose not to do red. 855 00:46:14,479 --> 00:46:15,020 I don't know. 856 00:46:15,020 --> 00:46:17,370 There was some subliminal message 857 00:46:17,370 --> 00:46:20,750 I think with throwing red Frisbees that we didn't like. 858 00:46:20,750 --> 00:46:21,380 But OK. 859 00:46:21,380 --> 00:46:22,550 So thank you. 860 00:46:22,550 --> 00:46:33,260 And h of x is simply something where 861 00:46:33,260 --> 00:46:37,360 I'm going to concatenate a zero to the x value 862 00:46:37,360 --> 00:46:38,660 and just put it out. 863 00:46:38,660 --> 00:46:40,810 And clearly this is breaking one-wayness 864 00:46:40,810 --> 00:46:43,610 because I'm just taking the input, I'm adding a zero to it, 865 00:46:43,610 --> 00:46:44,730 and shipping it out. 866 00:46:44,730 --> 00:46:46,900 So it's going to be easy to go backwards, right? 867 00:46:46,900 --> 00:46:53,500 And this only happens if x is less than n, 868 00:46:53,500 --> 00:46:55,460 as the gentleman just said. 869 00:46:55,460 --> 00:47:00,220 Less than or equal to n in terms of the input length, OK? 870 00:47:00,220 --> 00:47:03,131 Otherwise I'm going to do h of x. 871 00:47:08,270 --> 00:47:10,160 So this is good news. 872 00:47:10,160 --> 00:47:15,400 Because I'm actually using the hash function in the case 873 00:47:15,400 --> 00:47:17,890 where I have a longer input string. 874 00:47:17,890 --> 00:47:20,660 This is bad news for one-wayness because I'm just 875 00:47:20,660 --> 00:47:23,010 piping out the input. 876 00:47:23,010 --> 00:47:30,927 And so if I get an x, and I see what the x is out here, 877 00:47:30,927 --> 00:47:32,510 and let's just say for argument's sake 878 00:47:32,510 --> 00:47:38,480 that-- you could even say that n is 879 00:47:38,480 --> 00:47:43,330 going to be something that is less than d, 880 00:47:43,330 --> 00:47:46,210 which is the final output, which has d-bits. 881 00:47:46,210 --> 00:47:49,090 And so if you see something that h prime produces 882 00:47:49,090 --> 00:47:51,450 that's less than d-bits you instantly 883 00:47:51,450 --> 00:47:54,030 know that you can go backwards and discover 884 00:47:54,030 --> 00:47:57,186 what input produced that for the h prime, right? 885 00:47:57,186 --> 00:47:59,060 Because you just go off and you go backwards. 886 00:47:59,060 --> 00:48:00,350 This is what it tells you. 887 00:48:00,350 --> 00:48:01,850 Now on the other hand if it's larger 888 00:48:01,850 --> 00:48:03,160 obviously you can't do that. 889 00:48:03,160 --> 00:48:06,770 But there's a whole lot of combinations 890 00:48:06,770 --> 00:48:08,100 that you can do that for. 891 00:48:08,100 --> 00:48:11,300 So this breaks one-wayness, OK? 892 00:48:11,300 --> 00:48:13,074 Now you think about TCR. 893 00:48:13,074 --> 00:48:14,490 And what you want a show of course 894 00:48:14,490 --> 00:48:17,570 is that this maintains TCR. 895 00:48:17,570 --> 00:48:20,622 So that's the last thing that we have to show. 896 00:48:20,622 --> 00:48:22,080 We know that it breaks one-wayness. 897 00:48:22,080 --> 00:48:25,182 But if it broke TCR we don't quite have our example. 898 00:48:25,182 --> 00:48:26,640 So we want to show that it actually 899 00:48:26,640 --> 00:48:31,220 maintains TCR, which is kind of a weakish property 900 00:48:31,220 --> 00:48:33,440 that we need to maintain. 901 00:48:33,440 --> 00:48:35,890 And the reason this maintains TCR 902 00:48:35,890 --> 00:48:39,290 is that there's really only two cases here obviously, 903 00:48:39,290 --> 00:48:41,720 corresponding to the if statement. 904 00:48:41,720 --> 00:48:49,280 And it's pretty clear that if x is less than or equal to n, 905 00:48:49,280 --> 00:49:03,520 clearly different x's produce different h prime x's, correct? 906 00:49:03,520 --> 00:49:06,620 Because I'm just passing along the x out to the output. 907 00:49:06,620 --> 00:49:09,730 So if x is less than n I am going to get different hashes 908 00:49:09,730 --> 00:49:10,570 at the output. 909 00:49:10,570 --> 00:49:12,350 I'm just passing them out. 910 00:49:12,350 --> 00:49:13,940 So that's easy. 911 00:49:13,940 --> 00:49:17,490 And for the other case, well I assume that h of x 912 00:49:17,490 --> 00:49:20,129 was CCR, correct? 913 00:49:20,129 --> 00:49:22,420 Because that was the original assumption, that I had h, 914 00:49:22,420 --> 00:49:23,540 which was CCR. 915 00:49:23,540 --> 00:49:30,690 So in both cases TCR is maintained because else h 916 00:49:30,690 --> 00:49:38,350 of x maintains TCR, all right? 917 00:49:38,350 --> 00:49:41,284 So again, a bit of a contrived example 918 00:49:41,284 --> 00:49:42,700 to show you the difference between 919 00:49:42,700 --> 00:49:45,510 these different properties so you know not to mix them up. 920 00:49:45,510 --> 00:49:47,630 You know what you want to ask for, 921 00:49:47,630 --> 00:49:51,150 what is required when you actually 922 00:49:51,150 --> 00:49:53,870 implement an application that depends 923 00:49:53,870 --> 00:49:56,000 on particular properties. 924 00:49:56,000 --> 00:49:57,230 All right? 925 00:49:57,230 --> 00:49:59,010 Any questions so far about properties 926 00:49:59,010 --> 00:50:01,040 or any of these examples? 927 00:50:01,040 --> 00:50:03,227 We're going to dive in to using them. 928 00:50:06,970 --> 00:50:08,510 OK. 929 00:50:08,510 --> 00:50:12,170 So start thinking computer security. 930 00:50:12,170 --> 00:50:18,090 Start thinking hackers, protecting yourself 931 00:50:18,090 --> 00:50:20,655 against the bad guys that are out there who 932 00:50:20,655 --> 00:50:22,640 are trying to discover your passwords, 933 00:50:22,640 --> 00:50:24,924 trying to corrupt your files, generally 934 00:50:24,924 --> 00:50:25,965 make your life miserable. 935 00:50:32,880 --> 00:50:38,880 And we'll start out with fairly simple examples, where 936 00:50:38,880 --> 00:50:41,730 the properties are somewhat obvious, 937 00:50:41,730 --> 00:50:46,205 and graduate to this auction bidding example which 938 00:50:46,205 --> 00:50:48,080 should be sort of the culmination of at least 939 00:50:48,080 --> 00:50:50,120 this part of the lecture. 940 00:50:50,120 --> 00:50:52,470 And depending on how much time I have 941 00:50:52,470 --> 00:50:54,800 I'll tell you a little bit about how 942 00:50:54,800 --> 00:50:56,730 to implement hash functions. 943 00:50:56,730 --> 00:50:59,640 But I think these things are more 944 00:50:59,640 --> 00:51:03,580 important from a standpoint of giving you 945 00:51:03,580 --> 00:51:08,610 a sense of cryptographic hashes. 946 00:51:08,610 --> 00:51:10,380 All right. 947 00:51:10,380 --> 00:51:11,970 Password storage. 948 00:51:11,970 --> 00:51:16,730 How many of you write your password in an unencrypted text 949 00:51:16,730 --> 00:51:22,230 file and store it in a readable location? 950 00:51:22,230 --> 00:51:24,380 There you go, man. 951 00:51:24,380 --> 00:51:27,390 Thank you for being honest. 952 00:51:27,390 --> 00:51:29,550 And I do worse. 953 00:51:29,550 --> 00:51:32,610 Not only do I do that, I use my first daughter's 954 00:51:32,610 --> 00:51:35,334 name for four passwords. 955 00:51:35,334 --> 00:51:36,750 I won't tell you what the name is. 956 00:51:41,350 --> 00:51:43,470 So that's something that we'd like to fix, right? 957 00:51:43,470 --> 00:51:45,500 So what do real systems do? 958 00:51:45,500 --> 00:51:49,530 Real systems cannot protect against me using my first 959 00:51:49,530 --> 00:51:51,400 daughter's name as a password, right? 960 00:51:51,400 --> 00:51:53,580 So there's no way you can protect against that. 961 00:51:53,580 --> 00:51:56,830 But if I had a reasonable password, which 962 00:51:56,830 --> 00:51:59,030 had reasonable entropy in it-- so 963 00:51:59,030 --> 00:52:01,344 let's assume here that we have reasonable entropy 964 00:52:01,344 --> 00:52:02,010 in the password. 965 00:52:02,010 --> 00:52:04,000 And you can just say 128-bits. 966 00:52:04,000 --> 00:52:05,240 And it's not a lot, right? 967 00:52:05,240 --> 00:52:09,135 128-bits is 16 characters, OK? 968 00:52:09,135 --> 00:52:11,260 And you don't have to answer this-- how many of you 969 00:52:11,260 --> 00:52:15,390 have 16 characters in your password? 970 00:52:15,390 --> 00:52:16,710 Oh I'm impressed. 971 00:52:16,710 --> 00:52:17,350 OK. 972 00:52:17,350 --> 00:52:18,980 So you've got 128-bits of entropy. 973 00:52:18,980 --> 00:52:21,710 But the rest of you, forget it. 974 00:52:21,710 --> 00:52:25,040 This is not going to help you, OK? 975 00:52:25,040 --> 00:52:28,140 But what I want, assuming you have 976 00:52:28,140 --> 00:52:31,830 significant entropy in your password-- because otherwise, 977 00:52:31,830 --> 00:52:33,940 if there's not enough entropy you 978 00:52:33,940 --> 00:52:38,272 can just enumerate all possible passwords of eight letters. 979 00:52:38,272 --> 00:52:39,230 And it's not that much. 980 00:52:39,230 --> 00:52:41,391 It's 2 raised to 50, what have you. 981 00:52:41,391 --> 00:52:42,390 And you can just go off. 982 00:52:42,390 --> 00:52:44,150 And none of these properties matter. 983 00:52:44,150 --> 00:52:45,810 You just-- you have your h of x. 984 00:52:45,810 --> 00:52:48,206 It's public. 985 00:52:48,206 --> 00:52:50,080 We'll talk about how we use that in a second. 986 00:52:50,080 --> 00:52:53,350 But clearly if the domain is small 987 00:52:53,350 --> 00:52:55,120 you can just enumerate the domain. 988 00:52:55,120 --> 00:52:57,062 So keep that in mind. 989 00:52:57,062 --> 00:52:58,770 I talked about h of x, and it's obviously 990 00:52:58,770 --> 00:53:00,300 going to be relevant here. 991 00:53:00,300 --> 00:53:02,520 But suppose I wanted to build a system, 992 00:53:02,520 --> 00:53:04,300 and this is how systems are built, 993 00:53:04,300 --> 00:53:06,700 ETC slash password file, assuming 994 00:53:06,700 --> 00:53:11,040 you have long passwords it does it this way, 995 00:53:11,040 --> 00:53:13,320 otherwise it needs something that's called a salt. 996 00:53:13,320 --> 00:53:16,540 But that's 6, 8, 57 and we won't go there. 997 00:53:16,540 --> 00:53:19,590 So we just assume a large entropy. 998 00:53:19,590 --> 00:53:21,980 What is it that a system can do? 999 00:53:21,980 --> 00:53:26,210 What can it store in order to let you in, and only 1000 00:53:26,210 --> 00:53:28,830 let you in when you type your password, 1001 00:53:28,830 --> 00:53:32,190 and not let some bogus password into the system? 1002 00:53:32,190 --> 00:53:34,610 Or somebody with a bogus password into the system. 1003 00:53:34,610 --> 00:53:35,249 Yeah, go ahead. 1004 00:53:35,249 --> 00:53:37,540 AUDIENCE: If you capture the password when you enter it 1005 00:53:37,540 --> 00:53:39,380 and compare it to what's stored-- 1006 00:53:39,380 --> 00:53:40,347 SRINIVAS DEVADAS: Yes. 1007 00:53:40,347 --> 00:53:42,430 AUDIENCE: If it's a one-way hash you know you have 1008 00:53:42,430 --> 00:53:42,730 what the correct password is. 1009 00:53:42,730 --> 00:53:43,820 SRINIVAS DEVADAS: That's exactly right. 1010 00:53:43,820 --> 00:53:44,790 That's exactly right. 1011 00:53:44,790 --> 00:53:49,950 So it's a really simple idea, a very powerful idea. 1012 00:53:49,950 --> 00:53:54,610 It, as I said, assumed that the entropy-- and I'm belaboring 1013 00:53:54,610 --> 00:53:56,890 the obvious now-- but it is important 1014 00:53:56,890 --> 00:53:59,890 when you talk about security to state your assumptions. 1015 00:53:59,890 --> 00:54:04,380 But you do not store password on your computer. 1016 00:54:04,380 --> 00:54:06,940 And you store the hash of the password. 1017 00:54:06,940 --> 00:54:09,530 Now why do I store my password on the computer? 1018 00:54:09,530 --> 00:54:12,200 Because this is so inconvenient, right? 1019 00:54:12,200 --> 00:54:15,180 So this is what the system does for me. 1020 00:54:15,180 --> 00:54:18,110 But the fact of the matter is, if I lose my password, 1021 00:54:18,110 --> 00:54:19,470 this doesn't help me. 1022 00:54:19,470 --> 00:54:24,050 Because what the system wants you to do is choose a password 1023 00:54:24,050 --> 00:54:26,720 that is long enough, and the h is one-way. 1024 00:54:26,720 --> 00:54:30,960 So anybody who discovers h of PW that is publicly readable 1025 00:54:30,960 --> 00:54:33,840 cannot discover PW, all right? 1026 00:54:33,840 --> 00:54:36,420 That's what's cool about this. 1027 00:54:36,420 --> 00:54:38,740 How do you let the person log in? 1028 00:54:38,740 --> 00:54:47,860 Use h of PW to compare against h of PW prime, 1029 00:54:47,860 --> 00:54:54,420 which is what is entered, where PW prime is the typed password. 1030 00:55:00,540 --> 00:55:08,530 And clearly what we need is the disclosure of h of PW 1031 00:55:08,530 --> 00:55:14,960 should not reveal PW. 1032 00:55:14,960 --> 00:55:19,570 So we definitely need one-wayness. 1033 00:55:19,570 --> 00:55:24,370 What about-- what about collision resistance? 1034 00:55:24,370 --> 00:55:28,340 Our target collision resistance? 1035 00:55:28,340 --> 00:55:31,350 Think practitioner now, right? 1036 00:55:31,350 --> 00:55:33,590 Are we interested in this hash function 1037 00:55:33,590 --> 00:55:34,880 being collision resistant? 1038 00:55:34,880 --> 00:55:37,150 What does that mean in this case? 1039 00:55:37,150 --> 00:55:40,315 Give me the context in this particular application? 1040 00:55:40,315 --> 00:55:40,940 Yeah, go ahead. 1041 00:55:40,940 --> 00:55:44,860 AUDIENCE: It means that someone entering a different password 1042 00:55:44,860 --> 00:55:47,107 will have the same hash [INAUDIBLE]. 1043 00:55:47,107 --> 00:55:48,190 SRINIVAS DEVADAS: Exactly. 1044 00:55:48,190 --> 00:55:56,600 So it means that what you have is a situation where you do not 1045 00:55:56,600 --> 00:56:00,900 reveal-- and so what might happen is that h of PW prime 1046 00:56:00,900 --> 00:56:02,460 equals h of PW. 1047 00:56:02,460 --> 00:56:07,190 But h of PW equals h of PW prime. 1048 00:56:07,190 --> 00:56:11,490 But PW is not equal to PW prime. 1049 00:56:11,490 --> 00:56:13,950 What you have is a false positive. 1050 00:56:13,950 --> 00:56:15,570 Someone who didn't know your password 1051 00:56:15,570 --> 00:56:19,060 but guessed right-- and this is a 128-bit value, 1052 00:56:19,060 --> 00:56:22,840 and they guessed right-- is going to get it. 1053 00:56:22,840 --> 00:56:24,940 You don't particularly care of the probability 1054 00:56:24,940 --> 00:56:26,190 of this occurrence. 1055 00:56:26,190 --> 00:56:27,900 It's really small. 1056 00:56:27,900 --> 00:56:30,570 Typically you're going to have systems that lock you out 1057 00:56:30,570 --> 00:56:34,770 if you try 10 tries that occurs one, two, wrong passwords, 1058 00:56:34,770 --> 00:56:35,270 right? 1059 00:56:35,270 --> 00:56:37,965 So really in systems you do not require-- 1060 00:56:37,965 --> 00:56:39,340 you do want to build systems that 1061 00:56:39,340 --> 00:56:42,090 have minimal properties with respect 1062 00:56:42,090 --> 00:56:43,570 to the perimeters that are used. 1063 00:56:43,570 --> 00:56:47,090 So from a system building standpoint just require OW. 1064 00:56:47,090 --> 00:56:48,350 Don't go overboard. 1065 00:56:48,350 --> 00:56:53,100 Don't require collision resistance or TCR, OK? 1066 00:56:53,100 --> 00:56:55,420 Let's do a slightly different example. 1067 00:56:55,420 --> 00:56:59,010 Also a bit of a warm-up for what's 1068 00:56:59,010 --> 00:57:01,895 coming next, which is a file modification detector. 1069 00:57:22,080 --> 00:57:32,800 So for each file F, I'm going to store h of F. And as securely. 1070 00:57:32,800 --> 00:57:36,980 So you assume that this means that h of F cannot be modified 1071 00:57:36,980 --> 00:57:40,380 by anybody, h of F itself. 1072 00:57:47,860 --> 00:57:56,030 And now we want to check if F is modified 1073 00:57:56,030 --> 00:58:04,470 by re-computing h of F. Which could be, 1074 00:58:04,470 --> 00:58:05,640 this could be modified. 1075 00:58:05,640 --> 00:58:07,130 So this could actually be F prime. 1076 00:58:07,130 --> 00:58:09,250 You don't know that. 1077 00:58:09,250 --> 00:58:10,500 You have a file. 1078 00:58:10,500 --> 00:58:11,780 It's a gigabyte. 1079 00:58:11,780 --> 00:58:14,270 And somebody might have tampered with one 1080 00:58:14,270 --> 00:58:16,030 of the bits in the file. 1081 00:58:16,030 --> 00:58:19,340 All you have is a d-bit digest that 1082 00:58:19,340 --> 00:58:23,670 corresponds to h of F that you stored in a secure location. 1083 00:58:23,670 --> 00:58:27,190 And you want to check to see, by re-computing 1084 00:58:27,190 --> 00:58:31,940 h of F, the file that is given to you, 1085 00:58:31,940 --> 00:58:34,135 and comparing it with what you've stored, the h of F 1086 00:58:34,135 --> 00:58:35,730 that you've stored. 1087 00:58:35,730 --> 00:58:42,200 And so what property do we need in order to pull this off? 1088 00:58:42,200 --> 00:58:44,590 Of hash functions. 1089 00:58:44,590 --> 00:58:48,070 What precisely do we need to pull this off? 1090 00:58:50,620 --> 00:58:53,040 What is the adversary trying to do? 1091 00:58:53,040 --> 00:58:55,530 And what is a successful break? 1092 00:58:55,530 --> 00:59:02,000 A successful break is if an adversary can modify the file 1093 00:59:02,000 --> 00:59:08,720 and keep h of F the same, right? 1094 00:59:08,720 --> 00:59:10,780 That would be a successful break, right? 1095 00:59:10,780 --> 00:59:13,600 Yup. 1096 00:59:13,600 --> 00:59:14,125 Go ahead. 1097 00:59:14,125 --> 00:59:14,910 AUDIENCE: TCR. 1098 00:59:14,910 --> 00:59:15,550 SRINIVAS DEVADAS: TCR? 1099 00:59:15,550 --> 00:59:16,300 Yeah, absolutely. 1100 00:59:16,300 --> 00:59:16,841 You need TCR. 1101 00:59:19,350 --> 00:59:21,750 So you want to modify the file. 1102 00:59:34,830 --> 00:59:38,230 So you're given that the file-- the adversary 1103 00:59:38,230 --> 00:59:41,980 is given the file, which is the input to the hash, 1104 00:59:41,980 --> 00:59:47,550 and is going to try and modify-- modify the file, right? 1105 00:59:47,550 --> 00:59:51,130 So let's do a couple more. 1106 00:59:51,130 --> 00:59:57,470 And we're going to advance our requirements here a little bit. 1107 00:59:57,470 --> 01:00:00,891 So those two are basic properties. 1108 01:00:00,891 --> 01:00:02,140 I want to leave this up there. 1109 01:00:04,937 --> 01:00:06,770 We're going to do something that corresponds 1110 01:00:06,770 --> 01:00:08,690 to digital signatures. 1111 01:00:08,690 --> 01:00:13,030 So digital signatures are this wonderful invention 1112 01:00:13,030 --> 01:00:18,290 that came out of MIT in a computer science laboratory-- 1113 01:00:18,290 --> 01:00:23,160 again, Ron Rivest and collaborators-- which 1114 01:00:23,160 --> 01:00:28,120 are a way of digitally signing a document using 1115 01:00:28,120 --> 01:00:31,170 a secret key, a private key. 1116 01:00:31,170 --> 01:00:35,660 But anybody who has access to a public key, 1117 01:00:35,660 --> 01:00:37,210 so it could be pretty much anybody, 1118 01:00:37,210 --> 01:00:41,647 could verify the authenticity of that signature, right? 1119 01:00:41,647 --> 01:00:43,230 So that's what a digital signature is. 1120 01:00:52,490 --> 01:00:55,960 So we're going to talk about public cryptography 1121 01:00:55,960 --> 01:01:00,730 on Thursday, in terms of how you could build 1122 01:01:00,730 --> 01:01:06,640 systems or encryption algorithms that are public key algorithms. 1123 01:01:06,640 --> 01:01:12,470 But here I'll just tell you what we want out of them. 1124 01:01:12,470 --> 01:01:15,100 Essentially what we have here in the case of signatures, 1125 01:01:15,100 --> 01:01:18,100 we actually want to talk about encryption here, 1126 01:01:18,100 --> 01:01:20,180 are-- there's two keys associated 1127 01:01:20,180 --> 01:01:24,030 with a public key system. 1128 01:01:24,030 --> 01:01:26,880 Anybody and everybody in the system 1129 01:01:26,880 --> 01:01:31,090 would have a public key that you can put on your website. 1130 01:01:31,090 --> 01:01:34,500 And you also have a secret key-- that's like your password-- 1131 01:01:34,500 --> 01:01:35,930 that you don't want to write down, 1132 01:01:35,930 --> 01:01:38,221 you don't want to give away, because that's effectively 1133 01:01:38,221 --> 01:01:39,930 your identity. 1134 01:01:39,930 --> 01:01:44,700 And what digital signatures respond to 1135 01:01:44,700 --> 01:01:46,880 are that you have two operations. 1136 01:01:46,880 --> 01:01:51,030 You have signing and verification. 1137 01:01:51,030 --> 01:01:56,760 So signing means that you create a signature sigma that 1138 01:01:56,760 --> 01:02:06,420 is the sign using your private key, your secret key, 1139 01:02:06,420 --> 01:02:10,070 off a message M. So you're saying this is this message, 1140 01:02:10,070 --> 01:02:12,060 it came from me, right? 1141 01:02:12,060 --> 01:02:13,655 That's what signing means. 1142 01:02:13,655 --> 01:02:16,030 You have this long message and you sign it at the bottom. 1143 01:02:16,030 --> 01:02:20,620 You're taking responsibility for the contents of that message. 1144 01:02:20,620 --> 01:02:27,710 And then verification is you have M sigma and a public key. 1145 01:02:27,710 --> 01:02:31,770 And this is simply going to output true or false. 1146 01:02:35,780 --> 01:02:42,260 And so the public key should not reveal any information 1147 01:02:42,260 --> 01:02:43,260 about the secret key. 1148 01:02:48,570 --> 01:02:51,700 And that's the challenge of building PKI systems, 1149 01:02:51,700 --> 01:02:56,800 that we'll talk about in some detail next time. 1150 01:02:56,800 --> 01:03:01,440 But we don't need to think about that other 1151 01:03:01,440 --> 01:03:06,100 than acknowledging it today. 1152 01:03:06,100 --> 01:03:09,680 So the public and private key are two distinct things, 1153 01:03:09,680 --> 01:03:12,150 neither one of which reveals anything about the other. 1154 01:03:12,150 --> 01:03:14,430 Think of them as completely distinct passwords. 1155 01:03:14,430 --> 01:03:16,730 But they happen to be mathematically related. 1156 01:03:16,730 --> 01:03:18,500 That's why this whole thing works. 1157 01:03:18,500 --> 01:03:20,260 And that mathematical relationship 1158 01:03:20,260 --> 01:03:24,750 we'll look at in some detail on Thursday. 1159 01:03:24,750 --> 01:03:26,920 But having said that, take a look 1160 01:03:26,920 --> 01:03:29,490 at what this app is doing for us, right? 1161 01:03:29,490 --> 01:03:31,370 This is a security application. 1162 01:03:31,370 --> 01:03:33,930 And I haven't quite gotten to hash functions yet. 1163 01:03:33,930 --> 01:03:36,600 But I'll get to it in just a minute. 1164 01:03:36,600 --> 01:03:39,330 But what I want to do is emphasize that there's 1165 01:03:39,330 --> 01:03:41,150 two operations going on. 1166 01:03:41,150 --> 01:03:42,760 One of which is a signature, which 1167 01:03:42,760 --> 01:03:46,050 is a private signature, in the sense that it's private to me, 1168 01:03:46,050 --> 01:03:47,160 if I'm Alice. 1169 01:03:47,160 --> 01:03:48,500 Or private to Alice. 1170 01:03:48,500 --> 01:03:50,590 And you're using secret information 1171 01:03:50,590 --> 01:03:52,810 on this public message, M, because that's 1172 01:03:52,810 --> 01:03:54,690 going to be publicized. 1173 01:03:54,690 --> 01:03:57,580 And you're going to sign the public message. 1174 01:03:57,580 --> 01:04:01,160 And then anybody in the world who has access 1175 01:04:01,160 --> 01:04:04,190 to Alice's public key is going to be able to say, 1176 01:04:04,190 --> 01:04:06,840 oh I'm looking at the signature, which is a bunch of bits. 1177 01:04:06,840 --> 01:04:09,900 I'm looking at the message, which is a whole lot of bits. 1178 01:04:09,900 --> 01:04:12,590 And I have this public key, which is a bunch of bits. 1179 01:04:12,590 --> 01:04:16,150 And I'm going to be able to tell for sure 1180 01:04:16,150 --> 01:04:19,340 that either Alice signed this message, 1181 01:04:19,340 --> 01:04:22,560 or Alice did not sign this message. 1182 01:04:22,560 --> 01:04:26,710 And the assumption here is that Alice 1183 01:04:26,710 --> 01:04:28,950 kept her private key secret. 1184 01:04:28,950 --> 01:04:30,970 And of course, what I just wrote there, 1185 01:04:30,970 --> 01:04:33,450 that the public key does not reveal anything 1186 01:04:33,450 --> 01:04:35,530 about the secret key, OK? 1187 01:04:35,530 --> 01:04:38,350 So that's digital signatures for you, in a nutshell. 1188 01:04:38,350 --> 01:04:40,990 And when you do MIT certificates you're 1189 01:04:40,990 --> 01:04:45,130 using digital signatures a la Rivest-Shamir-Adleman, the RSA 1190 01:04:45,130 --> 01:04:45,900 algorithm. 1191 01:04:45,900 --> 01:04:48,580 So you're using this all the time, 1192 01:04:48,580 --> 01:04:52,290 when you click on 6.046 links, for example. 1193 01:04:52,290 --> 01:04:56,440 And what happens is M is typically really large. 1194 01:04:56,440 --> 01:04:58,060 I mean it could be a file, right? 1195 01:04:58,060 --> 01:04:59,510 It could be a large file. 1196 01:04:59,510 --> 01:05:02,730 And you don't necessarily want to compute these operations 1197 01:05:02,730 --> 01:05:04,150 on large files. 1198 01:05:04,150 --> 01:05:09,580 So for convenience, what happens is you end up hashing the file. 1199 01:05:09,580 --> 01:05:22,550 And for large M it's easier to sign h of M. 1200 01:05:22,550 --> 01:05:29,810 And so replace the M's that you see here with h of M, 1201 01:05:29,810 --> 01:05:30,720 all right? 1202 01:05:30,720 --> 01:05:38,640 So now that we're given that we're going to be doing h of M 1203 01:05:38,640 --> 01:05:42,550 in here, think about what we wanted 1204 01:05:42,550 --> 01:05:45,390 to accomplish with M, right? 1205 01:05:45,390 --> 01:05:48,150 I told you what we wanted to accomplish with M. 1206 01:05:48,150 --> 01:05:49,360 There's a particular message. 1207 01:05:49,360 --> 01:05:50,190 I'm Alice. 1208 01:05:50,190 --> 01:05:53,850 I'm going to keep my secret key secret. 1209 01:05:53,850 --> 01:05:57,910 But I want to commit to signing this message M, all right? 1210 01:05:57,910 --> 01:06:00,330 And I want to make sure that nobody 1211 01:06:00,330 --> 01:06:05,320 can pretend to be me who doesn't know my secret key. 1212 01:06:05,320 --> 01:06:07,290 And nobody does. 1213 01:06:07,290 --> 01:06:10,760 So if I'm going to be signing the hash of the message, 1214 01:06:10,760 --> 01:06:13,930 now it comes down to today's lecture. 1215 01:06:13,930 --> 01:06:16,680 I'm signing the hash of the message h of M. 1216 01:06:16,680 --> 01:06:22,120 What property do I require of h in order for this whole thing 1217 01:06:22,120 --> 01:06:23,640 to work out? 1218 01:06:23,640 --> 01:06:24,636 Yeah, go ahead. 1219 01:06:24,636 --> 01:06:26,540 AUDIENCE: Is it non-malleability? 1220 01:06:26,540 --> 01:06:28,665 SRINIVAS DEVADAS: Non malleability, but even before 1221 01:06:28,665 --> 01:06:31,770 that-- suppose-- absolutely, but non-malleability 1222 01:06:31,770 --> 01:06:36,590 is kind of beyond one of these properties over on the right. 1223 01:06:36,590 --> 01:06:39,570 You're on the right track, right? 1224 01:06:39,570 --> 01:06:45,219 So do you want to give me a different answer? 1225 01:06:45,219 --> 01:06:46,677 You can give me a different answer. 1226 01:06:46,677 --> 01:06:50,090 AUDIENCE: Oh, I'm not sure. 1227 01:06:50,090 --> 01:06:52,190 SRINIVAS DEVADAS: OK. 1228 01:06:52,190 --> 01:06:52,690 What? 1229 01:06:52,690 --> 01:06:53,898 Yeah, back there. 1230 01:06:53,898 --> 01:06:56,766 AUDIENCE: I think you wanted to one-way because otherwise you 1231 01:06:56,766 --> 01:07:00,112 could take that signature and find another message that you 1232 01:07:00,112 --> 01:07:01,080 could credit. 1233 01:07:01,080 --> 01:07:02,740 SRINIVAS DEVADAS: I can make M public. 1234 01:07:02,740 --> 01:07:05,480 I can make M-- M can be public. 1235 01:07:05,480 --> 01:07:07,060 And h of M is public. 1236 01:07:07,060 --> 01:07:13,570 So one-wayness is not interesting for this example 1237 01:07:13,570 --> 01:07:14,690 if M is public. 1238 01:07:14,690 --> 01:07:16,690 And we can assume that M eventually gets public. 1239 01:07:16,690 --> 01:07:18,840 Because that's the message I'm signing, right? 1240 01:07:18,840 --> 01:07:21,082 I can also put M out. 1241 01:07:21,082 --> 01:07:22,540 So I want the relationship-- I want 1242 01:07:22,540 --> 01:07:25,760 you to focus on the relationship between h of M and M 1243 01:07:25,760 --> 01:07:28,720 and tell me what would break this system. 1244 01:07:28,720 --> 01:07:31,120 And you're on the right track. 1245 01:07:31,120 --> 01:07:31,970 Yeah, go ahead. 1246 01:07:31,970 --> 01:07:32,932 Or way back there. 1247 01:07:32,932 --> 01:07:33,890 Yeah, sorry about that. 1248 01:07:33,890 --> 01:07:35,074 AUDIENCE: TCR. 1249 01:07:35,074 --> 01:07:35,990 SRINIVAS DEVADAS: TCR. 1250 01:07:35,990 --> 01:07:36,780 Why TCR? 1251 01:07:36,780 --> 01:07:37,696 AUDIENCE: [INAUDIBLE]. 1252 01:07:46,130 --> 01:07:49,070 SRINIVAS DEVADAS: So I have M. So what happens here-- 1253 01:07:49,070 --> 01:07:51,920 I should write this out. 1254 01:07:51,920 --> 01:08:12,640 I'm given-- as an adversary I have M and h of M. It is bad 1255 01:08:12,640 --> 01:08:33,010 if Alice signs h of M, but Bob claims Alice signed M prime. 1256 01:08:33,010 --> 01:08:39,830 Because h of M equals h of M prime, right? 1257 01:08:39,830 --> 01:08:41,600 That is bad. 1258 01:08:41,600 --> 01:08:44,729 So the M is public-- could you stand up? 1259 01:08:49,229 --> 01:08:50,600 M is given. 1260 01:08:50,600 --> 01:08:53,329 There's a specific M, and a specific h 1261 01:08:53,329 --> 01:08:56,470 of M in particular, that has been exposed. 1262 01:08:56,470 --> 01:08:59,620 And h of M is what was used for the signature. 1263 01:08:59,620 --> 01:09:01,140 So you want to keep h of M the same. 1264 01:09:01,140 --> 01:09:02,170 It's a specific one. 1265 01:09:02,170 --> 01:09:03,544 So it's not collision resistance, 1266 01:09:03,544 --> 01:09:05,850 it's target collision resistance, 1267 01:09:05,850 --> 01:09:07,460 because that's given to you. 1268 01:09:07,460 --> 01:09:09,430 And you want to keep that the same. 1269 01:09:09,430 --> 01:09:13,600 But you want to claim that oh, you promised me $10,000, not 1270 01:09:13,600 --> 01:09:15,319 $20, right? 1271 01:09:15,319 --> 01:09:17,899 If you can do that, you signed saying 1272 01:09:17,899 --> 01:09:22,149 you want to pay $10,000, not $20, then you've got a problem. 1273 01:09:22,149 --> 01:09:24,160 So your thing is very close. 1274 01:09:24,160 --> 01:09:27,130 It's just that it doesn't need to be a strong relationship 1275 01:09:27,130 --> 01:09:28,710 between the 10,000 or the 20. 1276 01:09:28,710 --> 01:09:31,000 I mean I give you a concrete example of that. 1277 01:09:31,000 --> 01:09:33,720 But it could be more, it could be less. 1278 01:09:33,720 --> 01:09:36,479 Anything that is different from what you signed, 1279 01:09:36,479 --> 01:09:38,870 be it with the numerical relationship or not, 1280 01:09:38,870 --> 01:09:43,080 would cause a problem and break this scheme, all right? 1281 01:09:43,080 --> 01:09:45,260 Are we good? 1282 01:09:45,260 --> 01:09:50,490 All right, one last example, the most interesting one. 1283 01:09:50,490 --> 01:09:57,250 And as I guessed I'm probably not going 1284 01:09:57,250 --> 01:10:01,670 to get to saying very much about how cache functions are 1285 01:10:01,670 --> 01:10:02,250 implemented. 1286 01:10:02,250 --> 01:10:04,041 But maybe I'll spend a minute or two on it. 1287 01:10:08,770 --> 01:10:12,700 So let's do this example that has to do with commitments. 1288 01:10:19,260 --> 01:10:20,890 Commitment is important, right? 1289 01:10:20,890 --> 01:10:22,640 You want to commit to doing things. 1290 01:10:22,640 --> 01:10:24,420 You want to keep your promises. 1291 01:10:24,420 --> 01:10:28,310 And in this case we have a legal requirement 1292 01:10:28,310 --> 01:10:34,550 that you want to be able to make people honor their commitments, 1293 01:10:34,550 --> 01:10:37,040 and not weasel their way out of commitments, right? 1294 01:10:37,040 --> 01:10:39,670 And we want to deal with this computationally. 1295 01:10:39,670 --> 01:10:42,720 And let's think about auctions. 1296 01:10:42,720 --> 01:10:51,325 So Alice has value x, e.g. an auction bid. 1297 01:10:54,940 --> 01:11:02,170 Alice computes what we're going to call 1298 01:11:02,170 --> 01:11:11,500 C of x, which is a commitment of x, and cements it, right? 1299 01:11:11,500 --> 01:11:26,670 C of x, C of x is-- let's assume that the auctioneer, 1300 01:11:26,670 --> 01:11:32,470 and perhaps other auctionees as well, see C of x. 1301 01:11:32,470 --> 01:11:34,770 You have to submit it to somebody, right? 1302 01:11:34,770 --> 01:11:37,100 So you can assume that that's exposed. 1303 01:11:37,100 --> 01:11:49,460 And what is going to happen is, when bidding is over Alice 1304 01:11:49,460 --> 01:11:53,145 is going to open-- so this is-- C 1305 01:11:53,145 --> 01:12:00,069 of x can be thought of as sealing the bid. 1306 01:12:00,069 --> 01:12:01,110 So that's the commitment. 1307 01:12:01,110 --> 01:12:03,030 You're sealing the-- you're making a bid 1308 01:12:03,030 --> 01:12:04,600 and you're sealing it in an envelope. 1309 01:12:04,600 --> 01:12:05,650 You've committed to that. 1310 01:12:05,650 --> 01:12:08,110 That's obviously, what happens in real life 1311 01:12:08,110 --> 01:12:09,740 without cryptography, but we want 1312 01:12:09,740 --> 01:12:12,300 to do this with cryptography, with hash functions. 1313 01:12:12,300 --> 01:12:19,250 And so now Alice opens C of x to reveal x. 1314 01:12:19,250 --> 01:12:25,670 So she has to prove that in fact x was her bid. 1315 01:12:25,670 --> 01:12:28,580 And that it matches what she sealed. 1316 01:12:28,580 --> 01:12:31,930 When you open it up, think about it conceptually 1317 01:12:31,930 --> 01:12:34,660 from a standpoint of what happens with paper, 1318 01:12:34,660 --> 01:12:38,620 and then we have to think about this computationally 1319 01:12:38,620 --> 01:12:41,120 and what this implies, right? 1320 01:12:41,120 --> 01:12:43,245 So again I'll do a little bit of set up. 1321 01:12:43,245 --> 01:12:45,370 And then we have start talking about the properties 1322 01:12:45,370 --> 01:12:48,997 that we want for this particular application. 1323 01:12:48,997 --> 01:12:50,580 So there are a bunch of people who are 1324 01:12:50,580 --> 01:12:54,680 doing bidding for this auction. 1325 01:12:54,680 --> 01:12:56,999 I don't-- I want to be the first-- 1326 01:12:56,999 --> 01:12:58,540 I don't want to spend a lot of money. 1327 01:12:58,540 --> 01:12:59,560 But I want to win. 1328 01:12:59,560 --> 01:13:01,640 All of us are like that, right? 1329 01:13:01,640 --> 01:13:04,350 If I know information about your bid, 1330 01:13:04,350 --> 01:13:06,490 that is obviously a tremendous advantage. 1331 01:13:06,490 --> 01:13:09,110 So clearly that can't happen, right? 1332 01:13:09,110 --> 01:13:13,000 If I know one other person's bid I just do plus 1 on that. 1333 01:13:13,000 --> 01:13:16,090 If I know everybody else's I just do plus 1 on the maximum. 1334 01:13:16,090 --> 01:13:19,420 So clearly there's some secrecy that's required here, correct? 1335 01:13:19,420 --> 01:13:23,000 So C of x is going to have to do two things. 1336 01:13:23,000 --> 01:13:26,160 It can't reveal x. 1337 01:13:26,160 --> 01:13:28,760 Because then even maybe the auctioneer is bad. 1338 01:13:28,760 --> 01:13:31,570 Or other people are looking at this. 1339 01:13:31,570 --> 01:13:34,760 And you can just assume that C of x is-- the C of x's are all 1340 01:13:34,760 --> 01:13:36,000 public. 1341 01:13:36,000 --> 01:13:39,840 But I also need a constraint that's 1342 01:13:39,840 --> 01:13:43,530 associated with C of x that corresponds to making 1343 01:13:43,530 --> 01:13:46,540 sure Alice is honest, correct? 1344 01:13:46,540 --> 01:13:50,940 So I need to make Alice commit to something, right? 1345 01:13:50,940 --> 01:13:56,000 So what are the different properties of the hash function 1346 01:13:56,000 --> 01:14:03,350 that if I use h of x here, that I'd 1347 01:14:03,350 --> 01:14:08,090 want h to satisfy in order for this whole process 1348 01:14:08,090 --> 01:14:14,700 to work like it's supposed to work with paper and envelopes? 1349 01:14:14,700 --> 01:14:15,695 Yeah, go ahead. 1350 01:14:15,695 --> 01:14:18,406 AUDIENCE: It has to be one-way [INAUDIBLE]. 1351 01:14:18,406 --> 01:14:20,030 SRINIVAS DEVADAS: It has to be one-way. 1352 01:14:20,030 --> 01:14:24,210 And explain to me-- so I want a description of it 1353 01:14:24,210 --> 01:14:26,260 has to be one-way, because why? 1354 01:14:26,260 --> 01:14:27,957 AUDIENCE: Because you want all the c 1355 01:14:27,957 --> 01:14:29,790 x's to be hidden from all the other options. 1356 01:14:29,790 --> 01:14:31,200 SRINIVAS DEVADAS: Right. 1357 01:14:31,200 --> 01:14:40,930 C of x should not reveal x, all right? 1358 01:14:40,930 --> 01:14:41,430 All right. 1359 01:14:41,430 --> 01:14:41,950 That's good. 1360 01:14:41,950 --> 01:14:44,320 Do you have more? 1361 01:14:44,320 --> 01:14:46,765 It has to be collision resistant. 1362 01:14:53,180 --> 01:14:55,560 OK. 1363 01:14:55,560 --> 01:14:57,852 I guess. 1364 01:14:57,852 --> 01:15:00,580 A little bit more. 1365 01:15:00,580 --> 01:15:02,560 You're getting there. 1366 01:15:02,560 --> 01:15:05,672 What-- why is it collision resistant? 1367 01:15:05,672 --> 01:15:08,132 AUDIENCE: Because you want to make sure that Alice, 1368 01:15:08,132 --> 01:15:12,560 when she makes a bid that she commits that bid. 1369 01:15:12,560 --> 01:15:15,512 If she's not going to resist it then she could bid $100 1370 01:15:15,512 --> 01:15:16,805 and then find something else. 1371 01:15:16,805 --> 01:15:18,430 SRINIVAS DEVADAS: That's exactly right. 1372 01:15:18,430 --> 01:15:26,540 So CR, because Alice should not be 1373 01:15:26,540 --> 01:15:37,760 able to open this in multiple ways, right? 1374 01:15:37,760 --> 01:15:41,940 And in this case it's not TCR in the sense 1375 01:15:41,940 --> 01:15:45,350 that Alice controls what her bids are. 1376 01:15:45,350 --> 01:15:51,440 And so she might find a pair of bids that collide, correct? 1377 01:15:51,440 --> 01:15:55,840 She might realize that in this particular hash function, 1378 01:15:55,840 --> 01:16:01,000 you know $10,000 and a billion dollars collide, right? 1379 01:16:01,000 --> 01:16:04,450 And so she figures depending on what happens, 1380 01:16:04,450 --> 01:16:07,820 she's a billionaire, let's assume. 1381 01:16:07,820 --> 01:16:09,320 She's going to open the right thing. 1382 01:16:09,320 --> 01:16:11,320 She's a billionaire, but she doesn't necessarily 1383 01:16:11,320 --> 01:16:13,390 want to spend the billion, OK? 1384 01:16:13,390 --> 01:16:15,040 So that's that, right? 1385 01:16:15,040 --> 01:16:18,360 But I want more. 1386 01:16:18,360 --> 01:16:19,115 Go ahead. 1387 01:16:19,115 --> 01:16:21,590 AUDIENCE: You don't want it to be malleable. 1388 01:16:21,590 --> 01:16:23,482 Assuming that the auctioneer is not honest 1389 01:16:23,482 --> 01:16:25,690 because you don't want to accept a bribe from someone 1390 01:16:25,690 --> 01:16:27,200 and then change everyone else's bid 1391 01:16:27,200 --> 01:16:29,485 to square root of whatever they bid. 1392 01:16:29,485 --> 01:16:31,110 SRINIVAS DEVADAS: That's exactly right. 1393 01:16:31,110 --> 01:16:34,480 Or plus 1, which is a great example, right? 1394 01:16:34,480 --> 01:16:37,050 So there you go. 1395 01:16:37,050 --> 01:16:38,000 I ran out of Frisbees. 1396 01:16:38,000 --> 01:16:39,083 You can get one next time. 1397 01:16:42,610 --> 01:16:45,640 So yeah, I don't need this anymore. 1398 01:16:45,640 --> 01:16:47,020 You're exactly right. 1399 01:16:47,020 --> 01:16:49,790 There's another-- it turns out it's even more subtle than what 1400 01:16:49,790 --> 01:16:51,070 you just described. 1401 01:16:51,070 --> 01:16:54,470 And I think I might be able to point that out to you. 1402 01:16:54,470 --> 01:16:59,730 But let me just first describe this answer, which 1403 01:16:59,730 --> 01:17:02,960 gives us non-malleability. 1404 01:17:02,960 --> 01:17:06,130 So the claim is that you also want non-malleability 1405 01:17:06,130 --> 01:17:08,000 in your hash function. 1406 01:17:08,000 --> 01:17:14,147 And the simple reason is, given C of x-- and let's assume 1407 01:17:14,147 --> 01:17:14,980 that this is public. 1408 01:17:14,980 --> 01:17:16,646 It's certainly public to the auctioneer, 1409 01:17:16,646 --> 01:17:19,530 and it could be public to the other bidders as well. 1410 01:17:19,530 --> 01:17:23,370 Because the notion of sealing is that you've 1411 01:17:23,370 --> 01:17:24,372 sealed it using C of x. 1412 01:17:24,372 --> 01:17:26,580 But people can see the outside of the envelope, which 1413 01:17:26,580 --> 01:17:27,990 is C of x. 1414 01:17:27,990 --> 01:17:29,510 So everyone can see C of x. 1415 01:17:29,510 --> 01:17:32,250 You still want this to work, even though all other bidders 1416 01:17:32,250 --> 01:17:33,650 can see C of x. 1417 01:17:33,650 --> 01:17:44,990 So given C of x, should not be possible to produce 1418 01:17:44,990 --> 01:17:48,110 C of x plus 1. 1419 01:17:48,110 --> 01:17:49,250 You don't know x is. 1420 01:17:49,250 --> 01:17:54,050 But if you can produce C of x plus 1, you win, all right? 1421 01:17:54,050 --> 01:17:57,590 And so that's the problem. 1422 01:17:57,590 --> 01:18:04,930 Now it turns out you now say OK, am I done? 1423 01:18:04,930 --> 01:18:06,930 I want these three properties. 1424 01:18:06,930 --> 01:18:10,350 And I'm done, right? 1425 01:18:10,350 --> 01:18:13,060 There's a little subtlety here which 1426 01:18:13,060 --> 01:18:15,750 these properties don't capture. 1427 01:18:15,750 --> 01:18:18,290 So that's why there's more here. 1428 01:18:18,290 --> 01:18:21,770 And I don't mean to titillate, because I'll 1429 01:18:21,770 --> 01:18:24,000 tell you what is missing here. 1430 01:18:24,000 --> 01:18:29,370 But let's say that I have a hash function that looks like this. 1431 01:18:33,600 --> 01:18:39,970 And this here is non-malleable. 1432 01:18:39,970 --> 01:18:41,690 It is collision resistant. 1433 01:18:41,690 --> 01:18:43,290 And it's one-way, all right? 1434 01:18:43,290 --> 01:18:46,730 So h of x has all these wonderful properties, 1435 01:18:46,730 --> 01:18:48,710 all right? 1436 01:18:48,710 --> 01:18:52,160 I'm creating an h prime x that looks 1437 01:18:52,160 --> 01:18:54,660 like this, which is a concatenation of h 1438 01:18:54,660 --> 01:19:00,210 of x, and giving away the most significant bit of x, which 1439 01:19:00,210 --> 01:19:01,670 is my bid, right? 1440 01:19:01,670 --> 01:19:03,780 I'm just giving that away, right? 1441 01:19:03,780 --> 01:19:08,190 The problem here is that we haven't really 1442 01:19:08,190 --> 01:19:11,660 made our properties broad enough to solve 1443 01:19:11,660 --> 01:19:14,230 this particular application to the extent 1444 01:19:14,230 --> 01:19:19,140 that there's contrived cases where these properties aren't 1445 01:19:19,140 --> 01:19:20,420 enough, OK? 1446 01:19:20,420 --> 01:19:22,180 And the reason is simple. 1447 01:19:22,180 --> 01:19:30,000 h prime x is arguably NM, CR, and OW. 1448 01:19:30,000 --> 01:19:32,660 And I won't go into to each of those arguments. 1449 01:19:32,660 --> 01:19:36,630 But you can think about it, right? 1450 01:19:36,630 --> 01:19:40,030 If I'm just giving you one bit, there's 159 others, 1451 01:19:40,030 --> 01:19:42,140 there's a couple of hundred others, whatever it 1452 01:19:42,140 --> 01:19:43,860 is that I have in the domain. 1453 01:19:43,860 --> 01:19:46,230 It's not going to be invertible. 1454 01:19:46,230 --> 01:19:49,420 h prime x is not going to be invertible if h of x 1455 01:19:49,420 --> 01:19:51,080 is not invertible. 1456 01:19:51,080 --> 01:19:57,880 h prime x is not going to be breakable in terms of collision 1457 01:19:57,880 --> 01:20:00,950 resistance if h of x is not breakable, 1458 01:20:00,950 --> 01:20:02,450 and so on and so forth. 1459 01:20:02,450 --> 01:20:04,740 But if I had a hash function like that, 1460 01:20:04,740 --> 01:20:09,340 is it a good hash function for my commitment application? 1461 01:20:09,340 --> 01:20:10,090 No, obviously not. 1462 01:20:10,090 --> 01:20:12,298 Because if I publicize this hash function-- remember, 1463 01:20:12,298 --> 01:20:13,890 everything is public here with respect 1464 01:20:13,890 --> 01:20:18,030 to h and h prime-- you are giving away the most 1465 01:20:18,030 --> 01:20:21,360 significant that corresponds to your bid 1466 01:20:21,360 --> 01:20:23,350 in this particular hash function, right? 1467 01:20:23,350 --> 01:20:33,170 So you really need a little bit more than these for secrecy, 1468 01:20:33,170 --> 01:20:34,200 for true secrecy. 1469 01:20:37,510 --> 01:20:39,890 But in the context of this example, 1470 01:20:39,890 --> 01:20:41,770 I mean it's common sense that you would not 1471 01:20:41,770 --> 01:20:43,550 use the hash function like that, right? 1472 01:20:43,550 --> 01:20:46,950 So it's not that there's anything profound here. 1473 01:20:46,950 --> 01:20:48,540 It's just that I want to make sure 1474 01:20:48,540 --> 01:20:51,480 that you understand the nuances of the properties 1475 01:20:51,480 --> 01:20:52,580 that we're requiring. 1476 01:20:52,580 --> 01:20:55,560 We had all the requirements corresponding 1477 01:20:55,560 --> 01:20:58,900 to the definitions of NM and CR and OW. 1478 01:20:58,900 --> 01:21:01,150 And you need a little bit more for this example, where 1479 01:21:01,150 --> 01:21:04,300 you have to say something, perhaps informally, 1480 01:21:04,300 --> 01:21:10,870 like the bits of your auction are scrambled in the final hash 1481 01:21:10,870 --> 01:21:14,010 output, which most hash functions should do anyway, 1482 01:21:14,010 --> 01:21:15,730 and h of x will definitely do. 1483 01:21:15,730 --> 01:21:19,290 But you kind of unscrambled it by adding this little thing 1484 01:21:19,290 --> 01:21:22,210 in here, corresponding to the most significant thing, 1485 01:21:22,210 --> 01:21:23,050 all right? 1486 01:21:23,050 --> 01:21:25,480 So I'll stop with that. 1487 01:21:25,480 --> 01:21:29,760 Let me just say that the operation-- or sorry, 1488 01:21:29,760 --> 01:21:33,590 the work involved in creating hash functions that 1489 01:21:33,590 --> 01:21:37,430 are poly-time computable is research work. 1490 01:21:37,430 --> 01:21:40,290 People put up hash functions and they get broken, 1491 01:21:40,290 --> 01:21:43,770 like MD4 was put up in '92 and then got broken, SHA-1 and so 1492 01:21:43,770 --> 01:21:44,700 on and so forth. 1493 01:21:44,700 --> 01:21:49,580 And so I just encourage you to look up SHA-3 and just take 1494 01:21:49,580 --> 01:21:52,480 a quick scan and what the complexity of SHA-3 1495 01:21:52,480 --> 01:21:56,820 is with respect to computing the hash given an arbitrary string, 1496 01:21:56,820 --> 01:21:57,590 all right? 1497 01:21:57,590 --> 01:21:59,575 I'll stick around for questions.