1 00:00:00,070 --> 00:00:02,430 The following content is provided under a Creative 2 00:00:02,430 --> 00:00:03,810 Commons license. 3 00:00:03,810 --> 00:00:06,060 Your support will help MIT OpenCourseWare 4 00:00:06,060 --> 00:00:10,150 continue to offer high quality educational resources for free. 5 00:00:10,150 --> 00:00:12,700 To make a donation, or to view additional materials 6 00:00:12,700 --> 00:00:16,600 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,600 --> 00:00:17,266 at ocw.mit.edu. 8 00:00:26,290 --> 00:00:28,000 PROFESSOR: This is very exciting. 9 00:00:28,000 --> 00:00:29,830 In the previous lecture, we learned 10 00:00:29,830 --> 00:00:32,600 all about buffer overflow attacks, 11 00:00:32,600 --> 00:00:34,080 and today we're going to continue 12 00:00:34,080 --> 00:00:37,430 to discuss some techniques to launch these attacks. 13 00:00:37,430 --> 00:00:41,210 So, the basic idea of all these buffer overflow attacks 14 00:00:41,210 --> 00:00:42,460 is as follows. 15 00:00:42,460 --> 00:00:46,360 So, first of all, they leverage a couple different facts. 16 00:00:54,080 --> 00:00:58,170 So, one thing that they leverage is that system software 17 00:00:58,170 --> 00:01:00,191 is often written in C. 18 00:01:07,990 --> 00:01:10,248 And so by system software, I mean things 19 00:01:10,248 --> 00:01:12,346 like databases, compilers, network servers, 20 00:01:12,346 --> 00:01:15,212 things like that. 21 00:01:15,212 --> 00:01:17,670 And you can also think of things like your favorite command 22 00:01:17,670 --> 00:01:18,680 shell. 23 00:01:18,680 --> 00:01:21,302 All of those types of things are typically written in C. So, 24 00:01:21,302 --> 00:01:23,135 why are these things typically written in C? 25 00:01:23,135 --> 00:01:25,610 Well, they're written in C because our community, 26 00:01:25,610 --> 00:01:28,085 of course, is obsessed with speed. 27 00:01:28,085 --> 00:01:31,070 And so C is supposed to be like high-level assembly, 28 00:01:31,070 --> 00:01:34,010 it takes us very close to the hardware, and so as a result, 29 00:01:34,010 --> 00:01:36,410 all these very mission critical systems 30 00:01:36,410 --> 00:01:38,690 are written in this very low level language. 31 00:01:38,690 --> 00:01:42,260 Now, the problem with things being written in C 32 00:01:42,260 --> 00:01:48,705 is that C actually exposes raw memory Addresses. 33 00:01:57,360 --> 00:01:57,860 Right? 34 00:01:57,860 --> 00:02:00,880 And so not only does it expose raw memory addresses, 35 00:02:00,880 --> 00:02:04,750 but it also performs no bounds checking when programs 36 00:02:04,750 --> 00:02:06,500 manipulate those raw addresses. 37 00:02:06,500 --> 00:02:07,000 Right? 38 00:02:07,000 --> 00:02:09,820 And so as you can imagine, this is a recipe for disaster. 39 00:02:09,820 --> 00:02:10,490 OK? 40 00:02:10,490 --> 00:02:13,260 So, once again, why doesn't C check these bounds? 41 00:02:13,260 --> 00:02:15,830 Well, one reason is because the hardware doesn't do that. 42 00:02:15,830 --> 00:02:17,670 And people who write in C typically 43 00:02:17,670 --> 00:02:20,870 want the max amount of speed possible. 44 00:02:20,870 --> 00:02:23,510 The other reason is that in C, as we'll discuss later, 45 00:02:23,510 --> 00:02:25,520 it can actually be very difficult to determine 46 00:02:25,520 --> 00:02:28,050 the semantics of what it means to have a pointer that's 47 00:02:28,050 --> 00:02:29,270 actually in bounds. 48 00:02:29,270 --> 00:02:31,720 So, in some cases, it would be very difficult for the C 49 00:02:31,720 --> 00:02:33,490 runtime to automatically do that. 50 00:02:33,490 --> 00:02:35,366 Now we'll discuss some techniques 51 00:02:35,366 --> 00:02:36,990 which will actually try to do that type 52 00:02:36,990 --> 00:02:37,950 of automatic inference. 53 00:02:37,950 --> 00:02:39,700 But as we'll see, none of these techniques 54 00:02:39,700 --> 00:02:42,310 are fully bulletproof. 55 00:02:42,310 --> 00:02:48,130 And so these attacks also leverage knowledge 56 00:02:48,130 --> 00:02:51,025 of the x86 architecture. 57 00:02:56,884 --> 00:02:58,425 And by knowledge of that architecture 58 00:02:58,425 --> 00:03:01,350 I mean things like what's the direction that the stack grows, 59 00:03:01,350 --> 00:03:02,260 right? 60 00:03:02,260 --> 00:03:04,632 What are the calling conventions for functions? 61 00:03:04,632 --> 00:03:06,590 When you invoke a C function, what is the stack 62 00:03:06,590 --> 00:03:07,407 going to look like? 63 00:03:07,407 --> 00:03:09,240 And when you allocate an object on the heap, 64 00:03:09,240 --> 00:03:12,380 what are those chief allocation structures going to look like? 65 00:03:12,380 --> 00:03:15,040 And so let's look at a simple example. 66 00:03:15,040 --> 00:03:18,850 It's very similar to something that you 67 00:03:18,850 --> 00:03:21,250 saw in the last lecture. 68 00:03:21,250 --> 00:03:25,240 So, we've got you're standard read request up here. 69 00:03:28,610 --> 00:03:30,845 And then you've got a buffer. 70 00:03:30,845 --> 00:03:31,760 That's here. 71 00:03:34,980 --> 00:03:37,382 And by now you've probably trained your lizard brain 72 00:03:37,382 --> 00:03:39,590 instincts-- whenever you see a buffer you're probably 73 00:03:39,590 --> 00:03:41,506 filled with fear-- that is the right attitude. 74 00:03:41,506 --> 00:03:43,380 And so we've got the buffer up here, 75 00:03:43,380 --> 00:03:48,000 and then we've got the canonical int i. 76 00:03:48,000 --> 00:03:51,540 And then we've got the infamous "gets" command. 77 00:03:56,780 --> 00:03:58,530 And then you've got some other stuff here. 78 00:03:58,530 --> 00:03:59,029 Right? 79 00:03:59,029 --> 00:04:01,996 So as we discussed in lecture last week, 80 00:04:01,996 --> 00:04:03,120 this is problematic, right? 81 00:04:03,120 --> 00:04:05,210 Because this gets operation here does not actually 82 00:04:05,210 --> 00:04:06,990 check the bounds on the buffer. 83 00:04:06,990 --> 00:04:11,680 So, what can happen is that if the user actually supplies 84 00:04:11,680 --> 00:04:17,000 the buffer-- and actually put that guy up here, for example-- 85 00:04:17,000 --> 00:04:19,660 if that buffer comes in from the user 86 00:04:19,660 --> 00:04:21,579 and we use this unsafe function here, 87 00:04:21,579 --> 00:04:23,910 we can actually overflow this buffer. 88 00:04:23,910 --> 00:04:26,800 We can actually rewrite stuff that's on the stack. 89 00:04:26,800 --> 00:04:29,160 So, just a reminder of what that stuff looks 90 00:04:29,160 --> 00:04:36,960 like-- let's look at a stack diagram here-- so 91 00:04:36,960 --> 00:04:55,680 let's say here we've got I. Let's say here we've got a buf 92 00:04:55,680 --> 00:04:56,180 Right? 93 00:04:56,180 --> 00:04:58,250 So, we've got the first address of buffer here. 94 00:04:58,250 --> 00:04:59,862 We've got the last one up here. 95 00:04:59,862 --> 00:05:01,320 I apologize for my handwriting, I'm 96 00:05:01,320 --> 00:05:02,610 used to writing on the marker board. 97 00:05:02,610 --> 00:05:03,568 You should pray for me. 98 00:05:03,568 --> 00:05:11,235 So, anyways, then up here, we've got the saved value 99 00:05:11,235 --> 00:05:14,870 of the break pointer. 100 00:05:18,120 --> 00:05:26,770 We've got the return address for the function there. 101 00:05:26,770 --> 00:05:32,089 And then we've got some other stuff from the previous frame. 102 00:05:36,220 --> 00:05:41,180 So, don't forget, we've got the stack pointer, 103 00:05:41,180 --> 00:05:42,020 which goes there. 104 00:05:45,150 --> 00:05:52,540 And then we've got the new break pointer, which goes here. 105 00:06:02,960 --> 00:06:04,780 The entry stack pointer goes there, 106 00:06:04,780 --> 00:06:06,290 and then somewhere up here, we've 107 00:06:06,290 --> 00:06:09,216 got the entry break point. 108 00:06:14,430 --> 00:06:14,930 Right? 109 00:06:14,930 --> 00:06:17,790 So, just as a reminder, the way that the Stack Overflow works 110 00:06:17,790 --> 00:06:22,031 is that basically, it goes this way. 111 00:06:22,031 --> 00:06:22,530 Right? 112 00:06:22,530 --> 00:06:24,670 So, when the gets operation is called, 113 00:06:24,670 --> 00:06:27,370 we start writing bytes into buf, and eventually it's 114 00:06:27,370 --> 00:06:31,604 going to start overwriting these things that are on the stack. 115 00:06:31,604 --> 00:06:33,145 And so this is basically-- should all 116 00:06:33,145 --> 00:06:34,710 look pretty familiar to you. 117 00:06:34,710 --> 00:06:36,190 So. 118 00:06:36,190 --> 00:06:38,830 What does the attacker do to take advantage of that? 119 00:06:38,830 --> 00:06:40,720 Basically supplies that long input. 120 00:06:40,720 --> 00:06:45,452 And so the key idea here is that this can be attacker-supplied. 121 00:06:45,452 --> 00:06:48,690 And so, if this return address is attacker-supplied, then 122 00:06:48,690 --> 00:06:50,590 basically the attacker can determine 123 00:06:50,590 --> 00:06:52,200 where this function's going to jump to 124 00:06:52,200 --> 00:06:53,780 after [INAUDIBLE] execution. 125 00:06:53,780 --> 00:06:57,410 So, what can the attacker do once it's actually 126 00:06:57,410 --> 00:06:59,506 been able to hijack that return address, 127 00:06:59,506 --> 00:07:00,630 and jump wherever it wants. 128 00:07:00,630 --> 00:07:02,635 Well, basically the attacker is now 129 00:07:02,635 --> 00:07:05,495 running code with the privileges of the process 130 00:07:05,495 --> 00:07:07,360 that it's just hijacked, for example. 131 00:07:07,360 --> 00:07:10,438 So, if that process was a high priority process, 132 00:07:10,438 --> 00:07:12,662 let's say it was running root, or admin, whatever 133 00:07:12,662 --> 00:07:15,120 they call the super-user of your favorite operating system, 134 00:07:15,120 --> 00:07:18,420 then now, that program, which is controlled by the attacker, 135 00:07:18,420 --> 00:07:22,370 can do whatever it wants using the authority 136 00:07:22,370 --> 00:07:24,110 of that high-priority program. 137 00:07:24,110 --> 00:07:26,420 So, it can do things, like it could maybe read files, 138 00:07:26,420 --> 00:07:29,490 it can send spam, let's say if you corrupted a mail server. 139 00:07:29,490 --> 00:07:33,245 It can even do things like actually defeat firewalls, 140 00:07:33,245 --> 00:07:35,250 right, because the idea of a firewall 141 00:07:35,250 --> 00:07:36,610 is that there's going to be this distinction 142 00:07:36,610 --> 00:07:38,693 between good machines that are behind the firewall 143 00:07:38,693 --> 00:07:40,880 and bad machines that are outside of the firewall. 144 00:07:40,880 --> 00:07:43,460 So, typically machines are inside of the firewall, 145 00:07:43,460 --> 00:07:45,424 they have a lot of trust with each other. 146 00:07:45,424 --> 00:07:46,965 But if you can subvert a machine that 147 00:07:46,965 --> 00:07:50,080 is actually inside the firewall, right, that's great. 148 00:07:50,080 --> 00:07:52,880 Because now you can just sort of skip past a lot of those checks 149 00:07:52,880 --> 00:07:55,590 that those machines don't have because they think that you're 150 00:07:55,590 --> 00:07:57,420 a trusted individual. 151 00:07:57,420 --> 00:07:59,239 So, one thing you might be thinking, 152 00:07:59,239 --> 00:08:01,530 or I remember I was thinking this when I was a student, 153 00:08:01,530 --> 00:08:02,960 was, "OK, fine, so I've showed you 154 00:08:02,960 --> 00:08:05,100 how to do this buffer overflow, but why 155 00:08:05,100 --> 00:08:06,990 didn't the OS stop this? 156 00:08:06,990 --> 00:08:07,490 Right? 157 00:08:07,490 --> 00:08:09,365 Isn't the OS supposed to be that thing that's 158 00:08:09,365 --> 00:08:11,489 sort of sitting around like Guardians of the Galaxy 159 00:08:11,489 --> 00:08:13,950 and defending all this kind of evil stuff from happening?" 160 00:08:13,950 --> 00:08:18,310 The thing to note is that the OS actually isn't watching you 161 00:08:18,310 --> 00:08:19,411 all the time. 162 00:08:19,411 --> 00:08:19,910 Right? 163 00:08:19,910 --> 00:08:21,817 The hardware is watching all the time. 164 00:08:21,817 --> 00:08:24,025 It's the thing that's actually fetching instructions, 165 00:08:24,025 --> 00:08:26,180 and decoding them, and doing things like that. 166 00:08:26,180 --> 00:08:29,430 But to a first approximation, what does the OS do? 167 00:08:29,430 --> 00:08:31,790 It basically sets up some page table stuff, 168 00:08:31,790 --> 00:08:33,789 and then it basically lets you, the application, 169 00:08:33,789 --> 00:08:36,679 run, and if you ask the operating system for services-- 170 00:08:36,679 --> 00:08:38,840 so for example, you want to send a network packet, 171 00:08:38,840 --> 00:08:41,634 or you want to do some IPC, or things like that, 172 00:08:41,634 --> 00:08:43,360 then you'll invoke a system call, 173 00:08:43,360 --> 00:08:45,280 and you'll actually trap into OS. 174 00:08:45,280 --> 00:08:47,020 But other than that, the operating system 175 00:08:47,020 --> 00:08:49,780 is not looking at each and every instruction 176 00:08:49,780 --> 00:08:52,510 that your application is executing. 177 00:08:52,510 --> 00:08:56,070 So, in other words, when this buffer overflowed, 178 00:08:56,070 --> 00:08:57,490 it's not like the operating system 179 00:08:57,490 --> 00:09:00,000 was looking at each of these memory axises for signs 180 00:09:00,000 --> 00:09:00,730 that [INAUDIBLE]. 181 00:09:00,730 --> 00:09:01,230 Right? 182 00:09:01,230 --> 00:09:02,604 All of this address space belongs 183 00:09:02,604 --> 00:09:04,570 to you, this [INAUDIBLE] process right, 184 00:09:04,570 --> 00:09:06,430 so you get to do with it what you want to do with it, right? 185 00:09:06,430 --> 00:09:08,971 Or at least this is the whole C attitude towards life, right? 186 00:09:08,971 --> 00:09:10,080 Life fast, die young. 187 00:09:10,080 --> 00:09:10,600 So. 188 00:09:10,600 --> 00:09:14,090 That's why the operating system can't help you right there. 189 00:09:14,090 --> 00:09:17,590 So, later in the lecture, we will discuss some things 190 00:09:17,590 --> 00:09:21,000 that the operating system can do with respect to the hardware 191 00:09:21,000 --> 00:09:23,560 so that it can help protect against these types of attacks. 192 00:09:23,560 --> 00:09:25,690 Once again, it's actually just the hardware 193 00:09:25,690 --> 00:09:27,980 that's interposing on every little thing that you do. 194 00:09:27,980 --> 00:09:29,130 So, you can actually take advantage 195 00:09:29,130 --> 00:09:30,650 of some of that stuff, for example, 196 00:09:30,650 --> 00:09:31,730 using special types of [INAUDIBLE] protections 197 00:09:31,730 --> 00:09:34,229 and things like that, that we'll discuss a little bit later. 198 00:09:35,070 --> 00:09:37,710 That's basically an overview of what 199 00:09:37,710 --> 00:09:39,500 the buffer overflow looks like. 200 00:09:39,500 --> 00:09:41,820 So, how are we gonna fix these things? 201 00:09:41,820 --> 00:09:49,010 So, one fix for avoiding buffer overflow 202 00:09:49,010 --> 00:09:54,180 is to simply avoid bugs in your C code. 203 00:09:59,620 --> 00:10:02,360 This has the nice advantage of being correct by construction, 204 00:10:02,360 --> 00:10:02,580 right. 205 00:10:02,580 --> 00:10:04,371 If you don't have any bugs in your program, 206 00:10:04,371 --> 00:10:06,950 ipso facto the attacker cannot take advantage of any bugs. 207 00:10:06,950 --> 00:10:08,490 That's on the professor, I get paid 208 00:10:08,490 --> 00:10:10,240 to think about something deeply like that. 209 00:10:10,240 --> 00:10:13,180 Now, this of course, is easier said than done. 210 00:10:13,180 --> 00:10:13,680 Right? 211 00:10:13,680 --> 00:10:15,220 There's a couple of very straightforward things 212 00:10:15,220 --> 00:10:17,980 that programmers can do to practice good security hygiene. 213 00:10:17,980 --> 00:10:21,300 So, for example, functions like this gets function, right? 214 00:10:21,300 --> 00:10:22,800 These are kind of like go-tos, these 215 00:10:22,800 --> 00:10:24,381 are now known to be bad ideas. 216 00:10:24,381 --> 00:10:24,880 Right? 217 00:10:24,880 --> 00:10:27,350 So, when you compile your code, and you include functions 218 00:10:27,350 --> 00:10:30,290 like this-- if you're using a modern compiler, GCC, 219 00:10:30,290 --> 00:10:33,050 Visual Studio, whatever, it will actually complain about that. 220 00:10:33,050 --> 00:10:35,510 It'll say, hey, you're one of these unsafe functions. 221 00:10:35,510 --> 00:10:37,526 Consider using [? FGADS ?], or using 222 00:10:37,526 --> 00:10:39,590 a version of [INAUDIBLE] that actually 223 00:10:39,590 --> 00:10:41,270 can track the bounds of things. 224 00:10:41,270 --> 00:10:43,800 So, that's one simple thing that programmers can do. 225 00:10:43,800 --> 00:10:45,640 But note that a lot of applications 226 00:10:45,640 --> 00:10:48,319 actually manipulate buffers without necessarily 227 00:10:48,319 --> 00:10:49,610 calling one of these functions. 228 00:10:49,610 --> 00:10:50,110 Right? 229 00:10:50,110 --> 00:10:52,540 This is very common in network servers, things like that. 230 00:10:52,540 --> 00:10:54,360 They'll define their own custom parsing routines, 231 00:10:54,360 --> 00:10:55,565 then make sure that things are extracted 232 00:10:55,565 --> 00:10:57,110 from the buffers in the way that they want. 233 00:10:57,110 --> 00:10:59,480 So, just restricting yourself to these types of things 234 00:10:59,480 --> 00:11:03,350 won't solve the problem completely. 235 00:11:03,350 --> 00:11:07,340 So, another thing that makes this approach difficult 236 00:11:07,340 --> 00:11:12,114 is that it's not always obvious what is a bug in a C program. 237 00:11:12,114 --> 00:11:14,655 So, if you've ever worked on a very large scale system that's 238 00:11:14,655 --> 00:11:17,030 been written in C, you'll know that it can be tricky 239 00:11:17,030 --> 00:11:20,060 if you've got some function definition that takes then 240 00:11:20,060 --> 00:11:22,080 18 void star pointers. 241 00:11:22,080 --> 00:11:24,620 I mean, only Zeus knows what all those things mean, right? 242 00:11:24,620 --> 00:11:27,490 And so it's much more difficult in a language like C, 243 00:11:27,490 --> 00:11:30,097 that has weak typing and things like that, 244 00:11:30,097 --> 00:11:31,680 to actually understand as a programmer 245 00:11:31,680 --> 00:11:33,006 what it means to have a bug, and what 246 00:11:33,006 --> 00:11:34,320 it means to not have a bug. 247 00:11:34,320 --> 00:11:34,819 OK? 248 00:11:34,819 --> 00:11:36,990 So, in general, one of the main themes 249 00:11:36,990 --> 00:11:39,920 that you'll see in this class is that C is probably 250 00:11:39,920 --> 00:11:42,498 the spawn of the devil, right? 251 00:11:42,498 --> 00:11:44,710 And we use it because, once again, 252 00:11:44,710 --> 00:11:46,630 we typically want to be to be fast, right? 253 00:11:46,630 --> 00:11:48,430 But as hardware gets faster and as we 254 00:11:48,430 --> 00:11:51,780 get more and better languages to write large-scale systems code, 255 00:11:51,780 --> 00:11:54,090 we'll see that maybe it doesn't always 256 00:11:54,090 --> 00:11:56,670 make sense to write your stuff in C. Even 257 00:11:56,670 --> 00:11:58,300 if you think it has to be fast. 258 00:11:58,300 --> 00:12:01,400 So, we'll discuss some of that later and later lectures. 259 00:12:01,400 --> 00:12:03,549 So, that's one approach, avoiding bugs 260 00:12:03,549 --> 00:12:04,340 in the first place. 261 00:12:04,340 --> 00:12:15,170 So, another approach-- is to build tools that 262 00:12:15,170 --> 00:12:18,090 allow programmers to find bugs. 263 00:12:26,670 --> 00:12:29,530 And so an example of this is something 264 00:12:29,530 --> 00:12:30,926 that's called static analysis. 265 00:12:30,926 --> 00:12:33,175 Now we'll talk a little bit more about static analysis 266 00:12:33,175 --> 00:12:34,805 in later lectures, but suffice it 267 00:12:34,805 --> 00:12:38,400 to say that static analysis is a way of analyzing the source 268 00:12:38,400 --> 00:12:40,640 code of your program before it even runs 269 00:12:40,640 --> 00:12:42,530 and looking for potential problems. 270 00:12:42,530 --> 00:12:46,550 So, imagine that you have a function like this. 271 00:12:46,550 --> 00:12:50,342 So, the [INAUDIBLE] foo function, 272 00:12:50,342 --> 00:12:52,640 it takes in a pointer. 273 00:12:56,420 --> 00:12:59,570 Let's say it declares an integer offset value. 274 00:13:02,810 --> 00:13:10,110 It declares another pointer and adds the offset 275 00:13:10,110 --> 00:13:11,540 to that pointer. 276 00:13:11,540 --> 00:13:13,620 Now, even just at this moment in the code, 277 00:13:13,620 --> 00:13:15,650 right, static analysis can tell you 278 00:13:15,650 --> 00:13:18,110 that this offset variable is un-initialized. 279 00:13:18,110 --> 00:13:18,610 Right? 280 00:13:18,610 --> 00:13:20,700 So, essentially you can do things like saying, 281 00:13:20,700 --> 00:13:22,375 is there any way, is there any control 282 00:13:22,375 --> 00:13:26,150 floating through this program by which offset could have been 283 00:13:26,150 --> 00:13:28,410 initialized before it was actually used 284 00:13:28,410 --> 00:13:29,660 this in this calculation here. 285 00:13:29,660 --> 00:13:32,540 Now, in this example it is very simple to see the answer is no. 286 00:13:32,540 --> 00:13:32,770 Right? 287 00:13:32,770 --> 00:13:34,960 You can imagine that if there were more branches, or things 288 00:13:34,960 --> 00:13:36,630 like this, it would be more difficult to tell. 289 00:13:36,630 --> 00:13:39,130 But one thing that a static analysis tool can tell you, 290 00:13:39,130 --> 00:13:41,090 and in fact, one thing that [? popular ?] compilers will 291 00:13:41,090 --> 00:13:43,610 tell you, is you'll compile this, and it'll say, hey buddy, 292 00:13:43,610 --> 00:13:45,190 this has not been initialized. 293 00:13:45,190 --> 00:13:46,940 Are you sure, is this what you want to do? 294 00:13:46,940 --> 00:13:49,330 So, that's one very simple example of static analysis. 295 00:13:49,330 --> 00:13:53,330 Another example of what you can do is, let's say after this, 296 00:13:53,330 --> 00:13:54,960 we have a branch condition here. 297 00:14:02,260 --> 00:14:02,760 Right? 298 00:14:02,760 --> 00:14:06,240 So, you say, if the offset is greater than eight, then 299 00:14:06,240 --> 00:14:12,470 we'll call some function bar, and passing the offset. 300 00:14:12,470 --> 00:14:14,090 Now, one thing you can note about this 301 00:14:14,090 --> 00:14:17,620 is that this branch condition here actually tells us 302 00:14:17,620 --> 00:14:20,160 something about what the value of offset is. 303 00:14:20,160 --> 00:14:20,730 Right? 304 00:14:20,730 --> 00:14:22,646 Ignoring the fact that it wasn't initialized , 305 00:14:22,646 --> 00:14:24,496 we do know that once we get here, 306 00:14:24,496 --> 00:14:26,870 we know the offset actually has to be greater than eight. 307 00:14:26,870 --> 00:14:28,430 So, in some cases, what we can do 308 00:14:28,430 --> 00:14:31,840 is actually propagate that constraint, that notion 309 00:14:31,840 --> 00:14:33,830 that the offset must be greater than eight, 310 00:14:33,830 --> 00:14:35,340 into our analysis of bar. 311 00:14:35,340 --> 00:14:35,840 Right? 312 00:14:35,840 --> 00:14:37,760 So, when we start statically analyzing bar, 313 00:14:37,760 --> 00:14:40,260 we know that offset can only take certain values. 314 00:14:40,260 --> 00:14:42,510 So, once again, this is a very high-level introduction 315 00:14:42,510 --> 00:14:44,051 to static analysis, and we'll discuss 316 00:14:44,051 --> 00:14:45,190 it more in later lectures. 317 00:14:45,190 --> 00:14:46,860 But this is a basic intuition of how 318 00:14:46,860 --> 00:14:49,420 we might be able to detect some types of bugs 319 00:14:49,420 --> 00:14:51,180 without even executing your code. 320 00:14:51,180 --> 00:14:52,430 So, does that all makes sense? 321 00:14:55,184 --> 00:14:57,560 So, another thing you can think about doing too 322 00:14:57,560 --> 00:15:02,250 is what they call program fuzzing. 323 00:15:02,250 --> 00:15:04,380 So, the idea behind program fuzzing 324 00:15:04,380 --> 00:15:07,660 is that essentially you take all of the functions in your code, 325 00:15:07,660 --> 00:15:10,630 and then essentially throw random values for input 326 00:15:10,630 --> 00:15:12,175 to those functions. 327 00:15:12,175 --> 00:15:15,010 And so the idea is that you want to have high code 328 00:15:15,010 --> 00:15:17,450 coverage for all of your tests. 329 00:15:17,450 --> 00:15:19,480 So, if you go out in the real world, 330 00:15:19,480 --> 00:15:21,160 typically when you check in unit test, 331 00:15:21,160 --> 00:15:24,486 you can't just do things like, I tried values two, four, eight, 332 00:15:24,486 --> 00:15:26,170 and 15, because 15 is an odd number, 333 00:15:26,170 --> 00:15:28,150 so I probably tested all the branches right. 334 00:15:28,150 --> 00:15:29,251 What you actually have to do is you 335 00:15:29,251 --> 00:15:31,720 have to look at things like, like I said how many branches 336 00:15:31,720 --> 00:15:35,238 in the program overall were actually touched by your test 337 00:15:35,238 --> 00:15:36,125 code, right? 338 00:15:36,125 --> 00:15:38,000 Because that's typically where the bugs hide. 339 00:15:38,000 --> 00:15:39,870 The programmers don't think about the corner cases, 340 00:15:39,870 --> 00:15:42,430 and so as a result, they do have some unit tests that pass. 341 00:15:42,430 --> 00:15:44,310 They even have bigger tests that pass. 342 00:15:44,310 --> 00:15:46,268 But they're not actually pinning all the corner 343 00:15:46,268 --> 00:15:47,170 cases in the program. 344 00:15:47,170 --> 00:15:50,200 So, static analysis can actually help with this fuzzing here. 345 00:15:50,200 --> 00:15:52,960 Once again, using things like this notion of constraint. 346 00:15:52,960 --> 00:15:55,120 So, for example, in this program here, we 347 00:15:55,120 --> 00:15:58,260 have this branch condition here that specified the offset 348 00:15:58,260 --> 00:15:59,400 being greater than eight. 349 00:15:59,400 --> 00:16:01,402 So, we can know what that offset is statically. 350 00:16:01,402 --> 00:16:03,860 So, we can make sure that if we're automatically generating 351 00:16:03,860 --> 00:16:08,194 fuzzed inputs, we can ensure that one of those inputs 352 00:16:08,194 --> 00:16:10,110 hopefully will ensure that, somehow, offset is 353 00:16:10,110 --> 00:16:12,693 less than eight, one will ensure that offset's equal to eight, 354 00:16:12,693 --> 00:16:15,290 one will ensure that it's greater than eight. 355 00:16:15,290 --> 00:16:18,500 So, does that all make sense? 356 00:16:18,500 --> 00:16:19,000 Cool. 357 00:16:19,000 --> 00:16:22,280 So, that's the basic idea behind the notion of building tools 358 00:16:22,280 --> 00:16:24,100 to help programmers find bugs. 359 00:16:24,100 --> 00:16:29,030 So, the nice thing is that even partial analysis can 360 00:16:29,030 --> 00:16:31,073 be very, very useful, particularly when 361 00:16:31,073 --> 00:16:32,975 you're dealing with C. A lot of these tools 362 00:16:32,975 --> 00:16:35,350 that we'll discuss, to prevent against things like buffer 363 00:16:35,350 --> 00:16:37,210 overflow or initialized variables, 364 00:16:37,210 --> 00:16:38,910 they can't catch all the problems. 365 00:16:38,910 --> 00:16:39,410 Right? 366 00:16:39,410 --> 00:16:42,040 But they can actually give us forward progress towards making 367 00:16:42,040 --> 00:16:44,065 these programs more secure. 368 00:16:44,065 --> 00:16:46,120 Now, of course, the disadvantage of these things 369 00:16:46,120 --> 00:16:48,510 is that they're not complete. 370 00:16:48,510 --> 00:16:50,280 Forward progress is not complete progress. 371 00:16:50,280 --> 00:16:52,443 And so it's still a very active area 372 00:16:52,443 --> 00:16:56,147 of research of how you defend against security exploits in C 373 00:16:56,147 --> 00:16:57,480 and just in programs in general. 374 00:17:00,440 --> 00:17:03,649 So, those were two approaches to deal with defending 375 00:17:03,649 --> 00:17:05,210 against buffer overflow. 376 00:17:05,210 --> 00:17:07,810 There's actually some other approaches. 377 00:17:07,810 --> 00:17:13,410 So, a third approach you might think about using 378 00:17:13,410 --> 00:17:14,606 is the use [INAUDIBLE]. 379 00:17:21,672 --> 00:17:36,524 And so examples of these are things like Python, Java, C#-- 380 00:17:36,524 --> 00:17:38,940 I'm not going to put up Pearl there because people who use 381 00:17:38,940 --> 00:17:39,815 Pearl are bad people. 382 00:17:39,815 --> 00:17:43,700 So you can use a memory-safe language like that. 383 00:17:43,700 --> 00:17:46,755 And this is to a certain extent seems like the most obvious 384 00:17:46,755 --> 00:17:48,300 thing that you could do. 385 00:17:48,300 --> 00:17:51,060 I just told you over there that basically C 386 00:17:51,060 --> 00:17:54,090 is high-level assembly code, and it exposes raw pointers 387 00:17:54,090 --> 00:17:56,690 and does all these things that you don't want it to do, 388 00:17:56,690 --> 00:17:57,590 and it doesn't do things you do want 389 00:17:57,590 --> 00:17:58,729 it to do, like [INAUDIBLE]. 390 00:17:58,729 --> 00:18:01,020 So, why not just use one of these high level languages? 391 00:18:01,020 --> 00:18:03,340 Well, there's a couple reasons for that. 392 00:18:03,340 --> 00:18:09,890 So, first of all, there's actually a lot of legacy code 393 00:18:09,890 --> 00:18:11,360 that's out there. 394 00:18:14,581 --> 00:18:15,080 Right? 395 00:18:15,080 --> 00:18:17,310 So, it's all fine and dandy if you want go out and start 396 00:18:17,310 --> 00:18:18,905 your new project and you want to write it 397 00:18:18,905 --> 00:18:20,488 in one of these really safe languages. 398 00:18:20,488 --> 00:18:22,781 But what if you've been given this big binary 399 00:18:22,781 --> 00:18:24,572 or this big source code distribution that's 400 00:18:24,572 --> 00:18:27,415 been written in C, it's been maintained for 10, 15 years, 401 00:18:27,415 --> 00:18:28,915 it's been this generational project, 402 00:18:28,915 --> 00:18:31,240 I mean our children's children will be working on it. 403 00:18:31,240 --> 00:18:33,100 You can't just say, I'm just going to write everything in C# 404 00:18:33,100 --> 00:18:34,431 and change the world. 405 00:18:34,431 --> 00:18:34,930 Right? 406 00:18:34,930 --> 00:18:37,160 And this isn't just a problem in C, for example. 407 00:18:37,160 --> 00:18:38,662 There's actually systems that you 408 00:18:38,662 --> 00:18:41,110 use that you should be afraid, because they actually 409 00:18:41,110 --> 00:18:43,570 use Fortran and COBOL code. 410 00:18:43,570 --> 00:18:44,170 What? 411 00:18:44,170 --> 00:18:46,260 That's stuff from the Civil War. 412 00:18:46,260 --> 00:18:48,169 So, why does that happen? 413 00:18:48,169 --> 00:18:49,710 Once again, the reason why it happens 414 00:18:49,710 --> 00:18:52,024 is because as engineers, we kind of want to think, 415 00:18:52,024 --> 00:18:54,565 oh, we can just build everything ourselves, it'll be awesome, 416 00:18:54,565 --> 00:18:55,714 it'll be just the way that I want it, 417 00:18:55,714 --> 00:18:57,740 I'll call my variables the things that I want. 418 00:18:57,740 --> 00:18:59,020 When in world, that doesn't happen. 419 00:18:59,020 --> 00:18:59,520 Right? 420 00:18:59,520 --> 00:19:02,195 You show up on your job, and you have this thing that exists, 421 00:19:02,195 --> 00:19:04,712 and you look at the code base, and you say, well, 422 00:19:04,712 --> 00:19:05,670 why doesn't it do this? 423 00:19:05,670 --> 00:19:07,280 And then you say, listen. 424 00:19:07,280 --> 00:19:08,942 We'll deal with that in V2. 425 00:19:08,942 --> 00:19:10,490 But for now, you got to make things 426 00:19:10,490 --> 00:19:13,266 work because the customers are taking away their money. 427 00:19:13,266 --> 00:19:15,782 So, there's basically this huge issue of legacy code here, 428 00:19:15,782 --> 00:19:17,140 and how do we deal with it? 429 00:19:17,140 --> 00:19:20,382 And as you'll see the with the baggy bounds system, 430 00:19:20,382 --> 00:19:22,340 One of the advantages of it is that it actually 431 00:19:22,340 --> 00:19:25,484 inter-operates quite well with this legacy code. 432 00:19:25,484 --> 00:19:27,525 So, anyway, this is one reason why you can't just 433 00:19:27,525 --> 00:19:29,775 necessarily make all these buffer overflow problems go 434 00:19:29,775 --> 00:19:33,360 away by using one of these memory-safe languages. 435 00:19:33,360 --> 00:19:39,420 So, another challenges is that what if you need 436 00:19:39,420 --> 00:19:42,794 low-level access to hardware? 437 00:19:48,832 --> 00:19:51,290 This might happen if you're writing something like a device 438 00:19:51,290 --> 00:19:53,060 driver or something like that. 439 00:19:53,060 --> 00:19:56,420 So, in that case, you really do need 440 00:19:56,420 --> 00:19:58,146 that the benefits that C gives you 441 00:19:58,146 --> 00:19:59,520 in terms of being able to look at 442 00:19:59,520 --> 00:20:01,240 registers and actually understand 443 00:20:01,240 --> 00:20:04,350 a little of [INAUDIBLE] and things like that. 444 00:20:04,350 --> 00:20:07,840 There's another thing too, which people always bring up 445 00:20:07,840 --> 00:20:12,390 and which I've alluded to before, but it's performance. 446 00:20:12,390 --> 00:20:12,890 Right? 447 00:20:12,890 --> 00:20:14,560 So, if you care about performance, 448 00:20:14,560 --> 00:20:16,060 typically the thing that you're told 449 00:20:16,060 --> 00:20:17,644 is you've got to write in C, otherwise 450 00:20:17,644 --> 00:20:19,268 you're just going to be so slow, you're 451 00:20:19,268 --> 00:20:21,480 going to get laughed out of code academy or whatever. 452 00:20:21,480 --> 00:20:24,930 Now, this is increasingly less of an issue. 453 00:20:24,930 --> 00:20:26,110 Like the perf stuff. 454 00:20:26,110 --> 00:20:28,320 Because people have actually gotten very good 455 00:20:28,320 --> 00:20:30,390 with doing things like making better compilers 456 00:20:30,390 --> 00:20:32,530 that have all kinds of powerful optimizations. 457 00:20:32,530 --> 00:20:34,180 And also, there are these things called 458 00:20:34,180 --> 00:20:36,440 Gits which actually really reduce 459 00:20:36,440 --> 00:20:38,960 the cost of using these memory-safe languages. 460 00:20:38,960 --> 00:20:41,110 So, have you guys heard of Gits before? 461 00:20:41,110 --> 00:20:43,526 So, I'll give you a very brief introduction to what it is. 462 00:20:43,526 --> 00:20:46,740 The idea is that, think about a language like Java, 463 00:20:46,740 --> 00:20:47,760 or JavaScript. 464 00:20:47,760 --> 00:20:50,360 It's very high level, it's dynamically tight, 465 00:20:50,360 --> 00:20:54,740 right, it has automatic heat management, things like that. 466 00:20:54,740 --> 00:20:58,270 So, typically, when these languages first came out, 467 00:20:58,270 --> 00:20:59,780 they were always interpreted. 468 00:20:59,780 --> 00:21:00,280 Right? 469 00:21:00,280 --> 00:21:02,196 And by interpreted I mean they didn't actually 470 00:21:02,196 --> 00:21:04,310 execute raw x86 instructions. 471 00:21:04,310 --> 00:21:06,410 Instead, these languages were compiled down 472 00:21:06,410 --> 00:21:07,890 to some type of intermediate form. 473 00:21:07,890 --> 00:21:11,370 You may have heard of things like the JVM, the Java Virtual 474 00:21:11,370 --> 00:21:13,370 Machine byte code, things like that. 475 00:21:13,370 --> 00:21:13,870 Right? 476 00:21:13,870 --> 00:21:16,230 You basically had a program that sat in a loop 477 00:21:16,230 --> 00:21:18,450 and took these byte codes, and basically 478 00:21:18,450 --> 00:21:21,132 executed the high level instruction that 479 00:21:21,132 --> 00:21:22,750 was encoded in that byte code. 480 00:21:22,750 --> 00:21:24,971 So, for example, some of the JVM byte codes 481 00:21:24,971 --> 00:21:26,720 dealt with things like pushing and popping 482 00:21:26,720 --> 00:21:28,070 things up on the stack. 483 00:21:28,070 --> 00:21:31,150 So, you have a program that would go through a loop, 484 00:21:31,150 --> 00:21:34,350 operate that stack, and simulate those operations. 485 00:21:34,350 --> 00:21:34,850 OK. 486 00:21:34,850 --> 00:21:36,786 So, that all seemed fine and dandy, but once 487 00:21:36,786 --> 00:21:38,910 again, all of the speed freaks out there were like, 488 00:21:38,910 --> 00:21:39,950 what about the perf? 489 00:21:39,950 --> 00:21:40,870 This too slow. 490 00:21:40,870 --> 00:21:42,490 You've got sort of that interpreter 491 00:21:42,490 --> 00:21:44,190 sitting in that loop, and getting 492 00:21:44,190 --> 00:21:46,090 in the way of our bare metal performance. 493 00:21:46,090 --> 00:21:50,120 So, what people started to do is actually take these high level 494 00:21:50,120 --> 00:21:52,210 interpreter languages and dynamically 495 00:21:52,210 --> 00:21:55,140 generate X86 code for them on the fly. 496 00:21:55,140 --> 00:21:55,800 Right? 497 00:21:55,800 --> 00:21:59,230 So, in terms of just in time compilation, that 498 00:21:59,230 --> 00:22:00,910 means I take your snippet of JavaScript, 499 00:22:00,910 --> 00:22:03,570 I take your snippet of Java whatever, 500 00:22:03,570 --> 00:22:06,270 and I actually spend a little bit of time upfront 501 00:22:06,270 --> 00:22:08,650 to create actual raw machine instructions. 502 00:22:08,650 --> 00:22:12,020 Raw x86 that will run directly on the bare metal. 503 00:22:12,020 --> 00:22:14,890 So, I take that initial performance hit for the Git 504 00:22:14,890 --> 00:22:17,570 compilation, but then after that, my program actually 505 00:22:17,570 --> 00:22:19,450 does run on the raw hard drive. 506 00:22:19,450 --> 00:22:19,970 Right? 507 00:22:19,970 --> 00:22:22,260 So, things like the perf argument 508 00:22:22,260 --> 00:22:23,842 are not necessarily as compelling 509 00:22:23,842 --> 00:22:25,800 as they used to be, because of stuff like this. 510 00:22:25,800 --> 00:22:29,620 There's also some crazy stuff out there, like ASN.js. 511 00:22:29,620 --> 00:22:31,320 So, we can talk more about this offline 512 00:22:31,320 --> 00:22:33,050 if you actually are a JavaScript packer. 513 00:22:33,050 --> 00:22:34,674 But there are actually some neat tricks 514 00:22:34,674 --> 00:22:36,470 that you can do, like compiling down 515 00:22:36,470 --> 00:22:39,750 JavaScript to very restricted subset of the language that 516 00:22:39,750 --> 00:22:42,093 only operates on arrays. 517 00:22:42,093 --> 00:22:44,426 Right, so what this allows you to do is actually get rid 518 00:22:44,426 --> 00:22:46,950 of a lot of the dynamic typing overhead in standard 519 00:22:46,950 --> 00:22:50,030 JavaScript, and you can actually get JavaScript code now to run 520 00:22:50,030 --> 00:22:54,220 within 2x of raw C or C++ performance. 521 00:22:54,220 --> 00:22:56,860 2x might sound like a lot, but it used 522 00:22:56,860 --> 00:22:58,729 to be things like 10x or 20z. 523 00:22:58,729 --> 00:23:01,145 So, we're actually making a lot of progress on that front. 524 00:23:04,003 --> 00:23:06,336 And so the other thing to keep in mind with performance, 525 00:23:06,336 --> 00:23:08,820 too, is that a lot of times, you don't need performance as much 526 00:23:08,820 --> 00:23:10,001 you might think that you do. 527 00:23:10,001 --> 00:23:10,500 Right? 528 00:23:10,500 --> 00:23:12,340 So, think about it like this. 529 00:23:12,340 --> 00:23:15,200 Let's say that your program is actually IO bound. 530 00:23:15,200 --> 00:23:16,330 So, it's not CPU bound. 531 00:23:16,330 --> 00:23:18,300 In other words, let's say that your program 532 00:23:18,300 --> 00:23:20,720 spends most of its time waiting for network input, 533 00:23:20,720 --> 00:23:23,120 waiting for disk input, waiting for user input, 534 00:23:23,120 --> 00:23:24,550 things like that. 535 00:23:24,550 --> 00:23:26,650 In those types of cases, you don't actually 536 00:23:26,650 --> 00:23:29,490 need to have blazing fast raw compute speed. 537 00:23:29,490 --> 00:23:29,990 Right? 538 00:23:29,990 --> 00:23:31,448 Because your program actually isn't 539 00:23:31,448 --> 00:23:34,174 spending a lot of time doing that kind of stuff. 540 00:23:34,174 --> 00:23:35,840 So, once again, this perf argument here, 541 00:23:35,840 --> 00:23:37,195 you've got to take this stuff with a grain of salt. 542 00:23:37,195 --> 00:23:38,732 And I actually see a lot students 543 00:23:38,732 --> 00:23:39,690 who struggle with this. 544 00:23:39,690 --> 00:23:41,702 So, for example, I'll ask someone 545 00:23:41,702 --> 00:23:43,810 to go out and write me a very simple program 546 00:23:43,810 --> 00:23:44,860 to parse a text file. 547 00:23:44,860 --> 00:23:47,360 So, they spend all this time trying to get this to work in C 548 00:23:47,360 --> 00:23:49,987 or C++ and it's super fast and uses the templates and all that 549 00:23:49,987 --> 00:23:50,570 kind of stuff. 550 00:23:50,570 --> 00:23:53,680 But it's like a one line solution in Python. 551 00:23:53,680 --> 00:23:55,335 And it essentially runs just as fast. 552 00:23:55,335 --> 00:23:57,384 And you could develop it much, much easier. 553 00:23:57,384 --> 00:23:59,300 So, you just have to take these perf arguments 554 00:23:59,300 --> 00:24:01,760 with a grain of salt. 555 00:24:01,760 --> 00:24:06,290 So, anyway, we've discussed the three ways you can possibly 556 00:24:06,290 --> 00:24:07,445 avoid buffer overflow. 557 00:24:07,445 --> 00:24:09,070 So, just avoid bugs in the first place. 558 00:24:09,070 --> 00:24:11,162 LOL, that's difficult to do. 559 00:24:11,162 --> 00:24:13,730 Approach two, you can build tools to help 560 00:24:13,730 --> 00:24:15,218 you discover those bugs. 561 00:24:15,218 --> 00:24:17,230 Then approach three is, in a certain sense, 562 00:24:17,230 --> 00:24:19,205 you can push those tools into the runtime. 563 00:24:19,205 --> 00:24:22,060 You can actually hopefully rely on some of their language 564 00:24:22,060 --> 00:24:24,741 runtime features to prevent you from seeing raw memory 565 00:24:24,741 --> 00:24:25,240 addresses. 566 00:24:25,240 --> 00:24:27,073 And you can do things like balance checking, 567 00:24:27,073 --> 00:24:29,370 and so on and so forth. 568 00:24:29,370 --> 00:24:32,320 Once again, as we discussed before, 569 00:24:32,320 --> 00:24:35,800 there's a lot of legacy C and C++ code out there. 570 00:24:35,800 --> 00:24:38,437 So, it's difficult to apply some of these techniques, 571 00:24:38,437 --> 00:24:40,145 particularly number two and number three, 572 00:24:40,145 --> 00:24:43,220 if you've got to deal with that legacy code. 573 00:24:43,220 --> 00:24:47,000 So, how can we do buffer overflow mitigation 574 00:24:47,000 --> 00:24:49,512 despite all these challenges? 575 00:24:49,512 --> 00:24:53,090 Besides just, you know, dropping out of computer science classes 576 00:24:53,090 --> 00:24:55,070 and becoming a painter, or something like that. 577 00:24:55,070 --> 00:24:59,260 So, what actually is going on in a buffer overflow? 578 00:24:59,260 --> 00:25:04,812 So, in a buffer overflow the attacker exploits two things. 579 00:25:11,020 --> 00:25:15,840 So, the first thing that the attack is going to exploit 580 00:25:15,840 --> 00:25:23,060 is gaining control over the instruction pointer. 581 00:25:29,910 --> 00:25:30,410 Right? 582 00:25:30,410 --> 00:25:33,470 And by this, I mean that somehow, the attacker 583 00:25:33,470 --> 00:25:36,710 figures out someplace in the code 584 00:25:36,710 --> 00:25:39,980 that it can make the program jump to against the program's 585 00:25:39,980 --> 00:25:40,680 will. 586 00:25:40,680 --> 00:25:43,880 Now, this is necessary but insufficient for an attack 587 00:25:43,880 --> 00:25:46,500 typically to happen. 588 00:25:46,500 --> 00:25:48,990 Because the other thing that the attacker needs to do 589 00:25:48,990 --> 00:25:57,729 is basically make that pointer point to malicious code. 590 00:26:08,440 --> 00:26:08,940 Right? 591 00:26:08,940 --> 00:26:12,530 So, how are we going to basically make the hijacked 592 00:26:12,530 --> 00:26:14,930 IP, instruction pointer, point to something 593 00:26:14,930 --> 00:26:18,230 that does something useful for the attacker. 594 00:26:18,230 --> 00:26:20,730 So, what's interesting is that in many cases, 595 00:26:20,730 --> 00:26:24,070 it's often fairly straightforward 596 00:26:24,070 --> 00:26:26,660 for the attacker to put some interesting code in a memory. 597 00:26:26,660 --> 00:26:28,660 So we looked at some of those shell code attacks 598 00:26:28,660 --> 00:26:31,530 in the last lecture, where you can actually embed that attack 599 00:26:31,530 --> 00:26:32,530 code in a string. 600 00:26:32,530 --> 00:26:35,029 As we'll discuss a little bit today and more 601 00:26:35,029 --> 00:26:36,570 in the next lecture, you can actually 602 00:26:36,570 --> 00:26:38,700 take advantage of some of the pre-existing code 603 00:26:38,700 --> 00:26:41,745 the application has and jump to in an unexpected way 604 00:26:41,745 --> 00:26:43,850 to make some evil things happen. 605 00:26:43,850 --> 00:26:48,119 So, typically, figuring out what code the attacker wants to run, 606 00:26:48,119 --> 00:26:49,910 maybe that's not as challenging as actually 607 00:26:49,910 --> 00:26:52,820 being able to force the program to jump 608 00:26:52,820 --> 00:26:56,692 to that location in memory. 609 00:26:56,692 --> 00:26:58,150 And the reason why that's tricky is 610 00:26:58,150 --> 00:27:03,210 because, basically, the attacker has to know in some way 611 00:27:03,210 --> 00:27:04,800 where it should jump to. 612 00:27:04,800 --> 00:27:05,300 Right? 613 00:27:05,300 --> 00:27:07,275 So, as we'll see in a second, and as you actually 614 00:27:07,275 --> 00:27:09,691 saw in the last lecture, a lot of these shell code attacks 615 00:27:09,691 --> 00:27:13,650 actually take advantage of these hard-coded locations in memory 616 00:27:13,650 --> 00:27:16,004 where the instruction pointer needs to get sent to. 617 00:27:16,004 --> 00:27:18,170 So, some of the defenses that we're about to look at 618 00:27:18,170 --> 00:27:21,960 can actually randomize things in terms of code layout, heap 619 00:27:21,960 --> 00:27:24,710 layout, and make it a little difficult for the attacker 620 00:27:24,710 --> 00:27:27,590 to figure out where things are located. 621 00:27:27,590 --> 00:27:33,820 So, let's look at one simple mitigation approach first. 622 00:27:33,820 --> 00:27:37,280 So, this is the idea of stack canaries. 623 00:27:42,920 --> 00:27:45,650 So, the basic idea behind stack canaries 624 00:27:45,650 --> 00:27:48,400 is that, during a buffer overflow, 625 00:27:48,400 --> 00:27:53,060 it's actually OK if we allow the attacker to overwrite 626 00:27:53,060 --> 00:27:56,080 the return address if we can actually 627 00:27:56,080 --> 00:27:59,840 catch that overwrite before we actually jump to the place 628 00:27:59,840 --> 00:28:02,860 that the attacker wants us to go. 629 00:28:02,860 --> 00:28:05,230 So, basically, here's how it works. 630 00:28:05,230 --> 00:28:11,810 Let's return to Neal stack diagram. 631 00:28:11,810 --> 00:28:15,150 Essentially we have to think of it as a magic value. 632 00:28:15,150 --> 00:28:20,060 Basically, in front of the return address. 633 00:28:20,060 --> 00:28:22,060 Such that any overflow would have 634 00:28:22,060 --> 00:28:25,804 to hit the canary first, and then hit the return address. 635 00:28:25,804 --> 00:28:27,915 And if we can check that canary before we 636 00:28:27,915 --> 00:28:30,190 return from the function, then we can detect the evil. 637 00:28:30,190 --> 00:28:35,600 So, let's say that, once again, we've got the buffer here. 638 00:28:44,760 --> 00:28:46,875 Then we're going to put the canary here. 639 00:28:53,960 --> 00:28:59,514 And then this will be the save value of the break pointer. 640 00:29:02,746 --> 00:29:04,454 And then this will be the return address. 641 00:29:09,900 --> 00:29:12,470 So, once again, remember the overflow goes this way. 642 00:29:12,470 --> 00:29:16,650 So the idea is that if the overflow wants 643 00:29:16,650 --> 00:29:18,870 to get to that return address, it first 644 00:29:18,870 --> 00:29:22,770 has to trample on this canary thing here, right? 645 00:29:22,770 --> 00:29:24,085 You have a question? 646 00:29:24,085 --> 00:29:27,000 AUDIENCE: Why does it have to touch the canary? 647 00:29:27,000 --> 00:29:29,018 PROFESSOR: Well, because-- assuming 648 00:29:29,018 --> 00:29:31,130 that the attacker doesn't know how 649 00:29:31,130 --> 00:29:34,795 to jump around in memory arbitrarily-- the way 650 00:29:34,795 --> 00:29:36,940 that tradionally [INAUDIBLE] overflow attacks work 651 00:29:36,940 --> 00:29:42,190 is that you look in GB, figure out where all this stuff is. 652 00:29:42,190 --> 00:29:44,300 And then, you essentially have this string, 653 00:29:44,300 --> 00:29:46,810 [INAUDIBLE] radius grows this way. 654 00:29:46,810 --> 00:29:49,050 Now, you're correct that if the attacker could just 655 00:29:49,050 --> 00:29:52,135 go here directly, then all the bets are off. 656 00:29:52,135 --> 00:29:54,200 But in the very simple overflow approach, 657 00:29:54,200 --> 00:29:57,690 everything just has to grow strictly that way. 658 00:29:57,690 --> 00:30:00,260 So the basic idea behind the canary 659 00:30:00,260 --> 00:30:03,970 is that we allow the buffer overflow exploit to take place. 660 00:30:03,970 --> 00:30:06,436 But then we have run time code that, 661 00:30:06,436 --> 00:30:08,850 at the time of the return from the function, 662 00:30:08,850 --> 00:30:11,350 is going to check this canary and make sure 663 00:30:11,350 --> 00:30:12,800 that it has the right value. 664 00:30:12,800 --> 00:30:13,300 Right? 665 00:30:13,300 --> 00:30:15,900 So it's called the canary because back in the days, when 666 00:30:15,900 --> 00:30:17,395 PETA wasn't around, you could use 667 00:30:17,395 --> 00:30:18,880 birds to test for evil things. 668 00:30:18,880 --> 00:30:20,860 So that's why it's called canary. 669 00:30:20,860 --> 00:30:24,077 AUDIENCE: My question is if the attacker is 670 00:30:24,077 --> 00:30:31,750 able to overwrite the return address, and modify the canary, 671 00:30:31,750 --> 00:30:34,225 how does he check that the canary was not modified, 672 00:30:34,225 --> 00:30:37,690 but was going to be performed? 673 00:30:37,690 --> 00:30:41,670 So the attacker overwrites the return address, right? 674 00:30:41,670 --> 00:30:47,444 So how is the check that the canary was modified-- 675 00:30:47,444 --> 00:30:48,110 PROFESSOR: Yeah. 676 00:30:48,110 --> 00:30:50,640 So basically, you have to have some piece of code 677 00:30:50,640 --> 00:30:54,505 that will actually check this before the return takes place. 678 00:30:54,505 --> 00:30:55,838 So in other words, you're right. 679 00:30:55,838 --> 00:30:58,200 There has to be that order in there. 680 00:30:58,200 --> 00:31:00,530 So essentially, what you have to do 681 00:31:00,530 --> 00:31:03,750 is you have to have a support from the compiler here 682 00:31:03,750 --> 00:31:07,090 that will actually extend the calling convention, 683 00:31:07,090 --> 00:31:08,150 if you will. 684 00:31:08,150 --> 00:31:10,610 Such that part of the return sequence 685 00:31:10,610 --> 00:31:13,700 is before we actually treat this value as valid, 686 00:31:13,700 --> 00:31:16,140 make sure this guy hasn't been trampled. 687 00:31:16,140 --> 00:31:18,557 Then, and only then, can we think of going somewhere else. 688 00:31:18,557 --> 00:31:20,640 AUDIENCE: I think I might be jumping the gun here, 689 00:31:20,640 --> 00:31:22,390 but doesn't this assume that the attacker 690 00:31:22,390 --> 00:31:25,365 can't find out or guess what the canary value is? 691 00:31:25,365 --> 00:31:27,990 PROFESSOR: That, in fact, is the very next thing my lecture is. 692 00:31:27,990 --> 00:31:29,281 If I had prizes, you'd get one. 693 00:31:29,281 --> 00:31:30,162 I don't have any. 694 00:31:30,162 --> 00:31:30,870 But good for you. 695 00:31:30,870 --> 00:31:31,370 Gold star. 696 00:31:31,370 --> 00:31:32,970 That's exactly correct. 697 00:31:32,970 --> 00:31:35,250 So one of the next things I'd like to say 698 00:31:35,250 --> 00:31:37,280 is what's the problem with this scheme? 699 00:31:37,280 --> 00:31:39,910 What if, for example, on every program, 700 00:31:39,910 --> 00:31:41,890 we always put the value a? 701 00:31:41,890 --> 00:31:44,164 Just like four values of a. 702 00:31:44,164 --> 00:31:46,330 So this is like a single [INAUDIBLE] at work, right? 703 00:31:46,330 --> 00:31:47,790 Then you'd have that exact problem 704 00:31:47,790 --> 00:31:48,789 that you just mentioned. 705 00:31:48,789 --> 00:31:50,742 Because then, the attacker-- this 706 00:31:50,742 --> 00:31:54,990 gets back to your question-- he or she knows how big this is. 707 00:31:54,990 --> 00:31:57,035 This is deterministic on every system. 708 00:31:57,035 --> 00:31:58,993 So you just make sure that your buffer overflow 709 00:31:58,993 --> 00:32:01,440 has a bunch of a's here, and then you overwrite this side. 710 00:32:01,440 --> 00:32:02,898 So you're exactly right about that. 711 00:32:02,898 --> 00:32:05,033 And so there's basically different types of values 712 00:32:05,033 --> 00:32:08,760 you could put between this canary to try to prevent that. 713 00:32:08,760 --> 00:32:10,330 One thing that you can do here is 714 00:32:10,330 --> 00:32:18,760 you can use-- this is sort of a very funny type of canary, 715 00:32:18,760 --> 00:32:21,020 but it basically exploits the ways 716 00:32:21,020 --> 00:32:27,700 that a lot of C progams and C functions 717 00:32:27,700 --> 00:32:29,310 handle special characters. 718 00:32:29,310 --> 00:32:32,180 So imagine that you used this value for the canary. 719 00:32:32,180 --> 00:32:34,990 So the binary value is 0, which is like the null byte, 720 00:32:34,990 --> 00:32:36,900 the null character in ASCII. 721 00:32:36,900 --> 00:32:41,000 Carriage return line feed, and then the negative 1. 722 00:32:41,000 --> 00:32:43,770 What's funny about this is that a lot of the functions that you 723 00:32:43,770 --> 00:32:47,090 can exploit-- that manipulate strings, for example-- 724 00:32:47,090 --> 00:32:50,260 they will stop when they encounter one of these words, 725 00:32:50,260 --> 00:32:51,610 or one of these values. 726 00:32:51,610 --> 00:32:54,780 So you can imagine that you're using some string manipulation 727 00:32:54,780 --> 00:32:56,020 function to go up this way. 728 00:32:56,020 --> 00:32:57,603 It's going to hit that null character. 729 00:32:57,603 --> 00:32:59,310 Oops-- it's going to stop processing. 730 00:32:59,310 --> 00:32:59,810 Right? 731 00:32:59,810 --> 00:33:02,962 Or maybe if you're using a line-oriented function-- 732 00:33:02,962 --> 00:33:04,670 carriage return, line feed-- that's often 733 00:33:04,670 --> 00:33:05,836 used as the line terminator. 734 00:33:05,836 --> 00:33:08,060 So once again, you're using that dangerous function 735 00:33:08,060 --> 00:33:09,335 that's trying to go this way. 736 00:33:09,335 --> 00:33:10,300 It hits that. 737 00:33:10,300 --> 00:33:11,870 Oops, it's going to quit. 738 00:33:11,870 --> 00:33:14,650 And the negative 1 is another similar magic token. 739 00:33:14,650 --> 00:33:16,400 So that's one way you can get around that. 740 00:33:16,400 --> 00:33:17,500 One second. 741 00:33:17,500 --> 00:33:19,050 And then another thing you can do 742 00:33:19,050 --> 00:33:22,628 is you can use a randomized value. 743 00:33:27,140 --> 00:33:29,662 So here, you just [INAUDIBLE] from this whole idea 744 00:33:29,662 --> 00:33:31,620 of trying to figure out what exactly it is that 745 00:33:31,620 --> 00:33:33,385 might cause that attack to terminate. 746 00:33:33,385 --> 00:33:35,700 And you just pull some random number 747 00:33:35,700 --> 00:33:37,795 and either make it difficult for the attacker 748 00:33:37,795 --> 00:33:39,550 to guess what that is. 749 00:33:39,550 --> 00:33:42,282 Now, of course, this random value-- its strength 750 00:33:42,282 --> 00:33:43,990 is basically based on how difficult it is 751 00:33:43,990 --> 00:33:46,150 for the attacker to guess that. 752 00:33:46,150 --> 00:33:48,970 So the attacker, for example, can 753 00:33:48,970 --> 00:33:51,250 understand that if there's only, let's say, 754 00:33:51,250 --> 00:33:54,320 three bits of entropy in your system, then maybe the attacker 755 00:33:54,320 --> 00:33:57,357 could use some type of forced attack, so on and so forth. 756 00:33:57,357 --> 00:33:59,065 So one thing to keep in mind, in general, 757 00:33:59,065 --> 00:34:00,523 is that whenever someone tells you, 758 00:34:00,523 --> 00:34:03,699 here's a randomized offense against attack foo, 759 00:34:03,699 --> 00:34:05,240 if there are not a lot of random bits 760 00:34:05,240 --> 00:34:07,800 there, that attack may not give you as much defense 761 00:34:07,800 --> 00:34:09,749 as you think. 762 00:34:09,749 --> 00:34:12,603 You had a question? 763 00:34:12,603 --> 00:34:14,186 AUDIENCE: Usually what tends to happen 764 00:34:14,186 --> 00:34:16,651 is you read from another buffer and you write 765 00:34:16,651 --> 00:34:18,130 into that buffer on the stack. 766 00:34:18,130 --> 00:34:22,074 So in that situation, it seems like that promiscuous canary 767 00:34:22,074 --> 00:34:23,060 is kind of useless. 768 00:34:23,060 --> 00:34:25,032 Because if I read from the [INAUDIBLE], 769 00:34:25,032 --> 00:34:26,511 I know what the canary is. 770 00:34:26,511 --> 00:34:28,980 And I have this other buffer that I control. 771 00:34:28,980 --> 00:34:30,414 And I never check. 772 00:34:30,414 --> 00:34:32,866 And in that buffer, I can put as much of it as I want. 773 00:34:32,866 --> 00:34:34,366 I don't want the promiscuous canary, 774 00:34:34,366 --> 00:34:36,342 so I can overwrite it very safely. 775 00:34:36,342 --> 00:34:39,489 So I don't see how this really works, 776 00:34:39,489 --> 00:34:42,159 and in what scenario it's-- you're assuming you're reading 777 00:34:42,159 --> 00:34:44,010 from the buffer on this stack and you're going to stop-- 778 00:34:44,010 --> 00:34:45,384 PROFESSOR: Well, we're assuming-- 779 00:34:45,384 --> 00:34:47,239 we're writing to the buffer. 780 00:34:47,239 --> 00:34:51,226 So basically, the idea is that you write some [? two-long ?] 781 00:34:51,226 --> 00:34:52,720 string this way. 782 00:34:52,720 --> 00:34:56,750 And then the idea is that if you can't guess what this is, then 783 00:34:56,750 --> 00:35:02,303 you can't, basically, put this value inside of your overflow 784 00:35:02,303 --> 00:35:03,209 string. 785 00:35:03,209 --> 00:35:05,560 AUDIENCE: But you said it's deterministic, right? 786 00:35:05,560 --> 00:35:06,490 0, CR, LF, negative 1. 787 00:35:06,490 --> 00:35:07,950 PROFESSOR: Oh, yeah. 788 00:35:07,950 --> 00:35:08,560 Right. 789 00:35:08,560 --> 00:35:09,060 OK. 790 00:35:09,060 --> 00:35:10,230 So I think I understand your question now. 791 00:35:10,230 --> 00:35:10,975 Yes. 792 00:35:10,975 --> 00:35:16,660 If you use this system here, with the deterministic canary, 793 00:35:16,660 --> 00:35:19,754 and you essentially are not using 794 00:35:19,754 --> 00:35:21,910 one of these functions from, let's say, 795 00:35:21,910 --> 00:35:25,422 the standard library that would be fooled by this, 796 00:35:25,422 --> 00:35:27,380 then, yeah, you can defeat the system that way. 797 00:35:27,380 --> 00:35:30,576 AUDIENCE: But I can use string CPIs 798 00:35:30,576 --> 00:35:32,076 and the destination can be buffered. 799 00:35:32,076 --> 00:35:35,631 And the source can be [INAUDIBLE]. 800 00:35:35,631 --> 00:35:37,422 And that would not protect me against that. 801 00:35:40,972 --> 00:35:43,948 PROFESSOR: I'm not sure I understand the attack, so. 802 00:35:43,948 --> 00:35:45,932 AUDIENCE: So the string CPI would take home 803 00:35:45,932 --> 00:35:50,075 the user input for my data, would overwrite canary-- oh, 804 00:35:50,075 --> 00:35:51,710 and you're saying-- hmm, actually, I 805 00:35:51,710 --> 00:35:52,570 understand what you're saying. 806 00:35:52,570 --> 00:35:53,278 PROFESSOR: Right? 807 00:35:53,278 --> 00:35:56,665 Because the idea is that you can fill this buffer with bytes 808 00:35:56,665 --> 00:35:58,140 from wherever, right? 809 00:35:58,140 --> 00:36:00,098 But the idea is that unless you can guess this, 810 00:36:00,098 --> 00:36:02,590 then it doesn't matter. 811 00:36:02,590 --> 00:36:03,700 But you're correct. 812 00:36:03,700 --> 00:36:07,300 In general, anything that allows you to guess this or randomly 813 00:36:07,300 --> 00:36:11,939 get that value correct will lead to the feed of the system. 814 00:36:11,939 --> 00:36:13,807 AUDIENCE: In terms of [INAUDIBLE], 815 00:36:13,807 --> 00:36:16,677 can you just take something like the number of seconds 816 00:36:16,677 --> 00:36:19,820 or milliseconds since the epoch and use 817 00:36:19,820 --> 00:36:23,600 that at the [INAUDIBLE]? 818 00:36:23,600 --> 00:36:24,974 PROFESSOR: Well, as it turns out, 819 00:36:24,974 --> 00:36:27,554 a lot of times, calls that get [INAUDIBLE] 820 00:36:27,554 --> 00:36:30,000 don't contain as much randomness as you might think. 821 00:36:30,000 --> 00:36:33,121 Because the program itself might somehow-- 822 00:36:33,121 --> 00:36:36,795 let's say, for example, have a log statement or function you 823 00:36:36,795 --> 00:36:38,962 could call to get the time that the program was 824 00:36:38,962 --> 00:36:40,170 launched or things like that. 825 00:36:40,170 --> 00:36:40,615 But you're right. 826 00:36:40,615 --> 00:36:42,406 In practice, if you can use something like, 827 00:36:42,406 --> 00:36:46,020 let's say, the hardware system plot, which is often the lowest 828 00:36:46,020 --> 00:36:49,250 level, better system of timing of it-- yes, that kind of thing 829 00:36:49,250 --> 00:36:49,750 might work. 830 00:36:49,750 --> 00:36:53,282 AUDIENCE: But even if you can pull the logs, 831 00:36:53,282 --> 00:36:56,276 it still depends on exactly what time you refuse a request. 832 00:36:56,276 --> 00:37:00,180 And so if you don't have control over how long it takes 833 00:37:00,180 --> 00:37:03,596 for your requests to get from your computer to the server, 834 00:37:03,596 --> 00:37:06,036 then I don't think you can deterministically 835 00:37:06,036 --> 00:37:07,476 guess exactly the right time. 836 00:37:07,476 --> 00:37:08,476 PROFESSOR: That's right. 837 00:37:08,476 --> 00:37:09,017 That's right. 838 00:37:09,017 --> 00:37:11,884 The devil's in the details with all this kind of stuff. 839 00:37:11,884 --> 00:37:14,300 In other words, if there's some way for you to figure out, 840 00:37:14,300 --> 00:37:16,424 for example, that type of timing channel, 841 00:37:16,424 --> 00:37:18,840 you might find out that the amount of entropy-- the amount 842 00:37:18,840 --> 00:37:20,359 of randomness-- is not, let's say, 843 00:37:20,359 --> 00:37:22,400 the full size of a timestamp, but maybe something 844 00:37:22,400 --> 00:37:23,433 that's much smaller. 845 00:37:23,433 --> 00:37:25,141 Because maybe the attacker can figure out 846 00:37:25,141 --> 00:37:26,880 the hour and the minute in which you 847 00:37:26,880 --> 00:37:30,608 did this, but not the second, for example. 848 00:37:30,608 --> 00:37:33,326 We'll take one more question, then we'll move on. 849 00:37:33,326 --> 00:37:35,826 AUDIENCE: For the record, trying to roll your own randomness 850 00:37:35,826 --> 00:37:37,477 is usually a bad idea, right? 851 00:37:37,477 --> 00:37:38,560 PROFESSOR: That's correct. 852 00:37:38,560 --> 00:37:38,870 AUDIENCE: Usually, you should just 853 00:37:38,870 --> 00:37:40,547 use whatever's supplied by your systems. 854 00:37:40,547 --> 00:37:41,338 PROFESSOR: Oh, yes. 855 00:37:41,338 --> 00:37:42,832 That's very true. 856 00:37:42,832 --> 00:37:44,990 It's like inventing your own cryptosystem, which 857 00:37:44,990 --> 00:37:46,865 is another popular thing undergrads sometimes 858 00:37:46,865 --> 00:37:47,473 want to do. 859 00:37:47,473 --> 00:37:49,306 We're not the NSA, we're not mathematicians. 860 00:37:49,306 --> 00:37:50,302 That typically fails. 861 00:37:50,302 --> 00:37:51,800 So you're exactly right about that. 862 00:37:51,800 --> 00:37:54,630 But even if you use system-supplied randomness, 863 00:37:54,630 --> 00:37:57,310 you still may end up with fewer bits of entropy 864 00:37:57,310 --> 00:37:58,150 than you expect. 865 00:37:58,150 --> 00:38:00,441 And I'll give you an example of that when we talk about 866 00:38:00,441 --> 00:38:01,640 address phase randomization. 867 00:38:01,640 --> 00:38:07,150 So that's basically how the stack canary approach works. 868 00:38:07,150 --> 00:38:12,040 And so since we're in a security class, you might be wondering, 869 00:38:12,040 --> 00:38:17,692 so what kinds of things will stack canaries not catch? 870 00:38:17,692 --> 00:38:20,762 So when do canaries fail? 871 00:38:28,310 --> 00:38:35,540 One way they can fail is if the attacker 872 00:38:35,540 --> 00:38:38,187 the things, like function pointers. 873 00:38:45,430 --> 00:38:47,780 Because if function pointers get [INAUDIBLE], 874 00:38:47,780 --> 00:38:49,807 there's nothing that the canary can 875 00:38:49,807 --> 00:38:52,120 do to prevent that type of exploit from taking place. 876 00:38:52,120 --> 00:38:57,890 For example, let's say you have code that declared a pointer. 877 00:39:00,890 --> 00:39:05,312 It gets initialized in some way, it doesn't really matter. 878 00:39:05,312 --> 00:39:08,648 Then you have a buffer here. 879 00:39:11,930 --> 00:39:15,010 Once again, the gets function rears its ugly head. 880 00:39:17,710 --> 00:39:25,090 And then, let's say, down here, we assign some value 5 881 00:39:25,090 --> 00:39:27,160 for the pointer. 882 00:39:27,160 --> 00:39:29,230 Now note that we haven't actually 883 00:39:29,230 --> 00:39:32,780 tried to attack the return address of the function that 884 00:39:32,780 --> 00:39:34,845 contains this code. 885 00:39:34,845 --> 00:39:37,130 When we view the buffer overflow, 886 00:39:37,130 --> 00:39:40,710 this pointer address up here is going to get corrupted. 887 00:39:40,710 --> 00:39:43,955 And so what ends up happening is that if the attacker can 888 00:39:43,955 --> 00:39:46,180 corrupt that pointer, then the attacker's 889 00:39:46,180 --> 00:39:50,930 able to write 5 to some attacker-controlled address. 890 00:39:50,930 --> 00:39:53,170 Does everyone see how the canary doesn't help here? 891 00:39:53,170 --> 00:39:54,711 Because we're basically not attacking 892 00:39:54,711 --> 00:39:57,650 the way that the function returns. 893 00:39:57,650 --> 00:40:01,026 AUDIENCE: But won't the pointer be below the buffer? 894 00:40:03,840 --> 00:40:06,035 PROFESSOR: So, yeah. 895 00:40:06,035 --> 00:40:07,160 AUDIENCE: Not necessarily-- 896 00:40:07,160 --> 00:40:08,440 PROFESSOR: So you're worried about, is it going to be here, 897 00:40:08,440 --> 00:40:09,730 or is it going to be here? 898 00:40:09,730 --> 00:40:11,715 AUDIENCE: I'm worried about when you-- 899 00:40:11,715 --> 00:40:14,048 will you actually be able to access where the pointer is 900 00:40:14,048 --> 00:40:14,960 when you're overturning-- 901 00:40:14,960 --> 00:40:15,877 PROFESSOR: Ah, yeah. 902 00:40:15,877 --> 00:40:17,960 So you can't necessarily-- that's a good question. 903 00:40:17,960 --> 00:40:20,982 So I think, in a lot of the previous examples, 904 00:40:20,982 --> 00:40:23,610 you've been assuming that this guy would be here. 905 00:40:23,610 --> 00:40:24,690 Like, in the [INAUDIBLE]. 906 00:40:24,690 --> 00:40:27,102 If the stack is going this way, then the pointer 907 00:40:27,102 --> 00:40:28,490 would be down here. 908 00:40:28,490 --> 00:40:30,282 But the order of the particular variables-- 909 00:40:30,282 --> 00:40:32,031 it depends on a bunch of different things. 910 00:40:32,031 --> 00:40:34,400 It depends on the way that the compiler lays stuff out. 911 00:40:34,400 --> 00:40:36,730 It depends on the column dimension of the hardware, 912 00:40:36,730 --> 00:40:38,350 so on and so forth. 913 00:40:38,350 --> 00:40:41,820 But you're right that if the-- basically, 914 00:40:41,820 --> 00:40:43,740 if the buffer overflow was going this way, 915 00:40:43,740 --> 00:40:45,780 but the pointer was in front of the buffer, 916 00:40:45,780 --> 00:40:48,140 then it's going to work. 917 00:40:48,140 --> 00:40:50,028 AUDIENCE: Why can't you associate a canary 918 00:40:50,028 --> 00:40:51,798 with the function canary, just like you 919 00:40:51,798 --> 00:40:53,587 did with the return address? 920 00:40:53,587 --> 00:40:54,170 PROFESSOR: Ah. 921 00:40:54,170 --> 00:40:55,685 That's an interesting point. 922 00:40:55,685 --> 00:40:57,274 You could do those things. 923 00:40:57,274 --> 00:40:59,860 In fact, you could try to imagine a compiler 924 00:40:59,860 --> 00:41:02,590 that, whenever it had any pointer whatsoever, 925 00:41:02,590 --> 00:41:05,451 it would always try to add padding for various things. 926 00:41:05,451 --> 00:41:05,950 Right? 927 00:41:05,950 --> 00:41:08,910 As it turns out, it seems like that will quickly 928 00:41:08,910 --> 00:41:12,766 get expensive, in terms of all the code that's 929 00:41:12,766 --> 00:41:15,250 added, to have to check for all those types of things. 930 00:41:15,250 --> 00:41:18,630 Because then you could imagine that every single time you 931 00:41:18,630 --> 00:41:21,048 want to invoke any pointer, or recall any function, 932 00:41:21,048 --> 00:41:22,506 you've got to have this code that's 933 00:41:22,506 --> 00:41:24,690 going to check whether that canary is correct. 934 00:41:24,690 --> 00:41:27,064 But yeah, in principle, you could do something like that. 935 00:41:29,380 --> 00:41:30,510 So does this make sense? 936 00:41:30,510 --> 00:41:33,012 So we see that canaries don't help you on this equation. 937 00:41:36,490 --> 00:41:39,160 And so another thing, as we've discussed before, 938 00:41:39,160 --> 00:41:46,112 is that if you can guess the randomness, then, basically, 939 00:41:46,112 --> 00:41:48,080 the random canaries don't work. 940 00:41:57,440 --> 00:42:01,560 Producing secure sources of randomness 941 00:42:01,560 --> 00:42:03,234 is actually a topic in and of itself. 942 00:42:03,234 --> 00:42:05,025 That's very, very complicated, so we're not 943 00:42:05,025 --> 00:42:06,710 going to go into great depth about that here. 944 00:42:06,710 --> 00:42:08,380 But suffice it to say, if you can guess the randomness, 945 00:42:08,380 --> 00:42:09,420 everything falls apart. 946 00:42:09,420 --> 00:42:11,915 AUDIENCE: So do canaries usually have less bits than the return 947 00:42:11,915 --> 00:42:12,414 address? 948 00:42:12,414 --> 00:42:13,914 Because otherwise, couldn't you just 949 00:42:13,914 --> 00:42:17,706 memorize the return address and check that the address changed? 950 00:42:17,706 --> 00:42:18,580 PROFESSOR: Let's see. 951 00:42:18,580 --> 00:42:23,310 So you're saying if the canary here is, let's say, 952 00:42:23,310 --> 00:42:25,568 smaller than-- 953 00:42:25,568 --> 00:42:28,459 AUDIENCE: I'm saying for the canary is that you know 954 00:42:28,459 --> 00:42:32,051 what that value is [INAUDIBLE]. 955 00:42:32,051 --> 00:42:34,051 Can't you also memorize the return address value 956 00:42:34,051 --> 00:42:37,550 and check if that's been changed? 957 00:42:37,550 --> 00:42:40,505 PROFESSOR: Oh, so you're saying can't the secure system-- 958 00:42:40,505 --> 00:42:42,373 can't it look at the return address 959 00:42:42,373 --> 00:42:45,320 and figure out if that's been changed. 960 00:42:45,320 --> 00:42:46,270 Yeah. 961 00:42:46,270 --> 00:42:50,329 In other words, if there-- well, yes and no. 962 00:42:50,329 --> 00:42:51,787 Note that there's still this that's 963 00:42:51,787 --> 00:42:53,953 going get overwritten in the buffer overflow attack. 964 00:42:53,953 --> 00:42:56,350 So this may still cause problems. 965 00:42:56,350 --> 00:42:59,720 But in principle, if somehow these things 966 00:42:59,720 --> 00:43:04,620 were invariant somehow, then you could do something like that. 967 00:43:04,620 --> 00:43:07,640 But the problem is that, in many cases, 968 00:43:07,640 --> 00:43:09,750 this return-- the bookkeeping overhead for that 969 00:43:09,750 --> 00:43:10,968 would be a little bit tricky. 970 00:43:10,968 --> 00:43:13,060 Because you can imagine that particular function 971 00:43:13,060 --> 00:43:16,147 may be called from places, and so on and so forth. 972 00:43:16,147 --> 00:43:17,605 Just in the interest of time, we're 973 00:43:17,605 --> 00:43:19,253 going to zoom forward a little bit. 974 00:43:19,253 --> 00:43:20,794 But if we have time at the end, we'll 975 00:43:20,794 --> 00:43:22,335 come back to some of these questions. 976 00:43:25,308 --> 00:43:29,330 So those are some situations in which the canary can fail. 977 00:43:29,330 --> 00:43:32,800 There's some other places in which it can fail, too. 978 00:43:32,800 --> 00:43:35,220 For example, one way that it might fail 979 00:43:35,220 --> 00:43:38,650 is with malloc and free attacks. 980 00:43:38,650 --> 00:43:44,446 This is a uniquely C-style attack. 981 00:43:44,446 --> 00:43:45,750 Let's see what happens here. 982 00:43:49,860 --> 00:43:59,012 Imagine that you have two pointers here, p and q. 983 00:43:59,012 --> 00:44:08,110 And then imagine that we issue a malloc for both of these. 984 00:44:08,110 --> 00:44:11,310 We give p 1,024 bytes of memory. 985 00:44:11,310 --> 00:44:15,080 We also give q 1,024 bytes of memory. 986 00:44:17,880 --> 00:44:29,478 And then, let's say that we do a strcpy on p 987 00:44:29,478 --> 00:44:31,898 from some bug that's controlled by the attacker. 988 00:44:31,898 --> 00:44:35,300 So here's where the overflow happens. 989 00:44:35,300 --> 00:44:43,770 And then let's say that would be free q 990 00:44:43,770 --> 00:44:48,010 and then let's say that would be free p. 991 00:44:48,010 --> 00:44:48,510 OK. 992 00:44:48,510 --> 00:44:50,360 So it's fairly straightforward code, right? 993 00:44:50,360 --> 00:44:54,321 Two pointers-- malloc's the memory for each one of them. 994 00:44:54,321 --> 00:44:55,945 You use one of these on site functions, 995 00:44:55,945 --> 00:45:03,380 the buffer overflow happens, and we free q and we free p. 996 00:45:03,380 --> 00:45:12,540 Let's assume that p and q-- the memory that's 997 00:45:12,540 --> 00:45:22,142 assigned to them-- are nearby, in terms of the layout in terms 998 00:45:22,142 --> 00:45:23,190 of [INAUDIBLE]. 999 00:45:23,190 --> 00:45:27,520 So both of these objects line next to each other 1000 00:45:27,520 --> 00:45:30,320 in the memory space. 1001 00:45:30,320 --> 00:45:34,460 There's some subtle and evil things that can happen, right? 1002 00:45:34,460 --> 00:45:39,860 Because this third copy here might actually over-- 1003 00:45:39,860 --> 00:45:41,830 it'll fill p with a bunch of stuff, 1004 00:45:41,830 --> 00:45:47,510 but it might also corrupt some of the state that belongs to q. 1005 00:45:47,510 --> 00:45:48,010 OK? 1006 00:45:48,010 --> 00:45:49,020 And this can cause problems. 1007 00:45:49,020 --> 00:45:50,519 And some of you may have done things 1008 00:45:50,519 --> 00:45:52,635 in this unintentionally in your own code, when 1009 00:45:52,635 --> 00:45:55,110 you have some type of weird use of pointers. 1010 00:45:55,110 --> 00:45:56,890 And then stuff seems to work, but when 1011 00:45:56,890 --> 00:45:58,778 you call free later on, it segfaults 1012 00:45:58,778 --> 00:45:59,880 or something like that. 1013 00:45:59,880 --> 00:46:00,500 Right? 1014 00:46:00,500 --> 00:46:01,660 What I'm going to talk about here 1015 00:46:01,660 --> 00:46:03,201 is the way that the attacker can take 1016 00:46:03,201 --> 00:46:04,550 advantage of that behavior. 1017 00:46:04,550 --> 00:46:06,591 We're actually going to explain why that happens. 1018 00:46:06,591 --> 00:46:12,520 So imagine that inside the implementation 1019 00:46:12,520 --> 00:46:17,380 of free and malloc, an allocated block looks like this. 1020 00:46:21,040 --> 00:46:29,320 So let's say that there is the app-visible data that lives up 1021 00:46:29,320 --> 00:46:29,820 here. 1022 00:46:29,820 --> 00:46:35,250 And then let's say you had a size variable down here. 1023 00:46:35,250 --> 00:46:38,310 This is not something that the application sees directly. 1024 00:46:38,310 --> 00:46:40,200 This is like some bookkeeping info 1025 00:46:40,200 --> 00:46:43,110 that the free or the malloc systems 1026 00:46:43,110 --> 00:46:45,390 attract so that you know the sizes of the buffer 1027 00:46:45,390 --> 00:46:47,990 that it allocated. 1028 00:46:47,990 --> 00:46:55,480 Let's say that free block has some metadata that 1029 00:46:55,480 --> 00:46:56,350 looks like this. 1030 00:47:03,092 --> 00:47:06,126 You've got the size of the free block here. 1031 00:47:06,126 --> 00:47:09,070 And then you've got a bunch of empty space here. 1032 00:47:09,070 --> 00:47:11,320 Then let's say-- this is where things get interesting. 1033 00:47:11,320 --> 00:47:17,230 You've got a backwards pointer and then 1034 00:47:17,230 --> 00:47:19,770 you've got a forward pointer. 1035 00:47:25,352 --> 00:47:27,060 And maybe you've got some size data here. 1036 00:47:27,060 --> 00:47:28,976 Now why are we having these two pointers here? 1037 00:47:28,976 --> 00:47:30,415 It's because the memory allocation 1038 00:47:30,415 --> 00:47:33,800 system, in this case, is using a doubly-linked list 1039 00:47:33,800 --> 00:47:37,950 to track how the free blocks related to each other. 1040 00:47:37,950 --> 00:47:39,605 So when you allocate a free block, 1041 00:47:39,605 --> 00:47:41,520 you take it off of this doubly-linked list. 1042 00:47:41,520 --> 00:47:45,070 And then when you deallocate it, you do some pointer arithmetic, 1043 00:47:45,070 --> 00:47:46,480 and then you fix these things up. 1044 00:47:46,480 --> 00:47:48,690 Then you add it back to that linked list, right? 1045 00:47:48,690 --> 00:47:51,100 So as always, whenever you hear pointer arithmetic, 1046 00:47:51,100 --> 00:47:52,667 you should think it's your canary. 1047 00:47:52,667 --> 00:47:55,000 Because that's where a lot of these problems come about. 1048 00:47:55,000 --> 00:48:01,230 And so the thing to note is that we had this buffer overflow 1049 00:48:01,230 --> 00:48:03,007 here, the p. 1050 00:48:03,007 --> 00:48:05,930 If we assume that p and q are next to each other, 1051 00:48:05,930 --> 00:48:08,795 or very close in memory, then what can end up happening 1052 00:48:08,795 --> 00:48:12,080 is that this buffer overflow can overwrite 1053 00:48:12,080 --> 00:48:19,050 some of this size data for the allocated pointer, q. 1054 00:48:19,050 --> 00:48:20,470 Is everybody with me so far? 1055 00:48:20,470 --> 00:48:22,860 Because if you're with me so far, then basically, 1056 00:48:22,860 --> 00:48:24,720 you can use your imagination at this point 1057 00:48:24,720 --> 00:48:26,230 and see where things go wrong. 1058 00:48:26,230 --> 00:48:27,605 Because essentially, what's going 1059 00:48:27,605 --> 00:48:31,840 to end up happening is that these free operations-- they 1060 00:48:31,840 --> 00:48:35,978 look at this metadata to do all kinds of pointer manipulations 1061 00:48:35,978 --> 00:48:37,226 with this kind of stuff. 1062 00:48:53,220 --> 00:48:56,790 Somewhere in the implementation of free, 1063 00:48:56,790 --> 00:49:05,480 it's going to get some pointer based 1064 00:49:05,480 --> 00:49:11,100 on the value of size, where size is something 1065 00:49:11,100 --> 00:49:12,236 the attacker controls. 1066 00:49:12,236 --> 00:49:14,110 Because the attacker did the buffer overflow. 1067 00:49:14,110 --> 00:49:14,609 Right? 1068 00:49:14,609 --> 00:49:18,610 So then, you can imagine that it does 1069 00:49:18,610 --> 00:49:20,090 a bunch of pointer arithmetic. 1070 00:49:25,030 --> 00:49:28,950 So it's going to look at the back in the four 1071 00:49:28,950 --> 00:49:33,110 pointers of this block. 1072 00:49:33,110 --> 00:49:35,910 And then it's going to do something 1073 00:49:35,910 --> 00:49:37,880 like update the back pointer. 1074 00:49:41,660 --> 00:49:44,760 And also update the forward pointer. 1075 00:49:49,807 --> 00:49:51,348 And the exact specifics of this-- you 1076 00:49:51,348 --> 00:49:52,867 don't need to worry about. 1077 00:49:52,867 --> 00:49:55,450 This is just an example of the code that takes place in there. 1078 00:49:55,450 --> 00:49:58,520 But the point is that note that because the attacker's 1079 00:49:58,520 --> 00:50:00,570 overwritten size, the attacker now 1080 00:50:00,570 --> 00:50:03,860 controls this pointer that's passed into the free code. 1081 00:50:03,860 --> 00:50:06,040 And because of that, these two statements 1082 00:50:06,040 --> 00:50:08,830 here, these are actually pointer updates. 1083 00:50:08,830 --> 00:50:09,330 Right? 1084 00:50:09,330 --> 00:50:10,870 This is a pointer somewhere. 1085 00:50:10,870 --> 00:50:15,080 And because the attacker has been able to control this p, 1086 00:50:15,080 --> 00:50:17,690 he actually controls all this stuff, too. 1087 00:50:17,690 --> 00:50:20,680 This is where the attack can actually take place. 1088 00:50:20,680 --> 00:50:22,862 So when the free code operates and it 1089 00:50:22,862 --> 00:50:25,370 tries to do things like, for example, merge these two 1090 00:50:25,370 --> 00:50:27,289 blocks, that's typically why you have 1091 00:50:27,289 --> 00:50:28,580 [INAUDIBLE] doubly-linked list. 1092 00:50:28,580 --> 00:50:30,791 Because if you have two blocks that are facing to each other 1093 00:50:30,791 --> 00:50:33,650 and they're both free, you want to merge them to one big block. 1094 00:50:33,650 --> 00:50:36,180 Well, we control size. 1095 00:50:36,180 --> 00:50:38,154 That means we control this whole process here. 1096 00:50:38,154 --> 00:50:41,235 That means if we've been clever in how these overflows are 1097 00:50:41,235 --> 00:50:44,474 working, at these points, we can write to a memory in the way 1098 00:50:44,474 --> 00:50:46,840 that we choose. 1099 00:50:46,840 --> 00:50:49,065 Does that make sense? 1100 00:50:49,065 --> 00:50:50,550 And like I said, this type of thing 1101 00:50:50,550 --> 00:50:52,974 often happens in your own code when you're not getting 1102 00:50:52,974 --> 00:50:54,015 very clever with pointer. 1103 00:50:54,015 --> 00:50:56,985 When you make some mistake with the double freeing or whatever, 1104 00:50:56,985 --> 00:50:59,507 this is why stuff will segfault sometimes. 1105 00:50:59,507 --> 00:51:01,090 Because you've messed up this metadata 1106 00:51:01,090 --> 00:51:03,690 that lives with each one of these allocated blocks. 1107 00:51:03,690 --> 00:51:05,950 And then at some point, this calculation 1108 00:51:05,950 --> 00:51:08,465 will point to some garbage value, and then you're dead. 1109 00:51:08,465 --> 00:51:11,048 But if you're the attacker, you can actually choose that value 1110 00:51:11,048 --> 00:51:12,482 and use it for your own advantage. 1111 00:51:17,280 --> 00:51:17,790 OK. 1112 00:51:17,790 --> 00:51:22,345 So now let's get to another approach 1113 00:51:22,345 --> 00:51:27,070 for getting rid of some of these buffer overflow attacks. 1114 00:51:27,070 --> 00:51:30,742 And that approach is bounds checking. 1115 00:51:38,480 --> 00:51:45,090 The goal of bounds checking is to make sure 1116 00:51:45,090 --> 00:51:48,768 that when you use a particular pointer, 1117 00:51:48,768 --> 00:51:54,380 it only refers to something that is a memory object. 1118 00:51:54,380 --> 00:51:58,380 And that pointer's in the valid bounds of that memory object. 1119 00:51:58,380 --> 00:52:00,760 So that's the basic idea behind the idea. 1120 00:52:00,760 --> 00:52:03,180 It's actually pretty simple-- at a high level. 1121 00:52:03,180 --> 00:52:05,610 Once again, in C, though, it's very difficult 1122 00:52:05,610 --> 00:52:07,010 to actually understand things. 1123 00:52:07,010 --> 00:52:08,926 Like, what does it actually mean for a pointer 1124 00:52:08,926 --> 00:52:11,260 to be in bounds or out of bounds, or valid or invalid? 1125 00:52:11,260 --> 00:52:13,920 So for example, let's say that you have 1126 00:52:13,920 --> 00:52:16,620 two pieces of code like this. 1127 00:52:16,620 --> 00:52:24,580 So you declare a character array of 1,024 bytes. 1128 00:52:24,580 --> 00:52:29,570 And then let's say that you use something like this. 1129 00:52:29,570 --> 00:52:32,920 You declare a pointer, and then you'd 1130 00:52:32,920 --> 00:52:38,686 get the address of one of the elements in x. 1131 00:52:41,620 --> 00:52:43,070 Does this make sense? 1132 00:52:43,070 --> 00:52:45,240 Is this a good idea to do that? 1133 00:52:45,240 --> 00:52:46,900 It's hard to say. 1134 00:52:46,900 --> 00:52:50,182 If you're treating this x up here as a string, 1135 00:52:50,182 --> 00:52:52,556 maybe it makes sense for Jim to take a pointer like this. 1136 00:52:52,556 --> 00:52:54,871 Then you can increment and decrement, because maybe you're 1137 00:52:54,871 --> 00:52:57,287 looking for some special value of your character in there. 1138 00:52:57,287 --> 00:53:00,545 But if this is a network message or something like that, 1139 00:53:00,545 --> 00:53:04,120 maybe there's actually some struct that's embedded in here. 1140 00:53:04,120 --> 00:53:05,670 So it doesn't actually make sense 1141 00:53:05,670 --> 00:53:07,661 to walk this character by character, right? 1142 00:53:07,661 --> 00:53:09,535 So the challenge here is that, once again, we 1143 00:53:09,535 --> 00:53:12,110 can see it allows you to do whatever you want. 1144 00:53:12,110 --> 00:53:15,510 It's hard to determine what it is you actually want it to do. 1145 00:53:15,510 --> 00:53:18,280 And so, as a result, it's a little bit 1146 00:53:18,280 --> 00:53:19,960 subtle with how you define things 1147 00:53:19,960 --> 00:53:23,440 like pointer safety in C. 1148 00:53:23,440 --> 00:53:26,470 You can also imagine that life gets even more complicated 1149 00:53:26,470 --> 00:53:30,600 if you use structs and unions. 1150 00:53:30,600 --> 00:53:32,075 Imagine you had a union. 1151 00:53:32,075 --> 00:53:35,462 It would look like this. 1152 00:53:35,462 --> 00:53:38,941 It's got some integer value in there. 1153 00:53:38,941 --> 00:53:43,414 And then you've got some struct. 1154 00:53:43,414 --> 00:53:46,396 And then, it has two integers inside of it. 1155 00:53:59,600 --> 00:54:02,510 Don't forget the way that the unions work is that, basically, 1156 00:54:02,510 --> 00:54:04,910 the union's going to allocate the maximum size 1157 00:54:04,910 --> 00:54:07,184 for the largest element. 1158 00:54:07,184 --> 00:54:08,600 At any given moment, you typically 1159 00:54:08,600 --> 00:54:11,320 expect that either this ni will be valid 1160 00:54:11,320 --> 00:54:14,810 or this struct s will be valid, but not both. 1161 00:54:14,810 --> 00:54:18,694 So imagine that you had code that did something like this. 1162 00:54:21,540 --> 00:54:26,414 You get a pointer to address this guy. 1163 00:54:33,218 --> 00:54:37,990 So I get an integer pointer to the address of, in the union, 1164 00:54:37,990 --> 00:54:40,800 this struct, and then k. 1165 00:54:40,800 --> 00:54:45,920 Well, this reference is strictly speaking in bounds. 1166 00:54:45,920 --> 00:54:47,910 There's memory that's been allocated for this. 1167 00:54:47,910 --> 00:54:49,280 That's not incorrect. 1168 00:54:49,280 --> 00:54:51,960 But are you actually, this moment in program of execution, 1169 00:54:51,960 --> 00:54:54,700 treating this union as one of these guys 1170 00:54:54,700 --> 00:54:56,470 or one of these guys? 1171 00:54:56,470 --> 00:54:58,200 It's hard to say. 1172 00:54:58,200 --> 00:55:01,720 So as a result of these ambiguous pointers semantics 1173 00:55:01,720 --> 00:55:05,730 that can arise in these C programs, typically, 1174 00:55:05,730 --> 00:55:09,300 these bound checking approaches can only 1175 00:55:09,300 --> 00:55:12,540 offer a weaker notion of pointer correctness. 1176 00:55:12,540 --> 00:55:16,840 And so that notion is as follows. 1177 00:55:23,995 --> 00:55:32,860 If you have a pointer p prime that's 1178 00:55:32,860 --> 00:55:50,530 derived from the base pointer p, then p prime 1179 00:55:50,530 --> 00:56:07,604 should only be used to deference memory that belongs 1180 00:56:07,604 --> 00:56:08,812 to the original base pointer. 1181 00:56:17,620 --> 00:56:20,850 So for a derived pointer p prime that's 1182 00:56:20,850 --> 00:56:23,200 derived from some original p, then 1183 00:56:23,200 --> 00:56:25,850 p prime should only be used to deference memory that 1184 00:56:25,850 --> 00:56:27,680 belongs to p. 1185 00:56:27,680 --> 00:56:31,880 Know that this is a weaker goal than enforcing completely 1186 00:56:31,880 --> 00:56:34,380 correct pointer semantics. 1187 00:56:34,380 --> 00:56:36,380 Because for example, you could still 1188 00:56:36,380 --> 00:56:41,244 have weird issues like with this union here, for example. 1189 00:56:41,244 --> 00:56:43,160 Maybe at this particular point in the program, 1190 00:56:43,160 --> 00:56:45,380 it wasn't correct for the program 1191 00:56:45,380 --> 00:56:49,150 to be able to reference that particular value in the union. 1192 00:56:49,150 --> 00:56:53,500 But at least this pointer reference is imbalanced. 1193 00:56:53,500 --> 00:56:59,490 So maybe-- like this example up here-- maybe this creation 1194 00:56:59,490 --> 00:57:02,660 of this pointer here violated the semantics 1195 00:57:02,660 --> 00:57:04,600 of the network message embedded in x. 1196 00:57:04,600 --> 00:57:07,710 But at least you're not trampling on arbitrary memory. 1197 00:57:07,710 --> 00:57:11,221 You're only trampling on the memory that belongs to you. 1198 00:57:11,221 --> 00:57:13,470 And so, in the world of C, this is considered success. 1199 00:57:16,949 --> 00:57:17,990 So that's the basic idea. 1200 00:57:17,990 --> 00:57:20,750 Now, the challenge with enforcing 1201 00:57:20,750 --> 00:57:24,103 these types of semantics here is that, in many cases, 1202 00:57:24,103 --> 00:57:26,132 you need help from the compiler. 1203 00:57:26,132 --> 00:57:27,590 So you need help from the compiler. 1204 00:57:27,590 --> 00:57:30,640 You typically need to recompile programs 1205 00:57:30,640 --> 00:57:32,346 to enforce these semantics. 1206 00:57:32,346 --> 00:57:34,836 That can be a drag for backwards compatibility. 1207 00:57:34,836 --> 00:57:38,520 But this is the basic notion of bounds checking. 1208 00:57:38,520 --> 00:57:41,230 What are some ways that you can implement bounds checking? 1209 00:57:49,000 --> 00:57:55,498 One very simple way is this notion called electric fencing. 1210 00:58:01,870 --> 00:58:06,410 The notion here is that, for every object that you allocate 1211 00:58:06,410 --> 00:58:13,930 on the heap, you allocate a guard page that's 1212 00:58:13,930 --> 00:58:15,440 immediately next to it. 1213 00:58:15,440 --> 00:58:18,820 And you set the page protection on that page, such 1214 00:58:18,820 --> 00:58:22,104 that if anybody tries to touch that, you get a hard fault. 1215 00:58:22,104 --> 00:58:23,770 The hard rules say that's out of bounds, 1216 00:58:23,770 --> 00:58:26,205 and then the program will stop right there. 1217 00:58:26,205 --> 00:58:29,195 And so this is a very simple thing that you can do. 1218 00:58:29,195 --> 00:58:31,316 And what's nice about this approach actually, 1219 00:58:31,316 --> 00:58:34,560 is that whenever you have an invalid memory reference, 1220 00:58:34,560 --> 00:58:37,140 this causes a fault immediately, right. 1221 00:58:37,140 --> 00:58:39,230 If you've ever debugged the Base C or C++ program, 1222 00:58:39,230 --> 00:58:41,854 one of the big problems is that a lot of times when you corrupt 1223 00:58:41,854 --> 00:58:46,300 memory, that memory is corrupted silently, and for a while, 1224 00:58:46,300 --> 00:58:49,130 and it isn't until later that something crashes and then only 1225 00:58:49,130 --> 00:58:50,870 then you realize something happened. 1226 00:58:50,870 --> 00:58:52,380 But you don't know what that something is. 1227 00:58:52,380 --> 00:58:54,390 You simply do what they call heisenbugs, right. 1228 00:58:54,390 --> 00:58:56,030 Things that have this notion of uncertainty in them. 1229 00:58:56,030 --> 00:58:58,150 So what's nice about this is that as soon 1230 00:58:58,150 --> 00:59:00,760 as the pointer hits here, boom, it's a guard page, 1231 00:59:00,760 --> 00:59:03,060 everything blows up. 1232 00:59:03,060 --> 00:59:05,130 Now can you think of a disadvantage 1233 00:59:05,130 --> 00:59:06,937 with this approach? 1234 00:59:06,937 --> 00:59:08,520 AUDIENCE: It takes longer [INAUDIBLE]. 1235 00:59:08,520 --> 00:59:10,049 PROFESSOR: Yeah exactly. 1236 00:59:10,049 --> 00:59:12,090 So imagine that this little-- this key thing here 1237 00:59:12,090 --> 00:59:15,120 was super, super small, then I've allocated a whole page 1238 00:59:15,120 --> 00:59:19,240 just to make sure that my little tiny thing here didn't get-- 1239 00:59:19,240 --> 00:59:21,133 didn't have one of these pointer attacks. 1240 00:59:21,133 --> 00:59:23,864 So this is very space intensive. 1241 00:59:23,864 --> 00:59:25,905 And so-- but people don't really deploy something 1242 00:59:25,905 --> 00:59:28,000 like this in production. 1243 00:59:28,000 --> 00:59:29,890 This could be useful for the bugging thing, 1244 00:59:29,890 --> 00:59:32,859 but you would never do this for a real program. 1245 00:59:32,859 --> 00:59:33,650 So that make sense? 1246 00:59:33,650 --> 00:59:36,600 So these electrical fences are actually pretty-- pretty 1247 00:59:36,600 --> 00:59:39,222 simple to understand. 1248 00:59:39,222 --> 00:59:42,340 AUDIENCE: Why does that have to be so large, necessarily? 1249 00:59:42,340 --> 00:59:46,930 PROFESSOR: Ah, so the reason is because this guard page here, 1250 00:59:46,930 --> 00:59:50,142 you're typically relying on the hardware, like page level 1251 00:59:50,142 --> 00:59:52,100 protections to deal with those types of things. 1252 00:59:52,100 --> 00:59:54,230 And so there's like certain memory size 1253 00:59:54,230 --> 00:59:56,359 you can set to the size of the page, according to 1254 00:59:56,359 --> 00:59:56,900 [? Hollis ?]. 1255 00:59:56,900 --> 00:59:58,886 But typically that page is 4k, for example. 1256 00:59:58,886 --> 01:00:00,260 So getting back to your question, 1257 01:00:00,260 --> 01:00:02,910 this is some like super small value here, 1258 01:00:02,910 --> 01:00:05,100 then yeah [INAUDIBLE] 2 bytes where 1259 01:00:05,100 --> 01:00:08,292 you got 4k here protecting it. 1260 01:00:08,292 --> 01:00:11,638 AUDIENCE: In protecting [INAUDIBLE] individual 1261 01:00:11,638 --> 01:00:13,027 [INAUDIBLE]. 1262 01:00:13,027 --> 01:00:14,818 PROFESSOR: Oh sorry yeah, yeah so by heap I 1263 01:00:14,818 --> 01:00:16,044 mean like heap object. 1264 01:00:16,044 --> 01:00:16,960 AUDIENCE: [INAUDIBLE]. 1265 01:00:16,960 --> 01:00:18,834 PROFESSOR: Yeah thank you for-- yeah exactly. 1266 01:00:18,834 --> 01:00:20,504 So imagine like for each malloc you do, 1267 01:00:20,504 --> 01:00:22,920 you can have one of these-- and set the guard page for it. 1268 01:00:22,920 --> 01:00:25,110 AUDIENCE: And you do it for log and above? 1269 01:00:25,110 --> 01:00:25,990 Or just above? 1270 01:00:25,990 --> 01:00:27,234 PROFESSOR: You can do either. 1271 01:00:27,234 --> 01:00:28,210 AUDIENCE: [INAUDIBLE] 1272 01:00:30,260 --> 01:00:31,260 PROFESSOR: That's right. 1273 01:00:31,260 --> 01:00:31,740 AUDIENCE: [INAUDIBLE]. 1274 01:00:31,740 --> 01:00:33,480 PROFESSOR: That's right, well you could do either. 1275 01:00:33,480 --> 01:00:34,979 The ones we have depending on this-- 1276 01:00:34,979 --> 01:00:37,030 on the size of the object. 1277 01:00:37,030 --> 01:00:40,010 I mean now you got to declare two guard fences, right. 1278 01:00:40,010 --> 01:00:42,610 So now this quickly gets out of control. 1279 01:00:42,610 --> 01:00:46,615 Which yeah, you could have a booking [INAUDIBLE]. 1280 01:00:46,615 --> 01:00:48,571 So that's the basic idea behind that. 1281 01:00:58,351 --> 01:01:02,280 And then another approach you can look at 1282 01:01:02,280 --> 01:01:07,028 is what they call fat pointers. 1283 01:01:11,990 --> 01:01:13,490 And so the idea here is we actually 1284 01:01:13,490 --> 01:01:16,280 want to modify the pointer representation itself 1285 01:01:16,280 --> 01:01:18,390 to include bounds information in it. 1286 01:01:18,390 --> 01:01:27,290 So if you look at your regular 32-bit pointer what's it 1287 01:01:27,290 --> 01:01:28,430 look like? 1288 01:01:28,430 --> 01:01:30,128 Well the answer is, 32-bits. 1289 01:01:30,128 --> 01:01:32,842 And then you got [INAUDIBLE]. 1290 01:01:32,842 --> 01:01:33,341 Right? 1291 01:01:33,341 --> 01:01:41,116 If you look at a fat pointer then one 1292 01:01:41,116 --> 01:01:42,740 way you can think about looking at this 1293 01:01:42,740 --> 01:01:46,710 is you got a 4 byte base. 1294 01:01:50,580 --> 01:01:57,143 And then you have a 4 byte end. 1295 01:01:57,143 --> 01:01:59,393 So in other words, this is where it would allocate out 1296 01:01:59,393 --> 01:02:02,000 that it starts, that's where it ends 1297 01:02:02,000 --> 01:02:09,125 and then you've got a 4 byte cur address. 1298 01:02:12,660 --> 01:02:14,220 So this is where the pointer actually 1299 01:02:14,220 --> 01:02:16,500 is, within that bounds, right. 1300 01:02:16,500 --> 01:02:20,550 So basically what happens is that the compiler will generate 1301 01:02:20,550 --> 01:02:24,530 code, such that when you access these fat pointers this gets 1302 01:02:24,530 --> 01:02:26,655 updated, but then it'll also check these two things 1303 01:02:26,655 --> 01:02:28,109 to make sure that nothing bad has 1304 01:02:28,109 --> 01:02:30,220 happened during that upgrade. 1305 01:02:30,220 --> 01:02:33,740 So for example you can imagine that if I had code like this. 1306 01:02:42,240 --> 01:02:47,780 So I have an end pointer and then I allocate 8 bytes. 1307 01:02:47,780 --> 01:02:49,880 So assuming that we're on a 32-bit architecture 1308 01:02:49,880 --> 01:02:53,480 to point to 2 [INAUDIBLE]. 1309 01:02:53,480 --> 01:02:59,277 And then I have some while loop that it 1310 01:02:59,277 --> 01:03:05,138 is going to just assign some value to the pointer and then 1311 01:03:05,138 --> 01:03:11,520 increment the pointer-- what you'll see 1312 01:03:11,520 --> 01:03:13,830 is that the current address for this pointer, 1313 01:03:13,830 --> 01:03:18,700 like at this point in code, will point to the base, right. 1314 01:03:18,700 --> 01:03:21,930 And then every time we iterate through here, 1315 01:03:21,930 --> 01:03:24,067 we can see that we're either checking a bound, 1316 01:03:24,067 --> 01:03:26,700 or incrementing a bound. 1317 01:03:26,700 --> 01:03:29,100 So at this point we want to dereference it. 1318 01:03:29,100 --> 01:03:32,425 We can actually check and see, is the current address 1319 01:03:32,425 --> 01:03:34,748 at that pointer, in this ring. 1320 01:03:34,748 --> 01:03:36,456 And if it's not you throw in an exception 1321 01:03:36,456 --> 01:03:39,180 here and so on and so forth. 1322 01:03:39,180 --> 01:03:41,765 So once again, where is this taking place? 1323 01:03:41,765 --> 01:03:45,230 This Is taking place in new code that the compiler generated. 1324 01:03:45,230 --> 01:03:48,019 So one question that came up on the online discussion group, 1325 01:03:48,019 --> 01:03:49,435 some people were saying, well what 1326 01:03:49,435 --> 01:03:52,100 if it's instrumented code, what does that mean, right? 1327 01:03:52,100 --> 01:03:54,359 So when I say that the-- that the compiler generates 1328 01:03:54,359 --> 01:03:56,442 new code, imagine that there-- this is 1329 01:03:56,442 --> 01:03:58,530 what you see as a programmer. 1330 01:03:58,530 --> 01:04:02,010 But before this operation actually takes place, 1331 01:04:02,010 --> 01:04:05,250 imagine the compiler inserted some new C code here 1332 01:04:05,250 --> 01:04:07,835 that basically looks at these base bounds here. 1333 01:04:07,835 --> 01:04:09,815 And then if there was something out of bounds 1334 01:04:09,815 --> 01:04:12,785 it would then do an exit, or an abort, or something like that. 1335 01:04:12,785 --> 01:04:14,493 So that's what it means to say that there 1336 01:04:14,493 --> 01:04:15,440 is instrumented code. 1337 01:04:15,440 --> 01:04:17,960 It's that you take the source code, use the program of C, 1338 01:04:17,960 --> 01:04:20,424 add some new C source code and then compile 1339 01:04:20,424 --> 01:04:22,690 that video program. 1340 01:04:22,690 --> 01:04:24,823 So the basic idea I think behind the fat pointer 1341 01:04:24,823 --> 01:04:26,860 is pretty simple. 1342 01:04:26,860 --> 01:04:29,520 There's some disadvantages to this. 1343 01:04:29,520 --> 01:04:32,465 The biggest disadvantage is that, oh 1344 01:04:32,465 --> 01:04:34,870 my goodness look how big the pointers are now, right. 1345 01:04:34,870 --> 01:04:37,370 And so what this means is that you can't just 1346 01:04:37,370 --> 01:04:40,140 take a fat pointer and pass it to an unmodified, 1347 01:04:40,140 --> 01:04:41,849 off the shell library. 1348 01:04:41,849 --> 01:04:43,515 Because it may have certain expectations 1349 01:04:43,515 --> 01:04:46,220 that pointers are a certain size and we give you this thing, 1350 01:04:46,220 --> 01:04:48,674 it's just going to-- it's going to blow up. 1351 01:04:48,674 --> 01:04:50,465 We also have trouble if you want to include 1352 01:04:50,465 --> 01:04:52,840 these types of pointers and structs, or things like that. 1353 01:04:52,840 --> 01:04:56,180 Because that can actually change the size of the struct, right. 1354 01:04:56,180 --> 01:04:58,110 So a very popular thing in C code to do 1355 01:04:58,110 --> 01:05:00,016 is to take like the size of the struct 1356 01:05:00,016 --> 01:05:01,974 and then like do something as a result of that. 1357 01:05:01,974 --> 01:05:04,557 Like reserve some disc space for a struct of that size, 1358 01:05:04,557 --> 01:05:05,515 and so on and so forth. 1359 01:05:05,515 --> 01:05:07,837 So this causes all that stuff to blow up, right. 1360 01:05:07,837 --> 01:05:11,740 Because once again, the pointers have gotten very, very big. 1361 01:05:11,740 --> 01:05:13,780 And another thing which is a bit subtle, 1362 01:05:13,780 --> 01:05:17,810 but it's that these fat pointers typically 1363 01:05:17,810 --> 01:05:21,500 will not be able to be updated in an atomic fashion, right. 1364 01:05:21,500 --> 01:05:24,630 So on 32-bit architectures typically, 1365 01:05:24,630 --> 01:05:27,310 if you do like a write to a 32-bit variable, 1366 01:05:27,310 --> 01:05:29,180 that write is atomic, right. 1367 01:05:29,180 --> 01:05:33,270 But now, these pointers are these three integer sized 1368 01:05:33,270 --> 01:05:34,590 things, right. 1369 01:05:34,590 --> 01:05:37,270 So if you have any code that takes advantage of the fact 1370 01:05:37,270 --> 01:05:39,780 that it expects pointer writes to be atomic, 1371 01:05:39,780 --> 01:05:41,830 then you may get in trouble, right. 1372 01:05:41,830 --> 01:05:45,460 Because you can imagine that to do some of these checks, 1373 01:05:45,460 --> 01:05:48,460 you have to look at the current address and then look at this 1374 01:05:48,460 --> 01:05:49,960 and then you might have to increment 1375 01:05:49,960 --> 01:05:51,570 that, and so on and so forth. 1376 01:05:51,570 --> 01:05:53,520 So this can cause very subtle concurrency bugs 1377 01:05:53,520 --> 01:05:55,728 if you have code that depends on that atomacy of fail 1378 01:05:55,728 --> 01:05:58,180 [INAUDIBLE]. 1379 01:05:58,180 --> 01:05:59,770 So does that all make sense? 1380 01:05:59,770 --> 01:06:01,340 So that's one approach you can do. 1381 01:06:01,340 --> 01:06:05,506 But kind of like electric fences, this 1382 01:06:05,506 --> 01:06:09,190 has some nasty side effects that means the people don't 1383 01:06:09,190 --> 01:06:10,980 typically use that in practice. 1384 01:06:14,660 --> 01:06:19,470 So now we can start talking about bounds checking, 1385 01:06:19,470 --> 01:06:22,165 with respect to the shadow of the infrastructure 1386 01:06:22,165 --> 01:06:25,890 that I mentioned in the baggy bounds paper. 1387 01:06:25,890 --> 01:06:37,200 So the basic idea for the shadow base structure 1388 01:06:37,200 --> 01:06:43,670 is for each object that you allocate, 1389 01:06:43,670 --> 01:06:46,830 you want to store how big the object is. 1390 01:06:53,650 --> 01:06:58,860 Right, so for example, if you have some pointer 1391 01:06:58,860 --> 01:07:03,763 that you call malloc on right, you 1392 01:07:03,763 --> 01:07:07,460 need to store that size of that object there, 1393 01:07:07,460 --> 01:07:09,460 and then note that if you have some thing that's 1394 01:07:09,460 --> 01:07:15,984 like a static variable like this, right, 1395 01:07:15,984 --> 01:07:18,109 the compiler can automatically figure out 1396 01:07:18,109 --> 01:07:19,525 what the bounds are for that thing 1397 01:07:19,525 --> 01:07:21,840 there, statically speaking. 1398 01:07:21,840 --> 01:07:23,635 So for each one of these pointers 1399 01:07:23,635 --> 01:07:31,415 you need to interpose somehow on two operations. 1400 01:07:34,600 --> 01:07:36,635 Basically you do arithmetic. 1401 01:07:41,290 --> 01:07:49,930 So this is things like q equals p plus 7, or whatever. 1402 01:07:49,930 --> 01:07:55,550 And then you want to interpose on dereferencing. 1403 01:07:55,550 --> 01:08:02,399 So this is something like q equals 1404 01:08:02,399 --> 01:08:03,440 a or something like that. 1405 01:08:06,200 --> 01:08:09,730 So what's interesting is that you might think, 1406 01:08:09,730 --> 01:08:13,690 well why can't we just rely on the reference 1407 01:08:13,690 --> 01:08:16,090 when interposing stuff? 1408 01:08:16,090 --> 01:08:20,205 Why do we have to look at this point arithmetic here? 1409 01:08:20,205 --> 01:08:22,170 But similarly you might wonder the other thing. 1410 01:08:22,170 --> 01:08:23,711 Like why can't you just deal with one 1411 01:08:23,711 --> 01:08:26,040 of these non [INAUDIBLE] interpose [INAUDIBLE]? 1412 01:08:26,040 --> 01:08:29,684 So you can't just signal an error 1413 01:08:29,684 --> 01:08:34,120 if you see the arithmetic going out of bounds because in c 1414 01:08:34,120 --> 01:08:37,040 that may or may not be there. 1415 01:08:37,040 --> 01:08:40,939 So in other words, a very common medium is C and C++ is you 1416 01:08:40,939 --> 01:08:44,695 might have a pointer that points to one pass the valid end 1417 01:08:44,695 --> 01:08:47,569 of an object right, and then you use that as a stop condition, 1418 01:08:47,569 --> 01:08:48,068 right. 1419 01:08:48,068 --> 01:08:49,910 So you iterate to the object and once you 1420 01:08:49,910 --> 01:08:52,896 hit that end pointer, that's when you actually stop the loop 1421 01:08:52,896 --> 01:08:54,076 or whatever. 1422 01:08:54,076 --> 01:08:56,908 So if we just interpose on arithmetic 1423 01:08:56,908 --> 01:08:58,890 and we always cause a hard fault, 1424 01:08:58,890 --> 01:09:00,990 when we see a pointer go out of bounds, 1425 01:09:00,990 --> 01:09:04,060 that may actually break a lot of legitimate applications, right. 1426 01:09:04,060 --> 01:09:06,520 So we can't just interpose on that. 1427 01:09:06,520 --> 01:09:09,466 And so you might say, well why can't you just interpose 1428 01:09:09,466 --> 01:09:12,442 on the reference thing, and you just-- when we notice 1429 01:09:12,442 --> 01:09:14,430 that you've cut something out of bounds, 1430 01:09:14,430 --> 01:09:15,845 we'll just read there and there. 1431 01:09:15,845 --> 01:09:17,470 Well the challenge there is that how do 1432 01:09:17,470 --> 01:09:18,636 you know it's out of bounds? 1433 01:09:18,636 --> 01:09:21,279 Right, it's the-- it's the arithmetic in our positioning 1434 01:09:21,279 --> 01:09:24,089 that officially allows us to tell whether or not 1435 01:09:24,089 --> 01:09:25,880 this thing's going to be legal here, right. 1436 01:09:25,880 --> 01:09:27,340 Because it's the interpositioning 1437 01:09:27,340 --> 01:09:29,960 on the arithmetic that allows us to track 1438 01:09:29,960 --> 01:09:31,517 where the pointer is with respect 1439 01:09:31,517 --> 01:09:34,120 to it's original baseline. 1440 01:09:34,120 --> 01:09:36,146 So that's the basic idea there. 1441 01:09:41,740 --> 01:09:45,480 And so the next question is how do we actually 1442 01:09:45,480 --> 01:09:46,730 implement the bounds checking? 1443 01:09:49,550 --> 01:09:55,720 Because basically we need some way to map a particular pointer 1444 01:09:55,720 --> 01:10:00,640 address to some type of bounds information for that pointer. 1445 01:10:00,640 --> 01:10:02,420 And so a lot of your previous solutions 1446 01:10:02,420 --> 01:10:05,274 use things like, for example, like a hash table, or a tree, 1447 01:10:05,274 --> 01:10:07,190 right that will allow you to do lookups right, 1448 01:10:07,190 --> 01:10:08,620 and stay the gray. 1449 01:10:08,620 --> 01:10:11,899 So given a pointer address, I do some lookup 1450 01:10:11,899 --> 01:10:14,190 in this data structure, figure out what the bounds are. 1451 01:10:14,190 --> 01:10:16,050 Given those bounds I can then figure out 1452 01:10:16,050 --> 01:10:18,790 if I want to allow the action to take place or not. 1453 01:10:18,790 --> 01:10:21,917 Now the problem with that is that it's a slow lookup, 1454 01:10:21,917 --> 01:10:24,250 right because these data structures you're thinking it's 1455 01:10:24,250 --> 01:10:26,733 a tree, or you're going through a bunch of branches 1456 01:10:26,733 --> 01:10:29,409 before you can actually hit the value potentially. 1457 01:10:29,409 --> 01:10:31,200 And even if it's a hash table where there's 1458 01:10:31,200 --> 01:10:33,880 an overflow in the bucket you got to follow chains, 1459 01:10:33,880 --> 01:10:36,740 or do you're code, or things like that. 1460 01:10:36,740 --> 01:10:40,160 So the baggy bounds paper that we 1461 01:10:40,160 --> 01:10:42,585 are about to look at actually figured out 1462 01:10:42,585 --> 01:10:45,800 a very efficient data structure that tracked to these bounds, 1463 01:10:45,800 --> 01:10:49,480 to make that bound checking very fat. 1464 01:10:49,480 --> 01:10:51,170 So let's just step into that right now. 1465 01:10:51,170 --> 01:10:53,340 But before we go into that let me very briefly 1466 01:10:53,340 --> 01:10:55,110 talk about how buddy allocation works. 1467 01:10:55,110 --> 01:10:56,985 Because that's one of the things that came up 1468 01:10:56,985 --> 01:10:58,480 in a lot of the questions. 1469 01:10:58,480 --> 01:11:00,830 So one thing you will see for these papers is that a lot 1470 01:11:00,830 --> 01:11:02,663 of times they are not self-contained, right. 1471 01:11:02,663 --> 01:11:05,580 So they will mention things that they will assume that you know, 1472 01:11:05,580 --> 01:11:07,135 but you may not know them. 1473 01:11:07,135 --> 01:11:08,385 Don't get discouraged by that. 1474 01:11:08,385 --> 01:11:10,076 That happens to me too sometimes. 1475 01:11:10,076 --> 01:11:11,450 These papers are written in a way 1476 01:11:11,450 --> 01:11:12,991 they assume a lot of prior knowledge, 1477 01:11:12,991 --> 01:11:14,500 so don't get discouraged by that. 1478 01:11:14,500 --> 01:11:16,616 Luckily we actually access to the internet 1479 01:11:16,616 --> 01:11:17,572 we can look up some of that stuff. 1480 01:11:17,572 --> 01:11:18,990 Can you imagine what happened in our parents time? 1481 01:11:18,990 --> 01:11:19,534 They just didn't understand stuff they just 1482 01:11:19,534 --> 01:11:21,182 had to go home, right. 1483 01:11:21,182 --> 01:11:25,915 So don't be afraid to look stuff up to get to Wikipedia 1484 01:11:25,915 --> 01:11:27,335 it's mostly correct. 1485 01:11:30,210 --> 01:11:37,600 So how does-- how does the buddy allocation system work? 1486 01:11:37,600 --> 01:11:40,390 So basically what it does at first 1487 01:11:40,390 --> 01:11:44,196 it treats unallocated memory as one big block. 1488 01:11:44,196 --> 01:11:44,870 OK. 1489 01:11:44,870 --> 01:11:47,810 And then when you request a smaller block 1490 01:11:47,810 --> 01:11:51,662 for dynamic allocation, it tries to split that address 1491 01:11:51,662 --> 01:11:56,060 base using powers of 2 until it finds a block that 1492 01:11:56,060 --> 01:11:57,603 is just big enough to work. 1493 01:11:57,603 --> 01:12:00,860 So let's say a request came in and say A 1494 01:12:00,860 --> 01:12:06,835 is going to equal to malloc 28. 1495 01:12:06,835 --> 01:12:07,440 28 bytes. 1496 01:12:07,440 --> 01:12:09,023 And let's just say this toy example is 1497 01:12:09,023 --> 01:12:11,590 only 128 bytes of memory total. 1498 01:12:11,590 --> 01:12:13,603 So the buddy allocator is going to look at this 1499 01:12:13,603 --> 01:12:14,820 and say, well I have 128 bytes of memory, 1500 01:12:14,820 --> 01:12:17,320 but it's too wasteful to allocate this whole thing 1501 01:12:17,320 --> 01:12:18,710 to this 28 byte request. 1502 01:12:18,710 --> 01:12:20,780 So I'm going to split this request in two 1503 01:12:20,780 --> 01:12:24,870 and then see if I have smaller block that's just big enough. 1504 01:12:24,870 --> 01:12:29,290 So it's going to say, OK put this to 0 to 64 and 64 to 128. 1505 01:12:29,290 --> 01:12:31,932 Ah OK, but this block here is still too big, right. 1506 01:12:31,932 --> 01:12:33,970 Basically what the buddy algorithm wants to do 1507 01:12:33,970 --> 01:12:36,660 is find a block such that the allocated 1508 01:12:36,660 --> 01:12:38,996 data in the real object, 28 bytes, 1509 01:12:38,996 --> 01:12:42,000 is at least half the size of that block. 1510 01:12:42,000 --> 01:12:44,470 So buddy allocator says, OK this thing over here 1511 01:12:44,470 --> 01:12:45,224 is still too big. 1512 01:12:45,224 --> 01:12:47,640 So what it's going to do is it's going to split the memory 1513 01:12:47,640 --> 01:12:51,940 space again, right. 1514 01:12:51,940 --> 01:12:58,050 So from 0 to 32 and then it's going to say, 1515 01:12:58,050 --> 01:13:02,160 ah OK 28 bytes that is more than half the size of this block 1516 01:13:02,160 --> 01:13:02,830 here. 1517 01:13:02,830 --> 01:13:08,726 So now this block is going to be allocated to A. OK, 1518 01:13:08,726 --> 01:13:10,940 and so it gets this address here. 1519 01:13:10,940 --> 01:13:17,910 Now let's say that we have another question comes in for B 1520 01:13:17,910 --> 01:13:23,030 and let's say we want to malloc 50 right. 1521 01:13:23,030 --> 01:13:26,485 So what's going to happen is that the buddy allocator will 1522 01:13:26,485 --> 01:13:29,410 say, ah OK I actually have a block here 1523 01:13:29,410 --> 01:13:31,015 that's big enough, right. 1524 01:13:31,015 --> 01:13:33,270 50 Is greater than half the size of this thing 1525 01:13:33,270 --> 01:13:35,240 so I'll just allocate that right there. 1526 01:13:35,240 --> 01:13:41,740 So we have this system, or setup, where we have A here, 1527 01:13:41,740 --> 01:13:44,990 and then we have B here, and then 1528 01:13:44,990 --> 01:13:51,635 let's say we had another request that came in for 20 bytes. 1529 01:13:53,955 --> 01:13:55,580 This is actually pretty straightforward 1530 01:13:55,580 --> 01:13:57,910 because we can put that right here, right. 1531 01:13:57,910 --> 01:14:03,496 So then you have something that looks like this. 1532 01:14:03,496 --> 01:14:07,280 Then what's interesting is that when you deallocate memory, 1533 01:14:07,280 --> 01:14:09,776 if you have to deallocate a block that 1534 01:14:09,776 --> 01:14:11,650 are next to each other and are the same size, 1535 01:14:11,650 --> 01:14:13,310 the buddy allocator will merge them 1536 01:14:13,310 --> 01:14:15,700 into a block that's twice as big, right. 1537 01:14:15,700 --> 01:14:29,720 So if we had free let's say C then we go to this situation, 1538 01:14:29,720 --> 01:14:33,361 we can't do any merging, because this is the only possible block 1539 01:14:33,361 --> 01:14:35,110 that this one could have been merged with. 1540 01:14:35,110 --> 01:14:37,330 It's the same size, but this things still occupied. 1541 01:14:37,330 --> 01:14:49,830 So then if we do a free on A, then we 1542 01:14:49,830 --> 01:14:52,480 have this situation here. 1543 01:14:52,480 --> 01:14:56,120 Right, where these two 32 byte blocks 1544 01:14:56,120 --> 01:14:59,850 were merged into one size 64, and that this one, a size 64 1545 01:14:59,850 --> 01:15:01,405 is still out there. 1546 01:15:01,405 --> 01:15:03,840 Right, so it's called the buddy system because once again, 1547 01:15:03,840 --> 01:15:06,560 whenever you have two adjacent blocks that 1548 01:15:06,560 --> 01:15:08,995 are of the same size and that could 1549 01:15:08,995 --> 01:15:11,960 be merged to form an aligned block, 1550 01:15:11,960 --> 01:15:14,840 then the system will merge that buddy with this other buddy 1551 01:15:14,840 --> 01:15:18,253 and then create that new block that's twice as big. 1552 01:15:18,253 --> 01:15:20,510 So the thing that's nice about this system 1553 01:15:20,510 --> 01:15:26,322 is that it's very simple to figure out where buddy's are. 1554 01:15:26,322 --> 01:15:28,314 Because you can do very cutesy arithmetic, 1555 01:15:28,314 --> 01:15:31,287 like the buddy bounds system-- baggy bounds system works. 1556 01:15:31,287 --> 01:15:32,870 But anyway I'm not going into details. 1557 01:15:32,870 --> 01:15:34,744 This is basically how buddy allocation works. 1558 01:15:34,744 --> 01:15:37,710 Does that make sense? 1559 01:15:37,710 --> 01:15:39,210 Right, and one question that came up 1560 01:15:39,210 --> 01:15:43,560 a lot in all my discussions, isn't this wasteful? 1561 01:15:43,560 --> 01:15:47,510 Right, so for example, imagine that up here at the beginning 1562 01:15:47,510 --> 01:15:52,484 I had a request for size 65 bytes, right. 1563 01:15:52,484 --> 01:15:54,890 So if I have a request for 65 bytes, 1564 01:15:54,890 --> 01:15:57,368 I would allocate this whole structure up here and then 1565 01:15:57,368 --> 01:16:00,034 there's-- actually you're out of dynamic memory and can't do any 1566 01:16:00,034 --> 01:16:00,742 more allocations. 1567 01:16:00,742 --> 01:16:02,894 And the answer is yes, that is wasteful. 1568 01:16:02,894 --> 01:16:04,560 But once again, it's a trade off, right. 1569 01:16:04,560 --> 01:16:07,060 Because it's very easy to do these calculations on how to do 1570 01:16:07,060 --> 01:16:08,530 merging and stuff like that. 1571 01:16:08,530 --> 01:16:10,520 So if you want finer grain allocation, 1572 01:16:10,520 --> 01:16:12,021 there are other valid ones for that. 1573 01:16:12,021 --> 01:16:13,728 It's outside the scope of the lecture so, 1574 01:16:13,728 --> 01:16:15,420 we can buffer that offline if you want. 1575 01:16:15,420 --> 01:16:19,190 That's basically how the buddy-- sorry the, the buddy allocator 1576 01:16:19,190 --> 01:16:21,300 works. 1577 01:16:21,300 --> 01:16:26,126 So what is the baggy bounds system going to do? 1578 01:16:26,126 --> 01:16:29,078 Well, it is going through a y, on couple of tricks. 1579 01:16:41,400 --> 01:16:52,830 So the first idea is you round up each allocation 1580 01:16:52,830 --> 01:17:06,245 to a power of 2, and you align the request to that power of 2. 1581 01:17:13,129 --> 01:17:15,295 Right, so essentially the buddy allocators very nice 1582 01:17:15,295 --> 01:17:17,205 because it handles a lot of that for you, right. 1583 01:17:17,205 --> 01:17:18,880 It naturally will do that kind of thing. 1584 01:17:18,880 --> 01:17:21,970 Because that's just the way that it allocates and deallocates 1585 01:17:21,970 --> 01:17:23,930 to memory. 1586 01:17:23,930 --> 01:17:27,880 And so the second thing that's going to happen, 1587 01:17:27,880 --> 01:17:41,530 baggy bounds system, is you express each bound as log base 1588 01:17:41,530 --> 01:17:44,260 2 of the allocation size. 1589 01:17:48,290 --> 01:17:51,710 Right, and so what this means-- and so why can we do this? 1590 01:17:51,710 --> 01:17:53,950 Well once again all of our allocation sizes 1591 01:17:53,950 --> 01:17:56,070 are powers of 2, right. 1592 01:17:56,070 --> 01:17:59,512 So we don't need very many bits to represent 1593 01:17:59,512 --> 01:18:01,470 how big a particular allocation size is. 1594 01:18:01,470 --> 01:18:10,602 So for example, if your allocation size is 16, 1595 01:18:10,602 --> 01:18:14,110 then you just need four-- the log rhythm 1596 01:18:14,110 --> 01:18:17,420 of that, 4 bits of the allocation size, right. 1597 01:18:17,420 --> 01:18:19,340 Does that make sense? 1598 01:18:19,340 --> 01:18:21,350 Right, this another popular question here. 1599 01:18:21,350 --> 01:18:23,770 This is why you only need small number of bits 1600 01:18:23,770 --> 01:18:25,570 here, because we're basically forcing 1601 01:18:25,570 --> 01:18:30,043 the allocation sizes hit this quantized way that you grow. 1602 01:18:30,043 --> 01:18:31,626 Like if you could only have something, 1603 01:18:31,626 --> 01:18:33,987 let's say 16 bytes or 32 bytes. 1604 01:18:33,987 --> 01:18:36,136 You can't have for example, 33 bytes. 1605 01:18:38,930 --> 01:18:41,730 And then the third thing that baggy bounds is going to do 1606 01:18:41,730 --> 01:19:04,818 is store the limit info in a linear array 1 byte per entry 1607 01:19:04,818 --> 01:19:12,258 but we're going to allocate memory 1608 01:19:12,258 --> 01:19:13,780 at the granularity of a slot. 1609 01:19:17,240 --> 01:19:24,530 Which in the paper they used 16 bytes as the slot width. 1610 01:19:24,530 --> 01:19:26,170 So for example, now this next one, 1611 01:19:26,170 --> 01:19:28,336 this is 1 bit that wasn't actually specifically said 1612 01:19:28,336 --> 01:19:31,030 in the paper which if you don't grasp 1613 01:19:31,030 --> 01:19:33,380 it'll make the paper very tricky to understand, right. 1614 01:19:33,380 --> 01:19:40,190 So now you can have a slot size which is equal to 16, 1615 01:19:40,190 --> 01:19:48,350 so if you do p equals malloc 16 so what's going to happen? 1616 01:19:48,350 --> 01:19:51,495 So in this bounds table you're going 1617 01:19:51,495 --> 01:20:02,600 to say take that pointer plot it by plot size it's 1618 01:20:02,600 --> 01:20:04,240 going to equal 4, right. 1619 01:20:04,240 --> 01:20:05,926 So in that bounds table we're going 1620 01:20:05,926 --> 01:20:11,830 to put the logarithm of the allocation size in the table. 1621 01:20:11,830 --> 01:20:12,976 Does that make sense? 1622 01:20:12,976 --> 01:20:14,350 OK, now what the tricky thing is, 1623 01:20:14,350 --> 01:20:16,372 let's say that you have something like this. 1624 01:20:22,570 --> 01:20:26,280 Right, so let's say that you out 32 bytes. 1625 01:20:26,280 --> 01:20:29,050 What is the bounds table going to look like there? 1626 01:20:29,050 --> 01:20:31,640 So here we actually have to update the bounds 1627 01:20:31,640 --> 01:20:37,920 table to abbreviate your p, or sorry t for the size you need. 1628 01:20:37,920 --> 01:20:39,545 But that fit the bounds table twice. 1629 01:20:44,970 --> 01:20:47,706 Right, once for the first slot memory 1630 01:20:47,706 --> 01:20:49,085 that this allocation takes up. 1631 01:20:49,085 --> 01:20:56,645 And then a second time for that second slot that it takes up. 1632 01:21:00,740 --> 01:21:03,740 Right, so once again 32 is the allocation size. 1633 01:21:03,740 --> 01:21:06,404 This is the log of that allocation size. 1634 01:21:06,404 --> 01:21:09,250 So for the two slots that this memory takes up, 1635 01:21:09,250 --> 01:21:11,482 we're going to update the bounds table twice. 1636 01:21:11,482 --> 01:21:13,815 Does that makes sense? 1637 01:21:13,815 --> 01:21:15,190 Right, and this is really the key 1638 01:21:15,190 --> 01:21:16,150 that I think for a lot of people that's 1639 01:21:16,150 --> 01:21:18,650 going to make the paper make sense or not make sense, right. 1640 01:21:18,650 --> 01:21:21,667 Because that bounds table multiple times if any 1641 01:21:21,667 --> 01:21:22,625 outside the allocation. 1642 01:21:22,625 --> 01:21:22,970 AUDIENCE: Can you repeat that for me again? 1643 01:21:22,970 --> 01:21:23,880 PROFESSOR: Excuse me? 1644 01:21:23,880 --> 01:21:25,380 AUDIENCE: Can you repeat that again? 1645 01:21:25,380 --> 01:21:26,326 PROFESSOR: Oh yeah, yeah, sure, sure. 1646 01:21:26,326 --> 01:21:27,826 So basically what the idea is that I 1647 01:21:27,826 --> 01:21:32,290 mean you've got this bounds table here 1648 01:21:32,290 --> 01:21:34,230 and it's got a bunch of entries. 1649 01:21:34,230 --> 01:21:38,360 But it basically needs entries to cover 1650 01:21:38,360 --> 01:21:41,167 all of p size, all the allocation size. 1651 01:21:41,167 --> 01:21:44,900 OK, so in this case it was very simple because basically this 1652 01:21:44,900 --> 01:21:46,744 is just one slot, due to the size. 1653 01:21:46,744 --> 01:21:48,447 Here it's multiple slot sizes, right. 1654 01:21:48,447 --> 01:21:50,363 So what's going to happen is that imagine then 1655 01:21:50,363 --> 01:21:53,570 that we had a pointer that's moving in the range of p. 1656 01:21:53,570 --> 01:21:55,700 You have to have some of the back end table 1657 01:21:55,700 --> 01:21:58,910 slot for each one of those places where p [INAUDIBLE], 1658 01:21:58,910 --> 01:21:59,410 right. 1659 01:21:59,410 --> 01:22:01,792 And so it's this second piece that 1660 01:22:01,792 --> 01:22:03,750 makes the paper a little bit confusing I think. 1661 01:22:03,750 --> 01:22:06,140 But it doesn't really go into depth about that, 1662 01:22:06,140 --> 01:22:07,568 but this is how that works. 1663 01:22:10,430 --> 01:22:20,357 OK so armed with the bounds table 1664 01:22:20,357 --> 01:22:30,806 stuff what happens if we have a C code that looks like this? 1665 01:22:30,806 --> 01:22:36,870 So you have a pointer, p-prime, you derive it from p, 1666 01:22:36,870 --> 01:22:40,210 we would add some variable i. 1667 01:22:40,210 --> 01:22:47,340 So how do you get the size of the allocation belonging to p? 1668 01:22:47,340 --> 01:22:56,620 Well you look in the table using this lookup here. 1669 01:23:05,910 --> 01:23:09,890 Right, so the size of the data that's been allocated to p 1670 01:23:09,890 --> 01:23:11,910 is going to be equal to 1 and then when you Left 1671 01:23:11,910 --> 01:23:14,420 Shift that by looking at the table, 1672 01:23:14,420 --> 01:23:17,475 taking that pointer value, and then Right Shifting that 1673 01:23:17,475 --> 01:23:19,360 by the log of the table size. 1674 01:23:19,360 --> 01:23:21,600 Right, if the arithmetic works out 1675 01:23:21,600 --> 01:23:24,385 because of the way that we're binding 1676 01:23:24,385 --> 01:23:27,730 pointers to the table bounds, right. 1677 01:23:27,730 --> 01:23:32,620 So this will get us-- this thing right here, will get us 1678 01:23:32,620 --> 01:23:33,765 the log of the sides. 1679 01:23:33,765 --> 01:23:36,130 And then this thing over here basically 1680 01:23:36,130 --> 01:23:39,280 expands that into like the regular value, right. 1681 01:23:39,280 --> 01:23:42,120 So for example, if the size of this pointer 1682 01:23:42,120 --> 01:23:46,890 were 32, in terms of bytes we've allocated, right. 1683 01:23:46,890 --> 01:23:50,295 This is going to get us five when we look at the table, 1684 01:23:50,295 --> 01:23:52,660 then when we Left Shift it this way, Left 1685 01:23:52,660 --> 01:23:54,076 Shift the one this way, then we're 1686 01:23:54,076 --> 01:23:57,810 going to get 32 back again from here. 1687 01:23:57,810 --> 01:23:58,680 OK. 1688 01:23:58,680 --> 01:24:06,738 And then we want to find the base of that pointer. 1689 01:24:06,738 --> 01:24:14,930 Take a pointer itself and then we're 1690 01:24:14,930 --> 01:24:24,549 going to and that with the side minus 1. 1691 01:24:24,549 --> 01:24:26,590 Now what this is going to do is, this is actually 1692 01:24:26,590 --> 01:24:29,700 going to give us a mass, that you can think of it. 1693 01:24:29,700 --> 01:24:34,280 And that mass is going to allow us to recover the base here. 1694 01:24:34,280 --> 01:24:40,100 So imagine that your size equals 16. 1695 01:24:40,100 --> 01:24:49,768 So 16 equals this in binary. 1696 01:24:49,768 --> 01:24:51,642 Right, there's a bunch of zeros off this way. 1697 01:24:51,642 --> 01:24:55,300 So we've got a 1 here, we've got some zeros over here. 1698 01:24:55,300 --> 01:25:07,394 So if we look at the bit-wide inverse of 16 minus 1, 1699 01:25:07,394 --> 01:25:09,070 then-- actually sorry. 1700 01:25:09,070 --> 01:25:12,450 So if we look at 16 minus 1, so what's that going to look like? 1701 01:25:12,450 --> 01:25:19,620 60 minus 1 we're going to look like right, something 1702 01:25:19,620 --> 01:25:20,270 like this. 1703 01:25:20,270 --> 01:25:21,040 OK. 1704 01:25:21,040 --> 01:25:26,193 And if we take the inverse of that 1705 01:25:26,193 --> 01:25:27,443 what is that going to give us? 1706 01:25:32,536 --> 01:25:33,530 Right, in binary. 1707 01:25:33,530 --> 01:25:37,290 So basically this thing here allows us to basically clear 1708 01:25:37,290 --> 01:25:40,860 the bit that essentially would be offset 1709 01:25:40,860 --> 01:25:43,230 from that valid pointer and just give us 1710 01:25:43,230 --> 01:25:44,556 the base of that pointer. 1711 01:25:44,556 --> 01:25:46,260 OK. 1712 01:25:46,260 --> 01:25:48,560 And so once we've got this, then it's 1713 01:25:48,560 --> 01:25:50,810 very simple to check whether this pointer's in bounds, 1714 01:25:50,810 --> 01:25:51,310 right. 1715 01:25:51,310 --> 01:25:55,360 So we can basically just check whether p-prime 1716 01:25:55,360 --> 01:26:04,990 is greater than or equal to base and whether p-prime 1717 01:26:04,990 --> 01:26:13,422 minus the base is less than size. 1718 01:26:13,422 --> 01:26:15,640 This is just a straightforward thing you do, right. 1719 01:26:15,640 --> 01:26:17,672 Just seeing whether that derived pointer 1720 01:26:17,672 --> 01:26:19,664 exists within the bounds of this [INAUDIBLE]. 1721 01:26:19,664 --> 01:26:22,080 Right, so at this point things are pretty straightforward. 1722 01:26:22,080 --> 01:26:24,300 Now they have like a optimized check in the paper, 1723 01:26:24,300 --> 01:26:25,690 I'm not going to go into that detail. 1724 01:26:25,690 --> 01:26:27,898 But suffice it to say that all the binary arithmetic, 1725 01:26:27,898 --> 01:26:29,839 it resolves down to the same thing. 1726 01:26:29,839 --> 01:26:31,380 There's just some clever tricks there 1727 01:26:31,380 --> 01:26:35,700 to avoid some of the explicit calculations we do here. 1728 01:26:35,700 --> 01:26:36,690 That's the basic idea. 1729 01:26:36,690 --> 01:26:49,140 And so the fifth trick that the baggy bounds system uses 1730 01:26:49,140 --> 01:26:59,915 is that it uses the virtual memory system to prevent out 1731 01:26:59,915 --> 01:27:04,710 of bounds [INAUDIBLE] right. 1732 01:27:04,710 --> 01:27:07,266 So the idea here is that-- how much time 1733 01:27:07,266 --> 01:27:08,182 do we have by the way? 1734 01:27:08,182 --> 01:27:09,174 Probably like zero? 1735 01:27:09,174 --> 01:27:12,490 So the basic idea here is that if we 1736 01:27:12,490 --> 01:27:15,990 have a pointer [INAUDIBLE] here, that we detect 1737 01:27:15,990 --> 01:27:19,145 is out of bounds, what we can do is actually set the high order 1738 01:27:19,145 --> 01:27:21,820 bit on a pointer, right. 1739 01:27:21,820 --> 01:27:26,350 And by doing that we guarantee that pointer is dereferenced, 1740 01:27:26,350 --> 01:27:28,502 then the caging hardware's going to be [INAUDIBLE], 1741 01:27:28,502 --> 01:27:30,210 we're going to throw a hard error, right. 1742 01:27:30,210 --> 01:27:31,626 Now in and of itself, just setting 1743 01:27:31,626 --> 01:27:33,940 that bit does not cause a problem. 1744 01:27:33,940 --> 01:27:35,740 It's only when you dereference that pointer 1745 01:27:35,740 --> 01:27:37,540 that you get into problems. 1746 01:27:37,540 --> 01:27:39,090 OK?