1 00:00:00,090 --> 00:00:02,430 The following content is provided under a Creative 2 00:00:02,430 --> 00:00:03,820 Commons license. 3 00:00:03,820 --> 00:00:06,060 Your support will help MIT OpenCourseWare 4 00:00:06,060 --> 00:00:10,150 continue to offer high quality educational resources for free. 5 00:00:10,150 --> 00:00:12,700 To make a donation or to view additional materials 6 00:00:12,700 --> 00:00:16,600 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,600 --> 00:00:17,310 at ocw.mit.edu. 8 00:00:26,790 --> 00:00:28,930 PROFESSOR: All right, guys, let's get started. 9 00:00:28,930 --> 00:00:31,350 So today, we're going to talk about a very 10 00:00:31,350 --> 00:00:33,944 different and principled approach to building secure web 11 00:00:33,944 --> 00:00:34,485 applications. 12 00:00:34,485 --> 00:00:36,604 And it's going to be about a system called Ur/Web. 13 00:00:36,604 --> 00:00:37,978 And right now, our guest lecturer 14 00:00:37,978 --> 00:00:39,769 is the author of the system, Adam Chlipala, 15 00:00:39,769 --> 00:00:41,982 who's a professor at MIT, is going to tell you 16 00:00:41,982 --> 00:00:44,361 more about the system he built. 17 00:00:44,361 --> 00:00:48,830 ADAM CHLIPALA: All right, so I want to get to a demo 18 00:00:48,830 --> 00:00:49,770 as soon as possible. 19 00:00:49,770 --> 00:00:51,630 But before that, I just want to spend 20 00:00:51,630 --> 00:00:54,890 some slides setting up part of the context about this system. 21 00:00:54,890 --> 00:00:57,150 And you've probably gotten some of that context 22 00:00:57,150 --> 00:00:59,560 already from the draft paper that 23 00:00:59,560 --> 00:01:03,740 was the reading for this class. 24 00:01:03,740 --> 00:01:05,360 So what is Ur/Web? 25 00:01:05,360 --> 00:01:07,360 It's always good to start out by explaining what 26 00:01:07,360 --> 00:01:08,885 the name of something means. 27 00:01:08,885 --> 00:01:12,600 So Ur/Web, first it's a programming language 28 00:01:12,600 --> 00:01:14,030 for building web applications. 29 00:01:14,030 --> 00:01:16,045 That's what the Web part of the name means. 30 00:01:16,045 --> 00:01:18,050 And it's sort of a full stack system. 31 00:01:18,050 --> 00:01:22,640 It does everything you need to do to build web applications. 32 00:01:22,640 --> 00:01:26,560 And Ur is a new general purpose functional programming 33 00:01:26,560 --> 00:01:28,950 language that is used to implement 34 00:01:28,950 --> 00:01:31,000 these web specific features. 35 00:01:33,820 --> 00:01:36,340 And the whole point of Ur/Web is that instead 36 00:01:36,340 --> 00:01:38,340 of having a general purpose programming language 37 00:01:38,340 --> 00:01:40,840 and then having a library or a traditional framework 38 00:01:40,840 --> 00:01:42,540 for building web applications, it's 39 00:01:42,540 --> 00:01:45,730 all integrated into a customized programming language in Ur/Web. 40 00:01:45,730 --> 00:01:49,690 And it's a language that involves compilation, not 41 00:01:49,690 --> 00:01:51,100 interpretation at run time. 42 00:01:51,100 --> 00:01:53,890 And the compiler in some sense understands what a web 43 00:01:53,890 --> 00:01:55,400 application is supposed to do. 44 00:01:55,400 --> 00:01:57,150 And it will point out mistakes that you're 45 00:01:57,150 --> 00:02:00,700 making that a conventional compiler, for say Java, 46 00:02:00,700 --> 00:02:04,170 would not be able to realize where mistakes. 47 00:02:04,170 --> 00:02:06,620 So there are really three main principles 48 00:02:06,620 --> 00:02:11,590 that I was trying to follow in designing this language. 49 00:02:11,590 --> 00:02:13,850 The middle one is most relevant in this context. 50 00:02:13,850 --> 00:02:16,970 But they are programmer productivity, security, 51 00:02:16,970 --> 00:02:17,860 and performance. 52 00:02:17,860 --> 00:02:22,300 And the last part, especially on the server side, because that 53 00:02:22,300 --> 00:02:24,350 seemed more important for scaling reasons. 54 00:02:24,350 --> 00:02:27,200 In many cases, the users of your application 55 00:02:27,200 --> 00:02:30,771 won't notice small performance issues on the client side. 56 00:02:30,771 --> 00:02:32,270 But a small issue on the server side 57 00:02:32,270 --> 00:02:34,590 could force you to buy many more servers than you would have 58 00:02:34,590 --> 00:02:35,090 otherwise. 59 00:02:37,950 --> 00:02:41,820 And at this point, there are some users of Ur/Web-- 60 00:02:41,820 --> 00:02:44,380 not nearly as much as pretty much any other language 61 00:02:44,380 --> 00:02:45,490 you probably think of. 62 00:02:45,490 --> 00:02:48,730 But there's at least this one commercial web application, 63 00:02:48,730 --> 00:02:53,420 which is an RSS feed reader that supports such exotic features 64 00:02:53,420 --> 00:02:55,170 as displaying comments. 65 00:02:55,170 --> 00:02:59,060 And there's the URL chosen by a non-native English speaker who 66 00:02:59,060 --> 00:03:00,210 regrets it now. 67 00:03:00,210 --> 00:03:03,200 It's called BazQux Reader, as a combination of common medicine 68 00:03:03,200 --> 00:03:06,840 tactic variables from the hacker community. 69 00:03:06,840 --> 00:03:10,019 And there are a few thousand paying users. 70 00:03:10,019 --> 00:03:12,060 And it looks like that-- much nicer than anything 71 00:03:12,060 --> 00:03:15,010 I know how to make with CSS. 72 00:03:15,010 --> 00:03:17,380 But here's a proof that it can be done using Ur/Web. 73 00:03:20,640 --> 00:03:22,829 Feel free to jump in with questions at any point, 74 00:03:22,829 --> 00:03:24,870 though I probably haven't gotten to the point yet 75 00:03:24,870 --> 00:03:27,400 that provokes many questions. 76 00:03:27,400 --> 00:03:29,505 So the basic sales pitch for Ur/Web 77 00:03:29,505 --> 00:03:31,880 is that it has a very high level programming model, which 78 00:03:31,880 --> 00:03:33,380 is very different from, say, Django, 79 00:03:33,380 --> 00:03:35,810 which I know you spent some time reading about or talking 80 00:03:35,810 --> 00:03:38,020 about in class. 81 00:03:38,020 --> 00:03:41,362 And it has a good security story. 82 00:03:41,362 --> 00:03:42,820 Some features you want for security 83 00:03:42,820 --> 00:03:44,580 are really integrated into the system 84 00:03:44,580 --> 00:03:48,340 so that you would really have to work hard to avoid inheriting 85 00:03:48,340 --> 00:03:49,360 these security benefits. 86 00:03:49,360 --> 00:03:51,780 And I'll say more about the detail shortly. 87 00:03:51,780 --> 00:03:53,410 And also, the server side performance 88 00:03:53,410 --> 00:03:56,874 is unusually good, even among the popular tools 89 00:03:56,874 --> 00:03:58,790 for building web applications that you're more 90 00:03:58,790 --> 00:04:01,490 likely to have heard of before. 91 00:04:01,490 --> 00:04:06,180 And the caveat is that we probably 92 00:04:06,180 --> 00:04:09,290 need to have internalized the big ideas 93 00:04:09,290 --> 00:04:10,800 of functional programming languages 94 00:04:10,800 --> 00:04:13,890 like Haskell before a programmer's ready to start 95 00:04:13,890 --> 00:04:14,550 using Ur/Web. 96 00:04:14,550 --> 00:04:19,110 And looking at the questions and answers for this class, 97 00:04:19,110 --> 00:04:22,060 maybe a fifth of you were complaining 98 00:04:22,060 --> 00:04:25,380 about the functional programming parts of the paper 99 00:04:25,380 --> 00:04:26,700 being hard to follow. 100 00:04:26,700 --> 00:04:29,166 I apologize. 101 00:04:29,166 --> 00:04:30,540 There are just so many good ideas 102 00:04:30,540 --> 00:04:31,800 in the world of functional programming 103 00:04:31,800 --> 00:04:33,340 that it's hard not to start from that point 104 00:04:33,340 --> 00:04:35,070 and add more cool stuff on top of that. 105 00:04:35,070 --> 00:04:38,110 And I will try to avoid any requirement 106 00:04:38,110 --> 00:04:40,800 to know that material to follow what 107 00:04:40,800 --> 00:04:45,240 I'll be doing in class today. 108 00:04:45,240 --> 00:04:48,580 So the programming model is really closely connected 109 00:04:48,580 --> 00:04:49,910 to static typing. 110 00:04:49,910 --> 00:04:52,060 And that's not just static typing like 111 00:04:52,060 --> 00:04:54,170 in, say, Java, which has a relatively inexpressive 112 00:04:54,170 --> 00:04:56,190 clunky type system, but static typing 113 00:04:56,190 --> 00:04:58,460 like in Haskell or OCaml. 114 00:04:58,460 --> 00:05:00,260 And these types are one of the ways 115 00:05:00,260 --> 00:05:02,380 that the compiler understands what you're doing 116 00:05:02,380 --> 00:05:05,190 and catches mistakes in your program. 117 00:05:05,190 --> 00:05:07,954 And it turns out that the core Ur language 118 00:05:07,954 --> 00:05:10,120 that Ur/Web is built on top of has a very expressive 119 00:05:10,120 --> 00:05:11,090 static type system. 120 00:05:11,090 --> 00:05:12,700 So many of the things that Ur/Web does 121 00:05:12,700 --> 00:05:14,710 are actually just exposed as libraries 122 00:05:14,710 --> 00:05:16,630 with no special compiler support. 123 00:05:16,630 --> 00:05:20,520 For instance, we'll teach the compiler how to type check 124 00:05:20,520 --> 00:05:23,860 SQL queries without actually building the typing rules 125 00:05:23,860 --> 00:05:25,420 of SQL into the compiler. 126 00:05:25,420 --> 00:05:29,310 They can be encoded as a library and use a standard type checker 127 00:05:29,310 --> 00:05:33,425 to make sure your SQL queries are following the rules of SQL. 128 00:05:36,950 --> 00:05:40,010 Most relevant in this context, the security story 129 00:05:40,010 --> 00:05:44,200 at a high level-- most of the most common security 130 00:05:44,200 --> 00:05:48,690 vulnerabilities are impossible by construction in Ur/Web. 131 00:05:48,690 --> 00:05:51,400 You will have to explicitly enable scary 132 00:05:51,400 --> 00:05:55,520 looking flag names to be allowed to do most of the most 133 00:05:55,520 --> 00:05:57,550 awful things you can do in a web application, 134 00:05:57,550 --> 00:06:02,090 like no cross site scripting vulnerabilities unless you 135 00:06:02,090 --> 00:06:04,610 really invoke some black magic, say, by using 136 00:06:04,610 --> 00:06:08,020 the foreign function interface. 137 00:06:08,020 --> 00:06:11,130 And there are a few other security-specific features 138 00:06:11,130 --> 00:06:14,100 that I'll highlight later. 139 00:06:14,100 --> 00:06:16,540 And the performance is also very good. 140 00:06:16,540 --> 00:06:21,790 The compiler is, first of all, a domain specific compiler 141 00:06:21,790 --> 00:06:22,970 for a web application. 142 00:06:22,970 --> 00:06:25,890 So it understands what the web application is doing and is 143 00:06:25,890 --> 00:06:28,800 able to optimize some things that a more general compiler 144 00:06:28,800 --> 00:06:29,770 wouldn't catch. 145 00:06:29,770 --> 00:06:31,904 And usually the code that comes out 146 00:06:31,904 --> 00:06:33,570 of this compiler that runs on the server 147 00:06:33,570 --> 00:06:36,100 is native code, which is very, very 148 00:06:36,100 --> 00:06:41,450 competitive with what you might bother to write by hand in C. 149 00:06:41,450 --> 00:06:44,449 And the performance costs that there 150 00:06:44,449 --> 00:06:45,990 are compared to other approaches tend 151 00:06:45,990 --> 00:06:47,630 to have to do with the concurrency model, which 152 00:06:47,630 --> 00:06:49,340 makes the programmer's life easier 153 00:06:49,340 --> 00:06:50,905 at some cost in performance. 154 00:06:50,905 --> 00:06:53,155 And I'll say a little bit more about that in a moment. 155 00:06:55,920 --> 00:06:59,960 Here's a quick plug for this web framework benchmarking 156 00:06:59,960 --> 00:07:03,380 initiative that is run by a third party. 157 00:07:03,380 --> 00:07:05,750 This is a screenshot of the results of the most 158 00:07:05,750 --> 00:07:09,830 recent round where a number of different web programming tasks 159 00:07:09,830 --> 00:07:11,870 were completed in many different frameworks, 160 00:07:11,870 --> 00:07:15,460 and they were compared pretty much exclusively on performance 161 00:07:15,460 --> 00:07:16,470 so far. 162 00:07:16,470 --> 00:07:18,780 And here you can see Ur/Web sitting 163 00:07:18,780 --> 00:07:23,400 at fourth out of about 60 frameworks on this benchmark. 164 00:07:23,400 --> 00:07:24,890 And there's been some improvements 165 00:07:24,890 --> 00:07:28,460 to the Ur/Web compiler since this screenshot was taken. 166 00:07:28,460 --> 00:07:29,960 And I expect in the next round it'll 167 00:07:29,960 --> 00:07:32,740 move up a little bit higher. 168 00:07:32,740 --> 00:07:35,880 But basically, already this is a simple example using 169 00:07:35,880 --> 00:07:37,630 SQL to generate HTML pages. 170 00:07:37,630 --> 00:07:40,970 You get about 100,000 requests per second 171 00:07:40,970 --> 00:07:43,100 from the Ur/Web server, which is going 172 00:07:43,100 --> 00:07:45,920 to be just plenty for most applications. 173 00:07:45,920 --> 00:07:48,940 So sort of maybe the important takeaway message 174 00:07:48,940 --> 00:07:53,110 from this slide in this class is that you can adopt a high level 175 00:07:53,110 --> 00:07:57,270 model that makes security easier to achieve without just giving 176 00:07:57,270 --> 00:08:00,914 up all the performance that you would expect to get from more 177 00:08:00,914 --> 00:08:01,830 mainstream techniques. 178 00:08:04,840 --> 00:08:07,760 All right, so let me start out by giving my cartoon 179 00:08:07,760 --> 00:08:09,810 impression of the way web programmers think 180 00:08:09,810 --> 00:08:12,885 about writing web applications in mainstream frameworks today. 181 00:08:12,885 --> 00:08:15,250 And then I'll show the different perspective 182 00:08:15,250 --> 00:08:17,810 that Ur/Web provides, where some of the things that 183 00:08:17,810 --> 00:08:20,520 can go wrong at this level given the abstractions that 184 00:08:20,520 --> 00:08:24,115 are exposed can no longer go wrong by construction. 185 00:08:24,115 --> 00:08:26,740 So the basic cartoon picture is there's a web server out there. 186 00:08:26,740 --> 00:08:29,410 And it's sort of in charge of the whole process 187 00:08:29,410 --> 00:08:30,720 of your application. 188 00:08:30,720 --> 00:08:33,049 And there's a whole fleet of browsers out there 189 00:08:33,049 --> 00:08:35,260 that are going to interact with that server. 190 00:08:35,260 --> 00:08:38,010 It'll have some state that winds up effectively 191 00:08:38,010 --> 00:08:40,350 shared across all these browsers through their contact 192 00:08:40,350 --> 00:08:42,179 with the server. 193 00:08:42,179 --> 00:08:44,210 So the usual picture is that the browser 194 00:08:44,210 --> 00:08:46,060 starts interacting with the web server 195 00:08:46,060 --> 00:08:49,520 by sending it an HTTP request that includes 196 00:08:49,520 --> 00:08:51,320 some URLs embedded in it. 197 00:08:51,320 --> 00:08:53,380 And then the web server throws back, 198 00:08:53,380 --> 00:08:55,597 again, the HTTP and HTML page. 199 00:08:55,597 --> 00:08:57,305 And there are some URLs embedded in that, 200 00:08:57,305 --> 00:08:59,860 which can be used to decide which request to make the web 201 00:08:59,860 --> 00:09:02,760 server in the future. 202 00:09:02,760 --> 00:09:04,350 This web server might also be talking 203 00:09:04,350 --> 00:09:07,450 to a database that provides a persistent store that 204 00:09:07,450 --> 00:09:10,170 is shared across all the users of the application. 205 00:09:10,170 --> 00:09:13,240 One popular protocol to speak between the web 206 00:09:13,240 --> 00:09:15,870 server and the database is SQL. 207 00:09:15,870 --> 00:09:20,070 That's what I'll be focusing on talking about Ur/Web. 208 00:09:20,070 --> 00:09:24,390 And also, with modern web applications, 209 00:09:24,390 --> 00:09:27,569 it's not just the one page at a time model 210 00:09:27,569 --> 00:09:29,610 where whenever anything has to change on the page 211 00:09:29,610 --> 00:09:31,010 you make a new request to the server 212 00:09:31,010 --> 00:09:32,843 and then replace the whole page of the unit. 213 00:09:32,843 --> 00:09:35,424 There's this Ajax style where the browser 214 00:09:35,424 --> 00:09:37,090 within a single page view will sometimes 215 00:09:37,090 --> 00:09:39,960 make extra HTTP requests to the web server 216 00:09:39,960 --> 00:09:42,390 and receive responses that are processed programmatically 217 00:09:42,390 --> 00:09:43,970 in a customized way. 218 00:09:43,970 --> 00:09:45,870 And this often uses representations 219 00:09:45,870 --> 00:09:49,070 like XML and JSON and other simple wire 220 00:09:49,070 --> 00:09:51,980 formats for exchanging data between the client 221 00:09:51,980 --> 00:09:54,400 and the server. 222 00:09:54,400 --> 00:09:56,980 And then when the browser gets back that response, 223 00:09:56,980 --> 00:09:58,480 there's some JavaScript code running 224 00:09:58,480 --> 00:10:01,440 there, which implements arbitrary logic for controlling 225 00:10:01,440 --> 00:10:03,875 the UI that we're displaying to the user. 226 00:10:03,875 --> 00:10:06,000 And the way this works is that this JavaScript code 227 00:10:06,000 --> 00:10:08,200 can read the responses that the server has given 228 00:10:08,200 --> 00:10:10,050 to those different Ajax calls. 229 00:10:10,050 --> 00:10:14,480 And then it can modify the page that's displayed basically 230 00:10:14,480 --> 00:10:18,470 by mutating a global variable that stands for the page. 231 00:10:18,470 --> 00:10:21,170 And any part of the program can have 232 00:10:21,170 --> 00:10:25,130 arbitrary effects on this global variable that is the page. 233 00:10:25,130 --> 00:10:27,230 And often, parts of the page are looked up 234 00:10:27,230 --> 00:10:30,391 by string IDs that are annotated on nodes of the tree that's 235 00:10:30,391 --> 00:10:31,390 describing the document. 236 00:10:34,000 --> 00:10:36,300 And finally, one more complication-- sometimes 237 00:10:36,300 --> 00:10:40,730 we want to allow what feels like the web server contacting 238 00:10:40,730 --> 00:10:43,840 the browser without prompting. 239 00:10:43,840 --> 00:10:45,700 So say there's a new email message. 240 00:10:45,700 --> 00:10:48,260 The web server wants to tell the browser, new message. 241 00:10:48,260 --> 00:10:50,850 So there are a variety of ways of doing this involving 242 00:10:50,850 --> 00:10:53,610 acronyms like Comet and WebSockets 243 00:10:53,610 --> 00:10:56,540 that really look a lot like the browser contacting the server. 244 00:10:56,540 --> 00:10:59,210 It's the same sort of thing conceptually 245 00:10:59,210 --> 00:11:02,610 in the other direction. 246 00:11:02,610 --> 00:11:05,940 All right, so I want to bring back on the screen 247 00:11:05,940 --> 00:11:08,360 all these protocols and languages, 248 00:11:08,360 --> 00:11:11,630 highlight some parts in yellow here. 249 00:11:11,630 --> 00:11:13,530 Having read the paper, does anyone 250 00:11:13,530 --> 00:11:16,580 have a guess about what is the commonality between all 251 00:11:16,580 --> 00:11:19,465 these highlighted parts here from a security perspective? 252 00:11:24,190 --> 00:11:24,690 Yes. 253 00:11:24,690 --> 00:11:26,310 STUDENT: They're all strings. 254 00:11:26,310 --> 00:11:28,327 So you can put whatever you want in them. 255 00:11:28,327 --> 00:11:30,410 ADAM CHLIPALA: Right, in the mainstream approaches 256 00:11:30,410 --> 00:11:32,620 to web application programming, all of these things 257 00:11:32,620 --> 00:11:33,460 are strings. 258 00:11:33,460 --> 00:11:35,600 And the programming language doesn't understand 259 00:11:35,600 --> 00:11:37,140 the way you're using them and can 260 00:11:37,140 --> 00:11:39,460 help you avoid making mistakes. 261 00:11:39,460 --> 00:11:42,380 So for instance, by representing these things are strings, 262 00:11:42,380 --> 00:11:44,960 you get code injection attacks. 263 00:11:44,960 --> 00:11:47,260 So as far as I'm concerned, code injection attacks 264 00:11:47,260 --> 00:11:49,160 are basically about the consequence 265 00:11:49,160 --> 00:11:51,280 of including as a primitive in your programming 266 00:11:51,280 --> 00:11:55,410 language or your framework some function that runs programs 267 00:11:55,410 --> 00:11:59,090 as text in some sufficiently expressive language. 268 00:11:59,090 --> 00:12:01,850 In Ur/Web, there is no built-in interpreter 269 00:12:01,850 --> 00:12:04,640 at runtime for strings as programs. 270 00:12:04,640 --> 00:12:07,830 And that makes a lot of the most common mistakes 271 00:12:07,830 --> 00:12:10,684 in web applications impossible by construction. 272 00:12:10,684 --> 00:12:12,350 So all these things that are highlighted 273 00:12:12,350 --> 00:12:15,240 will either be invisible, or they'll 274 00:12:15,240 --> 00:12:20,040 be represented with special types that make clear what kind 275 00:12:20,040 --> 00:12:23,380 of code you're dealing with and don't 276 00:12:23,380 --> 00:12:24,960 have any sort of automatic coercion 277 00:12:24,960 --> 00:12:27,780 from string into any of those special types. 278 00:12:31,210 --> 00:12:33,700 All right, so here's the alternative model 279 00:12:33,700 --> 00:12:36,170 that Ur/Web exposes, which gets compiled 280 00:12:36,170 --> 00:12:37,270 to the traditional model. 281 00:12:37,270 --> 00:12:40,810 So it works in all the widely deployed browsers. 282 00:12:40,810 --> 00:12:43,630 But the programmer can think at this higher level 283 00:12:43,630 --> 00:12:46,450 and avoid the potential for mistakes 284 00:12:46,450 --> 00:12:48,334 that were possible in the previous picture. 285 00:12:48,334 --> 00:12:50,500 So we still have the web server, which is in charge. 286 00:12:50,500 --> 00:12:52,110 And we still have this fleet of browsers that are 287 00:12:52,110 --> 00:12:53,630 trying to use the web server. 288 00:12:53,630 --> 00:12:55,240 But now, the first important change 289 00:12:55,240 --> 00:12:58,370 is that when the browser wants to initiate use of a web 290 00:12:58,370 --> 00:13:02,710 application, it doesn't just send a string of HTTP requests 291 00:13:02,710 --> 00:13:03,980 with a URL in it. 292 00:13:03,980 --> 00:13:09,090 Effectively, the abstraction is the browser names a function 293 00:13:09,090 --> 00:13:12,680 that should be called where the call runs on the server instead 294 00:13:12,680 --> 00:13:14,720 of the client. 295 00:13:14,720 --> 00:13:19,030 And then the server responds with not a string 296 00:13:19,030 --> 00:13:24,680 of HTTP protocol text but a strongly typed documentary. 297 00:13:24,680 --> 00:13:27,640 So instead of a string of HTML, it's a tree, 298 00:13:27,640 --> 00:13:31,210 a first class object in the language. 299 00:13:31,210 --> 00:13:34,420 And that is how the program manipulates 300 00:13:34,420 --> 00:13:37,750 it, not as a string. 301 00:13:37,750 --> 00:13:40,162 And each of these trees contains within it links, 302 00:13:40,162 --> 00:13:41,620 which are themselves basically just 303 00:13:41,620 --> 00:13:43,453 references to other functions that you might 304 00:13:43,453 --> 00:13:45,500 choose to call on the server. 305 00:13:45,500 --> 00:13:48,590 So then the browser, when the user clicks on those links, 306 00:13:48,590 --> 00:13:50,861 picks out the function and conceptually calls it 307 00:13:50,861 --> 00:13:53,110 on the server, just like the original function that we 308 00:13:53,110 --> 00:13:54,276 called to get to this point. 309 00:13:56,840 --> 00:13:59,740 And we have a database interface, 310 00:13:59,740 --> 00:14:01,850 which is accessed by the web server 311 00:14:01,850 --> 00:14:03,450 throwing queries at the database. 312 00:14:03,450 --> 00:14:05,730 And these are not just text in the Ur/Web model. 313 00:14:05,730 --> 00:14:09,610 They're strongly typed SQL syntax trees. 314 00:14:09,610 --> 00:14:13,930 And then the database will respond back with not text, 315 00:14:13,930 --> 00:14:18,500 but a list of records of native values 316 00:14:18,500 --> 00:14:20,950 in the programming language that we're working with. 317 00:14:20,950 --> 00:14:24,220 So we don't have to worry about incorrectly converting 318 00:14:24,220 --> 00:14:27,190 between strings and native representations, 319 00:14:27,190 --> 00:14:29,920 or native representations in any other format 320 00:14:29,920 --> 00:14:32,070 that the database might traditionally 321 00:14:32,070 --> 00:14:33,860 be presenting to us. 322 00:14:36,760 --> 00:14:41,450 And here's a key element of how the semantics of Ur/Web 323 00:14:41,450 --> 00:14:43,630 makes it easier for programmers to think about fewer 324 00:14:43,630 --> 00:14:46,340 scenarios that can actually happen when the application is 325 00:14:46,340 --> 00:14:48,340 running. 326 00:14:48,340 --> 00:14:50,590 There's the standard idea of transactions 327 00:14:50,590 --> 00:14:52,890 in the world of relational databases 328 00:14:52,890 --> 00:14:54,630 where you can run a series of operations 329 00:14:54,630 --> 00:14:56,940 that seem to run with no interruption 330 00:14:56,940 --> 00:14:58,500 by other concurrent threads. 331 00:14:58,500 --> 00:15:00,570 And Ur/Web adopts that model and builds it 332 00:15:00,570 --> 00:15:02,790 into the semantics of the language. 333 00:15:02,790 --> 00:15:06,190 So when a single function is running 334 00:15:06,190 --> 00:15:08,820 on the server on behalf of a client, 335 00:15:08,820 --> 00:15:11,300 then all of its database accesses 336 00:15:11,300 --> 00:15:13,000 appear to happen as an atomic unit 337 00:15:13,000 --> 00:15:16,860 without any interruption by any other concurrent requests 338 00:15:16,860 --> 00:15:18,320 to the same server. 339 00:15:18,320 --> 00:15:21,100 And you can't even avoid this behavior if you want to. 340 00:15:21,100 --> 00:15:24,410 Transactions are built into the language. 341 00:15:24,410 --> 00:15:27,840 And they really make concurrency a lot easier 342 00:15:27,840 --> 00:15:30,050 to think about, and potentially help 343 00:15:30,050 --> 00:15:32,510 you avoid security issues that only arise when 344 00:15:32,510 --> 00:15:35,870 some rare interleaving happens with a particular combination 345 00:15:35,870 --> 00:15:38,610 of requests. 346 00:15:38,610 --> 00:15:41,380 And actually, I want to get to one of the questions 347 00:15:41,380 --> 00:15:44,830 that someone submitted for this class that I found intriguing. 348 00:15:44,830 --> 00:15:47,940 Ur/Web will detect when a transaction fails because 349 00:15:47,940 --> 00:15:49,810 of a concurrency problem, like a deadlock, 350 00:15:49,810 --> 00:15:52,420 and automatically restart the transaction. 351 00:15:52,420 --> 00:15:56,550 Someone's response to a question said, 352 00:15:56,550 --> 00:16:00,090 this might make it easier to launch security attacks that 353 00:16:00,090 --> 00:16:02,400 depend on causing transactions to fail 354 00:16:02,400 --> 00:16:05,512 because of concurrency issues. 355 00:16:05,512 --> 00:16:07,095 I just wanted to ask the class, what's 356 00:16:07,095 --> 00:16:09,125 an example of an attack like that, 357 00:16:09,125 --> 00:16:13,260 if anyone happens to have one in mind? 358 00:16:13,260 --> 00:16:15,260 If you have a system that automatically restarts 359 00:16:15,260 --> 00:16:17,330 transactions that run into deadlocks, 360 00:16:17,330 --> 00:16:21,001 how does that cause a security problem, if it does? 361 00:16:21,001 --> 00:16:23,042 This is a question I don't have an answer in mind 362 00:16:23,042 --> 00:16:24,793 for, which is why I'm asking it. 363 00:16:34,940 --> 00:16:36,782 It might also have only a non-obvious answer 364 00:16:36,782 --> 00:16:38,990 that no one would come up with on the spot like this, 365 00:16:38,990 --> 00:16:39,878 which is fine, too. 366 00:16:44,856 --> 00:16:45,356 Yeah. 367 00:16:45,356 --> 00:16:47,846 STUDENT: Can you maybe do some sort of denial of service? 368 00:16:47,846 --> 00:16:50,336 If it's going to restart a transaction that you're 369 00:16:50,336 --> 00:16:51,830 sending, and you know it will fail, 370 00:16:51,830 --> 00:16:54,320 can you just keep restarting that and try again? 371 00:16:56,840 --> 00:16:59,335 ADAM CHLIPALA: OK, so-- 372 00:16:59,335 --> 00:17:01,248 STUDENT: So if you could cause the system 373 00:17:01,248 --> 00:17:04,741 to do some transaction you know is about to fail and repeatedly 374 00:17:04,741 --> 00:17:07,966 fail, it keeps trying over and over again, 375 00:17:07,966 --> 00:17:09,397 it would never [INAUDIBLE]. 376 00:17:09,397 --> 00:17:11,730 ADAM CHLIPALA: Right, so you'd need at least two threads 377 00:17:11,730 --> 00:17:13,140 running at once to do that. 378 00:17:13,140 --> 00:17:17,140 But potentially that could work. 379 00:17:17,140 --> 00:17:20,369 So you could launch a denial of service attack taking advantage 380 00:17:20,369 --> 00:17:23,150 of the fact that contention leads 381 00:17:23,150 --> 00:17:26,474 to request handlers restarting over and over again 382 00:17:26,474 --> 00:17:28,640 and purposely cause contention and use this as a way 383 00:17:28,640 --> 00:17:31,630 to amplify the strength of your denial of service attack 384 00:17:31,630 --> 00:17:36,401 beyond what you can get with a traditional model. 385 00:17:36,401 --> 00:17:37,650 All right, I can believe that. 386 00:17:37,650 --> 00:17:38,010 Yeah. 387 00:17:38,010 --> 00:17:40,551 STUDENT: Is [INAUDIBLE] the only way to cause the transaction 388 00:17:40,551 --> 00:17:41,500 to fail? 389 00:17:41,500 --> 00:17:42,880 ADAM CHLIPALA: It is. 390 00:17:42,880 --> 00:17:45,580 Well, it's the only way to cause it to fail and automatically 391 00:17:45,580 --> 00:17:46,080 restart. 392 00:17:50,660 --> 00:17:52,392 Yeah. 393 00:17:52,392 --> 00:17:54,450 STUDENT: Perhaps it could have a third party, 394 00:17:54,450 --> 00:17:56,346 which would conditionally fail. 395 00:17:56,346 --> 00:17:59,427 And then you could use that to monitor some other user's 396 00:17:59,427 --> 00:18:00,140 behavior. 397 00:18:00,140 --> 00:18:01,640 ADAM CHLIPALA: You'd also need a way 398 00:18:01,640 --> 00:18:04,390 to observe the fact that it had failed, which you should only 399 00:18:04,390 --> 00:18:05,790 be able to do through timing. 400 00:18:05,790 --> 00:18:07,320 But that could still be an issue. 401 00:18:07,320 --> 00:18:11,300 OK, right, so you can use this as a side channel 402 00:18:11,300 --> 00:18:12,800 to see what other threads are doing, 403 00:18:12,800 --> 00:18:14,758 because their actions might or might not create 404 00:18:14,758 --> 00:18:15,880 a conflict in your thread. 405 00:18:19,170 --> 00:18:25,520 OK, that sounds possible in principle, and very twisty. 406 00:18:25,520 --> 00:18:26,897 I'm not sure. 407 00:18:26,897 --> 00:18:28,730 It's hard to think of a concrete attack that 408 00:18:28,730 --> 00:18:31,810 would work predictably. 409 00:18:31,810 --> 00:18:33,430 But it could be a fun exercise. 410 00:18:33,430 --> 00:18:33,930 Yeah. 411 00:18:33,930 --> 00:18:38,752 STUDENT: So do the transactions you run-- for each request that 412 00:18:38,752 --> 00:18:41,145 comes in, you run a transaction for the code 413 00:18:41,145 --> 00:18:42,470 you run at the web server. 414 00:18:42,470 --> 00:18:44,303 But when you send that code to the database, 415 00:18:44,303 --> 00:18:46,705 does that translate into a database transaction as well? 416 00:18:46,705 --> 00:18:47,830 ADAM CHLIPALA: It is, yeah. 417 00:18:47,830 --> 00:18:49,800 The whole execution on the server side 418 00:18:49,800 --> 00:18:52,040 is wrapped in one database transaction 419 00:18:52,040 --> 00:18:54,981 if the application uses the database. 420 00:18:54,981 --> 00:18:55,480 Yeah. 421 00:18:55,480 --> 00:18:57,313 STUDENT: So if you have a transaction that's 422 00:18:57,313 --> 00:19:00,427 not going to end up obtaining, do you think [INAUDIBLE]? 423 00:19:00,427 --> 00:19:01,260 ADAM CHLIPALA: Yeah. 424 00:19:01,260 --> 00:19:02,820 STUDENT: Are you telling the database 425 00:19:02,820 --> 00:19:04,851 that nothing's going to be updated later? 426 00:19:04,851 --> 00:19:06,975 Because presumably, the database doesn't know that. 427 00:19:06,975 --> 00:19:09,308 ADAM CHLIPALA: Yes, so the compiler does static analysis 428 00:19:09,308 --> 00:19:11,560 and finds out transactions that need to be read-only. 429 00:19:11,560 --> 00:19:14,310 And it creates the transaction in read-only mode, 430 00:19:14,310 --> 00:19:18,516 which in some database systems enables extra optimizations. 431 00:19:18,516 --> 00:19:23,450 STUDENT: What about if you read some stuff, and some 432 00:19:23,450 --> 00:19:25,142 of the stuff you read doesn't affect 433 00:19:25,142 --> 00:19:26,962 what you're going to write, but some of the other stuff 434 00:19:26,962 --> 00:19:27,770 you read does? 435 00:19:27,770 --> 00:19:29,395 ADAM CHLIPALA: I see, so you're asking, 436 00:19:29,395 --> 00:19:32,470 could we use our knowledge of the semantics 437 00:19:32,470 --> 00:19:35,000 of the application to give hints to the database system 438 00:19:35,000 --> 00:19:39,860 saying some of what looked like concurrency violations are 439 00:19:39,860 --> 00:19:44,130 actually benign, and we don't need to restart at that point? 440 00:19:44,130 --> 00:19:45,450 I think the short answer is no. 441 00:19:45,450 --> 00:19:46,940 The current implementation doesn't do that. 442 00:19:46,940 --> 00:19:48,670 But that would be interesting to look into. 443 00:19:48,670 --> 00:19:51,003 I think it would require changes to the database engine, 444 00:19:51,003 --> 00:19:52,660 not just the interface in the language. 445 00:19:52,660 --> 00:19:54,124 STUDENT: Usually you could split it 446 00:19:54,124 --> 00:19:55,707 into two separate transactions, maybe, 447 00:19:55,707 --> 00:20:00,394 or something under certain circumstances. 448 00:20:00,394 --> 00:20:03,700 ADAM CHLIPALA: Yeah, that sounds hard to do right, 449 00:20:03,700 --> 00:20:06,872 but potentially worthwhile for-- I don't know how to estimate 450 00:20:06,872 --> 00:20:09,330 what fraction of applications could take advantage of that, 451 00:20:09,330 --> 00:20:10,410 but it's a neat idea. 452 00:20:13,910 --> 00:20:16,235 All right, so transactions are great. 453 00:20:19,370 --> 00:20:22,190 We also have-- so I was just telling you about the model, 454 00:20:22,190 --> 00:20:23,725 the old school model of the browser 455 00:20:23,725 --> 00:20:25,920 requesting a single page from the web server. 456 00:20:25,920 --> 00:20:28,720 We can also have this Ajax style stuff that basically 457 00:20:28,720 --> 00:20:30,432 looks like code on the client. 458 00:20:30,432 --> 00:20:31,890 It's calling a function that's just 459 00:20:31,890 --> 00:20:33,370 marked to run on the server. 460 00:20:33,370 --> 00:20:37,460 When it finishes, the result comes back in the client code. 461 00:20:37,460 --> 00:20:39,190 And the result is just a native value 462 00:20:39,190 --> 00:20:40,356 in the programming language. 463 00:20:40,356 --> 00:20:42,800 You don't have to worry about making it into a string 464 00:20:42,800 --> 00:20:44,190 somehow and translating it back. 465 00:20:47,580 --> 00:20:49,079 And then we have to take the result 466 00:20:49,079 --> 00:20:51,120 and use it to change the page that the user sees. 467 00:20:51,120 --> 00:20:54,150 Otherwise, it wasn't a very useful request to make. 468 00:20:54,150 --> 00:20:56,250 So the model in Ur/Web is very different 469 00:20:56,250 --> 00:20:58,310 from the standard document object model 470 00:20:58,310 --> 00:20:59,840 that browsers expose directly. 471 00:20:59,840 --> 00:21:01,730 The basic idea is something called 472 00:21:01,730 --> 00:21:03,840 functional reactive programming, which I won't try 473 00:21:03,840 --> 00:21:05,090 to explain in too much detail. 474 00:21:05,090 --> 00:21:10,210 Because I know it requires a nontrivial grokking 475 00:21:10,210 --> 00:21:13,585 of functional programming first, even if we cut off 476 00:21:13,585 --> 00:21:14,890 that reactive part. 477 00:21:14,890 --> 00:21:16,730 But the basic idea is the document 478 00:21:16,730 --> 00:21:19,490 is described in terms of a set of mutable cells, which 479 00:21:19,490 --> 00:21:21,360 are sort of the data the page depends on. 480 00:21:21,360 --> 00:21:23,450 And the page itself is something different, 481 00:21:23,450 --> 00:21:25,350 described as a function that takes 482 00:21:25,350 --> 00:21:27,030 as inputs the values of those cells, 483 00:21:27,030 --> 00:21:28,710 and then computes a page. 484 00:21:28,710 --> 00:21:30,650 And then the runtime system of the language 485 00:21:30,650 --> 00:21:32,865 watches changes to those mutable cells. 486 00:21:32,865 --> 00:21:34,680 And when they do change, it automatically 487 00:21:34,680 --> 00:21:37,550 computes the consequences for the displayed page 488 00:21:37,550 --> 00:21:41,066 and efficiently updates just the parts of the page 489 00:21:41,066 --> 00:21:42,990 that have changed based on those cells. 490 00:21:47,050 --> 00:21:49,570 All right, and on each client, there 491 00:21:49,570 --> 00:21:51,690 can be many different threads running at once. 492 00:21:54,320 --> 00:21:56,870 These threads are spawned in Ur/Web code 493 00:21:56,870 --> 00:21:58,780 and themselves run Ur/Web code. 494 00:21:58,780 --> 00:22:01,460 But the compiler needs to translate them into JavaScript 495 00:22:01,460 --> 00:22:03,170 to get the browser to run them. 496 00:22:03,170 --> 00:22:06,840 So that's one of the services the compiler provides. 497 00:22:06,840 --> 00:22:09,495 That's one important point about the threads. 498 00:22:09,495 --> 00:22:11,620 Another key point is that the client side threading 499 00:22:11,620 --> 00:22:13,860 follows what's call the cooperative multi-threading 500 00:22:13,860 --> 00:22:14,580 model. 501 00:22:14,580 --> 00:22:16,329 A thread doesn't have to worry about being 502 00:22:16,329 --> 00:22:19,160 preempted by another thread at an arbitrary point. 503 00:22:19,160 --> 00:22:21,220 There are well defined operations 504 00:22:21,220 --> 00:22:23,691 that signal, OK, it's all right to switch to another thread 505 00:22:23,691 --> 00:22:24,190 here. 506 00:22:24,190 --> 00:22:26,530 One of them is making a remote function call 507 00:22:26,530 --> 00:22:29,700 to the server, for instance, or asking 508 00:22:29,700 --> 00:22:31,690 to sleep for a certain number of milliseconds. 509 00:22:31,690 --> 00:22:34,270 But just regular code can't be interrupted arbitrarily. 510 00:22:34,270 --> 00:22:35,770 So that means the programmer doesn't 511 00:22:35,770 --> 00:22:37,690 need to think about as many interleavings, 512 00:22:37,690 --> 00:22:39,780 and it's easier to convince yourself 513 00:22:39,780 --> 00:22:41,970 that, say, a particular piece of code 514 00:22:41,970 --> 00:22:44,915 avoids some security issue or other bug. 515 00:22:44,915 --> 00:22:47,700 Because you can more easily enumerate all the possible ways 516 00:22:47,700 --> 00:22:49,982 for the two threads to interact with each other. 517 00:22:49,982 --> 00:22:51,440 And this is sort of a natural model 518 00:22:51,440 --> 00:22:55,890 to use given the way JavaScript is usually implemented. 519 00:22:55,890 --> 00:22:58,480 There isn't preemption in JavaScript and browsers 520 00:22:58,480 --> 00:22:58,980 already. 521 00:22:58,980 --> 00:23:01,270 So this is just presenting a threading abstraction 522 00:23:01,270 --> 00:23:03,404 on top of the callbacks-based model 523 00:23:03,404 --> 00:23:05,320 that JavaScript shows the programmer directly. 524 00:23:09,000 --> 00:23:11,720 And the last piece that one of the built-in 525 00:23:11,720 --> 00:23:14,100 abstractions that Ur/Web applications use 526 00:23:14,100 --> 00:23:19,060 is channels for passing messages between different machines. 527 00:23:19,060 --> 00:23:22,300 So each channel has a type, which expresses what 528 00:23:22,300 --> 00:23:23,610 kind of data can flow over if. 529 00:23:23,610 --> 00:23:25,734 You don't have to convert things to and from string 530 00:23:25,734 --> 00:23:28,630 or JSON or anything else to make this work. 531 00:23:28,630 --> 00:23:31,550 And channels can live in the database. 532 00:23:31,550 --> 00:23:33,425 So imagine this picture is showing us there's 533 00:23:33,425 --> 00:23:34,549 a channel that was created. 534 00:23:34,549 --> 00:23:36,160 It has a write side and a read side, 535 00:23:36,160 --> 00:23:38,170 which can go to separate places. 536 00:23:38,170 --> 00:23:40,660 The write end is sitting in the database. 537 00:23:40,660 --> 00:23:42,970 And the read end somehow made its way to the client 538 00:23:42,970 --> 00:23:46,010 and is sitting in the variable environment of a thread. 539 00:23:46,010 --> 00:23:48,500 So imagine that thread earlier made a remote call 540 00:23:48,500 --> 00:23:50,610 to the server, which created the channel, 541 00:23:50,610 --> 00:23:53,580 returned it to the client, and put it in the database in one 542 00:23:53,580 --> 00:23:55,930 transaction. 543 00:23:55,930 --> 00:23:59,870 So later, the server decides, OK, I'll query that channel out 544 00:23:59,870 --> 00:24:00,560 of the database. 545 00:24:00,560 --> 00:24:02,080 And I'll dump a value into it. 546 00:24:02,080 --> 00:24:05,786 And it just sort of pops out the other end on the client. 547 00:24:05,786 --> 00:24:08,290 And everything is strongly tied throughout this process. 548 00:24:11,450 --> 00:24:14,380 All right, I think this is the last step of my animation here. 549 00:24:14,380 --> 00:24:18,940 Any questions about this model before I switch to a code demo? 550 00:24:23,940 --> 00:24:27,940 STUDENT: So how is this different than [INAUDIBLE]? 551 00:24:27,940 --> 00:24:30,940 Why do you need a message passage if you already 552 00:24:30,940 --> 00:24:33,940 have that [INAUDIBLE]? 553 00:24:33,940 --> 00:24:36,860 ADAM CHLIPALA: OK, so RPC interface is going from browser 554 00:24:36,860 --> 00:24:40,900 initiates the call, the server handles it. 555 00:24:40,900 --> 00:24:42,546 The message is that the channels are 556 00:24:42,546 --> 00:24:44,420 intended for cases where the server initiates 557 00:24:44,420 --> 00:24:45,710 the communication. 558 00:24:45,710 --> 00:24:48,060 For instance, new email message-- that 559 00:24:48,060 --> 00:24:49,460 would be a canonical example. 560 00:24:49,460 --> 00:24:51,560 And the client is waiting to hear that there's a new email 561 00:24:51,560 --> 00:24:52,059 message. 562 00:24:52,059 --> 00:24:54,050 But it can't determine on its own 563 00:24:54,050 --> 00:24:55,840 when the next message is available. 564 00:24:55,840 --> 00:24:57,012 Yeah. 565 00:24:57,012 --> 00:24:58,720 STUDENT: Are all the messages multiplexed 566 00:24:58,720 --> 00:25:00,732 through one connection, or is it [INAUDIBLE]? 567 00:25:00,732 --> 00:25:02,190 ADAM CHLIPALA: They are multiplexed 568 00:25:02,190 --> 00:25:04,380 through one HTTP connection. 569 00:25:04,380 --> 00:25:06,710 I know there are these newfangled things today called 570 00:25:06,710 --> 00:25:08,700 web sockets and maybe some other protocols 571 00:25:08,700 --> 00:25:11,033 like that, which didn't exist when this was implemented. 572 00:25:11,033 --> 00:25:13,650 This all works over old-school HTTP 573 00:25:13,650 --> 00:25:17,263 with one connection for all the messages on different channels. 574 00:25:22,000 --> 00:25:24,000 All right, let's see what's next. 575 00:25:24,000 --> 00:25:25,850 Yeah, let me switch to a demo here. 576 00:25:29,740 --> 00:25:33,510 So here's a Hello World program in Ur/Web. 577 00:25:33,510 --> 00:25:35,510 Probably it deserves more of the screen space 578 00:25:35,510 --> 00:25:38,180 than this compilation output. 579 00:25:38,180 --> 00:25:44,660 So it looks pretty un-scary at this point, I hope. 580 00:25:44,660 --> 00:25:47,800 The unusual thing here maybe is that this is really 581 00:25:47,800 --> 00:25:48,550 the whole program. 582 00:25:48,550 --> 00:25:50,980 There's no extra routing logic that 583 00:25:50,980 --> 00:25:54,410 explains how to map a URL into some code to run 584 00:25:54,410 --> 00:25:57,010 to serve requests to that URL. 585 00:25:57,010 --> 00:25:59,810 We just have regular functions of a standard kind 586 00:25:59,810 --> 00:26:00,840 of programming language. 587 00:26:00,840 --> 00:26:06,130 And the compiler exposes all the functions in your main module 588 00:26:06,130 --> 00:26:08,740 as callable via URLs. 589 00:26:08,740 --> 00:26:11,220 And the URL is just formed from the function name. 590 00:26:11,220 --> 00:26:13,740 And if there's some nested structure modules, 591 00:26:13,740 --> 00:26:19,890 the module's structure is also replicated in the URL. 592 00:26:19,890 --> 00:26:22,230 And then we have a function that returns 593 00:26:22,230 --> 00:26:26,160 a piece of XHTML syntax. 594 00:26:26,160 --> 00:26:29,240 The compiler is actually using a special parsing extension 595 00:26:29,240 --> 00:26:32,680 for processing this XHTML syntax. 596 00:26:32,680 --> 00:26:34,640 And it's also doing some basic type 597 00:26:34,640 --> 00:26:38,372 checking to make sure that different XML elements appear 598 00:26:38,372 --> 00:26:39,830 inside others that they're actually 599 00:26:39,830 --> 00:26:43,140 authorized to appear inside of. 600 00:26:43,140 --> 00:26:46,340 And I think I compiled this before we started. 601 00:26:46,340 --> 00:26:48,610 And it does a not very surprising thing 602 00:26:48,610 --> 00:26:50,070 in the browser. 603 00:26:50,070 --> 00:26:56,290 And here's the HTML page that comes out. 604 00:26:56,290 --> 00:27:00,050 So among other properties, it automatically 605 00:27:00,050 --> 00:27:01,940 adds the right XHTML header. 606 00:27:01,940 --> 00:27:06,700 And it declares the character encoding for this document. 607 00:27:06,700 --> 00:27:08,365 I was mildly horrified to look at some 608 00:27:08,365 --> 00:27:09,990 of your assigned reading for this class 609 00:27:09,990 --> 00:27:13,120 and see how much time this book spends talking about character 610 00:27:13,120 --> 00:27:17,331 encodings and what happens if you're not using UTF-8. 611 00:27:17,331 --> 00:27:18,930 I hope I understood that correctly. 612 00:27:18,930 --> 00:27:21,570 This forces you to use UTF-8 so that those horrible things 613 00:27:21,570 --> 00:27:23,080 aren't going to happen, I hope. 614 00:27:23,080 --> 00:27:24,680 But if anyone sees a way to replicate 615 00:27:24,680 --> 00:27:28,620 any of the attacks from that book Tangled Web in Ur/Web, 616 00:27:28,620 --> 00:27:30,180 or has a hypothesis about something 617 00:27:30,180 --> 00:27:32,060 we should try to see if it works, 618 00:27:32,060 --> 00:27:33,530 I'd be interested to hear that. 619 00:27:33,530 --> 00:27:35,446 And by the way, at any point during this demo, 620 00:27:35,446 --> 00:27:39,740 please suggest experiments that come 621 00:27:39,740 --> 00:27:41,350 to mind about things we should try, 622 00:27:41,350 --> 00:27:44,290 mistakes you might make that you wonder whether this system is 623 00:27:44,290 --> 00:27:45,390 able to catch. 624 00:27:45,390 --> 00:27:47,340 I think that's the most fun kind of demo. 625 00:27:47,340 --> 00:27:47,840 Yeah. 626 00:27:47,840 --> 00:27:49,828 STUDENT: So things like CRSF [INAUDIBLE], 627 00:27:49,828 --> 00:27:51,319 you said that [INAUDIBLE]. 628 00:27:55,300 --> 00:27:58,910 ADAM CHLIPALA: So cross site request forgery 629 00:27:58,910 --> 00:28:02,120 I wanted to explain a little later explicitly. 630 00:28:02,120 --> 00:28:04,880 I think the paper sort of explains why cross site 631 00:28:04,880 --> 00:28:06,200 scripting can't work. 632 00:28:06,200 --> 00:28:11,640 And the reason is whenever you build a piece of syntax, 633 00:28:11,640 --> 00:28:18,830 it's an object, a tree of different sub 634 00:28:18,830 --> 00:28:19,810 parts of that syntax. 635 00:28:19,810 --> 00:28:21,620 It's not just a string. 636 00:28:21,620 --> 00:28:23,830 And you're not going to accidentally turn 637 00:28:23,830 --> 00:28:26,899 a string from the user into a tree with structure. 638 00:28:26,899 --> 00:28:28,190 You would know if you did that. 639 00:28:28,190 --> 00:28:30,370 Because it's hard to write an interpreter. 640 00:28:30,370 --> 00:28:32,370 And in Ur/Web, you have to write an interpreter. 641 00:28:32,370 --> 00:28:34,490 It doesn't automatically happen for you. 642 00:28:34,490 --> 00:28:38,650 But I'll have an example shortly that might also 643 00:28:38,650 --> 00:28:40,440 address that concern. 644 00:28:40,440 --> 00:28:44,690 So I want to show you what this syntactic sugar actually 645 00:28:44,690 --> 00:28:46,200 turns into in the compiler. 646 00:28:46,200 --> 00:28:47,920 So this might look like we could just 647 00:28:47,920 --> 00:28:49,740 add some double quotes around the HTML, 648 00:28:49,740 --> 00:28:51,710 and then we're back in the normal world. 649 00:28:51,710 --> 00:28:53,760 We might wonder, why is it such a big deal 650 00:28:53,760 --> 00:28:56,640 the omit the double quotes and put XML instead? 651 00:28:56,640 --> 00:29:02,340 So we can actually take my word for it 652 00:29:02,340 --> 00:29:05,820 that this is equivalent code for what this does. 653 00:29:05,820 --> 00:29:08,910 So tag is a built in function that builds a tree 654 00:29:08,910 --> 00:29:11,260 node of an HTML document. 655 00:29:11,260 --> 00:29:14,080 And I'm passing a bunch of arguments that are expressing 656 00:29:14,080 --> 00:29:17,162 the CSS styling on that node. 657 00:29:17,162 --> 00:29:19,120 This one doesn't really have anything going on, 658 00:29:19,120 --> 00:29:22,930 so it's a variety of different ways of saying nothing. 659 00:29:22,930 --> 00:29:26,690 And it doesn't take any attributes. 660 00:29:26,690 --> 00:29:28,770 And the tag is a body tag. 661 00:29:28,770 --> 00:29:31,340 So that's another thing in the standard library. 662 00:29:31,340 --> 00:29:34,680 All of the standard tags are functions with first class 663 00:29:34,680 --> 00:29:36,890 status in the standard library. 664 00:29:36,890 --> 00:29:39,870 And then we need to put a "Hello World" text inside it. 665 00:29:39,870 --> 00:29:41,740 So we call a cdata function where 666 00:29:41,740 --> 00:29:46,027 cdata is the XML word for character data 667 00:29:46,027 --> 00:29:47,110 or just a string constant. 668 00:29:47,110 --> 00:29:51,710 And we can put exactly the text from below. 669 00:29:51,710 --> 00:29:53,460 We'll comment that out. 670 00:29:53,460 --> 00:29:56,390 This should give us the same result as before. 671 00:29:56,390 --> 00:29:58,880 Let me see if that worked. 672 00:29:58,880 --> 00:30:05,500 OK, and now I'll go back to the actual page. 673 00:30:05,500 --> 00:30:10,530 Same thing as before, so this is what that function was really 674 00:30:10,530 --> 00:30:11,517 doing at the begin. 675 00:30:11,517 --> 00:30:12,850 It's not just building a string. 676 00:30:12,850 --> 00:30:14,308 It's calling a series of operations 677 00:30:14,308 --> 00:30:16,480 that are designed so that they only 678 00:30:16,480 --> 00:30:19,580 allow you to build valid HTML, and they never 679 00:30:19,580 --> 00:30:24,240 implicitly interpret a string as code instead of just 680 00:30:24,240 --> 00:30:25,240 content that sits there. 681 00:30:25,240 --> 00:30:26,107 Yeah? 682 00:30:26,107 --> 00:30:26,982 STUDENT: [INAUDIBLE]? 683 00:30:29,450 --> 00:30:31,939 ADAM CHLIPALA: Right, you are anticipating 684 00:30:31,939 --> 00:30:32,730 the next few steps. 685 00:30:32,730 --> 00:30:34,715 Let me do something less complicated 686 00:30:34,715 --> 00:30:37,556 first, which is also potentially worrisome. 687 00:30:37,556 --> 00:30:41,290 Let's decide that we're really happy to see the world, so we 688 00:30:41,290 --> 00:30:47,965 better put the word "hello" in bold and compile that again. 689 00:30:47,965 --> 00:30:51,410 It just shows up as interpreting that literally 690 00:30:51,410 --> 00:30:53,800 as text instead of markup. 691 00:30:53,800 --> 00:30:59,010 So this presentation of HTML syntax 692 00:30:59,010 --> 00:31:02,140 as a function that builds syntax doesn't 693 00:31:02,140 --> 00:31:04,407 have any of the usual syntactic encoding 694 00:31:04,407 --> 00:31:05,490 conventions built into it. 695 00:31:05,490 --> 00:31:07,910 It interprets things in the way you would want it to. 696 00:31:07,910 --> 00:31:11,320 And so the implementation of cdata 697 00:31:11,320 --> 00:31:13,150 does what's usually called escaping. 698 00:31:13,150 --> 00:31:14,640 But the programmer doesn't need to know there 699 00:31:14,640 --> 00:31:15,720 is any such thing as escaping. 700 00:31:15,720 --> 00:31:17,210 You can just think of it as, here's 701 00:31:17,210 --> 00:31:19,972 a set of convenient functions for building a tree object that 702 00:31:19,972 --> 00:31:20,680 describes a page. 703 00:31:20,680 --> 00:31:22,774 Did I see a question over there? 704 00:31:22,774 --> 00:31:24,584 STUDENT: [INAUDIBLE]? 705 00:31:24,584 --> 00:31:27,000 ADAM CHLIPALA: You want to see the HTML that it generates. 706 00:31:27,000 --> 00:31:32,170 OK, it's going to be not the most exciting thing. 707 00:31:32,170 --> 00:31:33,874 I don't know if that's [INAUDIBLE]. 708 00:31:33,874 --> 00:31:36,290 I can make it bigger, but then it doesn't fit on one line. 709 00:31:36,290 --> 00:31:39,680 So let me know if I should make it bigger. 710 00:31:39,680 --> 00:31:44,065 It just put in the usual escapes for the less than character 711 00:31:44,065 --> 00:31:47,660 with an ampersand. 712 00:31:47,660 --> 00:31:49,484 STUDENT: So given that you're using XHTML, 713 00:31:49,484 --> 00:31:52,376 couldn't you just use the cdata [INAUDIBLE] 714 00:31:52,376 --> 00:31:55,567 instead of doing manual [INAUDIBLE]? 715 00:31:55,567 --> 00:31:56,566 ADAM CHLIPALA: Probably. 716 00:31:56,566 --> 00:31:59,040 That would require me knowing more about XML than I do. 717 00:32:04,430 --> 00:32:07,880 All right, so there was another question about JavaScript URLs, 718 00:32:07,880 --> 00:32:08,740 which is a good one. 719 00:32:08,740 --> 00:32:11,140 If we allow JavaScript URLs, then we 720 00:32:11,140 --> 00:32:15,890 have a back door for automatic interpretation of strings 721 00:32:15,890 --> 00:32:17,790 as programs at runtime. 722 00:32:17,790 --> 00:32:19,510 And that causes all sorts of issues. 723 00:32:19,510 --> 00:32:21,600 So let's try to avoid that. 724 00:32:21,600 --> 00:32:26,130 I'll switch back, first of all, to the shorter version of this. 725 00:32:26,130 --> 00:32:32,510 And then inside the body, I'll make this multiple lines. 726 00:32:32,510 --> 00:32:37,000 And let's put a link that tries to do something appropriate. 727 00:32:46,820 --> 00:32:48,795 We'll leave some room for error messages here. 728 00:32:48,795 --> 00:32:49,878 This is working correctly. 729 00:32:54,700 --> 00:32:57,350 Invalid URL, JavaScript something, passed bless. 730 00:32:57,350 --> 00:32:59,305 So bless is a built in function that 731 00:32:59,305 --> 00:33:02,420 is the gatekeeper of which URLs are allowed. 732 00:33:02,420 --> 00:33:05,340 And by default, no URLs are allowed. 733 00:33:05,340 --> 00:33:08,320 So certainly this one is not allowed. 734 00:33:08,320 --> 00:33:12,430 And in general, it is a bad idea to write your URL policy 735 00:33:12,430 --> 00:33:16,777 so that you can create values that represent JavaScript URLs. 736 00:33:16,777 --> 00:33:19,110 Because then all sorts of guarantees that you might like 737 00:33:19,110 --> 00:33:20,400 are invalid. 738 00:33:20,400 --> 00:33:22,630 To make it a little clearer how it 739 00:33:22,630 --> 00:33:28,350 works, let me factor this code into a separate function 740 00:33:28,350 --> 00:33:32,335 called a linker that takes in a URL. 741 00:33:32,335 --> 00:33:33,331 So URL is a type. 742 00:33:33,331 --> 00:33:34,205 It's not just string. 743 00:33:34,205 --> 00:33:37,830 It's a type that stands for a URL that is explicitly 744 00:33:37,830 --> 00:33:41,260 authorized by your application's policy. 745 00:33:41,260 --> 00:33:47,140 And so we can [INAUDIBLE] XML. 746 00:33:47,140 --> 00:33:52,160 And instead of a constant, I'll just put u here. 747 00:33:52,160 --> 00:33:54,430 And so I'm using the curly braces 748 00:33:54,430 --> 00:33:58,470 like in some popular HTML template frameworks 749 00:33:58,470 --> 00:34:01,260 to indicate inserting some code from the host 750 00:34:01,260 --> 00:34:03,722 language inside the HTML that we're building. 751 00:34:03,722 --> 00:34:05,180 And this is all done in a way where 752 00:34:05,180 --> 00:34:06,500 it's type checked statically. 753 00:34:06,500 --> 00:34:08,020 So the system will check, yeah, this 754 00:34:08,020 --> 00:34:09,739 is a spot where a URL belongs. 755 00:34:09,739 --> 00:34:11,090 And this says it is a URL. 756 00:34:11,090 --> 00:34:13,020 So that's fine. 757 00:34:13,020 --> 00:34:18,210 And then I can explicitly expose the call to bless by saying, 758 00:34:18,210 --> 00:34:21,107 let's just call the linker function here 759 00:34:21,107 --> 00:34:23,280 on the result of blessing that URL. 760 00:34:28,679 --> 00:34:31,300 We should get basically the same error message as before. 761 00:34:31,300 --> 00:34:35,380 There's some program analysis going on here to figure out-- 762 00:34:35,380 --> 00:34:36,820 I guess it doesn't need that. 763 00:34:36,820 --> 00:34:39,350 Because this string is passed directly to bless. 764 00:34:39,350 --> 00:34:42,368 And we can see-- I couldn't wait to run this for you at runtime 765 00:34:42,368 --> 00:34:43,409 and discover the failure. 766 00:34:43,409 --> 00:34:45,342 But I can tell it's definitely going to fail. 767 00:34:45,342 --> 00:34:46,925 So I'll just make it a compiler error. 768 00:34:46,925 --> 00:34:50,800 This URL is not going to be accepted by the URL policy. 769 00:34:50,800 --> 00:34:54,728 STUDENT: So if you didn't have the [INAUDIBLE]? 770 00:34:54,728 --> 00:34:57,039 ADAM CHLIPALA: If I left out this call to bless, 771 00:34:57,039 --> 00:35:00,546 it would be a much more basic compile time error. 772 00:35:00,546 --> 00:35:01,920 You have a string and need a URL. 773 00:35:01,920 --> 00:35:02,920 They're different types. 774 00:35:06,738 --> 00:35:09,370 All right, but let's make this a little more interesting. 775 00:35:09,370 --> 00:35:11,120 And I'm going to open up the configuration 776 00:35:11,120 --> 00:35:12,900 file for this demo. 777 00:35:12,900 --> 00:35:15,820 It's pretty short, as these things go, 778 00:35:15,820 --> 00:35:19,760 at least if you look at any Java web application framework. 779 00:35:19,760 --> 00:35:22,080 They have these gigantic XML files for configuration. 780 00:35:22,080 --> 00:35:27,000 This is a little nicer than that, or so I claim. 781 00:35:27,000 --> 00:35:30,990 We can add a rule that says, anything on Wikipedia 782 00:35:30,990 --> 00:35:31,950 is allowed. 783 00:35:31,950 --> 00:35:34,765 And then we can put the Wikipedia URL in here. 784 00:35:41,517 --> 00:35:46,008 Now we're in good shape. 785 00:35:46,008 --> 00:35:47,006 What's missing? 786 00:35:53,992 --> 00:35:56,890 Oh, I guess I don't remember the URL scheme for that. 787 00:35:56,890 --> 00:35:58,414 But we got to the website. 788 00:35:58,414 --> 00:35:59,205 That's good enough. 789 00:36:01,983 --> 00:36:03,840 All right, so the big idea here is 790 00:36:03,840 --> 00:36:06,735 to have an abstract type of URL, just 791 00:36:06,735 --> 00:36:08,860 like you could have an abstract type of hash tables 792 00:36:08,860 --> 00:36:11,851 that encodes invariants about how the hash table looks 793 00:36:11,851 --> 00:36:13,850 and prevents code from reaching inside the array 794 00:36:13,850 --> 00:36:15,030 of the hash table. 795 00:36:15,030 --> 00:36:17,210 We can do the same thing for URLs. 796 00:36:17,210 --> 00:36:20,250 And the system enforces via this bless function 797 00:36:20,250 --> 00:36:22,710 that every value of this type has 798 00:36:22,710 --> 00:36:24,835 passed the appropriate check at some point. 799 00:36:24,835 --> 00:36:26,400 And for instance, with this policy, 800 00:36:26,400 --> 00:36:28,980 we know there will never be a JavaScript URL. 801 00:36:28,980 --> 00:36:33,780 And it's safe to take a URL value and use it as a link. 802 00:36:33,780 --> 00:36:36,492 It won't break the basic abstractions of the language. 803 00:36:36,492 --> 00:36:37,505 Yeah. 804 00:36:37,505 --> 00:36:38,380 STUDENT: [INAUDIBLE]? 805 00:36:49,100 --> 00:36:52,060 ADAM CHLIPALA: OK, so we have to try something like that. 806 00:36:52,060 --> 00:36:54,310 And this should go through. 807 00:36:54,310 --> 00:36:58,982 And then the browser knows it's a quote. 808 00:36:58,982 --> 00:37:01,000 And we can look at the source. 809 00:37:01,000 --> 00:37:05,010 That is because it was escaped in the right way. 810 00:37:05,010 --> 00:37:07,447 STUDENT: But can you still use-- so JavaScript allows you 811 00:37:07,447 --> 00:37:11,488 to say, [INAUDIBLE], and then specify inline JavaScript 812 00:37:11,488 --> 00:37:11,988 there. 813 00:37:11,988 --> 00:37:14,870 Is that something that [INAUDIBLE]? 814 00:37:14,870 --> 00:37:16,580 ADAM CHLIPALA: Yes and no. 815 00:37:16,580 --> 00:37:18,780 So we can put body onload. 816 00:37:18,780 --> 00:37:21,781 And instead of JavaScript, you put some Ur/Web code 817 00:37:21,781 --> 00:37:22,743 that does something. 818 00:37:29,490 --> 00:37:33,695 So it would be a disaster to interpret JavaScript code 819 00:37:33,695 --> 00:37:35,419 in string form as a program there. 820 00:37:35,419 --> 00:37:37,210 But we can put code of the same programming 821 00:37:37,210 --> 00:37:39,600 language you're working with already escaped in 822 00:37:39,600 --> 00:37:41,086 with these curly braces. 823 00:37:41,086 --> 00:37:42,710 And then it automatically gets compiled 824 00:37:42,710 --> 00:37:44,842 to JavaScript to run on the client. 825 00:37:52,778 --> 00:37:54,762 All right, any more questions? 826 00:37:54,762 --> 00:37:55,375 Yeah. 827 00:37:55,375 --> 00:37:56,250 STUDENT: [INAUDIBLE]? 828 00:38:04,499 --> 00:38:09,490 ADAM CHLIPALA: I think it's everything? 829 00:38:09,490 --> 00:38:12,560 Is it embarrassing that I said everything? 830 00:38:12,560 --> 00:38:15,919 Is there something that shouldn't be allowed? 831 00:38:15,919 --> 00:38:16,794 STUDENT: [INAUDIBLE]. 832 00:38:24,966 --> 00:38:27,340 ADAM CHLIPALA: I see, so symbols that would independently 833 00:38:27,340 --> 00:38:29,960 have funny things happening with software execution 834 00:38:29,960 --> 00:38:32,414 would confuse the human user? 835 00:38:32,414 --> 00:38:33,289 STUDENT: [INAUDIBLE]. 836 00:38:36,600 --> 00:38:39,070 ADAM CHLIPALA: OK, I remember reading some of that stuff. 837 00:38:39,070 --> 00:38:42,090 And maybe it said the new browser versions 838 00:38:42,090 --> 00:38:42,990 avoid those problems. 839 00:38:42,990 --> 00:38:45,820 But some old ones will get confused. 840 00:38:45,820 --> 00:38:49,540 It's possible this will create problems in the old ones that 841 00:38:49,540 --> 00:38:50,690 are too permissive. 842 00:38:50,690 --> 00:38:51,660 I'm not sure. 843 00:38:55,540 --> 00:38:57,750 But at least all these things are 844 00:38:57,750 --> 00:39:00,435 going to be interpreted as UTF-8 if they go into the document. 845 00:39:00,435 --> 00:39:02,643 So if there's some problem with a different encoding, 846 00:39:02,643 --> 00:39:06,444 it shouldn't be applicable here. 847 00:39:06,444 --> 00:39:06,944 Yeah. 848 00:39:06,944 --> 00:39:09,809 STUDENT: The string of the [INAUDIBLE], right now 849 00:39:09,809 --> 00:39:11,809 it's checking a compile time that that string is 850 00:39:11,809 --> 00:39:13,440 on allowed URL. 851 00:39:13,440 --> 00:39:16,006 But if you compute a string at runtime, 852 00:39:16,006 --> 00:39:19,104 does bless perform a check at runtime 853 00:39:19,104 --> 00:39:21,096 whether or not the string is allowed, or are 854 00:39:21,096 --> 00:39:22,590 you not allowed to-- 855 00:39:22,590 --> 00:39:25,578 ADAM CHLIPALA: So let's a write a form to test that claim. 856 00:39:31,056 --> 00:39:32,550 So we can put a form in here. 857 00:39:38,028 --> 00:39:42,510 And form wants us to enter URL in a text box called URL. 858 00:39:45,498 --> 00:39:48,520 Then we can have a Submit button. 859 00:39:48,520 --> 00:39:52,090 When you click on it, it should call the linker function 860 00:39:52,090 --> 00:39:55,480 with a record of one value for every field in the form. 861 00:39:55,480 --> 00:39:57,990 In this case, there's just one field called URL. 862 00:39:57,990 --> 00:40:00,710 And so linker will get passed a record that contains 863 00:40:00,710 --> 00:40:02,430 the URL as a string type. 864 00:40:02,430 --> 00:40:05,152 And then we'll explicitly try to bless it up there 865 00:40:05,152 --> 00:40:05,985 and see if it works. 866 00:40:16,150 --> 00:40:17,990 This is an example of an exciting type error 867 00:40:17,990 --> 00:40:20,730 message, which is admittedly sub-optimal in some ways. 868 00:40:27,247 --> 00:40:29,830 Here's one of those things that won't make any sense if you're 869 00:40:29,830 --> 00:40:30,913 not familiar with Haskell. 870 00:40:30,913 --> 00:40:32,670 I forgot a return. 871 00:40:32,670 --> 00:40:35,165 But at least now it looks more like a Java program. 872 00:40:39,342 --> 00:40:43,770 Have a string-- let me scroll to the end, do one of these, 873 00:40:43,770 --> 00:40:46,370 sort of copying the full type of all the attributes 874 00:40:46,370 --> 00:40:47,775 that this tag can take. 875 00:40:47,775 --> 00:40:50,790 And I also forgot to say, this is now a full page. 876 00:40:50,790 --> 00:40:54,880 So we can't use an a tag until we're inside a body tag. 877 00:40:54,880 --> 00:40:56,631 And this is the abstruse type error 878 00:40:56,631 --> 00:40:57,714 message for that property. 879 00:41:01,800 --> 00:41:05,330 OK, so now let's see what happens. 880 00:41:05,330 --> 00:41:13,250 URL is-- yay. 881 00:41:23,830 --> 00:41:27,253 There we go. 882 00:41:27,253 --> 00:41:30,805 So that was a somewhat long and not necessarily super 883 00:41:30,805 --> 00:41:32,180 exciting answer to your question. 884 00:41:32,180 --> 00:41:32,680 Yeah. 885 00:41:32,680 --> 00:41:36,750 STUDENT: The URL [INAUDIBLE], are those just for [INAUDIBLE], 886 00:41:36,750 --> 00:41:38,250 or is it more restrictive than that? 887 00:41:38,250 --> 00:41:39,791 ADAM CHLIPALA: It's more restrictive. 888 00:41:39,791 --> 00:41:42,155 It's currently just constants and prefixes. 889 00:41:42,155 --> 00:41:44,575 But you can also have disallow rules. 890 00:41:44,575 --> 00:41:46,450 And they run in the order that you write. 891 00:41:46,450 --> 00:41:52,351 STUDENT: Oh, so if you stick to disallow JavaScript [INAUDIBLE] 892 00:41:52,351 --> 00:41:54,601 that if you put a line break in the middle of the word 893 00:41:54,601 --> 00:41:57,812 "JavaScript," it will still interpret it as-- 894 00:41:57,812 --> 00:42:00,150 ADAM CHLIPALA: That would be too bad. 895 00:42:00,150 --> 00:42:01,745 That's why it's good to stick to the white list approach instead 896 00:42:01,745 --> 00:42:02,991 of the black list approach. 897 00:42:02,991 --> 00:42:05,020 So you probably want all the rules 898 00:42:05,020 --> 00:42:10,020 to start with a particular protocol, like HTTP, 899 00:42:10,020 --> 00:42:13,584 and only allow things that fall in your approved set 900 00:42:13,584 --> 00:42:15,192 of protocols. 901 00:42:15,192 --> 00:42:18,066 That's what I recommend it is. 902 00:42:18,066 --> 00:42:18,566 Yeah. 903 00:42:18,566 --> 00:42:20,012 STUDENT: For many sites, you might 904 00:42:20,012 --> 00:42:21,928 let users share links, in which case, you need 905 00:42:21,928 --> 00:42:23,734 to allow links to anywhere. 906 00:42:23,734 --> 00:42:25,192 ADAM CHLIPALA: You can allow links. 907 00:42:25,192 --> 00:42:27,442 Well, do you want your users to share JavaScript links 908 00:42:27,442 --> 00:42:29,890 or, I don't know, Flash links, or whatever's allowed? 909 00:42:29,890 --> 00:42:34,990 You see, you can white list all the HTTP, HTTPS, URLs 910 00:42:34,990 --> 00:42:37,705 and be in good shape for most websites. 911 00:42:37,705 --> 00:42:39,200 That would do that. 912 00:42:39,200 --> 00:42:42,630 And the guarantees are a little weaker 913 00:42:42,630 --> 00:42:44,460 compared to allowing only particular URLs. 914 00:42:44,460 --> 00:42:47,408 But you can at least ensure that there's no automatic execution 915 00:42:47,408 --> 00:42:48,842 of the string as a program. 916 00:42:54,590 --> 00:42:59,664 So let me pull up one of the examples from the paper, which 917 00:42:59,664 --> 00:43:10,600 is this one, an example of a simple system 918 00:43:10,600 --> 00:43:13,600 with a set of chat rooms represented in the database. 919 00:43:13,600 --> 00:43:16,110 And the user can click on a link to go to a room 920 00:43:16,110 --> 00:43:17,360 and then send a message. 921 00:43:17,360 --> 00:43:20,085 This was the first of several variants on that scheme. 922 00:43:22,640 --> 00:43:25,040 First, I'll point out I'm going to recompile this. 923 00:43:25,040 --> 00:43:27,770 And then magically, all the database tables 924 00:43:27,770 --> 00:43:30,410 that it declares are going to be added to the database. 925 00:43:30,410 --> 00:43:33,014 And we can now just start using the application. 926 00:43:33,014 --> 00:43:35,010 But first, we have to add some rooms. 927 00:43:35,010 --> 00:43:39,160 So let's open our [INAUDIBLE] interface to the demo database 928 00:43:39,160 --> 00:43:48,216 and insert into the room table some values like one and two. 929 00:43:48,216 --> 00:43:51,100 Hopefully these are here now. 930 00:43:51,100 --> 00:43:53,210 OK, and we go in there, and we can 931 00:43:53,210 --> 00:43:57,930 entertain ourselves all day long sending strings of text. 932 00:43:57,930 --> 00:44:05,983 Maybe a little more interesting, you can try to send HTML, 933 00:44:05,983 --> 00:44:09,634 and it just gets handled right away. 934 00:44:09,634 --> 00:44:12,040 That's the basic functionality there. 935 00:44:12,040 --> 00:44:15,870 And just to quickly go over some of how this works again, 936 00:44:15,870 --> 00:44:17,625 so we have these two SQL table that 937 00:44:17,625 --> 00:44:19,870 are just declared in this first class 938 00:44:19,870 --> 00:44:21,370 way inside the programming language. 939 00:44:21,370 --> 00:44:23,125 And we give the schema of each table. 940 00:44:23,125 --> 00:44:25,720 And then later, when we try to access those tables, 941 00:44:25,720 --> 00:44:27,990 the compiler will check that we're accessing them 942 00:44:27,990 --> 00:44:30,323 in a way that's consistent with the schema from a typing 943 00:44:30,323 --> 00:44:31,550 perspective. 944 00:44:31,550 --> 00:44:34,420 So we have a table of rooms where each room is 945 00:44:34,420 --> 00:44:37,520 a record of an ID, which is the integer, 946 00:44:37,520 --> 00:44:39,720 and a title, which is the string. 947 00:44:39,720 --> 00:44:42,540 This is the type we were just generating records in. 948 00:44:42,540 --> 00:44:47,000 And I created a few rooms at the SQL console. 949 00:44:47,000 --> 00:44:52,730 And we also have messages that each message belongs to a room. 950 00:44:52,730 --> 00:44:54,990 And it has a time when it was posted. 951 00:44:54,990 --> 00:44:59,210 And it has some text, which is the content of the message. 952 00:44:59,210 --> 00:45:03,520 And let me fast forward to the main function. 953 00:45:03,520 --> 00:45:06,092 We run an SQL query. 954 00:45:06,092 --> 00:45:08,610 So here's an example of SQL syntax embedded inside 955 00:45:08,610 --> 00:45:09,130 of Ur/Web. 956 00:45:09,130 --> 00:45:11,410 I don't want to go through the expansion of this one 957 00:45:11,410 --> 00:45:13,790 into calling functions from the standard library. 958 00:45:13,790 --> 00:45:17,370 Because it's pretty verbose if I do that. 959 00:45:17,370 --> 00:45:19,710 But take my word for it, this is de-sugared into calls 960 00:45:19,710 --> 00:45:21,790 of functions in the standard library that 961 00:45:21,790 --> 00:45:25,720 represent the valid ways of constructing an SQL query. 962 00:45:25,720 --> 00:45:27,630 And those functions have types that 963 00:45:27,630 --> 00:45:29,580 cause them to type check the query for you, 964 00:45:29,580 --> 00:45:33,320 not just guarantee that the syntax is reasonable. 965 00:45:33,320 --> 00:45:36,630 So this gets de-sugared into an indication of an SQL query. 966 00:45:36,630 --> 00:45:38,210 And then the code here is basically 967 00:45:38,210 --> 00:45:41,910 just looping over all the rows that come out of that query 968 00:45:41,910 --> 00:45:44,580 and generating a piece of HTML for each one. 969 00:45:44,580 --> 00:45:46,950 In particular, we're going to take the title 970 00:45:46,950 --> 00:45:52,472 field of a query result and convert that 971 00:45:52,472 --> 00:45:57,000 into HTML with this notation that involves curly braces. 972 00:45:57,000 --> 00:45:59,310 And the square brackets are additionally saying, 973 00:45:59,310 --> 00:46:01,900 this isn't literally a piece of HTML yet. 974 00:46:01,900 --> 00:46:04,455 But please convert it for me in the standard way. 975 00:46:04,455 --> 00:46:06,445 So we can do that with strings and integers 976 00:46:06,445 --> 00:46:07,862 and all sorts of other types. 977 00:46:07,862 --> 00:46:08,677 Yeah. 978 00:46:08,677 --> 00:46:11,479 STUDENT: So if that contained malicious HTML or something, 979 00:46:11,479 --> 00:46:12,630 would that be filtered out? 980 00:46:15,220 --> 00:46:16,410 ADAM CHLIPALA: It would be. 981 00:46:16,410 --> 00:46:19,030 So in the usual way of talking about these things, 982 00:46:19,030 --> 00:46:21,186 escaping happens in the way you'd want it to. 983 00:46:21,186 --> 00:46:24,640 In Ur/Web, you can just think of this as building a tree. 984 00:46:24,640 --> 00:46:27,080 This is a node that stands for some text. 985 00:46:27,080 --> 00:46:28,550 Obviously text can't do anything. 986 00:46:28,550 --> 00:46:30,951 STUDENT: So if that title was User Control, 987 00:46:30,951 --> 00:46:34,362 and someone made a chat room with the title Alert something, 988 00:46:34,362 --> 00:46:35,570 that would not be JavaScript? 989 00:46:35,570 --> 00:46:36,480 ADAM CHLIPALA: It wouldn't automatically 990 00:46:36,480 --> 00:46:39,336 be interpreted as JavaScript or HTML or anything else. 991 00:46:39,336 --> 00:46:41,220 It would just be text only. 992 00:46:41,220 --> 00:46:42,595 All right, so we have this title. 993 00:46:42,595 --> 00:46:45,200 And let's wrap an a tag around it. 994 00:46:45,200 --> 00:46:47,900 And instead of href, the usual way to do a link in HTML, 995 00:46:47,900 --> 00:46:50,567 we use the link attribute, which is 996 00:46:50,567 --> 00:46:54,200 sort of a pseudo attribute in Ur/Web, which 997 00:46:54,200 --> 00:46:57,570 takes as an argument not a URL, but basically 998 00:46:57,570 --> 00:46:58,570 an Ur/Web expression. 999 00:46:58,570 --> 00:47:00,290 And the meaning is when you click on this link, 1000 00:47:00,290 --> 00:47:02,623 please run this expression to generate the new page that 1001 00:47:02,623 --> 00:47:04,125 should be displayed. 1002 00:47:04,125 --> 00:47:06,000 In this case, we're calling a function called 1003 00:47:06,000 --> 00:47:10,280 chat, which is defined up here. 1004 00:47:10,280 --> 00:47:14,990 And here's what it is. 1005 00:47:14,990 --> 00:47:16,804 I won't go too much into the details. 1006 00:47:16,804 --> 00:47:18,220 But we have a few more SQL queries 1007 00:47:18,220 --> 00:47:21,280 using a variety of standard library functions 1008 00:47:21,280 --> 00:47:24,000 for different ways of using queried results. 1009 00:47:24,000 --> 00:47:26,421 We generate this HTML page. 1010 00:47:26,421 --> 00:47:27,920 And we say, you're in the chat room. 1011 00:47:27,920 --> 00:47:28,900 Here's the title. 1012 00:47:28,900 --> 00:47:30,750 We get the same kind of escaping there. 1013 00:47:30,750 --> 00:47:33,320 And there's a form where the user can enter some text. 1014 00:47:33,320 --> 00:47:36,325 That's the form that I used to demonstrate this 1015 00:47:36,325 --> 00:47:38,500 a few moments ago. 1016 00:47:38,500 --> 00:47:40,530 And the Submit button of the form 1017 00:47:40,530 --> 00:47:44,250 has this ask attribute that is containing say, 1018 00:47:44,250 --> 00:47:46,018 which is the name of a function in Ur/Web. 1019 00:47:46,018 --> 00:47:46,902 And here it is. 1020 00:47:46,902 --> 00:47:49,060 So when we submit the form, we call this function. 1021 00:47:49,060 --> 00:47:51,050 Run some more SQLs. 1022 00:47:51,050 --> 00:47:53,470 Insert a new row into a table. 1023 00:47:53,470 --> 00:47:58,870 We automatically jump in the ID of the chat room 1024 00:47:58,870 --> 00:48:00,940 and the text field that came from the form. 1025 00:48:00,940 --> 00:48:03,054 And these are automatically escaped as necessary. 1026 00:48:03,054 --> 00:48:05,095 But again, you don't have to think about escaping 1027 00:48:05,095 --> 00:48:06,870 in that way in Ur/Web. 1028 00:48:06,870 --> 00:48:09,642 Because this is just syntax for building a tree. 1029 00:48:09,642 --> 00:48:12,580 It doesn't stand for a string. 1030 00:48:12,580 --> 00:48:16,070 So there's no way to have strange things happen 1031 00:48:16,070 --> 00:48:18,270 with parsing that you don't expect from the way 1032 00:48:18,270 --> 00:48:21,100 that the syntax is written. 1033 00:48:21,100 --> 00:48:21,665 Yeah. 1034 00:48:21,665 --> 00:48:22,540 STUDENT: [INAUDIBLE]? 1035 00:48:29,740 --> 00:48:34,660 ADAM CHLIPALA: Yes, so from the fact that there's one widget, 1036 00:48:34,660 --> 00:48:37,700 one GUI widget in this form, and its name is text, 1037 00:48:37,700 --> 00:48:40,101 and that one is a text box, the compiler 1038 00:48:40,101 --> 00:48:42,350 infers that the record that stands for the form result 1039 00:48:42,350 --> 00:48:47,870 should have one element called text that is of type string. 1040 00:48:47,870 --> 00:48:51,187 And this encoding the forms, the typing rules for it 1041 00:48:51,187 --> 00:48:52,520 are not built into the language. 1042 00:48:52,520 --> 00:48:55,130 You can actually with the type system in Ur 1043 00:48:55,130 --> 00:48:58,690 express as a library, what are the operations for building 1044 00:48:58,690 --> 00:49:01,400 forms, and how do you check that they're used correctly, 1045 00:49:01,400 --> 00:49:02,990 including what consequences they have 1046 00:49:02,990 --> 00:49:04,510 of the types of the functions that 1047 00:49:04,510 --> 00:49:05,974 actually handle those forms? 1048 00:49:13,770 --> 00:49:14,270 [INAUDIBLE] 1049 00:49:19,660 --> 00:49:21,330 Any other questions about this code 1050 00:49:21,330 --> 00:49:25,755 before I switch to the next step of the sequence 1051 00:49:25,755 --> 00:49:28,129 in versions from the paper, which is only a small change? 1052 00:49:33,552 --> 00:49:36,510 All right, then here's what I'm going to do. 1053 00:49:36,510 --> 00:49:40,860 It's basically taking advantage of a way 1054 00:49:40,860 --> 00:49:44,156 to get enforced encapsulation of different parts 1055 00:49:44,156 --> 00:49:45,780 of an application that Ur/Web supports, 1056 00:49:45,780 --> 00:49:49,590 which is at least only rarely supported elsewhere. 1057 00:49:49,590 --> 00:49:51,640 I'm going to take this room. 1058 00:49:51,640 --> 00:49:53,560 I'm going to take some of these definitions 1059 00:49:53,560 --> 00:49:55,760 here and put them inside a module that encapsulates 1060 00:49:55,760 --> 00:49:56,760 some of them as private. 1061 00:49:56,760 --> 00:50:01,080 In particular, the database tables are going to be private. 1062 00:50:01,080 --> 00:50:02,917 So no one can access them directly. 1063 00:50:02,917 --> 00:50:05,000 They can only access them through a set of methods 1064 00:50:05,000 --> 00:50:05,800 that we provide. 1065 00:50:05,800 --> 00:50:08,388 So one method runs inside a transaction. 1066 00:50:08,388 --> 00:50:09,513 That's what this type says. 1067 00:50:09,513 --> 00:50:12,110 And it produces a list of records with ID 1068 00:50:12,110 --> 00:50:16,180 and title fields that stand for which rooms are available. 1069 00:50:16,180 --> 00:50:19,870 And we'll also just expose this chat operation. 1070 00:50:19,870 --> 00:50:21,370 And one thing I've done here is I've 1071 00:50:21,370 --> 00:50:24,400 introduced a name for the concept of an ID. 1072 00:50:24,400 --> 00:50:26,150 I won't just say that an ID is an integer. 1073 00:50:26,150 --> 00:50:28,140 I'll say it's a new type. 1074 00:50:28,140 --> 00:50:30,930 And the only way the outside world will ever get one 1075 00:50:30,930 --> 00:50:32,520 is to list all the rooms. 1076 00:50:32,520 --> 00:50:34,990 And the only way the outside world can ever use one 1077 00:50:34,990 --> 00:50:37,130 is to call the chat function on it. 1078 00:50:37,130 --> 00:50:39,010 So just like, say, the abstract type 1079 00:50:39,010 --> 00:50:41,870 of a hash table inside a hash table class where 1080 00:50:41,870 --> 00:50:45,740 the details of what is an ID and how do they get 1081 00:50:45,740 --> 00:50:48,750 produced internally are private to this module. 1082 00:50:48,750 --> 00:50:51,130 And the client code that calls this module 1083 00:50:51,130 --> 00:50:52,700 isn't going to need to use them. 1084 00:50:52,700 --> 00:50:56,640 So I'll use this syntax to put everything 1085 00:50:56,640 --> 00:50:58,920 down here inside the module so it's not 1086 00:50:58,920 --> 00:51:04,199 exposed to the rest of the code by default. 1087 00:51:04,199 --> 00:51:08,360 And we also are going to want to implement this rooms method. 1088 00:51:08,360 --> 00:51:10,180 We already happen to have chat around. 1089 00:51:10,180 --> 00:51:13,105 But we can implement rooms in a simple way 1090 00:51:13,105 --> 00:51:15,480 as using another standard library 1091 00:51:15,480 --> 00:51:17,800 function for interpreting a query in a useful way. 1092 00:51:17,800 --> 00:51:19,800 Let's just select everything from the room table 1093 00:51:19,800 --> 00:51:21,120 ordering by title. 1094 00:51:21,120 --> 00:51:23,280 And as usual, this query is type checked for us. 1095 00:51:23,280 --> 00:51:25,204 And the system determines, OK, this expression 1096 00:51:25,204 --> 00:51:26,787 is going to generate a list of records 1097 00:51:26,787 --> 00:51:28,920 that happens to match the type that we declared 1098 00:51:28,920 --> 00:51:30,870 in the signature of this module. 1099 00:51:30,870 --> 00:51:33,320 So now outside this module, no other code 1100 00:51:33,320 --> 00:51:36,242 is allowed to mention the room table or the message table. 1101 00:51:36,242 --> 00:51:38,126 So we can, at least from the perspective 1102 00:51:38,126 --> 00:51:40,010 of this application, enforce whenever 1103 00:51:40,010 --> 00:51:41,500 invariance we want on them. 1104 00:51:41,500 --> 00:51:43,685 We can even hide secrets inside of them 1105 00:51:43,685 --> 00:51:47,120 that would be a security problem if some other part of the code 1106 00:51:47,120 --> 00:51:48,918 was able to get a hold of them. 1107 00:51:48,918 --> 00:51:49,856 Yeah. 1108 00:51:49,856 --> 00:51:51,897 STUDENT: But couldn't some other part of the code 1109 00:51:51,897 --> 00:51:53,322 just declare table room as well? 1110 00:51:53,322 --> 00:51:55,280 ADAM CHLIPALA: That would be a different table. 1111 00:51:55,280 --> 00:51:57,782 We could do that, actually. 1112 00:51:57,782 --> 00:51:59,620 It's got to be in here. 1113 00:51:59,620 --> 00:52:04,442 I think this should have no effect on the behavior. 1114 00:52:04,442 --> 00:52:06,434 I think in this case we're going to get 1115 00:52:06,434 --> 00:52:07,928 something funny happening. 1116 00:52:07,928 --> 00:52:10,674 Let's put this in a different module 1117 00:52:10,674 --> 00:52:15,950 just to avoid something goofy. 1118 00:52:15,950 --> 00:52:17,170 Great, so we can do that. 1119 00:52:17,170 --> 00:52:19,252 And we can do whatever we want with this table. 1120 00:52:19,252 --> 00:52:22,355 And I'll compile this in maybe about 30 seconds 1121 00:52:22,355 --> 00:52:23,670 and we'll see what happens. 1122 00:52:23,670 --> 00:52:25,400 But it's actually a different table, 1123 00:52:25,400 --> 00:52:27,610 just like if you have the same private field 1124 00:52:27,610 --> 00:52:31,282 name across two classes in Java they're different field names. 1125 00:52:31,282 --> 00:52:31,895 Yeah. 1126 00:52:31,895 --> 00:52:32,770 STUDENT: [INAUDIBLE]? 1127 00:52:42,690 --> 00:52:47,570 ADAM CHLIPALA: So you're suggesting we have, 1128 00:52:47,570 --> 00:52:49,070 inside this module, an abstract type 1129 00:52:49,070 --> 00:52:52,604 called room, which contains both the ID and the title. 1130 00:52:52,604 --> 00:52:54,693 Is that right? 1131 00:52:54,693 --> 00:52:55,568 STUDENT: [INAUDIBLE]? 1132 00:53:05,460 --> 00:53:07,920 So I think what would work to do instead 1133 00:53:07,920 --> 00:53:10,790 is instead of type ID have type room, 1134 00:53:10,790 --> 00:53:12,290 have room determine a list of rooms, 1135 00:53:12,290 --> 00:53:13,665 and chat take a room as an input. 1136 00:53:13,665 --> 00:53:15,010 Is that what you have in mind? 1137 00:53:15,010 --> 00:53:17,970 So what would happen then is when we call the chat function, 1138 00:53:17,970 --> 00:53:20,429 it'll actually be called via a URL given 1139 00:53:20,429 --> 00:53:21,720 the way we use this eventually. 1140 00:53:21,720 --> 00:53:25,040 That would be passing the ID and the title 1141 00:53:25,040 --> 00:53:28,925 within the URL in the URL representation for a function 1142 00:53:28,925 --> 00:53:30,700 call. 1143 00:53:30,700 --> 00:53:33,042 And we only need the ID to implement that function. 1144 00:53:33,042 --> 00:53:34,750 So it would be a little wasteful of space 1145 00:53:34,750 --> 00:53:36,526 and might look gross to the user to have 1146 00:53:36,526 --> 00:53:40,170 to have the title passed along as an extra argument 1147 00:53:40,170 --> 00:53:43,136 in the invocation of chat via a URL. 1148 00:53:43,136 --> 00:53:46,690 Does that make sense? 1149 00:53:46,690 --> 00:53:48,336 Or maybe another way of saying it, 1150 00:53:48,336 --> 00:53:54,410 if I have this one [INAUDIBLE], is look up at the URL bar. 1151 00:53:54,410 --> 00:53:57,850 The ID of the channel we're going into 1152 00:53:57,850 --> 00:54:00,985 is serialized automatically in the URL at the end here. 1153 00:54:00,985 --> 00:54:02,513 And if we were passing a record that 1154 00:54:02,513 --> 00:54:04,560 contained an ID and a title, the title 1155 00:54:04,560 --> 00:54:07,312 would be serialized, too, which is at least a little 1156 00:54:07,312 --> 00:54:08,020 counterintuative. 1157 00:54:13,300 --> 00:54:17,210 OK, the last thing we need to do-- actually, 1158 00:54:17,210 --> 00:54:21,340 it might be instructive to make just a shallow change 1159 00:54:21,340 --> 00:54:25,840 to this code, reference the room module there, and then try 1160 00:54:25,840 --> 00:54:28,177 to access the room table like before. 1161 00:54:28,177 --> 00:54:29,260 This shouldn't be allowed. 1162 00:54:29,260 --> 00:54:31,430 This would be like being able to read and write 1163 00:54:31,430 --> 00:54:34,300 the private fields of a class in Java. 1164 00:54:34,300 --> 00:54:37,230 And indeed, we get a pretty straightforward message 1165 00:54:37,230 --> 00:54:42,130 basically saying, this right here is an unbound variable. 1166 00:54:42,130 --> 00:54:44,280 There's no table called room in scope. 1167 00:54:44,280 --> 00:54:46,060 And we could mention this extra one 1168 00:54:46,060 --> 00:54:47,700 that we created just for fun. 1169 00:54:47,700 --> 00:54:50,120 But then it would be a different table. 1170 00:54:50,120 --> 00:54:53,050 It wouldn't be a problem that we could access that. 1171 00:54:53,050 --> 00:54:56,780 So instead, what we should do is I'll break this into two parts. 1172 00:54:59,330 --> 00:55:02,240 We'll start out by just calling the rooms method, 1173 00:55:02,240 --> 00:55:06,930 and then do a slightly different thing to read its elements, 1174 00:55:06,930 --> 00:55:12,884 map over the list of results that gives-- what did I call 1175 00:55:12,884 --> 00:55:13,384 [INAUDIBLE]? 1176 00:55:21,112 --> 00:55:24,766 Map all the list of results instead of the other way it 1177 00:55:24,766 --> 00:55:27,465 was working, which was roughly equivalent except for using 1178 00:55:27,465 --> 00:55:28,431 different data types. 1179 00:55:28,431 --> 00:55:30,846 Let's see how this goes. 1180 00:55:30,846 --> 00:55:36,740 All right, so I'll go back here. 1181 00:55:36,740 --> 00:55:40,660 And we can do all the tremendously exciting things 1182 00:55:40,660 --> 00:55:43,109 we could do before. 1183 00:55:43,109 --> 00:55:44,400 But we have this encapsulation. 1184 00:55:44,400 --> 00:55:46,530 And you can sort of think of this room structure 1185 00:55:46,530 --> 00:55:49,264 as now it's a library, and you can call this 1186 00:55:49,264 --> 00:55:50,680 from all sorts of different places 1187 00:55:50,680 --> 00:55:52,050 that want to have this functionality. 1188 00:55:52,050 --> 00:55:52,790 You don't have to worry. 1189 00:55:52,790 --> 00:55:54,165 There's different places that are 1190 00:55:54,165 --> 00:55:56,540 going to break the internal invariance of the system. 1191 00:55:56,540 --> 00:55:59,350 Maybe you want to know that once a message is added, 1192 00:55:59,350 --> 00:56:00,440 it will never be deleted. 1193 00:56:00,440 --> 00:56:02,160 It's always there in the logs. 1194 00:56:02,160 --> 00:56:04,060 This structure enforces that independently 1195 00:56:04,060 --> 00:56:07,034 of which other code the room module might be composed with, 1196 00:56:07,034 --> 00:56:07,575 for instance. 1197 00:56:10,341 --> 00:56:11,245 Yeah. 1198 00:56:11,245 --> 00:56:13,589 STUDENT: Say you change the definition of room, 1199 00:56:13,589 --> 00:56:15,038 [INAUDIBLE]. 1200 00:56:15,038 --> 00:56:17,679 What's going to happen to the database table? 1201 00:56:17,679 --> 00:56:19,220 ADAM CHLIPALA: It'll be a little sad. 1202 00:56:19,220 --> 00:56:23,345 We'll have to manually run an alter table command if you 1203 00:56:23,345 --> 00:56:26,680 want to save the old data. 1204 00:56:26,680 --> 00:56:28,890 But when the application starts up, 1205 00:56:28,890 --> 00:56:31,770 it queries the system database catalog and checks 1206 00:56:31,770 --> 00:56:33,840 that the schema still matches what it expects. 1207 00:56:33,840 --> 00:56:35,350 So you'll get a static error then. 1208 00:56:35,350 --> 00:56:37,940 And that will hopefully give you a hint about what you 1209 00:56:37,940 --> 00:56:39,265 should change in the database. 1210 00:56:39,265 --> 00:56:40,660 STUDENT: But it wouldn't automatically 1211 00:56:40,660 --> 00:56:42,055 drop your database or something? 1212 00:56:42,055 --> 00:56:43,138 ADAM CHLIPALA: I hope not. 1213 00:56:43,138 --> 00:56:44,810 I don't think it should do that. 1214 00:56:44,810 --> 00:56:46,850 And you can imagine tweaking the compiler 1215 00:56:46,850 --> 00:56:48,830 to understand the evolution of a database. 1216 00:56:48,830 --> 00:56:52,002 I think you have to write alter table commands to run. 1217 00:56:52,002 --> 00:56:53,486 It doesn't do that right now. 1218 00:56:57,190 --> 00:57:02,800 OK, so now let's talk about cross site request forgery 1219 00:57:02,800 --> 00:57:03,874 and preventing it. 1220 00:57:03,874 --> 00:57:06,220 Actually, before we do that, let's 1221 00:57:06,220 --> 00:57:09,750 look at the code on this page. 1222 00:57:09,750 --> 00:57:11,665 We have a traditional looking HTML form 1223 00:57:11,665 --> 00:57:13,780 that gets generated here. 1224 00:57:13,780 --> 00:57:17,150 And there's certainly no cross site request forgery protection 1225 00:57:17,150 --> 00:57:19,930 in here, which I think is good. 1226 00:57:19,930 --> 00:57:22,660 Because as I understand cross site request forgery, 1227 00:57:22,660 --> 00:57:24,580 the problem is there's some implicit context 1228 00:57:24,580 --> 00:57:26,954 that your application sends on every request. 1229 00:57:26,954 --> 00:57:28,370 So there's some attacker out there 1230 00:57:28,370 --> 00:57:30,280 who doesn't know your implicit context. 1231 00:57:30,280 --> 00:57:32,250 Let's say your password is stored in a cookie, 1232 00:57:32,250 --> 00:57:34,020 for a really simple example. 1233 00:57:34,020 --> 00:57:36,330 And when the attacker tricks you into following a link 1234 00:57:36,330 --> 00:57:39,240 to the application, your browser sends the implicit context 1235 00:57:39,240 --> 00:57:41,320 automatically and causes the application 1236 00:57:41,320 --> 00:57:44,580 to do something the attacker could not have done directly. 1237 00:57:44,580 --> 00:57:46,890 In this case, there's no implicit context. 1238 00:57:46,890 --> 00:57:49,899 So there's no risk of a cross site request forgery. 1239 00:57:49,899 --> 00:57:51,940 Does anyone want to dispute that characterization 1240 00:57:51,940 --> 00:57:53,425 before I go on? 1241 00:57:53,425 --> 00:57:54,910 It could be educational for me. 1242 00:57:58,307 --> 00:58:00,390 All right, so now let's add some implicit context. 1243 00:58:00,390 --> 00:58:01,730 And the system is automatically going 1244 00:58:01,730 --> 00:58:03,438 to deploy the right countermeasures based 1245 00:58:03,438 --> 00:58:05,140 on program analysis that realizes 1246 00:58:05,140 --> 00:58:08,660 now there's implicit context. 1247 00:58:08,660 --> 00:58:14,222 In particular, we just throw in a cookie here. 1248 00:58:20,174 --> 00:58:22,890 As another example of module capsulation, 1249 00:58:22,890 --> 00:58:25,723 actually, I'll put in a whole sort of user authentication 1250 00:58:25,723 --> 00:58:31,830 system where we have the user accounts and abstract types 1251 00:58:31,830 --> 00:58:32,705 of IDs and passwords. 1252 00:58:32,705 --> 00:58:36,795 So you can't just build the value of either of these types 1253 00:58:36,795 --> 00:58:37,295 directly. 1254 00:58:37,295 --> 00:58:39,388 You'll have to go through some kind 1255 00:58:39,388 --> 00:58:45,400 of approved method of building values of these types. 1256 00:58:45,400 --> 00:58:50,040 And I'm actually going to expose the table directly 1257 00:58:50,040 --> 00:58:51,340 in the signature. 1258 00:58:51,340 --> 00:58:53,040 And I'll put a constraint on it, too, 1259 00:58:53,040 --> 00:58:55,002 saying the ID form is a key for it. 1260 00:58:55,002 --> 00:58:55,502 [INAUDIBLE] 1261 00:58:58,160 --> 00:59:00,390 But the thing is, on this user table, 1262 00:59:00,390 --> 00:59:02,680 ID and password are abstract types. 1263 00:59:02,680 --> 00:59:06,580 So the code can't actually look at the password. 1264 00:59:06,580 --> 00:59:11,080 And it can't generate all IDs in sequence 1265 00:59:11,080 --> 00:59:12,470 and try them against this table. 1266 00:59:12,470 --> 00:59:13,678 Because the type is abstract. 1267 00:59:13,678 --> 00:59:14,970 There's no way to make an ID. 1268 00:59:14,970 --> 00:59:16,180 There's no way to make a password. 1269 00:59:16,180 --> 00:59:17,310 They just come out of this table, 1270 00:59:17,310 --> 00:59:18,393 and they're opaque tokens. 1271 00:59:22,880 --> 00:59:27,640 But we might want to allow them to be input from strings. 1272 00:59:27,640 --> 00:59:29,900 You might want to allow one direction of conversion 1273 00:59:29,900 --> 00:59:31,370 between strings and these types. 1274 00:59:31,370 --> 00:59:33,495 So that's what I'll do here. 1275 00:59:33,495 --> 00:59:35,745 Basically, the details I don't want to try to explain. 1276 00:59:35,745 --> 00:59:38,910 But this is like a declaration, OK, 1277 00:59:38,910 --> 00:59:41,275 you're allowed to convert strings into IDs. 1278 00:59:41,275 --> 00:59:44,130 For those who speak Haskell, this is a type class instant. 1279 00:59:44,130 --> 00:59:46,530 For those who don't, it's permission 1280 00:59:46,530 --> 00:59:49,124 to turn strings into IDs. 1281 00:59:49,124 --> 00:59:51,040 We're going to leave out the other permission. 1282 00:59:51,040 --> 00:59:55,980 We don't want to be able to turn an ID back into anything. 1283 00:59:55,980 --> 00:59:58,635 And the password-- let's do the same thing. 1284 00:59:58,635 --> 01:00:00,760 We want to be able to read a password from the user 1285 01:00:00,760 --> 01:00:04,709 but not take a password and turn it into a string 1286 01:00:04,709 --> 01:00:06,750 where we can actually tell what the user entered. 1287 01:00:06,750 --> 01:00:08,166 So other parts of the code will be 1288 01:00:08,166 --> 01:00:10,880 able to accept password input from the user, 1289 01:00:10,880 --> 01:00:15,110 convert it into this type, and ship it off to the user module 1290 01:00:15,110 --> 01:00:16,820 and have it be checked. 1291 01:00:16,820 --> 01:00:19,170 But what they can't do is query the user table 1292 01:00:19,170 --> 01:00:22,310 and get all the passwords in a form where they can actually 1293 01:00:22,310 --> 01:00:25,500 extract the text from them. 1294 01:00:25,500 --> 01:00:32,438 Then we can have a login method that takes these two components 1295 01:00:32,438 --> 01:00:36,406 and just runs for its side effects, which is effectively 1296 01:00:36,406 --> 01:00:37,398 what that code says. 1297 01:00:37,398 --> 01:00:40,814 We'll also need a way to tell which user is logged in. 1298 01:00:40,814 --> 01:00:43,520 That is a code that runs a transaction that 1299 01:00:43,520 --> 01:00:46,730 produces an ID. 1300 01:00:46,730 --> 01:00:49,264 All right, so step one, we can just copy this definition. 1301 01:00:54,566 --> 01:00:56,650 And I'll fill in what these actually are. 1302 01:00:56,650 --> 01:00:58,590 It turns out-- surprise, surprise-- user IDs 1303 01:00:58,590 --> 01:00:59,910 and passwords are both strings. 1304 01:00:59,910 --> 01:01:03,706 But outside the module, that won't be exposed. 1305 01:01:03,706 --> 01:01:05,330 And now we're going to create a cookie. 1306 01:01:05,330 --> 01:01:09,410 So cookies are another thing that's built into the language. 1307 01:01:09,410 --> 01:01:12,220 Effectively, they act like mutable global variables 1308 01:01:12,220 --> 01:01:16,400 that have one copy per client that uses your application. 1309 01:01:16,400 --> 01:01:18,940 So we're going to create a cookie that on each client 1310 01:01:18,940 --> 01:01:24,102 will store basically just a copy of the same two fields 1311 01:01:24,102 --> 01:01:25,453 that we have here. 1312 01:01:25,453 --> 01:01:27,715 So this cookie is private to this module. 1313 01:01:27,715 --> 01:01:30,470 Other parts of the code won't be able to read the cookie, 1314 01:01:30,470 --> 01:01:33,270 because they just don't have this private field and scope. 1315 01:01:33,270 --> 01:01:35,170 So no one else will be able to see directly 1316 01:01:35,170 --> 01:01:38,090 the ID and password that are stored for this user. 1317 01:01:38,090 --> 01:01:40,870 But they will be persisted across different page views, 1318 01:01:40,870 --> 01:01:45,594 just like you would expect for cookies usually. 1319 01:01:45,594 --> 01:01:48,170 I'm going to give it a login function that's 1320 01:01:48,170 --> 01:01:53,460 going to run some incantation to check against the database 1321 01:01:53,460 --> 01:01:58,328 whether this is really a correct pair of username and password. 1322 01:01:58,328 --> 01:02:00,302 It'll just check, can we find a row 1323 01:02:00,302 --> 01:02:05,945 in the database that has this user ID and has this password? 1324 01:02:10,350 --> 01:02:13,450 If we find one, then yes, good, that's the correct value. 1325 01:02:13,450 --> 01:02:15,580 Let's just save it into the cookie. 1326 01:02:15,580 --> 01:02:18,390 We use a method that modifies the cookie value. 1327 01:02:18,390 --> 01:02:20,180 And we have to put some things in here, 1328 01:02:20,180 --> 01:02:23,960 like just for simplicity, I'll say this cookie never expires. 1329 01:02:23,960 --> 01:02:26,660 And I don't want to run SSL here, 1330 01:02:26,660 --> 01:02:29,180 so I'll say it doesn't need to be secure. 1331 01:02:29,180 --> 01:02:32,220 But if you really care about security, 1332 01:02:32,220 --> 01:02:35,840 obviously you would write secure equals true. 1333 01:02:35,840 --> 01:02:38,300 And if the check failed, then we can-- I don't know. 1334 01:02:38,300 --> 01:02:39,050 It doesn't matter. 1335 01:02:39,050 --> 01:02:41,150 If it signals an error, execution 1336 01:02:41,150 --> 01:02:44,720 stops with this error description. 1337 01:02:44,720 --> 01:02:47,520 Finally, we can create this function 1338 01:02:47,520 --> 01:02:50,230 that tells who the user is logged in as by getting 1339 01:02:50,230 --> 01:02:51,620 the current cookie value. 1340 01:02:51,620 --> 01:02:57,320 And then it might be none if the user hasn't 1341 01:02:57,320 --> 01:03:00,150 logged in yet, in which case, we can have a different error 1342 01:03:00,150 --> 01:03:01,870 message. 1343 01:03:01,870 --> 01:03:05,070 Or it might be some record of exactly the type 1344 01:03:05,070 --> 01:03:06,486 we used up there. 1345 01:03:06,486 --> 01:03:07,986 So I'll just copy some of this here. 1346 01:03:14,762 --> 01:03:17,110 Let's run the same check there. 1347 01:03:17,110 --> 01:03:19,730 If it worked, then we'll just return 1348 01:03:19,730 --> 01:03:21,480 the ID part of the record that we just 1349 01:03:21,480 --> 01:03:22,735 verified against the database. 1350 01:03:25,350 --> 01:03:26,940 Otherwise, [INAUDIBLE]. 1351 01:03:34,062 --> 01:03:39,496 So let me just type check this to see if this is on track, 1352 01:03:39,496 --> 01:03:44,930 that part-- Oops, capital Id. 1353 01:03:53,370 --> 01:03:55,950 All right, so the important is there's all 1354 01:03:55,950 --> 01:03:57,250 those implementation details. 1355 01:03:57,250 --> 01:03:59,083 But from outside this module, we think of it 1356 01:03:59,083 --> 01:04:00,990 in terms of the interface up there. 1357 01:04:00,990 --> 01:04:03,660 There are some unknown types of IDs and passwords. 1358 01:04:03,660 --> 01:04:06,780 This table of users expressed in terms of them 1359 01:04:06,780 --> 01:04:09,532 were allowed to turn strings into IDs and passwords, but not 1360 01:04:09,532 --> 01:04:11,280 the other way around. 1361 01:04:11,280 --> 01:04:13,795 And we have these two methods to log in in the first place 1362 01:04:13,795 --> 01:04:18,830 and to check which user is logged in at this point. 1363 01:04:18,830 --> 01:04:19,830 Any question about this? 1364 01:04:19,830 --> 01:04:20,330 Yeah. 1365 01:04:20,330 --> 01:04:22,602 STUDENT: Do you need to expose the user table? 1366 01:04:24,510 --> 01:04:27,135 ADAM CHLIPALA: Because I want to use it as a foreign key later. 1367 01:04:27,135 --> 01:04:28,504 That was the reason I did it. 1368 01:04:28,504 --> 01:04:30,472 It's not that great of a reason. 1369 01:04:34,408 --> 01:04:36,290 All right, so we're almost at the point 1370 01:04:36,290 --> 01:04:38,510 where I can show you CSRF protection in action. 1371 01:04:38,510 --> 01:04:41,090 We have to actually start logging in. 1372 01:04:41,090 --> 01:04:44,030 So that's easy enough to do. 1373 01:04:50,400 --> 01:04:52,720 OK, so what can we do here? 1374 01:04:52,720 --> 01:04:56,696 Let's just add another part of this page that says, 1375 01:04:56,696 --> 01:04:58,684 here's where you log in. 1376 01:05:02,660 --> 01:05:03,654 This is the form. 1377 01:05:10,681 --> 01:05:13,097 This is where you would put the username and the password. 1378 01:05:22,540 --> 01:05:24,405 And then click on the button. 1379 01:05:24,405 --> 01:05:26,920 It's trying to go to call a function called login, 1380 01:05:26,920 --> 01:05:29,370 which we'll define in a moment. 1381 01:05:36,230 --> 01:05:43,090 Let's define login as a function that does these things. 1382 01:05:43,090 --> 01:05:47,580 It's actually just a wrapper around calling the login 1383 01:05:47,580 --> 01:05:51,270 function from that module where we take each of the components 1384 01:05:51,270 --> 01:05:54,387 and convert it from string to the abstract types. 1385 01:05:54,387 --> 01:05:55,720 That's what read error is doing. 1386 01:05:55,720 --> 01:05:58,090 Error means if it doesn't work, just abort execution 1387 01:05:58,090 --> 01:06:01,333 instead of signaling the failure with a special return value. 1388 01:06:04,279 --> 01:06:06,925 Here's both of those, login and then jump to main. 1389 01:06:06,925 --> 01:06:08,848 So now we should be able to log in. 1390 01:06:08,848 --> 01:06:10,252 Let's check if that's true. 1391 01:06:13,160 --> 01:06:16,106 OK, so that was [INAUDIBLE]. 1392 01:06:22,500 --> 01:06:25,570 We'll probably want to create an account to allow us to log in. 1393 01:06:25,570 --> 01:06:27,550 So let me [INAUDIBLE]. 1394 01:06:34,655 --> 01:06:39,022 So now I should be able to log in as a. 1395 01:06:39,022 --> 01:06:40,360 OK, and take my word for it. 1396 01:06:40,360 --> 01:06:45,052 There's now a cookie set to record that information. 1397 01:06:45,052 --> 01:06:47,510 And then let's go back in the chat room and send a message. 1398 01:06:47,510 --> 01:06:50,300 We didn't actually add any access control here yet. 1399 01:06:50,300 --> 01:06:51,880 So there's not much going on here. 1400 01:06:51,880 --> 01:06:53,852 But we can check to see. 1401 01:06:53,852 --> 01:06:54,560 There's a cookie. 1402 01:06:54,560 --> 01:06:57,070 But the system has determined that we're not 1403 01:06:57,070 --> 01:06:58,970 using the cookie. 1404 01:06:58,970 --> 01:07:01,500 When we submit this form, the cookie is not read. 1405 01:07:01,500 --> 01:07:04,492 So there's actually no need to add any CSRF protection here 1406 01:07:04,492 --> 01:07:04,992 yet. 1407 01:07:04,992 --> 01:07:07,144 So now we have to add the way to use the cookie. 1408 01:07:07,144 --> 01:07:09,018 And then we should see the protection appear. 1409 01:07:09,018 --> 01:07:09,974 Yeah. 1410 01:07:09,974 --> 01:07:10,930 STUDENT: What are the contents of the cookie? 1411 01:07:10,930 --> 01:07:12,265 ADAM CHLIPALA: What are the contents of the cookie? 1412 01:07:12,265 --> 01:07:15,130 The contents are exactly what you'd expect from the code. 1413 01:07:15,130 --> 01:07:17,840 In other words, the cookie is declared 1414 01:07:17,840 --> 01:07:21,670 as having type this record, an ID, and a password. 1415 01:07:21,670 --> 01:07:23,045 So that's exactly what's in there 1416 01:07:23,045 --> 01:07:24,378 in a particular serialized form. 1417 01:07:29,930 --> 01:07:32,000 So now let's actually use the cookie. 1418 01:07:32,000 --> 01:07:34,310 And we should hopefully see despite the fact 1419 01:07:34,310 --> 01:07:36,995 we're going to use the cookie indirectly, 1420 01:07:36,995 --> 01:07:39,620 because we're going to use it in the room module, which doesn't 1421 01:07:39,620 --> 01:07:40,870 even have the cookie in scope. 1422 01:07:40,870 --> 01:07:44,675 But we'll call methods of the user module, which indirectly 1423 01:07:44,675 --> 01:07:45,550 are using the cookie. 1424 01:07:45,550 --> 01:07:47,341 And then the system will realize that means 1425 01:07:47,341 --> 01:07:48,710 we have dependency on it. 1426 01:07:48,710 --> 01:07:55,671 So let's make this really simple and just say 1427 01:07:55,671 --> 01:07:57,360 call the whoami method. 1428 01:08:00,430 --> 01:08:02,520 And I'm actually just going to ignore this. 1429 01:08:02,520 --> 01:08:04,320 Or we can do this. 1430 01:08:04,320 --> 01:08:07,710 Let's decide this a user we created is really special. 1431 01:08:07,710 --> 01:08:12,938 And only this user is allowed to post anything. 1432 01:08:18,427 --> 01:08:22,419 And we'll fail if we're not a. 1433 01:08:22,419 --> 01:08:25,413 All right, let's see if this works. 1434 01:08:25,413 --> 01:08:26,910 Did I forget a slash somewhere? 1435 01:08:26,910 --> 01:08:28,407 Oh, yeah. 1436 01:08:35,392 --> 01:08:35,892 [INAUDIBLE] 1437 01:08:48,367 --> 01:08:51,120 Oh, I expect him to be a string. 1438 01:08:51,120 --> 01:08:53,910 But it's actually an ID. 1439 01:08:53,910 --> 01:08:59,979 So let's just read a into an ID just like we did below 1440 01:08:59,979 --> 01:09:03,670 to process login. 1441 01:09:03,670 --> 01:09:05,510 And we haven't exposed that the ID 1442 01:09:05,510 --> 01:09:07,729 type supports equality testing. 1443 01:09:07,729 --> 01:09:10,540 So I'll just add that to the user module. 1444 01:09:10,540 --> 01:09:13,080 And then that should work. 1445 01:09:13,080 --> 01:09:16,160 ID supports equality testing. 1446 01:09:16,160 --> 01:09:18,948 And we should be OK. 1447 01:09:18,948 --> 01:09:20,920 So now we've brought in the interface. 1448 01:09:20,920 --> 01:09:23,648 Now we can do more things with the ID, which could 1449 01:09:23,648 --> 01:09:25,136 trigger some security issues. 1450 01:09:25,136 --> 01:09:27,756 But it lets us add this access control check, 1451 01:09:27,756 --> 01:09:34,560 so let's see how that works, go back to the main page 1452 01:09:34,560 --> 01:09:35,552 to [INAUDIBLE]. 1453 01:09:45,472 --> 01:09:47,770 All right, now the form automatically 1454 01:09:47,770 --> 01:09:50,550 has a hidden input name sig, which 1455 01:09:50,550 --> 01:09:53,544 is a cryptographic signature of the values of all 1456 01:09:53,544 --> 01:09:54,890 of the cookies. 1457 01:09:54,890 --> 01:10:00,380 And it's signed using a key that's a secret for the server. 1458 01:10:00,380 --> 01:10:04,110 And when the form is submitted, the application 1459 01:10:04,110 --> 01:10:06,360 knows, because the compiler told it, 1460 01:10:06,360 --> 01:10:09,130 that it should be checking signatures for the following 1461 01:10:09,130 --> 01:10:10,170 set of operations. 1462 01:10:10,170 --> 01:10:13,460 In this case, the only one is this say operation. 1463 01:10:13,460 --> 01:10:13,960 Yeah. 1464 01:10:13,960 --> 01:10:17,910 STUDENT: Does the signature have any sort of time stamp as well? 1465 01:10:17,910 --> 01:10:19,785 ADAM CHLIPALA: It does not have a time stamp. 1466 01:10:19,785 --> 01:10:22,866 STUDENT: Otherwise, if the attacker ever saw this live, 1467 01:10:22,866 --> 01:10:25,311 they could pretend to be the user. 1468 01:10:25,311 --> 01:10:27,010 It never expires. 1469 01:10:27,010 --> 01:10:29,140 ADAM CHLIPALA: It never expires, right. 1470 01:10:29,140 --> 01:10:32,354 So that's something that could be changed just 1471 01:10:32,354 --> 01:10:34,020 by modifying the language implementation 1472 01:10:34,020 --> 01:10:36,216 without modifying the applications, 1473 01:10:36,216 --> 01:10:37,668 and then deployed instantly. 1474 01:10:37,668 --> 01:10:39,120 But it's not there now. 1475 01:10:39,120 --> 01:10:43,476 And I can see why that could be a useful thing to add. 1476 01:10:43,476 --> 01:10:44,928 Question, yeah. 1477 01:10:44,928 --> 01:10:48,030 STUDENT: You could also fix that by just putting an [INAUDIBLE] 1478 01:10:48,030 --> 01:10:49,440 as well. 1479 01:10:49,440 --> 01:10:50,565 ADAM CHLIPALA: That's true. 1480 01:10:50,565 --> 01:10:52,398 You're right, you can change the application 1481 01:10:52,398 --> 01:10:54,750 to purposely modify the cookie data frequently enough 1482 01:10:54,750 --> 01:10:56,574 that the signature would go out of date. 1483 01:10:56,574 --> 01:10:58,065 That's also true. 1484 01:11:10,000 --> 01:11:11,860 So we've got 10 minutes left. 1485 01:11:11,860 --> 01:11:14,340 Any requests for things that someone particularly wants 1486 01:11:14,340 --> 01:11:16,026 to see before class is over? 1487 01:11:19,190 --> 01:11:21,930 I can start showing some Ajax stuff by default 1488 01:11:21,930 --> 01:11:24,329 if no one has another request. 1489 01:11:30,245 --> 01:11:31,231 Yeah. 1490 01:11:31,231 --> 01:11:35,190 STUDENT: Can you remap the URLs? 1491 01:11:35,190 --> 01:11:36,854 ADAM CHLIPALA: You can, yes. 1492 01:11:36,854 --> 01:11:39,670 So what remapping would you like to see? 1493 01:11:39,670 --> 01:11:40,211 STUDENT: Any. 1494 01:11:40,211 --> 01:11:41,923 I just want to see how it's done. 1495 01:11:41,923 --> 01:11:45,630 ADAM CHLIPALA: OK, so the compiler 1496 01:11:45,630 --> 01:11:48,210 is assigning-- as we can see back over here, 1497 01:11:48,210 --> 01:11:49,460 we called the say function. 1498 01:11:49,460 --> 01:11:50,835 And basically, that function call 1499 01:11:50,835 --> 01:11:52,590 is serialized as a particular URL form. 1500 01:11:52,590 --> 01:11:54,560 Maybe we don't like that form. 1501 01:11:54,560 --> 01:12:00,615 We decide we're going to rewrite URL so say 1502 01:12:00,615 --> 01:12:07,696 is inside the room module, inside demo. 1503 01:12:07,696 --> 01:12:10,414 Better put this up top so it runs 1504 01:12:10,414 --> 01:12:14,620 before these other rewrites-- rewrite url Demo/Room/say 1505 01:12:14,620 --> 01:12:18,085 into Demo/Room/speak. 1506 01:12:22,045 --> 01:12:26,005 And hopefully that's what I want it to. 1507 01:12:26,005 --> 01:12:27,985 Let's see what happens. 1508 01:12:34,915 --> 01:12:39,587 Yep, and you can have wild cards in those rules 1509 01:12:39,587 --> 01:12:41,170 also to map one prefix to another one. 1510 01:12:44,490 --> 01:12:47,100 And the compiler will enforce that every function 1511 01:12:47,100 --> 01:12:49,620 has a distinct URL schema. 1512 01:12:49,620 --> 01:12:51,890 So if you add a rule that causes a clash, 1513 01:12:51,890 --> 01:12:53,842 you'll get [INAUDIBLE]. 1514 01:12:53,842 --> 01:12:55,794 By default, the automatically generated 1515 01:12:55,794 --> 01:12:58,722 URL schemes are [INAUDIBLE]. 1516 01:12:58,722 --> 01:13:03,602 You can break that by using this feature. 1517 01:13:03,602 --> 01:13:06,518 Any other requests? 1518 01:13:06,518 --> 01:13:07,018 Yeah. 1519 01:13:07,018 --> 01:13:13,220 STUDENT: So you mentioned that the HTML [INAUDIBLE] 1520 01:13:13,220 --> 01:13:15,110 is not compiler specific. 1521 01:13:15,110 --> 01:13:16,335 It's like one is a library. 1522 01:13:16,335 --> 01:13:20,790 Are there other libraries for other formats as well? 1523 01:13:20,790 --> 01:13:22,760 ADAM CHLIPALA: There are other libraries 1524 01:13:22,760 --> 01:13:27,352 that don't do type checking at the same level of richness. 1525 01:13:27,352 --> 01:13:28,810 But for instance, there's a library 1526 01:13:28,810 --> 01:13:32,580 for serializing and de-serializing JSON. 1527 01:13:32,580 --> 01:13:34,440 And most of the automated way that's 1528 01:13:34,440 --> 01:13:36,940 driven by type structure. 1529 01:13:36,940 --> 01:13:40,364 So you can do things like that that aren't as integrated 1530 01:13:40,364 --> 01:13:41,114 with the compiler. 1531 01:13:47,040 --> 01:13:47,540 Yeah. 1532 01:13:47,540 --> 01:13:50,396 STUDENT: Presumably you'd still want to write JavaScript. 1533 01:13:50,396 --> 01:13:52,435 Is there any-- 1534 01:13:52,435 --> 01:13:53,678 ADAM CHLIPALA: I don't. 1535 01:13:53,678 --> 01:13:54,674 You do. 1536 01:13:54,674 --> 01:13:58,029 STUDENT: Right, no, but for, say, I don't know, 1537 01:13:58,029 --> 01:13:59,654 you want to animate things on the page. 1538 01:13:59,654 --> 01:14:00,974 There are still things where-- 1539 01:14:00,974 --> 01:14:03,140 ADAM CHLIPALA: Let me load the Ajax version of this. 1540 01:14:03,140 --> 01:14:05,132 And that might answer your question. 1541 01:14:08,118 --> 01:14:08,618 [INAUDIBLE] 1542 01:14:29,036 --> 01:14:31,526 All right, so this version has client side code. 1543 01:14:31,526 --> 01:14:33,518 Let's just [INAUDIBLE]. 1544 01:14:41,010 --> 01:14:45,946 Believe it or not, this time the add worked by an Ajax call. 1545 01:14:45,946 --> 01:14:52,350 And you get things like, here's a button tag. 1546 01:14:52,350 --> 01:14:54,210 And it has an onclick attribute that when 1547 01:14:54,210 --> 01:14:56,630 a user clicks the button, all this code here 1548 01:14:56,630 --> 01:14:58,330 runs on the client side. 1549 01:14:58,330 --> 01:14:59,769 But it's Ur/Web code. 1550 01:14:59,769 --> 01:15:00,810 It's not JavaScript code. 1551 01:15:00,810 --> 01:15:03,360 The compiler translates it into JavaScript for you 1552 01:15:03,360 --> 01:15:07,110 and guarantees that it maintains the properties that we want 1553 01:15:07,110 --> 01:15:09,388 for the abstractions in our list, 1554 01:15:09,388 --> 01:15:12,934 as long as the user isn't mucking around with it manually 1555 01:15:12,934 --> 01:15:15,279 [INAUDIBLE]. 1556 01:15:15,279 --> 01:15:16,820 STUDENT: I'm more thinking that there 1557 01:15:16,820 --> 01:15:18,426 are a lot of [INAUDIBLE] libraries 1558 01:15:18,426 --> 01:15:22,345 out there today that do useful things, and in many cases 1559 01:15:22,345 --> 01:15:25,672 complex things if you want to recode everything yourself. 1560 01:15:25,672 --> 01:15:28,355 Is there any way interfacing JavaScript from Ur/Web? 1561 01:15:28,355 --> 01:15:30,730 ADAM CHLIPALA: Yes, there's a foreign function interface, 1562 01:15:30,730 --> 01:15:33,670 which lets you give Ur/Web function names to JavaScript 1563 01:15:33,670 --> 01:15:35,192 function names and call. 1564 01:15:35,192 --> 01:15:38,452 But then whenever you use the foreign function interface, 1565 01:15:38,452 --> 01:15:41,641 you don't get all of these nice properties like construction 1566 01:15:41,641 --> 01:15:42,141 anymore. 1567 01:15:42,141 --> 01:15:43,632 You have to be very careful. 1568 01:15:43,632 --> 01:15:45,123 And to some extend, you have to understand the implementations 1569 01:15:45,123 --> 01:15:48,050 of some of these abstractions to avoid disturbing them. 1570 01:15:51,256 --> 01:15:56,245 While I have this code up here, let me just show you. 1571 01:15:56,245 --> 01:15:58,820 We still have the same say function as before, roughly. 1572 01:15:58,820 --> 01:16:00,940 But now, instead of calling it via a link, 1573 01:16:00,940 --> 01:16:02,860 we just take the function call, which 1574 01:16:02,860 --> 01:16:05,740 is populated with arguments that come about from the context 1575 01:16:05,740 --> 01:16:08,640 of this onclick handler. 1576 01:16:08,640 --> 01:16:11,410 And we just wrap that function called inside the RPC syntax. 1577 01:16:11,410 --> 01:16:14,680 And that means this is a function call on the client, 1578 01:16:14,680 --> 01:16:17,195 but run the call itself on the server with access 1579 01:16:17,195 --> 01:16:19,295 to the database and other server resources, 1580 01:16:19,295 --> 01:16:22,020 and then shift the result back over here. 1581 01:16:22,020 --> 01:16:23,920 And it's written in this direct style 1582 01:16:23,920 --> 01:16:26,490 here instead of the callbacks that you 1583 01:16:26,490 --> 01:16:28,150 need to use in JavaScript usually 1584 01:16:28,150 --> 01:16:32,245 to accomplish a remote server call [INAUDIBLE]. 1585 01:16:32,245 --> 01:16:32,745 Yeah. 1586 01:16:32,745 --> 01:16:33,620 STUDENT: [INAUDIBLE]? 1587 01:16:38,717 --> 01:16:40,217 ADAM CHLIPALA: The client is allowed 1588 01:16:40,217 --> 01:16:41,804 to call anything in scope. 1589 01:16:41,804 --> 01:16:43,345 So you just have to use scope the way 1590 01:16:43,345 --> 01:16:46,410 we usually use it to hide private fields 1591 01:16:46,410 --> 01:16:48,954 and so forth inside of an abstraction. 1592 01:16:57,420 --> 01:16:59,480 I mean, because there's a call here, 1593 01:16:59,480 --> 01:17:01,190 the functions we're allowed to call 1594 01:17:01,190 --> 01:17:04,530 are the ones whose names are in scope. 1595 01:17:04,530 --> 01:17:06,660 This name happens to not be in scope here. 1596 01:17:06,660 --> 01:17:08,209 So we couldn't call it directly here. 1597 01:17:08,209 --> 01:17:09,667 But because it's in scope up there, 1598 01:17:09,667 --> 01:17:11,158 we're allowed to call it. 1599 01:17:16,128 --> 01:17:18,613 Did I see another hand? 1600 01:17:23,583 --> 01:17:25,226 Let's see, is there anything else 1601 01:17:25,226 --> 01:17:27,559 interesting about this version that I wanted to mention? 1602 01:17:32,070 --> 01:17:35,110 It involves an implementation of a GUI widget using 1603 01:17:35,110 --> 01:17:36,760 this functional reactive style, which 1604 01:17:36,760 --> 01:17:40,060 is cool from a programming modularity perspective 1605 01:17:40,060 --> 01:17:43,790 but maybe less interesting from a security perspective. 1606 01:17:43,790 --> 01:17:47,210 But here's an example of calling a method of this abstraction 1607 01:17:47,210 --> 01:17:49,500 of a portion of the page that displays 1608 01:17:49,500 --> 01:17:51,103 a list of lines of text that you can 1609 01:17:51,103 --> 01:17:53,470 add to but never delete from. 1610 01:17:53,470 --> 01:17:55,040 And you can actually enforce that. 1611 01:17:55,040 --> 01:17:56,944 Because we don't have the dom here. 1612 01:17:56,944 --> 01:17:58,360 It's not that any part of the code 1613 01:17:58,360 --> 01:18:01,424 can reach into the document tree and mutate it and change 1614 01:18:01,424 --> 01:18:03,590 the log and delete lines that were previously added. 1615 01:18:06,175 --> 01:18:07,675 The more functional style here means 1616 01:18:07,675 --> 01:18:09,570 you can actually have a GUI widget that 1617 01:18:09,570 --> 01:18:11,280 owns a part of the page and controls 1618 01:18:11,280 --> 01:18:13,710 exactly what's shown there, and bugs and other code 1619 01:18:13,710 --> 01:18:16,682 can't interfere with computing what shows up there. 1620 01:18:21,570 --> 01:18:23,070 This is probably good point to stop, 1621 01:18:23,070 --> 01:18:26,735 unless there are any last questions. 1622 01:18:26,735 --> 01:18:28,190 STUDENT: Channels? 1623 01:18:28,190 --> 01:18:29,650 ADAM CHLIPALA: Channels. 1624 01:18:29,650 --> 01:18:32,285 I don't think there's enough time to properly demonstrate 1625 01:18:32,285 --> 01:18:33,199 channels. 1626 01:18:33,199 --> 01:18:34,610 But there's code in the paper. 1627 01:18:34,610 --> 01:18:36,526 And there are all sorts of demos and tutorials 1628 01:18:36,526 --> 01:18:39,250 on the website for this project. 1629 01:18:39,250 --> 01:18:40,220 Yeah. 1630 01:18:40,220 --> 01:18:42,645 STUDENT: It's really hard writing correct [INAUDIBLE] 1631 01:18:42,645 --> 01:18:43,615 and compilers. 1632 01:18:43,615 --> 01:18:46,436 How do you mitigate problems that 1633 01:18:46,436 --> 01:18:52,284 might be present from the abstraction layers themselves? 1634 01:18:52,284 --> 01:18:54,450 ADAM CHLIPALA: Get people to use it and report bugs. 1635 01:18:54,450 --> 01:18:58,060 That's the best I have for you. 1636 01:18:58,060 --> 01:19:01,967 I guess the idea is compilers like this 1637 01:19:01,967 --> 01:19:03,550 should be written much less frequently 1638 01:19:03,550 --> 01:19:04,810 than new applications. 1639 01:19:04,810 --> 01:19:07,675 So to condense all the bug finding in this one place 1640 01:19:07,675 --> 01:19:09,586 is still an improvement, even if it's not 1641 01:19:09,586 --> 01:19:13,581 done in a particularly principled way. 1642 01:19:13,581 --> 01:19:14,080 Yeah. 1643 01:19:14,080 --> 01:19:15,538 STUDENT: Just out of curiosity, how 1644 01:19:15,538 --> 01:19:18,440 are [INAUDIBLE] files handled? 1645 01:19:18,440 --> 01:19:20,500 ADAM CHLIPALA: You can use that configuration 1646 01:19:20,500 --> 01:19:24,090 file I showed to map them into parts of the URL space. 1647 01:19:24,090 --> 01:19:27,430 Or you can manually produce values 1648 01:19:27,430 --> 01:19:29,630 in the program that stand for files 1649 01:19:29,630 --> 01:19:31,720 and ask to return those as the result of the page. 1650 01:19:31,720 --> 01:19:35,319 There are a few different approaches. 1651 01:19:35,319 --> 01:19:36,205 Yeah. 1652 01:19:36,205 --> 01:19:37,540 STUDENT: Why Ur? 1653 01:19:37,540 --> 01:19:39,785 ADAM CHLIPALA: You're asking how I chose the name? 1654 01:19:39,785 --> 01:19:41,064 STUDENT: Yeah, like why-- 1655 01:19:41,064 --> 01:19:43,480 ADAM CHLIPALA: Oh, you're asking why you want to use this. 1656 01:19:43,480 --> 01:19:45,870 STUDENT: No, no, the name of the language, 1657 01:19:45,870 --> 01:19:47,850 just out of curiosity. 1658 01:19:47,850 --> 01:19:51,320 ADAM CHLIPALA: So Ur language is a concept from linguistics 1659 01:19:51,320 --> 01:19:55,175 to describe the language that is the ancestor 1660 01:19:55,175 --> 01:19:56,930 of the modern languages. 1661 01:19:56,930 --> 01:19:58,750 And the idea is in this language, 1662 01:19:58,750 --> 01:20:00,958 you can embed all sorts of other languages inside it. 1663 01:20:00,958 --> 01:20:02,919 So it's sort of the ancestor of all those.