1 00:00:00,500 --> 00:00:02,600 Quiz two is this week on Friday from, 2 00:00:02,600 --> 00:00:04,092 I hope that's correct, whatever's 3 00:00:04,092 --> 00:00:06,300 on the webpage is correct, but I think it's lecture 9 4 00:00:06,300 --> 00:00:09,000 through recitation number 15. 5 00:00:09,000 --> 00:00:11,380 So, in particular, the log structure 6 00:00:11,380 --> 00:00:13,482 file system paper is not on the exam. 7 00:00:13,482 --> 00:00:15,190 Some of you may have been told otherwise. 8 00:00:18,480 --> 00:00:20,529 What we're going to do today is continue 9 00:00:20,529 --> 00:00:21,695 our discussion of atomicity. 10 00:00:27,100 --> 00:00:30,630 Which, as you'll recall, is two properties 11 00:00:30,630 --> 00:00:32,970 that have to hold in order for atomicity to hold. 12 00:00:32,970 --> 00:00:39,200 And the first property that we talked about 13 00:00:39,200 --> 00:00:41,129 was recoverability. 14 00:00:41,129 --> 00:00:42,670 And the other property, which is what 15 00:00:42,670 --> 00:00:47,060 we're going to spend most of our time on today, is isolation. 16 00:01:06,450 --> 00:01:07,900 So, before we get into isolation, 17 00:01:07,900 --> 00:01:10,650 I just want to wrap up the discussion of recoverability 18 00:01:10,650 --> 00:01:14,660 because we left it a little bit of three-quarters 19 00:01:14,660 --> 00:01:17,120 and didn't quite finish it. 20 00:01:17,120 --> 00:01:19,780 So the story here was what we did 21 00:01:19,780 --> 00:01:23,870 first was talked about how to achieve 22 00:01:23,870 --> 00:01:28,280 a recoverable sector using two copies 23 00:01:28,280 --> 00:01:32,160 of the data and the chooser sector to choose between them. 24 00:01:32,160 --> 00:01:36,190 We talked about one way of achieving recoverability 25 00:01:36,190 --> 00:01:40,980 using version histories. 26 00:01:40,980 --> 00:01:42,980 And we did complete that discussion. 27 00:01:42,980 --> 00:01:46,360 And then we started talking about a more efficient way 28 00:01:46,360 --> 00:01:49,130 of achieving recoverability in situations where 29 00:01:49,130 --> 00:01:51,295 you cared a lot about performance using logging. 30 00:01:57,580 --> 00:02:00,250 And the main in logging is this protocol 31 00:02:00,250 --> 00:02:04,832 for when you decide to write the log called the write ahead 32 00:02:04,832 --> 00:02:05,790 logging (WAL) protocol. 33 00:02:05,790 --> 00:02:07,580 And the idea is very simple. 34 00:02:07,580 --> 00:02:11,860 Always write to the log before you update cell store. 35 00:02:11,860 --> 00:02:15,250 And that's the main discipline that if you always follow then 36 00:02:15,250 --> 00:02:17,890 you'll get things right. 37 00:02:17,890 --> 00:02:19,900 But one of the consequences of logging, 38 00:02:19,900 --> 00:02:22,310 you get two things from having this log. 39 00:02:22,310 --> 00:02:24,452 The first is when an action aborts. 40 00:02:24,452 --> 00:02:25,910 Independent of failure, when you're 41 00:02:25,910 --> 00:02:29,410 running an atomic action and the program 42 00:02:29,410 --> 00:02:31,750 calls abort or the system aborts that action, 43 00:02:31,750 --> 00:02:35,160 what the log allows you to do is go back through the log, 44 00:02:35,160 --> 00:02:38,010 scan the log backwards, look at all of the steps 45 00:02:38,010 --> 00:02:40,392 that that action took. 46 00:02:40,392 --> 00:02:41,850 And that action didn't quite commit 47 00:02:41,850 --> 00:02:43,730 so you have to back those changes out, 48 00:02:43,730 --> 00:02:46,540 and the log helps you unroll backward. 49 00:02:46,540 --> 00:02:50,990 So primarily what you need with a logging scheme, 50 00:02:50,990 --> 00:02:56,410 in or to do aborts, is the ability to undo. 51 00:02:56,410 --> 00:03:01,010 Which means that what you need in the log, whenever you update 52 00:03:01,010 --> 00:03:02,710 a cell store to something else, you need 53 00:03:02,710 --> 00:03:04,440 to keep track of the old value. 54 00:03:04,440 --> 00:03:07,040 And this is what we called in the log as an undo step. 55 00:03:10,040 --> 00:03:13,530 But now failures also happen in addition to aborts, 56 00:03:13,530 --> 00:03:22,010 and we need a little bit more mechanism in addition 57 00:03:22,010 --> 00:03:23,310 to just the ability to undo. 58 00:03:23,310 --> 00:03:27,880 And to understand why, let's take a specific example 59 00:03:27,880 --> 00:03:31,110 where you have an application and it is writing data 60 00:03:31,110 --> 00:03:33,560 to a database that is on disk. 61 00:03:37,620 --> 00:03:40,260 And what we said was there was a log 62 00:03:40,260 --> 00:03:44,392 and the log is stored on disk as well. 63 00:03:44,392 --> 00:03:46,100 And let's assume that the log is actually 64 00:03:46,100 --> 00:03:48,160 stored on a different disk. 65 00:03:48,160 --> 00:03:49,970 Now, the write ahead log protocol basically 66 00:03:49,970 --> 00:03:52,080 says that whenever you update, so you 67 00:03:52,080 --> 00:03:53,840 have an action which has things like 68 00:03:53,840 --> 00:03:55,565 reads and writes of cell store. 69 00:03:55,565 --> 00:03:57,440 And whenever there's a write to a cell store, 70 00:03:57,440 --> 00:03:59,680 the write ahead log protocol says write to the log 71 00:03:59,680 --> 00:04:04,140 before you write to the database or to the cell store. 72 00:04:04,140 --> 00:04:05,670 So two things can happen. 73 00:04:05,670 --> 00:04:07,370 The first thing, the simplest model 74 00:04:07,370 --> 00:04:11,040 here is that all writes are synchronous. 75 00:04:11,040 --> 00:04:12,730 What this means is that if you see 76 00:04:12,730 --> 00:04:17,250 a write statement in an atomic action, 77 00:04:17,250 --> 00:04:20,630 the first thing you have to do is to write to the log. 78 00:04:20,630 --> 00:04:21,550 And then that returns. 79 00:04:21,550 --> 00:04:23,990 And then you write to the database. 80 00:04:23,990 --> 00:04:25,400 And then you run the next action, 81 00:04:25,400 --> 00:04:27,980 the next step of the action. 82 00:04:27,980 --> 00:04:29,890 So clearly by the time in this action, 83 00:04:29,890 --> 00:04:32,080 if you ever get to the commit point 84 00:04:32,080 --> 00:04:34,060 and you're about to exit your commit, 85 00:04:34,060 --> 00:04:35,932 it means that all of the data has already 86 00:04:35,932 --> 00:04:37,140 been written to the database. 87 00:04:37,140 --> 00:04:39,960 Because, by assumption, we assume that all of the writes 88 00:04:39,960 --> 00:04:42,800 are synchronous and you write to the database and only 89 00:04:42,800 --> 00:04:44,830 then do you go to the next step. 90 00:04:44,830 --> 00:04:47,140 And, assuming there's only one thread of execution, 91 00:04:47,140 --> 00:04:50,855 then in this simple model, you always write to the database, 92 00:04:50,855 --> 00:04:52,480 you first write to the log and then you 93 00:04:52,480 --> 00:04:55,610 write to this self-store in the database. 94 00:04:55,610 --> 00:04:57,860 But by construction, when you get to the commit point, 95 00:04:57,860 --> 00:04:59,443 you know for sure that all of the data 96 00:04:59,443 --> 00:05:01,930 has been written to the database. 97 00:05:01,930 --> 00:05:05,040 So if a failure happens now and you didn't get to commit 98 00:05:05,040 --> 00:05:08,190 and the system failed before the commit ran 99 00:05:08,190 --> 00:05:10,690 any time in the middle of this action, 100 00:05:10,690 --> 00:05:12,980 the only thing you really need to do 101 00:05:12,980 --> 00:05:16,800 is to roll back all of the changes made by actions that 102 00:05:16,800 --> 00:05:19,360 didn't get to commit, which means 103 00:05:19,360 --> 00:05:21,580 that in this model that I've described so far, 104 00:05:21,580 --> 00:05:26,150 the only thing you need to do is to undo actions that respond 105 00:05:26,150 --> 00:05:31,070 to actions that didn't commit. 106 00:05:31,070 --> 00:05:33,750 Any action that committed by construction 107 00:05:33,750 --> 00:05:35,160 must have written all of its data 108 00:05:35,160 --> 00:05:37,560 and installed that data into cell store. 109 00:05:37,560 --> 00:05:39,840 Because if an action gets to the statement to run 110 00:05:39,840 --> 00:05:42,436 commit and then finishes commit, when it got to commit, 111 00:05:42,436 --> 00:05:44,060 you know that because of all the writes 112 00:05:44,060 --> 00:05:47,310 being synchronous to the cell store, 113 00:05:47,310 --> 00:05:49,170 you know that that write got written. 114 00:05:49,170 --> 00:05:50,992 And, by the write protocol's definition, 115 00:05:50,992 --> 00:05:52,700 you know the log got written before that. 116 00:05:52,700 --> 00:05:54,449 So you don't actually have to do anything. 117 00:05:54,449 --> 00:05:58,600 Even though the log contains the results of these actions that 118 00:05:58,600 --> 00:06:01,180 committed, you don't have to do anything 119 00:06:01,180 --> 00:06:03,020 for the committed actions. 120 00:06:03,020 --> 00:06:05,550 So, in fact, this undo log is enough to handle 121 00:06:05,550 --> 00:06:09,340 the simple database as well where all of the writes 122 00:06:09,340 --> 00:06:13,360 are being done synchronously in this application. 123 00:06:13,360 --> 00:06:16,230 But there are a few things that can happen. 124 00:06:16,230 --> 00:06:18,880 One of the reasons we threw out version histories 125 00:06:18,880 --> 00:06:21,810 or discarded the idea to go to this logging oriented model 126 00:06:21,810 --> 00:06:24,305 is for higher performance. 127 00:06:24,305 --> 00:06:26,430 And one of the ways we got higher performance is we 128 00:06:26,430 --> 00:06:28,596 didn't have to do these link list reversals in order 129 00:06:28,596 --> 00:06:30,660 to read and write items. 130 00:06:30,660 --> 00:06:32,270 But it's also going to be tempting, 131 00:06:32,270 --> 00:06:35,680 and you've seen this before, to not 132 00:06:35,680 --> 00:06:38,450 want to do synchronous writes to cell store on a database 133 00:06:38,450 --> 00:06:40,160 in order to get high performance. 134 00:06:40,160 --> 00:06:41,650 You might have asynchronous writes 135 00:06:41,650 --> 00:06:43,483 happening to the database and you don't know 136 00:06:43,483 --> 00:06:45,430 when those writes complete. 137 00:06:45,430 --> 00:06:49,300 And what could happen, as a result of that, 138 00:06:49,300 --> 00:06:51,330 is you'd issue some writes to the database 139 00:06:51,330 --> 00:06:55,760 and then you go ahead and you commit the action. 140 00:06:55,760 --> 00:06:58,240 But what could happen is the data 141 00:06:58,240 --> 00:07:00,240 that was supposed to be written to this database 142 00:07:00,240 --> 00:07:03,140 as store may not actually have gotten written because there's 143 00:07:03,140 --> 00:07:07,930 some kind of a cache in memory that actually returned 144 00:07:07,930 --> 00:07:11,490 to you from the write, but the write actually never made it 145 00:07:11,490 --> 00:07:13,320 to the database. 146 00:07:13,320 --> 00:07:15,580 So this system, for higher performance, 147 00:07:15,580 --> 00:07:18,060 if you stick the cache in here, what could happen 148 00:07:18,060 --> 00:07:23,280 is that you might reach the commit point 149 00:07:23,280 --> 00:07:27,360 and finish commit but not be guaranteed that the cell store 150 00:07:27,360 --> 00:07:30,400 data actually got written to the database. 151 00:07:30,400 --> 00:07:32,460 And if you fail now after commit, 152 00:07:32,460 --> 00:07:36,180 you have to go through the log and redo the actions 153 00:07:36,180 --> 00:07:39,490 for all of those actions that committed because some 154 00:07:39,490 --> 00:07:41,650 of those actions may not have made it into the cell 155 00:07:41,650 --> 00:07:44,107 storage in the database. 156 00:07:44,107 --> 00:07:45,690 So what this means is that in general, 157 00:07:45,690 --> 00:07:48,190 when you put a cash here, any memory cash here 158 00:07:48,190 --> 00:07:50,390 or if you have any situation where the writes are 159 00:07:50,390 --> 00:07:53,050 asynchronously being done to cell store, 160 00:07:53,050 --> 00:07:59,390 you need both an undo log in order to handle aborts 161 00:07:59,390 --> 00:08:02,170 and to handle the simple case and you 162 00:08:02,170 --> 00:08:07,950 need the ability to redo the results of certain actions 163 00:08:07,950 --> 00:08:09,500 from the log. 164 00:08:09,500 --> 00:08:11,540 And both of these are done, so the system fails 165 00:08:11,540 --> 00:08:13,123 and then, when you recover, before you 166 00:08:13,123 --> 00:08:16,680 allow other actions that might be ready to go to take over, 167 00:08:16,680 --> 00:08:21,330 the system has to recover running the recovery process. 168 00:08:21,330 --> 00:08:23,840 And the recovery happens from the log. 169 00:08:27,090 --> 00:08:29,590 And the way that recovery works is, and we went through this 170 00:08:29,590 --> 00:08:31,790 the last time, you scan the log backwards 171 00:08:31,790 --> 00:08:34,350 from the most recent entry in the log. 172 00:08:40,854 --> 00:08:43,270 And, while scanning the log backwards, you make two lists. 173 00:08:43,270 --> 00:08:46,590 You make a list of winners and a list of losers. 174 00:08:46,590 --> 00:08:50,600 And the winners are all of those actions that 175 00:08:50,600 --> 00:08:52,920 either committed or aborted. 176 00:08:52,920 --> 00:08:56,652 And there's one important point in the abort step, which 177 00:08:56,652 --> 00:08:58,110 is that when the system calls abort 178 00:08:58,110 --> 00:09:01,640 or when the application calls abort and abort returns, what 179 00:09:01,640 --> 00:09:03,360 the system does is goes through the log 180 00:09:03,360 --> 00:09:05,359 and looks at all the changes made by that action 181 00:09:05,359 --> 00:09:06,769 and undoes all of them. 182 00:09:06,769 --> 00:09:09,060 And then it writes the record, called the abort record, 183 00:09:09,060 --> 00:09:10,220 onto the log. 184 00:09:10,220 --> 00:09:12,420 So when you recovery and you see an abort record, 185 00:09:12,420 --> 00:09:14,470 you already know that those changes 186 00:09:14,470 --> 00:09:18,224 made by an aborted action have been undone. 187 00:09:18,224 --> 00:09:19,640 When you compose a list of winners 188 00:09:19,640 --> 00:09:22,860 that are committed actions and aborted actions, 189 00:09:22,860 --> 00:09:25,220 the only things you really need to redo 190 00:09:25,220 --> 00:09:28,704 are the steps corresponding to the committed actions. 191 00:09:28,704 --> 00:09:30,620 You don't have to undo the steps corresponding 192 00:09:30,620 --> 00:09:33,160 to the aborted actions because they already got undone. 193 00:09:33,160 --> 00:09:35,930 Because only after they got undone that the abort 194 00:09:35,930 --> 00:09:39,320 entry was written to the log. 195 00:09:39,320 --> 00:09:43,080 In addition to these committed and aborted actions 196 00:09:43,080 --> 00:09:45,610 that are winners, there are all other actions 197 00:09:45,610 --> 00:09:50,080 that were pending or active at the time of the cache, 198 00:09:50,080 --> 00:09:52,640 and those are losers. 199 00:09:52,640 --> 00:09:55,530 And so what you have to do to the losers 200 00:09:55,530 --> 00:10:01,160 is to undo the actions done by losers. 201 00:10:01,160 --> 00:10:06,020 So the actual recovery step runs after composing these winners 202 00:10:06,020 --> 00:10:13,040 and losers and it corresponds to redoing the committed winners 203 00:10:13,040 --> 00:10:15,230 and undoing the losers. 204 00:10:28,890 --> 00:10:31,256 And independent of crashes, independent 205 00:10:31,256 --> 00:10:33,630 of failures, when you just have actions that might abort, 206 00:10:33,630 --> 00:10:35,260 the only thing you really need is undo. 207 00:10:35,260 --> 00:10:36,620 You don't need to redo anything. 208 00:10:36,620 --> 00:10:38,650 If you don't ever have any failures but just actions 209 00:10:38,650 --> 00:10:40,816 aborting, the only thing you need to do with the log 210 00:10:40,816 --> 00:10:45,420 is to undo the results of uncommitted actions 211 00:10:45,420 --> 00:10:48,817 before abort returns. 212 00:10:48,817 --> 00:10:49,900 Now, this procedure is OK. 213 00:10:49,900 --> 00:10:51,950 Failures happen rarely and you're 214 00:10:51,950 --> 00:10:55,060 willing to take a substantial amount of time 215 00:10:55,060 --> 00:10:58,550 to recover from a failure then the log might be quite long. 216 00:10:58,550 --> 00:11:01,747 If a system fails once a week, your log might be quite long. 217 00:11:01,747 --> 00:11:03,330 And if you're willing to take the time 218 00:11:03,330 --> 00:11:06,240 to scan through the log entirely and build up these winners 219 00:11:06,240 --> 00:11:09,280 and losers list then this is fine, 220 00:11:09,280 --> 00:11:11,330 this approach works just fine. 221 00:11:11,330 --> 00:11:13,890 But people are often interested in optimizing the time it 222 00:11:13,890 --> 00:11:15,480 takes to recover from a crash. 223 00:11:15,480 --> 00:11:19,670 And a common optimization that's done is called a checkpoint. 224 00:11:19,670 --> 00:11:21,920 And I believe you've seen this in System R. 225 00:11:21,920 --> 00:11:23,378 And, if you haven't seen it, you'll 226 00:11:23,378 --> 00:11:26,012 probably see it in tomorrow's recitation. 227 00:11:26,012 --> 00:11:27,470 And the main idea in the checkpoint 228 00:11:27,470 --> 00:11:29,960 is it's an optimization that allows this recovery 229 00:11:29,960 --> 00:11:32,760 process not to have to scan the log all the way back to time 230 00:11:32,760 --> 00:11:34,170 zero. 231 00:11:34,170 --> 00:11:37,100 That the system periodically, while it's normally operating, 232 00:11:37,100 --> 00:11:39,010 takes this checkpoint, which is to write 233 00:11:39,010 --> 00:11:41,790 a special record into the log. 234 00:11:41,790 --> 00:11:44,740 And that record basically says at this point in time 235 00:11:44,740 --> 00:11:47,290 here are all of the actions that have committed 236 00:11:47,290 --> 00:11:49,040 and whose results have already been 237 00:11:49,040 --> 00:11:52,305 installed into the cell store. 238 00:11:52,305 --> 00:11:54,180 In other words, they've committed and already 239 00:11:54,180 --> 00:11:55,810 been installed so when you recover 240 00:11:55,810 --> 00:11:58,270 you don't have to go and scan the log all the way back. 241 00:11:58,270 --> 00:12:00,750 You just have to go back enough so you 242 00:12:00,750 --> 00:12:03,330 find all of the actions whose results haven't yet 243 00:12:03,330 --> 00:12:08,370 been installed completely which have been committed. 244 00:12:08,370 --> 00:12:11,250 And the checkpoint also contains a list of all the actions 245 00:12:11,250 --> 00:12:13,390 that are currently active. 246 00:12:13,390 --> 00:12:16,320 So once you write a checkpoint record to the log, 247 00:12:16,320 --> 00:12:20,140 during recovery you don't have to go all the way back in time. 248 00:12:20,140 --> 00:12:22,170 There are a few other optimizations 249 00:12:22,170 --> 00:12:24,504 that you can do with checkpoints that are not 250 00:12:24,504 --> 00:12:25,920 that interesting to get into here, 251 00:12:25,920 --> 00:12:27,410 but the primary optimization that's 252 00:12:27,410 --> 00:12:30,070 done to speed up the recovery process 253 00:12:30,070 --> 00:12:32,150 is to use this checkpoint record. 254 00:12:32,150 --> 00:12:35,604 And most database systems, the checkpoint record 255 00:12:35,604 --> 00:12:36,270 is pretty small. 256 00:12:36,270 --> 00:12:37,910 It's not like you're check pointing the entire state 257 00:12:37,910 --> 00:12:38,720 of the database. 258 00:12:38,720 --> 00:12:40,178 The main thing you're doing is it's 259 00:12:40,178 --> 00:12:42,460 a pretty small amount of state that you're using 260 00:12:42,460 --> 00:12:45,890 to just speed up recovery. 261 00:12:45,890 --> 00:12:48,230 So you shouldn't be thinking that the checkpoint is 262 00:12:48,230 --> 00:12:50,063 something where you take the entire database 263 00:12:50,063 --> 00:12:50,772 and copy it over. 264 00:12:50,772 --> 00:12:51,771 That's not what goes on. 265 00:12:51,771 --> 00:12:53,880 It's a pretty lightweight, small amount of state 266 00:12:53,880 --> 00:12:56,860 rather than the size of all of the data in the system. 267 00:12:59,420 --> 00:13:03,820 So that's the story behind recoverability. 268 00:13:03,820 --> 00:13:06,570 And we're actually going to come back to a piece of the story 269 00:13:06,570 --> 00:13:09,380 after we talk about isolation now because it will turn out 270 00:13:09,380 --> 00:13:10,929 that the mechanisms for isolation 271 00:13:10,929 --> 00:13:12,470 and the mechanisms for recoverability 272 00:13:12,470 --> 00:13:14,640 using logs interact in certain ways, 273 00:13:14,640 --> 00:13:17,560 so we have to come back to this either later today 274 00:13:17,560 --> 00:13:20,880 or on Wednesday. 275 00:13:20,880 --> 00:13:25,130 So now we're going to start talking about isolation. 276 00:13:25,130 --> 00:13:31,340 If you remember, the idea behind isolation 277 00:13:31,340 --> 00:13:33,960 is when you have a set of actions 278 00:13:33,960 --> 00:13:36,770 that run concurrently, what you would like 279 00:13:36,770 --> 00:13:44,400 is an equivalent ordering of the steps of the actions 280 00:13:44,400 --> 00:13:47,130 running concurrently such that the results are 281 00:13:47,130 --> 00:13:50,520 equivalent to some serial ordering of the actions. 282 00:13:50,520 --> 00:13:53,029 The simple way of describing isolation is do it all before 283 00:13:53,029 --> 00:13:53,820 or do it all after. 284 00:13:53,820 --> 00:13:58,660 If you have actions, let's say T1, T2 and so 285 00:13:58,660 --> 00:14:00,040 on running at the same time. 286 00:14:00,040 --> 00:14:05,030 And these actions might operate on some data that is in common, 287 00:14:05,030 --> 00:14:08,030 they might act on data that's not at all uncommon, 288 00:14:08,030 --> 00:14:11,192 what you would like is an equivalent of the state 289 00:14:11,192 --> 00:14:13,650 of the system after running these concurrent actions should 290 00:14:13,650 --> 00:14:18,690 be equivalent to some serial ordering of the actions. 291 00:14:18,690 --> 00:14:21,060 And it will turn out that what's tricky about isolation 292 00:14:21,060 --> 00:14:23,470 is that if you want isolation and you 293 00:14:23,470 --> 00:14:26,450 don't care about performance it's very, very easy to do. 294 00:14:26,450 --> 00:14:28,990 So it's very easy to get isolation 295 00:14:28,990 --> 00:14:30,490 if you don't care about performance. 296 00:14:30,490 --> 00:14:31,864 It's very easy to run fast if you 297 00:14:31,864 --> 00:14:34,030 don't care about correctness. 298 00:14:34,030 --> 00:14:36,660 So, you know, fast and correct is the hard problem. 299 00:14:36,660 --> 00:14:39,850 Fast and not correct is trivial and correct and slow 300 00:14:39,850 --> 00:14:40,740 is also trivial. 301 00:14:40,740 --> 00:14:42,240 I mean correct and slow is very easy 302 00:14:42,240 --> 00:14:44,569 because you could take all of these concurrent actions 303 00:14:44,569 --> 00:14:46,110 and just run them one after the other 304 00:14:46,110 --> 00:14:47,790 so you don't actually take advantage 305 00:14:47,790 --> 00:14:51,640 of any potential concurrency that might be possible. 306 00:14:51,640 --> 00:14:54,367 So suddenly slow and correct is very easy to do. 307 00:14:54,367 --> 00:14:56,450 And we'll actually start out with slow and correct 308 00:14:56,450 --> 00:15:01,690 and then optimize a simple scheme. 309 00:15:01,690 --> 00:15:04,580 There has also been a huge amount of work 310 00:15:04,580 --> 00:15:06,700 that's been done on fast and correct schemes. 311 00:15:06,700 --> 00:15:10,190 And, in the end, they all boil down to this one basic idea 312 00:15:10,190 --> 00:15:16,579 that we'll talk about toward the end of lecture today. 313 00:15:16,579 --> 00:15:17,620 So let's take an example. 314 00:15:17,620 --> 00:15:18,911 Let's say you have two actions. 315 00:15:18,911 --> 00:15:22,570 I'm going to call them with Ts because very often these 316 00:15:22,570 --> 00:15:24,930 are intended to be transactions which 317 00:15:24,930 --> 00:15:27,230 are consistency and durability in addition 318 00:15:27,230 --> 00:15:29,312 to isolation and recoverability. 319 00:15:29,312 --> 00:15:31,770 So I'm just going to use the word T to represent an action. 320 00:15:34,385 --> 00:15:35,760 Let's say you have an action that 321 00:15:35,760 --> 00:15:42,970 does read of some variable x and it does write of a variable y 322 00:15:42,970 --> 00:15:46,520 and you have transaction T2 that does write x 323 00:15:46,520 --> 00:15:50,567 and it does write y. 324 00:15:50,567 --> 00:15:52,900 So we're going to take a few examples like this in order 325 00:15:52,900 --> 00:15:56,290 to understand what it means for actions to run such 326 00:15:56,290 --> 00:15:58,440 that they're steps equivalent to some serial order. 327 00:15:58,440 --> 00:16:00,620 So we're going to spend some time really understanding that. 328 00:16:00,620 --> 00:16:02,380 And then, once we understand what we want, 329 00:16:02,380 --> 00:16:04,421 it will turn out to be relatively easy to come up 330 00:16:04,421 --> 00:16:09,480 with schemes to achieve it. 331 00:16:09,480 --> 00:16:11,610 Let's say that what happens here is 332 00:16:11,610 --> 00:16:15,710 that the system gets presented with these concurrent actions. 333 00:16:15,710 --> 00:16:18,580 And let's assume each of these is an atomic step. 334 00:16:18,580 --> 00:16:21,150 So there are four steps that can be interleaving in arbitrary 335 00:16:21,150 --> 00:16:23,210 ways, in any number of ways. 336 00:16:23,210 --> 00:16:25,377 Let's say that what happens is you run that first 337 00:16:25,377 --> 00:16:27,710 and then you run that second and then you run this third 338 00:16:27,710 --> 00:16:29,970 and then you run this fourth. 339 00:16:29,970 --> 00:16:31,525 So what you get is that I'm going 340 00:16:31,525 --> 00:16:33,400 to introduce a little bit of a notation here. 341 00:16:33,400 --> 00:16:47,130 I'm going to write these steps as r1(x), w2(x), w1(y) 342 00:16:47,130 --> 00:16:47,805 and w2(y). 343 00:16:50,520 --> 00:16:53,030 What the r represents is a read, w 344 00:16:53,030 --> 00:16:56,260 represents a write, what the subscripts represent 345 00:16:56,260 --> 00:17:00,090 is the identifier of the action doing the read or write. 346 00:17:00,090 --> 00:17:03,070 And what's in parenthesis is the variable that's being written. 347 00:17:03,070 --> 00:17:07,390 So this says that action one does a read of x, action two 348 00:17:07,390 --> 00:17:12,280 does a write of x, action one does a write of y 349 00:17:12,280 --> 00:17:15,369 and action two does a write of y. 350 00:17:18,050 --> 00:17:20,935 Now, if you look at these different actions, 351 00:17:20,935 --> 00:17:23,060 there are a few steps that conflict with each other 352 00:17:23,060 --> 00:17:25,230 and a few that don't. 353 00:17:25,230 --> 00:17:27,410 If you look at the read of x and the write of y, 354 00:17:27,410 --> 00:17:29,060 they're independent of everything else. 355 00:17:29,060 --> 00:17:31,670 If you just have read x here and write y here, 356 00:17:31,670 --> 00:17:33,200 they don't conflict with each other 357 00:17:33,200 --> 00:17:34,830 because those can happen in any order 358 00:17:34,830 --> 00:17:37,780 and the results are exactly the same. 359 00:17:37,780 --> 00:17:39,480 Similarly, the write y and the write x 360 00:17:39,480 --> 00:17:41,294 don't conflict with each other. 361 00:17:41,294 --> 00:17:43,460 The only things that really conflict with each other 362 00:17:43,460 --> 00:17:47,230 are the read x and the write x in this example 363 00:17:47,230 --> 00:17:51,240 because the results depend on which goes first. 364 00:17:51,240 --> 00:17:53,460 Write y conflicts with this write y, 365 00:17:53,460 --> 00:17:55,230 and those are basically the two things 366 00:17:55,230 --> 00:17:58,970 that conflict with each other in this example. 367 00:17:58,970 --> 00:18:02,680 Generally, if you have two actions 368 00:18:02,680 --> 00:18:05,280 conflict with one another, if they both contain 369 00:18:05,280 --> 00:18:08,980 a read and a write of the same variable or a write 370 00:18:08,980 --> 00:18:10,880 and a write of the same variable. 371 00:18:10,880 --> 00:18:13,120 So there are basically three things that conflict. 372 00:18:13,120 --> 00:18:17,000 If you have some variable, call it z, if you have a read of z 373 00:18:17,000 --> 00:18:20,450 and a write of z in one action or the other, 374 00:18:20,450 --> 00:18:25,710 if you have write of z and read of z or if you have write of z 375 00:18:25,710 --> 00:18:29,759 and write of z and those conflict with one another. 376 00:18:29,759 --> 00:18:31,300 Now a read and a read don't conflict. 377 00:18:31,300 --> 00:18:33,570 Because it doesn't matter what order they run in, 378 00:18:33,570 --> 00:18:35,361 they are going to give you the same answer. 379 00:18:37,560 --> 00:18:44,050 If you look at this ordering of r1(x), w2(x), w1(y) 380 00:18:44,050 --> 00:18:47,220 and w2(y), the things that conflict 381 00:18:47,220 --> 00:18:50,550 are these two and these two. 382 00:18:53,450 --> 00:18:55,880 Now, if you look at the two things 383 00:18:55,880 --> 00:19:00,740 that conflict and you draw arrows from which one happens 384 00:19:00,740 --> 00:19:02,780 before the other, what you'll find 385 00:19:02,780 --> 00:19:08,570 is that for this x conflict one runs before two and for this y 386 00:19:08,570 --> 00:19:10,430 conflict one runs before two. 387 00:19:10,430 --> 00:19:13,050 So the arrows, if I were to draw them in time order, 388 00:19:13,050 --> 00:19:15,750 would point this way. 389 00:19:15,750 --> 00:19:17,490 And this ordering is basically the same 390 00:19:17,490 --> 00:19:19,180 as the same ordering you would get, 391 00:19:19,180 --> 00:19:21,360 even though the steps run in different order, 392 00:19:21,360 --> 00:19:25,840 it's exactly the same as if you ran T1 completely before T2. 393 00:19:25,840 --> 00:19:28,120 Because running T1 completely before T2 says 394 00:19:28,120 --> 00:19:33,630 the ordering is r1(x), w1(y), w2(x) and w2(y). 395 00:19:33,630 --> 00:19:35,490 But the results are exactly the same, 396 00:19:35,490 --> 00:19:37,610 this different interleaving, which 397 00:19:37,610 --> 00:19:41,390 does one, two, three, four. 398 00:19:41,390 --> 00:19:43,570 So what this says is that this trace that you get, 399 00:19:43,570 --> 00:19:45,480 we're going to use the word trace for this, 400 00:19:45,480 --> 00:19:48,880 you present this concurrent actions each of which 401 00:19:48,880 --> 00:19:52,510 has one or more steps and the system runs them. 402 00:19:52,510 --> 00:19:54,520 And then the order in which the individual steps 403 00:19:54,520 --> 00:19:56,390 run produces a trace. 404 00:19:56,390 --> 00:19:58,480 And then the question is whether that trace 405 00:19:58,480 --> 00:20:00,790 is what is called serializable. 406 00:20:00,790 --> 00:20:05,860 A trace is serializable if the trace's results 407 00:20:05,860 --> 00:20:09,600 are identical to running the actions in some serial order 408 00:20:09,600 --> 00:20:11,241 one after the other. 409 00:20:11,241 --> 00:20:12,990 And so what we're going to be trying to do 410 00:20:12,990 --> 00:20:15,960 is, what we're going to do today is 411 00:20:15,960 --> 00:20:19,889 to come up with schemes which take these different steps 412 00:20:19,889 --> 00:20:22,430 corresponding to the action that produces an order that turns 413 00:20:22,430 --> 00:20:24,090 out to be a serializable order. 414 00:20:24,090 --> 00:20:26,340 And the challenge is to do it in a way that allows you 415 00:20:26,340 --> 00:20:29,730 to get reasonable performance. 416 00:20:29,730 --> 00:20:33,380 To give you an example of a nonserializable order, if we 417 00:20:33,380 --> 00:20:42,910 had the following trace, r1(x) followed by w2(x) 418 00:20:42,910 --> 00:20:56,090 followed by w2(y) followed by w1(y). 419 00:20:56,090 --> 00:20:59,240 What happens here is that these two guys 420 00:20:59,240 --> 00:21:03,970 conflict so you've got an arrow from one to two going this way. 421 00:21:03,970 --> 00:21:07,970 And, similarly, these two guys conflict but the arrow in time 422 00:21:07,970 --> 00:21:11,830 goes from two to one, which means 423 00:21:11,830 --> 00:21:13,810 that if you drew the arrow one to two this way, 424 00:21:13,810 --> 00:21:16,330 you would have an arrow going the opposite direction. 425 00:21:16,330 --> 00:21:18,290 So this doesn't correspond to any serial order 426 00:21:18,290 --> 00:21:22,010 because, as far as this conflict is concerned, 427 00:21:22,010 --> 00:21:25,861 this trace says that action one should run before action two. 428 00:21:25,861 --> 00:21:27,610 And, as far as this conflict is concerned, 429 00:21:27,610 --> 00:21:31,580 it says that action two should run before action one. 430 00:21:31,580 --> 00:21:34,120 Which means you're in trouble because there is really 431 00:21:34,120 --> 00:21:35,990 no serial ordering here. 432 00:21:35,990 --> 00:21:40,760 This trace does not correspond to either T1 before T2 or T2 433 00:21:40,760 --> 00:21:44,370 before T1 which means this trace, 434 00:21:44,370 --> 00:21:47,351 if you have a scheme that runs your actions, the steps 435 00:21:47,351 --> 00:21:49,100 of your action that produces the stress it 436 00:21:49,100 --> 00:21:52,717 means that that scheme does not provide isolation. 437 00:21:52,717 --> 00:21:54,800 Notice that we don't actually care whether T1 runs 438 00:21:54,800 --> 00:21:57,000 before T2 or T2 runs before T1. 439 00:21:57,000 --> 00:21:58,244 We're not worried about that. 440 00:21:58,244 --> 00:21:59,660 We're just worried about producing 441 00:21:59,660 --> 00:22:03,000 some serial equivalent order. 442 00:22:09,630 --> 00:22:16,870 So that's the definition of the property that we want, 443 00:22:16,870 --> 00:22:18,010 serializability. 444 00:22:18,010 --> 00:22:21,390 And what it says is a trace whose conflict arrows, 445 00:22:21,390 --> 00:22:24,210 these are these conflict arrows, are 446 00:22:24,210 --> 00:22:28,060 equivalent to some serial ordering 447 00:22:28,060 --> 00:22:30,950 of the steps of the action. 448 00:22:30,950 --> 00:22:39,790 What we want is a trace conflict that 449 00:22:39,790 --> 00:22:46,170 should be in the same order as some serial schedule 450 00:22:46,170 --> 00:22:51,765 or some serial order of the actions. 451 00:22:58,780 --> 00:23:02,770 So what we're going to do is in three parts. 452 00:23:02,770 --> 00:23:06,940 The first part is we're going to look at one of these traces. 453 00:23:06,940 --> 00:23:08,472 And given a trace, the first problem 454 00:23:08,472 --> 00:23:10,430 is to figure out whether that trace corresponds 455 00:23:10,430 --> 00:23:12,760 to some serial order. 456 00:23:12,760 --> 00:23:16,310 And then we're going to derive a property that guarantees 457 00:23:16,310 --> 00:23:18,950 that if a sudden property holds we 458 00:23:18,950 --> 00:23:21,864 would be assured that a trace is in serial order. 459 00:23:21,864 --> 00:23:23,280 And then the second part is we are 460 00:23:23,280 --> 00:23:25,405 going to come up with various schemes for achieving 461 00:23:25,405 --> 00:23:26,587 serializability. 462 00:23:26,587 --> 00:23:29,170 And the third part is in order to prove that those schemes are 463 00:23:29,170 --> 00:23:32,300 correct, what we are going to do is 464 00:23:32,300 --> 00:23:35,600 to prove that this property, that all serial orderings 465 00:23:35,600 --> 00:23:38,660 should satisfy holes for the protocol or for the algorithm 466 00:23:38,660 --> 00:23:40,390 that we design. 467 00:23:40,390 --> 00:23:44,970 That is the plan for the rest of today. 468 00:23:44,970 --> 00:23:49,640 This property for serializability 469 00:23:49,640 --> 00:23:53,870 is going to use data structure or a construction called 470 00:23:53,870 --> 00:23:55,492 an action graph. 471 00:23:55,492 --> 00:23:57,200 And it turns out what we are going to do, 472 00:23:57,200 --> 00:24:00,330 given one of these traces, is produce a graph out 473 00:24:00,330 --> 00:24:03,330 of those traces called the action graph. 474 00:24:03,330 --> 00:24:06,050 And then we are going to look to see whether a certain property 475 00:24:06,050 --> 00:24:09,890 holds for that action graph. 476 00:24:09,890 --> 00:24:12,480 Let me show you what this action graph is by example. 477 00:24:12,480 --> 00:24:14,930 The graph itself consists of nodes and edges 478 00:24:14,930 --> 00:24:15,890 [UNINTELLIGIBLE]. 479 00:24:15,890 --> 00:24:19,160 And the nodes are not these r1s and w2s. 480 00:24:19,160 --> 00:24:21,640 What the nodes are, are the actions themselves. 481 00:24:21,640 --> 00:24:24,930 So, if you have four actions running, 482 00:24:24,930 --> 00:24:27,180 you have four nodes on the graph. 483 00:24:27,180 --> 00:24:30,657 And then there are edges between these nodes on the graph. 484 00:24:30,657 --> 00:24:31,740 Let's do it by an example. 485 00:24:31,740 --> 00:24:36,990 Let's say you have first action T1 which 486 00:24:36,990 --> 00:24:41,110 has read of x and write of y. 487 00:24:41,110 --> 00:24:43,120 And just so we are sure that it is action one, 488 00:24:43,120 --> 00:24:45,440 I'm going to draw one underneath as a subscript 489 00:24:45,440 --> 00:24:47,730 because they're just reads. 490 00:24:47,730 --> 00:24:50,980 Action two has a write of x and a write of y. 491 00:24:55,210 --> 00:25:00,000 Action three has a read of y and a write of another variable z. 492 00:25:03,300 --> 00:25:08,033 And action four has a read of x. 493 00:25:17,880 --> 00:25:23,110 First of all, given these actions, which of the actions 494 00:25:23,110 --> 00:25:24,590 conflict with each other? 495 00:25:24,590 --> 00:25:28,280 Let's first write for T1. 496 00:25:28,280 --> 00:25:29,680 Does T2 conflict with T1? 497 00:25:32,270 --> 00:25:34,039 Yes it does because the read x, I 498 00:25:34,039 --> 00:25:35,580 mean that's the same as that example, 499 00:25:35,580 --> 00:25:37,780 so suddenly T2 conflicts with T1. 500 00:25:37,780 --> 00:25:39,300 What about T3? 501 00:25:39,300 --> 00:25:40,880 Does T3 conflict with T1? 502 00:25:40,880 --> 00:25:43,680 What that means is the interleaving 503 00:25:43,680 --> 00:25:46,210 of the individual steps matter as far as the final answer is 504 00:25:46,210 --> 00:25:48,452 concerned. 505 00:25:48,452 --> 00:25:49,910 Yes, it does because the write of y 506 00:25:49,910 --> 00:25:53,290 and the read of y conflict, so you've got T1 and T3 507 00:25:53,290 --> 00:25:54,860 that you have to worry about. 508 00:25:54,860 --> 00:25:56,125 Does T1 conflict with T4? 509 00:25:58,677 --> 00:25:59,260 No it doesn't. 510 00:25:59,260 --> 00:26:00,930 The read and the read do not conflict. 511 00:26:00,930 --> 00:26:03,990 Does T2 conflict with T3? 512 00:26:03,990 --> 00:26:06,300 Yes, it does because it has got the y. 513 00:26:09,690 --> 00:26:12,080 Does T2 conflict with T4? 514 00:26:12,080 --> 00:26:12,675 It does. 515 00:26:15,460 --> 00:26:17,160 And does T3 conflict with T4? 516 00:26:17,160 --> 00:26:17,660 It does not. 517 00:26:17,660 --> 00:26:19,340 There's nothing that's shared. 518 00:26:19,340 --> 00:26:22,380 Out of the six possible, or whatever, four, 519 00:26:22,380 --> 00:26:24,480 choose two possible conflicts, for conflicts 520 00:26:24,480 --> 00:26:26,771 you've got four of them that you've got to worry about. 521 00:26:29,095 --> 00:26:30,720 Now we're going to draw this graph that 522 00:26:30,720 --> 00:26:33,990 has T1, T2, T3 and T4, and we're going 523 00:26:33,990 --> 00:26:36,110 to call this the action graph. 524 00:26:39,550 --> 00:26:43,610 And what it's going to do is to draw arrows between actions 525 00:26:43,610 --> 00:26:45,460 that conflict with one another. 526 00:26:45,460 --> 00:26:47,050 But of course the answer, the arrows 527 00:26:47,050 --> 00:26:50,530 depend on the order in which these individual steps get 528 00:26:50,530 --> 00:26:54,640 scheduled by the system running these actions. 529 00:26:54,640 --> 00:26:57,710 We need an actual example for that. 530 00:26:57,710 --> 00:26:59,460 Let's say that what happens is you present 531 00:26:59,460 --> 00:27:00,640 these concurrent actions. 532 00:27:00,640 --> 00:27:05,940 And what happens is this guy runs first and then two, three, 533 00:27:05,940 --> 00:27:11,950 four, five, six and seven. 534 00:27:15,070 --> 00:27:16,570 That's the order in which the system 535 00:27:16,570 --> 00:27:18,870 runs the individual steps of this action. 536 00:27:21,690 --> 00:27:25,910 Now we're going to draw these arrows between actions. 537 00:27:25,910 --> 00:27:29,960 If two actions share a conflicting operation 538 00:27:29,960 --> 00:27:35,290 and in the first action the conflicting operation 539 00:27:35,290 --> 00:27:36,946 occurs before the second action then 540 00:27:36,946 --> 00:27:39,070 you're going to draw an arrow from the first action 541 00:27:39,070 --> 00:27:39,944 to the second action. 542 00:27:39,944 --> 00:27:44,440 So more generally there's an arrow from PI to PJ. 543 00:27:44,440 --> 00:27:50,690 If I and J have a conflicting operation 544 00:27:50,690 --> 00:27:54,980 that individual step is run by the system for I 545 00:27:54,980 --> 00:27:57,970 first before J. 546 00:27:57,970 --> 00:27:59,800 If you look here at T1 to T2 there's 547 00:27:59,800 --> 00:28:02,260 an arrow between T1 and T2. 548 00:28:02,260 --> 00:28:07,740 Because it ran r1(x) before it run w2(x). 549 00:28:07,740 --> 00:28:13,850 If you look at T1 and T3, this is a little subtle 550 00:28:13,850 --> 00:28:18,630 because if ran read of x before it ran r3(y), 551 00:28:18,630 --> 00:28:21,700 but that doesn't matter because read of x there and read of y 552 00:28:21,700 --> 00:28:23,850 there don't conflict. 553 00:28:23,850 --> 00:28:26,870 But then it ran w1(y) after it ran 554 00:28:26,870 --> 00:28:31,280 read 3y which means that the conflict is 555 00:28:31,280 --> 00:28:34,720 equivalent to running that step of T3 before the step of T1. 556 00:28:34,720 --> 00:28:38,820 So you actually have an arrow going back from T3 to T1. 557 00:28:41,770 --> 00:28:45,190 Now what about T2 and T3? 558 00:28:45,190 --> 00:28:49,160 T2 and T3, the same story, w2(x) and r3(y) don't conflict. 559 00:28:49,160 --> 00:28:52,820 But r3(y) before w2(y), that's the conflict, 560 00:28:52,820 --> 00:28:55,237 so you have an arrow going this way. 561 00:28:55,237 --> 00:28:57,320 So we've got three of them, you need a fourth one, 562 00:28:57,320 --> 00:28:59,550 and that is between T2 and T4. 563 00:28:59,550 --> 00:29:03,820 Between T2 and T4, w2(x) runs before r4(x) 564 00:29:03,820 --> 00:29:09,490 which means you have an arrow going this way. 565 00:29:09,490 --> 00:29:14,080 If you actually look at this picture, 566 00:29:14,080 --> 00:29:16,830 and we'll come up with a method to systematically argue 567 00:29:16,830 --> 00:29:19,220 this point, but if you look at that schedule, 568 00:29:19,220 --> 00:29:21,460 as shown here, where the system runs 569 00:29:21,460 --> 00:29:23,240 the individual steps in that order, 570 00:29:23,240 --> 00:29:26,640 this is actually equivalent to T3 571 00:29:26,640 --> 00:29:29,140 running and then T1 running and then 572 00:29:29,140 --> 00:29:31,530 T2 running and then T4 running. 573 00:29:31,530 --> 00:29:33,737 The order of interleaving these different steps 574 00:29:33,737 --> 00:29:36,070 in the way shown in that picture, one, two, three, four, 575 00:29:36,070 --> 00:29:38,756 five, six, seven is actually equivalent to the same result, 576 00:29:38,756 --> 00:29:40,130 it's the same result that you get 577 00:29:40,130 --> 00:29:42,850 if you run T3 completely and then T1 and then T2 and then 578 00:29:42,850 --> 00:29:43,670 T4. 579 00:29:43,670 --> 00:29:45,390 In fact, if you think about it, it's 580 00:29:45,390 --> 00:29:48,510 also equivalent to the same ordering 581 00:29:48,510 --> 00:29:51,270 that you get if you run T3 and then you run T1 582 00:29:51,270 --> 00:29:53,590 and then you run T4 and then you run T2. 583 00:29:56,610 --> 00:29:59,370 That just says that for the exact same scheduling 584 00:29:59,370 --> 00:30:01,660 of individual steps in the concurrent actions 585 00:30:01,660 --> 00:30:40,640 you might find multiple equivalent serial 586 00:30:40,640 --> 00:30:57,750 orders that all give you the same answer. 587 00:31:00,580 --> 00:31:18,920 Is that clear? 588 00:31:18,920 --> 00:31:26,500 The arrow from two to four is correct. 589 00:31:26,500 --> 00:31:28,540 Write effects runs before read effects. 590 00:31:33,605 --> 00:31:34,730 So I think this is correct. 591 00:31:34,730 --> 00:31:37,150 Is there a problem? 592 00:31:37,150 --> 00:31:39,720 All right. 593 00:31:39,720 --> 00:31:42,350 So, in this example, it isn't multiple serial orders. 594 00:31:42,350 --> 00:31:44,020 But in general it appears that there 595 00:31:44,020 --> 00:31:45,520 are multiple serial orders possible. 596 00:31:53,454 --> 00:31:55,620 What does this action graph got to do with anything? 597 00:31:58,550 --> 00:32:01,170 It turns out, and we'll prove this, 598 00:32:01,170 --> 00:32:03,370 that if the action graph-- 599 00:32:09,710 --> 00:32:13,260 --for any ordering of steps within an action, 600 00:32:13,260 --> 00:32:15,935 if the action graph does not have a cycle-- 601 00:32:21,960 --> 00:32:24,400 --then the corresponding trace from which the action graph 602 00:32:24,400 --> 00:32:27,010 was derived is serializable. 603 00:32:36,270 --> 00:32:41,700 What that means is that what you have to do 604 00:32:41,700 --> 00:32:43,900 is, given a certain ordering of steps, 605 00:32:43,900 --> 00:32:45,526 you construct this action graph, you 606 00:32:45,526 --> 00:32:46,900 look to see if it has any cycles. 607 00:32:46,900 --> 00:32:49,120 And, if it doesn't have any cycles, 608 00:32:49,120 --> 00:32:53,900 then you know that the order is equivalent to some serial order 609 00:32:53,900 --> 00:32:57,861 of the steps of the individual actions. 610 00:32:57,861 --> 00:33:00,360 And it actually turns out the result is a bit more powerful. 611 00:33:00,360 --> 00:33:03,410 The converse is also true that if you 612 00:33:03,410 --> 00:33:08,020 have a serializable trace then the corresponding action graph 613 00:33:08,020 --> 00:33:09,770 has no cycles. 614 00:33:09,770 --> 00:33:12,080 Now, to turn out, the more interesting result for us 615 00:33:12,080 --> 00:33:13,710 is going to be the following direction 616 00:33:13,710 --> 00:33:17,714 where if the graph is acyclic then the trace is serializable. 617 00:33:17,714 --> 00:33:19,130 Because what we're going to end up 618 00:33:19,130 --> 00:33:25,780 doing is inventing one or two protocols for achieving 619 00:33:25,780 --> 00:33:27,769 isolation, for achieving serializability. 620 00:33:27,769 --> 00:33:29,810 And we're going to prove that those protocols are 621 00:33:29,810 --> 00:33:33,100 correct by proving that the corresponding action graph, all 622 00:33:33,100 --> 00:33:35,950 of the possible action graphs produced by those protocols 623 00:33:35,950 --> 00:33:37,930 all have no cycles. 624 00:33:37,930 --> 00:33:40,080 So this direction is the direction that's actually 625 00:33:40,080 --> 00:33:42,610 more important for us, but the opposite 626 00:33:42,610 --> 00:33:45,900 is also true and not that hard to prove. 627 00:33:45,900 --> 00:33:48,640 So what is the intuition behind why, 628 00:33:48,640 --> 00:33:51,180 if you have an action graph that's serializable, 629 00:33:51,180 --> 00:33:55,660 sorry, that doesn't have cycles the trace is serializable? 630 00:33:55,660 --> 00:33:57,470 Well, notice one thing about this 631 00:33:57,470 --> 00:34:01,030 which is to draw a little bit of intuition. 632 00:34:01,030 --> 00:34:03,020 Suppose, in fact, what happened here 633 00:34:03,020 --> 00:34:05,920 was we didn't execute the actions, the steps 634 00:34:05,920 --> 00:34:11,600 in that order, but what we did was to run this at step five 635 00:34:11,600 --> 00:34:14,760 and that at step six. 636 00:34:14,760 --> 00:34:18,130 What would happen with the resulting action graph 637 00:34:18,130 --> 00:34:22,960 is that you would actually have an arch from T2 to T1 638 00:34:22,960 --> 00:34:24,190 going the other way as well. 639 00:34:26,759 --> 00:34:28,550 Because what this says is between T1 and T2 640 00:34:28,550 --> 00:34:30,020 there is one conflicting operation 641 00:34:30,020 --> 00:34:33,500 that goes this way where action one runs before two. 642 00:34:33,500 --> 00:34:37,920 And these two guys conflict where the step in two 643 00:34:37,920 --> 00:34:40,370 runs before the step in one and those two steps 644 00:34:40,370 --> 00:34:42,330 conflict with one another. 645 00:34:42,330 --> 00:34:43,830 And this is actually the cycle here 646 00:34:43,830 --> 00:34:47,810 that causes the whole scheme to be not serializable anymore. 647 00:34:51,100 --> 00:34:54,750 That's a little bit of intuition as to why this acyclic property 648 00:34:54,750 --> 00:34:55,880 is important. 649 00:34:55,880 --> 00:34:59,840 But to really prove this notice that if you 650 00:34:59,840 --> 00:35:03,020 have a directed acyclic graph you could do something 651 00:35:03,020 --> 00:35:05,400 called a topological sort on the graph. 652 00:35:05,400 --> 00:35:08,140 How many people know what a topological sort is? 653 00:35:08,140 --> 00:35:08,710 OK. 654 00:35:08,710 --> 00:35:12,570 For those who don't, the idea is very simple. 655 00:35:12,570 --> 00:35:17,170 In any directed acyclic graph there 656 00:35:17,170 --> 00:35:19,460 is going to be at least one node that 657 00:35:19,460 --> 00:35:22,060 has no arrows coming into it. 658 00:35:22,060 --> 00:35:24,870 All of the arrows going out are only going out of the node. 659 00:35:24,870 --> 00:35:27,790 And you can actually prove that by arguing the contradiction. 660 00:35:27,790 --> 00:35:30,560 If it turns out that every node has an arch coming in 661 00:35:30,560 --> 00:35:32,920 and an arch going out then by traversing 662 00:35:32,920 --> 00:35:36,159 that chain of pointers you'll end up with a cycle. 663 00:35:36,159 --> 00:35:38,700 So it's pretty easy to see that in any directed acyclic graph 664 00:35:38,700 --> 00:35:40,290 you're going to have some node that 665 00:35:40,290 --> 00:35:44,400 has no arrows coming into it and only arrows going out of it. 666 00:35:44,400 --> 00:35:48,510 So find that action, in this picture it is T3, 667 00:35:48,510 --> 00:35:52,270 and take that action and put it in first. 668 00:35:52,270 --> 00:35:54,650 That's the first action that you run. 669 00:35:54,650 --> 00:35:56,310 Because it has no arrows coming into it 670 00:35:56,310 --> 00:35:58,850 and only arrows going out, what that means 671 00:35:58,850 --> 00:36:03,420 is that no other action in a serial order runs before it. 672 00:36:03,420 --> 00:36:05,344 Or at least there's no reason for any action 673 00:36:05,344 --> 00:36:06,760 to run before it because there are 674 00:36:06,760 --> 00:36:09,770 no arrows from any other action coming into this action. 675 00:36:09,770 --> 00:36:10,945 So you put that in first. 676 00:36:13,480 --> 00:36:15,970 Now, remove that node from the entire graph. 677 00:36:15,970 --> 00:36:17,780 The resulting graph is acyclic, right? 678 00:36:17,780 --> 00:36:20,680 You cannot manufacture a cycle by removing a node so you 679 00:36:20,680 --> 00:36:23,460 recursively apply the same idea, find some other node which has 680 00:36:23,460 --> 00:36:26,520 no other things coming into it and only things going out 681 00:36:26,520 --> 00:36:27,500 of it. 682 00:36:27,500 --> 00:36:29,950 And if there are ties just pick one at random. 683 00:36:29,950 --> 00:36:31,737 And, therefore, construct an order. 684 00:36:31,737 --> 00:36:33,320 And all such orders that you construct 685 00:36:33,320 --> 00:36:34,640 are topological sort orders. 686 00:36:38,520 --> 00:36:41,680 This topological sort order, by construction, 687 00:36:41,680 --> 00:36:46,420 is a serial order of the actions. 688 00:36:46,420 --> 00:36:50,180 And this topological sort, if you now draw the arrows 689 00:36:50,180 --> 00:36:51,830 in the topological sort, they're going 690 00:36:51,830 --> 00:36:55,640 to be the same arrows as in the original directed graph, 691 00:36:55,640 --> 00:36:58,430 it's exactly the same graph, and now you have 692 00:36:58,430 --> 00:36:59,900 an equivalent serial order. 693 00:37:02,860 --> 00:37:06,230 That's the reason why the cyclic property is actually important 694 00:37:06,230 --> 00:37:18,560 as far as serializability is concerned. 695 00:37:18,560 --> 00:37:20,590 Now we can look at schemes that actually 696 00:37:20,590 --> 00:37:23,760 guaranty serializability. 697 00:37:23,760 --> 00:37:26,440 And the schemes we're going to discuss all are in this system 698 00:37:26,440 --> 00:37:28,320 where you have cell storage and where 699 00:37:28,320 --> 00:37:30,360 you have logs for recovery. 700 00:37:30,360 --> 00:37:33,270 The logs are not going to matter, for the most part. 701 00:37:33,270 --> 00:37:36,710 You can also do isolation in version histories, 702 00:37:36,710 --> 00:37:38,657 and one of the sections of the notes deals 703 00:37:38,657 --> 00:37:39,490 with that at length. 704 00:37:42,630 --> 00:37:44,680 I personally think it's not that important, 705 00:37:44,680 --> 00:37:50,109 but that doesn't mean it's not on the quiz. 706 00:37:50,109 --> 00:37:51,900 I think you get a little bit more intuition 707 00:37:51,900 --> 00:37:53,920 reading that in addition to this discussion. 708 00:37:53,920 --> 00:37:56,800 And a bulk of this discussion is actually not in the notes. 709 00:37:56,800 --> 00:38:00,330 It's just another way of looking at the problem. 710 00:38:00,330 --> 00:38:02,070 The mechanism we're going to build on 711 00:38:02,070 --> 00:38:04,670 is, in fact, described in the notes as well. 712 00:38:04,670 --> 00:38:09,570 It's a mechanism you've seen before, and it is called locks. 713 00:38:09,570 --> 00:38:12,680 As you recall, if you have variable x or any chunk of data 714 00:38:12,680 --> 00:38:16,140 you can protect it using two calls, acquire and release. 715 00:38:16,140 --> 00:38:18,110 So you could do things like acquire lock 716 00:38:18,110 --> 00:38:23,700 of x and release lock of x. 717 00:38:23,700 --> 00:38:26,822 And the lock protocol is that only one person 718 00:38:26,822 --> 00:38:28,030 can acquire a lock at a time. 719 00:38:28,030 --> 00:38:30,620 And all of the people wanting all other actions wishing 720 00:38:30,620 --> 00:38:35,810 to acquire the same lock will wait until the lock is released 721 00:38:35,810 --> 00:38:39,050 and then they fight to acquire it. 722 00:38:39,050 --> 00:38:41,020 And ultimately at the lowest level 723 00:38:41,020 --> 00:38:43,820 the lock has to be implemented with some low level 724 00:38:43,820 --> 00:38:44,640 atomic instruction. 725 00:38:44,640 --> 00:38:47,750 For example, something like a test and set lock instruction. 726 00:38:47,750 --> 00:38:51,530 It's the same story as before. 727 00:38:51,530 --> 00:38:53,880 Now, there are a few things to worry about with locks. 728 00:38:53,880 --> 00:38:56,420 The first one is the granularity of the lock. 729 00:38:56,420 --> 00:38:58,659 You could be very, very conservative 730 00:38:58,659 --> 00:39:00,450 and decide that the granularity of the lock 731 00:39:00,450 --> 00:39:02,079 is your entire system. 732 00:39:02,079 --> 00:39:03,620 So all of the data on the system gets 733 00:39:03,620 --> 00:39:07,130 protected with one lock, which means that you're running 734 00:39:07,130 --> 00:39:10,750 this very slow scheme because what you're guaranteeing 735 00:39:10,750 --> 00:39:13,710 is, in fact, isolation but you're 736 00:39:13,710 --> 00:39:16,680 running essentially one action at a time 737 00:39:16,680 --> 00:39:19,652 and you have no concurrency at all. 738 00:39:19,652 --> 00:39:21,860 The other extreme, you could be very, very aggressive 739 00:39:21,860 --> 00:39:24,940 and every individual cell stored item 740 00:39:24,940 --> 00:39:26,332 is protected with the lock. 741 00:39:26,332 --> 00:39:28,790 And now you're trying to max out the degree of concurrency, 742 00:39:28,790 --> 00:39:30,964 which means that although things could be fast 743 00:39:30,964 --> 00:39:33,130 you have to be a lot more careful if you want things 744 00:39:33,130 --> 00:39:33,960 to be correct. 745 00:39:33,960 --> 00:39:36,400 And correct here means that you have 746 00:39:36,400 --> 00:39:39,200 an order that is equivalent to some serial order. 747 00:39:42,960 --> 00:39:45,770 Why does the locking protocol actually matter? 748 00:39:45,770 --> 00:39:55,660 Well, let's go back to these two action examples of r1(x) w2(x) 749 00:39:55,660 --> 00:39:59,240 and w1(y) and w2(y). 750 00:40:01,850 --> 00:40:06,560 And let's just throw in an acquire lock of x here 751 00:40:06,560 --> 00:40:17,410 and an acquire lock of y here and a release lock of x here 752 00:40:17,410 --> 00:40:24,390 and in between here you have the read 753 00:40:24,390 --> 00:40:27,230 and here you do the release of the lock of y 754 00:40:27,230 --> 00:40:31,240 and similarly you do an acquire lx here 755 00:40:31,240 --> 00:40:35,380 and you do an acquire ly here. 756 00:40:35,380 --> 00:40:41,380 And then you release the locks at the end here. 757 00:40:41,380 --> 00:40:45,510 So acquire lock x, read x, release lock x, acquire lock y, 758 00:40:45,510 --> 00:40:48,970 write y, release lock y, and acquire right, 759 00:40:48,970 --> 00:40:50,680 acquire right, release, release. 760 00:40:50,680 --> 00:40:53,890 And, on the face of it, this kind of thing 761 00:40:53,890 --> 00:40:58,740 is actually reasonable for the old style synchronization, some 762 00:40:58,740 --> 00:41:00,660 of the old style synchronization things 763 00:41:00,660 --> 00:41:03,480 that we wanted because we didn't actual care about atomicity 764 00:41:03,480 --> 00:41:05,210 of full actions. 765 00:41:05,210 --> 00:41:10,440 If you look at what happens with this kind of locking, 766 00:41:10,440 --> 00:41:13,530 as shown here, you might be a little bit in trouble 767 00:41:13,530 --> 00:41:19,870 because it could be that these three steps happen first 768 00:41:19,870 --> 00:41:26,570 and then this whole set of steps up to here happen second, 769 00:41:26,570 --> 00:41:29,330 actually, up to the end. 770 00:41:29,330 --> 00:41:35,300 And then this chunk happens third. 771 00:41:35,300 --> 00:41:37,340 Now you're in trouble because read 772 00:41:37,340 --> 00:41:40,770 x happens before write x and the write y 773 00:41:40,770 --> 00:41:43,715 happens before this write y. 774 00:41:43,715 --> 00:41:45,340 And now you have not achieved isolation 775 00:41:45,340 --> 00:41:47,480 because, if you draw the conflict graph, 776 00:41:47,480 --> 00:41:49,170 you'll get an arrow from T1 to T2 777 00:41:49,170 --> 00:41:50,980 and an arrow coming back from T2 to T1 778 00:41:50,980 --> 00:41:54,340 and you have a cycle, which means that just throwing 779 00:41:54,340 --> 00:41:55,850 in the right acquires and releases 780 00:41:55,850 --> 00:42:00,030 isn't going to be sufficient to achieve isolation. 781 00:42:00,030 --> 00:42:02,750 So we're going to need a much better set of skills, a better 782 00:42:02,750 --> 00:42:05,519 scheme than just this throw in the right 783 00:42:05,519 --> 00:42:06,435 acquires and releases. 784 00:42:09,170 --> 00:42:11,770 The first scheme we're going to see is a simple scheme. 785 00:42:11,770 --> 00:42:13,030 It's called simple locking. 786 00:42:17,330 --> 00:42:21,770 The idea in simple locking is that every action 787 00:42:21,770 --> 00:42:24,390 knows beforehand all of the data items 788 00:42:24,390 --> 00:42:26,880 that it wants to read or write, and it 789 00:42:26,880 --> 00:42:30,210 acquires all of the locks before it does anything. 790 00:42:30,210 --> 00:42:33,760 Before doing any reads or writes it acquires all of the locks. 791 00:42:33,760 --> 00:42:41,550 The idea would be here you would acquire the lock of x 792 00:42:41,550 --> 00:42:44,520 and acquire the lock of y, and then you run the steps. 793 00:42:44,520 --> 00:42:47,400 Similarly, the other action does the same thing. 794 00:42:47,400 --> 00:42:49,580 And by construction, if you have actions 795 00:42:49,580 --> 00:42:52,770 that conflict with one another and one of the actions 796 00:42:52,770 --> 00:42:56,370 reaches the point where all of the locks it needs 797 00:42:56,370 --> 00:42:58,570 have been acquired then by construction 798 00:42:58,570 --> 00:43:03,200 no other conflicting action could be in the same state. 799 00:43:03,200 --> 00:43:06,360 Because if you have another action that conflicts then 800 00:43:06,360 --> 00:43:10,105 it means that there's at least one data item common to them 801 00:43:10,105 --> 00:43:11,480 which means that only one of them 802 00:43:11,480 --> 00:43:13,563 could have reached this point because they're both 803 00:43:13,563 --> 00:43:15,760 trying to acquire. 804 00:43:15,760 --> 00:43:19,090 This protocol where every action acquires all of the locks 805 00:43:19,090 --> 00:43:22,530 before running any step of the action will guaranty isolation. 806 00:43:22,530 --> 00:43:24,940 And the isolation order, the equivalent serial order 807 00:43:24,940 --> 00:43:28,540 that it guarantees is the same as the order 808 00:43:28,540 --> 00:43:31,039 in which the different actions reach this point where they 809 00:43:31,039 --> 00:43:32,330 have acquired all of the locks. 810 00:43:32,330 --> 00:43:34,837 And that point is also called the lock point. 811 00:43:34,837 --> 00:43:36,420 The lock point of an action is defined 812 00:43:36,420 --> 00:43:41,110 as the point where all of the locks that it needs or it wants 813 00:43:41,110 --> 00:43:43,260 to acquire have been acquired. 814 00:43:43,260 --> 00:43:45,920 And, because in this discipline where no action does 815 00:43:45,920 --> 00:43:47,840 any operation until all of the locks it needs 816 00:43:47,840 --> 00:43:49,930 have been acquired, you're guaranteed 817 00:43:49,930 --> 00:43:53,506 that this serial order of every action following 818 00:43:53,506 --> 00:43:55,130 this protocol, independent of knowledge 819 00:43:55,130 --> 00:43:57,420 of any other actions, everybody just blindly follows 820 00:43:57,420 --> 00:43:59,870 this protocol, the serial order is the same 821 00:43:59,870 --> 00:44:03,110 as if the actions had run in the order 822 00:44:03,110 --> 00:44:06,330 of acquiring the lock points, whatever that order might be. 823 00:44:11,160 --> 00:44:14,150 This is called simple locking, and it does provide isolation. 824 00:44:16,800 --> 00:44:18,740 But the problem with simple locking 825 00:44:18,740 --> 00:44:23,660 where you acquire all of the locks before running anything 826 00:44:23,660 --> 00:44:24,660 is two-fold. 827 00:44:24,660 --> 00:44:28,360 The first problem is that this dictates 828 00:44:28,360 --> 00:44:31,320 that all of the actions know the different items that they want 829 00:44:31,320 --> 00:44:33,267 to read or write beforehand. 830 00:44:33,267 --> 00:44:35,350 Which means that if you're deep inside some action 831 00:44:35,350 --> 00:44:37,391 and there are all sorts of conditional statements 832 00:44:37,391 --> 00:44:38,860 you kind of have to know beforehand 833 00:44:38,860 --> 00:44:41,110 all of the data items you might be reading and writing. 834 00:44:41,110 --> 00:44:42,300 And that could be pretty tricky. 835 00:44:42,300 --> 00:44:44,450 In practice, if you want to adopt simple locking, 836 00:44:44,450 --> 00:44:46,180 you might have to be very conservative 837 00:44:46,180 --> 00:44:47,990 and sort of try to lock everything or lock 838 00:44:47,990 --> 00:44:52,200 a large amount of data which reduces performance. 839 00:44:52,200 --> 00:44:55,530 Second, it actually doesn't give you enough opportunity 840 00:44:55,530 --> 00:44:59,260 to get high performance, even when you do know all 841 00:44:59,260 --> 00:45:01,020 of the data items beforehand. 842 00:45:01,020 --> 00:45:02,790 For example, one thing you could do 843 00:45:02,790 --> 00:45:06,270 is acquire the lock of x and then read x. 844 00:45:06,270 --> 00:45:10,320 And while you're off trying to read x then you might 845 00:45:10,320 --> 00:45:13,830 acquire a lock of y and read y. 846 00:45:13,830 --> 00:45:17,440 And so, depending on how you have structured your system 847 00:45:17,440 --> 00:45:19,700 scheme where you do some work, maybe 848 00:45:19,700 --> 00:45:23,070 some computation as well in between the lock 849 00:45:23,070 --> 00:45:25,365 acquisition steps might give you higher performance. 850 00:45:28,120 --> 00:45:32,030 So the question is can we be a little bit more aggressive 851 00:45:32,030 --> 00:45:34,660 than the simple locking scheme in order 852 00:45:34,660 --> 00:45:39,430 to get isolation as well as a little bit higher performance? 853 00:45:39,430 --> 00:45:41,554 And the answer is that there is such a scheme 854 00:45:41,554 --> 00:45:42,970 and it's called two-phase locking. 855 00:45:49,980 --> 00:45:51,850 And although a lot of work has happened 856 00:45:51,850 --> 00:45:56,320 for maybe a couple of decades on very high performance 857 00:45:56,320 --> 00:45:59,556 locking schemes, it turns out in the end they all, or at least 858 00:45:59,556 --> 00:46:00,930 for a large class of schemes that 859 00:46:00,930 --> 00:46:03,900 use this kind of locking they all boil down to some variable 860 00:46:03,900 --> 00:46:05,680 of two-phase locking. 861 00:46:05,680 --> 00:46:06,860 And the idea is very simple. 862 00:46:06,860 --> 00:46:09,410 The two-phase locking idea says there 863 00:46:09,410 --> 00:46:16,960 should be no release before all the acquires are done. 864 00:46:21,770 --> 00:46:24,670 So do not release any lock until all of the locks 865 00:46:24,670 --> 00:46:28,710 that you need have been applied. 866 00:46:28,710 --> 00:46:30,200 That's what happens here. 867 00:46:30,200 --> 00:46:32,940 This schemes violates two-phase locking because you actually 868 00:46:32,940 --> 00:46:35,260 release lock x before you acquire lock y, 869 00:46:35,260 --> 00:46:36,890 and that violated two-phase locking. 870 00:46:39,915 --> 00:46:42,040 And this idea that you don't release before all the 871 00:46:42,040 --> 00:46:44,123 acquires, there is a little bit of a subtlety that 872 00:46:44,123 --> 00:46:48,310 happens because you want recoverability 873 00:46:48,310 --> 00:46:53,646 which is that the action could at any stage abort. 874 00:46:53,646 --> 00:46:55,020 And, in order to abort an action, 875 00:46:55,020 --> 00:46:58,220 you need to go back and undo that variable, 876 00:46:58,220 --> 00:46:59,910 the value to a previous value. 877 00:46:59,910 --> 00:47:01,700 Which means that in order to abort 878 00:47:01,700 --> 00:47:03,180 you need to hold onto the lock. 879 00:47:03,180 --> 00:47:05,388 You better make sure that an action in order to abort 880 00:47:05,388 --> 00:47:08,860 has the locks for all of the data items 881 00:47:08,860 --> 00:47:11,760 whose values it wishes to change to the original value. 882 00:47:11,760 --> 00:47:14,290 Which means that in practice, at least for all the items 883 00:47:14,290 --> 00:47:17,470 that you're writing, the locks that you hold 884 00:47:17,470 --> 00:47:21,000 should not be released until the commit point, 885 00:47:21,000 --> 00:47:25,770 until the place where the action calls commit. 886 00:47:25,770 --> 00:47:29,330 So no release before all acquires 887 00:47:29,330 --> 00:47:31,220 is basically equivalent to-- 888 00:47:31,220 --> 00:47:32,720 There should be no acquire statement 889 00:47:32,720 --> 00:47:33,810 after any release statement. 890 00:47:33,810 --> 00:47:35,300 The moment you see a release statement and then an 891 00:47:35,300 --> 00:47:37,280 acquire after that of anything, then 892 00:47:37,280 --> 00:47:41,550 you know that this violates two-phase locking. 893 00:47:41,550 --> 00:47:43,910 It turns out that two-phase locking is correct 894 00:47:43,910 --> 00:47:47,390 that it provides isolation. 895 00:47:47,390 --> 00:47:53,650 And to see why let's look at a picture. 896 00:47:53,650 --> 00:47:55,650 We're going to go back to this action graph idea 897 00:47:55,650 --> 00:48:02,064 and look at a picture of what happens with two-phase locking. 898 00:48:02,064 --> 00:48:03,730 What we're going to prove is that if you 899 00:48:03,730 --> 00:48:06,460 use two-phase locking and you construct an action 900 00:48:06,460 --> 00:48:10,720 graph of what you get from running a sequence of steps 901 00:48:10,720 --> 00:48:12,900 that action graph has no cycles. 902 00:48:12,900 --> 00:48:15,630 And we know that if you have no cycles in the action graph 903 00:48:15,630 --> 00:48:17,870 you're guaranteed that it is equivalent 904 00:48:17,870 --> 00:48:19,070 to some serial order. 905 00:48:19,070 --> 00:48:21,170 We will argue this by contradiction. 906 00:48:21,170 --> 00:48:27,710 Let's say you have T1 and T2 all the way through some action Tk 907 00:48:27,710 --> 00:48:31,550 and you have a cycle going back from Tk to T1 in the action 908 00:48:31,550 --> 00:48:33,100 graph. 909 00:48:33,100 --> 00:48:35,470 Now, if there is an arrow from T1 to T2, 910 00:48:35,470 --> 00:48:37,450 it means that there is some data item 911 00:48:37,450 --> 00:48:41,210 x1 in common between T1 and T2. 912 00:48:41,210 --> 00:48:47,160 And you know that T2 ran after T1 which means that in T1 there 913 00:48:47,160 --> 00:48:51,640 was a release done of l1 after which in the system 914 00:48:51,640 --> 00:48:53,080 there was an acquire done of l1. 915 00:48:56,960 --> 00:49:02,000 Likewise, between T2 and T3, there is some data item x2 such 916 00:49:02,000 --> 00:49:08,430 that a release was done of l2 by T2. 917 00:49:08,430 --> 00:49:15,540 And after that an acquire was done of l2 by T3. 918 00:49:15,540 --> 00:49:18,560 And the release has to have been done after the acquire of l1 919 00:49:18,560 --> 00:49:21,420 because we're following two-phase locking. 920 00:49:21,420 --> 00:49:27,110 If you continue that down up to here out for Tk, 921 00:49:27,110 --> 00:49:35,700 Tk did an acquire somewhere later in time of lk minus one. 922 00:49:35,700 --> 00:49:42,160 And then it did a release of some data item, a lock of lk 923 00:49:42,160 --> 00:49:44,900 where lk is actually some data item that 924 00:49:44,900 --> 00:49:47,600 is shared between Tk and T1, so there is actually some data 925 00:49:47,600 --> 00:49:50,290 item xk whose lock is lk. 926 00:49:50,290 --> 00:49:53,850 And you know that Tk did a release of lk 927 00:49:53,850 --> 00:49:59,970 before T1 did an acquire of lk. 928 00:49:59,970 --> 00:50:01,646 But the T1's acquire of lk must have 929 00:50:01,646 --> 00:50:03,020 happened after the release for lk 930 00:50:03,020 --> 00:50:05,600 so it must have happened at some point here. 931 00:50:05,600 --> 00:50:08,560 T1 must have done an acquire of lk at some point at the bottom 932 00:50:08,560 --> 00:50:09,950 here. 933 00:50:09,950 --> 00:50:11,940 Just going out in time, by two-phase 934 00:50:11,940 --> 00:50:14,600 locking you get release of l1, then the other guy 935 00:50:14,600 --> 00:50:16,480 doesn't acquire of l1, release of l2, 936 00:50:16,480 --> 00:50:19,130 acquire of l2 all the way out, then release of lk, and then 937 00:50:19,130 --> 00:50:21,290 after that in time an acquire of lk. 938 00:50:21,290 --> 00:50:23,620 So that must have happened later on in time, 939 00:50:23,620 --> 00:50:25,690 but now this picture here violates two-phase 940 00:50:25,690 --> 00:50:28,460 locking because T1, for this cycle to hold, 941 00:50:28,460 --> 00:50:32,120 has to have done an acquire of lk after release of l1. 942 00:50:32,120 --> 00:50:34,370 But that violates two-phase locking because you're not 943 00:50:34,370 --> 00:50:36,036 allowed to acquire anything after you've 944 00:50:36,036 --> 00:50:37,510 released something. 945 00:50:37,510 --> 00:50:40,407 So two-phase locking, therefore, cannot have a cycle 946 00:50:40,407 --> 00:50:41,240 in the action graph. 947 00:50:41,240 --> 00:50:42,820 And, from the previous story, it means 948 00:50:42,820 --> 00:50:44,736 that it's equivalent to some serial order that 949 00:50:44,736 --> 00:50:46,920 corresponds to the topological sort 950 00:50:46,920 --> 00:50:50,662 of the directed acyclic graph. 951 00:50:50,662 --> 00:50:51,620 I'm going to stop here. 952 00:50:51,620 --> 00:50:54,130 We will continue with this stuff and then 953 00:50:54,130 --> 00:50:56,560 talk about other aspects of transactions next time. 954 00:50:56,560 --> 00:50:58,809 And if there are any questions either send me an email 955 00:50:58,809 --> 00:51:00,900 or ask me the next time.