1 00:00:00,000 --> 00:00:05,000 So in the last lecture I spent quite a while trying to convey a sense of 2 00:00:05,000 --> 00:00:10,000 how the structure of DNA was discovered. The crystallographic 3 00:00:10,000 --> 00:00:16,000 data that led to it, as I said, was collected by Roslyn 4 00:00:16,000 --> 00:00:21,000 Franklin. And I saw there was some confusion about this picture that I 5 00:00:21,000 --> 00:00:27,000 showed you next. This is not a photograph of a 6 00:00:27,000 --> 00:00:32,000 double helix. This is what happened when she 7 00:00:32,000 --> 00:00:36,000 bounced the x-ray off the crystal of DNA. This is the diffraction 8 00:00:36,000 --> 00:00:40,000 pattern that she saw. And then one works backwards from 9 00:00:40,000 --> 00:00:44,000 that trying to figure out what kind of structure it was that would have 10 00:00:44,000 --> 00:00:49,000 caused that diffraction pattern. And you have to be a pretty good 11 00:00:49,000 --> 00:00:53,000 x-ray crystallographer to draw any kind of inferences from that. 12 00:00:53,000 --> 00:00:57,000 And there people, including Francis Crick, who saw the implications 13 00:00:57,000 --> 00:01:02,000 of it right away. But the point was she collected the 14 00:01:02,000 --> 00:01:06,000 data and then two people that I told you about then whose name you know 15 00:01:06,000 --> 00:01:10,000 so well, Jim Watson and Francis Crick, were the two individuals that 16 00:01:10,000 --> 00:01:14,000 came up with the model that explained the diffraction pattern. 17 00:01:14,000 --> 00:01:19,000 And therefore we learned the structure of DNA as a 18 00:01:19,000 --> 00:01:23,000 double-stranded helix. I also tried to make the case that 19 00:01:23,000 --> 00:01:27,000 it wasn't two geniuses who sat down in the room, took a look at this and 20 00:01:27,000 --> 00:01:32,000 popped up with the model. It was a story of real people with 21 00:01:32,000 --> 00:01:36,000 misadventures and mistakes and recovery from mistakes and so on 22 00:01:36,000 --> 00:01:41,000 getting it. It was also a very small group. And I'm going to take 23 00:01:41,000 --> 00:01:45,000 just a very small minute at the beginning of the class because I 24 00:01:45,000 --> 00:01:49,000 have a colleague, Vernon Ingram who's sitting down 25 00:01:49,000 --> 00:01:54,000 here in the front, who was a member of this very small 26 00:01:54,000 --> 00:01:58,000 group with Jim Watson and Francis Crick. So here was there where all 27 00:01:58,000 --> 00:02:02,000 this happened. And almost nobody in the world has 28 00:02:02,000 --> 00:02:06,000 had a chance in your generation to hear directly from somebody who was 29 00:02:06,000 --> 00:02:10,000 there when it happened. So asked Vernon if he would come 30 00:02:10,000 --> 00:02:14,000 and just talk to you for a little bit just what it was 31 00:02:14,000 --> 00:02:54,000 like to be there. 32 00:02:54,000 --> 00:03:00,000 Well, thanks, Graham. You seem to be at a very exciting 33 00:03:00,000 --> 00:03:06,000 state in 7.014. This structure of the secret of life, 34 00:03:06,000 --> 00:03:14,000 no less. And it's interesting that immediately when Watson and Crick 35 00:03:14,000 --> 00:03:22,000 put together a model of the DNA molecule that fit the x-ray data, 36 00:03:22,000 --> 00:03:30,000 that was the point, how do you know a model is correct? 37 00:03:30,000 --> 00:03:36,000 Because there are certain distances in the model, and those have to 38 00:03:36,000 --> 00:03:43,000 correlate exactly with the distances of the x-ray spots in the 39 00:03:43,000 --> 00:03:49,000 diffraction pattern that you saw. That's how you know that a model 40 00:03:49,000 --> 00:03:56,000 that you've built to certain specifications corresponds to what 41 00:03:56,000 --> 00:04:02,000 the molecule of itself in the crystal that you're examining 42 00:04:02,000 --> 00:04:09,000 actually is composed of. It was by sheer accident that I 43 00:04:09,000 --> 00:04:15,000 happened to be working as a biochemist in the MRC, 44 00:04:15,000 --> 00:04:22,000 The Medical Research Council lab at the Cavendish Laboratory where 45 00:04:22,000 --> 00:04:28,000 Watson and Crick were working. Sheer accident. It was a very 46 00:04:28,000 --> 00:04:35,000 crowded lab, as Graham said. And that's something that you should 47 00:04:35,000 --> 00:04:41,000 remember. When you're choosing a lab to work in, 48 00:04:41,000 --> 00:04:47,000 always go to a lab that's overcrowded. Never go to a lab 49 00:04:47,000 --> 00:04:53,000 where there's lots of space because a really successful lab attracts so 50 00:04:53,000 --> 00:04:59,000 many coworkers, visitors that it rapidly gets 51 00:04:59,000 --> 00:05:05,000 overcrowded. And that was the case in this 52 00:05:05,000 --> 00:05:12,000 laboratory. The director was Max Perutz. Co-director John Kendrew 53 00:05:12,000 --> 00:05:20,000 doing x-ray crystallography of proteins for almost the first time, 54 00:05:20,000 --> 00:05:27,000 and solving the protein structure. Francis Crick was a graduate student 55 00:05:27,000 --> 00:05:33,000 of Max Perutz's doing his PhD work. And the first thing I remember about 56 00:05:33,000 --> 00:05:39,000 Francis was when I went there as a biochemist to work with Max Perutz, 57 00:05:39,000 --> 00:05:44,000 when I went there, there was this tall gangling guy constantly 58 00:05:44,000 --> 00:05:50,000 circulating between the top floor of the building, his office in the 59 00:05:50,000 --> 00:05:55,000 middle and the x-ray machines at the bottom. He was constantly going up 60 00:05:55,000 --> 00:06:01,000 and down. And in those days the buildings didn't have an elevators 61 00:06:01,000 --> 00:06:07,000 or lifts as the English called them. 62 00:06:07,000 --> 00:06:13,000 So he was in excellent physical shape. Very crowded, 63 00:06:13,000 --> 00:06:19,000 a very modest lab. And what's usually forgotten is a key member of 64 00:06:19,000 --> 00:06:25,000 that group, an engineer, Tony Broad, key person because he 65 00:06:25,000 --> 00:06:31,000 invented what was then the world's best and most efficient x-ray 66 00:06:31,000 --> 00:06:37,000 machine, a rotating anode x-ray machine. 67 00:06:37,000 --> 00:06:43,000 And because to the x-ray crystallographers in that group this 68 00:06:43,000 --> 00:06:49,000 machine was available, because of that they were the 69 00:06:49,000 --> 00:06:56,000 preeminent x-ray structure group in the world. My job was as a 70 00:06:56,000 --> 00:07:02,000 biochemist protein biochemistry putting a heavy atom, 71 00:07:02,000 --> 00:07:08,000 mercury, very heavy atom into Max Parutz's hemoglobin crystals 72 00:07:08,000 --> 00:07:15,000 in specific places. That has a predictable effect on the 73 00:07:15,000 --> 00:07:21,000 x-ray pattern and that enables the Fourier diagram to be constructed 74 00:07:21,000 --> 00:07:28,000 with real phase values for the x-ray diffractions, for the physicists 75 00:07:28,000 --> 00:07:36,000 among you here. Are there any physicists here? 76 00:07:36,000 --> 00:07:44,000 Yeah, I thought so. That was a big step forward and that was also a big 77 00:07:44,000 --> 00:07:52,000 step in figuring out the structure of the DNA samples semi-crystals 78 00:07:52,000 --> 00:08:00,000 that Professor Walker just referred to. 79 00:08:00,000 --> 00:08:05,000 All dependent on the engineer Tony Broad who is never mentioned in any 80 00:08:05,000 --> 00:08:11,000 of these histories, but without him this would not have 81 00:08:11,000 --> 00:08:17,000 happened. So it was an exciting place to work in, 82 00:08:17,000 --> 00:08:22,000 very exciting. We were all young in those days. And living the lives of 83 00:08:22,000 --> 00:08:28,000 young men and young women with all the complications that arise when 84 00:08:28,000 --> 00:08:34,000 you put a whole bunch of very energetic young men, 85 00:08:34,000 --> 00:08:39,000 very energetic young women together. And by that I mean the interpersonal 86 00:08:39,000 --> 00:08:45,000 relationships which when you're in a crowded, very active situation can 87 00:08:45,000 --> 00:08:50,000 sometimes interfere. And always very entertaining, 88 00:08:50,000 --> 00:08:55,000 I can tell you that. I could give you chapter and verse. 89 00:08:55,000 --> 00:09:01,000 But it isn't really so very different from people 90 00:09:01,000 --> 00:09:07,000 your age now, right? I mean I'm not saying it interferes 91 00:09:07,000 --> 00:09:13,000 with you, sometimes it might. But it was an exciting lab, an 92 00:09:13,000 --> 00:09:20,000 exciting time to be there because we were not the only group trying to 93 00:09:20,000 --> 00:09:26,000 figure out the structure of DNA. A huge competitor was Linus Pauling 94 00:09:26,000 --> 00:09:33,000 at Caltech who had beaten that same group once before, 95 00:09:33,000 --> 00:09:39,000 quite recently, over the alpha helix, the crucial component of 96 00:09:39,000 --> 00:09:45,000 protein structure. He got the right answer first, 97 00:09:45,000 --> 00:09:51,000 1.5 angstrom reflection, the alpha helix. And our group, 98 00:09:51,000 --> 00:09:57,000 Max Parutz and our group had been wrong. So the group was smarting 99 00:09:57,000 --> 00:10:03,000 under that kind of defeat, if you like. 100 00:10:03,000 --> 00:10:10,000 And competition is a wonderful spur, as long as you don't let it get out 101 00:10:10,000 --> 00:10:17,000 of hand. Well, needless to say we didn't, 102 00:10:17,000 --> 00:10:24,000 but the competition with the Pauling lab was certainly so severe that we 103 00:10:24,000 --> 00:10:30,000 awaited the next letter. You see, in those days new 104 00:10:30,000 --> 00:10:35,000 scientific information arrived not by publications, 105 00:10:35,000 --> 00:10:40,000 that too much too long, but by personal letter. And, 106 00:10:40,000 --> 00:10:45,000 in fact, the NIH has put together all these various letters in the 107 00:10:45,000 --> 00:10:50,000 Francis Crick collection. And when you have time you should 108 00:10:50,000 --> 00:10:55,000 look at those. They're quite interesting because 109 00:10:55,000 --> 00:11:01,000 they tell you in a way a scientific paper does not tell you. 110 00:11:01,000 --> 00:11:06,000 What I feel about my experiment results. What she feels about her 111 00:11:06,000 --> 00:11:11,000 experiment results. What it means to me as a person, 112 00:11:11,000 --> 00:11:16,000 to her as a person, to him as a person. So we were constantly 113 00:11:16,000 --> 00:11:22,000 watching the mail and discussing the news as it came in, 114 00:11:22,000 --> 00:11:27,000 mostly over a beer at the pub next door. It was very conveniently 115 00:11:27,000 --> 00:11:33,000 located. But being a small group crowded 116 00:11:33,000 --> 00:11:39,000 together made communication within our group very easy indeed. 117 00:11:39,000 --> 00:11:45,000 And we had fights. I don't mean physical fights. 118 00:11:45,000 --> 00:11:51,000 We had scientific fights. And as a biochemist I was able to 119 00:11:51,000 --> 00:11:57,000 settle a crucial fight among the crystallographers Crick and Watson 120 00:11:57,000 --> 00:12:02,000 who were building the model. Because, quite frankly, 121 00:12:02,000 --> 00:12:07,000 they didn't know much chemistry. And were trying to build a model 122 00:12:07,000 --> 00:12:13,000 with the wrong confirmation of the peptide bond. They didn't realize 123 00:12:13,000 --> 00:12:18,000 that the peptide bond has two possible confirmations. 124 00:12:18,000 --> 00:12:23,000 And they had at one point a terrible time trying to fit 125 00:12:23,000 --> 00:12:29,000 everything together because they were using the wrong confirmation. 126 00:12:29,000 --> 00:12:34,000 I'm talking about lactam-lactim for those of you who are organic 127 00:12:34,000 --> 00:12:39,000 chemists and it means something, a confirmation. And once they got 128 00:12:39,000 --> 00:12:44,000 the first confirmation then the model clicked into place. 129 00:12:44,000 --> 00:12:50,000 So we all helped, that's what I'm trying to say. 130 00:12:50,000 --> 00:12:55,000 We all helped with one great aim in mind. It was clear. 131 00:12:55,000 --> 00:13:00,000 And you know from what Professor Walker said, that the DNA structure, 132 00:13:00,000 --> 00:13:06,000 in its structure held the clue to crucial physiological 133 00:13:06,000 --> 00:13:11,000 behavior of DNA. And Crick and Watson said this in 134 00:13:11,000 --> 00:13:17,000 their first paper, the structure itself because of its 135 00:13:17,000 --> 00:13:22,000 complimentarity gives you an immediate clue as to how it 136 00:13:22,000 --> 00:13:27,000 replicates. And replication of DNA structure from generation to 137 00:13:27,000 --> 00:13:33,000 generation is, of course, the crucial thing about 138 00:13:33,000 --> 00:13:38,000 DNA. The copying, the precise copying 139 00:13:38,000 --> 00:13:43,000 from generation to generation. And that fell out the of x-ray 140 00:13:43,000 --> 00:13:49,000 structure. That's why the x-ray structure was so very important, 141 00:13:49,000 --> 00:13:54,000 because it gave you an immediate understanding of the role of DNA in 142 00:13:54,000 --> 00:14:00,000 modern biology. So that's what we did. 143 00:14:00,000 --> 00:14:06,000 And eventually the people in the group, the group got so overcrowded 144 00:14:06,000 --> 00:14:13,000 they built a huge lab that was beautiful, like any new lab is. 145 00:14:13,000 --> 00:14:19,000 But the thing I remember most of all was the atmosphere in that place. 146 00:14:19,000 --> 00:14:26,000 So remember, when you go and choose a lab, choose one that's overcrowded. 147 00:14:26,000 --> 00:14:45,000 It will pay off. [APPLAUSE] 148 00:14:45,000 --> 00:14:47,000 Thank you so much. That was really wonderful. 149 00:14:47,000 --> 00:14:59,000 Thank you. 150 00:14:59,000 --> 00:15:02,000 I don't know if some of you realized quite how rare that was, 151 00:15:02,000 --> 00:15:05,000 this discovery of the structure of DNA. As I said, 152 00:15:05,000 --> 00:15:09,000 probably one of the big discoveries of mankind. Because, 153 00:15:09,000 --> 00:15:12,000 as Vernon said, you could see so many of the secrets of life as soon 154 00:15:12,000 --> 00:15:16,000 as you saw that structure. Very few people have ever heard 155 00:15:16,000 --> 00:15:19,000 from someone who was there at the time. Maybe you'll forget a bunch 156 00:15:19,000 --> 00:15:23,000 of stuff down the line, but I hope you'll remember you heard 157 00:15:23,000 --> 00:15:26,000 somebody who was there when Wesson and Crick were there and maybe his 158 00:15:26,000 --> 00:15:30,000 extra piece of advice about choosing a lab. 159 00:15:30,000 --> 00:15:33,000 To say one thing quickly, some of you I think understood what 160 00:15:33,000 --> 00:15:37,000 I've been trying to do. I spent quite a bit of time talking 161 00:15:37,000 --> 00:15:40,000 about science being done by real people doing real experiments. 162 00:15:40,000 --> 00:15:44,000 Thanks for your comments. A few of you have gone out of your way to say 163 00:15:44,000 --> 00:15:48,000 that this was a total waste of time and you didn't understand why I 164 00:15:48,000 --> 00:15:51,000 didn't teach you something instead of doing something on the test. 165 00:15:51,000 --> 00:15:55,000 Well, I'm making up the test. And if you don't think there'll be 166 00:15:55,000 --> 00:15:59,000 something on scientific process on the second exam you'll 167 00:15:59,000 --> 00:16:02,000 be surprised. So I'm spending a lot of time on 168 00:16:02,000 --> 00:16:06,000 this, and the reason is because you are MIT student. 169 00:16:06,000 --> 00:16:09,000 You know, you can go many places in the country to many high school 170 00:16:09,000 --> 00:16:12,000 biology courses and you can memorize, someone will tell you to memorize 171 00:16:12,000 --> 00:16:16,000 everything that's in the book, and you'll get tested whether you 172 00:16:16,000 --> 00:16:19,000 can memorize it. You guys are at MIT because you 173 00:16:19,000 --> 00:16:23,000 have the potential to be leaders in whatever you do. 174 00:16:23,000 --> 00:16:26,000 I've made the transition from being an undergrad sort of trying to 175 00:16:26,000 --> 00:16:30,000 memorize stuff in a textbook to working on a cutting-edge. 176 00:16:30,000 --> 00:16:32,000 I've made some reasonably significant discoveries in science, 177 00:16:32,000 --> 00:16:35,000 as have my other colleagues in the department, some of them making 178 00:16:35,000 --> 00:16:38,000 greater than I. But nevertheless if you're on the 179 00:16:38,000 --> 00:16:41,000 cutting-edge then you're dealing with all the stuff I'm trying to 180 00:16:41,000 --> 00:16:44,000 tell you about in this thing. You're working as a part of a group. 181 00:16:44,000 --> 00:16:47,000 There's competition. There are interpersonal 182 00:16:47,000 --> 00:16:50,000 relationships. You make mistakes. 183 00:16:50,000 --> 00:16:53,000 You recover from them. You're making inferences. 184 00:16:53,000 --> 00:16:56,000 You're testing models. This is a very complex, very real, 185 00:16:56,000 --> 00:16:59,000 very dynamic, very human interaction. I hope you got a little bit of 186 00:16:59,000 --> 00:17:03,000 whiff of that from Vernon. And I wouldn't be, 187 00:17:03,000 --> 00:17:07,000 I'm quite capable of reproducing diagrams from the textbook without 188 00:17:07,000 --> 00:17:11,000 trying to give you a deeper understanding, 189 00:17:11,000 --> 00:17:15,000 and that's what I'm trying to do here. And I hope if it hasn't made 190 00:17:15,000 --> 00:17:19,000 sense to you by the end that at least a few more of you will get it. 191 00:17:19,000 --> 00:17:23,000 And those of you who I think saw what I was doing I appreciate your 192 00:17:23,000 --> 00:17:27,000 telling me that in the things. These are anonymous so I don't know, 193 00:17:27,000 --> 00:17:31,000 but a couple of you are certainly trying to make it clear that you 194 00:17:31,000 --> 00:17:35,000 didn't think it was worth your time coming to lecture. 195 00:17:35,000 --> 00:17:38,000 I'm trying to tell you why I'm trying to do it. 196 00:17:38,000 --> 00:17:41,000 I'm trying to teach you in a deeper way. And this is a required course. 197 00:17:41,000 --> 00:17:44,000 It's important for your life. I hope some of you will see that or if 198 00:17:44,000 --> 00:17:47,000 you don't see it now you'll see it later in your career. 199 00:17:47,000 --> 00:17:50,000 OK. Now, we're going to talk about DNA replication. 200 00:17:50,000 --> 00:17:53,000 I'm going to start to drive into some of the details that maybe are 201 00:17:53,000 --> 00:17:56,000 more the kind of things you're expecting. I just want to make one 202 00:17:56,000 --> 00:17:59,000 quick point here. I've talked about cell division and 203 00:17:59,000 --> 00:18:03,000 we saw this, how cells come from other cells going to make more cells. 204 00:18:03,000 --> 00:18:07,000 I showed you this little movie you've seen a few times of a yeast 205 00:18:07,000 --> 00:18:10,000 cell dividing, but all cells divide. 206 00:18:10,000 --> 00:18:14,000 Here's a cancer cell dividing. If you get a cancer it's a cell 207 00:18:14,000 --> 00:18:17,000 that's forgotten how to stop dividing and is growing to make a 208 00:18:17,000 --> 00:18:21,000 tumor. There's this cancer cell dividing. It looks not unlike a 209 00:18:21,000 --> 00:18:25,000 yeast on a molecular level, very, very similar. But there's 210 00:18:25,000 --> 00:18:29,000 another point. I told you how the structure of DNA 211 00:18:29,000 --> 00:18:33,000 with the complimentary strands with G pairing with C and A pairing with 212 00:18:33,000 --> 00:18:37,000 T immediately gave rise to an insight as to how the genetic 213 00:18:37,000 --> 00:18:41,000 material could be replicated. And you guys know that it's held 214 00:18:41,000 --> 00:18:45,000 together by hydrogen bonds between base pairs which are about 215 00:18:45,000 --> 00:18:49,000 one-twentieth the strength of the covalent bonds. 216 00:18:49,000 --> 00:18:53,000 So you're able to peel the strands apart without breaking the covalent 217 00:18:53,000 --> 00:18:57,000 bonds. And then by pairing A with T and G with C and doing that on both 218 00:18:57,000 --> 00:19:02,000 strands then you can end up with two identical copies. 219 00:19:02,000 --> 00:19:05,000 And so if you do two identical copies and you do it again you get 220 00:19:05,000 --> 00:19:09,000 eight. One of the things we've realized over the last two or three 221 00:19:09,000 --> 00:19:13,000 years in looking through the exams is somehow, at least some of the 222 00:19:13,000 --> 00:19:17,000 class, didn't connect the business about cells coming from other cells 223 00:19:17,000 --> 00:19:21,000 and DNA duplicating to give daughter DNA. And I'm just trying to hammer 224 00:19:21,000 --> 00:19:25,000 home the point that these are related. Every time a cell divides 225 00:19:25,000 --> 00:19:29,000 it has to duplicate its genetic information. 226 00:19:29,000 --> 00:19:33,000 That's why I'm going to be telling you about DNA replication. 227 00:19:33,000 --> 00:19:37,000 Here's a picture of that same cancer cell, but watch over here. 228 00:19:37,000 --> 00:19:41,000 This is the DNA. And you see it's doubled. And see how the DNA, 229 00:19:41,000 --> 00:19:46,000 which is the chromosomes, has pulled apart so that at the end you now 230 00:19:46,000 --> 00:19:50,000 have two cells and you've got identical copies of DNA. 231 00:19:50,000 --> 00:19:54,000 So if you're studying cancer, for example, this sort of thing is 232 00:19:54,000 --> 00:19:59,000 relevant to you. OK. So the issue of how -- 233 00:19:59,000 --> 00:20:03,000 Well, before I do that, I'm sorry. Just a couple of things 234 00:20:03,000 --> 00:20:08,000 about DNA replication before I dive into this. So we all started out as 235 00:20:08,000 --> 00:20:13,000 a single cell. I've got a lot more obviously 236 00:20:13,000 --> 00:20:18,000 because I'm made up of a lot of cells. If I took all the DNA in my 237 00:20:18,000 --> 00:20:22,000 body and I wind up all the molecules in it, do you guys have any idea how 238 00:20:22,000 --> 00:20:27,000 long that would be? Who thinks it would reach let's say 239 00:20:27,000 --> 00:20:33,000 across the room? OK. Across campus? 240 00:20:33,000 --> 00:20:39,000 Across Cambridge? Around the world? To the moon? Anybody left? 241 00:20:39,000 --> 00:20:46,000 To the sun? I've got ten to the fourteenth cells. 242 00:20:46,000 --> 00:20:52,000 There's about a meter or two in each cell. 10 to 20 billion miles 243 00:20:52,000 --> 00:20:59,000 of DNA in each of our bodies, human DNA. 244 00:20:59,000 --> 00:21:03,000 They would go back and forth to the sun multiple times. 245 00:21:03,000 --> 00:21:07,000 So that much DNA had to get replicated in order for the 246 00:21:07,000 --> 00:21:12,000 fertilized egg we all started out as to become you. 247 00:21:12,000 --> 00:21:16,000 Another thing, the accuracy of replication is about 248 00:21:16,000 --> 00:21:21,000 ten to the minus tenth. Most people, including myself, 249 00:21:21,000 --> 00:21:25,000 don't have a very good feel for exponents. So that's one 250 00:21:25,000 --> 00:21:30,000 mistake in 10 billion. You know, it could be one mistake in 251 00:21:30,000 --> 00:21:34,000 10 to the ninety-ninth. Well, what is one mistake in 10 252 00:21:34,000 --> 00:21:39,000 billion mean? So let's relate it to something we know. 253 00:21:39,000 --> 00:21:43,000 If I was typing let's say an eight letter word, 60 words a minute, 254 00:21:43,000 --> 00:21:48,000 24 hours a day, 7 days a week, and I was as good as DNA replication, 255 00:21:48,000 --> 00:21:52,000 how often would I make a mistake? So you can each think of how long 256 00:21:52,000 --> 00:21:57,000 you think that is. But if I was good on average, 257 00:21:57,000 --> 00:22:02,000 I would make a mistake once every 38 years. 258 00:22:02,000 --> 00:22:06,000 So I'm about to tell you about a process that's absolutely 259 00:22:06,000 --> 00:22:11,000 astonishing in terms of how fast and how much you can do and with an 260 00:22:11,000 --> 00:22:15,000 accuracy that goes beyond what we're used to in our ordinary life. 261 00:22:15,000 --> 00:22:20,000 So how does it do this? It has to be more than just pulling the 262 00:22:20,000 --> 00:22:24,000 strands apart. And there's been some confusion as 263 00:22:24,000 --> 00:22:29,000 to why I'm emphasizing 5 prime and 3 prime. 264 00:22:29,000 --> 00:22:34,000 Well, each of these subunits, each nucleotide, this is a 3 prime 265 00:22:34,000 --> 00:22:39,000 hydroxyl and this is the 5 prime position. If we were joining 266 00:22:39,000 --> 00:22:44,000 together subunits that had a hook and an eye it would make a 267 00:22:44,000 --> 00:22:49,000 difference because it's not the same on both ends. If we're going to 268 00:22:49,000 --> 00:22:54,000 start hooking together it's exactly the same thing when we get to a 269 00:22:54,000 --> 00:22:59,000 biochemical level, the 5 prime end is not the same as 270 00:22:59,000 --> 00:23:04,000 hydroxyl at the 3 prime end because the whole thing is asymmetric. 271 00:23:04,000 --> 00:23:12,000 So the enzymes that copy DNA are known as DNA polymerases. 272 00:23:12,000 --> 00:23:20,000 And it was a very difficult challenge to figure out how they 273 00:23:20,000 --> 00:23:28,000 operated, but Arthur Kornberg was the first person to solve 274 00:23:28,000 --> 00:23:36,000 this problem. He was an extraordinarily gifted 275 00:23:36,000 --> 00:23:43,000 biochemist. He's still at Stanford. And what he found was if we have a 276 00:23:43,000 --> 00:23:49,000 5 prime end this would then be the 3 prime end, and there's a 3 prime 277 00:23:49,000 --> 00:23:56,000 hydroxyl which is this one right here. And this was paired, 278 00:23:56,000 --> 00:24:02,000 say, with a C and A paired with a T. And let's say a G paired with a C 279 00:24:02,000 --> 00:24:07,000 here. And let's say the next template base was, 280 00:24:07,000 --> 00:24:12,000 let's make it a T. What Arthur Kornberg was able to find was an 281 00:24:12,000 --> 00:24:18,000 enzyme activity that catalyzed a template-dependent replication of 282 00:24:18,000 --> 00:24:23,000 DNA. That was critical because he had to find, if you broke the cells 283 00:24:23,000 --> 00:24:28,000 open, somewhere in that gamish of enzymes and things from 284 00:24:28,000 --> 00:24:33,000 inside a cell. There had to be something that was 285 00:24:33,000 --> 00:24:37,000 able to copy DNA. So in order to do that he had to 286 00:24:37,000 --> 00:24:41,000 work out an assay. And he also had to have some kind 287 00:24:41,000 --> 00:24:45,000 of guess as to what the cell would be using in order to carry out the 288 00:24:45,000 --> 00:24:49,000 synthesis. But one thing that was sort of obvious was a DNA template 289 00:24:49,000 --> 00:24:53,000 because that was being copies. But the other part was you had to 290 00:24:53,000 --> 00:24:58,000 have energy to form a covalent bond. 291 00:24:58,000 --> 00:25:02,000 So somehow there had to be something that was sort of activated with the 292 00:25:02,000 --> 00:25:07,000 energy built into the molecule so that thermodynamically the whole 293 00:25:07,000 --> 00:25:12,000 thing would slide downhill when you made a bond. And he knew that the 294 00:25:12,000 --> 00:25:17,000 cell had triphosphates, just the same type that we talked 295 00:25:17,000 --> 00:25:36,000 about when we talked about ATP. 296 00:25:36,000 --> 00:25:43,000 So this would be a deoxyribonucleotide triphosphate. 297 00:25:43,000 --> 00:25:51,000 And he was able to make a guess, because he had to try things until 298 00:25:51,000 --> 00:25:59,000 he found something that would work, that this was what's used in DNA 299 00:25:59,000 --> 00:26:06,000 synthesis. So this hydroxyl ultimately attacks 300 00:26:06,000 --> 00:26:13,000 this phosphate here. And these two other phosphates then 301 00:26:13,000 --> 00:26:21,000 come off as a leaving group. So if we thought of it as a pea 302 00:26:21,000 --> 00:26:28,000 like this with two more peas here, these two come off and you get a new 303 00:26:28,000 --> 00:26:36,000 bond formed to the phosphate. And so what Kornberg then was able 304 00:26:36,000 --> 00:26:45,000 to find by using a DNA template that had this sort of structure and 305 00:26:45,000 --> 00:26:54,000 [TTATA?] like this, that he was now able to get an A 306 00:26:54,000 --> 00:27:04,000 added here. This hydroxyl here became the new hydroxyl. 307 00:27:04,000 --> 00:27:10,000 And so the direction of synthesis, this strand is the other way, so the 308 00:27:10,000 --> 00:27:16,000 direction of synthesis of a DNA polymerase, it's polymerizing in the 309 00:27:16,000 --> 00:27:22,000 5 prime to two 3 prime direction. This was again an amazing discovery 310 00:27:22,000 --> 00:27:28,000 because it was the first time that anyone had found an enzyme 311 00:27:28,000 --> 00:27:34,000 that could copy DNA. Arthur Kornberg got a Nobel Prize 312 00:27:34,000 --> 00:27:38,000 for it. But at this point actually genetics came in because there was a 313 00:27:38,000 --> 00:27:43,000 scientist John Cairns who was at that point down at Cold Spring 314 00:27:43,000 --> 00:27:47,000 Harbor, as I told you the other day. And John, in spite of the fact that 315 00:27:47,000 --> 00:27:52,000 Arthur had found a DNA polymerase that had all the properties that you 316 00:27:52,000 --> 00:27:56,000 would expect for copying DNA, didn't think that was the one that 317 00:27:56,000 --> 00:28:01,000 actually copied the DNA necessary for cellular replication. 318 00:28:01,000 --> 00:28:05,000 So he reasoned if he was right he'd be able to find a mutation that 319 00:28:05,000 --> 00:28:09,000 would eliminate the activity of that enzyme and the cell would still live. 320 00:28:09,000 --> 00:28:13,000 And so they did a screening, and it was a lot of work, but they 321 00:28:13,000 --> 00:28:18,000 eventually found a mutant of E. coli that lacked this DNA polymerase 322 00:28:18,000 --> 00:28:22,000 that Arthur Kornberg had discovered. And the cell was still alive and 323 00:28:22,000 --> 00:28:26,000 was still replicating its DNA. So it told both John and then 324 00:28:26,000 --> 00:28:31,000 Arthur there must be another enzyme in the cell. 325 00:28:31,000 --> 00:28:34,000 And so Arthur went back. And now working in a mutant that 326 00:28:34,000 --> 00:28:37,000 was missing this first polymerase he discovered he found the one that 327 00:28:37,000 --> 00:28:40,000 really replicates the DNA. The first one is important, 328 00:28:40,000 --> 00:28:44,000 too. It's needed for DNA repair. I'm going to talk to you about that 329 00:28:44,000 --> 00:28:47,000 in next lecture, but it's not absolutely crucial for 330 00:28:47,000 --> 00:28:50,000 life. And there's an interplay of genetics and biochemistry. 331 00:28:50,000 --> 00:28:53,000 And you'll see I'm just sort of foreshadowing what we're going to 332 00:28:53,000 --> 00:28:57,000 get to when we talk about the genetics of this. 333 00:28:57,000 --> 00:29:00,000 And I know a couple of you clearly were frustrated about me showing you 334 00:29:00,000 --> 00:29:03,000 pictures of the people who did this, but nevertheless since this was such 335 00:29:03,000 --> 00:29:07,000 a historic event a couple of years ago at Cold Spring Harbor. 336 00:29:07,000 --> 00:29:10,000 This you see the helix model down there. There was Jim Watson opening 337 00:29:10,000 --> 00:29:14,000 the symposium. When I got up to talk I said, 338 00:29:14,000 --> 00:29:17,000 well, I told my students that I'd let them know what it was like when 339 00:29:17,000 --> 00:29:21,000 I was there, so I took out a camera and I took a picture of the audience. 340 00:29:21,000 --> 00:29:24,000 And so there are a bunch of Nobel Laureates and types here who were 341 00:29:24,000 --> 00:29:28,000 sitting there smiling for you guys in the class. And there was Arthur 342 00:29:28,000 --> 00:29:32,000 Kornberg giving his talk. Now, these DNA polymerases are 343 00:29:32,000 --> 00:29:36,000 incredible protein machines. The crystal structures of DNA 344 00:29:36,000 --> 00:29:40,000 polymerases operating their template have been solved. 345 00:29:40,000 --> 00:29:44,000 And you can solve, depending on how many diffractions 346 00:29:44,000 --> 00:29:48,000 you can get, you can get a model that's more and more detailed. 347 00:29:48,000 --> 00:29:52,000 And there have been very high resolution models of DNA polymerases. 348 00:29:52,000 --> 00:29:56,000 This blue and white stuff is the surface of the protein, 349 00:29:56,000 --> 00:30:00,000 and this is sort of a template and the various parts. 350 00:30:00,000 --> 00:30:04,000 Just to give you an idea here are these tracings of the shapes of the 351 00:30:04,000 --> 00:30:08,000 electron density. You can see how the 352 00:30:08,000 --> 00:30:12,000 crystallographers have fit the nucleotides right in the crystal 353 00:30:12,000 --> 00:30:17,000 into these electron densities. And here putting it together a bit 354 00:30:17,000 --> 00:30:21,000 in the blue is the secondary structure of the protein and the 355 00:30:21,000 --> 00:30:25,000 templates and whatnot. And I don't expect you to see very 356 00:30:25,000 --> 00:30:29,000 much in that, but the point is I wanted to sort of just set you up to 357 00:30:29,000 --> 00:30:34,000 show you this little movie. Because DNA polymerases are 358 00:30:34,000 --> 00:30:38,000 incredible machines. They copy at about a thousand 359 00:30:38,000 --> 00:30:43,000 nucleotides a second and their accuracy is really amazing. 360 00:30:43,000 --> 00:30:47,000 And I'll tell you all the tricks to the accuracy in the next lecture, 361 00:30:47,000 --> 00:30:52,000 but I want to show you this little movie because this is sort of a 362 00:30:52,000 --> 00:30:56,000 simulation of what must happen every time a nucleotide is added. 363 00:30:56,000 --> 00:31:01,000 Now, we'll see this over and over again so I'll take it in pieces. 364 00:31:01,000 --> 00:31:04,000 The yellow and the orange are the secondary structures. 365 00:31:04,000 --> 00:31:08,000 That's an alpha helix. And certainly one thing you can see 366 00:31:08,000 --> 00:31:12,000 is happening, as we're looking at this, is the parts of the protein 367 00:31:12,000 --> 00:31:16,000 are moving during this. So you can see this alpha helix 368 00:31:16,000 --> 00:31:20,000 that's sort of swinging up and swinging back down. 369 00:31:20,000 --> 00:31:24,000 Now, what's over here is the template base. 370 00:31:24,000 --> 00:31:28,000 That's the base that correspondents to the T that I was just 371 00:31:28,000 --> 00:31:31,000 showing you here. This is the incoming nucleotide. 372 00:31:31,000 --> 00:31:35,000 There is the triphosphate coming down here. And, 373 00:31:35,000 --> 00:31:39,000 in fact, you just see those two phosphates going. 374 00:31:39,000 --> 00:31:43,000 So what's happening here, this is going to be the end of the 375 00:31:43,000 --> 00:31:46,000 growing chain. It's going attack right there, 376 00:31:46,000 --> 00:31:50,000 join the phosphate and the pyrophosphate will leave. 377 00:31:50,000 --> 00:31:54,000 And if you'll take a look, when you see this movement of this 378 00:31:54,000 --> 00:31:58,000 helix from the beginning state to up to here, you'll see what happens. 379 00:31:58,000 --> 00:32:02,000 It's squeezing the template base and the incoming nucleotide together. 380 00:32:02,000 --> 00:32:07,000 What it's really doing is testing for the correct shape. 381 00:32:07,000 --> 00:32:11,000 Remember the shape of an A-T base pair and a G-C base pair is the same. 382 00:32:11,000 --> 00:32:16,000 And if those of you who are confused about guanine and the 383 00:32:16,000 --> 00:32:21,000 keto-enol thing, try to draw hydrogen bonds with the 384 00:32:21,000 --> 00:32:25,000 enol form of guanine and see how you do. I think you'll begin 385 00:32:25,000 --> 00:32:30,000 to understand a bit. So at the heart of life is something 386 00:32:30,000 --> 00:32:36,000 that can copy DNA. And there are these exquisitely 387 00:32:36,000 --> 00:32:42,000 beautiful machines. The replica machine in E. 388 00:32:42,000 --> 00:32:48,000 coli has 18 proteins and the ones in our bodies are even more 389 00:32:48,000 --> 00:32:54,000 sophisticated with even more parts. OK. But to replicate a DNA 390 00:32:54,000 --> 00:33:00,000 molecule there's another problem that comes up. 391 00:33:00,000 --> 00:33:18,000 Because DNA polymerases copy -- 392 00:33:18,000 --> 00:33:23,000 -- and grow chains in a 5 prime to 3 prime direction. 393 00:33:23,000 --> 00:33:36,000 And they need a 3 prime hydroxy 394 00:33:36,000 --> 00:33:43,000 terminus. So they won't work if you just gave it a single strand of DNA. 395 00:33:43,000 --> 00:33:51,000 No DNA polymerase can handle that. It has to have something like this 396 00:33:51,000 --> 00:34:03,000 where there's a template strand -- 397 00:34:03,000 --> 00:34:07,000 -- and there's what's known as the primer strand. 398 00:34:07,000 --> 00:34:11,000 So there has to be something that has the 3 prime hydroxyl and there 399 00:34:11,000 --> 00:34:16,000 has to be something that's going to provide the template that's going to 400 00:34:16,000 --> 00:34:20,000 be copied. So if we pull strands apart like this with 5 prime to 3 401 00:34:20,000 --> 00:34:24,000 prime then they'll be 5 prime to 3 prime running in the opposite 402 00:34:24,000 --> 00:34:31,000 direction. If we have a template like this, 403 00:34:31,000 --> 00:34:39,000 this is OK because the strand here can be copied 5 prime to 3 prime. 404 00:34:39,000 --> 00:34:48,000 This is the new strand being synthesized by the DNA polymerase. 405 00:34:48,000 --> 00:34:56,000 But what about the other strand? The replication fork is moving in 406 00:34:56,000 --> 00:35:03,000 this direction, but if the -- So here is the 3 to 5 prime 407 00:35:03,000 --> 00:35:08,000 direction here. So if the DNA polymerase is going 408 00:35:08,000 --> 00:35:13,000 to be copying this strand it's going to be moving backwards to the 409 00:35:13,000 --> 00:35:17,000 direction of the replication fork. Now, I guess evolution and nature 410 00:35:17,000 --> 00:35:22,000 could have selected for two types of DNA polymerases, 411 00:35:22,000 --> 00:35:27,000 one that copies 5 prime to 3 prime and one that copies in the opposite 412 00:35:27,000 --> 00:35:32,000 direction. But it didn't. And there are a number of 413 00:35:32,000 --> 00:35:36,000 theoretical reasons that we could discuss in a more advanced course 414 00:35:36,000 --> 00:35:40,000 perhaps for why that is true. But, in fact, what it does is it 415 00:35:40,000 --> 00:35:44,000 uses the same polymerase. So as these things peel apart the 416 00:35:44,000 --> 00:35:48,000 polymerase works in the other direction, but there's another 417 00:35:48,000 --> 00:35:52,000 problem. If I just peel it apart like this there's no 3 prime 418 00:35:52,000 --> 00:35:56,000 hydroxyl. So it took people quite a few years to figure out the strategy 419 00:35:56,000 --> 00:36:01,000 that's used in nature. Nature has a special enzyme that 420 00:36:01,000 --> 00:36:08,000 makes a little piece of RNA. It's called an RNA primer. And 421 00:36:08,000 --> 00:36:15,000 what it does is it provides a 3 prime hydroxyl. 422 00:36:15,000 --> 00:36:22,000 And once you have the 3 prime hydroxyl at the end of the little 423 00:36:22,000 --> 00:36:29,000 RNA chain then the DNA polymerase -- 424 00:36:29,000 --> 00:36:39,000 -- can be made 5 prime to 3 prime. 425 00:36:39,000 --> 00:36:47,000 So as you peel open the replication fork then little pieces of RNA are 426 00:36:47,000 --> 00:36:55,000 used to make a new strand of DNA and it goes this way. 427 00:36:55,000 --> 00:37:03,000 Now that obviously doesn't give you a new intact DNA strand. 428 00:37:03,000 --> 00:37:08,000 And part of the clue to this working out what was going on at DNA 429 00:37:08,000 --> 00:37:13,000 replication was the recognition that newly synthesized DNA was made as 430 00:37:13,000 --> 00:37:18,000 little pieces. And then later it got joined into 431 00:37:18,000 --> 00:37:23,000 longer pieces. And the person who discovered this 432 00:37:23,000 --> 00:37:28,000 was Okazaki. So these fragments of DNA that are synthesized initially 433 00:37:28,000 --> 00:37:34,000 are called Okazaki fragments, after the person who discovered this. 434 00:37:34,000 --> 00:37:38,000 It was rather puzzling because when you tried to look at the synthesis 435 00:37:38,000 --> 00:37:42,000 of DNA you're looking at a long molecule, and you found some of the 436 00:37:42,000 --> 00:37:47,000 newly synthesized material was in short pieces. And as you watched 437 00:37:47,000 --> 00:37:51,000 over time it got longer. So the cell, I think you can sort 438 00:37:51,000 --> 00:37:55,000 of see from first principles what has to happen here then. 439 00:37:55,000 --> 00:37:06,000 That in order to come and make -- 440 00:37:06,000 --> 00:37:10,000 This strand is pretty easy to do, but what the cells have to do now is 441 00:37:10,000 --> 00:37:14,000 they've got these little RNA primers. 442 00:37:14,000 --> 00:38:25,000 And then they remove the RNA by an 443 00:38:25,000 --> 00:38:32,000 enzyme that's capable of degrading the DNA or clipping it 444 00:38:32,000 --> 00:38:38,000 at the junction. And that then leaves the cell in 445 00:38:38,000 --> 00:38:43,000 this sort of situation where there are little tiny gaps in between 446 00:38:43,000 --> 00:38:48,000 these pieces of DNA. But at the end of each one of these 447 00:38:48,000 --> 00:38:53,000 is a 3 prime hydroxyl. So another polymerase or one or 448 00:38:53,000 --> 00:38:58,000 another polymerase in the cell can fill those little pieces 449 00:38:58,000 --> 00:39:04,000 of DNA out. And then there's one little nick 450 00:39:04,000 --> 00:39:10,000 that needs to be sealed. And so what you finally end up with 451 00:39:10,000 --> 00:39:16,000 is a 3 prime hydroxyl here, a 5 prime phosphate that's at the 452 00:39:16,000 --> 00:39:22,000 other end, and then these are joined together. This is one nucleotide 453 00:39:22,000 --> 00:39:28,000 here and the other here. These are then joined together to 454 00:39:28,000 --> 00:39:34,000 give the ordinary phosphodiester bond that links -- 455 00:39:34,000 --> 00:39:41,000 -- the two nucleotides together like 456 00:39:41,000 --> 00:39:46,000 that. And the enzyme that does that is an enzyme called DNA ligase. 457 00:39:46,000 --> 00:39:51,000 You can almost think about it as DNA Scotch tape that will take a 458 00:39:51,000 --> 00:39:56,000 little nick in DNA, if we've got a phosphate and 459 00:39:56,000 --> 00:40:01,000 hydroxyl, and it will join them together. 460 00:40:01,000 --> 00:40:06,000 So this process of replication, which can go at about a thousand 461 00:40:06,000 --> 00:40:12,000 nucleotides a second with this amazing degree of accuracy, 462 00:40:12,000 --> 00:40:17,000 uses two different DNA polymerases, both of which biochemically can only 463 00:40:17,000 --> 00:40:23,000 go in one direction. But you can see they have to be 464 00:40:23,000 --> 00:40:28,000 somehow oriented so that one of them is able to move in this direction 465 00:40:28,000 --> 00:40:34,000 and the other one is able to move in that direction. 466 00:40:34,000 --> 00:40:38,000 The key part in this sort of the course is to try and understand this 467 00:40:38,000 --> 00:40:42,000 5 to 3 prime and to get this basic idea that nature had to do something. 468 00:40:42,000 --> 00:40:47,000 It was fairly easy to copy one strand because that was sort of the 469 00:40:47,000 --> 00:40:51,000 direction of the polymerase movement was the same as the replication for 470 00:40:51,000 --> 00:40:55,000 it movement, but the other strand had to have been much 471 00:40:55,000 --> 00:40:59,000 more a problem. And so when you get down to a 472 00:40:59,000 --> 00:41:03,000 biochemical level, though, it's very conceptually easy 473 00:41:03,000 --> 00:41:07,000 to say, oh, you've got complimentary strands so we just take it apart, 474 00:41:07,000 --> 00:41:11,000 we take the photograph and the negative and we make the opposite 475 00:41:11,000 --> 00:41:14,000 one and now we've got two copies. When you get down to the 476 00:41:14,000 --> 00:41:18,000 biochemical details there is this major biochemical issue of whether 477 00:41:18,000 --> 00:41:22,000 the polymerase can go in the 3 prime or the 5 prime direction. 478 00:41:22,000 --> 00:41:26,000 And nature has chosen to do it all or has been selected to do it all 479 00:41:26,000 --> 00:41:30,000 somehow with a polymerase going in one direction. 480 00:41:30,000 --> 00:41:35,000 There are many other aspects to DNA replication. And one of the tricks 481 00:41:35,000 --> 00:41:40,000 that I find most fascinating is that these polymerases, 482 00:41:40,000 --> 00:41:45,000 once they get on DNA they stay on. And that's part of the secret 483 00:41:45,000 --> 00:41:50,000 because it takes about a millisecond to add a nucleotide, 484 00:41:50,000 --> 00:41:55,000 but if it comes off the DNA it has to get back on. Then it 485 00:41:55,000 --> 00:42:00,000 takes about a minute. So the whole trick to being a very, 486 00:42:00,000 --> 00:42:04,000 very fast DNA polymerase is to somehow hang onto the DNA. 487 00:42:04,000 --> 00:42:08,000 So what biochemists did was they purified the actual enzymatic 488 00:42:08,000 --> 00:42:12,000 activity that could carry out this process, and then they started to 489 00:42:12,000 --> 00:42:16,000 look for other protein factors that would help the process to work 490 00:42:16,000 --> 00:42:20,000 better. And they discovered something called a processivity 491 00:42:20,000 --> 00:42:24,000 factor which made the polymerase stay on the DNA. 492 00:42:24,000 --> 00:42:28,000 And people wondered for a lot of years how that worked and why did 493 00:42:28,000 --> 00:42:33,000 this system work so well. And finally the crystal structure of 494 00:42:33,000 --> 00:42:38,000 the processivity factor was discovered. And if I go back to 495 00:42:38,000 --> 00:42:43,000 this sort of diagram where this is the piece of DNA that's copied, 496 00:42:43,000 --> 00:42:48,000 what it turned out was that the processivity factor is basically a 497 00:42:48,000 --> 00:42:53,000 doughnut that kind of gets clamped around the DNA like that. 498 00:42:53,000 --> 00:42:58,000 So it's sort of like taking a washer with a place where you can 499 00:42:58,000 --> 00:43:03,000 pry it apart opening it up, putting it around the DNA like this. 500 00:43:03,000 --> 00:43:07,000 And then the polymerase, more or less since this is 501 00:43:07,000 --> 00:43:11,000 topologically linked to the DNA, is like a washer sliding on a wire. 502 00:43:11,000 --> 00:43:16,000 This DNA polymerase hangs onto that and it doesn't come off. 503 00:43:16,000 --> 00:43:20,000 And I think there's a little picture of it. 504 00:43:20,000 --> 00:43:25,000 Here's a little movie. There's the DNA going through and 505 00:43:25,000 --> 00:43:29,000 this is one of these clamps. It's virtually the same structure 506 00:43:29,000 --> 00:43:34,000 in a bacterium and inside of us. But, in fact, the amino acids are 507 00:43:34,000 --> 00:43:38,000 almost all different. But the underlying structure of the 508 00:43:38,000 --> 00:43:43,000 protein is almost identical. And there are special machines 509 00:43:43,000 --> 00:43:48,000 called clamp loaders that pry open this clamps, clamp them around DNA, 510 00:43:48,000 --> 00:43:52,000 and that's part of the secret to how these polymerases are able to 511 00:43:52,000 --> 00:43:57,000 polymerize DNA so fast. There are a lot of other pieces of 512 00:43:57,000 --> 00:44:02,000 this machinery. If you go on you'll hear more about 513 00:44:02,000 --> 00:44:06,000 them. I just want to give you one of the most recent insights. 514 00:44:06,000 --> 00:44:10,000 I mean this, as you might guess, since DNA replication is at the 515 00:44:10,000 --> 00:44:14,000 heart of life it's been studied very, very hard, ever since the discovery 516 00:44:14,000 --> 00:44:19,000 of DNA helix. My colleague, Alan Grossman, made quite a 517 00:44:19,000 --> 00:44:23,000 discovery just probably three or four years ago. 518 00:44:23,000 --> 00:44:27,000 He took that green fluorescent protein that we've seen a few times, 519 00:44:27,000 --> 00:44:31,000 and he actually joined the gene encoding green fluorescent protein 520 00:44:31,000 --> 00:44:36,000 to the backend of a piece of the DNA polymerase. 521 00:44:36,000 --> 00:44:39,000 So wherever the DNA polymerase went now there was a little fluorescent 522 00:44:39,000 --> 00:44:43,000 molecule. And he looked to see where it was in the cell. 523 00:44:43,000 --> 00:44:47,000 And I, like many other people, had for years taught, and this is 524 00:44:47,000 --> 00:44:50,000 why, you know, I have respect for the fact that I'm 525 00:44:50,000 --> 00:44:54,000 just teaching you the current model. For much of my career I taught, so 526 00:44:54,000 --> 00:44:58,000 DNA polymerase is sort of like a train going down the tracks a 527 00:44:58,000 --> 00:45:03,000 thousand molecules per second. And we're doing all this stuff with 528 00:45:03,000 --> 00:45:09,000 the leading and with the two strands. So let me just put those words up 529 00:45:09,000 --> 00:45:15,000 while I'm up there. This one is called, 530 00:45:15,000 --> 00:45:22,000 this strand that's easy to replicate is called the leading strand. 531 00:45:22,000 --> 00:45:28,000 And this one where you have to do the primer and whatnot is called the 532 00:45:28,000 --> 00:45:33,000 lagging strand. In any case, what I had taught was 533 00:45:33,000 --> 00:45:37,000 that polymerase was like a train running on tracks. 534 00:45:37,000 --> 00:45:41,000 You could calculate how fast it would move. What Alan, 535 00:45:41,000 --> 00:45:45,000 to his amazing I imagine, found was when he looked to see 536 00:45:45,000 --> 00:45:48,000 where the DNA polymerase was, it wasn't spread out all over the 537 00:45:48,000 --> 00:45:52,000 cell as if you thought it was a thing running on tracks. 538 00:45:52,000 --> 00:45:56,000 In fact, it was in the center of the cell. And then late in cell 539 00:45:56,000 --> 00:46:00,000 division it split into two spots that went to the midpoints of what 540 00:46:00,000 --> 00:46:04,000 would be the daughter cells. And so what he ended up realizing 541 00:46:04,000 --> 00:46:09,000 from that was that instead it was more as if the polymerase was a 542 00:46:09,000 --> 00:46:14,000 factory and it pulled the DNA through it rather than it traveling 543 00:46:14,000 --> 00:46:19,000 down the tracks of the DNA. And that was a very surprising 544 00:46:19,000 --> 00:46:24,000 discovery that went against all the dogma and all the pictures in the 545 00:46:24,000 --> 00:46:30,000 textbook. And it was a discovery at MIT. 546 00:46:30,000 --> 00:46:34,000 That was published in, I think it was 2001, something like 547 00:46:34,000 --> 00:46:39,000 that, a very recent discovery. Things keep changing. That's again 548 00:46:39,000 --> 00:46:43,000 why I keep emphasizing I cannot teach you a fact in biology. 549 00:46:43,000 --> 00:46:48,000 I can teach you the best understanding we have that explains 550 00:46:48,000 --> 00:46:52,000 the experiments to date. But somebody may make a discovery 551 00:46:52,000 --> 00:46:57,000 tomorrow. That means we'll have to change our understanding. 552 00:46:57,000 --> 00:47:00,000 OK? So good luck on the exam. I'll see you on Monday, OK?