1 00:00:00,000 --> 00:00:05,000 OK, so I'd like to go now to the next segment of the course. 2 00:00:05,000 --> 00:00:11,000 Think you can probably appreciate little bit better this triangle I 3 00:00:11,000 --> 00:00:17,000 had on before about how what biochemists did was they tended to 4 00:00:17,000 --> 00:00:23,000 break cells open, look at the component parts, 5 00:00:23,000 --> 00:00:29,000 through other things in there, and then proteins. 6 00:00:29,000 --> 00:00:33,000 But an awful lot of stuff having to do with function is proteins, 7 00:00:33,000 --> 00:00:37,000 and what geneticists, the discipline of genetics would do, 8 00:00:37,000 --> 00:00:41,000 which made mutants of living organisms, and that looked at how 9 00:00:41,000 --> 00:00:45,000 function was affected by mutating individual genes, 10 00:00:45,000 --> 00:00:49,000 how those were both very powerful approaches. Genetics told you what 11 00:00:49,000 --> 00:00:53,000 was really important, and biochemistry told you how it 12 00:00:53,000 --> 00:00:57,000 worked at a molecular level, but the real problem is knowing 13 00:00:57,000 --> 00:01:01,000 whether this thing you had doing something in the test tube was 14 00:01:01,000 --> 00:01:06,000 actually the one that did it in life. 15 00:01:06,000 --> 00:01:09,000 And I think it'll do when Arthur Kornberg isolated the very first DNA 16 00:01:09,000 --> 00:01:13,000 polymerase, he was able to copy DNA. And he got a Nobel prize, and it 17 00:01:13,000 --> 00:01:17,000 was the first enzyme that could copy DNA, and then John Karens made a 18 00:01:17,000 --> 00:01:20,000 mutant that was lacking the enzyme. And the organism was still alive. 19 00:01:20,000 --> 00:01:24,000 So therefore, it couldn't have been the DNA polymerase that was copying 20 00:01:24,000 --> 00:01:28,000 the chromosome. It actually turned out to be a DNA 21 00:01:28,000 --> 00:01:31,000 repair enzyme. So, if you can actually unite 22 00:01:31,000 --> 00:01:35,000 genetics and biochemistry, if you can take a mutant that's 23 00:01:35,000 --> 00:01:39,000 broken in a function and you found your protein was missing, 24 00:01:39,000 --> 00:01:42,000 or vice versa, then you had a very, very powerful insight because you 25 00:01:42,000 --> 00:01:46,000 connect your knowledge of what was physiologically important to the 26 00:01:46,000 --> 00:01:50,000 biochemistry that you're doing. But it was really, really hard for 27 00:01:50,000 --> 00:01:53,000 years, and only in very rare occasions did some geneticists have 28 00:01:53,000 --> 00:01:57,000 a mutant that suggests so strongly that some biochemists would look, 29 00:01:57,000 --> 00:02:01,000 or some biochemists would have such a powerful result that they talked 30 00:02:01,000 --> 00:02:05,000 to a geneticist and seen if anybody had found the result. 31 00:02:05,000 --> 00:02:08,000 And the power of recombinant DNA, although it started a biotech 32 00:02:08,000 --> 00:02:12,000 industry, and it made possible the sequencing of the genome, 33 00:02:12,000 --> 00:02:16,000 there's another level up one higher in conceptual understanding. 34 00:02:16,000 --> 00:02:20,000 And what it did was, it let you go back and forth between here and here. 35 00:02:20,000 --> 00:02:24,000 You wouldn't have any problem now if I gave you the sequencing of the 36 00:02:24,000 --> 00:02:27,000 gene, you can order it. You could stick it in a cell and 37 00:02:27,000 --> 00:02:31,000 make massive amounts of the protein. You purified it twenty-fold and it 38 00:02:31,000 --> 00:02:35,000 would be pure, whereas before you might have had to 39 00:02:35,000 --> 00:02:39,000 purified it fifteen thousand-fold out of 1,000 g of cells, 40 00:02:39,000 --> 00:02:43,000 and you would have had to been a very good biochemists with 15 steps 41 00:02:43,000 --> 00:02:46,000 in order to purify it. So, you can go from the sequence of 42 00:02:46,000 --> 00:02:50,000 the gene to the protein, or if we got a protein and we wanted 43 00:02:50,000 --> 00:02:54,000 to know which gene we'd just sequence a bit of the protein, 44 00:02:54,000 --> 00:02:58,000 use that genetic code, work ourselves backwards to some possible 45 00:02:58,000 --> 00:03:02,000 sequences, then go looking for the sequences, and then go 46 00:03:02,000 --> 00:03:05,000 find the gene. So, what recombinant DNA allows us 47 00:03:05,000 --> 00:03:09,000 to do is close that loop. You can go from genetics to 48 00:03:09,000 --> 00:03:13,000 biochemistry at back and forth. Now everybody does everything 49 00:03:13,000 --> 00:03:17,000 instead of it being isolated disciplines, which it was when I 50 00:03:17,000 --> 00:03:20,000 entered the field. So, all of the stuff depended on 51 00:03:20,000 --> 00:03:24,000 the development to clone particular pieces of DNA. 52 00:03:24,000 --> 00:03:28,000 And I want to make clear right at the beginning, 53 00:03:28,000 --> 00:03:32,000 there's a couple of uses of cloning that are in popular 54 00:03:32,000 --> 00:03:35,000 usage right now. What we're talking about in this 55 00:03:35,000 --> 00:03:39,000 lecture is cloning a piece of DNA. What that means, is I'm going to 56 00:03:39,000 --> 00:03:43,000 take a particular segment of DNA, say, cut it here, and cut it there. 57 00:03:43,000 --> 00:03:46,000 And I'm going to take that piece, and I'm going to do something to it 58 00:03:46,000 --> 00:03:50,000 that lets me amplify it and make many, many copies of that piece of 59 00:03:50,000 --> 00:03:54,000 DNA. And cloning of anything else you make a whole lot of copies. 60 00:03:54,000 --> 00:03:58,000 So that's one use of the word cloning. 61 00:03:58,000 --> 00:04:03,000 The other use, which you see in the popular press 62 00:04:03,000 --> 00:04:08,000 all the time, is cloning an individual, but not being there but 63 00:04:08,000 --> 00:04:13,000 you would take the nucleus from the cell of the individual, 64 00:04:13,000 --> 00:04:19,000 he put that nucleus into an egg that didn't have its own nucleus moved, 65 00:04:19,000 --> 00:04:24,000 and now you hope what you get out of that is an organism that has all the 66 00:04:24,000 --> 00:04:30,000 same genetic content as the starting individual. 67 00:04:30,000 --> 00:04:33,000 And in fact, although it sounds very good in paper, 68 00:04:33,000 --> 00:04:37,000 is you're probably beginning to see it's not the panacea that people 69 00:04:37,000 --> 00:04:41,000 thought it was, or that we'd have to worry that in 70 00:04:41,000 --> 00:04:45,000 10 years, all of my MIT students would be clones of the brightest 71 00:04:45,000 --> 00:04:48,000 person in the class or anything like that, because other stuff happens, 72 00:04:48,000 --> 00:04:52,000 because unless you go on to advanced biology courses, 73 00:04:52,000 --> 00:04:56,000 but there are modifications of DNA. There's all sorts of stuff that 74 00:04:56,000 --> 00:05:00,000 happens to it, so it's not identical. 75 00:05:00,000 --> 00:05:03,000 And so many of these cloned organisms, like Dolly the sheep, 76 00:05:03,000 --> 00:05:07,000 that was famous, died early with, I forget, arthritis and things. So, 77 00:05:07,000 --> 00:05:11,000 there are a lot of problems on that score. But that's the other use of 78 00:05:11,000 --> 00:05:15,000 the word cloning and that's not what these next three lectures that we 79 00:05:15,000 --> 00:05:19,000 are talking about cloning a piece of DNA. And that was the big problem 80 00:05:19,000 --> 00:05:22,000 that faced the field, certainly when I was an undergrad 81 00:05:22,000 --> 00:05:26,000 and even when I was a grad student I was interested in synthesizing 82 00:05:26,000 --> 00:05:30,000 pieces of DNA, and it was one of those things that 83 00:05:30,000 --> 00:05:34,000 people said, why are you doing it? Well, because you could try to do 84 00:05:34,000 --> 00:05:38,000 it. What if you got a piece of DNA, like Gobind Khorana, who's my 85 00:05:38,000 --> 00:05:42,000 colleague, who got the Nobel prize for synthesizing the first gene. 86 00:05:42,000 --> 00:05:46,000 He synthesized it. It was a tRNA gene that was 120 87 00:05:46,000 --> 00:05:50,000 nucleotide base pairs long or something. He synthesized it. 88 00:05:50,000 --> 00:05:55,000 He'd shown you could do it. But you couldn't do anything with it. 89 00:05:55,000 --> 00:05:59,000 And there were sort of two big problems. One was the fact that 90 00:05:59,000 --> 00:06:04,000 this DNA, although it's not its a monotonous tetranucleotide. 91 00:06:04,000 --> 00:06:07,000 It's pretty hard to tell. Each one of these things is a base 92 00:06:07,000 --> 00:06:11,000 pair, and human DNA has 3 billion of those. And a bit down here, 93 00:06:11,000 --> 00:06:15,000 doesn't look very different than the bit out there. 94 00:06:15,000 --> 00:06:19,000 And it certainly wouldn't looks very different than the bit of DNA 95 00:06:19,000 --> 00:06:23,000 that's 2 billion base pairs over on the other side of campus or 96 00:06:23,000 --> 00:06:27,000 something. So, there is no way to take DNA and cut 97 00:06:27,000 --> 00:06:31,000 it reproducibly, so you to get fragments. 98 00:06:31,000 --> 00:06:34,000 What you could see from first principles was what you would need 99 00:06:34,000 --> 00:06:38,000 was magic scissors. And what would the scissors look 100 00:06:38,000 --> 00:06:41,000 like? Well, it would have to be scissors that could be sequenced 101 00:06:41,000 --> 00:06:45,000 because there's nothing else different. You know it's a regular 102 00:06:45,000 --> 00:06:48,000 backbone, and it's only four nucleotides. So, 103 00:06:48,000 --> 00:06:52,000 if you wanted to cut DNA in particular places, 104 00:06:52,000 --> 00:06:55,000 you had to have scissors that could see a sequence. 105 00:06:55,000 --> 00:06:59,000 And furthermore, you can see they couldn't just, 106 00:06:59,000 --> 00:07:02,000 there are the hydrogen bonding parts of ANT or GNC because those are 107 00:07:02,000 --> 00:07:06,000 stuck together there in the middle of the DNA. 108 00:07:06,000 --> 00:07:10,000 So, you'd need scissors that could somehow find a sequence and make a 109 00:07:10,000 --> 00:07:14,000 cut. And those were found. I'm going to till you about those. 110 00:07:14,000 --> 00:07:18,000 They're called restriction enzymes. So that was part of the thing. The 111 00:07:18,000 --> 00:07:22,000 other thing was, imagine I could cut out this 112 00:07:22,000 --> 00:07:26,000 fragment. And they gave it to you, and I said, great. Now I've got it. 113 00:07:26,000 --> 00:07:30,000 Would you make me a lot of copies of this DNA? 114 00:07:30,000 --> 00:07:35,000 Could you do that? Let's say, you now know how to 115 00:07:35,000 --> 00:07:41,000 transform the principle that we saw back with Avery. 116 00:07:41,000 --> 00:07:46,000 We could take naked DNA and put this fragment into the cell. 117 00:07:46,000 --> 00:07:52,000 Would it replicate? What do you think? Anybody remember? 118 00:07:52,000 --> 00:07:58,000 No? OK, we talked about some other languages, right? 119 00:07:58,000 --> 00:08:02,000 But one of the things that's in the DNA is the genetic code with all the 120 00:08:02,000 --> 00:08:06,000 genes. And we can find the reading frame. Remember when we talked 121 00:08:06,000 --> 00:08:10,000 about an origin of replication. I said that was sort of, at least 122 00:08:10,000 --> 00:08:15,000 for E. coli there's one origin. In eukaryotes the origins are 123 00:08:15,000 --> 00:08:19,000 spaced out along the DNA. And every time you have a round of 124 00:08:19,000 --> 00:08:23,000 replication, it starts with one of those origins and then goes. 125 00:08:23,000 --> 00:08:27,000 So the chance of this piece of DNA by chance is going to have an origin 126 00:08:27,000 --> 00:08:32,000 is pretty small. So if I put it into an organism, 127 00:08:32,000 --> 00:08:36,000 it's going to sit there, if you're lucky, make it degraded because it's 128 00:08:36,000 --> 00:08:40,000 got [blend ins? , or even if I made it into circle 129 00:08:40,000 --> 00:08:44,000 it probably wouldn't replicate because it probably doesn't have the 130 00:08:44,000 --> 00:08:49,000 word in the DNA that says start a round of DNA replication. 131 00:08:49,000 --> 00:08:53,000 So, the other overarching principle of DNA replication is you somehow 132 00:08:53,000 --> 00:08:57,000 have to take the fragment of DNA that you're looking at, 133 00:08:57,000 --> 00:09:02,000 and you have to attach it to an origin of replication. 134 00:09:02,000 --> 00:09:06,000 Now, if you have an origin of replication and you have a fragment 135 00:09:06,000 --> 00:09:10,000 of DNA, and you put it in the cell, now you'll get a lot of copies of 136 00:09:10,000 --> 00:09:14,000 that piece of DNA. So that's what recombinant DNA is 137 00:09:14,000 --> 00:09:18,000 all about in a really, really simple form. I'm just going 138 00:09:18,000 --> 00:09:22,000 to take you now into increasing sort of levels of detail. 139 00:09:22,000 --> 00:09:26,000 So, let me just sort of give you just sort of a really broad view of 140 00:09:26,000 --> 00:09:30,000 this cloning, and then we'll sort of start to dive in to some of the 141 00:09:30,000 --> 00:09:35,000 fancier techniques that have come out of this. 142 00:09:35,000 --> 00:09:40,000 We'll talk about DNA sequencing, and PCR, and stuff in the next 143 00:09:40,000 --> 00:09:45,000 lecture. So the first principle here is to cut the DNA, 144 00:09:45,000 --> 00:09:50,000 and I know you may think this is sort of baby talk, 145 00:09:50,000 --> 00:09:55,000 but this is how I think. If you really think about this 146 00:09:55,000 --> 00:10:00,000 stuff, this is what it really is. With sequence specific molecular 147 00:10:00,000 --> 00:10:06,000 scissors, these have the rather odd name. 148 00:10:06,000 --> 00:10:10,000 They're called restriction enzymes. I don't know if any of you know why 149 00:10:10,000 --> 00:10:14,000 they're called restriction enzymes, and although I'm sure that some of 150 00:10:14,000 --> 00:10:18,000 you have used them in your op to cut up pieces of DNA. 151 00:10:18,000 --> 00:10:22,000 But what that does, these are enzymes, as I'll tell you, 152 00:10:22,000 --> 00:10:26,000 that recognize a particular sequence. And they always cut at that 153 00:10:26,000 --> 00:10:30,000 sequence. In the value of that is you can reproducibly cut DNA exactly 154 00:10:30,000 --> 00:10:34,000 the same spot, and the spots are specified by 155 00:10:34,000 --> 00:10:38,000 whatever sequence that particular pair of scissors knows how to read. 156 00:10:38,000 --> 00:10:42,000 Then the next thing we have to do is we need to join the piece of DNA to 157 00:10:42,000 --> 00:10:47,000 an origin of replication. So the thing that carries the 158 00:10:47,000 --> 00:10:51,000 origin of replication is called a vector. And usually, 159 00:10:51,000 --> 00:10:56,000 not always, these are circles. We'll consider the ones that grow 160 00:10:56,000 --> 00:11:00,000 in E. coli are most of the time, or in bacteria, mostly circles. 161 00:11:00,000 --> 00:11:05,000 They are the ones that are broadly used for most cloning. 162 00:11:05,000 --> 00:11:12,000 So, we'll talk about those. What makes a vector? When it has 163 00:11:12,000 --> 00:11:19,000 to have is an origin of DNA replication. They usually have 164 00:11:19,000 --> 00:11:26,000 something else. We could call it a selectable 165 00:11:26,000 --> 00:11:34,000 marker, but something like a drug resistance. 166 00:11:34,000 --> 00:11:37,000 If any of you have done cloning in a UROP usually it's something like a 167 00:11:37,000 --> 00:11:41,000 gene for making the cell ampicillin resistant, or tetracycline resistant, 168 00:11:41,000 --> 00:11:45,000 some antibiotic that would normally kill the cell. 169 00:11:45,000 --> 00:11:48,000 So you can tell, does that cell have that vector or 170 00:11:48,000 --> 00:11:52,000 not because the cell starts at either ampicillin sensitive. 171 00:11:52,000 --> 00:11:56,000 It acquires the vector that's replicating it, 172 00:11:56,000 --> 00:12:00,000 also acquires the gene that gives it the drug resistance. 173 00:12:00,000 --> 00:12:06,000 But if we're going to cut that, if we're going to join a piece of 174 00:12:06,000 --> 00:12:12,000 DNA to that, we can't join into a circle without breaking it. 175 00:12:12,000 --> 00:12:18,000 So, we need to cut the vector at a unique site. And we would use a 176 00:12:18,000 --> 00:12:24,000 restriction enzyme for that. And you can also see in designing a 177 00:12:24,000 --> 00:12:30,000 vector, you'd want something that only has one site. 178 00:12:30,000 --> 00:12:35,000 So, what we would have achieved from this conceptually as we've now got 179 00:12:35,000 --> 00:12:41,000 this, this is the vector, its origin of replication here, 180 00:12:41,000 --> 00:12:46,000 and let's say ampicillin resistance, for example, as a selectable marker, 181 00:12:46,000 --> 00:12:52,000 the gene for that could be encoded here. And we have the fragment of, 182 00:12:52,000 --> 00:12:58,000 if you want to that down, just put it down on the floor, 183 00:12:58,000 --> 00:13:04,000 I think. We've made the point at this stage. Thanks. 184 00:13:04,000 --> 00:13:10,000 What one has to do is to join this piece of DNA to that. 185 00:13:10,000 --> 00:13:17,000 And we'll go to the molecular details of this. 186 00:13:17,000 --> 00:13:23,000 But, we'll join the fragment to the vector, and actually this was 187 00:13:23,000 --> 00:13:30,000 something that was already in molecular biologists' toolkits, 188 00:13:30,000 --> 00:13:35,000 have been studying DNA replication. That's DNA ligase. 189 00:13:35,000 --> 00:13:38,000 When we finish an Okazaki fragment, we had to seal [UNINTELLIGIBLE] and 190 00:13:38,000 --> 00:13:42,000 the enzyme that did that was an enzyme called DNA ligase. 191 00:13:42,000 --> 00:13:45,000 So, molecular biologists basically had the scotch tape or the glue to 192 00:13:45,000 --> 00:13:48,000 join stuff back together. What they were missing for many, 193 00:13:48,000 --> 00:13:51,000 many years where the sequence specific molecular scissors. 194 00:13:51,000 --> 00:13:55,000 So at this stage, if we were doing the recombinant DNA, 195 00:13:55,000 --> 00:13:58,000 we now have a vector. We now have a piece of DNA joined 196 00:13:58,000 --> 00:14:03,000 to it. In fact, we probably have a whole 197 00:14:03,000 --> 00:14:09,000 other mess of things that happens along the way. 198 00:14:09,000 --> 00:14:15,000 But at the moment, they're in a test tube. 199 00:14:15,000 --> 00:14:22,000 So, if we want to have this thing grow, what do we have to do next? 200 00:14:22,000 --> 00:14:28,000 We are going to have to get the DNA from outside the cell inside the 201 00:14:28,000 --> 00:14:35,000 cell. That's the word; we need to transform the DNA into a cell. 202 00:14:35,000 --> 00:14:38,000 Again, the word transform, that goes back to those 203 00:14:38,000 --> 00:14:42,000 transformation experiments with the Streptococcus pneumonia going from 204 00:14:42,000 --> 00:14:45,000 smooth to rough, and you are taking stuff from the 205 00:14:45,000 --> 00:14:49,000 cell that transformed them from rough to smooth, 206 00:14:49,000 --> 00:14:53,000 whatever, that's where the word came from, but we now know it's getting 207 00:14:53,000 --> 00:14:56,000 naked DNA inside the cell where it can be replicated. 208 00:14:56,000 --> 00:15:00,000 And then the next thing we need to know is, what cells have acquired 209 00:15:00,000 --> 00:15:04,000 this vector that at least as the vector. 210 00:15:04,000 --> 00:15:09,000 We'll settle for that in the beginning, and to do that, 211 00:15:09,000 --> 00:15:14,000 you need to select for the marker on the vector. In the case of this one, 212 00:15:14,000 --> 00:15:19,000 we would start with a strain that killed my ampicillin, 213 00:15:19,000 --> 00:15:25,000 and then we just play it out and ask for guys that are ampicillin 214 00:15:25,000 --> 00:15:29,000 resistant. And, you can see that there is 215 00:15:29,000 --> 00:15:33,000 another class of problem because if we had uncut vector, 216 00:15:33,000 --> 00:15:36,000 and there would probably be, for sure, some of that in our mix, 217 00:15:36,000 --> 00:15:40,000 that would make the cells ampicillin. And if we had an insert, 218 00:15:40,000 --> 00:15:43,000 it would also be ampicillin. So, if we really wanted to get into 219 00:15:43,000 --> 00:15:47,000 this, we'd have to do some more work to sort out what's on there. 220 00:15:47,000 --> 00:15:51,000 But that's the basic stuff. I suspect most of you know this 221 00:15:51,000 --> 00:15:54,000 practically since kindergarten. But that's the overall framework 222 00:15:54,000 --> 00:15:58,000 into which, now, I'm going to start layering 223 00:15:58,000 --> 00:16:01,000 different pieces of detail. And the next part, 224 00:16:01,000 --> 00:16:05,000 again, some of you may know. I don't think it will be a totally 225 00:16:05,000 --> 00:16:09,000 foreign concept. You are probably familiar with this, 226 00:16:09,000 --> 00:16:13,000 that what are these restriction enzymes? The actual word is 227 00:16:13,000 --> 00:16:16,000 restriction endonuclease. They are often usually called 228 00:16:16,000 --> 00:16:20,000 restriction enzymes in a lab [parlons?]. Nuclease is something 229 00:16:20,000 --> 00:16:24,000 that cuts the nucleic acid, and endonuclease is one that doesn't 230 00:16:24,000 --> 00:16:28,000 need a free end. So, it can cut in the middle of the 231 00:16:28,000 --> 00:16:32,000 sequence instead of nibbling at the end. That would be 232 00:16:32,000 --> 00:16:37,000 an exonuclease. So, these things have names that 233 00:16:37,000 --> 00:16:43,000 tend to be something like ECO-R1, which has something to do with where 234 00:16:43,000 --> 00:16:50,000 they are derived from. And a typical one, one of the very 235 00:16:50,000 --> 00:16:56,000 first ones that is still in really wide use, is ECO-R1. 236 00:16:56,000 --> 00:17:03,000 And this recognizes the sequence G A A T T C. 237 00:17:03,000 --> 00:17:07,000 Now, you'll notice that if you read the sequence in this way it's the 238 00:17:07,000 --> 00:17:11,000 same sequence when you read it on the other strand. 239 00:17:11,000 --> 00:17:15,000 It's called a palindrome but be careful because palindromes in 240 00:17:15,000 --> 00:17:19,000 English, those are words that you read from the front to the back; 241 00:17:19,000 --> 00:17:23,000 they're the same. In an English letter, 242 00:17:23,000 --> 00:17:28,000 it doesn't matter whether it's in A here or an A there. 243 00:17:28,000 --> 00:17:32,000 But you guys know something about DNA structure. 244 00:17:32,000 --> 00:17:36,000 There's a five to three prime polarity. So, 245 00:17:36,000 --> 00:17:40,000 reading this way doesn't look at all the same. It's totally different. 246 00:17:40,000 --> 00:17:45,000 But reading in this strand, we say that's five to three. 247 00:17:45,000 --> 00:17:49,000 The thing that's identical is the reciprocal sequence on the other 248 00:17:49,000 --> 00:17:53,000 side. So there is this, you see G A A and G A A but it's not 249 00:17:53,000 --> 00:17:58,000 like the English word palindrome, so get yourself mixed up about that. 250 00:17:58,000 --> 00:18:03,000 Anyway, what this will then, what this enzyme then does, is it 251 00:18:03,000 --> 00:18:08,000 cuts to the side here. It cuts symmetrically. 252 00:18:08,000 --> 00:18:14,000 And what it generates them is a G three prime hydroxyl. 253 00:18:14,000 --> 00:18:19,000 Remember the ribose? If we have, say, an A there, this is the three 254 00:18:19,000 --> 00:18:25,000 prime position, and that's the five prime position 255 00:18:25,000 --> 00:18:30,000 in the sugar. So, it leaves a three prime hydroxyl, 256 00:18:30,000 --> 00:18:36,000 and is also then leaves a five prime phosphate. 257 00:18:36,000 --> 00:18:45,000 So, we'd have A A T T C here, and then on the other side, we would 258 00:18:45,000 --> 00:18:54,000 get the reciprocal thing. So, we'd have G with a three prime 259 00:18:54,000 --> 00:19:03,000 hydroxyl, and then over A A T T C like that. 260 00:19:03,000 --> 00:19:07,000 So now we've got a break here. We can pull those apart. But one 261 00:19:07,000 --> 00:19:12,000 of the nice things you can see right from this is that we're generating 262 00:19:12,000 --> 00:19:17,000 five prime single stranded ends, and this one is the sequence A A T T 263 00:19:17,000 --> 00:19:22,000 C. This is A A T T C here, and these guys, if they could get 264 00:19:22,000 --> 00:19:27,000 together and line up as they would here, they'd be able to 265 00:19:27,000 --> 00:19:32,000 form hydrogen bonds. So, if you take an enzyme like 266 00:19:32,000 --> 00:19:36,000 ECO-R1, and we took, let's say, a circle that had a 267 00:19:36,000 --> 00:19:40,000 single ECO-R1 site, G A A T T C, if we cut it with the 268 00:19:40,000 --> 00:19:44,000 restriction enzyme we would make [nicks?]. And if we kept a cold, 269 00:19:44,000 --> 00:19:48,000 all that we'd have is DNA nicks. And if we warm things up a little 270 00:19:48,000 --> 00:19:52,000 bit, there's only four hydrogen bonds that are holding that together. 271 00:19:52,000 --> 00:19:56,000 So, the thing would linearize and just flop around in the breeze. 272 00:19:56,000 --> 00:20:00,000 If we cooled it slowly, the thermodynamically most favorable 273 00:20:00,000 --> 00:20:04,000 state, the lowest energy state, would be with those ends coming back 274 00:20:04,000 --> 00:20:08,000 together. So, we could then add DNA ligase. 275 00:20:08,000 --> 00:20:12,000 If we added these up and added DNA ligase, we could reverse the process 276 00:20:12,000 --> 00:20:16,000 and go back and forth, ECO-R1 to cut it, DNA to ligate it. 277 00:20:16,000 --> 00:20:23,000 And then, the beauty of recombinant DNA is this rejoining part doesn't 278 00:20:23,000 --> 00:20:30,000 see what's out here or what's out there. 279 00:20:30,000 --> 00:20:34,000 All it sees is the little ends that are generated by an ECO-R1 site. 280 00:20:34,000 --> 00:20:38,000 So they take some of my DNA, and I'll cut it up. 281 00:20:38,000 --> 00:20:42,000 I'll get a zillion ECO-R1 fragments, but they'll all have the same little 282 00:20:42,000 --> 00:20:47,000 overhanging bit that's complementary to the vector. 283 00:20:47,000 --> 00:20:51,000 So if I take a vector cut with ECO-R1, and I take some of my own 284 00:20:51,000 --> 00:20:55,000 DNA and I mix them, I can get a little fragment, 285 00:20:55,000 --> 00:20:59,000 get in between the vector, and it does exactly that joining that I was 286 00:20:59,000 --> 00:21:04,000 diagramming right here. So again, it was the discovery of 287 00:21:04,000 --> 00:21:10,000 these restriction enzymes that made possible almost all the stuff that's 288 00:21:10,000 --> 00:21:16,000 happened in biology since 1975. The development of restriction 289 00:21:16,000 --> 00:21:21,000 enzymes was essentially, I was a postdoc at Berkeley at that 290 00:21:21,000 --> 00:21:27,000 point and the labs, Stan Cohen at Stanford, 291 00:21:27,000 --> 00:21:33,000 Herb Boyer at UCSF, and a two others around the country were 292 00:21:33,000 --> 00:21:37,000 working on this. They were almost all labs that had 293 00:21:37,000 --> 00:21:41,000 worked on bacterial plasmids. Plasmids are little circles of DNA, 294 00:21:41,000 --> 00:21:45,000 so the labs that started were ones who have been busy studying little 295 00:21:45,000 --> 00:21:49,000 circles of DNA that usually carry drug-resistances between cells. 296 00:21:49,000 --> 00:21:53,000 And so that was happening while I was a postdoc. 297 00:21:53,000 --> 00:21:57,000 And when I got to MIT and '76 the technology was just beginning. 298 00:21:57,000 --> 00:22:01,000 I was one of the first labs trying to cut pieces of DNA and join 299 00:22:01,000 --> 00:22:04,000 them back together. So, it's a pretty recent development. 300 00:22:04,000 --> 00:22:08,000 At that point, DNA sequencing hadn't been invented. 301 00:22:08,000 --> 00:22:12,000 The idea that you could pull out a piece of DNA and do something with 302 00:22:12,000 --> 00:22:16,000 it or produce a protein was just a thought. It didn't exist. 303 00:22:16,000 --> 00:22:19,000 So it's hard to overemphasize how critical the discovery of these 304 00:22:19,000 --> 00:22:23,000 restriction enzymes were. Now, I just want to tell you where 305 00:22:23,000 --> 00:22:27,000 they came from, or how people found them. 306 00:22:27,000 --> 00:22:31,000 And they'll try and do this quickly because I know some of you get 307 00:22:31,000 --> 00:22:35,000 impatient with history. But this is really important because 308 00:22:35,000 --> 00:22:39,000 it's very easy to make fun of basic research. You can ridicule anything 309 00:22:39,000 --> 00:22:43,000 pretty easily, and you might just ask because I'm 310 00:22:43,000 --> 00:22:47,000 telling you the story. Somebody proposed doing this. 311 00:22:47,000 --> 00:22:51,000 I'm going to tell you the experiment that basically is the 312 00:22:51,000 --> 00:22:55,000 basis of the biotech industry, and would you have been smart enough 313 00:22:55,000 --> 00:22:59,000 to recognize that it was the discovery of a phenomenon called 314 00:22:59,000 --> 00:23:03,000 restriction, restriction in bacteriophage growth on bacteria? 315 00:23:03,000 --> 00:23:06,000 And it was, here are actually a couple of EM's and these little 316 00:23:06,000 --> 00:23:10,000 plasmids. This is an electron micrograph one. 317 00:23:10,000 --> 00:23:14,000 In these little circles it's been shadowed. And this is actually 318 00:23:14,000 --> 00:23:18,000 artificially colored, but that was the kind of plasmids 319 00:23:18,000 --> 00:23:22,000 that people were cutting up. So, as I said, trying to get 320 00:23:22,000 --> 00:23:26,000 through this DNA, and the stuff, what's made possible 321 00:23:26,000 --> 00:23:30,000 the sequencing at the Whitehead Genome Center and stuff that I'll 322 00:23:30,000 --> 00:23:33,000 tell you about is going on. I didn't really set this one up. 323 00:23:33,000 --> 00:23:37,000 But that's Eric Lander who teaches 701 in the fall. 324 00:23:37,000 --> 00:23:40,000 I told you a picture from that DNA 50th. Well, they had a banquet at 325 00:23:40,000 --> 00:23:44,000 the end of it, and I was there. 326 00:23:44,000 --> 00:23:47,000 This is [Savandi Pabo? from Europe who is sequencing the 327 00:23:47,000 --> 00:23:51,000 chimp genome. And that's Francis Collins who is head of the entire 328 00:23:51,000 --> 00:23:54,000 Human Genome Project. This is Evelyn Witkin, 329 00:23:54,000 --> 00:23:58,000 who was a big discoverer of early DNA repair events. 330 00:23:58,000 --> 00:24:02,000 And I put that one in because it was sort of interesting. 331 00:24:02,000 --> 00:24:05,000 There was Eric, and Savandi, and Francis were 332 00:24:05,000 --> 00:24:08,000 talking about what would happen when they knew the sequence of the chimp 333 00:24:08,000 --> 00:24:11,000 genome, which wasn't done. And there was an advertisement, 334 00:24:11,000 --> 00:24:14,000 a poster advertising Jim Watson's latest book. And they ripped that 335 00:24:14,000 --> 00:24:17,000 in half, and were writing notes on the back all the way through dinner. 336 00:24:17,000 --> 00:24:20,000 So if you want to see what scientists on the cutting edge, 337 00:24:20,000 --> 00:24:24,000 including someone who teaches 701 the fall looks like when they are 338 00:24:24,000 --> 00:24:27,000 not teaching 701, there is a picture. 339 00:24:27,000 --> 00:24:30,000 So anyway, the discovery of restriction enzymes was Salvador 340 00:24:30,000 --> 00:24:34,000 Luria, who I've mentioned. He was a member of the biology 341 00:24:34,000 --> 00:24:38,000 department, and one of our Nobel Prize winners. 342 00:24:38,000 --> 00:24:42,000 He started the cancer center. He also trained at Jim Watson, 343 00:24:42,000 --> 00:24:46,000 when I showed you that picture. This is Salvador standing over here. 344 00:24:46,000 --> 00:24:50,000 Another thing that Salvador did, he was a Nobel Prize winner but he 345 00:24:50,000 --> 00:24:54,000 thought introductory biology. So I am basically following in the 346 00:24:54,000 --> 00:24:58,000 footsteps of Salvador. He wrote a book called, 347 00:24:58,000 --> 00:25:02,000 even though he was a Nobel Prize winner, a book called 36 lectures in 348 00:25:02,000 --> 00:25:06,000 introductory biology. And some universities, 349 00:25:06,000 --> 00:25:10,000 the intro to biology is taught by whoever is at the bottom of the food 350 00:25:10,000 --> 00:25:15,000 chain. The most junior professor gets stuck with intro to biology. 351 00:25:15,000 --> 00:25:19,000 And here it's the other way. I mean you're getting Eric and Bob, 352 00:25:19,000 --> 00:25:24,000 for example, Weinberg to teach in the fall, tells you that. 353 00:25:24,000 --> 00:25:28,000 And really where that comes from is the fact that Salvador Luria had 354 00:25:28,000 --> 00:25:32,000 such an interest in replication. So he trained at Jim Watson, 355 00:25:32,000 --> 00:25:35,000 started the cancer center here, and he also carried out this 356 00:25:35,000 --> 00:25:39,000 phenomenon of restriction. And to get this working with 357 00:25:39,000 --> 00:25:42,000 bacteriophage. And I know a [couple he wrote? 358 00:25:42,000 --> 00:25:45,000 , he didn't like to see old guys on porches. So I got freaked out, 359 00:25:45,000 --> 00:25:48,000 and I took this next picture out for this morning. Oh, 360 00:25:48,000 --> 00:25:51,000 I've got to show it anyway for two reasons. This is Salvador sitting 361 00:25:51,000 --> 00:25:55,000 on a porch at Cold Spring Harbor with Max Delbrook who started, 362 00:25:55,000 --> 00:25:58,000 really, much of the work on bacteriophage that gave us the 363 00:25:58,000 --> 00:26:01,000 underpinnings of microbiology. And then put it partly on A because 364 00:26:01,000 --> 00:26:04,000 it shows the informality of the molecular biology culture which 365 00:26:04,000 --> 00:26:08,000 persists to this day, and also because Salvador had such 366 00:26:08,000 --> 00:26:11,000 an impish sense of humor. You would have really enjoyed had 367 00:26:11,000 --> 00:26:14,000 he been teaching this course. Anyway, Salvador was studying this 368 00:26:14,000 --> 00:26:18,000 bacteriophage. Remember we talked about it? 369 00:26:18,000 --> 00:26:21,000 And they basically [with syringe? they injected their DNA into the 370 00:26:21,000 --> 00:26:24,000 cell. There's an electron micrograph. The DNA is up top there. 371 00:26:24,000 --> 00:26:28,000 It goes in, and then the DNA takes over the cell, 372 00:26:28,000 --> 00:26:32,000 reprograms it, and makes babyphage. And I showed you how we make plaques. 373 00:26:32,000 --> 00:26:38,000 So that was what Salvador was studying. And it's a little like 374 00:26:38,000 --> 00:26:44,000 what we are talking about with Mendel. He didn't have very many 375 00:26:44,000 --> 00:26:49,000 techniques available to him at the time. He couldn't sequence DNA. 376 00:26:49,000 --> 00:26:55,000 He couldn't do a lot of things. But he could [plate phage? 377 00:26:55,000 --> 00:27:01,000 and count, and things like that. And what Salvador was looking at, 378 00:27:01,000 --> 00:27:07,000 he had a bacteriophage, and he had two strains of bacteria. 379 00:27:07,000 --> 00:27:13,000 I'll call them A and B. OK, here comes the experiment that 380 00:27:13,000 --> 00:27:20,000 founded the biotech industry. You ready? You going to fund me? 381 00:27:20,000 --> 00:27:27,000 All right, so what I propose doing is I'm going to grow the phage on 382 00:27:27,000 --> 00:27:34,000 strain A, and now I'm going to plate on strain A and strain B. 383 00:27:34,000 --> 00:27:38,000 I laid awake all last week thinking of this experiment. 384 00:27:38,000 --> 00:27:43,000 So what did I get? I got a lot of plaques on strain A, 385 00:27:43,000 --> 00:27:48,000 probably something like 109 or 1010 per mil because that's what this 386 00:27:48,000 --> 00:27:52,000 phage lysate usually looks like once you've grown them up. 387 00:27:52,000 --> 00:27:57,000 And if I plate them over on strain B, no phage, maybe an 388 00:27:57,000 --> 00:28:02,000 occasional phage. So, I bristly the bacteriophage can 389 00:28:02,000 --> 00:28:08,000 grow mine on strain A. It can't grow on strain B most of 390 00:28:08,000 --> 00:28:13,000 the time, but some variant has managed to figure out how to grow on 391 00:28:13,000 --> 00:28:18,000 strain B. Give our foray into genetics, I think many of you would 392 00:28:18,000 --> 00:28:24,000 think, probably as I suspect Salvador did at the time, 393 00:28:24,000 --> 00:28:29,000 the things mutated. It's learned. It's made some change in its 394 00:28:29,000 --> 00:28:34,000 genomes that's allowed it to grow on strain B. So, 395 00:28:34,000 --> 00:28:40,000 I wonder if it learned to grow on strain B, couldn't 396 00:28:40,000 --> 00:28:47,000 grow on strain A? So, basically what he took was the 397 00:28:47,000 --> 00:28:57,000 phage from that experiment, and then he plated them on strain A 398 00:28:57,000 --> 00:29:04,000 and strain B. And, well, as you might guess, 399 00:29:04,000 --> 00:29:10,000 since it was growing on strain B, lots of plaques, and there were lots 400 00:29:10,000 --> 00:29:15,000 of plaques over here. OK, so it didn't forget how to grow 401 00:29:15,000 --> 00:29:20,000 on strain A. So, better check over here, 402 00:29:20,000 --> 00:29:26,000 too, need a control experiment, so take this guy, plate it out, 403 00:29:26,000 --> 00:29:31,000 strain A, strain B, some plaques over here. 404 00:29:31,000 --> 00:29:37,000 That's not a surprise going in strain A. 405 00:29:37,000 --> 00:29:42,000 We are back to where we started from. It doesn't sound like a mutant, 406 00:29:42,000 --> 00:29:47,000 does it? And if it was a mutant, everybody should have been the same. 407 00:29:47,000 --> 00:29:53,000 Instead, when you grew a phage on strain A, it didn't have the ability 408 00:29:53,000 --> 00:29:58,000 to grow on strain B. But if you gave it a chance to grow 409 00:29:58,000 --> 00:30:04,000 on strain B, most of them wouldn't make it. 410 00:30:04,000 --> 00:30:09,000 But if it ever did, it had now acquired the ability to 411 00:30:09,000 --> 00:30:15,000 grow on strain B. So it could still grow on both. 412 00:30:15,000 --> 00:30:21,000 But if you take somebody who had been growing, something had been 413 00:30:21,000 --> 00:30:27,000 growing at strain A, it lost it. And so, the idea there 414 00:30:27,000 --> 00:30:33,000 was then that it wasn't a mutation. Something was happening in the 415 00:30:33,000 --> 00:30:39,000 strain B that enabled it to grow on strain B. 416 00:30:39,000 --> 00:30:43,000 And if it ever got away from that environment, [it? 417 00:30:43,000 --> 00:30:48,000 lost it. So, the phenomenon was called restriction. 418 00:30:48,000 --> 00:30:53,000 It wasn't a mutation. It was something else. 419 00:30:53,000 --> 00:30:58,000 And it turned out, then, that restriction was due to an 420 00:30:58,000 --> 00:31:04,000 enzyme. An example of this kind of thing, 421 00:31:04,000 --> 00:31:11,000 then, would be this ECO-R1 activity that's able to cut at a very 422 00:31:11,000 --> 00:31:18,000 specific sequence. Now, if you are going to have a set 423 00:31:18,000 --> 00:31:25,000 of molecular scissors inside of you that could cut a G A T C sequences, 424 00:31:25,000 --> 00:31:32,000 you'd have a problem unless you did something else because every one of 425 00:31:32,000 --> 00:31:39,000 your G A T C sequences would be cut by the restriction enzymes. 426 00:31:39,000 --> 00:31:46,000 So what the cells that have a restriction enzyme have, 427 00:31:46,000 --> 00:31:54,000 as a modification enzyme that recognizes the same sequence, 428 00:31:54,000 --> 00:32:02,000 and then modifies it in some way that makes it resistant to the 429 00:32:02,000 --> 00:32:09,000 restriction enzyme. And in the case of the ECO-R1, 430 00:32:09,000 --> 00:32:15,000 it puts on a methyl group on this A. You might have thought that that 431 00:32:15,000 --> 00:32:22,000 would interfere with base pairing, but it doesn't because adenine looks 432 00:32:22,000 --> 00:32:29,000 like this. These are the guys that do the base pair refining, 433 00:32:29,000 --> 00:32:35,000 if you look back, and you'll see that you could put a methyl 434 00:32:35,000 --> 00:32:41,000 group in there. It wouldn't interfere with the base 435 00:32:41,000 --> 00:32:45,000 pairing, but would allow this to go. So, it was the discovery of this 436 00:32:45,000 --> 00:32:49,000 phenomenon of restriction of bacteriophage grew on one strain, 437 00:32:49,000 --> 00:32:53,000 not on another. It could learn to grow on the other strain. 438 00:32:53,000 --> 00:32:57,000 It could lose that acquisition. It was that phenomenon. People, 439 00:32:57,000 --> 00:33:01,000 didn't have any other reason other than it was an interesting problem 440 00:33:01,000 --> 00:33:05,000 in biology to understand it. Once they understood the basis of it, 441 00:33:05,000 --> 00:33:09,000 another whole world opened up because you could see from basic 442 00:33:09,000 --> 00:33:13,000 principles, now I could cut any piece of DNA. I could generate 443 00:33:13,000 --> 00:33:16,000 these little overhanging sticky ends. I could take a plasmid. 444 00:33:16,000 --> 00:33:20,000 People knew about those. I could find one that only has one 445 00:33:20,000 --> 00:33:24,000 restriction site. I could stick things in. 446 00:33:24,000 --> 00:33:28,000 Now I've attached an origin of replication to each of those pieces. 447 00:33:28,000 --> 00:33:32,000 And I'm in business. I can now, for the first time ever, 448 00:33:32,000 --> 00:33:38,000 take a particular piece of DNA and make as many copies as I want. 449 00:33:38,000 --> 00:33:44,000 And that was an absolute transformation to the way people 450 00:33:44,000 --> 00:33:50,000 were able to think about biology. So I'm going to just kind of give 451 00:33:50,000 --> 00:33:56,000 you an idea of how people would start. So the way people began and 452 00:33:56,000 --> 00:34:02,000 still began most things is they'd call it, the usual term is you call 453 00:34:02,000 --> 00:34:08,000 constructing a recombinant DNA library. 454 00:34:08,000 --> 00:34:17,000 And there are a variety of different ways of doing this. 455 00:34:17,000 --> 00:34:26,000 But this principle is the same. We'd take the DNA from whatever 456 00:34:26,000 --> 00:34:36,000 organism you're interested in, [and studying?]. 457 00:34:36,000 --> 00:34:43,000 And we cut with some restriction enzyme. And this restriction enzyme 458 00:34:43,000 --> 00:34:51,000 will cut wherever there happened to be sites. They might be close 459 00:34:51,000 --> 00:34:59,000 together. They might be far apart. But whatever, still generate some 460 00:34:59,000 --> 00:35:05,000 characteristic set of fragments. And we'll now have, 461 00:35:05,000 --> 00:35:09,000 in this case, fragment number one, fragment number two, fragment number 462 00:35:09,000 --> 00:35:13,000 three, fragment number four, and so on. And if it's my DNA, 463 00:35:13,000 --> 00:35:18,000 there's a lot of fragments. And of course, they're all mixed up. 464 00:35:18,000 --> 00:35:22,000 I can't tell where any of them are. They're just all mixed together in 465 00:35:22,000 --> 00:35:26,000 the test tube. Then we'll take that vector that 466 00:35:26,000 --> 00:35:31,000 we've opened. And now, we'll mix all of these 467 00:35:31,000 --> 00:35:37,000 fragments together with this vector. And then we join it just the way 468 00:35:37,000 --> 00:35:43,000 [that it's cut? . And now what we'll get, 469 00:35:43,000 --> 00:35:49,000 it's a collection of plasmids that have different inserts. 470 00:35:49,000 --> 00:35:55,000 So, one of the plasmids will have that fragment number one. 471 00:35:55,000 --> 00:36:01,000 Another one of them will have fragment number two, number 472 00:36:01,000 --> 00:36:06,000 three, and so on. Then this whole thing is what's 473 00:36:06,000 --> 00:36:11,000 known as a library. You can see if it's DNA from me, 474 00:36:11,000 --> 00:36:17,000 there were 3 billion base pairs to start. Given the human DNA, 475 00:36:17,000 --> 00:36:22,000 G A T C sequences are pretty common. You can calculate the frequency 476 00:36:22,000 --> 00:36:27,000 yourself for how many sites on average there would be for a 477 00:36:27,000 --> 00:36:32,000 restriction enzyme within a piece of DNA and figure out roughly how the 478 00:36:32,000 --> 00:36:38,000 fragments there would be in the library of human DNA. 479 00:36:38,000 --> 00:36:42,000 So, we are partway there. We can now make a library. 480 00:36:42,000 --> 00:36:46,000 We can make it from bacterial DNA. We can make it from human DNA. But 481 00:36:46,000 --> 00:36:51,000 the next thing that people had to learn how to do was to figure out 482 00:36:51,000 --> 00:36:55,000 how to find a particular fragment that had the gene that you 483 00:36:55,000 --> 00:36:59,000 are interested in. And there's a whole variety of 484 00:36:59,000 --> 00:37:02,000 things. I mean, ultimately today since the human 485 00:37:02,000 --> 00:37:06,000 genome is sequenced, you go on a computer and a type and 486 00:37:06,000 --> 00:37:09,000 you find it because the sequence is all known. But the only reason we 487 00:37:09,000 --> 00:37:12,000 can do that is because of all the work that was done in between. 488 00:37:12,000 --> 00:37:15,000 So, I'll give you several ways of doing this. But one of the ways I 489 00:37:15,000 --> 00:37:18,000 think you can see very easily, and it's actually going back to the 490 00:37:18,000 --> 00:37:22,000 term complementation. Remember complementation? 491 00:37:22,000 --> 00:37:25,000 We had something that was mutant, and then we'd put in a wild type 492 00:37:25,000 --> 00:37:29,000 gene, and fixed it up again. So, for example, 493 00:37:29,000 --> 00:37:33,000 suppose I was studying histidine biosynthesis in E. 494 00:37:33,000 --> 00:37:38,000 coli, and I wanted to find the gene that encoded the enzyme that I had 495 00:37:38,000 --> 00:37:43,000 just disabled in my histidine minus mutant. So, if I have a 496 00:37:43,000 --> 00:37:47,000 [hisoxotroph?], I'll call it a [his G? 497 00:37:47,000 --> 00:37:52,000 gene, for example, that's one of the genes involved in making 498 00:37:52,000 --> 00:37:57,000 histidine. So, since it's a histidine auxotroph, 499 00:37:57,000 --> 00:38:01,000 if I have it on just minimal glucose plates, and I streak it out, 500 00:38:01,000 --> 00:38:07,000 it's not going to grow. But if I grow it on minimal glucose 501 00:38:07,000 --> 00:38:13,000 plus histidine, then it will be able to grow, 502 00:38:13,000 --> 00:38:19,000 right? So, I've got a variant of this organism. 503 00:38:19,000 --> 00:38:25,000 It's got a single mutation in it that's affecting one gene. 504 00:38:25,000 --> 00:38:31,000 And because I don't have that gene, I can't grow on minimal. 505 00:38:31,000 --> 00:38:35,000 If I made a library of E. coli DNA, which is going to have a 506 00:38:35,000 --> 00:38:40,000 lot of fragments as well, and I took that library and I put it 507 00:38:40,000 --> 00:38:45,000 into this mutant, I'm going to get a big mess of 508 00:38:45,000 --> 00:38:49,000 things, all the different plasmids with all the different fragments 509 00:38:49,000 --> 00:38:54,000 that go into that mutant. How am I going to find the one that 510 00:38:54,000 --> 00:38:59,000 I want? Anybody see that? It's not that hard. I'm the mutant. 511 00:38:59,000 --> 00:39:04,000 I can't grow on minimal because I can't make this enzyme. 512 00:39:04,000 --> 00:39:09,000 Therefore, I can't make histidine. What do I need? How could you fix 513 00:39:09,000 --> 00:39:14,000 me up if you were a doctor? What's the gene we want out of here? 514 00:39:14,000 --> 00:39:20,000 The one that makes that particular enzyme that makes histidine. 515 00:39:20,000 --> 00:39:25,000 Yeah, take the whole library, stick it in this mutant. If the 516 00:39:25,000 --> 00:39:31,000 gene coming in encodes DNA polymerase, I'm not going 517 00:39:31,000 --> 00:39:36,000 to help this guy. It still won't grow in minimal, 518 00:39:36,000 --> 00:39:42,000 take a gene involved in making part of the cell wall would help. 519 00:39:42,000 --> 00:39:48,000 But if I put in the, I get a fragment of DNA that includes the 520 00:39:48,000 --> 00:39:53,000 his G plus gene, and I put it in here, 521 00:39:53,000 --> 00:39:59,000 it's going to grow up. If it has the plasmid that has, 522 00:39:59,000 --> 00:40:05,000 or let's say the vector that has the his G plus gene. 523 00:40:05,000 --> 00:40:08,000 So what you've done is a really, sort of, you've used that principle 524 00:40:08,000 --> 00:40:12,000 of complementation that some of you were sort of wondering about when we 525 00:40:12,000 --> 00:40:16,000 were doing genetics. So, you'd break a copy of a gene. 526 00:40:16,000 --> 00:40:20,000 In the things we talked about, we bring in a whole chromosome that 527 00:40:20,000 --> 00:40:24,000 included in it just a wild type copy of the gene. With recombinant DNA, 528 00:40:24,000 --> 00:40:28,000 we can really narrow it down in the extreme. We can bring in a piece of 529 00:40:28,000 --> 00:40:32,000 DNA that is only the gene that's broken. 530 00:40:32,000 --> 00:40:36,000 And we can take the gene back to the wild type. One thing, 531 00:40:36,000 --> 00:40:41,000 just to close, you'll see, if you remember back when I talked 532 00:40:41,000 --> 00:40:45,000 about language that are not universal, although the genetic code 533 00:40:45,000 --> 00:40:50,000 is universal, promoters and things are not. So I couldn't ever do this 534 00:40:50,000 --> 00:40:54,000 with a human DNA, could I, because it wouldn't get 535 00:40:54,000 --> 00:40:59,000 expressed. So, we need some other ways of finding 536 00:40:59,000 --> 00:41:02,000 those. We'll talk about those on the next lecture, OK?