1 00:00:07,060 --> 00:00:08,540 PROFESSOR: So now I have this. 2 00:00:08,540 --> 00:00:12,066 And what I need to do is I need to transform. 3 00:00:17,790 --> 00:00:28,200 So I need to take E.coli growing in a test tube. 4 00:00:28,200 --> 00:00:34,510 And I need to put into it a whole collection of zillions 5 00:00:34,510 --> 00:00:40,710 and zillions of plasmid molecules, plasmid circles 6 00:00:40,710 --> 00:00:44,070 that all have different pieces of human DNA. 7 00:00:44,070 --> 00:00:46,260 One of these things might have ARG1, one ARG2, 8 00:00:46,260 --> 00:00:48,630 one ARG3, et cetera. 9 00:00:48,630 --> 00:00:50,680 Or if it's human DNA, one of them your hemoglobin, one of 10 00:00:50,680 --> 00:00:52,420 them your collagen, whatever. 11 00:00:52,420 --> 00:00:55,390 I have to throw the DNA on top of the bacteria. 12 00:00:59,410 --> 00:01:02,210 And then I need to come up with some engineering trick to 13 00:01:02,210 --> 00:01:08,570 make the bacteria take up this DNA across their cell 14 00:01:08,570 --> 00:01:13,630 membranes and use it. 15 00:01:13,630 --> 00:01:16,460 Why should I be able to persuade bacteria 16 00:01:16,460 --> 00:01:19,656 to take up my DNA? 17 00:01:19,656 --> 00:01:20,620 AUDIENCE: They do it anyway. 18 00:01:20,620 --> 00:01:21,570 PROFESSOR: They do it anyway. 19 00:01:21,570 --> 00:01:23,550 Remember, we said they like to exchange DNA. 20 00:01:23,550 --> 00:01:25,710 That's kind of their thing, right? 21 00:01:25,710 --> 00:01:29,060 They pick up DNA from their environment anyway. 22 00:01:29,060 --> 00:01:32,440 See, as usual, we haven't really invented something. 23 00:01:32,440 --> 00:01:35,780 What we've done is we have used the natural property 24 00:01:35,780 --> 00:01:36,970 that's out there. 25 00:01:36,970 --> 00:01:40,690 The natural properties, bacteria like to slurp up DNA. 26 00:01:40,690 --> 00:01:44,520 Now in fairness, some pretty cool protocols were developed 27 00:01:44,520 --> 00:01:47,050 that improve the ability to do it. 28 00:01:47,050 --> 00:01:49,290 You treat the DNA, a little nastily. 29 00:01:49,290 --> 00:01:50,720 You give it certain ions. 30 00:01:50,720 --> 00:01:52,640 You heat shock it in a certain way. 31 00:01:52,640 --> 00:01:56,420 And then it encourages it to slurp up DNA better. 32 00:01:56,420 --> 00:01:58,410 But the ability to slurp up DNA was there 33 00:01:58,410 --> 00:01:59,580 in the first place. 34 00:01:59,580 --> 00:02:02,630 We couldn't have given it the ability to slurp up DNA if it 35 00:02:02,630 --> 00:02:03,200 wasn't there. 36 00:02:03,200 --> 00:02:06,840 But we can always goose it up, improve it in certain ways. 37 00:02:06,840 --> 00:02:09,500 And there are great protocols that were developed in order 38 00:02:09,500 --> 00:02:13,510 to be able to increase the efficiency of such 39 00:02:13,510 --> 00:02:15,150 transformation. 40 00:02:15,150 --> 00:02:20,670 So now we now have our E.coli. 41 00:02:20,670 --> 00:02:23,800 Here's an E.coli. 42 00:02:23,800 --> 00:02:32,160 Some of the E.coli have acquired a plasmid containing 43 00:02:32,160 --> 00:02:35,130 human DNA or maybe, if we used yeast, yeast DNA. 44 00:02:40,750 --> 00:02:41,890 Some of them have it, right? 45 00:02:41,890 --> 00:02:43,550 Because it's kind of a random process. 46 00:02:43,550 --> 00:02:45,910 Some slurp it up, some haven't. 47 00:02:45,910 --> 00:02:50,240 So I might have, I don't know, maybe one in 100 of my cells 48 00:02:50,240 --> 00:02:51,520 will have slurped up DNA. 49 00:02:51,520 --> 00:02:56,500 But 99 in 100 won't have slurped up DNA. 50 00:02:56,500 --> 00:03:01,530 I want to plate these out on a Petri plate and grow them up 51 00:03:01,530 --> 00:03:04,710 into individual colonies. 52 00:03:04,710 --> 00:03:07,000 But I would like it to be the case, please, that the 53 00:03:07,000 --> 00:03:09,750 individual colonies that grow up are only the ones that got 54 00:03:09,750 --> 00:03:11,190 my plasmid. 55 00:03:11,190 --> 00:03:12,840 I really don't want to see all these guys who 56 00:03:12,840 --> 00:03:16,210 didn't get my plasmid. 57 00:03:16,210 --> 00:03:16,980 So what do I do? 58 00:03:16,980 --> 00:03:20,630 Do I go in there with a microscope and try to figure 59 00:03:20,630 --> 00:03:23,528 out which ones got my human DNA plasmid? 60 00:03:27,690 --> 00:03:31,410 So I would love to plate it out and only have the guys who 61 00:03:31,410 --> 00:03:32,660 got my plasmid grow. 62 00:03:35,480 --> 00:03:38,030 How do I arrange that only the guys who got my 63 00:03:38,030 --> 00:03:39,280 plasmid will grow? 64 00:03:42,240 --> 00:03:43,060 Sorry? 65 00:03:43,060 --> 00:03:45,250 AUDIENCE: [INAUDIBLE]. 66 00:03:45,250 --> 00:03:45,860 PROFESSOR: For something. 67 00:03:45,860 --> 00:03:49,160 I got to put in a gene into that plasmid so that only the 68 00:03:49,160 --> 00:03:51,530 guys who get that will grow. 69 00:03:51,530 --> 00:03:55,086 Any ideas for what gene might go in there? 70 00:03:55,086 --> 00:03:57,000 AUDIENCE: [INAUDIBLE]. 71 00:03:57,000 --> 00:03:57,746 PROFESSOR: Sorry? 72 00:03:57,746 --> 00:03:59,810 AUDIENCE: [INAUDIBLE]. 73 00:03:59,810 --> 00:04:02,420 PROFESSOR: How about a gene conferring resistance to an 74 00:04:02,420 --> 00:04:04,410 antibiotic. 75 00:04:04,410 --> 00:04:06,270 And then what could I do? 76 00:04:06,270 --> 00:04:07,520 Well, I could plate. 77 00:04:12,730 --> 00:04:13,980 I'm going to select-- 78 00:04:18,459 --> 00:04:21,290 see this E.coli here, it had my plasmid. 79 00:04:21,290 --> 00:04:23,150 And it's going to have a drug resistance gene. 80 00:04:23,150 --> 00:04:25,080 And it will grow up. 81 00:04:25,080 --> 00:04:28,565 This guy over here, he didn't get a drug resistance gene. 82 00:04:28,565 --> 00:04:30,360 He doesn't grow up. 83 00:04:30,360 --> 00:04:34,780 The only ones who grow up have a drug resistant-- 84 00:04:34,780 --> 00:04:37,950 wait a second. 85 00:04:37,950 --> 00:04:39,617 Why doesn't this guy grow up? 86 00:04:44,090 --> 00:04:45,090 AUDIENCE: You add the drug. 87 00:04:45,090 --> 00:04:48,170 PROFESSOR: Oh, I got to add the drug, good point. 88 00:04:48,170 --> 00:04:51,110 So what I'm going to do is I add my drug to the plate. 89 00:04:53,640 --> 00:04:58,960 I add my drug to the medium, to the Petri plate here. 90 00:04:58,960 --> 00:05:02,170 And now my Petri plate is say a penicillin plate or a 91 00:05:02,170 --> 00:05:04,740 streptomycin plate or a ampicillin plate or a 92 00:05:04,740 --> 00:05:07,150 kanamycin plate or whatever. 93 00:05:07,150 --> 00:05:10,300 And then the only things that could grow up are the ones 94 00:05:10,300 --> 00:05:13,110 that have resistance to whatever antibiotic I used. 95 00:05:13,110 --> 00:05:15,580 So I grow them up on that plate. 96 00:05:15,580 --> 00:05:17,910 And they grew up just fine. 97 00:05:17,910 --> 00:05:21,960 The ones who didn't get-- now how did I get that antibiotic 98 00:05:21,960 --> 00:05:23,238 gene into the plasmid? 99 00:05:26,990 --> 00:05:29,890 It actually comes that way in nature, sort of. 100 00:05:29,890 --> 00:05:31,050 Now when I get good, I can move them 101 00:05:31,050 --> 00:05:32,430 around from one to another. 102 00:05:32,430 --> 00:05:36,310 But nature thought of it first. 103 00:05:36,310 --> 00:05:38,630 Nature gave us something that had an origin of replication 104 00:05:38,630 --> 00:05:41,060 and an antibiotic resistance gene. 105 00:05:41,060 --> 00:05:42,840 And we could just add things to it. 106 00:05:42,840 --> 00:05:44,640 It had everything we needed. 107 00:05:44,640 --> 00:05:46,390 It had the ability to slurp up DNA. 108 00:05:46,390 --> 00:05:50,730 It had the ability to confer a new property on the cells like 109 00:05:50,730 --> 00:05:52,370 resistance to antibiotics. 110 00:05:52,370 --> 00:05:56,510 And when we're done, we have a library here. 111 00:06:02,070 --> 00:06:11,420 And we call this a library of cells or colonies, each one of 112 00:06:11,420 --> 00:06:15,040 which has a distinct piece of human DNA, each one of which 113 00:06:15,040 --> 00:06:16,600 has a different piece of human DNA. 114 00:06:20,987 --> 00:06:23,190 Let me stop and ask, any questions? 115 00:06:23,190 --> 00:06:25,040 This is like one of the coolest 116 00:06:25,040 --> 00:06:26,830 new procedures invented. 117 00:06:26,830 --> 00:06:27,830 Yes? 118 00:06:27,830 --> 00:06:29,763 AUDIENCE: Is it possible that the same E.coli takes in two 119 00:06:29,763 --> 00:06:31,470 of the things? 120 00:06:31,470 --> 00:06:36,750 PROFESSOR: Is it possible that same E.coli takes in two? 121 00:06:36,750 --> 00:06:42,000 Yes, turns out we do this at a dilution in a way so that most 122 00:06:42,000 --> 00:06:47,170 take up zero, some take up one, and a couple take up two. 123 00:06:47,170 --> 00:06:50,960 And what we say there is, oh well. 124 00:06:50,960 --> 00:06:51,990 It's not so common. 125 00:06:51,990 --> 00:06:53,590 And we live with it. 126 00:06:53,590 --> 00:06:55,158 What else? 127 00:06:55,158 --> 00:06:58,152 AUDIENCE: You need to make sure the bacteria that doesn't 128 00:06:58,152 --> 00:06:59,650 already have a resistance to it? 129 00:06:59,650 --> 00:07:01,990 PROFESSOR: I better use a bacteria that's not already 130 00:07:01,990 --> 00:07:05,750 resistant to this antibiotic, good point. 131 00:07:05,750 --> 00:07:06,490 That's right. 132 00:07:06,490 --> 00:07:07,040 And I can do that. 133 00:07:07,040 --> 00:07:07,993 Yes? 134 00:07:07,993 --> 00:07:09,965 AUDIENCE: When making those vectors, how do you make sure 135 00:07:09,965 --> 00:07:13,662 it only cuts once so you can put one piece of human DNA and 136 00:07:13,662 --> 00:07:14,895 it doesn't break apart? 137 00:07:14,895 --> 00:07:20,790 PROFESSOR: Oh, so what if the vector had five EcoR1 sites? 138 00:07:20,790 --> 00:07:24,170 People have engineered them to just have one. 139 00:07:24,170 --> 00:07:27,500 In fact, what they've really done now is they engineer them 140 00:07:27,500 --> 00:07:30,260 so that within the vector there's a little region, 141 00:07:30,260 --> 00:07:34,370 called a polylinker, that has many different sites for many 142 00:07:34,370 --> 00:07:36,470 different restriction enzymes, depending on which one you 143 00:07:36,470 --> 00:07:37,170 might want to use. 144 00:07:37,170 --> 00:07:39,130 And they're all right over here in the same place. 145 00:07:39,130 --> 00:07:41,670 And they go around and they fix the vector. 146 00:07:41,670 --> 00:07:43,730 So it doesn't have any of the others. 147 00:07:43,730 --> 00:07:46,740 You could, for example cut it open and change the sequence a 148 00:07:46,740 --> 00:07:47,040 little bit. 149 00:07:47,040 --> 00:07:48,810 All these tools are available to us. 150 00:07:48,810 --> 00:07:50,510 And rather than you having to do it, 151 00:07:50,510 --> 00:07:51,590 they're all in the catalog. 152 00:07:51,590 --> 00:07:54,300 The catalog has vectors that have polylinkers, don't have 153 00:07:54,300 --> 00:07:55,650 any other cut sites, et cetera. 154 00:07:55,650 --> 00:07:57,132 Yes? 155 00:07:57,132 --> 00:08:00,604 AUDIENCE: How do you isolate the sequence you want if the 156 00:08:00,604 --> 00:08:03,090 cutting sequence is so frequent? 157 00:08:03,090 --> 00:08:06,210 PROFESSOR: Oh, so you're asking two questions. 158 00:08:06,210 --> 00:08:10,290 One is you're asking how do I find the particular thing I'm 159 00:08:10,290 --> 00:08:12,390 looking for on the plate amongst the millions of 160 00:08:12,390 --> 00:08:14,080 colonies that might grow up? 161 00:08:14,080 --> 00:08:16,320 Or are you asking if the cutting sequence is so 162 00:08:16,320 --> 00:08:19,770 frequent, how do I avoid cutting in the middle of the 163 00:08:19,770 --> 00:08:21,020 sequence I want? 164 00:08:25,600 --> 00:08:25,945 AUDIENCE: Well both, I guess. 165 00:08:25,945 --> 00:08:30,470 If you want to replicate human genes, if you have a bunch of 166 00:08:30,470 --> 00:08:32,905 DNA, how can you make it so that you 167 00:08:32,905 --> 00:08:35,830 isolate only that gene? 168 00:08:35,830 --> 00:08:37,530 PROFESSOR: OK, so now there are really two questions 169 00:08:37,530 --> 00:08:38,120 you're asking. 170 00:08:38,120 --> 00:08:40,220 They're both really good questions. 171 00:08:40,220 --> 00:08:45,530 One, I take my human DNA. 172 00:08:45,530 --> 00:08:49,970 And let's say we're trying to clone the gene for 173 00:08:49,970 --> 00:08:51,680 beta-globin. 174 00:08:51,680 --> 00:08:53,890 Beta-globin is a part of hemoglobin. 175 00:08:53,890 --> 00:08:58,280 It's one of the two proteins in hemoglobin. 176 00:08:58,280 --> 00:09:06,710 And let's say the beta-globin gene is a certain length. 177 00:09:06,710 --> 00:09:12,820 And suppose the beta-globin gene actually has several 178 00:09:12,820 --> 00:09:14,070 EcoR1 sites. 179 00:09:20,480 --> 00:09:22,380 I'm going to chop it up into multiple pieces. 180 00:09:22,380 --> 00:09:24,250 That's bad, right? 181 00:09:24,250 --> 00:09:27,370 That's step one bad. 182 00:09:27,370 --> 00:09:28,620 How do I avoid that? 183 00:09:31,240 --> 00:09:32,130 Sorry? 184 00:09:32,130 --> 00:09:33,540 AUDIENCE: Paste it. 185 00:09:33,540 --> 00:09:34,160 PROFESSOR: Paste it again. 186 00:09:34,160 --> 00:09:36,960 Oh no, once I cut it, all the pieces of human DNA are 187 00:09:36,960 --> 00:09:37,700 floating around. 188 00:09:37,700 --> 00:09:40,090 How do I know what to paste to what? 189 00:09:40,090 --> 00:09:41,470 AUDIENCE: Methylate it. 190 00:09:41,470 --> 00:09:42,630 PROFESSOR: Methylate it. 191 00:09:42,630 --> 00:09:44,800 So first I could take my human DNA. 192 00:09:44,800 --> 00:09:46,470 And I could methylate it. 193 00:09:46,470 --> 00:09:48,030 Then it won't get cut up by the enzyme. 194 00:09:48,030 --> 00:09:50,120 But what's the problem with that? 195 00:09:50,120 --> 00:09:51,640 None of it will get cut up with the enzyme. 196 00:09:55,230 --> 00:09:58,620 Suppose I did something that was a compromise though. 197 00:09:58,620 --> 00:10:05,890 Suppose I added a little methylase and a little 198 00:10:05,890 --> 00:10:08,180 restriction enzyme. 199 00:10:08,180 --> 00:10:11,700 And the methylase went around randomly and put some 200 00:10:11,700 --> 00:10:14,020 methyl groups on. 201 00:10:14,020 --> 00:10:16,640 Then these wouldn't get cut. 202 00:10:16,640 --> 00:10:20,620 And sometimes it might put on that. 203 00:10:20,620 --> 00:10:22,010 And those wouldn't get cut. 204 00:10:22,010 --> 00:10:23,260 And it would just get cut here. 205 00:10:25,770 --> 00:10:28,805 I would call such a thing a partial digestion. 206 00:10:32,210 --> 00:10:34,810 It can be done one of two ways. 207 00:10:34,810 --> 00:10:37,380 I could just add a little bit of restriction enzyme and not 208 00:10:37,380 --> 00:10:38,630 give it too long. 209 00:10:38,630 --> 00:10:40,800 Or I could add a mixture of a little restriction enzyme, a 210 00:10:40,800 --> 00:10:43,730 little methylation enzyme, and let them have at it. 211 00:10:43,730 --> 00:10:46,540 And if I had the right balance, I could cut on 212 00:10:46,540 --> 00:10:50,730 average every second restriction site or on average 213 00:10:50,730 --> 00:10:52,880 every fourth restriction site. 214 00:10:52,880 --> 00:10:56,970 Or if I use a different ratio, every tenth restriction site. 215 00:10:56,970 --> 00:11:00,382 So I actually can play a stochastic, a probabilistic 216 00:11:00,382 --> 00:11:05,490 game of cutting every second, fourth, tenth site and 217 00:11:05,490 --> 00:11:06,420 obviously get a mix. 218 00:11:06,420 --> 00:11:08,250 I'll get a mixture of different cuts. 219 00:11:08,250 --> 00:11:11,450 But in terms of cutting in the middle of my gene, I no longer 220 00:11:11,450 --> 00:11:15,010 care because I can do partial digestions. 221 00:11:15,010 --> 00:11:18,130 Now that means that in my library I might not just have 222 00:11:18,130 --> 00:11:19,270 the Eco fragments. 223 00:11:19,270 --> 00:11:21,820 But because I did a partial digestion, I might have two 224 00:11:21,820 --> 00:11:24,130 consecutive or five consecutive Eco fragments, if 225 00:11:24,130 --> 00:11:25,700 I've done a partial digestion. 226 00:11:25,700 --> 00:11:26,860 But your question still stands. 227 00:11:26,860 --> 00:11:29,740 How do I find my right gene? 228 00:11:29,740 --> 00:11:31,640 How do I figure out which is the right gene? 229 00:11:35,690 --> 00:11:38,430 So I've got this. 230 00:11:38,430 --> 00:11:39,680 I've got my plate. 231 00:11:42,040 --> 00:11:44,820 I've done this amazing purification. 232 00:11:44,820 --> 00:11:49,320 Every one of these guys has a chunk of human DNA that might 233 00:11:49,320 --> 00:11:51,090 be just one EcoR1 fragment. 234 00:11:51,090 --> 00:11:54,430 Or if I happen to have done a partial digestion, it might be 235 00:11:54,430 --> 00:11:56,450 two EcoR1 fragments. 236 00:11:56,450 --> 00:12:04,262 But it turns out that this guy here is beta-globin. 237 00:12:04,262 --> 00:12:05,970 How do I know? 238 00:12:05,970 --> 00:12:07,050 I've purified beta-globin. 239 00:12:07,050 --> 00:12:08,430 I get points, right? 240 00:12:08,430 --> 00:12:10,070 I purified beta-globin. 241 00:12:10,070 --> 00:12:11,930 Somewhere on my plate is a pure 242 00:12:11,930 --> 00:12:14,520 beta-globin sitting there. 243 00:12:14,520 --> 00:12:18,180 But the problem is I don't know where it is. 244 00:12:18,180 --> 00:12:19,890 How am I going to find it? 245 00:12:19,890 --> 00:12:21,080 They all still look the same. 246 00:12:21,080 --> 00:12:22,330 They look like bacteria. 247 00:12:25,867 --> 00:12:27,791 AUDIENCE: [INAUDIBLE]. 248 00:12:27,791 --> 00:12:29,864 PROFESSOR: To do what? 249 00:12:29,864 --> 00:12:33,343 AUDIENCE: Well, in this case find some way to determine 250 00:12:33,343 --> 00:12:37,330 whether or not a certain certain has beta-globin. 251 00:12:37,330 --> 00:12:38,300 PROFESSOR: Whether a certain colony-- 252 00:12:38,300 --> 00:12:39,500 AUDIENCE: Has beta-globin. 253 00:12:39,500 --> 00:12:41,610 PROFESSOR: Has the beta-globin gene in it. 254 00:12:41,610 --> 00:12:45,330 So I somehow have to test each colony to figure out whether 255 00:12:45,330 --> 00:12:46,780 it's got the beta-globin gene in it. 256 00:12:46,780 --> 00:12:48,070 How do you propose to do that? 257 00:12:52,840 --> 00:12:54,090 AUDIENCE: There's some observation you can make. 258 00:12:58,130 --> 00:12:58,605 PROFESSOR: Talk loud. 259 00:12:58,605 --> 00:13:01,160 AUDIENCE: There's some observation you can make. 260 00:13:01,160 --> 00:13:02,910 PROFESSOR: And you make that observation. 261 00:13:02,910 --> 00:13:05,460 Right, so-- 262 00:13:05,460 --> 00:13:05,960 I'm with you. 263 00:13:05,960 --> 00:13:06,918 Yeah? 264 00:13:06,918 --> 00:13:09,313 AUDIENCE: You basically want some ribosomes 265 00:13:09,313 --> 00:13:10,750 to create the proteins. 266 00:13:10,750 --> 00:13:12,870 PROFESSOR: Oh, oh, oh, so maybe I can make that thing 267 00:13:12,870 --> 00:13:14,860 make beta-globin. 268 00:13:14,860 --> 00:13:17,190 Oh, and then look for the protein. 269 00:13:17,190 --> 00:13:18,960 Problem with that is E.coli doesn't read human 270 00:13:18,960 --> 00:13:20,360 instructions. 271 00:13:20,360 --> 00:13:22,030 So E.coli's polymerase won't read. 272 00:13:22,030 --> 00:13:22,960 But it's a great idea. 273 00:13:22,960 --> 00:13:23,940 Maybe we can do that. 274 00:13:23,940 --> 00:13:25,590 Any other ideas? 275 00:13:25,590 --> 00:13:27,360 How are we going to find our gene? 276 00:13:27,360 --> 00:13:28,410 We've made a library. 277 00:13:28,410 --> 00:13:29,100 We have a library. 278 00:13:29,100 --> 00:13:30,790 It's a bigger library than MIT. 279 00:13:30,790 --> 00:13:32,630 It's got more volumes than MIT. 280 00:13:32,630 --> 00:13:33,980 How are we're going to find our book? 281 00:13:33,980 --> 00:13:36,820 AUDIENCE: Put it in a medium without beta-globin. 282 00:13:36,820 --> 00:13:39,932 PROFESSOR: Put a medium without beta-globin. 283 00:13:39,932 --> 00:13:40,900 AUDIENCE: See if it grows or not. 284 00:13:40,900 --> 00:13:43,390 PROFESSOR: See if it grows up. 285 00:13:43,390 --> 00:13:45,700 E.coli couldn't care less about beta-globin. 286 00:13:45,700 --> 00:13:46,800 It doesn't need beta-globin. 287 00:13:46,800 --> 00:13:49,060 And in addition, it won't produce beta-globin because it 288 00:13:49,060 --> 00:13:51,480 doesn't read the human instructions. 289 00:13:51,480 --> 00:13:53,960 All right, so we've gone to all this trouble. 290 00:13:53,960 --> 00:13:57,680 We've made a library with all the possible books in the 291 00:13:57,680 --> 00:13:59,380 human genome. 292 00:13:59,380 --> 00:14:01,170 And we don't know how to use the library. 293 00:14:04,480 --> 00:14:07,190 Friday's lecture let us discuss 294 00:14:07,190 --> 00:14:08,440 how to use the library.