1 00:00:00,000 --> 00:00:02,496 There were some other questions sort 2 00:00:02,496 --> 00:00:06,500 of running along this general idea of the fact 3 00:00:06,500 --> 00:00:10,000 that the information in DNA doesn't go, 4 00:00:10,000 --> 00:00:14,000 even though it encodes the information for proteins 5 00:00:14,000 --> 00:00:16,998 goes via this rRNA intermediate. 6 00:00:16,998 --> 00:00:21,000 Someone asked what was the M. 7 00:00:21,000 --> 00:00:22,665 The M is for messenger. 8 00:00:22,665 --> 00:00:25,000 The idea being that since the DNA, 9 00:00:25,000 --> 00:00:28,070 at least in eukaryotes the DNA was in the nucleus 10 00:00:28,070 --> 00:00:30,815 and proteins were made out of the cytoplasm, 11 00:00:30,815 --> 00:00:34,000 somehow that information had to be carried from the nucleus 12 00:00:34,000 --> 00:00:36,000 where the DNA was out to the cytoplasm. 13 00:00:36,000 --> 00:00:37,842 And that's where the term messenger 14 00:00:37,842 --> 00:00:41,284 was because the RNA was seen as something that would 15 00:00:41,284 --> 00:00:43,000 carry the information out. 16 00:00:43,000 --> 00:00:46,108 Now, a point here, it's really critical 17 00:00:46,108 --> 00:00:51,000 because we're going to continue to talk about gene regulation. 18 00:00:51,000 --> 00:00:56,000 And that is when a cell is making one of these mRNAs, 19 00:00:56,000 --> 00:00:59,663 it doesn't make one single copy of all of the genes 20 00:00:59,663 --> 00:01:02,816 that are in the genome on one RNA. 21 00:01:02,816 --> 00:01:06,936 Instead it does it either one gene at a time, 22 00:01:06,936 --> 00:01:09,120 which is the usual case, or occasionally 23 00:01:09,120 --> 00:01:14,178 as we see in the lac operon a little tiny cluster of DNAs 24 00:01:14,178 --> 00:01:16,000 that have related functions. 25 00:01:16,000 --> 00:01:19,375 And the beauty of that is that it then 26 00:01:19,375 --> 00:01:24,080 enables the cell to dial in how much protein is being made, 27 00:01:24,080 --> 00:01:28,535 in part at least, by determining how much RNA is being made. 28 00:01:28,535 --> 00:01:32,596 So you're going to make no RNA and not make the protein at all 29 00:01:32,596 --> 00:01:35,307 or make a little RNA and get a little protein, 30 00:01:35,307 --> 00:01:37,763 or if it's really a thing you need 31 00:01:37,763 --> 00:01:41,394 very large quantities of you can really crank out a lot of RNA 32 00:01:41,394 --> 00:01:43,000 and make a lot of protein. 33 00:01:43,000 --> 00:01:45,100 So the potential is there for regulation. 34 00:01:45,100 --> 00:01:48,834 And in bacteria, as I said, for almost all bacterial genes 35 00:01:48,834 --> 00:01:50,000 it's pretty straightforward. 36 00:01:50,000 --> 00:01:52,394 You can look in the DNA, and at least 37 00:01:52,394 --> 00:01:55,596 if you know where to start, see the start of a protein, 38 00:01:55,596 --> 00:01:58,400 you can just use that table of the genetic code 39 00:01:58,400 --> 00:02:00,400 and read off the sequence. 40 00:02:00,400 --> 00:02:04,725 But eukaryotes in particular, higher eukaryotes have this odd 41 00:02:04,725 --> 00:02:09,536 business that what seemed odd and surprising that when you 42 00:02:09,536 --> 00:02:13,800 look at their genes, many of them you cannot do that 43 00:02:13,800 --> 00:02:18,200 because it's as if there are extra bits of DNA stuck 44 00:02:18,200 --> 00:02:19,357 in the middle. 45 00:02:19,357 --> 00:02:22,213 And in some cases you heard it could 46 00:02:22,213 --> 00:02:27,688 be really huge amounts of DNA so that there is this extra thing 47 00:02:27,688 --> 00:02:30,000 where there's a pre mRNA. 48 00:02:30,000 --> 00:02:34,224 And this RNA splicing we talked about has to take place 49 00:02:34,224 --> 00:02:35,922 to generate the mRNA. 50 00:02:35,922 --> 00:02:41,384 And once you have the mRNA then the ribosome and the charge 51 00:02:41,384 --> 00:02:44,456 tRNAs can be used to make the proteins. 52 00:02:44,456 --> 00:02:49,000 And someone asked what all that extra DNA is for. 53 00:02:49,000 --> 00:02:53,428 I mean we still don't fully understand that. 54 00:02:53,428 --> 00:02:57,000 There are some regulatory sequences 55 00:02:57,000 --> 00:03:03,000 and regulatory actors that are buried in that non coding DNA. 56 00:03:03,000 --> 00:03:05,565 But another thing may be that this is just 57 00:03:05,565 --> 00:03:07,333 the way it's worked in evolution. 58 00:03:07,333 --> 00:03:10,663 And as long as it works there's no driving force 59 00:03:10,663 --> 00:03:13,000 necessarily to get rid of it. 60 00:03:13,000 --> 00:03:16,000 If you look at microorganisms, for example, 61 00:03:16,000 --> 00:03:20,000 yeast is a eukaryotic that, like E. coli, 62 00:03:20,000 --> 00:03:22,541 has to replicate pretty fast in order 63 00:03:22,541 --> 00:03:26,220 to compete with other microorganisms for the food 64 00:03:26,220 --> 00:03:29,000 and whatnot in its environment. 65 00:03:29,000 --> 00:03:32,600 And it has relatively little of these extra intervening 66 00:03:32,600 --> 00:03:35,496 sequences compared to what we find in our DNA. 67 00:03:35,496 --> 00:03:38,000 I gave you the example of Factor 8. 68 00:03:38,000 --> 00:03:41,663 It didn't really particularly matter what it was in the sense 69 00:03:41,663 --> 00:03:45,456 that it was just an example of something that has 70 00:03:45,456 --> 00:03:47,454 a lot of intervening sequence. 71 00:03:47,454 --> 00:03:51,540 What it is, though, it's one of a set 72 00:03:51,540 --> 00:03:57,000 of proteins that are involved in clotting of your blood. 73 00:03:57,000 --> 00:04:00,632 When you cut yourself, we have this system 74 00:04:00,632 --> 00:04:04,220 that prevents us from bleeding to death, 75 00:04:04,220 --> 00:04:08,816 unless you have hemophilia or something like that where's 76 00:04:08,816 --> 00:04:12,000 there's a problem with the clotting system, 77 00:04:12,000 --> 00:04:14,664 then a very complex set of things happen. 78 00:04:14,664 --> 00:04:18,875 And Factor 8 is one of the several proteins that 79 00:04:18,875 --> 00:04:22,000 play critical roles in that. 80 00:04:22,000 --> 00:04:26,440 Somebody asked, I talked about a few things 81 00:04:26,440 --> 00:04:29,688 that were, this was sort of the dogma. 82 00:04:29,688 --> 00:04:32,908 This was how information was thought to go. 83 00:04:32,908 --> 00:04:35,632 And then how Dave Baltimore found 84 00:04:35,632 --> 00:04:39,724 there was reverse transcriptase that could take an RNA 85 00:04:39,724 --> 00:04:42,000 and make a DNA copy. 86 00:04:42,000 --> 00:04:45,178 And the question was they didn't understand 87 00:04:45,178 --> 00:04:47,357 how the dogma could change. 88 00:04:47,357 --> 00:04:50,213 I mean that's sort of what I'm trying 89 00:04:50,213 --> 00:04:52,832 to emphasize a lot in this course 90 00:04:52,832 --> 00:04:56,576 is what I'm teaching you is what human experimentation 91 00:04:56,576 --> 00:04:59,496 and thought has brought up until spring 92 00:04:59,496 --> 00:05:02,000 of 2005 of terms in biology. 93 00:05:02,000 --> 00:05:03,750 Some of the sort of basic discoveries 94 00:05:03,750 --> 00:05:06,332 were made in physics so long ago that it's 95 00:05:06,332 --> 00:05:09,000 very unlikely that you'll come in and discover 96 00:05:09,000 --> 00:05:13,000 that the Newtonian mechanics you learned as a freshman is not 97 00:05:13,000 --> 00:05:14,998 operative anymore when you're a senior, 98 00:05:14,998 --> 00:05:18,776 but you can still have these massive revolutions in biology 99 00:05:18,776 --> 00:05:22,332 where just suddenly whole things, like RNA splicing, 100 00:05:22,332 --> 00:05:25,000 emerge from the woodwork almost overnight. 101 00:05:25,000 --> 00:05:27,904 And that's, in fact, what happened with that. 102 00:05:27,904 --> 00:05:30,284 That's what happened with reverse transcriptase. 103 00:05:30,284 --> 00:05:33,140 So, in a sense, it's almost a joke 104 00:05:33,140 --> 00:05:34,565 that Crick called it dogma. 105 00:05:34,565 --> 00:05:36,428 He didn't know what he was doing. 106 00:05:36,428 --> 00:05:38,140 But it sort of took that property on. 107 00:05:38,140 --> 00:05:41,128 And I'm trying to caution you that even though some of you 108 00:05:41,128 --> 00:05:43,333 would like me to stick to just facts 109 00:05:43,333 --> 00:05:45,331 that we're continually learning and there 110 00:05:45,331 --> 00:05:48,000 are discoveries being made even as we're 111 00:05:48,000 --> 00:05:50,000 going on with this course. 112 00:05:50,000 --> 00:05:53,444 Now, I told you that there were certain viruses, HIV 113 00:05:53,444 --> 00:05:56,108 being the one I really emphasized, 114 00:05:56,108 --> 00:05:58,000 that have this property. 115 00:05:58,000 --> 00:06:01,000 Their genetic material is not DNA. 116 00:06:01,000 --> 00:06:01,800 It's RNA. 117 00:06:01,800 --> 00:06:05,363 And the reverse transcriptase makes a double strand copy 118 00:06:05,363 --> 00:06:07,904 that it inserts into the organism's DNA 119 00:06:07,904 --> 00:06:10,000 and becomes a permanent part. 120 00:06:10,000 --> 00:06:13,714 And I'm trying to caution you that's why 121 00:06:13,714 --> 00:06:16,213 safe sex is such a big deal. 122 00:06:16,213 --> 00:06:18,570 Because if you get infected with HIV 123 00:06:18,570 --> 00:06:21,135 you'll have it for the rest of your life. 124 00:06:21,135 --> 00:06:23,452 I mentioned there were some cancer viruses, 125 00:06:23,452 --> 00:06:26,000 and someone said, well, if you do 126 00:06:26,000 --> 00:06:29,330 that how can you cure cancer? 127 00:06:29,330 --> 00:06:32,425 Well, in fact, we're lucky in the sense 128 00:06:32,425 --> 00:06:36,428 that we don't have to contend at this point in a major way 129 00:06:36,428 --> 00:06:40,307 with human cancer viruses, that would be more of a problem, 130 00:06:40,307 --> 00:06:41,842 but your cats have to. 131 00:06:41,842 --> 00:06:44,888 You may have heard of the feline leukemia virus. 132 00:06:44,888 --> 00:06:48,454 This is a retrovirus of this same class, 133 00:06:48,454 --> 00:06:53,000 and cats infected with it have it in their saliva. 134 00:06:53,000 --> 00:06:54,998 And although it can be transmitted 135 00:06:54,998 --> 00:06:57,000 amongst pets in the same household, 136 00:06:57,000 --> 00:07:01,978 the big problem is usually cat fights. 137 00:07:01,978 --> 00:07:03,936 And then you get a scratch and then it gets in. 138 00:07:03,936 --> 00:07:05,666 So if you have a cat you probably 139 00:07:05,666 --> 00:07:09,000 had to take it to the vet and get vaccinated. 140 00:07:09,000 --> 00:07:11,456 And one of the reasons is you're trying 141 00:07:11,456 --> 00:07:14,452 to get it vaccinated against the feline leukemia virus. 142 00:07:14,452 --> 00:07:19,284 And it's just sort of like that story I told you 143 00:07:19,284 --> 00:07:22,842 with the streptococcus, that if your immune system has 144 00:07:22,842 --> 00:07:25,307 seen the thing beforehand then if you actually 145 00:07:25,307 --> 00:07:28,070 get an infection it has a very quick response 146 00:07:28,070 --> 00:07:31,178 and your cat doesn't get infected by the virus. 147 00:07:31,178 --> 00:07:33,500 And therefore doesn't become a candidate 148 00:07:33,500 --> 00:07:36,500 for getting leukemia in later life. 149 00:07:36,500 --> 00:07:37,000 OK. 150 00:07:37,000 --> 00:07:40,458 So at least there's an effort to try and response to at least 151 00:07:40,458 --> 00:07:42,500 a few of the things. 152 00:07:42,500 --> 00:07:46,152 There were some very interesting and thoughtful questions. 153 00:07:46,152 --> 00:07:50,921 So I want to now go back to talk a little bit more 154 00:07:50,921 --> 00:07:53,684 about this issue of regulation because that is one 155 00:07:53,684 --> 00:07:55,815 of the real secrets to life. 156 00:07:55,815 --> 00:07:58,000 And it's the ability of organisms 157 00:07:58,000 --> 00:07:59,995 to turn on and off certain functions 158 00:07:59,995 --> 00:08:02,333 and to have rheostats where the can control 159 00:08:02,333 --> 00:08:03,665 the levels of expression. 160 00:08:03,665 --> 00:08:07,000 But all of this has to be in the DNA. 161 00:08:07,000 --> 00:08:09,664 And so before people knew how this worked, 162 00:08:09,664 --> 00:08:11,000 it's a little mysterious. 163 00:08:11,000 --> 00:08:16,000 And maybe one way you might think about it, well, 164 00:08:16,000 --> 00:08:18,904 if I have a gene that encodes something 165 00:08:18,904 --> 00:08:21,920 for lactose metabolism, and I tell you there's 166 00:08:21,920 --> 00:08:25,999 a sequence upstream of this and somehow this gene is regulated 167 00:08:25,999 --> 00:08:29,666 depending on whether I put lactose in the medium or not, 168 00:08:29,666 --> 00:08:31,664 how is it going to work? 169 00:08:31,664 --> 00:08:34,285 Does it have to fit into little holes 170 00:08:34,285 --> 00:08:36,565 in between the letters of the genetic code 171 00:08:36,565 --> 00:08:40,000 sort of the way Gamov originally thought about it, 172 00:08:40,000 --> 00:08:43,000 or is there some other mechanism? 173 00:08:43,000 --> 00:08:46,663 And what you'll see here is one of the general strategies 174 00:08:46,663 --> 00:08:50,330 that evolution has chosen is although there's 175 00:08:50,330 --> 00:08:52,726 regulatory information in the DNA, 176 00:08:52,726 --> 00:08:55,630 DNA isn't a particularly good molecule for recognizing 177 00:08:55,630 --> 00:09:01,000 things, but what it is good at is encoding proteins. 178 00:09:01,000 --> 00:09:04,213 So the trick is to have some proteins made 179 00:09:04,213 --> 00:09:07,500 whose role in life are to be regulators. 180 00:09:07,500 --> 00:09:11,000 So that this is what is underlying 181 00:09:11,000 --> 00:09:17,000 this system that I started to tell you about on Friday. 182 00:09:17,000 --> 00:09:30,496 So remember -- -- beta galactosidase is the enzyme 183 00:09:30,496 --> 00:09:34,800 that takes lactose which is lactose beta 1,4 glucose. 184 00:09:34,800 --> 00:09:40,665 And somebody said it's easy to get them mixed up. 185 00:09:40,665 --> 00:09:44,600 I apologize, but those are the names. 186 00:09:44,600 --> 00:09:48,800 And cleaves it, just breaks the bond 187 00:09:48,800 --> 00:09:51,800 to give galactose plus glucose. 188 00:09:51,800 --> 00:09:57,714 Both of those can be metabolized by ordinary elements 189 00:09:57,714 --> 00:10:02,000 you'll find in most cells. 190 00:10:02,000 --> 00:10:06,000 But taking lactose, which you also know as milk sugar 191 00:10:06,000 --> 00:10:10,000 because it's in milk, needs this extra function. 192 00:10:10,000 --> 00:10:12,997 If you're lactose intolerant, as a fraction of you 193 00:10:12,997 --> 00:10:17,500 will be because that's quite common in the human population, 194 00:10:17,500 --> 00:10:20,565 then, although you had the enzyme when you were baby, 195 00:10:20,565 --> 00:10:23,332 it's been shut off in your body since then 196 00:10:23,332 --> 00:10:24,664 and it causes problems. 197 00:10:24,664 --> 00:10:27,815 Because if you eat lactose, drink milk or something, 198 00:10:27,815 --> 00:10:30,571 the lactose goes right through your stomach 199 00:10:30,571 --> 00:10:34,000 and ends up in your intestine. 200 00:10:34,000 --> 00:10:38,000 And there are bacteria in there that are able to break it open. 201 00:10:38,000 --> 00:10:42,641 And when they do that it leads to gas and some other sort 202 00:10:42,641 --> 00:10:45,856 of uncomfortablenesses that are associated 203 00:10:45,856 --> 00:10:48,000 with lactose intolerance. 204 00:10:48,000 --> 00:10:50,541 So, as I'd said, the major finding 205 00:10:50,541 --> 00:10:53,664 was that this enzyme beta galactosidase or beta 206 00:10:53,664 --> 00:10:54,912 gal was regulated. 207 00:10:54,912 --> 00:10:57,000 That if you grow E. 208 00:10:57,000 --> 00:11:07,142 coli on glucose -- -- there was no beta gal. 209 00:11:07,142 --> 00:11:11,362 And if you grow them on lactose as the carbon source 210 00:11:11,362 --> 00:11:15,000 then there were high levels of beta gal. 211 00:11:15,000 --> 00:11:20,000 And that then lead Jacques Monod and Francois Jacob, 212 00:11:20,000 --> 00:11:23,996 the two French scientists I mentioned, 213 00:11:23,996 --> 00:11:27,110 to begin studying this problem. 214 00:11:27,110 --> 00:11:31,000 And, like so many things in biology, 215 00:11:31,000 --> 00:11:34,178 they were working on a huge problem, 216 00:11:34,178 --> 00:11:36,000 how are genes regulated? 217 00:11:36,000 --> 00:11:39,072 But it wasn't so evident at the beginning. 218 00:11:39,072 --> 00:11:43,000 What they were doing was this very modest thing, 219 00:11:43,000 --> 00:11:47,771 why is beta galactosidase there in one condition and not 220 00:11:47,771 --> 00:11:48,270 another? 221 00:11:48,270 --> 00:11:51,384 Just a little problem in bacterial metabolism. 222 00:11:51,384 --> 00:11:54,072 And ultimately it gave us the roots 223 00:11:54,072 --> 00:11:57,248 to the answer of how genes are regulated. 224 00:11:57,248 --> 00:12:01,999 And I just have to leave out all the beautify stuff that 225 00:12:01,999 --> 00:12:03,664 led to what they found. 226 00:12:03,664 --> 00:12:07,997 Let me just recapitulate what I put on the board the other day. 227 00:12:07,997 --> 00:12:12,664 It turned out that the LacC gene, this is the gene that 228 00:12:12,664 --> 00:12:15,000 encodes beta galactosidase, so that's 229 00:12:15,000 --> 00:12:19,000 the sequence, that has the sequence of codons, 230 00:12:19,000 --> 00:12:21,280 that if you could start at the beginning 231 00:12:21,280 --> 00:12:25,912 and go along and just put all the amino acids in you would 232 00:12:25,912 --> 00:12:28,000 end up with beta galactosidase. 233 00:12:28,000 --> 00:12:32,580 That unit of genetic information, 234 00:12:32,580 --> 00:12:41,538 which is called the lacC gene, let's put it here. 235 00:12:41,538 --> 00:12:47,460 Then there were two other genes just 236 00:12:47,460 --> 00:12:52,748 downstream of this in the DNA. 237 00:12:52,748 --> 00:13:01,000 And this unit is made as a single mRNA. 238 00:13:01,000 --> 00:13:05,000 So this is a little bit different than what I told you. 239 00:13:05,000 --> 00:13:06,815 It's not just one gene. 240 00:13:06,815 --> 00:13:09,800 It's actually two or three because these bacteria 241 00:13:09,800 --> 00:13:13,000 try to do everything very efficiently because they're 242 00:13:13,000 --> 00:13:13,888 growing quickly. 243 00:13:13,888 --> 00:13:18,600 This means it can turn on three genes of related function 244 00:13:18,600 --> 00:13:19,400 very efficiently. 245 00:13:19,400 --> 00:13:23,800 When you have several genes that are expressed using one mRNA, 246 00:13:23,800 --> 00:13:30,000 as I said, the genes are said to be organized in an operon. 247 00:13:30,000 --> 00:13:33,570 But the key point that we talked about before is 248 00:13:33,570 --> 00:13:36,816 you're going to have these genes expressed everything 249 00:13:36,816 --> 00:13:40,000 has to be written in the DNA. 250 00:13:40,000 --> 00:13:42,688 And it's not using the genetic code. 251 00:13:42,688 --> 00:13:46,110 It's using other words that are written there. 252 00:13:46,110 --> 00:13:52,000 And you've seen this word now in several lectures. 253 00:13:52,000 --> 00:13:55,000 That's a promoter. 254 00:13:55,000 --> 00:14:00,000 And that means to start transcription. 255 00:14:00,000 --> 00:14:02,565 And to stop the mRNA there has to be 256 00:14:02,565 --> 00:14:04,000 something at the other end. 257 00:14:04,000 --> 00:14:07,330 And I'll show you the sequence of at least one 258 00:14:07,330 --> 00:14:09,666 of these promoters in a minute. 259 00:14:09,666 --> 00:14:13,000 This means to stop transcription, 260 00:14:13,000 --> 00:14:15,496 to stop making the RNA copy. 261 00:14:15,496 --> 00:14:18,000 If you didn't have sequences like 262 00:14:18,000 --> 00:14:21,377 that the cell wouldn't know where should I begin the RNA 263 00:14:21,377 --> 00:14:23,500 and where does it end. 264 00:14:23,500 --> 00:14:27,000 And since there are many, many genes, 265 00:14:27,000 --> 00:14:32,000 there have to be many promoters and many terminators. 266 00:14:32,000 --> 00:14:33,840 And the other point I tried to hammer 267 00:14:33,840 --> 00:14:37,178 home the other day is although the genetic code is universal, 268 00:14:37,178 --> 00:14:41,000 you can take that little table and read the sequence 269 00:14:41,000 --> 00:14:43,000 of human proteins or E. 270 00:14:43,000 --> 00:14:44,500 coli proteins, these other languages 271 00:14:44,500 --> 00:14:48,500 that are written using this four letter nucleic acid alphabet 272 00:14:48,500 --> 00:14:50,000 are not universal. 273 00:14:50,000 --> 00:14:53,330 So the sequences that E. coli uses for a promoter, 274 00:14:53,330 --> 00:14:55,998 a start transcription are very different than what 275 00:14:55,998 --> 00:14:59,110 our bodies use as a start transcription thing. 276 00:14:59,110 --> 00:15:03,932 And when we get to the recombinant DNA stuff 277 00:15:03,932 --> 00:15:06,262 that will be an issue. 278 00:15:06,262 --> 00:15:10,636 So this mRNA is then used to make proteins. 279 00:15:10,636 --> 00:15:17,538 This would be beta galactosidase or the lac Z gene product, 280 00:15:17,538 --> 00:15:22,380 and these other genes make, so these are proteins. 281 00:15:22,380 --> 00:15:27,766 Here you see this flow of information from the DNA 282 00:15:27,766 --> 00:15:32,800 through the mRNA down to being made into proteins. 283 00:15:32,800 --> 00:15:37,357 In the case of bacteria there's no nucleus. 284 00:15:37,357 --> 00:15:40,927 Everything is in one big pot so that mRNA doesn't 285 00:15:40,927 --> 00:15:43,920 have to go anywhere, but it's all there 286 00:15:43,920 --> 00:15:47,908 and it gets translated to give copies of the protein. 287 00:15:47,908 --> 00:15:51,540 So, as I said, somehow if the cell 288 00:15:51,540 --> 00:15:56,086 is going to now regulate whether these genes are expressed 289 00:15:56,086 --> 00:16:03,000 or not, depending on whether lactose is present. 290 00:16:03,000 --> 00:16:04,926 And the way they do it makes perfect sense. 291 00:16:04,926 --> 00:16:06,921 Don't bother to make the enzyme if there's 292 00:16:06,921 --> 00:16:09,070 no lactose in the neighborhood, and only 293 00:16:09,070 --> 00:16:10,532 make it if it's present. 294 00:16:10,532 --> 00:16:12,660 So how are they going to do that? 295 00:16:12,660 --> 00:16:15,088 Is the lactose going to come along and stick 296 00:16:15,088 --> 00:16:17,000 into some little hole here or something? 297 00:16:17,000 --> 00:16:18,995 That's not a general strategy and it 298 00:16:18,995 --> 00:16:22,600 wouldn't work if you look at the structure of DNA anyway. 299 00:16:22,600 --> 00:16:26,998 It wouldn't have access to the sequence of bases. 300 00:16:26,998 --> 00:16:35,165 So what was discovered was that there was another gene very 301 00:16:35,165 --> 00:16:37,664 close called lacI. 302 00:16:37,664 --> 00:16:45,844 And the protein that it encodes is called the lac repressor. 303 00:16:45,844 --> 00:16:56,250 And since it's a gene it has to have a promoter, 304 00:16:56,250 --> 00:17:00,000 a start transcription. 305 00:17:00,000 --> 00:17:05,918 But this one, be careful now, don't get yourself mixed up, 306 00:17:05,918 --> 00:17:09,800 this is for the lacI gene. 307 00:17:09,800 --> 00:17:16,544 This is a different promoter over here for this thing. 308 00:17:16,544 --> 00:17:21,583 And then there is also then a terminator 309 00:17:21,583 --> 00:17:23,915 or a stop transcription. 310 00:17:23,915 --> 00:17:28,700 And, again, this is for the lacI gene. 311 00:17:28,700 --> 00:17:35,000 So this gets made into an mRNA as well. 312 00:17:35,000 --> 00:17:40,726 And this gets translated into a protein 313 00:17:40,726 --> 00:17:44,750 that's known as lac repressor. 314 00:17:44,750 --> 00:17:53,000 And what that lac repressor has is the ability to bind 315 00:17:53,000 --> 00:18:03,000 to a particular sequence in DNA that's located right here. 316 00:18:03,000 --> 00:18:16,998 This is a binding site right here for lac repressor. 317 00:18:16,998 --> 00:18:23,000 So let me just try to blow that up 318 00:18:23,000 --> 00:18:30,326 just a little bit because this could be a little bit 319 00:18:30,326 --> 00:18:31,000 confusing. 320 00:18:31,000 --> 00:18:50,815 So here's the promoter -- -- for this LacCYA operon. 321 00:18:50,815 --> 00:18:53,307 Here's the beginning of the LacC gene. 322 00:18:53,307 --> 00:18:55,763 And so this is where the RNA polymerase, 323 00:18:55,763 --> 00:18:59,456 the machine that's going to make the RNA copy has to bind. 324 00:18:59,456 --> 00:19:01,000 And here's the binding site. 325 00:19:01,000 --> 00:19:13,400 -- for lac repressor or the lacI protein. 326 00:19:13,400 --> 00:19:15,800 So how does this circuit work? 327 00:19:15,800 --> 00:19:21,140 And I sort of pose that as an issue for those of you 328 00:19:21,140 --> 00:19:25,000 who had followed it at least to that point. 329 00:19:25,000 --> 00:19:32,000 So let's consider the two situations. 330 00:19:32,000 --> 00:19:33,998 If there's no lactose present what 331 00:19:33,998 --> 00:19:37,110 we know from just scientists knew from experimentation 332 00:19:37,110 --> 00:19:40,440 was there was no beta galactosidase 333 00:19:40,440 --> 00:19:42,200 activity inside the cell. 334 00:19:42,200 --> 00:19:46,771 You could crack them open and you wouldn't find this enzyme 335 00:19:46,771 --> 00:19:47,270 there. 336 00:19:47,270 --> 00:19:50,454 And when you added lactose they knew, 337 00:19:50,454 --> 00:19:53,178 from that experiment described on Friday, 338 00:19:53,178 --> 00:19:55,625 it was synthesized de novo. 339 00:19:55,625 --> 00:20:00,000 So if there's no lactose present -- 340 00:20:00,000 --> 00:20:09,000 And what happens is we have the lacI gene, mRNA is being made, 341 00:20:09,000 --> 00:20:13,500 this lac repressor is being made, 342 00:20:13,500 --> 00:20:21,000 here's the promoter for LacCYA, and here's the sequence. 343 00:20:21,000 --> 00:20:29,331 And this repressor goes up and binds to that. 344 00:20:29,331 --> 00:20:34,400 And by binding to this particular sequence 345 00:20:34,400 --> 00:20:38,000 what it does is it covers up the promoter. 346 00:20:38,000 --> 00:20:41,632 It covers up the start signal, the signal 347 00:20:41,632 --> 00:20:44,428 that says "start transcription here". 348 00:20:44,428 --> 00:20:48,000 Are you guys with me? 349 00:20:48,000 --> 00:20:49,815 It's a relatively simple strategy. 350 00:20:49,815 --> 00:20:52,768 It's just by lac repressor having that ability 351 00:20:52,768 --> 00:20:55,456 to bind to a particular sequence it's 352 00:20:55,456 --> 00:20:58,666 able to prevent the RNA polymerase 353 00:20:58,666 --> 00:21:02,000 from seeing the promoter. 354 00:21:02,000 --> 00:21:05,270 And therefore it's able to prevent 355 00:21:05,270 --> 00:21:08,000 the RNA from being made. 356 00:21:08,000 --> 00:21:10,000 So there's no mRNA. 357 00:21:10,000 --> 00:21:14,500 And if there's no mRNA over here then there's 358 00:21:14,500 --> 00:21:17,400 no beta galactosidase being made. 359 00:21:17,400 --> 00:21:21,000 This is an exercise in futility. 360 00:21:21,000 --> 00:21:26,380 Why has the cell gone and made this useless protein 361 00:21:26,380 --> 00:21:32,375 that isn't doing anything in terms of helping 362 00:21:32,375 --> 00:21:35,000 it metabolize lactose? 363 00:21:35,000 --> 00:21:37,128 But now take a look at the system 364 00:21:37,128 --> 00:21:40,332 compared to what I described when we were first doing it. 365 00:21:40,332 --> 00:21:43,416 If we had to do all the regulation directly 366 00:21:43,416 --> 00:21:46,328 with the DNA we have this problem 367 00:21:46,328 --> 00:21:48,855 that lactose would somehow have to be 368 00:21:48,855 --> 00:21:51,705 able to see a sequence in DNA and somehow determine 369 00:21:51,705 --> 00:21:52,357 what happened. 370 00:21:52,357 --> 00:21:55,213 But what this cell has done now is 371 00:21:55,213 --> 00:21:58,228 it's set this system up so that the ability 372 00:21:58,228 --> 00:22:00,377 to make lactose or not make lactose 373 00:22:00,377 --> 00:22:06,000 is conditional on this protein called the lac repressor. 374 00:22:06,000 --> 00:22:10,086 If lac repressor is bound, as it's shown here, 375 00:22:10,086 --> 00:22:12,816 it's basically covering up the promoter, 376 00:22:12,816 --> 00:22:17,362 the cell cannot make RNA and cannot make beta galactosidase. 377 00:22:17,362 --> 00:22:23,000 If it was absent, if we just got rid of lac repressor 378 00:22:23,000 --> 00:22:26,000 then the promoter would be exposed 379 00:22:26,000 --> 00:22:29,080 and the cell could make beta galactosidase. 380 00:22:29,080 --> 00:22:33,875 And so it's now lac repressor that has the conditionality. 381 00:22:33,875 --> 00:22:37,000 It's, in essence, a sensor. 382 00:22:37,000 --> 00:22:40,328 It can at least be considered a sensor 383 00:22:40,328 --> 00:22:42,454 for whether lactose is present. 384 00:22:42,454 --> 00:22:47,000 And indeed that is the property that lac repressor has. 385 00:22:47,000 --> 00:22:49,080 It's able to bind lactose. 386 00:22:49,080 --> 00:23:00,000 So if we think lactose is present what happens? 387 00:23:00,000 --> 00:23:04,000 I mean this lacI gene is a pretty uninteresting, 388 00:23:04,000 --> 00:23:06,178 uninteresting from the standpoint of regulation 389 00:23:06,178 --> 00:23:09,776 in the sense that it's made all the time. 390 00:23:09,776 --> 00:23:13,842 The cell just continually cranks out a bit of lac repressor. 391 00:23:13,842 --> 00:23:15,377 It doesn't need very much. 392 00:23:15,377 --> 00:23:18,565 It just needs to make enough so that the one binding 393 00:23:18,565 --> 00:23:21,332 site for lac repressor has somebody bound to it. 394 00:23:21,332 --> 00:23:24,750 So it can get away with pretty low levels. 395 00:23:24,750 --> 00:23:30,545 But over here then we have this promoter, 396 00:23:30,545 --> 00:23:36,000 and we have the LacC gene and so on here, 397 00:23:36,000 --> 00:23:40,664 and there is the binding sequence right there. 398 00:23:40,664 --> 00:23:46,000 But this lac repressor has the ability to bind lactose, 399 00:23:46,000 --> 00:23:51,500 which I'm going to draw as a little triangle here, 400 00:23:51,500 --> 00:23:57,000 even though you know it's a disaccharide. 401 00:23:57,000 --> 00:24:02,000 It has a different property. 402 00:24:02,000 --> 00:24:04,000 But the fundamental characteristic 403 00:24:04,000 --> 00:24:07,000 of the lac repressor binding lactose 404 00:24:07,000 --> 00:24:09,912 is it undergoes a change in confirmation. 405 00:24:09,912 --> 00:24:14,000 So if it's got a binding pocket, and lactose 406 00:24:14,000 --> 00:24:16,768 fits into that binding pocket, those 407 00:24:16,768 --> 00:24:20,224 alpha helices and beta sheets and so on move 408 00:24:20,224 --> 00:24:21,768 around a little bit. 409 00:24:21,768 --> 00:24:24,456 And what happens then is it perturbs 410 00:24:24,456 --> 00:24:27,816 the part of the protein that would normally 411 00:24:27,816 --> 00:24:31,000 be able to recognize this DNA sequence. 412 00:24:31,000 --> 00:24:44,200 And this cannot -- -- bind to the DNA sequence up here. 413 00:24:44,200 --> 00:24:49,180 And I'll tell you the special name for that binding site. 414 00:24:49,180 --> 00:24:55,544 It's just one of these terms that you'll see in biology. 415 00:24:55,544 --> 00:25:00,000 Everything has to be given a name. 416 00:25:00,000 --> 00:25:05,250 It's called an operator, for historical reasons. 417 00:25:05,250 --> 00:25:09,000 But, in any case, the lac repressor, 418 00:25:09,000 --> 00:25:12,461 once it's hanging onto a lactose it's 419 00:25:12,461 --> 00:25:14,766 unable to bind this sequence. 420 00:25:14,766 --> 00:25:19,635 That means that the start site for transcription is made. 421 00:25:19,635 --> 00:25:25,000 And so you get the mRNA made and then 422 00:25:25,000 --> 00:25:30,000 you get beta galactosidase present. 423 00:25:30,000 --> 00:25:33,267 A little bit complicated, but sort of underlying it 424 00:25:33,267 --> 00:25:35,710 is this idea that now the cell is 425 00:25:35,710 --> 00:25:38,000 using a protein rather than DNA to tell 426 00:25:38,000 --> 00:25:39,816 whether lactose is present. 427 00:25:39,816 --> 00:25:44,200 And hopefully you can see this is really general now 428 00:25:44,200 --> 00:25:47,000 because you can design a regulatory protein. 429 00:25:47,000 --> 00:25:49,856 And basically it's got to have two things. 430 00:25:49,856 --> 00:25:54,220 It's got to have a part that talks to the DNA 431 00:25:54,220 --> 00:25:56,832 and recognizes some sequence, and it's 432 00:25:56,832 --> 00:26:01,000 got to have another part that senses whatever it is. 433 00:26:01,000 --> 00:26:03,500 Histidine, temperature, you name it. 434 00:26:03,500 --> 00:26:06,920 But once you understand that design principle then 435 00:26:06,920 --> 00:26:11,428 you can begin to see how it is that the cell is 436 00:26:11,428 --> 00:26:15,416 able to turn genes on and off just by encoding information 437 00:26:15,416 --> 00:26:16,664 in the DNA. 438 00:26:16,664 --> 00:26:20,416 And, as I say, one of the big tricks 439 00:26:20,416 --> 00:26:25,000 there is to let the protein do the sensing for you. 440 00:26:25,000 --> 00:26:29,680 Now, I just want to give you a little bit of a blowup of what 441 00:26:29,680 --> 00:26:31,800 this things looks like. 442 00:26:31,800 --> 00:26:36,800 Because sort of all I've done is kind of 443 00:26:36,800 --> 00:26:39,200 put it here as a sequence. 444 00:26:39,200 --> 00:26:41,200 I've called it a promoter. 445 00:26:41,200 --> 00:26:45,885 What a promoter then is, again it 446 00:26:45,885 --> 00:26:50,000 means the start for transcription, 447 00:26:50,000 --> 00:26:55,000 the process of making RNA. 448 00:26:55,000 --> 00:27:01,360 And I've tried to stress that these promoters are not 449 00:27:01,360 --> 00:27:02,000 universal. 450 00:27:02,000 --> 00:27:09,000 So when I tell you this for E. 451 00:27:09,000 --> 00:27:13,400 coli, this is what a promoter for E. coli looks like, 452 00:27:13,400 --> 00:27:17,664 but it doesn't look at all like a promoter in our bodies. 453 00:27:17,664 --> 00:27:20,000 And so it's basically a word that's 454 00:27:20,000 --> 00:27:23,996 written using this nucleic acid alphabet. 455 00:27:23,996 --> 00:27:27,998 And it looks something like this. 456 00:27:27,998 --> 00:27:35,000 There's TTGACA and then there are about 17 base pairs that 457 00:27:35,000 --> 00:27:38,000 can be just about anything. 458 00:27:38,000 --> 00:27:40,400 And then there's TATAAT. 459 00:27:40,400 --> 00:27:44,428 And then there's another little bit here 460 00:27:44,428 --> 00:27:46,996 that's about ten base pairs long. 461 00:27:46,996 --> 00:27:52,725 And then this is the start of the mRNA which is usually 462 00:27:52,725 --> 00:27:58,400 given the convention of being called the plus one position. 463 00:27:58,400 --> 00:28:03,875 So you'll notice this word I've called it, written, 464 00:28:03,875 --> 00:28:09,304 that says start transcription has even got two parts to it. 465 00:28:09,304 --> 00:28:13,152 And this is usually referred to the minus ten region 466 00:28:13,152 --> 00:28:17,000 of the promoter, and this is the minus 35 region 467 00:28:17,000 --> 00:28:22,000 because that's the distance from the start of transcription. 468 00:28:22,000 --> 00:28:25,120 It maybe seem sort of weird to you to see 469 00:28:25,120 --> 00:28:28,875 what I'm telling you is sort of word written 470 00:28:28,875 --> 00:28:32,000 in the nucleic acid language. 471 00:28:32,000 --> 00:28:35,000 It's got some bits in the middle that don't matter. 472 00:28:35,000 --> 00:28:38,377 But remember the DNA is a helix and things going around 473 00:28:38,377 --> 00:28:39,000 like this. 474 00:28:39,000 --> 00:28:41,500 So if you were just to take a DNA helix 475 00:28:41,500 --> 00:28:44,776 and then lay something down on one side of it, 476 00:28:44,776 --> 00:28:48,535 it would contact it here, it wouldn't contact it there, 477 00:28:48,535 --> 00:28:52,665 and then when it came back up again it would contact it here. 478 00:28:52,665 --> 00:28:56,330 And so as these things hang on along the sides of DNA 479 00:28:56,330 --> 00:28:59,726 it's not at all uncommon to find this sort of broken where 480 00:28:59,726 --> 00:29:01,904 you can have something that matters, 481 00:29:01,904 --> 00:29:04,998 something that doesn't matter and something 482 00:29:04,998 --> 00:29:07,000 that matters again. 483 00:29:07,000 --> 00:29:12,994 The RNA polymerase in E. coli, it's a machine. 484 00:29:12,994 --> 00:29:19,500 It's got four proteins that are the core. 485 00:29:19,500 --> 00:29:24,818 That's the part that actually synthesizes 486 00:29:24,818 --> 00:29:34,776 the RNA plus one protein, which is known as the sigma subunit. 487 00:29:34,776 --> 00:29:43,100 And it has the special job of recognizing the promoter. 488 00:29:43,100 --> 00:29:48,000 And so when we start asking where 489 00:29:48,000 --> 00:29:55,104 does this regulatory sequence that the lac repressor 490 00:29:55,104 --> 00:29:59,330 binds it in all of this. 491 00:29:59,330 --> 00:30:18,416 It turns out that the sequence for binding lac repressor -- -- 492 00:30:18,416 --> 00:30:20,912 overlaps with this minus ten region. 493 00:30:20,912 --> 00:30:24,248 So when the lac repressor is sitting down 494 00:30:24,248 --> 00:30:27,160 it's covering up a very important part 495 00:30:27,160 --> 00:30:28,000 of transcription. 496 00:30:28,000 --> 00:30:30,332 You guys with me? 497 00:30:30,332 --> 00:30:30,915 OK. 498 00:30:30,915 --> 00:30:36,285 So this is an interesting kind of regulation. 499 00:30:36,285 --> 00:30:53,000 It's given the general term -- -- negative regulation. 500 00:30:53,000 --> 00:30:55,331 And the reason that term is applied, 501 00:30:55,331 --> 00:31:11,500 it means that the regulatory protein -- -- 502 00:31:11,500 --> 00:31:19,000 interferes with transcription. 503 00:31:19,000 --> 00:31:22,072 And let's take a brief foray into we're 504 00:31:22,072 --> 00:31:28,332 going to talk about genetics as our next subject. 505 00:31:28,332 --> 00:31:41,284 And I think maybe we can let you sort of already 506 00:31:41,284 --> 00:31:57,000 get a sense of how some of this was figured out. 507 00:31:57,000 --> 00:32:09,304 So there's a substance called X gal, which 508 00:32:09,304 --> 00:32:21,724 is a galactose with some chemical entity hanging off 509 00:32:21,724 --> 00:32:24,086 that's colorless. 510 00:32:24,086 --> 00:32:30,285 But yet it's a substrate textbook. 511 00:32:30,285 --> 00:32:31,544 And so if we grow E. 512 00:32:31,544 --> 00:32:33,377 coli on plates that have glucose plus X gal, 513 00:32:33,377 --> 00:32:35,627 the colonies would be we learn all this kind of thing. 514 00:32:35,627 --> 00:32:37,250 It just doesn't come out of a whose 515 00:32:37,250 --> 00:32:39,000 bond can be cleaved by beta galactosidase. 516 00:32:39,000 --> 00:32:39,918 experimentally I'm not doing too good 517 00:32:39,918 --> 00:32:41,142 a job of conveying to you how colorless. 518 00:32:41,142 --> 00:32:42,683 And if we were to grow them on plates 519 00:32:42,683 --> 00:32:45,499 that had lactose plus but I think if you don't have 520 00:32:45,499 --> 00:32:47,184 some sense of how this is done 521 00:32:47,184 --> 00:32:48,600 Someone said they thought this was 522 00:32:48,600 --> 00:32:50,610 too much lab stuff, X gal then all of the colonies 523 00:32:50,610 --> 00:32:52,235 would be colored because they're making 524 00:32:52,235 --> 00:32:55,768 And this is a very useful thing for bacterial geneticists. 525 00:32:55,768 --> 00:32:57,684 And it gives galactose plus the free X entity. 526 00:32:57,684 --> 00:32:59,600 And this is colored. beta galactosidase. 527 00:32:59,600 --> 00:33:07,090 And part of the way that this stuff that I've 528 00:33:07,090 --> 00:33:12,545 been telling you was figured out was by bacterial geneticists 529 00:33:12,545 --> 00:33:14,180 looking for something. 530 00:33:14,180 --> 00:33:18,000 What they looked for was back here 531 00:33:18,000 --> 00:33:21,270 on this plate that had Xgal. 532 00:33:21,270 --> 00:33:24,750 Almost all the colonies were colorless. 533 00:33:24,750 --> 00:33:31,090 These were colored because they could make beta galactosidase. 534 00:33:31,090 --> 00:33:39,330 So if I gave you some plates of this, you looked in the lab 535 00:33:39,330 --> 00:33:46,181 and then you found a colored colony, it's a mutant. 536 00:33:46,181 --> 00:33:52,000 I'll define these terms for you very shortly. 537 00:33:52,000 --> 00:33:58,150 But it's got an alteration in the DNA that affects 538 00:33:58,150 --> 00:34:08,356 the regulation -- -- of beta gal or the product of the lacC 539 00:34:08,356 --> 00:34:08,856 gene. 540 00:34:08,856 --> 00:34:14,000 And on the basis of what I've told you about this model, 541 00:34:14,000 --> 00:34:18,800 can you guys come up with two types of things, two places 542 00:34:18,800 --> 00:34:24,900 or kinds of mutations that could break this system that 543 00:34:24,900 --> 00:34:31,662 would lead to beta galactosidase being on even though there's 544 00:34:31,662 --> 00:34:35,250 no lactose in the medium? 545 00:34:35,250 --> 00:34:39,000 Anybody see one of them? 546 00:34:39,000 --> 00:34:40,844 Why is it off? 547 00:34:40,844 --> 00:34:42,688 Because of lac repressor? 548 00:34:42,688 --> 00:34:47,000 In a wild type strain it's because lac repressor 549 00:34:47,000 --> 00:34:50,500 is bound to that sequence and it's 550 00:34:50,500 --> 00:34:52,200 shutting off transcription. 551 00:34:52,200 --> 00:35:03,500 So we had a variant that could now transcribe. 552 00:35:03,500 --> 00:35:10,000 Yeah. 553 00:35:10,000 --> 00:35:12,304 OK, so that's a good idea. 554 00:35:12,304 --> 00:35:16,500 So if we could somehow mutate that little binding sequence 555 00:35:16,500 --> 00:35:21,555 in a way that didn't screw up everything else then, 556 00:35:21,555 --> 00:35:26,000 even though there was lac repressor being made, 557 00:35:26,000 --> 00:35:32,000 if it couldn't bind here because the sequence had been changed 558 00:35:32,000 --> 00:35:35,125 then you'd get it made. 559 00:35:35,125 --> 00:35:37,000 That's exactly right. 560 00:35:37,000 --> 00:35:38,580 That's one of them. 561 00:35:38,580 --> 00:35:39,080 Yeah. 562 00:35:39,080 --> 00:35:42,000 OK, it was a problem making lacI. 563 00:35:42,000 --> 00:35:43,248 What would happen? 564 00:35:43,248 --> 00:35:47,800 Well, if we couldn't make this and it couldn't bind there 565 00:35:47,800 --> 00:35:49,000 we'd be on. 566 00:35:49,000 --> 00:35:51,000 And that's the other class. 567 00:35:51,000 --> 00:35:54,328 Can you think of a kind of mutation 568 00:35:54,328 --> 00:35:56,908 that we learned about, think back 569 00:35:56,908 --> 00:36:00,086 to the genetic code, that would prevent 570 00:36:00,086 --> 00:36:10,000 lac repressor from being made? 571 00:36:10,000 --> 00:36:12,149 I'm trying to give you a clue. 572 00:36:12,149 --> 00:36:15,142 There were 61 codons encoded for amino acids. 573 00:36:15,142 --> 00:36:15,713 Yeah. 574 00:36:15,713 --> 00:36:18,000 Oh, that would work. 575 00:36:18,000 --> 00:36:21,178 Yup, if we messed up the promoter. 576 00:36:21,178 --> 00:36:23,000 That's a sophisticated answer. 577 00:36:23,000 --> 00:36:26,330 Yeah, if we messed up the promoter for making lacI 578 00:36:26,330 --> 00:36:28,800 that would certainly give that. 579 00:36:28,800 --> 00:36:33,500 Can you think of another type of thing 580 00:36:33,500 --> 00:36:38,000 that would affect the lacI gene, would prevent lacI 581 00:36:38,000 --> 00:36:42,333 from being made? 582 00:36:42,333 --> 00:36:45,666 Somebody? 583 00:36:45,666 --> 00:36:48,773 Yeah. 584 00:36:48,773 --> 00:36:49,272 OK. 585 00:36:49,272 --> 00:36:52,000 If the sequence was wrong so that they could bind, 586 00:36:52,000 --> 00:36:53,228 that would be good. 587 00:36:53,228 --> 00:36:56,000 The one I'm trying to tease out of you, 588 00:36:56,000 --> 00:36:59,526 but I won't take longer now, is remember those three stop 589 00:36:59,526 --> 00:37:01,150 codons that didn't encode for anything? 590 00:37:01,150 --> 00:37:04,332 There should be one of those at the end of the protein. 591 00:37:04,332 --> 00:37:07,307 But if you changed one of the amino acid 592 00:37:07,307 --> 00:37:09,763 codons into a stop codon that would also 593 00:37:09,763 --> 00:37:11,333 prevent you from making it. 594 00:37:11,333 --> 00:37:15,000 So there are at least a couple of kinds of mutations. 595 00:37:15,000 --> 00:37:16,712 And what I've sort of done here is 596 00:37:16,712 --> 00:37:19,452 I've skipped all the evidence and given you the model. 597 00:37:19,452 --> 00:37:23,815 And I cannot give you all the evidence that lead to this, 598 00:37:23,815 --> 00:37:26,400 which is a pretty well established model. 599 00:37:26,400 --> 00:37:30,000 I don't think this is going to change likely. 600 00:37:30,000 --> 00:37:32,331 We've been studying it for so long. 601 00:37:32,331 --> 00:37:36,816 But this is the kind of evidence on which it was based. 602 00:37:36,816 --> 00:37:40,908 In was by people finding things and figuring out 603 00:37:40,908 --> 00:37:46,090 that parts of the machinery were broken and then working on. 604 00:37:46,090 --> 00:37:49,360 So there's another kind of regulation 605 00:37:49,360 --> 00:37:51,454 known as positive regulation. 606 00:37:51,454 --> 00:37:55,540 And for a long time people though maybe everything 607 00:37:55,540 --> 00:37:57,250 was negative regulation. 608 00:37:57,250 --> 00:38:03,220 But it turns out positive regulation is far more common. 609 00:38:03,220 --> 00:38:06,625 In this case, the regulatory protein 610 00:38:06,625 --> 00:38:09,125 instead of inhibiting transcription 611 00:38:09,125 --> 00:38:12,000 assists with the transcription. 612 00:38:12,000 --> 00:38:17,362 And it turned out, after people had 613 00:38:17,362 --> 00:38:19,632 been studying the beta galactosidase 614 00:38:19,632 --> 00:38:26,000 system for a number of years, that it had a positive control 615 00:38:26,000 --> 00:38:31,328 system superimposed or together with the negative regulatory 616 00:38:31,328 --> 00:38:32,000 system. 617 00:38:32,000 --> 00:38:35,328 The same thing engineers do all the time, 618 00:38:35,328 --> 00:38:39,856 pile up regulatory circuits and get all kinds 619 00:38:39,856 --> 00:38:42,000 of additional conditionalities. 620 00:38:42,000 --> 00:38:44,664 And the thing I've told you saw far 621 00:38:44,664 --> 00:38:48,333 was we asked whether beta gal is present. 622 00:38:48,333 --> 00:38:57,331 And the carbon source -- -- is glucose. 623 00:38:57,331 --> 00:39:02,000 This is low or not there. 624 00:39:02,000 --> 00:39:08,000 If it's lactose it's high, but if cells were grown on both, 625 00:39:08,000 --> 00:39:13,000 glucose plus lactose, then beta galactosidase is low again. 626 00:39:13,000 --> 00:39:18,000 And this makes some physiological sense because E. 627 00:39:18,000 --> 00:39:21,000 coli likes to use galactose. 628 00:39:21,000 --> 00:39:24,000 It's its favorite food source. 629 00:39:24,000 --> 00:39:28,224 And so if it's got its favorite food source around then 630 00:39:28,224 --> 00:39:31,140 it doesn't want to make proteins that 631 00:39:31,140 --> 00:39:36,875 are used to eat all its sort of less favorite food source. 632 00:39:36,875 --> 00:39:40,307 So there's a nice conditionality here. 633 00:39:40,307 --> 00:39:42,456 In order for it, the circuitry is 634 00:39:42,456 --> 00:39:46,800 set up so that it only makes the enzyme for metabolizing lactose 635 00:39:46,800 --> 00:39:50,724 when the cell realizes its favorite food source isn't 636 00:39:50,724 --> 00:39:54,000 there and then it senses there's lactose. 637 00:39:54,000 --> 00:39:59,856 So only under those conditions does it make the enzymes 638 00:39:59,856 --> 00:40:02,000 for making lactose. 639 00:40:02,000 --> 00:40:04,800 And, again, the way this circuitry works, 640 00:40:04,800 --> 00:40:07,665 this positive regulation needs two things. 641 00:40:07,665 --> 00:40:11,000 Once again, it needs a protein. 642 00:40:11,000 --> 00:40:13,912 This one has given the name CRP. 643 00:40:13,912 --> 00:40:17,071 And, again, it's something that's able to bind 644 00:40:17,071 --> 00:40:18,856 to a sequence in DNA. 645 00:40:18,856 --> 00:40:21,666 And then it's also got a conditionality. 646 00:40:21,666 --> 00:40:25,416 It's able to recognize something else. 647 00:40:25,416 --> 00:40:29,576 And what this one recognizes is this small molecule that's 648 00:40:29,576 --> 00:40:35,142 known on cyclic A&P. It's just the 649 00:40:35,142 --> 00:40:39,428 familiar [ribomonophosphate? that you've seen 650 00:40:39,428 --> 00:40:46,000 before but it's looped around and formed an ester bond here. 651 00:40:46,000 --> 00:40:49,998 And that's why it's called cyclic A&P. 652 00:40:49,998 --> 00:40:54,000 But the important thing about this 653 00:40:54,000 --> 00:41:01,000 is that the levels of cyclic A&P are dependant on glucose. 654 00:41:01,000 --> 00:41:08,557 So if you have high glucose you have low cyclic A&P 655 00:41:08,557 --> 00:41:17,076 and if you have low glucose you have high cyclic A&P. 656 00:41:17,076 --> 00:41:23,857 And here again is what's going to happen. 657 00:41:23,857 --> 00:41:29,856 So this is the promoter for LacCYA 658 00:41:29,856 --> 00:41:36,570 and here's the start of the lacZ gene. 659 00:41:36,570 --> 00:41:46,000 And I told you this is where the operator would be finding. 660 00:41:46,000 --> 00:41:59,250 This CRP protein is able to bind to -- -- 661 00:41:59,250 --> 00:42:12,000 a site that's even a little bit farther upstream of the LacC 662 00:42:12,000 --> 00:42:16,000 gene than is the promoter. 663 00:42:16,000 --> 00:42:26,250 And the idea of this is that if this CRP [lacs?] -- -- 664 00:42:26,250 --> 00:42:36,228 just by itself, it doesn't bind to the DNA. 665 00:42:36,228 --> 00:42:40,571 But it has a little binding pocket that 666 00:42:40,571 --> 00:42:47,423 senses the levels of cyclic A&P. So if the cell is starving 667 00:42:47,423 --> 00:42:51,766 for glucose, there are high levels of A&P, 668 00:42:51,766 --> 00:43:00,328 then the CRP bound to cyclic A&P, this is capable of binding 669 00:43:00,328 --> 00:43:03,000 to this sequence. 670 00:43:03,000 --> 00:43:12,000 So that sounds weird, but what we've, again, 671 00:43:12,000 --> 00:43:19,704 got now is we've got a protein whose binding to DNA is 672 00:43:19,704 --> 00:43:24,272 conditional to something inside the cell. 673 00:43:24,272 --> 00:43:30,692 And rather than getting in the way, what 674 00:43:30,692 --> 00:43:36,228 this does, if you have a situation where 675 00:43:36,228 --> 00:43:42,750 you have CRP with cyclic A&P bond to it 676 00:43:42,750 --> 00:43:51,750 and it's next door to this promoter, what it's able to do 677 00:43:51,750 --> 00:43:57,000 is help RNA polymerase recognize the promoter. 678 00:43:57,000 --> 00:44:01,090 So this helps RNA polymerase. 679 00:44:01,090 --> 00:44:07,428 Whereas, when we were talking about the lac repressor what 680 00:44:07,428 --> 00:44:10,284 it was doing, if you recall, was getting 681 00:44:10,284 --> 00:44:13,000 in the way of RNA polymerase. 682 00:44:13,000 --> 00:44:17,000 And this actually has a relatively simple sort 683 00:44:17,000 --> 00:44:20,500 of molecular explanation for what's going on. 684 00:44:20,500 --> 00:44:37,178 The RNA polymerase -- The best sequence for the minus ten 685 00:44:37,178 --> 00:44:42,688 region of the promoter, that I showed you over on the other 686 00:44:42,688 --> 00:44:46,428 board, is something with the sequence TATAAT. 687 00:44:46,428 --> 00:44:50,461 So the RNA polymerase machinery, it 688 00:44:50,461 --> 00:44:53,227 can recognize more than one sequence, 689 00:44:53,227 --> 00:44:58,499 but if it sees a promoter that has a minus ten region that 690 00:44:58,499 --> 00:45:01,461 is TATAAT, it really binds well and that 691 00:45:01,461 --> 00:45:07,000 would be a very strong promoter and you get lots of mRNA. 692 00:45:07,000 --> 00:45:12,816 Now, the LacC promoter actually has two nucleotides 693 00:45:12,816 --> 00:45:15,000 that are different. 694 00:45:15,000 --> 00:45:26,000 It's TATGTT. 695 00:45:26,000 --> 00:45:38,000 So this is a very weak promoter without help. 696 00:45:38,000 --> 00:45:43,000 So in this lac system, if we got rid of lac repressor 697 00:45:43,000 --> 00:45:48,000 entirely so that the promoter was just exposed all the time, 698 00:45:48,000 --> 00:45:50,808 as I showed you up here, I've left out 699 00:45:50,808 --> 00:45:57,285 a detail in this first part in that this promoter is not 700 00:45:57,285 --> 00:45:59,000 very strong. 701 00:45:59,000 --> 00:46:01,394 And we'd only get a little bit of RNA 702 00:46:01,394 --> 00:46:03,000 and a little bit of protein. 703 00:46:03,000 --> 00:46:08,000 And so what the cell does is if it knows there is no galactose, 704 00:46:08,000 --> 00:46:11,840 knows there's no glucose around and cyclic A&P levels are 705 00:46:11,840 --> 00:46:15,664 high then it uses the binding of the CRP to here 706 00:46:15,664 --> 00:46:18,416 to assist the RNA polymerase and get on. 707 00:46:18,416 --> 00:46:21,328 I understand this is a bit complicated. 708 00:46:21,328 --> 00:46:23,624 Some of you probably have it. 709 00:46:23,624 --> 00:46:26,120 Some of you will be lost and you'll 710 00:46:26,120 --> 00:46:29,710 have to sit and look at your textbook for a little while. 711 00:46:29,710 --> 00:46:33,200 But it comes down to a couple of really simple principles. 712 00:46:33,200 --> 00:46:36,999 One is to detect what's going on the cell makes 713 00:46:36,999 --> 00:46:39,663 a regulatory protein, the regulatory protein binds DNA 714 00:46:39,663 --> 00:46:41,452 and it also senses something. 715 00:46:41,452 --> 00:46:44,363 And these things can work in two ways. 716 00:46:44,363 --> 00:46:48,000 They can either bind DNA and get in the way, 717 00:46:48,000 --> 00:46:49,995 they can be a negative regulatory element, 718 00:46:49,995 --> 00:46:52,726 or they can bind to DNA and they can 719 00:46:52,726 --> 00:46:56,000 help something happen and be a positive regulatory element. 720 00:46:56,000 --> 00:47:00,000 All the rest are just the details of lac. 721 00:47:00,000 --> 00:47:02,763 And we could spend the entire course or regulation 722 00:47:02,763 --> 00:47:05,071 and barely scratch the surface, but it's 723 00:47:05,071 --> 00:47:08,284 one of the huge secrets of life that cells 724 00:47:08,284 --> 00:47:11,178 are able to individually turn on different genes 725 00:47:11,178 --> 00:47:13,454 in different ways at different times, 726 00:47:13,454 --> 00:47:17,540 have rheostats for levels, coordinate great sets of genes 727 00:47:17,540 --> 00:47:19,000 in response to various stimuli. 728 00:47:19,000 --> 00:47:19,500 OK? 729 00:47:19,500 --> 00:47:21,650 See you on Wednesday.