1 00:00:00,080 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,235 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,235 --> 00:00:17,860 at ocw.mit.edu. 8 00:00:20,482 --> 00:00:22,690 PROFESSOR: So yeah, but today, what we're going to do 9 00:00:22,690 --> 00:00:24,590 is talk about evolution. 10 00:00:24,590 --> 00:00:26,340 And in particular, we're going to complete 11 00:00:26,340 --> 00:00:27,750 our discussion of evolution in the presence 12 00:00:27,750 --> 00:00:30,440 of clonal interference when multiple mutant lineages are 13 00:00:30,440 --> 00:00:32,640 competing in the population a the same time. 14 00:00:32,640 --> 00:00:34,320 And then, we'll move on to try to think 15 00:00:34,320 --> 00:00:36,850 about evolution on these so-called rugged fitness 16 00:00:36,850 --> 00:00:38,290 landscapes. 17 00:00:38,290 --> 00:00:41,750 So such ruggedness occurs when there 18 00:00:41,750 --> 00:00:45,025 are interactions between the mutations within the organism. 19 00:00:45,025 --> 00:00:47,560 So a so-called epistatic interactions 20 00:00:47,560 --> 00:00:51,070 can, perhaps, constrain the path of evolution. 21 00:00:51,070 --> 00:00:54,566 So we'll complete our discussion of Roy Kashoney's paper 22 00:00:54,566 --> 00:00:56,440 on the equivalence principle and the presence 23 00:00:56,440 --> 00:00:57,730 of clonal interference. 24 00:00:57,730 --> 00:01:01,390 Then we'll say something about how clonal interference slows 25 00:01:01,390 --> 00:01:02,565 down the rate of evolution. 26 00:01:02,565 --> 00:01:06,700 It slows down the rate at which the population can increase 27 00:01:06,700 --> 00:01:08,760 in fitness because of this competition 28 00:01:08,760 --> 00:01:11,640 between different beneficial lineages. 29 00:01:11,640 --> 00:01:14,280 And then, we'll discuss this paper 30 00:01:14,280 --> 00:01:17,790 by Daniel Weinreich, which I think, 31 00:01:17,790 --> 00:01:21,120 for many of us, had a real important effect 32 00:01:21,120 --> 00:01:23,950 in just terms of getting us to think about evolution 33 00:01:23,950 --> 00:01:27,100 in a different way. 34 00:01:27,100 --> 00:01:29,826 Any questions before? 35 00:01:29,826 --> 00:01:30,325 Yes. 36 00:01:30,325 --> 00:01:32,116 AUDIENCE: Question about the exam. 37 00:01:32,116 --> 00:01:33,580 PROFESSOR: Yes. 38 00:01:33,580 --> 00:01:35,570 In case you've tried to forget, we 39 00:01:35,570 --> 00:01:39,370 have an exam again next week. 40 00:01:39,370 --> 00:01:42,680 So research and education indicates that 41 00:01:42,680 --> 00:01:45,080 the more exams, the better. 42 00:01:45,080 --> 00:01:49,200 So if that makes you feel any better in terms of process, 43 00:01:49,200 --> 00:01:50,510 then use that. 44 00:01:50,510 --> 00:01:57,210 So next Thursday, 7 o'clock, we will announce the room later. 45 00:01:57,210 --> 00:01:59,028 Did you have a question about that? 46 00:01:59,028 --> 00:02:01,370 AUDIENCE: Is it everything from the start or [INAUDIBLE] 47 00:02:01,370 --> 00:02:05,500 PROFESSOR: Yeah, so it will be weighted towards the material 48 00:02:05,500 --> 00:02:08,190 that we did not test through exam one. 49 00:02:08,190 --> 00:02:12,160 But you can expect to get two plus or minus 50 00:02:12,160 --> 00:02:16,520 1 questions on material that was covered 51 00:02:16,520 --> 00:02:19,720 in the first part of the class. 52 00:02:19,720 --> 00:02:20,220 OK. 53 00:02:20,220 --> 00:02:21,053 Any other questions? 54 00:02:24,236 --> 00:02:27,160 OK. 55 00:02:27,160 --> 00:02:30,500 All right, so coming back to this discussion 56 00:02:30,500 --> 00:02:31,780 of the equivalence principle. 57 00:02:36,140 --> 00:02:42,360 So the last figure of the paper-- it's a figure of four-- 58 00:02:42,360 --> 00:02:48,262 illustrated some alternative underlying distribution. 59 00:02:48,262 --> 00:02:49,970 So what we wanted to know is-- all right, 60 00:02:49,970 --> 00:02:52,100 so they're going to be some beneficial mutations 61 00:02:52,100 --> 00:02:56,117 that these E.coli can get in this new environment. 62 00:02:56,117 --> 00:02:58,700 And they're going to distributed according to some probability 63 00:02:58,700 --> 00:02:59,762 distribution. 64 00:02:59,762 --> 00:03:01,345 So what we, in principle, want to know 65 00:03:01,345 --> 00:03:04,300 is this distribution of effects of beneficial mutations. 66 00:03:04,300 --> 00:03:06,360 So the probability of distribution, P beneficial, 67 00:03:06,360 --> 00:03:09,940 is a function of s. 68 00:03:09,940 --> 00:03:14,410 And what was, kind of, the scale of beneficial mutations 69 00:03:14,410 --> 00:03:19,246 that they observe in the paper. 70 00:03:19,246 --> 00:03:21,760 All right, how much better do the populations 71 00:03:21,760 --> 00:03:24,075 get after these few 100 generations? 72 00:03:27,196 --> 00:03:27,695 Hm? 73 00:03:27,695 --> 00:03:28,610 AUDIENCE: 1%. 74 00:03:28,610 --> 00:03:30,609 PROFESSOR: All right, so [? order of ?] percent. 75 00:03:30,609 --> 00:03:32,870 A few percent, right? 76 00:03:32,870 --> 00:03:33,440 Precisely. 77 00:03:33,440 --> 00:03:36,570 And indeed, they had three different models 78 00:03:36,570 --> 00:03:38,437 for some of the underlying distributions 79 00:03:38,437 --> 00:03:39,520 and how they might behave. 80 00:03:39,520 --> 00:03:43,827 So they had the exponential uniform and the [INAUDIBLE] 81 00:03:43,827 --> 00:03:44,785 kind of delta function. 82 00:03:49,240 --> 00:03:52,490 And in all these cases, are those 83 00:03:52,490 --> 00:03:56,190 the only three possible underlying distributions? 84 00:03:56,190 --> 00:03:58,620 Those are just, kind of, typical distributions. 85 00:03:58,620 --> 00:04:01,712 What you want, in this case, is you want these three degree 86 00:04:01,712 --> 00:04:03,920 distributions to, somehow, be qualitatively different 87 00:04:03,920 --> 00:04:06,290 so you can drive home the point that you 88 00:04:06,290 --> 00:04:09,430 can, in principle, describe the results of their evolution 89 00:04:09,430 --> 00:04:15,020 experiment via wildly different underlying distributions. 90 00:04:15,020 --> 00:04:16,240 OK. 91 00:04:16,240 --> 00:04:21,380 Now in all these cases, you have to specify 92 00:04:21,380 --> 00:04:30,582 the mutation rate, as well as the mean selection coefficient. 93 00:04:30,582 --> 00:04:32,550 All right, so the mean of this distribution. 94 00:04:45,684 --> 00:04:50,110 OK, now the question is, can we get any intuitive insight 95 00:04:50,110 --> 00:04:53,040 into why there were the patterns that they observe in terms 96 00:04:53,040 --> 00:04:56,407 of the mutation rate that they had to assume 97 00:04:56,407 --> 00:04:57,865 and the mean selection coefficient. 98 00:05:00,420 --> 00:05:03,110 There were some region of parameter space for those two 99 00:05:03,110 --> 00:05:05,809 distributions, for each of those distributions, that were, kind 100 00:05:05,809 --> 00:05:07,100 of, consistent with their data. 101 00:05:10,890 --> 00:05:14,160 So in particular, there was a mutation rate-- mu-- 102 00:05:14,160 --> 00:05:19,206 for the exponential distribution, 103 00:05:19,206 --> 00:05:25,270 a mutation rate for the uniform, and a mutation rate 104 00:05:25,270 --> 00:05:30,290 for the delta function. 105 00:05:30,290 --> 00:05:36,440 And also, there was a mean s associated 106 00:05:36,440 --> 00:05:42,960 with the exponential, mean s for the uniform 107 00:05:42,960 --> 00:05:48,275 and, again, a mean s for the delta. 108 00:05:48,275 --> 00:05:53,032 AUDIENCE: Were they trying to minimize the parameter numbers? 109 00:05:53,032 --> 00:05:53,740 PROFESSOR: Right. 110 00:05:53,740 --> 00:05:57,750 OK, so in this case, they wanted to compare these three 111 00:05:57,750 --> 00:05:59,590 distributions. 112 00:05:59,590 --> 00:06:03,370 Each of them were specified by two parameters. 113 00:06:03,370 --> 00:06:05,835 You could come up with other underlying distributions that 114 00:06:05,835 --> 00:06:07,210 might, for example, have a larger 115 00:06:07,210 --> 00:06:09,751 number that might be specified by a larger number parameters. 116 00:06:09,751 --> 00:06:12,140 But then, it's harder to compare the quality of the fit 117 00:06:12,140 --> 00:06:12,770 and so forth. 118 00:06:12,770 --> 00:06:15,580 So they chose these three distributions 119 00:06:15,580 --> 00:06:17,410 just because this is, somehow, the rate 120 00:06:17,410 --> 00:06:19,830 that these new mutations will appear in the population. 121 00:06:19,830 --> 00:06:22,260 And then, this tells us something about how good 122 00:06:22,260 --> 00:06:23,296 those mutations are. 123 00:06:28,030 --> 00:06:28,530 Right. 124 00:06:28,530 --> 00:06:34,300 Now some of you have the paper in front of you. 125 00:06:34,300 --> 00:06:35,760 And that's OK. 126 00:06:35,760 --> 00:06:41,620 But based on our understanding of how the clonal interference, 127 00:06:41,620 --> 00:06:44,510 kind of, manifests itself in terms of leading, 128 00:06:44,510 --> 00:06:48,110 eventually, they have the log of the fraction. 129 00:06:50,710 --> 00:06:54,320 So the fraction starts out 50-50. 130 00:06:54,320 --> 00:06:56,530 Log F1 over F2. 131 00:06:56,530 --> 00:06:59,390 So like, cyan and yellow, say? 132 00:06:59,390 --> 00:07:00,220 All right. 133 00:07:00,220 --> 00:07:01,680 So this thing starts out here. 134 00:07:01,680 --> 00:07:04,340 And then, one side gets beneficial mutation. 135 00:07:04,340 --> 00:07:05,610 So it, kind of, comes up. 136 00:07:05,610 --> 00:07:09,800 So they measure the slope, for example, 137 00:07:09,800 --> 00:07:14,270 of the lineage that is taking over the population. 138 00:07:14,270 --> 00:07:19,130 So they want to know, well, which of these distributions 139 00:07:19,130 --> 00:07:20,760 and associated parameters will be 140 00:07:20,760 --> 00:07:24,210 able to explain the range of different trajectories 141 00:07:24,210 --> 00:07:25,110 that they saw? 142 00:07:32,960 --> 00:07:40,340 So the question is, can we order these things? 143 00:07:40,340 --> 00:07:41,000 And why? 144 00:07:48,251 --> 00:07:49,750 All right, which one of these should 145 00:07:49,750 --> 00:07:53,340 be the largest, second largest, third largest, and so forth? 146 00:07:53,340 --> 00:07:55,130 OK. 147 00:07:55,130 --> 00:07:59,006 Now it's OK if you just-- well, if you 148 00:07:59,006 --> 00:08:01,710 had the paper in front of you, you could just read it off. 149 00:08:01,710 --> 00:08:02,726 But ultimately, you're going to have 150 00:08:02,726 --> 00:08:04,392 to be able to explain why it is that one 151 00:08:04,392 --> 00:08:07,034 is larger than the other. 152 00:08:07,034 --> 00:08:08,938 OK? 153 00:08:08,938 --> 00:08:14,610 So what I want to know is, for example, at what order should 154 00:08:14,610 --> 00:08:15,795 these things come in? 155 00:08:15,795 --> 00:08:16,320 All right. 156 00:08:16,320 --> 00:08:21,250 So what I want to do is let you think about it for a minute. 157 00:08:21,250 --> 00:08:23,110 And then, we're going to vote by putting 158 00:08:23,110 --> 00:08:26,050 our cards from high mutation rate 159 00:08:26,050 --> 00:08:31,550 to low mutation rate among a, b, and c. 160 00:08:31,550 --> 00:08:32,332 Yes? 161 00:08:32,332 --> 00:08:34,592 AUDIENCE: So the means are constrained to be the same? 162 00:08:37,377 --> 00:08:39,960 PROFESSOR: So it's really going to be some range of parameters 163 00:08:39,960 --> 00:08:41,020 on each of these. 164 00:08:41,020 --> 00:08:43,809 So the question is, why is it that in the range of parameters 165 00:08:43,809 --> 00:08:45,335 that are consistent with what they observe experimentally, 166 00:08:45,335 --> 00:08:47,190 that these things have some order? 167 00:09:03,479 --> 00:09:05,520 Are there any other questions about the question? 168 00:09:13,810 --> 00:09:15,970 I'll give you 30 seconds to think 169 00:09:15,970 --> 00:09:17,820 about what the mutation rate should, 170 00:09:17,820 --> 00:09:19,445 kind of, be in this situation. 171 00:09:28,606 --> 00:09:31,442 AUDIENCE: The answer is highest to lowest mutation rate. 172 00:09:31,442 --> 00:09:32,150 PROFESSOR: Right. 173 00:09:32,150 --> 00:09:35,060 So you're going to put the highest mutation rate up high, 174 00:09:35,060 --> 00:09:37,270 the lowest mutation rate down there, 175 00:09:37,270 --> 00:09:38,800 and the middle one in between. 176 00:09:38,800 --> 00:09:39,300 Yeah. 177 00:10:23,670 --> 00:10:24,170 All right. 178 00:10:24,170 --> 00:10:25,150 Do you need more time? 179 00:10:27,778 --> 00:10:30,890 All right, let's see where we are. 180 00:10:30,890 --> 00:10:34,300 And it's OK if you're confused or don't 181 00:10:34,300 --> 00:10:35,485 know what I'm trying to ask. 182 00:10:35,485 --> 00:10:38,011 But let me see where the group is. 183 00:10:38,011 --> 00:10:38,510 Ready? 184 00:10:38,510 --> 00:10:42,870 Three, two, one. 185 00:10:42,870 --> 00:10:43,410 All right. 186 00:10:43,410 --> 00:10:48,170 So I would say it's, pretty much, all over the place, 187 00:10:48,170 --> 00:10:51,270 whether people are voting something in reality or not. 188 00:10:54,330 --> 00:10:54,840 OK, right. 189 00:10:54,840 --> 00:11:01,819 So the situation is that we have the data, which 190 00:11:01,819 --> 00:11:04,360 is shown in figure 3A, which is a bunch of these things that, 191 00:11:04,360 --> 00:11:05,360 kind of, look like this. 192 00:11:10,600 --> 00:11:11,195 All right. 193 00:11:15,670 --> 00:11:17,410 So we have some times. 194 00:11:17,410 --> 00:11:19,500 We have some slopes. 195 00:11:19,500 --> 00:11:23,250 We want to know how can we understand that data 196 00:11:23,250 --> 00:11:24,138 that we get out. 197 00:11:27,074 --> 00:11:28,490 So what we're going to do is we're 198 00:11:28,490 --> 00:11:30,040 going to take a model in which we say, 199 00:11:30,040 --> 00:11:32,498 all right, we're going to start with this population that's 200 00:11:32,498 --> 00:11:33,230 all identical. 201 00:11:33,230 --> 00:11:38,720 And then, we're going to allow some mutations to accumulate. 202 00:11:38,720 --> 00:11:41,730 And we're going to let them compete against each other. 203 00:11:41,730 --> 00:11:45,154 And then, see what happens to the other. 204 00:11:45,154 --> 00:11:47,570 So there's going to be some, again, distribution of slopes 205 00:11:47,570 --> 00:11:48,970 and so forth. 206 00:11:48,970 --> 00:11:49,470 All right. 207 00:11:49,470 --> 00:11:51,450 To what degree does this sort of data 208 00:11:51,450 --> 00:11:56,280 constrain that underlying distribution? 209 00:11:56,280 --> 00:11:59,920 Between something that looks like an exponential, 210 00:11:59,920 --> 00:12:03,060 something that has a uniform distribution, 211 00:12:03,060 --> 00:12:06,220 and something that is a delta function. 212 00:12:09,940 --> 00:12:12,031 So that's the exponential. 213 00:12:12,031 --> 00:12:12,780 This is the delta. 214 00:12:12,780 --> 00:12:13,780 And this is the uniform. 215 00:12:20,660 --> 00:12:22,145 Yeah. 216 00:12:22,145 --> 00:12:25,610 AUDIENCE: So we measure the slope at what time [INAUDIBLE]? 217 00:12:29,100 --> 00:12:30,959 PROFESSOR: Yeah, it could. 218 00:12:30,959 --> 00:12:31,500 That's right. 219 00:12:31,500 --> 00:12:35,790 So this thing could turn around at various times. 220 00:12:35,790 --> 00:12:39,864 So I think that there are a number of different ways 221 00:12:39,864 --> 00:12:42,030 that you could argue about the right way to do this. 222 00:12:42,030 --> 00:12:43,488 In practice, I think it's not going 223 00:12:43,488 --> 00:12:47,590 to be very sensitive because there's a minority of them that 224 00:12:47,590 --> 00:12:50,850 will actually be turning around, for example. 225 00:12:50,850 --> 00:12:53,054 So you could, for example, just say 226 00:12:53,054 --> 00:12:54,970 all of the trajectories that cross some point, 227 00:12:54,970 --> 00:12:56,180 I mean, measure the slope. 228 00:12:56,180 --> 00:12:59,760 And I think that would be sufficient. 229 00:12:59,760 --> 00:13:07,110 AUDIENCE: But if you don't see this fraction turn over, 230 00:13:07,110 --> 00:13:09,226 you could still have clonal interference? 231 00:13:09,226 --> 00:13:11,230 PROFESSOR: If you don't see the fraction. 232 00:13:11,230 --> 00:13:11,855 AUDIENCE: Like, in the sense-- 233 00:13:11,855 --> 00:13:12,854 PROFESSOR: That's right. 234 00:13:12,854 --> 00:13:17,100 So even if you don't see these things like, the flatten out, 235 00:13:17,100 --> 00:13:20,870 for example, then you could still have clonal interference 236 00:13:20,870 --> 00:13:23,270 because the slope might still be steeper 237 00:13:23,270 --> 00:13:25,880 than it would be in the absence of clonal interference. 238 00:13:39,154 --> 00:13:40,570 All right, what I'm going to do is 239 00:13:40,570 --> 00:13:44,140 I'm going to let you discuss with a neighbor for one minute. 240 00:13:44,140 --> 00:13:45,989 And then, we'll, maybe, discuss as a group 241 00:13:45,989 --> 00:13:48,530 just because I want to make sure that everybody gets a chance 242 00:13:48,530 --> 00:13:52,250 to try and verbalize their thought process. 243 00:13:52,250 --> 00:13:55,219 And if we discuss in a group, then only a few of us get to. 244 00:13:55,219 --> 00:13:56,260 All right, so one minute. 245 00:13:56,260 --> 00:13:57,890 Try to discuss it with your neighbor. 246 00:13:57,890 --> 00:14:00,256 And then we'll reconvene. 247 00:14:00,256 --> 00:14:03,742 [SIDE CONVERSATIONS] 248 00:15:51,310 --> 00:15:54,310 Yeah, and we're going to discuss the means in a moment. 249 00:15:54,310 --> 00:15:56,110 So indeed, these distributions will not 250 00:15:56,110 --> 00:15:59,110 end up having the same mean s. 251 00:15:59,110 --> 00:16:01,576 AUDIENCE: What are you controlling? 252 00:16:01,576 --> 00:16:02,950 PROFESSOR: What we're controlling 253 00:16:02,950 --> 00:16:05,490 is that we're asking about what range 254 00:16:05,490 --> 00:16:07,310 of parameters for each distribution 255 00:16:07,310 --> 00:16:08,855 will adequately fit the data. 256 00:16:12,934 --> 00:16:14,368 AUDIENCE: I know [INAUDIBLE]. 257 00:16:16,919 --> 00:16:18,210 Does it depend on what you get? 258 00:16:18,210 --> 00:16:19,910 PROFESSOR: The data will be, basically, 259 00:16:19,910 --> 00:16:23,620 the initial slopes here and when they 260 00:16:23,620 --> 00:16:25,379 deviated from a 50-50 mixture. 261 00:16:25,379 --> 00:16:25,920 AUDIENCE: OK. 262 00:16:25,920 --> 00:16:26,850 PROFESSOR: All right, so that's what [INAUDIBLE] 263 00:16:26,850 --> 00:16:27,970 is those histograms. 264 00:16:27,970 --> 00:16:30,350 AUDIENCE: Yeah. 265 00:16:30,350 --> 00:16:31,278 OK. 266 00:16:31,278 --> 00:16:31,778 [INAUDIBLE] 267 00:16:36,062 --> 00:16:38,880 PROFESSOR: All right, so it seems like we've quieted down, 268 00:16:38,880 --> 00:16:41,880 which means that we all agree on the answer. 269 00:16:41,880 --> 00:16:44,200 Is that-- no? 270 00:16:44,200 --> 00:16:47,690 OK, well I think that this is, actually, pretty tricky. 271 00:16:47,690 --> 00:16:48,315 So that's fine. 272 00:16:48,315 --> 00:16:49,981 I just want to see where we are, though. 273 00:16:49,981 --> 00:16:50,580 All right. 274 00:16:50,580 --> 00:16:52,530 Reconfigure your cards. 275 00:16:52,530 --> 00:16:58,280 Your best guess for the orders of the mutation rates 276 00:16:58,280 --> 00:17:01,130 between exponential uniform and delta. 277 00:17:01,130 --> 00:17:01,920 All right, ready? 278 00:17:01,920 --> 00:17:03,490 Three, two, one. 279 00:17:06,619 --> 00:17:12,089 OK, so we're migrating towards some things. 280 00:17:12,089 --> 00:17:17,990 OK, great. 281 00:17:17,990 --> 00:17:26,310 And can somebody verbalize the answer that their group got? 282 00:17:33,114 --> 00:17:37,487 AUDIENCE: So our answer is A, B, and C. 283 00:17:37,487 --> 00:17:38,070 PROFESSOR: OK. 284 00:17:38,070 --> 00:17:40,450 AUDIENCE: [INAUDIBLE] exponential [INAUDIBLE]. 285 00:17:44,734 --> 00:17:53,380 We can not see most of the [INAUDIBLE] 286 00:17:53,380 --> 00:17:57,340 lower selection coefficient mutations. 287 00:17:57,340 --> 00:17:58,330 PROFESSOR: OK. 288 00:17:58,330 --> 00:18:00,805 AUDIENCE: So we're actually underestimating the mutation 289 00:18:00,805 --> 00:18:03,775 rate from the data. 290 00:18:03,775 --> 00:18:05,089 PROFESSOR: Underestimate. 291 00:18:05,089 --> 00:18:06,630 OK, no, I can see what you're saying. 292 00:18:06,630 --> 00:18:08,130 OK, yeah, so the idea is that you're 293 00:18:08,130 --> 00:18:10,420 saying that we don't see an awful lot of the mutations 294 00:18:10,420 --> 00:18:13,550 here, which means that the true mutation 295 00:18:13,550 --> 00:18:14,967 rate, the underlying mutation rate 296 00:18:14,967 --> 00:18:17,383 is, somehow, much larger than you would have thought based 297 00:18:17,383 --> 00:18:20,130 on the mutations that you actually see here or something. 298 00:18:20,130 --> 00:18:21,750 And there's maybe another. 299 00:18:21,750 --> 00:18:23,400 OK, so they're different. 300 00:18:23,400 --> 00:18:26,480 All right. 301 00:18:26,480 --> 00:18:29,339 OK, it's certainly along-- yeah, sometimes it's true. 302 00:18:29,339 --> 00:18:31,880 And then, of course, there are different ways of saying this. 303 00:18:31,880 --> 00:18:33,250 Yes? 304 00:18:33,250 --> 00:18:36,136 AUDIENCE: Yeah, same answer but a slightly different way 305 00:18:36,136 --> 00:18:37,010 of thinking about it. 306 00:18:39,650 --> 00:18:42,912 If you're just randomly sampling any of these distributions, 307 00:18:42,912 --> 00:18:49,164 then your sample drawn from the exponential distribution. 308 00:18:49,164 --> 00:18:52,918 It's going to be low selection [INAUDIBLE] 309 00:18:52,918 --> 00:18:55,084 more often than it's going to be for the other ones. 310 00:18:55,084 --> 00:18:56,270 Like the delta is [INAUDIBLE]. 311 00:18:56,270 --> 00:18:56,520 PROFESSOR: That's right. 312 00:18:56,520 --> 00:18:57,228 Yeah, yeah, yeah. 313 00:18:57,228 --> 00:18:59,560 AUDIENCE: [INAUDIBLE] every time the uniform. 314 00:18:59,560 --> 00:19:04,325 It's going to be equally likely to be a high selection 315 00:19:04,325 --> 00:19:06,800 coefficient as opposed to a low selection coefficient. 316 00:19:06,800 --> 00:19:08,425 But with the exponential distributions, 317 00:19:08,425 --> 00:19:10,760 your most likely to be a low selection coefficient. 318 00:19:10,760 --> 00:19:12,160 So you want more mutations. 319 00:19:12,160 --> 00:19:13,160 PROFESSOR: That's right. 320 00:19:13,160 --> 00:19:16,200 You, somehow, need more mutations of that exponential 321 00:19:16,200 --> 00:19:18,041 in order to sample out there. 322 00:19:18,041 --> 00:19:18,540 Right? 323 00:19:18,540 --> 00:19:22,104 So which one is going to have a more clonal interference? 324 00:19:22,104 --> 00:19:23,770 Which of these distributions will end up 325 00:19:23,770 --> 00:19:25,645 having the most clonal interference after you 326 00:19:25,645 --> 00:19:26,930 fit the data? 327 00:19:33,180 --> 00:19:33,903 Yeah? 328 00:19:33,903 --> 00:19:34,870 AUDIENCE: The one with the highest mutation. 329 00:19:34,870 --> 00:19:37,036 PROFESSOR: The one with the highest mutation, right? 330 00:19:37,036 --> 00:19:37,990 Kind of has to. 331 00:19:37,990 --> 00:19:39,620 Of course, and even though some of those mutations 332 00:19:39,620 --> 00:19:40,476 are going to be loss, still it's going 333 00:19:40,476 --> 00:19:42,370 to have the most colonel interference there. 334 00:19:42,370 --> 00:19:45,540 And indeed, if the underlying distribution 335 00:19:45,540 --> 00:19:48,990 were modeling as a delta function 336 00:19:48,990 --> 00:19:51,610 and, in their [? fit ?], what they got 337 00:19:51,610 --> 00:19:56,580 was that this might be around 5 and 1/2%, I think. 338 00:19:59,165 --> 00:20:01,250 Yeah, so [? 5 to 5 and 1/2 ?]. 339 00:20:01,250 --> 00:20:01,750 OK. 340 00:20:01,750 --> 00:20:09,810 So here, this guy was around 0.055. 341 00:20:09,810 --> 00:20:11,110 Between 5 and 5 and 1/2%. 342 00:20:13,624 --> 00:20:15,540 So what they're saying is, all right, well you 343 00:20:15,540 --> 00:20:18,210 could explain all of our data just 344 00:20:18,210 --> 00:20:21,930 by assuming that there's some mutation rate where, 345 00:20:21,930 --> 00:20:25,420 periodically, some individual gets a beneficial mutation that 346 00:20:25,420 --> 00:20:28,420 is a 5, 5 and 1/2%. 347 00:20:28,420 --> 00:20:30,500 And that could, in principle, be used 348 00:20:30,500 --> 00:20:34,240 to explain the base features here, how long have 349 00:20:34,240 --> 00:20:37,310 to wait before anything happens, and the slope when something 350 00:20:37,310 --> 00:20:38,840 starts happening. 351 00:20:38,840 --> 00:20:41,780 So the histogram that they plot is actually, somehow, 352 00:20:41,780 --> 00:20:44,380 this initial this initial slope once you start 353 00:20:44,380 --> 00:20:47,330 seeing it deviate from 50/50. 354 00:20:47,330 --> 00:20:47,830 OK? 355 00:20:47,830 --> 00:20:48,330 All right. 356 00:20:48,330 --> 00:20:50,190 But their point is it that that does not 357 00:20:50,190 --> 00:20:51,900 prove that the underlying distribution is 358 00:20:51,900 --> 00:20:54,740 a delta function with some mutation. 359 00:20:54,740 --> 00:20:57,210 And indeed, to explain the data with a delta function, 360 00:20:57,210 --> 00:21:01,780 you don't actually don't need any clonal interference. 361 00:21:01,780 --> 00:21:02,280 Right? 362 00:21:02,280 --> 00:21:04,279 You just say, OK, well somebody gets a mutation. 363 00:21:04,279 --> 00:21:05,530 It's 5%. 364 00:21:05,530 --> 00:21:07,140 And eventually, it's going to spread. 365 00:21:07,140 --> 00:21:08,830 And that's what we see. 366 00:21:08,830 --> 00:21:11,350 If you want to explain the later dynamics of flattening 367 00:21:11,350 --> 00:21:15,160 out and so forth, then you have to allow the other lineage 368 00:21:15,160 --> 00:21:18,580 to get a mutations, as well to cause a flattening. 369 00:21:18,580 --> 00:21:21,010 But as far as the base dynamics of when 370 00:21:21,010 --> 00:21:23,104 you leave the 50-50 in the initial slope, 371 00:21:23,104 --> 00:21:25,520 you don't even really need to have any clonal interference 372 00:21:25,520 --> 00:21:30,312 to explain their data with a delta function underlined. 373 00:21:30,312 --> 00:21:32,770 And that's why you also can get by with a very low mutation 374 00:21:32,770 --> 00:21:34,606 rate because you don't really need much 375 00:21:34,606 --> 00:21:35,980 in the way of competing lineages. 376 00:21:39,560 --> 00:21:40,120 Yeah? 377 00:21:40,120 --> 00:21:42,564 AUDIENCE: But what if the slopes are [INAUDIBLE]? 378 00:21:42,564 --> 00:21:43,480 PROFESSOR: Yeah, yeah. 379 00:21:43,480 --> 00:21:43,980 No, right. 380 00:21:43,980 --> 00:21:47,510 So you're not actually going to get the true distribution 381 00:21:47,510 --> 00:21:48,560 of slopes. 382 00:21:48,560 --> 00:21:52,267 But their argument is that a lot of that 383 00:21:52,267 --> 00:21:54,100 could just be noise and measuring the slopes 384 00:21:54,100 --> 00:21:56,620 and so forth because, if everything is a delta function, 385 00:21:56,620 --> 00:22:00,560 then you would start out by just getting one slope, 386 00:22:00,560 --> 00:22:02,847 unless you offer multiple mutations on a lineage. 387 00:22:02,847 --> 00:22:04,680 And then, things could get more complicated. 388 00:22:04,680 --> 00:22:06,340 But yeah. 389 00:22:06,340 --> 00:22:10,010 In this case, all of these guys would have the same slope. 390 00:22:10,010 --> 00:22:14,170 But that's, at least, a reasonable first order 391 00:22:14,170 --> 00:22:16,830 approximation to the data. 392 00:22:16,830 --> 00:22:19,090 However, as you move to these distributions in uniform 393 00:22:19,090 --> 00:22:21,649 and exponential, you're going to need more and more clonal 394 00:22:21,649 --> 00:22:23,440 interference to, kind of, explain the data. 395 00:22:23,440 --> 00:22:26,422 So you'll need higher and higher mutation rate. 396 00:22:26,422 --> 00:22:27,880 What's interesting is that you also 397 00:22:27,880 --> 00:22:32,031 have a lower and lower mean s. 398 00:22:32,031 --> 00:22:32,531 OK? 399 00:22:36,010 --> 00:22:37,998 AUDIENCE: Can you just explain why 400 00:22:37,998 --> 00:22:42,924 you need to explain the data? 401 00:22:42,924 --> 00:22:43,840 PROFESSOR: Yeah, sure. 402 00:22:43,840 --> 00:22:47,580 And I think that drawing these underlying distributions 403 00:22:47,580 --> 00:22:49,250 is really helpful. 404 00:22:49,250 --> 00:22:52,060 So first, we're going to draw the delta function. 405 00:22:52,060 --> 00:22:53,910 That, kind of, makes sense that you 406 00:22:53,910 --> 00:22:57,400 can fit everything just by assuming 5, 5 and 1/2%, right. 407 00:22:57,400 --> 00:23:02,240 So what we're going to do is draw the various P. 408 00:23:02,240 --> 00:23:03,810 So I drew those distributions. 409 00:23:03,810 --> 00:23:06,390 But they weren't necessarily to scale. 410 00:23:06,390 --> 00:23:09,520 i.e, they didn't necessarily have the proper mean selection 411 00:23:09,520 --> 00:23:11,220 coefficient. 412 00:23:11,220 --> 00:23:13,380 What we can do here is we can draw 413 00:23:13,380 --> 00:23:16,330 this is the mean s of the delta function, which 414 00:23:16,330 --> 00:23:18,285 was 5, 5 and 1/2%. 415 00:23:18,285 --> 00:23:22,350 All right, we got this guy here. 416 00:23:22,350 --> 00:23:27,190 Now the question is can we describe 417 00:23:27,190 --> 00:23:33,480 the data using a uniform distribution 418 00:23:33,480 --> 00:23:36,105 with the same mean selection coefficient? 419 00:23:40,040 --> 00:23:42,680 All right, so we're going to have you vote yes and no. 420 00:23:42,680 --> 00:23:45,260 And if you say no, then you have to say 421 00:23:45,260 --> 00:23:46,951 what's going to go wrong. 422 00:23:46,951 --> 00:23:47,451 All right? 423 00:23:47,451 --> 00:23:51,550 The question is can we just use the same mean selection 424 00:23:51,550 --> 00:23:53,555 coefficient for our uniform distribution. 425 00:23:56,230 --> 00:23:57,170 AUDIENCE: [INAUDIBLE]. 426 00:24:00,372 --> 00:24:01,080 PROFESSOR: Right. 427 00:24:01,080 --> 00:24:01,700 So what we're going to do is we're 428 00:24:01,700 --> 00:24:04,370 going to measure the experiment where we would have, 429 00:24:04,370 --> 00:24:07,289 say, 96 different evolutionary trajectories where we measure 430 00:24:07,289 --> 00:24:09,580 the time that it takes for something to happen and then 431 00:24:09,580 --> 00:24:10,454 the initial slope. 432 00:24:10,454 --> 00:24:12,620 And we're going to take a histogram of those things, 433 00:24:12,620 --> 00:24:14,170 and compare it between what we would 434 00:24:14,170 --> 00:24:16,310 get in the model with what we got experimentally. 435 00:24:21,816 --> 00:24:23,190 All right, the question is can we 436 00:24:23,190 --> 00:24:26,340 use the same mean s for a uniform 437 00:24:26,340 --> 00:24:28,532 as we did for the delta function. 438 00:24:28,532 --> 00:24:30,680 And if you say no, you have to say why not. 439 00:24:30,680 --> 00:24:31,370 A is yes. 440 00:24:31,370 --> 00:24:31,880 B is no. 441 00:24:34,661 --> 00:24:35,160 Ready? 442 00:24:38,180 --> 00:24:42,074 Three, two, one. 443 00:24:42,074 --> 00:24:44,220 All right, so we have a bunch of no's. 444 00:24:44,220 --> 00:24:45,030 Maybe a few yes's. 445 00:24:45,030 --> 00:24:48,570 All right but then some of the no's, 446 00:24:48,570 --> 00:24:50,104 it's incumbent on you [INAUDIBLE]. 447 00:24:50,104 --> 00:24:50,770 So I don't know. 448 00:24:50,770 --> 00:24:54,992 So yes, so one of the no's, why not? 449 00:24:54,992 --> 00:24:57,457 AUDIENCE: So if we have [INAUDIBLE] 450 00:24:57,457 --> 00:25:00,168 if it has the same average, then there 451 00:25:00,168 --> 00:25:02,217 are going to be outliers that are more 452 00:25:02,217 --> 00:25:03,931 better for the [INAUDIBLE]. 453 00:25:03,931 --> 00:25:04,930 PROFESSOR: That's right. 454 00:25:04,930 --> 00:25:07,513 So the problem here is that, if we use the same mean selection 455 00:25:07,513 --> 00:25:10,500 coefficient pretty for the uniform, 456 00:25:10,500 --> 00:25:12,000 then what we're going to end up with 457 00:25:12,000 --> 00:25:15,326 is something that comes out twice as far. 458 00:25:15,326 --> 00:25:15,951 AUDIENCE: Wait. 459 00:25:15,951 --> 00:25:19,879 But the delta function is just a uniform distribution with 0. 460 00:25:19,879 --> 00:25:20,702 AUDIENCE: Yeah. 461 00:25:20,702 --> 00:25:23,160 AUDIENCE: So you could always just fit it with [INAUDIBLE]. 462 00:25:23,160 --> 00:25:23,820 PROFESSOR: Yeah, we're assuming it's 463 00:25:23,820 --> 00:25:25,380 a uniform distribution that starts at 0 464 00:25:25,380 --> 00:25:27,005 and then goes out to some [? amount ?]. 465 00:25:29,379 --> 00:25:31,560 Yeah. 466 00:25:31,560 --> 00:25:34,440 So the idea is that-- I mean, whatever. 467 00:25:34,440 --> 00:25:35,330 That's the model. 468 00:25:35,330 --> 00:25:37,620 And I think it's a reasonable model 469 00:25:37,620 --> 00:25:40,350 because we know that there are a lot of mutations 470 00:25:40,350 --> 00:25:41,480 that have little effect. 471 00:25:41,480 --> 00:25:44,670 So it makes sense for a distribution to start at 0, 472 00:25:44,670 --> 00:25:47,231 if you're going to have something like a uniform. 473 00:25:47,231 --> 00:25:47,730 Right? 474 00:25:47,730 --> 00:25:50,420 And the problem with such a uniform distribution 475 00:25:50,420 --> 00:25:53,564 is that because there's going to be some clonal interference, 476 00:25:53,564 --> 00:25:55,730 what that means is that you're going to be, kind of, 477 00:25:55,730 --> 00:25:57,270 weighted out here. 478 00:25:57,270 --> 00:25:58,770 And that means that you'll, kind of, 479 00:25:58,770 --> 00:26:01,310 see mutations that are out here around 10%, 480 00:26:01,310 --> 00:26:05,050 instead of around 5%. 481 00:26:05,050 --> 00:26:06,624 So what you actually want, then, is 482 00:26:06,624 --> 00:26:08,040 something where the mean selection 483 00:26:08,040 --> 00:26:09,760 coefficient is around half of what 484 00:26:09,760 --> 00:26:11,650 you had as the delta function. 485 00:26:11,650 --> 00:26:14,180 So you want something that really looks more like this. 486 00:26:16,750 --> 00:26:18,785 And that's actually why, if you look at the data 487 00:26:18,785 --> 00:26:23,774 or if you look at this figure, then the area that 488 00:26:23,774 --> 00:26:25,190 works for the uniform distribution 489 00:26:25,190 --> 00:26:27,900 has a mean/coefficient of 3%. 490 00:26:27,900 --> 00:26:31,120 So this thing comes out to around 6% here. 491 00:26:31,120 --> 00:26:37,380 So just beyond the s correspond to the delta function. 492 00:26:37,380 --> 00:26:38,650 So this is the delta. 493 00:26:38,650 --> 00:26:41,470 And this is the uniform. 494 00:26:41,470 --> 00:26:43,552 So here is around 6%. 495 00:26:43,552 --> 00:26:45,468 And the mean of this is [? at half ?] of that. 496 00:26:45,468 --> 00:26:47,700 Right? 497 00:26:47,700 --> 00:26:50,420 What's happening is that there's some clonal interference. 498 00:26:50,420 --> 00:26:53,450 And you only need a modest amount of clonal interference 499 00:26:53,450 --> 00:26:56,120 because if you sample from a uniform distribution just a few 500 00:26:56,120 --> 00:27:03,130 then, the most fir one will be around here. 501 00:27:03,130 --> 00:27:06,190 And it'll already be relatively peaked. 502 00:27:06,190 --> 00:27:08,020 And of course, there's also this issue 503 00:27:08,020 --> 00:27:09,978 that you have to survive stochastic extinction. 504 00:27:09,978 --> 00:27:12,210 So that amplifies the effect further. 505 00:27:12,210 --> 00:27:16,152 So really, if you just have, say, two mutations sampled 506 00:27:16,152 --> 00:27:18,360 from the uniform that survives stochastic extinction, 507 00:27:18,360 --> 00:27:19,960 you're already going to get something 508 00:27:19,960 --> 00:27:24,020 that's peaked around there. 509 00:27:24,020 --> 00:27:25,422 Does that make sense? 510 00:27:25,422 --> 00:27:25,922 Yeah. 511 00:27:25,922 --> 00:27:27,463 AUDIENCE: But what if we [INAUDIBLE]? 512 00:27:29,709 --> 00:27:32,000 PROFESSOR: So then, the question is what exact mutation 513 00:27:32,000 --> 00:27:32,740 rate do you need. 514 00:27:32,740 --> 00:27:38,240 And basically, you need a high enough mutation rate 515 00:27:38,240 --> 00:27:40,130 that you get some mutations and that you can 516 00:27:40,130 --> 00:27:42,507 have some clonal interference. 517 00:27:42,507 --> 00:27:44,090 And the question is, what prevents you 518 00:27:44,090 --> 00:27:47,980 from having a mutation rate that's too high, maybe? 519 00:27:47,980 --> 00:27:51,310 AUDIENCE: My question was why do you need clonal interference 520 00:27:51,310 --> 00:27:54,084 to explain the data. 521 00:27:54,084 --> 00:27:55,920 PROFESSOR: Ah, right. 522 00:27:55,920 --> 00:28:00,076 AUDIENCE: I mean, of course, you have these [INAUDIBLE] focused 523 00:28:00,076 --> 00:28:01,299 on what happened early on. 524 00:28:01,299 --> 00:28:02,540 PROFESSOR: That's right. 525 00:28:02,540 --> 00:28:05,050 Yeah. 526 00:28:05,050 --> 00:28:05,550 Right, OK. 527 00:28:05,550 --> 00:28:08,610 So in there it comes down to how peaked 528 00:28:08,610 --> 00:28:13,645 this distribution of slopes is because there 529 00:28:13,645 --> 00:28:15,625 are very few shallow slopes. 530 00:28:15,625 --> 00:28:18,000 And of course, then, it gets into questions about quality 531 00:28:18,000 --> 00:28:19,370 of data and so forth. 532 00:28:19,370 --> 00:28:21,120 And that's more subtle. 533 00:28:21,120 --> 00:28:23,845 But certainly in principal, in the absence 534 00:28:23,845 --> 00:28:27,310 of clonal interference, you would have some fair number 535 00:28:27,310 --> 00:28:29,654 of shallow slopes. 536 00:28:29,654 --> 00:28:31,070 It would be under represented just 537 00:28:31,070 --> 00:28:33,080 because of the stochastic extinction business. 538 00:28:33,080 --> 00:28:36,215 But still, you need some to explain the, sort of, 539 00:28:36,215 --> 00:28:38,590 peakiness of that distribution of the slope distribution. 540 00:28:42,610 --> 00:28:44,540 Now the exponential is interesting 541 00:28:44,540 --> 00:28:48,570 because it's in a very, very different regime. 542 00:28:48,570 --> 00:28:54,660 So I just want to show this is mean s for the uniform. 543 00:28:54,660 --> 00:28:56,970 And actually, if you look at the figure 544 00:28:56,970 --> 00:29:00,140 of the mean s for the exponential, 545 00:29:00,140 --> 00:29:05,030 it's down there around 1%, which is a little bit 546 00:29:05,030 --> 00:29:08,530 surprising because what this is saying 547 00:29:08,530 --> 00:29:11,320 is that, if you look at this distribution, 548 00:29:11,320 --> 00:29:16,350 this initial slope, kind of, extends down here. 549 00:29:16,350 --> 00:29:21,130 And you have something that falls off, 550 00:29:21,130 --> 00:29:23,710 dramatically, though, right. 551 00:29:23,710 --> 00:29:24,210 OK. 552 00:29:24,210 --> 00:29:26,040 So how is it possible that you could 553 00:29:26,040 --> 00:29:28,310 use such a distribution that's peaked over here 554 00:29:28,310 --> 00:29:31,270 that's so far over on the left and still explain 555 00:29:31,270 --> 00:29:32,393 the same data? 556 00:29:32,393 --> 00:29:33,309 AUDIENCE: [INAUDIBLE]. 557 00:29:35,870 --> 00:29:36,870 PROFESSOR: That's right. 558 00:29:36,870 --> 00:29:39,810 So it's true that I've drawn this, kind of, around 0. 559 00:29:39,810 --> 00:29:41,470 But the exponential and principle 560 00:29:41,470 --> 00:29:41,880 goes to [? infinity ?]. 561 00:29:41,880 --> 00:29:43,630 It's just that it falls off exponentially. 562 00:29:43,630 --> 00:29:46,490 But what this means is that you're sampling pretty far 563 00:29:46,490 --> 00:29:52,500 out on the exponential in order to get the same mean effect. 564 00:29:52,500 --> 00:29:55,790 So you're actually going out to five or six times 565 00:29:55,790 --> 00:29:57,690 this characteristic s. 566 00:29:57,690 --> 00:30:00,170 So you're talking e to the minus 5. 567 00:30:00,170 --> 00:30:00,760 Right? 568 00:30:00,760 --> 00:30:02,718 That means there's a lot of clonal interference 569 00:30:02,718 --> 00:30:05,530 that has to be happening in order to explain this, right? 570 00:30:05,530 --> 00:30:07,500 But why is it that it's way over here? 571 00:30:07,500 --> 00:30:10,990 I mean, why not just use an exponential with an s that's 572 00:30:10,990 --> 00:30:12,480 more like 3, 4, 5, 6%? 573 00:30:17,400 --> 00:30:19,120 What's that? 574 00:30:19,120 --> 00:30:19,620 Well OK. 575 00:30:19,620 --> 00:30:21,470 But, you know, I mean, this is just a model. 576 00:30:21,470 --> 00:30:25,120 I can do whatever I want because, actually, they fit 577 00:30:25,120 --> 00:30:26,740 their data with these models. 578 00:30:26,740 --> 00:30:30,890 So realistic just means that it explains their data. 579 00:30:30,890 --> 00:30:33,870 So there's nothing a priori wrong with saying, oh, 580 00:30:33,870 --> 00:30:38,270 here's an exponential with a characteristic fall off of 5%. 581 00:30:38,270 --> 00:30:40,252 That's, in principle, fine. 582 00:30:40,252 --> 00:30:41,960 I mean, it doesn't work, for some reason, 583 00:30:41,960 --> 00:30:45,240 but we have to figure out why. 584 00:30:45,240 --> 00:30:45,740 Yeah. 585 00:30:45,740 --> 00:30:46,690 AUDIENCE: [INAUDIBLE]. 586 00:30:49,540 --> 00:30:50,440 PROFESSOR: OK, right. 587 00:30:50,440 --> 00:30:54,852 So you're likely going to get a low selection coefficient, OK. 588 00:30:54,852 --> 00:30:55,734 Is that the problem? 589 00:30:55,734 --> 00:30:56,650 AUDIENCE: [INAUDIBLE]. 590 00:31:03,564 --> 00:31:05,000 PROFESSOR: OK, that's true. 591 00:31:05,000 --> 00:31:12,766 But I guess the question is why can't we use? 592 00:31:12,766 --> 00:31:15,664 AUDIENCE: [INAUDIBLE]. 593 00:31:15,664 --> 00:31:17,119 PROFESSOR: Two mutations? 594 00:31:17,119 --> 00:31:17,660 Two clusters? 595 00:31:17,660 --> 00:31:18,512 What do you mean? 596 00:31:18,512 --> 00:31:19,496 AUDIENCE: [INAUDIBLE]. 597 00:31:26,384 --> 00:31:28,370 PROFESSOR: OK, the time between two. 598 00:31:28,370 --> 00:31:30,460 OK, that's an interesting statement. 599 00:31:30,460 --> 00:31:34,610 Although, the region or parameter space that they claim 600 00:31:34,610 --> 00:31:38,300 works is actually a region where we have a really high mutation 601 00:31:38,300 --> 00:31:39,230 rate. 602 00:31:39,230 --> 00:31:41,730 Orders of magnitude higher than the other two distributions. 603 00:31:41,730 --> 00:31:43,640 So in that sense, the time between mutations 604 00:31:43,640 --> 00:31:46,570 being established is really small 605 00:31:46,570 --> 00:31:48,444 because they're saying that, oh, if you 606 00:31:48,444 --> 00:31:50,110 want to fit the [INAUDIBLE] exponential, 607 00:31:50,110 --> 00:31:52,276 then you have to assume lots of clonal interference. 608 00:31:52,276 --> 00:31:55,590 So that means the time between successive establishments 609 00:31:55,590 --> 00:31:58,560 or mutations is really, actually, very short. 610 00:31:58,560 --> 00:32:03,190 So that's actually the regime where they claim works. 611 00:32:03,190 --> 00:32:04,690 So the question is why is it that we 612 00:32:04,690 --> 00:32:08,074 can't go to this other regime? 613 00:32:08,074 --> 00:32:09,490 In this figure that they make, why 614 00:32:09,490 --> 00:32:13,810 is it that the allowed mean coefficient doesn't extend out 615 00:32:13,810 --> 00:32:16,494 to over there? 616 00:32:16,494 --> 00:32:16,994 Yeah. 617 00:32:16,994 --> 00:32:17,956 AUDIENCE: [INAUDIBLE]. 618 00:32:24,209 --> 00:32:25,160 PROFESSOR: Right. 619 00:32:25,160 --> 00:32:27,120 AUDIENCE: Because the exponential is 620 00:32:27,120 --> 00:32:29,080 [INAUDIBLE] that [INAUDIBLE]. 621 00:32:29,080 --> 00:32:30,660 PROFESSOR: That's right. 622 00:32:30,660 --> 00:32:31,960 Yeah, that's right. 623 00:32:31,960 --> 00:32:33,940 The point is that, to explain their data, 624 00:32:33,940 --> 00:32:36,360 you need something to cause this distribution 625 00:32:36,360 --> 00:32:38,340 to get more peaked over here, which 626 00:32:38,340 --> 00:32:41,340 means you need to have a fair amount of clonal interference. 627 00:32:41,340 --> 00:32:45,510 But that means you need to have a high mutation rate. 628 00:32:45,510 --> 00:32:47,260 But once you have that high mutation rate, 629 00:32:47,260 --> 00:32:49,090 if you have an exponential that comes out here, 630 00:32:49,090 --> 00:32:51,420 then you would actually sample way out here, as well. 631 00:32:51,420 --> 00:32:53,742 So the only way to get a peaks distribution around here 632 00:32:53,742 --> 00:32:55,700 is to have it so that the exponential is really 633 00:32:55,700 --> 00:32:57,790 suppressing those really good mutations. 634 00:32:57,790 --> 00:33:01,020 But you have a lot of clonal interference 635 00:33:01,020 --> 00:33:03,650 that, kind of, pulls things out. 636 00:33:03,650 --> 00:33:08,610 Now there still is a fair range of parameters that work here. 637 00:33:08,610 --> 00:33:11,390 You know, it goes from, say, half a percent up 638 00:33:11,390 --> 00:33:14,676 to, maybe, 1 and 1/2% in terms of this mean selection 639 00:33:14,676 --> 00:33:15,670 coefficient. 640 00:33:15,670 --> 00:33:19,430 And the mutation rate that works then changes. 641 00:33:19,430 --> 00:33:22,610 So as you got a larger mean selection coefficient, 642 00:33:22,610 --> 00:33:24,610 the mutation rate that is compatible 643 00:33:24,610 --> 00:33:27,830 goes down because you need less clonal interference. 644 00:33:27,830 --> 00:33:31,550 So that's why, if you look at this figure for the region that 645 00:33:31,550 --> 00:33:34,340 works in terms of the mutation rate 646 00:33:34,340 --> 00:33:37,900 for the beneficial mutation and the mean s, 647 00:33:37,900 --> 00:33:41,890 there's some region that looks, kind of, like this 648 00:33:41,890 --> 00:33:43,900 that works for the exponential. 649 00:33:43,900 --> 00:33:47,010 And this is, actually, a big range. 650 00:33:47,010 --> 00:33:51,980 So this is, actually, a factor of 100 in mutation rate 651 00:33:51,980 --> 00:33:53,250 that would be compatible. 652 00:33:53,250 --> 00:33:56,602 And then, a factor of, maybe, three in mean selection 653 00:33:56,602 --> 00:33:58,610 coefficient. 654 00:33:58,610 --> 00:34:00,934 So there's some range of parameters that work. 655 00:34:00,934 --> 00:34:03,100 And you can understand why it is that this thing has 656 00:34:03,100 --> 00:34:04,420 to be shaped the way it is. 657 00:34:09,069 --> 00:34:09,944 Does that make sense? 658 00:34:12,689 --> 00:34:13,677 Yeah. 659 00:34:13,677 --> 00:34:14,665 AUDIENCE: [INAUDIBLE]. 660 00:34:25,039 --> 00:34:29,239 PROFESSOR: Right because the uniform is out here. 661 00:34:29,239 --> 00:34:31,710 And then, the delta function is here. 662 00:34:31,710 --> 00:34:36,804 So this is delta, uniform, and exponential. 663 00:34:36,804 --> 00:34:39,345 And you're just saying that it all, [? kind of ?], following. 664 00:34:39,345 --> 00:34:40,844 Is that-- yeah. 665 00:34:40,844 --> 00:34:41,838 AUDIENCE: [INAUDIBLE]. 666 00:34:54,760 --> 00:34:58,180 PROFESSOR: Yeah, well maybe some power law fall off 667 00:34:58,180 --> 00:35:01,611 would do that. 668 00:35:01,611 --> 00:35:03,601 AUDIENCE: [INAUDIBLE]. 669 00:35:03,601 --> 00:35:05,600 PROFESSOR: There's nothing there's nothing magic 670 00:35:05,600 --> 00:35:08,280 about these three regions. 671 00:35:08,280 --> 00:35:12,016 And it's not that we're claiming that the mean selection 672 00:35:12,016 --> 00:35:14,157 coefficient cannot be in here. 673 00:35:14,157 --> 00:35:16,240 It's just that, if you pick these three underlined 674 00:35:16,240 --> 00:35:18,450 distributions, you get this range of different values 675 00:35:18,450 --> 00:35:19,116 that would work. 676 00:35:19,116 --> 00:35:21,100 So if you chose other underlying distributions, 677 00:35:21,100 --> 00:35:23,380 you could get other blogs. 678 00:35:23,380 --> 00:35:25,450 But it's true that there is a general trend 679 00:35:25,450 --> 00:35:27,610 that, the higher the mean selection coefficient, 680 00:35:27,610 --> 00:35:29,750 the less clonal interference you need 681 00:35:29,750 --> 00:35:31,821 or want to explain the data. 682 00:35:31,821 --> 00:35:32,775 AUDIENCE: [INAUDIBLE]. 683 00:35:36,591 --> 00:35:37,330 PROFESSOR: Right. 684 00:35:37,330 --> 00:35:40,231 Why can't it go higher or lower? 685 00:35:40,231 --> 00:35:40,730 Yeah. 686 00:35:43,598 --> 00:35:44,830 I don't know. 687 00:35:44,830 --> 00:35:47,310 A factor of 100 isn't enough for you? 688 00:35:47,310 --> 00:35:49,280 Yeah, it's a good question. 689 00:35:49,280 --> 00:35:52,240 I'd have to think about it to figure out which-- 690 00:35:52,240 --> 00:35:54,740 because there's going to be a different effect on each side, 691 00:35:54,740 --> 00:35:55,239 presumably. 692 00:35:59,310 --> 00:36:01,374 Yeah, and all of these distributions, 693 00:36:01,374 --> 00:36:02,790 there's some floor just because we 694 00:36:02,790 --> 00:36:06,980 know that you have to get these beneficial mutations 695 00:36:06,980 --> 00:36:09,550 within the first few tens of generations. 696 00:36:09,550 --> 00:36:12,220 Otherwise, the mutation wouldn't have gotten a chance 697 00:36:12,220 --> 00:36:15,240 to spread when it did. 698 00:36:15,240 --> 00:36:16,670 So that means we know that there's 699 00:36:16,670 --> 00:36:18,490 going to be a lower bound on the mutation 700 00:36:18,490 --> 00:36:22,590 always because, if it's too low, then we 701 00:36:22,590 --> 00:36:24,810 wouldn't have gotten the mutations in time. 702 00:36:24,810 --> 00:36:30,817 And now, in terms of why it can't be higher, yeah, 703 00:36:30,817 --> 00:36:37,180 I'd have to think about it. 704 00:36:37,180 --> 00:36:39,280 Are there any other questions about this paper? 705 00:36:39,280 --> 00:36:41,190 I think it's a challenging paper, kind of, 706 00:36:41,190 --> 00:36:43,860 conceptually/mathematically. 707 00:36:43,860 --> 00:36:45,900 But I think it's interesting because it 708 00:36:45,900 --> 00:36:48,875 does get you to think about this process of clonal interference 709 00:36:48,875 --> 00:36:49,375 in new ways. 710 00:36:53,582 --> 00:36:54,570 AUDIENCE: [INAUDIBLE]. 711 00:36:54,570 --> 00:36:55,558 PROFESSOR: Yeah. 712 00:36:55,558 --> 00:36:57,534 AUDIENCE: So is that enough? 713 00:36:57,534 --> 00:37:03,462 Like, is this condition rate known [INAUDIBLE]? 714 00:37:03,462 --> 00:37:04,546 PROFESSOR: Oh, I see. 715 00:37:04,546 --> 00:37:05,670 Oh, that's a good question. 716 00:37:05,670 --> 00:37:06,674 Yeah. 717 00:37:06,674 --> 00:37:08,090 Yeah, so I think that the mutation 718 00:37:08,090 --> 00:37:13,730 rates that-- first of all, this is not a per base pair mutation 719 00:37:13,730 --> 00:37:14,400 rate. 720 00:37:14,400 --> 00:37:17,225 This is the rate that you get beneficial mutations. 721 00:37:20,180 --> 00:37:23,172 So I'd say, the numbers are not ridiculous. 722 00:37:23,172 --> 00:37:24,630 But it's not that you can just take 723 00:37:24,630 --> 00:37:28,340 the known per base [? pair of ?] mutation rate and say, oh, well 724 00:37:28,340 --> 00:37:31,310 it has to be here because an awful lot of them 725 00:37:31,310 --> 00:37:33,660 are deleterious. 726 00:37:33,660 --> 00:37:35,767 And it's very sensitive because the mutations that 727 00:37:35,767 --> 00:37:37,350 are occurring around here don't really 728 00:37:37,350 --> 00:37:40,484 matter for evolution because they tend not to survive 729 00:37:40,484 --> 00:37:41,400 stochastic extinction. 730 00:37:41,400 --> 00:37:43,220 They're not going to survive clonal interference. 731 00:37:43,220 --> 00:37:44,900 Yet, it could be a big part of the distribution. 732 00:37:44,900 --> 00:37:45,399 Right? 733 00:37:45,399 --> 00:37:48,290 It could be that the majority of the mutations-- for example, 734 00:37:48,290 --> 00:37:50,410 in the exponential-- is there. 735 00:37:50,410 --> 00:37:53,899 So you can actually have very different distributions 736 00:37:53,899 --> 00:37:55,940 that don't really change the evolutionary process 737 00:37:55,940 --> 00:37:59,330 but would change the rate of beneficial mutations. 738 00:37:59,330 --> 00:38:02,209 So I think the numbers are not ridiculous. 739 00:38:02,209 --> 00:38:03,750 But it's hard to constrain, actually. 740 00:38:11,740 --> 00:38:13,919 What I want to do is, before talking 741 00:38:13,919 --> 00:38:15,460 about these rugged fitness landscapes 742 00:38:15,460 --> 00:38:18,020 and the Weinreich paper, let's just say something 743 00:38:18,020 --> 00:38:21,660 about the rate of evolution. 744 00:38:26,020 --> 00:38:27,990 And when we say rate, we're referring 745 00:38:27,990 --> 00:38:31,266 to the change and the mean fitness of the population 746 00:38:31,266 --> 00:38:32,140 with respect to time. 747 00:38:41,469 --> 00:38:44,850 All right, so this is the rate of evolution. 748 00:38:48,170 --> 00:38:52,240 So this is the change in the mean fitness. 749 00:38:52,240 --> 00:38:54,220 Delta. 750 00:38:54,220 --> 00:38:58,700 Delta mean fitness divided by delta time. 751 00:39:01,460 --> 00:39:03,025 So what we want to do is just start 752 00:39:03,025 --> 00:39:05,780 by thinking about a situation where 753 00:39:05,780 --> 00:39:10,060 we assume that we're not all running out of new mutations 754 00:39:10,060 --> 00:39:11,090 that are good for us. 755 00:39:11,090 --> 00:39:14,840 All right, so what we can do is just assume that, at some rate, 756 00:39:14,840 --> 00:39:21,580 mu-- so there's at rate mu beneficial, we'll say. 757 00:39:21,580 --> 00:39:23,990 We sample from some probabilities distribution 758 00:39:23,990 --> 00:39:27,010 of beneficial mutations. 759 00:39:27,010 --> 00:39:31,300 And then, something happens to cast an extinction. 760 00:39:31,300 --> 00:39:33,290 Maybe clonal interference and whatnot. 761 00:39:33,290 --> 00:39:34,900 But someone of them will fix. 762 00:39:34,900 --> 00:39:36,441 And then, that increases the fitness. 763 00:39:36,441 --> 00:39:41,180 And then, for now, we'll just assume that the fitness is add. 764 00:39:41,180 --> 00:39:42,780 But for small S's, it doesn't matter 765 00:39:42,780 --> 00:39:43,730 whether we're thinking about [? fitness as ?] 766 00:39:43,730 --> 00:39:45,005 adding or multiplying. 767 00:39:47,720 --> 00:39:50,300 You guys understand what I just said there? 768 00:39:50,300 --> 00:39:57,461 So if you get a mutation that has affect S1 and maybe 769 00:39:57,461 --> 00:39:59,210 the right way to think about this is that, 770 00:39:59,210 --> 00:40:02,425 if you get a mutation s2, then the [? fitnesses ?] perhaps , 771 00:40:02,425 --> 00:40:04,480 should multiply as the [? mu ?] model. 772 00:40:04,480 --> 00:40:06,870 But this is, of course, for a small s1 and s2, 773 00:40:06,870 --> 00:40:10,410 this is around 1 plus s2. 774 00:40:10,410 --> 00:40:12,730 So for small S's for short times, maybe 775 00:40:12,730 --> 00:40:18,625 we don't need to worry about this because this is, for s1 776 00:40:18,625 --> 00:40:23,271 and s2, much less than 1. 777 00:40:23,271 --> 00:40:23,770 Yes? 778 00:40:23,770 --> 00:40:25,936 AUDIENCE: My intuition would be that, if you already 779 00:40:25,936 --> 00:40:28,004 had a 5% increase i [? fitness ?], 780 00:40:28,004 --> 00:40:29,896 it would be harder to get [INAUDIBLE]. 781 00:40:29,896 --> 00:40:31,439 PROFESSOR: That's right. 782 00:40:31,439 --> 00:40:33,022 So eventually, that certainly is going 783 00:40:33,022 --> 00:40:35,310 to be the case that we're going to run out 784 00:40:35,310 --> 00:40:37,560 of these beneficial mutations. 785 00:40:37,560 --> 00:40:44,440 But for the first few thousand generations, 786 00:40:44,440 --> 00:40:45,920 it's roughly linear. 787 00:40:45,920 --> 00:40:48,390 So eventually, it does start curving over. 788 00:40:48,390 --> 00:40:52,010 But maybe not as fast as you would have thought. 789 00:40:56,500 --> 00:41:00,230 And at the very least, this is a good [? no ?] model. 790 00:41:00,230 --> 00:41:03,134 And then, we can, of course, complicate things later. 791 00:41:03,134 --> 00:41:05,050 But for now, we'll just assume that you always 792 00:41:05,050 --> 00:41:06,925 sample from the same probability distribution 793 00:41:06,925 --> 00:41:09,386 of beneficial mutations, just for simplicity. 794 00:41:09,386 --> 00:41:13,170 The question is, how fast will the fitness of the population 795 00:41:13,170 --> 00:41:15,835 increase with time? 796 00:41:15,835 --> 00:41:16,335 OK? 797 00:41:24,506 --> 00:41:26,005 Do you guys understand the question? 798 00:41:30,190 --> 00:41:32,840 All right, let's start by thinking about the regime where 799 00:41:32,840 --> 00:41:39,670 mu b N is much less than 1. 800 00:41:39,670 --> 00:41:47,860 So very low rates of mutation relative to the population 801 00:41:47,860 --> 00:41:50,010 size. 802 00:41:50,010 --> 00:41:55,390 Now you might recall-- for clonal interference. 803 00:41:58,710 --> 00:42:02,400 So clonal interference not relevant. 804 00:42:02,400 --> 00:42:05,280 What we found last time was it required 805 00:42:05,280 --> 00:42:10,370 that the time that it took for mutation to fix 806 00:42:10,370 --> 00:42:12,530 had to be much less than the time 807 00:42:12,530 --> 00:42:16,570 between successive establishments 808 00:42:16,570 --> 00:42:19,180 of these beneficial mutations. 809 00:42:19,180 --> 00:42:22,910 And this was 1 over s log N's. 810 00:42:31,890 --> 00:42:34,040 You should be able to [? derive ?] both of these. 811 00:42:34,040 --> 00:42:39,255 This and that and this, and the next step, as well. 812 00:42:51,264 --> 00:42:53,964 All right, so you can ignore clonal interference 813 00:42:53,964 --> 00:42:54,630 if this is true. 814 00:42:57,430 --> 00:43:02,483 So let's say that we can ignore clonal interference. 815 00:43:08,640 --> 00:43:10,660 So for small population sizes or in the limit 816 00:43:10,660 --> 00:43:13,080 of low mutation rates. 817 00:43:13,080 --> 00:43:15,750 What we want to know is the rate of evolution. 818 00:43:15,750 --> 00:43:17,585 How will it scale with various things? 819 00:43:23,230 --> 00:43:25,340 I'll go ahead and have us vote for-- OK. 820 00:43:35,040 --> 00:43:37,155 How does it scale with, [? particular ?], both, 821 00:43:37,155 --> 00:43:39,970 the mutation rate and the population size? 822 00:43:51,307 --> 00:43:52,390 It's proportional to what? 823 00:44:12,624 --> 00:44:14,040 Holding another thing is constant. 824 00:44:14,040 --> 00:44:19,170 Holding, for example, the distribution and n constant. 825 00:44:19,170 --> 00:44:19,670 OK? 826 00:44:26,144 --> 00:44:30,050 Do you understand the question? 827 00:44:30,050 --> 00:44:30,550 Yes. 828 00:44:30,550 --> 00:44:33,427 AUDIENCE: [INAUDIBLE]. 829 00:44:33,427 --> 00:44:34,010 PROFESSOR: OK. 830 00:44:34,010 --> 00:44:37,590 So this is just the rate of beneficial mutations. 831 00:44:37,590 --> 00:44:39,132 And this is just to the 0th power. 832 00:44:39,132 --> 00:44:40,090 i.e, it doesn't defend. 833 00:44:40,090 --> 00:44:42,280 AUDIENCE: Oh, OK. 834 00:44:42,280 --> 00:44:43,815 PROFESSOR: Linearly, it's squared. 835 00:44:54,292 --> 00:44:55,776 All right, ready? 836 00:44:55,776 --> 00:44:56,742 AUDIENCE: [INAUDIBLE]. 837 00:44:59,574 --> 00:45:02,120 PROFESSOR: Yeah, assuming that clonal interference is not 838 00:45:02,120 --> 00:45:04,150 relevant. 839 00:45:04,150 --> 00:45:06,130 So assuming, for a small mutation rate, 840 00:45:06,130 --> 00:45:09,535 how does it scale in mutation rate? 841 00:45:09,535 --> 00:45:12,260 OK, ready. 842 00:45:12,260 --> 00:45:14,790 Three, two, one. 843 00:45:17,520 --> 00:45:23,260 All right, so we got a lot of B's, which is nice. 844 00:45:23,260 --> 00:45:27,440 So this is saying that, indeed, if mutation rate is small, 845 00:45:27,440 --> 00:45:30,120 then there's going to be some rate that mutations 846 00:45:30,120 --> 00:45:31,610 enter into the population. 847 00:45:31,610 --> 00:45:35,190 They may or may not survive stochastic extinction. 848 00:45:35,190 --> 00:45:37,760 But if they do, and that doesn't depend on mutation rate, 849 00:45:37,760 --> 00:45:40,720 then they can fix. 850 00:45:40,720 --> 00:45:42,560 And in this low mutation rate regime, 851 00:45:42,560 --> 00:45:45,550 they don't compete with each other. 852 00:45:45,550 --> 00:45:48,130 In which case, if you double the rate of these things entered 853 00:45:48,130 --> 00:45:49,921 into the population, you'll double the rate 854 00:45:49,921 --> 00:45:51,020 that they get established. 855 00:45:51,020 --> 00:45:53,920 And you'll double the rate that these beneficial mutations will 856 00:45:53,920 --> 00:45:58,190 fix in the population So you double the rate. 857 00:45:58,190 --> 00:46:00,080 OK? 858 00:46:00,080 --> 00:46:03,940 All right, now as a function of n. 859 00:46:03,940 --> 00:46:11,906 Does it go as N to the 0 or other? 860 00:46:11,906 --> 00:46:22,210 We got A, B, C, D. All right, I'll give you 15 seconds 861 00:46:22,210 --> 00:46:23,689 to think about it. 862 00:46:41,950 --> 00:46:42,620 Ready? 863 00:46:42,620 --> 00:46:47,290 Three, two, one. 864 00:46:47,290 --> 00:46:50,920 OK, so now we're getting more disagreement. 865 00:46:50,920 --> 00:46:51,790 All right. 866 00:46:51,790 --> 00:46:55,655 And I'd say, largely, A's and B's. 867 00:46:55,655 --> 00:46:57,280 All right, there's enough disagreement. 868 00:46:57,280 --> 00:47:00,290 Let's go ahead and spend just 30 seconds. 869 00:47:00,290 --> 00:47:01,846 Turn to your neighbor. 870 00:47:01,846 --> 00:47:05,332 [SIDE CONVERSATIONS] 871 00:47:35,310 --> 00:47:36,499 Per individual, yeah. 872 00:47:36,499 --> 00:47:38,495 [SIDE CONVERSATIONS] 873 00:48:15,199 --> 00:48:16,740 All right, let's go ahead and re-vote 874 00:48:16,740 --> 00:48:19,091 just so I can see where we are. 875 00:48:19,091 --> 00:48:20,590 How is it that the rate of evolution 876 00:48:20,590 --> 00:48:22,540 in this regime, no clonal interference, 877 00:48:22,540 --> 00:48:23,950 how is this going to scale the population size? 878 00:48:23,950 --> 00:48:24,450 Ready? 879 00:48:24,450 --> 00:48:27,940 Three, two, one. 880 00:48:27,940 --> 00:48:28,440 OK 881 00:48:28,440 --> 00:48:30,530 So, I'd say that we have not really convinced 882 00:48:30,530 --> 00:48:31,640 each other of anything. 883 00:48:31,640 --> 00:48:34,350 All right, so A's B's? 884 00:48:34,350 --> 00:48:39,225 Somebody explain the reasoning. 885 00:48:39,225 --> 00:48:39,724 Yeah. 886 00:48:39,724 --> 00:48:48,060 AUDIENCE: [INAUDIBLE] mutation rate [INAUDIBLE] 887 00:48:48,060 --> 00:48:50,140 larger the population, the larger we're 888 00:48:50,140 --> 00:48:53,417 going to get [INAUDIBLE] population. 889 00:48:53,417 --> 00:48:54,000 PROFESSOR: OK. 890 00:48:54,000 --> 00:48:55,840 All right, so if you double the size of the population, 891 00:48:55,840 --> 00:48:57,605 you'll double the rate the new mutations enter 892 00:48:57,605 --> 00:48:58,438 into the population. 893 00:48:58,438 --> 00:49:00,260 AUDIENCE: And the other important thing 894 00:49:00,260 --> 00:49:04,845 is that the fixation time just goes, like, [INAUDIBLE]. 895 00:49:04,845 --> 00:49:05,720 PROFESSOR: OK, right. 896 00:49:05,720 --> 00:49:07,760 So then there's the fixation time business. 897 00:49:07,760 --> 00:49:09,980 So how is that relevant? 898 00:49:09,980 --> 00:49:16,015 Yeah, somebody that said A, what was your partners reasoning? 899 00:49:22,400 --> 00:49:26,425 Or where you convinced by this argument? 900 00:49:26,425 --> 00:49:28,900 Or confused. 901 00:49:28,900 --> 00:49:32,855 AUDIENCE: Well if it's [INAUDIBLE], then the 902 00:49:32,855 --> 00:49:33,355 [INAUDIBLE]. 903 00:49:35,807 --> 00:49:36,390 PROFESSOR: Ah. 904 00:49:36,390 --> 00:49:38,556 Right, so if it's a nearly new [INAUDIBLE] mutation, 905 00:49:38,556 --> 00:49:42,075 then it's true that the [INAUDIBLE] would be 1/N but-- 906 00:49:42,075 --> 00:49:45,136 AUDIENCE: But if N is very large then, sometimes, 907 00:49:45,136 --> 00:49:46,534 s will not [INAUDIBLE]. 908 00:49:46,534 --> 00:49:48,280 PROFESSOR: OK, right. 909 00:49:48,280 --> 00:49:50,540 But now you're invoking N being large, 910 00:49:50,540 --> 00:49:54,689 which I don't think we necessarily want to do. 911 00:49:54,689 --> 00:49:56,980 I mean, I guess there are a couple things to say there. 912 00:49:56,980 --> 00:50:00,800 One is that the mutations that are really nearly neutral 913 00:50:00,800 --> 00:50:07,580 will not have a very significant effect on the fitness. 914 00:50:07,580 --> 00:50:12,870 And within that regime, I think, yeah, 915 00:50:12,870 --> 00:50:16,000 you'd have to check what happens there, right? 916 00:50:16,000 --> 00:50:21,980 But I think that, in most of these cases, small population, 917 00:50:21,980 --> 00:50:24,830 we're often saying population [? must be ?] 10 to the 4 918 00:50:24,830 --> 00:50:26,350 or so. 919 00:50:26,350 --> 00:50:28,560 In which case, the nearly neutral mutations 920 00:50:28,560 --> 00:50:31,100 are not very relevant. 921 00:50:31,100 --> 00:50:31,900 Right? 922 00:50:31,900 --> 00:50:34,620 So indeed, over a broad range of conditions, in this situation, 923 00:50:34,620 --> 00:50:41,610 it's going to scale as N. So the rate, so far, 924 00:50:41,610 --> 00:50:47,730 it's going to equal to there's a mu b times an N. 925 00:50:47,730 --> 00:50:52,780 And that's basically because the rate the new mutations enter 926 00:50:52,780 --> 00:50:57,880 into the populations is mu N. And the rate that they get 927 00:50:57,880 --> 00:50:59,500 established is just mu N's. 928 00:50:59,500 --> 00:51:03,420 This is, indeed, what this calculation is telling us. 929 00:51:03,420 --> 00:51:08,010 Now the question is, how much of a fitness gain will we get? 930 00:51:08,010 --> 00:51:11,696 How is it going to scale with this probability distribution. 931 00:51:11,696 --> 00:51:13,570 So this probability distribution will give us 932 00:51:13,570 --> 00:51:17,950 some function of s. 933 00:51:17,950 --> 00:51:19,910 So this distribution has to be relevant. 934 00:51:19,910 --> 00:51:21,470 Do we agree? 935 00:51:21,470 --> 00:51:24,750 So we want to know in what way is it relevant. 936 00:51:24,750 --> 00:51:35,440 Is it relevant via the mean, via the mean squared, 937 00:51:35,440 --> 00:51:50,090 the mean cubed-- I don't know-- the mean squared, or other? 938 00:51:50,090 --> 00:51:51,460 This one is harder. 939 00:51:51,460 --> 00:51:54,130 So it's worth spending, I'll give you, 940 00:51:54,130 --> 00:51:56,000 a full 30 seconds to think about it. 941 00:51:56,000 --> 00:51:58,083 So the question is, how is it that the probability 942 00:51:58,083 --> 00:52:00,320 distribution will enter into the rate of evolution 943 00:52:00,320 --> 00:52:01,070 in this situation? 944 00:52:03,690 --> 00:52:04,190 OK? 945 00:52:04,190 --> 00:52:05,476 Question. 946 00:52:05,476 --> 00:52:06,462 No? 947 00:52:06,462 --> 00:52:07,941 AUDIENCE: Can [INAUDIBLE]? 948 00:52:14,350 --> 00:52:16,000 PROFESSOR: I'm sorry, multiple what? 949 00:52:16,000 --> 00:52:17,659 AUDIENCE: Is it possible that it's 950 00:52:17,659 --> 00:52:19,320 entered in multiple situation? 951 00:52:19,320 --> 00:52:20,837 PROFESSOR: Oh, yeah. 952 00:52:20,837 --> 00:52:22,670 You know, this is why I gave you flashcards. 953 00:52:22,670 --> 00:52:25,800 You can put up any combination you want. 954 00:52:29,680 --> 00:52:30,560 Another 15 seconds. 955 00:52:30,560 --> 00:52:31,790 This one is trickier. 956 00:52:59,800 --> 00:53:02,100 Let's go ahead and vote. 957 00:53:02,100 --> 00:53:02,790 Ready? 958 00:53:02,790 --> 00:53:07,130 Three, two, one. 959 00:53:07,130 --> 00:53:11,140 All right, so we got a lot of A's and B's. 960 00:53:11,140 --> 00:53:12,630 Some, maybe, C's and D's. 961 00:53:16,380 --> 00:53:17,400 All right, so yeah. 962 00:53:17,400 --> 00:53:19,300 We're pretty far. 963 00:53:19,300 --> 00:53:20,960 OK, we're all over the place. 964 00:53:20,960 --> 00:53:24,230 All right, so there's going to be some distribution. 965 00:53:24,230 --> 00:53:31,220 There might be, for example, probability distribution 966 00:53:31,220 --> 00:53:32,620 function of s, function of s. 967 00:53:32,620 --> 00:53:34,110 We want to know it's going to something that's 968 00:53:34,110 --> 00:53:35,330 going to fall off in some way. 969 00:53:35,330 --> 00:53:36,080 Maybe exponential. 970 00:53:36,080 --> 00:53:38,330 Maybe something else. 971 00:53:38,330 --> 00:53:40,640 Now to keep track of this, what we need to do 972 00:53:40,640 --> 00:53:48,900 is remember that the probability of establishment goes as s. 973 00:53:48,900 --> 00:53:51,550 The probability of establishment. 974 00:53:51,550 --> 00:53:53,667 And the [INAUDIBLE] model is equal to s. 975 00:53:53,667 --> 00:53:55,000 In other models, it might be 2s. 976 00:53:55,000 --> 00:53:55,960 But it's around s. 977 00:53:58,960 --> 00:54:03,460 And then, the question is, how much of a benefit 978 00:54:03,460 --> 00:54:05,410 will you get if you do establish? 979 00:54:08,092 --> 00:54:09,670 All right. 980 00:54:09,670 --> 00:54:11,900 And that is, again, going to go as-- 981 00:54:11,900 --> 00:54:17,210 and whatever the delta fitness is actually, again, equal 982 00:54:17,210 --> 00:54:21,540 to whatever acid is that you sampled. 983 00:54:21,540 --> 00:54:27,980 And the given mutation that appeared, you sample somewhere. 984 00:54:27,980 --> 00:54:33,390 It has to, both, survive and well- if it survives, then 985 00:54:33,390 --> 00:54:35,452 it gives you an s. 986 00:54:35,452 --> 00:54:37,160 So what that means is that, actually, you 987 00:54:37,160 --> 00:54:43,150 end up averaging s squared of the distribution 988 00:54:43,150 --> 00:54:45,140 to determine the rate of evolution. 989 00:54:49,810 --> 00:54:51,710 If it was just a delta function at s, 990 00:54:51,710 --> 00:54:53,380 then it's just an s squared. 991 00:54:53,380 --> 00:54:55,831 So then you think, oh, it's going to be mean of x squared. 992 00:54:55,831 --> 00:54:56,330 Right? 993 00:54:56,330 --> 00:54:57,880 But if it's a delta function, then the mean 994 00:54:57,880 --> 00:55:00,171 of s squared and the mean s squared are the same thing. 995 00:55:02,930 --> 00:55:05,020 But in this situation, it's just useful to play 996 00:55:05,020 --> 00:55:06,395 with some different distributions 997 00:55:06,395 --> 00:55:13,320 and see how it plays out. 998 00:55:13,320 --> 00:55:18,380 But in this case, it is, indeed, mean s squared. 999 00:55:18,380 --> 00:55:25,210 So the rate of evolution in the limit of no clonal interference 1000 00:55:25,210 --> 00:55:27,691 is, actually, rather simple. 1001 00:55:27,691 --> 00:55:29,830 It just goes as mu N. But then, you 1002 00:55:29,830 --> 00:55:31,230 have to take the expectation of s 1003 00:55:31,230 --> 00:55:33,480 squared of whatever this distribution is 1004 00:55:33,480 --> 00:55:35,739 of underlying mutations. 1005 00:55:35,739 --> 00:55:36,239 OK? 1006 00:55:41,241 --> 00:55:42,990 But of course, this is going to break down 1007 00:55:42,990 --> 00:55:46,740 at some point, as everything does. 1008 00:55:46,740 --> 00:55:49,559 And can somebody remind us why it is 1009 00:55:49,559 --> 00:55:50,808 that it's going to break down? 1010 00:55:59,900 --> 00:56:01,450 Clonal interference. 1011 00:56:01,450 --> 00:56:03,130 Perfect. 1012 00:56:03,130 --> 00:56:06,330 And we know exactly where clonal interference is going 1013 00:56:06,330 --> 00:56:08,094 to start being relevant here. 1014 00:56:08,094 --> 00:56:09,510 In particular, what we can imagine 1015 00:56:09,510 --> 00:56:11,135 drawing is something that's [INAUDIBLE] 1016 00:56:11,135 --> 00:56:15,640 the rate of evolution as a function of N. 1017 00:56:15,640 --> 00:56:21,120 So rate is a function of N. And we might even 1018 00:56:21,120 --> 00:56:23,170 want to plot in a log, log scale. 1019 00:56:26,190 --> 00:56:31,450 So we'll say the log of the rate. 1020 00:56:31,450 --> 00:56:33,490 The log of N. 1021 00:56:33,490 --> 00:56:41,216 And at the beginning, what do I draw here for small n? 1022 00:56:45,980 --> 00:56:50,615 A line with slope 1. 1023 00:56:50,615 --> 00:56:51,590 OK. 1024 00:56:51,590 --> 00:56:53,700 If i had depended on n squared, then 1025 00:56:53,700 --> 00:56:57,000 what would I be drawing here? 1026 00:56:57,000 --> 00:57:00,290 A line with slope 2. 1027 00:57:00,290 --> 00:57:00,992 Yep. 1028 00:57:00,992 --> 00:57:02,950 Don't mess that up because it's really easy to. 1029 00:57:02,950 --> 00:57:06,250 OK, so it's a line with slope 1. 1030 00:57:06,250 --> 00:57:07,820 Right? 1031 00:57:07,820 --> 00:57:10,454 So at the beginning, for small population size, 1032 00:57:10,454 --> 00:57:11,870 if you double the population size, 1033 00:57:11,870 --> 00:57:14,050 you should double the rate of evolution. 1034 00:57:14,050 --> 00:57:16,482 But this can't go on forever, and it won't. 1035 00:57:16,482 --> 00:57:18,065 So it's going to curve over somewhere. 1036 00:57:21,146 --> 00:57:22,014 all right. 1037 00:57:22,014 --> 00:57:23,972 AUDIENCE: How can the rate go down [INAUDIBLE]? 1038 00:57:29,745 --> 00:57:32,245 PROFESSOR: the rate goes down relative to what it would have 1039 00:57:32,245 --> 00:57:34,280 if it had continued. 1040 00:57:34,280 --> 00:57:36,077 And that's because you're wasting 1041 00:57:36,077 --> 00:57:37,160 some beneficial mutations. 1042 00:57:37,160 --> 00:57:38,909 With clonal interference, what's happening 1043 00:57:38,909 --> 00:57:40,870 is that you have a great mutation over here 1044 00:57:40,870 --> 00:57:42,370 but also a great mutation over here. 1045 00:57:42,370 --> 00:57:44,850 And only one of those great mutations can win. 1046 00:57:44,850 --> 00:57:46,750 So with clonal interference, you're, 1047 00:57:46,750 --> 00:57:49,500 somehow, wasting some of the beneficial mutations 1048 00:57:49,500 --> 00:57:50,250 that you acquired. 1049 00:57:54,026 --> 00:57:55,599 AUDIENCE: Even though you're always 1050 00:57:55,599 --> 00:57:57,432 taking the maximum because what you're doing 1051 00:57:57,432 --> 00:57:59,204 is you're [INAUDIBLE]. 1052 00:57:59,204 --> 00:58:00,870 PROFESSOR: Right because the alternative 1053 00:58:00,870 --> 00:58:04,030 would have been to take the sum of them, which 1054 00:58:04,030 --> 00:58:05,320 is what happens over here. 1055 00:58:05,320 --> 00:58:07,520 If they're not competing, then you got one. 1056 00:58:07,520 --> 00:58:08,000 And you get the other. 1057 00:58:08,000 --> 00:58:09,458 And you just get higher and higher. 1058 00:58:09,458 --> 00:58:12,820 With clonal interference, you indeed take the maximum. 1059 00:58:12,820 --> 00:58:14,130 But that's lower than sum. 1060 00:58:16,782 --> 00:58:18,490 And that affect just gets worse and worse 1061 00:58:18,490 --> 00:58:21,140 as you get to larger population sizes. 1062 00:58:21,140 --> 00:58:26,410 So although it's relatively simple 1063 00:58:26,410 --> 00:58:28,210 to calculate the rate of evolution 1064 00:58:28,210 --> 00:58:29,857 in the limit of small populations, 1065 00:58:29,857 --> 00:58:32,190 the rate of evolution in the case of clonal interference 1066 00:58:32,190 --> 00:58:36,790 is, actually, a very hard problem. 1067 00:58:36,790 --> 00:58:42,650 Hard, well, experimentally, theoretically, and in all ways. 1068 00:58:42,650 --> 00:58:45,975 I just want to say a few things. 1069 00:58:45,975 --> 00:58:48,350 There have been some really, I think, interesting studies 1070 00:58:48,350 --> 00:58:51,900 occurring over the last 10 years trying to get at this regime. 1071 00:58:51,900 --> 00:58:55,540 So the question is how should it behave. 1072 00:58:55,540 --> 00:59:07,640 And there's a paper by Desai, Fisher, and Murray 1073 00:59:07,640 --> 00:59:10,010 when they were all at Harvard. 1074 00:59:10,010 --> 00:59:12,140 Since then, Daniel Fisher, the [INAUDIBLE], 1075 00:59:12,140 --> 00:59:14,010 has moved to Stanford. 1076 00:59:14,010 --> 00:59:17,700 Desai went to Princeton but then came back to Harvard. 1077 00:59:17,700 --> 00:59:20,820 So he's now Harvard faculty. 1078 00:59:20,820 --> 00:59:28,350 So this is current biology in 2006, '07, '08. 1079 00:59:28,350 --> 00:59:30,980 It's 2007. 1080 00:59:30,980 --> 00:59:35,930 So they have, kind of, a simplification of this 1081 00:59:35,930 --> 00:59:37,870 where they asked let's just assume 1082 00:59:37,870 --> 00:59:39,750 that we don't have a probability distribution 1083 00:59:39,750 --> 00:59:41,970 of beneficial mutations. 1084 00:59:41,970 --> 00:59:46,960 Instead, let's just assume that there's some mutation rate 1085 00:59:46,960 --> 00:59:50,690 to acquire beneficial mutation that's exactly s. 1086 00:59:50,690 --> 00:59:51,190 OK? 1087 00:59:51,190 --> 00:59:53,398 So you say, all right, well that sounds super simple. 1088 00:59:53,398 --> 00:59:54,840 Of course, it's not true. 1089 00:59:54,840 --> 00:59:57,300 But even that problem is hard. 1090 00:59:57,300 --> 00:59:59,120 But what they can do in this regime 1091 00:59:59,120 --> 01:00:02,180 is that then it's nice because everything's, kind of, discrete 1092 01:00:02,180 --> 01:00:03,680 because then the population is going 1093 01:00:03,680 --> 01:00:10,480 to described by a series of-- so this 1094 01:00:10,480 --> 01:00:13,150 is abundance as a function of the fitness. 1095 01:00:13,150 --> 01:00:17,340 So this is, maybe, the bulk minus fitness relative 0. 1096 01:00:17,340 --> 01:00:23,030 Here, this is s, 2s, 3s, 4s, and maybe there's a little bit here 1097 01:00:23,030 --> 01:00:23,730 at 5s. 1098 01:00:23,730 --> 01:00:25,909 OK? 1099 01:00:25,909 --> 01:00:27,450 So what happens is that there's going 1100 01:00:27,450 --> 01:00:30,540 to be some equilibrium distribution of this front 1101 01:00:30,540 --> 01:00:33,660 or the nose of the population. 1102 01:00:33,660 --> 01:00:35,920 And at some rate, these guys get mutations 1103 01:00:35,920 --> 01:00:38,812 where some individual, kind of, comes like this because it gets 1104 01:00:38,812 --> 01:00:40,020 a mutation that's beneficial. 1105 01:00:40,020 --> 01:00:41,980 But that doesn't actually affect the dynamics 1106 01:00:41,980 --> 01:00:45,480 very much because all these guys are growing exponentially. 1107 01:00:45,480 --> 01:00:49,230 What's really relevant are when individuals that 1108 01:00:49,230 --> 01:00:51,870 have several or more of these beneficial mutations 1109 01:00:51,870 --> 01:00:56,650 than the rest of the population, when these guys get a mutation 1110 01:00:56,650 --> 01:00:58,610 that they can move forward. 1111 01:00:58,610 --> 01:01:01,260 And it's this dynamic that is, kind of, 1112 01:01:01,260 --> 01:01:03,320 pulling the population here. 1113 01:01:03,320 --> 01:01:07,270 So you can actually do analytic calculations on this model 1114 01:01:07,270 --> 01:01:09,895 that you would not be able to do if you did a full distribution 1115 01:01:09,895 --> 01:01:11,670 of mutations here. 1116 01:01:11,670 --> 01:01:14,460 But it's actually still complicated and hard. 1117 01:01:14,460 --> 01:01:17,350 And I do not claim to have gone through the full derivation 1118 01:01:17,350 --> 01:01:20,500 because the full thing is they get 1119 01:01:20,500 --> 01:01:24,626 that the velocity in this model approximately equal to. 1120 01:01:24,626 --> 01:01:25,750 By then, it's an s squared. 1121 01:01:47,630 --> 01:01:52,170 All right, so you might sneer at this model and say, oh, well, 1122 01:01:52,170 --> 01:01:54,790 you know, this is an oversimplification, 1123 01:01:54,790 --> 01:01:55,965 blah, blah, blah. 1124 01:01:55,965 --> 01:01:57,180 But even this is hard. 1125 01:01:57,180 --> 01:02:00,610 And you get a complicated expression. 1126 01:02:00,610 --> 01:02:02,419 And it's describing something fundamental 1127 01:02:02,419 --> 01:02:03,960 that's happening in these populations 1128 01:02:03,960 --> 01:02:05,330 with clonal interference. 1129 01:02:05,330 --> 01:02:08,690 The important thing is that you can-- more or less, 1130 01:02:08,690 --> 01:02:11,774 this term is going to be the dominant one in many cases, 1131 01:02:11,774 --> 01:02:13,940 especially because we're interested in the large end 1132 01:02:13,940 --> 01:02:14,840 regime here. 1133 01:02:14,840 --> 01:02:18,800 So what you see is that, for large N in this model 1134 01:02:18,800 --> 01:02:20,690 and they did experiments in this paper that 1135 01:02:20,690 --> 01:02:23,740 are consistent with it, they find 1136 01:02:23,740 --> 01:02:27,400 that the velocity-- the rate of evolution-- this is the rate. 1137 01:02:30,150 --> 01:02:33,740 You see her the s squared term that we already talked about. 1138 01:02:33,740 --> 01:02:34,844 Right? 1139 01:02:34,844 --> 01:02:36,510 And there's no probability distribution, 1140 01:02:36,510 --> 01:02:37,920 so it's just an s squared. 1141 01:02:37,920 --> 01:02:41,050 But what you see is that it ends up going as log N for large N. 1142 01:02:41,050 --> 01:02:43,400 So this is when there's a lot of clonal interference. 1143 01:02:43,400 --> 01:02:48,070 So maybe this thing goes as log N. 1144 01:02:48,070 --> 01:02:52,190 But I'd say, maybe, this is not an open and shut case 1145 01:02:52,190 --> 01:02:54,540 because real populations are more complicated than this 1146 01:02:54,540 --> 01:02:57,110 because it's not that every mutation has magnitude s. 1147 01:02:57,110 --> 01:03:00,110 But I think it's a reasonable first order model. 1148 01:03:00,110 --> 01:03:03,530 And I think this is a nice set of calculations 1149 01:03:03,530 --> 01:03:04,545 to make sense of things. 1150 01:03:08,510 --> 01:03:12,700 Are there any questions about that calculation 1151 01:03:12,700 --> 01:03:15,490 or why this thing curves off [INAUDIBLE]? 1152 01:03:15,490 --> 01:03:16,407 Yes. 1153 01:03:16,407 --> 01:03:18,275 AUDIENCE: [INAUDIBLE]. 1154 01:03:18,275 --> 01:03:19,209 PROFESSOR: Yeah. 1155 01:03:19,209 --> 01:03:22,478 AUDIENCE: [INAUDIBLE]? 1156 01:03:22,478 --> 01:03:23,270 PROFESSOR: Sure. 1157 01:03:23,270 --> 01:03:27,450 OK, so the question is why is it that it's average s squared 1158 01:03:27,450 --> 01:03:32,760 instead of average s squared? 1159 01:03:32,760 --> 01:03:35,870 So what is it that's useful to say? 1160 01:03:35,870 --> 01:03:38,370 I'm trying to think of a nice probability distribution 1161 01:03:38,370 --> 01:03:42,730 where those things will have very different-- 1162 01:03:47,310 --> 01:03:49,021 AUDIENCE: [INAUDIBLE]? 1163 01:03:49,021 --> 01:03:50,520 PROFESSOR: For a delta distribution, 1164 01:03:50,520 --> 01:03:52,130 they're the same thing. 1165 01:03:52,130 --> 01:03:57,650 So if we wanted to do an intuitive explanation, maybe, 1166 01:03:57,650 --> 01:04:00,870 [? probably ?]-- I'm trying to think if we 1167 01:04:00,870 --> 01:04:02,540 had two delta distributions. 1168 01:04:02,540 --> 01:04:04,760 So the mean s would be here. 1169 01:04:04,760 --> 01:04:05,260 Yeah, OK. 1170 01:04:05,260 --> 01:04:09,170 So imagine that-- I know that this 1171 01:04:09,170 --> 01:04:12,577 is not a real beneficial mutation so maybe but OK. 1172 01:04:12,577 --> 01:04:13,910 But it's something that's small. 1173 01:04:13,910 --> 01:04:18,660 You know, negligently small here and magnitude s over here. 1174 01:04:18,660 --> 01:04:26,880 So the mean s is, indeed, equal to this s 0 over 2. 1175 01:04:26,880 --> 01:04:28,680 Right? 1176 01:04:28,680 --> 01:04:35,735 Whereas, the mean of s squared is, then, s squared 0 over 4. 1177 01:04:38,507 --> 01:04:40,090 Are we looking at something different? 1178 01:04:40,090 --> 01:04:41,770 Oh, no. 1179 01:04:41,770 --> 01:04:47,110 OK, I think I'm going to have to come up with a better example 1180 01:04:47,110 --> 01:04:47,610 to answer. 1181 01:04:47,610 --> 01:04:50,540 So maybe after class we can come up with one. 1182 01:04:50,540 --> 01:04:52,499 I don't want to take five minutes finding 1183 01:04:52,499 --> 01:04:53,290 a good explanation. 1184 01:04:53,290 --> 01:04:53,790 Yeah? 1185 01:04:53,790 --> 01:04:54,772 AUDIENCE: [INAUDIBLE]? 1186 01:04:59,640 --> 01:05:01,140 PROFESSOR: Yeah, so this is the rate 1187 01:05:01,140 --> 01:05:05,700 of evolution in the regime of large population size. 1188 01:05:05,700 --> 01:05:06,520 AUDIENCE: OK. 1189 01:05:06,520 --> 01:05:08,150 PROFESSOR: And I don't know how small 1190 01:05:08,150 --> 01:05:11,360 you have to go before [? funny ?] [INAUDIBLE]. 1191 01:05:11,360 --> 01:05:15,022 AUDIENCE: That's only for the [? large ?] [INAUDIBLE]. 1192 01:05:15,022 --> 01:05:15,960 PROFESSOR: Yep. 1193 01:05:15,960 --> 01:05:17,585 Yeah, and, of course, you could imagine 1194 01:05:17,585 --> 01:05:19,560 that taking various limits is complicated here. 1195 01:05:19,560 --> 01:05:23,890 But the important thing is that the rate 1196 01:05:23,890 --> 01:05:30,153 goes as log N for, in the case, a clonal interference 1197 01:05:30,153 --> 01:05:34,130 because that's the key thing to remember, besides the fact 1198 01:05:34,130 --> 01:05:37,540 that they actually had to work to analyze this model. 1199 01:05:44,785 --> 01:05:48,190 OK, so what I want to do is, in the last 10, 15 minutes, 1200 01:05:48,190 --> 01:05:50,640 to talk about this [INAUDIBLE] paper 1201 01:05:50,640 --> 01:05:54,620 because I think it's pretty. 1202 01:05:54,620 --> 01:06:02,040 And I mean, it's an elegant example 1203 01:06:02,040 --> 01:06:05,340 of how, if you look at a problem in a different way, 1204 01:06:05,340 --> 01:06:09,390 then you can get, I think, really interesting insights 1205 01:06:09,390 --> 01:06:16,160 using a minimal amount of, like, measurements. 1206 01:06:16,160 --> 01:06:19,992 I mean, basically, how many measurements are in this paper? 1207 01:06:19,992 --> 01:06:20,824 AUDIENCE: 32. 1208 01:06:20,824 --> 01:06:22,290 PROFESSOR: Basically, 32. 1209 01:06:22,290 --> 01:06:24,960 Right? 1210 01:06:24,960 --> 01:06:27,630 So what they are doing is they're analyzing mutations 1211 01:06:27,630 --> 01:06:29,620 in the enzyme beta lactamase or the gene 1212 01:06:29,620 --> 01:06:32,630 encoding beta lactamase, which confers resistance 1213 01:06:32,630 --> 01:06:36,051 to beta lactam drugs like, in this case, cefataxime 1214 01:06:36,051 --> 01:06:42,550 OK, so this guy gives resistance to these beta lactam 1215 01:06:42,550 --> 01:06:46,700 drugs that are like ampicillin or penicillin. 1216 01:06:46,700 --> 01:06:50,590 In this case, cefataxime. 1217 01:06:50,590 --> 01:06:54,440 but not all of the versions of this gene or the enzyme, 1218 01:06:54,440 --> 01:06:59,250 actually, can break down this new drug. 1219 01:06:59,250 --> 01:07:03,180 So all 32 versions of the enzyme that they study 1220 01:07:03,180 --> 01:07:05,230 break down, for example, ampicillin. 1221 01:07:05,230 --> 01:07:10,110 But they had widely varying levels of resistance or ability 1222 01:07:10,110 --> 01:07:12,060 to break down this drug, cefataxime. 1223 01:07:15,966 --> 01:07:17,840 So they wanted to try to understand something 1224 01:07:17,840 --> 01:07:22,330 about what happens if you start out with this base 1225 01:07:22,330 --> 01:07:23,460 version of the enzyme. 1226 01:07:23,460 --> 01:07:25,280 You know, if you just look up what's 1227 01:07:25,280 --> 01:07:28,850 the sequence for beta lactamase. 1228 01:07:28,850 --> 01:07:30,340 What's going to be the sequence? 1229 01:07:30,340 --> 01:07:33,220 And that we're going to call the minus, minus, minus, minus, 1230 01:07:33,220 --> 01:07:33,720 minus. 1231 01:07:36,460 --> 01:07:39,660 And we might, reasonably, want to know 1232 01:07:39,660 --> 01:07:43,399 how does it get to the version of the enzyme that has all five 1233 01:07:43,399 --> 01:07:44,690 of these [? point ?] mutations? 1234 01:07:48,190 --> 01:07:52,110 Does anybody remember what those five mutations were? 1235 01:07:52,110 --> 01:07:54,748 Like, what kind of mutations are they? 1236 01:07:54,748 --> 01:07:55,700 AUDIENCE: [INAUDIBLE]. 1237 01:07:55,700 --> 01:07:58,170 PROFESSOR: They're all [? point ?] mutations. 1238 01:07:58,170 --> 01:08:01,380 And are they all protein coding mutations? 1239 01:08:01,380 --> 01:08:02,030 No. 1240 01:08:02,030 --> 01:08:05,220 So actually, this one here is actually 1241 01:08:05,220 --> 01:08:07,880 a promoter mutation that increases expression 1242 01:08:07,880 --> 01:08:10,540 by a factor of two or three. 1243 01:08:10,540 --> 01:08:12,420 Whereas, these things here are indeed protein 1244 01:08:12,420 --> 01:08:16,529 coding and change the amino acid. 1245 01:08:16,529 --> 01:08:17,736 Protein coding. 1246 01:08:17,736 --> 01:08:20,050 The amino acids in the end. 1247 01:08:20,050 --> 01:08:24,115 Well they each change one amino acid in the resulting protein. 1248 01:08:28,149 --> 01:08:31,670 So what Weinreich in this paper was 1249 01:08:31,670 --> 01:08:35,300 trying to understand is what is the shape of these fitness 1250 01:08:35,300 --> 01:08:36,120 landscapes? 1251 01:08:36,120 --> 01:08:39,779 And what does that mean about the course of evolution 1252 01:08:39,779 --> 01:08:43,210 or the repeatability or predictability of evolution? 1253 01:08:45,982 --> 01:08:50,800 And I just want to stress this is the Weinreich 1254 01:08:50,800 --> 01:08:58,220 2006 because this version of the gene/enzyme 1255 01:08:58,220 --> 01:09:02,040 is, essentially, unable to break down cefataxime at all. 1256 01:09:02,040 --> 01:09:07,080 So E-coli that has this version of enzyme, 1257 01:09:07,080 --> 01:09:09,729 it's almost as if they don't have any enzyme at all. 1258 01:09:09,729 --> 01:09:11,439 Whereas, this version of the enzyme 1259 01:09:11,439 --> 01:09:16,174 is able to break down this drug, cefataxime, at very high rates. 1260 01:09:16,174 --> 01:09:17,715 And indeed, the way that these things 1261 01:09:17,715 --> 01:09:20,319 are quantified in this paper is we have what's known as the MIC 1262 01:09:20,319 --> 01:09:21,985 or the Minimum Inhibitory Concentration. 1263 01:09:26,580 --> 01:09:30,490 And basically, you just ask-- oops, 1264 01:09:30,490 --> 01:09:35,000 inhibitory concentration-- what's 1265 01:09:35,000 --> 01:09:36,654 the minimum amount of the antibiotic 1266 01:09:36,654 --> 01:09:39,390 that you have to add to prevent growth 1267 01:09:39,390 --> 01:09:42,649 of the bacterial population after 20 hours starting 1268 01:09:42,649 --> 01:09:45,180 from some standard cell density, OK? 1269 01:09:45,180 --> 01:09:47,160 So it's a very easy experiment to do 1270 01:09:47,160 --> 01:09:52,290 because, 96 well [? plate ?], you just have many wells. 1271 01:09:52,290 --> 01:09:55,610 And you just go down a concentration, maybe, 1272 01:09:55,610 --> 01:09:59,220 by a factor of a root 2 each time. 1273 01:09:59,220 --> 01:10:01,580 So then, you go across 12 or 24. 1274 01:10:01,580 --> 01:10:04,480 And you get over a broad range of antibiotic concentrations. 1275 01:10:04,480 --> 01:10:07,250 And what you should see is that this is, maybe, 1276 01:10:07,250 --> 01:10:08,850 dividing by route 2 each time. 1277 01:10:08,850 --> 01:10:10,190 So you get growth here. 1278 01:10:10,190 --> 01:10:11,620 You get growth here. 1279 01:10:11,620 --> 01:10:14,270 Growth here but then no growth, no growth, no growth. 1280 01:10:14,270 --> 01:10:17,890 And the concentration and that you added here is the MIC. 1281 01:10:17,890 --> 01:10:19,510 What you'll see is that, depending 1282 01:10:19,510 --> 01:10:23,150 on the version of the enzyme that the bacteria have, 1283 01:10:23,150 --> 01:10:26,960 the growth will occur up to different concentrations. 1284 01:10:26,960 --> 01:10:29,215 AUDIENCE: Why would you [INAUDIBLE] 1285 01:10:29,215 --> 01:10:30,639 PROFESSOR: Why would you? 1286 01:10:30,639 --> 01:10:31,180 Why is that-- 1287 01:10:31,180 --> 01:10:32,122 AUDIENCE: [INAUDIBLE]. 1288 01:10:32,122 --> 01:10:33,580 PROFESSOR: Oh, I'm just telling you 1289 01:10:33,580 --> 01:10:35,780 what they actually did experimentally in this paper. 1290 01:10:35,780 --> 01:10:37,500 You could do a factor of 2. 1291 01:10:37,500 --> 01:10:42,860 Or it's just a question how fine of a resolution you want. 1292 01:10:42,860 --> 01:10:46,170 How to do root 2? 1293 01:10:46,170 --> 01:10:49,560 OK, so this is like a mathematicians question, 1294 01:10:49,560 --> 01:10:53,290 all right because your point is going to be that-- 1295 01:10:53,290 --> 01:10:54,646 AUDIENCE: [INAUDIBLE] route 2. 1296 01:10:54,646 --> 01:10:56,340 PROFESSOR: OK, so it's true. 1297 01:10:56,340 --> 01:11:00,430 Root 2, it's an irrational number. 1298 01:11:00,430 --> 01:11:04,530 It's the first proof in a analysis textbook. 1299 01:11:04,530 --> 01:11:06,190 It doesn't matter, OK? 1300 01:11:06,190 --> 01:11:09,920 Our error in pipe heading is a percent, 1301 01:11:09,920 --> 01:11:18,860 which means that if you do 1.41, that's fine. 1302 01:11:18,860 --> 01:11:20,280 Yes. 1303 01:11:20,280 --> 01:11:23,320 So don't be paralyzed by petting a root 2. 1304 01:11:23,320 --> 01:11:24,820 OK? 1305 01:11:24,820 --> 01:11:29,095 AUDIENCE: Is it always very sharp, this changing in growth? 1306 01:11:29,095 --> 01:11:31,250 PROFESSOR: You know, biology and the word 1307 01:11:31,250 --> 01:11:36,290 always should never be used in the same sentence. 1308 01:11:36,290 --> 01:11:38,820 I'd say that it's a reasonable [? assay. ?] 1309 01:11:38,820 --> 01:11:41,030 It's, typically, sharp. 1310 01:11:41,030 --> 01:11:43,260 It happens, though, that you get growth here. 1311 01:11:43,260 --> 01:11:45,010 And then, you get stressed out because you 1312 01:11:45,010 --> 01:11:46,030 don't know what to do. 1313 01:11:46,030 --> 01:11:50,050 I mean, the important thing is that you do the experiment 1314 01:11:50,050 --> 01:11:52,390 multiple times and you have some reasonable rule 1315 01:11:52,390 --> 01:11:54,270 for treating these things. 1316 01:11:54,270 --> 01:11:56,320 Yeah, it can be more complicated though. 1317 01:12:00,924 --> 01:12:02,340 All right, what we're going to use 1318 01:12:02,340 --> 01:12:03,881 in the context of this paper, though, 1319 01:12:03,881 --> 01:12:06,650 is we're just going to assume that this MIC is 1320 01:12:06,650 --> 01:12:09,940 a measure for fitness. 1321 01:12:09,940 --> 01:12:13,570 The mapping from MIC to fitness is, actually, very nontrivial. 1322 01:12:13,570 --> 01:12:16,130 Something my group has spent a long time thinking about. 1323 01:12:16,130 --> 01:12:20,090 But for the purpose of this paper, just when you hear MIC, 1324 01:12:20,090 --> 01:12:22,195 you can just say think of it as fitness. 1325 01:12:22,195 --> 01:12:26,130 OK, [? higher ?] MIC, he assumes that it could 1326 01:12:26,130 --> 01:12:27,520 be selected for by evolution. 1327 01:12:30,860 --> 01:12:36,490 So there are, in principle, 2 to the five different states. 1328 01:12:41,280 --> 01:12:43,560 Different versions of this gene. 1329 01:12:43,560 --> 01:12:45,670 So what he did is he constructed each 2 to the 5 1330 01:12:45,670 --> 01:12:50,930 versions of the gene, put them into the same strain of E-coli, 1331 01:12:50,930 --> 01:12:55,041 and then measured the MIC of each of those 32 strains. 1332 01:12:55,041 --> 01:12:57,040 And that was all the measurements in this paper, 1333 01:12:57,040 --> 01:13:00,040 basically because everything else is just 1334 01:13:00,040 --> 01:13:02,950 analysis of that resulting fitness landscape. 1335 01:13:02,950 --> 01:13:05,940 But what's exciting about this is just 1336 01:13:05,940 --> 01:13:09,530 the ability to have an experimental fitness 1337 01:13:09,530 --> 01:13:12,720 landscape because we've talked about fitness 1338 01:13:12,720 --> 01:13:13,730 landscapes for years. 1339 01:13:13,730 --> 01:13:16,710 But then, it tends to be much more like what you saw 1340 01:13:16,710 --> 01:13:19,237 in Martin Novak's book, right? 1341 01:13:19,237 --> 01:13:21,320 That you can think about these fitness landscapes. 1342 01:13:21,320 --> 01:13:25,614 And you can do calculations of what should happen on them. 1343 01:13:25,614 --> 01:13:27,280 But this is a case where we can actually 1344 01:13:27,280 --> 01:13:29,280 just measure something akin to a fitness landscape, 1345 01:13:29,280 --> 01:13:30,700 and to try to say what it means. 1346 01:13:30,700 --> 01:13:34,120 So you can ask questions about how rugged is the landscape. 1347 01:13:34,120 --> 01:13:37,240 How many different paths can you take from this version 1348 01:13:37,240 --> 01:13:40,490 to this version? 1349 01:13:40,490 --> 01:13:44,910 So first of all, how many peaks were in this landscape 1350 01:13:44,910 --> 01:13:46,162 that he measured? 1351 01:13:50,581 --> 01:13:51,563 AUDIENCE: [INAUDIBLE]. 1352 01:13:51,563 --> 01:13:53,670 PROFESSOR: One peak. 1353 01:13:53,670 --> 01:13:55,330 This is important. 1354 01:13:55,330 --> 01:13:58,020 This was the one and only peak. 1355 01:14:00,660 --> 01:14:03,210 And when you read this paper, you come away thinking, 1356 01:14:03,210 --> 01:14:05,690 oh yeah, this is a really rugged landscape, right? 1357 01:14:05,690 --> 01:14:08,480 Many of the paths were not allowed by so-called Darwinian 1358 01:14:08,480 --> 01:14:10,690 or selective evolution. 1359 01:14:10,690 --> 01:14:14,030 But it's easy to forget that, actually, it's not that. 1360 01:14:14,030 --> 01:14:15,910 It's a moderately rugged landscape 1361 01:14:15,910 --> 01:14:19,690 because there was still one peak. 1362 01:14:19,690 --> 01:14:27,080 In particular, if you just assume that the population 1363 01:14:27,080 --> 01:14:29,860 starts at that minus, minus, minus, state, 1364 01:14:29,860 --> 01:14:32,170 starts travelling uphill in fitness, 1365 01:14:32,170 --> 01:14:35,190 gets a mutation and goes uphill, is there any possibility for it 1366 01:14:35,190 --> 01:14:37,670 to get trapped in a non-optimal? 1367 01:14:37,670 --> 01:14:42,090 No because there are no other peaks. 1368 01:14:42,090 --> 01:14:43,740 So there's only one peak. 1369 01:14:43,740 --> 01:14:46,050 That's the same thing as saying that you 1370 01:14:46,050 --> 01:14:48,240 can take any path you like going uphill, 1371 01:14:48,240 --> 01:14:50,580 and you will always get to the same final location. 1372 01:14:50,580 --> 01:14:54,940 You will never get stuck anywhere. 1373 01:14:54,940 --> 01:14:59,150 I mean, it does not mean that you can take 1374 01:14:59,150 --> 01:15:01,280 any old path that you want. 1375 01:15:01,280 --> 01:15:03,410 Many of the paths may be blocked in the sense 1376 01:15:03,410 --> 01:15:06,250 that they may go downhill and so forth. 1377 01:15:06,250 --> 01:15:08,850 But at any location you're at, there's always, at least, 1378 01:15:08,850 --> 01:15:11,392 one path going up in fitness, up in MIC. 1379 01:15:11,392 --> 01:15:12,850 Doesn't matter which path you take. 1380 01:15:12,850 --> 01:15:15,980 You will always be able to get to the peak of this landscape. 1381 01:15:15,980 --> 01:15:17,450 OK? 1382 01:15:17,450 --> 01:15:20,319 So it's not too rugged of a landscape, in that sense. 1383 01:15:20,319 --> 01:15:20,819 Yeah? 1384 01:15:20,819 --> 01:15:23,652 AUDIENCE: Wasn't it, [INAUDIBLE]? 1385 01:15:23,652 --> 01:15:24,360 PROFESSOR: Right. 1386 01:15:24,360 --> 01:15:26,710 So some paths are not allowed in the sense 1387 01:15:26,710 --> 01:15:30,957 that some paths decrease fitness, locally. 1388 01:15:30,957 --> 01:15:33,540 But what I'm saying is that you can take a different path that 1389 01:15:33,540 --> 01:15:34,331 goes up in fitness. 1390 01:15:34,331 --> 01:15:37,922 And you'll still get to the same peak. 1391 01:15:37,922 --> 01:15:41,331 AUDIENCE: Right, but you started in that [? wrong path ?] 1392 01:15:41,331 --> 01:15:42,772 and you [INAUDIBLE]. 1393 01:15:42,772 --> 01:15:43,730 PROFESSOR: Well no, no. 1394 01:15:43,730 --> 01:15:45,780 The thing is that you can't take that path 1395 01:15:45,780 --> 01:15:50,640 because that path goes down in fitness is the claim. 1396 01:15:50,640 --> 01:15:52,860 So the statement from this paper is 1397 01:15:52,860 --> 01:15:57,650 that, if we just assume that the only mutations that 1398 01:15:57,650 --> 01:15:59,980 can fix in a population are mutations that increase 1399 01:15:59,980 --> 01:16:02,845 fitness, than it does not matter which 1400 01:16:02,845 --> 01:16:04,220 of those beneficial mutations you 1401 01:16:04,220 --> 01:16:06,880 take because you will always end up reaching the peak. 1402 01:16:14,610 --> 01:16:16,987 So there are 120 possible trajectories. 1403 01:16:16,987 --> 01:16:18,320 Can somebody say how we got 120? 1404 01:16:25,952 --> 01:16:27,860 AUDIENCE: [INAUDIBLE]. 1405 01:16:27,860 --> 01:16:29,000 PROFESSOR: Right. 1406 01:16:29,000 --> 01:16:30,075 So this is 5 factorial. 1407 01:16:34,210 --> 01:16:37,180 What he found by analyzing the resulting fitness landscape 1408 01:16:37,180 --> 01:16:40,070 is that 102 were selectively inaccessible. 1409 01:16:46,190 --> 01:16:50,790 Oh, I don't know how many-- is that the right way to spell 1410 01:16:50,790 --> 01:16:52,048 that? 1411 01:16:52,048 --> 01:16:53,750 Yeah. 1412 01:16:53,750 --> 01:16:57,230 And what he's assuming is that the only mutations that can fix 1413 01:16:57,230 --> 01:17:00,360 are beneficial mutations. 1414 01:17:00,360 --> 01:17:01,480 OK, right. 1415 01:17:01,480 --> 01:17:03,310 In particular, if you have two states, 1416 01:17:03,310 --> 01:17:05,659 there's a mutation that you could acquire. 1417 01:17:05,659 --> 01:17:07,450 From here, let's say you get this mutation, 1418 01:17:07,450 --> 01:17:09,570 and that just leads to the same MIC. 1419 01:17:09,570 --> 01:17:13,054 He assumes that that is inaccessible. 1420 01:17:13,054 --> 01:17:14,032 All right? 1421 01:17:14,032 --> 01:17:15,740 You know, and of course, like all things, 1422 01:17:15,740 --> 01:17:17,150 you can argue about it. 1423 01:17:17,150 --> 01:17:19,300 What he's saying is that it won't 1424 01:17:19,300 --> 01:17:24,490 fix in reasonable times, which is fair if it's really 1425 01:17:24,490 --> 01:17:26,060 a neutral mutation. 1426 01:17:26,060 --> 01:17:29,410 In particular, if you have 10 to the 6 bacteria, 1427 01:17:29,410 --> 01:17:32,670 and if the mutation rates are the same everywhere 1428 01:17:32,670 --> 01:17:36,280 and some mutations lead to a significant increase 1429 01:17:36,280 --> 01:17:38,430 in fitness, then a neutral mutation 1430 01:17:38,430 --> 01:17:41,150 would be unlikely to fix. 1431 01:17:41,150 --> 01:17:43,500 So what he found is that there were only, 1432 01:17:43,500 --> 01:17:49,820 then, 18 of the 120 paths off this fitness landscape that 1433 01:17:49,820 --> 01:17:52,170 had monotonically increasing fitness values. 1434 01:17:55,150 --> 01:17:57,275 And indeed, if you analyze those trajectories, what 1435 01:17:57,275 --> 01:18:00,000 he found is that, actually, 18 [? isn't ?] maybe even 1436 01:18:00,000 --> 01:18:02,690 over [INAUDIBLE] because only a few of those trajectories 1437 01:18:02,690 --> 01:18:06,310 would likely occupy majority of what 1438 01:18:06,310 --> 01:18:09,730 you might call the actually observed paths just because 1439 01:18:09,730 --> 01:18:13,060 of the statistics of when those paths branch and so forth. 1440 01:18:13,060 --> 01:18:15,120 So the argument from this paper is that we 1441 01:18:15,120 --> 01:18:16,915 can measure fitness landscapes. 1442 01:18:16,915 --> 01:18:18,290 And from it, we can say something 1443 01:18:18,290 --> 01:18:20,580 about the path of evolution, perhaps. 1444 01:18:20,580 --> 01:18:22,580 Other people have since gone and done, actually, 1445 01:18:22,580 --> 01:18:24,355 laboratory evolution on a different antibiotic resistance 1446 01:18:24,355 --> 01:18:25,970 gene-- again, Roy [? Kashoney ?], 1447 01:18:25,970 --> 01:18:28,500 actually-- to confirm that these landscapes, 1448 01:18:28,500 --> 01:18:32,580 at least in some cases, can inform laboratory evolution. 1449 01:18:32,580 --> 01:18:35,670 So there is a sense that maybe evolution is more predictable 1450 01:18:35,670 --> 01:18:37,800 than you would have thought. 1451 01:18:37,800 --> 01:18:38,640 We're out of time. 1452 01:18:38,640 --> 01:18:41,290 But if you have any questions, please, go ahead and come on up 1453 01:18:41,290 --> 01:18:43,190 and ask them. 1454 01:18:43,190 --> 01:18:43,790 All right? 1455 01:18:43,790 --> 01:18:45,340 Thanks.