1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high-quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation, or view additional materials 6 00:00:13,340 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:25,732 --> 00:00:26,760 PROFESSOR: I.e. 9 00:00:26,760 --> 00:00:30,090 When multiple lineages of beneficial mutants 10 00:00:30,090 --> 00:00:32,790 are kind of spreading in a population at the same time, 11 00:00:32,790 --> 00:00:34,660 so when these clones are somehow interfering 12 00:00:34,660 --> 00:00:36,550 with each other in the population, 13 00:00:36,550 --> 00:00:39,110 how is it that that clonal interference effect 14 00:00:39,110 --> 00:00:42,035 kind of plays out in the context of these microbial asexual 15 00:00:42,035 --> 00:00:42,535 populations. 16 00:00:42,535 --> 00:00:45,720 And in particular, in this paper, what 17 00:00:45,720 --> 00:00:48,700 they were trying to measure was the distribution of effects 18 00:00:48,700 --> 00:00:50,390 of beneficial mutations. 19 00:00:50,390 --> 00:00:53,650 So you've put a population into some new environment. 20 00:00:53,650 --> 00:00:56,540 There may be some beneficial mutations that are possible. 21 00:00:56,540 --> 00:00:59,270 The question is, what is the probability distribution 22 00:00:59,270 --> 00:01:01,732 of effects of those beneficial mutations. 23 00:01:01,732 --> 00:01:04,000 And this is, as you might imagine, 24 00:01:04,000 --> 00:01:07,460 a very basic quantitative we'd like to understand just 25 00:01:07,460 --> 00:01:10,574 to understand how evolution works in new environments. 26 00:01:10,574 --> 00:01:11,990 But it's actually something that's 27 00:01:11,990 --> 00:01:13,156 rather difficult to measure. 28 00:01:13,156 --> 00:01:15,130 And what's interesting here is it's 29 00:01:15,130 --> 00:01:20,960 difficult to measure for maybe a surprising reason. 30 00:01:20,960 --> 00:01:26,160 This is a wonderful example of how you kind of-- if life 31 00:01:26,160 --> 00:01:28,370 gives you lemons, then make lemonade. 32 00:01:28,370 --> 00:01:30,066 Is that the expression? 33 00:01:30,066 --> 00:01:30,953 OK. 34 00:01:30,953 --> 00:01:31,452 All right. 35 00:01:31,452 --> 00:01:34,090 So in this paper, they set out to try and measure 36 00:01:34,090 --> 00:01:37,450 the [? distribution ?] effects of beneficial mutations. 37 00:01:37,450 --> 00:01:39,227 They were unsuccessful. 38 00:01:39,227 --> 00:01:41,435 But they were unsuccessful for an interesting reason, 39 00:01:41,435 --> 00:01:43,336 so then they were able to write what I think 40 00:01:43,336 --> 00:01:47,304 is a very nice science paper. 41 00:01:47,304 --> 00:01:50,215 Are there any questions about where 42 00:01:50,215 --> 00:01:52,590 we are or administrative things before we get going? 43 00:01:57,340 --> 00:02:01,610 All right, so just to recap where we were from last time, 44 00:02:01,610 --> 00:02:03,990 we introduced this Moran model, which 45 00:02:03,990 --> 00:02:06,650 there's a fixed population size, N. 46 00:02:06,650 --> 00:02:10,650 Then each of these sort of cycles, what we do 47 00:02:10,650 --> 00:02:14,210 is we choose one individual to divide 48 00:02:14,210 --> 00:02:17,725 and we choose one individual to be replaced. 49 00:02:17,725 --> 00:02:23,830 And we have the daughter, say, cell replace the cell that 50 00:02:23,830 --> 00:02:26,150 was chosen for replacement. 51 00:02:26,150 --> 00:02:28,690 Now in that model, what we found was 52 00:02:28,690 --> 00:02:31,560 that there's some probability, x1, 53 00:02:31,560 --> 00:02:33,500 for a new mutant with relative fitness, 54 00:02:33,500 --> 00:02:37,040 r, to actually fix in the population. 55 00:02:37,040 --> 00:02:41,850 And this is in the case of no clonal interference. 56 00:02:41,850 --> 00:02:45,740 So CI will in general be clonal interference today. 57 00:02:45,740 --> 00:02:50,170 So this is-- if it's just this one mutation 58 00:02:50,170 --> 00:02:54,590 that we're considering, then the question of surviving 59 00:02:54,590 --> 00:02:56,750 stochastic extinction kind of at the beginning 60 00:02:56,750 --> 00:03:00,190 is equivalent to the question of being 61 00:03:00,190 --> 00:03:03,610 able to fix in the population certainly 62 00:03:03,610 --> 00:03:07,490 for a beneficial mutation, with r greater than 1. 63 00:03:07,490 --> 00:03:09,360 So what we want to do today to just start 64 00:03:09,360 --> 00:03:11,890 is to try to think about how we can 65 00:03:11,890 --> 00:03:14,980 use this model and this equation in order 66 00:03:14,980 --> 00:03:17,578 to understand the probability of fixation 67 00:03:17,578 --> 00:03:19,744 when there's more than one mutant in the population. 68 00:03:22,780 --> 00:03:27,080 So here, what we found is that this thing, 69 00:03:27,080 --> 00:03:31,730 there are 2 limits that we might want to keep track of. 70 00:03:31,730 --> 00:03:41,020 So this thing goes to 1 over n, for neutral mutations, 71 00:03:41,020 --> 00:03:47,470 and it goes to S for moderately beneficial mutations. 72 00:03:50,300 --> 00:03:57,140 So S is defined as r minus 1. 73 00:03:57,140 --> 00:03:58,990 So it's this for S kind of greater 74 00:03:58,990 --> 00:04:04,145 than zero, and for S times n, much larger than 1. 75 00:04:11,210 --> 00:04:13,756 So I guess this already says that S is greater than zero, 76 00:04:13,756 --> 00:04:14,755 so we could ignore that. 77 00:04:17,380 --> 00:04:20,779 Are there any questions about this before we 78 00:04:20,779 --> 00:04:26,075 get going to practice using it? 79 00:04:26,075 --> 00:04:29,440 All right, so I just wanted to have us consider 80 00:04:29,440 --> 00:04:31,365 a few possible situations. 81 00:04:35,138 --> 00:04:37,480 All right, population size 10. 82 00:04:37,480 --> 00:04:39,290 What we want to consider a situation where 83 00:04:39,290 --> 00:04:40,550 there are two mutants. 84 00:04:44,414 --> 00:04:49,760 And we're going to call them A and B, with relative fitness rA 85 00:04:49,760 --> 00:04:55,315 1.01, and rB 0.99. 86 00:04:55,315 --> 00:05:00,030 So A is a beneficial mutant, and B is a deleterious mutant. 87 00:05:00,030 --> 00:05:01,640 And in all these problems, we're going 88 00:05:01,640 --> 00:05:05,740 to assume that the rest of the population has fitness 1. 89 00:05:05,740 --> 00:05:08,114 So these fitnesses of A and B, they 90 00:05:08,114 --> 00:05:10,030 are as compared to the rest of the population. 91 00:05:10,030 --> 00:05:11,680 So in this case, there are eight other individuals 92 00:05:11,680 --> 00:05:12,765 with relative fitness 1. 93 00:05:15,490 --> 00:05:19,340 So the question is, what is the probability that a fixes. 94 00:05:26,450 --> 00:05:27,950 Takes over the population. 95 00:05:27,950 --> 00:05:32,120 Reaches number of A individuals equal to 10. 96 00:05:32,120 --> 00:05:34,800 At that stage, we're not allowing for any new mutations 97 00:05:34,800 --> 00:05:36,765 here, so then that's the end of the dynamics. 98 00:05:40,212 --> 00:05:41,670 So you can start thinking about it, 99 00:05:41,670 --> 00:05:43,980 and I will give some options here. 100 00:06:08,310 --> 00:06:10,619 All right so this is in the Moran model, 101 00:06:10,619 --> 00:06:12,160 what is the probability that A fixes. 102 00:06:19,620 --> 00:06:22,028 I'll give you 20 seconds to think about it. 103 00:06:43,950 --> 00:06:45,460 Do you need more time? 104 00:06:51,314 --> 00:06:53,230 Let's go ahead and vote, and see where we are. 105 00:06:53,230 --> 00:06:53,840 Ready. 106 00:06:53,840 --> 00:06:56,165 Three, two, one-- 107 00:07:01,070 --> 00:07:06,100 All right, we got some-- I'd say Bs, Cs is kind of dominant. 108 00:07:06,100 --> 00:07:06,900 All right. 109 00:07:06,900 --> 00:07:10,150 So I'd say that there's generally a disagreement 110 00:07:10,150 --> 00:07:14,852 between B and C. Interesting. 111 00:07:14,852 --> 00:07:16,320 OK. 112 00:07:16,320 --> 00:07:18,520 I think that there's-- it's almost 50/50. 113 00:07:18,520 --> 00:07:20,700 It is a bit splotchy, somehow. 114 00:07:20,700 --> 00:07:24,135 You may need to make some effort to find somebody that disagrees 115 00:07:24,135 --> 00:07:25,260 with you, but please do so. 116 00:07:25,260 --> 00:07:27,510 All right, turn to a neighbor and try to convince them 117 00:07:27,510 --> 00:07:30,716 that you're right. 118 00:07:30,716 --> 00:07:34,684 [SIDE CONVERSATIONS] 119 00:08:36,218 --> 00:08:39,194 PROFESSOR: All right, let's go ahead and reconvene and see 120 00:08:39,194 --> 00:08:42,250 if anybody [INAUDIBLE]. 121 00:08:42,250 --> 00:08:43,200 Ready. 122 00:08:43,200 --> 00:08:47,692 Three, two, one. 123 00:08:47,692 --> 00:08:49,580 All right. 124 00:08:49,580 --> 00:08:51,660 I'd say that there's some migration towards C, 125 00:08:51,660 --> 00:08:56,170 but it's not all the way. 126 00:08:56,170 --> 00:08:58,320 All right, different arguments. 127 00:08:58,320 --> 00:09:03,024 And you may want to-- D is the argument to be made over here. 128 00:09:03,024 --> 00:09:04,190 All right, so let's hear it. 129 00:09:04,190 --> 00:09:05,160 AUDIENCE: [INAUDIBLE]. 130 00:09:05,160 --> 00:09:06,618 So if there's one individual from A 131 00:09:06,618 --> 00:09:09,764 and one individual from B, and basically they're 132 00:09:09,764 --> 00:09:12,250 about the same fitness, then the probability 133 00:09:12,250 --> 00:09:15,729 that they should, I guess, take over is 50%. 134 00:09:15,729 --> 00:09:18,214 Well that assumes that one of them will take over. 135 00:09:18,214 --> 00:09:18,936 PROFESSOR: Yeah, is there any reason 136 00:09:18,936 --> 00:09:20,769 to assume that one of them should take over? 137 00:09:20,769 --> 00:09:22,687 AUDIENCE: No. 138 00:09:22,687 --> 00:09:25,120 PROFESSOR: I think that we may agree that the probability 139 00:09:25,120 --> 00:09:27,105 that A fixes is going to be approximately 140 00:09:27,105 --> 00:09:29,249 equal to the probability that B fixes. 141 00:09:29,249 --> 00:09:30,748 That's going to be a useful insight. 142 00:09:30,748 --> 00:09:32,740 But [INAUDIBLE] both those things. 143 00:09:32,740 --> 00:09:34,732 So we have to keep track of that. 144 00:09:34,732 --> 00:09:38,995 So that's the argument for why it's not D. 145 00:09:38,995 --> 00:09:40,210 AUDIENCE: You're welcome. 146 00:09:40,210 --> 00:09:42,420 PROFESSOR: Anytime. [INAUDIBLE]. 147 00:09:42,420 --> 00:09:46,590 So between B and C, how can we think about this? 148 00:09:49,415 --> 00:09:51,540 You can tell us once again what your neighbor said. 149 00:09:51,540 --> 00:09:52,040 That's fine. 150 00:09:52,040 --> 00:09:52,726 AUDIENCE: Why? 151 00:09:52,726 --> 00:09:53,350 PROFESSOR: Yes. 152 00:09:53,350 --> 00:09:57,495 AUDIENCE: Um, well, my neighbor, as opposed to me, 153 00:09:57,495 --> 00:10:01,068 made the correct decision to try to think about it intuitively 154 00:10:01,068 --> 00:10:02,984 instead of trying to get through all the math. 155 00:10:02,984 --> 00:10:03,764 PROFESSOR: Yes. 156 00:10:03,764 --> 00:10:05,930 Always good to think about things intuitively rather 157 00:10:05,930 --> 00:10:07,090 than-- that's-- 158 00:10:07,090 --> 00:10:08,830 AUDIENCE: [INAUDIBLE]. 159 00:10:08,830 --> 00:10:11,210 So A is very slightly beneficial, 160 00:10:11,210 --> 00:10:14,910 and so it is just a little bit more 161 00:10:14,910 --> 00:10:17,392 likely to fix than a neutral mutation. 162 00:10:17,392 --> 00:10:21,110 PROFESSOR: OK, and how can we quantify this statement? 163 00:10:21,110 --> 00:10:23,486 That I know I said that I didn't like math, but you know, 164 00:10:23,486 --> 00:10:25,151 every now and then we should look at it. 165 00:10:25,151 --> 00:10:27,380 Yeah, so this statement that A is approximately 166 00:10:27,380 --> 00:10:29,130 a neutral mutation. 167 00:10:29,130 --> 00:10:31,878 How is it that we can-- how do we know that? 168 00:10:31,878 --> 00:10:32,832 I mean did-- 169 00:10:32,832 --> 00:10:35,217 AUDIENCE: We have criteria for nearly neutral. 170 00:10:35,217 --> 00:10:37,580 PROFESSOR: Yes, we have a criteria for nearly neutral. 171 00:10:37,580 --> 00:10:39,913 What does that-- what does it mean to be nearly neutral? 172 00:10:39,913 --> 00:10:42,632 AUDIENCE: The magnitude of S times N 173 00:10:42,632 --> 00:10:44,361 is much, much less than one. 174 00:10:44,361 --> 00:10:45,360 PROFESSOR: That's right. 175 00:10:45,360 --> 00:10:48,360 S times N is much less than one, which 176 00:10:48,360 --> 00:10:50,690 means that it's nearly neutral. 177 00:10:50,690 --> 00:10:53,180 What that means is that the probability of fixing 178 00:10:53,180 --> 00:10:56,610 is the same as if it were a neutral mutation, 179 00:10:56,610 --> 00:10:59,218 or approximately the same. 180 00:10:59,218 --> 00:11:01,548 And is that the case here? 181 00:11:01,548 --> 00:11:02,480 AUDIENCE: Yes. 182 00:11:02,480 --> 00:11:03,412 PROFESSOR: Yes. 183 00:11:03,412 --> 00:11:06,460 S is 0.01, so 1/100. 184 00:11:06,460 --> 00:11:07,780 N is 10. 185 00:11:07,780 --> 00:11:11,660 So S times N is equal to 0.1. 186 00:11:11,660 --> 00:11:15,212 Which means that both-- so A is maybe nearly neutral. 187 00:11:15,212 --> 00:11:16,045 Is B nearly neutral? 188 00:11:16,045 --> 00:11:16,740 AUDIENCE: Yes. 189 00:11:16,740 --> 00:11:19,430 PROFESSOR: Yes, so both A and B are nearly neutral. 190 00:11:19,430 --> 00:11:22,110 One of them is an advantageous mutant, one is a deleterious. 191 00:11:22,110 --> 00:11:24,390 But in this population size, they 192 00:11:24,390 --> 00:11:27,420 both behave as if they're nearly neutral. 193 00:11:27,420 --> 00:11:29,110 But in a larger population, so if 194 00:11:29,110 --> 00:11:30,245 we had a million individuals, would these 195 00:11:30,245 --> 00:11:31,119 be neutral mutations? 196 00:11:31,119 --> 00:11:31,706 AUDIENCE: No. 197 00:11:31,706 --> 00:11:33,840 PROFESSOR: No. 198 00:11:33,840 --> 00:11:36,110 So the question of whether something behaves 199 00:11:36,110 --> 00:11:39,021 as a neutral mutation depends upon the population size 200 00:11:39,021 --> 00:11:40,360 via this relation. 201 00:11:44,430 --> 00:11:47,345 So that's-- and then how do we go from there? 202 00:11:47,345 --> 00:11:52,676 AUDIENCE: Well the probability of a neutral mutation 203 00:11:52,676 --> 00:11:54,946 [INAUDIBLE] population is going to be [? 1/S ?]. 204 00:11:54,946 --> 00:11:55,946 PROFESSOR: That's right. 205 00:11:55,946 --> 00:11:58,786 A neutral mutation will fix a probability of one group. 206 00:11:58,786 --> 00:12:02,540 So indeed, we're going to get this. 207 00:12:02,540 --> 00:12:05,950 And we need to highlight that we can-- 208 00:12:05,950 --> 00:12:08,510 you have to be very careful about just sticking things 209 00:12:08,510 --> 00:12:09,530 into formulas. 210 00:12:09,530 --> 00:12:12,460 So you may remember that the probability 211 00:12:12,460 --> 00:12:17,050 of a beneficial mutant fixing in the population is S. 212 00:12:17,050 --> 00:12:19,750 But that's only for S times N being much larger than one, 213 00:12:19,750 --> 00:12:22,700 and that's when there's no clonal interference. 214 00:12:22,700 --> 00:12:25,200 When you don't have to compete with other beneficial mutants 215 00:12:25,200 --> 00:12:26,810 in the lineage. 216 00:12:26,810 --> 00:12:28,963 We're going to see an example of this in a moment. 217 00:12:28,963 --> 00:12:30,463 But if you just have a single mutant 218 00:12:30,463 --> 00:12:34,900 in the population, that's beneficial in the sense of it's 219 00:12:34,900 --> 00:12:36,390 not neutral. 220 00:12:36,390 --> 00:12:37,740 Then probably the fixation is S. 221 00:12:37,740 --> 00:12:40,420 And of course, this is nonsensical, right? 222 00:12:40,420 --> 00:12:43,790 Because you can't-- it wouldn't make any-- 223 00:12:43,790 --> 00:12:45,990 certainly it wouldn't-- given that you say oh well, 224 00:12:45,990 --> 00:12:48,448 if it were neutral, it would fix with a probability of 10%. 225 00:12:48,448 --> 00:12:50,920 And then you say oh, but if it's an advantageous mutant, 226 00:12:50,920 --> 00:12:54,776 it can't fix with a probability less than that. 227 00:12:54,776 --> 00:12:58,142 So this is nearly neutral, fixation probability is 10%. 228 00:12:58,142 --> 00:12:59,975 Are there any questions about that argument? 229 00:13:04,725 --> 00:13:09,141 So let's consider a different problem. 230 00:13:09,141 --> 00:13:12,943 So now, we're going to consider a population that's larger. 231 00:13:12,943 --> 00:13:14,949 So it's going to be the same situation, where 232 00:13:14,949 --> 00:13:19,928 we have these two mutants present as single individuals. 233 00:13:28,401 --> 00:13:30,900 So what we want to know now is the probability that B fixes. 234 00:13:37,261 --> 00:13:39,260 Once again, you should start thinking about this 235 00:13:39,260 --> 00:13:43,590 while I type it. 236 00:13:43,590 --> 00:13:45,200 Oh, there was one other thing I wanted 237 00:13:45,200 --> 00:13:47,980 to say, actually, over here. 238 00:13:47,980 --> 00:13:52,410 If we modeled these population dynamics 239 00:13:52,410 --> 00:13:57,240 using differential equations-- so instead of doing the Moran 240 00:13:57,240 --> 00:14:00,692 process where we stochastic dynamics of division, 241 00:14:00,692 --> 00:14:03,150 replacement, and so forth, if you write down a differential 242 00:14:03,150 --> 00:14:07,720 equation describing this situation, 243 00:14:07,720 --> 00:14:09,277 what is the probability that A fixes? 244 00:14:09,277 --> 00:14:10,735 I'm going to give you five seconds. 245 00:14:10,735 --> 00:14:11,556 AUDIENCE: One. 246 00:14:11,556 --> 00:14:14,532 PROFESSOR: All right, one. 247 00:14:14,532 --> 00:14:16,516 OK, so you don't get to vote. 248 00:14:16,516 --> 00:14:20,062 Indeed, 1. 249 00:14:20,062 --> 00:14:22,270 So if you had written down the differential equation, 250 00:14:22,270 --> 00:14:25,360 then there's no randomness there. 251 00:14:25,360 --> 00:14:28,480 Whichever mutant has the largest fitness 252 00:14:28,480 --> 00:14:31,535 will fix with probability 1. 253 00:14:31,535 --> 00:14:32,035 OK? 254 00:14:35,602 --> 00:14:36,810 OK, probability that B fixes. 255 00:15:09,470 --> 00:15:11,260 So this-- you show me both [INAUDIBLE]. 256 00:15:13,870 --> 00:15:16,541 I'll give you 20, 30 seconds to think about this. 257 00:15:27,425 --> 00:15:27,925 Yes? 258 00:15:27,925 --> 00:15:31,066 AUDIENCE: Is there just one [INAUDIBLE]? 259 00:15:31,066 --> 00:15:32,950 PROFESSOR: Yes. 260 00:15:32,950 --> 00:15:35,250 All right, so this kind of-- that's why-- yeah. 261 00:15:35,250 --> 00:15:36,416 So it's just the same thing. 262 00:15:36,416 --> 00:15:42,420 And so that's a single A mutant and a single B mutant. 263 00:15:42,420 --> 00:15:49,298 Single A, single B. But you should 264 00:15:49,298 --> 00:15:52,072 know how to calculate these things if I tell you that they 265 00:15:52,072 --> 00:15:53,030 were some other number. 266 00:16:09,214 --> 00:16:10,130 Do you need more time? 267 00:16:16,804 --> 00:16:19,470 So in general, if you don't look at me when I ask that question, 268 00:16:19,470 --> 00:16:20,840 I'm going to take that as meaning that you're still 269 00:16:20,840 --> 00:16:21,640 thinking about it. 270 00:16:21,640 --> 00:16:23,196 Is that how I should interpret that? 271 00:16:23,196 --> 00:16:23,696 OK. 272 00:16:48,020 --> 00:16:50,930 Let's go ahead and vote just so that we can see where we are. 273 00:16:50,930 --> 00:16:52,270 OK, ready. 274 00:16:52,270 --> 00:16:54,550 Three, two, one. 275 00:16:58,486 --> 00:17:00,500 All right. 276 00:17:00,500 --> 00:17:01,730 We're all over the place. 277 00:17:01,730 --> 00:17:04,569 That's great. 278 00:17:04,569 --> 00:17:07,660 Because that means your neighbor will either 279 00:17:07,660 --> 00:17:09,869 be able to help you or confuse you, or do something. 280 00:17:09,869 --> 00:17:13,266 So let's [INAUDIBLE]. 281 00:17:13,266 --> 00:17:15,540 There's a million-- population size 1,000,000. 282 00:17:15,540 --> 00:17:17,849 We have two mutants in the population. 283 00:17:17,849 --> 00:17:20,619 One has relative fitness 2, one has relative fitness 284 00:17:20,619 --> 00:17:25,599 of-- an advantage of 1%. 285 00:17:25,599 --> 00:17:30,830 One thing to be careful of here is the equation 286 00:17:30,830 --> 00:17:37,655 that we have for probability of fixation here-- ah. 287 00:17:37,655 --> 00:17:39,530 There's one more thing that I should say. 288 00:17:39,530 --> 00:17:43,640 It's for S. Well, it's an S much less than 1. 289 00:17:46,380 --> 00:17:51,390 Just for-- so this is for a moderately-beneficial mutation. 290 00:17:51,390 --> 00:17:53,931 Otherwise, you have to use the full approach. 291 00:17:53,931 --> 00:17:54,925 Yes? 292 00:17:54,925 --> 00:17:56,416 AUDIENCE: [INAUDIBLE]. 293 00:17:56,416 --> 00:17:57,410 PROFESSOR: All right. 294 00:17:57,410 --> 00:17:58,868 Go ahead and turn to your neighbor, 295 00:17:58,868 --> 00:18:02,877 and try to convince them that you've figured this out. 296 00:18:02,877 --> 00:18:06,853 [SIDE CONVERSATIONS] 297 00:19:32,326 --> 00:19:35,870 PROFESSOR: So everybody should be talking to somebody. 298 00:19:35,870 --> 00:19:36,720 What do you think? 299 00:19:41,067 --> 00:19:45,550 Well it's OK to not know, but that's why you-- 300 00:19:45,550 --> 00:19:49,920 you should either join them, or if you look behind you, 301 00:19:49,920 --> 00:19:55,610 there's someone else that looks very, uh-- All right, 302 00:19:55,610 --> 00:19:56,897 why don't-- let's-- OK. 303 00:19:56,897 --> 00:19:57,730 Let's-- what's that? 304 00:19:57,730 --> 00:20:00,100 AUDIENCE: I said my reasons are always good reasons. 305 00:20:00,100 --> 00:20:01,950 PROFESSOR: They're always good reasons. 306 00:20:01,950 --> 00:20:05,280 They'll be educational. 307 00:20:05,280 --> 00:20:09,560 All right, let's go ahead and reconvene. 308 00:20:09,560 --> 00:20:13,150 I know that there were a lot of different guesses 309 00:20:13,150 --> 00:20:15,410 on this problem, which means maybe 310 00:20:15,410 --> 00:20:19,060 that the conversations were scattered or difficult, 311 00:20:19,060 --> 00:20:21,680 or it was difficult to converge. 312 00:20:21,680 --> 00:20:24,940 But still, let's see if it made any difference. 313 00:20:24,940 --> 00:20:25,440 Ready. 314 00:20:25,440 --> 00:20:28,230 Three, two, one. 315 00:20:28,230 --> 00:20:30,064 OK. 316 00:20:30,064 --> 00:20:30,730 All right, yeah. 317 00:20:30,730 --> 00:20:33,570 So I'd say that the conversations have helped. 318 00:20:33,570 --> 00:20:36,330 But we're still a fair distribution 319 00:20:36,330 --> 00:20:43,370 between maybe A, C, and D still. 320 00:20:43,370 --> 00:20:43,932 But-- yeah. 321 00:20:43,932 --> 00:20:45,640 Does somebody want to give an explanation 322 00:20:45,640 --> 00:20:49,592 for one of those three, or something else? 323 00:20:49,592 --> 00:20:50,300 Yeah, [? John. ?] 324 00:20:50,300 --> 00:20:53,590 AUDIENCE: [INAUDIBLE], then you know [INAUDIBLE]. 325 00:20:57,350 --> 00:20:58,310 PROFESSOR: Right. 326 00:20:58,310 --> 00:20:59,730 AUDIENCE: And it indicates that [INAUDIBLE]. 327 00:20:59,730 --> 00:21:00,150 PROFESSOR: Perfect. 328 00:21:00,150 --> 00:21:00,290 OK. 329 00:21:00,290 --> 00:21:01,956 Now I just want to make sure that we all 330 00:21:01,956 --> 00:21:02,820 agree on this logic. 331 00:21:02,820 --> 00:21:05,590 So if the only thing we had was the mutation B, 332 00:21:05,590 --> 00:21:08,900 then the probability of fixation would be 1% because, 333 00:21:08,900 --> 00:21:12,220 in that case-- is this a nearly neutral mutation? 334 00:21:12,220 --> 00:21:13,490 No. 335 00:21:13,490 --> 00:21:15,570 So that means that its probability of fixation 336 00:21:15,570 --> 00:21:19,670 is going to be around S, in this case, 1.01. 337 00:21:19,670 --> 00:21:20,430 OK, but-- 338 00:21:20,430 --> 00:21:24,595 AUDIENCE: But now A is there, so you-- first, you 339 00:21:24,595 --> 00:21:27,290 expect that-- you expect the probability of B 340 00:21:27,290 --> 00:21:30,720 to fix to be lower than 1%. 341 00:21:30,720 --> 00:21:34,150 A fixes in half of the cases. 342 00:21:34,150 --> 00:21:35,170 PROFESSOR: That's right. 343 00:21:35,170 --> 00:21:37,325 AUDIENCE: So in these cases, [? B's ?] not going to fix. 344 00:21:37,325 --> 00:21:38,210 PROFESSOR: Exactly. 345 00:21:38,210 --> 00:21:40,626 AUDIENCE: So my answer is C, but I mean, it's a bit of a-- 346 00:21:40,626 --> 00:21:42,447 PROFESSOR: Yeah, no, no, I mean it's-- no, 347 00:21:42,447 --> 00:21:44,030 but that's precisely what's happening. 348 00:21:44,030 --> 00:21:48,680 So for B to fix, it requires two things to happen. 349 00:21:48,680 --> 00:21:51,760 First, B has to survive stochastic extinction. 350 00:22:00,130 --> 00:22:05,340 And-- so we're going to multiply these probabilities-- 351 00:22:05,340 --> 00:22:08,130 A doesn't survive. 352 00:22:11,170 --> 00:22:13,320 We're going to discuss in a moment what this really 353 00:22:13,320 --> 00:22:15,403 means in terms of surviving stochastic extinction. 354 00:22:15,403 --> 00:22:18,090 But when a new mutation appears in the population, 355 00:22:18,090 --> 00:22:20,660 it will either relatively quickly go extinct, 356 00:22:20,660 --> 00:22:23,360 or it will relatively quickly become established, 357 00:22:23,360 --> 00:22:24,380 is what we call that. 358 00:22:24,380 --> 00:22:25,755 What that means is that it will-- 359 00:22:25,755 --> 00:22:27,180 if it's a beneficial mutation, it 360 00:22:27,180 --> 00:22:30,560 will take over unless it's outcompeted by someone else. 361 00:22:30,560 --> 00:22:33,470 So it has to be that B survives stochastic extinction, 362 00:22:33,470 --> 00:22:39,400 and A doesn't survive this initial phase. 363 00:22:39,400 --> 00:22:42,080 And this thing-- so that means that the probability 364 00:22:42,080 --> 00:22:47,552 that B fixes is going to be-- well, OK. 365 00:22:47,552 --> 00:22:49,760 The probability that B survives stochastic extinction 366 00:22:49,760 --> 00:22:53,004 is going to be 1%. 367 00:22:53,004 --> 00:22:56,320 The probability that A survives is 1/2, so the probability 368 00:22:56,320 --> 00:22:57,820 that it doesn't survive is also 1/2. 369 00:23:03,226 --> 00:23:04,850 Now the half, just to remind ourselves, 370 00:23:04,850 --> 00:23:08,970 comes from this thing x1, for A. There was this [INAUDIBLE]. 371 00:23:15,190 --> 00:23:18,728 This thing is what? 372 00:23:18,728 --> 00:23:19,644 AUDIENCE: [INAUDIBLE]. 373 00:23:19,644 --> 00:23:20,560 PROFESSOR: Right. 374 00:23:20,560 --> 00:23:22,540 So this thing very quickly goes to zero. 375 00:23:22,540 --> 00:23:25,410 So this is 1/2 to the millionth power. 376 00:23:25,410 --> 00:23:26,860 So that's zero. 377 00:23:26,860 --> 00:23:28,320 So we just get 1 minus 1/2. 378 00:23:38,500 --> 00:23:39,000 Yes. 379 00:23:39,000 --> 00:23:40,458 AUDIENCE: But what are we assuming? 380 00:23:40,458 --> 00:23:42,742 It's not clear to me in this case. 381 00:23:42,742 --> 00:23:44,950 Because we're assuming they're completely independent 382 00:23:44,950 --> 00:23:45,825 and we can multiply-- 383 00:23:45,825 --> 00:23:46,825 PROFESSOR: That's right. 384 00:23:46,825 --> 00:23:47,980 AUDIENCE: Together, so-- 385 00:23:47,980 --> 00:23:50,370 PROFESSOR: Yeah, that's right. 386 00:23:50,370 --> 00:23:54,290 And I would say that for-- from the same [INAUDIBLE] 387 00:23:54,290 --> 00:23:55,710 the population of a million, these 388 00:23:55,710 --> 00:24:00,740 are totally non-interacting processes, for all intents 389 00:24:00,740 --> 00:24:01,410 and purposes. 390 00:24:01,410 --> 00:24:04,460 If you were down to a population size of 10, 391 00:24:04,460 --> 00:24:07,132 then you might have to worry that they're 392 00:24:07,132 --> 00:24:08,090 going to interact more. 393 00:24:08,090 --> 00:24:13,260 But in this process of-- in this Moran process, 394 00:24:13,260 --> 00:24:14,997 you have a million individuals. 395 00:24:14,997 --> 00:24:16,580 It's true that this is twice as likely 396 00:24:16,580 --> 00:24:18,240 to be chosen to divide as a normal cell, 397 00:24:18,240 --> 00:24:19,906 but you're talking about a million cells 398 00:24:19,906 --> 00:24:21,500 in that population. 399 00:24:21,500 --> 00:24:24,630 So the correction to the things that you're worrying about 400 00:24:24,630 --> 00:24:25,970 is a [INAUDIBLE] 10 to the 6. 401 00:24:25,970 --> 00:24:28,802 AUDIENCE: And so the thing [INAUDIBLE], yeah. 402 00:24:28,802 --> 00:24:33,050 But if rA and rB were closer, I should [INAUDIBLE] correction-- 403 00:24:33,050 --> 00:24:33,980 PROFESSOR: No. 404 00:24:33,980 --> 00:24:38,450 It's not about how close rA and rB are to each other. 405 00:24:38,450 --> 00:24:42,510 Even if they had the exact same relative fitness. 406 00:24:42,510 --> 00:24:44,332 Even if they were both 2 or both 1.01, 407 00:24:44,332 --> 00:24:46,040 then they would still be non-interacting. 408 00:24:46,040 --> 00:24:48,565 And that's just because they're in this sea of individuals. 409 00:24:48,565 --> 00:24:50,481 AUDIENCE: But the thing that I'm worried about 410 00:24:50,481 --> 00:24:52,510 is, for example, rA could take over quickly, 411 00:24:52,510 --> 00:24:54,810 and then rB could take over. 412 00:24:54,810 --> 00:24:55,650 And that-- 413 00:24:55,650 --> 00:24:56,385 PROFESSOR: OK-- 414 00:24:56,385 --> 00:24:57,792 AUDIENCE: Would not be-- 415 00:24:57,792 --> 00:24:58,500 PROFESSOR: Right. 416 00:24:58,500 --> 00:24:59,130 AUDIENCE: [INAUDIBLE]. 417 00:24:59,130 --> 00:25:00,070 PROFESSOR: OK so-- 418 00:25:00,070 --> 00:25:01,569 AUDIENCE: And that's not reasonable. 419 00:25:01,569 --> 00:25:03,675 [? If this case ?] is rA [INAUDIBLE], then rB-- 420 00:25:03,675 --> 00:25:05,012 the effective rB-- 421 00:25:05,012 --> 00:25:05,720 PROFESSOR: Right. 422 00:25:05,720 --> 00:25:07,120 OK So there is one problem-- You're right. 423 00:25:07,120 --> 00:25:08,190 So I should be careful. 424 00:25:08,190 --> 00:25:11,640 If the two mutants have exactly the same fitness, 425 00:25:11,640 --> 00:25:13,770 then if they both survive stochastic extinction, 426 00:25:13,770 --> 00:25:16,410 they're going to-- then you can describe everything 427 00:25:16,410 --> 00:25:17,480 as a differential. 428 00:25:17,480 --> 00:25:20,310 And this is the distinction between this initial phase 429 00:25:20,310 --> 00:25:21,940 of trying to become established as 430 00:25:21,940 --> 00:25:24,710 compared to the later dynamics in that the established means 431 00:25:24,710 --> 00:25:27,490 that if it's a beneficial mutation, it will spread, 432 00:25:27,490 --> 00:25:29,440 and basically deterministically. 433 00:25:29,440 --> 00:25:32,530 So in that situation, if you had two mutants, and both 434 00:25:32,530 --> 00:25:35,809 of relative fitness 1.01, then they both 435 00:25:35,809 --> 00:25:37,600 would try to survive stochastic extinction. 436 00:25:37,600 --> 00:25:38,974 And let's say that they both did. 437 00:25:38,974 --> 00:25:41,870 Then they would both spread in the population exponentially. 438 00:25:41,870 --> 00:25:44,610 Of course, given that the initial dynamics 439 00:25:44,610 --> 00:25:46,144 are stochastic, then it would still 440 00:25:46,144 --> 00:25:48,060 be the case that one of them would just happen 441 00:25:48,060 --> 00:25:49,340 to be above the other one. 442 00:25:49,340 --> 00:25:52,230 But then they would grow exponentially together. 443 00:25:52,230 --> 00:25:54,180 That's a more complicated situation 444 00:25:54,180 --> 00:25:56,360 than what we're going to discuss. 445 00:25:56,360 --> 00:25:57,906 And that would actually be very rare. 446 00:25:57,906 --> 00:25:59,280 Because each one of these mutants 447 00:25:59,280 --> 00:26:00,988 would only have a 1% chance of surviving. 448 00:26:00,988 --> 00:26:03,760 So only one in 10 to the 4 times would you 449 00:26:03,760 --> 00:26:05,417 end up with both of these mutants 450 00:26:05,417 --> 00:26:06,750 surviving stochastic extinction. 451 00:26:10,790 --> 00:26:15,720 Any other questions about-- so I just want to be clear here. 452 00:26:15,720 --> 00:26:19,336 So this is when A is equal to 1-- the number of A individuals 453 00:26:19,336 --> 00:26:21,210 is 1, the number of B individuals equal to 1. 454 00:26:28,420 --> 00:26:35,890 There was a question, what if we started with A equal to 10. 455 00:26:35,890 --> 00:26:38,830 So let's say the-- yeah. 456 00:26:38,830 --> 00:26:39,330 let's? 457 00:26:39,330 --> 00:26:44,190 Say we start with 10A and 10B, for example. 458 00:26:44,190 --> 00:26:47,415 So instead of starting with just mutant A and one mutant B, now 459 00:26:47,415 --> 00:26:49,630 we have 10 of each. 460 00:26:49,630 --> 00:26:53,140 So we're changing things symmetrically. 461 00:26:53,140 --> 00:26:56,660 What's going to be the probability that B fixes, 462 00:26:56,660 --> 00:26:58,479 approximately? 463 00:26:58,479 --> 00:27:00,270 This is going to be a little funny somehow. 464 00:27:00,270 --> 00:27:04,850 But OK, we'll-- just think about it for 15 seconds. 465 00:27:04,850 --> 00:27:09,760 What's going to the probability that B fixes in this case? 466 00:27:32,624 --> 00:27:35,205 All right, I'm worried you guys are now even trying. 467 00:27:38,964 --> 00:27:40,630 AUDIENCE: Could you repeat the question? 468 00:27:40,630 --> 00:27:41,560 PROFESSOR: Sure, yeah. 469 00:27:41,560 --> 00:27:42,970 OK. 470 00:27:42,970 --> 00:27:45,226 So we just did the problem, we had a single mutant A 471 00:27:45,226 --> 00:27:47,350 and a single mutant B. And we found the probability 472 00:27:47,350 --> 00:27:50,980 that B fixes was 0.005. 473 00:27:50,980 --> 00:27:52,540 Now I want to know is, does anything 474 00:27:52,540 --> 00:27:55,022 change instead of having a single A and single B, 475 00:27:55,022 --> 00:27:59,910 now what we have is 10 A, 10 B, and then the rest of them just 476 00:27:59,910 --> 00:28:01,275 being [INAUDIBLE] one. 477 00:28:14,620 --> 00:28:15,890 So it's not fancy math. 478 00:28:15,890 --> 00:28:17,765 But it's still, I can see, a little confused. 479 00:28:35,857 --> 00:28:37,440 I just wanted you to see where we are. 480 00:28:37,440 --> 00:28:44,264 We'll use the same options here, whatever is closest. 481 00:28:44,264 --> 00:28:46,430 And we can argue about what close means in a moment. 482 00:28:46,430 --> 00:28:49,715 All right, ready, three, two, one. 483 00:28:52,860 --> 00:28:57,995 So it's kind of some A's and B's, some E's. 484 00:29:00,510 --> 00:29:02,750 All right. 485 00:29:02,750 --> 00:29:04,390 Well, we'll talk about it. 486 00:29:07,320 --> 00:29:08,580 All right. 487 00:29:08,580 --> 00:29:10,730 Maybe we'll jump ahead to the group discussion, 488 00:29:10,730 --> 00:29:12,430 just because I don't want to spend 489 00:29:12,430 --> 00:29:13,680 too much time on this problem. 490 00:29:13,680 --> 00:29:15,420 But these are the kinds of problems, 491 00:29:15,420 --> 00:29:18,170 I think, it's very important to practice with, 492 00:29:18,170 --> 00:29:19,870 because they just develop your intuition 493 00:29:19,870 --> 00:29:22,119 for the stochastic dynamics that occur in populations. 494 00:29:24,960 --> 00:29:27,540 It's going to be something like B. You know, 495 00:29:27,540 --> 00:29:31,250 that's not why I put B in there. 496 00:29:31,250 --> 00:29:33,580 But let's just see how it goes. 497 00:29:33,580 --> 00:29:35,350 And the way to find the answer here 498 00:29:35,350 --> 00:29:37,580 is the same as what we did before, except that now we 499 00:29:37,580 --> 00:29:40,700 have to think about these processes somewhat differently. 500 00:29:40,700 --> 00:29:42,620 So first of all, in order for B to fix, 501 00:29:42,620 --> 00:29:46,630 it requires that still B has to survive stochastic extinction. 502 00:29:46,630 --> 00:29:52,514 So we still have the same 0.01 here. 503 00:29:52,514 --> 00:29:54,430 But now we have to think about the probability 504 00:29:54,430 --> 00:29:56,980 that A doesn't survive. 505 00:29:56,980 --> 00:30:00,270 And that's going to change now, right? 506 00:30:00,270 --> 00:30:03,660 Oh, I should be-- OK. 507 00:30:03,660 --> 00:30:05,452 Yeah, OK. 508 00:30:05,452 --> 00:30:06,660 I forgot that we had 10 here. 509 00:30:10,522 --> 00:30:12,730 We could calculate what it is, and actually, maybe we 510 00:30:12,730 --> 00:30:13,780 should have. 511 00:30:13,780 --> 00:30:16,460 Well, let's calculate that. 512 00:30:16,460 --> 00:30:21,440 All right, so this is x1 for B. We still 513 00:30:21,440 --> 00:30:24,560 have 1.01 to the mill-- so we can still 514 00:30:24,560 --> 00:30:27,029 ignore this bottom part. 515 00:30:27,029 --> 00:30:28,570 But what it's going to be is, oh, no, 516 00:30:28,570 --> 00:30:31,380 but there's this x10 of B, where you have 10 individuals. 517 00:30:31,380 --> 00:30:38,295 So this is 1 minus 1.01 to the 10th power. 518 00:30:38,295 --> 00:30:40,760 AUDIENCE: But we have clonal interference here. 519 00:30:42,754 --> 00:30:44,920 PROFESSOR: Right now what we're trying to understand 520 00:30:44,920 --> 00:30:51,550 is if only we had B-- this is the probability that we're 521 00:30:51,550 --> 00:30:53,720 trying to calculate this part of it. 522 00:30:53,720 --> 00:30:55,770 So if we ignore A, what's the probably that B 523 00:30:55,770 --> 00:30:57,061 survives stochastic extinction? 524 00:31:01,828 --> 00:31:04,840 Oh, this is fine. 525 00:31:04,840 --> 00:31:08,460 So it's a modest change. 526 00:31:08,460 --> 00:31:14,110 Because this is going to be 1 minus. 527 00:31:14,110 --> 00:31:27,180 So this thing is around 1.1, because 1 plus x to the n 528 00:31:27,180 --> 00:31:31,915 is around 1 plus nx for nx much less than 1. 529 00:31:35,070 --> 00:31:48,110 So this is 1.1. 530 00:31:48,110 --> 00:31:50,210 OK, so right. 531 00:31:50,210 --> 00:31:55,300 So this is around 0.1. 532 00:31:55,300 --> 00:31:59,850 So indeed, I'd forgotten that there was 10. 533 00:31:59,850 --> 00:32:04,010 So given that we have 10 of these B individuals starting, 534 00:32:04,010 --> 00:32:06,210 they may survive. 535 00:32:06,210 --> 00:32:08,870 Ignoring A right now, just on their own dynamics, 536 00:32:08,870 --> 00:32:14,420 they may survive, or they may drift down to 0 and go extinct. 537 00:32:14,420 --> 00:32:16,690 And indeed, even starting with 10 individuals-- 538 00:32:16,690 --> 00:32:19,330 10 B individuals-- you actually still expect them 539 00:32:19,330 --> 00:32:20,722 to go extinct. 540 00:32:20,722 --> 00:32:22,260 And this, actually, calculation is 541 00:32:22,260 --> 00:32:24,392 very relevant for what we're about to think about, 542 00:32:24,392 --> 00:32:25,850 which is this question of what does 543 00:32:25,850 --> 00:32:27,855 it mean to be established in a population. 544 00:32:27,855 --> 00:32:32,050 And already you can kind of see what it's going to mean. 545 00:32:32,050 --> 00:32:33,930 How big do you need to get in the population 546 00:32:33,930 --> 00:32:35,295 before you're likely to survive? 547 00:32:38,300 --> 00:32:39,200 AUDIENCE: 1/s. 548 00:32:39,200 --> 00:32:40,161 PROFESSOR: 1/s. 549 00:32:40,161 --> 00:32:41,536 Because you can basically see how 550 00:32:41,536 --> 00:32:43,452 this is going to work out in this calculation, 551 00:32:43,452 --> 00:32:47,090 that the number you have to get to to be established 552 00:32:47,090 --> 00:32:49,854 is around 1/s. 553 00:32:49,854 --> 00:32:52,270 So in this case, you have to get to around 100 individuals 554 00:32:52,270 --> 00:32:55,715 before you are going to be more likely than not to survive. 555 00:32:59,657 --> 00:33:01,490 Let's see if we can figure out this problem. 556 00:33:01,490 --> 00:33:04,130 All right, so we had 0.1 was the probability 557 00:33:04,130 --> 00:33:06,010 that B would survive stochastic extinction. 558 00:33:06,010 --> 00:33:07,504 But then in order for B to fix, it 559 00:33:07,504 --> 00:33:09,420 has to not only survive stochastic extinction, 560 00:33:09,420 --> 00:33:11,110 but also A has to not survive. 561 00:33:14,380 --> 00:33:18,140 And we can figure out what that is going to be here. 562 00:33:18,140 --> 00:33:21,430 This is that term. 563 00:33:21,430 --> 00:33:24,870 And then we need another one. 564 00:33:24,870 --> 00:33:28,820 So this is x of 10 of A. And this is, again, 565 00:33:28,820 --> 00:33:33,370 going to be 1 minus 1/2 to the 10th power. 566 00:33:33,370 --> 00:33:35,842 AUDIENCE: [INAUDIBLE]. 567 00:33:35,842 --> 00:33:36,800 PROFESSOR: Yeah, right. 568 00:33:36,800 --> 00:33:39,050 So it's always good to memorize a few of these things. 569 00:33:39,050 --> 00:33:45,740 2 to the 10 is approximately 1,000. 570 00:33:45,740 --> 00:33:47,680 But we want not this. 571 00:33:47,680 --> 00:33:49,590 So we're going to subtract this 1 away. 572 00:33:49,590 --> 00:33:51,320 So we're going to get 10 to the minus 3. 573 00:33:55,070 --> 00:33:56,790 Do you guys see how that-- right? 574 00:33:56,790 --> 00:34:00,013 So given that we start with 10 individuals of A 575 00:34:00,013 --> 00:34:02,552 with a relative fitness 2, then there's only a 1 576 00:34:02,552 --> 00:34:04,950 in 1,000 chance that those A individuals 577 00:34:04,950 --> 00:34:06,300 are going to go extinct. 578 00:34:14,409 --> 00:34:17,040 So the probability that B is going to fix in this situation 579 00:34:17,040 --> 00:34:20,750 is rather close to 10 to the minus 4. 580 00:34:20,750 --> 00:34:23,460 So it's somewhere in between there. 581 00:34:28,740 --> 00:34:31,299 Are there any questions about how that came about? 582 00:34:34,879 --> 00:34:36,920 I feel like there's a fair amount of unhappiness. 583 00:34:51,530 --> 00:34:54,260 So it could be the logic of this, 584 00:34:54,260 --> 00:34:56,510 or it could be the calculation of one of the two terms 585 00:34:56,510 --> 00:34:59,210 or it could be something else. 586 00:34:59,210 --> 00:34:59,710 Yes? 587 00:34:59,710 --> 00:35:00,376 AUDIENCE: Sorry. 588 00:35:00,376 --> 00:35:03,686 Where did you get x10 B, the probably 589 00:35:03,686 --> 00:35:07,336 of x of B surviving again? 590 00:35:07,336 --> 00:35:09,460 PROFESSOR: You want to know this term or that term? 591 00:35:09,460 --> 00:35:10,501 AUDIENCE: The first term. 592 00:35:10,501 --> 00:35:12,120 PROFESSOR: Right, the first term. 593 00:35:12,120 --> 00:35:14,190 So the first term is just asking, 594 00:35:14,190 --> 00:35:18,240 if there were no A, what would be the probability that B would 595 00:35:18,240 --> 00:35:19,490 survive stochastic extinction? 596 00:35:19,490 --> 00:35:23,010 That's the same as the probability that it would fix, 597 00:35:23,010 --> 00:35:25,060 because it's a beneficial mutation, 598 00:35:25,060 --> 00:35:26,846 and if B is the only thing, right? 599 00:35:26,846 --> 00:35:27,846 So we use this equation. 600 00:35:27,846 --> 00:35:29,862 AUDIENCE: Oh, OK. 601 00:35:29,862 --> 00:35:31,320 PROFESSOR: So we use this equation, 602 00:35:31,320 --> 00:35:33,140 except that this thing just goes to 1, 603 00:35:33,140 --> 00:35:34,470 so we're just left with this. 604 00:35:34,470 --> 00:35:36,714 But this equation is assuming that there's 605 00:35:36,714 --> 00:35:38,172 just one of the mutant individuals. 606 00:35:38,172 --> 00:35:44,050 Otherwise if you say x sub i, then it's r to the i here. 607 00:35:50,474 --> 00:35:52,640 So that was the equation that was derived in chapter 608 00:35:52,640 --> 00:35:54,275 six of Evolutionary Dynamics. 609 00:35:57,780 --> 00:36:00,670 But I think that that's not the only thing that people are 610 00:36:00,670 --> 00:36:10,511 unhappy about, so please-- no? 611 00:36:10,511 --> 00:36:11,010 Yes. 612 00:36:11,010 --> 00:36:12,551 AUDIENCE: A question on why don't you 613 00:36:12,551 --> 00:36:14,398 need to take into account B doesn't 614 00:36:14,398 --> 00:36:16,818 survive-- that probability that B doesn't survive? 615 00:36:16,818 --> 00:36:19,234 For that first question that's not on the board right now, 616 00:36:19,234 --> 00:36:23,084 I guess, for the nearly neutral [INAUDIBLE]. 617 00:36:23,084 --> 00:36:25,500 PROFESSOR: OK, so this was the very first one that we did? 618 00:36:25,500 --> 00:36:26,574 OK. 619 00:36:26,574 --> 00:36:27,740 Right, so you want to know-- 620 00:36:30,532 --> 00:36:32,740 AUDIENCE: Why don't you need to take into account the 621 00:36:32,740 --> 00:36:33,880 probably that B doesn't survive? 622 00:36:33,880 --> 00:36:35,588 PROFESSOR: All right, so in this problem, 623 00:36:35,588 --> 00:36:39,700 it's just that they all behave as if they were just neutral. 624 00:36:39,700 --> 00:36:42,390 So the probability of each of these 10 individuals fixing 625 00:36:42,390 --> 00:36:46,520 in the population is the same, essentially. 626 00:36:46,520 --> 00:36:51,460 So they each have a 1 in 10 probability of surviving. 627 00:36:51,460 --> 00:36:53,770 And in this case, I would say that if things 628 00:36:53,770 --> 00:36:56,040 are nearly neutral, you don't talk 629 00:36:56,040 --> 00:36:58,200 about this idea of surviving stochastic extinction, 630 00:36:58,200 --> 00:37:03,730 because you can't get big enough to the point 631 00:37:03,730 --> 00:37:06,644 where you are really guaranteed to survive 632 00:37:06,644 --> 00:37:07,560 stochastic extinction. 633 00:37:07,560 --> 00:37:11,610 Because even if you've gone to occupy 75% of the population, 634 00:37:11,610 --> 00:37:13,610 what's your probability of fixing at that stage? 635 00:37:16,350 --> 00:37:20,310 If you're nearly neutral and you get up to 75%, 636 00:37:20,310 --> 00:37:21,690 you have a 75% chance of fixing. 637 00:37:24,790 --> 00:37:27,170 In the case in the neutral mutations, 638 00:37:27,170 --> 00:37:28,750 there's no analogous idea of what 639 00:37:28,750 --> 00:37:32,510 it means to be established, because you could always 640 00:37:32,510 --> 00:37:34,850 just go back and go extinct with finite probability. 641 00:37:42,800 --> 00:37:43,415 This question? 642 00:37:47,350 --> 00:37:50,300 Well, it's looking more and more likely to show up 643 00:37:50,300 --> 00:37:53,980 on the midterm in two weeks. 644 00:37:53,980 --> 00:37:58,972 So now is your opportunity to ask a question about it. 645 00:37:58,972 --> 00:38:02,870 AUDIENCE: Can you talk about and establish 646 00:38:02,870 --> 00:38:03,980 how you got [INAUDIBLE]? 647 00:38:03,980 --> 00:38:04,650 PROFESSOR: Yes. 648 00:38:04,650 --> 00:38:07,320 OK, this is a good question. 649 00:38:07,320 --> 00:38:12,230 So when we talk about a mutation becoming established, that's 650 00:38:12,230 --> 00:38:18,310 asking-- of course, we know that even a rather beneficial 651 00:38:18,310 --> 00:38:19,810 mutation, say something that confers 652 00:38:19,810 --> 00:38:24,520 advantage of a few percent, will probably go extinct. 653 00:38:24,520 --> 00:38:28,150 However, if it gets up to be a large enough fraction 654 00:38:28,150 --> 00:38:31,745 or a number of the population, then the dynamics 655 00:38:31,745 --> 00:38:34,120 are going to be well described by a differential equation 656 00:38:34,120 --> 00:38:35,810 where it won't go extinct. 657 00:38:35,810 --> 00:38:39,990 So this is established mutation. 658 00:38:39,990 --> 00:38:42,060 And this, again, we really talk about only 659 00:38:42,060 --> 00:38:45,392 for beneficial mutations, because other mutations 660 00:38:45,392 --> 00:38:46,600 can't really get established. 661 00:38:46,600 --> 00:38:49,210 They could still stochastically fix. 662 00:38:49,210 --> 00:38:51,775 But it's still stochastic up to the very end. 663 00:38:54,640 --> 00:38:58,605 And so the question is, at what number in the population 664 00:38:58,605 --> 00:39:01,750 do you had to get before you are expected to take over 665 00:39:01,750 --> 00:39:06,050 the population, assuming that other new mutants don't arise. 666 00:39:13,650 --> 00:39:15,260 That corresponds to asking, OK, this 667 00:39:15,260 --> 00:39:16,718 is the probability of fixation when 668 00:39:16,718 --> 00:39:21,330 you have i individuals in the population. 669 00:39:21,330 --> 00:39:24,060 And this is what we saw before. 670 00:39:28,720 --> 00:39:31,280 This is permutation with relative fitness r. 671 00:39:31,280 --> 00:39:33,940 You have i individuals in the population, total population 672 00:39:33,940 --> 00:39:34,700 size n. 673 00:39:38,140 --> 00:39:42,300 And this becoming established means that xi 674 00:39:42,300 --> 00:39:43,720 is approximately equal to 1. 675 00:39:49,850 --> 00:39:53,480 Now, in these situations, r to the n 676 00:39:53,480 --> 00:39:55,199 is, again, very, very large. 677 00:39:55,199 --> 00:39:56,365 So this thing we can ignore. 678 00:39:56,365 --> 00:39:58,840 So this is really the same thing as asking that 1 minus. 679 00:40:02,870 --> 00:40:05,530 And this never gets to be exactly equal to 1. 680 00:40:05,530 --> 00:40:07,360 But what now we're really asking is 681 00:40:07,360 --> 00:40:08,770 that this thing is close to 1. 682 00:40:11,970 --> 00:40:13,470 And this equivalent is saying that-- 683 00:40:13,470 --> 00:40:17,120 what do we want to say-- that 1 over r to the i 684 00:40:17,120 --> 00:40:18,390 is much less than 1. 685 00:40:24,760 --> 00:40:28,930 And well, that means that r to the i is much greater than 1. 686 00:40:39,240 --> 00:40:41,050 So this is really saying that what you want 687 00:40:41,050 --> 00:40:43,740 is i times s to be much greater than 1. 688 00:40:51,432 --> 00:40:54,980 So n established, which is, in this case, 689 00:40:54,980 --> 00:40:56,575 it's the i such that this is true. 690 00:41:00,910 --> 00:41:02,850 It has to be much greater than 1 over s. 691 00:41:06,010 --> 00:41:17,870 And indeed, I think that when i is equal to 1 over s, then 692 00:41:17,870 --> 00:41:20,500 I think the probability of extinction is 1 over e, 693 00:41:20,500 --> 00:41:23,950 if you do the math. 694 00:41:23,950 --> 00:41:27,050 So that means that this is kind of the crossing over point, 695 00:41:27,050 --> 00:41:29,320 where you're more likely than not to survive. 696 00:41:33,290 --> 00:41:35,690 And just to be concrete, for many of the mutations 697 00:41:35,690 --> 00:41:39,440 we're talking about and that you read about in Roy's paper, 698 00:41:39,440 --> 00:41:42,170 s was a few percent. 699 00:41:42,170 --> 00:41:44,450 So this is saying that the population is established 700 00:41:44,450 --> 00:41:47,860 when the number of these mutants gets up to above 30 701 00:41:47,860 --> 00:41:50,750 or a few times 30, so maybe 100 individuals 702 00:41:50,750 --> 00:41:53,220 means that it's unlikely to go extinct stochastically. 703 00:41:58,930 --> 00:42:02,690 So really, it's once you get to this n established, 704 00:42:02,690 --> 00:42:04,610 then the population will grow exponentially 705 00:42:04,610 --> 00:42:05,390 in the population. 706 00:42:05,390 --> 00:42:07,347 And it's going to spread exponentially 707 00:42:07,347 --> 00:42:09,805 with an advantage s relative to the rest of the population. 708 00:42:13,309 --> 00:42:15,850 So this discussion is actually very useful for the next thing 709 00:42:15,850 --> 00:42:16,766 we want to talk about. 710 00:42:16,766 --> 00:42:18,890 But are there any question about how we got this? 711 00:42:22,741 --> 00:42:24,866 AUDIENCE: You said the probability of going extinct 712 00:42:24,866 --> 00:42:26,870 is 1 over e. 713 00:42:26,870 --> 00:42:28,230 PROFESSOR: Right 714 00:42:28,230 --> 00:42:30,370 AUDIENCE: So xi is approximately equal to 1. 715 00:42:32,482 --> 00:42:33,190 PROFESSOR: Right. 716 00:42:33,190 --> 00:42:38,064 So this is the probably of going extinct? 717 00:42:38,064 --> 00:42:39,800 AUDIENCE: Yeah. 718 00:42:39,800 --> 00:42:42,910 PROFESSOR: So the probability of going extinct here, 719 00:42:42,910 --> 00:42:47,583 if you have i individuals, is 1 over this guy. 720 00:42:47,583 --> 00:42:52,330 So it's 1 over 1 plus s to the i. 721 00:42:52,330 --> 00:43:02,930 And then if i times s is 1 but s is small-- 722 00:43:02,930 --> 00:43:07,030 this is one of those definitions of e to the x 723 00:43:07,030 --> 00:43:09,470 that, I don't know, do you guys remember this? 724 00:43:09,470 --> 00:43:15,390 It's like, oh my goodness, e to the x is the limit. 725 00:43:22,449 --> 00:43:26,390 No, no, x is up there. 726 00:43:26,390 --> 00:43:28,810 Boy, you guys are taxing me. 727 00:43:28,810 --> 00:43:30,210 All right. 728 00:43:30,210 --> 00:43:34,010 This is x over n to the n. 729 00:43:34,010 --> 00:43:36,900 And this limit is n goes to infinity. 730 00:43:36,900 --> 00:43:37,440 Is this? 731 00:43:37,440 --> 00:43:38,237 AUDIENCE: Yes. 732 00:43:38,237 --> 00:43:39,195 PROFESSOR: Do we agree? 733 00:43:39,195 --> 00:43:39,695 OK. 734 00:43:42,350 --> 00:43:43,488 This makes sense, right? 735 00:43:47,264 --> 00:43:48,930 So this is just the equivalent of saying 736 00:43:48,930 --> 00:43:52,110 that if i times s equal to 1 but s is small, 737 00:43:52,110 --> 00:43:54,110 then this thing is around e. 738 00:43:58,569 --> 00:44:00,610 If you're confused, I don't want to get into this 739 00:44:00,610 --> 00:44:02,060 too much, because it's not quite central to what 740 00:44:02,060 --> 00:44:02,640 we're talking about. 741 00:44:02,640 --> 00:44:04,640 But come up and talk to me after the class, 742 00:44:04,640 --> 00:44:06,800 and we'll derive this together. 743 00:44:11,050 --> 00:44:15,500 So the idea is that when you've reached a population size of 1 744 00:44:15,500 --> 00:44:19,297 over s, then you have a 1 over e probability of going extinct. 745 00:44:22,160 --> 00:44:24,510 So e pops up at all sorts of weird locations. 746 00:44:24,510 --> 00:44:27,800 This is why you hear about it. 747 00:44:33,084 --> 00:44:34,500 So what I want to do now is I want 748 00:44:34,500 --> 00:44:38,330 to discuss this idea of clonal interference a little bit more, 749 00:44:38,330 --> 00:44:41,850 in particular ask, in what situation will 750 00:44:41,850 --> 00:44:44,270 we have clonal interference? 751 00:44:44,270 --> 00:44:47,010 Or equivalently, when can we ignore clonal interference? 752 00:44:51,180 --> 00:44:54,290 Can somebody say in words when we're 753 00:44:54,290 --> 00:44:57,870 going to be able to ignore the effect of clonal interference? 754 00:44:57,870 --> 00:44:58,693 Yes. 755 00:44:58,693 --> 00:45:02,477 AUDIENCE: [INAUDIBLE] so large that there's a very limited 756 00:45:02,477 --> 00:45:04,954 match that goes to-- 757 00:45:04,954 --> 00:45:06,370 PROFESSOR: OK, I think we're going 758 00:45:06,370 --> 00:45:08,750 to have to be careful here. 759 00:45:08,750 --> 00:45:11,510 Because you're saying, oh well, if the population 760 00:45:11,510 --> 00:45:13,580 is large enough, then you're saying, 761 00:45:13,580 --> 00:45:15,370 oh, the mutations won't interact, right? 762 00:45:15,370 --> 00:45:17,140 AUDIENCE: Not in the beginning. 763 00:45:17,140 --> 00:45:17,250 PROFESSOR: Right. 764 00:45:17,250 --> 00:45:18,791 They don't interact at the beginning. 765 00:45:18,791 --> 00:45:21,220 I think that statement I'm very happy with. 766 00:45:21,220 --> 00:45:23,840 But the problem is that if they both survive 767 00:45:23,840 --> 00:45:27,170 stochastic extinction and they start spreading, 768 00:45:27,170 --> 00:45:32,020 then they do start to interact in this large population. 769 00:45:32,020 --> 00:45:33,962 AUDIENCE: Didn't you just say that once they 770 00:45:33,962 --> 00:45:37,892 survive stochastic extinction, they pretty much behave like-- 771 00:45:37,892 --> 00:45:39,600 PROFESSOR: They behave deterministically. 772 00:45:39,600 --> 00:45:42,530 But the thing is that if you have two sub-populations that 773 00:45:42,530 --> 00:45:44,520 are exponentially growing-- and there 774 00:45:44,520 --> 00:45:47,390 are these very nice diagrams that I guess in Roy's paper 775 00:45:47,390 --> 00:45:48,530 they don't show. 776 00:45:48,530 --> 00:45:56,060 Which is if you look at somehow the population kind of changing 777 00:45:56,060 --> 00:46:02,970 as a function of time, you can think about, OK, well, this 778 00:46:02,970 --> 00:46:05,630 is, let's say, constant population size. 779 00:46:05,630 --> 00:46:09,610 And it starts out all being a particular type. 780 00:46:09,610 --> 00:46:12,480 We'll say there's some population of A. 781 00:46:12,480 --> 00:46:15,094 But now a new mutation arises and survives 782 00:46:15,094 --> 00:46:16,010 stochastic extinction. 783 00:46:16,010 --> 00:46:18,196 It starts spreading in the population. 784 00:46:18,196 --> 00:46:22,840 All right, this may be mutant B. But then 785 00:46:22,840 --> 00:46:28,110 if another mutant appears over here that's even more fit, 786 00:46:28,110 --> 00:46:32,160 then it can actually spread faster. 787 00:46:32,160 --> 00:46:39,520 And indeed, it can cause the B lineage to get out competed. 788 00:46:39,520 --> 00:46:45,950 So we start out with population A. Mutant B is more fit than A. 789 00:46:45,950 --> 00:46:47,660 It's spreading exponentially. 790 00:46:47,660 --> 00:46:51,230 But some time later mutation C appears. 791 00:46:51,230 --> 00:46:54,810 Now this C lineage is more fit than B, 792 00:46:54,810 --> 00:46:56,390 so it's able to spread exponentially. 793 00:46:56,390 --> 00:47:02,900 And indeed, it out competes B. So this is exactly what 794 00:47:02,900 --> 00:47:05,220 we mean by clonal interference. 795 00:47:05,220 --> 00:47:09,500 These two lineages are interfering with each other. 796 00:47:09,500 --> 00:47:12,550 So this is, indeed, clonal interference. 797 00:47:12,550 --> 00:47:14,305 And there have been these drawings out. 798 00:47:14,305 --> 00:47:16,680 For every 20 years, people have been thinking about this. 799 00:47:16,680 --> 00:47:19,540 Like Lenski wrote some classic papers thinking about this. 800 00:47:19,540 --> 00:47:22,962 Recently, Michael Desai at Harvard 801 00:47:22,962 --> 00:47:24,420 has done some really beautiful work 802 00:47:24,420 --> 00:47:27,200 where he takes these evolving yeast populations. 803 00:47:27,200 --> 00:47:30,685 He does high resolution-- resolution 804 00:47:30,685 --> 00:47:33,230 in both senses, temporal as well as 805 00:47:33,230 --> 00:47:36,060 kind of deep sequencing in the population-- where he could 806 00:47:36,060 --> 00:47:40,060 actually directly see these lineages spreading and then 807 00:47:40,060 --> 00:47:41,030 being out-competed. 808 00:47:41,030 --> 00:47:43,737 And he sees multiple mutations spreading 809 00:47:43,737 --> 00:47:45,570 through the population together because they 810 00:47:45,570 --> 00:47:46,778 were attached on some genome. 811 00:47:46,778 --> 00:47:50,670 It's a very nice paper. 812 00:47:50,670 --> 00:47:53,200 I almost had you guys read it, so it's maybe 813 00:47:53,200 --> 00:48:02,637 Nature in 2012, '13. 814 00:48:02,637 --> 00:48:04,480 I don't know, possibly '12. 815 00:48:08,240 --> 00:48:11,040 So this is the idea of clonal interference. 816 00:48:11,040 --> 00:48:13,404 So it's after the initial stochastic dynamics 817 00:48:13,404 --> 00:48:14,070 have played out. 818 00:48:14,070 --> 00:48:14,914 Yes. 819 00:48:14,914 --> 00:48:18,232 AUDIENCE: Is the [? spectre ?] [INAUDIBLE] you call that 820 00:48:18,232 --> 00:48:20,510 [? clonal ?] [? interference? ?] 821 00:48:20,510 --> 00:48:23,010 PROFESSOR: Yeah, so that is a result of clonal interference, 822 00:48:23,010 --> 00:48:27,990 because it's saying that that half arises because B, maybe, 823 00:48:27,990 --> 00:48:30,900 if it did survive, it may still get out-competed by A. 824 00:48:30,900 --> 00:48:33,083 So that's clonal interference. 825 00:48:33,083 --> 00:48:35,166 AUDIENCE: Did you say the populations of mutations 826 00:48:35,166 --> 00:48:36,532 weren't [? correct? ?] 827 00:48:36,532 --> 00:48:37,240 PROFESSOR: Right. 828 00:48:37,240 --> 00:48:39,390 So the idea is that they don't interact initially. 829 00:48:39,390 --> 00:48:42,000 Because when the mutations are present at a very low 830 00:48:42,000 --> 00:48:44,660 frequency, then they're really interacting 831 00:48:44,660 --> 00:48:49,010 with the bulk, the rest of the population, 10 to the 6 there. 832 00:48:49,010 --> 00:48:50,970 However, if they survive stochastic extinction 833 00:48:50,970 --> 00:48:53,020 and they start spreading in the population, 834 00:48:53,020 --> 00:48:56,998 then there's the possibility of clonal interference. 835 00:48:56,998 --> 00:49:00,097 AUDIENCE: But wouldn't that change that [INAUDIBLE]? 836 00:49:00,097 --> 00:49:00,680 PROFESSOR: No. 837 00:49:00,680 --> 00:49:04,800 I think that's why it's at the half. 838 00:49:04,800 --> 00:49:07,150 Because this factor of a half is really just saying 839 00:49:07,150 --> 00:49:10,250 that even if B survives stochastic extinction, 840 00:49:10,250 --> 00:49:12,270 it only has a 50-50 shot of actually being 841 00:49:12,270 --> 00:49:15,037 able to take over, because there's a 50% chance 842 00:49:15,037 --> 00:49:16,870 it's going to experience clonal interference 843 00:49:16,870 --> 00:49:18,994 with this A lineage that's going to out-compete it. 844 00:49:26,770 --> 00:49:30,971 Maybe if you're unhappy about that statement, 845 00:49:30,971 --> 00:49:31,845 harass me afterwards. 846 00:49:36,050 --> 00:49:39,060 So from this discussion, though, how 847 00:49:39,060 --> 00:49:40,642 can we think about the importance 848 00:49:40,642 --> 00:49:41,600 of clonal interference? 849 00:49:41,600 --> 00:49:45,670 In particular, there are going to be two time scales. 850 00:49:45,670 --> 00:49:47,149 And the question of whether there's 851 00:49:47,149 --> 00:49:48,940 going to be significant clonal interference 852 00:49:48,940 --> 00:49:53,070 is related to which of these time scales is larger. 853 00:49:53,070 --> 00:49:54,975 So what might the time scales be? 854 00:50:12,507 --> 00:50:13,965 Or at least one of the time scales? 855 00:50:18,676 --> 00:50:20,217 AUDIENCE: I guess one should probably 856 00:50:20,217 --> 00:50:21,952 be the time to be established. 857 00:50:21,952 --> 00:50:22,660 PROFESSOR: Right. 858 00:50:22,660 --> 00:50:24,270 To be established. 859 00:50:24,270 --> 00:50:26,914 AUDIENCE: To reach the [INAUDIBLE] [? establish. ?] 860 00:50:26,914 --> 00:50:27,580 PROFESSOR: Yeah. 861 00:50:33,405 --> 00:50:37,210 So the actual time that it takes from when a mutation appears 862 00:50:37,210 --> 00:50:42,600 to when it becomes established-- I mean, there is a time there. 863 00:50:42,600 --> 00:50:45,540 But it ends up not being the relevant thing, 864 00:50:45,540 --> 00:50:48,130 because that's the regime where they're not actually 865 00:50:48,130 --> 00:50:50,550 interacting, that the mutant lineages are not 866 00:50:50,550 --> 00:50:52,030 interacting anyways. 867 00:50:52,030 --> 00:50:53,950 So there's a sense that it doesn't quite 868 00:50:53,950 --> 00:50:58,206 matter how long that takes, at least to first order. 869 00:50:58,206 --> 00:51:03,036 AUDIENCE: How much time did it take to reach some population 870 00:51:03,036 --> 00:51:06,132 size [INAUDIBLE]? 871 00:51:06,132 --> 00:51:06,840 PROFESSOR: Right. 872 00:51:06,840 --> 00:51:07,680 OK, yeah. 873 00:51:07,680 --> 00:51:09,380 So one of them is going to be that it's 874 00:51:09,380 --> 00:51:16,200 the time to go from established to fixation. 875 00:51:16,200 --> 00:51:17,992 So it's actually the next time scale there. 876 00:51:17,992 --> 00:51:19,699 And actually, the time to get established 877 00:51:19,699 --> 00:51:20,770 is actually rather short. 878 00:51:23,240 --> 00:51:25,740 So a mutation appears, and most of the time it goes extinct. 879 00:51:25,740 --> 00:51:27,531 But the ones that don't go extinct actually 880 00:51:27,531 --> 00:51:29,570 get established rather quickly, because that's 881 00:51:29,570 --> 00:51:33,190 a biased sampling over the trajectories. 882 00:51:33,190 --> 00:51:35,360 But there is a rather significant time 883 00:51:35,360 --> 00:51:37,192 that it takes for the lineage to go 884 00:51:37,192 --> 00:51:38,900 from being established to actually fixing 885 00:51:38,900 --> 00:51:39,710 in the population. 886 00:51:39,710 --> 00:51:42,300 So this is kind of how long it takes for a mutation 887 00:51:42,300 --> 00:51:45,220 to take over a population. 888 00:51:45,220 --> 00:51:47,500 We might call this the time to spread or so. 889 00:51:50,870 --> 00:51:52,920 And it turns out that that time-- 890 00:51:52,920 --> 00:51:55,710 what is it going to depend on? 891 00:51:55,710 --> 00:51:56,489 AUDIENCE: s. 892 00:51:56,489 --> 00:51:58,030 PROFESSOR: It's going to depend on s. 893 00:51:58,030 --> 00:52:04,610 So as s goes up, then this goes down. 894 00:52:04,610 --> 00:52:05,410 All right, perfect. 895 00:52:05,410 --> 00:52:09,124 All right, let's just go put a 1 over s there. 896 00:52:09,124 --> 00:52:11,290 And this is because it's going to grow exponentially 897 00:52:11,290 --> 00:52:13,190 in the population. 898 00:52:13,190 --> 00:52:15,560 And then how else is it going to depend on? 899 00:52:15,560 --> 00:52:16,860 AUDIENCE: n. 900 00:52:16,860 --> 00:52:18,100 PROFESSOR: n, right. 901 00:52:18,100 --> 00:52:22,980 And indeed, this is going to be the log of n times s. 902 00:52:22,980 --> 00:52:25,460 It's really n over n established. 903 00:52:25,460 --> 00:52:28,280 Because the idea is that you start out at n established. 904 00:52:28,280 --> 00:52:30,920 You grow exponentially with rate s until you kind of take 905 00:52:30,920 --> 00:52:32,780 over the population, size n. 906 00:52:36,215 --> 00:52:41,846 And this s appears because it's 1 over n established. 907 00:52:41,846 --> 00:52:44,230 Because this in the log is really the population 908 00:52:44,230 --> 00:52:45,480 size divided by n established. 909 00:52:48,030 --> 00:52:50,470 So this is one time scale. 910 00:52:50,470 --> 00:52:52,810 Do you guys understand why it should be like this? 911 00:52:57,060 --> 00:52:57,560 Right. 912 00:52:57,560 --> 00:52:59,768 And what's the other time scale that's relevant here? 913 00:52:59,768 --> 00:53:03,530 AUDIENCE: How fast mutations occur. 914 00:53:03,530 --> 00:53:06,536 PROFESSOR: How fast mutations-- yes. 915 00:53:06,536 --> 00:53:08,744 AUDIENCE: How long you have to wait from one mutation 916 00:53:08,744 --> 00:53:09,950 to the next mutation. 917 00:53:09,950 --> 00:53:10,770 PROFESSOR: Right. 918 00:53:10,770 --> 00:53:14,370 So this is how long you have to wait for one mutation. 919 00:53:14,370 --> 00:53:20,290 So this is T, what we might call T mutation. 920 00:53:20,290 --> 00:53:21,850 Now it's a little bit subtle to think 921 00:53:21,850 --> 00:53:25,440 about what this should be. 922 00:53:25,440 --> 00:53:27,940 But it should say something about the time 923 00:53:27,940 --> 00:53:32,216 it takes for mutations to appear in the population. 924 00:53:32,216 --> 00:53:36,430 So let's go ahead and just guess what this might be. 925 00:53:36,430 --> 00:53:38,720 T mutation should be equal to what? 926 00:53:38,720 --> 00:53:40,380 All right, so it could be-- 927 00:53:45,800 --> 00:53:46,716 AUDIENCE: [INAUDIBLE]. 928 00:53:52,754 --> 00:53:53,420 PROFESSOR: Yeah. 929 00:53:53,420 --> 00:53:56,650 So I guess I'm trying to get you to think about what this should 930 00:53:56,650 --> 00:53:59,860 mean to be relevant for determining the time 931 00:53:59,860 --> 00:54:02,022 scales of whether clonal interference is important. 932 00:54:02,022 --> 00:54:04,170 AUDIENCE: [INAUDIBLE]. 933 00:54:04,170 --> 00:54:08,820 PROFESSOR: Mu is the probability per generation 934 00:54:08,820 --> 00:54:13,020 of having a beneficial mutation of magnitude s. 935 00:54:13,020 --> 00:54:18,764 Mu is kind of the mutation rate. 936 00:54:18,764 --> 00:54:20,442 AUDIENCE: [INAUDIBLE]. 937 00:54:20,442 --> 00:54:21,150 PROFESSOR: Right. 938 00:54:21,150 --> 00:54:23,839 But we'll just assume that there's 939 00:54:23,839 --> 00:54:25,380 only one kind of beneficial mutation, 940 00:54:25,380 --> 00:54:28,550 and every beneficial mutation has magnitude s, 941 00:54:28,550 --> 00:54:30,990 for simplicity. 942 00:54:30,990 --> 00:54:32,960 So it's mutation rate. 943 00:54:32,960 --> 00:54:35,430 It's a beneficial mutation rate per generation 944 00:54:35,430 --> 00:54:36,965 of leading to mutation s. 945 00:54:43,040 --> 00:54:46,720 And you can just also say, if you have no idea what 946 00:54:46,720 --> 00:54:49,160 I'm talking about, it's fine. 947 00:54:49,160 --> 00:54:52,890 But it's worth spending a little bit of time thinking 948 00:54:52,890 --> 00:54:54,880 about what should be the relevant time 949 00:54:54,880 --> 00:54:57,466 scale to compare to this. 950 00:54:57,466 --> 00:54:57,965 Yes? 951 00:54:57,965 --> 00:55:00,212 AUDIENCE: So Mu is per organism [INAUDIBLE]? 952 00:55:03,156 --> 00:55:04,156 PROFESSOR: That's right. 953 00:55:16,990 --> 00:55:18,970 And it's OK if you find this confusing, 954 00:55:18,970 --> 00:55:20,200 because this is subtle. 955 00:55:20,200 --> 00:55:22,380 But if you think about it for 30 seconds, then 956 00:55:22,380 --> 00:55:27,220 maybe you'll be fertile ground for the discussion to follow. 957 00:55:47,392 --> 00:55:49,920 All right, let's go ahead and vote and see where we are. 958 00:55:49,920 --> 00:55:53,220 Ready, three, two, one. 959 00:55:56,020 --> 00:55:58,260 All right. 960 00:55:58,260 --> 00:55:59,850 This is very nice. 961 00:55:59,850 --> 00:56:05,748 Almost everybody is saying C. Although it really should be D. 962 00:56:05,748 --> 00:56:06,664 AUDIENCE: [INAUDIBLE]. 963 00:56:09,800 --> 00:56:12,594 So 1 over s is units of time, and 1 over mu is [INAUDIBLE]. 964 00:56:15,790 --> 00:56:19,110 PROFESSOR: OK, that's an interesting statement. 965 00:56:19,110 --> 00:56:21,730 I mean, I guess this is a relative fitness. 966 00:56:21,730 --> 00:56:22,690 AUDIENCE: So s is mu. 967 00:56:23,279 --> 00:56:25,070 AUDIENCE: But right there, you have s units 968 00:56:25,070 --> 00:56:29,000 of time, 1/s units of time. 969 00:56:29,000 --> 00:56:30,625 PROFESSOR: Yeah, so the problem of this 970 00:56:30,625 --> 00:56:33,525 is this is all in units of generation time. 971 00:56:33,525 --> 00:56:34,900 Because this is really telling us 972 00:56:34,900 --> 00:56:36,233 about the number of generations. 973 00:56:39,970 --> 00:56:45,379 So what is going on? 974 00:56:45,379 --> 00:56:47,420 Because 1 over mu n should also be a time, right? 975 00:56:49,990 --> 00:56:50,906 AUDIENCE: [INAUDIBLE]. 976 00:57:00,800 --> 00:57:02,380 Yeah, I'm not sure if-- yeah, I mean, 977 00:57:02,380 --> 00:57:03,754 since this is all in generations, 978 00:57:03,754 --> 00:57:11,550 I'm not sure if we even have to-- yeah, I think it's OK. 979 00:57:11,550 --> 00:57:13,494 AUDIENCE: n times s has to be [INAUDIBLE]. 980 00:57:16,410 --> 00:57:18,880 PROFESSOR: I like that statement. 981 00:57:18,880 --> 00:57:20,100 But then he, I mean-- 982 00:57:20,100 --> 00:57:21,940 AUDIENCE: Well, then you have 1 over mu. 983 00:57:21,940 --> 00:57:22,939 PROFESSOR: Yeah, I know. 984 00:57:22,939 --> 00:57:25,845 But then, this should have units of time. 985 00:57:25,845 --> 00:57:29,520 I agree with your statement, but I also sort of agree with his, 986 00:57:29,520 --> 00:57:30,520 the 1 over s here. 987 00:57:30,520 --> 00:57:31,190 You know that's the problem. 988 00:57:31,190 --> 00:57:32,130 AUDIENCE: Everything is basically unitless. 989 00:57:32,130 --> 00:57:33,630 PROFESSOR: Yeah, but I think that it's really 990 00:57:33,630 --> 00:57:35,463 because everything's in units of generation. 991 00:57:35,463 --> 00:57:37,300 So really, it's all unitless, except for mu. 992 00:57:37,300 --> 00:57:38,166 Yeah. 993 00:57:38,166 --> 00:57:39,541 AUDIENCE: What I don't understand 994 00:57:39,541 --> 00:57:44,462 is why how beneficial the mutation is 995 00:57:44,462 --> 00:57:46,430 should play into how-- 996 00:57:46,430 --> 00:57:47,860 PROFESSOR: No, I agree. 997 00:57:47,860 --> 00:57:49,900 This is subtle. 998 00:57:49,900 --> 00:57:51,460 So let's imagine this is time. 999 00:57:55,700 --> 00:58:01,330 Now, 1 over mu n is telling us how much time 1000 00:58:01,330 --> 00:58:03,660 there is between the initial appearance 1001 00:58:03,660 --> 00:58:06,994 of these beneficial mutants. 1002 00:58:06,994 --> 00:58:08,535 So here we get a beneficial mutation. 1003 00:58:10,484 --> 00:58:12,150 But what do you think is going to happen 1004 00:58:12,150 --> 00:58:14,754 to that beneficial mutation? 1005 00:58:14,754 --> 00:58:15,670 Yeah, that guy's dead. 1006 00:58:15,670 --> 00:58:18,709 It comes up. 1007 00:58:18,709 --> 00:58:21,250 How long do we have to wait for the next beneficial mutation? 1008 00:58:21,250 --> 00:58:22,280 AUDIENCE: 1 over mu n. 1009 00:58:22,280 --> 00:58:23,238 PROFESSOR: 1 over mu n. 1010 00:58:23,238 --> 00:58:24,824 Is it going to be exactly that? 1011 00:58:24,824 --> 00:58:26,240 Is going to be peaked around that? 1012 00:58:29,460 --> 00:58:31,790 All right, verbal answer, what is 1013 00:58:31,790 --> 00:58:34,070 going to be the probability distribution 1014 00:58:34,070 --> 00:58:38,490 between successive appearances of this beneficial mutation? 1015 00:58:38,490 --> 00:58:40,996 Ready, three, two, one. 1016 00:58:40,996 --> 00:58:41,912 AUDIENCE: Exponential. 1017 00:58:41,912 --> 00:58:44,130 PROFESSOR: It's exponentially distributed. 1018 00:58:44,130 --> 00:58:46,010 So there might be another one that's coming through here. 1019 00:58:46,010 --> 00:58:47,090 What do you think is going to happen to that guy? 1020 00:58:47,090 --> 00:58:47,842 AUDIENCE: Probably dead. 1021 00:58:47,842 --> 00:58:48,950 PROFESSOR: Oh, he's dead. 1022 00:58:48,950 --> 00:58:50,330 OK. 1023 00:58:50,330 --> 00:58:51,880 Another one. 1024 00:58:51,880 --> 00:58:54,900 So these guys are appearing. 1025 00:58:54,900 --> 00:58:57,360 But if the magnitude of s is, say, 1026 00:58:57,360 --> 00:58:59,770 0.02, that means we have to wait for 50 of these things 1027 00:58:59,770 --> 00:59:01,890 to appear before we expect that one of them 1028 00:59:01,890 --> 00:59:05,250 is actually going to get established. 1029 00:59:05,250 --> 00:59:07,460 And established means it gets up to 1 over s. 1030 00:59:12,342 --> 00:59:13,800 So this guy, once he's established, 1031 00:59:13,800 --> 00:59:17,560 he's going to grow exponentially in the population. 1032 00:59:17,560 --> 00:59:24,760 But the idea is that the time scale here of 1 over mu n 1033 00:59:24,760 --> 00:59:27,810 is the time between successive appearances 1034 00:59:27,810 --> 00:59:36,980 of this beneficial mutation, whereas as 1 over mu ns 1035 00:59:36,980 --> 00:59:41,030 is the time scale between successive establishments 1036 00:59:41,030 --> 00:59:42,380 of these beneficial mutations. 1037 00:59:42,380 --> 00:59:43,921 And it's only if they get established 1038 00:59:43,921 --> 00:59:47,060 that they're relevant. 1039 00:59:47,060 --> 00:59:52,740 So that's why the t mutation established is 1 over mu ns. 1040 00:59:56,320 --> 01:00:02,710 Now, no clonal interference means that one of these 1041 01:00:02,710 --> 01:00:04,150 is larger than the other one. 1042 01:00:07,490 --> 01:00:10,340 So no clonal interference means is it, 1043 01:00:10,340 --> 01:00:13,470 A, this guy's larger, or B, this guy's larger? 1044 01:00:16,370 --> 01:00:18,431 So larger, question mark. 1045 01:00:18,431 --> 01:00:18,930 This one? 1046 01:00:18,930 --> 01:00:19,730 That one? 1047 01:00:19,730 --> 01:00:21,521 All right, think about this for 15 seconds, 1048 01:00:21,521 --> 01:00:24,160 just to make sure that we're-- make sure. 1049 01:00:24,160 --> 01:00:26,210 We're asking, if there's no clonal interference, 1050 01:00:26,210 --> 01:00:28,751 it means that one of these is much larger than the other one. 1051 01:00:28,751 --> 01:00:29,350 Which one? 1052 01:00:29,350 --> 01:00:34,140 Ready, three, two, one. 1053 01:00:34,140 --> 01:00:39,370 It means that this thing is much larger than this one. 1054 01:00:39,370 --> 01:00:46,820 So this thing has to be much larger than t establish to fix. 1055 01:00:49,980 --> 01:00:53,030 Because if it's the case this time scale 1056 01:00:53,030 --> 01:00:55,460 is very large compared to this time scale of spreading, 1057 01:00:55,460 --> 01:00:58,280 then the mutations don't ever clonal interfere. 1058 01:01:01,840 --> 01:01:04,500 I think it's very important that you can reconstruct 1059 01:01:04,500 --> 01:01:08,230 this whole argument because it has 1060 01:01:08,230 --> 01:01:10,690 all the ingredients of the dynamics in terms 1061 01:01:10,690 --> 01:01:14,235 of stochastic, and then deterministic. 1062 01:01:16,930 --> 01:01:20,290 If you find this stuff confusing, please 1063 01:01:20,290 --> 01:01:22,940 go through it with a friend or come to me after class. 1064 01:01:26,560 --> 01:01:28,570 So what we went to do for the last 20 minutes 1065 01:01:28,570 --> 01:01:32,680 is, then, talk about Roy's paper. 1066 01:01:32,680 --> 01:01:37,796 And I think that this paper is interesting and subtle. 1067 01:01:37,796 --> 01:01:39,670 It's a little bit hard to tell how much of it 1068 01:01:39,670 --> 01:01:45,020 is an experimental kind of straight up demonstration 1069 01:01:45,020 --> 01:01:47,350 or how much of it is that it's just a way of getting 1070 01:01:47,350 --> 01:01:49,179 you to think new thoughts. 1071 01:01:49,179 --> 01:01:51,470 It certainly is a way to get you to think new thoughts. 1072 01:01:51,470 --> 01:01:54,990 At least I felt that it had that effect on me, 1073 01:01:54,990 --> 01:01:57,510 and that after reading it, I just 1074 01:01:57,510 --> 01:01:59,620 got very excited about all the dynamics 1075 01:01:59,620 --> 01:02:03,670 that were at play here, and how out 1076 01:02:03,670 --> 01:02:05,660 of the complexity of this clonal interference 1077 01:02:05,660 --> 01:02:07,270 process, maybe, in some cases, there 1078 01:02:07,270 --> 01:02:10,060 might be some even simpler phenomenological type 1079 01:02:10,060 --> 01:02:12,164 description of what was going on. 1080 01:02:12,164 --> 01:02:14,580 But I think that what is very interesting about that paper 1081 01:02:14,580 --> 01:02:17,210 is this idea that what they really wanted to measure 1082 01:02:17,210 --> 01:02:19,140 was this probability distribution 1083 01:02:19,140 --> 01:02:22,200 of beneficial mutations. 1084 01:02:22,200 --> 01:02:25,874 So what we mean by that is that different beneficial mutations 1085 01:02:25,874 --> 01:02:27,040 will have different effects. 1086 01:02:36,790 --> 01:02:40,350 And in particular, you could imagine a thought experiment 1087 01:02:40,350 --> 01:02:43,480 where you take an E. coli cell. 1088 01:02:43,480 --> 01:02:49,770 How big is the E. coli genome roughly? 1089 01:02:49,770 --> 01:02:52,685 And just like it's always good to have guideposts for how long 1090 01:02:52,685 --> 01:02:54,935 ago things happened, it's also good to have guideposts 1091 01:02:54,935 --> 01:02:56,860 for things like it's just good to have 1092 01:02:56,860 --> 01:03:00,590 a few genome sizes memorized. 1093 01:03:00,590 --> 01:03:02,300 Now does anybody know the E. coli genome? 1094 01:03:02,300 --> 01:03:04,474 AUDIENCE: 2,000 base pairs. 1095 01:03:04,474 --> 01:03:05,640 PROFESSOR: 2,000 base pairs? 1096 01:03:05,640 --> 01:03:07,485 AUDIENCE: 2,000 [INAUDIBLE]. 1097 01:03:07,485 --> 01:03:10,330 PROFESSOR: No, I just want to make sure that we're-- yeah. 1098 01:03:10,330 --> 01:03:13,800 Yeah, it's a bit more than that. 1099 01:03:13,800 --> 01:03:16,000 It's a few million base pairs, three or four, 1100 01:03:16,000 --> 01:03:18,100 depending on the strain. 1101 01:03:18,100 --> 01:03:21,190 But you can imagine a thought experiment where you go in, 1102 01:03:21,190 --> 01:03:26,490 and at every base pair you make the three possible point 1103 01:03:26,490 --> 01:03:28,120 mutations. 1104 01:03:28,120 --> 01:03:31,700 So then you can imagine having 10 million different strains 1105 01:03:31,700 --> 01:03:35,020 that are each different at one site. 1106 01:03:35,020 --> 01:03:39,310 And then what we can do is we can plot the rate of population 1107 01:03:39,310 --> 01:03:41,510 growth in some environment. 1108 01:03:41,510 --> 01:03:43,810 So this is gamma. 1109 01:03:43,810 --> 01:03:51,040 It's the 1 over n dn/dt, basically division, right? 1110 01:03:51,040 --> 01:03:59,110 Normalized by the wild type division. 1111 01:03:59,110 --> 01:04:04,070 I'm sorry, that's supposed to be down there. 1112 01:04:04,070 --> 01:04:08,970 So this is gamma over gamma of wild type. 1113 01:04:08,970 --> 01:04:15,510 And this could be E. coli in LB or minimal media or at 1114 01:04:15,510 --> 01:04:16,040 30 [? ci ?]. 1115 01:04:16,040 --> 01:04:19,260 You pick some reasonable environment. 1116 01:04:19,260 --> 01:04:21,440 Question is, if you draw the histogram of what 1117 01:04:21,440 --> 01:04:24,265 this thing should look like, what should it be? 1118 01:04:29,390 --> 01:04:32,960 Now what I'm going to ask you to do is, in 30 seconds, 1119 01:04:32,960 --> 01:04:35,490 draw something on your sheet of paper. 1120 01:04:38,014 --> 01:04:39,430 This is supposed to be a frequency 1121 01:04:39,430 --> 01:04:45,107 or histogram over those 10 million point mutations. 1122 01:04:45,107 --> 01:04:46,940 So we haven't actually done this experiment. 1123 01:04:46,940 --> 01:04:54,160 But I think we have made all possible gene 1124 01:04:54,160 --> 01:05:01,910 knockouts in E. coli, so removing each of the genes. 1125 01:05:01,910 --> 01:05:03,540 And we've done that in E. coli. 1126 01:05:03,540 --> 01:05:04,822 We've done it in yeast. 1127 01:05:04,822 --> 01:05:07,280 And indeed, in yeast, they're actually making and measuring 1128 01:05:07,280 --> 01:05:11,270 the division rate for all pairwise knockouts. 1129 01:05:11,270 --> 01:05:14,600 So they're some fraction of the way through. 1130 01:05:14,600 --> 01:05:17,315 They published their first 10 million measurements 1131 01:05:17,315 --> 01:05:19,690 for the division rate of the pairwise knockouts in yeast. 1132 01:05:19,690 --> 01:05:21,350 This is Charlie Boone at Toronto. 1133 01:05:21,350 --> 01:05:23,630 It's an amazing data set. 1134 01:05:23,630 --> 01:05:25,000 Sorry, this is a simpler one. 1135 01:05:25,000 --> 01:05:27,850 So this is just if we make a histogram of the growth 1136 01:05:27,850 --> 01:05:33,530 rate of E. coli, all 10 million possible point 1137 01:05:33,530 --> 01:05:35,279 mutations in E. coli, what do you 1138 01:05:35,279 --> 01:05:36,528 think it's going to look like? 1139 01:05:44,680 --> 01:05:49,380 And I'm going to come by, and if I don't see distribution drawn 1140 01:05:49,380 --> 01:05:53,300 on your sheet of paper, then you'll 1141 01:05:53,300 --> 01:05:56,825 get to draw it up here on the board for all of us. 1142 01:06:03,529 --> 01:06:06,847 AUDIENCE: It's totally a distribution. 1143 01:06:06,847 --> 01:06:07,430 PROFESSOR: OK. 1144 01:06:14,754 --> 01:06:15,740 All right. 1145 01:06:19,009 --> 01:06:20,008 AUDIENCE: He was coming. 1146 01:06:20,008 --> 01:06:21,415 He was coming to me. 1147 01:06:21,415 --> 01:06:22,290 PROFESSOR: All right. 1148 01:06:22,290 --> 01:06:24,248 And what I find interesting about this exercise 1149 01:06:24,248 --> 01:06:26,440 is that you get all possible distributions. 1150 01:06:30,276 --> 01:06:34,530 No, but it's funny, because it's like a super basic question. 1151 01:06:34,530 --> 01:06:38,800 And somehow we don't necessarily think about it, or whatnot. 1152 01:06:38,800 --> 01:06:40,250 Could somebody throw out what they 1153 01:06:40,250 --> 01:06:41,976 think maybe should be going on? 1154 01:06:41,976 --> 01:06:42,928 Yeah. 1155 01:06:42,928 --> 01:06:46,736 AUDIENCE: So I'm expecting a large fraction to get 0. 1156 01:06:52,410 --> 01:06:56,030 PROFESSOR: So it could be that they're all here. 1157 01:06:56,030 --> 01:06:59,036 AUDIENCE: You know, most will start-- not most, but large 1158 01:06:59,036 --> 01:07:01,007 fractions [INAUDIBLE]. 1159 01:07:01,007 --> 01:07:03,590 PROFESSOR: So it could be that a large fraction of these point 1160 01:07:03,590 --> 01:07:07,280 mutations, and the cell is just dead. 1161 01:07:07,280 --> 01:07:09,540 So this is what you call a pessimistic view of nature 1162 01:07:09,540 --> 01:07:11,740 or of life. 1163 01:07:11,740 --> 01:07:14,970 And he has trouble crossing the street because he's never sure. 1164 01:07:14,970 --> 01:07:16,366 What do you think? 1165 01:07:16,366 --> 01:07:21,159 AUDIENCE: I actually think that there's a much larger fraction 1166 01:07:21,159 --> 01:07:22,992 where it's exactly the same, because they've 1167 01:07:22,992 --> 01:07:25,478 got a lot of point mutations that are just substitutions. 1168 01:07:25,478 --> 01:07:26,311 PROFESSOR: Yeah, OK. 1169 01:07:26,311 --> 01:07:28,970 So lots of them, maybe. 1170 01:07:28,970 --> 01:07:31,630 So these are the two polar views of the world, 1171 01:07:31,630 --> 01:07:35,225 that nothing matters, or you're all dead. 1172 01:07:37,737 --> 01:07:39,320 AUDIENCE: I think it can reconcile it. 1173 01:07:39,320 --> 01:07:41,240 I think you can say-- 1174 01:07:41,240 --> 01:07:45,040 PROFESSOR: OK, well, now I think you're being too-- OK, 1175 01:07:45,040 --> 01:07:46,510 how are we going to reconcile them? 1176 01:07:46,510 --> 01:07:48,551 AUDIENCE: So we know that there's the [? third ?] 1177 01:07:48,551 --> 01:07:49,644 [? bubble, ?] right? 1178 01:07:49,644 --> 01:07:50,310 PROFESSOR: Sure. 1179 01:07:50,310 --> 01:07:53,572 AUDIENCE: And so there's probably at least 1/3 1180 01:07:53,572 --> 01:07:57,486 somewhere it just doesn't matter at all, where 1181 01:07:57,486 --> 01:07:59,014 you get the same code. 1182 01:07:59,014 --> 01:08:00,930 PROFESSOR: Although there are cases, actually, 1183 01:08:00,930 --> 01:08:02,660 where even silent mutations end up changing 1184 01:08:02,660 --> 01:08:03,951 protein at an expression level. 1185 01:08:03,951 --> 01:08:07,056 All right, but I don't know if this reconciles 1186 01:08:07,056 --> 01:08:08,180 the two views of the world. 1187 01:08:12,520 --> 01:08:18,135 So some people have drawn things that look like this. 1188 01:08:18,135 --> 01:08:21,090 Some people will draw it, and things look like this. 1189 01:08:21,090 --> 01:08:22,056 Yeah. 1190 01:08:22,056 --> 01:08:23,514 And we have a uniform distribution. 1191 01:08:27,029 --> 01:08:30,069 If you're going to make a null model, I mean, I don't know. 1192 01:08:30,069 --> 01:08:32,630 So I would say that we have enough information 1193 01:08:32,630 --> 01:08:35,290 to say something about what this thing should look like. 1194 01:08:38,340 --> 01:08:43,695 So first of all, some fraction of genes are indeed essential. 1195 01:08:49,300 --> 01:08:53,430 For 10% or 20% of the genes, you remove it, the cell is dead. 1196 01:08:53,430 --> 01:08:56,340 That doesn't mean that 10% to 20% of point mutations 1197 01:08:56,340 --> 01:08:59,420 will lead to a lethal phenotype, but it 1198 01:08:59,420 --> 01:09:05,359 means that if you do inactivate or knock out that gene, 1199 01:09:05,359 --> 01:09:06,979 then it will be lethal. 1200 01:09:06,979 --> 01:09:12,547 So that means that there are indeed some set of these point 1201 01:09:12,547 --> 01:09:14,130 mutations that are going to be lethal, 1202 01:09:14,130 --> 01:09:17,720 but it's going to be small. 1203 01:09:17,720 --> 01:09:19,710 And I don't know what the actual number 1204 01:09:19,710 --> 01:09:29,040 would be, but maybe 1 in 10 to the 4, something small. 1205 01:09:29,040 --> 01:09:34,270 Because it's probably only 10% the locations on the gene 1206 01:09:34,270 --> 01:09:35,590 would actually knock it out. 1207 01:09:35,590 --> 01:09:37,100 I'm making up that number. 1208 01:09:37,100 --> 01:09:39,640 The protein guys would have a much better sense. 1209 01:09:39,640 --> 01:09:41,439 But it's going to be a small number. 1210 01:09:41,439 --> 01:09:43,410 It might only be 1% of the mutations, actually, 1211 01:09:43,410 --> 01:09:46,140 would actually knock out the function of the protein. 1212 01:09:46,140 --> 01:09:49,690 And then not all of the genome actually codes for proteins. 1213 01:09:49,690 --> 01:09:51,689 So you start multiplying small numbers together, 1214 01:09:51,689 --> 01:09:53,439 and you get something that's rather small. 1215 01:09:53,439 --> 01:09:55,810 But there will be some number of mutations there. 1216 01:09:55,810 --> 01:09:57,640 However, the vast majority of mutations 1217 01:09:57,640 --> 01:10:00,780 will have no measurable fitness effect. 1218 01:10:00,780 --> 01:10:06,600 So indeed, this thing kind of peaks here, and then 1219 01:10:06,600 --> 01:10:07,880 rather sharply comes down. 1220 01:10:07,880 --> 01:10:12,440 And this is on a linear scale, or something like that. 1221 01:10:12,440 --> 01:10:15,680 So the width might indeed be a few percent. 1222 01:10:20,210 --> 01:10:23,480 But actually, probably even less, maybe 1%, 1223 01:10:23,480 --> 01:10:26,182 or half, something small. 1224 01:10:26,182 --> 01:10:26,682 Yeah. 1225 01:10:26,682 --> 01:10:29,670 AUDIENCE: How skewed is it? 1226 01:10:29,670 --> 01:10:32,205 PROFESSOR: So on a linear scale, it doesn't look skewed. 1227 01:10:32,205 --> 01:10:34,120 On a log scale, I think it does. 1228 01:10:34,120 --> 01:10:36,550 I mean, in the sense that if you plot log frequency 1229 01:10:36,550 --> 01:10:38,700 on this axis, then there's a longer tail 1230 01:10:38,700 --> 01:10:40,520 on the left than on the right. 1231 01:10:40,520 --> 01:10:44,350 But I'd say on a linear scale, it's just pretty sharp function 1232 01:10:44,350 --> 01:10:44,850 there. 1233 01:10:44,850 --> 01:10:46,475 AUDIENCE: So people have measured that? 1234 01:10:47,990 --> 01:10:50,760 PROFESSOR: Certainly for all the gene knockouts, 1235 01:10:50,760 --> 01:10:52,080 we have measured this. 1236 01:10:52,080 --> 01:10:55,980 And even that distribution is sharply peaked. 1237 01:10:55,980 --> 01:10:58,570 So the point mutation distribution's 1238 01:10:58,570 --> 01:11:00,926 going to be even more tightly peaked. 1239 01:11:00,926 --> 01:11:02,509 AUDIENCE: For most genes, you can just 1240 01:11:02,509 --> 01:11:04,284 knock them out and [INAUDIBLE]. 1241 01:11:04,284 --> 01:11:04,950 PROFESSOR: Yeah. 1242 01:11:04,950 --> 01:11:05,450 Sorry. 1243 01:11:05,450 --> 01:11:07,990 So for most genes, at least in many environments-- 1244 01:11:07,990 --> 01:11:10,406 and of course, it depends on what environment you measure. 1245 01:11:10,406 --> 01:11:12,420 But for most genes, you can knock them out. 1246 01:11:12,420 --> 01:11:14,670 Certainly in rich media, most genes you can knock out. 1247 01:11:23,790 --> 01:11:31,400 This, as I said, might be 1 in 10 to the minus 3, 4. 1248 01:11:35,210 --> 01:11:37,770 I might be off by a couple orders of magnitude. 1249 01:11:37,770 --> 01:11:40,950 But the point is that this is a small fraction over here. 1250 01:11:40,950 --> 01:11:41,612 Most are here. 1251 01:11:41,612 --> 01:11:44,070 And there will be some point mutations that come down here. 1252 01:11:44,070 --> 01:11:46,210 But on a linear scale, it's going to be very small. 1253 01:11:51,160 --> 01:11:53,270 Now from the standpoint of a population evolving 1254 01:11:53,270 --> 01:11:55,490 to a new environment, which mutations 1255 01:11:55,490 --> 01:11:57,292 are we most interested in? 1256 01:11:57,292 --> 01:11:58,600 AUDIENCE: The beneficial. 1257 01:11:58,600 --> 01:12:00,099 PROFESSOR: The beneficial mutations. 1258 01:12:00,099 --> 01:12:02,840 So we're most interested in these guys. 1259 01:12:05,870 --> 01:12:09,250 And indeed, this is the distribution 1260 01:12:09,250 --> 01:12:14,120 that [INAUDIBLE] set out to try and measure in that experiment 1261 01:12:14,120 --> 01:12:17,180 that we just read about last night. 1262 01:12:17,180 --> 01:12:19,840 So he wanted to know, if you zoom in here-- 1263 01:12:19,840 --> 01:12:22,400 because remember, this is even narrower than I've actually 1264 01:12:22,400 --> 01:12:24,400 drawn. 1265 01:12:24,400 --> 01:12:28,320 But this distribution, the probability 1266 01:12:28,320 --> 01:12:32,110 of a beneficial mutation of magnitude s-- s, here 0, 1267 01:12:32,110 --> 01:12:36,555 s-- it's going to do something. 1268 01:12:39,240 --> 01:12:40,915 Many mutations that are nearly neutral, 1269 01:12:40,915 --> 01:12:42,540 but it's going to fall off in some way, 1270 01:12:42,540 --> 01:12:44,660 and they wanted to try and measure this. 1271 01:12:44,660 --> 01:12:48,190 Now how many mutations confer a 2% advantage, 4%, 1272 01:12:48,190 --> 01:12:49,690 for this particular E. coli strain 1273 01:12:49,690 --> 01:12:52,810 in that particular environment? 1274 01:12:52,810 --> 01:12:55,874 But it turned out being difficult to measure. 1275 01:12:55,874 --> 01:12:57,790 And what was the reason that they gave for why 1276 01:12:57,790 --> 01:12:58,956 it was difficult to measure? 1277 01:13:06,000 --> 01:13:06,540 Somebody? 1278 01:13:06,540 --> 01:13:07,201 Anybody? 1279 01:13:07,201 --> 01:13:07,700 Please? 1280 01:13:16,208 --> 01:13:17,124 AUDIENCE: [INAUDIBLE]. 1281 01:13:21,526 --> 01:13:22,150 PROFESSOR: Yes. 1282 01:13:22,150 --> 01:13:23,691 But the equivalence principle doesn't 1283 01:13:23,691 --> 01:13:26,860 say that they're all equivalent necessarily. 1284 01:13:26,860 --> 01:13:31,523 AUDIENCE: Well, you can make different [INAUDIBLE]. 1285 01:13:31,523 --> 01:13:33,064 So there are different distributions, 1286 01:13:33,064 --> 01:13:35,530 and they each have their perimeters 1287 01:13:35,530 --> 01:13:36,199 PROFESSOR: Yeah. 1288 01:13:36,199 --> 01:13:37,574 AUDIENCE: You can pick perimeters 1289 01:13:37,574 --> 01:13:43,140 for each of them, such that they will give you the same-- 1290 01:13:43,140 --> 01:13:44,570 PROFESSOR: OK, that's right. 1291 01:13:44,570 --> 01:13:47,440 So what they found was that different underlying 1292 01:13:47,440 --> 01:13:50,724 distributions-- this probability distribution 1293 01:13:50,724 --> 01:13:52,140 for beneficial mutations, function 1294 01:13:52,140 --> 01:13:55,200 of s-- different underlying distributions 1295 01:13:55,200 --> 01:13:58,570 could give you the same final output in their measurements. 1296 01:13:58,570 --> 01:13:59,430 But why was that? 1297 01:14:05,850 --> 01:14:06,560 Yeah, John? 1298 01:14:06,560 --> 01:14:09,162 AUDIENCE: The only thing you can measure is what fixes. 1299 01:14:09,162 --> 01:14:09,870 PROFESSOR: Right. 1300 01:14:09,870 --> 01:14:13,590 So you can only measure what it fixes, 1301 01:14:13,590 --> 01:14:17,130 or becomes significant fraction population, 10%. 1302 01:14:17,130 --> 01:14:18,805 Only measure what fixes or grows. 1303 01:14:22,520 --> 01:14:25,442 And what does that mean? 1304 01:14:25,442 --> 01:14:29,292 AUDIENCE: Well, it means that you're not probing [INAUDIBLE]. 1305 01:14:29,292 --> 01:14:30,000 PROFESSOR: Right. 1306 01:14:30,000 --> 01:14:32,484 It means that we're not probing that distribution, and why not? 1307 01:14:32,484 --> 01:14:33,400 AUDIENCE: [INAUDIBLE]. 1308 01:14:36,064 --> 01:14:36,730 PROFESSOR: Yeah. 1309 01:14:36,730 --> 01:14:39,105 But why is it that we only see part of this distribution? 1310 01:14:39,105 --> 01:14:40,520 What was their-- 1311 01:14:40,520 --> 01:14:42,510 AUDIENCE: We don't see the ones that die out. 1312 01:14:42,510 --> 01:14:44,426 PROFESSOR: We don't see the ones that die out. 1313 01:14:44,426 --> 01:14:46,430 And why is that they die out? 1314 01:14:46,430 --> 01:14:48,055 Or why is it that they-- 1315 01:14:48,055 --> 01:14:49,270 AUDIENCE: [INAUDIBLE]. 1316 01:14:49,270 --> 01:14:52,540 PROFESSOR: OK, yeah. 1317 01:14:52,540 --> 01:14:53,200 Yes. 1318 01:14:53,200 --> 01:14:54,699 So the fact of stochastic extinction 1319 01:14:54,699 --> 01:14:55,790 is very relevant here. 1320 01:14:55,790 --> 01:14:57,790 And I think that in this paper they a little bit 1321 01:14:57,790 --> 01:15:01,250 underplay that aspect of it, because they're primarily 1322 01:15:01,250 --> 01:15:03,410 arguing that it was another effect that led 1323 01:15:03,410 --> 01:15:05,680 to this equivalence principle. 1324 01:15:05,680 --> 01:15:06,924 Right? 1325 01:15:06,924 --> 01:15:07,890 AUDIENCE: Competition. 1326 01:15:07,890 --> 01:15:09,060 PROFESSOR: Yeah, clonal interference. 1327 01:15:09,060 --> 01:15:11,101 I mean, that's why we spent the last hour talking 1328 01:15:11,101 --> 01:15:12,780 about clonal interference. 1329 01:15:12,780 --> 01:15:14,870 So their argument in this paper was 1330 01:15:14,870 --> 01:15:16,950 that, as a result of clonal interference, 1331 01:15:16,950 --> 01:15:19,960 competition between these different lineages, 1332 01:15:19,960 --> 01:15:22,280 then the distribution of fitness effects 1333 01:15:22,280 --> 01:15:24,350 that you measure, that fix, is not 1334 01:15:24,350 --> 01:15:26,600 the distribution that was the underlying distribution. 1335 01:15:29,210 --> 01:15:34,240 Because less fit beneficial mutations get out-competed. 1336 01:15:34,240 --> 01:15:35,850 But there's a very important question, 1337 01:15:35,850 --> 01:15:41,540 which is, let's imagine that let's have mu go to 0. 1338 01:15:44,550 --> 01:15:46,060 Or we go to small populations. 1339 01:15:46,060 --> 01:15:50,250 Let's say that after reading his paper 1340 01:15:50,250 --> 01:15:52,690 I said, OK, well, yeah, because of clonal interference 1341 01:15:52,690 --> 01:15:55,829 I can't measure this probability distribution in the set up 1342 01:15:55,829 --> 01:15:56,370 that he used. 1343 01:15:56,370 --> 01:15:59,300 But maybe if I go to small population sizes, 1344 01:15:59,300 --> 01:16:03,280 or I reduce the mutation rate somehow-- magic-- 1345 01:16:03,280 --> 01:16:07,850 then would this allow me to measure that distribution? 1346 01:16:07,850 --> 01:16:09,470 So I'm going to ask you guys to vote. 1347 01:16:09,470 --> 01:16:10,370 So mu, let's say. 1348 01:16:10,370 --> 01:16:12,470 Our experiment, I have mu go to 0. 1349 01:16:12,470 --> 01:16:16,990 Does this mean that when I then go do write CFP, 1350 01:16:16,990 --> 01:16:21,610 YFP, they're 50-50, and then you start doing this. 1351 01:16:21,610 --> 01:16:25,227 And we measure the slope to get s, 1352 01:16:25,227 --> 01:16:26,435 just like in this experiment. 1353 01:16:29,085 --> 01:16:38,100 If I plot the distribution of resulting s's, no clonal 1354 01:16:38,100 --> 01:16:41,660 interference anymore, because I've set mu equal to 0. 1355 01:16:41,660 --> 01:16:45,160 So this thing goes to infinity. 1356 01:16:45,160 --> 01:16:50,710 Do I recover the underlying distribution? 1357 01:16:50,710 --> 01:16:51,710 Ready. 1358 01:16:51,710 --> 01:16:57,210 So P be measured, as a function of s, 1359 01:16:57,210 --> 01:17:00,380 is it equal to the true one? 1360 01:17:00,380 --> 01:17:01,080 Question mark. 1361 01:17:07,770 --> 01:17:10,150 I'm going to give you 15 seconds to think about this. 1362 01:17:30,420 --> 01:17:31,350 Ready. 1363 01:17:31,350 --> 01:17:35,150 Three, two, one. 1364 01:17:35,150 --> 01:17:35,650 All right. 1365 01:17:35,650 --> 01:17:38,880 So we have, I'd say, a majority of nos. 1366 01:17:38,880 --> 01:17:40,550 And can somebody say why not? 1367 01:17:40,550 --> 01:17:43,540 AUDIENCE: You can't really see the established mutation. 1368 01:17:43,540 --> 01:17:44,540 PROFESSOR: That's right. 1369 01:17:44,540 --> 01:17:47,130 Because even if you don't have clonal interference, 1370 01:17:47,130 --> 01:17:48,850 you still only see the mutations that 1371 01:17:48,850 --> 01:17:51,547 survive so stochastic extinction. 1372 01:17:51,547 --> 01:17:53,380 And I think this is a really important point 1373 01:17:53,380 --> 01:17:55,338 that, at least in a first reading of the paper, 1374 01:17:55,338 --> 01:17:57,460 you might not realize. 1375 01:17:57,460 --> 01:17:57,960 Yeah, John. 1376 01:17:57,960 --> 01:17:58,876 AUDIENCE: [INAUDIBLE]. 1377 01:18:01,170 --> 01:18:02,170 PROFESSOR: That's right. 1378 01:18:02,170 --> 01:18:05,300 So in principle, we know that the probability of survival 1379 01:18:05,300 --> 01:18:07,730 goes as s. 1380 01:18:07,730 --> 01:18:10,460 But you end up losing a fair amount of information. 1381 01:18:10,460 --> 01:18:15,620 So for example, let's say that this thing is an exponential. 1382 01:18:15,620 --> 01:18:17,990 So this is Pb of s. 1383 01:18:17,990 --> 01:18:21,180 Let's just say it's an exponential. 1384 01:18:21,180 --> 01:18:24,770 And indeed, from some ideas from extreme value theory, 1385 01:18:24,770 --> 01:18:28,226 there are arguments that maybe this should be an exponential. 1386 01:18:28,226 --> 01:18:29,600 And if you're curious about this, 1387 01:18:29,600 --> 01:18:32,550 come and ask me after class. 1388 01:18:32,550 --> 01:18:35,480 But let's just imagine that this thing is exponential. 1389 01:18:35,480 --> 01:18:37,420 Now the question is what would we actually 1390 01:18:37,420 --> 01:18:42,940 measure in this experiment in the absence 1391 01:18:42,940 --> 01:18:43,940 of clonal interference? 1392 01:18:47,870 --> 01:18:50,172 Would we measure a distribution-- 1393 01:18:50,172 --> 01:18:53,390 would the distribution be maximal 1394 01:18:53,390 --> 01:18:57,230 at 0 or at some other value? 1395 01:18:57,230 --> 01:18:58,690 Verbally, 0 or other value? 1396 01:18:58,690 --> 01:19:00,804 Ready, three, two, one. 1397 01:19:00,804 --> 01:19:01,720 AUDIENCE: Other value. 1398 01:19:01,720 --> 01:19:02,800 PROFESSOR: Other value. 1399 01:19:02,800 --> 01:19:06,250 And indeed, the probability of surviving extinction, 1400 01:19:06,250 --> 01:19:10,170 remember this x1 goes as s means that the distribution they 1401 01:19:10,170 --> 01:19:12,090 actually measure will start at 0, 1402 01:19:12,090 --> 01:19:19,750 grow linearly, and then fall off. 1403 01:19:19,750 --> 01:19:21,290 So this distribution is essentially 1404 01:19:21,290 --> 01:19:23,289 like this distribution of the convolution of two 1405 01:19:23,289 --> 01:19:30,410 exponentials, where this thing goes up linearly, 1406 01:19:30,410 --> 01:19:33,580 and then curves off. 1407 01:19:33,580 --> 01:19:37,240 So I think that a very important point to realize in all of this 1408 01:19:37,240 --> 01:19:39,370 is that this is probability distribution that's 1409 01:19:39,370 --> 01:19:46,220 measured as a function of s with no clonal interference, right? 1410 01:19:46,220 --> 01:19:47,780 So the statement is that even if you 1411 01:19:47,780 --> 01:19:49,890 don't have any clonal interference, already 1412 01:19:49,890 --> 01:19:51,723 you're going to measure a peak distribution. 1413 01:19:56,760 --> 01:20:00,521 The underlying distribution will be peaked at 0. 1414 01:20:00,521 --> 01:20:02,020 But what you measure is always going 1415 01:20:02,020 --> 01:20:02,930 to be peaked somewhere else. 1416 01:20:02,930 --> 01:20:04,550 And indeed, if you go and you look 1417 01:20:04,550 --> 01:20:08,702 in the paper at the measured alpha as a function of s, 1418 01:20:08,702 --> 01:20:10,660 you see something that looks kind of like this. 1419 01:20:10,660 --> 01:20:14,527 So it's a little bit difficult to be absolutely certain. 1420 01:20:14,527 --> 01:20:17,110 What's going on is that in the absence of clonal interference, 1421 01:20:17,110 --> 01:20:17,700 you get this. 1422 01:20:17,700 --> 01:20:20,160 With clonal interference, let's say you have two mutations 1423 01:20:20,160 --> 01:20:21,400 and you want to know, OK, what are you going to measure? 1424 01:20:21,400 --> 01:20:23,355 Well, you sample from this distribution twice, 1425 01:20:23,355 --> 01:20:25,950 and you take the larger of the two. 1426 01:20:25,950 --> 01:20:27,910 Because this distribution is already 1427 01:20:27,910 --> 01:20:31,050 the probability distribution of mutations that get established. 1428 01:20:31,050 --> 01:20:35,495 So then this is going to grow as a quadratic here. 1429 01:20:35,495 --> 01:20:37,040 If you have more clonal interference, 1430 01:20:37,040 --> 01:20:40,800 then this thing shifts to the right and gets tighter. 1431 01:20:40,800 --> 01:20:43,506 So this is the process by which you end up 1432 01:20:43,506 --> 01:20:44,630 with a peaked distribution. 1433 01:20:47,294 --> 01:20:48,710 So what we're going to do is we're 1434 01:20:48,710 --> 01:20:53,050 going to start in the first 15 minutes on Tuesday. 1435 01:20:53,050 --> 01:20:56,530 We'll talk about this experiment a bit more. 1436 01:20:56,530 --> 01:20:59,440 And then we'll go on to think about fitness landscapes 1437 01:20:59,440 --> 01:21:00,930 and the rate of evolution. 1438 01:21:00,930 --> 01:21:03,179 If you have any questions about anything that we said, 1439 01:21:03,179 --> 01:21:03,941 please come on up.