1 00:00:00,090 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,010 Commons license. 3 00:00:04,010 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,210 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,210 --> 00:00:17,835 at ocw.mit.edu. 8 00:00:21,640 --> 00:00:25,060 PROFESSOR: All right, why don't we go ahead and get started. 9 00:00:25,060 --> 00:00:27,680 So today what we want to do is start 10 00:00:27,680 --> 00:00:30,350 thinking a bit about evolution in finite populations. 11 00:00:30,350 --> 00:00:32,549 And of course, what we mean by that, 12 00:00:32,549 --> 00:00:36,530 is evolution in populations where we have to really think 13 00:00:36,530 --> 00:00:38,750 about stochastic dynamics. 14 00:00:38,750 --> 00:00:42,120 And now in general, just like in the context 15 00:00:42,120 --> 00:00:45,230 of gene networks within cells, the situation 16 00:00:45,230 --> 00:00:47,250 where we have to worry about stochastic dynamics 17 00:00:47,250 --> 00:00:50,570 is in the small number kind of limit. 18 00:00:50,570 --> 00:00:53,690 What's perhaps surprising about evolution 19 00:00:53,690 --> 00:00:56,720 is that they're always-- the small numbers 20 00:00:56,720 --> 00:00:58,570 are always important. 21 00:00:58,570 --> 00:01:00,570 Even if you're in a large population, 22 00:01:00,570 --> 00:01:04,311 I'd say 10 to the 9 individuals, if you want to study evolution, 23 00:01:04,311 --> 00:01:06,560 then you're interested in cases where new mutants will 24 00:01:06,560 --> 00:01:07,560 arise in the population. 25 00:01:07,560 --> 00:01:10,270 And kind of by definition, those new mutants 26 00:01:10,270 --> 00:01:14,040 start out as kind of a single member of the population. 27 00:01:14,040 --> 00:01:16,450 Which means that in the context of evolution, 28 00:01:16,450 --> 00:01:22,490 we always have to think about stochastic type dynamics. 29 00:01:22,490 --> 00:01:25,500 Now the basic model that we're going to use in this class 30 00:01:25,500 --> 00:01:30,440 is the Moran process, which is a model that 31 00:01:30,440 --> 00:01:32,220 fixes population size. 32 00:01:32,220 --> 00:01:36,870 And then instead of having discrete generations, where 33 00:01:36,870 --> 00:01:38,414 all the individuals are reproducing 34 00:01:38,414 --> 00:01:40,580 at the same time-- which is what you might have seen 35 00:01:40,580 --> 00:01:42,314 in the Wright-Fisher process-- instead, 36 00:01:42,314 --> 00:01:43,980 we're going to think about the situation 37 00:01:43,980 --> 00:01:47,180 where it occurs more stepwise. 38 00:01:47,180 --> 00:01:51,390 In the sense that individuals reproduce one at a time. 39 00:01:51,390 --> 00:01:55,050 And then we contract the dynamics of the population. 40 00:01:55,050 --> 00:01:57,900 So we're going to think about both the situation where we're 41 00:01:57,900 --> 00:02:01,100 trying to understand neutral dynamics, when we're tracking 42 00:02:01,100 --> 00:02:05,850 the composition of a population when the fitness of individuals 43 00:02:05,850 --> 00:02:08,639 is equal or nearly equal. 44 00:02:08,639 --> 00:02:10,199 But because in stochastic dynamics, 45 00:02:10,199 --> 00:02:12,250 there are interesting things that happen. 46 00:02:12,250 --> 00:02:13,750 But then we'll get into the question 47 00:02:13,750 --> 00:02:15,114 of non-neutral evolution. 48 00:02:15,114 --> 00:02:17,280 And really, we want to consider both halves of that. 49 00:02:17,280 --> 00:02:20,020 All right, so in many cases, in the context of evolution, 50 00:02:20,020 --> 00:02:24,520 we're interested in, or focused on beneficial mutants. 51 00:02:24,520 --> 00:02:25,939 Now for those beneficial mutants, 52 00:02:25,939 --> 00:02:27,730 one of the basic things we're going to find 53 00:02:27,730 --> 00:02:33,460 is that even beneficial mutants will typically go extinct. 54 00:02:33,460 --> 00:02:36,790 It doesn't mean that they're not important over the long run. 55 00:02:36,790 --> 00:02:39,690 But it does mean that there is a very real sense 56 00:02:39,690 --> 00:02:42,630 that randomness is dominating the life of even 57 00:02:42,630 --> 00:02:44,360 beneficial mutants. 58 00:02:44,360 --> 00:02:46,140 And then finally, if there's time, 59 00:02:46,140 --> 00:02:49,310 we will discuss this idea of Muller's ratchet, which 60 00:02:49,310 --> 00:02:51,770 is basically pointing out that if there 61 00:02:51,770 --> 00:02:55,190 are deleterious mutants or mutations in the population, 62 00:02:55,190 --> 00:02:58,300 those deleterious mutations can in some cases 63 00:02:58,300 --> 00:02:59,830 spread and fix in the population. 64 00:02:59,830 --> 00:03:02,550 And when that happens, you can have 65 00:03:02,550 --> 00:03:05,590 a decrease in the fitness of a population over time. 66 00:03:05,590 --> 00:03:09,020 And this is particularly a strong effect 67 00:03:09,020 --> 00:03:12,410 for small populations, because small populations, 68 00:03:12,410 --> 00:03:15,290 they're not as effective, what you might call filters, 69 00:03:15,290 --> 00:03:16,460 for selection. 70 00:03:19,054 --> 00:03:20,470 And so what we want to do is start 71 00:03:20,470 --> 00:03:22,011 by thinking about this Moran process. 72 00:03:27,887 --> 00:03:29,470 And the key feature here is that we're 73 00:03:29,470 --> 00:03:34,840 going to have a constant population size, constant N. 74 00:03:34,840 --> 00:03:37,060 And that's not because we believe 75 00:03:37,060 --> 00:03:40,550 that real populations always have a fixed population size, 76 00:03:40,550 --> 00:03:44,100 but rather, we want to try to get some intuition 77 00:03:44,100 --> 00:03:45,570 in this simple model. 78 00:03:45,570 --> 00:03:47,950 And then of course, it's reasonable to ask, well, 79 00:03:47,950 --> 00:03:51,410 which aspects of the mathematics or intuition 80 00:03:51,410 --> 00:03:54,700 we develop are going to change as a result of allowing 81 00:03:54,700 --> 00:03:56,974 fluctuations in the total population size? 82 00:03:56,974 --> 00:03:58,390 But I think there's a lot of value 83 00:03:58,390 --> 00:04:02,384 in starting out by analyzing the simplest model that you can. 84 00:04:02,384 --> 00:04:03,800 So what we're going to think about 85 00:04:03,800 --> 00:04:06,380 is a situation where we have a population composed 86 00:04:06,380 --> 00:04:07,380 of N individuals. 87 00:04:07,380 --> 00:04:10,930 And for now we'll just consider two types, A and B. 88 00:04:10,930 --> 00:04:16,890 And this is going to be a model for asexually reproducing 89 00:04:16,890 --> 00:04:17,589 populations. 90 00:04:17,589 --> 00:04:19,440 Constant N, asexual. 91 00:04:22,964 --> 00:04:24,380 What that means is, in particular, 92 00:04:24,380 --> 00:04:29,710 that we're going to assume that an A individual can 93 00:04:29,710 --> 00:04:31,170 lead to two individuals. 94 00:04:31,170 --> 00:04:36,130 Similarity, a B individual can lead to two B individuals. 95 00:04:36,130 --> 00:04:39,300 Right, so you can think about this as, for example, a model 96 00:04:39,300 --> 00:04:43,450 for how microbial populations may evolve. 97 00:04:43,450 --> 00:04:47,179 And for now, we will not consider any mutations. 98 00:04:47,179 --> 00:04:49,470 All right, so we're going to think about the process of 99 00:04:49,470 --> 00:04:51,400 assume that those mutations are already there. 100 00:04:51,400 --> 00:04:54,510 So A and B could be different. 101 00:04:54,510 --> 00:04:56,630 They could have, for example, different-- 102 00:04:56,630 --> 00:04:59,270 they could be different at some point mutation 103 00:04:59,270 --> 00:05:02,940 side of some gene that is relevant for growing 104 00:05:02,940 --> 00:05:04,760 a low glucose concentration, for example. 105 00:05:04,760 --> 00:05:06,170 OK? 106 00:05:06,170 --> 00:05:09,460 So here we're going to-- so this is birth slash 107 00:05:09,460 --> 00:05:11,860 division, and in particular here we're going to, for now, 108 00:05:11,860 --> 00:05:14,420 assume no mutation. 109 00:05:14,420 --> 00:05:17,200 So we'll assume that A's always give birth 110 00:05:17,200 --> 00:05:19,340 to A's, and B's always give birth to B's. 111 00:05:29,550 --> 00:05:33,330 We'll follow the nomenclature from the reading 112 00:05:33,330 --> 00:05:37,060 that you guys did last night, Martin Nowak's book, chapter 113 00:05:37,060 --> 00:05:40,230 six, where we're going to think about-- we're 114 00:05:40,230 --> 00:05:47,580 going to assume that there are initially i A individuals, 115 00:05:47,580 --> 00:05:50,610 and therefore, N minus i B individuals. 116 00:05:55,280 --> 00:06:02,840 Now we'll assume that the basic process for the-- in this Moran 117 00:06:02,840 --> 00:06:06,630 process is that you have reproduction, or birth, that's 118 00:06:06,630 --> 00:06:08,695 proportional to fitness. 119 00:06:15,942 --> 00:06:17,400 And then the resulting kind of what 120 00:06:17,400 --> 00:06:20,170 you might call a daughter cell replaces one member 121 00:06:20,170 --> 00:06:21,670 of the population at random. 122 00:06:21,670 --> 00:06:24,650 So there's birth and then replacement. 123 00:06:24,650 --> 00:06:27,360 And indeed we'll assume that replacement, 124 00:06:27,360 --> 00:06:28,860 that the daughter cell, for example, 125 00:06:28,860 --> 00:06:34,040 could even replace the mother cell, if we want. 126 00:06:34,040 --> 00:06:35,520 So this is just the birth. 127 00:06:35,520 --> 00:06:37,645 So A is going to lead-- there's going to be two As, 128 00:06:37,645 --> 00:06:39,290 and this new A will have to replace 129 00:06:39,290 --> 00:06:41,570 one of the other individuals in the population, 130 00:06:41,570 --> 00:06:43,210 to keep constant population size. 131 00:06:47,000 --> 00:06:47,846 All right. 132 00:06:47,846 --> 00:06:50,230 Are there any questions about the basic model? 133 00:06:52,960 --> 00:06:55,210 OK. 134 00:06:55,210 --> 00:06:58,945 So that, in principle, here we can use this model 135 00:06:58,945 --> 00:07:01,600 to try to understand both neutral and non-neutral 136 00:07:01,600 --> 00:07:02,100 evolution. 137 00:07:02,100 --> 00:07:04,391 But let's start out by thinking about the neutral case. 138 00:07:08,480 --> 00:07:13,630 So in particular, the fitness, rA is equal to rB. 139 00:07:16,330 --> 00:07:19,600 Now what I want to do is, given the rules we 140 00:07:19,600 --> 00:07:24,730 just kind of laid out for you, let's assume that i over n 141 00:07:24,730 --> 00:07:28,790 is equal to one third. 142 00:07:28,790 --> 00:07:32,870 So for now, we'll say, OK a third of the population is A, 143 00:07:32,870 --> 00:07:35,610 2/3 is then B. 144 00:07:35,610 --> 00:07:38,640 And we can think about these probabilities 145 00:07:38,640 --> 00:07:41,230 of going from i to i plus 1, as compared 146 00:07:41,230 --> 00:07:44,210 to going from i to i minus 1. 147 00:07:44,210 --> 00:07:48,280 So these are the probabilities that in one cycle of birth 148 00:07:48,280 --> 00:07:52,600 replacement, the number of A's goes up one or goes down one. 149 00:07:52,600 --> 00:07:56,420 Can you ever go up two or three or four in the Moran process 150 00:07:56,420 --> 00:07:57,500 in one step? 151 00:07:57,500 --> 00:07:58,300 No. 152 00:07:58,300 --> 00:08:02,860 Because each step is always one birth and one replacement. 153 00:08:02,860 --> 00:08:04,240 So you can move, at most, one. 154 00:08:04,240 --> 00:08:08,100 Do you always move-- does i change always? 155 00:08:08,100 --> 00:08:09,489 No. 156 00:08:09,489 --> 00:08:11,280 And what we want to know is the probability 157 00:08:11,280 --> 00:08:16,620 of going from i to i plus 1, as compared to the probability 158 00:08:16,620 --> 00:08:18,680 of going from i to i minus one. 159 00:08:18,680 --> 00:08:21,940 The ratio of these probabilities is equal to what? 160 00:08:39,780 --> 00:08:42,380 We're considering a case where the A's and B's have 161 00:08:42,380 --> 00:08:44,560 the same fitness, so they're somehow 162 00:08:44,560 --> 00:08:48,190 equal per capita probability of being chosen to reproduce. 163 00:08:48,190 --> 00:08:51,297 But there's-- but we're not in a symmetric population 164 00:08:51,297 --> 00:08:52,130 distribution, right? 165 00:08:52,130 --> 00:08:57,300 So 1/3 of the population is A, 2/3 is B. 166 00:08:57,300 --> 00:09:00,800 So I'll give you 20 seconds to think about this. 167 00:09:26,130 --> 00:09:27,570 All right, do you need more time? 168 00:09:31,450 --> 00:09:32,670 Everybody nod or shake. 169 00:09:32,670 --> 00:09:35,490 Do you need more time? 170 00:09:35,490 --> 00:09:35,990 OK. 171 00:09:35,990 --> 00:09:37,990 I'll give you another 10 seconds, because it's-- 172 00:09:49,120 --> 00:09:51,350 Let's go ahead and see where we are. 173 00:09:51,350 --> 00:09:52,530 Ready? 174 00:09:52,530 --> 00:09:55,850 Three, two, one. 175 00:09:55,850 --> 00:09:56,510 All right. 176 00:09:56,510 --> 00:10:01,780 We have a wide range of different answers here. 177 00:10:01,780 --> 00:10:03,020 OK, perfect. 178 00:10:03,020 --> 00:10:06,599 This is exactly the situation that we hope for. 179 00:10:06,599 --> 00:10:07,640 So turn to your neighbor. 180 00:10:07,640 --> 00:10:09,420 You should certainly be able to find somebody 181 00:10:09,420 --> 00:10:10,480 that disagrees with you. 182 00:10:10,480 --> 00:10:12,990 So if the first person you turn to agrees with you, 183 00:10:12,990 --> 00:10:14,699 try to find somebody else talk to. 184 00:11:26,100 --> 00:11:28,210 All right, why don't we go ahead and reconvene. 185 00:11:28,210 --> 00:11:30,606 I know that there was quite a lot of disagreement, 186 00:11:30,606 --> 00:11:32,480 so that means that you guys will probably not 187 00:11:32,480 --> 00:11:35,280 be able to converge in this one minute time frame. 188 00:11:35,280 --> 00:11:39,160 But let me just see, let me see if anybody's mind 189 00:11:39,160 --> 00:11:40,550 was changed by their neighbors. 190 00:11:40,550 --> 00:11:42,190 All right, let's re-vote. 191 00:11:42,190 --> 00:11:45,730 Ready, three, two, one. 192 00:11:45,730 --> 00:11:50,330 OK, all right, so it's pretty much 193 00:11:50,330 --> 00:11:53,270 the same as where we started, maybe. 194 00:11:53,270 --> 00:11:54,970 All right. 195 00:11:54,970 --> 00:11:57,610 OK I would say-- does anybody want to volunteer 196 00:11:57,610 --> 00:11:59,747 what their neighbor said? 197 00:11:59,747 --> 00:12:01,080 I know what your neighbors said. 198 00:12:01,080 --> 00:12:03,550 So tell us. 199 00:12:03,550 --> 00:12:07,130 AUDIENCE: OK, so if you-- so I would say it's E. 200 00:12:07,130 --> 00:12:09,130 And the reason is that-- so there are two cases. 201 00:12:09,130 --> 00:12:12,260 In the first case, both the number A and B stay the same. 202 00:12:12,260 --> 00:12:15,050 Right, for example, A gets born and A dies. 203 00:12:15,050 --> 00:12:18,226 So you first decide, is the pop-- is the number of A and B 204 00:12:18,226 --> 00:12:20,499 going to change, or is it not going to change? 205 00:12:20,499 --> 00:12:21,082 PROFESSOR: OK. 206 00:12:21,082 --> 00:12:22,873 AUDIENCE: Once you've decided that it's not 207 00:12:22,873 --> 00:12:24,550 going to change-- I'm sorry. 208 00:12:24,550 --> 00:12:25,750 Once you've decided that it is going to change-- 209 00:12:25,750 --> 00:12:26,690 PROFESSOR: Right. 210 00:12:26,690 --> 00:12:28,481 AUDIENCE: --then you just want to know, OK, 211 00:12:28,481 --> 00:12:31,762 then what's the probability that you just choose A to change 212 00:12:31,762 --> 00:12:33,590 [INAUDIBLE]. 213 00:12:33,590 --> 00:12:35,051 PROFESSOR: OK. 214 00:12:35,051 --> 00:12:35,550 Yeah. 215 00:12:35,550 --> 00:12:37,542 AUDIENCE: And then the probability that A-- 216 00:12:37,542 --> 00:12:38,540 you choose-- 217 00:12:38,540 --> 00:12:39,520 PROFESSOR: OK. 218 00:12:39,520 --> 00:12:42,590 But you haven't said anything about replacement yet. 219 00:12:42,590 --> 00:12:46,830 So I'm-- replacement should be-- because certainly, 220 00:12:46,830 --> 00:12:49,620 we're talking about the ratio of the probability that the number 221 00:12:49,620 --> 00:12:52,120 of A goes up, as compared to the probability that the number 222 00:12:52,120 --> 00:12:52,810 of A goes down. 223 00:12:52,810 --> 00:12:53,310 Right? 224 00:12:53,310 --> 00:12:56,160 So we've already, in some ways, excluded the cases 225 00:12:56,160 --> 00:12:59,540 where the number of A individuals doesn't change. 226 00:12:59,540 --> 00:13:02,300 And in your-- what you just told us, 227 00:13:02,300 --> 00:13:03,790 you're asking about the probability 228 00:13:03,790 --> 00:13:08,831 that individuals are going to be chosen to reproduce. 229 00:13:08,831 --> 00:13:11,657 AUDIENCE: Yeah, because [INAUDIBLE]. 230 00:13:11,657 --> 00:13:12,240 PROFESSOR: OK. 231 00:13:12,240 --> 00:13:13,465 Yeah, but I guess all I'm saying is 232 00:13:13,465 --> 00:13:14,850 that there're going to be two halves to this, right? 233 00:13:14,850 --> 00:13:15,990 So you have to think about the probability 234 00:13:15,990 --> 00:13:18,910 that an individual is being chosen to reproduce, and also 235 00:13:18,910 --> 00:13:22,440 the probability that a particular type of individual 236 00:13:22,440 --> 00:13:24,260 will be chosen to get replaced. 237 00:13:24,260 --> 00:13:26,660 So it's the-- there's somehow a balance of those two. 238 00:13:26,660 --> 00:13:28,004 AUDIENCE: [INAUDIBLE] 239 00:13:28,004 --> 00:13:29,670 PROFESSOR: Right, because in this case-- 240 00:13:29,670 --> 00:13:30,180 AUDIENCE: [INAUDIBLE] replace-- 241 00:13:30,180 --> 00:13:33,340 PROFESSOR: --replace, this is-- right, death, slash-- right. 242 00:13:33,340 --> 00:13:35,680 And I should maybe just highlight-- if you want, 243 00:13:35,680 --> 00:13:37,490 you could call-- replacement-- I mean, 244 00:13:37,490 --> 00:13:40,566 this is just a nice way of saying death, right? 245 00:13:40,566 --> 00:13:41,065 Death-- 246 00:13:44,641 --> 00:13:45,640 AUDIENCE: Yeah, you can. 247 00:13:45,640 --> 00:13:48,150 I mean, [? the point ?] [? is ?] once you've ruled out-- 248 00:13:48,150 --> 00:13:49,941 once you say, OK, the populations are going 249 00:13:49,941 --> 00:13:53,940 to change, then if you choose an A to reproduce, a B has to die. 250 00:13:53,940 --> 00:13:55,836 PROFESSOR: Oh, OK, once you've already-- 251 00:13:55,836 --> 00:13:57,544 AUDIENCE: If an A is chosen to reproduce, 252 00:13:57,544 --> 00:13:59,040 and an A is chosen to die, then-- 253 00:13:59,040 --> 00:14:00,489 PROFESSOR: OK, yeah, right. 254 00:14:00,489 --> 00:14:02,530 But, OK, I think I understand what you're saying. 255 00:14:02,530 --> 00:14:05,000 But we-- you still have to keep track of there 256 00:14:05,000 --> 00:14:06,060 are the two sides. 257 00:14:06,060 --> 00:14:07,810 There's the replace-- there's the birth, 258 00:14:07,810 --> 00:14:08,685 and the replacements. 259 00:14:08,685 --> 00:14:12,180 And we have to figure out how the relative probabilities 260 00:14:12,180 --> 00:14:14,144 or rates that those two things happen. 261 00:14:14,144 --> 00:14:16,560 Does somebody want to make an argument for something else? 262 00:14:16,560 --> 00:14:18,737 I mean, we'll see how this plays out in a moment. 263 00:14:18,737 --> 00:14:20,078 AUDIENCE: I want to argue for C. 264 00:14:20,078 --> 00:14:21,420 PROFESSOR: OK. 265 00:14:21,420 --> 00:14:22,859 AUDIENCE: So take the numerator. 266 00:14:22,859 --> 00:14:23,525 PROFESSOR: Yeah. 267 00:14:23,525 --> 00:14:26,000 AUDIENCE: In order to go from i to i plus 1-- 268 00:14:26,000 --> 00:14:27,857 so we're going to take two individuals 269 00:14:27,857 --> 00:14:28,690 from the population. 270 00:14:28,690 --> 00:14:31,695 We need one of them to be type A, that's the one that's 271 00:14:31,695 --> 00:14:32,514 going to reproduce. 272 00:14:32,514 --> 00:14:33,005 PROFESSOR: Yep. 273 00:14:33,005 --> 00:14:34,478 AUDIENCE: And the other to be type B, the one that's 274 00:14:34,478 --> 00:14:35,019 going to die. 275 00:14:35,019 --> 00:14:35,951 PROFESSOR: Yeah. 276 00:14:35,951 --> 00:14:37,430 AUDIENCE: So we get the product of those-- 277 00:14:37,430 --> 00:14:37,840 PROFESSOR: Perfect. 278 00:14:37,840 --> 00:14:38,220 OK. 279 00:14:38,220 --> 00:14:40,750 And we can actually just be more-- be explicit about this. 280 00:14:40,750 --> 00:14:43,880 OK, so the probability-- in one cycle, the probability 281 00:14:43,880 --> 00:14:45,680 that you go from i to i plus one. 282 00:14:45,680 --> 00:14:48,690 That requires that two things happen. 283 00:14:48,690 --> 00:14:52,550 One is that you choose an A individual to reproduce. 284 00:14:52,550 --> 00:14:54,300 And what's the probability that you choose 285 00:14:54,300 --> 00:14:55,520 an A individual to reproduce? 286 00:14:55,520 --> 00:14:57,019 AUDIENCE: It's going to be i over N. 287 00:14:57,019 --> 00:15:00,450 PROFESSOR: i over N. So we have i over N. Right, 288 00:15:00,450 --> 00:15:08,350 so this the probability that A reproduces. 289 00:15:08,350 --> 00:15:12,370 And then for i to go from-- to increase by one, 290 00:15:12,370 --> 00:15:15,020 requires not only that an A individual is chosen 291 00:15:15,020 --> 00:15:17,880 to reproduce, but that a B individual is chosen 292 00:15:17,880 --> 00:15:19,910 for replacement, or death. 293 00:15:19,910 --> 00:15:22,217 And what's the probability that that's going to happen? 294 00:15:22,217 --> 00:15:24,960 AUDIENCE: That's N minus i all over N. 295 00:15:24,960 --> 00:15:30,720 PROFESSOR: N minus i, all over N. OK. 296 00:15:30,720 --> 00:15:34,680 So this is the probability that in one cycle 297 00:15:34,680 --> 00:15:37,380 you're going to go from i to i plus 1. 298 00:15:37,380 --> 00:15:40,570 Now of course it's not-- we haven't 299 00:15:40,570 --> 00:15:42,520 said what the probability of staying in i is, 300 00:15:42,520 --> 00:15:45,050 but this is the probability that i will increase by one. 301 00:15:48,370 --> 00:15:51,380 Do we agree? 302 00:15:51,380 --> 00:15:53,480 And indeed, where is it that we've assumed-- where 303 00:15:53,480 --> 00:15:58,400 is it that we've assumed neutrality in this calculation? 304 00:15:58,400 --> 00:16:02,606 That A and B have equal fitness? 305 00:16:02,606 --> 00:16:03,106 Yep? 306 00:16:03,106 --> 00:16:05,189 AUDIENCE: Just take the probability of reproducing 307 00:16:05,189 --> 00:16:06,890 to be about-- or, the-- 308 00:16:06,890 --> 00:16:08,020 PROFESSOR: That's right. 309 00:16:08,020 --> 00:16:08,980 That's right. 310 00:16:08,980 --> 00:16:13,090 So indeed, we've-- this probability that A reproduces, 311 00:16:13,090 --> 00:16:15,630 we've assumed that it's just simply i over N. Whereas, 312 00:16:15,630 --> 00:16:18,380 if it were non-neutral we'd have to write something else. 313 00:16:18,380 --> 00:16:20,838 Maybe we'll figure out what that's going to be in a moment. 314 00:16:20,838 --> 00:16:22,710 But it's in here that we've assumed that. 315 00:16:22,710 --> 00:16:24,150 Incidentally, you could write down 316 00:16:24,150 --> 00:16:26,710 a reasonable model similar to the Moran process, 317 00:16:26,710 --> 00:16:29,110 where differences in fitness show up 318 00:16:29,110 --> 00:16:31,680 instead of here, in the probability of reproduction, 319 00:16:31,680 --> 00:16:34,620 you can have it as a difference in probability of death, 320 00:16:34,620 --> 00:16:35,966 or being replaced. 321 00:16:35,966 --> 00:16:37,632 But this is the most maybe intuitive way 322 00:16:37,632 --> 00:16:39,342 of thinking about it. 323 00:16:39,342 --> 00:16:41,050 And this is very similar to, for example, 324 00:16:41,050 --> 00:16:46,660 what happens in a, what you might call, 325 00:16:46,660 --> 00:16:50,530 a turbidostat, where you keep constant population size. 326 00:16:50,530 --> 00:16:55,630 And as the cells divide other cells are randomly sucked out. 327 00:16:55,630 --> 00:17:00,110 So I'd say that this Moran process is really 328 00:17:00,110 --> 00:17:03,000 a theoretical kind of implementation 329 00:17:03,000 --> 00:17:04,500 of what you could do experimentally, 330 00:17:04,500 --> 00:17:05,569 is this turbidostat. 331 00:17:05,569 --> 00:17:07,230 Which is like a chemostat, instead 332 00:17:07,230 --> 00:17:11,653 of keeping constant dilution rate, you fix population size. 333 00:17:11,653 --> 00:17:12,152 Yes. 334 00:17:12,152 --> 00:17:16,274 AUDIENCE: So do we care about the step of, OK, first A 335 00:17:16,274 --> 00:17:17,010 reproduces. 336 00:17:17,010 --> 00:17:19,545 Then, from the pool of new individuals-- 337 00:17:19,545 --> 00:17:22,760 because you're going to have N plus 1, so-- 338 00:17:22,760 --> 00:17:24,055 PROFESSOR: OK, so all right. 339 00:17:24,055 --> 00:17:25,930 I think maybe I wasn't totally clear on this. 340 00:17:25,930 --> 00:17:27,535 OK so, you have N individuals here. 341 00:17:27,535 --> 00:17:28,910 What you're going to do is you're 342 00:17:28,910 --> 00:17:30,940 going to choose one of them randomly, 343 00:17:30,940 --> 00:17:33,070 maybe proportional to fitness for reproduction. 344 00:17:33,070 --> 00:17:34,910 And then, but then, from this original N, 345 00:17:34,910 --> 00:17:37,230 you choose one of them for death. 346 00:17:37,230 --> 00:17:38,480 So it's not-- you're not, yes. 347 00:17:38,480 --> 00:17:41,650 It's not-- so the daughter cell is not allowed to-- 348 00:17:41,650 --> 00:17:42,252 AUDIENCE: Die. 349 00:17:42,252 --> 00:17:42,960 PROFESSOR: Right. 350 00:17:42,960 --> 00:17:44,990 The daughter cell always replaces somebody, 351 00:17:44,990 --> 00:17:46,632 but it could've been the mother cell. 352 00:17:46,632 --> 00:17:48,840 If we're thinking about this in the context of cells. 353 00:17:53,640 --> 00:17:57,660 So we haven't yet figured out which answer is which, right? 354 00:17:57,660 --> 00:17:59,044 But we can go ahead. 355 00:17:59,044 --> 00:18:00,960 OK, this is the probability that A reproduces, 356 00:18:00,960 --> 00:18:03,460 and over here, this is the probability 357 00:18:03,460 --> 00:18:06,611 that a B individual is replaced. 358 00:18:06,611 --> 00:18:07,110 Right? 359 00:18:13,182 --> 00:18:14,640 What we can do is, we can ask, well 360 00:18:14,640 --> 00:18:18,780 what's the probability that we go from i to i minus 1? 361 00:18:18,780 --> 00:18:21,272 Well it's the exact kind of same calculation, 362 00:18:21,272 --> 00:18:22,730 except now what we want to know is, 363 00:18:22,730 --> 00:18:24,970 we want to know the probability that a B is 364 00:18:24,970 --> 00:18:26,952 chosen for reproduction. 365 00:18:26,952 --> 00:18:28,160 And what is that going to be? 366 00:18:28,160 --> 00:18:28,660 Somebody? 367 00:18:31,220 --> 00:18:34,430 N minus i, right, the number of B individuals 368 00:18:34,430 --> 00:18:36,550 divided by the total number of individuals. 369 00:18:36,550 --> 00:18:40,170 So this is the probability that a B reproduces. 370 00:18:43,000 --> 00:18:45,470 And then what's the probability that A-- 371 00:18:45,470 --> 00:18:47,360 that an A type individual will be chosen 372 00:18:47,360 --> 00:18:51,000 for replacement or death? 373 00:18:51,000 --> 00:18:54,090 That's just i over N. The number of A individuals divided 374 00:18:54,090 --> 00:18:55,310 by the total population size. 375 00:19:05,320 --> 00:19:08,220 All right, does everybody agree with the two calculations 376 00:19:08,220 --> 00:19:08,930 that we just did? 377 00:19:12,100 --> 00:19:13,160 Let's re-vote. 378 00:19:15,940 --> 00:19:20,955 All right, ready, three, two, one. 379 00:19:20,955 --> 00:19:23,080 All right, see, you know, if we do the calculation, 380 00:19:23,080 --> 00:19:24,470 we can convince you. 381 00:19:24,470 --> 00:19:28,040 So indeed, these are equal, these two probabilities. 382 00:19:33,780 --> 00:19:34,930 Right, and this is funny. 383 00:19:34,930 --> 00:19:38,560 Because on the one hand it's like blindingly obvious, 384 00:19:38,560 --> 00:19:40,880 but then the other hand, you get yourself all tied up 385 00:19:40,880 --> 00:19:42,005 in knots thinking about it. 386 00:19:42,005 --> 00:19:45,750 So I don't understand why or how those two statements can 387 00:19:45,750 --> 00:19:50,090 be true at the same time, but they are. 388 00:19:50,090 --> 00:19:55,420 So this is indeed a random walk in i space, 389 00:19:55,420 --> 00:19:56,420 number of A individuals. 390 00:20:01,314 --> 00:20:03,230 And it sort of has to be, because these things 391 00:20:03,230 --> 00:20:04,340 are neutral. 392 00:20:04,340 --> 00:20:07,610 The fact that i over N is not equal to a half 393 00:20:07,610 --> 00:20:11,230 doesn't matter, because these two terms kind of cancel. 394 00:20:11,230 --> 00:20:14,240 But indeed, all of the things that you know have to be true 395 00:20:14,240 --> 00:20:19,040 based on the fact that A and B have equal fitness, 396 00:20:19,040 --> 00:20:22,920 they're going to not work if this thing were not equal to 1, 397 00:20:22,920 --> 00:20:25,550 if these two probabilities were not equal. 398 00:20:25,550 --> 00:20:28,751 So any of these other answers would lead to things that you 399 00:20:28,751 --> 00:20:30,750 would clearly agree are going to be nonsensical, 400 00:20:30,750 --> 00:20:34,110 if you think through the consequence of this. 401 00:20:34,110 --> 00:20:35,610 And we're going to do one right now. 402 00:20:40,230 --> 00:20:49,010 All right, let's imagine that we start-- so here's 403 00:20:49,010 --> 00:20:53,060 the number of A individuals, i. 404 00:20:53,060 --> 00:20:55,060 I apologize that that's the nomenclature we 405 00:20:55,060 --> 00:20:57,720 have for a number of A individuals, 406 00:20:57,720 --> 00:21:01,320 but we want to be consistent with Martin's book. 407 00:21:01,320 --> 00:21:07,660 Now let's say this is N and let's say 408 00:21:07,660 --> 00:21:12,250 we start out at some i here. 409 00:21:12,250 --> 00:21:16,855 The question is, what's the probability that B fixes? 410 00:21:16,855 --> 00:21:19,230 I want to make sure I write down some reasonable options. 411 00:21:23,580 --> 00:21:30,680 So what we want to know is the probability that B fixes, 412 00:21:30,680 --> 00:21:36,560 and that means that it takes over eventually. 413 00:21:36,560 --> 00:21:38,050 That B, we'll say eventually. 414 00:21:45,210 --> 00:21:49,270 In the Moran process with neutral dynamics. 415 00:22:15,266 --> 00:22:18,160 AUDIENCE: I's the number of A, right? 416 00:22:18,160 --> 00:22:19,580 PROFESSOR: That's right. 417 00:22:19,580 --> 00:22:23,320 i is the number of A individuals. 418 00:22:42,540 --> 00:22:45,125 I'm going to give you seven more seconds. 419 00:23:01,390 --> 00:23:05,336 All right, ready, three, two, one. 420 00:23:08,240 --> 00:23:11,260 All right, so we have-- it's kind of mostly split between 421 00:23:11,260 --> 00:23:12,354 C's and D's. 422 00:23:12,354 --> 00:23:14,770 Although I'd say a majority of the group is going to say-- 423 00:23:14,770 --> 00:23:20,330 is saying that it's going to be D. All right, can-- all right, 424 00:23:20,330 --> 00:23:24,940 and this is the distinction between the probability that B 425 00:23:24,940 --> 00:23:27,424 fixes and that A fixes. 426 00:23:27,424 --> 00:23:28,840 I'm not trying to be super tricky, 427 00:23:28,840 --> 00:23:30,839 but I just want to make sure that you keep track 428 00:23:30,839 --> 00:23:33,580 of A's and B's. 429 00:23:33,580 --> 00:23:38,590 And in particular, as i increases, 430 00:23:38,590 --> 00:23:41,660 the probability that A fixes should go up or down? 431 00:23:41,660 --> 00:23:44,033 Verbally, three, two, one. 432 00:23:44,033 --> 00:23:44,880 AUDIENCE: Up. 433 00:23:44,880 --> 00:23:47,270 PROFESSOR: Up. 434 00:23:47,270 --> 00:23:52,070 This here-- over here is a bunch of A's, here is a bunch of B's. 435 00:23:52,070 --> 00:23:54,010 So if you have a larger here, than you 436 00:23:54,010 --> 00:24:00,560 should be more likely to fix the A individuals, vice versa. 437 00:24:00,560 --> 00:24:04,510 So in particular, this-- the probability that B eventually 438 00:24:04,510 --> 00:24:07,820 fixes is going to be this, whereas the probability that A 439 00:24:07,820 --> 00:24:12,100 will fix eventually is just going to be 1 minus that, 440 00:24:12,100 --> 00:24:13,020 it's i over N. 441 00:24:13,020 --> 00:24:18,910 So this is indeed what was pointed out in the book. 442 00:24:18,910 --> 00:24:24,040 All right, and can somebody give an argument, verbally, 443 00:24:24,040 --> 00:24:26,480 for why the-- I mean, this is a result, 444 00:24:26,480 --> 00:24:29,350 that if you think about in the right way, 445 00:24:29,350 --> 00:24:32,640 you can just verbally say why it has to be this. 446 00:24:32,640 --> 00:24:39,442 Rather than writing down all the equations that-- so why 447 00:24:39,442 --> 00:24:41,650 is it that the probability that A will eventually fix 448 00:24:41,650 --> 00:24:42,816 has to be equal to i over N? 449 00:24:48,980 --> 00:24:49,568 Yeah. 450 00:24:49,568 --> 00:24:51,660 AUDIENCE: [INAUDIBLE] book. 451 00:24:51,660 --> 00:24:54,480 PROFESSOR: Yeah, perfect. 452 00:24:54,480 --> 00:24:57,480 AUDIENCE: So at this given time, there are N individuals-- 453 00:24:57,480 --> 00:24:58,755 PROFESSOR: Yep. 454 00:24:58,755 --> 00:25:00,380 And there will always be N individuals, 455 00:25:00,380 --> 00:25:01,463 because we're keeping it-- 456 00:25:01,463 --> 00:25:04,590 AUDIENCE: OK, yeah, right. 457 00:25:04,590 --> 00:25:06,410 Their descendents, at some point, 458 00:25:06,410 --> 00:25:08,354 the descendents of one of them is 459 00:25:08,354 --> 00:25:10,020 going to take over the whole population. 460 00:25:10,020 --> 00:25:10,645 That's a given. 461 00:25:10,645 --> 00:25:12,460 PROFESSOR: That's right. 462 00:25:12,460 --> 00:25:13,690 And that's fine. 463 00:25:13,690 --> 00:25:16,530 It's at first glance kind of surprising 464 00:25:16,530 --> 00:25:22,817 but, it's just the nature of-- if you imagine that they all 465 00:25:22,817 --> 00:25:24,900 were individually tagged, right, so it wasn't just 466 00:25:24,900 --> 00:25:26,900 that we had two types, A and B. But if they were 467 00:25:26,900 --> 00:25:30,390 all color coded using rainbow colors, 468 00:25:30,390 --> 00:25:32,410 then you could keep track of them. 469 00:25:32,410 --> 00:25:37,850 And one of the individuals will eventually fix. 470 00:25:37,850 --> 00:25:40,612 Now, OK, and then what's next? 471 00:25:40,612 --> 00:25:42,590 AUDIENCE: So there are i individuals 472 00:25:42,590 --> 00:25:46,980 that are type A, so the probability 473 00:25:46,980 --> 00:25:51,150 that this one individual [INAUDIBLE] will fix. 474 00:25:51,150 --> 00:25:54,790 PROFESSOR: And then there is one ingredient in that argument 475 00:25:54,790 --> 00:25:57,170 that you didn't say, but I'm sure is in your mind. 476 00:25:57,170 --> 00:25:59,890 Which is, how is it that the probability-- so 477 00:25:59,890 --> 00:26:02,930 among these N individuals, right? 478 00:26:02,930 --> 00:26:04,840 One of them will eventually fix. 479 00:26:04,840 --> 00:26:08,480 And what's the probability that each one 480 00:26:08,480 --> 00:26:12,408 will be the lucky ancestor for all of the population? 481 00:26:12,408 --> 00:26:13,991 AUDIENCE: So it's equally distributed. 482 00:26:13,991 --> 00:26:14,657 PROFESSOR: Yeah. 483 00:26:14,657 --> 00:26:15,860 It's just 1 over N, right? 484 00:26:15,860 --> 00:26:18,235 So the idea is there are N individuals in the population, 485 00:26:18,235 --> 00:26:19,762 they're all identical. 486 00:26:19,762 --> 00:26:21,220 We know that eventually one of them 487 00:26:21,220 --> 00:26:23,450 is going to take over the population, just 488 00:26:23,450 --> 00:26:25,694 due to random stochastic dynamics. 489 00:26:25,694 --> 00:26:27,485 What that means is that each individual has 490 00:26:27,485 --> 00:26:32,760 a probability of 1 over N of taking over the population. 491 00:26:32,760 --> 00:26:35,220 And this is very important. 492 00:26:35,220 --> 00:26:46,480 So each individual has a 1 over N probability of fixing. 493 00:26:50,340 --> 00:26:53,150 And that's assuming that everybody in the population 494 00:26:53,150 --> 00:26:54,980 has the same fitness. 495 00:26:54,980 --> 00:26:57,819 And that's just by symmetry. 496 00:26:57,819 --> 00:26:59,610 But then of course, you can also say, well, 497 00:26:59,610 --> 00:27:02,870 you know if the probability of each individual is 1 over N, 498 00:27:02,870 --> 00:27:07,160 then the probability that one of these i individuals takes over 499 00:27:07,160 --> 00:27:11,969 is going to be i over N. 500 00:27:11,969 --> 00:27:14,404 AUDIENCE: The generalization [INAUDIBLE] 501 00:27:14,404 --> 00:27:17,813 reproducing organisms is that [INAUDIBLE] organisms 502 00:27:17,813 --> 00:27:21,709 every individual is probably the ancestor of [INAUDIBLE]. 503 00:27:21,709 --> 00:27:25,150 Like ever individual takes [INAUDIBLE]. 504 00:27:25,150 --> 00:27:26,230 PROFESSOR: OK, all right. 505 00:27:26,230 --> 00:27:29,610 OK, now you want to allow for recombination. 506 00:27:29,610 --> 00:27:30,550 Is that-- 507 00:27:30,550 --> 00:27:31,816 AUDIENCE: I mean, yeah. 508 00:27:31,816 --> 00:27:32,690 PROFESSOR: Right, OK. 509 00:27:32,690 --> 00:27:36,960 So yes, there are several important aspects of sex. 510 00:27:36,960 --> 00:27:39,420 But one of the major ones is the recombination. 511 00:27:39,420 --> 00:27:43,530 And so if you have enough recombination, 512 00:27:43,530 --> 00:27:47,950 then everybody will contribute-- well, everybody. 513 00:27:47,950 --> 00:27:51,100 Then there will be-- then many, many individuals 514 00:27:51,100 --> 00:27:53,860 will contribute to the [? lineage. ?] 515 00:27:53,860 --> 00:27:56,190 What you often will hear people talk about 516 00:27:56,190 --> 00:28:01,120 is the ancestral Adam and the ancestral Eve. 517 00:28:01,120 --> 00:28:04,250 And that-- and what are people referring to about that? 518 00:28:10,200 --> 00:28:11,130 Yes, in the back. 519 00:28:11,130 --> 00:28:17,539 AUDIENCE: An individual with the-- an individual 520 00:28:17,539 --> 00:28:23,455 that [INAUDIBLE] early on [INAUDIBLE]. 521 00:28:26,440 --> 00:28:27,390 PROFESSOR: Yes, right. 522 00:28:27,390 --> 00:28:29,150 So there's this idea-- OK, so I don't 523 00:28:29,150 --> 00:28:31,340 want to get too much into the sexual-- sexually 524 00:28:31,340 --> 00:28:32,980 reproducing populations because that's 525 00:28:32,980 --> 00:28:34,590 covered more in other classes. 526 00:28:34,590 --> 00:28:38,370 And it's a totally different models you would typically use. 527 00:28:38,370 --> 00:28:41,230 But I think the simplest way to think about some 528 00:28:41,230 --> 00:28:45,740 of this is just that there's some part the genome that 529 00:28:45,740 --> 00:28:48,260 does not have recombination in the same way. 530 00:28:48,260 --> 00:28:49,650 So it's simpler. 531 00:28:49,650 --> 00:28:52,512 What part of the genome is that in us? 532 00:28:52,512 --> 00:28:53,470 AUDIENCE: Y chromosome. 533 00:28:53,470 --> 00:28:54,178 PROFESSOR: Right. 534 00:28:54,178 --> 00:28:57,170 So the Y chromosome, and that means that in principle you 535 00:28:57,170 --> 00:29:02,340 could track the dynamics along the male lineages. 536 00:29:02,340 --> 00:29:05,920 So there are all these studies, whatever, Genghis Khan, maybe 537 00:29:05,920 --> 00:29:08,240 lots of us are descendants of. 538 00:29:08,240 --> 00:29:11,550 Right, because he had lots of wives, or something like that. 539 00:29:11,550 --> 00:29:14,810 So his Y chromosome supposedly occupies 540 00:29:14,810 --> 00:29:18,490 a non negligible fraction of the population. 541 00:29:18,490 --> 00:29:21,480 OK, so but then what about-- what's the other, 542 00:29:21,480 --> 00:29:23,340 yeah, so on the female side? 543 00:29:23,340 --> 00:29:24,642 What would be the equivalent? 544 00:29:24,642 --> 00:29:25,600 AUDIENCE: Mitochondria. 545 00:29:25,600 --> 00:29:26,690 PROFESSOR: Mitochondria. 546 00:29:26,690 --> 00:29:27,190 Right. 547 00:29:27,190 --> 00:29:29,270 So in principle, you can-- so I think 548 00:29:29,270 --> 00:29:32,140 for an awful lot of these studies you can-- 549 00:29:32,140 --> 00:29:37,290 the genetics are much simpler for those two lineages. 550 00:29:37,290 --> 00:29:39,500 Because you don't have the recombination. 551 00:29:39,500 --> 00:29:42,410 AUDIENCE: So why is the mitochondria-- I mean, 552 00:29:42,410 --> 00:29:45,025 you say that it's the most obvious thing. 553 00:29:45,025 --> 00:29:45,900 PROFESSOR: OK, yeah-- 554 00:29:45,900 --> 00:29:47,919 AUDIENCE: I've never heard that. 555 00:29:47,919 --> 00:29:49,210 PROFESSOR: Yeah, OK, OK, right. 556 00:29:49,210 --> 00:29:49,525 So-- 557 00:29:49,525 --> 00:29:50,130 AUDIENCE: I mean, we don't have to-- 558 00:29:50,130 --> 00:29:51,546 PROFESSOR: OK, well, you're right. 559 00:29:51,546 --> 00:29:55,880 So basically the situation is that we have cells, 560 00:29:55,880 --> 00:30:00,670 and most of the genome is in the nucleus. 561 00:30:00,670 --> 00:30:02,510 But then, but the mitochondria actually 562 00:30:02,510 --> 00:30:06,350 have their own mitochondrial DNA. 563 00:30:06,350 --> 00:30:09,664 And then the issue is, OK, well, what happens? 564 00:30:09,664 --> 00:30:11,830 You know, here's the birds and the bees talk for you 565 00:30:11,830 --> 00:30:12,530 guys, all right? 566 00:30:15,640 --> 00:30:18,510 Right, so the sperm comes, fertilizes the egg. 567 00:30:18,510 --> 00:30:25,570 And the vast majority of the mitochondria 568 00:30:25,570 --> 00:30:28,130 come from, or were in the egg, as 569 00:30:28,130 --> 00:30:30,000 compared to the mitochondria from the sperm. 570 00:30:30,000 --> 00:30:33,640 And I don't-- does anybody know if any of the sperm 571 00:30:33,640 --> 00:30:35,770 mitochondria actually contribute? 572 00:30:35,770 --> 00:30:38,645 Are they selectively-- does something happen to them? 573 00:30:38,645 --> 00:30:40,070 AUDIENCE: They don't have any. 574 00:30:40,070 --> 00:30:41,000 PROFESSOR: Oh, they just don't have any? 575 00:30:41,000 --> 00:30:42,120 All right, whoo. 576 00:30:42,120 --> 00:30:43,770 All right, well, OK. 577 00:30:47,160 --> 00:30:48,630 OK, well that solves that problem. 578 00:30:48,630 --> 00:30:49,536 OK. 579 00:30:49,536 --> 00:30:51,820 AUDIENCE: Wait. 580 00:30:51,820 --> 00:30:55,940 PROFESSOR: Is that not your-- all right, well-- all right, 581 00:30:55,940 --> 00:30:59,030 this is the kind of thing that somebody could maybe Wikipedia 582 00:30:59,030 --> 00:31:01,550 this while we're going. 583 00:31:04,240 --> 00:31:06,205 But that's the basic idea, though. 584 00:31:10,090 --> 00:31:12,621 All right, does anybody have any questions about these two 585 00:31:12,621 --> 00:31:13,120 statements? 586 00:31:13,120 --> 00:31:16,550 Probability that A fixes, probability that B fixes? 587 00:31:19,690 --> 00:31:23,830 Incidentally, you should be able to draw-- these are random. 588 00:31:23,830 --> 00:31:29,430 From this point moving forward, am I more likely-- OK, so 589 00:31:29,430 --> 00:31:32,950 given where i is here, am I more likely to fix B or A? 590 00:31:32,950 --> 00:31:34,490 Ready, three, two, one. 591 00:31:34,490 --> 00:31:34,990 AUDIENCE: B. 592 00:31:34,990 --> 00:31:37,560 PROFESSOR: B. Does that mean that my first step is 593 00:31:37,560 --> 00:31:39,920 more likely to be in the direction of B than in A? 594 00:31:39,920 --> 00:31:40,420 Yes or no. 595 00:31:40,420 --> 00:31:41,719 Ready, three, two, one. 596 00:31:41,719 --> 00:31:42,260 AUDIENCE: No. 597 00:31:42,260 --> 00:31:43,070 PROFESSOR: No. 598 00:31:43,070 --> 00:31:45,160 OK, so this is a random lock. 599 00:31:45,160 --> 00:31:47,060 All right, it doesn't always take steps, 600 00:31:47,060 --> 00:31:49,660 but sometimes it goes up and then down. 601 00:31:49,660 --> 00:31:53,184 All right, so it's going to- now I'm-- you know, 602 00:31:53,184 --> 00:31:55,300 I understand it's not to-- you know, whatever. 603 00:31:55,300 --> 00:31:55,800 OK. 604 00:32:01,350 --> 00:32:04,510 But the idea is that once it hits 605 00:32:04,510 --> 00:32:06,264 0 or 1 here in terms of the fraction, 606 00:32:06,264 --> 00:32:07,430 then you stay where you are. 607 00:32:07,430 --> 00:32:08,721 These are absorbing boundaries. 608 00:32:14,560 --> 00:32:17,756 But every now and then it's going to hit there. 609 00:32:23,250 --> 00:32:25,610 Before we get going too much more in this, 610 00:32:25,610 --> 00:32:28,560 I want to mention something about time in this model. 611 00:32:28,560 --> 00:32:32,520 Because time is a little bit of a funny entity here. 612 00:32:35,082 --> 00:32:36,040 So here's the question. 613 00:32:36,040 --> 00:32:41,020 How long-- and long is funny-- but how long 614 00:32:41,020 --> 00:32:43,740 does one-- I don't know. 615 00:32:43,740 --> 00:32:45,882 Do we want to call this an iteration or a cycle? 616 00:32:45,882 --> 00:32:47,160 On iteration of the model? 617 00:32:58,130 --> 00:33:01,060 And what I mean by that is that in units of something that 618 00:33:01,060 --> 00:33:06,920 would be like real time, you know what I mean? 619 00:33:12,670 --> 00:33:21,289 Is it a second, a generation time-- so this 620 00:33:21,289 --> 00:33:22,830 would be like a cell generation time. 621 00:33:42,360 --> 00:33:44,360 Or don't know, something. 622 00:33:44,360 --> 00:33:46,900 OK, I'll give you 15 seconds. 623 00:33:46,900 --> 00:33:48,950 Can you guys all read this? 624 00:33:48,950 --> 00:33:52,690 Seconds, generation time, N times generation time, or 1 625 00:33:52,690 --> 00:33:55,716 over N times generation time. 626 00:33:55,716 --> 00:33:58,244 AUDIENCE: What do we want [INAUDIBLE]? 627 00:33:58,244 --> 00:33:59,910 PROFESSOR: Yeah, well let's say that I-- 628 00:33:59,910 --> 00:34:02,190 let's just imagine that I was using this to model 629 00:34:02,190 --> 00:34:06,410 the dynamics of the neutral drift 630 00:34:06,410 --> 00:34:12,020 dynamics of some bacteria in my test tube in the lab. 631 00:34:12,020 --> 00:34:14,300 Right, so let's say I have one of these tubridostats. 632 00:34:14,300 --> 00:34:16,429 So I-- question is, how long does this 633 00:34:16,429 --> 00:34:20,288 last in the units of-- right. 634 00:34:20,288 --> 00:34:21,704 Or equivalent, how many iterations 635 00:34:21,704 --> 00:34:27,420 do I have to go to get through some period of time in the lab. 636 00:34:27,420 --> 00:34:30,989 So I grow my bacteria in my turbidostat, say. 637 00:34:30,989 --> 00:34:34,469 And I do it for 100 hours. 638 00:34:34,469 --> 00:34:37,940 Now your advisor says, OK, go do a simulation, 639 00:34:37,940 --> 00:34:39,171 so you get something. 640 00:34:39,171 --> 00:34:41,420 And your advisor goes, all right, do a simulation, use 641 00:34:41,420 --> 00:34:43,590 the Moran process. 642 00:34:43,590 --> 00:34:45,090 You guys are going to be doing this, 643 00:34:45,090 --> 00:34:47,600 so this is not entirely hypothetical. 644 00:34:47,600 --> 00:34:53,731 But right, so your advisor says, go simulate this process. 645 00:34:53,731 --> 00:34:54,230 Right? 646 00:34:54,230 --> 00:34:56,300 So the question is, how many iterations 647 00:34:56,300 --> 00:34:59,320 do you have to do to make it equivalent to that 100 hours 648 00:34:59,320 --> 00:35:00,680 that you did in the lab? 649 00:35:00,680 --> 00:35:03,500 How do you-- how do you make a connection 650 00:35:03,500 --> 00:35:07,170 between a model, well, this model, and something 651 00:35:07,170 --> 00:35:08,920 that actually happens in your laboratory? 652 00:35:15,470 --> 00:35:17,110 Ready? 653 00:35:17,110 --> 00:35:21,505 Three, two, one. 654 00:35:21,505 --> 00:35:23,880 OK, all right, so we got a majority of the group agreeing 655 00:35:23,880 --> 00:35:28,466 that it's going to be D. Can somebody just say why this is? 656 00:35:28,466 --> 00:35:30,936 AUDIENCE: [INAUDIBLE] one cell [INAUDIBLE]. 657 00:35:34,030 --> 00:35:35,980 PROFESSOR: Right, so each iteration, there's 658 00:35:35,980 --> 00:35:40,831 only one cell out of N that actually divide, right? 659 00:35:40,831 --> 00:35:43,330 And that means that if you want, like for example, everybody 660 00:35:43,330 --> 00:35:45,070 to have had a chance, roughly, to divide, 661 00:35:45,070 --> 00:35:47,580 you need to go N iterations. 662 00:35:47,580 --> 00:35:50,770 And it also makes sense, if you ask-- 663 00:35:50,770 --> 00:35:57,200 let's imagine you have a test tube with a million bacteria. 664 00:35:57,200 --> 00:36:00,710 Now it's going to take some time before one of them divides. 665 00:36:00,710 --> 00:36:04,180 Now the question is, if you had 10 million bacteria 666 00:36:04,180 --> 00:36:06,070 in your test tube, you have to wait 1/10 667 00:36:06,070 --> 00:36:10,340 as long before the first one divides. 668 00:36:10,340 --> 00:36:12,550 So the amount of real time that elapses 669 00:36:12,550 --> 00:36:15,280 in each one of these iterations goes as 1 over N, 670 00:36:15,280 --> 00:36:19,300 where N is the population size. 671 00:36:19,300 --> 00:36:22,240 So I got some unhappy looks, so that means that I 672 00:36:22,240 --> 00:36:24,890 expect an unhappy question. 673 00:36:24,890 --> 00:36:25,390 Maybe. 674 00:36:30,374 --> 00:36:32,040 OK, well, if you don't ask the question, 675 00:36:32,040 --> 00:36:33,510 then in the teaching evaluations you're 676 00:36:33,510 --> 00:36:35,051 not allowed to write that you did not 677 00:36:35,051 --> 00:36:39,716 like the explanation of time in the Moran process. 678 00:36:39,716 --> 00:36:40,632 AUDIENCE: [INAUDIBLE] 679 00:36:40,632 --> 00:36:42,602 [LAUGHTER] 680 00:36:42,602 --> 00:36:43,810 PROFESSOR: Well, that worked. 681 00:36:43,810 --> 00:36:46,103 A little bit too well. 682 00:36:46,103 --> 00:36:48,311 AUDIENCE: [INAUDIBLE] question, just a clarification. 683 00:36:48,311 --> 00:36:49,297 PROFESSOR: Yeah. 684 00:36:49,297 --> 00:36:51,515 AUDIENCE: So an iteration is when 685 00:36:51,515 --> 00:36:54,432 one cell or thing increases? 686 00:36:54,432 --> 00:36:55,140 PROFESSOR: Right. 687 00:36:55,140 --> 00:36:57,880 OK, an iteration in this model is both of these things. 688 00:36:57,880 --> 00:37:01,470 So it's a birth, and a death, or a replacement. 689 00:37:01,470 --> 00:37:03,950 So it's one duration here, another-- so each iteration 690 00:37:03,950 --> 00:37:07,566 involves one birth and one replacement. 691 00:37:07,566 --> 00:37:08,066 Yeah. 692 00:37:08,066 --> 00:37:10,010 AUDIENCE: What is one generation time? 693 00:37:10,010 --> 00:37:11,874 Is that when the population [INAUDIBLE]? 694 00:37:11,874 --> 00:37:13,540 PROFESSOR: Right, so the generation time 695 00:37:13,540 --> 00:37:16,580 is the typical time that it takes 696 00:37:16,580 --> 00:37:18,330 for one of these individuals to give birth 697 00:37:18,330 --> 00:37:19,470 to another individual. 698 00:37:19,470 --> 00:37:22,080 So in the case of the cells, it might be half an hour. 699 00:37:27,830 --> 00:37:30,375 I'm going to try to leave this up just so that you guys can 700 00:37:30,375 --> 00:37:31,340 continue to look a bit. 701 00:37:44,942 --> 00:37:46,900 So I want to say just something about this idea 702 00:37:46,900 --> 00:37:50,360 of a molecular clock while we're here. 703 00:38:06,869 --> 00:38:08,660 All right, so now that we've said something 704 00:38:08,660 --> 00:38:11,880 about how much time is actually elapsing here, 705 00:38:11,880 --> 00:38:14,450 we can think a little bit about the rate-- 706 00:38:14,450 --> 00:38:18,110 now we want to allow mutations. 707 00:38:18,110 --> 00:38:25,970 So let's assume that there is a mutation rate or probability. 708 00:38:25,970 --> 00:38:33,642 Mu is the probability of a mutation. 709 00:38:33,642 --> 00:38:35,725 And we're going to say this is a neutral mutation. 710 00:38:39,630 --> 00:38:43,680 And this is per division, or per birth. 711 00:38:48,690 --> 00:38:50,940 So the idea is that when-- all right, 712 00:38:50,940 --> 00:38:53,040 we might start out with just all A individuals 713 00:38:53,040 --> 00:38:55,314 in the population. 714 00:38:55,314 --> 00:38:58,560 But then an A individual will give-- OK, 715 00:38:58,560 --> 00:39:00,602 here is the mother cell, the original A. 716 00:39:00,602 --> 00:39:02,060 And the mother cell, for now, we'll 717 00:39:02,060 --> 00:39:03,970 assume just doesn't ever mutate. 718 00:39:03,970 --> 00:39:06,710 But that the daughter cell has a probability, 719 00:39:06,710 --> 00:39:15,450 mu, of being a new type, say B. And we often 720 00:39:15,450 --> 00:39:20,950 call this a mutation rate, but it's a probability per birth. 721 00:39:25,584 --> 00:39:27,000 So what we want to know is what is 722 00:39:27,000 --> 00:39:29,580 the what we want to calculate is what's 723 00:39:29,580 --> 00:39:35,700 the rate at which new neutral mutants both appear 724 00:39:35,700 --> 00:39:38,500 and then fix in the population. 725 00:39:38,500 --> 00:39:44,190 So what we're asking about is, from the standpoint of us 726 00:39:44,190 --> 00:39:47,910 as scientists, we do sequencing of different lineages, 727 00:39:47,910 --> 00:39:49,390 say humans and chimpanzees. 728 00:39:49,390 --> 00:39:51,635 And we're looking at the accumulation of these, what 729 00:39:51,635 --> 00:39:53,440 we think are neutral mutations. 730 00:39:53,440 --> 00:39:56,460 Question is, how many neutral mutations do we expect to see? 731 00:39:59,940 --> 00:40:02,420 So we need to know the rate that these things happen. 732 00:40:07,470 --> 00:40:12,530 So this is the rate of fixation of neutral mutations, 733 00:40:12,530 --> 00:40:17,200 and so this is somehow the rate of neutral evolution. 734 00:40:17,200 --> 00:40:19,867 There are two steps in here. 735 00:40:19,867 --> 00:40:21,700 What are the two things that have to happen? 736 00:40:25,524 --> 00:40:26,634 AUDIENCE: [INAUDIBLE] 737 00:40:26,634 --> 00:40:28,050 PROFESSOR: Right, needs to appear. 738 00:40:28,050 --> 00:40:30,607 So we'll call this the rate of appearance. 739 00:40:30,607 --> 00:40:32,190 And then what else do we need to know? 740 00:40:42,669 --> 00:40:43,929 AUDIENCE: [INAUDIBLE] 741 00:40:43,929 --> 00:40:45,345 PROFESSOR: I'm sorry, what's that? 742 00:40:45,345 --> 00:40:46,429 AUDIENCE: Population size. 743 00:40:46,429 --> 00:40:48,136 PROFESSOR: Right, so the population size. 744 00:40:48,136 --> 00:40:49,430 And why are you saying that? 745 00:40:49,430 --> 00:40:52,780 Or what's-- I mean, the population size is certainly 746 00:40:52,780 --> 00:40:56,180 going to be relevant, but I guess the question is, 747 00:40:56,180 --> 00:40:58,630 will the rate of neutral evolution, 748 00:40:58,630 --> 00:41:02,849 the rate in which you see neutral mutants in a lineage, 749 00:41:02,849 --> 00:41:05,390 will that be just equal to this, or do we need to multiply it 750 00:41:05,390 --> 00:41:05,800 by something else? 751 00:41:05,800 --> 00:41:06,410 Yeah. 752 00:41:06,410 --> 00:41:07,330 AUDIENCE: [INAUDIBLE] 753 00:41:07,330 --> 00:41:07,600 PROFESSOR: Right. 754 00:41:07,600 --> 00:41:09,150 And it's a rate of fixation. 755 00:41:09,150 --> 00:41:11,840 We'll say it's really kind of a probability of fixation. 756 00:41:11,840 --> 00:41:13,830 Because there's some rate per unit time. 757 00:41:13,830 --> 00:41:15,663 Maybe even like real time in terms of number 758 00:41:15,663 --> 00:41:16,890 of generations in real time. 759 00:41:16,890 --> 00:41:18,973 But we need to know the probability that it fixes. 760 00:41:29,820 --> 00:41:31,142 Great. 761 00:41:31,142 --> 00:41:32,850 So let's-- all right, we're going to do-- 762 00:41:32,850 --> 00:41:35,550 we're going to do a very detailed calculation here. 763 00:41:35,550 --> 00:41:36,365 Yes. 764 00:41:36,365 --> 00:41:39,490 AUDIENCE: So I'm just curious, [INAUDIBLE] for this 765 00:41:39,490 --> 00:41:43,330 probability [INAUDIBLE] time is completely left out 766 00:41:43,330 --> 00:41:44,770 of the picture, so in principle-- 767 00:41:44,770 --> 00:41:46,322 PROFESSOR: That's right. 768 00:41:46,322 --> 00:41:48,178 AUDIENCE: --it could take a very long time. 769 00:41:48,178 --> 00:41:49,178 PROFESSOR: That's right. 770 00:41:49,178 --> 00:41:52,510 AUDIENCE: But we don't take this into account because-- 771 00:41:52,510 --> 00:41:53,910 PROFESSOR: That's right. 772 00:41:53,910 --> 00:42:00,380 So for now, let's just assume that the rate of appearance 773 00:42:00,380 --> 00:42:03,770 of these is small, so that you don't 774 00:42:03,770 --> 00:42:07,339 have to worry about different mutants competing 775 00:42:07,339 --> 00:42:08,130 against each other. 776 00:42:08,130 --> 00:42:09,562 We're going to spend a lot of time 777 00:42:09,562 --> 00:42:11,270 on Thursday talking about this phenomenon 778 00:42:11,270 --> 00:42:14,140 of clonal interference, when multiple mutant lineages are 779 00:42:14,140 --> 00:42:16,722 coexisting and perhaps competing in a population. 780 00:42:16,722 --> 00:42:19,180 But for simplicity for now, what we're just going to assume 781 00:42:19,180 --> 00:42:22,451 is that there's a separation of time scales. 782 00:42:22,451 --> 00:42:22,950 Right? 783 00:42:22,950 --> 00:42:25,390 Which means that the rate at which these neutral mutants 784 00:42:25,390 --> 00:42:28,860 appear in the population is very small compared 785 00:42:28,860 --> 00:42:35,950 to the 1 over the time that it takes for the fixation 786 00:42:35,950 --> 00:42:39,230 to occur. 787 00:42:39,230 --> 00:42:41,650 So what we want is what's the rate of appearance 788 00:42:41,650 --> 00:42:46,727 here in units of real time. 789 00:42:46,727 --> 00:42:48,810 What are the things that are going to appear here? 790 00:42:52,865 --> 00:42:53,990 AUDIENCE: Rate of mutation. 791 00:42:53,990 --> 00:42:55,650 PROFESSOR: All right, rate of mutation, 792 00:42:55,650 --> 00:43:01,840 mu, times population size. 793 00:43:04,470 --> 00:43:06,720 And that's just because a larger population 794 00:43:06,720 --> 00:43:10,570 will experience a larger rate of these mutants appearing 795 00:43:10,570 --> 00:43:11,320 in the population. 796 00:43:11,320 --> 00:43:13,969 And it's a linear. 797 00:43:13,969 --> 00:43:15,760 And that's actually just-- that is, indeed, 798 00:43:15,760 --> 00:43:18,220 the rate of appearance. 799 00:43:18,220 --> 00:43:20,100 And the probability of fixation of each one? 800 00:43:23,049 --> 00:43:24,424 AUDIENCE: [? Excuse me, ?] but it 801 00:43:24,424 --> 00:43:28,590 doesn't have the unit of rate. 802 00:43:28,590 --> 00:43:30,320 PROFESSOR: Oh, yes. 803 00:43:30,320 --> 00:43:33,000 OK, so this is-- OK so we have to actually-- 804 00:43:33,000 --> 00:43:35,210 AUDIENCE: [INAUDIBLE] times is that per iteration? 805 00:43:35,210 --> 00:43:35,835 PROFESSOR: Yes. 806 00:43:35,835 --> 00:43:41,900 So this is per, this is per-- so this is a rate per generation. 807 00:43:41,900 --> 00:43:48,450 So I guess I'd define mu as the probability-- 808 00:43:48,450 --> 00:43:52,000 so this is all in units of per generation, basically. 809 00:43:52,000 --> 00:43:53,590 Because mu is a per. 810 00:43:58,063 --> 00:44:00,550 AUDIENCE: Generation or iteration? 811 00:44:00,550 --> 00:44:03,660 PROFESSOR: OK, let's make sure that I-- 812 00:44:03,660 --> 00:44:08,700 mu-- this is a generation. 813 00:44:08,700 --> 00:44:15,170 Because if we have-- right. 814 00:44:15,170 --> 00:44:17,870 So let's say that, for example, there's 10 to the 6 individuals 815 00:44:17,870 --> 00:44:21,418 here, and the mutation rate is-- yeah. 816 00:44:24,100 --> 00:44:26,273 All right, probability of fixation was what? 817 00:44:26,273 --> 00:44:27,220 AUDIENCE: 1 over N. 818 00:44:27,220 --> 00:44:34,350 PROFESSOR: 1 over N. So this is great. 819 00:44:34,350 --> 00:44:38,920 Because this is saying that the rate at which you expect 820 00:44:38,920 --> 00:44:42,080 neutral mutants to actually appear in the population, 821 00:44:42,080 --> 00:44:44,140 in terms of like, in terms of fixing, 822 00:44:44,140 --> 00:44:47,070 if you were to sequence along a lineage, 823 00:44:47,070 --> 00:44:50,280 that it is independent of the population size. 824 00:44:50,280 --> 00:44:55,890 And it's given by the rate of mutation. 825 00:44:59,110 --> 00:45:00,990 But what you expect is it's on-- it should 826 00:45:00,990 --> 00:45:02,555 be on a per generation basis. 827 00:45:07,840 --> 00:45:13,280 So this thing is perhaps useful in several different ways. 828 00:45:13,280 --> 00:45:17,390 And there are some subtleties, like always, to this. 829 00:45:17,390 --> 00:45:21,226 If you go out and you measure the rates of fixation 830 00:45:21,226 --> 00:45:22,600 of neutral mutants, what you find 831 00:45:22,600 --> 00:45:25,850 is that it's not really constant on a per generation basis. 832 00:45:25,850 --> 00:45:30,400 But more on a-- maybe even closer on a per actual year 833 00:45:30,400 --> 00:45:32,200 basis, say. 834 00:45:32,200 --> 00:45:34,280 In particular, this would predict 835 00:45:34,280 --> 00:45:39,300 that if organisms have the same mutation rate, 836 00:45:39,300 --> 00:45:42,580 I'd say roughly maybe humans and mice. 837 00:45:42,580 --> 00:45:44,910 But yet humans and mice have very different generation 838 00:45:44,910 --> 00:45:45,410 times. 839 00:45:48,640 --> 00:45:51,280 By [INAUDIBLE]. 840 00:45:51,280 --> 00:45:53,580 Then you would expect the rate of accumulation 841 00:45:53,580 --> 00:45:56,540 of neutral mutants in the human population on a per year basis 842 00:45:56,540 --> 00:45:59,090 to be much lower than mice. 843 00:45:59,090 --> 00:46:01,290 But that's not true. 844 00:46:01,290 --> 00:46:03,780 We'll get into a bit later why that might be. 845 00:46:03,780 --> 00:46:07,160 But I just want to highlight that that's-- that this model 846 00:46:07,160 --> 00:46:08,930 is very simple. 847 00:46:08,930 --> 00:46:12,330 And it predicts something that is too simple, maybe. 848 00:46:12,330 --> 00:46:14,814 But at least it's saying that there's 849 00:46:14,814 --> 00:46:16,730 some sense in which the population size is not 850 00:46:16,730 --> 00:46:19,630 as relevant as you might have thought it was going to be. 851 00:46:19,630 --> 00:46:21,624 And at least within a particular lineage, 852 00:46:21,624 --> 00:46:23,290 if you're talking about the accumulation 853 00:46:23,290 --> 00:46:24,920 of neutral mutations along humans, 854 00:46:24,920 --> 00:46:29,280 for example, then you can say, maybe that's roughly constant. 855 00:46:29,280 --> 00:46:31,372 It gets very-- it gets very tricky. 856 00:46:31,372 --> 00:46:33,330 I mean, if you look at the rate of accumulation 857 00:46:33,330 --> 00:46:35,680 of neutral mutants in one protein in humans, 858 00:46:35,680 --> 00:46:38,650 it's at a different rate than another protein in humans. 859 00:46:38,650 --> 00:46:39,860 So everything is complicated. 860 00:46:39,860 --> 00:46:42,230 But at least each-- along each of these proteins, 861 00:46:42,230 --> 00:46:44,880 maybe it still is roughly some sort of clock, 862 00:46:44,880 --> 00:46:49,620 because it accumulates mutations at some rate that's 863 00:46:49,620 --> 00:46:51,445 roughly linear with time. 864 00:46:51,445 --> 00:46:53,820 Of course, it's hard to imagine how any process like this 865 00:46:53,820 --> 00:46:56,920 can not go with time like that. 866 00:46:56,920 --> 00:46:59,470 But at least this is potentially a useful thing. 867 00:46:59,470 --> 00:47:02,220 And indeed, when you read about studies 868 00:47:02,220 --> 00:47:04,730 from sequences trying to estimate 869 00:47:04,730 --> 00:47:07,860 the time since the last common ancestor, 870 00:47:07,860 --> 00:47:10,200 this is the category of technique 871 00:47:10,200 --> 00:47:14,720 that is the basis for that, is that you're just counting up 872 00:47:14,720 --> 00:47:16,920 how many neutral mutations appeared along these 873 00:47:16,920 --> 00:47:18,211 along these different lineages. 874 00:47:21,340 --> 00:47:24,660 And I think that there are a number of really fascinating 875 00:47:24,660 --> 00:47:26,520 things that you can try to address 876 00:47:26,520 --> 00:47:30,280 with this kind of molecular clock. 877 00:47:30,280 --> 00:47:32,440 And I'll maybe bring up one of them. 878 00:47:32,440 --> 00:47:36,570 Incidentally, I'm not a huge fan of memorizing things. 879 00:47:36,570 --> 00:47:41,130 But for both size scales, and time scales, 880 00:47:41,130 --> 00:47:44,410 and so forth, I really very much do like the idea of everybody 881 00:47:44,410 --> 00:47:47,920 having memorized a few sign posts. 882 00:47:47,920 --> 00:47:50,056 Because that way, when you hear something new, 883 00:47:50,056 --> 00:47:51,430 you have some way of interpreting 884 00:47:51,430 --> 00:47:54,440 whether it's big, or small, or something else. 885 00:47:54,440 --> 00:47:59,770 So for example, the time since the last common ancestor 886 00:47:59,770 --> 00:48:02,460 between humans and chimpanzees. 887 00:48:02,460 --> 00:48:05,790 Does anybody have any sense of-- well, 888 00:48:05,790 --> 00:48:08,150 I'll actually have us vote, because I think it's useful. 889 00:48:08,150 --> 00:48:11,790 In case you're off by many orders of magnitude, 890 00:48:11,790 --> 00:48:14,930 that you make-- OK. 891 00:48:14,930 --> 00:48:20,672 So the last common ancestor, human, chimpanzee. 892 00:48:20,672 --> 00:48:22,130 And incidentally, this is something 893 00:48:22,130 --> 00:48:24,140 that people do argue a lot about. 894 00:48:24,140 --> 00:48:27,860 But it's within a factor of two of something. 895 00:48:27,860 --> 00:48:32,350 So I'm going to go ahead and make some-- so hold on. 896 00:48:32,350 --> 00:48:52,970 I just want to make sure I get my-- well, 7 times 10 to the 6. 897 00:48:59,615 --> 00:49:00,115 All right. 898 00:49:06,580 --> 00:49:08,570 All right, I'll give you 10 seconds 899 00:49:08,570 --> 00:49:10,420 to orient yourself relative to other things 900 00:49:10,420 --> 00:49:11,920 that you might know about the world. 901 00:49:16,550 --> 00:49:22,900 OK, ready, three, two, one. 902 00:49:22,900 --> 00:49:29,670 All right, we got-- that's interesting. 903 00:49:29,670 --> 00:49:30,220 OK, yes. 904 00:49:30,220 --> 00:49:34,870 I would say it's kind of uniformly distributed-- oh. 905 00:49:34,870 --> 00:49:36,080 It's pretty unified. 906 00:49:36,080 --> 00:49:38,770 There are a minority of-- not very many E's. 907 00:49:38,770 --> 00:49:44,090 But I would say that the other things are pretty-- it's maybe 908 00:49:44,090 --> 00:49:45,100 peaked around here. 909 00:49:48,600 --> 00:49:51,150 I don't want to get into any biblical debates here. 910 00:49:56,770 --> 00:49:58,200 But right, OK. 911 00:49:58,200 --> 00:50:00,480 So what are some things that if we 912 00:50:00,480 --> 00:50:02,590 have a timeline of the world. 913 00:50:02,590 --> 00:50:08,400 OK, this is going to be a flash course in-- all right. 914 00:50:08,400 --> 00:50:13,870 OK, here's, here I am, and I'm unhappy because we 915 00:50:13,870 --> 00:50:16,910 don't know when humans and chimps-- all right. 916 00:50:16,910 --> 00:50:18,370 So this is us. 917 00:50:18,370 --> 00:50:20,122 All right. 918 00:50:20,122 --> 00:50:21,580 AUDIENCE: [INAUDIBLE] 919 00:50:21,580 --> 00:50:22,290 PROFESSOR: Right. 920 00:50:22,290 --> 00:50:22,790 OK. 921 00:50:22,790 --> 00:50:24,463 So let's say-- we could start with-- you 922 00:50:24,463 --> 00:50:25,504 want to start with earth? 923 00:50:25,504 --> 00:50:26,130 OK. 924 00:50:26,130 --> 00:50:28,970 Four and a half billion. 925 00:50:28,970 --> 00:50:32,010 This might be on a logarithmic scale, somehow. 926 00:50:32,010 --> 00:50:34,936 So we're going to-- just to space things out a little bit. 927 00:50:34,936 --> 00:50:35,810 AUDIENCE: [INAUDIBLE] 928 00:50:35,810 --> 00:50:37,851 PROFESSOR: Right, you know, the universe is what, 929 00:50:37,851 --> 00:50:38,940 13 ish billion years? 930 00:50:38,940 --> 00:50:40,590 I don't know. 931 00:50:40,590 --> 00:50:43,790 People are calculating with these-- 13 billion, 932 00:50:43,790 --> 00:50:47,590 four and a half billion, you know earth congeals, 933 00:50:47,590 --> 00:50:50,360 it's hot, whatever. 934 00:50:50,360 --> 00:50:53,765 All right, so life gets started, maybe, a billion years later. 935 00:50:56,467 --> 00:50:57,660 AUDIENCE: 3.9 [INAUDIBLE]. 936 00:50:57,660 --> 00:51:00,130 PROFESSOR: 3.9 sounds like a fine number. 937 00:51:00,130 --> 00:51:02,350 Wait, what did you vote for human and chimpanzee? 938 00:51:02,350 --> 00:51:04,870 You're very specific on this one. 939 00:51:04,870 --> 00:51:07,509 AUDIENCE: I actually voted for 3.5 times [INAUDIBLE]. 940 00:51:07,509 --> 00:51:08,550 PROFESSOR: OK, all right. 941 00:51:08,550 --> 00:51:11,360 So you're-- you want to get involved in this actual debate. 942 00:51:11,360 --> 00:51:13,300 OK, that's why. 943 00:51:13,300 --> 00:51:16,260 I just want to-- yeah, now I'm going 944 00:51:16,260 --> 00:51:18,950 to be stuck doing the linear and logarithmic scaling of how 945 00:51:18,950 --> 00:51:23,730 I want to-- this is going to some sort 946 00:51:23,730 --> 00:51:27,350 of funny logarithmic scaling from here to here. 947 00:51:27,350 --> 00:51:35,830 So dinosaurs-- all right, 60 some million years ago. 948 00:51:35,830 --> 00:51:40,845 Right, so, say bye to the dinosaurs. 949 00:51:40,845 --> 00:51:41,345 Dinosaurs. 950 00:51:45,320 --> 00:51:47,296 [INAUDIBLE] 951 00:51:47,296 --> 00:51:49,766 AUDIENCE: [INAUDIBLE] 952 00:51:49,766 --> 00:51:53,030 PROFESSOR: [INAUDIBLE] explosion was before that. 953 00:51:53,030 --> 00:51:59,010 Yeah, I don't-- OK, it's-- OK, all right. 954 00:51:59,010 --> 00:52:02,250 All right, so this is around human chimp. 955 00:52:02,250 --> 00:52:07,160 And indeed, people argue about whether it's five or 10. 956 00:52:07,160 --> 00:52:09,830 But you know, given that we were uniformly 957 00:52:09,830 --> 00:52:12,980 distributed across this number, we 958 00:52:12,980 --> 00:52:16,280 shouldn't be nit picky about the left one. 959 00:52:19,020 --> 00:52:20,550 Right, OK. 960 00:52:20,550 --> 00:52:25,730 And agriculture was maybe 12,000 years ago. 961 00:52:25,730 --> 00:52:27,754 Some sense of things. 962 00:52:27,754 --> 00:52:30,239 AUDIENCE: B is good for human and Neanderthal, 963 00:52:30,239 --> 00:52:31,730 and homo erectus. 964 00:52:31,730 --> 00:52:33,830 PROFESSOR: Yeah, human and Neanderthals, right. 965 00:52:33,830 --> 00:52:38,590 That's-- 70, OK, let's say yeah. 966 00:52:38,590 --> 00:52:39,542 This is when-- right. 967 00:52:39,542 --> 00:52:41,750 AUDIENCE: Common ancestor was definitely [INAUDIBLE]. 968 00:52:41,750 --> 00:52:43,500 PROFESSOR: Oh, common ancestor was before. 969 00:52:43,500 --> 00:52:45,810 But in terms of interbreeding, was sort-- all right. 970 00:52:45,810 --> 00:52:48,650 So this was the interbreeding, if you want to read that paper. 971 00:52:53,450 --> 00:52:55,000 But human chimp is here. 972 00:52:57,592 --> 00:52:58,930 Around seven million years. 973 00:52:58,930 --> 00:53:01,417 And it's not that this is the number that's magical, 974 00:53:01,417 --> 00:53:02,500 that you have to memorize. 975 00:53:02,500 --> 00:53:06,870 But I think that you should have some event 976 00:53:06,870 --> 00:53:10,820 in the history of the world at each logarithmic spacing. 977 00:53:10,820 --> 00:53:14,570 Just so that you-- when you hear about when something happened 978 00:53:14,570 --> 00:53:16,590 you know kind of vaguely where to put-- 979 00:53:16,590 --> 00:53:19,364 where to put something. 980 00:53:19,364 --> 00:53:21,030 Otherwise it just doesn't mean anything. 981 00:53:24,210 --> 00:53:27,270 One of my favorite examples of how the molecular clock was 982 00:53:27,270 --> 00:53:30,090 used to come up with something that I think 983 00:53:30,090 --> 00:53:32,136 is pretty neat and nontrivial, is 984 00:53:32,136 --> 00:53:34,010 to try to answer this question of when humans 985 00:53:34,010 --> 00:53:36,070 started wearing clothing. 986 00:53:36,070 --> 00:53:40,050 So this is, a priori, not very obvious. 987 00:53:40,050 --> 00:53:40,550 Right? 988 00:53:40,550 --> 00:53:44,800 Because we know we have evidence for clothing 989 00:53:44,800 --> 00:53:47,070 maybe 30,000 years ago. 990 00:53:47,070 --> 00:53:51,250 And there are needles that were used for clothing. 991 00:53:51,250 --> 00:53:56,120 There were-- and some of these little figurines, at least 992 00:53:56,120 --> 00:54:00,315 some fraction of the figurines, like fertility goddess 993 00:54:00,315 --> 00:54:02,940 kind of thing, some fraction of them have some clothing, right? 994 00:54:02,940 --> 00:54:05,825 So then it suggests that there were clothes. 995 00:54:05,825 --> 00:54:08,450 But the question is, before that it's actually rather difficult 996 00:54:08,450 --> 00:54:11,950 to know when we started wearing clothes, right? 997 00:54:11,950 --> 00:54:17,700 Apparently, we lost our body hair something 998 00:54:17,700 --> 00:54:19,030 like a million years ago. 999 00:54:19,030 --> 00:54:20,780 So you might say, oh, maybe that's around 1000 00:54:20,780 --> 00:54:22,840 when we started wearing clothes. 1001 00:54:22,840 --> 00:54:26,510 Of course a lot of animal hide and so forth wouldn't last. 1002 00:54:26,510 --> 00:54:28,440 So there's not any archaeological evidence 1003 00:54:28,440 --> 00:54:31,010 of this. 1004 00:54:31,010 --> 00:54:35,927 And so is anybody aware of how researchers 1005 00:54:35,927 --> 00:54:37,760 have used the molecular clock ideas in order 1006 00:54:37,760 --> 00:54:39,335 to try to answer this question? 1007 00:54:57,452 --> 00:54:58,436 AUDIENCE: [INAUDIBLE] 1008 00:55:03,848 --> 00:55:07,880 PROFESSOR: Yeah, this is amazing. 1009 00:55:07,880 --> 00:55:10,455 So you use lice. 1010 00:55:10,455 --> 00:55:13,050 There have been a number of studies doing this, 1011 00:55:13,050 --> 00:55:17,170 and apparently, there was a researcher in Germany, 1012 00:55:17,170 --> 00:55:20,980 who was at the Max Planck Institute for genomics 1013 00:55:20,980 --> 00:55:26,860 or something, and his son came home with a note saying-- 1014 00:55:26,860 --> 00:55:28,790 and actually this happened to me recently, 1015 00:55:28,790 --> 00:55:31,757 they got an email that there's a lice outbreak 1016 00:55:31,757 --> 00:55:33,340 to stay out of preschool, so watch out 1017 00:55:33,340 --> 00:55:36,560 when you're going by the play area-- 1018 00:55:36,560 --> 00:55:40,470 so he got this note back from his son's preschool that said, 1019 00:55:40,470 --> 00:55:42,030 oh yeah, there's a lice outbreak, 1020 00:55:42,030 --> 00:55:43,780 so this is what you have to watch out for. 1021 00:55:43,780 --> 00:55:47,120 But it said, oh, there's a different species of lice 1022 00:55:47,120 --> 00:55:52,260 that inhabits our clothing as our hair. 1023 00:55:52,260 --> 00:55:54,980 All right, so I'd say this is one of those things 1024 00:55:54,980 --> 00:55:57,420 that you could just read that and say, oh, well whatever. 1025 00:55:57,420 --> 00:55:59,076 Or if you're a geneticist you read that 1026 00:55:59,076 --> 00:56:03,462 and say, oh, I can use this to figure out when humans started 1027 00:56:03,462 --> 00:56:04,420 wearing clothes, right? 1028 00:56:04,420 --> 00:56:08,170 Because presumably the species that 1029 00:56:08,170 --> 00:56:13,960 specializes in living in our clothing was probably not there 1030 00:56:13,960 --> 00:56:17,717 or had not yet speciated before we had clothes. 1031 00:56:17,717 --> 00:56:19,800 Course, you can imagine ways that this could fail, 1032 00:56:19,800 --> 00:56:23,494 but it's a neat hypothesis. 1033 00:56:23,494 --> 00:56:25,160 So then you can go and you can basically 1034 00:56:25,160 --> 00:56:29,940 sequence the species of lice that lives in our clothing as 1035 00:56:29,940 --> 00:56:31,990 compared to the kind that lives in our hair, 1036 00:56:31,990 --> 00:56:35,260 and you can ask, how many neutral mutations accumulated 1037 00:56:35,260 --> 00:56:37,820 along these different lineages. 1038 00:56:37,820 --> 00:56:40,950 Now, you can imagine that based on, since we just 1039 00:56:40,950 --> 00:56:43,490 did this very nice study, we know 1040 00:56:43,490 --> 00:56:47,590 that it should be more than 30,000 years 1041 00:56:47,590 --> 00:56:53,530 and it should be less than 7 million, probably, hopefully. 1042 00:56:53,530 --> 00:56:56,130 Although, it's always possible that our ancestral state was 1043 00:56:56,130 --> 00:56:57,796 wearing clothes and that the chimpanzees 1044 00:56:57,796 --> 00:57:00,200 stopped wearing clothes. 1045 00:57:00,200 --> 00:57:02,790 But we'd be surprised if that were the case. 1046 00:57:06,100 --> 00:57:07,940 All right, so this is basically just asking 1047 00:57:07,940 --> 00:57:16,370 about head lice versus clothing lice. 1048 00:57:16,370 --> 00:57:19,760 And the original study by this researcher Max Planck 1049 00:57:19,760 --> 00:57:22,950 estimated 70,000 years, but then just a couple years ago there 1050 00:57:22,950 --> 00:57:25,570 was another publication from a professor 1051 00:57:25,570 --> 00:57:28,720 at the University of Florida that estimated 170,000. 1052 00:57:28,720 --> 00:57:31,810 So there still is a fair range, but I 1053 00:57:31,810 --> 00:57:42,867 guess the most recent estimate we'd have to say is 170,000. 1054 00:57:42,867 --> 00:57:43,450 Which is neat. 1055 00:57:43,450 --> 00:57:46,760 I don't know-- it's not that it changes, 1056 00:57:46,760 --> 00:57:50,330 necessarily, how I go about my daily life, 1057 00:57:50,330 --> 00:57:55,550 but I really love this idea that it's a very basic question 1058 00:57:55,550 --> 00:58:00,280 that your toddler son might ask you-- something that you'd 1059 00:58:00,280 --> 00:58:04,730 think that might be totally unknowable in the sense 1060 00:58:04,730 --> 00:58:07,960 that we would never have any way of getting any estimate all, 1061 00:58:07,960 --> 00:58:08,460 right? 1062 00:58:08,460 --> 00:58:16,720 But using some clever theoretical ideas together 1063 00:58:16,720 --> 00:58:23,110 with data on this accumulation of neutral mutations 1064 00:58:23,110 --> 00:58:26,860 allows one to at least make a ballpark estimate of something 1065 00:58:26,860 --> 00:58:29,540 that there's no physical record of except 1066 00:58:29,540 --> 00:58:32,435 in the DNA of our louses. 1067 00:58:32,435 --> 00:58:33,940 Is that a word? 1068 00:58:33,940 --> 00:58:35,080 AUDIENCE: It's lice. 1069 00:58:35,080 --> 00:58:35,300 PROFESSOR: It's just lice? 1070 00:58:35,300 --> 00:58:35,800 All right. 1071 00:58:39,030 --> 00:58:41,650 Are there any questions about this point? 1072 00:58:47,700 --> 00:58:51,160 So this is all neutral mutation, but of course we'd 1073 00:58:51,160 --> 00:58:54,450 like to move beyond these neutral mutations 1074 00:58:54,450 --> 00:58:59,002 to try to understand how non-neutral mutations spread. 1075 00:58:59,002 --> 00:59:00,460 I'm not going to do the derivation, 1076 00:59:00,460 --> 00:59:02,990 because the derivation is in your book, 1077 00:59:02,990 --> 00:59:04,360 and you just read about it. 1078 00:59:04,360 --> 00:59:06,850 But I do want to just make sure that we understand what 1079 00:59:06,850 --> 00:59:10,470 this equation is telling us. 1080 00:59:10,470 --> 00:59:14,990 So first of all we're going to assume that A 1081 00:59:14,990 --> 00:59:17,930 has some relative fitness r. 1082 00:59:17,930 --> 00:59:21,250 So r is defined as basically the relative fitness of of A, 1083 00:59:21,250 --> 00:59:24,450 or the fitness of A divided by the fitness of B. 1084 00:59:24,450 --> 00:59:27,460 So r is greater than 1 means that A is advantageous. 1085 00:59:27,460 --> 00:59:31,990 Less than 1 means it's deleterious. 1086 00:59:31,990 --> 00:59:38,130 And what we're told is that x sub 1087 00:59:38,130 --> 00:59:47,820 i, which is the probability that A fixes, 1088 00:59:47,820 --> 00:59:49,685 is equal to this expression. 1089 00:59:58,980 --> 01:00:06,310 If A fixes, given i A individuals 1090 01:00:06,310 --> 01:00:08,870 and N minus i B individuals. 1091 01:00:08,870 --> 01:00:11,153 AUDIENCE: So I guess that this assumes that they die 1092 01:00:11,153 --> 01:00:12,850 at the same rate [INAUDIBLE]. 1093 01:00:12,850 --> 01:00:13,850 PROFESSOR: That's right. 1094 01:00:13,850 --> 01:00:14,391 That's right. 1095 01:00:14,391 --> 01:00:17,410 The assumption is that we're placement is unbiased, purely 1096 01:00:17,410 --> 01:00:24,650 random, and it's only birth that is different by a factor of r. 1097 01:00:24,650 --> 01:00:28,710 And so I think that this is, on one level, wonderful. 1098 01:00:28,710 --> 01:00:32,410 It's kind of a simple expression describing 1099 01:00:32,410 --> 01:00:35,010 a lot of information of the dynamics 1100 01:00:35,010 --> 01:00:36,180 of the stochastic process. 1101 01:00:38,552 --> 01:00:40,760 On another level, the problem is that you look at it, 1102 01:00:40,760 --> 01:00:44,560 and I think it's easy to have like absolutely zero intuition 1103 01:00:44,560 --> 01:00:47,090 for what this thing does. 1104 01:00:47,090 --> 01:00:52,110 So what I always like to do when a student comes to my office 1105 01:00:52,110 --> 01:00:55,740 and says, oh I derived something great for our project. 1106 01:00:55,740 --> 01:00:58,860 You take a few limits to get a sense 1107 01:00:58,860 --> 01:01:00,550 of what's going on with it. 1108 01:01:00,550 --> 01:01:02,360 At half-time you find that it's not true. 1109 01:01:02,360 --> 01:01:05,650 But at least it's a way of developing intuition 1110 01:01:05,650 --> 01:01:07,210 for what's happening. 1111 01:01:07,210 --> 01:01:10,970 All right, so what are limits that this thing should behave-- 1112 01:01:13,946 --> 01:01:19,450 AUDIENCE: It should be 0 if there's no A. 1113 01:01:19,450 --> 01:01:25,510 PROFESSOR: Right, so x of 0 should be equal to 0, 1114 01:01:25,510 --> 01:01:27,260 is what you're saying. 1115 01:01:27,260 --> 01:01:28,970 If you have 0 individuals, you should 1116 01:01:28,970 --> 01:01:31,911 have 0 probability of fixing, independent of your fitness, 1117 01:01:31,911 --> 01:01:32,410 right? 1118 01:01:35,780 --> 01:01:39,240 All right, that sounds like a reasonable thing to check. 1119 01:01:39,240 --> 01:01:42,160 And does it work? 1120 01:01:42,160 --> 01:01:44,590 So r to the 0 is equal to 1. 1121 01:01:44,590 --> 01:01:47,750 So that's 1, so it's 1 minus 1-- 0. 1122 01:01:47,750 --> 01:01:50,565 Yep. 1123 01:01:50,565 --> 01:01:51,555 Yes? 1124 01:01:51,555 --> 01:01:54,030 AUDIENCE: r goes to infinity independently 1125 01:01:54,030 --> 01:01:58,561 of what you start with in step zero, then you expect A to fix? 1126 01:01:58,561 --> 01:01:59,560 PROFESSOR: That's right. 1127 01:01:59,560 --> 01:02:06,090 So the limit of xi for any i other than 0-- 1128 01:02:06,090 --> 01:02:10,580 as r goes to infinity, this should be equal to 1. 1129 01:02:16,240 --> 01:02:18,400 All right, so let's see. 1130 01:02:18,400 --> 01:02:29,526 If r goes to infinity, you get 0-- this is also 0-- 1 divided 1131 01:02:29,526 --> 01:02:31,320 by 1 is equal to 1-- all right. 1132 01:02:34,120 --> 01:02:35,370 Did everybody agree with that? 1133 01:02:39,410 --> 01:02:44,109 And that makes sense just that if A is just super, super fit, 1134 01:02:44,109 --> 01:02:44,900 then it should fix. 1135 01:02:47,590 --> 01:02:49,010 And of course, what's tricky here 1136 01:02:49,010 --> 01:02:53,380 is that r has to be surprisingly large before this thing ends up 1137 01:02:53,380 --> 01:02:56,170 being true. 1138 01:02:56,170 --> 01:03:00,370 This limit is great, and it's correct and true, 1139 01:03:00,370 --> 01:03:01,870 but it's also a little bit dangerous 1140 01:03:01,870 --> 01:03:06,810 because-- well we'll see that even things that you 1141 01:03:06,810 --> 01:03:10,390 think of as being very beneficial mutations typically 1142 01:03:10,390 --> 01:03:11,310 do not fix. 1143 01:03:11,310 --> 01:03:17,690 So this is the danger, but at least the limit is still true. 1144 01:03:17,690 --> 01:03:20,560 Any other limits that we think ought to happen? 1145 01:03:20,560 --> 01:03:23,537 AUDIENCE: If an i goes to N? 1146 01:03:23,537 --> 01:03:24,620 PROFESSOR: An i goes to N? 1147 01:03:24,620 --> 01:03:27,040 OK, right. 1148 01:03:27,040 --> 01:03:28,560 This is the opposite of this one. 1149 01:03:28,560 --> 01:03:32,266 This is just saying that if you already have fixed then 1150 01:03:32,266 --> 01:03:32,765 you fixed. 1151 01:03:36,270 --> 01:03:38,425 Indeed, if i is equal to N-- that works. 1152 01:03:42,970 --> 01:03:47,046 Any other limits that you believe should be true, 1153 01:03:47,046 --> 01:03:47,920 think should be true? 1154 01:03:47,920 --> 01:03:52,200 AUDIENCE: The one we already checked for r equals one? 1155 01:03:52,200 --> 01:03:53,591 PROFESSOR: Yes. 1156 01:03:53,591 --> 01:03:54,090 Indeed. 1157 01:03:54,090 --> 01:04:00,860 So if it's neutral-- so the limit as r goes to 1 of xi 1158 01:04:00,860 --> 01:04:03,104 should be equal to what? 1159 01:04:03,104 --> 01:04:04,030 AUDIENCE: i/N. 1160 01:04:04,030 --> 01:04:06,730 PROFESSOR: Should be equal to i/N. So this one 1161 01:04:06,730 --> 01:04:10,570 is a little bit less obvious, because if you 1162 01:04:10,570 --> 01:04:20,537 set r equal to 1, does this mean that it's equal to 0? 1163 01:04:20,537 --> 01:04:21,495 And what's the problem? 1164 01:04:24,360 --> 01:04:25,193 [INTERPOSING VOICES] 1165 01:04:28,620 --> 01:04:31,710 PROFESSOR: Well, OK, but even that statement's not true. 1166 01:04:31,710 --> 01:04:33,985 It's not even necessarily close to 0. 1167 01:04:33,985 --> 01:04:35,360 AUDIENCE: [INAUDIBLE] L'Hopitals? 1168 01:04:35,360 --> 01:04:36,068 PROFESSOR: Right. 1169 01:04:36,068 --> 01:04:37,746 This is the L'Hopitals. 1170 01:04:37,746 --> 01:04:39,120 There was another context already 1171 01:04:39,120 --> 01:04:41,360 were L'Hopitals came up, right? 1172 01:04:41,360 --> 01:04:42,810 Maybe? 1173 01:04:42,810 --> 01:04:45,440 OK, so the problem is that if you set r equal to 1 here, 1174 01:04:45,440 --> 01:04:46,140 then you get 0. 1175 01:04:46,140 --> 01:04:48,220 So then you think, oh, the answer is 0. 1176 01:04:48,220 --> 01:04:50,720 But you have to be more careful than that, because this also 1177 01:04:50,720 --> 01:04:53,710 is equal to 0. 1178 01:04:53,710 --> 01:05:03,010 And so L'Hopital's-- L-H- -- is it above the H? 1179 01:05:07,510 --> 01:05:08,820 AUDIENCE: No, that looks right. 1180 01:05:08,820 --> 01:05:09,360 PROFESSOR: Is it good? 1181 01:05:09,360 --> 01:05:09,943 AUDIENCE: Yes. 1182 01:05:09,943 --> 01:05:11,420 PROFESSOR: All right. 1183 01:05:11,420 --> 01:05:12,960 You're French, right? 1184 01:05:12,960 --> 01:05:14,680 I mean, sort of. 1185 01:05:18,000 --> 01:05:22,755 He's from Quebec, so I don't know what that question-- 1186 01:05:22,755 --> 01:05:23,630 how it's interpreted. 1187 01:05:26,740 --> 01:05:28,500 So this [INAUDIBLE] looks all right. 1188 01:05:28,500 --> 01:05:31,450 See, what you just have to do is then 1189 01:05:31,450 --> 01:05:33,600 you take the derivative with respect 1190 01:05:33,600 --> 01:05:36,240 to r for both the numerator and the denominator, 1191 01:05:36,240 --> 01:05:38,940 and then you see what ha-- but you take the limit again. 1192 01:05:38,940 --> 01:05:41,130 And sometimes you have to apply L'Hopital's rule 1193 01:05:41,130 --> 01:05:42,610 multiple times, right? 1194 01:05:42,610 --> 01:05:47,630 So what we write here is this is the limit as r goes to 1, 1195 01:05:47,630 --> 01:05:51,810 and we take the derivative of the numerator respect to r. 1196 01:05:51,810 --> 01:06:00,540 So we get out an i, 1 over r to the i plus 1, maybe? 1197 01:06:06,470 --> 01:06:14,960 And here we get out an N. 1198 01:06:14,960 --> 01:06:18,940 All right, so we took the derivatives back to r here. 1199 01:06:18,940 --> 01:06:20,750 But we left it as a limit because we might 1200 01:06:20,750 --> 01:06:22,170 need to apply it again, right? 1201 01:06:22,170 --> 01:06:23,440 Just because after you take the derivative 1202 01:06:23,440 --> 01:06:25,731 you're not guaranteed that it's going to work out fine, 1203 01:06:25,731 --> 01:06:26,800 but in this case it does. 1204 01:06:26,800 --> 01:06:28,850 Because already, this limit, we're 1205 01:06:28,850 --> 01:06:32,320 allowed to just set equal to 1 because nothing blows up. 1206 01:06:32,320 --> 01:06:38,270 So this is indeed equal to i/N. N. 1207 01:06:38,270 --> 01:06:41,250 And the important point here is that it's not necessarily 1208 01:06:41,250 --> 01:06:43,080 approximately equal to 0. 1209 01:06:43,080 --> 01:06:46,690 It could be anywhere between 0 and 1 1210 01:06:46,690 --> 01:06:49,490 depending what i and N are. 1211 01:06:49,490 --> 01:06:52,150 PROFESSOR: So that means that this expression here 1212 01:06:52,150 --> 01:06:58,142 captures the dynamics, actually, for all i, r, 1213 01:06:58,142 --> 01:07:01,630 N within the Moran process. 1214 01:07:01,630 --> 01:07:07,320 This thing is simply just true in this model. 1215 01:07:07,320 --> 01:07:11,020 There are no approximations yet. 1216 01:07:11,020 --> 01:07:13,205 There is, however, one approximation 1217 01:07:13,205 --> 01:07:16,300 that is very useful to make, which 1218 01:07:16,300 --> 01:07:19,170 is the approximation of what happens 1219 01:07:19,170 --> 01:07:22,130 when r is approximately 1. 1220 01:07:27,949 --> 01:07:29,490 In particular what we're going to ask 1221 01:07:29,490 --> 01:07:32,450 is, if we define something called a selection 1222 01:07:32,450 --> 01:07:46,020 coefficient, that is 1 plus s, the idea 1223 01:07:46,020 --> 01:07:49,664 here is that in many cases-- well for Thursday 1224 01:07:49,664 --> 01:07:51,330 we're going to read a paper that I think 1225 01:07:51,330 --> 01:07:53,000 this is quite interesting. 1226 01:07:53,000 --> 01:07:59,080 And where they were analyzing the appearance 1227 01:07:59,080 --> 01:08:01,180 of these mutations that would allow 1228 01:08:01,180 --> 01:08:04,790 bacteria to survive in some environment to do better. 1229 01:08:04,790 --> 01:08:06,310 And typical selection coefficients 1230 01:08:06,310 --> 01:08:09,990 here are kind of 1% to 3%. 1231 01:08:09,990 --> 01:08:13,980 So the mutations that appear and that allow one of these cells 1232 01:08:13,980 --> 01:08:15,610 to do better in this new environment, 1233 01:08:15,610 --> 01:08:20,240 convert an advantage that was on the order of 1% or 2%, or so. 1234 01:08:20,240 --> 01:08:24,580 Which means that s here would be like 0.01, 0.02. 1235 01:08:24,580 --> 01:08:28,040 Which means that for basically all the situations that you 1236 01:08:28,040 --> 01:08:30,529 see in the laboratory and so forth, what you really 1237 01:08:30,529 --> 01:08:34,390 want to know is what happens for small s. 1238 01:08:34,390 --> 01:08:36,770 So for s, much less than 1. 1239 01:08:36,770 --> 01:08:38,713 So where r is approximately equal to 1. 1240 01:08:45,029 --> 01:08:51,750 And in this case, we can say xi, well-- 1241 01:08:51,750 --> 01:08:56,670 and we actually are going to want to ask about x sub 1. 1242 01:08:56,670 --> 01:08:58,370 So that's a 1 now. 1243 01:09:02,010 --> 01:09:04,010 And the reason for that is that we want to know, 1244 01:09:04,010 --> 01:09:06,370 are there some rate that new mutations 1245 01:09:06,370 --> 01:09:07,632 will appear in the population? 1246 01:09:07,632 --> 01:09:10,090 When they appear they'll be present in a single individual, 1247 01:09:10,090 --> 01:09:13,130 and we want to know what is the probability that one 1248 01:09:13,130 --> 01:09:17,460 individual-- let's say has a beneficial mutation, 1249 01:09:17,460 --> 01:09:21,410 well, the probability it'll fix-- so we want to know 1250 01:09:21,410 --> 01:09:27,500 is for s, much less than 1, but larger than 0. 1251 01:09:27,500 --> 01:09:30,720 So far, it's a beneficial mutation of modest effect. 1252 01:09:30,720 --> 01:09:33,580 What's the probability that it will fix? 1253 01:09:33,580 --> 01:09:40,359 Well the idea here is that r to the N 1254 01:09:40,359 --> 01:09:44,590 is going to be much larger than 1, because N is often 1255 01:09:44,590 --> 01:09:47,460 a big population. 1256 01:09:47,460 --> 01:09:50,040 Now in that situation, this is just 1257 01:09:50,040 --> 01:09:53,490 approximately equal to 1 over r. 1258 01:09:58,350 --> 01:10:01,920 And r, we've already decided it can be expressed as 1 1259 01:10:01,920 --> 01:10:05,390 plus the selection coefficient. 1260 01:10:05,390 --> 01:10:08,490 Now this is something that you want 1261 01:10:08,490 --> 01:10:14,270 to be able to simplify in your sleep. 1262 01:10:14,270 --> 01:10:16,630 1 divided by 1 plus s is approximately 1263 01:10:16,630 --> 01:10:19,560 equal to 1 minus 1 minus s. 1264 01:10:24,790 --> 01:10:27,190 And this is indeed approximately equal to s. 1265 01:10:30,580 --> 01:10:33,600 This is saying that in the Moran process, 1266 01:10:33,600 --> 01:10:36,180 if a beneficial mutation appears in the population 1267 01:10:36,180 --> 01:10:41,230 with selection code coefficient s, that might be 1% to 3%, 1268 01:10:41,230 --> 01:10:46,040 then it has a 1% to 3% probability of surviving. 1269 01:10:46,040 --> 01:10:49,180 Because this is the probability of fixing, 1270 01:10:49,180 --> 01:10:51,627 but in this situation fixation and survival 1271 01:10:51,627 --> 01:10:53,210 are the same thing, because we're just 1272 01:10:53,210 --> 01:10:55,730 considering this one mutation. 1273 01:10:55,730 --> 01:10:57,120 So it's the only thing that we're 1274 01:10:57,120 --> 01:10:59,480 considering is the fate of this one mutation of the population. 1275 01:10:59,480 --> 01:11:00,870 We're going to assume for now that you 1276 01:11:00,870 --> 01:11:03,286 can't get new mutations in the population to compete with. 1277 01:11:03,286 --> 01:11:05,040 And then either you go extinct, or you 1278 01:11:05,040 --> 01:11:07,610 take over the population. 1279 01:11:07,610 --> 01:11:11,120 And what's surprising here is that even 1280 01:11:11,120 --> 01:11:16,290 if it's a fact that you think it was big, like 3%, 4%. 1281 01:11:16,290 --> 01:11:18,380 I would love to get such a mutation. 1282 01:11:18,380 --> 01:11:22,420 But still, in a population in the Moran process, 1283 01:11:22,420 --> 01:11:25,350 or really in any other model like this, 1284 01:11:25,350 --> 01:11:28,900 it will typically go extinct. 1285 01:11:28,900 --> 01:11:31,990 Now, it's worth saying that depending 1286 01:11:31,990 --> 01:11:35,290 upon the model that you're using, 1287 01:11:35,290 --> 01:11:40,050 you'll get different numbers in here. 1288 01:11:40,050 --> 01:11:43,900 In this case, the probability of fixation or survival 1289 01:11:43,900 --> 01:11:46,000 is 1 times s. 1290 01:11:46,000 --> 01:11:47,650 But in some other models and it depends 1291 01:11:47,650 --> 01:11:48,720 on the branching process. 1292 01:11:48,720 --> 01:11:50,460 It could be two times s. 1293 01:11:50,460 --> 01:11:55,185 But it's something of order unity times s. 1294 01:11:55,185 --> 01:12:03,610 AUDIENCE: And so we say that if s is equal to 0, then x of 1 1295 01:12:03,610 --> 01:12:05,570 should be-- 1296 01:12:05,570 --> 01:12:06,460 [INTERPOSING VOICES] 1297 01:12:06,460 --> 01:12:08,640 PROFESSOR: Right. 1298 01:12:08,640 --> 01:12:09,140 Exactly. 1299 01:12:09,140 --> 01:12:10,973 AUDIENCE: [INAUDIBLE] assume that s is much, 1300 01:12:10,973 --> 01:12:12,819 much greater than 1/N? 1301 01:12:12,819 --> 01:12:13,610 PROFESSOR: Exactly. 1302 01:12:13,610 --> 01:12:19,030 So what we've assumed is that this is true 1303 01:12:19,030 --> 01:12:25,870 and that I think that's actually already sufficient. 1304 01:12:25,870 --> 01:12:30,960 And indeed, you can go and you can do the expansion 1305 01:12:30,960 --> 01:12:35,030 and find what x sub 1 should be equal to in general here. 1306 01:12:35,030 --> 01:12:41,420 And for small s, but including the possibility 1307 01:12:41,420 --> 01:12:45,935 that it's very-- I'm sorry-- for small s but not super small s. 1308 01:12:45,935 --> 01:12:47,685 It's a matter of what you're comparing to. 1309 01:12:50,540 --> 01:12:55,310 And you end up getting 1/N plus s/2. 1310 01:12:58,380 --> 01:13:04,780 And this is for s times N much less than 1. 1311 01:13:09,010 --> 01:13:10,980 And indeed, this is the definition 1312 01:13:10,980 --> 01:13:13,170 of what we mean by nearly neutral. 1313 01:13:18,580 --> 01:13:21,530 Because up to now we've been talking about neutral mutations 1314 01:13:21,530 --> 01:13:24,570 as if they just had to be exactly, exactly neutral, 1315 01:13:24,570 --> 01:13:29,220 but then really, that probably doesn't actually literally 1316 01:13:29,220 --> 01:13:29,990 exist. 1317 01:13:29,990 --> 01:13:32,790 But if a mutation appears and it only changes your fitness 1318 01:13:32,790 --> 01:13:35,790 by one part in 10 to the minus 30, 1319 01:13:35,790 --> 01:13:40,220 then it is equivalent to being a truly neutral mutation. 1320 01:13:40,220 --> 01:13:43,060 But the way that you quantify that is this thing. 1321 01:13:43,060 --> 01:13:45,270 s times N. You want to know whether it's 1322 01:13:45,270 --> 01:13:46,760 larger or smaller than 1. 1323 01:13:51,300 --> 01:14:00,285 So for example if you plot x sub 1 as a function of s-- 0, 1324 01:14:00,285 --> 01:14:02,590 right? 1325 01:14:02,590 --> 01:14:04,326 It crosses 1 over N here. 1326 01:14:08,520 --> 01:14:14,070 Now, it's going to have a slope here-- s over 2-- 1327 01:14:14,070 --> 01:14:16,220 but then it kind of goes up. 1328 01:14:16,220 --> 01:14:22,075 It eventually hits this-- this is just the s line-- 1329 01:14:22,075 --> 01:14:25,410 because if s is much less than 1, 1330 01:14:25,410 --> 01:14:29,590 yet s times N is much larger than 1, 1331 01:14:29,590 --> 01:14:31,635 then x1 is approximately equal to s. 1332 01:14:34,240 --> 01:14:38,925 And then over here, this goes down exponentially. 1333 01:14:43,080 --> 01:14:46,190 But the statement is that if s times N is much less than 1, 1334 01:14:46,190 --> 01:14:50,590 then the mutation acts essentially as if it's neutral. 1335 01:14:50,590 --> 01:14:53,830 So then you can just work with that, 1336 01:14:53,830 --> 01:14:56,160 whereas, if s times N is much larger than 1, 1337 01:14:56,160 --> 01:14:58,700 then you end up-- it's not guaranteed 1338 01:14:58,700 --> 01:15:02,200 that it's going to fix, but it's much larger than a probability 1339 01:15:02,200 --> 01:15:06,120 of 1/N. Whereas down here, it becomes 1340 01:15:06,120 --> 01:15:10,434 very unlikely to fix once s times N is larger than 1 1341 01:15:10,434 --> 01:15:11,392 on the deleterial side. 1342 01:15:17,300 --> 01:15:20,320 Are there any questions about what's going on here? 1343 01:15:20,320 --> 01:15:20,820 Yes. 1344 01:15:20,820 --> 01:15:23,814 AUDIENCE: I'm wondering about that factor of 1/2. 1345 01:15:23,814 --> 01:15:29,053 [INAUDIBLE] seems like we didn't get it over here when we had N 1346 01:15:29,053 --> 01:15:30,121 equals-- 1347 01:15:30,121 --> 01:15:31,120 PROFESSOR: That's right. 1348 01:15:31,120 --> 01:15:31,855 Yeah, sorry. 1349 01:15:31,855 --> 01:15:33,980 And indeed, when I did this calculation originally, 1350 01:15:33,980 --> 01:15:36,354 I was very confused, because I thought it should be that. 1351 01:15:36,354 --> 01:15:38,330 But what you see is that if you plot this, 1352 01:15:38,330 --> 01:15:42,501 the slope here, which is the 1/2, is less than slope 1353 01:15:42,501 --> 01:15:43,417 over here, which is 1. 1354 01:15:51,369 --> 01:15:52,363 AUDIENCE: [INAUDIBLE] 1355 01:15:57,656 --> 01:15:59,780 PROFESSOR: Well, what's funny is, I actually spent, 1356 01:15:59,780 --> 01:16:01,400 like, hours, trying to figure out 1357 01:16:01,400 --> 01:16:03,830 where I had made my mistakes. 1358 01:16:03,830 --> 01:16:06,880 No, but I think that if you just draw it, 1359 01:16:06,880 --> 01:16:08,635 I think it's all consistent. 1360 01:16:08,635 --> 01:16:09,510 AUDIENCE: [INAUDIBLE] 1361 01:16:12,870 --> 01:16:15,660 PROFESSOR: I mean, I'm plotting the entire curve, 1362 01:16:15,660 --> 01:16:17,656 analytically, perfectly. 1363 01:16:17,656 --> 01:16:19,510 AUDIENCE: [INAUDIBLE] 1364 01:16:19,510 --> 01:16:23,520 PROFESSOR: It's just that what we know, 1365 01:16:23,520 --> 01:16:25,600 is that it behaves like this around here, 1366 01:16:25,600 --> 01:16:33,409 and like s up here, and I've just connected them-- 1367 01:16:33,409 --> 01:16:35,450 it probably doesn't answer your question but it-- 1368 01:16:39,900 --> 01:16:42,610 So I think we have just enough time to say something 1369 01:16:42,610 --> 01:16:44,090 about this Muller's ratchet idea. 1370 01:16:46,660 --> 01:16:50,820 Verbally, can a deleterious mutation, s less than 0, 1371 01:16:50,820 --> 01:16:52,450 can it fix in a population? 1372 01:16:52,450 --> 01:16:52,950 Yes or no? 1373 01:16:52,950 --> 01:16:55,487 Ready, three, two, one. 1374 01:16:55,487 --> 01:16:56,070 AUDIENCE: Yes. 1375 01:16:56,070 --> 01:16:58,500 PROFESSOR: Yeah. 1376 01:16:58,500 --> 01:17:00,650 This is greater than 0. 1377 01:17:00,650 --> 01:17:04,070 Now it's, very unlikely to fix if the negative s times N 1378 01:17:04,070 --> 01:17:06,040 is much larger than 1. 1379 01:17:06,040 --> 01:17:10,040 But for small size populations, you 1380 01:17:10,040 --> 01:17:13,140 could actually fix relatively deleterious mutations. 1381 01:17:15,840 --> 01:17:20,090 So the idea is that for small populations, 1382 01:17:20,090 --> 01:17:23,330 it's easy to accumulate deleterious mutations. 1383 01:17:23,330 --> 01:17:26,738 And indeed, This is related to something called a mutation 1384 01:17:26,738 --> 01:17:27,529 accumulation assay. 1385 01:17:33,120 --> 01:17:35,230 So if you take a population of bacteria 1386 01:17:35,230 --> 01:17:47,500 or other microorganism-- so you grow up 1387 01:17:47,500 --> 01:17:49,780 your bacteria in a test tube. 1388 01:17:49,780 --> 01:17:51,990 And so you have a bunch of bacteria. 1389 01:17:51,990 --> 01:17:54,490 So now there's selection acting in here, because the faster 1390 01:17:54,490 --> 01:17:56,040 dividing cells are spreading. 1391 01:17:56,040 --> 01:18:00,790 However, what you do is that you then plate them as colonies. 1392 01:18:00,790 --> 01:18:03,976 And these colonies each started as single cells. 1393 01:18:03,976 --> 01:18:05,350 Then what you do is you just take 1394 01:18:05,350 --> 01:18:10,760 a random cell, a random colony, that came from a single cell, 1395 01:18:10,760 --> 01:18:14,920 and you grow it up here. 1396 01:18:14,920 --> 01:18:17,572 Or you could just replay directly if you like. 1397 01:18:17,572 --> 01:18:19,155 And then you just repeat this process. 1398 01:18:22,040 --> 01:18:27,744 The idea is that you've picked a random cell 1399 01:18:27,744 --> 01:18:29,910 from this population that you allowed it to grow up, 1400 01:18:29,910 --> 01:18:32,160 and so you've kind of removed the effects of selection 1401 01:18:32,160 --> 01:18:34,220 in here. 1402 01:18:34,220 --> 01:18:35,910 So when you pick a random colony here, 1403 01:18:35,910 --> 01:18:37,706 maybe that colony got some weird mutation 1404 01:18:37,706 --> 01:18:40,080 that decreased its fitness, but it wasn't really selected 1405 01:18:40,080 --> 01:18:41,810 against because you just kind of picked 1406 01:18:41,810 --> 01:18:44,440 one of these colonies randomly. 1407 01:18:44,440 --> 01:18:47,610 So this kind of process is a way of reducing 1408 01:18:47,610 --> 01:18:50,155 what's known as the effective population size-- N effective. 1409 01:18:52,730 --> 01:18:56,020 So when populations are not constant in time, 1410 01:18:56,020 --> 01:18:58,420 but instead oscillate or fluctuate, 1411 01:18:58,420 --> 01:19:03,040 then in many cases the dynamics or the strength of this drift, 1412 01:19:03,040 --> 01:19:05,900 or the stochastic stuff, that could be characterized 1413 01:19:05,900 --> 01:19:06,755 by some ineffective. 1414 01:19:06,755 --> 01:19:09,170 And of course, depending on which variable 1415 01:19:09,170 --> 01:19:11,171 or which quantity you're trying to study, 1416 01:19:11,171 --> 01:19:13,170 you might have a slightly different N effective. 1417 01:19:13,170 --> 01:19:15,930 But the point is that if you have fluctuating population 1418 01:19:15,930 --> 01:19:21,420 sizes, then the relevant population size for thinking 1419 01:19:21,420 --> 01:19:23,650 about these sorts of ideas, is often 1420 01:19:23,650 --> 01:19:25,590 towards the smaller side of the range 1421 01:19:25,590 --> 01:19:26,840 of the fluctuating population. 1422 01:19:26,840 --> 01:19:31,284 So you're kind of dominated by how small the bottleneck gets. 1423 01:19:31,284 --> 01:19:33,450 And here, you're kind of going through a single cell 1424 01:19:33,450 --> 01:19:36,190 bottleneck, so that leads to a very small N effective, 1425 01:19:36,190 --> 01:19:38,039 and it allows for the accumulation 1426 01:19:38,039 --> 01:19:39,080 of deleterious mutations. 1427 01:19:44,905 --> 01:19:47,280 Now, we'll say something more about this Muller's ratchet 1428 01:19:47,280 --> 01:19:50,080 idea, because in the field of evolution, 1429 01:19:50,080 --> 01:19:53,340 I'd say one of the big overriding questions 1430 01:19:53,340 --> 01:19:56,590 is trying to understand the evolutionary advantages 1431 01:19:56,590 --> 01:19:59,030 of sexual reproduction. 1432 01:19:59,030 --> 01:20:02,220 Because of this famous twofold cost of sex. 1433 01:20:02,220 --> 01:20:04,760 That it seems that an asexual population 1434 01:20:04,760 --> 01:20:08,830 should be able to grow twice as fast as a sexually 1435 01:20:08,830 --> 01:20:13,440 reproducing population, because the males are not giving birth. 1436 01:20:13,440 --> 01:20:15,940 So it's a huge cost associated with sexual reproduction. 1437 01:20:15,940 --> 01:20:20,020 The question is, why is it that so many organisms engage in it? 1438 01:20:20,020 --> 01:20:23,800 And one of the explanations that's been proposed 1439 01:20:23,800 --> 01:20:26,680 is this Muller's ratchet idea, that it 1440 01:20:26,680 --> 01:20:31,320 may alleviate this accumulation of deleterious mutations. 1441 01:20:31,320 --> 01:20:34,270 We'll talk more about how this works out, 1442 01:20:34,270 --> 01:20:36,630 maybe quantitatively, and the other proposals 1443 01:20:36,630 --> 01:20:39,710 and so forth later, but you may, over the course 1444 01:20:39,710 --> 01:20:42,350 of your studies, come across this Muller's ratchet vis-a-vis 1445 01:20:42,350 --> 01:20:43,850 the question of sexual reproduction. 1446 01:20:43,850 --> 01:20:45,475 So I just wanted you to know that there 1447 01:20:45,475 --> 01:20:48,200 is this discussion out there, about how sexual reproduction 1448 01:20:48,200 --> 01:20:51,270 may allow for the separation of these beneficial and 1449 01:20:51,270 --> 01:20:54,000 deleterious mutations that would otherwise 1450 01:20:54,000 --> 01:20:56,150 accumulate in the population.