1 00:00:00,929 --> 00:00:02,720 PROFESSOR: We're constantly asking how long 2 00:00:02,720 --> 00:00:04,340 we have to wait for things. 3 00:00:04,340 --> 00:00:08,200 And in the context of probability theory, 4 00:00:08,200 --> 00:00:10,760 it turns into the technical question called, 5 00:00:10,760 --> 00:00:14,550 the expected time to failure-- or the mean time to failure. 6 00:00:14,550 --> 00:00:17,220 Some examples might be that an insurance company wants 7 00:00:17,220 --> 00:00:19,160 to know, for a given policy holder, 8 00:00:19,160 --> 00:00:23,390 the expected time before that policy holder dies. 9 00:00:23,390 --> 00:00:26,460 A mechanical engineer wants to know the expected time 10 00:00:26,460 --> 00:00:30,140 before a button that's being pushed once per second 11 00:00:30,140 --> 00:00:32,409 is expected to fail. 12 00:00:32,409 --> 00:00:35,100 And I want to know when the part that my body 13 00:00:35,100 --> 00:00:38,642 shop has been waiting for is expected to come in. 14 00:00:42,270 --> 00:00:43,870 The mean time to failure problem, 15 00:00:43,870 --> 00:00:47,300 we can formalize in terms of flipping coins. 16 00:00:47,300 --> 00:00:49,990 So we're going to flip a coin until a head comes up. 17 00:00:49,990 --> 00:00:51,470 And we're going to think of a head 18 00:00:51,470 --> 00:00:55,750 as being a failure, and a tail as a success. 19 00:00:55,750 --> 00:00:58,520 So let's assume the probability of getting a head-- 20 00:00:58,520 --> 00:01:01,280 the probability of failure-- is p. 21 00:01:01,280 --> 00:01:03,150 Again, this is not a fair coin. 22 00:01:03,150 --> 00:01:08,260 It's a coin that may be biased in either direction. 23 00:01:08,260 --> 00:01:10,570 And let's let F be the number of flips 24 00:01:10,570 --> 00:01:13,590 until the first head comes up-- the number of flips 25 00:01:13,590 --> 00:01:16,370 until the first failure. 26 00:01:16,370 --> 00:01:20,760 And if we're counting as flips as time, it's the time to fail. 27 00:01:20,760 --> 00:01:23,940 So what we'd like to know is what's the expectation of F? 28 00:01:23,940 --> 00:01:29,010 What's the expected number of flips before a head comes up? 29 00:01:29,010 --> 00:01:32,080 Well, in order to calculate that expectation, 30 00:01:32,080 --> 00:01:33,670 we need to know some probabilities. 31 00:01:33,670 --> 00:01:36,460 So what's the probability that F equals 1? 32 00:01:36,460 --> 00:01:38,586 Well that's the same as saying that that's 33 00:01:38,586 --> 00:01:40,210 getting a head on the first flip-- it's 34 00:01:40,210 --> 00:01:44,550 the probability that you get just an H on the first flip, 35 00:01:44,550 --> 00:01:46,560 and that has probability, P. 36 00:01:46,560 --> 00:01:48,920 What's the probability that F equals 2? 37 00:01:48,920 --> 00:01:51,915 Well that's the probability that you flip a tail and then 38 00:01:51,915 --> 00:01:52,870 a head. 39 00:01:52,870 --> 00:01:54,960 And that has probability q times p, 40 00:01:54,960 --> 00:01:57,390 because we're assuming the flips are independent, 41 00:01:57,390 --> 00:01:59,490 so it's the probability of a tail-- which 42 00:01:59,490 --> 00:02:02,590 is q-- times the probability of a head-- which is p. 43 00:02:02,590 --> 00:02:05,470 Similarly, the probability of F equals 44 00:02:05,470 --> 00:02:08,900 3 is the same as the probability of flipping tail, tail, head-- 45 00:02:08,900 --> 00:02:10,240 and it's q squared p. 46 00:02:10,240 --> 00:02:14,080 And of course, the probability density function 47 00:02:14,080 --> 00:02:17,250 of F-- the number of steps until you'd 48 00:02:17,250 --> 00:02:20,550 flip a head-- at n-- the probability 49 00:02:20,550 --> 00:02:24,110 that you have to flip n times before you get the first head-- 50 00:02:24,110 --> 00:02:26,090 is q at the n minus 1 p. 51 00:02:26,090 --> 00:02:29,810 By the way, a random variable whose probability density 52 00:02:29,810 --> 00:02:34,000 function is this value is called a geometric distribution. 53 00:02:34,000 --> 00:02:36,920 They come up all the time. 54 00:02:36,920 --> 00:02:40,150 All right, so what's the formula for the expectation of F? 55 00:02:40,150 --> 00:02:42,070 It's simply, of course, by definition, 56 00:02:42,070 --> 00:02:45,370 it's the sum over all the possible values of F-- 57 00:02:45,370 --> 00:02:48,010 which in this case are integers, n, greater than 0, 58 00:02:48,010 --> 00:02:51,090 of n times the probability that F equals n. 59 00:02:51,090 --> 00:02:53,640 We figured out that the probability that F equals n 60 00:02:53,640 --> 00:02:56,310 is q to n minus 1 times p. 61 00:02:56,310 --> 00:02:58,950 And now I'm going to observe that we really 62 00:02:58,950 --> 00:03:02,600 do know how to evaluate this sum easily enough. 63 00:03:02,600 --> 00:03:04,650 I'm going to factor out the p, and it 64 00:03:04,650 --> 00:03:07,060 becomes a sum over n greater than 0 of q 65 00:03:07,060 --> 00:03:08,970 to the n minus 1 times n. 66 00:03:08,970 --> 00:03:13,810 And then I can simplify matters if I replace n by n minus 1. 67 00:03:13,810 --> 00:03:16,720 And then I get a q to the n power. 68 00:03:16,720 --> 00:03:20,310 So this is equivalent to p times the sum over n 69 00:03:20,310 --> 00:03:24,600 greater than or equal to 0 of n plus 1 q to the n. 70 00:03:24,600 --> 00:03:28,190 Now this is a familiar generating function. 71 00:03:28,190 --> 00:03:32,780 It's simply equal to 1 over 1 minus q squared, 72 00:03:32,780 --> 00:03:34,890 as we've seen already. 73 00:03:34,890 --> 00:03:39,020 So in short, the expectation of F is p times 1 74 00:03:39,020 --> 00:03:42,124 over the square of 1 minus q. 75 00:03:42,124 --> 00:03:43,540 So let's put them together, there. 76 00:03:43,540 --> 00:03:47,330 Of course 1 minus q is p, so it's p times 1 77 00:03:47,330 --> 00:03:50,260 over p squared, or one over p. 78 00:03:50,260 --> 00:03:53,370 And we get this really very clean answer. 79 00:03:53,370 --> 00:03:56,360 The expected number of flips before you 80 00:03:56,360 --> 00:03:59,460 get a head is 1 over the probability of a head. 81 00:03:59,460 --> 00:04:02,670 So for example, with a fair coin, where p is 1/2, 82 00:04:02,670 --> 00:04:05,440 the expected number of flips until you get the first head 83 00:04:05,440 --> 00:04:07,680 is 2-- it's 1 over 1/2. 84 00:04:07,680 --> 00:04:10,290 If you had a biased coin, where the probability of getting 85 00:04:10,290 --> 00:04:15,500 a head was 1 in 3-- 1/3-- then, in fact, 86 00:04:15,500 --> 00:04:19,930 the expected number of flips until you got a head was 3. 87 00:04:19,930 --> 00:04:22,960 OK, that's a nice clean answer. 88 00:04:22,960 --> 00:04:25,070 But we got it in this way that doesn't really 89 00:04:25,070 --> 00:04:25,928 give much intuition. 90 00:04:28,830 --> 00:04:31,470 So let's look at another naive way 91 00:04:31,470 --> 00:04:34,585 to derive the expected time to the first head, 92 00:04:34,585 --> 00:04:36,710 without having to worry about generating functions, 93 00:04:36,710 --> 00:04:39,290 and all that sophisticated stuff about series-- which 94 00:04:39,290 --> 00:04:41,210 is easy to forget. 95 00:04:41,210 --> 00:04:44,440 So let's look at the outcome tree that 96 00:04:44,440 --> 00:04:46,270 describes this experiment of flipping 97 00:04:46,270 --> 00:04:47,770 until you get the first head. 98 00:04:47,770 --> 00:04:50,950 So starting at the root, with probability, 99 00:04:50,950 --> 00:04:53,720 p, you flip a head immediately and you stop. 100 00:04:53,720 --> 00:04:56,500 Or, with probability, q, you flip a tail, 101 00:04:56,500 --> 00:04:59,100 and then with probability, p, you finally flip the head 102 00:04:59,100 --> 00:05:00,760 and stop. 103 00:05:00,760 --> 00:05:05,960 If you haven't flipped the head by the end of the second step, 104 00:05:05,960 --> 00:05:10,400 then that is a possibility that, with probability, q, 105 00:05:10,400 --> 00:05:13,000 flipped a tail, and then there's a possibility 106 00:05:13,000 --> 00:05:16,680 that you stop after the third step, with a head, and so on. 107 00:05:16,680 --> 00:05:20,150 That's how this tree goes. 108 00:05:20,150 --> 00:05:23,030 Now, looking at the structure of this tree, 109 00:05:23,030 --> 00:05:25,340 it's an infinite tree, but it's very repetitive. 110 00:05:25,340 --> 00:05:29,180 In fact, if we call it the tree B, then what we're seeing 111 00:05:29,180 --> 00:05:32,300 is that this whole sub-tree is a copy of B. 112 00:05:32,300 --> 00:05:36,050 So now I have a nice recursive, but very finite, 113 00:05:36,050 --> 00:05:38,840 description of this whole infinite tree. 114 00:05:38,840 --> 00:05:42,710 B is a tree that has a left branch of p that 115 00:05:42,710 --> 00:05:46,510 ends in a leaf, or a right branch with probability, 116 00:05:46,510 --> 00:05:50,450 q, followed by a repeat of the tree, B. 117 00:05:50,450 --> 00:05:53,630 So now I can apply total expectation 118 00:05:53,630 --> 00:05:58,810 to find the expectation of F. F is the expected number of steps 119 00:05:58,810 --> 00:06:01,700 I make in this tree until I finally make the left branch 120 00:06:01,700 --> 00:06:04,670 to an H. How do I figure out what 121 00:06:04,670 --> 00:06:08,710 the expected time in the tree is until I make that left branch 122 00:06:08,710 --> 00:06:11,330 of flipping a head, finally? 123 00:06:11,330 --> 00:06:14,250 Well, using total expectation, what I can do 124 00:06:14,250 --> 00:06:19,250 is branch on whether or not the first flip is a head. 125 00:06:19,250 --> 00:06:21,990 So the expectation of F, according 126 00:06:21,990 --> 00:06:24,870 to the total expectation is going 127 00:06:24,870 --> 00:06:27,670 to be the expectation of F, given 128 00:06:27,670 --> 00:06:30,800 that the first flip is a head, times the probability, p, 129 00:06:30,800 --> 00:06:32,370 that it is a head. 130 00:06:32,370 --> 00:06:37,840 And the other term is that it's the expectation of F, given 131 00:06:37,840 --> 00:06:42,100 that the first flip is a tail times q, the probability 132 00:06:42,100 --> 00:06:45,040 that the first flip is a tail. 133 00:06:45,040 --> 00:06:47,670 Well, first of all, let's look at this term. 134 00:06:47,670 --> 00:06:50,830 What is the expected number of flips 135 00:06:50,830 --> 00:06:53,010 until you get a head, given that you got a head? 136 00:06:53,010 --> 00:06:54,810 Well, it's 1. 137 00:06:54,810 --> 00:06:56,690 So this term is easily taken care of. 138 00:06:56,690 --> 00:06:58,500 We understand that one. 139 00:06:58,500 --> 00:06:59,470 What about this term? 140 00:06:59,470 --> 00:07:02,730 This is the expected number of flips 141 00:07:02,730 --> 00:07:05,100 until you get the first tail. 142 00:07:09,390 --> 00:07:11,630 Sorry-- it's the expected number of flips 143 00:07:11,630 --> 00:07:14,620 until you get the first head, given that your first step was 144 00:07:14,620 --> 00:07:15,750 a tail. 145 00:07:15,750 --> 00:07:19,455 Well what that means is that we are here after the tail, 146 00:07:19,455 --> 00:07:22,450 and we're asking, what's the expectation 147 00:07:22,450 --> 00:07:24,730 of F-- the number of flips that you 148 00:07:24,730 --> 00:07:28,780 get starting at B-- when you do one flip that takes you 149 00:07:28,780 --> 00:07:30,890 to the start of B again? 150 00:07:30,890 --> 00:07:36,140 And the answer is, obviously, 1 plus the expected number 151 00:07:36,140 --> 00:07:39,740 of flips in B, which is expectation 152 00:07:39,740 --> 00:07:43,400 of F. In short, this term-- the expectation of F, given 153 00:07:43,400 --> 00:07:46,470 that the first flip is a tail, is simply 154 00:07:46,470 --> 00:07:49,320 1 plus the expected number of flips 155 00:07:49,320 --> 00:07:50,770 that we're trying to figure out. 156 00:07:50,770 --> 00:07:54,610 Now look at this-- I have a very simple arithmetic formula now. 157 00:07:54,610 --> 00:08:00,340 Expectation of F is 1 times p plus 1 plus F times q. 158 00:08:00,340 --> 00:08:01,460 There it is. 159 00:08:01,460 --> 00:08:04,610 And now I can solve for E of F. Well, 160 00:08:04,610 --> 00:08:06,370 just taking a quick look at this, 161 00:08:06,370 --> 00:08:11,090 this is going to yield a q term, and p plus q is 1. 162 00:08:11,090 --> 00:08:14,490 And this is going to be q times E of F, and there's an E of F 163 00:08:14,490 --> 00:08:15,780 there. 164 00:08:15,780 --> 00:08:18,170 If I pull the E of F over, I'm going 165 00:08:18,170 --> 00:08:19,870 to do a little arithmetic-- I'm going 166 00:08:19,870 --> 00:08:22,280 to leave the rest to you, and realize, again, 167 00:08:22,280 --> 00:08:26,070 that the answer is what we had before, the expectation of F 168 00:08:26,070 --> 00:08:29,240 is equal to 1 over p. 169 00:08:29,240 --> 00:08:32,569 So let's do one more silly example for fun 170 00:08:32,569 --> 00:08:36,580 to remember what the significance of 1 over p is. 171 00:08:36,580 --> 00:08:39,320 [? Suppose we think ?] about the space station Mir. 172 00:08:39,320 --> 00:08:42,860 Now, it's spinning around, and there's a lot of garbage 173 00:08:42,860 --> 00:08:45,540 out there that it's likely to hit-- a lot of space junk. 174 00:08:45,540 --> 00:08:49,260 And suppose that, based on our previous statistics 175 00:08:49,260 --> 00:08:54,080 and estimations of the small stuff that has been hitting Mir 176 00:08:54,080 --> 00:08:59,120 that it could survive, that we estimate that there's about a 1 177 00:08:59,120 --> 00:09:03,030 in 150,000 chance that in any given hour, 178 00:09:03,030 --> 00:09:07,970 it's going to run into some intolerable 179 00:09:07,970 --> 00:09:11,850 collision with space junk, or a meteor that's going to destroy. 180 00:09:11,850 --> 00:09:14,530 So suppose the space station Mir has a 1 181 00:09:14,530 --> 00:09:18,700 in 150,000 chance of destruction in any given hour. 182 00:09:18,700 --> 00:09:21,610 So how many hours do we expect until destruction? 183 00:09:21,610 --> 00:09:25,960 Well, it's 1 of over 150,000, 150,000 hours, 184 00:09:25,960 --> 00:09:27,633 which [INAUDIBLE] 185 00:09:30,550 --> 00:09:32,850 So much for silly space station examples. 186 00:09:32,850 --> 00:09:35,341 Let's wrap up with an intuitive argument that could be made 187 00:09:35,341 --> 00:09:37,840 rigorous, but I'm not going to, because I think it's clearer 188 00:09:37,840 --> 00:09:41,820 just left in this informal way that makes it intuitive 189 00:09:41,820 --> 00:09:44,000 why you would expect that, of course, 190 00:09:44,000 --> 00:09:46,100 the expected time to failure is 1 191 00:09:46,100 --> 00:09:49,600 over the probability of failing on a given flip. 192 00:09:49,600 --> 00:09:53,380 Well, how many failures do we expect in one try? 193 00:09:53,380 --> 00:09:57,340 Well, by definition, it's the expectation 194 00:09:57,340 --> 00:10:01,030 of getting a head on the first flip-- it's p. 195 00:10:01,030 --> 00:10:05,500 OK, now if you flip n times, you expect 196 00:10:05,500 --> 00:10:11,570 to get n times as many failures as you'd expect in one try. 197 00:10:11,570 --> 00:10:15,790 So the expected number of fails in n tries is n times p. 198 00:10:15,790 --> 00:10:17,110 That's an intuitive argument. 199 00:10:17,110 --> 00:10:19,190 In fact, it's the rigorously correct argument. 200 00:10:19,190 --> 00:10:21,850 Remember that if we flip n times, 201 00:10:21,850 --> 00:10:24,720 we're counting the number of heads and flips-- 202 00:10:24,720 --> 00:10:26,490 that's a binomial distribution we already 203 00:10:26,490 --> 00:10:28,440 figured out in a couple of ways-- 204 00:10:28,440 --> 00:10:30,170 that it's expectations is n p. 205 00:10:30,170 --> 00:10:32,310 But never mind that. 206 00:10:32,310 --> 00:10:34,600 I think it's intuitively clear that if you expect 207 00:10:34,600 --> 00:10:38,390 p heads in one try, and you do n tries that are all independent, 208 00:10:38,390 --> 00:10:44,700 you're going to expect to get n times p failures-- or heads. 209 00:10:44,700 --> 00:10:48,850 Now, what's the expected number of tries between failures? 210 00:10:48,850 --> 00:10:52,560 Well if you think about that, I've done n tries, 211 00:10:52,560 --> 00:10:57,190 and I've got n p failures, so if I divide the number of tries 212 00:10:57,190 --> 00:11:00,500 by the number of failures, that, by definition, is 213 00:11:00,500 --> 00:11:02,810 the average time between the failures. 214 00:11:02,810 --> 00:11:05,170 It's the expected time to a failure. 215 00:11:05,170 --> 00:11:08,650 So I divide the number of tries by the number of fails-- which, 216 00:11:08,650 --> 00:11:11,510 by definition, is the average number of tries 217 00:11:11,510 --> 00:11:16,370 between failures, and it's equal to n over n p, which 218 00:11:16,370 --> 00:11:18,260 is equal to 1 over p. 219 00:11:18,260 --> 00:11:22,360 And that's an argument that I hope you will remember.