1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 8 00:00:24,930 --> 00:00:27,780 PROFESSOR: I guess we should start. 9 00:00:27,780 --> 00:00:32,050 This is the last of these lectures. 10 00:00:32,050 --> 00:00:36,940 The final will be on next Wednesday, as I hope you all 11 00:00:36,940 --> 00:00:41,700 know by this time, in the ice rink, whatever that means. 12 00:00:41,700 --> 00:00:45,520 And there was some question about how many sheets of paper 13 00:00:45,520 --> 00:00:49,170 you could bring in as crib sheets. 14 00:00:49,170 --> 00:00:52,780 And it seems like the reasonable thing is four 15 00:00:52,780 --> 00:00:55,960 sheets, which means you can bring in the two sheets you 16 00:00:55,960 --> 00:00:58,100 made up for the quiz plus two more. 17 00:00:58,100 --> 00:01:00,280 Or you can make up four new ones if you want or do 18 00:01:00,280 --> 00:01:02,950 whatever you want. 19 00:01:02,950 --> 00:01:04,980 I don't think it's very important how many sheets you 20 00:01:04,980 --> 00:01:09,800 bring in, because I've never seen anybody referring to 21 00:01:09,800 --> 00:01:11,880 their sheets. 22 00:01:11,880 --> 00:01:15,770 I mean, it's a good way of organizing what you know to 23 00:01:15,770 --> 00:01:17,960 try to put it on four sheets of paper. 24 00:01:20,840 --> 00:01:25,210 I want to mostly review what we've done throughout the 25 00:01:25,210 --> 00:01:29,990 term, with a few more general comments thrown in. 26 00:01:29,990 --> 00:01:33,740 I thought I'd start with martingales, because we then 27 00:01:33,740 --> 00:01:35,410 completely finish what we wanted to 28 00:01:35,410 --> 00:01:36,720 talk about last time. 29 00:01:36,720 --> 00:01:39,510 And the Strong Law of Large Numbers was 30 00:01:39,510 --> 00:01:41,800 left slightly hanging. 31 00:01:41,800 --> 00:01:45,790 And I want to show you how to do that in a 32 00:01:45,790 --> 00:01:47,870 little better way. 33 00:01:47,870 --> 00:01:52,070 And also show you that it's a more general theorem than it 34 00:01:52,070 --> 00:01:54,550 appears to be at first sight. 35 00:01:54,550 --> 00:01:59,686 So let's go with martingales. 36 00:01:59,686 --> 00:02:03,690 The basic definition is a sequence of random variables 37 00:02:03,690 --> 00:02:09,430 is a martingale, if, for all elements of the sequence, the 38 00:02:09,430 --> 00:02:16,940 expected value of Zn, given all of the previous values, is 39 00:02:16,940 --> 00:02:21,260 equal to the random variable, Z n minus 1. 40 00:02:21,260 --> 00:02:25,230 Remember, and we've talked about this a number of times, 41 00:02:25,230 --> 00:02:28,540 when you're talking about the expected value of one random 42 00:02:28,540 --> 00:02:32,010 variable, given a bunch of other random variables, you're 43 00:02:32,010 --> 00:02:35,050 only taking the expectation over the first part. 44 00:02:35,050 --> 00:02:38,760 You're only taking the expectation over Z sub n. 45 00:02:38,760 --> 00:02:41,940 And the other quantities are still random variables. 46 00:02:41,940 --> 00:02:45,860 Namely, you have an expected value Z sub n, for each sample 47 00:02:45,860 --> 00:02:49,360 value of Z n minus 1, all the way down to Z 1. 48 00:02:49,360 --> 00:02:54,520 And what the definition says is it's a martingale only if, 49 00:02:54,520 --> 00:03:00,880 for all sample values of those earlier values, the expected 50 00:03:00,880 --> 00:03:07,190 value is equal to the sample value of the most recent one. 51 00:03:07,190 --> 00:03:10,690 Namely, the memory is all contained right in this last 52 00:03:10,690 --> 00:03:12,390 term, effectively. 53 00:03:12,390 --> 00:03:14,920 At least as far as expectation is concerned. 54 00:03:14,920 --> 00:03:20,626 Memory might be far broader than that for everything else. 55 00:03:20,626 --> 00:03:25,290 And the first thing we did with martingales is we said 56 00:03:25,290 --> 00:03:29,030 the expected value was again, if you're only given part of 57 00:03:29,030 --> 00:03:31,770 the history, if you're only given the history from i back 58 00:03:31,770 --> 00:03:38,360 to 1, where i is strictly less than n minus 1, that expected 59 00:03:38,360 --> 00:03:40,170 value is equal to Zi. 60 00:03:40,170 --> 00:03:43,920 So no matter where you start going back, the expected value 61 00:03:43,920 --> 00:03:52,160 of Z sub n is the most recent value that is given. 62 00:03:52,160 --> 00:03:54,960 So if the most recent value given is Z 1, then the 63 00:03:54,960 --> 00:04:00,590 expected value of Zn, given Z1, is Z1. 64 00:04:00,590 --> 00:04:03,400 And also along with that, you have the relationship, the 65 00:04:03,400 --> 00:04:07,700 expected value of Zn is equal to the expected value of Zi, 66 00:04:07,700 --> 00:04:10,700 just by taking the expected value over Z sub i. 67 00:04:10,700 --> 00:04:15,950 So all of that's sort of straightforward. 68 00:04:15,950 --> 00:04:18,570 We talked a good deal about the increments of a 69 00:04:18,570 --> 00:04:19,970 martingale. 70 00:04:19,970 --> 00:04:27,050 The increments, X sub n equals Z sub n minus Zn minus 1, are 71 00:04:27,050 --> 00:04:30,890 very much like the increments that we have when a renewal 72 00:04:30,890 --> 00:04:33,900 process, when a Poisson process, all of these 73 00:04:33,900 --> 00:04:38,530 processes we talked about we can define in various ways. 74 00:04:38,530 --> 00:04:42,410 And here we can define a martingale in two ways also. 75 00:04:42,410 --> 00:04:46,950 One is by the actual martingale itself, which are, 76 00:04:46,950 --> 00:04:49,570 in a sense, the sums of the increments. 77 00:04:49,570 --> 00:04:52,070 And the other ways in terms of the increments. 78 00:04:52,070 --> 00:04:55,680 And the increments satisfy the property that the expected 79 00:04:55,680 --> 00:05:01,850 value of Xn, given all the earlier values, is equal to 0. 80 00:05:01,850 --> 00:05:05,430 Namely, no matter what are all the earlier values are, X sub 81 00:05:05,430 --> 00:05:11,010 n has mean 0 in order to be a martingale. 82 00:05:11,010 --> 00:05:16,940 A good special case of this is where X sub n is equal to U 83 00:05:16,940 --> 00:05:25,430 sub n times Y sub n, where the U sub n are IID, equiprobable 84 00:05:25,430 --> 00:05:27,330 1 and minus 1. 85 00:05:27,330 --> 00:05:29,870 And the Y sub i's are anything you want them to be. 86 00:05:29,870 --> 00:05:32,490 It's just that the U sub i's have to be independent 87 00:05:32,490 --> 00:05:34,110 of the Y sub i. 88 00:05:34,110 --> 00:05:40,420 So I think this shows that in fact martingales are really a 89 00:05:40,420 --> 00:05:43,470 pretty broad class of things. 90 00:05:43,470 --> 00:05:47,050 And they were invented to talk about fair gambling games, 91 00:05:47,050 --> 00:05:50,540 where they wanted to give the gambler the opportunity to do 92 00:05:50,540 --> 00:05:52,420 whatever he wanted to do. 93 00:05:52,420 --> 00:05:56,770 But the game itself was defined in such a way that, no 94 00:05:56,770 --> 00:05:59,340 matter what you do, the game is fair. 95 00:05:59,340 --> 00:06:02,910 You establish bets in one whatever things you want to. 96 00:06:02,910 --> 00:06:07,260 And when you wind up with it, the expected value of X sub n, 97 00:06:07,260 --> 00:06:10,430 given the past, is always 0. 98 00:06:10,430 --> 00:06:13,640 And that's equivalent to saying the expected value of Z 99 00:06:13,640 --> 00:06:18,900 sub n, given the past, is equal to Z sub i. 100 00:06:18,900 --> 00:06:23,890 Examples we talked about are 0 mean random walks and products 101 00:06:23,890 --> 00:06:27,000 of unit-mean IID random variables. 102 00:06:27,000 --> 00:06:29,830 So they're both these product martingales, and there are 103 00:06:29,830 --> 00:06:31,710 these sum martingales. 104 00:06:31,710 --> 00:06:35,000 And those are just two simple examples, which 105 00:06:35,000 --> 00:06:37,670 come up all the time. 106 00:06:37,670 --> 00:06:40,035 Then we talked about submartingales. 107 00:06:40,035 --> 00:06:43,590 A submartingale is like a martingale, except 108 00:06:43,590 --> 00:06:46,290 it grows with time. 109 00:06:46,290 --> 00:06:49,560 And we're not going to talk about supermartingales, 110 00:06:49,560 --> 00:06:53,750 because a supermartingale is just a negative submartingale. 111 00:06:53,750 --> 00:06:56,960 So we don't have to talk about that. 112 00:06:56,960 --> 00:06:59,570 A martingale is a submartingale. 113 00:06:59,570 --> 00:07:03,150 So anything you know about submartingales applies to 114 00:07:03,150 --> 00:07:04,970 martingales also. 115 00:07:04,970 --> 00:07:09,110 So you can state theorems for submartingales and they apply 116 00:07:09,110 --> 00:07:11,160 to martingales just as well. 117 00:07:11,160 --> 00:07:14,370 You can say stronger things very often about martingales. 118 00:07:16,950 --> 00:07:21,270 And then we have the same theorem for submartingales. 119 00:07:25,750 --> 00:07:30,450 Now that should say, and it did say, until my evil twin 120 00:07:30,450 --> 00:07:34,830 got a hold of it, if Zn is a submartingales, then for n 121 00:07:34,830 --> 00:07:39,250 greater than i, greater than 0, this expected value is 122 00:07:39,250 --> 00:07:41,740 greater than or equal to Zi. 123 00:07:41,740 --> 00:07:45,480 And the expected value Zn is greater than equal to the 124 00:07:45,480 --> 00:07:47,070 expected value of Zi. 125 00:07:47,070 --> 00:07:50,920 In other words, this theorem, for submartingales, is the 126 00:07:50,920 --> 00:07:54,470 same as the corresponding theorem for martingales, 127 00:07:54,470 --> 00:07:58,770 except now you have inequalities there, just like 128 00:07:58,770 --> 00:08:01,100 you have inequalities in the definition of the 129 00:08:01,100 --> 00:08:01,970 submartingales. 130 00:08:01,970 --> 00:08:05,530 So there's nothing strange there. 131 00:08:05,530 --> 00:08:10,560 Then we found out that, if you have a convex function, from 132 00:08:10,560 --> 00:08:14,320 the reals into the reals, then Jensen's inequality says that 133 00:08:14,320 --> 00:08:18,750 the expected value of h of X is greater than or equal to h 134 00:08:18,750 --> 00:08:20,300 of the expected value of x. 135 00:08:20,300 --> 00:08:22,170 We showed a picture for that you remember. 136 00:08:22,170 --> 00:08:23,800 There's a convex curve. 137 00:08:23,800 --> 00:08:25,520 There's some straight line. 138 00:08:25,520 --> 00:08:30,930 And what Jensen's inequality says is you take an average 139 00:08:30,930 --> 00:08:32,919 over the expected value of X, and you're 140 00:08:32,919 --> 00:08:34,620 somewhere above the line. 141 00:08:34,620 --> 00:08:37,070 And you take the average first, and you're 142 00:08:37,070 --> 00:08:38,320 sitting on the line. 143 00:08:41,491 --> 00:08:45,620 So if h of X is convex, that's what Jensen's inequality is. 144 00:08:45,620 --> 00:08:49,460 And it follows from that that, if Zn is a submartingale-- 145 00:08:49,460 --> 00:08:52,480 and that includes martingales-- 146 00:08:52,480 --> 00:08:57,460 and h is convex and the expected value of h of X is 147 00:08:57,460 --> 00:09:02,420 finite, then h of Zn is a martingale also. 148 00:09:02,420 --> 00:09:06,300 In other words, if you have a martingale Z sub n, the 149 00:09:06,300 --> 00:09:11,535 expected value of Z sub n is a submartingale. 150 00:09:11,535 --> 00:09:16,490 The expected value of E to R Zn is a martingale. 151 00:09:16,490 --> 00:09:20,790 Use whatever convex function you want to, and you wind up, 152 00:09:20,790 --> 00:09:22,230 martingales go into submartingales. 153 00:09:25,030 --> 00:09:30,400 You can't get out of the range of submartingales that easily. 154 00:09:30,400 --> 00:09:35,000 We then talked about stopped martingales and stopped 155 00:09:35,000 --> 00:09:36,805 submartingales. 156 00:09:36,805 --> 00:09:41,020 We said a stopped process, for a possibly 157 00:09:41,020 --> 00:09:43,330 defective stopping time-- 158 00:09:43,330 --> 00:09:45,150 now you remember what a stopping time is? 159 00:09:45,150 --> 00:09:49,400 A stopping time is a random variable, which is a function 160 00:09:49,400 --> 00:09:53,590 of everything that takes place up until the time of stopping. 161 00:09:53,590 --> 00:09:57,530 And you have to look at the definition carefully, because 162 00:09:57,530 --> 00:10:02,030 stopping time comes in too many places to just say it and 163 00:10:02,030 --> 00:10:03,770 understand what it means. 164 00:10:03,770 --> 00:10:08,130 But it's clear what it means, if you view yourself as an 165 00:10:08,130 --> 00:10:13,000 observer watching a sequence of random variables, of sample 166 00:10:13,000 --> 00:10:15,940 values of random variables, one after another. 167 00:10:15,940 --> 00:10:20,160 And after you see a certain number of random variables, 168 00:10:20,160 --> 00:10:22,200 your rule says, stop. 169 00:10:22,200 --> 00:10:23,940 And then you don't observe anymore. 170 00:10:23,940 --> 00:10:26,690 So you just observe this finite number. 171 00:10:26,690 --> 00:10:28,135 And then you stop at that point. 172 00:10:28,135 --> 00:10:29,860 And then you're all done. 173 00:10:29,860 --> 00:10:33,170 If it's a possibly defective stopping rule, then you might 174 00:10:33,170 --> 00:10:35,250 keep on going forever, or you might stop. 175 00:10:35,250 --> 00:10:36,580 You don't know what you're going to do. 176 00:10:42,610 --> 00:10:47,100 The stopped process Z sub n star is a little different 177 00:10:47,100 --> 00:10:49,400 from what we were doing before. 178 00:10:49,400 --> 00:10:51,610 Before what we were doing is we were sitting there 179 00:10:51,610 --> 00:10:53,730 observing this process. 180 00:10:53,730 --> 00:10:57,390 At a certain point, the stopping rule said stop. 181 00:10:57,390 --> 00:11:00,900 And before, we were very obedient. 182 00:11:00,900 --> 00:11:04,550 And when the stopping rule told us to stop, we stopped. 183 00:11:04,550 --> 00:11:08,900 Now, since we know a little more, we question authority a 184 00:11:08,900 --> 00:11:09,960 little more. 185 00:11:09,960 --> 00:11:13,470 And when the stopping rule says stop, we break things 186 00:11:13,470 --> 00:11:15,180 into two processes. 187 00:11:15,180 --> 00:11:18,980 There's the original process, which keeps on going. 188 00:11:18,980 --> 00:11:22,360 And there this stopped process, which just stops. 189 00:11:22,360 --> 00:11:25,730 And it's convenient to have a stopped process instead of 190 00:11:25,730 --> 00:11:27,300 just a stopping rule. 191 00:11:27,300 --> 00:11:31,000 Because with a stopped process, you can look at any 192 00:11:31,000 --> 00:11:34,670 time into the future, and if it's already stopped, you know 193 00:11:34,670 --> 00:11:36,050 what the stopped value is. 194 00:11:36,050 --> 00:11:38,270 You know what it was when it stopped. 195 00:11:38,270 --> 00:11:41,100 You don't necessarily know when it stopped, by looking at 196 00:11:41,100 --> 00:11:41,710 in the future. 197 00:11:41,710 --> 00:11:44,370 But you know that it did stop. 198 00:11:44,370 --> 00:11:51,448 So the stopped process, well, it says here what it is. 199 00:11:54,840 --> 00:12:00,740 It satisfies the stopped value at time n as equal to Z sub n, 200 00:12:00,740 --> 00:12:05,110 if n is less than or equal to the stopping time J, and Z sub 201 00:12:05,110 --> 00:12:09,430 n star is equal to Z sub J, if n is greater than J. 202 00:12:09,430 --> 00:12:11,740 So you get up to the stopping timing, and you stop. 203 00:12:11,740 --> 00:12:14,880 And then it just stays fixed forever after. 204 00:12:14,880 --> 00:12:22,170 And the nice theorem there is that the stopped process for a 205 00:12:22,170 --> 00:12:27,570 submartingale, with a possibly defective stopping rule, is a 206 00:12:27,570 --> 00:12:29,690 submartingale again. 207 00:12:29,690 --> 00:12:33,970 What that means is it's just a concise way of writing, the 208 00:12:33,970 --> 00:12:37,810 stopped process for a martingale is a martingale in 209 00:12:37,810 --> 00:12:39,140 its own right. 210 00:12:39,140 --> 00:12:42,140 And the stopped process for a submartingale is a 211 00:12:42,140 --> 00:12:46,350 submartingale in its own right. 212 00:12:46,350 --> 00:12:50,680 So the convenient thing is, you can take a martingale, you 213 00:12:50,680 --> 00:12:53,870 can stop it, you still have a martingale. 214 00:12:53,870 --> 00:12:56,200 And everything you know about martingales applies to this 215 00:12:56,200 --> 00:12:59,170 stopping process. 216 00:12:59,170 --> 00:13:02,180 So we're getting to the point where, starting out with a 217 00:13:02,180 --> 00:13:04,890 martingale, we can do lots of things with it. 218 00:13:04,890 --> 00:13:08,650 And that's the whole mathematical game. 219 00:13:08,650 --> 00:13:11,040 With a mathematical game, you build up 220 00:13:11,040 --> 00:13:13,740 theorems from nothing. 221 00:13:13,740 --> 00:13:17,690 As an experimentalist or an engineer, you sort of try to 222 00:13:17,690 --> 00:13:26,050 figure out those things from the reality around you. 223 00:13:26,050 --> 00:13:28,700 Here, we're just building it up. 224 00:13:28,700 --> 00:13:36,420 And the other part of that theorem says that the expected 225 00:13:36,420 --> 00:13:41,370 value of Z1 is less than or equal to the expected value of 226 00:13:41,370 --> 00:13:45,520 Zn star, is less than or equal to the expected value of Zn 227 00:13:45,520 --> 00:13:47,070 for a submartingale. 228 00:13:47,070 --> 00:13:49,570 And they're all equal to a martingale. 229 00:13:49,570 --> 00:13:55,130 In other words, the marginal expectations for a martingale, 230 00:13:55,130 --> 00:13:57,130 they start out a Z1. 231 00:13:57,130 --> 00:13:58,870 They stay at Z1. 232 00:13:58,870 --> 00:14:03,370 And for the stopped process, they stay at that same value. 233 00:14:03,370 --> 00:14:06,350 And that's not too surprising. 234 00:14:06,350 --> 00:14:10,840 Because if you have a martingale, if you go until 235 00:14:10,840 --> 00:14:15,400 you reach the stopping point, from that stopping point on, 236 00:14:15,400 --> 00:14:19,260 the martingale has mean 0, from that point on. 237 00:14:22,390 --> 00:14:25,610 Not the martingale itself, but the increments of the 238 00:14:25,610 --> 00:14:29,340 martingale have mean 0, from that point on. 239 00:14:29,340 --> 00:14:31,920 And the stopped process has mean 0. 240 00:14:31,920 --> 00:14:34,720 In other words, the stopped process, the increments are 241 00:14:34,720 --> 00:14:35,720 actually 0. 242 00:14:35,720 --> 00:14:37,680 Whereas for the original process, the 243 00:14:37,680 --> 00:14:39,560 increments wobble around. 244 00:14:39,560 --> 00:14:41,390 But they still have mean 0. 245 00:14:41,390 --> 00:14:46,430 So this is a very nice and useful thing to know. 246 00:14:46,430 --> 00:14:51,090 If you look at this product martingale, Z sub n is E to 247 00:14:51,090 --> 00:14:53,760 the rsn minus n gamma or r . 248 00:14:53,760 --> 00:14:55,050 Why is that a martingale? 249 00:14:58,670 --> 00:14:59,920 How do you know it's a martingale? 250 00:15:04,540 --> 00:15:07,880 Well, you look at the expected value of this. 251 00:15:07,880 --> 00:15:10,070 And it's expected value of this. 252 00:15:10,070 --> 00:15:15,910 The expected value of E to rsn is the moment generating 253 00:15:15,910 --> 00:15:21,610 function of Z sub n, of s sub n. 254 00:15:21,610 --> 00:15:25,980 It's moment generating function of E to rsn. 255 00:15:25,980 --> 00:15:31,355 And the moment generating function of E to rsn is just E 256 00:15:31,355 --> 00:15:33,110 to the n gamma of r. 257 00:15:33,110 --> 00:15:40,560 So this is clearly something which should be a martingale, 258 00:15:40,560 --> 00:15:44,090 because it just keeps at that level all along. 259 00:15:44,090 --> 00:15:47,900 If you have a stopping rule, such as a threshold crossing, 260 00:15:47,900 --> 00:15:50,200 then you've got a stopped martingale. 261 00:15:50,200 --> 00:15:54,570 And subject to some little mathematical nitpicks, which 262 00:15:54,570 --> 00:15:58,440 the text talks about, this leads you to the much more 263 00:15:58,440 --> 00:16:04,370 general version of Wald's identity, which says that the 264 00:16:04,370 --> 00:16:09,790 expected value of Z, at the time of stopping, is equal to 265 00:16:09,790 --> 00:16:12,520 the expected value of E to the rsJ minus J 266 00:16:12,520 --> 00:16:14,300 gamma of r equals 1. 267 00:16:14,300 --> 00:16:17,310 This you remember is what Wald's identity was when we 268 00:16:17,310 --> 00:16:19,180 were just talking about random walks. 269 00:16:19,180 --> 00:16:22,980 And this was a more general version, because it's talking 270 00:16:22,980 --> 00:16:27,910 about general stopping rules, instead of just to thresholds. 271 00:16:27,910 --> 00:16:31,730 But it does have these little mathematical nitpicks in it, 272 00:16:31,730 --> 00:16:36,030 which I'm not going to talk about here. 273 00:16:36,030 --> 00:16:40,020 Then we have Kolmogorov's submartingale inequality. 274 00:16:40,020 --> 00:16:42,400 We talked about all of these things last time. 275 00:16:42,400 --> 00:16:46,310 So we're going pretty quickly through them. 276 00:16:46,310 --> 00:16:51,390 The submartingale inequality is really the Markov 277 00:16:51,390 --> 00:16:55,390 inequality souped up. 278 00:16:55,390 --> 00:16:58,280 And what it says is, if you have a non-negative 279 00:16:58,280 --> 00:17:02,100 submartingale, that can include a non-negative 280 00:17:02,100 --> 00:17:07,560 martingale, for any positive integer m, the probability 281 00:17:07,560 --> 00:17:13,940 that the maximum is a Z sub i, from 1 to m, is greater than 282 00:17:13,940 --> 00:17:17,109 or equal to a, is less than or equal to the expected value of 283 00:17:17,109 --> 00:17:19,150 Z sub m over a. 284 00:17:19,150 --> 00:17:25,150 You see all that the Markov inequality says is the 285 00:17:25,150 --> 00:17:28,990 probability that Z sub m is greater than or equal to a, is 286 00:17:28,990 --> 00:17:30,770 less than or equal to this. 287 00:17:30,770 --> 00:17:33,930 This puts a lot more teeth into it, because it lets you 288 00:17:33,930 --> 00:17:38,310 talk about all of these random variables, up until time m. 289 00:17:38,310 --> 00:17:41,630 And it says the maximum of them satisfies this 290 00:17:41,630 --> 00:17:42,100 inequality. 291 00:17:42,100 --> 00:17:44,410 I mean, we always knew that the Markov inequality was 292 00:17:44,410 --> 00:17:46,570 very, very weak. 293 00:17:46,570 --> 00:17:48,820 And this is also pretty weak. 294 00:17:48,820 --> 00:17:50,580 But it's not quite as weak, because it 295 00:17:50,580 --> 00:17:52,940 covers a lot more things. 296 00:17:52,940 --> 00:17:54,930 If you have a non-negative martingale-- 297 00:17:54,930 --> 00:17:58,520 this is submartingales, this is martingales. 298 00:17:58,520 --> 00:18:02,270 You see here, with submartingales, the expected 299 00:18:02,270 --> 00:18:07,640 value of Z sub m keeps increasing with m. 300 00:18:07,640 --> 00:18:10,450 So there's a trade-off between making m large and 301 00:18:10,450 --> 00:18:11,700 not making m large. 302 00:18:15,040 --> 00:18:18,020 If you're dealing with a martingale, then expected 303 00:18:18,020 --> 00:18:21,010 value Z sub m is constant over all time. 304 00:18:21,010 --> 00:18:22,680 It doesn't change. 305 00:18:22,680 --> 00:18:27,550 And therefore, you can take this inequality here. 306 00:18:27,550 --> 00:18:30,570 You can go to the limit, as m goes to infinity. 307 00:18:30,570 --> 00:18:33,920 And you wind up with a probability, the sup of Zm, 308 00:18:33,920 --> 00:18:37,390 greater than or equal to a, is less than or equal to the 309 00:18:37,390 --> 00:18:40,510 expected value of the first of those random variables, the 310 00:18:40,510 --> 00:18:44,390 expected value of Z1 divided by a. 311 00:18:44,390 --> 00:18:48,440 So this looks like a very powerful inequality. 312 00:18:48,440 --> 00:18:52,585 It turns out that I don't know many applications of that. 313 00:18:52,585 --> 00:18:54,070 And I don't know why. 314 00:18:54,070 --> 00:18:56,780 It seems like it ought to be very useful. 315 00:18:56,780 --> 00:18:59,930 But I know one reason, which is what I'm going to show you 316 00:18:59,930 --> 00:19:04,260 next, which is how you can really use the submartingale 317 00:19:04,260 --> 00:19:07,520 inequality to make it do an awful lot of things that you 318 00:19:07,520 --> 00:19:09,405 wouldn't imagine that it could do otherwise. 319 00:19:13,560 --> 00:19:16,270 First, you go to the Kolmogorov version of the 320 00:19:16,270 --> 00:19:18,470 Chebyshev inequality. 321 00:19:18,470 --> 00:19:22,860 This has the same relationship to the Kolmogorov 322 00:19:22,860 --> 00:19:28,120 submartingale inequality as Chebyshev has to Markov. 323 00:19:28,120 --> 00:19:31,170 Namely, what you do is, instead of looking at the 324 00:19:31,170 --> 00:19:36,020 random variables Z sub n, you look at the random variable Z 325 00:19:36,020 --> 00:19:38,630 sub n squared. 326 00:19:38,630 --> 00:19:41,230 And what do we know now? 327 00:19:41,230 --> 00:19:46,420 If Z sub n is a martingale or a submartingale, Z sub n 328 00:19:46,420 --> 00:19:50,880 squared is a martingale or submartingale also. 329 00:19:50,880 --> 00:19:55,520 Namely, well, the only thing we can be sure of is that Z 330 00:19:55,520 --> 00:19:59,000 sub n squared is a submartingale. 331 00:19:59,000 --> 00:20:02,260 But if it's a submartingale, then we can apply this 332 00:20:02,260 --> 00:20:03,990 inequality again. 333 00:20:03,990 --> 00:20:07,315 And what it tells us, in this case, is the probability at 334 00:20:07,315 --> 00:20:11,840 the maximum of the magnitudes of these random variables. 335 00:20:11,840 --> 00:20:14,370 Probably the maximum is greater than or equal to b, is 336 00:20:14,370 --> 00:20:17,870 less than or equal to the expected value of Z sub m 337 00:20:17,870 --> 00:20:20,600 squared over b squared. 338 00:20:20,600 --> 00:20:24,190 So before, just like the Markov inequality, the Markov 339 00:20:24,190 --> 00:20:28,530 inequality only works for non-negative random variables. 340 00:20:28,530 --> 00:20:31,790 You go to the Chebyshev inequality, because that works 341 00:20:31,790 --> 00:20:35,500 for negative or positive random variables. 342 00:20:35,500 --> 00:20:38,200 So that makes it kind of neat. 343 00:20:38,200 --> 00:20:43,120 And then what you have is this thing, which goes down as 1 344 00:20:43,120 --> 00:20:45,360 over b squared, which looks a little stronger. 345 00:20:45,360 --> 00:20:50,690 But that's not the real reason that you want to use it. 346 00:20:50,690 --> 00:21:02,480 Now, this inequality here only works for the first m values 347 00:21:02,480 --> 00:21:04,790 of this random variable. 348 00:21:04,790 --> 00:21:08,970 What we're usually interested in here is what happens as m 349 00:21:08,970 --> 00:21:10,630 gets very large. 350 00:21:10,630 --> 00:21:16,620 As m gets very large, this thing, very often, blows up. 351 00:21:16,620 --> 00:21:18,410 So this [INAUDIBLE] 352 00:21:18,410 --> 00:21:21,130 does not really do what you would like an 353 00:21:21,130 --> 00:21:22,980 inequality to do. 354 00:21:22,980 --> 00:21:28,960 So what we're going to do is, first, we're going to say, if 355 00:21:28,960 --> 00:21:35,950 you had this inequality here, then you can lower bound this 356 00:21:35,950 --> 00:21:41,940 by taking just a maximum, not over 1 up to m, but only over 357 00:21:41,940 --> 00:21:44,510 m over 2 up to m. 358 00:21:44,510 --> 00:21:46,640 Now why do we want to do that? 359 00:21:46,640 --> 00:21:49,250 Well, hold on and you'll see. 360 00:21:49,250 --> 00:21:53,390 But anyway this is bigger than, greater than 361 00:21:53,390 --> 00:21:55,580 or equal to, this. 362 00:21:55,580 --> 00:21:59,010 So what we're going to do now is we're going to take this 363 00:21:59,010 --> 00:22:00,640 inequality. 364 00:22:00,640 --> 00:22:05,930 We're going to use it for m equals 2 to the m, for m 365 00:22:05,930 --> 00:22:10,410 equals 2 to the k plus 1, m equals 2 to the k plus 2, all 366 00:22:10,410 --> 00:22:13,180 the way up to infinity. 367 00:22:13,180 --> 00:22:18,280 And so we're going to find the probability of the union over 368 00:22:18,280 --> 00:22:23,010 j greater than or equal to k of this quantity here, but now 369 00:22:23,010 --> 00:22:28,020 just maximized over 2 to the j minus 1, less than n, less 370 00:22:28,020 --> 00:22:30,180 than or equal 2 to the j. 371 00:22:30,180 --> 00:22:33,730 And then the maximum of Z sub n, greater than or equal to. 372 00:22:33,730 --> 00:22:37,680 And now, for each one of these j's here, we'll put in 373 00:22:37,680 --> 00:22:39,460 whatever b sub j we want. 374 00:22:39,460 --> 00:22:43,680 So the general form of this inequality then becomes. 375 00:22:43,680 --> 00:22:45,860 We have this term on the left. 376 00:22:45,860 --> 00:22:47,750 We use the union bound. 377 00:22:47,750 --> 00:22:50,690 And we get this term on the right. 378 00:22:50,690 --> 00:22:55,470 So at this point, we have an inequality, which works for 379 00:22:55,470 --> 00:22:59,260 all n, instead of just for values smaller 380 00:22:59,260 --> 00:23:01,340 than some given amount. 381 00:23:01,340 --> 00:23:04,510 So this is sort of a general technique for taking an 382 00:23:04,510 --> 00:23:09,130 inequality, which only works up to a certain value, and 383 00:23:09,130 --> 00:23:12,300 extending it so it works over all values. 384 00:23:12,300 --> 00:23:17,590 You have to be pretty careful about how you choose b sub j. 385 00:23:17,590 --> 00:23:21,470 Now what we're going to do is say, OK. 386 00:23:21,470 --> 00:23:25,870 And remember, what is happening here is we started 387 00:23:25,870 --> 00:23:29,370 out with a submartingale or a martingale. 388 00:23:29,370 --> 00:23:34,920 When we take Z n squared, we still have a submartingale. 389 00:23:34,920 --> 00:23:38,190 So we can use a submartingale inequality, which is what 390 00:23:38,190 --> 00:23:39,370 we're doing here. 391 00:23:39,370 --> 00:23:44,210 We're using the submartingale inequality on Zm squared 392 00:23:44,210 --> 00:23:45,910 rather than on Zm. 393 00:23:45,910 --> 00:23:48,120 And Zm squared is non-negative, 394 00:23:48,120 --> 00:23:50,060 so that works there. 395 00:23:50,060 --> 00:23:52,260 Then we go down to this point. 396 00:23:52,260 --> 00:23:55,750 We take a union over all of these terms. 397 00:23:55,750 --> 00:23:57,650 And note what happens. 398 00:23:57,650 --> 00:24:03,430 Every n is included in one of these terms, every n 399 00:24:03,430 --> 00:24:06,080 beyond 2 the k. 400 00:24:06,080 --> 00:24:09,140 So if we want to prove something about the limiting 401 00:24:09,140 --> 00:24:16,860 values of Z sub n, we have everything included there, 402 00:24:16,860 --> 00:24:19,160 everything beyond 2 to the k. 403 00:24:19,160 --> 00:24:21,690 But as far as the limit is concerned, you don't care 404 00:24:21,690 --> 00:24:25,340 about any initial finite set. 405 00:24:25,340 --> 00:24:29,950 You care what happens after that initial finite set. 406 00:24:29,950 --> 00:24:32,940 So what we have then [INAUDIBLE] 407 00:24:32,940 --> 00:24:36,130 of these terms, less than or equal to this term. 408 00:24:36,130 --> 00:24:43,350 When I apply this to a random walk S sub n, S sub n is a 409 00:24:43,350 --> 00:24:46,650 submartingale, at this point. 410 00:24:46,650 --> 00:24:48,790 The expected value of x squared will 411 00:24:48,790 --> 00:24:50,470 assume a sigma squared. 412 00:24:50,470 --> 00:24:57,300 The expected value now S sub n, or Z sub n is we'll call 413 00:24:57,300 --> 00:25:02,070 it, is the sum of these n IID random variables. 414 00:25:02,070 --> 00:25:03,376 So the expected value-- 415 00:25:03,376 --> 00:25:04,910 AUDIENCE: 10 o'clock. 416 00:25:04,910 --> 00:25:09,430 PROFESSOR: The expected value of Z to the 2J is just 2 to 417 00:25:09,430 --> 00:25:13,090 the J times the expected value of x squared, in other words, 418 00:25:13,090 --> 00:25:14,120 sigma squared. 419 00:25:14,120 --> 00:25:17,370 [INAUDIBLE] just doing this for a 0 mean [INAUDIBLE] 420 00:25:17,370 --> 00:25:21,492 variable, because [INAUDIBLE] 421 00:25:21,492 --> 00:25:23,590 given an arbitrary non-0 [INAUDIBLE] 422 00:25:23,590 --> 00:25:24,590 random variable. 423 00:25:24,590 --> 00:25:27,540 You can look at it as to mean plus a random 424 00:25:27,540 --> 00:25:29,400 variable, which is 0 mean. 425 00:25:29,400 --> 00:25:33,390 So that's the same ideas we're using here. 426 00:25:33,390 --> 00:25:38,770 So we take this inequality now, and I'm going to use for 427 00:25:38,770 --> 00:25:43,410 b sub J, 3/2 to the J. Why 3/2 to J? 428 00:25:43,410 --> 00:25:45,430 Well you'll see in just a second. 429 00:25:45,430 --> 00:25:49,780 But when I use 3/2 to the J here I get the maximum over S 430 00:25:49,780 --> 00:25:52,830 sub n, greater than or equal to 3/2 to the J. 431 00:25:52,830 --> 00:26:00,010 And over here I get b sub J squared is 9/4 to the J. And 432 00:26:00,010 --> 00:26:06,240 here I have 2 the J also. 433 00:26:06,240 --> 00:26:11,810 So when I sum this, it winds up with 8/9 to the k times 9 434 00:26:11,810 --> 00:26:13,620 sigma squared. 435 00:26:13,620 --> 00:26:18,040 So what I have now is something where, when k gets 436 00:26:18,040 --> 00:26:21,850 larger, this term is going to 0. 437 00:26:21,850 --> 00:26:25,160 And I have something over here, well that doesn't look 438 00:26:25,160 --> 00:26:27,670 quite so attractive, but just wait a minute. 439 00:26:27,670 --> 00:26:31,010 What I'm really interested in is not S sub n. 440 00:26:31,010 --> 00:26:33,670 But I'm interested in S sub n over n. 441 00:26:33,670 --> 00:26:36,710 For the strong law of large numbers, I'd like to show that 442 00:26:36,710 --> 00:26:40,300 S sub n over n approaches a limit. 443 00:26:40,300 --> 00:26:44,690 And n in this case runs between 2 to the J minus 1 and 444 00:26:44,690 --> 00:26:46,100 2 to the J. 445 00:26:46,100 --> 00:26:48,640 So when I put that in here-- 446 00:26:48,640 --> 00:26:53,640 we'll see what that amounts to in the next slide. 447 00:26:53,640 --> 00:27:00,130 For the strong law of large numbers, what our theorem says 448 00:27:00,130 --> 00:27:04,290 is that the probability of the set of sample points, for 449 00:27:04,290 --> 00:27:09,890 which S sub n over n equals 0, that set of sample points has 450 00:27:09,890 --> 00:27:12,770 probability y. 451 00:27:12,770 --> 00:27:16,000 So the proof of that, if I pick up this equation from the 452 00:27:16,000 --> 00:27:24,590 previous slide, and when I lower bound the left side of 453 00:27:24,590 --> 00:27:30,030 this, what I'm going to get, I'm going to divide by n here. 454 00:27:30,030 --> 00:27:32,500 And I'm going to divide by something a little bit 455 00:27:32,500 --> 00:27:36,060 smaller, which is 2 to the J minus 1 here. 456 00:27:36,060 --> 00:27:39,730 So I get the maximum of S sub n over n, greater than or 457 00:27:39,730 --> 00:27:42,930 equal to 2 times 3/4 to the J. 458 00:27:42,930 --> 00:27:44,360 Now you see why I picked-- 459 00:27:47,940 --> 00:27:50,680 I think you see at this point why I picked the 460 00:27:50,680 --> 00:27:52,010 sub j the way I did. 461 00:27:52,010 --> 00:27:55,250 I wanted to pick it t to be smaller than 2 to the J. And I 462 00:27:55,250 --> 00:28:00,030 wanted to pick it to be big enough that it drove the right 463 00:28:00,030 --> 00:28:03,270 hand term to 0. 464 00:28:03,270 --> 00:28:05,230 So now we're done really. 465 00:28:05,230 --> 00:28:10,180 Because, if I look at this expression here, a sample 466 00:28:10,180 --> 00:28:15,290 sequence S sub n of omega, that's not contained in this 467 00:28:15,290 --> 00:28:21,810 union, has to approach 0. 468 00:28:21,810 --> 00:28:27,650 Because these terms from 2 to the J minus 1 to 2 to the J, 469 00:28:27,650 --> 00:28:31,140 in order to be in this set, they have to be greater than 470 00:28:31,140 --> 00:28:34,180 or equal to 2 times 3/4 to the J. 471 00:28:34,180 --> 00:28:39,490 As j gets larger and larger, this term goes to 0. 472 00:28:39,490 --> 00:28:43,930 So the only terms that exceed that are terms that are 473 00:28:43,930 --> 00:28:45,180 arbitrarily small. 474 00:28:48,240 --> 00:28:54,840 So the complement of this set is the set of terms for which 475 00:28:54,840 --> 00:28:58,350 S sub n over n does not approach 0. 476 00:28:58,350 --> 00:29:03,270 But the probability of that is 8/9 to the k times time some 477 00:29:03,270 --> 00:29:05,480 garbage over here. 478 00:29:05,480 --> 00:29:09,450 So now it's true for all k. 479 00:29:09,450 --> 00:29:20,130 The terms which approach 0, namely the sampled values for 480 00:29:20,130 --> 00:29:24,980 which S sub n over n approaches 0 are all 481 00:29:24,980 --> 00:29:27,670 complimentary to this set. 482 00:29:27,670 --> 00:29:36,590 So the probability that S sub n over omega over n approaches 483 00:29:36,590 --> 00:29:43,460 0 is greater than 1 minus this quantity here. 484 00:29:43,460 --> 00:29:45,990 That's true for all k. 485 00:29:45,990 --> 00:29:50,650 And since it's true for all k, this term goes to 0. 486 00:29:50,650 --> 00:29:54,140 And the theorem is proven. 487 00:29:54,140 --> 00:29:57,280 Now why did I want to go through this. 488 00:29:57,280 --> 00:30:01,080 There are perhaps easier ways to prove the strong law of 489 00:30:01,080 --> 00:30:08,990 large numbers, just assuming that the variance is finite. 490 00:30:08,990 --> 00:30:11,502 Why this particular y? 491 00:30:11,502 --> 00:30:15,900 Well, if you look at this, it applies to much more than just 492 00:30:15,900 --> 00:30:19,100 sums of IID random variables. 493 00:30:19,100 --> 00:30:23,540 It applies to arbitrary martingales, so long as these 494 00:30:23,540 --> 00:30:25,800 conditions are satisfied. 495 00:30:25,800 --> 00:30:29,410 It applies to these cases, like where you have a random 496 00:30:29,410 --> 00:30:34,160 variable, which is plus or minus 1 times some arbitrary 497 00:30:34,160 --> 00:30:35,680 random variable. 498 00:30:35,680 --> 00:30:39,050 So this gives you sort of a general way of proving strong 499 00:30:39,050 --> 00:30:42,390 laws of large numbers for strange 500 00:30:42,390 --> 00:30:44,110 sequences of random variables. 501 00:30:46,660 --> 00:30:48,610 So that's the reason for going through this. 502 00:30:48,610 --> 00:30:54,050 We now have a way of proving strong laws of large numbers 503 00:30:54,050 --> 00:30:57,860 for lots of different kinds of martingales, rather than just 504 00:30:57,860 --> 00:31:01,690 for this set of things here. 505 00:31:01,690 --> 00:31:10,760 So let's move on back to Markov chains, countable or 506 00:31:10,760 --> 00:31:12,020 finite state. 507 00:31:12,020 --> 00:31:16,480 I'm moving back to chapter three and five in the text, 508 00:31:16,480 --> 00:31:20,240 mostly chapter five, and trying to finish some sort of 509 00:31:20,240 --> 00:31:22,190 review of what we've done. 510 00:31:22,190 --> 00:31:24,280 When I look back at what we've done, it seems like we've 511 00:31:24,280 --> 00:31:26,990 proven an awful lot theorems. 512 00:31:26,990 --> 00:31:29,360 So all I can do is talk about the theorems. 513 00:31:32,370 --> 00:31:38,490 I should say something, again, this last day, on this last 514 00:31:38,490 --> 00:31:44,310 lecture, about why we spend so much time proving theorems. 515 00:31:44,310 --> 00:31:46,860 In other words, we've just proven a theorem here. 516 00:31:46,860 --> 00:31:53,760 I promised you I would prove a theorem every lecture, along 517 00:31:53,760 --> 00:31:58,760 with talking about why they're important and so on. 518 00:31:58,760 --> 00:32:04,760 And most of you are engineers or you're scientists in 519 00:32:04,760 --> 00:32:05,650 various fields. 520 00:32:05,650 --> 00:32:07,380 You're not mathematicians. 521 00:32:07,380 --> 00:32:11,120 Why should you be interested in all these theorems. 522 00:32:11,120 --> 00:32:14,240 Why should you take abstract courses, which 523 00:32:14,240 --> 00:32:17,180 look like math courses? 524 00:32:17,180 --> 00:32:21,210 And the reason is this kind of stuff is more important for 525 00:32:21,210 --> 00:32:23,800 you than it is for mathematicians. 526 00:32:23,800 --> 00:32:26,410 And it's more important for you, because when you're 527 00:32:26,410 --> 00:32:30,190 dealing with a real engineering or real scientific 528 00:32:30,190 --> 00:32:33,410 problem, how do you deal with it? 529 00:32:33,410 --> 00:32:36,650 I mean, you have a real mess facing you. 530 00:32:36,650 --> 00:32:39,550 You spend a lot of time trying to understand what that mess 531 00:32:39,550 --> 00:32:41,850 is all about. 532 00:32:41,850 --> 00:32:46,070 And you don't form a model of it, and then apply theorems. 533 00:32:46,070 --> 00:32:48,690 What you do is to try to understand it. 534 00:32:48,690 --> 00:32:51,590 You look at multiple models. 535 00:32:51,590 --> 00:32:55,780 When we were looking at hypothesis testing, we said 536 00:32:55,780 --> 00:32:59,450 we're going to assume a priori probabilities. 537 00:32:59,450 --> 00:33:01,960 I lied about that a little bit. 538 00:33:01,960 --> 00:33:05,050 We we're not assuming a priori probabilities. 539 00:33:05,050 --> 00:33:10,340 We we're assuming a class of probability models, each of 540 00:33:10,340 --> 00:33:13,810 which had a priori probabilities in them. 541 00:33:13,810 --> 00:33:16,560 And then we said something about that class of 542 00:33:16,560 --> 00:33:19,220 probability models. 543 00:33:19,220 --> 00:33:21,950 And saying something about that class of probability 544 00:33:21,950 --> 00:33:25,490 models, we were able to say a great deal more than you can 545 00:33:25,490 --> 00:33:28,960 say if you refuse to even think about a model, which 546 00:33:28,960 --> 00:33:31,980 doesn't have a priori probabilities in it. 547 00:33:31,980 --> 00:33:35,960 So by looking at lots of different models, you can 548 00:33:35,960 --> 00:33:40,470 understand an enormous number of things without really 549 00:33:40,470 --> 00:33:42,940 having any one model which describes the whole 550 00:33:42,940 --> 00:33:45,650 situation for you. 551 00:33:45,650 --> 00:33:50,020 And that's why we try to prove theorems for models, because 552 00:33:50,020 --> 00:33:53,460 then, when you understand lots of simple models, you have 553 00:33:53,460 --> 00:33:56,280 these complicated physical situations, and 554 00:33:56,280 --> 00:33:57,570 you play with them. 555 00:33:57,570 --> 00:34:00,470 You play with them by applying various simple models that you 556 00:34:00,470 --> 00:34:02,170 understand to them. 557 00:34:02,170 --> 00:34:04,860 And as you do this, you gradually understand the 558 00:34:04,860 --> 00:34:07,060 physical process better. 559 00:34:07,060 --> 00:34:09,540 And that's the way we discover things. 560 00:34:09,540 --> 00:34:11,260 OK, end of lecture. 561 00:34:11,260 --> 00:34:16,830 Not end of lecture, but end of partial lecture about why you 562 00:34:16,830 --> 00:34:18,080 want to learn some mathematics. 563 00:34:20,550 --> 00:34:26,510 The first passage time from state i to j, remember, is the 564 00:34:26,510 --> 00:34:32,790 smallest n, when you start off in state i, at which you get 565 00:34:32,790 --> 00:34:33,610 to state j. 566 00:34:33,610 --> 00:34:35,690 You start off in state i. 567 00:34:35,690 --> 00:34:38,090 You jump from one lily pad to another. 568 00:34:38,090 --> 00:34:41,040 You eventually wind up at lily pad number j. 569 00:34:41,040 --> 00:34:44,989 And we want to know how long it takes you to get to j. 570 00:34:44,989 --> 00:34:48,889 That's a random variable, obviously. 571 00:34:48,889 --> 00:34:54,620 And this Tij, as possibly the effective random variable, 572 00:34:54,620 --> 00:34:58,230 that has the probability mass function. 573 00:34:58,230 --> 00:35:00,460 It's the definition of what this probability 574 00:35:00,460 --> 00:35:02,600 mass function is. 575 00:35:02,600 --> 00:35:05,480 And it has a distribution function. 576 00:35:05,480 --> 00:35:08,490 And the probability mass function-- 577 00:35:08,490 --> 00:35:10,900 you probably remember how we derived this. 578 00:35:10,900 --> 00:35:14,200 We derived it by sort of crawling up on it, by looking 579 00:35:14,200 --> 00:35:19,900 at it, first, for n equals 1, in which case it's just a 580 00:35:19,900 --> 00:35:23,750 transition probability of n equals 2. 581 00:35:23,750 --> 00:35:26,720 In which case, it's the probability that you first go 582 00:35:26,720 --> 00:35:31,760 to k, and then in n minus 1 steps, you go to j. 583 00:35:31,760 --> 00:35:35,710 But you have to leave j out, because if you go to j in the 584 00:35:35,710 --> 00:35:41,850 first step, you've already had your first passage. 585 00:35:41,850 --> 00:35:43,870 so 586 00:35:43,870 --> 00:35:49,120 We define a state to be recurrent, if T sub jj is 587 00:35:49,120 --> 00:35:50,980 non-defective. 588 00:35:50,980 --> 00:35:54,160 And we define it to be transient otherwise. 589 00:35:54,160 --> 00:35:57,740 In other words, if it's not certain that you ever get to 590 00:35:57,740 --> 00:36:01,540 state j, then you define it to be transient, if it's 591 00:36:01,540 --> 00:36:04,400 recurrent and it's positive recurrent, if the expected 592 00:36:04,400 --> 00:36:07,740 value of T sub jj is less than infinity. 593 00:36:07,740 --> 00:36:11,110 And it's null recurrent otherwise. 594 00:36:11,110 --> 00:36:14,190 How do we know how to analyze this? 595 00:36:14,190 --> 00:36:17,530 Well we study renewal processes. 596 00:36:17,530 --> 00:36:21,300 And if you look at the renewal process where you've got a 597 00:36:21,300 --> 00:36:24,150 renewal every time you hit state j, you start 598 00:36:24,150 --> 00:36:25,040 out on stage j. 599 00:36:25,040 --> 00:36:28,710 The first time you hit state j, that's a renewal. 600 00:36:28,710 --> 00:36:31,370 The next time you hit state j, that's another renewal. 601 00:36:33,970 --> 00:36:40,150 You have a renewal process where the interrenewal time is 602 00:36:40,150 --> 00:36:47,920 a random variable, which has the PMF F sub ij event. 603 00:36:52,400 --> 00:37:00,060 Excuse me, if you have a renewal process, if you start 604 00:37:00,060 --> 00:37:05,440 in state j, where T sub jj is the amount of time before 605 00:37:05,440 --> 00:37:10,730 renewal occurs, from that time on, you get another renewal 606 00:37:10,730 --> 00:37:12,790 with another random variable with the same 607 00:37:12,790 --> 00:37:14,990 distribution as Tjj. 608 00:37:14,990 --> 00:37:22,610 And F sub ij is the PMF of that renewal time. 609 00:37:22,610 --> 00:37:29,620 And F sub ij is the distribution function of it. 610 00:37:29,620 --> 00:37:34,460 So then when we define the state j as being recurrent, 611 00:37:34,460 --> 00:37:37,030 what we're really doing is going back to what we know 612 00:37:37,030 --> 00:37:43,820 about renewal processes and saying a Markov chain is 613 00:37:43,820 --> 00:37:49,180 recurrent if the renewal process that we define for 614 00:37:49,180 --> 00:37:55,700 that countable state Markov chain has these various 615 00:37:55,700 --> 00:38:00,470 properties for this renewal random variable. 616 00:38:00,470 --> 00:38:03,640 For each recurrent j, there's an integer renewal counting 617 00:38:03,640 --> 00:38:08,130 process N sub jj of t. 618 00:38:08,130 --> 00:38:14,180 You start in state j at time t, which is after t steps of 619 00:38:14,180 --> 00:38:16,230 the Markov process. 620 00:38:16,230 --> 00:38:19,850 What you're interested is how many times have you hit state 621 00:38:19,850 --> 00:38:22,570 j, up until time t. 622 00:38:22,570 --> 00:38:27,620 That's the counting process we talk about in renewal theory. 623 00:38:27,620 --> 00:38:33,550 So N sub jj of t is the number of visits to j starting in j. 624 00:38:33,550 --> 00:38:37,370 And it has the interrenewal distribution F sub jj, which 625 00:38:37,370 --> 00:38:39,520 is that quantity up there. 626 00:38:39,520 --> 00:38:45,250 We have a delayed renewal counting process N sub ij of 627 00:38:45,250 --> 00:38:51,030 t, if we count visits to j, starting in i. 628 00:38:51,030 --> 00:38:55,100 We didn't talk much about delayed renewal processes, 629 00:38:55,100 --> 00:38:57,220 except for pointing out that when you have a delayed 630 00:38:57,220 --> 00:38:59,930 renewal process, it really is the same 631 00:38:59,930 --> 00:39:01,410 as a renewal processes. 632 00:39:01,410 --> 00:39:05,160 It just has some arbitrary amount of time that's required 633 00:39:05,160 --> 00:39:07,790 to get to state j for the first time and the keep 634 00:39:07,790 --> 00:39:09,100 recurrent on. 635 00:39:09,100 --> 00:39:12,810 Even if the expected time to get to j for first time is 636 00:39:12,810 --> 00:39:17,370 infinite, and the expected time for renewals from j to j 637 00:39:17,370 --> 00:39:21,990 is finite, you still have this same renewal processes. 638 00:39:21,990 --> 00:39:25,030 You can even lose an infinite amount of time at the 639 00:39:25,030 --> 00:39:28,480 beginning, and you amortize it over time. 640 00:39:28,480 --> 00:39:31,100 Don't ask me why you can amortize an infinite amount of 641 00:39:31,100 --> 00:39:32,510 time over time. 642 00:39:32,510 --> 00:39:33,760 But you can. 643 00:39:35,810 --> 00:39:40,710 And actually if you read about delayed renewal processes, you 644 00:39:40,710 --> 00:39:42,601 see why you actually get that. 645 00:39:46,310 --> 00:39:51,740 So all states in a class are positive recurrent, or all are 646 00:39:51,740 --> 00:39:54,520 null recurrent, or all are transient. 647 00:39:54,520 --> 00:39:55,620 We've proved that theorem. 648 00:39:55,620 --> 00:39:59,030 It wasn't really a very hard theorem to prove. 649 00:39:59,030 --> 00:40:04,040 And you can sort of see that it ought to be. 650 00:40:04,040 --> 00:40:07,860 Then we define the chain as being irreducible, if all 651 00:40:07,860 --> 00:40:09,340 state pairs communicate. 652 00:40:09,340 --> 00:40:14,050 In other words, if for every pair of states, there's a path 653 00:40:14,050 --> 00:40:16,800 that goes from one state to the other state. 654 00:40:19,520 --> 00:40:22,790 This is intuitively a simple idea, if you have a finite 655 00:40:22,790 --> 00:40:24,480 state and Markov chain. 656 00:40:24,480 --> 00:40:29,390 If you have a countably infinite state Markov chain, 657 00:40:29,390 --> 00:40:31,420 it seems to be a little more peculiar. 658 00:40:31,420 --> 00:40:34,080 But it really isn't. 659 00:40:34,080 --> 00:40:38,610 For a countably infinite state an Markov chain every state 660 00:40:38,610 --> 00:40:41,430 has a finite number. 661 00:40:41,430 --> 00:40:44,140 And you can take every pair of states. 662 00:40:44,140 --> 00:40:45,750 You can identify them. 663 00:40:45,750 --> 00:40:48,260 And you can see whether there's a path going from one 664 00:40:48,260 --> 00:40:48,870 to the other. 665 00:40:48,870 --> 00:40:53,390 For all of these birth-death processes we've talked about, 666 00:40:53,390 --> 00:40:55,600 I mean, it's obvious whether the states all 667 00:40:55,600 --> 00:40:56,660 communicate or not. 668 00:40:56,660 --> 00:40:58,940 You just see if there's any break in the 669 00:40:58,940 --> 00:41:00,720 chain at any point. 670 00:41:00,720 --> 00:41:02,180 And it really looks like a chain. 671 00:41:02,180 --> 00:41:07,850 It's a node, two transitions, another node, two transitions, 672 00:41:07,850 --> 00:41:09,570 another node. 673 00:41:09,570 --> 00:41:12,430 And that's just the way chains are supposed to work. 674 00:41:15,450 --> 00:41:19,650 An irreducible class might be positive recurrent. 675 00:41:19,650 --> 00:41:21,680 It might be null recurrent. 676 00:41:21,680 --> 00:41:23,880 Or it might be transient. 677 00:41:23,880 --> 00:41:28,600 And we already have seen what makes a state null recurrent 678 00:41:28,600 --> 00:41:31,280 or transient. 679 00:41:31,280 --> 00:41:33,720 And it's the same thing for the class. 680 00:41:42,910 --> 00:41:47,830 We started out by saying a state is either null 681 00:41:47,830 --> 00:41:53,390 recurrent, positive recurrent, or transient depending on this 682 00:41:53,390 --> 00:41:56,070 renewal process associated with it. 683 00:41:56,070 --> 00:42:00,210 And now there's this theorem, which says that if one node in 684 00:42:00,210 --> 00:42:06,880 a class of states is positive recurrent, they all are. 685 00:42:06,880 --> 00:42:09,230 And you ought to be able to sort of see 686 00:42:09,230 --> 00:42:11,480 the reason for that. 687 00:42:11,480 --> 00:42:17,040 If I have one state which is positive recurrent, it means 688 00:42:17,040 --> 00:42:20,940 that the expected time to go from this state to 689 00:42:20,940 --> 00:42:22,210 this state is finite. 690 00:42:24,820 --> 00:42:29,060 Now if I had some other state, I have to go 691 00:42:29,060 --> 00:42:31,660 from here to there. 692 00:42:31,660 --> 00:42:34,230 I can go through here and then off to there. 693 00:42:34,230 --> 00:42:37,490 So the amount of time it takes to get to there, and then from 694 00:42:37,490 --> 00:42:41,740 there to there, is also finite, expected amount, and 695 00:42:41,740 --> 00:42:43,060 the same backwards. 696 00:42:43,060 --> 00:42:45,030 So that was the way we proved this. 697 00:42:49,220 --> 00:42:53,930 If we have an irreducible Markov chain-- 698 00:42:53,930 --> 00:42:58,520 now this is the theorem you really use all the time. 699 00:42:58,520 --> 00:43:03,580 This is sort of says how you operate with these things. 700 00:43:03,580 --> 00:43:06,880 It says the steady state equations-- 701 00:43:06,880 --> 00:43:10,530 they're the equations you've used in half the problems 702 00:43:10,530 --> 00:43:13,030 you've done with Markov chains-- 703 00:43:13,030 --> 00:43:17,350 if these equations have a solution for the pi sub j's, 704 00:43:17,350 --> 00:43:21,250 remember the Markov chain is defined in terms of the 705 00:43:21,250 --> 00:43:25,050 transition probabilities P sub ij. 706 00:43:25,050 --> 00:43:28,120 We solve these equations to find out what the steady state 707 00:43:28,120 --> 00:43:30,780 probabilities pi sub j are. 708 00:43:30,780 --> 00:43:34,900 And the theorem says, if you can find the solution to those 709 00:43:34,900 --> 00:43:36,230 equations-- 710 00:43:36,230 --> 00:43:39,830 pi sub j's have to add up to 1-- 711 00:43:39,830 --> 00:43:42,720 then the solution is unique. 712 00:43:42,720 --> 00:43:50,640 The pi sub j's are equal to 1 over the mean time to go from 713 00:43:50,640 --> 00:43:55,510 that state back to that state again. 714 00:43:55,510 --> 00:43:58,960 And what does that mean? 715 00:43:58,960 --> 00:44:03,760 What that really gives you is not a way to find pi sub j. 716 00:44:03,760 --> 00:44:08,710 It gives you a way to find a T sub jj. 717 00:44:08,710 --> 00:44:13,050 Because these equations are more often the way that you 718 00:44:13,050 --> 00:44:15,760 solve for the steady state probabilities. 719 00:44:15,760 --> 00:44:19,010 And then that gives you a way to find the mean recurrence 720 00:44:19,010 --> 00:44:22,165 time between visits to this given state. 721 00:44:25,430 --> 00:44:28,660 And what else does this theorem say? 722 00:44:28,660 --> 00:44:32,290 It says if the states are positive recurrent, then the 723 00:44:32,290 --> 00:44:34,450 steady state equations have a solution. 724 00:44:34,450 --> 00:44:38,640 So this is an if and only if kind of statement. 725 00:44:38,640 --> 00:44:42,570 It relates these equations, these steady state equations, 726 00:44:42,570 --> 00:44:48,820 to solutions and says, if these equations have a 727 00:44:48,820 --> 00:44:51,780 solution, then in fact you have the 728 00:44:51,780 --> 00:44:54,110 steady state equations. 729 00:44:54,110 --> 00:44:56,790 They satisfy all these relationships about mean 730 00:44:56,790 --> 00:44:59,310 recurrence time. 731 00:44:59,310 --> 00:45:02,910 And if the states are positive recurrent, then those 732 00:45:02,910 --> 00:45:04,790 equations have a solution. 733 00:45:04,790 --> 00:45:09,730 And in the solutions, the pi sub j's are all possible. 734 00:45:09,730 --> 00:45:11,900 So it's an infinite set of equations, so you can't 735 00:45:11,900 --> 00:45:14,190 necessarily solve it. 736 00:45:14,190 --> 00:45:16,220 But you sort of know everything there is to know 737 00:45:16,220 --> 00:45:19,470 about it, at this point. 738 00:45:19,470 --> 00:45:21,730 Well, there's one other thing, when you have a birth-death 739 00:45:21,730 --> 00:45:25,670 chain, these equations simplify a great deal. 740 00:45:25,670 --> 00:45:30,010 The counting processes under positive recurrence have to 741 00:45:30,010 --> 00:45:33,280 satisfy this equation. 742 00:45:33,280 --> 00:45:37,910 And my evil twin brother got a hold of this and left out the 743 00:45:37,910 --> 00:45:40,850 n in the copy that you have. 744 00:45:40,850 --> 00:45:46,060 And I spotted it when I looked at it just a little bit. 745 00:45:46,060 --> 00:45:49,470 He was still sleeping, so I've managed to find it. 746 00:45:49,470 --> 00:45:51,650 So it's corrected here. 747 00:45:51,650 --> 00:45:52,922 And what does that say? 748 00:45:55,820 --> 00:46:00,380 It says, when you have positive recurrence, if you 749 00:46:00,380 --> 00:46:04,840 look from 0 out to t, and you count the number of times that 750 00:46:04,840 --> 00:46:10,600 you hit state j, that's a random variable. 751 00:46:10,600 --> 00:46:16,980 If you take that and divide by n, you look from time 0 out to 752 00:46:16,980 --> 00:46:22,960 time N, N sub ij of N, it's the number of times 753 00:46:22,960 --> 00:46:24,660 you visit state j. 754 00:46:24,660 --> 00:46:28,850 You divide that by N, and you go to the limit. 755 00:46:28,850 --> 00:46:33,130 And there's a strong law of large numbers there, which was 756 00:46:33,130 --> 00:46:36,650 a strong law of large numbers for renewal processes, which 757 00:46:36,650 --> 00:46:39,760 says that it has a limit with probability 1. 758 00:46:39,760 --> 00:46:43,355 And this says that limit is pi sub j. 759 00:46:43,355 --> 00:46:45,730 And that's sort of obvious, again. 760 00:46:45,730 --> 00:46:49,340 I mean, visualize what happens. 761 00:46:49,340 --> 00:46:51,900 You start out in state j. 762 00:46:51,900 --> 00:46:55,030 For one unit of time, you're in state j. 763 00:46:55,030 --> 00:46:58,700 Then you go away from state j, and for a long time you're out 764 00:46:58,700 --> 00:47:00,220 in the wilderness. 765 00:47:00,220 --> 00:47:03,660 And then you finally get back to state j again. 766 00:47:03,660 --> 00:47:07,770 Think of a renewal reward process, where you get 1 unit 767 00:47:07,770 --> 00:47:12,200 of reward every time you're in state j and 0 reward every 768 00:47:12,200 --> 00:47:14,360 time you're not in state j. 769 00:47:14,360 --> 00:47:18,470 That means every interrenewal period, you pick 770 00:47:18,470 --> 00:47:19,830 up one unit of reward. 771 00:47:25,000 --> 00:47:27,065 Well, this is what that says. 772 00:47:32,290 --> 00:47:36,190 It says that the fraction of those visits to state j-- 773 00:47:40,800 --> 00:47:46,230 that out of the total visits in the Markov chain, the ones 774 00:47:46,230 --> 00:47:50,830 that go to state j have probability pi sub j. 775 00:47:50,830 --> 00:47:54,020 So again this is another relationship with these steady 776 00:47:54,020 --> 00:47:55,200 state probabilities. 777 00:47:55,200 --> 00:47:58,860 The steady state probabilities tell you what these mean 778 00:47:58,860 --> 00:48:00,890 recurrence times are. 779 00:48:00,890 --> 00:48:03,300 And that tells you what this is. 780 00:48:03,300 --> 00:48:08,060 This, in a sense, is the same as this. 781 00:48:08,060 --> 00:48:11,290 Those are just sort of the same results. 782 00:48:11,290 --> 00:48:14,540 So there's nothing special about it. 783 00:48:14,540 --> 00:48:18,340 We talked a little bit about the Markov model of age of 784 00:48:18,340 --> 00:48:22,750 renewal process for any integer valued renewal 785 00:48:22,750 --> 00:48:33,010 process, you can find a Markov chain which gives you the age 786 00:48:33,010 --> 00:48:35,010 of that process. 787 00:48:35,010 --> 00:48:37,660 You visualize being in state j. 788 00:48:37,660 --> 00:48:46,790 And you visualize being in state 0, of this Markov model, 789 00:48:46,790 --> 00:48:50,550 at the point where you have a renewal. 790 00:48:50,550 --> 00:48:56,670 One step later, if you have another renewal, that happens 791 00:48:56,670 --> 00:49:02,070 with probability P sub 00, you go back to state 0 again. 792 00:49:02,070 --> 00:49:04,400 If you don't have a renewal in the next time, 793 00:49:04,400 --> 00:49:06,580 you go to state 1. 794 00:49:06,580 --> 00:49:09,830 From state 1, you might go to state 2. 795 00:49:09,830 --> 00:49:13,200 When you're in state 2, it means you're two time units 796 00:49:13,200 --> 00:49:15,750 away from state 0. 797 00:49:15,750 --> 00:49:21,830 If you go back to state 0, it means you have a renewal in 798 00:49:21,830 --> 00:49:24,160 three time units. 799 00:49:24,160 --> 00:49:26,480 Otherwise you go to state 3. 800 00:49:26,480 --> 00:49:30,000 Then you might have a renewal and so forth. 801 00:49:30,000 --> 00:49:38,360 So for this very simple kind of Markov chain, this tells 802 00:49:38,360 --> 00:49:41,920 you everything there is to know, in the sense, about 803 00:49:41,920 --> 00:49:44,540 integer value renewal processes. 804 00:49:44,540 --> 00:49:48,820 So there's this nice connection between the two. 805 00:49:48,820 --> 00:49:52,510 And it lets you see pretty easily about when you have no 806 00:49:52,510 --> 00:49:53,380 recurrence. 807 00:49:53,380 --> 00:49:55,280 Now we spend a lot of time talking about these 808 00:49:55,280 --> 00:49:58,800 birth-death Markov chains. 809 00:49:58,800 --> 00:50:03,400 And the easy way to solve for birth-death Markov of chains 810 00:50:03,400 --> 00:50:10,490 is to say intuitively that between any two adjacent 811 00:50:10,490 --> 00:50:14,370 states, the number of times you go up has to equal the 812 00:50:14,370 --> 00:50:16,980 number of times you go down, plus or minus 1. 813 00:50:16,980 --> 00:50:21,000 If you start out here and you end up here, you're going this 814 00:50:21,000 --> 00:50:25,680 way one more time than you've gone that way and vice versa. 815 00:50:25,680 --> 00:50:29,660 And combining that with the steady state equations that we 816 00:50:29,660 --> 00:50:34,600 now have been talking about, it must be that the steady 817 00:50:34,600 --> 00:50:37,900 state probability of pi sub i-- 818 00:50:37,900 --> 00:50:42,000 pi sub i times P sub i is the probability of going from 819 00:50:42,000 --> 00:50:43,510 state 2 to state 3. 820 00:50:43,510 --> 00:50:46,850 It's the probability of being in state 2 and making a 821 00:50:46,850 --> 00:50:49,760 transition to state 3. 822 00:50:49,760 --> 00:50:53,820 This probability here is the probability of being in state 823 00:50:53,820 --> 00:50:59,580 3 and going to state 2. 824 00:50:59,580 --> 00:51:02,540 And we're saying that asymptotically, as you look 825 00:51:02,540 --> 00:51:05,330 over an infinite number of transitions, those two have to 826 00:51:05,330 --> 00:51:07,040 be the same. 827 00:51:07,040 --> 00:51:10,190 The other way to do it, if you like algebra, is to start out 828 00:51:10,190 --> 00:51:11,750 with a steady state equation. 829 00:51:11,750 --> 00:51:14,750 And you can derive this right away. 830 00:51:14,750 --> 00:51:16,710 I think it's nicer to see intuitively 831 00:51:16,710 --> 00:51:19,100 why it has be true. 832 00:51:19,100 --> 00:51:26,850 And what that says is if rho sub i is equal to P sub i over 833 00:51:26,850 --> 00:51:34,200 Q sub i plus 1, P sub i is the up transition probability. 834 00:51:34,200 --> 00:51:37,920 Q sub i is the down transition probability. 835 00:51:37,920 --> 00:51:45,330 Rho sub i is the ratio of the two state probabilities. 836 00:51:45,330 --> 00:51:50,330 And that's equal to this equation here. 837 00:51:50,330 --> 00:51:52,830 That's just how to calculate these things. 838 00:51:52,830 --> 00:51:54,080 And you've done that. 839 00:51:56,800 --> 00:51:59,185 Let's go on to Markov processes. 840 00:52:02,250 --> 00:52:04,990 I have no idea where I'm going to finish up. 841 00:52:04,990 --> 00:52:08,310 I had a lot to do. 842 00:52:08,310 --> 00:52:11,100 I better not waste too much time. 843 00:52:11,100 --> 00:52:13,470 Remember what a Markov process is now. 844 00:52:16,650 --> 00:52:20,910 At least the way we started out thinking about, it's a 845 00:52:20,910 --> 00:52:24,080 Markov chain along with a holding time. 846 00:52:24,080 --> 00:52:26,420 And each state is a Markov chain. 847 00:52:26,420 --> 00:52:30,130 And the holding times are exponential, to be a countable 848 00:52:30,130 --> 00:52:32,450 state Markov process. 849 00:52:32,450 --> 00:52:36,300 So we can visualize it as a sequence of 850 00:52:36,300 --> 00:52:40,480 states, X0, X1, X2, X3. 851 00:52:40,480 --> 00:52:45,240 And a sequence of holding times, U1, U2, U3, U4. 852 00:52:45,240 --> 00:52:47,760 these are all random variables. 853 00:52:47,760 --> 00:52:51,110 And this kind of dependence diagram says what random 854 00:52:51,110 --> 00:52:54,790 variables depend on what random variables. 855 00:52:54,790 --> 00:52:59,790 U1, given X0, is independent of the rest of the world. 856 00:52:59,790 --> 00:53:02,960 U2, given X1, is independent of the rest of the 857 00:53:02,960 --> 00:53:05,930 world, and so forth. 858 00:53:05,930 --> 00:53:11,060 And if you look at this graph here and you visualize the 859 00:53:11,060 --> 00:53:13,740 fact that because of Bayes' rule, you could go 860 00:53:13,740 --> 00:53:16,490 both ways on this. 861 00:53:16,490 --> 00:53:22,750 In other words, if this, given this, is independent of 862 00:53:22,750 --> 00:53:26,240 everything else, we can go through the 863 00:53:26,240 --> 00:53:28,760 same kind of argument. 864 00:53:28,760 --> 00:53:34,850 And we can make these arrows go the opposite way. 865 00:53:34,850 --> 00:53:39,320 And we can say, if we just consider these states here, we 866 00:53:39,320 --> 00:53:46,480 can say that, given X3, U4 is independent of X2 and also 867 00:53:46,480 --> 00:53:52,130 independent of U3 and X1 and U2 and so forth. 868 00:53:52,130 --> 00:53:55,520 So if you look at the dependence graph of a Markov 869 00:53:55,520 --> 00:54:01,230 chain, which is which states depend on which other states, 870 00:54:01,230 --> 00:54:03,870 those arrows there that we have, which make it easier to 871 00:54:03,870 --> 00:54:06,090 see what's going on, you can take them off. 872 00:54:06,090 --> 00:54:10,210 You can redraw them in any way you want to and look at the 873 00:54:10,210 --> 00:54:15,080 dependencies in the opposite way. 874 00:54:15,080 --> 00:54:25,610 Now to understand what the state is at any time t, 875 00:54:25,610 --> 00:54:28,650 there's an equation to do that. 876 00:54:28,650 --> 00:54:31,700 It's an equation that isn't much help. 877 00:54:31,700 --> 00:54:37,600 I think it's more help to look at this and to see from this 878 00:54:37,600 --> 00:54:38,950 what's going on. 879 00:54:38,950 --> 00:54:41,785 You start in some state that's 0. 880 00:54:44,670 --> 00:54:49,080 And starting in state 0, there's a holding time in U0. 881 00:54:49,080 --> 00:54:51,680 The holding time is U1. 882 00:54:51,680 --> 00:54:54,870 And you stay in. 883 00:54:54,870 --> 00:54:58,230 And the time in U1 is an exponential random variable 884 00:54:58,230 --> 00:55:00,000 with rate U sub i. 885 00:55:00,000 --> 00:55:01,550 That's what this says. 886 00:55:01,550 --> 00:55:06,300 So at the end of that holding time, you go from state i to 887 00:55:06,300 --> 00:55:07,590 some other state. 888 00:55:07,590 --> 00:55:09,040 This is the state you go to. 889 00:55:09,040 --> 00:55:11,950 The state you go to is according to the mark Markov 890 00:55:11,950 --> 00:55:13,710 chain probabilities. 891 00:55:13,710 --> 00:55:17,180 And it's state j in this case. 892 00:55:17,180 --> 00:55:22,230 You stay in state j until the holding time U2, which is a 893 00:55:22,230 --> 00:55:28,490 function of j, finishes you up at this time and so forth. 894 00:55:28,490 --> 00:55:32,810 So if you want to look at what state you're in at a given 895 00:55:32,810 --> 00:55:37,060 time, namely pick a time here and say what's the state at 896 00:55:37,060 --> 00:55:39,760 this time, as a random variable. 897 00:55:39,760 --> 00:55:44,460 So what you have to do then is you have to climb your way up 898 00:55:44,460 --> 00:55:46,970 from here to there. 899 00:55:46,970 --> 00:55:55,150 And you have to talk about the value of S1, S2, and S3. 900 00:55:55,150 --> 00:55:58,180 And those are exponential random variables. 901 00:55:58,180 --> 00:56:01,230 But they're exponential random variables that depend on the 902 00:56:01,230 --> 00:56:02,800 state that you're in. 903 00:56:02,800 --> 00:56:06,480 So as you're climbing your way up and looking at this sample 904 00:56:06,480 --> 00:56:11,070 function of the process, you have to look at U1 an X0. 905 00:56:11,070 --> 00:56:15,740 X0 defines what U1 is, as a random variable. 906 00:56:15,740 --> 00:56:18,610 It says that U1 is an exponential random variable, 907 00:56:18,610 --> 00:56:21,190 with rate U sub i. 908 00:56:21,190 --> 00:56:25,050 So you get to here, then you have some holding time here, 909 00:56:25,050 --> 00:56:29,940 which is a function of j and so forth, the whole way up. 910 00:56:29,940 --> 00:56:34,480 Which is why I said that an equation for X of t, in terms 911 00:56:34,480 --> 00:56:37,910 of these S's is not going to help you a great deal. 912 00:56:37,910 --> 00:56:41,340 Understanding how the process is working I think 913 00:56:41,340 --> 00:56:44,770 helps you a lot more. 914 00:56:44,770 --> 00:56:47,650 We said that there were three ways to represent a Markov 915 00:56:47,650 --> 00:56:55,350 process, which I'm giving here in terms 916 00:56:55,350 --> 00:56:57,630 just of Markov chains. 917 00:56:57,630 --> 00:56:59,410 The first one-- 918 00:56:59,410 --> 00:57:02,430 and the fact that these are all for M/M/1 doesn't make any 919 00:57:02,430 --> 00:57:03,050 difference. 920 00:57:03,050 --> 00:57:06,890 It's just these three general [INAUDIBLE]. 921 00:57:06,890 --> 00:57:11,970 One of them is, you look at it in terms of the embedded 922 00:57:11,970 --> 00:57:13,220 Markov chain. 923 00:57:24,030 --> 00:57:26,990 For this embedded Markov chain, the transition 924 00:57:26,990 --> 00:57:31,290 probabilities, when you're in state 0 in an M/M/1 queue, 925 00:57:31,290 --> 00:57:33,470 what's the next state you go to? 926 00:57:33,470 --> 00:57:37,140 Well the only state you can go to is state 1. 927 00:57:37,140 --> 00:57:40,110 Because we don't have any self transitions. 928 00:57:40,110 --> 00:57:42,050 So you go up to state 1 eventually. 929 00:57:42,050 --> 00:57:45,930 From state 1, you can go that way, with probability mu over 930 00:57:45,930 --> 00:57:47,360 lambda plus mu. 931 00:57:47,360 --> 00:57:51,210 Or you can go this way, with probability lambda over lambda 932 00:57:51,210 --> 00:57:56,120 plus mu, and so forth the whole way out. 933 00:57:56,120 --> 00:58:01,140 The next way of describing it, which is almost the same, is 934 00:58:01,140 --> 00:58:05,020 instead of using the transition probabilities and 935 00:58:05,020 --> 00:58:08,400 the embedded chain, you look directly at the transition 936 00:58:08,400 --> 00:58:11,500 rates for the Poisson process. 937 00:58:11,500 --> 00:58:15,580 Meaning the transition rates are the new sub i's associated 938 00:58:15,580 --> 00:58:16,920 with the different states. 939 00:58:16,920 --> 00:58:20,540 When you get in state i, the amount of time you spend is 940 00:58:20,540 --> 00:58:24,810 state i is an exponential random variable. 941 00:58:24,810 --> 00:58:27,800 And when you make a transition, you're either 942 00:58:27,800 --> 00:58:32,020 going to go to one state or another state, in this case. 943 00:58:32,020 --> 00:58:36,380 In general, you might go to any one of a number of states. 944 00:58:36,380 --> 00:58:47,650 Now if I tell that we start out in state one and the next 945 00:58:47,650 --> 00:58:53,430 state we go is state 2, now I ask you what's the expected 946 00:58:53,430 --> 00:58:56,140 amount of time that that transition took? 947 00:58:56,140 --> 00:58:57,390 What's the answer? 948 00:59:00,210 --> 00:59:03,620 Is it queue 1, 2, or is it mu sub 1? 949 00:59:10,550 --> 00:59:13,804 Anybody awake out there? 950 00:59:13,804 --> 00:59:16,230 AUDIENCE: Sir, could you repeat the question? 951 00:59:16,230 --> 00:59:17,010 PROFESSOR: Yes. 952 00:59:17,010 --> 00:59:20,600 The question is, we started out in state 1. 953 00:59:20,600 --> 00:59:25,730 Given that we started out in state 1 and given that the 954 00:59:25,730 --> 00:59:30,770 next state is state 2, what's the amount of time that it 955 00:59:30,770 --> 00:59:32,930 takes to go from 1 to 2? 956 00:59:32,930 --> 00:59:34,850 It's an exponential random variable. 957 00:59:34,850 --> 00:59:37,722 What's the rate of that random variable? 958 00:59:37,722 --> 00:59:39,210 AUDIENCE: Lambda plus U. 959 00:59:39,210 --> 00:59:39,706 PROFESSOR: What? 960 00:59:39,706 --> 00:59:41,330 AUDIENCE: Lambda plus U. 961 00:59:41,330 --> 00:59:42,945 PROFESSOR: Lambda plus mu? 962 00:59:42,945 --> 00:59:44,610 Yes. 963 00:59:44,610 --> 00:59:48,990 Lambda plus mu in the case of M/M/1 queue. 964 00:59:48,990 --> 00:59:53,860 If you have an arbitrary change, why the amount of time 965 00:59:53,860 --> 01:00:00,260 that it takes is mu sub I. This is just back to this old 966 01:00:00,260 --> 01:00:01,540 thing about splitting and 967 01:00:01,540 --> 01:00:03,110 combining of Poisson processes. 968 01:00:05,860 --> 01:00:10,130 When you have a combined Poisson process, which is what 969 01:00:10,130 --> 01:00:13,980 you have here, when you're in state i, there's a combined 970 01:00:13,980 --> 01:00:18,940 Poisson process, which is running, which says you go 971 01:00:18,940 --> 01:00:20,600 right with probability. 972 01:00:20,600 --> 01:00:22,940 Lambda, you go left with probability mu 973 01:00:22,940 --> 01:00:26,120 for an M/M/1 queue. 974 01:00:26,120 --> 01:00:32,640 And you can look at it in terms of, first, you see what 975 01:00:32,640 --> 01:00:34,490 the next state is. 976 01:00:34,490 --> 01:00:37,500 And then you ask how long did it take to get there? 977 01:00:37,500 --> 01:00:40,590 Or you look at in terms of how long does it take to make a 978 01:00:40,590 --> 01:00:43,910 transition and then which state did you go to? 979 01:00:43,910 --> 01:00:46,820 And with these combined Poisson processes, those two 980 01:00:46,820 --> 01:00:50,550 questions are independent of each other. 981 01:00:50,550 --> 01:00:53,990 And if there's one thing you remember from all of this, 982 01:00:53,990 --> 01:00:55,150 please remember that. 983 01:00:55,150 --> 01:01:00,420 Because it's something that you use in almost every 984 01:01:00,420 --> 01:01:03,010 problem that you do with Markov 985 01:01:03,010 --> 01:01:04,990 chains and Markov processes. 986 01:01:04,990 --> 01:01:07,730 It just comes up all the time. 987 01:01:07,730 --> 01:01:15,710 This final version here is looking at the same Markov 988 01:01:15,710 --> 01:01:23,090 process, but looking at it in sample time instead of looking 989 01:01:23,090 --> 01:01:24,820 at the embedded queue. 990 01:01:24,820 --> 01:01:28,110 Now the important thing here is, when you look at it in 991 01:01:28,110 --> 01:01:32,340 sample time, you might not be able to do this. 992 01:01:32,340 --> 01:01:40,010 Because with this entire cannibal state Markov chain, 993 01:01:40,010 --> 01:01:42,950 you might not be able to define these self-loop 994 01:01:42,950 --> 01:01:44,450 transition probabilities. 995 01:01:44,450 --> 01:01:47,220 Because these numbers might get too large. 996 01:01:47,220 --> 01:01:49,700 But for the M/M/1 queue, you can do it. 997 01:01:49,700 --> 01:01:53,500 The important thing is that the steady state probabilities 998 01:01:53,500 --> 01:01:57,780 you find for these states are not the same as the steady 999 01:01:57,780 --> 01:02:01,300 state probabilities you find for the embedded Markov chain. 1000 01:02:01,300 --> 01:02:04,930 They are in fact the same as the steady state probabilities 1001 01:02:04,930 --> 01:02:07,330 for the Markov process itself. 1002 01:02:07,330 --> 01:02:11,880 That's these steady state probabilities are the fraction 1003 01:02:11,880 --> 01:02:14,980 of time that you spend in state j. 1004 01:02:14,980 --> 01:02:18,580 And this is a sample time Markov process. 1005 01:02:18,580 --> 01:02:22,570 It is the same fraction of time you spend in state j. 1006 01:02:22,570 --> 01:02:25,040 Here you have this embedded chain. 1007 01:02:25,040 --> 01:02:28,250 And for example, in the embedded chain, the only place 1008 01:02:28,250 --> 01:02:32,190 you go from state 0 is state 1. 1009 01:02:32,190 --> 01:02:35,120 Here from state 0, you can stay in state 1010 01:02:35,120 --> 01:02:36,680 0 for a long time. 1011 01:02:36,680 --> 01:02:39,400 Because here the increments of time are constant. 1012 01:02:44,060 --> 01:02:47,530 We can look at delayed renewal reward theorems for the 1013 01:02:47,530 --> 01:02:52,610 renewal process to see what's going on here, for the 1014 01:02:52,610 --> 01:02:56,260 fraction of time we spend in state j. 1015 01:02:56,260 --> 01:02:58,570 We look at that picture up there. 1016 01:02:58,570 --> 01:03:01,580 We start out in state j, for example. 1017 01:03:01,580 --> 01:03:04,930 Same as the renewal reward process that we had for a 1018 01:03:04,930 --> 01:03:07,280 Markov chain. 1019 01:03:07,280 --> 01:03:10,020 We got a reward of 1 for the amount of time that we 1020 01:03:10,020 --> 01:03:11,800 stay in state j. 1021 01:03:11,800 --> 01:03:14,810 After that, we're wandering around in the wilderness. 1022 01:03:14,810 --> 01:03:17,580 We finally come back to state j again. 1023 01:03:17,580 --> 01:03:21,230 We get 1 unit of reward times the amount of 1024 01:03:21,230 --> 01:03:22,560 time we spend here. 1025 01:03:22,560 --> 01:03:26,580 In other words, we're accumulating reward at a rate 1026 01:03:26,580 --> 01:03:30,540 of 1 unit per unit time, up to there. 1027 01:03:30,540 --> 01:03:36,690 So the average reward we get per unit time is the expected 1028 01:03:36,690 --> 01:03:44,790 value of U of j divided by the expected interrenewal time, 1029 01:03:44,790 --> 01:03:49,320 which is 1 over mu j times the expected time, from one 1030 01:03:49,320 --> 01:03:52,230 renewal to the next. 1031 01:03:52,230 --> 01:03:56,710 Which tells us that the fraction of time we spend in 1032 01:03:56,710 --> 01:04:02,340 state j is equal to the fraction of transitions that 1033 01:04:02,340 --> 01:04:06,480 go to state j, divided by the rate at which we leave state 1034 01:04:06,480 --> 01:04:09,970 j, times the expected number of overall 1035 01:04:09,970 --> 01:04:13,220 transitions per unit time. 1036 01:04:13,220 --> 01:04:15,330 This is an important result. 1037 01:04:15,330 --> 01:04:18,700 Because depending on what M sub i is, depending on what 1038 01:04:18,700 --> 01:04:23,360 the number of transitions per unit time is, it really tells 1039 01:04:23,360 --> 01:04:24,580 you what's going on. 1040 01:04:24,580 --> 01:04:28,140 Because all of these bizarre Markov processes that we've 1041 01:04:28,140 --> 01:04:33,520 looked at are bizarre because of the way that this behaves. 1042 01:04:33,520 --> 01:04:35,405 This can infinite or can be 0. 1043 01:04:47,080 --> 01:04:54,210 At this point, we've been talking about the expected 1044 01:04:54,210 --> 01:05:01,100 number of transitions per unit time as a random variable, as 1045 01:05:01,100 --> 01:05:03,770 a limit in probability 1, given that we 1046 01:05:03,770 --> 01:05:05,790 start in state i. 1047 01:05:05,790 --> 01:05:10,270 And suddenly, we see that it doesn't depend on i at all. 1048 01:05:10,270 --> 01:05:14,150 So there is some number, M bar, which is the expected 1049 01:05:14,150 --> 01:05:17,900 number of transitions per unit time, which is independent of 1050 01:05:17,900 --> 01:05:19,210 what state we started in. 1051 01:05:19,210 --> 01:05:26,970 We call that M M bar instead M sub I. And that's this 1052 01:05:26,970 --> 01:05:29,300 quantity here. 1053 01:05:29,300 --> 01:05:38,600 And what we get from that it is the fraction of time we 1054 01:05:38,600 --> 01:05:44,330 spend in state j is proportional to pi 1055 01:05:44,330 --> 01:05:46,310 j over mu sub j. 1056 01:05:46,310 --> 01:05:50,250 But since it has to add up to 1, we have to divide it by 1057 01:05:50,250 --> 01:05:52,080 this quantity here. 1058 01:05:52,080 --> 01:05:56,330 And this quantity here is one over-- 1059 01:05:56,330 --> 01:06:00,325 this is the expected number of transitions per unit time. 1060 01:06:03,190 --> 01:06:09,710 And if we try to get the pi sub j's from P sub j's, the 1061 01:06:09,710 --> 01:06:13,440 corresponding thing, as we find out, the expected number 1062 01:06:13,440 --> 01:06:18,300 transitions per unit time as a sum over i, P sub i, 1063 01:06:18,300 --> 01:06:19,330 times mu sub i. 1064 01:06:19,330 --> 01:06:23,640 You can play all sorts of games with these equations. 1065 01:06:23,640 --> 01:06:27,685 And when you do so, all of those things become evident. 1066 01:06:44,010 --> 01:06:49,460 I would advise you to just cross this equation out. 1067 01:06:49,460 --> 01:06:51,380 I don't know what it came from. 1068 01:06:51,380 --> 01:06:54,300 But it doesn't mean anything. 1069 01:06:57,780 --> 01:07:02,700 We spent a lot of time talking about what happens when the 1070 01:07:02,700 --> 01:07:06,770 expected number of transitions per unit time 1071 01:07:06,770 --> 01:07:10,020 is either 0 or infinity. 1072 01:07:10,020 --> 01:07:15,870 We had this case we looked at of an M/M/1 type queue, where 1073 01:07:15,870 --> 01:07:19,150 the server got rattled as time went on. 1074 01:07:19,150 --> 01:07:21,010 And the server got rattled with more and 1075 01:07:21,010 --> 01:07:22,700 more customers waiting. 1076 01:07:22,700 --> 01:07:25,590 The customer's got discouraged and didn't come in. 1077 01:07:25,590 --> 01:07:31,090 So we had a process where the longer the queue got, the 1078 01:07:31,090 --> 01:07:33,965 longer time it took for anything to happen. 1079 01:07:41,600 --> 01:07:46,130 So that as far as the embedded Markov chain went, 1080 01:07:46,130 --> 01:07:47,550 everything was fine. 1081 01:07:47,550 --> 01:07:52,010 But then we looked at the process itself, the time that 1082 01:07:52,010 --> 01:07:55,140 it took in each of these higher order states was so 1083 01:07:55,140 --> 01:08:00,360 large, that, as a process, it didn't make any sense. 1084 01:08:00,360 --> 01:08:02,330 So the P sub i's were all 0. 1085 01:08:02,330 --> 01:08:04,140 The pi sub i's all looked fine. 1086 01:08:06,640 --> 01:08:10,300 And the other kind of cases, where the expected number of 1087 01:08:10,300 --> 01:08:15,070 transitions per unit time becomes infinite. 1088 01:08:15,070 --> 01:08:18,080 And that's just the opposite kind of case, where, when you 1089 01:08:18,080 --> 01:08:21,170 get to the higher ordered states, things start happening 1090 01:08:21,170 --> 01:08:22,510 very, very fast. 1091 01:08:22,510 --> 01:08:26,310 The higher ordered state you go to, the faster the 1092 01:08:26,310 --> 01:08:28,520 transitions occur. 1093 01:08:28,520 --> 01:08:29,770 It's like a small child. 1094 01:08:32,810 --> 01:08:35,890 I mean, the more excited the small child gets, the faster 1095 01:08:35,890 --> 01:08:37,040 things happen. 1096 01:08:37,040 --> 01:08:38,670 And the faster things happen, the more 1097 01:08:38,670 --> 01:08:40,050 excited the child gets. 1098 01:08:40,050 --> 01:08:43,450 So pretty soon things are happening so fast, the child 1099 01:08:43,450 --> 01:08:44,790 just collapses. 1100 01:08:44,790 --> 01:08:47,330 And if you're lucky, the child sleeps. 1101 01:08:47,330 --> 01:08:49,689 So you can think of it that way. 1102 01:08:52,279 --> 01:08:53,529 We talked about reversibility. 1103 01:08:58,990 --> 01:09:03,350 And reversibility for Markov processes I think is somewhat 1104 01:09:03,350 --> 01:09:05,170 easier to see then 1105 01:09:05,170 --> 01:09:07,115 reversibility for Markov chains. 1106 01:09:12,790 --> 01:09:15,660 If you're dealing with a Markov process, we're sitting 1107 01:09:15,660 --> 01:09:17,790 in state i for a while. 1108 01:09:17,790 --> 01:09:20,380 At some time we make a transition. 1109 01:09:20,380 --> 01:09:21,529 We go to state j. 1110 01:09:21,529 --> 01:09:23,689 We sit there for a long time. 1111 01:09:23,689 --> 01:09:26,819 Then we go to state k and so forth. 1112 01:09:26,819 --> 01:09:30,210 If we try to look at this process coming back the other 1113 01:09:30,210 --> 01:09:34,740 way, we see that we're in state k. 1114 01:09:34,740 --> 01:09:37,930 At a certain point, we had a transition. 1115 01:09:37,930 --> 01:09:40,779 We had a transition into state j. 1116 01:09:40,779 --> 01:09:42,550 And how long does it take before that 1117 01:09:42,550 --> 01:09:43,819 transition is over? 1118 01:09:46,319 --> 01:09:49,340 We're in state j, so the amount of time that it takes 1119 01:09:49,340 --> 01:09:52,510 is an exponentially distributed random variable. 1120 01:09:52,510 --> 01:09:54,610 And it's exponentially distributed with the same 1121 01:09:54,610 --> 01:09:58,360 amount of time, whether we're coming in this way or whether 1122 01:09:58,360 --> 01:10:00,160 we're coming in this way. 1123 01:10:00,160 --> 01:10:02,560 And that's the notion of reversibility. 1124 01:10:02,560 --> 01:10:05,840 It doesn't make any difference whether you look at it from 1125 01:10:05,840 --> 01:10:10,380 right to left or from left to right. 1126 01:10:10,380 --> 01:10:16,620 And in this kind of situation, if you find the steady state 1127 01:10:16,620 --> 01:10:23,120 probabilities for these transitions or you find the 1128 01:10:23,120 --> 01:10:29,180 steady state fraction of time you spend in each state. 1129 01:10:29,180 --> 01:10:32,920 I mean, we just showed that if you look at this process going 1130 01:10:32,920 --> 01:10:35,630 backwards, if you define all the probabilities coming 1131 01:10:35,630 --> 01:10:40,700 backwards, the expected amount of time that you spend in 1132 01:10:40,700 --> 01:10:44,650 state i or the rate for leaving state i is independent 1133 01:10:44,650 --> 01:10:46,010 of right to left. 1134 01:10:46,010 --> 01:10:49,110 And a slightly more complicated argument says the 1135 01:10:49,110 --> 01:10:52,140 P sub i's are the same going right to left. 1136 01:10:52,140 --> 01:10:55,400 And the fraction of time you spend in each state is 1137 01:10:55,400 --> 01:10:57,830 obviously the same going from right to left as 1138 01:10:57,830 --> 01:10:59,490 these limits occur. 1139 01:10:59,490 --> 01:11:05,980 So that gives you all these bizarre conditions for 1140 01:11:05,980 --> 01:11:10,570 queuing, which are very useful. 1141 01:11:15,220 --> 01:11:20,080 I'm not going to say any more about that except 1142 01:11:20,080 --> 01:11:23,280 the guessing theorem. 1143 01:11:23,280 --> 01:11:26,940 The guessing theorem says suppose a Markov process is 1144 01:11:26,940 --> 01:11:28,980 irreducible. 1145 01:11:28,980 --> 01:11:30,690 You can check pretty easily whether it's 1146 01:11:30,690 --> 01:11:31,830 irreducible or not. 1147 01:11:31,830 --> 01:11:33,910 You can't necessarily check very easily 1148 01:11:33,910 --> 01:11:36,170 whether it's recurrent. 1149 01:11:38,700 --> 01:11:42,160 And suppose P sub i is a set of probabilities that 1150 01:11:42,160 --> 01:11:48,530 satisfies P sub i times Q sub ij equals P sub 1151 01:11:48,530 --> 01:11:50,830 j times Q sub ji. 1152 01:11:50,830 --> 01:11:56,520 In other words, this is the probability of being in state 1153 01:11:56,520 --> 01:12:00,690 i, and the next transition is to state j. 1154 01:12:00,690 --> 01:12:03,640 This is the probability of being in state j, and the next 1155 01:12:03,640 --> 01:12:05,600 transition to state i. 1156 01:12:05,600 --> 01:12:10,500 This says that if you can find a set of probabilities which 1157 01:12:10,500 --> 01:12:14,740 satisfy these equations, and if they also satisfy this 1158 01:12:14,740 --> 01:12:20,640 condition, P sub i, mu sub i, less than infinity, then P sub 1159 01:12:20,640 --> 01:12:23,030 i is greater than 0 for all i. 1160 01:12:23,030 --> 01:12:26,430 P sub i is a steady state time averaged probability state i. 1161 01:12:26,430 --> 01:12:28,340 The processes is reversible. 1162 01:12:28,340 --> 01:12:31,930 And the embedded chain is positive recurring. 1163 01:12:31,930 --> 01:12:34,580 So all you have to do is solve those equations. 1164 01:12:34,580 --> 01:12:37,760 And if you can solve those equations, you're done. 1165 01:12:40,410 --> 01:12:43,120 Everything is fine. 1166 01:12:43,120 --> 01:12:45,680 You don't have to know anything about reversibility 1167 01:12:45,680 --> 01:12:48,330 or renewal theory or anything else. 1168 01:12:48,330 --> 01:12:51,210 If you have that theorem, you just 1169 01:12:51,210 --> 01:12:53,300 solve for those equations. 1170 01:12:53,300 --> 01:12:57,710 Solve these equations by guessing what the solution is, 1171 01:12:57,710 --> 01:13:00,430 and then you in fact have a reversible process. 1172 01:13:06,690 --> 01:13:10,530 So the useful application of this is that all birth-death 1173 01:13:10,530 --> 01:13:15,890 processes are reversible if this equation is satisfied. 1174 01:13:15,890 --> 01:13:19,330 And you can immediately find the steady state 1175 01:13:19,330 --> 01:13:20,580 probabilities of them. 1176 01:13:23,050 --> 01:13:25,276 I'm not going to have much time for random walks. 1177 01:13:28,680 --> 01:13:29,880 But random walks are what we've been 1178 01:13:29,880 --> 01:13:31,490 talking about all term. 1179 01:13:31,490 --> 01:13:34,500 We just didn't call them random walks until we got to 1180 01:13:34,500 --> 01:13:36,180 the seventh chapter. 1181 01:13:36,180 --> 01:13:41,140 But a random walk is a sequence of random variables, 1182 01:13:41,140 --> 01:13:47,490 where each Sn in the sequence is a sum of some number of 1183 01:13:47,490 --> 01:13:52,330 underlying IID random variables, X1 up to X sub n. 1184 01:13:52,330 --> 01:13:56,780 Well we're interested in exponential bounds on S sub n 1185 01:13:56,780 --> 01:13:57,600 for large n. 1186 01:13:57,600 --> 01:13:59,910 These are known as Chernoff bounds. 1187 01:13:59,910 --> 01:14:03,560 We talked about them back in chapter one. 1188 01:14:03,560 --> 01:14:05,320 I'm not going to mention them again now. 1189 01:14:05,320 --> 01:14:07,860 We're interested in threshold crossings. 1190 01:14:07,860 --> 01:14:11,460 If you have two thresholds, one positive threshold, one 1191 01:14:11,460 --> 01:14:16,120 negative threshold, you would like to know what's the 1192 01:14:16,120 --> 01:14:20,810 stopping time when S sub n first crosses alpha? 1193 01:14:20,810 --> 01:14:23,960 Or what's the stopping time when it first crosses beta? 1194 01:14:23,960 --> 01:14:28,210 What's the probability of crossing alpha before you 1195 01:14:28,210 --> 01:14:30,930 cross beta or vice versa? 1196 01:14:30,930 --> 01:14:33,760 And what's the distribution of the overshoot, when you pass 1197 01:14:33,760 --> 01:14:34,760 one of them? 1198 01:14:34,760 --> 01:14:37,120 So there all those questions. 1199 01:14:37,120 --> 01:14:40,890 We pretty much talked about the first two. 1200 01:14:40,890 --> 01:14:45,480 The question of overshoot, I think I mentioned this. 1201 01:14:45,480 --> 01:14:48,460 The text doesn't say much about it. 1202 01:14:48,460 --> 01:14:52,090 Overshoot is just a nasty, nasty problem. 1203 01:14:52,090 --> 01:14:55,250 If you ever have to find the overshoot of something, go 1204 01:14:55,250 --> 01:14:59,570 look for a computer program to simulate it or something. 1205 01:14:59,570 --> 01:15:02,760 You're not going to solve the problem very easily. 1206 01:15:02,760 --> 01:15:08,030 Fowler is the only book I know which does a reasonable job of 1207 01:15:08,030 --> 01:15:09,720 trying to solve this. 1208 01:15:09,720 --> 01:15:13,170 And you have to be extraordinarily patient. 1209 01:15:13,170 --> 01:15:17,030 I mean Fowler does everything in the nicest possible way. 1210 01:15:17,030 --> 01:15:19,330 Or at least he always seem to do everything in the nicest 1211 01:15:19,330 --> 01:15:20,810 possible way. 1212 01:15:20,810 --> 01:15:23,970 Most textbooks you look at, after you understand the 1213 01:15:23,970 --> 01:15:27,110 subject, you look at and you say, oh, he should have done 1214 01:15:27,110 --> 01:15:28,570 it this way. 1215 01:15:28,570 --> 01:15:31,910 I've never had that experience with Fowler at all. 1216 01:15:31,910 --> 01:15:33,530 Always, I look at it. 1217 01:15:33,530 --> 01:15:35,370 I say, oh, there's an easier way to do it. 1218 01:15:35,370 --> 01:15:36,990 I try to do it the easier way. 1219 01:15:36,990 --> 01:15:38,800 And then I find something's wrong with it. 1220 01:15:38,800 --> 01:15:41,620 And then I go back and say, ah, I got to do it the way 1221 01:15:41,620 --> 01:15:43,530 Fowler did it. 1222 01:15:43,530 --> 01:15:48,590 So if you're serious about this field and you don't have 1223 01:15:48,590 --> 01:15:51,730 a copy of this very old book, get it, 1224 01:15:51,730 --> 01:15:53,190 because it's solid gold. 1225 01:16:01,460 --> 01:16:06,450 Suppose a random variable has a moment generating function, 1226 01:16:06,450 --> 01:16:11,470 expected value of E to the zr over some 1227 01:16:11,470 --> 01:16:13,180 positive region of r. 1228 01:16:13,180 --> 01:16:17,270 And suppose it has a mean which is negative. 1229 01:16:17,270 --> 01:16:22,570 The Chernoff bound says that for any alpha greater than 0 1230 01:16:22,570 --> 01:16:27,860 and any r in 0 to r plus, the probability that Z is greater 1231 01:16:27,860 --> 01:16:31,040 than or equal to alpha is less than or equal to 1232 01:16:31,040 --> 01:16:31,860 this quantity here. 1233 01:16:31,860 --> 01:16:33,370 You remember, we derived this. 1234 01:16:33,370 --> 01:16:36,730 The derivation is very simple. 1235 01:16:36,730 --> 01:16:39,660 It's a an obvious result. 1236 01:16:39,660 --> 01:16:41,740 It's a little strange. 1237 01:16:41,740 --> 01:16:48,130 Because this says that for this random variable it's 1238 01:16:48,130 --> 01:16:52,130 complimentary distribution function has to go down as e 1239 01:16:52,130 --> 01:16:55,270 to the minus r alpha. 1240 01:16:55,270 --> 01:16:59,330 Now all random variables can't go down exponentially as e to 1241 01:16:59,330 --> 01:17:01,260 the minus r alpha. 1242 01:17:01,260 --> 01:17:06,310 The reason for this is that these moment generating 1243 01:17:06,310 --> 01:17:09,050 functions down exist for all alpha. 1244 01:17:09,050 --> 01:17:14,150 So what it's really saying is where it exists, it goes down 1245 01:17:14,150 --> 01:17:17,700 with alpha as e to the minus r alpha. 1246 01:17:17,700 --> 01:17:20,160 We then define the semi-invariant moment 1247 01:17:20,160 --> 01:17:21,920 generating function. 1248 01:17:21,920 --> 01:17:26,000 And then a more convenient way of stating the Chernoff bound 1249 01:17:26,000 --> 01:17:28,000 was in this way. 1250 01:17:28,000 --> 01:17:29,210 You look here. 1251 01:17:29,210 --> 01:17:35,800 And you say, for a fixed value of n here, this probability of 1252 01:17:35,800 --> 01:17:39,320 S sub n is greater than or equal to n a, is something 1253 01:17:39,320 --> 01:17:42,540 which is going down exponentially with n. 1254 01:17:42,540 --> 01:17:45,680 And if you optimize over r, this bound is 1255 01:17:45,680 --> 01:17:47,340 exponentially tight. 1256 01:17:47,340 --> 01:17:54,110 In other words, if you try to replace this with anything 1257 01:17:54,110 --> 01:17:58,090 smaller, namely which goes down faster, than for large 1258 01:17:58,090 --> 01:18:01,090 enough n, the bound will be false. 1259 01:18:01,090 --> 01:18:04,870 So this is the tightest bound you can get when you 1260 01:18:04,870 --> 01:18:07,170 optimize it over r. 1261 01:18:07,170 --> 01:18:10,220 So its exponential in n. 1262 01:18:10,220 --> 01:18:13,970 Mostly we wanted to use it for threshold crossings. 1263 01:18:13,970 --> 01:18:20,630 And for threshold crossings, we would like to look at it in 1264 01:18:20,630 --> 01:18:22,860 another way. 1265 01:18:22,860 --> 01:18:26,680 And we dealt with this graphically. 1266 01:18:26,680 --> 01:18:30,470 Probability of Sn greater than or equal to alpha. 1267 01:18:30,470 --> 01:18:33,240 Now what we want to do is hold alpha constant. 1268 01:18:33,240 --> 01:18:35,350 Alpha is some threshold up there. 1269 01:18:35,350 --> 01:18:39,060 We want to ask, what's the probability that after n 1270 01:18:39,060 --> 01:18:41,780 trials, we're sitting above alpha? 1271 01:18:41,780 --> 01:18:43,460 And we'd like to try to solve that for 1272 01:18:43,460 --> 01:18:45,580 different values of n. 1273 01:18:45,580 --> 01:18:50,560 The Chernoff bound, in this case, this quantity here is 1274 01:18:50,560 --> 01:18:52,470 this intercept here. 1275 01:18:52,470 --> 01:18:54,950 You take the semi-invariant moment 1276 01:18:54,950 --> 01:18:57,270 generating function as convex. 1277 01:18:57,270 --> 01:18:59,640 You draw this curve. 1278 01:18:59,640 --> 01:19:04,290 You take a tangent of slope alpha over n. 1279 01:19:04,290 --> 01:19:06,420 And you see where it hits here. 1280 01:19:06,420 --> 01:19:08,290 And this is the exponent that you have. 1281 01:19:08,290 --> 01:19:10,730 This is a negative exponent. 1282 01:19:10,730 --> 01:19:16,400 As you very n, this tilts around on this curve. 1283 01:19:16,400 --> 01:19:19,520 And it comes in to this point. 1284 01:19:19,520 --> 01:19:22,230 It goes back out again. 1285 01:19:22,230 --> 01:19:24,670 That's what happens to it. 1286 01:19:24,670 --> 01:19:31,190 And that smallest exponent, as you vary n, is the most likely 1287 01:19:31,190 --> 01:19:34,680 time at which you're going to cross that threshold. 1288 01:19:34,680 --> 01:19:37,930 And what we found, from looking at 1289 01:19:37,930 --> 01:19:41,570 Wald's equality is that-- 1290 01:19:41,570 --> 01:19:46,000 let me go on, because we're running out of time. 1291 01:19:50,870 --> 01:19:54,390 Wald's identity for two thresholds says this. 1292 01:19:54,390 --> 01:19:58,930 And the corollary says, if the underlying random variable is 1293 01:19:58,930 --> 01:20:06,580 less than 0, and if the r at which the-- 1294 01:20:06,580 --> 01:20:10,260 the second solution of gamma of r equals 0. 1295 01:20:10,260 --> 01:20:12,000 You have this convex curve. 1296 01:20:12,000 --> 01:20:15,450 Gamma is always equal to 0. 1297 01:20:15,450 --> 01:20:19,380 There's some other value of r, for which gamma is equal to 0. 1298 01:20:19,380 --> 01:20:21,370 And that's r star. 1299 01:20:21,370 --> 01:20:25,080 And this says that the probability that we have 1300 01:20:25,080 --> 01:20:29,830 crossed alpha at time j, where j is the time of first 1301 01:20:29,830 --> 01:20:32,310 crossing, is less than or equal e to the 1302 01:20:32,310 --> 01:20:34,270 minus alpha r star. 1303 01:20:34,270 --> 01:20:36,780 This bound is tight also. 1304 01:20:36,780 --> 01:20:38,460 And that's a very nice result. 1305 01:20:38,460 --> 01:20:42,830 Because that just says that all you got do is find r star. 1306 01:20:42,830 --> 01:20:45,810 And that tells you what the probability of crossing a 1307 01:20:45,810 --> 01:20:47,210 threshold is. 1308 01:20:47,210 --> 01:20:50,060 And it's a very tight bound if alpha is very large. 1309 01:20:50,060 --> 01:20:53,780 It doesn't make any difference what the negative threshold 1310 01:20:53,780 --> 01:20:56,230 is, or whether it's there or not. 1311 01:20:56,230 --> 01:20:59,660 This tells you the thing you want to know. 1312 01:20:59,660 --> 01:21:04,610 I think I'm going to stop at that point, because I have 1313 01:21:04,610 --> 01:21:08,330 been sort of rushing to get to this point. 1314 01:21:08,330 --> 01:21:11,210 And it doesn't do any good to keep rushing. 1315 01:21:11,210 --> 01:21:16,610 So thank you all for being around all term. 1316 01:21:16,610 --> 01:21:17,540 I appreciate it. 1317 01:21:17,540 --> 01:21:18,790 Thank you.