1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:17,710 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,710 --> 00:00:18,960 ocw.mit.edu. 8 00:00:23,680 --> 00:00:28,130 PROFESSOR: OK, I guess it's time to get started. 9 00:00:28,130 --> 00:00:32,650 Next lecture, I'm going to try to summarize what we've done 10 00:00:32,650 --> 00:00:37,160 so that I want to try to finish what we're going to 11 00:00:37,160 --> 00:00:39,870 finish for the term today. 12 00:00:39,870 --> 00:00:45,740 And that means we have a lot of topics that all get 13 00:00:45,740 --> 00:00:50,440 slightly squeezed in together, including the martingale 14 00:00:50,440 --> 00:00:54,200 convergence theorem and the strengthening of the strong 15 00:00:54,200 --> 00:00:57,870 law of large numbers and the Kolmogorov's submartingale 16 00:00:57,870 --> 00:01:01,610 inequalities and stopped martingales. 17 00:01:01,610 --> 00:01:06,100 So all of these are fairly major topics. 18 00:01:06,100 --> 00:01:09,610 One thing that it means is that we certainly aren't going 19 00:01:09,610 --> 00:01:13,200 to do much with the martingale convergence theorem. 20 00:01:13,200 --> 00:01:16,300 It's a major theorem and advanced work. 21 00:01:16,300 --> 00:01:19,130 What we're trying to do here is just give you 22 00:01:19,130 --> 00:01:20,870 some flavor of it. 23 00:01:20,870 --> 00:01:26,460 The other things, as we move up the chain there, we want to 24 00:01:26,460 --> 00:01:28,920 know more and more about it. 25 00:01:28,920 --> 00:01:33,070 And the things on top certainly explain what's going 26 00:01:33,070 --> 00:01:34,320 on in the things below. 27 00:01:36,530 --> 00:01:40,280 OK, let's review what a martingale is. 28 00:01:40,280 --> 00:01:45,660 A sequence of random variables is a martingale if it 29 00:01:45,660 --> 00:01:48,680 satisfies this funny looking condition. 30 00:01:48,680 --> 00:01:52,010 When I write it this way, it's not completely obvious what 31 00:01:52,010 --> 00:01:52,660 it's saying. 32 00:01:52,660 --> 00:02:00,130 But I think we know now the expected value of one thing, 33 00:02:00,130 --> 00:02:03,900 of one random variable, given a set of random variables, is 34 00:02:03,900 --> 00:02:06,010 really a random variable in its own right. 35 00:02:06,010 --> 00:02:10,660 And that's a random variable, which is a function of those 36 00:02:10,660 --> 00:02:12,160 conditioning random variables. 37 00:02:12,160 --> 00:02:18,310 So expected value of Zn, given Zn minus 1 to Z1. 38 00:02:18,310 --> 00:02:19,520 It's a random variable. 39 00:02:19,520 --> 00:02:24,210 It maps each sample point to the conditional expectation of 40 00:02:24,210 --> 00:02:28,930 Zn, conditional in Z1 to Zn minus 1. 41 00:02:28,930 --> 00:02:31,710 For martingale, this expectation has to be this 42 00:02:31,710 --> 00:02:34,450 random variable, Z sub n minus one. 43 00:02:34,450 --> 00:02:38,280 It has to be the most recent random variable you've seen. 44 00:02:40,890 --> 00:02:45,810 And then last time, we proved this lemma, a pretty major 45 00:02:45,810 --> 00:02:49,870 lemma, which says, for a martingale, expected value of 46 00:02:49,870 --> 00:02:57,750 Zn is equal to of Zn, given not all of the past, but just 47 00:02:57,750 --> 00:03:01,680 a certain number of elements of the past Zn conditional and 48 00:03:01,680 --> 00:03:05,150 Zi, back to Z1. 49 00:03:05,150 --> 00:03:07,360 That's equal to Zi. 50 00:03:07,360 --> 00:03:11,960 Expected value of Zn is equal to the expected value of Zi. 51 00:03:11,960 --> 00:03:15,330 We didn't talk about this one last time. 52 00:03:15,330 --> 00:03:18,150 But this is obvious in terms of this. 53 00:03:18,150 --> 00:03:23,230 If you take the expected value of this over all of the 54 00:03:23,230 --> 00:03:26,980 conditioning random variables, then what you get is just the 55 00:03:26,980 --> 00:03:28,940 expected value of Z sub i. 56 00:03:28,940 --> 00:03:31,070 So that's what this says. 57 00:03:31,070 --> 00:03:33,610 OK, question now. 58 00:03:33,610 --> 00:03:38,610 Why is it that if you condition on more random 59 00:03:38,610 --> 00:03:43,390 variables than just Zn in the paths, if you condition on 60 00:03:43,390 --> 00:03:47,280 things into the future, including Z sub n its self, 61 00:03:47,280 --> 00:03:52,480 why is the expected value of Zn given Zm down 62 00:03:52,480 --> 00:03:54,720 to Zn down to Z1? 63 00:03:54,720 --> 00:04:00,250 Why is that equal to Zn and not equal to Zn minus 1? 64 00:04:00,250 --> 00:04:03,990 If you understand this, you understand what conditional 65 00:04:03,990 --> 00:04:05,720 expectations are. 66 00:04:05,720 --> 00:04:09,370 If you don't understand it, you've got to spend some time 67 00:04:09,370 --> 00:04:13,910 really thinking about what conditional expectations are. 68 00:04:13,910 --> 00:04:18,290 OK, how many people can see why this, in fact, has to be 69 00:04:18,290 --> 00:04:22,530 Zn and not something else? 70 00:04:22,530 --> 00:04:24,350 AUDIENCE: We know Zn already. 71 00:04:24,350 --> 00:04:25,550 PROFESSOR: What? 72 00:04:25,550 --> 00:04:27,570 AUDIENCE: Isn't it-- don't you know Zn already? 73 00:04:27,570 --> 00:04:28,640 PROFESSOR: That's right. 74 00:04:28,640 --> 00:04:31,810 Since you know the n already, then you know it. 75 00:04:31,810 --> 00:04:34,750 That's exactly what it says. 76 00:04:34,750 --> 00:04:38,660 And we'll talk more about this later because we actually use 77 00:04:38,660 --> 00:04:42,490 this a bunch of times when we're going on. 78 00:04:42,490 --> 00:04:50,350 But what it says is that for any given value of each of 79 00:04:50,350 --> 00:04:54,910 these random variables, in other words, for a given value 80 00:04:54,910 --> 00:04:58,540 of Z sub n, the expected value of Z sub n-- 81 00:04:58,540 --> 00:05:01,410 I mean, if I just wrote it, it's what's the expected value 82 00:05:01,410 --> 00:05:05,190 of Z sub n, given Z sub n? 83 00:05:05,190 --> 00:05:07,020 What's the expected value of Z sub n-- 84 00:05:07,020 --> 00:05:09,660 I'll make it even easier. 85 00:05:09,660 --> 00:05:12,920 And you can see it more clearly. 86 00:05:12,920 --> 00:05:20,300 What's the expected value of Z sub n, given that Z sub n is 87 00:05:20,300 --> 00:05:25,620 equal to a particular value z sub n. 88 00:05:25,620 --> 00:05:29,340 OK, now I hope you can see why it is that this is 89 00:05:29,340 --> 00:05:31,780 equal to Z sub n. 90 00:05:31,780 --> 00:05:33,950 That's the only thing that the random 91 00:05:33,950 --> 00:05:35,400 variable Z sub n can be. 92 00:05:35,400 --> 00:05:41,540 For this sample point and the conditioning, we're assuming 93 00:05:41,540 --> 00:05:45,876 the sample value for which Z sub n is equal to little z sub 94 00:05:45,876 --> 00:05:49,060 n, and therefore that's what this random 95 00:05:49,060 --> 00:05:51,690 variable has to be. 96 00:05:51,690 --> 00:05:53,370 That's what you said in one sentence. 97 00:05:53,370 --> 00:05:56,040 And I'm saying it in five sentences. 98 00:05:56,040 --> 00:05:58,920 It's hard enough that you want to say it in five sentences 99 00:05:58,920 --> 00:06:02,170 because this is not an obvious thing. 100 00:06:02,170 --> 00:06:06,230 It's not an obvious thing until you really have a sense 101 00:06:06,230 --> 00:06:09,430 of what these conditional expectations mean. 102 00:06:09,430 --> 00:06:15,640 And that's part of our function for the last couple 103 00:06:15,640 --> 00:06:18,610 of weeks, to sort out what those things mean because 104 00:06:18,610 --> 00:06:22,550 that's part of understanding what martingales are doing. 105 00:06:22,550 --> 00:06:28,410 OK, when we go one step further on this, we've said 106 00:06:28,410 --> 00:06:33,760 that the expected value of Z sub n, given Z sub i back to Z 107 00:06:33,760 --> 00:06:38,460 sub n is Z sub i, so the expected value of Z sub n, 108 00:06:38,460 --> 00:06:42,720 only given Z sub 1, is equal to Z sub 1. 109 00:06:42,720 --> 00:06:46,460 That's what this says, an expected value of Z sub n then 110 00:06:46,460 --> 00:06:49,670 is equal to the expected value of Z sub 1. 111 00:06:49,670 --> 00:06:53,350 It says these marginal expectations are all the same. 112 00:06:53,350 --> 00:06:53,710 Yes? 113 00:06:53,710 --> 00:06:57,407 AUDIENCE: Why don't you just say that the expectation of 114 00:06:57,407 --> 00:06:59,626 the Zs are constant. 115 00:06:59,626 --> 00:07:03,835 I mean, it seems like this is kind of a roundabout way of 116 00:07:03,835 --> 00:07:06,020 saying that. 117 00:07:06,020 --> 00:07:08,480 PROFESSOR: Yeah, in a sense. 118 00:07:08,480 --> 00:07:10,210 But if you want to-- 119 00:07:10,210 --> 00:07:13,790 almost all the examples you can think of, it's hard to 120 00:07:13,790 --> 00:07:17,540 figure out what these Z sub n's are because the Z sub n's 121 00:07:17,540 --> 00:07:21,450 are given in terms of all of the previous random variables. 122 00:07:21,450 --> 00:07:25,050 If you want to sort out what a martingale is, and you can't 123 00:07:25,050 --> 00:07:29,870 even understand what Z sub 1 is, then you're in trouble. 124 00:07:29,870 --> 00:07:32,720 I mean, it makes the theorem more abstract to say the 125 00:07:32,720 --> 00:07:37,770 expected value of Z sub n is a constant random variable 126 00:07:37,770 --> 00:07:40,020 rather than saying which random variable it is. 127 00:07:40,020 --> 00:07:42,630 And it's obvious what random variable it is. 128 00:07:42,630 --> 00:07:45,290 I mean, Z1 is one example of it. 129 00:07:45,290 --> 00:07:46,230 So you're right. 130 00:07:46,230 --> 00:07:48,912 You could say that way. 131 00:07:48,912 --> 00:07:56,660 OK, so we talked about a number of simple examples of 132 00:07:56,660 --> 00:07:59,330 martingales last time. 133 00:07:59,330 --> 00:08:03,960 All of these examples assume that the expected value of the 134 00:08:03,960 --> 00:08:08,640 magnitude of Z sub n is less than infinity for all n. 135 00:08:08,640 --> 00:08:13,370 Remember, this does not mean that the expected value of Z 136 00:08:13,370 --> 00:08:18,110 sub n is bounded over all n. 137 00:08:18,110 --> 00:08:22,290 I mean, you can have expected value of Z to the n 138 00:08:22,290 --> 00:08:23,380 can be 2 to the n. 139 00:08:23,380 --> 00:08:26,780 It can be shooting off to infinity very, very fast. 140 00:08:26,780 --> 00:08:29,690 But it still is finite for every value of n. 141 00:08:29,690 --> 00:08:31,550 That's what this assumption is. 142 00:08:31,550 --> 00:08:35,340 Later on, when we talk about the martingale convergence 143 00:08:35,340 --> 00:08:38,480 theorem, we'll assume that these expectations are 144 00:08:38,480 --> 00:08:43,729 bounded, which is a much, much stronger constraint. 145 00:08:43,729 --> 00:08:48,020 OK, we talked about these examples last time. 146 00:08:48,020 --> 00:08:51,500 All of them are pretty important because any time 147 00:08:51,500 --> 00:08:54,460 you're trying to do a problem with martingales, trying to 148 00:08:54,460 --> 00:08:58,440 prove something for martingales, what I like to do 149 00:08:58,440 --> 00:09:05,230 first is to look at all the simple examples I know and try 150 00:09:05,230 --> 00:09:08,140 to get some insight from them as to whether the result is 151 00:09:08,140 --> 00:09:10,402 true or whether it's not true. 152 00:09:10,402 --> 00:09:14,650 And if you can't see from the examples what's going on, then 153 00:09:14,650 --> 00:09:19,960 you're sort of stuck for the most part, unless you're lucky 154 00:09:19,960 --> 00:09:22,340 enough to construct the kind of proof you would 155 00:09:22,340 --> 00:09:24,670 find in a math book. 156 00:09:24,670 --> 00:09:28,250 Now math book theorem proofs are beautiful 157 00:09:28,250 --> 00:09:30,920 because they're the-- 158 00:09:30,920 --> 00:09:33,710 I mean, mathematicians work to get the shortest possible 159 00:09:33,710 --> 00:09:37,960 proof they can, which has no holes in it at all. 160 00:09:37,960 --> 00:09:40,190 And that makes it very elegant. 161 00:09:40,190 --> 00:09:43,150 But when you're trying to understand it, what is often 162 00:09:43,150 --> 00:09:47,340 done is somebody starts out understanding a theorem, and 163 00:09:47,340 --> 00:09:50,190 they write a proof which runs for three pages. 164 00:09:50,190 --> 00:09:52,580 And then they think about it for two weeks. 165 00:09:52,580 --> 00:09:54,450 They cut it down to one page. 166 00:09:54,450 --> 00:09:56,620 They think about it for another two weeks. 167 00:09:56,620 --> 00:09:58,610 They cut it down to half a page. 168 00:09:58,610 --> 00:10:00,000 And then they publish it. 169 00:10:00,000 --> 00:10:03,640 And all they do is publish the half page. 170 00:10:03,640 --> 00:10:05,050 Everybody is stuck. 171 00:10:05,050 --> 00:10:09,190 Nobody knows where this came from. 172 00:10:09,190 --> 00:10:13,050 So what I'm trying to do here is, at least in some cases, to 173 00:10:13,050 --> 00:10:15,640 give you a little more than the half page to give you some 174 00:10:15,640 --> 00:10:19,280 idea of where these things come from, the extent I can. 175 00:10:19,280 --> 00:10:22,340 OK so the zero mean random walk-- 176 00:10:22,340 --> 00:10:29,080 if Z sub n is equal to some of Xi, where Xi are IID and zero 177 00:10:29,080 --> 00:10:34,220 mean , then this zero mean random walk is, in fact, a 178 00:10:34,220 --> 00:10:36,440 martingale, just satisfies the conditions. 179 00:10:39,400 --> 00:10:44,830 This one is probably the most important of all of the simple 180 00:10:44,830 --> 00:10:50,160 examples because it says if Z sub n is a sum of random 181 00:10:50,160 --> 00:10:51,230 variables-- 182 00:10:51,230 --> 00:10:53,370 don't know what the random variables so are-- the 183 00:10:53,370 --> 00:10:56,400 condition on the random variables is the expected 184 00:10:56,400 --> 00:11:00,345 value of X sub i, given all the previous ones is 0. 185 00:11:03,100 --> 00:11:06,790 This is a general example because every martingale in 186 00:11:06,790 --> 00:11:11,460 the world you can look at the increments of that martingale, 187 00:11:11,460 --> 00:11:21,850 namely you can define X sub n to be Z sub n minus 188 00:11:21,850 --> 00:11:24,130 Z sub n minus 1. 189 00:11:24,130 --> 00:11:28,010 And as soon as you do that, X sub n satisfies 190 00:11:28,010 --> 00:11:29,560 this condition here. 191 00:11:29,560 --> 00:11:35,040 And what you've got for sure is a martingale, so that this 192 00:11:35,040 --> 00:11:40,890 condition here, that the X sub is are satisfying, is really 193 00:11:40,890 --> 00:11:44,300 the same as a martingale condition. 194 00:11:44,300 --> 00:11:46,730 It's just that when people are talking about martingales, 195 00:11:46,730 --> 00:11:50,310 they're talking about the sums of random variables. 196 00:11:50,310 --> 00:11:52,780 Here we're just talking about the random variables 197 00:11:52,780 --> 00:11:53,670 themselves. 198 00:11:53,670 --> 00:11:56,950 It's like when we talk about sum of IID random variables, 199 00:11:56,950 --> 00:11:59,440 we prove the laws of large numbers and everything. 200 00:11:59,440 --> 00:12:02,090 What we're really talking about there 201 00:12:02,090 --> 00:12:04,120 is IID random variables. 202 00:12:04,120 --> 00:12:07,510 What we're really talking about here is these random 203 00:12:07,510 --> 00:12:11,430 variables, which has the property that no matter what's 204 00:12:11,430 --> 00:12:14,640 happened in the past, the expected value of this new 205 00:12:14,640 --> 00:12:17,210 random variable is equal to 0. 206 00:12:17,210 --> 00:12:19,570 People call these fair games. 207 00:12:19,570 --> 00:12:22,720 And they call martingales examples of fair games. 208 00:12:22,720 --> 00:12:26,710 And martingale, the expected value of Z sub n, you can 209 00:12:26,710 --> 00:12:28,710 think of it as you're expecting that 210 00:12:28,710 --> 00:12:31,330 worth of time n. 211 00:12:31,330 --> 00:12:34,880 And with these underlying random variables, the X sub i, 212 00:12:34,880 --> 00:12:43,100 the X sub i is, in a sense, your profit at time n. 213 00:12:43,100 --> 00:12:46,130 And what this says is your profit at time n is 214 00:12:46,130 --> 00:12:50,190 independent of everything in the past, independent of every 215 00:12:50,190 --> 00:12:53,680 sample value of everything in the past. 216 00:12:53,680 --> 00:12:58,210 And this is why people call it fair game. 217 00:12:58,210 --> 00:13:04,100 It's really a very strong definition of a fair game. 218 00:13:04,100 --> 00:13:07,650 I mean, it's saying an awful lot. 219 00:13:07,650 --> 00:13:10,290 I mean, everybody says life is not fair. 220 00:13:10,290 --> 00:13:14,150 Surely by this definition, life is not even close to fair 221 00:13:14,150 --> 00:13:16,670 because when you look at all your past-- 222 00:13:16,670 --> 00:13:19,070 I mean, you try to learn from your past. 223 00:13:19,070 --> 00:13:23,540 This is saying in these kinds of gambling games, you can't 224 00:13:23,540 --> 00:13:25,150 learn from the past. 225 00:13:25,150 --> 00:13:30,660 You can't do anything with it so long as you're interested 226 00:13:30,660 --> 00:13:32,690 only in the expectation. 227 00:13:32,690 --> 00:13:38,000 The expectation of X sub i is equal to 0, no matter what all 228 00:13:38,000 --> 00:13:41,450 the earlier random variables are. 229 00:13:41,450 --> 00:13:44,970 This, I think, gives you a sense of what a martingale is, 230 00:13:44,970 --> 00:13:48,140 probably better than the original definition. 231 00:13:48,140 --> 00:13:50,340 OK, another one we talked about last time, 232 00:13:50,340 --> 00:13:52,110 this is very specific. 233 00:13:52,110 --> 00:13:55,900 Suppose that X sub i is equal to the product of two random 234 00:13:55,900 --> 00:13:59,070 variables, U sub i times Y sub i. 235 00:13:59,070 --> 00:14:05,870 The U sub i here are IID equiprobable, plus or minus 1. 236 00:14:05,870 --> 00:14:08,780 And the Y sub is are independent of the U sub is. 237 00:14:08,780 --> 00:14:11,120 They can be anything at all. 238 00:14:11,120 --> 00:14:14,610 And when you take these Y sub is, which are anything at all, 239 00:14:14,610 --> 00:14:19,650 but these quantities here, which are IID 1 and minus 1, 240 00:14:19,650 --> 00:14:25,760 when you look at this product here, any positive number this 241 00:14:25,760 --> 00:14:28,420 can be is equiprobable with the 242 00:14:28,420 --> 00:14:30,680 corresponding negative number. 243 00:14:30,680 --> 00:14:34,100 And that means when you take the expectation of X sub i, 244 00:14:34,100 --> 00:14:38,670 given any old thing in the past, this U sub i is enough 245 00:14:38,670 --> 00:14:41,300 to make the expectation equal to 0. 246 00:14:41,300 --> 00:14:44,630 So this is a fairly strong kind of example also, which 247 00:14:44,630 --> 00:14:47,150 gives you a sense of what these things are. 248 00:14:47,150 --> 00:14:49,370 Product form martingales-- 249 00:14:49,370 --> 00:14:52,110 you use product form martingales primarily to find 250 00:14:52,110 --> 00:14:54,210 counter examples of theorems. 251 00:14:54,210 --> 00:14:58,140 If you stated a theorem and it isn't true-- 252 00:14:58,140 --> 00:15:01,420 almost all the examples I know of of reasonable martingale 253 00:15:01,420 --> 00:15:05,620 theorems which are not true, you look at a product 254 00:15:05,620 --> 00:15:10,030 martingale, and very often you look at this product 255 00:15:10,030 --> 00:15:17,090 martingale down here, and you find out either that the 256 00:15:17,090 --> 00:15:21,320 theorem is not true, which lets you stop looking at it. 257 00:15:21,320 --> 00:15:25,470 Or it says, well, it still isn't clear from that. 258 00:15:25,470 --> 00:15:31,120 OK, so the product form martingale, there's a sequence 259 00:15:31,120 --> 00:15:34,190 of IID unit-mean random variables. 260 00:15:34,190 --> 00:15:41,200 And Zn, which is the product, is then a martingale, if you 261 00:15:41,200 --> 00:15:43,940 assume this condition up here, of course. 262 00:15:43,940 --> 00:15:49,870 And this condition here, the probability that the n-th 263 00:15:49,870 --> 00:15:58,480 order product, namely the product of n of these sample 264 00:15:58,480 --> 00:16:06,290 values, if you get one 0, the product is 0. 265 00:16:06,290 --> 00:16:08,240 So you're done. 266 00:16:08,240 --> 00:16:12,630 So the only question is, do you get all 1s? 267 00:16:12,630 --> 00:16:16,140 Or do you get something other than all 1s? 268 00:16:16,140 --> 00:16:20,880 If you get all 1s, then the product of these random 269 00:16:20,880 --> 00:16:24,160 variables is 2 to the n. 270 00:16:24,160 --> 00:16:28,060 You get 2 to the n with probability 2 to the minus n. 271 00:16:28,060 --> 00:16:32,070 And you get zero with all the rest of the probability. 272 00:16:32,070 --> 00:16:35,660 The limit as n goes to infinity if Z sub n is equal 273 00:16:35,660 --> 00:16:42,360 to zero with probability 1, namely eventually 274 00:16:42,360 --> 00:16:43,600 you go down to 0. 275 00:16:43,600 --> 00:16:46,380 And you stay there forever after. 276 00:16:46,380 --> 00:16:51,220 And with this very small probability, you get to some 277 00:16:51,220 --> 00:16:53,110 humongous number. 278 00:16:53,110 --> 00:16:56,555 And you keep going up until eventually you lose, and you 279 00:16:56,555 --> 00:16:57,950 go down to 0. 280 00:16:57,950 --> 00:17:01,290 So the limit for the expected value of Z sub n-- 281 00:17:01,290 --> 00:17:06,900 for every n the expected value of Z sub n is equal to 1. 282 00:17:06,900 --> 00:17:09,609 That's what makes this example interesting. 283 00:17:09,609 --> 00:17:11,625 The limit of the expected value of the Z sub 284 00:17:11,625 --> 00:17:13,680 ns is equal to 1. 285 00:17:13,680 --> 00:17:19,190 But the Z sub ns themselves go to 0 with probability 1. 286 00:17:19,190 --> 00:17:23,200 And the reason for that is that you had this enormous 287 00:17:23,200 --> 00:17:26,540 growth here with very small probability. 288 00:17:26,540 --> 00:17:29,390 But it's enough to keep the expectation equal to 1. 289 00:17:33,050 --> 00:17:36,060 OK, then we started to talk about sub and super 290 00:17:36,060 --> 00:17:37,900 martingales. 291 00:17:37,900 --> 00:17:42,050 And I told you if you can't remember what the definition 292 00:17:42,050 --> 00:17:46,290 of a submartingale is in terms of is it less than or equal or 293 00:17:46,290 --> 00:17:49,570 greater than or equal, just remember that it's not 294 00:17:49,570 --> 00:17:51,020 what it should be. 295 00:17:51,020 --> 00:17:53,430 It's the opposite of what it should be. 296 00:17:53,430 --> 00:17:57,810 So the expected value of Z sub n given the past is greater 297 00:17:57,810 --> 00:18:00,220 than or equal to Z sub n minus 1. 298 00:18:00,220 --> 00:18:02,270 That means it's a submartingale. 299 00:18:02,270 --> 00:18:05,620 In other words, submartingales grow in time. 300 00:18:05,620 --> 00:18:09,300 Supermartingales shrink in time. 301 00:18:09,300 --> 00:18:10,540 And that's strange. 302 00:18:10,540 --> 00:18:13,080 But that's the way it is. 303 00:18:13,080 --> 00:18:17,870 If this quantity is a submartingale, then minus Zn 304 00:18:17,870 --> 00:18:21,150 is a supermartingale and vice versa. 305 00:18:21,150 --> 00:18:24,280 So I'm going to only talk about submartingales from now 306 00:18:24,280 --> 00:18:29,660 on because supermartingales just do everything the same, 307 00:18:29,660 --> 00:18:33,400 but just look at minus signs instead of plus signs. 308 00:18:33,400 --> 00:18:35,350 So why bother yourself with one thing 309 00:18:35,350 --> 00:18:37,450 extra to think about? 310 00:18:37,450 --> 00:18:41,870 So for submartingales, the expected value of Z sub n 311 00:18:41,870 --> 00:18:48,270 given the past, given part of the past from i down to 1, is 312 00:18:48,270 --> 00:18:50,940 greater than or equal to Z sub i. 313 00:18:50,940 --> 00:18:53,850 That's essentially the same as that theorem we've stated 314 00:18:53,850 --> 00:18:56,980 before, which said that for martingales, the expected 315 00:18:56,980 --> 00:19:02,780 values of Zn given Zi down to Z1 was equal to Z sub i. 316 00:19:02,780 --> 00:19:06,160 You remember we proved that in detail last time because that 317 00:19:06,160 --> 00:19:08,680 was a crucially important theorem. 318 00:19:08,680 --> 00:19:13,080 You take that proof and you put this inequality in it 319 00:19:13,080 --> 00:19:17,640 instead of a quality, and it immediately gives you this. 320 00:19:17,640 --> 00:19:21,080 You just follow that proof step by step, putting an 321 00:19:21,080 --> 00:19:26,200 equality in in place of a quality. 322 00:19:26,200 --> 00:19:29,900 And same thing here, the expected value of Z sub n is 323 00:19:29,900 --> 00:19:33,440 greater than or equal to the expected value of Z sub i. 324 00:19:33,440 --> 00:19:35,172 That's true for all i. 325 00:19:35,172 --> 00:19:39,700 So the expected value of Zn is also greater than or equal to 326 00:19:39,700 --> 00:19:41,400 the expected value Z1. 327 00:19:41,400 --> 00:19:43,500 In other words, the expected values of these random 328 00:19:43,500 --> 00:19:47,740 variables, in fact, always grow. 329 00:19:47,740 --> 00:19:49,970 Or if they don't grow, they at least stay the same. 330 00:19:49,970 --> 00:19:51,220 They can't shrink. 331 00:19:55,080 --> 00:19:59,620 OK, we started to talk about convex functions last time. 332 00:19:59,620 --> 00:20:04,420 I want to remind you what they are. 333 00:20:04,420 --> 00:20:07,950 A function which carries the real numbers into the real 334 00:20:07,950 --> 00:20:12,940 numbers is convex, if each tangent of the curve lies on 335 00:20:12,940 --> 00:20:13,870 or below the curve. 336 00:20:13,870 --> 00:20:15,450 Here's a picture of it. 337 00:20:15,450 --> 00:20:20,410 Here's a function h of x, a one-dimensional function. 338 00:20:20,410 --> 00:20:22,880 So you can draw x on the line. 339 00:20:22,880 --> 00:20:28,280 And h of x is something which goes up and down. 340 00:20:28,280 --> 00:20:30,680 And here's another example. 341 00:20:30,680 --> 00:20:34,030 H of x is equal to the magnitude of x. 342 00:20:34,030 --> 00:20:37,590 You're usually used to thinking of convex functions 343 00:20:37,590 --> 00:20:40,580 in terms of functions that have a positive second 344 00:20:40,580 --> 00:20:42,510 derivative. 345 00:20:42,510 --> 00:20:45,460 Taking the geometric view, you get something considerably 346 00:20:45,460 --> 00:20:49,900 more general because it includes all of these cases, 347 00:20:49,900 --> 00:20:52,490 as well as all of these cases. 348 00:20:52,490 --> 00:20:57,770 And this idea of tangents lying on or below the line 349 00:20:57,770 --> 00:20:59,520 gives you the linkage here. 350 00:20:59,520 --> 00:21:06,390 You have something which comes down, goes to 0, and 351 00:21:06,390 --> 00:21:08,610 then goes up again. 352 00:21:08,610 --> 00:21:11,300 Think of drawing tangents to this curve. 353 00:21:11,300 --> 00:21:17,270 Tangents have to have a slope starting here and 354 00:21:17,270 --> 00:21:21,160 going around to here. 355 00:21:21,160 --> 00:21:23,820 And all of those tangents hit at that point there. 356 00:21:23,820 --> 00:21:26,800 So this is a very pathological thing. 357 00:21:26,800 --> 00:21:30,770 But all the tangents indeed do lie below the curve. 358 00:21:30,770 --> 00:21:32,885 So you've satisfied the condition. 359 00:21:32,885 --> 00:21:35,390 The lemma is Jensen's inequality. 360 00:21:35,390 --> 00:21:39,560 It says, if h is convex and Z is a random variable with 361 00:21:39,560 --> 00:21:45,220 finite expectation, then h of the expected value of Z is 362 00:21:45,220 --> 00:21:49,090 less than or equal to the expected value of h of Z. This 363 00:21:49,090 --> 00:21:52,400 seems sort of obvious, perhaps. 364 00:21:52,400 --> 00:21:55,700 It's one of those things which either seems obvious or it 365 00:21:55,700 --> 00:21:56,860 doesn't seem obvious. 366 00:21:56,860 --> 00:22:00,580 And if it doesn't seem obvious, it doesn't become 367 00:22:00,580 --> 00:22:03,270 obvious terribly easily. 368 00:22:03,270 --> 00:22:06,390 But what I want to do here is to convince you of why it's 369 00:22:06,390 --> 00:22:10,290 true by looking at a little triangle here. 370 00:22:15,080 --> 00:22:19,230 I can think of the random variable Z as having three 371 00:22:19,230 --> 00:22:20,450 possible values-- 372 00:22:20,450 --> 00:22:24,790 one over here where we get this point comes into the 373 00:22:24,790 --> 00:22:29,140 curve, one here, and one here. 374 00:22:29,140 --> 00:22:33,440 Now if I take those three possible values of x and I 375 00:22:33,440 --> 00:22:37,020 think of assigning all possible probability 376 00:22:37,020 --> 00:22:40,920 assignments to those three possible values, what happens? 377 00:22:40,920 --> 00:22:43,390 If I assign all the probability over here, I get 378 00:22:43,390 --> 00:22:44,460 that point. 379 00:22:44,460 --> 00:22:45,980 If I assign all the probability 380 00:22:45,980 --> 00:22:47,530 here, I get that point. 381 00:22:47,530 --> 00:22:49,110 If I assign all the probability 382 00:22:49,110 --> 00:22:50,440 here, I get that point. 383 00:22:50,440 --> 00:22:54,280 And for everything else, it lies inside that triangle. 384 00:22:54,280 --> 00:22:55,520 And you can convince yourselves 385 00:22:55,520 --> 00:22:58,410 of that pretty easily. 386 00:22:58,410 --> 00:23:01,190 So for all of those probability measures where the 387 00:23:01,190 --> 00:23:05,570 expected value lies on this line, what you get is 388 00:23:05,570 --> 00:23:16,600 something between this and this as the expected value of 389 00:23:16,600 --> 00:23:20,990 h of Z. So you get something above the line for the 390 00:23:20,990 --> 00:23:24,690 expected value for h. 391 00:23:24,690 --> 00:23:28,780 Of expected values of Z, you just get this point right 392 00:23:28,780 --> 00:23:32,000 there, which is clearly smaller than anything you can 393 00:23:32,000 --> 00:23:40,990 generate out of that triangle or quadrilateral or any kind 394 00:23:40,990 --> 00:23:47,790 of straight line figure that you draw, which is the set of 395 00:23:47,790 --> 00:23:51,590 expected values you can get from probabilities using the 396 00:23:51,590 --> 00:23:52,995 points on that-- 397 00:23:56,470 --> 00:24:00,820 well, it's what I said it was. 398 00:24:00,820 --> 00:24:06,030 OK, Jensen's inequality leads to the following theorem. 399 00:24:06,030 --> 00:24:08,880 And I'm not going to prove it here in class. 400 00:24:08,880 --> 00:24:11,930 It's one of those theorems which is sort of 401 00:24:11,930 --> 00:24:15,230 obvious and not quite. 402 00:24:15,230 --> 00:24:18,710 So if you want to see the proof, you can look at it. 403 00:24:18,710 --> 00:24:22,040 Or if you want to, you can just believe it. 404 00:24:22,040 --> 00:24:27,530 If Z sub n is a martingale or it's a submartingale and if h 405 00:24:27,530 --> 00:24:32,860 is convex, then the expected value of the magnitude of h of 406 00:24:32,860 --> 00:24:36,580 Zn is less than infinity for all n. 407 00:24:36,580 --> 00:24:42,180 Then h of Zn is a submartingale. 408 00:24:42,180 --> 00:24:46,580 In other words, when you have that convex function, you go 409 00:24:46,580 --> 00:24:49,130 from something which is a martingale, which will be what 410 00:24:49,130 --> 00:24:53,180 you would get on the line to something which is bigger than 411 00:24:53,180 --> 00:25:00,540 that, so that what you get is the fact that expected value 412 00:25:00,540 --> 00:25:05,780 of h of Z is, in fact, growing with time, faster than the 413 00:25:05,780 --> 00:25:09,380 martingale itself is growing. 414 00:25:09,380 --> 00:25:17,790 OK, so one example of this is if z of n is a martingale, 415 00:25:17,790 --> 00:25:22,700 then the absolute value of Z sub n is a submartingale. 416 00:25:22,700 --> 00:25:26,530 Submartingales are martingales. 417 00:25:26,530 --> 00:25:29,100 Martingales are submartingales also. 418 00:25:29,100 --> 00:25:32,990 So I don't have to keep saying that if it's a martingale or a 419 00:25:32,990 --> 00:25:34,260 submartingale. 420 00:25:34,260 --> 00:25:36,650 I can just say if it's a submartingale. 421 00:25:36,650 --> 00:25:43,430 This theorem is usually stated as, if Z of n is a martingale, 422 00:25:43,430 --> 00:25:48,070 then h of Zn is a submartingale, which is true. 423 00:25:48,070 --> 00:25:52,950 But just as obviously, if Zn is a submartingale, which is 424 00:25:52,950 --> 00:25:56,970 more general, h of Zn is also a submartingale. 425 00:25:56,970 --> 00:25:59,750 So you don't get out of the realm of submartingales by 426 00:25:59,750 --> 00:26:02,760 taking convex functions. 427 00:26:02,760 --> 00:26:10,290 And you also get that Z squared is a martingale. 428 00:26:10,290 --> 00:26:13,110 And E to the rZn is a martingale because all of 429 00:26:13,110 --> 00:26:15,410 those are convex functions. 430 00:26:15,410 --> 00:26:19,650 So when you want to look at any of those, you just go from 431 00:26:19,650 --> 00:26:24,560 talking about a martingale to talking about a submartingale. 432 00:26:24,560 --> 00:26:26,920 And life is easy again. 433 00:26:26,920 --> 00:26:30,050 OK, major topic-- stopped martingales. 434 00:26:30,050 --> 00:26:34,670 We've talked about stopping rules before. 435 00:26:34,670 --> 00:26:39,820 And a stopping rule, we were interested in stopping rules 436 00:26:39,820 --> 00:26:47,860 when we were mostly talking about renewal processes. 437 00:26:47,860 --> 00:26:50,990 But stopping rules can be applied to any sequence of 438 00:26:50,990 --> 00:26:53,150 random variables at all. 439 00:26:53,150 --> 00:26:57,140 What a stopping rule is, you remember, is you have a 440 00:26:57,140 --> 00:27:01,230 sequence of random variables, any kind of random variable. 441 00:27:01,230 --> 00:27:05,780 And a stopping rule is a rule where the time that you stop 442 00:27:05,780 --> 00:27:08,530 is determined by the things that you've seen up until the 443 00:27:08,530 --> 00:27:09,430 time that you stop. 444 00:27:09,430 --> 00:27:11,760 You can't peak at future values. 445 00:27:11,760 --> 00:27:16,230 You have to look at these sample values one by one as 446 00:27:16,230 --> 00:27:17,300 they arrive. 447 00:27:17,300 --> 00:27:20,910 And a stopping rule is something which, when you get 448 00:27:20,910 --> 00:27:24,010 to the point you want to stop, you know that you want to stop 449 00:27:24,010 --> 00:27:29,470 there from the sample values you've already observed. 450 00:27:29,470 --> 00:27:34,090 So when you're playing poker with somebody, which I think 451 00:27:34,090 --> 00:27:38,680 we talked about before, and you make a bet and you lose, 452 00:27:38,680 --> 00:27:40,410 you cannot withdraw your bet. 453 00:27:40,410 --> 00:27:42,320 You cannot say, I stopped! 454 00:27:42,320 --> 00:27:43,570 I stopped before! 455 00:27:50,020 --> 00:27:53,330 The time that you stop depends on what you've already seen up 456 00:27:53,330 --> 00:27:54,650 until the time that you stop. 457 00:27:58,470 --> 00:28:00,500 We talked about possibly defective 458 00:28:00,500 --> 00:28:02,580 random variables before. 459 00:28:02,580 --> 00:28:04,850 I realized I never defined a possibly 460 00:28:04,850 --> 00:28:07,890 defective random variable. 461 00:28:07,890 --> 00:28:11,710 In fact, somebody asked me afterwards if it could be just 462 00:28:11,710 --> 00:28:13,380 any old thing at all. 463 00:28:13,380 --> 00:28:14,580 And I said, no. 464 00:28:14,580 --> 00:28:17,050 And here's what it is. 465 00:28:17,050 --> 00:28:24,940 It's a mapping from the sample space to a set of real values, 466 00:28:24,940 --> 00:28:26,620 to the extended real values. 467 00:28:26,620 --> 00:28:30,540 And it has the property that for a defective random 468 00:28:30,540 --> 00:28:34,780 variable, the mapping can give you plus infinity. 469 00:28:34,780 --> 00:28:36,680 Or it can give you minus infinity. 470 00:28:36,680 --> 00:28:39,120 And it can give you either one of those with positive 471 00:28:39,120 --> 00:28:42,820 probability rather than just 0 probability. 472 00:28:42,820 --> 00:28:47,410 So it applies to these cases where you have a threshold, a 473 00:28:47,410 --> 00:28:49,950 single threshold, for a random walk. 474 00:28:49,950 --> 00:28:52,770 And you might cross the threshold, or you might never 475 00:28:52,770 --> 00:28:54,090 cross the threshold. 476 00:28:54,090 --> 00:28:57,890 So it applies to conditions where sometimes you stop and 477 00:28:57,890 --> 00:29:01,000 sometimes you just keep on going forever. 478 00:29:01,000 --> 00:29:04,530 So it's nice for that kind of situation. 479 00:29:04,530 --> 00:29:07,770 The other provisos a random variable back when we defined 480 00:29:07,770 --> 00:29:11,540 random variables still hold for a possibly defective 481 00:29:11,540 --> 00:29:12,370 random variable. 482 00:29:12,370 --> 00:29:15,100 So you have a distribution function. 483 00:29:15,100 --> 00:29:17,740 It's just the distribution function doesn't necessarily 484 00:29:17,740 --> 00:29:21,460 go to 1, and it doesn't necessarily start at 0. 485 00:29:21,460 --> 00:29:25,270 It could be somewhere in between. 486 00:29:25,270 --> 00:29:29,720 OK, so a stop process for a possibly defective stopping 487 00:29:29,720 --> 00:29:38,460 time satisfies Z sub n star, which is Z sub n star is the 488 00:29:38,460 --> 00:29:39,710 stopping time. 489 00:29:45,320 --> 00:29:49,790 Let me start on that. 490 00:29:49,790 --> 00:29:52,300 We have now defined stopping rules. 491 00:29:52,300 --> 00:29:55,280 We now want to define a stop process. 492 00:29:55,280 --> 00:29:59,810 A stop process is a process which runs along until you 493 00:29:59,810 --> 00:30:01,640 decide you're going to stop. 494 00:30:01,640 --> 00:30:04,720 But before when we stopped, the game was over and nothing 495 00:30:04,720 --> 00:30:05,930 else happened. 496 00:30:05,930 --> 00:30:10,050 Here, the idea is to game continues forever. 497 00:30:10,050 --> 00:30:12,790 But you stop playing, OK? 498 00:30:12,790 --> 00:30:16,920 So the sequence of random variables continues forever. 499 00:30:16,920 --> 00:30:20,000 But the random variable of interest to you is this 500 00:30:20,000 --> 00:30:27,240 quantity Z sub n star, which at the point you stopped, then 501 00:30:27,240 --> 00:30:31,160 all subsequent Z sub n's are just equal to 502 00:30:31,160 --> 00:30:33,090 that stopped value. 503 00:30:33,090 --> 00:30:36,480 So you're talking about some kind of gambling game now, 504 00:30:36,480 --> 00:30:40,110 perhaps, where you play for a while. 505 00:30:40,110 --> 00:30:41,630 And you're playing some game where the 506 00:30:41,630 --> 00:30:43,580 game continues forever. 507 00:30:43,580 --> 00:30:45,100 And you make your bets according to 508 00:30:45,100 --> 00:30:47,010 some strange algorithm. 509 00:30:47,010 --> 00:30:50,320 And when you've made $10, you say, that's 510 00:30:50,320 --> 00:30:51,530 all I want to make. 511 00:30:51,530 --> 00:30:53,780 I'm happy with that. 512 00:30:53,780 --> 00:30:57,600 And I'm not going to become the kind of gambler who can 513 00:30:57,600 --> 00:30:58,280 never stop. 514 00:30:58,280 --> 00:31:00,760 So I'm going to stop at that point. 515 00:31:00,760 --> 00:31:06,010 So your capital remains $10 forever after, although the 516 00:31:06,010 --> 00:31:08,170 game keeps on going. 517 00:31:08,170 --> 00:31:11,170 And if you start out with a capital of $10 and you lose it 518 00:31:11,170 --> 00:31:15,080 all and you can't borrow anything, then you stop also 519 00:31:15,080 --> 00:31:17,680 when your capital becomes minus $10. 520 00:31:17,680 --> 00:31:20,590 So you can see that this is a useful thing when you're 521 00:31:20,590 --> 00:31:24,470 talking about threshold crossings because when you 522 00:31:24,470 --> 00:31:28,040 have a random walk and you have a threshold crossing, you 523 00:31:28,040 --> 00:31:29,840 can stop at that point. 524 00:31:29,840 --> 00:31:32,550 And then, you just stay there forever after. 525 00:31:32,550 --> 00:31:35,330 But if you cross the other threshold, you stay there 526 00:31:35,330 --> 00:31:36,770 forever after. 527 00:31:36,770 --> 00:31:39,405 And that makes it very convenient because you can 528 00:31:39,405 --> 00:31:43,380 look at what has happens out at infinity as a way saying 529 00:31:43,380 --> 00:31:46,500 what the value of a game was at the time you stopped. 530 00:31:46,500 --> 00:31:49,170 So you don't have to worry about what was the value of 531 00:31:49,170 --> 00:31:51,810 the game at the stopping point. 532 00:31:51,810 --> 00:31:53,950 You can keep on going forever. 533 00:31:53,950 --> 00:31:59,200 And you can talk about what the stopped process is. 534 00:31:59,200 --> 00:32:01,430 And my guess-- 535 00:32:01,430 --> 00:32:04,530 or if you've read ahead a little bit, 536 00:32:04,530 --> 00:32:06,770 which I hope you have-- 537 00:32:06,770 --> 00:32:09,960 you will know that the main theorem here is that if you 538 00:32:09,960 --> 00:32:15,510 start out with a Martingale and you stop it someplace, you 539 00:32:15,510 --> 00:32:17,690 still have a Martingale. 540 00:32:17,690 --> 00:32:21,570 In other words, if you add stopping as one of your 541 00:32:21,570 --> 00:32:25,450 options in gambling and it's a fair game, if you can find a 542 00:32:25,450 --> 00:32:30,010 fair game any place, and you stop, then it's 543 00:32:30,010 --> 00:32:30,960 still a fair game. 544 00:32:30,960 --> 00:32:34,220 The stop process is still a fair game. 545 00:32:34,220 --> 00:32:40,070 And that's as it should be because if it's a fair game, 546 00:32:40,070 --> 00:32:42,010 you should be able to stop. 547 00:32:42,010 --> 00:32:46,300 So for example, a given gambling strategy is Zn is the 548 00:32:46,300 --> 00:32:49,770 net worth at time n. 549 00:32:49,770 --> 00:32:52,610 You can modify that to stop when Zn 550 00:32:52,610 --> 00:32:54,380 reaches some given value. 551 00:32:54,380 --> 00:32:58,620 So the stopped process remains at that value forever. 552 00:32:58,620 --> 00:33:01,790 And Zn follows the original strategy. 553 00:33:01,790 --> 00:33:03,800 Here's the main theorem here. 554 00:33:03,800 --> 00:33:09,020 If j is a possibly defective stopping rule for a Martingale 555 00:33:09,020 --> 00:33:14,250 or a sub-Martingale and Zn greater than or equal to 1, 556 00:33:14,250 --> 00:33:19,390 then the stop process, Zn star, is a Martingale if the 557 00:33:19,390 --> 00:33:21,980 original processes is a Martingale and it's a 558 00:33:21,980 --> 00:33:26,420 sub-Martingale if the original process is a Martingale. 559 00:33:26,420 --> 00:33:28,770 And the proof is the following. 560 00:33:28,770 --> 00:33:32,670 You can almost say this looks obvious. 561 00:33:32,670 --> 00:33:35,090 If it looks obvious to you, you 562 00:33:35,090 --> 00:33:37,750 should admire your intuition. 563 00:33:37,750 --> 00:33:41,190 If it doesn't look obvious to you, you should admire your 564 00:33:41,190 --> 00:33:43,570 mathematical insight. 565 00:33:43,570 --> 00:33:49,660 And either way, the kind of intuition is that before 566 00:33:49,660 --> 00:33:55,006 stopping occurs, Z sub n star is equal to Z sub n. 567 00:33:55,006 --> 00:33:59,360 And after you stop, Z sub n star is constant. 568 00:33:59,360 --> 00:34:02,170 So it satisfies a Martingale condition because it's not 569 00:34:02,170 --> 00:34:05,790 going up and it's not going down. 570 00:34:05,790 --> 00:34:10,310 But in fact, when you try to think through the whole thing, 571 00:34:10,310 --> 00:34:11,560 it's not quite enough. 572 00:34:14,630 --> 00:34:17,899 It's the kind of thing where after you look at it for a 573 00:34:17,899 --> 00:34:20,260 while, you say, yes, it has to be true. 574 00:34:20,260 --> 00:34:20,960 But why? 575 00:34:20,960 --> 00:34:23,409 And you can't explain why it's true. 576 00:34:23,409 --> 00:34:26,260 I'm going to go through the proof here. 577 00:34:26,260 --> 00:34:30,540 And mostly the reason is that the proof I have in the notes, 578 00:34:30,540 --> 00:34:31,960 I can't understand it anymore. 579 00:34:35,330 --> 00:34:38,145 Well, I can sort of understand it when I correct a 580 00:34:38,145 --> 00:34:39,909 few errors in it. 581 00:34:39,909 --> 00:34:44,550 But I think this proof gives you a much better 582 00:34:44,550 --> 00:34:45,670 idea why it's true. 583 00:34:45,670 --> 00:34:49,080 And I think you can follow it in real time. 584 00:34:49,080 --> 00:34:53,320 Whereas that proof, I couldn't follow it in real time, or 585 00:34:53,320 --> 00:34:55,310 fairly extended time. 586 00:34:55,310 --> 00:34:59,070 OK, so this stop process, I can express it in the 587 00:34:59,070 --> 00:34:59,950 following way. 588 00:34:59,950 --> 00:35:03,730 And let me try to explain why this is. 589 00:35:03,730 --> 00:35:09,950 If your stopping rule tells you that you stop at time m 590 00:35:09,950 --> 00:35:14,050 for a particular sample sequence, this indicator 591 00:35:14,050 --> 00:35:23,020 function, i of sub j equals n, this function here is 1 for 592 00:35:23,020 --> 00:35:27,410 all sample sequences for which you stop at time m. 593 00:35:27,410 --> 00:35:32,070 And it's 0 for all other sequences. 594 00:35:32,070 --> 00:35:37,400 So the value of this stop process at time n is going to 595 00:35:37,400 --> 00:35:43,560 be the value at which it stopped, which is Zm, when you 596 00:35:43,560 --> 00:35:47,910 have this indicator function, which is j equals m. 597 00:35:47,910 --> 00:35:52,350 And if it hasn't stopped yet, it's going to be Z sub n, 598 00:35:52,350 --> 00:35:55,360 which is what it really is. 599 00:35:55,360 --> 00:35:56,250 And it hasn't stopped. 600 00:35:56,250 --> 00:36:00,080 So the stop process hasn't yet stopped. 601 00:36:00,080 --> 00:36:04,730 So Z sub n star is equal to Z sub n. 602 00:36:04,730 --> 00:36:12,930 OK, so as far as the magnitude is concerned, the magnitude of 603 00:36:12,930 --> 00:36:17,350 Z sub n is going to be less than or equal to the sum of 604 00:36:17,350 --> 00:36:19,610 those magnitudes. 605 00:36:19,610 --> 00:36:22,600 And the sum of those magnitudes, you can just 606 00:36:22,600 --> 00:36:26,630 ignore the indicator functions because they're either 0 or 1. 607 00:36:26,630 --> 00:36:28,690 So we can upper bound them by 1. 608 00:36:28,690 --> 00:36:34,410 So Z sub n star is less than or equal to the sum over m 609 00:36:34,410 --> 00:36:37,930 less than n of Z sub n plus Z sub n. 610 00:36:40,800 --> 00:36:45,220 And this means that the expected value of Z sub n star 611 00:36:45,220 --> 00:36:48,920 has to be less than infinity because what it is in this 612 00:36:48,920 --> 00:36:52,790 bound here is a sum of finite numbers. 613 00:36:52,790 --> 00:36:56,430 When you take a finite sum-- this is a finite sum. 614 00:36:56,430 --> 00:36:59,110 There are only n plus 1 terms in it. 615 00:36:59,110 --> 00:37:03,090 When you take a finite sum of finite numbers, you get 616 00:37:03,090 --> 00:37:04,590 something finite. 617 00:37:04,590 --> 00:37:08,500 So expected value of Z sub n is--. 618 00:37:18,437 --> 00:37:18,920 Excuse me. 619 00:37:18,920 --> 00:37:23,050 I was a little bit quick about that. 620 00:37:23,050 --> 00:37:28,200 The expected value of Z sub n star is now less than or equal 621 00:37:28,200 --> 00:37:32,110 to the expected value of each of the Z sub n's plus the 622 00:37:32,110 --> 00:37:34,360 expected value of Z sub n. 623 00:37:34,360 --> 00:37:37,950 Since the Z process is a Martingale, you know that all 624 00:37:37,950 --> 00:37:40,780 of those expected values are finite. 625 00:37:40,780 --> 00:37:43,520 And since all of those expected values are finite, 626 00:37:43,520 --> 00:37:49,140 the expected value of Z sub n star is finite as we said. 627 00:37:49,140 --> 00:37:57,180 OK, so let's try to trace out what happens if we look at the 628 00:37:57,180 --> 00:38:02,310 expected value of Z sub n star conditional on the past 629 00:38:02,310 --> 00:38:05,230 history up until time n minus 1. 630 00:38:05,230 --> 00:38:10,550 We'll rewrite Z sub n star in terms of this expression here. 631 00:38:10,550 --> 00:38:17,420 So it's the sum over m less than n of the expected value 632 00:38:17,420 --> 00:38:22,810 of the stopping point if the stopping point was equal to m 633 00:38:22,810 --> 00:38:27,070 plus the expected value of Z sub n if the stopping point 634 00:38:27,070 --> 00:38:28,430 was greater than n. 635 00:38:28,430 --> 00:38:31,560 So we just want to analyze all of those terms. 636 00:38:31,560 --> 00:38:32,910 So we look at them. 637 00:38:32,910 --> 00:38:35,710 There's nothing complicated about it. 638 00:38:35,710 --> 00:38:40,310 The expected value of this term here, expected value of 639 00:38:40,310 --> 00:38:47,720 Zm times i of j equals m given Zn minus 1. 640 00:38:47,720 --> 00:38:51,920 And now, we're going to be child-like about it. 641 00:38:51,920 --> 00:38:58,800 And we're going to assume a particular sample sequence, 642 00:38:58,800 --> 00:39:01,730 which is equal to little z n minus 1. 643 00:39:01,730 --> 00:39:04,460 What is this expected value here? 644 00:39:04,460 --> 00:39:08,120 It has to be Z sub n if j is equal to m. 645 00:39:08,120 --> 00:39:09,370 Why is that? 646 00:39:09,370 --> 00:39:12,130 That's the argument I was just going through before. 647 00:39:12,130 --> 00:39:17,240 What's the expected value of a random variable conditional on 648 00:39:17,240 --> 00:39:19,950 the random variable, that same random variable, having a 649 00:39:19,950 --> 00:39:22,160 particular value? 650 00:39:22,160 --> 00:39:25,650 The fact that you're given a large number of these 651 00:39:25,650 --> 00:39:28,190 quantities doesn't make any difference. 652 00:39:28,190 --> 00:39:31,440 The main thing that you're given here is the value of Z 653 00:39:31,440 --> 00:39:40,030 sub n being little z sub m, which says this quantity is 654 00:39:40,030 --> 00:39:43,710 equal to little z sub m if j is equal to m. 655 00:39:43,710 --> 00:39:47,710 That's equal to 0 if j is unequal to m, which in fact is 656 00:39:47,710 --> 00:39:53,780 just equal to Zm times the indicator function of j. 657 00:39:53,780 --> 00:39:56,340 OK, so you-- 658 00:39:56,340 --> 00:40:00,140 and the same thing happens for the indicator 659 00:40:00,140 --> 00:40:03,370 function of j equals n. 660 00:40:03,370 --> 00:40:07,260 This should be j greater than or equal to n. 661 00:40:07,260 --> 00:40:10,350 This is Zn minus 1. 662 00:40:10,350 --> 00:40:12,220 And we add these things up. 663 00:40:12,220 --> 00:40:17,790 And what we get, finally, is this sum here. 664 00:40:17,790 --> 00:40:21,010 And now, you look at this last term here, which is a 665 00:40:21,010 --> 00:40:23,150 combination of here and here. 666 00:40:23,150 --> 00:40:27,340 And this is just the indicator function for j greater than or 667 00:40:27,340 --> 00:40:29,210 equal to n minus 1. 668 00:40:29,210 --> 00:40:34,280 And if you look back at the definition of Z sub n star, 669 00:40:34,280 --> 00:40:37,720 this is just Z star of n minus 1. 670 00:40:40,650 --> 00:40:44,080 I see a lot of blank faces. 671 00:40:44,080 --> 00:40:46,680 But this is the kind of thing you almost 672 00:40:46,680 --> 00:40:47,930 have to look at twice. 673 00:40:50,490 --> 00:40:52,790 So we'll let it go with that. 674 00:40:52,790 --> 00:40:58,390 So this shows that the expected value of Z sub n star 675 00:40:58,390 --> 00:41:05,010 given the past of the original process is equal to Z 676 00:41:05,010 --> 00:41:06,810 star n minus 1. 677 00:41:06,810 --> 00:41:08,650 That's not quite what you want. 678 00:41:08,650 --> 00:41:15,460 You want the expected value of Zn star given Z star of n 679 00:41:15,460 --> 00:41:20,700 minus 1 to be equal to Z star n minus 1. 680 00:41:20,700 --> 00:41:25,450 In other words, you want to be able to replace this quantity 681 00:41:25,450 --> 00:41:29,080 in here with Z star n minus 1. 682 00:41:29,080 --> 00:41:31,770 And the question is, how do you do that? 683 00:41:31,770 --> 00:41:34,160 And that's what bothered me about the proof in the notes 684 00:41:34,160 --> 00:41:37,480 because it didn't even talk about that. 685 00:41:37,480 --> 00:41:46,000 So the argument is Z star n minus 1 in the past is really 686 00:41:46,000 --> 00:41:49,780 a function of the past of the original process. 687 00:41:49,780 --> 00:41:52,640 If I give you the sample values of the original 688 00:41:52,640 --> 00:41:56,840 process, you can tell me where the process stops. 689 00:41:56,840 --> 00:42:00,550 And you can say what the stop process is both before and 690 00:42:00,550 --> 00:42:02,370 after that point. 691 00:42:02,370 --> 00:42:07,860 So the stop values are a function of 692 00:42:07,860 --> 00:42:09,600 the unstopped values. 693 00:42:09,600 --> 00:42:13,590 So now what I can do is for every sample point of the 694 00:42:13,590 --> 00:42:20,330 original process leading to a given sequence of the stop 695 00:42:20,330 --> 00:42:26,320 process, we're going to have expected value Zn star given 696 00:42:26,320 --> 00:42:31,040 these values here as equal to Z star n minus 1. 697 00:42:31,040 --> 00:42:35,350 And since that's true for all of the values for which this 698 00:42:35,350 --> 00:42:40,300 leads to that, this is true also. 699 00:42:40,300 --> 00:42:45,480 So that proves it. 700 00:42:45,480 --> 00:42:50,080 I'm doing this primarily because I think you ought to 701 00:42:50,080 --> 00:42:53,350 be tortured with at least one proof in every lecture. 702 00:42:53,350 --> 00:42:59,230 And the other thing is the proof in the notes was not 703 00:42:59,230 --> 00:43:00,110 quite sufficient. 704 00:43:00,110 --> 00:43:03,080 So I wanted to add to it here. 705 00:43:03,080 --> 00:43:05,680 So now, you have a proof of it. 706 00:43:05,680 --> 00:43:13,350 Consequences of the theorem, this is for sub-Martingales, 707 00:43:13,350 --> 00:43:19,210 the marginal expected values of the stopped process lies in 708 00:43:19,210 --> 00:43:23,190 between the expected value of Z1 and the expected 709 00:43:23,190 --> 00:43:24,130 value of Z sub n. 710 00:43:24,130 --> 00:43:28,590 In other words, when you take the stop process, it in some 711 00:43:28,590 --> 00:43:34,270 sense is intermediate between what happens at time 1 and 712 00:43:34,270 --> 00:43:39,150 what happens for the original process at time n. 713 00:43:39,150 --> 00:43:43,610 It can't grow any faster than the original process. 714 00:43:43,610 --> 00:43:47,520 This is also almost intuitively obvious. 715 00:43:47,520 --> 00:43:51,900 And it's proven in section 7.8. 716 00:43:51,900 --> 00:43:55,230 So you can find it there. 717 00:43:55,230 --> 00:43:57,880 It's quite a bit easier than the proof I just went through. 718 00:43:57,880 --> 00:44:01,920 The proof I just went through was a fairly difficult and 719 00:44:01,920 --> 00:44:03,610 fairly tricky proof. 720 00:44:03,610 --> 00:44:07,850 Partly I went through that proof because everything we do 721 00:44:07,850 --> 00:44:09,520 from now on-- 722 00:44:09,520 --> 00:44:11,335 we're not going to do an awful lot of things. 723 00:44:17,610 --> 00:44:20,560 But the Martingale convergence theorem, the strong law of 724 00:44:20,560 --> 00:44:23,890 large numbers, and all of the other results we're talking 725 00:44:23,890 --> 00:44:27,860 about all depend critically on that theorem that 726 00:44:27,860 --> 00:44:29,410 we just went through. 727 00:44:29,410 --> 00:44:31,680 In other words, it's a really major theorem. 728 00:44:31,680 --> 00:44:35,650 It's not trivial little thing. 729 00:44:35,650 --> 00:44:38,440 OK, this one is fairly major, too. 730 00:44:38,440 --> 00:44:40,520 But it follows very easily from the other one. 731 00:44:47,970 --> 00:44:52,150 Do I want to talk about this, this generating function 732 00:44:52,150 --> 00:44:53,400 product of the Martingale? 733 00:44:57,030 --> 00:44:58,450 No. 734 00:44:58,450 --> 00:44:59,840 Let's let that go. 735 00:44:59,840 --> 00:45:02,260 Let's not-- 736 00:45:02,260 --> 00:45:03,510 not that important. 737 00:45:07,150 --> 00:45:10,920 So this is that, too. 738 00:45:10,920 --> 00:45:13,262 No, I guess I better go back to that. 739 00:45:16,360 --> 00:45:17,610 We need it. 740 00:45:19,470 --> 00:45:22,260 OK, let's look at the generating function product 741 00:45:22,260 --> 00:45:25,620 Martingale that we had for a random walk. 742 00:45:25,620 --> 00:45:31,740 So X sub n is a sequence of IID random variables. 743 00:45:31,740 --> 00:45:37,030 The partial sums form the variables of a random walk. 744 00:45:37,030 --> 00:45:42,260 Sn is a random walk where Sn is a sum. 745 00:45:42,260 --> 00:45:49,870 For any r such that gamma of r exists, we then define Z sub n 746 00:45:49,870 --> 00:45:53,630 to be a Martingale, this Martingale here. 747 00:45:53,630 --> 00:45:57,660 That's called a generating function Martingale. 748 00:45:57,660 --> 00:45:59,275 Zn is a Martingale. 749 00:45:59,275 --> 00:46:02,060 The expected value of Zn is equal to 1. 750 00:46:02,060 --> 00:46:05,350 You can see immediately from this that the expected value 751 00:46:05,350 --> 00:46:06,850 of Zn is equal to 1. 752 00:46:06,850 --> 00:46:09,780 You don't need any of the theory we've gone through 753 00:46:09,780 --> 00:46:15,950 because the expected value of the e to the r Sn is what? 754 00:46:15,950 --> 00:46:19,880 It's a moment generating function to the n-th power. 755 00:46:19,880 --> 00:46:21,920 That's just this term here. 756 00:46:21,920 --> 00:46:25,480 So this has to be equal to 1 for all n. 757 00:46:25,480 --> 00:46:28,140 The fact that this is a Martingale comes from that 758 00:46:28,140 --> 00:46:30,600 example of product form Martingale 759 00:46:30,600 --> 00:46:31,590 that we went through. 760 00:46:31,590 --> 00:46:38,370 So there's nothing very sophisticated here. 761 00:46:38,370 --> 00:46:42,920 OK, so if we assume that gamma of r exists and we let Zn be 762 00:46:42,920 --> 00:46:50,090 this Martingale, well, this is just what we said before. 763 00:46:50,090 --> 00:46:51,550 So you see it here. 764 00:46:51,550 --> 00:46:55,860 Let j be the non-defective stopping time that stops when 765 00:46:55,860 --> 00:47:02,090 either alpha greater than 0 or beta less than 0. 766 00:47:02,090 --> 00:47:07,360 Since this is a stopping time, the expected value of e to the 767 00:47:07,360 --> 00:47:13,590 Zn star is equal to 1 for all n greater than or equal to 1. 768 00:47:13,590 --> 00:47:21,060 And the limit as n goes to infinity of Z sub n is then 769 00:47:21,060 --> 00:47:27,550 going to be equal to the process at the 770 00:47:27,550 --> 00:47:29,110 time where you stopped. 771 00:47:29,110 --> 00:47:30,880 After you stop, you stay the same. 772 00:47:30,880 --> 00:47:32,830 So you never move. 773 00:47:32,830 --> 00:47:35,890 And the expected value of Z sub j is just 774 00:47:35,890 --> 00:47:36,990 this quantity here. 775 00:47:36,990 --> 00:47:39,970 What does that look like? 776 00:47:39,970 --> 00:47:44,180 That's the Wald identity coming up again. 777 00:47:44,180 --> 00:47:48,420 That's the Wald identity coming up for a random walk 778 00:47:48,420 --> 00:47:50,120 with two thresholds. 779 00:47:50,120 --> 00:47:53,760 The nice thing about doing it this way is you can see that 780 00:47:53,760 --> 00:47:56,770 the proof applies to many other situations. 781 00:47:56,770 --> 00:48:00,380 You can have almost any stopping rule you want. 782 00:48:00,380 --> 00:48:05,570 And you still get the Wald identity. 783 00:48:05,570 --> 00:48:11,230 So it has a much more general form than we had before. 784 00:48:11,230 --> 00:48:16,065 This business here, the limit of Z sub n star going to Z sub 785 00:48:16,065 --> 00:48:18,710 j is a little fishy. 786 00:48:21,960 --> 00:48:24,260 The proof in the notes is fine. 787 00:48:24,260 --> 00:48:28,480 This limit does in fact equal this limit. 788 00:48:28,480 --> 00:48:32,880 What's bizarre is that the expected value of this limit 789 00:48:32,880 --> 00:48:37,490 is not necessarily equal to the expected value of Z sub j. 790 00:48:37,490 --> 00:48:41,890 So make a note to yourselves that if you ever want to use 791 00:48:41,890 --> 00:48:47,530 Wald's identity in this more general case, think carefully 792 00:48:47,530 --> 00:48:51,570 about what's going on as you go to the limit. 793 00:48:51,570 --> 00:48:54,340 Because it can be a little bit tricky there. 794 00:48:54,340 --> 00:48:58,440 It's not always what it looks like. 795 00:48:58,440 --> 00:49:02,970 OK, so we're on to Mr. Kolmogorov again. 796 00:49:02,970 --> 00:49:08,860 Kolmogorov was the guy who did so many things in the subject. 797 00:49:08,860 --> 00:49:12,890 Most important, he said for a firm foundation to start with, 798 00:49:12,890 --> 00:49:16,180 he was the one that said you really need a model. 799 00:49:16,180 --> 00:49:18,970 And you really need some axioms. 800 00:49:18,970 --> 00:49:21,365 And then, he went on with all these other neat things that 801 00:49:21,365 --> 00:49:24,510 we've talked about from time to time. 802 00:49:24,510 --> 00:49:30,280 His sub-Martingale inequality is a fairly simple result. 803 00:49:30,280 --> 00:49:34,430 But this follows after that stopping theorem that we've 804 00:49:34,430 --> 00:49:36,670 just talked about. 805 00:49:36,670 --> 00:49:39,750 And everything else depends on this. 806 00:49:39,750 --> 00:49:42,180 So there's a sort of chain that runs through this whole 807 00:49:42,180 --> 00:49:43,570 development. 808 00:49:43,570 --> 00:49:46,660 And if you go further in Martingales, you find that 809 00:49:46,660 --> 00:49:51,230 this is just an absolutely major theorem which comes up 810 00:49:51,230 --> 00:49:52,870 all the time. 811 00:49:52,870 --> 00:49:56,490 And it's wonderful because it's so simple. 812 00:49:56,490 --> 00:50:01,940 OK, so let's let Z sub n be a non-negative sub-Martingale. 813 00:50:05,120 --> 00:50:10,050 And then, for any positive integer m and any number a 814 00:50:10,050 --> 00:50:17,120 bigger than 0, the probability that the maximum of the first 815 00:50:17,120 --> 00:50:23,070 n terms of this Martingale is greater than or equal to the 816 00:50:23,070 --> 00:50:29,840 quantity a is the expected value of Z sub n divided by a. 817 00:50:29,840 --> 00:50:36,290 This looks like the Markov inequality. 818 00:50:36,290 --> 00:50:41,500 If instead of taking the maximum from 1 to m we just 819 00:50:41,500 --> 00:50:44,740 look at Z sub n, the probability that Z sub n is 820 00:50:44,740 --> 00:50:48,910 greater than or equal to a, then we get less than or equal 821 00:50:48,910 --> 00:50:53,510 to Z of m divided by a. 822 00:50:53,510 --> 00:50:57,060 So what this is saying is it's really strengthening that 823 00:50:57,060 --> 00:51:01,140 Markov inequality and saying, you don't have to restrict 824 00:51:01,140 --> 00:51:03,670 yourself to Z sub m. 825 00:51:03,670 --> 00:51:07,680 You can instead look at all of the terms up until m. 826 00:51:07,680 --> 00:51:11,990 And this bound here you get in the Markov inequality really 827 00:51:11,990 --> 00:51:18,610 covers the maximum of all of those terms to start out with, 828 00:51:18,610 --> 00:51:21,650 which says that for any m, you can look at the maximum over 829 00:51:21,650 --> 00:51:25,140 an enormous sum of terms if you want to. 830 00:51:25,140 --> 00:51:29,530 And it does this nice thing. 831 00:51:29,530 --> 00:51:33,080 OK, I'm going to prove this also. 832 00:51:33,080 --> 00:51:36,070 But this proof is simple. 833 00:51:36,070 --> 00:51:39,800 So you can follow it in real time, I think. 834 00:51:39,800 --> 00:51:44,720 So we want to start out with letting j be the stopping 835 00:51:44,720 --> 00:51:53,110 time, which is essentially the smallest term where you've 836 00:51:53,110 --> 00:51:55,290 crossed the threshold at a. 837 00:51:55,290 --> 00:51:58,550 And if you haven't crossed a threshold at a, then it's 838 00:51:58,550 --> 00:52:01,270 equal to the last term. 839 00:52:01,270 --> 00:52:03,340 So here's the specific stopping rule that 840 00:52:03,340 --> 00:52:04,680 we're going to use. 841 00:52:04,680 --> 00:52:10,280 If Zn is greater than or equal to a for any n, then j is the 842 00:52:10,280 --> 00:52:13,750 smallest n for which Zn is greater than a. 843 00:52:13,750 --> 00:52:17,835 It's the first time at which we've crossed that threshold. 844 00:52:17,835 --> 00:52:24,460 If Zn is less than a for all n up until m, then we make j 845 00:52:24,460 --> 00:52:25,700 equal to n. 846 00:52:25,700 --> 00:52:28,820 So we're insisting on stopping at some point. 847 00:52:28,820 --> 00:52:31,010 This is not a defective stopping rule. 848 00:52:31,010 --> 00:52:34,520 It's a real stopping rule because you've set the limit 849 00:52:34,520 --> 00:52:37,010 on how far you want to look. 850 00:52:37,010 --> 00:52:41,780 OK, so the process has to stop by time m. 851 00:52:41,780 --> 00:52:45,820 The value of the process at the time you stop-- 852 00:52:45,820 --> 00:52:49,120 remember this thing we've called Z sub j, which is the 853 00:52:49,120 --> 00:52:52,010 value at the stopping times. 854 00:52:52,010 --> 00:52:55,770 Z sub j is greater than or equal to a if you stopped 855 00:52:55,770 --> 00:52:57,020 before time m. 856 00:52:59,600 --> 00:53:07,740 And Z sub n is-- 857 00:53:07,740 --> 00:53:11,020 well, we're saying that Z sub j is greater than or equal to 858 00:53:11,020 --> 00:53:16,300 a if and only if Zn is greater than or equal to a for some n 859 00:53:16,300 --> 00:53:18,810 less than or equal to m. 860 00:53:18,810 --> 00:53:25,066 If you haven't crossed a threshold by time m, then Z 861 00:53:25,066 --> 00:53:29,690 sub m is equal to Z sub n. 862 00:53:29,690 --> 00:53:32,400 But it's not above a. 863 00:53:32,400 --> 00:53:36,460 So the stopping time is this largest possible value that 864 00:53:36,460 --> 00:53:41,430 we've got until the process stops by time n. 865 00:53:41,430 --> 00:53:45,470 Zj greater than or equal to a if and only if we've crossed a 866 00:53:45,470 --> 00:53:48,620 threshold for some n less than or equal to m. 867 00:53:48,620 --> 00:53:53,710 So the probability that we've crossed the threshold from 1 868 00:53:53,710 --> 00:53:58,400 to n to n is equal to the probability that Z sub j is 869 00:53:58,400 --> 00:54:01,750 greater than or equal or a, which is less than or equal to 870 00:54:01,750 --> 00:54:05,470 the expected value of Z sub j divided by a. 871 00:54:05,470 --> 00:54:09,320 Since the process must be stopped by time m, we have Z 872 00:54:09,320 --> 00:54:12,450 sub j is equal to Z sub m star. 873 00:54:12,450 --> 00:54:16,800 And the stop process, f time m and Z sub n, is 874 00:54:16,800 --> 00:54:17,980 less than or equal-- 875 00:54:17,980 --> 00:54:22,450 expected value of the stop process is less than or equal 876 00:54:22,450 --> 00:54:24,750 to the expected value of the original process. 877 00:54:24,750 --> 00:54:27,200 Why is that? 878 00:54:27,200 --> 00:54:32,830 That's that theorum we just proved somewhere. 879 00:54:32,830 --> 00:54:35,190 Yeah, this one here, OK? 880 00:54:35,190 --> 00:54:37,365 That's submartingale consequence. 881 00:54:42,540 --> 00:54:43,050 OK. 882 00:54:43,050 --> 00:54:45,850 So that completes the proof. 883 00:54:51,630 --> 00:54:52,946 And it's 10:30 now. 884 00:54:56,620 --> 00:55:01,130 So the Kolmogorove submartingale inequality is 885 00:55:01,130 --> 00:55:04,800 really a strengthening of the Markov inequality. 886 00:55:04,800 --> 00:55:10,250 So you get this extra soup to nuts form of it. 887 00:55:10,250 --> 00:55:14,220 Chebyshev inequality can be strengthened in the same way. 888 00:55:14,220 --> 00:55:18,480 That's called a Kolmogorove inequality also. 889 00:55:18,480 --> 00:55:21,890 Kolmogorove just got in here before anybody else and he 890 00:55:21,890 --> 00:55:24,120 took those axioms that he made up-- 891 00:55:24,120 --> 00:55:25,740 and he was a smart guy-- 892 00:55:25,740 --> 00:55:27,400 and he developed this whole school of 893 00:55:27,400 --> 00:55:29,970 probability in Russia. 894 00:55:29,970 --> 00:55:33,750 And along with that, since he had these original results, he 895 00:55:33,750 --> 00:55:37,450 just almost wiped up the field before anybody else knew what 896 00:55:37,450 --> 00:55:38,530 was going on. 897 00:55:38,530 --> 00:55:41,630 Partly a consequence of the fact that mathematicians in 898 00:55:41,630 --> 00:55:44,340 most other parts of the world didn't believe that there was 899 00:55:44,340 --> 00:55:48,500 any good probability theory going on in Russia. 900 00:55:48,500 --> 00:55:50,630 So they weren't really conscious of this until he 901 00:55:50,630 --> 00:55:54,770 cleaned up the whole field. 902 00:55:54,770 --> 00:55:58,370 So if you want to be famous mathematician, you should move 903 00:55:58,370 --> 00:56:03,270 away from the US and go to Upper Turkestan or something. 904 00:56:03,270 --> 00:56:05,920 And you then clean up the whole field the same way that 905 00:56:05,920 --> 00:56:07,750 Kolmogorove did. 906 00:56:07,750 --> 00:56:09,000 OK. 907 00:56:11,390 --> 00:56:15,190 So the strengthening of the Kolmogorove inequality. 908 00:56:15,190 --> 00:56:20,270 What the result says is let Zn be a submartingale with the 909 00:56:20,270 --> 00:56:23,850 expected value of Zn squared less than infinity. 910 00:56:23,850 --> 00:56:29,070 Then the probability that the maximum of these terms, up to 911 00:56:29,070 --> 00:56:33,790 n, is greater than or equal to b, is less than or equal to 912 00:56:33,790 --> 00:56:39,190 the expected value of Zm squared divided by b squared. 913 00:56:39,190 --> 00:56:43,300 You'll notice that that's almost the same thing as the 914 00:56:43,300 --> 00:56:45,960 submartingale inequality. 915 00:56:45,960 --> 00:56:49,250 This one says the probability that the maximum of Z sub i is 916 00:56:49,250 --> 00:56:52,090 greater than or equal to i. 917 00:56:52,090 --> 00:57:01,420 And this one says THE probability that the maximum Z 918 00:57:01,420 --> 00:57:04,860 n is greater than or equal to b is less than or equal to the 919 00:57:04,860 --> 00:57:09,230 expected value of Zn squared over b squared. 920 00:57:09,230 --> 00:57:11,690 If you can't prove this, go back and look at the proof of 921 00:57:11,690 --> 00:57:12,940 the Chebyshev inequality. 922 00:57:15,090 --> 00:57:24,170 The proof of this given the submartingale inequality is 923 00:57:24,170 --> 00:57:28,570 exactly the same as the proof of the Chebyshev inequality 924 00:57:28,570 --> 00:57:29,890 given the Markov inequality. 925 00:57:29,890 --> 00:57:36,180 You just go through the same steps and it's fairly simple. 926 00:57:36,180 --> 00:57:36,680 OK. 927 00:57:36,680 --> 00:57:40,810 So that is a nice result. 928 00:57:40,810 --> 00:57:44,930 What happens if you apply this to a random walk? 929 00:57:44,930 --> 00:57:50,080 If you apply it to a random walk, what you do is replace 930 00:57:50,080 --> 00:57:56,330 this is Z sub n with the sum of random variables, and the 931 00:57:56,330 --> 00:57:59,990 random walk, minus the mean of those random variables. 932 00:57:59,990 --> 00:58:04,470 We have seen that a zero-mean random walk is a martingale. 933 00:58:04,470 --> 00:58:08,100 So what we're going to do next is to use that zero-mean 934 00:58:08,100 --> 00:58:13,070 random walk of the martingale and then what this says is the 935 00:58:13,070 --> 00:58:15,960 probability that the maximum, from 1, less 936 00:58:15,960 --> 00:58:17,250 than or equal to n-- 937 00:58:17,250 --> 00:58:22,280 less than or equal to m, of Sn minus nX bar. 938 00:58:22,280 --> 00:58:26,090 That's Zn because we're subtracting off the main. 939 00:58:26,090 --> 00:58:29,020 The probability of that is greater than or equal to, and 940 00:58:29,020 --> 00:58:31,530 we just give b another name-- 941 00:58:31,530 --> 00:58:33,410 m times epsilon-- 942 00:58:33,410 --> 00:58:36,090 and it's less than or equal to the expected 943 00:58:36,090 --> 00:58:38,700 value of Z sub m squared. 944 00:58:38,700 --> 00:58:41,250 What's the expected value of this quantity squared? 945 00:58:48,480 --> 00:58:52,260 It's n times sigma squared. 946 00:58:52,260 --> 00:58:58,800 Because S sub m is just the sum of m IID random variables, 947 00:58:58,800 --> 00:59:01,455 which have variance sigma squared. 948 00:59:01,455 --> 00:59:06,670 So you take the expected value of this quantity squared, and 949 00:59:06,670 --> 00:59:12,400 this n times the variance of X. So this then becomes sigma 950 00:59:12,400 --> 00:59:18,110 squared times m divided by m squared times epsilon squared, 951 00:59:18,110 --> 00:59:19,470 and the m cancels out. 952 00:59:22,700 --> 00:59:27,130 This gives you the Chebyshev inequality with the extra 953 00:59:27,130 --> 00:59:33,640 feature, but it deals with the whole sum from 1 up to m. 954 00:59:33,640 --> 00:59:36,170 Now, you look at this and you say, gee. 955 00:59:36,170 --> 00:59:40,210 Wouldn't it be absolutely wonderful if instead of going 956 00:59:40,210 --> 00:59:44,570 from 1 to m, this went from m to infinity? 957 00:59:44,570 --> 00:59:47,940 Because then you'd be saying the maximum of these terms-- 958 00:59:47,940 --> 00:59:52,270 the maximum is less than or equal to something. 959 00:59:52,270 --> 00:59:55,280 And you'd have the strong law of large numbers all sitting 960 00:59:55,280 --> 00:59:57,320 there for you. 961 00:59:57,320 --> 01:00:02,820 And life was not that good, but almost as good, because we 962 01:00:02,820 --> 01:00:09,320 can now do the strong law of large numbers assuming only 963 01:00:09,320 --> 01:00:12,950 IID random variables with the variance. 964 01:00:12,950 --> 01:00:17,390 So we're going to use that expression that we just did-- 965 01:00:17,390 --> 01:00:19,770 we're going to plug it into what we need for the strong 966 01:00:19,770 --> 01:00:21,630 law of large numbers. 967 01:00:21,630 --> 01:00:24,560 Again, I'm going to give you the idea of the proof of that. 968 01:00:24,560 --> 01:00:27,600 I wasn't going to do that, but I looked at the proof in the 969 01:00:27,600 --> 01:00:33,020 notes, and I had trouble understanding that too. 970 01:00:33,020 --> 01:00:36,040 You understand, I have a problem here. 971 01:00:36,040 --> 01:00:40,010 I write things two years ago, I look at them now. 972 01:00:40,010 --> 01:00:43,130 I have a bad memory, so I have trouble understanding them. 973 01:00:43,130 --> 01:00:46,650 So I recreate a new proof, which looks obvious to me now 974 01:00:46,650 --> 01:00:49,280 because I've done it right now, and in two years, it 975 01:00:49,280 --> 01:00:51,040 might look just as difficult. 976 01:00:51,040 --> 01:00:54,710 So if you look at this half proof here and you can't 977 01:00:54,710 --> 01:00:59,140 understand it, let me know, and I'll go back to the 978 01:00:59,140 --> 01:01:03,430 drawing board and work on something else. 979 01:01:03,430 --> 01:01:03,940 OK. 980 01:01:03,940 --> 01:01:10,860 So the theorem says let X sub i be a sequence of IID random 981 01:01:10,860 --> 01:01:14,905 variables with mean x bar and standard deviation sigma less 982 01:01:14,905 --> 01:01:16,170 than infinity. 983 01:01:16,170 --> 01:01:19,550 So I'm trying to do the strong law of large numbers before we 984 01:01:19,550 --> 01:01:21,200 assume the fourth moment. 985 01:01:21,200 --> 01:01:23,740 Here, I'm only assuming a second moment. 986 01:01:23,740 --> 01:01:26,150 If you work really hard, you can do it with the first 987 01:01:26,150 --> 01:01:28,590 absolute moment. 988 01:01:28,590 --> 01:01:28,970 OK. 989 01:01:28,970 --> 01:01:33,600 Let the S sub n be the sum of n random variables, then for 990 01:01:33,600 --> 01:01:36,650 any epsilon-- 991 01:01:36,650 --> 01:01:38,570 oh, I don't need an epsilon there. 992 01:01:38,570 --> 01:01:40,270 I don't know where that epsilon came from. 993 01:01:44,870 --> 01:01:46,900 Just cross that out. 994 01:01:46,900 --> 01:01:48,930 It doesn't belong. 995 01:01:48,930 --> 01:01:52,870 The probability that the limit, as n goes to infinity, 996 01:01:52,870 --> 01:01:56,640 of S n over n is equal to X bar. 997 01:01:56,640 --> 01:01:59,940 The probability of that event-- 998 01:01:59,940 --> 01:02:06,520 event happens for a whole bunch of sample sequences. 999 01:02:06,520 --> 01:02:08,380 It doesn't happen for others. 1000 01:02:08,380 --> 01:02:11,610 And this says that the probability of the class of 1001 01:02:11,610 --> 01:02:14,070 infinite length sequences for which that 1002 01:02:14,070 --> 01:02:16,000 happens is equal to 1. 1003 01:02:16,000 --> 01:02:18,980 That's the statement of the strong law of large numbers 1004 01:02:18,980 --> 01:02:21,180 that we had before. 1005 01:02:21,180 --> 01:02:26,410 It says that the probability of the set of sequences for 1006 01:02:26,410 --> 01:02:30,860 which the sample average approaches the main 1007 01:02:30,860 --> 01:02:34,950 probability of that set of sequences is equal to 1. 1008 01:02:34,950 --> 01:02:35,280 OK. 1009 01:02:35,280 --> 01:02:38,630 So the idea of the proof is going to be 1010 01:02:38,630 --> 01:02:39,880 the following thing. 1011 01:02:43,560 --> 01:02:47,650 And what I'm going to use is this Chebyshev inequality 1012 01:02:47,650 --> 01:02:49,440 we've already done. 1013 01:02:49,440 --> 01:02:54,790 But since Chebyshev inequality, in this new form-- 1014 01:02:58,160 --> 01:03:04,220 namely the Kolmogorove inequality only goes up to n, 1015 01:03:04,220 --> 01:03:06,430 what I'm going to do is look at successively 1016 01:03:06,430 --> 01:03:07,970 larger values of n. 1017 01:03:07,970 --> 01:03:11,650 So I'm going to try to crawl my way up on infinity by 1018 01:03:11,650 --> 01:03:15,110 taking first a short length, then a longer length, then a 1019 01:03:15,110 --> 01:03:17,280 longer length, and a the longer length. 1020 01:03:17,280 --> 01:03:22,640 So I'm going to take this quantity here, which was the 1021 01:03:22,640 --> 01:03:26,100 quantity in the Kolmogorove submartingale inequality. 1022 01:03:26,100 --> 01:03:29,860 I'm going to ask what's the probability that the union of 1023 01:03:29,860 --> 01:03:35,170 all of these things, from m equals sum k, which I'm going 1024 01:03:35,170 --> 01:03:37,110 to let go to infinity later, what's the 1025 01:03:37,110 --> 01:03:40,470 probability of this union? 1026 01:03:40,470 --> 01:03:44,390 And the terms in the union, instead of going from 1 to n 1027 01:03:44,390 --> 01:03:50,370 to n, I want to replace n by 2 to the m. 1028 01:03:50,370 --> 01:03:52,630 The maximum of this. 1029 01:03:52,630 --> 01:03:56,090 The probability that this is greater than or equal to 2 to 1030 01:03:56,090 --> 01:03:58,250 the m times epsilon-- 1031 01:03:58,250 --> 01:03:59,755 that's the biggest term times epsilon. 1032 01:04:02,950 --> 01:04:05,590 I want to see that this is less than or 1033 01:04:05,590 --> 01:04:07,720 equal to this quantity. 1034 01:04:07,720 --> 01:04:12,500 Now, why is this less than or equal to that? 1035 01:04:12,500 --> 01:04:13,470 AUDIENCE: Union bound. 1036 01:04:13,470 --> 01:04:14,310 PROFESSOR: What? 1037 01:04:14,310 --> 01:04:15,110 AUDIENCE: Union bound. 1038 01:04:15,110 --> 01:04:16,260 PROFESSOR: Union bound, yes. 1039 01:04:16,260 --> 01:04:17,970 That's all it is. 1040 01:04:17,970 --> 01:04:21,270 I've just applied the union bound to this. 1041 01:04:21,270 --> 01:04:25,830 This is less than or equal to the probability of this for m 1042 01:04:25,830 --> 01:04:33,390 equals k, plus the probability of this for m equals k plus 1, 1043 01:04:33,390 --> 01:04:34,890 and so forth. 1044 01:04:34,890 --> 01:04:39,800 Each of these terms is sigma squared over 2 to the m times 1045 01:04:39,800 --> 01:04:41,370 epsilon squared. 1046 01:04:41,370 --> 01:04:45,230 That's what we had on the last page, I hope. 1047 01:04:45,230 --> 01:04:45,570 Yes. 1048 01:04:45,570 --> 01:04:49,190 Sigma squared over m times epsilon squared. 1049 01:04:49,190 --> 01:04:53,220 Remember, we replaced m by 2 to the m, so this has changed 1050 01:04:53,220 --> 01:04:55,600 in that way. 1051 01:04:55,600 --> 01:04:59,040 And now we can sum this. 1052 01:04:59,040 --> 01:05:02,890 And when we sum it, we just get 2 sigma squared over 2 to 1053 01:05:02,890 --> 01:05:05,430 the k times epsilon squared. 1054 01:05:05,430 --> 01:05:07,790 What are we doing here? 1055 01:05:07,790 --> 01:05:10,540 What's the whole of this? 1056 01:05:10,540 --> 01:05:15,310 The Kolmogorove submartingale inequality, lets us, instead 1057 01:05:15,310 --> 01:05:19,670 of looking at just one value of n, let's us look at a whole 1058 01:05:19,670 --> 01:05:23,690 bunch of values altogether and maximize over them. 1059 01:05:23,690 --> 01:05:27,660 So what I'm going do is use the Kolmogorove submartingale 1060 01:05:27,660 --> 01:05:31,350 inequality over one big bunch of things and then over 1061 01:05:31,350 --> 01:05:34,190 another much bigger bunch of things, then over another 1062 01:05:34,190 --> 01:05:36,280 much, much bigger set of things. 1063 01:05:36,280 --> 01:05:43,340 And because I'm hopping over these much larger sequences, I 1064 01:05:43,340 --> 01:05:47,220 can now sum this quantity here, which I couldn't do if I 1065 01:05:47,220 --> 01:05:48,690 only had an m here. 1066 01:05:48,690 --> 01:05:53,050 If I replaced this 2 to the m by m and I tried to sum this, 1067 01:05:53,050 --> 01:05:56,020 what would happen? 1068 01:05:56,020 --> 01:05:59,940 It's a harmonic series and it diverges. 1069 01:05:59,940 --> 01:06:02,770 So what I've been able to do-- 1070 01:06:02,770 --> 01:06:06,280 or, really, what Kolmogorove was able to do-- 1071 01:06:06,280 --> 01:06:10,010 was instead of summing over all m, he was summing over 1072 01:06:10,010 --> 01:06:13,350 bunches of things and using this maximum here. 1073 01:06:15,870 --> 01:06:18,710 So this probability is less than or equal 1074 01:06:18,710 --> 01:06:21,350 to something finite. 1075 01:06:21,350 --> 01:06:26,380 If I now let k go to infinity, this term goes to 0, which 1076 01:06:26,380 --> 01:06:32,160 says that the tail end of this whole big thing goes to 0 as k 1077 01:06:32,160 --> 01:06:35,610 gets larger, for any epsilon at all. 1078 01:06:35,610 --> 01:06:38,060 But now, this doesn't quite look like what I want it to 1079 01:06:38,060 --> 01:06:41,990 look like, so what I'm going to do is find something which 1080 01:06:41,990 --> 01:06:45,120 is smaller than this that looks like what I would like 1081 01:06:45,120 --> 01:06:46,620 it to look like. 1082 01:06:46,620 --> 01:06:49,170 So I'm going to lower bound this quantity here by this 1083 01:06:49,170 --> 01:06:51,310 quantity here. 1084 01:06:51,310 --> 01:06:54,090 I still have the same union here. 1085 01:06:54,090 --> 01:07:02,180 Instead of finding the max over 1 to n to 2 to the n, I 1086 01:07:02,180 --> 01:07:05,970 will get a probability which is smaller because I'll only 1087 01:07:05,970 --> 01:07:08,320 maximize over part of those terms. 1088 01:07:08,320 --> 01:07:12,390 I'll only go from 2 to the m minus 1, less than or equal to 1089 01:07:12,390 --> 01:07:15,190 n, less than or equal to 2 to the n. 1090 01:07:15,190 --> 01:07:19,010 So I'm maximizing over a smaller set of terms, which 1091 01:07:19,010 --> 01:07:26,230 makes the probability of this smaller. 1092 01:07:26,230 --> 01:07:32,330 And then I'm replacing the 2 the m here by 2 times n, 1093 01:07:32,330 --> 01:07:36,620 because now this bound is between 2 to the n minus 1 and 1094 01:07:36,620 --> 01:07:37,790 2 to the m. 1095 01:07:37,790 --> 01:07:40,670 So I can replace it that way. 1096 01:07:40,670 --> 01:07:42,860 And now, it's exactly the sum that I want. 1097 01:07:42,860 --> 01:07:43,620 Yeah? 1098 01:07:43,620 --> 01:07:46,980 AUDIENCE: What does it mean to maximize the [INAUDIBLE]? 1099 01:07:46,980 --> 01:07:50,340 Now n is [INAUDIBLE]. 1100 01:07:50,340 --> 01:07:52,740 So it looks like you're maximizing over inequalities. 1101 01:07:52,740 --> 01:07:54,592 Is it something like that? 1102 01:07:54,592 --> 01:08:05,980 PROFESSOR: Well, Yeah no. 1103 01:08:05,980 --> 01:08:10,030 I probably want to take this quantity, subtract off-- 1104 01:08:15,380 --> 01:08:15,590 no. 1105 01:08:15,590 --> 01:08:27,109 What I really have to do to make this make any sense is 1106 01:08:27,109 --> 01:08:37,069 write it as the maximum over the same set of things S n 1107 01:08:37,069 --> 01:08:48,859 over n minus x bar, greater than or equal to 2 epsilon. 1108 01:08:48,859 --> 01:08:50,109 OK? 1109 01:09:08,069 --> 01:09:09,810 Thank you. 1110 01:09:09,810 --> 01:09:10,240 AUDIENCE: OK. 1111 01:09:10,240 --> 01:09:11,960 Another thing-- 1112 01:09:11,960 --> 01:09:13,466 you're more likely to be bigger 1113 01:09:13,466 --> 01:09:14,260 than a smaller quantity. 1114 01:09:14,260 --> 01:09:19,159 Your n is smaller than 2 to the m, you're more likely to 1115 01:09:19,159 --> 01:09:22,396 be bigger than a smaller quantity. 1116 01:09:22,396 --> 01:09:24,886 So you're not bounding correctly, it would seem. 1117 01:09:30,529 --> 01:09:30,889 Oh, no. 1118 01:09:30,889 --> 01:09:34,800 You're doing 2 times m. 1119 01:09:34,800 --> 01:09:37,720 PROFESSOR: Well, there's no problem here. 1120 01:09:37,720 --> 01:09:39,744 I think the question is here. 1121 01:09:39,744 --> 01:09:47,520 Can I reduce this maximum down to a smaller sum and get a 1122 01:09:47,520 --> 01:09:48,770 smaller probability? 1123 01:09:51,300 --> 01:09:51,930 Oh, yes. 1124 01:09:51,930 --> 01:09:55,740 There's a smaller probability that this smaller max will 1125 01:09:55,740 --> 01:09:58,870 exceed a limit than if this will exceed a limit. 1126 01:09:58,870 --> 01:10:01,210 So I should do it in two steps. 1127 01:10:01,210 --> 01:10:02,710 In fact, you pointed it out here. 1128 01:10:02,710 --> 01:10:05,280 I should do it in three steps. 1129 01:10:05,280 --> 01:10:07,670 So the first step replaces this 1130 01:10:07,670 --> 01:10:11,170 maximum with this maximum. 1131 01:10:11,170 --> 01:10:19,910 Then the second step is going to go through that step. 1132 01:10:19,910 --> 01:10:24,560 And the third one is going to replace the m here with the n. 1133 01:10:24,560 --> 01:10:25,810 Anyway. 1134 01:10:31,550 --> 01:10:35,945 I think it's OK, with a few minor twiddles. 1135 01:10:39,250 --> 01:10:42,400 And before the term is over, I will get a new set of notes 1136 01:10:42,400 --> 01:10:45,020 out on the web, and you can check them to see if you're 1137 01:10:45,020 --> 01:10:47,130 actually satisfied with it, OK? 1138 01:10:51,450 --> 01:10:51,850 OK. 1139 01:10:51,850 --> 01:10:57,190 So finally, the martingale convergence theorem. 1140 01:10:57,190 --> 01:11:04,900 I'm not even going to try to prove this at all, but you 1141 01:11:04,900 --> 01:11:09,460 might have some imagination for how this follows from 1142 01:11:09,460 --> 01:11:13,300 dealing with stop processes, also. 1143 01:11:13,300 --> 01:11:19,000 What it says is Z sub n is a martingale again. 1144 01:11:19,000 --> 01:11:22,360 We're going to assume that there's something finite m, so 1145 01:11:22,360 --> 01:11:28,680 the expected value of Zn is less than or equal to n. 1146 01:11:28,680 --> 01:11:32,950 So what I'm saying is the expected value of Z sub n is 1147 01:11:32,950 --> 01:11:34,610 now bounded-- 1148 01:11:34,610 --> 01:11:37,330 it's not finite, it's more than finite-- 1149 01:11:37,330 --> 01:11:39,000 it's bounded. 1150 01:11:39,000 --> 01:11:40,800 It can never exceed this quantity. 1151 01:11:40,800 --> 01:11:45,540 I can have an expected value Z sub n, which is equal to 2 to 1152 01:11:45,540 --> 01:11:49,480 the n, and that's fine for every n. 1153 01:11:49,480 --> 01:11:51,140 But it's not bounded. 1154 01:11:51,140 --> 01:11:54,650 This quantity, I'm assuming it's bounded. 1155 01:11:54,650 --> 01:11:59,050 And then, according to this super theorem, there's a 1156 01:11:59,050 --> 01:12:03,130 random variable, Z. And don't ask what Z is. 1157 01:12:03,130 --> 01:12:07,630 Z is usually a very complicated random variable. 1158 01:12:07,630 --> 01:12:09,650 All this is doing is saying it exists. 1159 01:12:09,650 --> 01:12:12,570 You don't know what it is. 1160 01:12:12,570 --> 01:12:17,750 Such that the limit, as n goes to infinity, is Z sub n, is 1161 01:12:17,750 --> 01:12:21,100 equal to this random variable. 1162 01:12:21,100 --> 01:12:27,110 In other words, the limit of Z sub n minus Z is equal to 0 as 1163 01:12:27,110 --> 01:12:29,960 n goes to infinity. 1164 01:12:29,960 --> 01:12:31,960 And the texts proves the theorem with the additional 1165 01:12:31,960 --> 01:12:34,860 constraint that the expected value of Z sub 1166 01:12:34,860 --> 01:12:39,190 n squared is bounded. 1167 01:12:39,190 --> 01:12:44,780 Either one of those bounds is a very big constraint on these 1168 01:12:44,780 --> 01:12:46,020 martingales. 1169 01:12:46,020 --> 01:12:50,280 So the way you use these theorems is you take an 1170 01:12:50,280 --> 01:12:53,730 original problem that you're dealing with and you twist it 1171 01:12:53,730 --> 01:12:56,800 around, then you massage it and you do all sorts of things 1172 01:12:56,800 --> 01:13:02,030 to it all in order to get another martingale, which 1173 01:13:02,030 --> 01:13:04,560 satisfies this bound in the constraint. 1174 01:13:04,560 --> 01:13:07,470 Then you apply this theorem, and then you go back to where 1175 01:13:07,470 --> 01:13:08,710 you started. 1176 01:13:08,710 --> 01:13:12,380 So that's the sort of general way of dealing with this. 1177 01:13:18,920 --> 01:13:23,770 And you see this theorem being used in all sorts of strange 1178 01:13:23,770 --> 01:13:27,420 places where you would never expect it to be used. 1179 01:13:27,420 --> 01:13:31,640 For those of you in the communication field, about a 1180 01:13:31,640 --> 01:13:34,690 couple of years ago, there was a very famous paper dealing 1181 01:13:34,690 --> 01:13:37,310 with something called polar codes, which is 1182 01:13:37,310 --> 01:13:39,470 a new kind of coding-- 1183 01:13:39,470 --> 01:13:41,410 very careful. 1184 01:13:41,410 --> 01:13:43,390 And they guy, in order to prove that these codes 1185 01:13:43,390 --> 01:13:47,040 worked, used that. 1186 01:13:47,040 --> 01:13:47,600 I don't know how-- 1187 01:13:47,600 --> 01:13:50,200 I haven't checked it out yet-- but that was 1188 01:13:50,200 --> 01:13:51,700 crucial in his proofs. 1189 01:13:51,700 --> 01:13:55,560 So he had to turn these things into martingales somehow and 1190 01:13:55,560 --> 01:13:59,350 then use that proof. 1191 01:13:59,350 --> 01:14:01,530 OK. 1192 01:14:01,530 --> 01:14:05,000 We talked about branching processes, about the 1193 01:14:05,000 --> 01:14:07,900 remarkable things about them. 1194 01:14:07,900 --> 01:14:13,240 This theorem applies directly to these branching processes. 1195 01:14:13,240 --> 01:14:15,180 A branching process, you remember-- 1196 01:14:15,180 --> 01:14:20,310 the number of elements or organisms or whatever, at time 1197 01:14:20,310 --> 01:14:25,390 n is the number of offspring of the set of elements at 1198 01:14:25,390 --> 01:14:28,880 time n minus 1. 1199 01:14:28,880 --> 01:14:34,130 Each element at each time has a random number of offspring, 1200 01:14:34,130 --> 01:14:38,050 which is independent of time that's independent of all the 1201 01:14:38,050 --> 01:14:40,520 other elements. 1202 01:14:40,520 --> 01:14:48,140 And it's a random variable Y. And the expected value of X 1203 01:14:48,140 --> 01:14:54,480 sub n is going to be X sub n minus 1 times the expected 1204 01:14:54,480 --> 01:14:58,650 value of Y, because X sub n minus 1 is the number of 1205 01:14:58,650 --> 01:15:02,320 elements in the n minus first generation. 1206 01:15:02,320 --> 01:15:05,470 Y bar is the expected number of offspring 1207 01:15:05,470 --> 01:15:07,270 of each one of them. 1208 01:15:07,270 --> 01:15:09,242 So you look at it and you say, ah. 1209 01:15:09,242 --> 01:15:10,420 That theorem doesn't work. 1210 01:15:10,420 --> 01:15:10,920 No good. 1211 01:15:10,920 --> 01:15:13,540 You walk away. 1212 01:15:13,540 --> 01:15:16,860 Then somebody else who is really interested in branching 1213 01:15:16,860 --> 01:15:22,340 processes says, oh, this process is growing as-- 1214 01:15:25,200 --> 01:15:29,460 I mean it's growing by Y bar every unit of time. 1215 01:15:29,460 --> 01:15:32,170 So I should be able to deal with that somehow. 1216 01:15:32,170 --> 01:15:34,720 So I say, OK. 1217 01:15:34,720 --> 01:15:37,250 Let's look at the number of elements in the n-th 1218 01:15:37,250 --> 01:15:41,870 generation and divide it by Y to the n. 1219 01:15:41,870 --> 01:15:43,370 When you do this, what happens? 1220 01:15:49,830 --> 01:16:04,740 The expected value of X n divided by Y bar to the n is 1221 01:16:04,740 --> 01:16:21,010 going to be equal to X sub n minus 1 over Y bar sub n minus 1222 01:16:21,010 --> 01:16:42,080 1 times 1223 01:16:42,080 --> 01:16:43,550 AUDIENCE: You need to [INAUDIBLE]. 1224 01:16:46,490 --> 01:16:47,960 PROFESSOR: Yes. 1225 01:16:47,960 --> 01:16:50,730 That would help, wouldn't it. 1226 01:16:50,730 --> 01:16:53,040 Thank you. 1227 01:16:53,040 --> 01:17:02,910 Given X n minus 1 divided by Y bar n times n minus 1. 1228 01:17:06,460 --> 01:17:17,070 And Xn minus 2 over Y bar n minus 2. 1229 01:17:17,070 --> 01:17:20,810 And so if we're given these things, we don't have to worry 1230 01:17:20,810 --> 01:17:23,210 about this quantity if we're just given a number in each 1231 01:17:23,210 --> 01:17:24,250 generation. 1232 01:17:24,250 --> 01:17:27,380 If we're given a number in generation n minus 1, the 1233 01:17:27,380 --> 01:17:37,180 expected value of X sub n over Y n is just X n minus 1. 1234 01:17:37,180 --> 01:17:41,870 Expected value, we pick up another value of Y divided by 1235 01:17:41,870 --> 01:17:48,930 Y bar to the n, which is X n minus 1 over Y 1236 01:17:48,930 --> 01:17:52,860 bar the n minus 1. 1237 01:17:52,860 --> 01:17:53,390 OK. 1238 01:17:53,390 --> 01:17:58,520 So the theorem applies here because that expected value is 1239 01:17:58,520 --> 01:18:01,550 just 1, then. 1240 01:18:01,550 --> 01:18:04,980 And this is a martingale. 1241 01:18:04,980 --> 01:18:08,740 So the theorem says that this quantity 1242 01:18:08,740 --> 01:18:10,370 approaches a random variable. 1243 01:18:10,370 --> 01:18:12,590 And what does that mean? 1244 01:18:12,590 --> 01:18:22,530 Well, if you observe this process for a long time, it 1245 01:18:22,530 --> 01:18:23,930 might die out. 1246 01:18:23,930 --> 01:18:26,090 If it dies out and it stays died out-- it 1247 01:18:26,090 --> 01:18:28,310 never comes back again. 1248 01:18:28,310 --> 01:18:32,950 And if it doesn't die out, it's going to start to grow. 1249 01:18:32,950 --> 01:18:35,520 And if it starts to grow, it's going to start to 1250 01:18:35,520 --> 01:18:37,530 grow in this why. 1251 01:18:37,530 --> 01:18:43,700 After a very long time, if it's growing, X sub n minus 1 1252 01:18:43,700 --> 01:18:47,380 is humongous, and the law of large numbers says that the 1253 01:18:47,380 --> 01:18:52,990 next generation should have very close to X sub n minus 1 1254 01:18:52,990 --> 01:18:55,940 times the Y bar elements in it. 1255 01:18:55,940 --> 01:18:59,150 So it says that after you get started, this thing wobbles 1256 01:18:59,150 --> 01:19:01,410 around trying to decide whether it's going to go to 1257 01:19:01,410 --> 01:19:04,420 zero or decide to get large. 1258 01:19:04,420 --> 01:19:07,400 But if one starts to get large, then it becomes very 1259 01:19:07,400 --> 01:19:09,320 stable from that time on. 1260 01:19:09,320 --> 01:19:13,090 And it's going to increase by Y bar with each unit of time. 1261 01:19:13,090 --> 01:19:15,940 If it decides it's going to die out, it goes to 0 and it 1262 01:19:15,940 --> 01:19:17,030 stays there. 1263 01:19:17,030 --> 01:19:21,350 So what the theorem is saying is just what I just said-- 1264 01:19:21,350 --> 01:19:28,940 namely X to the n over Y bar n in fact either grows in this 1265 01:19:28,940 --> 01:19:33,090 very regular Y or it goes to 0 and it stays there. 1266 01:19:33,090 --> 01:19:39,750 So this random variable is 0 with a probability that the 1267 01:19:39,750 --> 01:19:43,110 process dies out, and we evaluated that before. 1268 01:19:45,900 --> 01:19:51,080 The other values of it are very hard to evaluate. 1269 01:19:51,080 --> 01:19:54,940 The other values depend on how long this thing takes. 1270 01:19:54,940 --> 01:19:58,180 If it's not going to go to 0, how long does it take before 1271 01:19:58,180 --> 01:20:00,000 it really takes off? 1272 01:20:00,000 --> 01:20:03,220 And sometimes it takes a long time before it really takes 1273 01:20:03,220 --> 01:20:06,360 off, sometimes it takes a short time, and that's what 1274 01:20:06,360 --> 01:20:08,620 the random variable Z is. 1275 01:20:08,620 --> 01:20:13,540 But the random variable Z says that after a very long time, 1276 01:20:13,540 --> 01:20:19,380 the value of this process is X sub n is going to be 1277 01:20:19,380 --> 01:20:24,510 approximately Z times this quantity here, which is 1278 01:20:24,510 --> 01:20:27,620 growing exponentially. 1279 01:20:27,620 --> 01:20:29,370 OK. 1280 01:20:29,370 --> 01:20:33,010 That gives us the martingale convergence theorem. 1281 01:20:33,010 --> 01:20:37,640 Next time, I will try to review at least the whole 1282 01:20:37,640 --> 01:20:38,890 course from Markov chains on.