1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 8 00:00:22,820 --> 00:00:24,180 PROFESSOR: OK, so let's get started. 9 00:00:26,680 --> 00:00:30,530 We're going to essentially finish up on 10 00:00:30,530 --> 00:00:33,600 Poisson processes today. 11 00:00:33,600 --> 00:00:38,960 And today, we have the part of it that's really the fun part. 12 00:00:38,960 --> 00:00:44,870 What we had until now was the dry stuff, defining everything 13 00:00:44,870 --> 00:00:46,150 and so forth. 14 00:00:46,150 --> 00:00:50,180 As I said before, Poisson processes are these perfect 15 00:00:50,180 --> 00:00:54,210 processes where everything that could be true is true. 16 00:00:54,210 --> 00:00:58,760 And you have so many different ways of looking at problems 17 00:00:58,760 --> 00:01:02,950 that you can solve problems in a wide variety of ways. 18 00:01:02,950 --> 00:01:08,900 What you want try to do in the problem set this week is to 19 00:01:08,900 --> 00:01:13,650 get out of the mode of starting to write equations 20 00:01:13,650 --> 00:01:15,770 before you think about it. 21 00:01:15,770 --> 00:01:20,060 I mean, write equations, yes, but while you're doing it, 22 00:01:20,060 --> 00:01:24,910 think about what the right way of approaching the problem is. 23 00:01:24,910 --> 00:01:29,730 As far as I know, every part of every problem there can be 24 00:01:29,730 --> 00:01:31,860 solved in one or two lines. 25 00:01:31,860 --> 00:01:32,800 There's nothing long. 26 00:01:32,800 --> 00:01:34,360 There's nothing complicated. 27 00:01:34,360 --> 00:01:38,740 But you have to find exactly the right way of doing it. 28 00:01:38,740 --> 00:01:43,010 And that's what you're supposed to learn because you 29 00:01:43,010 --> 00:01:48,880 find all sorts of processes in the world, which are poorly 30 00:01:48,880 --> 00:01:52,400 modelled as Poisson processes. 31 00:01:52,400 --> 00:02:00,370 You get a lot of insight about the real process by looking at 32 00:02:00,370 --> 00:02:04,370 Poisson processes, but you don't get the whole story. 33 00:02:04,370 --> 00:02:08,530 And if you don't understand how these relationships are 34 00:02:08,530 --> 00:02:13,000 connected to each other then you have no hope of getting 35 00:02:13,000 --> 00:02:18,000 some sense of when Poisson process theory is telling you 36 00:02:18,000 --> 00:02:21,580 something and when it isn't. 37 00:02:21,580 --> 00:02:24,910 That's why engineers are different than mathematicians. 38 00:02:24,910 --> 00:02:28,570 Mathematicians live in this beautiful world. 39 00:02:28,570 --> 00:02:29,380 And I love it. 40 00:02:29,380 --> 00:02:30,880 I love to live there. 41 00:02:30,880 --> 00:02:34,570 Love to go there for vacations and so on because everything 42 00:02:34,570 --> 00:02:36,250 is perfectly clean. 43 00:02:36,250 --> 00:02:39,560 Everything has a right solution. 44 00:02:39,560 --> 00:02:41,100 And if it's not a right solution, 45 00:02:41,100 --> 00:02:42,545 it's a wrong solution. 46 00:02:42,545 --> 00:02:47,850 In engineering, everything is kind of hazy. 47 00:02:47,850 --> 00:02:50,830 And you get insights about things when you put a lot of 48 00:02:50,830 --> 00:02:52,300 insights together. 49 00:02:52,300 --> 00:02:54,850 You finally make judgments about things. 50 00:02:54,850 --> 00:02:57,780 You use all sorts of models to do this. 51 00:02:57,780 --> 00:03:03,050 And what you use a course like this for is to understand what 52 00:03:03,050 --> 00:03:04,990 all these models are saying. 53 00:03:04,990 --> 00:03:07,090 And then be able to use them. 54 00:03:07,090 --> 00:03:11,380 So, the stuff that we'll be talking about today is stuff 55 00:03:11,380 --> 00:03:15,090 you will use all the rest of the term. 56 00:03:15,090 --> 00:03:18,770 Because everything we do, surprisingly enough, from now 57 00:03:18,770 --> 00:03:24,800 on, has some relationships with Poisson processes. 58 00:03:24,800 --> 00:03:29,710 It doesn't sound like they do, but, in fact, they do. 59 00:03:29,710 --> 00:03:33,110 And this has a lot of the things that look like tricks, 60 00:03:33,110 --> 00:03:34,690 but which are more than tricks. 61 00:03:34,690 --> 00:03:39,540 They're really what comes from a basic understanding of 62 00:03:39,540 --> 00:03:40,850 Poisson processes. 63 00:03:40,850 --> 00:03:42,660 So first, I'm going to review a little bit 64 00:03:42,660 --> 00:03:43,925 what we did last time. 65 00:03:46,940 --> 00:03:49,710 Poisson process is an arrival process. 66 00:03:49,710 --> 00:03:51,800 Remember what an arrival process is. 67 00:03:51,800 --> 00:03:57,030 It's just a bunch of arrival epochs, which have some 68 00:03:57,030 --> 00:03:59,660 statistics associated with them. 69 00:03:59,660 --> 00:04:02,980 And it has IID exponentially-distributed 70 00:04:02,980 --> 00:04:04,240 interarrival times. 71 00:04:04,240 --> 00:04:09,190 So the time between successive arrivals is independent from 72 00:04:09,190 --> 00:04:11,700 one arrival to the next. 73 00:04:11,700 --> 00:04:14,670 And it has this exponential distribution, which is what 74 00:04:14,670 --> 00:04:18,512 gives the Poisson process it's very special characteristic. 75 00:04:21,620 --> 00:04:24,930 It can be represented by its arrival epochs. 76 00:04:24,930 --> 00:04:31,360 These are the things you see here, S1, S2, S3, and so 77 00:04:31,360 --> 00:04:32,960 forth, are the arrival epochs. 78 00:04:32,960 --> 00:04:38,450 If you can specify what the probability relationship is 79 00:04:38,450 --> 00:04:42,100 for this set of joint random variables then you know 80 00:04:42,100 --> 00:04:45,230 everything there is to know about a Poisson process. 81 00:04:45,230 --> 00:04:48,810 If you specify the joint distribution of these 82 00:04:48,810 --> 00:04:52,700 interarrival times, and that's trivial because they're IID 83 00:04:52,700 --> 00:04:56,045 exponentially-distributed random variables, then you 84 00:04:56,045 --> 00:04:58,790 know everything there is to know about the process. 85 00:04:58,790 --> 00:05:04,280 And if you can specify N of t, which is the number of 86 00:05:04,280 --> 00:05:10,380 arrivals up until time t for every t then that specifies 87 00:05:10,380 --> 00:05:12,730 the process completely also. 88 00:05:12,730 --> 00:05:17,360 So we take the viewpoint here, I mean, usually we view a 89 00:05:17,360 --> 00:05:22,130 stochastic process as either a sequence of random variables 90 00:05:22,130 --> 00:05:24,940 or a continuum of random variables. 91 00:05:24,940 --> 00:05:28,490 Here, we're viewing as this three ways of looking at the 92 00:05:28,490 --> 00:05:30,200 same thing. 93 00:05:30,200 --> 00:05:36,620 So a Poisson process is then either the sequence of 94 00:05:36,620 --> 00:05:40,590 interarrival times, the sequence of arrival epochs, 95 00:05:40,590 --> 00:05:44,920 or, what we call the counting process, N of t at each t 96 00:05:44,920 --> 00:05:46,360 greater than zero. 97 00:05:46,360 --> 00:05:50,030 The arrival epochs in N of t are related either 98 00:05:50,030 --> 00:05:53,180 this way or this way. 99 00:05:53,180 --> 00:05:55,850 And we talked about that. 100 00:05:58,800 --> 00:06:01,980 The interarrival times are memoryless. 101 00:06:01,980 --> 00:06:05,650 In other words, they satisfy this relationship here. 102 00:06:05,650 --> 00:06:08,680 The probability that an interarrival time Xi is 103 00:06:08,680 --> 00:06:12,970 greater than t plus x, for any t and any x, which are 104 00:06:12,970 --> 00:06:15,850 positive, given that it's greater than t. 105 00:06:15,850 --> 00:06:18,730 In other words, given that you've already wasted time t 106 00:06:18,730 --> 00:06:23,400 waiting, the probability that the time from here until the 107 00:06:23,400 --> 00:06:27,876 actual occurrence occurs is again exponential. 108 00:06:32,870 --> 00:06:35,340 This is stated in conditional form. 109 00:06:35,340 --> 00:06:41,330 We stated it before in just joint probability form. 110 00:06:41,330 --> 00:06:45,380 OK, which says, that if you wait for a while and nothing 111 00:06:45,380 --> 00:06:49,180 has happened, you just keep on waiting. 112 00:06:49,180 --> 00:06:51,080 You're right where you started. 113 00:06:51,080 --> 00:06:52,240 You haven't lost anything. 114 00:06:52,240 --> 00:06:54,990 You haven't gained anything. 115 00:06:54,990 --> 00:06:59,100 And we said that other renewal processes, which are IID 116 00:06:59,100 --> 00:07:05,790 interarrival random variables, you can have these heavy 117 00:07:05,790 --> 00:07:11,090 tailed distributions where if nothing happens after while 118 00:07:11,090 --> 00:07:14,700 then you start to really feel badly because you know 119 00:07:14,700 --> 00:07:18,290 nothing's going to happen for an awful lot longer. 120 00:07:18,290 --> 00:07:21,420 Heavy tailed distribution's best example is when you're 121 00:07:21,420 --> 00:07:26,130 trying to catch an airplane and they say, it's going to be 122 00:07:26,130 --> 00:07:28,180 10 minutes late. 123 00:07:28,180 --> 00:07:31,780 That's the worst heavy tailed distribution there is. 124 00:07:31,780 --> 00:07:33,300 And it drives you crazy. 125 00:07:33,300 --> 00:07:36,680 Because I've never caught a plane that was supposed to be 126 00:07:36,680 --> 00:07:41,380 10 minutes late that wasn't at least an hour late. 127 00:07:41,380 --> 00:07:46,280 And often, it got canceled, which makes it not a random 128 00:07:46,280 --> 00:07:47,530 variable at all. 129 00:07:54,770 --> 00:07:57,330 One of the things we're interested in now, and we 130 00:07:57,330 --> 00:08:00,800 talked about it a lot last time, is you pick some 131 00:08:00,800 --> 00:08:05,170 arbitrary time t, that can be any time at all. 132 00:08:05,170 --> 00:08:10,550 And you ask, how long is it from time t-- t might be when 133 00:08:10,550 --> 00:08:13,400 you arrive to wait for a bus or something-- 134 00:08:13,400 --> 00:08:16,780 how long is it until the next bus comes? 135 00:08:16,780 --> 00:08:21,410 So Z is the random variable that goes. 136 00:08:21,410 --> 00:08:25,370 And you really should put some indices on this. 137 00:08:25,370 --> 00:08:32,360 But what it is is the random variable from t until this 138 00:08:32,360 --> 00:08:38,070 slow arrival here that's poking along finally comes in. 139 00:08:38,070 --> 00:08:44,200 Now, what we found is that the interval Z, which is the time 140 00:08:44,200 --> 00:08:51,760 from this arrival back to t, is exponential. 141 00:08:51,760 --> 00:08:58,860 And the way we showed that is to say, let's condition Z on 142 00:08:58,860 --> 00:09:00,940 anything we want to condition it on. 143 00:09:00,940 --> 00:09:03,780 And the things that's important too condition it on 144 00:09:03,780 --> 00:09:07,870 is the value of N of t here, which is N. And once we 145 00:09:07,870 --> 00:09:12,520 condition it on the fact that N of t is n, we then condition 146 00:09:12,520 --> 00:09:18,100 on S sub n, which is the time that this last arrival came. 147 00:09:18,100 --> 00:09:29,750 So if we condition Z on N of t at time t and the time that 148 00:09:29,750 --> 00:09:33,570 this last arrival came in, which is S sub two, in this 149 00:09:33,570 --> 00:09:39,080 case, then Z turns out to be, again, just the time left to 150 00:09:39,080 --> 00:09:40,990 wait after we've already waited for this 151 00:09:40,990 --> 00:09:41,990 given amount of time. 152 00:09:41,990 --> 00:09:45,250 We then find out that Z, conditional on these two 153 00:09:45,250 --> 00:09:48,370 things that we don't understand at all, it's just 154 00:09:48,370 --> 00:09:50,960 exponential, no matter what they are. 155 00:09:50,960 --> 00:09:55,210 And since it's exponential no matter what they are, we don't 156 00:09:55,210 --> 00:09:57,720 go to the trouble of trying to figure out what they are. 157 00:09:57,720 --> 00:10:02,920 We just say, whatever distribution they have, Z 158 00:10:02,920 --> 00:10:07,440 itself, unconditionally, is just the same exponential 159 00:10:07,440 --> 00:10:08,900 random variable. 160 00:10:08,900 --> 00:10:12,930 So that was one of the main things we did last time. 161 00:10:15,710 --> 00:10:20,860 Next thing we did was we started to look at any set of 162 00:10:20,860 --> 00:10:26,310 time, say t1, t2, up to t sub k, and then we looked at the 163 00:10:26,310 --> 00:10:27,720 increments. 164 00:10:27,720 --> 00:10:31,690 How many arrivals occurred between zero and t1? 165 00:10:31,690 --> 00:10:35,060 How many arrivals occurred between t1 and t2? 166 00:10:35,060 --> 00:10:37,850 How many between t2 and t3? 167 00:10:37,850 --> 00:10:44,030 So we looked at these Poisson process increments, a whole 168 00:10:44,030 --> 00:10:49,780 bunch of random variables, and we said, these are stationary 169 00:10:49,780 --> 00:10:54,080 and independent Poisson counting processes, over their 170 00:10:54,080 --> 00:10:54,950 given intervals. 171 00:10:54,950 --> 00:11:07,410 In other words, if you stop the first process, you stop 172 00:11:07,410 --> 00:11:10,370 looking at this at time t1. 173 00:11:10,370 --> 00:11:19,620 Then you look at this from time t1 to t2. 174 00:11:19,620 --> 00:11:23,480 You look at this one from tk minus 1 to tk. 175 00:11:23,480 --> 00:11:26,500 And you look at the last one, the next last one, all 176 00:11:26,500 --> 00:11:27,940 the way up to tk. 177 00:11:27,940 --> 00:11:30,150 And there should be one where you look at it 178 00:11:30,150 --> 00:11:34,530 from tk on to t. 179 00:11:34,530 --> 00:11:37,810 No, you're only looking at k of them. 180 00:11:37,810 --> 00:11:40,180 So these are the things that you're looking at. 181 00:11:40,180 --> 00:11:42,260 The statement is that these are 182 00:11:42,260 --> 00:11:44,430 independent random variables. 183 00:11:44,430 --> 00:11:48,150 The other statement is they're stationary, which means, if 184 00:11:48,150 --> 00:11:51,550 you look at the number of arrivals in this interval, 185 00:11:51,550 --> 00:11:56,500 here, it's a function of t2 minus t1. 186 00:11:56,500 --> 00:11:59,700 But it's not a function of t1 alone. 187 00:11:59,700 --> 00:12:03,260 It's only a function of the length of the interval. 188 00:12:03,260 --> 00:12:09,540 The number of arrivals in any interval of length tk minus tk 189 00:12:09,540 --> 00:12:16,780 minus 1 is a function only of the length of that interval 190 00:12:16,780 --> 00:12:19,020 and not of where the interval is. 191 00:12:19,020 --> 00:12:23,540 That's a reasonable way to look at stationarity, I think. 192 00:12:23,540 --> 00:12:28,240 How many arrivals come in in a given area is independent of 193 00:12:28,240 --> 00:12:29,460 where the area is. 194 00:12:29,460 --> 00:12:34,020 It depends only on how long the interval is. 195 00:12:34,020 --> 00:12:38,970 OK, then we found that the probability mass function for 196 00:12:38,970 --> 00:12:43,430 N of t, and now we're just looking from zero to t because 197 00:12:43,430 --> 00:12:47,440 we know we get the same thing over any interval, this 198 00:12:47,440 --> 00:12:52,670 probability mass function is this nice function, which 199 00:12:52,670 --> 00:12:54,950 depends only and the product lambda t. 200 00:12:54,950 --> 00:12:59,610 It never depends on lambda alone or t alone. 201 00:12:59,610 --> 00:13:04,820 Lambda t to n, e to the minus lambda t over n factorial. 202 00:13:04,820 --> 00:13:07,210 What the n factorial is doing there. 203 00:13:07,210 --> 00:13:10,760 Well, it came out in the derivation. 204 00:13:10,760 --> 00:13:15,740 By the time we finish today you should have more ideas of 205 00:13:15,740 --> 00:13:18,120 where that factorial came from. 206 00:13:18,120 --> 00:13:21,580 And we'll try to understand that. 207 00:13:21,580 --> 00:13:23,800 By the stationary and independent increment 208 00:13:23,800 --> 00:13:27,390 property, we know that these two things are independent, N 209 00:13:27,390 --> 00:13:32,450 of t1 and the number of arrivals in t1 to t. 210 00:13:32,450 --> 00:13:36,160 This is a Poisson random variable. 211 00:13:36,160 --> 00:13:38,580 This is a Poisson random variable. 212 00:13:38,580 --> 00:13:44,190 We know that the number of arrivals between zero and t is 213 00:13:44,190 --> 00:13:46,140 also a Poisson random variable. 214 00:13:46,140 --> 00:13:48,300 And what does that tell you? 215 00:13:48,300 --> 00:13:50,900 It tells you, you don't have to go through all of this 216 00:13:50,900 --> 00:13:54,190 discrete convolution stuff. 217 00:13:54,190 --> 00:13:56,690 You probably should go through it once just for your own 218 00:13:56,690 --> 00:13:59,410 edification to see that this all works. 219 00:13:59,410 --> 00:14:05,180 But for a very lazy person like me, who likes using 220 00:14:05,180 --> 00:14:10,000 arguments like this, I say, well, these two things are 221 00:14:10,000 --> 00:14:11,620 independent. 222 00:14:11,620 --> 00:14:14,100 They are Poisson random variables. 223 00:14:14,100 --> 00:14:16,040 Their sum is Poisson. 224 00:14:16,040 --> 00:14:20,400 And therefore, whenever you have two independent Poisson 225 00:14:20,400 --> 00:14:23,390 random variables and you add them together, what you get is 226 00:14:23,390 --> 00:14:29,930 a Poisson random variable whose mean is the sum of the 227 00:14:29,930 --> 00:14:36,070 means of the two individual random variables. 228 00:14:36,070 --> 00:14:38,390 In general, sums of independent Poisson random 229 00:14:38,390 --> 00:14:40,880 variables are Poisson with the means adding. 230 00:14:44,440 --> 00:14:47,500 Them we went through a couple of alternate definitions of a 231 00:14:47,500 --> 00:14:48,750 Poisson process. 232 00:14:51,130 --> 00:14:55,980 And at this point, just from what I've said so far, and 233 00:14:55,980 --> 00:14:59,520 from reading the notes and understanding what I've said 234 00:14:59,520 --> 00:15:04,060 so far, it ought to be almost clear that these alternate 235 00:15:04,060 --> 00:15:06,190 definitions, which we talked about last 236 00:15:06,190 --> 00:15:09,030 time have to be valid. 237 00:15:09,030 --> 00:15:13,020 If an arrival process has the stationary and independent 238 00:15:13,020 --> 00:15:17,590 increment properties and if N of t has the Poisson PMF for 239 00:15:17,590 --> 00:15:23,520 given lambda, and all t, then the process is Poisson. 240 00:15:23,520 --> 00:15:26,640 Now what is that saying? 241 00:15:26,640 --> 00:15:30,970 I mean, we've said that if we can specify all of the random 242 00:15:30,970 --> 00:15:34,030 variables N of t for all t, then we've 243 00:15:34,030 --> 00:15:36,170 specified the process. 244 00:15:36,170 --> 00:15:38,970 What does it mean to specify a whole 245 00:15:38,970 --> 00:15:41,650 bunch of random variables? 246 00:15:41,650 --> 00:15:44,770 It is not sufficient to find the distribution function of 247 00:15:44,770 --> 00:15:48,270 all those random variables. 248 00:15:48,270 --> 00:15:53,190 And one of the problems in the homeworks at this time is to 249 00:15:53,190 --> 00:15:57,000 explicitly show that for a simpler process, 250 00:15:57,000 --> 00:15:59,240 the Bernoulli process. 251 00:15:59,240 --> 00:16:07,080 And to actually construct an example of where N of t is 252 00:16:07,080 --> 00:16:10,380 specified everywhere. 253 00:16:10,380 --> 00:16:13,110 But you don't have the independence between different 254 00:16:13,110 --> 00:16:16,500 intervals, and therefore, you don't 255 00:16:16,500 --> 00:16:18,310 have a Bernoulli process. 256 00:16:18,310 --> 00:16:22,980 You just have this nice binomial formula everywhere. 257 00:16:22,980 --> 00:16:26,400 But it doesn't really tell you much. 258 00:16:26,400 --> 00:16:31,080 OK, so, but here we're adding on the independent increment 259 00:16:31,080 --> 00:16:36,650 properties, which says over any set of intervals the joint 260 00:16:36,650 --> 00:16:39,840 distribution of how many arrivals there are here, how 261 00:16:39,840 --> 00:16:54,080 many here, how many here, how many here, those joint random 262 00:16:54,080 --> 00:16:59,600 variables are independent of each other, which is what the 263 00:16:59,600 --> 00:17:02,210 independent increment property says. 264 00:17:02,210 --> 00:17:07,140 So in fact, this tells you everything you want to know 265 00:17:07,140 --> 00:17:11,380 because you now know the relationship between each one 266 00:17:11,380 --> 00:17:12,630 of these intervals. 267 00:17:18,970 --> 00:17:21,420 So we see why the process is Poisson. 268 00:17:21,420 --> 00:17:23,230 This one's a little trickier. 269 00:17:23,230 --> 00:17:26,220 If an arrival process has the stationary and independent 270 00:17:26,220 --> 00:17:30,190 increment properties and it satisfies this incremental 271 00:17:30,190 --> 00:17:33,670 condition, then the processes Poisson. 272 00:17:33,670 --> 00:17:36,870 And the incremental condition says that if you're looking at 273 00:17:36,870 --> 00:17:46,040 the number of arrivals in some interval of size delta, the 274 00:17:46,040 --> 00:17:51,830 probability that this is equal to N has the form 1 minus 275 00:17:51,830 --> 00:17:57,030 lambda delta plus o of delta, where n equals zero. 276 00:17:57,030 --> 00:18:00,480 Lambda delta plus o of delta, for n equals 1. 277 00:18:00,480 --> 00:18:04,480 o of delta for n greater than or equal to 2. 278 00:18:04,480 --> 00:18:08,620 This intuitively is only supposed to talk about very, 279 00:18:08,620 --> 00:18:10,170 very small delta. 280 00:18:10,170 --> 00:18:14,442 So if you take the Poisson distribution lambda t to the 281 00:18:14,442 --> 00:18:18,660 N, e to the minus lambda t over n factorial, and you look 282 00:18:18,660 --> 00:18:22,380 at what happens when t is very small, this is 283 00:18:22,380 --> 00:18:24,330 what it turns into. 284 00:18:24,330 --> 00:18:28,420 When t is very small the probability that there are no 285 00:18:28,420 --> 00:18:32,700 arrivals in this interval of size delta is very large. 286 00:18:32,700 --> 00:18:42,490 It's 1 minus lambda delta plus this extra term that says-- 287 00:18:42,490 --> 00:18:45,130 first point of view whenever you see an o of delta is to 288 00:18:45,130 --> 00:18:48,320 say, oh, that's not important. 289 00:18:48,320 --> 00:18:53,890 And for N equals 1, there's going to be one arrival with 290 00:18:53,890 --> 00:18:56,680 some fudge factor, which is not important. 291 00:18:56,680 --> 00:18:59,770 And there's going to be two or more arrivals with some fudge 292 00:18:59,770 --> 00:19:02,400 factor, which is not important. 293 00:19:02,400 --> 00:19:07,830 The next thing we talked about is that o of delta really is 294 00:19:07,830 --> 00:19:15,650 defined as any function which is a function of delta where 295 00:19:15,650 --> 00:19:18,800 the limit of o of delta divided by 296 00:19:18,800 --> 00:19:20,960 delta is equal to 0. 297 00:19:20,960 --> 00:19:23,660 In other words, it's some function that goes to zero 298 00:19:23,660 --> 00:19:25,840 faster than delta does. 299 00:19:25,840 --> 00:19:29,090 So it's something which is insignificant with 300 00:19:29,090 --> 00:19:34,950 relationship to this as delta gets very small. 301 00:19:34,950 --> 00:19:38,440 Now, how do you use this kind of statement to make some 302 00:19:38,440 --> 00:19:41,430 statement about larger intervals? 303 00:19:41,430 --> 00:19:44,650 Well, you're clearly stuck looking at 304 00:19:44,650 --> 00:19:46,660 differential equations. 305 00:19:46,660 --> 00:19:48,720 And the text does that. 306 00:19:48,720 --> 00:19:52,130 I refuse to talk about differential equations in 307 00:19:52,130 --> 00:19:53,780 lecture or anyplace else. 308 00:19:53,780 --> 00:19:57,000 When I retired I said, I will no longer talk about 309 00:19:57,000 --> 00:19:58,250 differential equations anymore. 310 00:20:01,310 --> 00:20:04,510 And you know, you don't need to because you can see what's 311 00:20:04,510 --> 00:20:07,870 happening here. 312 00:20:07,870 --> 00:20:11,870 And what you see is happening is, in fact, what's happening. 313 00:20:11,870 --> 00:20:15,790 OK, so, that's where we finished up last time. 314 00:20:15,790 --> 00:20:20,210 And now, we come to the really fun stuff where we want to 315 00:20:20,210 --> 00:20:24,000 combine independent Poisson processes. 316 00:20:24,000 --> 00:20:26,970 And then, we want to split Poisson processes. 317 00:20:26,970 --> 00:20:29,900 And we want to play all sorts of games with multiple Poisson 318 00:20:29,900 --> 00:20:33,690 processes, which looks very hard, and because of this, 319 00:20:33,690 --> 00:20:34,780 it's very easy. 320 00:20:34,780 --> 00:20:35,996 Yes? 321 00:20:35,996 --> 00:20:38,550 AUDIENCE: The previous definitions, they are if and 322 00:20:38,550 --> 00:20:39,960 only statements, right? 323 00:20:39,960 --> 00:20:40,910 PROFESSOR: They are what? 324 00:20:40,910 --> 00:20:42,480 AUDIENCE: If and only, all Poisson 325 00:20:42,480 --> 00:20:45,070 processes satisfy tho-- 326 00:20:45,070 --> 00:20:46,450 PROFESSOR: Yes. 327 00:20:46,450 --> 00:20:49,590 OK, in other words, what you're saying is, if you can 328 00:20:49,590 --> 00:20:53,510 satisfy those properties, this is a Poisson process. 329 00:20:53,510 --> 00:20:56,190 I mean, we've already shown that a Poisson process 330 00:20:56,190 --> 00:20:57,440 satisfies those properties. 331 00:21:03,170 --> 00:21:06,510 As a matter of fact, the way I put it in the notes is these 332 00:21:06,510 --> 00:21:09,140 are three alternate definitions where you could 333 00:21:09,140 --> 00:21:13,940 start out with any one of them and derive the whole thing. 334 00:21:13,940 --> 00:21:16,660 Many people like to start out with this incremental 335 00:21:16,660 --> 00:21:21,030 definition because it's very physical. 336 00:21:21,030 --> 00:21:24,770 But it makes all the mathematics much, much harder. 337 00:21:24,770 --> 00:21:28,930 And so, it's just a question of what you prefer. 338 00:21:28,930 --> 00:21:32,280 I like to start out with something clean, then derive 339 00:21:32,280 --> 00:21:34,530 things, and then say, does it make any sense for the 340 00:21:34,530 --> 00:21:36,920 physical situation? 341 00:21:36,920 --> 00:21:38,970 And that's what we usually do. 342 00:21:38,970 --> 00:21:42,340 We don't usually start out with a physical situation and 343 00:21:42,340 --> 00:21:48,120 analyze the hell out of it and say, aha, this is a Poisson 344 00:21:48,120 --> 00:21:51,100 process because it satisfies all these properties. 345 00:21:51,100 --> 00:21:54,470 It never satisfies all those properties. 346 00:21:54,470 --> 00:21:57,810 I mean, you say it's a Poisson process because a Poisson 347 00:21:57,810 --> 00:22:02,340 process is simple and you can get some insight from it, not 348 00:22:02,340 --> 00:22:06,760 because it really is a Poisson process. 349 00:22:06,760 --> 00:22:12,890 Let's talk about taking two independent Poisson processes. 350 00:22:12,890 --> 00:22:17,340 Just to be a little more precise, two counting 351 00:22:17,340 --> 00:22:24,030 processes, N1 of t and N2 of t are independent if for all t1 352 00:22:24,030 --> 00:22:29,840 up to t sub n the random variables N1 of t1 to N1 of tn 353 00:22:29,840 --> 00:22:35,280 are independent of N1 of t1 up to N2 of tn. 354 00:22:35,280 --> 00:22:38,590 Why don't I just say that for all t they're independent? 355 00:22:38,590 --> 00:22:40,930 Because I don't even know what that means. 356 00:22:40,930 --> 00:22:45,010 I mean, we've never defined independence for an infinite 357 00:22:45,010 --> 00:22:47,100 number of things. 358 00:22:47,100 --> 00:22:51,760 So all we can do is say, for all finite sets, we have this 359 00:22:51,760 --> 00:22:53,460 independence. 360 00:22:53,460 --> 00:22:58,210 Now, give you a short pop quiz. 361 00:22:58,210 --> 00:23:03,375 Suppose that instead of doing it this way, I say, for all t1 362 00:23:03,375 --> 00:23:09,390 to tn the random variables N1 of t1 to N1 of tn are 363 00:23:09,390 --> 00:23:17,325 independent of for all tao1, tao2, tao3, tao4, N2 of tao1 364 00:23:17,325 --> 00:23:20,300 up to N2 of tao sub n? 365 00:23:20,300 --> 00:23:22,250 That sounds much more general, doesn't it? 366 00:23:22,250 --> 00:23:26,600 Because it's saying that I can count one process at one set 367 00:23:26,600 --> 00:23:31,610 of times and the other process at another set of times. 368 00:23:31,610 --> 00:23:34,490 Now, why isn't it any more general to do it that way? 369 00:23:41,700 --> 00:23:45,470 Well, it's an unfair pop quiz because if you can answer that 370 00:23:45,470 --> 00:23:49,520 question in the short one sentence answer that you'd be 371 00:23:49,520 --> 00:23:54,570 willing to give in a class like this, I would just give 372 00:23:54,570 --> 00:23:59,250 you A plus and tell you go away. 373 00:23:59,250 --> 00:24:02,800 And you already know all the things you should know. 374 00:24:02,800 --> 00:24:06,870 The argument is the following, if you order these different 375 00:24:06,870 --> 00:24:14,760 times, first tao1 of tao1 is less than t1 then t1, if 376 00:24:14,760 --> 00:24:18,560 that's the next one, and you order them all along. 377 00:24:18,560 --> 00:24:22,440 And then you apply this definition to 378 00:24:22,440 --> 00:24:25,060 that ordered set. 379 00:24:25,060 --> 00:24:33,830 t1, tao1, t2, t3, t4, tao2, t5, tao3, and so forth, and 380 00:24:33,830 --> 00:24:37,200 you apply this definition, and then you get the other 381 00:24:37,200 --> 00:24:38,320 definition. 382 00:24:38,320 --> 00:24:40,690 So one is not more general than the other. 383 00:24:44,940 --> 00:24:49,980 The theorem then, is that if N1 of t and N2 of t are 384 00:24:49,980 --> 00:24:53,140 independent Poisson processes, one of them has 385 00:24:53,140 --> 00:24:54,740 a rate lambda 1. 386 00:24:54,740 --> 00:24:57,120 One of them has a rate lambda 2. 387 00:24:57,120 --> 00:25:02,530 And if N of t is equal to N1 of t plus N2 of t, that just 388 00:25:02,530 --> 00:25:08,230 means for every t this random variable N of t is the sum of 389 00:25:08,230 --> 00:25:11,190 the random variable N1 of t plus the random 390 00:25:11,190 --> 00:25:12,960 variable N2 of 2. 391 00:25:12,960 --> 00:25:14,530 This is true for all. 392 00:25:14,530 --> 00:25:18,775 This is, definition, for all t greater than 0, then the sum N 393 00:25:18,775 --> 00:25:24,920 of t is a Poisson process of rate lambda equals lambda 1 394 00:25:24,920 --> 00:25:26,600 plus lambda 2. 395 00:25:26,600 --> 00:25:30,680 Looks almost obvious, doesn't it? 396 00:25:30,680 --> 00:25:33,380 I said that today there was lots of fun stuff. 397 00:25:33,380 --> 00:25:37,360 There's also a little bit of ugly stuff and this is one of 398 00:25:37,360 --> 00:25:42,160 those half obvious things that's ugly. 399 00:25:42,160 --> 00:25:44,370 And I'm not going to waste a lot of time on it. 400 00:25:44,370 --> 00:25:47,290 You can read all about the details in the notes. 401 00:25:47,290 --> 00:25:49,610 But I will spend a little bit of time on it 402 00:25:49,610 --> 00:25:52,850 because it's important. 403 00:25:52,850 --> 00:25:57,620 The idea is that if you look at any small increment, t to t 404 00:25:57,620 --> 00:26:02,757 plus delta, the number of arrivals in the interval t to 405 00:26:02,757 --> 00:26:07,280 t plus delta is equal to the number of arrivals in the 406 00:26:07,280 --> 00:26:10,970 interval t to t plus delta from the first process plus 407 00:26:10,970 --> 00:26:12,870 that in the second process. 408 00:26:12,870 --> 00:26:17,230 So the probability that there's one arrival in this 409 00:26:17,230 --> 00:26:21,600 combined process is the probability that there's one 410 00:26:21,600 --> 00:26:27,940 arrival in the first process and no arrivals in the second 411 00:26:27,940 --> 00:26:31,890 process, or that there's no arrivals in the first process 412 00:26:31,890 --> 00:26:34,270 and one arrival in the second process. 413 00:26:34,270 --> 00:26:37,315 That's just a very simple case of convolution. 414 00:26:39,870 --> 00:26:42,020 Those are the only ways you can get one in 415 00:26:42,020 --> 00:26:44,270 the combined process. 416 00:26:44,270 --> 00:26:51,310 This term here is delta times lambda 1-- 417 00:26:51,310 --> 00:26:52,420 too early in the morning. 418 00:26:52,420 --> 00:26:55,586 I'm confusing my deltas and lambdas. 419 00:26:55,586 --> 00:26:57,820 There are too many of each of them. 420 00:26:57,820 --> 00:26:59,310 --plus o of delta. 421 00:26:59,310 --> 00:27:02,680 And this term here, probability that there's zero 422 00:27:02,680 --> 00:27:05,620 for the second process is 1 minus delta 423 00:27:05,620 --> 00:27:08,300 lambda 2 plus o of delta. 424 00:27:08,300 --> 00:27:11,370 And then this term is just the opposite term 425 00:27:11,370 --> 00:27:13,010 corresponding to this. 426 00:27:13,010 --> 00:27:15,670 Now I multiply these terms out. 427 00:27:15,670 --> 00:27:17,560 And what do I get? 428 00:27:17,560 --> 00:27:22,750 Well, this 1 here combines with a delta lambda 1. 429 00:27:22,750 --> 00:27:28,660 Then there's a delta lambda 2 times delta lambda 1, which is 430 00:27:28,660 --> 00:27:29,860 delta squared. 431 00:27:29,860 --> 00:27:32,800 It's a delta squared term, so that's really 432 00:27:32,800 --> 00:27:34,980 an o of delta term. 433 00:27:34,980 --> 00:27:39,540 It's negligible as delta goes to zero. 434 00:27:39,540 --> 00:27:41,490 So we forget about that. 435 00:27:41,490 --> 00:27:45,710 There's an o of delta times 1, that's still an o of delta. 436 00:27:45,710 --> 00:27:48,380 There's an o of delta times a delta lambda. 437 00:27:48,380 --> 00:27:53,610 And that's sort of an o of delta squared if you wish. 438 00:27:53,610 --> 00:27:54,950 But it's still an o of delta. 439 00:27:54,950 --> 00:27:58,140 It goes to zero as delta gets large. 440 00:27:58,140 --> 00:28:00,160 And goes to zero faster than delta. 441 00:28:00,160 --> 00:28:02,770 What we're trying to do is to find the terms that are 442 00:28:02,770 --> 00:28:06,480 significant in terms of delta. 443 00:28:06,480 --> 00:28:10,130 Namely, when delta gets very small, I want to find things 444 00:28:10,130 --> 00:28:13,990 that are at least proportional to delta and not of lower 445 00:28:13,990 --> 00:28:15,450 order than delta. 446 00:28:15,450 --> 00:28:18,790 So when I get done with all of that, this is delta times 447 00:28:18,790 --> 00:28:22,500 lambda 1 plus lambda 2 plus o of delta. 448 00:28:22,500 --> 00:28:27,040 That's the incremental property that we want a 449 00:28:27,040 --> 00:28:28,370 Poisson process to have. 450 00:28:28,370 --> 00:28:30,895 So it has that incremental property. 451 00:28:35,190 --> 00:28:42,150 And those are the same sort of argument if you want to for 452 00:28:42,150 --> 00:28:47,160 the number of arrivals in t, that t plus delta. 453 00:28:47,160 --> 00:28:48,820 Maybe a picture of this would help. 454 00:28:52,320 --> 00:28:56,030 Pictures always help in the morning. 455 00:28:56,030 --> 00:28:57,280 Here we have two processes. 456 00:29:00,220 --> 00:29:06,815 We're looking at some interval of time, t to t plus delta. 457 00:29:09,530 --> 00:29:14,160 t to plus delta. 458 00:29:14,160 --> 00:29:15,790 And we might have an arrival here. 459 00:29:18,580 --> 00:29:22,120 We might have an arrival here, an arrival 460 00:29:22,120 --> 00:29:25,460 here, an arrival here. 461 00:29:25,460 --> 00:29:29,040 Well, the probability of an arrival here and an arrival 462 00:29:29,040 --> 00:29:32,990 here is something of order delta squared. 463 00:29:32,990 --> 00:29:36,050 So that's something we ignore. 464 00:29:36,050 --> 00:29:38,760 So it says, we might have an arrival here 465 00:29:38,760 --> 00:29:41,810 and no arrival here. 466 00:29:41,810 --> 00:29:43,910 This is a lambda 1. 467 00:29:43,910 --> 00:29:45,960 This is lambda 2 here. 468 00:29:48,840 --> 00:29:51,700 We might have an arrival here and none here, or we might 469 00:29:51,700 --> 00:29:54,470 have an arrival here and none there. 470 00:29:54,470 --> 00:29:58,080 Two arrivals is just too unlikely to worry about, so we 471 00:29:58,080 --> 00:30:00,784 forget about it at least for the time being. 472 00:30:05,160 --> 00:30:09,190 Now, after going through this incremental argument, if you 473 00:30:09,190 --> 00:30:13,050 go back and say, let's forget about all these o of deltas 474 00:30:13,050 --> 00:30:15,670 because they're very confusing, let's just do the 475 00:30:15,670 --> 00:30:18,370 convolution knowing what N of t is in 476 00:30:18,370 --> 00:30:20,110 both of these intervals. 477 00:30:20,110 --> 00:30:24,150 If you do that, it's much easier to find out that the 478 00:30:24,150 --> 00:30:29,940 number of arrivals in this sum of intervals. 479 00:30:29,940 --> 00:30:33,760 Number arrivals here, plus the number of arrivals here, this 480 00:30:33,760 --> 00:30:34,810 is Poisson. 481 00:30:34,810 --> 00:30:35,870 This is Poisson. 482 00:30:35,870 --> 00:30:39,040 You add two Poisson, you get Poisson. 483 00:30:39,040 --> 00:30:44,294 The rate of the sum of the two Poisson is Poisson. 484 00:30:44,294 --> 00:30:49,290 So it's Poisson with lambda 1 plus lambda 2. 485 00:30:49,290 --> 00:30:52,870 Now how many of you saw that and says, why is this idiot 486 00:30:52,870 --> 00:30:55,650 going through this incremental argument? 487 00:30:55,650 --> 00:30:56,930 Anyone? 488 00:30:56,930 --> 00:30:58,880 I won't be embarrassed. 489 00:30:58,880 --> 00:30:59,740 I knew it anyway. 490 00:30:59,740 --> 00:31:00,990 And I did it for a reason. 491 00:31:04,120 --> 00:31:07,060 But I must confess, when I wrote the first edition of 492 00:31:07,060 --> 00:31:09,980 this book I didn't recognize that. 493 00:31:09,980 --> 00:31:13,710 So I went through this terribly tedious argument. 494 00:31:13,710 --> 00:31:22,630 Anyway, the more important issue is if you have the sum 495 00:31:22,630 --> 00:31:25,870 of many, many small independent arrival 496 00:31:25,870 --> 00:31:27,810 processes-- 497 00:31:27,810 --> 00:31:32,110 OK, in other words, you have the internet. 498 00:31:32,110 --> 00:31:37,680 And a node in the internet is getting jobs, or messages, 499 00:31:37,680 --> 00:31:41,920 from all sorts of people, and all sorts of processes, and 500 00:31:41,920 --> 00:31:45,400 all sorts of nonsense going to all sorts of people, all sorts 501 00:31:45,400 --> 00:31:49,540 of processes, and all sorts of nonsense and those are 502 00:31:49,540 --> 00:31:51,360 independent of each other, not really 503 00:31:51,360 --> 00:31:53,240 independent of each other. 504 00:31:53,240 --> 00:32:00,750 But relative to the data rate that's travelling over this 505 00:32:00,750 --> 00:32:03,760 internet, each of those processes 506 00:32:03,760 --> 00:32:06,190 are very, very small. 507 00:32:06,190 --> 00:32:11,330 And what happens is the sum of many, many small independent 508 00:32:11,330 --> 00:32:15,820 arrival processes tend to be Poisson even if the small 509 00:32:15,820 --> 00:32:18,230 processes are not. 510 00:32:18,230 --> 00:32:21,690 In a sense, the independence between the processes 511 00:32:21,690 --> 00:32:25,860 overcomes the dependence between successive arrivals in 512 00:32:25,860 --> 00:32:27,660 each process. 513 00:32:27,660 --> 00:32:29,930 Now, I look at that and I say, well, it's sort of a 514 00:32:29,930 --> 00:32:31,900 plausibility argument. 515 00:32:31,900 --> 00:32:35,230 You look at the argument in the text, and you say, ah, 516 00:32:35,230 --> 00:32:37,185 it's sort of a plausibility argument. 517 00:32:39,990 --> 00:32:42,290 I mean, proving this statement, you need to put a 518 00:32:42,290 --> 00:32:44,080 lot of conditions on it. 519 00:32:44,080 --> 00:32:46,450 And you need to really go through an awful lot of work. 520 00:32:49,060 --> 00:32:53,020 It's like proving the central limit theorem, but it's 521 00:32:53,020 --> 00:32:55,510 probably harder than that. 522 00:32:55,510 --> 00:33:00,150 So, if you read the text, and say you don't really 523 00:33:00,150 --> 00:33:04,120 understand the argument there, I don't understand it either 524 00:33:04,120 --> 00:33:06,980 because I don't think it's exactly right. 525 00:33:06,980 --> 00:33:10,520 And I was just trying to say something to give some idea of 526 00:33:10,520 --> 00:33:11,770 why this is plausible. 527 00:33:14,980 --> 00:33:16,230 It should probably be changed. 528 00:33:18,830 --> 00:33:23,760 Next we want to talk about splitting a Poisson process. 529 00:33:23,760 --> 00:33:26,440 So we start out with a Poisson process here 530 00:33:26,440 --> 00:33:29,350 and a t of rate lambda. 531 00:33:29,350 --> 00:33:30,390 And what happens? 532 00:33:30,390 --> 00:33:33,690 These arrivals come to some point. 533 00:33:33,690 --> 00:33:36,110 And some character is standing there. 534 00:33:38,670 --> 00:33:41,720 It's like when you're having your passport checked when you 535 00:33:41,720 --> 00:33:44,100 come back to Boston after being away. 536 00:33:47,390 --> 00:33:52,630 In some places you press a button, and if the button come 537 00:33:52,630 --> 00:33:56,790 up one way, you're sent off to one line to be interrogated 538 00:33:56,790 --> 00:33:57,840 and all sorts of junk. 539 00:33:57,840 --> 00:33:59,490 And it's supposed to be random. 540 00:33:59,490 --> 00:34:02,440 I have no idea whether it's random or not, but it's 541 00:34:02,440 --> 00:34:04,820 supposed to be random. 542 00:34:04,820 --> 00:34:06,890 And otherwise, you go through and you get 543 00:34:06,890 --> 00:34:08,790 through very quickly. 544 00:34:08,790 --> 00:34:11,110 So it's the same sort of thing here. 545 00:34:11,110 --> 00:34:13,530 You have a bunch of arrivals. 546 00:34:13,530 --> 00:34:20,310 And each arrival is effectively randomly shoveled 547 00:34:20,310 --> 00:34:23,050 this way or shoveled this way. 548 00:34:23,050 --> 00:34:26,330 With probability p, it's shoveled this way. 549 00:34:26,330 --> 00:34:30,360 With probability 1 minus p, it's shoveled this way. 550 00:34:30,360 --> 00:34:38,889 So you can characterize this switch as a Bernoulli process. 551 00:34:38,889 --> 00:34:42,330 It's a Bernoulli process because it's random and 552 00:34:42,330 --> 00:34:48,030 independent from, not time to time now, 553 00:34:48,030 --> 00:34:50,150 but arrival to arrival. 554 00:34:50,150 --> 00:34:53,540 When we first looked at a Poisson process, we said it's 555 00:34:53,540 --> 00:34:56,110 a sequence of random variables. 556 00:34:56,110 --> 00:34:58,820 Sometimes we look at in terms of time. 557 00:34:58,820 --> 00:35:01,680 Time doesn't really make any difference there. 558 00:35:01,680 --> 00:35:05,940 It's just a sequence of IID random variables. 559 00:35:05,940 --> 00:35:09,290 So you have a sequence of IID random variables doing this 560 00:35:09,290 --> 00:35:10,450 switching here. 561 00:35:10,450 --> 00:35:13,520 You have a Poisson process coming in. 562 00:35:13,520 --> 00:35:17,130 And when you look at the combination of the Poisson 563 00:35:17,130 --> 00:35:21,990 process and the Bernoulli process, you get some kind of 564 00:35:21,990 --> 00:35:24,870 process of things coming out here. 565 00:35:24,870 --> 00:35:28,840 And another kind of process of things coming out here. 566 00:35:28,840 --> 00:35:35,120 And the theorem says, that when you combine this Poisson 567 00:35:35,120 --> 00:35:39,500 process with this independent Bernoulli process, what you 568 00:35:39,500 --> 00:35:45,240 get is a Poisson process here and an independent Poisson 569 00:35:45,240 --> 00:35:48,040 process here. 570 00:35:48,040 --> 00:35:51,600 And, of course, you need to know what the probability is 571 00:35:51,600 --> 00:35:55,970 of being switched this way and being switched that way. 572 00:35:55,970 --> 00:35:59,790 Each new process clearly has a stationary and independent 573 00:35:59,790 --> 00:36:01,360 increment property. 574 00:36:01,360 --> 00:36:04,450 Why is that? 575 00:36:04,450 --> 00:36:10,010 Well, you look at some increment of time and this 576 00:36:10,010 --> 00:36:15,680 process here is independent from one period of time to 577 00:36:15,680 --> 00:36:18,650 another period of time. 578 00:36:18,650 --> 00:36:23,930 The Bernoulli process is just switching the terms within 579 00:36:23,930 --> 00:36:27,220 that interval of time, which is independent of all other 580 00:36:27,220 --> 00:36:29,060 intervals of time. 581 00:36:29,060 --> 00:36:31,680 So that when you look at the combination of the Bernoulli 582 00:36:31,680 --> 00:36:35,580 process and the Poisson process, you have the 583 00:36:35,580 --> 00:36:40,680 stationary and independent increment property. 584 00:36:40,680 --> 00:36:44,890 And each satisfies the small increment property. 585 00:36:44,890 --> 00:36:48,410 If you look at very small delta here, 586 00:36:48,410 --> 00:36:49,960 unless each is Poisson. 587 00:36:49,960 --> 00:36:52,930 There's a more careful argument in the notes. 588 00:36:52,930 --> 00:36:56,150 What I'm trying to do in the lecture is not to give you 589 00:36:56,150 --> 00:36:59,580 careful proofs of things, but to give you some insight into 590 00:36:59,580 --> 00:37:01,520 why they're true. 591 00:37:01,520 --> 00:37:04,570 So that you can read the proof. 592 00:37:04,570 --> 00:37:07,580 And instead of going through each line and saying, yes, I 593 00:37:07,580 --> 00:37:10,410 agree with that, yes, I agree with that and you finally come 594 00:37:10,410 --> 00:37:13,780 to the end of the proof and you think what? 595 00:37:13,780 --> 00:37:17,020 Which I've done all the time because I don't really know 596 00:37:17,020 --> 00:37:19,960 what's going on. 597 00:37:19,960 --> 00:37:23,680 And you don't learn anything from that kind of proof. 598 00:37:23,680 --> 00:37:28,200 If you're surprised when it's done it means you ought to go 599 00:37:28,200 --> 00:37:30,270 back and look at it more carefully because there's 600 00:37:30,270 --> 00:37:36,540 something there that you didn't understand while it was 601 00:37:36,540 --> 00:37:37,790 wandering past you. 602 00:37:41,230 --> 00:37:44,250 The small increment property really doesn't make it clear 603 00:37:44,250 --> 00:37:48,090 that the split processes are independent. 604 00:37:48,090 --> 00:37:54,000 And for independence both processes must sometimes have 605 00:37:54,000 --> 00:37:56,560 arrivals in the same small increment. 606 00:37:56,560 --> 00:37:58,470 And this independence is hidden in 607 00:37:58,470 --> 00:38:00,350 those o of delta terms. 608 00:38:00,350 --> 00:38:03,220 And if you want to resolve that for yourself, you really 609 00:38:03,220 --> 00:38:06,440 have to look at the text. 610 00:38:06,440 --> 00:38:09,380 And you have to do some work too. 611 00:38:09,380 --> 00:38:13,030 The nice thing about combining and splitting is that you 612 00:38:13,030 --> 00:38:15,350 typically do them together. 613 00:38:15,350 --> 00:38:19,600 Most of the places where you use combining and splitting 614 00:38:19,600 --> 00:38:22,690 you use both of them repeatedly. 615 00:38:22,690 --> 00:38:26,400 The typical thing you do is first, you look at separate 616 00:38:26,400 --> 00:38:29,000 independent Poisson processes. 617 00:38:29,000 --> 00:38:32,610 And you take those separate independent Poisson processes, 618 00:38:32,610 --> 00:38:37,440 and you say, I want to look at those as a combined process. 619 00:38:37,440 --> 00:38:40,760 And after you look at them as a combined process, you then 620 00:38:40,760 --> 00:38:42,880 split them again. 621 00:38:42,880 --> 00:38:46,540 And what you're doing, when you then split them again, is 622 00:38:46,540 --> 00:38:50,740 you're saying these two independent Poisson processes 623 00:38:50,740 --> 00:38:54,720 that I started with, I can view them as one Poisson 624 00:38:54,720 --> 00:38:58,310 process plus a Bernoulli process. 625 00:38:58,310 --> 00:39:01,630 And you'd be amazed at how many problems you can solve by 626 00:39:01,630 --> 00:39:03,130 doing that. 627 00:39:03,130 --> 00:39:06,560 You will be amazed when you do the homework this time. 628 00:39:06,560 --> 00:39:12,220 If you don't use that property at least 10 times, you haven't 629 00:39:12,220 --> 00:39:18,005 really understood what's being asked for in those problems. 630 00:39:21,370 --> 00:39:25,060 Let's look at a simple example. 631 00:39:25,060 --> 00:39:28,030 Let's look at a last come, first served queue. 632 00:39:28,030 --> 00:39:31,980 Last come, first served queues are rather peculiar. 633 00:39:31,980 --> 00:39:38,310 They are queues where arrivals come in, and for some reason 634 00:39:38,310 --> 00:39:42,580 or other, when a new arrival comes in that arrival goes 635 00:39:42,580 --> 00:39:44,930 into the server immediately. 636 00:39:44,930 --> 00:39:47,010 And the queue gets backed up. 637 00:39:47,010 --> 00:39:49,795 Whatever was being served gets backed up. 638 00:39:49,795 --> 00:39:53,680 And the new arrival gets served, which is fine for the 639 00:39:53,680 --> 00:39:57,260 new arrival, unless another arrival comes 640 00:39:57,260 --> 00:40:00,410 in before it finishes. 641 00:40:00,410 --> 00:40:02,500 And then it gets backed up too. 642 00:40:02,500 --> 00:40:05,518 So things get backed up in the queue. 643 00:40:05,518 --> 00:40:07,380 And those that are lucky enough to get 644 00:40:07,380 --> 00:40:08,880 through, get through. 645 00:40:08,880 --> 00:40:11,400 And those that aren't, don't. 646 00:40:11,400 --> 00:40:13,950 Anybody have any idea why that-- 647 00:40:13,950 --> 00:40:17,470 I mean, it sounds like a very unfair thing to do. 648 00:40:17,470 --> 00:40:22,570 But, in fact, it's not unfair because you're not singling 649 00:40:22,570 --> 00:40:25,270 out any particular group or anything. 650 00:40:25,270 --> 00:40:29,120 You're just that's the rule you use. 651 00:40:29,120 --> 00:40:32,540 And it applies to everyone equally, so it's not unfair. 652 00:40:32,540 --> 00:40:35,880 But why might it make sense sometimes? 653 00:40:35,880 --> 00:40:37,122 Yeah? 654 00:40:37,122 --> 00:40:40,098 AUDIENCE: If you're doing a first come, first served 655 00:40:40,098 --> 00:40:43,580 you're going to have a certain distribution 656 00:40:43,580 --> 00:40:47,250 for your serve time. 657 00:40:47,250 --> 00:40:50,948 Whereas, if you do last come, first served-- 658 00:40:50,948 --> 00:40:51,894 I don't know. 659 00:40:51,894 --> 00:40:54,505 I'm just trying to think that there might some situations 660 00:40:54,505 --> 00:40:57,110 where the distribution of service times in that is 661 00:40:57,110 --> 00:41:00,427 favorable overall, even though some people, every once in a 662 00:41:00,427 --> 00:41:02,850 while, are feeling screwed. 663 00:41:02,850 --> 00:41:06,080 PROFESSOR: It's a good try. 664 00:41:06,080 --> 00:41:07,790 But it's not quite right. 665 00:41:07,790 --> 00:41:11,250 And, in fact, it's not right for a Poisson process because 666 00:41:11,250 --> 00:41:16,700 for a Poisson process the server is just chunking things 667 00:41:16,700 --> 00:41:21,490 out at rate mu, no matter what. 668 00:41:21,490 --> 00:41:23,980 So it doesn't really help anyone. 669 00:41:23,980 --> 00:41:27,450 If you have a heavy tail distribution, if somebody who 670 00:41:27,450 --> 00:41:31,180 comes in and requires an enormous amount of service, 671 00:41:31,180 --> 00:41:35,340 then you get everybody else done first because that 672 00:41:35,340 --> 00:41:40,430 customer with a huge service requirement keeps getting 673 00:41:40,430 --> 00:41:44,210 pushed back every time the queue is empty he gets some 674 00:41:44,210 --> 00:41:45,220 service again. 675 00:41:45,220 --> 00:41:47,100 And then other people come in. 676 00:41:47,100 --> 00:41:48,440 And people with small service 677 00:41:48,440 --> 00:41:50,200 requirements get served quickly. 678 00:41:53,700 --> 00:41:55,200 And what that means is it's not quite 679 00:41:55,200 --> 00:41:57,550 as crazy as it sounds. 680 00:41:57,550 --> 00:42:02,450 But the reason we want to look at it here is because it's a 681 00:42:02,450 --> 00:42:04,470 nice example of this combining and 682 00:42:04,470 --> 00:42:07,430 splitting of Poisson processes. 683 00:42:07,430 --> 00:42:09,970 So how does that happen? 684 00:42:09,970 --> 00:42:14,530 Well, you view services as a Poisson process. 685 00:42:14,530 --> 00:42:18,640 Namely, we have an exponential server here, where the time 686 00:42:18,640 --> 00:42:22,040 for each service is exponentially distributed. 687 00:42:22,040 --> 00:42:24,660 Now, if you're awake, you point out, well, what happens 688 00:42:24,660 --> 00:42:27,850 when the server has nothing to do? 689 00:42:27,850 --> 00:42:32,710 Well, just suppose that there's some very low priority 690 00:42:32,710 --> 00:42:36,870 set of jobs that the server is doing, which also are 691 00:42:36,870 --> 00:42:40,200 exponentially distributed that the server goes to when it has 692 00:42:40,200 --> 00:42:41,510 nothing to do. 693 00:42:41,510 --> 00:42:47,520 So the server is still doing things exponentially at rate 694 00:42:47,520 --> 00:42:51,770 lambda, but if there's nothing real to do then 695 00:42:51,770 --> 00:42:53,570 the output is wasted. 696 00:42:53,570 --> 00:42:55,910 So that's a Poisson process. 697 00:42:55,910 --> 00:42:59,940 The income is Poisson also. 698 00:42:59,940 --> 00:43:04,880 So the arrival process, plus the service process, they're 699 00:43:04,880 --> 00:43:06,030 independent. 700 00:43:06,030 --> 00:43:06,980 So they're Poisson. 701 00:43:06,980 --> 00:43:10,900 And the rate is lambda plus mu. 702 00:43:10,900 --> 00:43:16,020 Now, interesting question, what's the probability that an 703 00:43:16,020 --> 00:43:20,030 arrival completes service before being interrupted? 704 00:43:20,030 --> 00:43:22,120 I'm lucky, I get into the server. 705 00:43:22,120 --> 00:43:23,830 I start getting served. 706 00:43:23,830 --> 00:43:26,950 What's the probability that I finish before I get 707 00:43:26,950 --> 00:43:28,200 interrupted? 708 00:43:33,380 --> 00:43:37,200 Well, I get into service at a particular time t. 709 00:43:37,200 --> 00:43:41,640 I look at this combined Poisson process of arrivals 710 00:43:41,640 --> 00:43:47,630 and services and if the first arrival in this combined 711 00:43:47,630 --> 00:43:53,740 process gets switched to service, I'm done. 712 00:43:53,740 --> 00:43:58,460 If gets switched to arrival, I'm interrupted. 713 00:43:58,460 --> 00:44:02,180 So the question is, what's the probability that I get 714 00:44:02,180 --> 00:44:04,380 switched to-- 715 00:44:08,360 --> 00:44:10,940 I want to find the probability that I'm interrupted, so 716 00:44:10,940 --> 00:44:14,240 what's the probability that a new arrival in the combined 717 00:44:14,240 --> 00:44:20,480 process gets switched to arrivals because that's the 718 00:44:20,480 --> 00:44:22,270 case where I get interrupted. 719 00:44:22,270 --> 00:44:23,520 And it's just lambda. 720 00:44:29,470 --> 00:44:32,730 The probability that I get interrupted is lambda divided 721 00:44:32,730 --> 00:44:34,290 by lambda plus mu. 722 00:44:34,290 --> 00:44:39,180 And the probability that I complete my service is mu over 723 00:44:39,180 --> 00:44:40,440 lambda plus mu. 724 00:44:40,440 --> 00:44:44,450 When mu is very large, when the server is going very, very 725 00:44:44,450 --> 00:44:47,640 fast, I have a good chance of finishing before being 726 00:44:47,640 --> 00:44:48,600 interrupted. 727 00:44:48,600 --> 00:44:51,120 When it's the other way, I have a much 728 00:44:51,120 --> 00:44:54,120 smaller chance of finishing. 729 00:44:54,120 --> 00:44:57,340 More interesting, and more difficult case, given that 730 00:44:57,340 --> 00:45:01,410 you're interrupted, what is the probability that you have 731 00:45:01,410 --> 00:45:03,730 no further interruptions? 732 00:45:03,730 --> 00:45:07,360 In other words, I'm being served, I get interrupted, so 733 00:45:07,360 --> 00:45:10,760 now I'm sitting at the front of the queue. 734 00:45:10,760 --> 00:45:14,260 Everybody else that came in before me is in back of me. 735 00:45:14,260 --> 00:45:16,590 I'm sitting there at the front of the queue. 736 00:45:16,590 --> 00:45:21,360 This interrupting customer is being served. 737 00:45:21,360 --> 00:45:33,700 What's the probability that I'm going to finish my service 738 00:45:33,700 --> 00:45:36,110 before any further interruption occurs? 739 00:45:36,110 --> 00:45:39,430 Have to be careful about how to state this. 740 00:45:39,430 --> 00:45:42,900 The probability that there is no further interruption is 741 00:45:42,900 --> 00:45:48,700 that two services occur before the next arrival. 742 00:45:52,580 --> 00:45:56,710 And the probability of that is mu over lambda plus mu the 743 00:45:56,710 --> 00:45:57,960 quantity squared. 744 00:46:01,990 --> 00:46:04,380 Now, whether you agree with that number or not is 745 00:46:04,380 --> 00:46:05,050 immaterial. 746 00:46:05,050 --> 00:46:08,670 The thing I want you to understand is that this is a 747 00:46:08,670 --> 00:46:14,140 method which you can use in quite complicated 748 00:46:14,140 --> 00:46:15,390 circumstances. 749 00:46:17,100 --> 00:46:23,380 And it's something that applies in so many places that 750 00:46:23,380 --> 00:46:25,820 it's amazing. 751 00:46:25,820 --> 00:46:28,060 OK, let's talk a little bit about 752 00:46:28,060 --> 00:46:31,325 non-homogeneous Poisson processes. 753 00:46:35,990 --> 00:46:38,960 Maybe the most important application of this is optical 754 00:46:38,960 --> 00:46:40,260 transmission. 755 00:46:40,260 --> 00:46:44,410 There's an optical stream of photons that's modulated by 756 00:46:44,410 --> 00:46:46,230 variable power. 757 00:46:46,230 --> 00:46:49,590 Photon stream is reasonably modeled as a Poisson process, 758 00:46:49,590 --> 00:46:51,770 not perfectly modeled that way. 759 00:46:51,770 --> 00:46:55,380 But again, what we're doing here is saying, let's look at 760 00:46:55,380 --> 00:46:56,540 this model. 761 00:46:56,540 --> 00:46:59,850 And then, see whether the consequences of the model 762 00:46:59,850 --> 00:47:02,010 apply to the physical situation. 763 00:47:02,010 --> 00:47:05,720 The modulation converts the steady photon rate into a 764 00:47:05,720 --> 00:47:07,370 variable rate. 765 00:47:07,370 --> 00:47:13,210 So the arrivals are being served, namely, they're being 766 00:47:13,210 --> 00:47:16,190 transmitted at some rate, lambda of 767 00:47:16,190 --> 00:47:19,570 t, where t is varying. 768 00:47:19,570 --> 00:47:22,800 I mean, the photons have nothing to do the 769 00:47:22,800 --> 00:47:25,230 information at all. 770 00:47:25,230 --> 00:47:27,350 They're just random photons. 771 00:47:27,350 --> 00:47:30,640 And the information is represented by lambda t. 772 00:47:30,640 --> 00:47:32,500 The question is can you actually send 773 00:47:32,500 --> 00:47:34,400 anything this way? 774 00:47:34,400 --> 00:47:38,710 We model the number of photons in any interval t and t plus 775 00:47:38,710 --> 00:47:43,660 delta as a Poisson random variable whose rate parameter 776 00:47:43,660 --> 00:47:48,030 over t and t plus delta, or very small delta, is the 777 00:47:48,030 --> 00:47:52,210 average photon rate over t and t plus delta times delta. 778 00:47:52,210 --> 00:47:55,780 So we go through this small increment model again. 779 00:47:55,780 --> 00:47:59,140 But in the small increment model, we have a lambda t 780 00:47:59,140 --> 00:48:04,100 instead of a lambda and everything else is the same. 781 00:48:04,100 --> 00:48:10,510 And when you carry this out, what you find-- 782 00:48:10,510 --> 00:48:13,740 and I'm not going to show this or anything. 783 00:48:13,740 --> 00:48:16,550 I'm just going to tell it to you because it's in the notes 784 00:48:16,550 --> 00:48:19,280 if you want to read more about it. 785 00:48:19,280 --> 00:48:24,530 The probability that you have a given number of arrivals in 786 00:48:24,530 --> 00:48:30,320 some given interval is a Poisson random variable again. 787 00:48:30,320 --> 00:48:33,010 Namely, whether the process is homogeneous or not 788 00:48:33,010 --> 00:48:42,320 homogeneous, if the original photons are Poisson then what 789 00:48:42,320 --> 00:48:47,340 you wind up with is this parameter 790 00:48:47,340 --> 00:48:49,680 of the Poisson process. 791 00:48:49,680 --> 00:48:56,370 m hat to the N e the minus m hat divided by n factorial. 792 00:48:56,370 --> 00:48:58,820 And this parameter here is what? 793 00:48:58,820 --> 00:49:02,080 It's the overall arrival rate in that interval. 794 00:49:02,080 --> 00:49:05,770 So it's exactly what you'd expect it to be. 795 00:49:05,770 --> 00:49:08,810 So what this is saying is combining and splitting 796 00:49:08,810 --> 00:49:12,980 non-homogeneous processes still works like in the 797 00:49:12,980 --> 00:49:14,680 homogeneous case. 798 00:49:14,680 --> 00:49:17,630 The independent exponential in arrivals don't work. 799 00:49:24,280 --> 00:49:27,280 OK, let me say that another way. 800 00:49:27,280 --> 00:49:30,490 When you're looking at non-homogeneous Poisson 801 00:49:30,490 --> 00:49:35,310 processes, looking at them in terms of a Poisson counting 802 00:49:35,310 --> 00:49:38,240 process is really a neat thing to do. 803 00:49:38,240 --> 00:49:41,930 Because all of the Poisson counting process formulas 804 00:49:41,930 --> 00:49:43,100 still work. 805 00:49:43,100 --> 00:49:45,040 And all you have to do when you want to look at the 806 00:49:45,040 --> 00:49:48,430 distribution of the number arrivals in an interval is 807 00:49:48,430 --> 00:49:50,950 look at the average arrival rate over that interval. 808 00:49:50,950 --> 00:49:53,390 And that tells you everything. 809 00:49:53,390 --> 00:49:56,460 When you look at the interarrival times and things 810 00:49:56,460 --> 00:50:00,180 like that they aren't exponential anymore. 811 00:50:00,180 --> 00:50:01,430 And none of that works. 812 00:50:04,790 --> 00:50:09,050 So you get half the pie, but not the whole thing. 813 00:50:09,050 --> 00:50:15,100 Now, the other thing, which is really fun, and which will 814 00:50:15,100 --> 00:50:22,690 take us a bit of time, is to study, now it's not a small 815 00:50:22,690 --> 00:50:25,380 increment but a bigger increment, of a Poisson 816 00:50:25,380 --> 00:50:29,900 process when you condition on N of t. 817 00:50:29,900 --> 00:50:32,970 In other words, if somebody tells you how many arrivals 818 00:50:32,970 --> 00:50:37,590 have occurred by time t, how do you analyze where those are 819 00:50:37,590 --> 00:50:41,240 arrivals occur, where the arrival epochs are, 820 00:50:41,240 --> 00:50:42,820 and things like that? 821 00:50:42,820 --> 00:50:47,340 Well, since we're conditioning on N of t, let's start out 822 00:50:47,340 --> 00:50:51,460 with N of t equals 23 because that's obviously the simplest 823 00:50:51,460 --> 00:50:53,830 number to start out with. 824 00:50:53,830 --> 00:50:54,950 Obviously not. 825 00:50:54,950 --> 00:50:58,950 We start out with N of t equals 1 because that clearly 826 00:50:58,950 --> 00:51:01,130 is a simpler thing to start with. 827 00:51:01,130 --> 00:51:03,500 So here's the situation we have. 828 00:51:03,500 --> 00:51:06,200 We're assuming that there's a one arrival 829 00:51:06,200 --> 00:51:07,670 up until this time. 830 00:51:07,670 --> 00:51:11,380 That one arrival has to occur somewhere in 831 00:51:11,380 --> 00:51:12,660 this interval here. 832 00:51:12,660 --> 00:51:14,300 We don't know where it is. 833 00:51:14,300 --> 00:51:19,040 So we say, well, let's suppose that it's in the interval s1 834 00:51:19,040 --> 00:51:24,460 to s1 plus delta and try to find the probability that it 835 00:51:24,460 --> 00:51:26,840 actually is in that interval. 836 00:51:26,840 --> 00:51:30,070 So we start out with something that looks ugly. 837 00:51:30,070 --> 00:51:32,770 But we'll see it isn't as ugly as it looks like. 838 00:51:32,770 --> 00:51:38,297 We'll look at the conditional density of s1 given that N of 839 00:51:38,297 --> 00:51:40,480 t is equal to 1. 840 00:51:40,480 --> 00:51:41,710 That's what this is. 841 00:51:41,710 --> 00:51:44,110 This is a conditional density here. 842 00:51:44,110 --> 00:51:47,340 And we say, well, how do we find the conditional density? 843 00:51:47,340 --> 00:51:49,710 This a one dimensional density. 844 00:51:49,710 --> 00:51:51,680 So what we're going to do is we're going to look at the 845 00:51:51,680 --> 00:51:56,540 probability that there's one arrival in this tiny interval. 846 00:51:56,540 --> 00:51:58,840 And then we're going to divide by delta and that gives us the 847 00:51:58,840 --> 00:52:01,700 density, if everything is well defined. 848 00:52:01,700 --> 00:52:06,070 So the density is going to be the limit as delta goes to 849 00:52:06,070 --> 00:52:11,200 zero, which is the way you always define densities, of 850 00:52:11,200 --> 00:52:19,300 the probability that there is no arrivals in here. 851 00:52:19,300 --> 00:52:21,560 That there's one arrival here. 852 00:52:21,560 --> 00:52:23,910 And that there's no arrivals here. 853 00:52:23,910 --> 00:52:27,840 So probability zero arrivals here. 854 00:52:27,840 --> 00:52:30,950 Probability of one arrival here. 855 00:52:30,950 --> 00:52:33,560 And probability of zero arrivals here. 856 00:52:33,560 --> 00:52:35,580 We want to find that probability. 857 00:52:35,580 --> 00:52:38,740 We want to divide for the conditioning by the 858 00:52:38,740 --> 00:52:44,240 probability that there was one arrival in the whole interval. 859 00:52:44,240 --> 00:52:47,360 This delta here is because we want to go to the limit and 860 00:52:47,360 --> 00:52:49,540 get a density. 861 00:52:49,540 --> 00:52:52,350 OK, when we look at this and we calculate this, e to the 862 00:52:52,350 --> 00:52:56,100 minus lambda s1 is what this is. 863 00:52:56,100 --> 00:52:59,780 Lambda delta times e to the minus lambda delta 864 00:52:59,780 --> 00:53:02,200 is what this is. 865 00:53:02,200 --> 00:53:06,340 And e to the minus lambda t minus s1 minus delta 866 00:53:06,340 --> 00:53:07,290 is what this is. 867 00:53:07,290 --> 00:53:14,665 Because this last interval is t minus s1 minus delta. 868 00:53:18,070 --> 00:53:22,795 Now, the remarkable thing is when I add, I multiply e to 869 00:53:22,795 --> 00:53:28,560 the minus lambda s1 times e to the minus lambda delta times e 870 00:53:28,560 --> 00:53:31,940 to the minus lambda t minus s1 minus delta. 871 00:53:31,940 --> 00:53:33,480 There's an s1 here. 872 00:53:33,480 --> 00:53:34,830 An s1 one here. 873 00:53:34,830 --> 00:53:36,190 There's a delta here. 874 00:53:36,190 --> 00:53:37,650 And a delta here. 875 00:53:37,650 --> 00:53:41,050 This whole thing is e to the minus lambda delta. 876 00:53:41,050 --> 00:53:44,440 And it cancels out with this e to the minus lambda delta. 877 00:53:44,440 --> 00:53:47,150 That's going to happen no matter how many 878 00:53:47,150 --> 00:53:48,390 arrivals I put in here. 879 00:53:48,390 --> 00:53:51,740 So that's an interesting thing to observe. 880 00:53:51,740 --> 00:53:57,150 So what I wind up with is lambda delta up here, lambda 881 00:53:57,150 --> 00:54:00,330 delta t here, and my probability 882 00:54:00,330 --> 00:54:04,050 density is 1 over t. 883 00:54:04,050 --> 00:54:07,530 And now we look at that and we say, my God, how strange. 884 00:54:07,530 --> 00:54:08,930 And then we look at it and we say, oh, of 885 00:54:08,930 --> 00:54:09,970 course, it's obvious. 886 00:54:09,970 --> 00:54:10,360 Yes? 887 00:54:10,360 --> 00:54:14,792 AUDIENCE: Why do you use N s1 and then the N tilde for the 888 00:54:14,792 --> 00:54:18,700 other two in the first part in the limit? 889 00:54:18,700 --> 00:54:18,980 PROFESSOR: Here? 890 00:54:18,980 --> 00:54:21,590 AUDIENCE: You have p and s1 and it's e N tilde on the top. 891 00:54:21,590 --> 00:54:23,920 Why do you use the N and the N tilde? 892 00:54:27,615 --> 00:54:30,090 PROFESSOR: N of t is not shown in the figure. 893 00:54:30,090 --> 00:54:35,220 N of t just says there is one arrival up until time t, which 894 00:54:35,220 --> 00:54:36,980 is what lets me draw the figure. 895 00:54:36,980 --> 00:54:38,852 AUDIENCE: I meant in the limit as delta 896 00:54:38,852 --> 00:54:41,200 approaches zero, Pn, Professor. 897 00:54:41,200 --> 00:54:44,943 You used Pn and then Pn tilde for the other two. 898 00:54:44,943 --> 00:54:46,899 Does that signify anything? 899 00:54:51,300 --> 00:54:56,560 PROFESSOR: OK, this term is the probability that we have 900 00:54:56,560 --> 00:54:59,110 no arrivals between zero and s1. 901 00:55:03,130 --> 00:55:08,440 This N tilde means number of arrivals in an interval, which 902 00:55:08,440 --> 00:55:11,550 starts at s1 and goes to s1 plus delta. 903 00:55:11,550 --> 00:55:15,240 So this is the probability we have one arrival somewhere in 904 00:55:15,240 --> 00:55:16,590 this interval. 905 00:55:16,590 --> 00:55:20,420 And this term here is the probability that we have no 906 00:55:20,420 --> 00:55:27,360 arrivals in s1 plus delta up until t. 907 00:55:27,360 --> 00:55:31,130 And when we write all of those things out everything cancels 908 00:55:31,130 --> 00:55:33,370 out except the 1 over t. 909 00:55:33,370 --> 00:55:35,320 And then we look at that and we say, of course. 910 00:55:43,960 --> 00:55:47,940 If you look at the small increment definition of a 911 00:55:47,940 --> 00:55:54,950 Poisson process, it says that arrivals are equally likely at 912 00:55:54,950 --> 00:55:57,250 any point along here. 913 00:55:57,250 --> 00:56:00,290 If I condition on the fact that there's been one arrival 914 00:56:00,290 --> 00:56:03,500 in this whole interval, and they're equally likely to 915 00:56:03,500 --> 00:56:07,750 occur anywhere along here, the only sensible thing that can 916 00:56:07,750 --> 00:56:12,320 happen is that the probability density for where that arrival 917 00:56:12,320 --> 00:56:16,350 happens is uniform over this whole interval. 918 00:56:20,966 --> 00:56:23,780 Now, the important point is that this 919 00:56:23,780 --> 00:56:26,000 doesn't depend on s1. 920 00:56:26,000 --> 00:56:28,260 And it doesn't depend on lambda. 921 00:56:28,260 --> 00:56:31,030 That's a little surprising also. 922 00:56:31,030 --> 00:56:33,580 I mean, we have this Poisson process where arrivals are 923 00:56:33,580 --> 00:56:38,340 dependent on lambda, but you see, as soon as you say N of t 924 00:56:38,340 --> 00:56:43,410 equals 1, you sort of wash that out. 925 00:56:43,410 --> 00:56:46,300 Because you're already conditioning on the fact that 926 00:56:46,300 --> 00:56:49,280 there was only one arrival by this time. 927 00:56:49,280 --> 00:56:52,710 So you have this nice result here. 928 00:56:52,710 --> 00:56:58,200 But a result which looks a little suspicious. 929 00:56:58,200 --> 00:57:00,560 Well, we were successful with that. 930 00:57:00,560 --> 00:57:04,160 Let's go on to N of t equals 2. 931 00:57:04,160 --> 00:57:07,370 We'll get the general picture with this. 932 00:57:07,370 --> 00:57:17,290 We want to look at the probability density of 933 00:57:17,290 --> 00:57:19,830 arrivals at s1. 934 00:57:19,830 --> 00:57:23,360 This s1 and an arrival here, given that there were only two 935 00:57:23,360 --> 00:57:26,840 arrivals overall. 936 00:57:26,840 --> 00:57:32,930 And again, we take the limit as delta goes to zero. 937 00:57:32,930 --> 00:57:36,420 And we take a little delta interval here, a little delta 938 00:57:36,420 --> 00:57:37,240 interval here. 939 00:57:37,240 --> 00:57:41,600 So we're finding the probability that there are two 940 00:57:41,600 --> 00:57:46,280 arrivals, one between s1 and s1 plus delta and the other 941 00:57:46,280 --> 00:57:50,700 one between s2 and s2 plus delta. 942 00:57:50,700 --> 00:57:55,025 So that probability should be proportional to delta squared. 943 00:57:55,025 --> 00:57:58,080 And we're dividing by delta squared to find the joint 944 00:57:58,080 --> 00:58:01,220 probability density. 945 00:58:01,220 --> 00:58:04,440 You don't believe that, go back and look at how joint 946 00:58:04,440 --> 00:58:07,360 probability densities are the defined. 947 00:58:07,360 --> 00:58:10,480 I mean, you have to define them somehow. 948 00:58:10,480 --> 00:58:15,620 So you have to look at the probability in a small area. 949 00:58:15,620 --> 00:58:18,380 And then shrink the area to zero. 950 00:58:18,380 --> 00:58:22,920 And the area of the area is delta squared. 951 00:58:22,920 --> 00:58:25,040 And you have to divide by delta squared as 952 00:58:25,040 --> 00:58:26,490 you go to the limit. 953 00:58:26,490 --> 00:58:29,800 OK, so, we do the same thing that we did before. 954 00:58:29,800 --> 00:58:34,090 The probability of no arrivals in here is e to the 955 00:58:34,090 --> 00:58:35,430 minus lambda s1. 956 00:58:35,430 --> 00:58:40,710 In other words, we are taking the unconditional probability 957 00:58:40,710 --> 00:58:44,450 and then we're dividing by the conditioning probability. 958 00:58:44,450 --> 00:58:47,170 So the probability that there's nothing arriving in 959 00:58:47,170 --> 00:58:50,170 here is that. 960 00:58:50,170 --> 00:58:53,890 The probability there's one arrival in here is that. 961 00:58:53,890 --> 00:58:56,330 Probability that there's nothing in 962 00:58:56,330 --> 00:58:58,720 this interval is this. 963 00:58:58,720 --> 00:59:02,600 Probability there's one in this interval is lambda delta 964 00:59:02,600 --> 00:59:04,750 times e to the minus lambda delta. 965 00:59:04,750 --> 00:59:09,350 And finally, this last term here, we have to divide by 966 00:59:09,350 --> 00:59:11,470 delta squared to go to a density. 967 00:59:11,470 --> 00:59:15,200 We have to divide by the probability that there were 968 00:59:15,200 --> 00:59:18,620 actually just two arrivals here. 969 00:59:18,620 --> 00:59:21,490 Well, again, the same thing happens. 970 00:59:21,490 --> 00:59:26,100 If we take all the exponentials, lambda s1, 971 00:59:26,100 --> 00:59:34,110 lambda delta, lambda 2 in here, this term 972 00:59:34,110 --> 00:59:36,350 cancels out this term. 973 00:59:36,350 --> 00:59:40,420 This term here cancels out this term. 974 00:59:40,420 --> 00:59:47,750 And all we're left with is e to the minus lambda t up here, 975 00:59:47,750 --> 00:59:50,620 e to the minus lambda t down here. 976 00:59:50,620 --> 00:59:58,590 And because the PMF function for N of t is e to the minus 977 00:59:58,590 --> 01:00:06,120 lambda t over n factorial, n factorial for n equals 2 is 2. 978 01:00:06,120 --> 01:00:10,220 We wind up with 2 over t squared. 979 01:00:10,220 --> 01:00:12,670 Now that's a little bit surprising. 980 01:00:12,670 --> 01:00:14,740 I'm going to show you a picture in just a little bit, 981 01:00:14,740 --> 01:00:19,510 which makes it clear what that 2 is doing there. 982 01:00:19,510 --> 01:00:22,470 But let's not worry about it too much for the time being. 983 01:00:22,470 --> 01:00:25,880 The important thing here is, again, 984 01:00:25,880 --> 01:00:28,580 it's independent lambda. 985 01:00:28,580 --> 01:00:34,020 And it's a independent of s1 and s2. 986 01:00:34,020 --> 01:00:37,390 Namely, given that there are two arrivals in here doesn't 987 01:00:37,390 --> 01:00:40,150 make any difference where they are. 988 01:00:40,150 --> 01:00:45,060 This is, in some sense, uniform over s1 and s2. 989 01:00:45,060 --> 01:00:49,250 And I have to be more careful about what uniform means here. 990 01:00:49,250 --> 01:00:52,320 But we'll do that in just a little bit. 991 01:00:52,320 --> 01:00:55,170 Now we can do the same thing for N of t equals N, for 992 01:00:55,170 --> 01:00:59,690 arbitrary N. And when I put in an arbitrary number of 993 01:00:59,690 --> 01:01:04,720 arrivals here I still get this property that when I take e to 994 01:01:04,720 --> 01:01:08,420 the minus lambda of this region, e to the minus lambda 995 01:01:08,420 --> 01:01:15,250 times this region, this, this, this, this, this, this and so 996 01:01:15,250 --> 01:01:19,070 forth, when I get all done, it's e to the minus lambda t. 997 01:01:19,070 --> 01:01:23,890 And it cancels out with this term down here. 998 01:01:23,890 --> 01:01:25,440 We've done that down here. 999 01:01:29,150 --> 01:01:35,850 And what you come up with is this joint density is equal to 1000 01:01:35,850 --> 01:01:42,740 N factorial over t to the N. So again, it's the same story. 1001 01:01:42,740 --> 01:01:44,080 It's uniform. 1002 01:01:44,080 --> 01:01:48,230 It doesn't depend on where the s's are, except for the fact 1003 01:01:48,230 --> 01:01:52,080 that s1 is less than s2 is less than s3. 1004 01:01:52,080 --> 01:01:53,950 We have to sort that out. 1005 01:01:53,950 --> 01:01:58,520 It has this peculiar N factorial here that doesn't 1006 01:01:58,520 --> 01:02:00,080 depend on lambda, as I said. 1007 01:02:00,080 --> 01:02:03,120 It does depend N in this way. 1008 01:02:03,120 --> 01:02:07,410 That's a uniform N dimensional probability density over the 1009 01:02:07,410 --> 01:02:11,980 volume t to the N over N factorial corresponding to the 1010 01:02:11,980 --> 01:02:17,060 constraint region zero less than s1 less than blah, blah, 1011 01:02:17,060 --> 01:02:24,150 blah, less than s sub N. Now that 1012 01:02:24,150 --> 01:02:27,410 region is a little peculiar. 1013 01:02:27,410 --> 01:02:30,310 This N factorial is a little peculiar. 1014 01:02:30,310 --> 01:02:33,200 The t to the N is pretty obvious. 1015 01:02:33,200 --> 01:02:38,220 Because when you have a joint density over t things, each 1016 01:02:38,220 --> 01:02:41,470 bounded between zero and t, you expect to see 1017 01:02:41,470 --> 01:02:42,720 a t to the N there. 1018 01:02:45,280 --> 01:02:49,930 So if you ask, what's going on? 1019 01:02:49,930 --> 01:02:53,780 In a very simple minded way, the question is how did this 1020 01:02:53,780 --> 01:02:59,050 derivation know that the volume of s1 to sn over zero 1021 01:02:59,050 --> 01:03:04,360 less than s1, less than sn, less than s2, is N factorial 1022 01:03:04,360 --> 01:03:05,930 over t to the N? 1023 01:03:05,930 --> 01:03:08,950 I have a uniform probability density over 1024 01:03:08,950 --> 01:03:11,320 some strange volume. 1025 01:03:11,320 --> 01:03:13,790 And it's a strange volume which satisfies this 1026 01:03:13,790 --> 01:03:15,360 constraint. 1027 01:03:15,360 --> 01:03:20,030 How do I know that this is N factorial here? 1028 01:03:20,030 --> 01:03:21,750 Very confusing at first. 1029 01:03:21,750 --> 01:03:27,050 We saw the same question when we derived the airline 1030 01:03:27,050 --> 01:03:28,810 density, remember? 1031 01:03:28,810 --> 01:03:31,600 There was an N factorial there. 1032 01:03:31,600 --> 01:03:35,520 And that N factorial when we derived the Poisson 1033 01:03:35,520 --> 01:03:39,850 distribution led to an N factorial there. 1034 01:03:39,850 --> 01:03:43,370 And, in fact, the fact that we were using the Poisson 1035 01:03:43,370 --> 01:03:48,540 distribution there is why that N factorial appears here. 1036 01:03:48,540 --> 01:03:52,370 It was because we stuck that in when we were doing the 1037 01:03:52,370 --> 01:03:54,790 derivation of the Poisson PMF. 1038 01:03:54,790 --> 01:03:56,900 But it still seems a little mystifying. 1039 01:03:56,900 --> 01:03:58,800 So I want to try to resolve that mystery. 1040 01:04:04,020 --> 01:04:07,090 And to resolve it let's look at a 1041 01:04:07,090 --> 01:04:10,730 supposedly unrelated problem. 1042 01:04:10,730 --> 01:04:17,810 And the unrelated problem is the following, suppose U1 up 1043 01:04:17,810 --> 01:04:22,790 to U sub n are n IID random variables, independent and 1044 01:04:22,790 --> 01:04:26,900 identically distributed random variables, each uniform 1045 01:04:26,900 --> 01:04:29,320 over zero to t. 1046 01:04:29,320 --> 01:04:32,430 Well, now, the probability density for those n random 1047 01:04:32,430 --> 01:04:34,600 variables is very, very simple. 1048 01:04:34,600 --> 01:04:38,380 Because each one of them is uniform over one to t. 1049 01:04:38,380 --> 01:04:43,850 The joint density is 1 over t to the n. 1050 01:04:43,850 --> 01:04:45,440 OK, no problem there. 1051 01:04:45,440 --> 01:04:46,760 They're each independent. 1052 01:04:46,760 --> 01:04:49,440 So the probability densities multiply. 1053 01:04:49,440 --> 01:04:53,400 Each one has a probability density 1 over t. 1054 01:04:53,400 --> 01:04:56,230 No matter where it is because it's uniform. 1055 01:04:56,230 --> 01:04:57,930 So you multiply them together. 1056 01:04:57,930 --> 01:05:02,050 And no matter where you are in this n dimensional cube, the 1057 01:05:02,050 --> 01:05:07,090 probability density is 1 over t to the n. 1058 01:05:07,090 --> 01:05:11,380 The next thing I want to do is define-- 1059 01:05:11,380 --> 01:05:15,380 now these are supposedly not the same random variables as 1060 01:05:15,380 --> 01:05:17,470 these arrival epochs we had before. 1061 01:05:17,470 --> 01:05:21,340 These are just defined in terms of these 1062 01:05:21,340 --> 01:05:22,805 uniform random variables. 1063 01:05:25,570 --> 01:05:30,610 You're seeing a good example here of why in probability 1064 01:05:30,610 --> 01:05:35,450 theory we start out with axioms, we derive properties. 1065 01:05:35,450 --> 01:05:39,620 and we don't lock ourselves in completely to some physical 1066 01:05:39,620 --> 01:05:42,030 problem that we're trying to solve. 1067 01:05:42,030 --> 01:05:44,280 Because if we had locked ourselves completely into a 1068 01:05:44,280 --> 01:05:47,320 Poisson process, we wouldn't even be able to look at this. 1069 01:05:47,320 --> 01:05:50,150 We'd say, ah, that's nothing to do with our problem. 1070 01:05:50,150 --> 01:05:51,020 But it does. 1071 01:05:51,020 --> 01:05:52,950 So wait. 1072 01:05:52,950 --> 01:05:56,090 We're going to define s1 as the minimum of 1073 01:05:56,090 --> 01:05:58,470 U1 up to U sub n. 1074 01:05:58,470 --> 01:06:01,770 In other words, these are the order statistics of 1075 01:06:01,770 --> 01:06:04,300 U1 up to U sub n. 1076 01:06:04,300 --> 01:06:08,760 I choose U1 to be anything between zero and t, U2 to be 1077 01:06:08,760 --> 01:06:11,170 anything between zero and t. 1078 01:06:11,170 --> 01:06:15,350 And then I choose s1 to be the smaller of those two and s2 to 1079 01:06:15,350 --> 01:06:16,730 be the larger of those two. 1080 01:06:24,698 --> 01:06:28,070 This is zero to t. 1081 01:06:28,070 --> 01:06:30,770 And here is say, U1. 1082 01:06:35,295 --> 01:06:38,340 And here is U2. 1083 01:06:38,340 --> 01:06:45,650 And then I transfer and call this s1 and this s2. 1084 01:06:45,650 --> 01:06:54,570 And if I have a U3 out here then this becomes s2 and this 1085 01:06:54,570 --> 01:06:56,690 becomes s3. 1086 01:06:56,690 --> 01:07:00,220 So the order statistics are just, you take these 1087 01:07:00,220 --> 01:07:02,145 arrivals and you-- 1088 01:07:02,145 --> 01:07:03,670 well, they don't have to be arrivals. 1089 01:07:03,670 --> 01:07:04,680 They're whatever they are. 1090 01:07:04,680 --> 01:07:06,780 They're just uniform random variables. 1091 01:07:06,780 --> 01:07:09,650 And I just order them. 1092 01:07:09,650 --> 01:07:17,790 So now my question is, the region of volume t to the n, 1093 01:07:17,790 --> 01:07:24,470 name I have a volume t to the n, for the possible values of 1094 01:07:24,470 --> 01:07:26,195 U1 up to U sub n. 1095 01:07:29,940 --> 01:07:35,620 That volume where the density of Un is non-zero, I can think 1096 01:07:35,620 --> 01:07:39,685 of it as being partitioned into n factorial regions. 1097 01:07:42,370 --> 01:07:46,470 First region is U1 less than U2, and so 1098 01:07:46,470 --> 01:07:48,930 forth up to U sub n. 1099 01:07:48,930 --> 01:07:53,410 The second one is U2 is less than U1, less than U3 1100 01:07:53,410 --> 01:07:54,990 and so forth up. 1101 01:07:54,990 --> 01:08:03,380 And for every permutation of U1 to U sub n, I get a 1102 01:08:03,380 --> 01:08:05,610 different ordering here. 1103 01:08:05,610 --> 01:08:07,880 I don't even care about the ordering anymore. 1104 01:08:07,880 --> 01:08:15,110 All I care about is that the number of permutations of U1 1105 01:08:15,110 --> 01:08:16,520 to U sub n-- 1106 01:08:16,520 --> 01:08:18,890 or the integers one to n, really is 1107 01:08:18,890 --> 01:08:19,840 what I'm talking about. 1108 01:08:19,840 --> 01:08:25,189 Number of permutations of 1 to n is n factorial. 1109 01:08:25,189 --> 01:08:31,720 And for each one of those permutations, I can talk about 1110 01:08:31,720 --> 01:08:36,689 the region in which the first permutation is less than the 1111 01:08:36,689 --> 01:08:39,920 second permutation less than the third and so forth. 1112 01:08:39,920 --> 01:08:41,170 Those are disjoint-- 1113 01:08:43,560 --> 01:08:47,075 they have to be the same volume by symmetry. 1114 01:08:53,310 --> 01:08:56,729 Now how does that symmetry argument work there? 1115 01:08:56,729 --> 01:09:00,439 I have n integers. 1116 01:09:00,439 --> 01:09:03,380 There's no way to tell the difference between them, 1117 01:09:03,380 --> 01:09:07,380 except by giving them names. 1118 01:09:07,380 --> 01:09:14,390 And each of them corresponds to a IID random variable. 1119 01:09:17,279 --> 01:09:25,140 And now, what I'm saying is that the region which U1 is 1120 01:09:25,140 --> 01:09:31,490 less than U2 has the same area as the region where U2 1121 01:09:31,490 --> 01:09:32,930 is less than U1. 1122 01:09:32,930 --> 01:09:38,029 And the argument is whatever you want to do with U1 and U2 1123 01:09:38,029 --> 01:09:42,979 to find the area of that, I can take your argument and 1124 01:09:42,979 --> 01:09:46,840 every time you've written a U1, I'll substitute U2 for it. 1125 01:09:46,840 --> 01:09:48,740 And every time you've written a U2 I'll 1126 01:09:48,740 --> 01:09:51,240 substitute U1 for it. 1127 01:09:51,240 --> 01:09:53,740 And it's the same argument. 1128 01:09:53,740 --> 01:09:58,900 So that if you find the area using your permutation, I get 1129 01:09:58,900 --> 01:10:02,580 the same area just using your argument again. 1130 01:10:02,580 --> 01:10:07,060 And I can do this for as many terms as I want to. 1131 01:10:07,060 --> 01:10:12,790 OK, so this says that from symmetry each volume is the 1132 01:10:12,790 --> 01:10:16,930 same and thus, each volume is t to the N 1133 01:10:16,930 --> 01:10:19,860 divided by N factorial. 1134 01:10:19,860 --> 01:10:24,410 Now the region where s sub N is non-zero, s sub N are these 1135 01:10:24,410 --> 01:10:25,950 ordering statistics. 1136 01:10:25,950 --> 01:10:29,140 These are the terms in which you wanted less than U2 less 1137 01:10:29,140 --> 01:10:33,250 than U sub n after I reordered them. 1138 01:10:35,810 --> 01:10:43,040 The volume of the region where S1 is less than S2 less than 1139 01:10:43,040 --> 01:10:48,000 Sn less than or equal to t, the region of that volume is 1140 01:10:48,000 --> 01:10:52,110 exactly t to the N divided by N factorial. 1141 01:10:52,110 --> 01:10:55,630 Because it's one over N factorial 1142 01:10:55,630 --> 01:10:58,750 of the entire region. 1143 01:10:58,750 --> 01:11:01,310 Now, you just have to think about that symmetry argument 1144 01:11:01,310 --> 01:11:02,560 for a while. 1145 01:11:04,810 --> 01:11:07,940 The other thing you can do is just integrate, which is what 1146 01:11:07,940 --> 01:11:13,370 we did when we were deriving the airline distribution. 1147 01:11:13,370 --> 01:11:14,960 I mean, you just integrate the thing out. 1148 01:11:14,960 --> 01:11:18,250 And you find out that that N factorial 1149 01:11:18,250 --> 01:11:21,160 magically appears there. 1150 01:11:21,160 --> 01:11:24,000 I think you get a better idea of what's going on here. 1151 01:11:31,540 --> 01:11:36,700 Let me go on to this slide, which will explain a little 1152 01:11:36,700 --> 01:11:40,820 bit what's going on here. 1153 01:11:40,820 --> 01:11:43,740 Suppose I'm looking at S1 and S2. 1154 01:11:43,740 --> 01:11:52,170 The area where S1 is less than S2 is this cross 1155 01:11:52,170 --> 01:11:53,420 hatched area here. 1156 01:11:57,350 --> 01:12:00,110 All the points in here are points where S2 1157 01:12:00,110 --> 01:12:01,830 is bigger than S1. 1158 01:12:01,830 --> 01:12:07,250 As advertised, the region where S2 is bigger than S1 has 1159 01:12:07,250 --> 01:12:14,470 the same area as the region where S2 is less than S1. 1160 01:12:14,470 --> 01:12:18,550 And all I'm saying is this same property happens for N 1161 01:12:18,550 --> 01:12:22,850 equals 3, N equals 4, and so forth. 1162 01:12:22,850 --> 01:12:29,180 And the argument there really is a symmetry argument. 1163 01:12:29,180 --> 01:12:30,720 OK, I don't want to worry about what the 1164 01:12:30,720 --> 01:12:32,670 x's are at this point. 1165 01:12:32,670 --> 01:12:40,940 I'm just showing you that a little tiny area volume there 1166 01:12:40,940 --> 01:12:43,100 has to be part of this triangle. 1167 01:12:43,100 --> 01:12:46,740 And I get that factor of 2 because as S2 1168 01:12:46,740 --> 01:12:47,990 is bigger than S1. 1169 01:13:06,540 --> 01:13:11,460 Let me give you an example of using the ordering statistics, 1170 01:13:11,460 --> 01:13:12,880 which we just talked about. 1171 01:13:15,740 --> 01:13:19,800 Namely, looking at N IID random variables, and then 1172 01:13:19,800 --> 01:13:22,560 choosing S1 to be the smallest, S2 to be the next 1173 01:13:22,560 --> 01:13:24,380 smallest and so forth. 1174 01:13:24,380 --> 01:13:29,560 What we've seen is that this S1 to S sub N is identically 1175 01:13:29,560 --> 01:13:34,480 distributed to the problem of the epochs of arrivals in a 1176 01:13:34,480 --> 01:13:39,420 Poisson process conditional on N arrivals up until time t. 1177 01:13:39,420 --> 01:13:47,110 So we can use either the uniform distribution and 1178 01:13:47,110 --> 01:13:50,740 ordering or we can use Poisson results. 1179 01:13:50,740 --> 01:13:54,840 Either one can give us results about either process. 1180 01:13:54,840 --> 01:13:58,430 Here what I'm going to do is use order statistics to find 1181 01:13:58,430 --> 01:14:04,350 the distribution function of S1 conditional on N arrivals 1182 01:14:04,350 --> 01:14:08,410 until time N. Conditional in the fact that I found 1,000 1183 01:14:08,410 --> 01:14:11,150 arrivals up until time 1,000. 1184 01:14:11,150 --> 01:14:14,380 I want to find the distribution function of when 1185 01:14:14,380 --> 01:14:16,360 the first arrival occurs. 1186 01:14:16,360 --> 01:14:21,110 Now, if I know that 1,000 arrivals have occurred by time 1187 01:14:21,110 --> 01:14:25,490 1,000, the first arrival is probably going to be pretty 1188 01:14:25,490 --> 01:14:30,100 close to a Poisson random variable. 1189 01:14:30,100 --> 01:14:32,550 But I'd like to see that. 1190 01:14:32,550 --> 01:14:36,490 OK, so the probability that the minimum of these U sub 1191 01:14:36,490 --> 01:14:41,190 i's, these uniform random variables, is bigger than S1 1192 01:14:41,190 --> 01:14:44,730 is the product of the probabilities that each one is 1193 01:14:44,730 --> 01:14:45,800 bigger than S1. 1194 01:14:45,800 --> 01:14:49,440 The only way that the minimum can be bigger than S1 is that 1195 01:14:49,440 --> 01:14:53,140 each of them are bigger than S1. 1196 01:14:53,140 --> 01:15:00,380 So this is the product from I equals 1 to N of 1 1197 01:15:00,380 --> 01:15:03,256 minus S1 over t. 1198 01:15:03,256 --> 01:15:06,240 Then we take the product of things that are all the same. 1199 01:15:06,240 --> 01:15:11,940 It's 1 minus S1 over t to the N. And then I go into the 1200 01:15:11,940 --> 01:15:16,100 domain of these arrival epochs. 1201 01:15:16,100 --> 01:15:22,460 And the probability that the first arrival epoch, the 1202 01:15:22,460 --> 01:15:26,490 complimentary distribution function of that, is then 1 1203 01:15:26,490 --> 01:15:30,840 minus S1 over t to the N. 1204 01:15:30,840 --> 01:15:35,630 You can also find the expected value of S1 just by 1205 01:15:35,630 --> 01:15:37,220 integrating this. 1206 01:15:37,220 --> 01:15:39,600 Namely, integrate the complimentary distribution 1207 01:15:39,600 --> 01:15:42,820 function to get the expected value. 1208 01:15:42,820 --> 01:15:43,570 And what do you get? 1209 01:15:43,570 --> 01:15:46,780 You get t over N plus 1. 1210 01:15:46,780 --> 01:15:52,080 A little surprising, but not as surprising 1211 01:15:52,080 --> 01:15:53,730 as you would think. 1212 01:15:53,730 --> 01:16:00,110 We're looking at an interval from zero to t. 1213 01:16:00,110 --> 01:16:01,975 We have three arrivals there. 1214 01:16:04,840 --> 01:16:09,390 And then we're asking where does the first one occur? 1215 01:16:09,390 --> 01:16:14,870 And the argument is this interval, this interval, this 1216 01:16:14,870 --> 01:16:18,270 interval, and this interval ought to have the same 1217 01:16:18,270 --> 01:16:20,010 distribution. 1218 01:16:20,010 --> 01:16:23,030 I mean, we haven't talked about this interval yet, but 1219 01:16:23,030 --> 01:16:25,600 it's there. 1220 01:16:25,600 --> 01:16:28,420 I mean, the last of these arrivals is not a t, it's 1221 01:16:28,420 --> 01:16:30,250 somewhere before t. 1222 01:16:30,250 --> 01:16:33,420 So I have n plus 1 intervals on the 1223 01:16:33,420 --> 01:16:36,170 side of these n arrivals. 1224 01:16:36,170 --> 01:16:40,020 And what this is saying is the nice symmetric result at the 1225 01:16:40,020 --> 01:16:46,940 expected place where S1 winds up is at t over n plus one. 1226 01:16:46,940 --> 01:16:50,440 These intervals are in some sense, at least as far as 1227 01:16:50,440 --> 01:16:53,700 expectation is concerned, are uniform. 1228 01:16:53,700 --> 01:16:59,000 And we wind up with t over N plus 1 as the expected value 1229 01:16:59,000 --> 01:17:01,510 of where the first one is. 1230 01:17:01,510 --> 01:17:02,760 That's nice and clean. 1231 01:17:06,030 --> 01:17:08,540 If you look at this problem and you think in terms of 1232 01:17:08,540 --> 01:17:13,610 little tiny arrivals coming in any place, and, in fact, you 1233 01:17:13,610 --> 01:17:21,310 look at this U1 up to U sub N, one arrival. 1234 01:17:21,310 --> 01:17:23,280 I can think of it as one arrival 1235 01:17:23,280 --> 01:17:26,400 from each of N processes. 1236 01:17:26,400 --> 01:17:30,950 And these are all Poisson processes. 1237 01:17:30,950 --> 01:17:35,050 They all add up to give a joint Poisson process, a 1238 01:17:35,050 --> 01:17:38,670 combined Poisson process, with N arrivals. 1239 01:17:38,670 --> 01:17:44,470 And then, I order the arrivals to correspond to the overall 1240 01:17:44,470 --> 01:17:45,970 sum process. 1241 01:17:45,970 --> 01:17:48,550 And this is the answer I get. 1242 01:17:48,550 --> 01:17:56,250 So the uniform idea is related to adding up lots of little 1243 01:17:56,250 --> 01:17:57,500 tiny processes. 1244 01:18:00,540 --> 01:18:04,680 OK, now the next thing I want to do, and I can see I'm not 1245 01:18:04,680 --> 01:18:08,570 really going to get time to do it, but I'd like to give you 1246 01:18:08,570 --> 01:18:11,790 the results of it. 1247 01:18:11,790 --> 01:18:23,740 If I look at a box, a little cube, of area delta squared in 1248 01:18:23,740 --> 01:18:26,870 the S1, S2 space. 1249 01:18:26,870 --> 01:18:32,580 And I look at what that maps in to, in terms of the 1250 01:18:32,580 --> 01:18:38,370 interarrivals x1 and x2, if you just map these points into 1251 01:18:38,370 --> 01:18:43,830 this, you see that this square here is going into a 1252 01:18:43,830 --> 01:18:46,640 parallelepiped. 1253 01:18:46,640 --> 01:18:50,850 And if know a little bit of linear algebra-- 1254 01:18:50,850 --> 01:18:53,190 well, you don't even need to know any linear algebra to see 1255 01:18:53,190 --> 01:18:57,600 that no matter where you put this little square anywhere 1256 01:18:57,600 --> 01:19:02,210 around here it's always going to map into the same kind of a 1257 01:19:02,210 --> 01:19:03,260 parallelepiped. 1258 01:19:03,260 --> 01:19:08,000 So if I take this whole area and I break it up into a grid, 1259 01:19:08,000 --> 01:19:13,130 this area is going to be broken up into a grid where 1260 01:19:13,130 --> 01:19:16,080 it's things are going down this way. 1261 01:19:16,080 --> 01:19:18,300 And this way, we're going to have a lot of these little 1262 01:19:18,300 --> 01:19:20,320 tiny parallelepipeds. 1263 01:19:20,320 --> 01:19:23,650 If I have uniform probability density there, I'll have 1264 01:19:23,650 --> 01:19:26,890 uniform probability density there. 1265 01:19:26,890 --> 01:19:29,890 The place where you need some linear algebra, if you're 1266 01:19:29,890 --> 01:19:34,520 dealing with n dimensions instead of 2, is to see that 1267 01:19:34,520 --> 01:19:39,330 the volume of this delta cube here, for this problem here, 1268 01:19:39,330 --> 01:19:42,110 is going to be identical to the volume of 1269 01:19:42,110 --> 01:19:45,440 the delta cube here. 1270 01:19:45,440 --> 01:19:48,960 OK, so when you get all done that, the area in the x1, x2 1271 01:19:48,960 --> 01:19:55,070 space where zero is less than x1 plus x2 is less than t is t 1272 01:19:55,070 --> 01:19:57,020 squared over 2 again. 1273 01:19:57,020 --> 01:20:02,520 And the probability density of these two interarrival 1274 01:20:02,520 --> 01:20:07,530 intervals is again 2 over t squared. 1275 01:20:07,530 --> 01:20:10,870 It's the same as the arrivals. 1276 01:20:10,870 --> 01:20:12,120 I'm going to skip this slide. 1277 01:20:12,120 --> 01:20:15,200 I want to get the one other thing I wanted to talk about a 1278 01:20:15,200 --> 01:20:16,980 little bit. 1279 01:20:16,980 --> 01:20:18,230 There's a paradox. 1280 01:20:20,110 --> 01:20:26,000 And the main interarrival time for Poisson process is one 1281 01:20:26,000 --> 01:20:27,250 over lambda. 1282 01:20:30,370 --> 01:20:34,450 If I come at time t and I start waiting for the next 1283 01:20:34,450 --> 01:20:40,330 arrival, the mean time I have to wait is 1 over lambda. 1284 01:20:40,330 --> 01:20:44,450 If I come in and I start looking back, at the last 1285 01:20:44,450 --> 01:20:50,490 arrival, well, this says it's 1 over lambda times 1 minus e 1286 01:20:50,490 --> 01:20:53,060 to the minus lambda t, which is what it is. 1287 01:20:53,060 --> 01:20:54,310 We haven't derived that. 1288 01:20:54,310 --> 01:20:56,840 But we know it's something. 1289 01:20:56,840 --> 01:20:59,660 So, what's going on here? 1290 01:20:59,660 --> 01:21:03,780 I come in at time t, the interval between the last 1291 01:21:03,780 --> 01:21:08,510 arrival and the next arrival, the mean interval there is 1292 01:21:08,510 --> 01:21:11,650 bigger than the mean interval between arrivals 1293 01:21:11,650 --> 01:21:14,500 in a Poisson process. 1294 01:21:14,500 --> 01:21:16,715 That is a real paradox. 1295 01:21:16,715 --> 01:21:19,770 And how do I explain that? 1296 01:21:19,770 --> 01:21:23,360 It means that if I'm waiting for buses, I'm always unlucky. 1297 01:21:23,360 --> 01:21:25,390 And you're unlucky too and we're all unlucky. 1298 01:21:28,040 --> 01:21:34,200 Because whenever you arrive, the amount of time between the 1299 01:21:34,200 --> 01:21:39,240 last bus and the next bus is bigger than it ought to be. 1300 01:21:39,240 --> 01:21:41,960 And how do we explain that? 1301 01:21:41,960 --> 01:21:45,320 Well, this sort of gives you a really half-assed 1302 01:21:45,320 --> 01:21:47,080 explanation of it. 1303 01:21:47,080 --> 01:21:48,800 It's not very good. 1304 01:21:48,800 --> 01:21:52,960 When we study renewals, we'll find a good explanation of it. 1305 01:21:57,010 --> 01:22:00,490 First choose a sample path of a Poisson process. 1306 01:22:00,490 --> 01:22:06,140 I mean, start out with a sample path. 1307 01:22:06,140 --> 01:22:09,420 And that has arrivals going along. 1308 01:22:09,420 --> 01:22:12,800 And we want to get rid of zero because funny things happen 1309 01:22:12,800 --> 01:22:14,340 around zero. 1310 01:22:14,340 --> 01:22:19,910 So we'll start at one eon and stop at n eons. 1311 01:22:19,910 --> 01:22:22,670 And we'll look over that interval. 1312 01:22:22,670 --> 01:22:26,470 And then we'll choose t to be a random variable in this 1313 01:22:26,470 --> 01:22:27,970 interval here. 1314 01:22:27,970 --> 01:22:30,860 OK, so, we choose some t after we've looked 1315 01:22:30,860 --> 01:22:32,840 at the sample path. 1316 01:22:32,840 --> 01:22:37,900 And then we say for this random value of t, how big is 1317 01:22:37,900 --> 01:22:42,370 the interval between the most recent arrival 1318 01:22:42,370 --> 01:22:44,710 and the next arrival. 1319 01:22:44,710 --> 01:22:50,200 And what you see is that since I have these arrivals of a 1320 01:22:50,200 --> 01:22:57,890 Poisson process laid out here, the large intervals take up 1321 01:22:57,890 --> 01:23:00,820 proportionally more area than the little intervals. 1322 01:23:00,820 --> 01:23:14,450 If I have a bunch of little tiny intervals and some big 1323 01:23:14,450 --> 01:23:18,810 intervals and then I choose a random t along here, I'm much 1324 01:23:18,810 --> 01:23:22,870 more likely to choose a t in here, proportionally, than I 1325 01:23:22,870 --> 01:23:26,820 am to choose a t in here. 1326 01:23:26,820 --> 01:23:30,970 This mean time between the arrivals is the average of 1327 01:23:30,970 --> 01:23:36,450 this, this, this, this, and this. 1328 01:23:36,450 --> 01:23:40,660 And what I'm looking at here is I picked some arbitrary 1329 01:23:40,660 --> 01:23:42,320 point in time. 1330 01:23:42,320 --> 01:23:45,910 These arbitrary points in time are likely to lie in very 1331 01:23:45,910 --> 01:23:47,882 large intervals. 1332 01:23:47,882 --> 01:23:48,794 Yes? 1333 01:23:48,794 --> 01:23:51,165 AUDIENCE: Can you please explain again what is exactly 1334 01:23:51,165 --> 01:23:52,240 is the paradox? 1335 01:23:52,240 --> 01:23:56,080 PROFESSOR: The paradox is that the mean time between arrival 1336 01:23:56,080 --> 01:23:58,290 is one over lambda. 1337 01:23:58,290 --> 01:24:03,170 But if I arrive to look for a bus and the buses are Poisson, 1338 01:24:03,170 --> 01:24:09,300 I arrive in an interval whose mean length is larger than 1 1339 01:24:09,300 --> 01:24:10,550 over lambda. 1340 01:24:14,870 --> 01:24:16,390 And that seems strange to me. 1341 01:24:19,650 --> 01:24:22,350 Well, I think I'll stop here. 1342 01:24:22,350 --> 01:24:26,650 And maybe we will spend just a little time sorting 1343 01:24:26,650 --> 01:24:27,900 this out next time.