1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:17,890 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,890 --> 00:00:19,140 ocw.mit.edu. 8 00:00:22,940 --> 00:00:24,030 PROFESSOR: So this is the outline of 9 00:00:24,030 --> 00:00:25,900 the lecture for today. 10 00:00:25,900 --> 00:00:29,530 So we just need to talk about the conditional densities for 11 00:00:29,530 --> 00:00:30,620 Poisson process. 12 00:00:30,620 --> 00:00:33,650 So these are the things that we've seen before in previous 13 00:00:33,650 --> 00:00:38,390 sections, so I'm just going to give a simple proof for them 14 00:00:38,390 --> 00:00:41,000 and some intuition. 15 00:00:41,000 --> 00:00:46,130 So if you remember the Poisson process, this was the famous 16 00:00:46,130 --> 00:00:48,210 figure that we had. 17 00:00:48,210 --> 00:00:52,190 So our time is at zero and afterwards zero and then we 18 00:00:52,190 --> 00:00:53,440 had some arrivals. 19 00:00:58,800 --> 00:01:01,160 So our time is then t. 20 00:01:01,160 --> 00:01:06,550 We had N of t arrivals, which is equal to n. 21 00:01:06,550 --> 00:01:07,850 This is Sn. 22 00:01:26,960 --> 00:01:28,740 This is the figure that we had. 23 00:01:28,740 --> 00:01:33,610 So for the first equation, we're looking at-- 24 00:01:33,610 --> 00:01:37,250 We want to find the conditional distribution of 25 00:01:37,250 --> 00:01:48,040 interval times condition and last arrival times. 26 00:01:48,040 --> 00:01:51,840 So first of all, we want to find the joint distribution of 27 00:01:51,840 --> 00:01:53,680 the interval arrival times and the last 28 00:01:53,680 --> 00:01:58,970 arrival, this equation. 29 00:01:58,970 --> 00:02:01,970 So you know that interval arrival times are independent 30 00:02:01,970 --> 00:02:19,880 in the Poisson process, so just looking at this thing, 31 00:02:19,880 --> 00:02:23,640 you know that these things are independent of each other so 32 00:02:23,640 --> 00:02:26,250 we will have lambda to the power of Ne to 33 00:02:26,250 --> 00:02:27,500 the power minus lambda. 34 00:02:31,790 --> 00:02:35,230 And then, this last one corresponds to the fact that 35 00:02:35,230 --> 00:02:43,630 xn plus 1 is equal to n plus 1 minus summation of x. 36 00:02:43,630 --> 00:02:47,880 These two events are equivalent to each other, so 37 00:02:47,880 --> 00:02:50,420 this is going to correspond to lambda e to the 38 00:02:50,420 --> 00:02:52,330 power of minus lambda. 39 00:02:55,890 --> 00:02:57,690 So we just did it the other way, so 40 00:02:57,690 --> 00:02:58,940 it's going to be like-- 41 00:03:11,020 --> 00:03:13,910 So you can just get rid of these terms and it's going to 42 00:03:13,910 --> 00:03:15,660 be that equation. 43 00:03:15,660 --> 00:03:18,570 So it's very simple, it's just the independence of 44 00:03:18,570 --> 00:03:20,330 inter-arrival times. 45 00:03:20,330 --> 00:03:23,870 So you look at the last term as the inter-arrival time for 46 00:03:23,870 --> 00:03:26,310 the last arrival, and then they're going to be 47 00:03:26,310 --> 00:03:28,660 independent, and the terms kind of cancel out and you 48 00:03:28,660 --> 00:03:30,430 have this equation. 49 00:03:30,430 --> 00:03:35,060 So this conditional density will be the joint 50 00:03:35,060 --> 00:03:35,730 distribution-- 51 00:03:35,730 --> 00:03:41,910 or the distribution of Sn plus 1, which is like these. 52 00:03:41,910 --> 00:03:44,240 So the joint decision will be like this. 53 00:03:44,240 --> 00:03:46,820 So what does the intuition behind this mean? 54 00:03:46,820 --> 00:03:48,300 This means that-- 55 00:03:48,300 --> 00:03:49,500 you've seen this term before. 56 00:03:49,500 --> 00:03:52,510 This means that we have uniform distribution over the 57 00:03:52,510 --> 00:03:57,180 interval of zero to one, condition on something. 58 00:03:57,180 --> 00:04:03,630 So previously, what we had was the conditional distribution 59 00:04:03,630 --> 00:04:08,560 of arrival times condition on the last arrival, so F of s1 60 00:04:08,560 --> 00:04:10,740 and condition on sn plus 1. 61 00:04:10,740 --> 00:04:15,320 But here we have distribution of x1 to x and 62 00:04:15,320 --> 00:04:17,290 condition on sn plus 1. 63 00:04:17,290 --> 00:04:21,779 And previously, the constraint that we had was like we should 64 00:04:21,779 --> 00:04:23,845 have ordering over the arrival times. 65 00:04:28,260 --> 00:04:31,920 So because of this constraint we had this n factorial, if 66 00:04:31,920 --> 00:04:34,400 you remember, that order of statistics things that you 67 00:04:34,400 --> 00:04:35,500 talked about. 68 00:04:35,500 --> 00:04:38,215 Now here, what we have is something like this. 69 00:04:38,215 --> 00:04:42,670 So all the arrival times should be positive, and 70 00:04:42,670 --> 00:04:49,220 summation of them should be less than t. 71 00:04:49,220 --> 00:04:54,740 Because well, the summation of them to n plus 1 is 72 00:04:54,740 --> 00:04:56,330 equal to sn plus 1. 73 00:04:58,940 --> 00:05:00,780 And the last term is positive, so this thing should 74 00:05:00,780 --> 00:05:02,030 be less than t. 75 00:05:04,880 --> 00:05:08,480 So these two constraints are sort of dual to each other. 76 00:05:08,480 --> 00:05:11,890 So because of this constraint, I had some n factorial over 77 00:05:11,890 --> 00:05:15,660 the conditional distribution of arrival times, condition on 78 00:05:15,660 --> 00:05:17,720 the last arrival time. 79 00:05:17,720 --> 00:05:21,620 Now, I have inter-arrival conditional distribution of 80 00:05:21,620 --> 00:05:24,100 inter-arrival time condition on last arrival time and we 81 00:05:24,100 --> 00:05:25,560 should have this n factorial here. 82 00:05:28,060 --> 00:05:31,650 The other interesting thing is that if I condition it under n 83 00:05:31,650 --> 00:05:36,030 of t, so the number of arrivals at time [INAUDIBLE] 84 00:05:36,030 --> 00:05:37,180 t. 85 00:05:37,180 --> 00:05:40,270 The time is going to be very similar to the previous one. 86 00:05:43,710 --> 00:05:46,060 Just to prove this thing, it's very simple again. 87 00:05:56,190 --> 00:05:57,520 I just wanted to show you-- 88 00:06:09,229 --> 00:06:10,740 So what does this thing mean? 89 00:06:10,740 --> 00:06:14,290 This thing means that these are the distributions of the 90 00:06:14,290 --> 00:06:18,100 inter-arrival times and the last event corresponds to the 91 00:06:18,100 --> 00:06:23,420 fact that xn plus 1 is bigger than t minus summation of xi. 92 00:06:26,830 --> 00:06:31,890 Because we have an arrival time at [? snn, ?] 93 00:06:31,890 --> 00:06:36,210 and then after that there's no arrival. 94 00:06:36,210 --> 00:06:39,660 Because at time nt we have n arrivals. 95 00:06:39,660 --> 00:06:41,770 So this corresponds to do is thing. 96 00:06:41,770 --> 00:06:45,521 So again, we'll have something like lambda n, e 97 00:06:45,521 --> 00:06:46,771 to the power of-- 98 00:06:51,210 --> 00:06:53,890 This is for the first event and you see that these are 99 00:06:53,890 --> 00:06:57,030 independent because of the Poisson properties of the 100 00:06:57,030 --> 00:06:58,300 Poisson process. 101 00:06:58,300 --> 00:06:59,680 And the last term will be-- 102 00:07:03,320 --> 00:07:05,940 property of the Poisson and a variable bigger 103 00:07:05,940 --> 00:07:07,190 than something is-- 104 00:07:17,930 --> 00:07:20,290 Yes, this cancel out. 105 00:07:20,290 --> 00:07:23,490 And then we'll have this term, this first n lambda to the 106 00:07:23,490 --> 00:07:26,310 power of n, just forget about n plus 1. 107 00:07:26,310 --> 00:07:31,890 And property distribution of n of t equal to n is this term 108 00:07:31,890 --> 00:07:33,020 without n plus 1. 109 00:07:33,020 --> 00:07:37,050 So lambda n to the 12, and e to the power minus lambda t 110 00:07:37,050 --> 00:07:38,470 over n factorial. 111 00:07:38,470 --> 00:07:41,870 So these terms cancel out and we'll have this thing. 112 00:07:41,870 --> 00:07:45,940 So again, we should have these constraints that summation of 113 00:07:45,940 --> 00:07:48,570 xk, k equal to 1, n should be less than t. 114 00:07:56,980 --> 00:08:00,390 No matter if they are conditioning of n of t or sn 115 00:08:00,390 --> 00:08:05,380 plus 1, the last arrival time or the number of arrivals at 116 00:08:05,380 --> 00:08:09,560 some time instant, the condition of distribution is 117 00:08:09,560 --> 00:08:12,952 uniform subject to some constraints. 118 00:08:12,952 --> 00:08:14,220 Is there any questions? 119 00:08:22,510 --> 00:08:29,540 So if you look at the limit, and so again, looking at this 120 00:08:29,540 --> 00:08:40,280 figure, you see that this corresponds to sn plus 1 and 121 00:08:40,280 --> 00:08:43,710 you see that sn plus 1 is bigger than t. 122 00:08:43,710 --> 00:08:47,710 So if you take the limit and take the t1 going to t, these 123 00:08:47,710 --> 00:08:49,400 two distributions are going to be the same. 124 00:08:53,060 --> 00:08:56,270 So t1 corresponds to sn plus 1. 125 00:08:56,270 --> 00:08:57,520 t1. 126 00:09:05,390 --> 00:09:08,550 So basically what's it's saying is that conditional 127 00:09:08,550 --> 00:09:12,350 distribution of arrival times are uniform, independent of 128 00:09:12,350 --> 00:09:14,030 what is going to happen in future. 129 00:09:14,030 --> 00:09:17,200 So in future, there might be some, I mean, the knowledge 130 00:09:17,200 --> 00:09:20,110 that we have about future could be the fact that knowing 131 00:09:20,110 --> 00:09:23,940 what happens in this interval or the next arrival is going 132 00:09:23,940 --> 00:09:27,080 to happen at some point, I mean, condition on these 133 00:09:27,080 --> 00:09:30,910 things the arrival times are going to be uniform. 134 00:09:43,330 --> 00:09:46,570 The other fact is that this distribution that I showed 135 00:09:46,570 --> 00:09:51,910 you, they are all symmetric with respect to interval 136 00:09:51,910 --> 00:09:53,450 arrival times. 137 00:09:53,450 --> 00:09:58,120 So it doesn't matter, I mean, no arrival has more priority 138 00:09:58,120 --> 00:10:01,630 to any other one or they're not different to each other. 139 00:10:01,630 --> 00:10:05,180 So if you look at the permutation of them, they're 140 00:10:05,180 --> 00:10:08,380 going to be similar. 141 00:10:08,380 --> 00:10:16,500 So now, I want to look at the distribution of x1 or s1, they 142 00:10:16,500 --> 00:10:17,750 are equal to each other. 143 00:10:20,030 --> 00:10:24,260 I can easily calculate the probability of x1 bigger than 144 00:10:24,260 --> 00:10:32,350 some tau condition, for example, n of t is equal to n. 145 00:10:32,350 --> 00:10:34,065 So what is this conditional distribution? 146 00:10:38,440 --> 00:10:42,690 So condition on n of t equal to n, I know that in this 147 00:10:42,690 --> 00:10:50,750 interval I have n arrivals and n inter-arrival times, and I 148 00:10:50,750 --> 00:10:57,140 want to know if x1 is in this interval, or s1. 149 00:10:57,140 --> 00:10:59,870 s1 is equal to x1. 150 00:10:59,870 --> 00:11:03,460 And you know that si's are also uniform. 151 00:11:03,460 --> 00:11:09,860 So you know that these are independent of each other, and 152 00:11:09,860 --> 00:11:14,241 so you can say that the probability of this thing is 153 00:11:14,241 --> 00:11:17,140 like this one. 154 00:11:17,140 --> 00:11:19,041 So all of them should be bigger than tau. 155 00:11:24,120 --> 00:11:26,430 And since this is symmetric-- 156 00:11:26,430 --> 00:11:29,880 so this corresponds to x1. 157 00:11:29,880 --> 00:11:31,080 Since this was-- 158 00:11:31,080 --> 00:11:33,330 everything was symmetric for all inter-arrival times. 159 00:11:33,330 --> 00:11:37,490 This is also the complimentary distribution function for xk. 160 00:11:49,330 --> 00:11:50,580 Is there any questions? 161 00:11:56,640 --> 00:11:59,340 So the other think that are not in the slides that I want 162 00:11:59,340 --> 00:12:02,144 to tell you is the conditional distribution of 163 00:12:02,144 --> 00:12:05,380 si given n of t. 164 00:12:14,270 --> 00:12:20,380 So I know that there n arrivals in this interval, and 165 00:12:20,380 --> 00:12:29,540 I want to see if the i arrival is something like this. 166 00:12:29,540 --> 00:12:31,170 So the probability that-- 167 00:12:31,170 --> 00:12:34,390 so this even corresponds to the fact that there is one 168 00:12:34,390 --> 00:12:40,020 arrival in this interval and i minus 1 arrival was here and n 169 00:12:40,020 --> 00:12:43,990 minus i minus 1 arrival was here. 170 00:12:46,620 --> 00:12:49,720 Oh, no, sorry, n minus i arrival is here. 171 00:12:49,720 --> 00:12:52,990 So the probability of having one arrival 172 00:12:52,990 --> 00:12:56,630 here corresponds to-- 173 00:12:56,630 --> 00:12:59,780 since arrival times our uniform, and I want to have 174 00:12:59,780 --> 00:13:03,880 one arrival in this interval like this, and I want to have 175 00:13:03,880 --> 00:13:12,110 i minus arrivals here i minus one arrivals here, and n minus 176 00:13:12,110 --> 00:13:13,240 i arrivals here. 177 00:13:13,240 --> 00:13:17,010 So it's like aq binomial probability distribution so 178 00:13:17,010 --> 00:13:21,280 out of the n arrival-- n minus 1 remaining arrivals, I want 179 00:13:21,280 --> 00:13:25,283 to have some of them in this interval and 180 00:13:25,283 --> 00:13:26,533 some of them here. 181 00:13:35,000 --> 00:13:37,050 So this is the distribution of this thing. 182 00:13:41,670 --> 00:13:45,980 Yeah, so just as a check we know that the last 183 00:13:45,980 --> 00:13:48,260 inter-arrival-- 184 00:13:48,260 --> 00:13:51,030 last arrival time, the distribution 185 00:13:51,030 --> 00:13:53,120 of that thing is-- 186 00:13:53,120 --> 00:13:55,450 Here, you just cancel these things out, too-- 187 00:13:55,450 --> 00:13:56,700 is-- 188 00:13:59,900 --> 00:14:03,860 We need to get back to this equation, but it's just the 189 00:14:03,860 --> 00:14:07,040 same as this one but you let i equal to n. 190 00:14:12,000 --> 00:14:13,540 This is OK? 191 00:14:13,540 --> 00:14:17,672 Just to think i equal to n, you get this one. 192 00:14:17,672 --> 00:14:20,037 AUDIENCE: Does this equation also apply to i equals 193 00:14:20,037 --> 00:14:21,930 [INAUDIBLE]? 194 00:14:21,930 --> 00:14:23,180 PROFESSOR: Yeah, sure. 195 00:14:26,810 --> 00:14:33,520 Yeah, i equal to 1, we will have nt minus tau. 196 00:14:48,280 --> 00:14:54,720 And if you look at this thing, so this is a complimentary 197 00:14:54,720 --> 00:14:55,650 distribution function. 198 00:14:55,650 --> 00:14:57,413 You just take to zero till you get that. 199 00:15:03,350 --> 00:15:04,725 This is just a check of this formulation. 200 00:15:09,640 --> 00:15:10,520 Yeah. 201 00:15:10,520 --> 00:15:12,020 So is there any other questions? 202 00:15:12,020 --> 00:15:15,830 Did anybody get why we have this kind of formulation? 203 00:15:15,830 --> 00:15:19,190 Is was very simple, it's just the combination of a uniform 204 00:15:19,190 --> 00:15:23,014 distribution and a binomial distribution. 205 00:15:23,014 --> 00:15:24,508 AUDIENCE: What is tau? 206 00:15:24,508 --> 00:15:27,910 PROFESSOR: Tau is this thing. 207 00:15:27,910 --> 00:15:29,160 Tau is the si. 208 00:15:33,250 --> 00:15:33,954 Change of notation. 209 00:15:33,954 --> 00:15:35,610 AUDIENCE: [INAUDIBLE]? 210 00:15:35,610 --> 00:15:39,670 PROFESSOR: I said that I want one arrival in this interval 211 00:15:39,670 --> 00:15:44,930 and i minus 1 arrival was here and n minus i arrival is here. 212 00:15:44,930 --> 00:15:48,380 So I define it, the Bernoulli process. 213 00:15:48,380 --> 00:15:51,880 So priority of falling in this interval is tau 214 00:15:51,880 --> 00:15:55,030 over t minus tau. 215 00:15:55,030 --> 00:15:55,870 And I say that-- 216 00:15:55,870 --> 00:15:57,890 AUDIENCE: [INAUDIBLE]? 217 00:15:57,890 --> 00:16:00,180 PROFESSOR: OK, so falling in this interval in a uniform 218 00:16:00,180 --> 00:16:02,700 distribution corresponds to success. 219 00:16:02,700 --> 00:16:08,440 I want, among n minus 1 arrivals, i minus 1 of them to 220 00:16:08,440 --> 00:16:12,040 be successful and the rest of them to be [INAUDIBLE]. 221 00:16:12,040 --> 00:16:12,390 [INAUDIBLE] 222 00:16:12,390 --> 00:16:14,632 that one of them should fall here and the rest of them 223 00:16:14,632 --> 00:16:15,882 should fall here. 224 00:16:17,790 --> 00:16:18,225 Richard? 225 00:16:18,225 --> 00:16:20,532 AUDIENCE: So that's the [INAUDIBLE]? 226 00:16:20,532 --> 00:16:24,413 PROFESSOR: Yeah, you just get rid of this dt, that's it. 227 00:16:24,413 --> 00:16:27,299 [? AUDIENCE: Why do ?] those stay, n over t. 228 00:16:27,299 --> 00:16:28,261 PROFESSOR: Which one? 229 00:16:28,261 --> 00:16:29,704 AUDIENCE: Can you explain again the first term? 230 00:16:29,704 --> 00:16:31,345 The first term there, the nd over t? 231 00:16:31,345 --> 00:16:33,510 PROFESSOR: Oh, OK, so I want to have one interval 232 00:16:33,510 --> 00:16:39,080 definitely here but among n arrivals I want 233 00:16:39,080 --> 00:16:40,050 them to fall here. 234 00:16:40,050 --> 00:16:42,690 So any one of them should cancel. 235 00:16:42,690 --> 00:16:46,210 But then i minus 1 of them should fall here and n minus i 236 00:16:46,210 --> 00:16:49,395 intervals should fall here. 237 00:16:49,395 --> 00:16:52,190 Any other questions? 238 00:16:52,190 --> 00:16:53,720 It's beautiful, isn't it? 239 00:17:01,240 --> 00:17:02,490 There's a problem. 240 00:17:07,800 --> 00:17:13,369 I studied x1 to xn, but what is the distribution of this 241 00:17:13,369 --> 00:17:14,140 remaining part? 242 00:17:14,140 --> 00:17:19,050 So if I condition on n, nt or condition on sn plus 1, what 243 00:17:19,050 --> 00:17:21,900 is the distribution of this one or this one or doing 244 00:17:21,900 --> 00:17:24,619 distribution off all these? 245 00:17:24,619 --> 00:17:27,510 So this is what we're going to look. 246 00:17:27,510 --> 00:17:33,960 So first of all, I do the conditioning over sn plus 1, 247 00:17:33,960 --> 00:17:36,910 and I find the distribution of-- 248 00:17:36,910 --> 00:17:43,010 the distribution of x1 to xn condition on sn plus 1, is 249 00:17:43,010 --> 00:17:46,560 there any randomness in distribution of xn plus 1? 250 00:17:46,560 --> 00:17:48,540 So this is going to be sx plus 1. 251 00:18:03,940 --> 00:18:06,200 So this is going to be xn plus 1. 252 00:18:06,200 --> 00:18:10,060 So if I find a distribution of x1 to xn condition on this 253 00:18:10,060 --> 00:18:15,860 condition is sn plus 1, it's very easy to show that xn plus 254 00:18:15,860 --> 00:18:18,220 1 is equal to sn plus 1. 255 00:18:22,360 --> 00:18:23,880 So there's no randomness here. 256 00:18:28,470 --> 00:18:32,440 But looking at the figure, you see that-- 257 00:18:32,440 --> 00:18:36,200 and the distributions, you see that everything is symmetric. 258 00:18:36,200 --> 00:18:41,510 So I've found the distribution x1 to xn and I can find xn 259 00:18:41,510 --> 00:18:43,650 plus 1 easily from them. 260 00:18:43,650 --> 00:18:48,080 But what if I find the distribution of x2 to xn plus 261 00:18:48,080 --> 00:18:53,640 1, condition on this thing, and then find x1 from them? 262 00:18:53,640 --> 00:18:56,420 So you can say that x1 is equal to xn minus 263 00:18:56,420 --> 00:19:02,250 1 2 to n plus 1xi. 264 00:19:02,250 --> 00:19:04,120 And then we have this solution. 265 00:19:04,120 --> 00:19:07,970 So it seems that there's no difference between them and 266 00:19:07,970 --> 00:19:10,000 actually, there's not. 267 00:19:10,000 --> 00:19:15,360 So condition on sn plus 1, we have the same distribution for 268 00:19:15,360 --> 00:19:19,800 x1 to xn or x2 to xn plus 1. 269 00:19:19,800 --> 00:19:23,060 But there are only n free parameters because the n plus 270 00:19:23,060 --> 00:19:27,525 1's can be calculated in your equations. 271 00:19:33,530 --> 00:19:37,430 If you take any n random variables out of this n plus 1 272 00:19:37,430 --> 00:19:40,490 random variables, we have the same story. 273 00:19:40,490 --> 00:19:46,670 So it's going to be uniform, our 0 to t for this interval. 274 00:19:46,670 --> 00:19:49,160 So it's symmetric. 275 00:19:49,160 --> 00:19:53,020 Now, the very nice thing is about n of t, equal t-- 276 00:19:53,020 --> 00:19:56,560 [? so ?] n must be equal to n. 277 00:19:56,560 --> 00:20:00,420 So this is easy because we had these equations, but what if I 278 00:20:00,420 --> 00:20:04,470 call this thing xn star of n plus 1 and I do the 279 00:20:04,470 --> 00:20:08,270 conditioning over n of t? 280 00:20:08,270 --> 00:20:13,130 So I don't know the arrival time for n plus 1's arrival 281 00:20:13,130 --> 00:20:15,625 but I do know the number of arrivals at time [INAUDIBLE] 282 00:20:15,625 --> 00:20:16,650 t. 283 00:20:16,650 --> 00:20:19,230 Condition on that, what is the distribution for example, for 284 00:20:19,230 --> 00:20:22,120 xn star n plus 1 or x1 to xn? 285 00:20:26,250 --> 00:20:29,800 Again, easily we can say that it's the same 286 00:20:29,800 --> 00:20:33,760 story, so it's uniform. 287 00:20:33,760 --> 00:20:38,970 So you can take out of x1 to xn and xn star n plus 1, you 288 00:20:38,970 --> 00:20:47,810 can take n off them and find the distribution of that. 289 00:20:54,130 --> 00:20:57,540 OK, so the other way to look at this is that if I find the 290 00:20:57,540 --> 00:21:01,530 distribution of x1 to xn condition on n of t, I will 291 00:21:01,530 --> 00:21:07,280 have the expect that xs star n plus 1 is equal to t minus sn, 292 00:21:07,280 --> 00:21:12,421 which is t minus summation of x to i. 293 00:21:12,421 --> 00:21:15,110 Can anyone of you see this? 294 00:21:15,110 --> 00:21:15,460 Hope so. 295 00:21:15,460 --> 00:21:16,710 Oh, OK. 296 00:21:19,720 --> 00:21:23,240 So again, if I find the distribution of x1 to xn, I 297 00:21:23,240 --> 00:21:26,530 can find sx star n plus 1 deterministically. 298 00:21:29,650 --> 00:21:40,980 Just as a signage check, you can find that f of the star of 299 00:21:40,980 --> 00:21:46,030 x-- oh, sorry, f of x star n plus 1 condition on-- 300 00:22:00,600 --> 00:22:05,160 So looking at this formulation and just replacing everything, 301 00:22:05,160 --> 00:22:08,080 we can't find this thing, and you can see that this is 302 00:22:08,080 --> 00:22:09,280 similar to this one again. 303 00:22:09,280 --> 00:22:12,760 So this is the distribution for-- 304 00:22:12,760 --> 00:22:15,860 well, this is the distribution for x1. 305 00:22:15,860 --> 00:22:18,220 So you can find the density and this 306 00:22:18,220 --> 00:22:18,990 is going to be density. 307 00:22:18,990 --> 00:22:23,010 So what I'm saying is that x1 has the same marginal 308 00:22:23,010 --> 00:22:26,870 distribution as xn plus 1 star. 309 00:22:30,410 --> 00:22:33,220 So you can get it with this kind of reasoning, saying that 310 00:22:33,220 --> 00:22:36,330 the marginal distribution should be the same or looking 311 00:22:36,330 --> 00:22:37,580 at this kind of formulation. 312 00:22:44,480 --> 00:22:47,400 When you have this distribution you can find the 313 00:22:47,400 --> 00:22:53,280 distribution of summation of xi, i equal to 1 to n. 314 00:22:53,280 --> 00:22:54,480 And then you can find-- 315 00:22:54,480 --> 00:22:56,910 So t is determined, so you can just replace 316 00:22:56,910 --> 00:22:58,980 everything and find it. 317 00:22:58,980 --> 00:23:01,188 And you can see that these two are the same. 318 00:23:01,188 --> 00:23:02,580 AUDIENCE: [INAUDIBLE] 319 00:23:02,580 --> 00:23:04,250 the same distribution that is one? 320 00:23:04,250 --> 00:23:05,410 PROFESSOR: Yeah. 321 00:23:05,410 --> 00:23:06,660 Condition on nt [INAUDIBLE]. 322 00:23:09,410 --> 00:23:12,010 So everything is symmetric and uniform. 323 00:23:12,010 --> 00:23:14,980 But you can only choose n for your parameters because the n 324 00:23:14,980 --> 00:23:19,420 plus 1 is a combination of the-- 325 00:23:19,420 --> 00:23:20,670 Right? 326 00:23:27,720 --> 00:23:29,870 So this is the end of the chapter [INAUDIBLE] 327 00:23:33,160 --> 00:23:35,670 for Poisson process, is there any problems or any 328 00:23:35,670 --> 00:23:38,360 questions about it? 329 00:23:38,360 --> 00:23:41,360 So let's start the Markov chain. 330 00:23:41,360 --> 00:23:44,150 So in this chapter we only talk about the finite-state of 331 00:23:44,150 --> 00:23:47,160 Markov chains. 332 00:23:47,160 --> 00:23:51,830 Markov chains are some processes that changing 333 00:23:51,830 --> 00:23:53,680 integer-time sense. 334 00:23:53,680 --> 00:23:56,960 So it's like we're quantizing time and any kind of change 335 00:23:56,960 --> 00:24:00,750 can happen only in an integer-time sense. 336 00:24:00,750 --> 00:24:05,530 And it's different in this way from Poisson processes because 337 00:24:05,530 --> 00:24:10,780 we can have the definition of Poisson processes for any 338 00:24:10,780 --> 00:24:15,820 continuous time, but Markov chains are only defined for 339 00:24:15,820 --> 00:24:17,730 integer-times. 340 00:24:17,730 --> 00:24:23,060 And finite-state Markov chains is a Markov chain where states 341 00:24:23,060 --> 00:24:28,275 can be only in a finite set, so we usually call it 1 to M, 342 00:24:28,275 --> 00:24:30,630 but you can name it in any way you like. 343 00:24:42,930 --> 00:24:47,430 What we are looking for in finite-state Markov chains is 344 00:24:47,430 --> 00:24:52,080 the probability distribution of the next state conditioned 345 00:24:52,080 --> 00:24:53,370 on whatever [? history ?] 346 00:24:53,370 --> 00:24:55,220 that I have. 347 00:24:55,220 --> 00:24:57,950 So I know that at integer times 348 00:24:57,950 --> 00:24:59,540 there can be some change. 349 00:24:59,540 --> 00:25:01,100 So what is this change? 350 00:25:01,100 --> 00:25:03,190 We model it with the probability distribution of 351 00:25:03,190 --> 00:25:06,490 this change corresponding to the history. 352 00:25:06,490 --> 00:25:08,970 So we can model any discrete integer time 353 00:25:08,970 --> 00:25:12,020 processing this way. 354 00:25:12,020 --> 00:25:17,460 Now, the nice thing about Markov chains or homogeneous 355 00:25:17,460 --> 00:25:20,470 Markov chains is that this distribution is 356 00:25:20,470 --> 00:25:22,320 equal to these things. 357 00:25:22,320 --> 00:25:28,360 So you see i and j, so i is the previous state and j is 358 00:25:28,360 --> 00:25:31,940 the state that I want to go in so the distribution of the 359 00:25:31,940 --> 00:25:36,110 next state condition and all the history only depends on 360 00:25:36,110 --> 00:25:38,800 the last step. 361 00:25:38,800 --> 00:25:43,740 So even the previous state, the new state is independent 362 00:25:43,740 --> 00:25:47,420 of all their earlier things that might happen. 363 00:25:47,420 --> 00:25:56,570 So there's a very nice way to show it. 364 00:25:56,570 --> 00:26:00,160 So in general, we say that xn plus 1 is 365 00:26:00,160 --> 00:26:07,700 independent of xn plus 1. 366 00:26:11,930 --> 00:26:13,790 So if I know the periods-- 367 00:26:13,790 --> 00:26:25,950 and these things are true for all the possible states. 368 00:26:25,950 --> 00:26:31,840 So whatever the history in the earlier process that I had, I 369 00:26:31,840 --> 00:26:36,110 only care about the state that I was in in previous time. 370 00:26:36,110 --> 00:26:38,840 And this is going to show me what is the distribution of 371 00:26:38,840 --> 00:26:41,320 next state. 372 00:26:41,320 --> 00:26:45,830 So the two important things that you can see this in this 373 00:26:45,830 --> 00:26:48,070 formulation is that, first of all, this kind of 374 00:26:48,070 --> 00:26:54,150 independence, it is true for all i, j, k, m, and so on. 375 00:26:54,150 --> 00:26:58,810 The other thing is that it's not time-dependent. 376 00:26:58,810 --> 00:27:03,920 So if I'm in state, the probability distribution of my 377 00:27:03,920 --> 00:27:10,000 next state is going to be the same if the same thing happens 378 00:27:10,000 --> 00:27:12,030 in 100 years. 379 00:27:12,030 --> 00:27:13,470 So it's not time-dependent. 380 00:27:13,470 --> 00:27:16,230 That's why we call it homogeneous Markov chains. 381 00:27:16,230 --> 00:27:19,620 And we can have non-homogeneous Markov chains, 382 00:27:19,620 --> 00:27:22,700 but there are not many nice results about them. 383 00:27:22,700 --> 00:27:25,990 I mean, we cannot find many good things. 384 00:27:25,990 --> 00:27:31,620 And all the information that we know to characterize the 385 00:27:31,620 --> 00:27:35,250 Markov chain is this kind of distribution. 386 00:27:35,250 --> 00:27:38,850 So we call these things transition probabilities. 387 00:27:38,850 --> 00:27:46,820 And something else which is called initial probabilities, 388 00:27:46,820 --> 00:27:51,320 which tell me that, at time instance 0, what is the 389 00:27:51,320 --> 00:27:53,520 probability distribution of the states. 390 00:27:53,520 --> 00:27:57,320 So knowing Markov chains, you can find that probability 391 00:27:57,320 --> 00:28:01,370 distribution of x1 is equal to probability of x1 condition 392 00:28:01,370 --> 00:28:03,960 and x0 [INAUDIBLE]. 393 00:28:03,960 --> 00:28:05,880 So knowing this thing and these transition 394 00:28:05,880 --> 00:28:07,650 probabilities, I can kind the probability 395 00:28:07,650 --> 00:28:09,820 distribution of x1. 396 00:28:09,820 --> 00:28:11,190 Well, very easily, the probability 397 00:28:11,190 --> 00:28:14,599 of xn is equal to-- 398 00:28:14,599 --> 00:28:15,849 sorry. 399 00:28:24,840 --> 00:28:27,460 so you can do this thing iteratively. 400 00:28:27,460 --> 00:28:30,450 And just knowing the transition probabilities and 401 00:28:30,450 --> 00:28:32,510 initial probabilities, you can find the probability 402 00:28:32,510 --> 00:28:36,065 distribution of the states at any time instant to forever. 403 00:28:39,900 --> 00:28:40,830 Do you see it's very easy? 404 00:28:40,830 --> 00:28:43,260 You just do this thing iteratively. 405 00:28:48,180 --> 00:28:52,850 So I talked about the independence and initial. 406 00:28:57,120 --> 00:29:03,820 Yeah, so what we do is, we characterize the Markov chains 407 00:29:03,820 --> 00:29:06,090 with these set of transition probabilities 408 00:29:06,090 --> 00:29:08,080 and the initial state. 409 00:29:08,080 --> 00:29:11,070 And you see that these transition probabilities, we 410 00:29:11,070 --> 00:29:19,340 have, well, m times n minus 1, free parameters in them, 411 00:29:19,340 --> 00:29:23,190 because the summation of each of them should be equal to 1. 412 00:29:23,190 --> 00:29:27,980 So for each initial state, I can have a distribution over 413 00:29:27,980 --> 00:29:30,420 the next step. 414 00:29:30,420 --> 00:29:33,180 So I'm going to talk about it in the matrix form. 415 00:29:33,180 --> 00:29:37,930 And what we're doing in practice usually, we assume 416 00:29:37,930 --> 00:29:40,120 the initial state to be a constant state. 417 00:29:40,120 --> 00:29:44,290 So we usually just define a state, call it initial state. 418 00:29:44,290 --> 00:29:46,520 Usually, we call it x0. 419 00:29:46,520 --> 00:29:49,300 I mean, it's what's you're going to usually see if you 420 00:29:49,300 --> 00:29:51,670 want to study the behavior of Markov chains. 421 00:29:54,250 --> 00:29:57,850 So these are the two ways that we can visualize the 422 00:29:57,850 --> 00:29:59,980 transition probabilities. 423 00:29:59,980 --> 00:30:01,970 So matrix form is very easy. 424 00:30:01,970 --> 00:30:05,060 So you just look at the n probability 425 00:30:05,060 --> 00:30:06,790 distributions that you have. 426 00:30:06,790 --> 00:30:12,780 And with Markov chain with n number of states, you will 427 00:30:12,780 --> 00:30:17,600 have n distributions, each of them corresponding to the 428 00:30:17,600 --> 00:30:22,010 conditional distribution of next step, condition of the 429 00:30:22,010 --> 00:30:24,730 present step equal to i. 430 00:30:24,730 --> 00:30:26,600 So i can be anything. 431 00:30:26,600 --> 00:30:30,530 So this was the thing I was saying, the number of free 432 00:30:30,530 --> 00:30:31,260 parameters. 433 00:30:31,260 --> 00:30:35,600 So I have n distribution, and each distribution has n minus 434 00:30:35,600 --> 00:30:39,290 1 free parameters, because the summation of them should be 435 00:30:39,290 --> 00:30:41,590 equal to 1. 436 00:30:41,590 --> 00:30:43,726 So this is the number of free parameters that I have. 437 00:31:05,090 --> 00:31:06,750 So is there any problem with the matrix form? 438 00:31:06,750 --> 00:31:10,680 I'm assuming that you've all been seeing 439 00:31:10,680 --> 00:31:13,690 this sometime before. 440 00:31:13,690 --> 00:31:14,550 Fine? 441 00:31:14,550 --> 00:31:19,810 So the nice thing about matrix form is that you can do many 442 00:31:19,810 --> 00:31:21,220 nice stuff with them. 443 00:31:21,220 --> 00:31:23,710 We can look at the notes for them, and we will see these 444 00:31:23,710 --> 00:31:25,270 kinds of properties later. 445 00:31:25,270 --> 00:31:28,690 But you can imagine that just looking at the matrix form and 446 00:31:28,690 --> 00:31:33,920 doing some algebra over them, we can get very nice results 447 00:31:33,920 --> 00:31:35,790 about the Markov chains. 448 00:31:35,790 --> 00:31:40,950 The other representation of the Markov chains is using a 449 00:31:40,950 --> 00:31:45,250 graph, a directed graph in which each node corresponds to 450 00:31:45,250 --> 00:31:49,440 a state and each arc, or each edge, corresponds to a 451 00:31:49,440 --> 00:31:51,450 transition probability. 452 00:31:51,450 --> 00:31:57,280 And the very important thing about graphical model is that 453 00:31:57,280 --> 00:32:01,800 there's a very clear difference between 0 454 00:32:01,800 --> 00:32:04,660 transition probabilities and non-0 transition 455 00:32:04,660 --> 00:32:05,850 probabilities. 456 00:32:05,850 --> 00:32:09,180 So if there's any possibility of going from one state to 457 00:32:09,180 --> 00:32:14,580 another state, we have an edge or an arc here. 458 00:32:14,580 --> 00:32:17,850 So probability of 10 to the power of minus 5 is 459 00:32:17,850 --> 00:32:19,240 different from 0. 460 00:32:19,240 --> 00:32:21,900 We have an arc in one case and no arc in another case, 461 00:32:21,900 --> 00:32:24,210 because there's a chance of going from this state to the 462 00:32:24,210 --> 00:32:25,280 next state. 463 00:32:25,280 --> 00:32:27,790 And even then, the probability is 10 to the power of minus 5. 464 00:32:34,560 --> 00:32:38,710 And OK, so we can do a lot of inference by just looking at 465 00:32:38,710 --> 00:32:40,330 the graphical model. 466 00:32:40,330 --> 00:32:43,580 And it's easier to see some properties of the Markov chain 467 00:32:43,580 --> 00:32:45,060 by looking at the graphical models. 468 00:32:45,060 --> 00:32:49,350 We will see these properties that we can find by looking at 469 00:32:49,350 --> 00:32:52,570 the graphical model. 470 00:32:52,570 --> 00:32:59,180 So this is not really 471 00:32:59,180 --> 00:33:01,830 classification of states, actually. 472 00:33:01,830 --> 00:33:04,150 So there are some definitions, very 473 00:33:04,150 --> 00:33:06,170 intuitive definitions here. 474 00:33:06,170 --> 00:33:15,190 So there's something called a walk, which says that this is 475 00:33:15,190 --> 00:33:19,940 an ordered string of the nodes, where the probability 476 00:33:19,940 --> 00:33:23,940 of going from each node to the next one is non-0. 477 00:33:23,940 --> 00:33:27,340 So for example, looking at this walk, you can have the 478 00:33:27,340 --> 00:33:32,240 probability of going from 4 to 4 is positive, 4 to 1, 1 to 2, 479 00:33:32,240 --> 00:33:34,760 2 to 3, and 3, to 2. 480 00:33:34,760 --> 00:33:38,790 So there is no kind of constraints in the definition 481 00:33:38,790 --> 00:33:39,330 of the walk. 482 00:33:39,330 --> 00:33:42,020 There should be just positive probabilities in going from 483 00:33:42,020 --> 00:33:43,870 one state to the next state. 484 00:33:43,870 --> 00:33:45,200 So we can have repetition. 485 00:33:45,200 --> 00:33:47,230 We can have whatever we like. 486 00:33:50,260 --> 00:33:52,520 And the number of-- well, we can find the number 487 00:33:52,520 --> 00:33:54,250 of states for sure. 488 00:33:54,250 --> 00:33:58,790 So what is the maximum number of steps in a walk? 489 00:33:58,790 --> 00:34:00,550 Or minimum number? 490 00:34:00,550 --> 00:34:03,612 Minimum number of states is two-- 491 00:34:03,612 --> 00:34:04,760 [? one. ?] 492 00:34:04,760 --> 00:34:06,355 So what is the maximum number of steps? 493 00:34:12,935 --> 00:34:16,580 Well, there's no constraint, so it can be anything. 494 00:34:16,580 --> 00:34:19,930 So the next thing that we look is a path. 495 00:34:19,930 --> 00:34:24,489 A path is a walk where there is no repeated nodes. 496 00:34:24,489 --> 00:34:28,659 So I never go through one node twice. 497 00:34:28,659 --> 00:34:31,780 So for example, I have this path, 4, 1, 2, 3. 498 00:34:31,780 --> 00:34:34,770 We can see here, 4, 1, 2, 3. 499 00:34:34,770 --> 00:34:38,210 And now we can say, what is the maximum number 500 00:34:38,210 --> 00:34:39,469 of steps in a path? 501 00:34:43,453 --> 00:34:44,449 AUDIENCE: [INAUDIBLE]. 502 00:34:44,449 --> 00:34:46,616 PROFESSOR: Steps, not states. 503 00:34:46,616 --> 00:34:47,429 AUDIENCE: [INAUDIBLE]. 504 00:34:47,429 --> 00:34:50,190 PROFESSOR: n minus 1, yeah. 505 00:34:50,190 --> 00:34:52,330 Because you cannot go through one state twice. 506 00:34:52,330 --> 00:34:57,030 So you have a maximum number of steps in n minus 1. 507 00:34:57,030 --> 00:35:02,230 And the cycle is a walk in which the first and last node 508 00:35:02,230 --> 00:35:03,540 is repeated. 509 00:35:03,540 --> 00:35:07,280 So first of all, there's not a really great difference 510 00:35:07,280 --> 00:35:09,590 between these two cycles, so it doesn't 511 00:35:09,590 --> 00:35:11,580 matter where I start. 512 00:35:11,580 --> 00:35:13,550 And we don't care about the repetition and 513 00:35:13,550 --> 00:35:14,800 definition of cycle. 514 00:35:22,572 --> 00:35:23,510 Oh, OK. 515 00:35:23,510 --> 00:35:24,330 Yeah, sorry. 516 00:35:24,330 --> 00:35:25,410 No node is repeated. 517 00:35:25,410 --> 00:35:27,560 So sorry, something was wrong. 518 00:35:27,560 --> 00:35:30,700 Yeah, we shouldn't have any repetition except for the 519 00:35:30,700 --> 00:35:31,660 first and last nodes. 520 00:35:31,660 --> 00:35:33,900 So the first and last node should be the same, but there 521 00:35:33,900 --> 00:35:35,480 shouldn't be any repetition. 522 00:35:35,480 --> 00:35:38,005 So what is the maximum number of steps in this case? 523 00:35:41,872 --> 00:35:42,870 AUDIENCE: [INAUDIBLE]. 524 00:35:42,870 --> 00:35:47,070 PROFESSOR: n, yeah, because we have an additional step. 525 00:35:47,070 --> 00:35:49,070 Yeah? 526 00:35:49,070 --> 00:35:51,320 AUDIENCE: You said in a path, the maximum number of 527 00:35:51,320 --> 00:35:52,570 steps is n minus 1? 528 00:35:52,570 --> 00:35:53,570 PROFESSOR: Yeah. 529 00:35:53,570 --> 00:35:57,070 AUDIENCE: I mean, if you have n equals, like, 6, couldn't 530 00:35:57,070 --> 00:35:59,350 there be 1, 2, 3, 4, 5, 6? 531 00:35:59,350 --> 00:36:02,830 PROFESSOR: No, it's the steps, not the states. 532 00:36:02,830 --> 00:36:08,270 So if I have path from 1 to 2, and this is the path. 533 00:36:08,270 --> 00:36:12,360 So there's one step. 534 00:36:12,360 --> 00:36:14,740 Just whenever you're confused, make a simple model. 535 00:36:14,740 --> 00:36:18,770 It's going to make everything very, very easy. 536 00:36:18,770 --> 00:36:22,010 Any other questions? 537 00:36:22,010 --> 00:36:24,060 So this is another definition. 538 00:36:24,060 --> 00:36:27,930 We say that a node j is accessible from i if there is 539 00:36:27,930 --> 00:36:30,840 a walk from i to j. 540 00:36:30,840 --> 00:36:37,350 And by just looking at the graphical model, you can 541 00:36:37,350 --> 00:36:41,540 verify the existence of this walk very easily. 542 00:36:41,540 --> 00:36:44,030 But the nice thing, again, about this thing is what I 543 00:36:44,030 --> 00:36:46,720 emphasized in the graphical model. 544 00:36:46,720 --> 00:36:51,690 It talks about if there's any positive probability of going 545 00:36:51,690 --> 00:36:53,910 from i to j. 546 00:36:53,910 --> 00:37:01,420 So if j is accessible from i and you start at node i, 547 00:37:01,420 --> 00:37:04,770 there's a positive probability that you will end in node j 548 00:37:04,770 --> 00:37:07,290 sometime in the future. 549 00:37:07,290 --> 00:37:13,550 There's nothing said about the number of steps needed to get 550 00:37:13,550 --> 00:37:17,010 there, or the probability of that, except that this 551 00:37:17,010 --> 00:37:19,720 probability is non-0. 552 00:37:19,720 --> 00:37:20,970 It's positive. 553 00:37:24,110 --> 00:37:28,750 So for example, if there's a state like k, that we have p 554 00:37:28,750 --> 00:37:33,250 of ik is positive and p of kj is positive, then we will have 555 00:37:33,250 --> 00:37:36,320 p of ij [? too ?] is positive. 556 00:37:36,320 --> 00:37:38,980 So we have this kind of notation. 557 00:37:38,980 --> 00:37:40,230 I don't have a pointer. 558 00:37:42,810 --> 00:37:43,510 Weird. 559 00:37:43,510 --> 00:37:50,060 So yeah, we have this kind of my notation. pijn means that 560 00:37:50,060 --> 00:37:55,600 the probability of going from state i to state j in n steps. 561 00:37:55,600 --> 00:37:59,990 And this is exactly like n steps. 562 00:37:59,990 --> 00:38:02,770 Actually, this value could be different for 563 00:38:02,770 --> 00:38:05,910 different number of n. 564 00:38:05,910 --> 00:38:12,860 For example, if we have a Markov chain like this, p of 565 00:38:12,860 --> 00:38:22,320 131 is equal to 0, but p of 132 is non-0. 566 00:38:22,320 --> 00:38:31,680 So node j is accessible from i if pijn is positive for any n. 567 00:38:31,680 --> 00:38:37,610 So we say that j is not accessible from i if this is 0 568 00:38:37,610 --> 00:38:40,680 for all possible n's. 569 00:38:40,680 --> 00:38:42,434 Mercedes? 570 00:38:42,434 --> 00:38:45,236 AUDIENCE: Are those actually greater than or equal to? 571 00:38:45,236 --> 00:38:46,210 PROFESSOR: No, greater. 572 00:38:46,210 --> 00:38:52,990 They are always greater than or equal to 0. 573 00:38:52,990 --> 00:38:56,510 Probability is always non-negative. 574 00:38:56,510 --> 00:39:00,700 So what want is positive probability, meaning that 575 00:39:00,700 --> 00:39:02,930 there's a chance that I will get there. 576 00:39:02,930 --> 00:39:04,955 I don't care how small this chance is, but it 577 00:39:04,955 --> 00:39:07,366 shouldn't be 0. 578 00:39:07,366 --> 00:39:08,616 AUDIENCE: [INAUDIBLE]. 579 00:39:12,688 --> 00:39:18,060 So p to ij, though, couldn't it be equal to pik, [? pkj? ?] 580 00:39:18,060 --> 00:39:19,440 PROFESSOR: Yeah, exactly. 581 00:39:19,440 --> 00:39:26,510 So in this case, p132 is equal to p12, p23, 582 00:39:26,510 --> 00:39:29,480 AUDIENCE: So I guess I'm asking if p2ij should be 583 00:39:29,480 --> 00:39:33,450 greater than or equal to pfk. 584 00:39:33,450 --> 00:39:35,350 PROFESSOR: Oh, OK. 585 00:39:35,350 --> 00:39:36,050 No, actually. 586 00:39:36,050 --> 00:39:36,990 You know why? 587 00:39:36,990 --> 00:39:40,870 Because there can exist some other state like here. 588 00:39:40,870 --> 00:39:41,250 AUDIENCE: Right. 589 00:39:41,250 --> 00:39:43,740 But if that doesn't exist and there's only one path. 590 00:39:43,740 --> 00:39:44,360 PROFESSOR: Yeah, sure. 591 00:39:44,360 --> 00:39:45,690 But there's no guarantee. 592 00:39:45,690 --> 00:39:50,150 What I'm saying is that Pik is positive, Pkj is positive. 593 00:39:50,150 --> 00:39:53,710 So this thing is positive, but it can be bigger. 594 00:39:53,710 --> 00:39:55,510 In this case, it can be bigger. 595 00:39:55,510 --> 00:39:58,840 I don't care about the quantity, about the amount of 596 00:39:58,840 --> 00:39:59,350 the probability. 597 00:39:59,350 --> 00:40:03,430 I care about its positiveness, it's greater than 0. 598 00:40:03,430 --> 00:40:05,920 I just want it to be non-0. 599 00:40:11,810 --> 00:40:18,540 The other thing is that when we look at pijn, I don't care 600 00:40:18,540 --> 00:40:22,090 what kind of walk I have between i and j. 601 00:40:22,090 --> 00:40:25,040 It's a walk, so it can have repetition, it can have 602 00:40:25,040 --> 00:40:27,480 cycles, or it can have anything. 603 00:40:27,480 --> 00:40:28,560 I don't care. 604 00:40:28,560 --> 00:40:34,510 So if you really want to calculate pij for n equal to 605 00:40:34,510 --> 00:40:34,800 [INAUDIBLE] 606 00:40:34,800 --> 00:40:38,940 1,000, you should really find all possible walks from i to j 607 00:40:38,940 --> 00:40:43,495 and add the probabilities to find this value. 608 00:40:43,495 --> 00:40:45,205 And all the walks within steps. 609 00:40:48,700 --> 00:40:50,950 Not here. 610 00:40:50,950 --> 00:40:55,100 Let's look at some examples. 611 00:40:55,100 --> 00:40:59,910 So is node 3 accessible from node 1? 612 00:41:03,870 --> 00:41:08,840 So you see that there's a walk like this, 1, 2, 3. 613 00:41:08,840 --> 00:41:11,270 So there's a positive probability of going to state 614 00:41:11,270 --> 00:41:13,230 3 from node 1. 615 00:41:13,230 --> 00:41:18,380 So node 3 is accessible from 1. 616 00:41:18,380 --> 00:41:20,880 But if you really want to calculate the probability, you 617 00:41:20,880 --> 00:41:24,850 should also look at the fact that we can have cycles. 618 00:41:24,850 --> 00:41:27,110 So 1, 2, 3 is a walk. 619 00:41:27,110 --> 00:41:29,550 But 1, 1, 2, 3 is also a walk. 620 00:41:29,550 --> 00:41:34,070 1, 2 3, 2, 3 is also a walk. 621 00:41:34,070 --> 00:41:34,370 You see? 622 00:41:34,370 --> 00:41:37,420 So you have to count all these things to find the probability 623 00:41:37,420 --> 00:41:41,780 of p13n for any n. 624 00:41:41,780 --> 00:41:46,370 What about state 5? 625 00:41:46,370 --> 00:41:49,990 So is the node 3 accessible from node 5? 626 00:41:52,594 --> 00:41:56,350 You see that it's not. 627 00:41:56,350 --> 00:41:59,560 Actually, if you're going to state 5, you never go out. 628 00:41:59,560 --> 00:42:01,910 With [INAUDIBLE] 629 00:42:01,910 --> 00:42:03,930 probability of 1, you stay there forever. 630 00:42:03,930 --> 00:42:10,990 So actually, no state except state 5 is accessible from 5. 631 00:42:15,570 --> 00:42:20,965 So is node 2 accessible from itself? 632 00:42:23,590 --> 00:42:28,760 So accessible means that I should have a walk from 2 to 2 633 00:42:28,760 --> 00:42:31,400 in some number of steps. 634 00:42:31,400 --> 00:42:35,870 So you can see that we can have 2, 3, 2, 635 00:42:35,870 --> 00:42:36,740 or many other walks. 636 00:42:36,740 --> 00:42:39,900 So it's accessible. 637 00:42:39,900 --> 00:42:43,435 But as you see, node 6 is not accessible from itself. 638 00:42:45,980 --> 00:42:48,420 If you are in node 6, you always go out. 639 00:42:53,460 --> 00:42:55,990 So it's not accessible. 640 00:42:55,990 --> 00:42:58,680 So let's go back to these definitions. 641 00:43:01,450 --> 00:43:04,050 Yeah, this is what I said, and I'm emphasizing again. 642 00:43:04,050 --> 00:43:09,070 If you want to say that j is not accessible from i, I 643 00:43:09,070 --> 00:43:13,770 should have a pijn equal to 0 for all n. 644 00:43:18,970 --> 00:43:24,010 The other thing is that, if there's a walk from i to j and 645 00:43:24,010 --> 00:43:30,100 from j to k, we can prove easily that there's a walk 646 00:43:30,100 --> 00:43:32,440 from i to k. 647 00:43:32,440 --> 00:43:46,520 So having a walk from i to j means that for some n, this 648 00:43:46,520 --> 00:43:47,880 thing is positive. 649 00:43:47,880 --> 00:43:50,140 So this is i to j. 650 00:43:50,140 --> 00:43:54,230 From j to k, this means that p of jk, for 651 00:43:54,230 --> 00:43:56,960 some n, this is positive. 652 00:43:56,960 --> 00:44:03,890 So looking at i two k, I can say that p of i to k, m plus 653 00:44:03,890 --> 00:44:06,068 n, is greater than. 654 00:44:09,820 --> 00:44:14,260 And I have this thing here because of the reason that I 655 00:44:14,260 --> 00:44:16,600 explained right now, because there might be other paths 656 00:44:16,600 --> 00:44:24,190 from i to k, except the paths going through j, or except the 657 00:44:24,190 --> 00:44:30,010 walks that have n steps to get to j and n steps to get to k. 658 00:44:32,680 --> 00:44:37,120 So we know that, well, we can concatenate the walks from-- 659 00:44:37,120 --> 00:44:39,660 so if there's a walk from i to j and j to k, we have a walk 660 00:44:39,660 --> 00:44:40,910 from i to k. 661 00:44:56,260 --> 00:45:00,710 So we say that states i and j communicate if there's a walk 662 00:45:00,710 --> 00:45:04,050 from i to j and from j to i. 663 00:45:04,050 --> 00:45:05,880 So it can go back and forth. 664 00:45:05,880 --> 00:45:10,650 So it means that there's a cycle from from i 665 00:45:10,650 --> 00:45:12,460 two i or j two j. 666 00:45:12,460 --> 00:45:13,710 This is the implication. 667 00:45:17,260 --> 00:45:20,650 So it's, again, very simple to prove that if i communicates 668 00:45:20,650 --> 00:45:25,880 with j and j communicates with k, then i communicates with k. 669 00:45:25,880 --> 00:45:28,200 In order to prove that, I should assume that i 670 00:45:28,200 --> 00:45:34,810 communicates to j, I need to prove that there is a walk 671 00:45:34,810 --> 00:45:39,480 from i to k, and I need to prove that there is a walk 672 00:45:39,480 --> 00:45:41,610 from k to i. 673 00:45:41,610 --> 00:45:46,700 So this means that i communicates with k. 674 00:45:46,700 --> 00:45:52,050 So these two things can be proved easily from the 675 00:45:52,050 --> 00:45:54,710 concatenation of the walks and the fact that i and j 676 00:45:54,710 --> 00:45:56,040 communicate and j and k communicate. 677 00:46:00,240 --> 00:46:04,050 Now, what I can define is something 678 00:46:04,050 --> 00:46:07,660 called a class of states. 679 00:46:07,660 --> 00:46:18,160 So a class is called a non-empty set of states, where 680 00:46:18,160 --> 00:46:21,440 all the pairs of states in a class communicated with each 681 00:46:21,440 --> 00:46:25,290 other, and none of them communicate with any other 682 00:46:25,290 --> 00:46:28,060 state in the Markov chain. 683 00:46:28,060 --> 00:46:31,070 So this is the definition of class. 684 00:46:31,070 --> 00:46:37,900 So I just group the pairs that can communicate with the state 685 00:46:37,900 --> 00:46:40,830 and just get rid of all those who do not communicate to the 686 00:46:40,830 --> 00:46:42,080 [INAUDIBLE]. 687 00:46:46,110 --> 00:46:55,180 So for defining a class or for finding a class, or for naming 688 00:46:55,180 --> 00:46:59,810 a class, we can have a representative state. 689 00:46:59,810 --> 00:47:05,320 So I want to find all the states that communicate with 690 00:47:05,320 --> 00:47:07,490 each other in a class. 691 00:47:07,490 --> 00:47:11,280 So I can just pick one of the states in this class and find 692 00:47:11,280 --> 00:47:13,540 all the states that communicate with this single 693 00:47:13,540 --> 00:47:19,410 state, because if two states communicate with one state, 694 00:47:19,410 --> 00:47:23,520 then these two states communicate with each other. 695 00:47:23,520 --> 00:47:32,210 And if there's a state that doesn't communicate with me, 696 00:47:32,210 --> 00:47:35,490 it doesn't communicate with anybody else whom I'm 697 00:47:35,490 --> 00:47:37,740 communicating with. 698 00:47:37,740 --> 00:47:41,860 I'm going to prove it now, just in a few moments. 699 00:47:41,860 --> 00:47:45,530 Just, I want to look at this figure again. 700 00:47:45,530 --> 00:47:53,080 So first I take state 2 and find a class that has this 701 00:47:53,080 --> 00:47:54,790 state in itself. 702 00:47:54,790 --> 00:48:00,170 So you see that in this class, we have state 2 on 3, because 703 00:48:00,170 --> 00:48:04,110 there's only a state 3 that communicates with 2. 704 00:48:04,110 --> 00:48:08,700 And correspondingly, we can have class 705 00:48:08,700 --> 00:48:11,200 having state 4 and 5. 706 00:48:11,200 --> 00:48:14,440 And you see that state 1 is communicating with itself, so 707 00:48:14,440 --> 00:48:15,410 it's a class by itself. 708 00:48:15,410 --> 00:48:18,120 But it doesn't communicate with anyone else. 709 00:48:18,120 --> 00:48:20,530 So we have this class also. 710 00:48:20,530 --> 00:48:26,070 So next question, why do we call C4 a class? 711 00:48:31,070 --> 00:48:32,320 So it doesn't communicate with itself. 712 00:48:35,620 --> 00:48:39,810 If you're in state 6, you go out of it with probability of 713 00:48:39,810 --> 00:48:41,100 1, eventually. 714 00:48:41,100 --> 00:48:42,465 So why do we call it a class? 715 00:48:49,200 --> 00:48:52,100 Actually, so this is the definition of the classes. 716 00:48:52,100 --> 00:48:55,630 But we want to have some very nice property of other classes 717 00:48:55,630 --> 00:48:59,720 which says that we can partition the states in a 718 00:48:59,720 --> 00:49:03,870 Markov chain by using the classes. 719 00:49:03,870 --> 00:49:08,355 So if I don't count this case as a class, I cannot partition 720 00:49:08,355 --> 00:49:11,910 it, because partitioning means that I should cover the whole 721 00:49:11,910 --> 00:49:14,010 states in the classes. 722 00:49:14,010 --> 00:49:19,550 What I to do is to do some kind of partitioning in a 723 00:49:19,550 --> 00:49:22,370 Markov chain that I have, by using the classes, so that I 724 00:49:22,370 --> 00:49:25,690 can have a representative state for each class. 725 00:49:25,690 --> 00:49:31,510 And this is one way to partition the Markov chains. 726 00:49:34,010 --> 00:49:37,000 And why do I say it's partitioning? 727 00:49:37,000 --> 00:49:44,220 Well, it's covering the whole finite space of the states. 728 00:49:44,220 --> 00:49:46,820 But I need to prove that there's no intersection 729 00:49:46,820 --> 00:49:49,770 between classes also. 730 00:49:49,770 --> 00:49:51,020 Why is it like that? 731 00:49:55,100 --> 00:49:59,900 So meaning that I cannot have two classes where there is an 732 00:49:59,900 --> 00:50:02,430 intersection between them, because if there's an 733 00:50:02,430 --> 00:50:03,300 intersection-- 734 00:50:03,300 --> 00:50:09,660 for example, i belongs to C1 and i belongs to C2-- 735 00:50:09,660 --> 00:50:14,320 it means that i communicates with all the states in C1 and 736 00:50:14,320 --> 00:50:17,170 i communicates with all states in C2. 737 00:50:17,170 --> 00:50:20,340 So you can say that all states in C1 communicate with all 738 00:50:20,340 --> 00:50:21,610 states in C2. 739 00:50:21,610 --> 00:50:23,040 So actually, they should be the same. 740 00:50:28,220 --> 00:50:30,070 And there's only these states that 741 00:50:30,070 --> 00:50:31,280 communicate with each other. 742 00:50:31,280 --> 00:50:34,690 We have this exclusivity that's conditional. 743 00:50:34,690 --> 00:50:37,980 So we can have this kind of partitioning. 744 00:50:37,980 --> 00:50:40,090 Is there any question? 745 00:50:40,090 --> 00:50:41,340 Everybody's fine? 746 00:50:44,650 --> 00:50:47,690 So another definition which is going to be very, very 747 00:50:47,690 --> 00:50:52,220 important in the future for us is the recurrency. 748 00:50:52,220 --> 00:50:53,470 And it's actually very simple. 749 00:51:00,310 --> 00:51:10,170 So a state is called recurrent if for all the states that we 750 00:51:10,170 --> 00:51:17,820 have j is accessible to i, we also have i is 751 00:51:17,820 --> 00:51:19,770 accessible to j. 752 00:51:19,770 --> 00:51:25,320 So if from some state i, in some number of states, I can 753 00:51:25,320 --> 00:51:29,400 go to some state j, I can get back to i from 754 00:51:29,400 --> 00:51:31,030 this state for sure. 755 00:51:31,030 --> 00:51:32,780 And this should be true for all the 756 00:51:32,780 --> 00:51:36,000 states in a Markov chain. 757 00:51:36,000 --> 00:51:39,550 So if there's a walk from i to j, there should be a walk from 758 00:51:39,550 --> 00:51:41,170 j to i, too. 759 00:51:41,170 --> 00:51:45,730 If this is true, then i is recurrent. 760 00:51:45,730 --> 00:51:47,610 This is the definition of recurrence. 761 00:51:47,610 --> 00:51:50,080 Is it OK? 762 00:51:50,080 --> 00:51:53,610 And if it's not recurrent, we call it transient. 763 00:51:53,610 --> 00:51:56,720 And why do we call transient? 764 00:51:56,720 --> 00:51:59,130 I mean, the intuition is very nice. 765 00:51:59,130 --> 00:52:01,790 So what is a transient state? 766 00:52:01,790 --> 00:52:05,100 A transient state says that there might be some kind of 767 00:52:05,100 --> 00:52:09,240 walk from i to some k. 768 00:52:09,240 --> 00:52:11,560 And then I can't go back. 769 00:52:11,560 --> 00:52:14,910 So there's a positive probability that I go out of 770 00:52:14,910 --> 00:52:18,370 state i in some way, in some walk, and never 771 00:52:18,370 --> 00:52:19,620 come back to it. 772 00:52:19,620 --> 00:52:23,008 So it's kind of transitional behavior. 773 00:52:23,008 --> 00:52:25,984 AUDIENCE: [INAUDIBLE]? 774 00:52:25,984 --> 00:52:28,363 PROFESSOR: No, no, with some probability. 775 00:52:31,000 --> 00:52:32,570 So there exists some probability. 776 00:52:32,570 --> 00:52:34,270 There is a positive probability that I go out of 777 00:52:34,270 --> 00:52:36,240 it and never come back. 778 00:52:36,240 --> 00:52:39,900 It's enough for definition of transience. 779 00:52:39,900 --> 00:52:43,700 So you know why? 780 00:52:43,700 --> 00:52:47,510 Because I have the definition of recurrence for all the j's 781 00:52:47,510 --> 00:52:50,740 that are accessible from i. 782 00:52:50,740 --> 00:52:51,810 So with probability 1-- 783 00:52:51,810 --> 00:52:54,240 oh, OK, so I cannot say with probability 1. 784 00:52:57,220 --> 00:52:59,420 Yeah, I cannot say probability of 1 for recurrency. 785 00:52:59,420 --> 00:53:04,190 But for transient behavior, there exists some probability. 786 00:53:04,190 --> 00:53:05,800 There's a positive probability that I go out 787 00:53:05,800 --> 00:53:07,050 and never come back. 788 00:53:21,770 --> 00:53:24,140 State 1 is transient. 789 00:53:26,680 --> 00:53:29,215 And state 3, is it recurrent or transient? 790 00:53:36,860 --> 00:53:37,260 Some idea? 791 00:53:37,260 --> 00:53:37,530 AUDIENCE: [INAUDIBLE]. 792 00:53:37,530 --> 00:53:38,710 PROFESSOR: Transient? 793 00:53:38,710 --> 00:53:41,290 What about state 2? 794 00:53:41,290 --> 00:53:44,430 It's wrong, because there should be something going on 795 00:53:44,430 --> 00:53:45,470 [INAUDIBLE]. 796 00:53:45,470 --> 00:53:47,570 So now here, it's recurrent. 797 00:53:53,680 --> 00:53:54,930 Good? 798 00:53:58,410 --> 00:54:00,650 Oh yeah, we have examples here. 799 00:54:00,650 --> 00:54:04,970 So states 2 and 3 are recurrent, because the only 800 00:54:04,970 --> 00:54:09,660 state that I can go out from this state is to themselves. 801 00:54:09,660 --> 00:54:13,000 So they are only accessible from themselves. 802 00:54:13,000 --> 00:54:14,790 And there's a positive probability of going from one 803 00:54:14,790 --> 00:54:15,410 to the other. 804 00:54:15,410 --> 00:54:17,280 So they're recurrent. 805 00:54:17,280 --> 00:54:22,060 But states 4 and 5 are transient. 806 00:54:22,060 --> 00:54:28,020 And the reason is here, because there is a positive 807 00:54:28,020 --> 00:54:30,980 probability that I go out of state 4 in that direction and 808 00:54:30,980 --> 00:54:33,750 never come back. 809 00:54:33,750 --> 00:54:35,000 And there's no way to come back. 810 00:54:37,950 --> 00:54:41,200 And state 6 and 1 are also transient. 811 00:54:41,200 --> 00:54:44,740 State 6 is very, very transient because, well, I go 812 00:54:44,740 --> 00:54:47,290 out of it in one step and never come back. 813 00:54:47,290 --> 00:54:51,540 And state 1 is also transient, because I can go out of it in 814 00:54:51,540 --> 00:54:55,111 this direction and never be able to come back. 815 00:54:55,111 --> 00:54:56,361 Do you see? 816 00:55:00,320 --> 00:55:04,960 So this is a very, very, very important theorem, which says 817 00:55:04,960 --> 00:55:09,450 that if we have some classes, the states in the class are 818 00:55:09,450 --> 00:55:12,590 all recurrent or all transient. 819 00:55:12,590 --> 00:55:15,110 This is what we are going to use a lot in the future, so 820 00:55:15,110 --> 00:55:17,147 you should remember it. 821 00:55:17,147 --> 00:55:21,070 And the proof is very easy. 822 00:55:21,070 --> 00:55:29,385 So let's assume that state i is recurrent. 823 00:55:29,385 --> 00:55:37,160 And let's define Si, all the states that communicate with 824 00:55:37,160 --> 00:55:40,660 i, that are accessible from i. 825 00:55:40,660 --> 00:55:45,950 And you know that since i is recurrent, being accessible 826 00:55:45,950 --> 00:55:52,230 from i means that i is accessible from them. 827 00:55:52,230 --> 00:55:56,690 So if j is accessible from i, i is also accessible from j. 828 00:55:56,690 --> 00:56:05,460 So we know that i and j communicate with each other if 829 00:56:05,460 --> 00:56:10,960 and only if j is in this set. 830 00:56:10,960 --> 00:56:12,370 So this is a class. 831 00:56:12,370 --> 00:56:15,950 This is the class that contains i. 832 00:56:15,950 --> 00:56:18,610 So this is like the class that I told you, that we can have a 833 00:56:18,610 --> 00:56:23,360 class where this state i is representative. 834 00:56:23,360 --> 00:56:26,970 And actually, any state in a class can be representative. 835 00:56:26,970 --> 00:56:29,920 It doesn't matter. 836 00:56:29,920 --> 00:56:33,440 So by just looking at state i, I can define a class. 837 00:56:33,440 --> 00:56:39,410 The states that are accessible from i-- and actually, this is 838 00:56:39,410 --> 00:56:41,190 the class that contains i. 839 00:56:43,800 --> 00:56:51,720 So let's assume that there's a state called k which is 840 00:56:51,720 --> 00:56:55,720 accessible from some j which is in this set. 841 00:56:55,720 --> 00:56:59,660 So k is accessible from j, and j is accessible from i. 842 00:56:59,660 --> 00:57:03,020 So k is accessible from i. 843 00:57:03,020 --> 00:57:07,060 But k accessible from i implies that i is also 844 00:57:07,060 --> 00:57:09,250 accessible from k, because i is recurrent. 845 00:57:12,204 --> 00:57:16,510 So you think i is accessible from k? 846 00:57:16,510 --> 00:57:20,020 And you know that j is also accessible from i because, 847 00:57:20,020 --> 00:57:24,590 well, Si was also a class, recurrent class. 848 00:57:24,590 --> 00:57:27,060 So you see that j is also recurrent. 849 00:57:27,060 --> 00:57:32,440 So if k is accessible from j, then j is also accessible from 850 00:57:32,440 --> 00:57:34,490 k for any k. 851 00:57:34,490 --> 00:57:37,390 So this is the definition of recurrency. 852 00:57:37,390 --> 00:57:42,795 So what I want to say is that if i is recurrent and it's in 853 00:57:42,795 --> 00:57:46,370 the same class as j, then j is recurrent for sure. 854 00:57:49,180 --> 00:57:52,540 And we didn't prove it here, but if one of them is 855 00:57:52,540 --> 00:57:54,070 transient, them all of them are transient too. 856 00:57:56,900 --> 00:57:58,150 So the proof is very simple. 857 00:58:01,700 --> 00:58:04,950 I proved that if there is any state like k that's is 858 00:58:04,950 --> 00:58:07,480 accessible from j, I need to prove that j is also 859 00:58:07,480 --> 00:58:09,640 accessible from k. 860 00:58:09,640 --> 00:58:10,890 And I proved it like this. 861 00:58:15,550 --> 00:58:16,800 It's easy. 862 00:58:18,720 --> 00:58:23,830 So the next definition that we have is the definition of the 863 00:58:23,830 --> 00:58:27,480 periodic states and classes. 864 00:58:27,480 --> 00:58:33,590 So I told you that the number of steps in a walk-- so when I 865 00:58:33,590 --> 00:58:37,460 say that there's a walk from some state to the other state, 866 00:58:37,460 --> 00:58:41,740 I didn't specify the number of steps needed to get from the 867 00:58:41,740 --> 00:58:43,240 state to another state. 868 00:58:43,240 --> 00:58:47,610 So assuming that there is a walk from state i to i, 869 00:58:47,610 --> 00:58:51,410 meaning that i is accessible from i, it's not always true. 870 00:58:51,410 --> 00:58:54,560 You might go out of i and never come back to it. 871 00:58:54,560 --> 00:58:58,260 So assuming that there's a positive probability that we 872 00:58:58,260 --> 00:59:02,140 can go from state i to i, you can look at the number of 873 00:59:02,140 --> 00:59:04,880 steps needed for this walk. 874 00:59:04,880 --> 00:59:08,610 And we can find the greatest common divisor of 875 00:59:08,610 --> 00:59:10,510 this number of steps. 876 00:59:10,510 --> 00:59:15,180 And it's called the period of i. 877 00:59:15,180 --> 00:59:19,720 And if this number is greater than 1, then i is 878 00:59:19,720 --> 00:59:21,710 called to be periodic. 879 00:59:21,710 --> 00:59:24,640 And if it's not, it's aperiodic. 880 00:59:24,640 --> 00:59:28,840 So the very simple example of this thing 881 00:59:28,840 --> 00:59:31,190 is this Markov chain. 882 00:59:31,190 --> 00:59:40,900 So probability of 1,1 in step is 0. 883 00:59:40,900 --> 00:59:44,990 probability of 1, 1 in two steps is positive. 884 00:59:44,990 --> 00:59:49,420 And actually, probability of 1, 1 for all even number of 885 00:59:49,420 --> 00:59:51,440 steps is positive. 886 00:59:51,440 --> 00:59:55,567 But for all odd number of steps, it's 0. 887 00:59:58,490 --> 01:00:01,940 And actually, you can prove that 2 is the 888 01:00:01,940 --> 01:00:03,390 greatest common divisor. 889 01:00:03,390 --> 01:00:05,487 So this is the very easy example to 890 01:00:05,487 --> 01:00:13,660 show that it's periodic. 891 01:00:18,850 --> 01:00:23,620 So there is a very simple thing to check if 892 01:00:23,620 --> 01:00:25,920 something is aperiodic. 893 01:00:25,920 --> 01:00:28,090 But it doesn't tell me-- 894 01:00:28,090 --> 01:00:31,380 I mean, if it doesn't exist, we don't know its periodicity. 895 01:00:31,380 --> 01:00:41,780 So if there is a walk from state 1 or state i to itself, 896 01:00:41,780 --> 01:00:47,930 and in this walk, we go from a state called, like, j, and 897 01:00:47,930 --> 01:00:50,560 there is a loop here-- 898 01:00:50,560 --> 01:00:52,610 so Pjj is positive. 899 01:00:55,460 --> 01:01:00,062 What can we say about the periodicity of i? 900 01:01:00,062 --> 01:01:02,040 No, the period is 1. 901 01:01:02,040 --> 01:01:04,490 Yeah, exactly. 902 01:01:04,490 --> 01:01:07,820 So in this case, state i is always aperiodic. 903 01:01:11,080 --> 01:01:13,900 So if there's a walk, and in this walk, there is a loop, we 904 01:01:13,900 --> 01:01:14,860 always say that. 905 01:01:14,860 --> 01:01:18,230 But if it doesn't exist, can we say anything about the 906 01:01:18,230 --> 01:01:20,800 periodicity? 907 01:01:20,800 --> 01:01:21,770 No. 908 01:01:21,770 --> 01:01:22,670 It might be periodic. 909 01:01:22,670 --> 01:01:24,940 It might be aperiodic. 910 01:01:24,940 --> 01:01:25,820 It's just a check. 911 01:01:25,820 --> 01:01:29,040 So whenever you see a loop, it's aperiodic. 912 01:01:29,040 --> 01:01:32,350 Whenever you see a loop in the walk from i to i, if there's a 913 01:01:32,350 --> 01:01:34,740 loop in the other side of the Markov chain, we don't care. 914 01:01:38,370 --> 01:01:41,740 So the definition is fine. 915 01:01:41,740 --> 01:01:46,030 So just looking at this example, so if we're going 916 01:01:46,030 --> 01:01:50,880 from state 4 to 4, the number of states that we need is, 917 01:01:50,880 --> 01:01:52,710 like, 4, 6, 8. 918 01:01:52,710 --> 01:02:01,940 So 4, 1, 2, 3, 4 is a cycle, or is a walk from 4, to 4, and 919 01:02:01,940 --> 01:02:05,440 the number of steps is 4. 920 01:02:05,440 --> 01:02:10,340 4, 5, 6, 7, 8 9, 4 is another walk. 921 01:02:10,340 --> 01:02:13,340 So this corresponds to n equal to 6. 922 01:02:13,340 --> 01:02:18,230 And 4, 1, 2 3, 4, 1, 2, 3, 4 is another walk which 923 01:02:18,230 --> 01:02:21,080 corresponds to n equal to 8. 924 01:02:21,080 --> 01:02:27,640 So you see that we can go like this or this or this. 925 01:02:27,640 --> 01:02:29,680 So these are different n's. 926 01:02:29,680 --> 01:02:33,640 But we see that the greatest common divisor is 2. 927 01:02:33,640 --> 01:02:36,232 So the period of state 4 is 2. 928 01:02:36,232 --> 01:02:40,120 For state 7, we have this thing. 929 01:02:40,120 --> 01:02:44,730 So the minimum number of steps to get from 7 to itself is 6, 930 01:02:44,730 --> 01:02:48,840 and then you can get from 7 to 7 in 10 steps. 931 01:02:48,840 --> 01:02:49,960 And I hope you see it. 932 01:02:49,960 --> 01:02:53,650 So it's going to be like this. 933 01:02:53,650 --> 01:02:57,690 And so again, the greatest common divisor is 2. 934 01:03:04,400 --> 01:03:18,790 So we proved that if one state in a class is recurrent, and 935 01:03:18,790 --> 01:03:20,830 then all the states are recurrent. 936 01:03:20,830 --> 01:03:23,440 And we said that recurrency corresponds to having a cycle, 937 01:03:23,440 --> 01:03:26,200 having a walk from each state to itself. 938 01:03:29,690 --> 01:03:36,310 So this is the result, very similar to that one, which 939 01:03:36,310 --> 01:03:38,460 says that all the states in the same class 940 01:03:38,460 --> 01:03:41,170 have the same period. 941 01:03:41,170 --> 01:03:45,600 And in this example, you see it too. 942 01:03:45,600 --> 01:03:48,590 The proof is not very complicated, but it takes time 943 01:03:48,590 --> 01:03:49,120 to do that. 944 01:03:49,120 --> 01:03:50,210 So I'm not going to do it. 945 01:03:50,210 --> 01:03:51,920 But it's all nice. 946 01:03:51,920 --> 01:03:57,330 So it's good if you look at the text, you'll see that. 947 01:03:57,330 --> 01:03:58,686 AUDIENCE: [INAUDIBLE] 948 01:03:58,686 --> 01:04:01,700 only for recurrent states? 949 01:04:01,700 --> 01:04:02,250 PROFESSOR: Yeah. 950 01:04:02,250 --> 01:04:04,217 For non-recurrent ones, it's-- 951 01:04:04,217 --> 01:04:05,680 AUDIENCE: [INAUDIBLE]. 952 01:04:05,680 --> 01:04:06,510 PROFESSOR: It's 1. 953 01:04:06,510 --> 01:04:08,570 It's aperiodic. 954 01:04:08,570 --> 01:04:09,220 OK, yeah. 955 01:04:09,220 --> 01:04:11,596 Periodicity is only defined for recurrent states, yeah. 956 01:04:15,290 --> 01:04:16,710 We have another example here. 957 01:04:16,710 --> 01:04:17,780 I just want to show you. 958 01:04:17,780 --> 01:04:24,480 So I have two recurrent classes in this example. 959 01:04:24,480 --> 01:04:25,750 Actually, three. 960 01:04:25,750 --> 01:04:28,610 So one of them is this class. 961 01:04:28,610 --> 01:04:33,030 What is the period for a class corresponding to state 1? 962 01:04:33,030 --> 01:04:34,460 It's 1, because I have a loop. 963 01:04:34,460 --> 01:04:35,570 It's very simple. 964 01:04:35,570 --> 01:04:39,440 For any n, there is a positive probability. 965 01:04:39,440 --> 01:04:43,320 What is the period for this class, states-- 966 01:04:43,320 --> 01:04:47,390 containing states 4 and 5? 967 01:04:47,390 --> 01:04:50,510 Look at it. 968 01:04:50,510 --> 01:04:52,850 There is definitely a loop in this class. 969 01:04:52,850 --> 01:04:54,785 So it's 1. 970 01:04:54,785 --> 01:04:58,100 No, I said that whenever there is a loop, the greatest common 971 01:04:58,100 --> 01:04:58,910 divisor is 1. 972 01:04:58,910 --> 01:05:03,080 So for going from a state 5 to 5, we can go it in one step, 973 01:05:03,080 --> 01:05:05,580 two step, three step, four step, five. 974 01:05:05,580 --> 01:05:09,420 So the greatest common divisor is 1. 975 01:05:09,420 --> 01:05:14,200 So here, I showed you that if there is a loop, then it's 976 01:05:14,200 --> 01:05:16,650 definitely aperiodic, meaning that the greatest common 977 01:05:16,650 --> 01:05:18,840 divisor is 1. 978 01:05:18,840 --> 01:05:20,218 AUDIENCE: [INAUDIBLE] have self-transitions [INAUDIBLE] 979 01:05:20,218 --> 01:05:21,670 2? 980 01:05:21,670 --> 01:05:24,410 PROFESSOR: No, I'm talking about 4 and 5. 981 01:05:24,410 --> 01:05:25,260 In what? 982 01:05:25,260 --> 01:05:27,204 AUDIENCE: If they didn't have self-transitions? 983 01:05:27,204 --> 01:05:28,180 [INAUDIBLE]. 984 01:05:28,180 --> 01:05:28,780 PROFESSOR: Oh yeah. 985 01:05:28,780 --> 01:05:30,140 It would be like this one? 986 01:05:30,140 --> 01:05:30,980 AUDIENCE: Yeah, [INAUDIBLE]. 987 01:05:30,980 --> 01:05:33,310 PROFESSOR: Yeah, definitely. 988 01:05:33,310 --> 01:05:38,685 So the class containing states 2 and 3, they are periodic. 989 01:05:38,685 --> 01:05:39,651 AUDIENCE: Oh, wait. 990 01:05:39,651 --> 01:05:40,901 You just said [INAUDIBLE]. 991 01:05:44,000 --> 01:05:44,640 PROFESSOR: Oh, OK. 992 01:05:44,640 --> 01:05:45,890 AUDIENCE: [INAUDIBLE]. 993 01:05:53,396 --> 01:05:55,097 PROFESSOR: Yeah, actually, we can define 994 01:05:55,097 --> 01:05:56,850 for transient states. 995 01:05:56,850 --> 01:05:59,380 Yeah, for transient classes, we can define periodicity, 996 01:05:59,380 --> 01:06:00,630 like this case. 997 01:06:04,480 --> 01:06:05,730 Yeah, why not? 998 01:06:20,300 --> 01:06:26,360 This is another very important thing that we can 999 01:06:26,360 --> 01:06:29,100 do in Markov chains. 1000 01:06:29,100 --> 01:06:32,770 So I have a class of states in a Markov chain, 1001 01:06:32,770 --> 01:06:35,590 and they are periodic. 1002 01:06:35,590 --> 01:06:42,130 So the period of each state in a class is greater than 1. 1003 01:06:42,130 --> 01:06:45,250 And you know that all the states in a class have the 1004 01:06:45,250 --> 01:06:47,480 same period. 1005 01:06:47,480 --> 01:06:52,920 So it means that I can partition the class into these 1006 01:06:52,920 --> 01:06:58,700 subclasses, where there is only-- 1007 01:06:58,700 --> 01:07:01,040 OK, so-- 1008 01:07:01,040 --> 01:07:02,510 do I have that? 1009 01:07:02,510 --> 01:07:03,760 I don't know. 1010 01:07:15,270 --> 01:07:18,160 So let's assume that d is equal to 3. 1011 01:07:18,160 --> 01:07:21,620 And the whole team is the class of states that I'm 1012 01:07:21,620 --> 01:07:22,750 talking about. 1013 01:07:22,750 --> 01:07:29,020 I can partition into three classes, in which I only have 1014 01:07:29,020 --> 01:07:32,906 transitions from one of these to the other one. 1015 01:07:36,720 --> 01:07:41,440 So there's no transition in this subclass to itself. 1016 01:07:41,440 --> 01:07:47,270 And the only transitions are from one 1017 01:07:47,270 --> 01:07:48,540 class to the next one. 1018 01:07:51,290 --> 01:07:52,790 So this is sort of intuitive. 1019 01:07:52,790 --> 01:07:59,430 So just looking at three states, if the period is 3, I 1020 01:07:59,430 --> 01:08:04,060 can partition it in this way. 1021 01:08:04,060 --> 01:08:08,300 Or I can have it like two of these in this case, where I 1022 01:08:08,300 --> 01:08:11,222 have like this. 1023 01:08:11,222 --> 01:08:11,840 You see? 1024 01:08:11,840 --> 01:08:14,190 So there's a transition from this to these two states, and 1025 01:08:14,190 --> 01:08:16,930 from these two states to here, and from here to here. 1026 01:08:16,930 --> 01:08:19,720 But I cannot have any transition from here to 1027 01:08:19,720 --> 01:08:23,979 itself, or from here to here, or here to here. 1028 01:08:23,979 --> 01:08:28,460 Just look at the text for the proof and illustration. 1029 01:08:28,460 --> 01:08:35,195 But there exists this kind of partitioning. 1030 01:08:35,195 --> 01:08:36,445 You should know that. 1031 01:08:40,220 --> 01:08:44,300 Yeah, so if I am in a class, in a subclass, the next state 1032 01:08:44,300 --> 01:08:47,500 will be in the next subclass, for sure. 1033 01:08:47,500 --> 01:08:50,640 So I know this subclass in [INAUDIBLE]. 1034 01:08:50,640 --> 01:08:53,950 And in two steps, I will be in the other [INAUDIBLE]. 1035 01:08:53,950 --> 01:08:57,710 So you know that the set of classes that I can be in 1036 01:08:57,710 --> 01:09:04,840 state, nd plus m. 1037 01:09:04,840 --> 01:09:08,970 So the other definition that you can say is that, again, 1038 01:09:08,970 --> 01:09:13,420 you can choose one of the states in a class that I 1039 01:09:13,420 --> 01:09:14,359 talked about. 1040 01:09:14,359 --> 01:09:18,220 So let's say that I choose a state 1. 1041 01:09:18,220 --> 01:09:20,130 And I define Sm-- 1042 01:09:20,130 --> 01:09:23,200 Sm corresponds to that subclass-- 1043 01:09:23,200 --> 01:09:26,240 are all the j's where S1j. 1044 01:09:42,399 --> 01:09:47,450 So I told you that I will have d classes, I can define each 1045 01:09:47,450 --> 01:09:49,580 class like this. 1046 01:09:49,580 --> 01:09:54,320 So this is all the possible states that I can be in nd 1047 01:09:54,320 --> 01:09:57,260 plus m step. 1048 01:09:57,260 --> 01:10:01,320 So starting from state 1, in nd plus m step, I can be in 1049 01:10:01,320 --> 01:10:03,690 set of classes, for some m. 1050 01:10:03,690 --> 01:10:05,090 So m can we big. 1051 01:10:05,090 --> 01:10:09,628 But anyway, I call this the subclass number m. 1052 01:10:12,830 --> 01:10:24,865 So let's just talk about something. 1053 01:10:28,680 --> 01:10:30,990 So I said that in order to characterize the Markov 1054 01:10:30,990 --> 01:10:37,460 chains, I need to show you the initial state or initial state 1055 01:10:37,460 --> 01:10:38,220 distribution. 1056 01:10:38,220 --> 01:10:42,940 So it can be deterministic, like I start from some 1057 01:10:42,940 --> 01:10:45,660 specific state all the time, or there can be some 1058 01:10:45,660 --> 01:10:49,460 distribution, like px0, meaning that this is the 1059 01:10:49,460 --> 01:10:53,490 distribution that I would have at my initial state. 1060 01:10:53,490 --> 01:10:59,720 I don't have it here, but using, I think, chain rules, 1061 01:10:59,720 --> 01:11:03,230 we can easily find a distribution of states at each 1062 01:11:03,230 --> 01:11:10,500 time instant by having the transitional probabilities and 1063 01:11:10,500 --> 01:11:13,260 the initial state distribution. 1064 01:11:13,260 --> 01:11:15,230 I just wrote it for you. 1065 01:11:15,230 --> 01:11:19,990 So for characterizing the Markov chain, I just need to 1066 01:11:19,990 --> 01:11:22,650 tell you the transition probabilities 1067 01:11:22,650 --> 01:11:26,570 and the initial state. 1068 01:11:26,570 --> 01:11:31,480 So a very good question is that, is there any kind of 1069 01:11:31,480 --> 01:11:40,330 stable behavior when time goes, like in very far future? 1070 01:11:40,330 --> 01:11:45,490 So I had an example here, just looking at here. 1071 01:11:45,490 --> 01:11:50,150 So this is a very simple thing that I can say about this 1072 01:11:50,150 --> 01:11:51,370 Markov chain. 1073 01:11:51,370 --> 01:11:55,370 I know that in the very far future, there is a 0 1074 01:11:55,370 --> 01:11:59,680 probability that I'm in state 6. 1075 01:11:59,680 --> 01:12:00,380 And you know it why? 1076 01:12:00,380 --> 01:12:03,690 Because any time that I am in state 6, I will go out of it 1077 01:12:03,690 --> 01:12:05,060 with probability 1. 1078 01:12:05,060 --> 01:12:08,330 So there's no chance that I will be there. 1079 01:12:08,330 --> 01:12:20,610 Or I can say that, if I start from state 4, in very, very 1080 01:12:20,610 --> 01:12:27,444 far future, I will not be in state 1, 4, or 5, for sure. 1081 01:12:27,444 --> 01:12:29,030 You know why? 1082 01:12:29,030 --> 01:12:33,020 Because there is a positive probability of going out of 1083 01:12:33,020 --> 01:12:35,740 these threes states, like here. 1084 01:12:35,740 --> 01:12:40,080 And then if I go out of it, I can never come back. 1085 01:12:40,080 --> 01:12:44,910 So there's a chance of going from state 4 to state 2. 1086 01:12:44,910 --> 01:12:47,360 And if I ever go there, I will never come back. 1087 01:12:50,180 --> 01:12:55,830 So these are the statements that I can say about the state 1088 01:12:55,830 --> 01:13:00,920 of behavior or the very, very future behavior 1089 01:13:00,920 --> 01:13:02,300 of the Markov chain. 1090 01:13:02,300 --> 01:13:07,690 So the question is that, can we always say these kind of 1091 01:13:07,690 --> 01:13:09,200 statements? 1092 01:13:09,200 --> 01:13:12,160 And what kind of statements, actually, we can have? 1093 01:13:12,160 --> 01:13:16,090 And so you see that, for example, for a state six 1094 01:13:16,090 --> 01:13:16,550 [INAUDIBLE] 1095 01:13:16,550 --> 01:13:20,870 example, I could say that the probability of being in state 1096 01:13:20,870 --> 01:13:24,620 6 as n goes to infinity is 0. 1097 01:13:24,620 --> 01:13:27,620 So can I always have some kind of probability distribution 1098 01:13:27,620 --> 01:13:32,910 over the states in the future as n goes to infinity? 1099 01:13:32,910 --> 01:13:35,780 This is a very good question. 1100 01:13:35,780 --> 01:13:39,020 And actually, it's related to a lot of applications that we 1101 01:13:39,020 --> 01:13:42,600 have for Markov chains, like the queuing theory. 1102 01:13:42,600 --> 01:13:45,700 And you can have queues for almost anything. 1103 01:13:45,700 --> 01:13:50,800 So one of the most fundamental and interesting classes of 1104 01:13:50,800 --> 01:13:55,870 states are the states that are called ergodic, meaning that 1105 01:13:55,870 --> 01:13:57,560 they are recurrent and aperiodic. 1106 01:14:00,240 --> 01:14:09,230 So if I have a Markov chain that has only one class, and 1107 01:14:09,230 --> 01:14:14,430 this class is recurrent and aperiodic, then we call it a 1108 01:14:14,430 --> 01:14:15,980 ergodic Markov chain. 1109 01:14:15,980 --> 01:14:21,880 So we had two theorems saying that if a state in a class is 1110 01:14:21,880 --> 01:14:25,170 recurrent, then all the states are recurrent. 1111 01:14:25,170 --> 01:14:30,010 And if a state in a class is aperiodic, then all the states 1112 01:14:30,010 --> 01:14:31,550 are aperiodic. 1113 01:14:31,550 --> 01:14:37,070 So we can say that some classes are ergodic and some 1114 01:14:37,070 --> 01:14:38,970 are not ergodic. 1115 01:14:38,970 --> 01:14:42,940 So if the Markov chain has only one class, and this class 1116 01:14:42,940 --> 01:14:46,460 is aperiodic and recurrent, then we call it a ergodic 1117 01:14:46,460 --> 01:14:47,980 Markov chain. 1118 01:14:47,980 --> 01:14:58,970 And the very important and nice property of ergodic 1119 01:14:58,970 --> 01:15:03,200 Markov chains is that they lose memory as n goes to 1120 01:15:03,200 --> 01:15:06,740 infinity, meaning that whatever initial distribution 1121 01:15:06,740 --> 01:15:11,240 that I have for the initial state, I will 1122 01:15:11,240 --> 01:15:12,160 lose memory of that. 1123 01:15:12,160 --> 01:15:14,500 So whatever state that I start in, or whatever distribution 1124 01:15:14,500 --> 01:15:23,200 that I start in, after a while, for a large enough n, 1125 01:15:23,200 --> 01:15:26,940 the distribution of states does not depend on that. 1126 01:15:26,940 --> 01:15:31,760 So again, looking at that chain rule, I could say that I 1127 01:15:31,760 --> 01:15:34,760 can find the probability distribution of x by looking 1128 01:15:34,760 --> 01:15:36,010 at this thing. 1129 01:15:40,400 --> 01:15:45,420 And looking at these recursively, it all depends on 1130 01:15:45,420 --> 01:15:49,540 P of the initial distribution. 1131 01:15:49,540 --> 01:15:57,140 And the ergodic Markov chains have this property that, after 1132 01:15:57,140 --> 01:15:59,890 a while, this distribution doesn't depend on initial 1133 01:15:59,890 --> 01:16:01,220 distribution anymore. 1134 01:16:01,220 --> 01:16:02,850 So they lose memory. 1135 01:16:02,850 --> 01:16:05,920 And actually, usually we can calculate the stable 1136 01:16:05,920 --> 01:16:06,490 distribution. 1137 01:16:06,490 --> 01:16:11,140 So this thing goes to a limit which is called pj. 1138 01:16:11,140 --> 01:16:14,470 So the important thing is that it doesn't depend on i. 1139 01:16:14,470 --> 01:16:17,780 It doesn't depend where i starts. 1140 01:16:17,780 --> 01:16:20,960 Eventually, I will converge the distribution. 1141 01:16:20,960 --> 01:16:25,070 And then this distribution doesn't change. 1142 01:16:25,070 --> 01:16:29,610 So for a large enough n, I have this property for ergodic 1143 01:16:29,610 --> 01:16:30,860 Markov chains. 1144 01:16:32,925 --> 01:16:35,050 We will have a lot to do with these 1145 01:16:35,050 --> 01:16:37,380 properties in the future. 1146 01:16:37,380 --> 01:16:47,990 So I was saying that this pj, pi j, should be positive. 1147 01:16:47,990 --> 01:16:54,500 So being in each state has a non-0 probability. 1148 01:16:54,500 --> 01:17:01,730 The first thing that I need to prove is that pijn and is 1149 01:17:01,730 --> 01:17:07,620 nonzero for a large enough n for all j and all the initial 1150 01:17:07,620 --> 01:17:09,250 distributions. 1151 01:17:09,250 --> 01:17:11,580 And I want to prove that this is true for 1152 01:17:11,580 --> 01:17:12,920 ergodic Markov chains. 1153 01:17:12,920 --> 01:17:14,170 This is not true, generally. 1154 01:17:16,852 --> 01:17:24,180 Well, this is more a combinatorial issue, but there 1155 01:17:24,180 --> 01:17:28,480 is a theorem here which says that for an ergodic Markov 1156 01:17:28,480 --> 01:17:35,880 chain, for all n greater than this value, I have non-0 1157 01:17:35,880 --> 01:17:39,300 probability of going from i to j. 1158 01:17:39,300 --> 01:17:43,190 So the thing that you should be careful here is that, for 1159 01:17:43,190 --> 01:17:45,850 all n greater than this value. 1160 01:17:45,850 --> 01:18:02,860 So for going from a state 1 to 1, I can go it in six steps 1161 01:18:02,860 --> 01:18:05,360 and 12 steps and so on. 1162 01:18:05,360 --> 01:18:11,170 But I cannot go it in 24 steps, I think. 1163 01:18:11,170 --> 01:18:17,720 I cannot go from 1 to 1 in 25 steps. 1164 01:18:17,720 --> 01:18:24,470 But I can go from 1 to 1 in 26, 27, 28, 29, to 30. 1165 01:18:24,470 --> 01:18:28,300 So for n greater than n minus 1 squared plus 1, I can go 1166 01:18:28,300 --> 01:18:30,880 from any state to any state. 1167 01:18:30,880 --> 01:18:38,770 So I think this [? bond ?] is tied for state 4. 1168 01:18:41,630 --> 01:18:43,890 Sorry, maybe what I said is true for state 4. 1169 01:18:49,614 --> 01:18:50,580 Yeah. 1170 01:18:50,580 --> 01:18:53,770 So I'm not going to prove this theorem. 1171 01:18:53,770 --> 01:18:56,130 Well, actually you don't have the proof 1172 01:18:56,130 --> 01:18:57,567 in the notes, either. 1173 01:18:57,567 --> 01:19:05,910 But yeah, you can look at the example and the cases that are 1174 01:19:05,910 --> 01:19:08,570 discussed in the notes. 1175 01:19:08,570 --> 01:19:09,820 Is there any questions? 1176 01:19:12,870 --> 01:19:14,120 Fine?