1 00:00:01,650 --> 00:00:05,370 As a warm-up, just to see how to use steady-state probabilities, 2 00:00:05,370 --> 00:00:07,680 let us look at our familiar example. 3 00:00:07,680 --> 00:00:09,830 This is a two-state Markov chain, 4 00:00:09,830 --> 00:00:13,330 and we did write down the complete balance equations 5 00:00:13,330 --> 00:00:16,059 for this chain, and found the steady-state probabilities 6 00:00:16,059 --> 00:00:17,320 before. 7 00:00:17,320 --> 00:00:20,160 Notice that we can find these by using the trick which 8 00:00:20,160 --> 00:00:23,760 we introduced for birth and death processes. 9 00:00:23,760 --> 00:00:27,190 You cut the chain along this line, 10 00:00:27,190 --> 00:00:31,060 and argue that the frequency of transition of this type 11 00:00:31,060 --> 00:00:33,820 has to be the same as the frequency of transition 12 00:00:33,820 --> 00:00:35,030 of this type. 13 00:00:35,030 --> 00:00:39,340 So if you have pi 1 here and pi 2 here, 14 00:00:39,340 --> 00:00:45,690 what it means is pi 1 times 0.5, which represent 15 00:00:45,690 --> 00:00:48,090 the frequency of these kind of transitions, 16 00:00:48,090 --> 00:00:52,910 has to be equal to pi 2 times 0.2, 17 00:00:52,910 --> 00:00:57,760 which is this kind of transition, plus pi 1 18 00:00:57,760 --> 00:01:01,940 plus pi 2, the normalization equation. 19 00:01:01,940 --> 00:01:03,770 And by solving the system of equations, 20 00:01:03,770 --> 00:01:06,190 you obtain the same thing as what we obtained before, 21 00:01:06,190 --> 00:01:09,940 which is the steady-state probabilities of state 1 22 00:01:09,940 --> 00:01:12,680 and of state 2. 23 00:01:12,680 --> 00:01:16,470 Let us now try to calculate some related quantities. 24 00:01:16,470 --> 00:01:19,230 Suppose that you start at state 1, 25 00:01:19,230 --> 00:01:23,039 and you want to calculate this particular probability. 26 00:01:23,039 --> 00:01:25,610 So you start at state 1, and you want 27 00:01:25,610 --> 00:01:27,770 to know what is the probability that next time you 28 00:01:27,770 --> 00:01:31,750 will be in state 1, and 100 times step later, you are still 29 00:01:31,750 --> 00:01:33,344 in state 1. 30 00:01:33,344 --> 00:01:34,759 Now, the conditional probabilities 31 00:01:34,759 --> 00:01:37,470 of two things happening is the conditional probability 32 00:01:37,470 --> 00:01:39,820 of the first one happening, x1 equals 33 00:01:39,820 --> 00:01:43,820 1, given x0 equals 1, and given that the first one happens, 34 00:01:43,820 --> 00:01:47,990 the probability that the second one happened-- x100 equals 1, 35 00:01:47,990 --> 00:01:53,440 given x1 equals 1, and x0 equals 1. 36 00:01:53,440 --> 00:01:54,520 So what is this? 37 00:01:54,520 --> 00:01:58,090 This first one is the transition probability from state 1 38 00:01:58,090 --> 00:02:01,270 to state 1, p 1 1. 39 00:02:01,270 --> 00:02:04,350 How about the second probability? 40 00:02:04,350 --> 00:02:09,169 Because of the Markov property, that information is irrelevant, 41 00:02:09,169 --> 00:02:16,340 and so that probability is r 1 1 in 99 steps. 42 00:02:16,340 --> 00:02:19,730 Now, 99 is possibly a big number, 43 00:02:19,730 --> 00:02:22,300 and so we approximate this quantity, 44 00:02:22,300 --> 00:02:24,590 and we are going to use the steady-state probability 45 00:02:24,590 --> 00:02:26,310 of state 1 for doing that. 46 00:02:26,310 --> 00:02:30,440 And that gives us an approximation, p 1 1, 47 00:02:30,440 --> 00:02:38,530 times pi of 1, which, then, is 0.5 times 2 over 7. 48 00:02:38,530 --> 00:02:40,660 Now, how about this expression? 49 00:02:40,660 --> 00:02:43,530 Given that you start in state 1, what 50 00:02:43,530 --> 00:02:47,850 is the probability that at time step 100, you are in 1, 51 00:02:47,850 --> 00:02:50,960 and time step 101 you are in 2. 52 00:02:50,960 --> 00:02:52,570 By doing the same technique gives 53 00:02:52,570 --> 00:02:57,560 the conditional probability of the first thing happening, 54 00:02:57,560 --> 00:02:59,450 and then, given that thing happening, 55 00:02:59,450 --> 00:03:03,450 the probability that the second one happen. 56 00:03:03,450 --> 00:03:09,360 x101 equals 2, given x100 equals 1, and x0 equals 1. 57 00:03:09,360 --> 00:03:15,260 And again, here what we have is r 1 1 of 100, 58 00:03:15,260 --> 00:03:19,120 and times-- again, the Markov property 59 00:03:19,120 --> 00:03:21,829 tells us that we can forget about this one-- 60 00:03:21,829 --> 00:03:27,070 and this is the probability transition from 1 to 2, pi 1 2. 61 00:03:27,070 --> 00:03:31,010 And again, here, if n equals 100 is large enough, 62 00:03:31,010 --> 00:03:38,510 we can approximate that by pi 1 times p 1 2. 63 00:03:38,510 --> 00:03:44,410 And this, then, 2 over 7 times 0.5 again. 64 00:03:44,410 --> 00:03:48,150 Finally, let's calculate this third expression, where, 65 00:03:48,150 --> 00:03:51,390 again, we start at state 1, and now we are asking, 66 00:03:51,390 --> 00:03:56,370 what is probability that after time step 100, you are in 1, 67 00:03:56,370 --> 00:04:00,720 and 100 steps later, you are again in 1? 68 00:04:00,720 --> 00:04:03,230 We use the same trick as before, this probability 69 00:04:03,230 --> 00:04:09,074 that the first thing happened, and given that, 70 00:04:09,074 --> 00:04:10,865 the probability of the second one happened. 71 00:04:18,880 --> 00:04:25,170 Again, this is r 1 1 of 100, and this one, 72 00:04:25,170 --> 00:04:28,970 for the same reason as before, we can forget this term. 73 00:04:28,970 --> 00:04:38,304 From 100 to 200, you have 100 time steps, r 1 1 of 100. 74 00:04:38,304 --> 00:04:39,845 And 100 is big enough, so we're going 75 00:04:39,845 --> 00:04:43,590 to approximate both by the same values, 76 00:04:43,590 --> 00:04:48,420 and we get pi 1 square, which is 2 over 7 square. 77 00:04:51,070 --> 00:04:55,690 Now, in this calculation, we assume that n equals 99, 78 00:04:55,690 --> 00:04:59,430 or n equals 100 were big enough-- big 79 00:04:59,430 --> 00:05:02,040 enough so that the limit has taken effect. 80 00:05:02,040 --> 00:05:04,970 But how do we know that our approximation is good? 81 00:05:04,970 --> 00:05:12,030 In other words, is n equals 99 or 100 large enough? 82 00:05:17,360 --> 00:05:20,000 Well, this has something to do with the mixing time 83 00:05:20,000 --> 00:05:23,320 scale of our Markov chain, and by mixing time scale, 84 00:05:23,320 --> 00:05:27,040 I mean, how long does it take for the initial states 85 00:05:27,040 --> 00:05:28,490 to be forgotten? 86 00:05:28,490 --> 00:05:30,539 So how can you see that here? 87 00:05:30,539 --> 00:05:32,080 Well, you can first try a simulation. 88 00:05:36,280 --> 00:05:39,900 So using any of your favorite software, simulate. 89 00:05:39,900 --> 00:05:43,800 If you calculate, and you look at, as a function of n, 90 00:05:43,800 --> 00:05:50,980 and you draw r 1 1 of n, at n equals 0, this is 1. 91 00:05:50,980 --> 00:05:54,316 n equals 1, then you have 5, et cetera. 92 00:05:54,316 --> 00:05:59,070 At 1, this is going to be p 1 1, and p 1 1 was 0.5, 93 00:05:59,070 --> 00:06:00,920 so already it's here. 94 00:06:00,920 --> 00:06:03,660 So initially it was here, here. 95 00:06:03,660 --> 00:06:07,620 And 2 over 7, so this is 0.5. 96 00:06:07,620 --> 00:06:09,650 2 over 7 is here. 97 00:06:09,650 --> 00:06:13,140 And what we know is that when n goes to infinity, 98 00:06:13,140 --> 00:06:16,570 these things goes to 2 over 7. 99 00:06:16,570 --> 00:06:21,870 And if you simulate, you will see that it goes very fast. 100 00:06:21,870 --> 00:06:25,140 So with you, I'm just joining points, 101 00:06:25,140 --> 00:06:29,170 and we will see that n equals 5 already. 102 00:06:29,170 --> 00:06:31,670 You are very, very close to 2 over 7. 103 00:06:31,670 --> 00:06:34,990 So you have really an exponential decrease here. 104 00:06:39,130 --> 00:06:41,570 In fact, if you look at the simulation result, 105 00:06:41,570 --> 00:06:44,170 and you look at n equals 5 already, 106 00:06:44,170 --> 00:06:49,690 here you have two correct decimal already. 107 00:06:49,690 --> 00:06:56,090 And for n equals 10, it's correct up to five decimals. 108 00:06:59,590 --> 00:07:02,840 Or, if you do not want to have simulation, but simply 109 00:07:02,840 --> 00:07:06,504 think in terms of order of magnitude-- so here 110 00:07:06,504 --> 00:07:07,920 it would be another approach would 111 00:07:07,920 --> 00:07:13,600 be order of magnitude type of argument-- 112 00:07:13,600 --> 00:07:16,180 and in order to do that, starting here, 113 00:07:16,180 --> 00:07:19,490 on average, how many trials, or how many time steps 114 00:07:19,490 --> 00:07:23,130 would it take in order for you to observe such a transition 115 00:07:23,130 --> 00:07:23,690 here? 116 00:07:23,690 --> 00:07:26,640 Well, you use the geometric, random variable, 117 00:07:26,640 --> 00:07:29,620 and this is the amount of time, on average, 118 00:07:29,620 --> 00:07:30,850 until you have success. 119 00:07:30,850 --> 00:07:33,980 And it is 1 over the probability here, 120 00:07:33,980 --> 00:07:39,220 so it takes an average two time steps to go from here to here. 121 00:07:39,220 --> 00:07:42,020 And to go from here to here, it takes, on average, 122 00:07:42,020 --> 00:07:47,000 1 over 0.2, which is about five time steps. 123 00:07:47,000 --> 00:07:49,150 So as an order of magnitude, given 124 00:07:49,150 --> 00:07:53,750 you started here in state 1, after, on average, 125 00:07:53,750 --> 00:07:57,700 about 10 iterations, there will be some randomness. 126 00:07:57,700 --> 00:07:59,350 There is a high likelihood, on average, 127 00:07:59,350 --> 00:08:01,290 that you will go there and come here. 128 00:08:01,290 --> 00:08:04,250 And then if you do n equals 100, which 129 00:08:04,250 --> 00:08:06,730 is 10 times that, in terms of order of magnitude, 130 00:08:06,730 --> 00:08:08,810 it looks like n is large enough. 131 00:08:08,810 --> 00:08:12,630 So that would be a back to the envelope calculation. 132 00:08:12,630 --> 00:08:16,060 Now, this kind of calculation is useful in general, not just 133 00:08:16,060 --> 00:08:16,680 here. 134 00:08:16,680 --> 00:08:18,830 So for example, let's do it again. 135 00:08:18,830 --> 00:08:23,240 Assume that instead of 0.5 here, the probability that you had 136 00:08:23,240 --> 00:08:25,450 was 0.999. 137 00:08:25,450 --> 00:08:31,460 And maybe here, instead of 0.8, it was 0.998. 138 00:08:31,460 --> 00:08:34,400 Now, in order to observe such a transition, 139 00:08:34,400 --> 00:08:37,760 it will take, on average, since this number here 140 00:08:37,760 --> 00:08:41,140 would become 0.001, it would take, 141 00:08:41,140 --> 00:08:44,080 on average, about 1,000 time steps 142 00:08:44,080 --> 00:08:46,720 in order to observe one such transition. 143 00:08:46,720 --> 00:08:50,010 So if you look at time steps of that order, 144 00:08:50,010 --> 00:08:53,980 it will not be enough if your chain were of this type. 145 00:08:53,980 --> 00:08:56,840 After n equals 100, the likelihood 146 00:08:56,840 --> 00:08:59,160 would be that you will still be here, 147 00:08:59,160 --> 00:09:02,180 so the initial state, or the initial condition, 148 00:09:02,180 --> 00:09:03,820 would matter still. 149 00:09:03,820 --> 00:09:05,920 So you would take about 1,000 to get there, 150 00:09:05,920 --> 00:09:09,240 and then here it would have 0.002. 151 00:09:09,240 --> 00:09:11,920 That means that, on average, from here you 152 00:09:11,920 --> 00:09:14,510 would take about 500 iterations before you observe 153 00:09:14,510 --> 00:09:17,780 that for the first time, so the same order of magnitude. 154 00:09:17,780 --> 00:09:21,560 So in order to get enough randomness here, 155 00:09:21,560 --> 00:09:24,960 a good rule would be to multiply this 1,000 by 10. 156 00:09:24,960 --> 00:09:28,420 So maybe with n equals 10,000, you 157 00:09:28,420 --> 00:09:32,000 would feel confident enough, in that specific case, 158 00:09:32,000 --> 00:09:34,290 in order to use this kind of approximation 159 00:09:34,290 --> 00:09:36,240 that we have used here. 160 00:09:36,240 --> 00:09:37,930 And finally, for those interested, 161 00:09:37,930 --> 00:09:40,330 you can study that by theory. 162 00:09:40,330 --> 00:09:43,660 And here there is an entire field 163 00:09:43,660 --> 00:09:47,220 that try to study how fast a Markov chain converges 164 00:09:47,220 --> 00:09:51,220 to steady state, and the so-called mixing time. 165 00:09:51,220 --> 00:09:54,420 And it turns out that for these Markov chain, 166 00:09:54,420 --> 00:09:57,360 you can say that the convergence, or the rate 167 00:09:57,360 --> 00:09:59,770 of convergence, is of the order c 168 00:09:59,770 --> 00:10:05,940 at the power n, where c is a number that is between 0 and 1. 169 00:10:05,940 --> 00:10:10,070 And the closer c is to 1, the slower the convergence is. 170 00:10:10,070 --> 00:10:15,320 And the closer c is to 0, the faster the convergence is. 171 00:10:15,320 --> 00:10:18,920 And for example, here, for our initial case, 172 00:10:18,920 --> 00:10:22,780 the c was 0.3, so that was the first case 173 00:10:22,780 --> 00:10:24,490 for the second chain. 174 00:10:24,490 --> 00:10:28,980 With these kind of probabilities, 0.99 and 0.998, 175 00:10:28,980 --> 00:10:31,850 the c would be 0.997.