1 00:00:06,279 --> 00:00:07,380 PROFESSOR: Hi, everyone. 2 00:00:07,380 --> 00:00:08,520 Welcome back. 3 00:00:08,520 --> 00:00:12,390 So today, I'd like to tackle a problem in Markov matrices. 4 00:00:12,390 --> 00:00:15,220 Specifically, we're going to start with this problem which 5 00:00:15,220 --> 00:00:17,540 almost has a physics origin. 6 00:00:17,540 --> 00:00:21,240 If we have a particle that jumps between positions A and B 7 00:00:21,240 --> 00:00:23,470 with the following probabilities-- 8 00:00:23,470 --> 00:00:25,910 I'll just state it-- if it starts at A and jumps 9 00:00:25,910 --> 00:00:29,110 to B with probability 0.4 or starts at A 10 00:00:29,110 --> 00:00:34,300 and stays at A with probability 0.6, or if it starts at B 11 00:00:34,300 --> 00:00:36,870 then it goes to A with probability 0.2 12 00:00:36,870 --> 00:00:40,010 or stays at B with probability 0.8, 13 00:00:40,010 --> 00:00:42,660 we'd like to know the evolution of the probability 14 00:00:42,660 --> 00:00:45,690 of this particle over a long period of time. 15 00:00:45,690 --> 00:00:47,110 So specifically the problem we're 16 00:00:47,110 --> 00:00:49,830 interested today is: if we have a particle 17 00:00:49,830 --> 00:00:52,040 and we know that it starts at position A, 18 00:00:52,040 --> 00:00:53,940 what is the probability that it is 19 00:00:53,940 --> 00:00:56,150 at position A and the probability 20 00:00:56,150 --> 00:01:01,890 that it's at position B after one step, after n steps, 21 00:01:01,890 --> 00:01:04,112 and then finally after an infinite number of steps? 22 00:01:04,112 --> 00:01:06,320 So I'll let you think about this problem for a moment 23 00:01:06,320 --> 00:01:07,028 and I'll be back. 24 00:01:17,460 --> 00:01:18,580 Hi everyone. 25 00:01:18,580 --> 00:01:20,370 Welcome back. 26 00:01:20,370 --> 00:01:23,630 So the main difficulty with this problem 27 00:01:23,630 --> 00:01:25,646 is that it's phrased as a physics problem. 28 00:01:25,646 --> 00:01:28,020 And we need to convert it into some mathematical language 29 00:01:28,020 --> 00:01:29,899 to get a handle on it. 30 00:01:29,899 --> 00:01:31,440 So specifically, what we'd like to do 31 00:01:31,440 --> 00:01:35,670 is to convert this into a matrix formalism. 32 00:01:35,670 --> 00:01:41,120 So what we can do is we can write this little graph down 33 00:01:41,120 --> 00:01:44,990 and describe everything in this graph using a matrix. 34 00:01:44,990 --> 00:01:47,860 So I'm going to call this matrix A, 35 00:01:47,860 --> 00:01:52,700 and I'm going to associate the first row of A 36 00:01:52,700 --> 00:01:58,580 with particle position A and particle position B. 37 00:01:58,580 --> 00:02:06,320 And I'll associate the first and second columns 38 00:02:06,320 --> 00:02:08,990 with particles positions A and B. 39 00:02:08,990 --> 00:02:10,490 And then what I'm going to do is I'm 40 00:02:10,490 --> 00:02:14,040 going to fill in this matrix with the probability 41 00:02:14,040 --> 00:02:15,270 distributions. 42 00:02:15,270 --> 00:02:18,490 So, specifically what's going to go in this top left corner? 43 00:02:18,490 --> 00:02:24,350 Well, the number 0.6, which represents the probability 44 00:02:24,350 --> 00:02:26,870 that I stay at position A, given that I 45 00:02:26,870 --> 00:02:30,780 start at position A. What's going to go here 46 00:02:30,780 --> 00:02:33,060 in the bottom left-hand corner? 47 00:02:33,060 --> 00:02:39,890 Well, we're going to put 0.4, because this is the probability 48 00:02:39,890 --> 00:02:45,230 that I wind up at B, given that I start at A. And then lastly, 49 00:02:45,230 --> 00:02:50,490 we'll fill in these other two columns or the second column 50 00:02:50,490 --> 00:02:56,690 with 0.8 and 0.2. 51 00:02:56,690 --> 00:02:58,340 So I'll just state briefly this is 52 00:02:58,340 --> 00:03:00,210 what's called a Markov matrix. 53 00:03:00,210 --> 00:03:03,600 And it's called Markov, because first off, every element 54 00:03:03,600 --> 00:03:06,460 is positive or 0. 55 00:03:06,460 --> 00:03:11,840 And secondly, the sum of the elements in each column is 1. 56 00:03:11,840 --> 00:03:16,520 So if we note 0.4 plus 0.6 is 1, 0.8 plus 0.2 is 1. 57 00:03:16,520 --> 00:03:18,490 And these matrices come up all the time 58 00:03:18,490 --> 00:03:20,890 when we're talking about probabilities and the evolution 59 00:03:20,890 --> 00:03:23,530 of probability distributions. 60 00:03:23,530 --> 00:03:24,110 OK. 61 00:03:24,110 --> 00:03:28,250 So now, once we've encoded this graph using this matrix A, 62 00:03:28,250 --> 00:03:30,480 we now want to tackle this problem. 63 00:03:30,480 --> 00:03:35,394 So I'm going to introduce the vector p, 64 00:03:35,394 --> 00:03:36,810 and I'm going to use a subscript 0 65 00:03:36,810 --> 00:03:41,630 is to denote the probability that the particle is at time 0. 66 00:03:41,630 --> 00:03:50,400 So we're told that the particle starts at position A. 67 00:03:50,400 --> 00:03:55,190 So at time 0, I'm going to use the vector [1, 0]. 68 00:03:55,190 --> 00:04:03,810 Again, I'm going to match the top component of this vector 69 00:04:03,810 --> 00:04:06,310 with the top component of this matrix 70 00:04:06,310 --> 00:04:09,700 and the first column of this matrix. 71 00:04:09,700 --> 00:04:12,120 And then likewise, the second component of this vector 72 00:04:12,120 --> 00:04:17,480 with the second row and second column of this matrix. 73 00:04:17,480 --> 00:04:19,670 And we're interested in: how does 74 00:04:19,670 --> 00:04:26,620 this probability evolve as the particle takes many steps? 75 00:04:26,620 --> 00:04:35,610 So for one step, what's the probability of the particle 76 00:04:35,610 --> 00:04:36,770 going to be? 77 00:04:36,770 --> 00:04:41,150 Well, this is the beauty of introducing matrix notation. 78 00:04:41,150 --> 00:04:44,510 I'm going to denote p_1 to be the probability 79 00:04:44,510 --> 00:04:48,160 of the particle after one step. 80 00:04:48,160 --> 00:04:53,080 And we see that we can write this as the matrix A multiplied 81 00:04:53,080 --> 00:04:54,160 by p_0. 82 00:04:58,030 --> 00:05:07,570 So the answer is 0.6 and 0.4. 83 00:05:07,570 --> 00:05:09,680 And I achieve this just by multiplying this matrix 84 00:05:09,680 --> 00:05:11,291 by this vector. 85 00:05:11,291 --> 00:05:11,790 OK? 86 00:05:11,790 --> 00:05:13,050 So this concludes part one. 87 00:05:16,770 --> 00:05:19,240 Now part two is a little trickier. 88 00:05:19,240 --> 00:05:20,435 So part two is n steps. 89 00:05:26,510 --> 00:05:28,340 And to tackle this problem, we need 90 00:05:28,340 --> 00:05:30,750 to use a little more machinery. 91 00:05:30,750 --> 00:05:35,550 So first off, I'm going to note that p_1 is A times p_0. 92 00:05:38,620 --> 00:05:42,302 Likewise, p_2-- so this is the position 93 00:05:42,302 --> 00:05:44,510 of the-- the probability distribution of the particle 94 00:05:44,510 --> 00:05:45,940 after two steps. 95 00:05:45,940 --> 00:05:52,860 This is A times p_0, which is A squared times p_0. 96 00:05:52,860 --> 00:05:55,000 And we note that there's a general trend. 97 00:05:55,000 --> 00:06:01,070 After n steps-- so P_n-- the general trend 98 00:06:01,070 --> 00:06:05,970 is, it's going to be this matrix A raised to the n-th power, 99 00:06:05,970 --> 00:06:09,230 multiply the vector P0. 100 00:06:09,230 --> 00:06:11,920 So how do we take the n-th power of a matrix? 101 00:06:11,920 --> 00:06:16,760 Well, this is where we use eigenvectors and eigenvalues. 102 00:06:16,760 --> 00:06:27,320 So recall, that we can take any matrix A that's diagonalizable 103 00:06:27,320 --> 00:06:35,360 and write it as U D U inverse, where D is a diagonal matrix 104 00:06:35,360 --> 00:06:39,180 and this matrix U is a matrix whose columns correspond 105 00:06:39,180 --> 00:06:41,242 to the eigenvectors of A. 106 00:06:41,242 --> 00:06:42,700 So for this problem, I'm just going 107 00:06:42,700 --> 00:06:44,830 to state what the eigenvalues and eigenvectors are. 108 00:06:44,830 --> 00:06:48,050 And I'll let you work them out. 109 00:06:48,050 --> 00:06:52,800 So because it's a Markov matrix, we always 110 00:06:52,800 --> 00:06:55,780 have an eigenvalue which is 1. 111 00:06:55,780 --> 00:07:01,130 And in this case, we have an eigenvector u which is 1 and 2. 112 00:07:04,160 --> 00:07:11,440 In addition, the second eigenvalue is 0.4. 113 00:07:11,440 --> 00:07:14,520 And the eigenvector corresponding to this one 114 00:07:14,520 --> 00:07:17,070 is [1, -1]. 115 00:07:17,070 --> 00:07:19,750 And I'll just call these u_1 and u_2, like that. 116 00:07:30,670 --> 00:07:37,420 OK, we can now write this big matrix U as 1, 2; 1, -1. 117 00:07:41,580 --> 00:07:44,452 D is going to be-- now I have to match things up. 118 00:07:44,452 --> 00:07:46,160 If I'm going to put the first eigenvector 119 00:07:46,160 --> 00:07:51,750 in the first column, we have to stick 1 in the first column 120 00:07:51,750 --> 00:07:57,620 as well and then 0.4 like this. 121 00:07:57,620 --> 00:08:01,410 And then lastly, we also have U inverse which I can just 122 00:08:01,410 --> 00:08:16,670 work out to be minus 1/3, one over the determinant, 123 00:08:16,670 --> 00:08:29,740 times -1, -1; -2, and 1, which simplifies to this. 124 00:08:29,740 --> 00:08:40,860 OK, so now if we take A and raise it to the power of n, 125 00:08:40,860 --> 00:08:43,809 we have this nice identity that all the U and U inverses 126 00:08:43,809 --> 00:08:46,040 collapse in the middle. 127 00:08:46,040 --> 00:08:51,900 And we're left with U, D to the n, U inverse, p_0. 128 00:08:54,460 --> 00:08:57,810 Now raising the a diagonal matrix to the power of n 129 00:08:57,810 --> 00:08:59,400 is a relatively simple thing to do. 130 00:08:59,400 --> 00:09:03,655 We just take the eigenvalues and raise them to the power of n. 131 00:09:03,655 --> 00:09:05,780 So when we compute this product, there's a question 132 00:09:05,780 --> 00:09:07,720 of what order do we do things? 133 00:09:07,720 --> 00:09:09,930 Now these are 2 by 2 matrices, so in theory we 134 00:09:09,930 --> 00:09:11,830 could just multiply out, 2 by 2 matrix, 2 135 00:09:11,830 --> 00:09:15,050 by 2 matrix, 2 by 2 matrix, and then on a vector which is a 2 136 00:09:15,050 --> 00:09:16,540 by 1 matrix. 137 00:09:16,540 --> 00:09:19,030 But if you're in a test and you're cramped for time, 138 00:09:19,030 --> 00:09:21,574 you want to do as little computations as possible. 139 00:09:21,574 --> 00:09:22,990 So what you want to do is you want 140 00:09:22,990 --> 00:09:28,170 to start on the right-hand side and then work backwards. 141 00:09:28,170 --> 00:09:40,170 So if we do this, we end up obtaining 1, 2, 142 00:09:40,170 --> 00:09:47,795 this is going to be to the power of n, 1/3, [1, 2]. 143 00:09:51,870 --> 00:09:53,520 OK, so for this last part, I'm just 144 00:09:53,520 --> 00:09:55,570 going to write down the final answer. 145 00:09:55,570 --> 00:09:59,890 And I'll let you work out the multiplication of matrices. 146 00:09:59,890 --> 00:10:15,650 So we have for p_n: 1/3, 2 times 0.4 to the n plus 1, -2 0.4 147 00:10:15,650 --> 00:10:21,430 to the n plus 2. 148 00:10:21,430 --> 00:10:26,160 And this is the final vector for p of n. 149 00:10:26,160 --> 00:10:27,980 So this finishes up Part 2. 150 00:10:27,980 --> 00:10:30,620 And then lastly, for Part 3, what 151 00:10:30,620 --> 00:10:33,930 happens when n goes to infinity? 152 00:10:33,930 --> 00:10:36,740 Well, we have the answer for any n. 153 00:10:36,740 --> 00:10:39,410 So we can just take the limit as n goes to infinity. 154 00:10:39,410 --> 00:10:41,950 Now, specifically as n goes to infinity, 155 00:10:41,950 --> 00:10:46,040 0.4 raised to some very large power vanishes. 156 00:10:46,040 --> 00:10:48,150 So these two terms drop off. 157 00:10:48,150 --> 00:10:52,550 And at the end of the day, we're left with p_infinity 158 00:10:52,550 --> 00:10:58,130 is 1/3 [1, 2]. 159 00:10:58,130 --> 00:10:59,240 OK? 160 00:10:59,240 --> 00:11:04,430 So just to recap, we started off with a particle starting at A, 161 00:11:04,430 --> 00:11:08,550 and then after a very long time, the particle 162 00:11:08,550 --> 00:11:11,560 winds up with a probability distribution 163 00:11:11,560 --> 00:11:16,030 which is 1/3, 1 and 2. 164 00:11:16,030 --> 00:11:20,290 And this is quite characteristic of Markov matrix chains. 165 00:11:20,290 --> 00:11:24,290 Specifically, we note that 1/3 * [1, 2] 166 00:11:24,290 --> 00:11:31,340 is a multiple of the eigenvector corresponding to eigenvalue 1. 167 00:11:31,340 --> 00:11:34,667 So even though the particle started at position A, 168 00:11:34,667 --> 00:11:36,250 after a long period of time, it tended 169 00:11:36,250 --> 00:11:40,130 to forget where it started and approached, diffused 170 00:11:40,130 --> 00:11:43,130 into this uniform distribution. 171 00:11:43,130 --> 00:11:43,860 OK. 172 00:11:43,860 --> 00:11:45,470 I'd like to finish up here. 173 00:11:45,470 --> 00:11:47,680 And I'll see you next time.