1
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

2
00:00:02,960 --> 00:00:04,370
Commons license.

3
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

4
00:00:07,410 --> 00:00:11,060
offer high-quality educational
resources for free.

5
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

6
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.

8
00:00:22,860 --> 00:00:23,150
SHAN-YUAN HO: OK.

9
00:00:23,150 --> 00:00:24,860
So today's lecture
is going to be on

10
00:00:24,860 --> 00:00:26,570
Finite-state Markov Chains.

11
00:00:26,570 --> 00:00:28,500
And we're going to use
the matrix approach.

12
00:00:28,500 --> 00:00:32,180
So in last lecture, we saw
that the Markov chain, we

13
00:00:32,180 --> 00:00:35,760
could represent it as a directed
graph or as a matrix.

14
00:00:35,760 --> 00:00:40,860
So the outline is we will look
at this transition matrix and

15
00:00:40,860 --> 00:00:42,220
its powers.

16
00:00:42,220 --> 00:00:45,640
And then we'll want to know
whether this p of n is going

17
00:00:45,640 --> 00:00:48,490
to converge for very,
very large n.

18
00:00:48,490 --> 00:00:53,790
Then we will extend this to
Ergodic Markov chains, Ergodic

19
00:00:53,790 --> 00:00:57,460
unichains, and other
finite-state Markov chains.

20
00:00:57,460 --> 00:01:00,170
So remember in the Markovity,
these Markov chains, the

21
00:01:00,170 --> 00:01:03,250
effect of the past on the future
is totally summarized

22
00:01:03,250 --> 00:01:04,269
by its state.

23
00:01:04,269 --> 00:01:07,590
So we want to analyze the
probabilities of properties of

24
00:01:07,590 --> 00:01:09,150
the sequence of these states.

25
00:01:09,150 --> 00:01:12,430
So whatever the state you are
in, all the past is totally

26
00:01:12,430 --> 00:01:13,330
summarized in that state.

27
00:01:13,330 --> 00:01:15,970
And that's the only thing
that affects the future.

28
00:01:15,970 --> 00:01:20,310
So an ergodic Markov chain is
a Markov chain that has a

29
00:01:20,310 --> 00:01:23,150
single recurrent class
and is aperiodic.

30
00:01:23,150 --> 00:01:25,150
So this chain doesn't contain
any transient states.

31
00:01:25,150 --> 00:01:28,000
And it doesn't contain
any periodicity.

32
00:01:28,000 --> 00:01:32,455
So an ergodic unichain is just
ergodic Markov chain, but it

33
00:01:32,455 --> 00:01:34,176
has some transient
states in it.

34
00:01:36,990 --> 00:01:41,370
So the state x sub n of this
Markov chain at step n depends

35
00:01:41,370 --> 00:01:43,450
only on the past through
the previous step.

36
00:01:43,450 --> 00:01:47,500
So for n steps, we want
to be at state j.

37
00:01:47,500 --> 00:01:48,810
And then we have this path.

38
00:01:48,810 --> 00:01:51,510
x sub n minus 1 is i, and
so forth, up to x0.

39
00:01:51,510 --> 00:01:54,040
It's just the probability
from i to j, from

40
00:01:54,040 --> 00:01:55,620
state i to state j.

41
00:01:55,620 --> 00:01:59,800
So this means that we can write
the joint probability of

42
00:01:59,800 --> 00:02:02,310
all these states that we're in,
so x0, x1, all the way up

43
00:02:02,310 --> 00:02:07,110
to xn, as a function of these
transition probabilities.

44
00:02:07,110 --> 00:02:10,060
So in this transition
probability matrix, we can

45
00:02:10,060 --> 00:02:13,950
represent these transition
probabilities.

46
00:02:13,950 --> 00:02:19,220
We see that here, in this
example, this is a 6-state

47
00:02:19,220 --> 00:02:20,210
Markov chain.

48
00:02:20,210 --> 00:02:23,380
So if I want to go from, say,
state 2 to state 1 in one

49
00:02:23,380 --> 00:02:26,780
step, it would just
be p of 2,1.

50
00:02:26,780 --> 00:02:29,990
If I want to go from
state 6 to itself--

51
00:02:29,990 --> 00:02:33,380
this is last one, which
is p of 6,6.

52
00:02:33,380 --> 00:02:35,700
So this is a probably
transition matrix.

53
00:02:35,700 --> 00:02:38,960
So if we condition on the state
at time 0 and then we

54
00:02:38,960 --> 00:02:44,850
define this p of ijn is equal to
the probability that we're

55
00:02:44,850 --> 00:02:49,410
in state j at the n-th step,
given that we start x0 is

56
00:02:49,410 --> 00:02:54,920
equal to i, let's look at what
happens when n is equal to 2.

57
00:02:54,920 --> 00:02:59,580
So in a 2-step transition,
we go from i to j.

58
00:02:59,580 --> 00:03:04,100
It's just the probability that
at step 2, x2 is equal to j,

59
00:03:04,100 --> 00:03:06,870
x1 is equal to some k,
and x0 is equal to i.

60
00:03:06,870 --> 00:03:09,420
So remember, we started
in state i.

61
00:03:09,420 --> 00:03:13,210
But this has to be multiplied
by probability that x1 is

62
00:03:13,210 --> 00:03:15,490
equal to k, given that
x0 is equal to i.

63
00:03:15,490 --> 00:03:18,500
And we have to sum this over all
the states k, in order to

64
00:03:18,500 --> 00:03:21,754
get the total probability
from--

65
00:03:21,754 --> 00:03:22,670
Oh, stand back?

66
00:03:22,670 --> 00:03:23,590
OK.

67
00:03:23,590 --> 00:03:24,160
There.

68
00:03:24,160 --> 00:03:25,860
OK.

69
00:03:25,860 --> 00:03:28,770
So this is just probability
of ij in two steps.

70
00:03:28,770 --> 00:03:31,850
It's just the probability of
i going to k times the

71
00:03:31,850 --> 00:03:36,700
probability of k going to j,
summed over all k states.

72
00:03:36,700 --> 00:03:42,270
So we notice that this term
right here, the sum over k or

73
00:03:42,270 --> 00:03:45,690
ik, kj is just the ij term of
the product of the transition

74
00:03:45,690 --> 00:03:47,480
matrix P with itself.

75
00:03:47,480 --> 00:03:49,425
So we represent this
as P squared.

76
00:03:49,425 --> 00:03:51,760
So we multiply the transition
matrix by itself.

77
00:03:51,760 --> 00:03:55,300
This gives us the 2-step
transition matrix of this

78
00:03:55,300 --> 00:03:56,320
Markov chain.

79
00:03:56,320 --> 00:04:01,400
So if you want to go i to j, you
just look at ij element in

80
00:04:01,400 --> 00:04:02,400
this matrix.

81
00:04:02,400 --> 00:04:04,240
And that gives you the
probability in two steps,

82
00:04:04,240 --> 00:04:05,490
going from state i to state j.

83
00:04:10,200 --> 00:04:14,060
So for n, we just iterate
on this for

84
00:04:14,060 --> 00:04:16,220
successively larger n.

85
00:04:16,220 --> 00:04:23,520
So for n state to get from state
i to state j, we just

86
00:04:23,520 --> 00:04:27,680
have this probability x sub n,
given j, given x of n minus

87
00:04:27,680 --> 00:04:31,950
the previous step is equal to
k, x sub n minus 1 equals k,

88
00:04:31,950 --> 00:04:34,940
given x0 is equal to i,
summing over all k.

89
00:04:34,940 --> 00:04:38,700
So this means that we broke
this up for n-th step.

90
00:04:38,700 --> 00:04:41,200
In the n minus one step,
we visited state k.

91
00:04:41,200 --> 00:04:43,730
And then we multiplied that
one-step transition from k to

92
00:04:43,730 --> 00:04:46,560
j because we want to arrive
at j starting at i.

93
00:04:46,560 --> 00:04:50,570
But again, we have to sum over
all the k's in order to get

94
00:04:50,570 --> 00:04:54,610
the probability from
i to j in n steps.

95
00:04:54,610 --> 00:04:59,960
So p of n right here, this
representation is just the

96
00:04:59,960 --> 00:05:02,770
transition matrix multiplied
by itself n times.

97
00:05:02,770 --> 00:05:05,430
And this gives you the n-th step
transition probabilities

98
00:05:05,430 --> 00:05:06,890
of this Markov chain.

99
00:05:06,890 --> 00:05:09,860
So computationally, what you do
is you take p, p squared, p

100
00:05:09,860 --> 00:05:10,680
to the fourth.

101
00:05:10,680 --> 00:05:13,600
If you wanted p to the 9th,
you'd just take p eighth

102
00:05:13,600 --> 00:05:17,570
multiplied by p, to to
multiply by this.

103
00:05:17,570 --> 00:05:20,140
So this gives us this thing
called the Chapman-Kolmogorov

104
00:05:20,140 --> 00:05:23,580
equations, which means that when
we want to go from step i

105
00:05:23,580 --> 00:05:27,800
to step j, we can go to an
intermediate state and then

106
00:05:27,800 --> 00:05:29,520
sum up all the states that
would go into the

107
00:05:29,520 --> 00:05:30,890
intermediate state.

108
00:05:30,890 --> 00:05:35,980
So in this case, if the step is
m plus n transition, we can

109
00:05:35,980 --> 00:05:38,330
break it up into m and n.

110
00:05:38,330 --> 00:05:41,670
So it's the probability that it
goes from i to k in exactly

111
00:05:41,670 --> 00:05:45,660
m steps and k to j in n steps,
summing over all the k's that

112
00:05:45,660 --> 00:05:50,230
it visits on its way
from i to j.

113
00:05:50,230 --> 00:05:53,110
So this is very useful a
quantity that we can

114
00:05:53,110 --> 00:05:56,280
manipulate our transition
probabilities when we get

115
00:05:56,280 --> 00:05:59,180
higher orders of n.

116
00:05:59,180 --> 00:06:00,840
So the convergence
of p to the n.

117
00:06:00,840 --> 00:06:04,250
So a very important question we
like to ask is as n goes to

118
00:06:04,250 --> 00:06:07,960
infinity whether this goes
to a limit or not.

119
00:06:07,960 --> 00:06:12,810
In other words, does the initial
state matter, all

120
00:06:12,810 --> 00:06:16,380
initial sates matter in
this Markov chain?

121
00:06:16,380 --> 00:06:17,750
So the Markov chain is going
to go on for a long, long,

122
00:06:17,750 --> 00:06:19,280
long, long, long time.

123
00:06:19,280 --> 00:06:22,350
And at the n-th state where n is
very large, is it going to

124
00:06:22,350 --> 00:06:25,980
depend on i?

125
00:06:25,980 --> 00:06:29,840
Or is it going to depend on n,
which is the number of steps?

126
00:06:29,840 --> 00:06:33,820
If it goes to this quantity,
some limit, then it won't

127
00:06:33,820 --> 00:06:35,040
depend on this.

128
00:06:35,040 --> 00:06:37,610
So let's assume that
this limit exists.

129
00:06:37,610 --> 00:06:43,140
If this limit does exist, we can
take the sum of this limit

130
00:06:43,140 --> 00:06:48,860
and then multiply it by p of
jk, summing over all j.

131
00:06:48,860 --> 00:06:50,790
So we do a sum of over j.

132
00:06:50,790 --> 00:06:53,490
So we're going from j to
k on both sides, and we

133
00:06:53,490 --> 00:06:54,790
sum over all j.

134
00:06:54,790 --> 00:06:57,020
So we take this limit
right here.

135
00:06:57,020 --> 00:07:02,180
We notice that this left side
going from i to k to n plus 1

136
00:07:02,180 --> 00:07:06,500
is just that this limit
at state k exists.

137
00:07:06,500 --> 00:07:08,790
Because we saw assumed up here
that this exists for

138
00:07:08,790 --> 00:07:10,190
all i and all j.

139
00:07:10,190 --> 00:07:14,086
So therefore, if we take the n
plus 1 step, we take this n

140
00:07:14,086 --> 00:07:18,860
going to infinity of i to k,
it has to go to pi of k.

141
00:07:18,860 --> 00:07:20,790
So when we do this,
we could simplify

142
00:07:20,790 --> 00:07:22,570
this equation up here.

143
00:07:22,570 --> 00:07:27,810
And if it doesn't exist, we have
this pi sub k for all the

144
00:07:27,810 --> 00:07:29,370
states in the Markov chain.

145
00:07:29,370 --> 00:07:30,780
So this is just a vector.

146
00:07:30,780 --> 00:07:34,885
So pi sub k is equal to pi sub j
times the probability from k

147
00:07:34,885 --> 00:07:37,120
to j, summed over all j.

148
00:07:37,120 --> 00:07:39,490
So if you have an m state Markov
chain, you have exactly

149
00:07:39,490 --> 00:07:42,160
m of these equations.

150
00:07:42,160 --> 00:07:45,280
And this one, we'll call it the
vector pi, which consists

151
00:07:45,280 --> 00:07:50,460
of each element of this
equation, if the limit is

152
00:07:50,460 --> 00:07:51,000
going to exist.

153
00:07:51,000 --> 00:07:52,400
But we don't know whether
it does or not, at

154
00:07:52,400 --> 00:07:54,750
this point in time.

155
00:07:54,750 --> 00:07:57,010
So if it does exist, what's
going to happen?

156
00:07:57,010 --> 00:08:00,770
So that means I'm going to
multiply this probability

157
00:08:00,770 --> 00:08:03,940
matrix, P times P,P P, P,
P, P, P all the way.

158
00:08:03,940 --> 00:08:07,980
And if the limit exists, then
that means for each row, they

159
00:08:07,980 --> 00:08:09,490
must be all identical.

160
00:08:09,490 --> 00:08:14,610
Because we said the limit
exists, then going from 1 to

161
00:08:14,610 --> 00:08:16,855
j, 2 to j, 3 to j, 4 to
j, they should be

162
00:08:16,855 --> 00:08:18,050
all exactly the same.

163
00:08:18,050 --> 00:08:20,600
This is also the equivalent of
saying, when I look at this

164
00:08:20,600 --> 00:08:24,190
large limit, as n is very,
very large if the limit

165
00:08:24,190 --> 00:08:27,040
exists, that all the elements
in the column should be

166
00:08:27,040 --> 00:08:30,170
exactly the same as well,
or all the rows.

167
00:08:30,170 --> 00:08:33,220
So the elements are equal to
each other, or all the rows,

168
00:08:33,220 --> 00:08:37,419
if I look at the row, which is
going to be this pi vector.

169
00:08:37,419 --> 00:08:39,220
They should be the same.

170
00:08:39,220 --> 00:08:40,400
So we define this vector.

171
00:08:40,400 --> 00:08:43,340
If this limit exists, the
probability vector

172
00:08:43,340 --> 00:08:45,830
is this vector pi.

173
00:08:45,830 --> 00:08:48,930
Because we said it was an
m state Markov chain.

174
00:08:48,930 --> 00:08:52,060
Each pi sub i is non-negative,
and they obviously have

175
00:08:52,060 --> 00:08:53,620
to sum up to 1.

176
00:08:53,620 --> 00:08:57,380
So this is what we call a
probability vector, called the

177
00:08:57,380 --> 00:08:59,020
steady-state vector,
for this transition

178
00:08:59,020 --> 00:09:01,690
matrix P, if it exists.

179
00:09:01,690 --> 00:09:06,290
So what happens is this limit
is easy to study.

180
00:09:06,290 --> 00:09:14,030
In the future in the course, we
will study these pi P, this

181
00:09:14,030 --> 00:09:18,130
steady-state vector for
various Markov chains.

182
00:09:18,130 --> 00:09:21,870
And so you see, it is quite
interesting, many things that

183
00:09:21,870 --> 00:09:24,930
could come about it.

184
00:09:24,930 --> 00:09:29,330
So we notice that this
solution can

185
00:09:29,330 --> 00:09:31,600
contain more than one.

186
00:09:31,600 --> 00:09:32,460
It may not be unique.

187
00:09:32,460 --> 00:09:34,880
So if it contains more than one,
it's very possible that

188
00:09:34,880 --> 00:09:37,000
it has more than one solution,
more than one probability

189
00:09:37,000 --> 00:09:37,880
vector solution.

190
00:09:37,880 --> 00:09:40,450
But just because a solution
exists to that, it doesn't

191
00:09:40,450 --> 00:09:43,010
mean that this limit exists.

192
00:09:43,010 --> 00:09:44,840
So we have prove that
limit exists, first.

193
00:09:47,890 --> 00:09:54,490
So for ergodic Markov chain,
here we have another way to

194
00:09:54,490 --> 00:10:01,320
express this that this matrix
converges is that the matrix

195
00:10:01,320 --> 00:10:02,570
of the rows--

196
00:10:05,260 --> 00:10:08,360
the elements in the column are
all the same for each i.

197
00:10:08,360 --> 00:10:09,530
So we have this theorem.

198
00:10:09,530 --> 00:10:12,950
And today's lecture is going to
be completely this theorem.

199
00:10:12,950 --> 00:10:16,450
This theorem says that if you
have an ergodic finite-state

200
00:10:16,450 --> 00:10:18,890
Markov chain-- so when we say
"ergodic," remember it means

201
00:10:18,890 --> 00:10:22,540
that there's only one class,
every single state in this is

202
00:10:22,540 --> 00:10:24,290
recurrent, you have no transient
states, and you have

203
00:10:24,290 --> 00:10:25,030
no periodicity.

204
00:10:25,030 --> 00:10:27,330
So it's an aperiodic chain.

205
00:10:27,330 --> 00:10:32,590
And then for each j, if you take
the maximum path from i

206
00:10:32,590 --> 00:10:36,880
to j in n steps, this is
non-increasing in n.

207
00:10:36,880 --> 00:10:41,800
So in other words, this right
here, this is non-increasing.

208
00:10:41,800 --> 00:10:44,976
So if I take the maximum path
from state i to j, it gives

209
00:10:44,976 --> 00:10:46,600
you exactly n steps.

210
00:10:46,600 --> 00:10:48,920
So that means this is
maximized over all

211
00:10:48,920 --> 00:10:50,180
initial states i.

212
00:10:50,180 --> 00:10:52,390
So it doesn't matter what state
you start, and I take

213
00:10:52,390 --> 00:10:53,370
the maximum path.

214
00:10:53,370 --> 00:10:58,180
And if I increase n, and I take
maximum of that again,

215
00:10:58,180 --> 00:11:01,310
the maximum path, this
is not increasing.

216
00:11:01,310 --> 00:11:04,330
And the minimum is
non-decreasing in n.

217
00:11:04,330 --> 00:11:10,480
So as we take n, the path from
i to j, this n getting larger

218
00:11:10,480 --> 00:11:16,370
and larger, we have that the
maximum of this path, which is

219
00:11:16,370 --> 00:11:19,960
the most probable path,
is non-increasing.

220
00:11:19,960 --> 00:11:24,610
And then the minimum of this
path, the least likely path,

221
00:11:24,610 --> 00:11:26,340
is going to be non-decreasing.

222
00:11:26,340 --> 00:11:29,600
So we're wondering whether
this limit is going to

223
00:11:29,600 --> 00:11:30,460
converge or not.

224
00:11:30,460 --> 00:11:33,460
In this theorem it said that
for an ergodic finite-state

225
00:11:33,460 --> 00:11:36,000
Markov chain, this limit
actually does converge.

226
00:11:36,000 --> 00:11:39,850
So in other words, the lim sup
is equal to lim if of this and

227
00:11:39,850 --> 00:11:44,250
will equal pi sub j, which is
the steady-state distribution.

228
00:11:44,250 --> 00:11:46,670
And not only that, this
convergence is going to be

229
00:11:46,670 --> 00:11:49,440
exponential in n.

230
00:11:49,440 --> 00:11:53,230
So this is the theorem that
we will prove today.

231
00:11:53,230 --> 00:11:58,180
So the key to this theorem is
this pair statements, that the

232
00:11:58,180 --> 00:12:01,010
most probable path from i
to j, given n steps--

233
00:12:01,010 --> 00:12:02,460
so this is the most
probable path--

234
00:12:02,460 --> 00:12:06,550
is non-increasing at n,
and the minimum is

235
00:12:06,550 --> 00:12:08,500
non-decreasing in n.

236
00:12:08,500 --> 00:12:12,320
So the proof is almost trivial,
but let's see what

237
00:12:12,320 --> 00:12:14,220
happens in this.

238
00:12:14,220 --> 00:12:18,410
So we have a probably
transition matrix.

239
00:12:18,410 --> 00:12:21,160
So this is the statement
right here.

240
00:12:21,160 --> 00:12:24,050
And the transition is just one
here and one here, with

241
00:12:24,050 --> 00:12:25,960
probability 1, 1.

242
00:12:25,960 --> 00:12:30,580
In this case, we want to say,
what is the maximum path that

243
00:12:30,580 --> 00:12:34,490
we're in state 2,
given n steps?

244
00:12:34,490 --> 00:12:37,510
So we know that this probability
alternates between

245
00:12:37,510 --> 00:12:40,150
1 and 2, it's non-increasing,
it's not decreasing, it's

246
00:12:40,150 --> 00:12:41,740
always the same.

247
00:12:41,740 --> 00:12:47,890
So those two bounds are
met with equality.

248
00:12:47,890 --> 00:12:48,660
So in this here.

249
00:12:48,660 --> 00:12:50,240
So the second example is this.

250
00:12:50,240 --> 00:12:52,690
We have a two-state
chain again.

251
00:12:52,690 --> 00:12:58,520
But this time, from 1 to 2, we
have the transition of 3/4.

252
00:12:58,520 --> 00:13:01,326
So that means that we have
a chain here of 1/4.

253
00:13:01,326 --> 00:13:03,090
See, the minute we put a
self-loop in here, it

254
00:13:03,090 --> 00:13:05,590
completely destroys
the periodicity.

255
00:13:05,590 --> 00:13:08,760
Any Markov chain, you put a
self-loop in it, and the

256
00:13:08,760 --> 00:13:09,910
periodicity is destroyed.

257
00:13:09,910 --> 00:13:12,050
So here we have 3/4.

258
00:13:12,050 --> 00:13:15,180
So this has to come
back with 1/4.

259
00:13:15,180 --> 00:13:16,220
All right.

260
00:13:16,220 --> 00:13:22,590
So in this one, let's look at
the n step going from 1 to 2.

261
00:13:22,590 --> 00:13:27,060
So basically, we want
to end up in state 2

262
00:13:27,060 --> 00:13:28,850
in exactly n steps.

263
00:13:28,850 --> 00:13:34,810
So when n is equal to 1,
what is the maximum?

264
00:13:34,810 --> 00:13:36,950
The maximum is if you start it
in this state and then you

265
00:13:36,950 --> 00:13:37,890
went to state 2.

266
00:13:37,890 --> 00:13:40,600
The other alternative is you
start at state 2, and you stay

267
00:13:40,600 --> 00:13:41,040
in state 2.

268
00:13:41,040 --> 00:13:43,250
Because we want to end at state
2 in exactly one step.

269
00:13:43,250 --> 00:13:45,300
So the maximum is going to
be 3/4, and the minimum

270
00:13:45,300 --> 00:13:48,190
is going to be 1/4.

271
00:13:48,190 --> 00:13:50,170
You get n is equal to 2.

272
00:13:50,170 --> 00:13:53,430
Now we want to end up in
state 2 in two steps.

273
00:13:53,430 --> 00:13:56,360
So what is going to
be the maximum?

274
00:13:56,360 --> 00:13:58,960
The maximum is going
to be if you visit

275
00:13:58,960 --> 00:14:00,580
state 1 and then back.

276
00:14:00,580 --> 00:14:02,840
So n is equal to 1.

277
00:14:02,840 --> 00:14:08,850
Then P1 from 1 to 2
is equal to 3/4.

278
00:14:08,850 --> 00:14:15,830
So the probability from 1 to 2
in two steps is equal to 3/8.

279
00:14:18,690 --> 00:14:22,270
So it goes 1/4 plus
3/4, 3/4 plus 1/4.

280
00:14:22,270 --> 00:14:23,280
It should be equal
to 3/8, right?

281
00:14:23,280 --> 00:14:25,510
Is that right?

282
00:14:25,510 --> 00:14:28,220
OK.

283
00:14:28,220 --> 00:14:32,570
And then it when P1,2 to 3, if
there are three transitions

284
00:14:32,570 --> 00:14:35,770
from 1 to 2, then it's
equal to 9/16.

285
00:14:35,770 --> 00:14:38,900
So if for 2, if I want
to transition

286
00:14:38,900 --> 00:14:40,350
from 2 to 2 n steps--

287
00:14:40,350 --> 00:14:44,030
so P2,2 is equal to 1/4.

288
00:14:44,030 --> 00:14:47,090
So it just stayed by itself.

289
00:14:47,090 --> 00:14:54,600
So P2,2 in two steps, you
don't have a choice.

290
00:14:54,600 --> 00:14:55,850
You have to go from
3/4 to 3/4.

291
00:15:03,020 --> 00:15:03,580
So that's 9/16.

292
00:15:03,580 --> 00:15:07,970
But For thing is I can also stay
here by 1/4 times 1/4.

293
00:15:07,970 --> 00:15:14,740
So that gives me 5/8
and so forth.

294
00:15:14,740 --> 00:15:18,305
So basically, the sequence going
from 1 to 2 is going to

295
00:15:18,305 --> 00:15:24,030
be oscillating between 3/4,
3/8, 9/16, and so forth.

296
00:15:24,030 --> 00:15:27,530
And then going from 2,2, it's
going to be oscillating too.

297
00:15:27,530 --> 00:15:30,440
We can see that's
1/4, 5/8, 7/16.

298
00:15:30,440 --> 00:15:34,980
So what happens is this
oscillation is going to

299
00:15:34,980 --> 00:15:37,860
converge-- it's going to
approach, actually, 1/2.

300
00:15:37,860 --> 00:15:44,540
So if we take the maximum of
these two, so P1,2 and P2,2--

301
00:15:44,540 --> 00:15:48,410
because that means that we're
going to end at state 2.

302
00:15:48,410 --> 00:15:53,060
And maximum over n steps, then
we just look at these two

303
00:15:53,060 --> 00:15:56,670
numbers, the 3/4 and 1/4, if we
want the maximum, then it's

304
00:15:56,670 --> 00:15:57,350
going to be 3/4.

305
00:15:57,350 --> 00:15:59,410
For the 3/8 and 5/8, the maximum
is going to be 5/8,

306
00:15:59,410 --> 00:16:02,690
the 9/16 and 7/16, the 9/16
will be the maximum.

307
00:16:02,690 --> 00:16:04,975
And similarly, we compare it,
and we take the minimum.

308
00:16:04,975 --> 00:16:07,730
And the minimum is 1/4,
3/8, and 7/16.

309
00:16:07,730 --> 00:16:10,620
So we see that the maximum
is going to be--

310
00:16:10,620 --> 00:16:12,150
it starts high.

311
00:16:12,150 --> 00:16:14,490
And then it's going to
decrease toward 1/2.

312
00:16:14,490 --> 00:16:16,970
And the minimum, what happens
is it's going to start low,

313
00:16:16,970 --> 00:16:21,060
and then it's going to
increase to 1/2.

314
00:16:21,060 --> 00:16:23,600
So this is exactly
this one here.

315
00:16:23,600 --> 00:16:26,800
So P's transition makes this
an arbitrary finite-state

316
00:16:26,800 --> 00:16:27,980
Markov chain.

317
00:16:27,980 --> 00:16:32,130
Therefore, each j, this maximum
path, the most problem

318
00:16:32,130 --> 00:16:34,630
path from i to j in n steps
is non-increasing n.

319
00:16:34,630 --> 00:16:36,860
And the minimum is
non-decreasing n.

320
00:16:36,860 --> 00:16:40,480
So you take n plus 1
steps from i to j.

321
00:16:40,480 --> 00:16:43,570
So we're going to use that
Chapman-Kolmogorov equation.

322
00:16:43,570 --> 00:16:46,790
So we take the first step
to some state k.

323
00:16:46,790 --> 00:16:50,260
And then we go from
k to j in n steps.

324
00:16:50,260 --> 00:16:53,180
But then we sum this
over all k.

325
00:16:53,180 --> 00:16:59,620
But this P n for state to j to
k in n steps, I can just take

326
00:16:59,620 --> 00:17:00,620
the maximum path.

327
00:17:00,620 --> 00:17:05,950
So I take the most probable
path, the state that gives me

328
00:17:05,950 --> 00:17:08,680
the most probable path, and
I substitute this in.

329
00:17:08,680 --> 00:17:12,359
When I substitute this in,
obviously every one of these

330
00:17:12,359 --> 00:17:14,670
guys is going to be less
than or equal to this.

331
00:17:14,670 --> 00:17:16,859
Therefore, this outside
term is going to be

332
00:17:16,859 --> 00:17:17,900
less than or equal.

333
00:17:17,900 --> 00:17:20,190
So now this is just going
to be a constant.

334
00:17:20,190 --> 00:17:24,589
So I sum over all k, and
then this term remains.

335
00:17:24,589 --> 00:17:29,740
So therefore, what we know is
if I want to end up in state

336
00:17:29,740 --> 00:17:37,430
j, and for n steps, if I
increase the step more, to n

337
00:17:37,430 --> 00:17:41,930
plus 1, we know that this
probability is going to stay

338
00:17:41,930 --> 00:17:43,010
the same or decrease.

339
00:17:43,010 --> 00:17:44,330
It's not going to increase.

340
00:17:44,330 --> 00:17:47,520
So you could do exactly the same
thing for the minimum.

341
00:17:47,520 --> 00:17:50,480
So if this is going to be true,
then of course, if I

342
00:17:50,480 --> 00:17:53,000
think the maximum of this,
it's also going to

343
00:17:53,000 --> 00:17:53,830
be less than that.

344
00:17:53,830 --> 00:17:57,270
Because this limit's true
for Markov chain.

345
00:17:57,270 --> 00:17:59,620
It doesn't matter.

346
00:17:59,620 --> 00:18:02,830
It just has to be a finite-state
Markov chain.

347
00:18:02,830 --> 00:18:05,250
So this is true for any
finite-state Markov chain.

348
00:18:05,250 --> 00:18:07,850
So if I take the maximum of
this, it's less than or equal

349
00:18:07,850 --> 00:18:10,560
to the maximum of
the n-th step.

350
00:18:10,560 --> 00:18:16,440
So n plus 1 steps, the path is
going to be less probable when

351
00:18:16,440 --> 00:18:18,510
I take the maximum path, the
fact that I end up at

352
00:18:18,510 --> 00:18:19,760
state j than n.

353
00:18:25,570 --> 00:18:29,050
So before we complete the proof
of this theorem, let's

354
00:18:29,050 --> 00:18:32,450
look at this case where P
is greater than zero.

355
00:18:32,450 --> 00:18:35,440
So if we say Pis greater than
zero, this means that every

356
00:18:35,440 --> 00:18:39,020
entry in this matrix is greater
than 0 for all i, j,

357
00:18:39,020 --> 00:18:42,310
which means that this graph
is fully connected.

358
00:18:42,310 --> 00:18:46,130
So that means you could get from
i to j in one step with

359
00:18:46,130 --> 00:18:49,610
nonzero probability.

360
00:18:49,610 --> 00:18:52,250
So if P is greater than 0--

361
00:18:52,250 --> 00:18:53,600
and let this be the
transition matrix.

362
00:18:53,600 --> 00:18:56,330
So we'll prove this first, and
then we'll extend it to the

363
00:18:56,330 --> 00:18:59,170
arbitrary finite Markov chain.

364
00:18:59,170 --> 00:19:02,560
So let alpha here is equal
to the minimum.

365
00:19:02,560 --> 00:19:04,420
So it's going to be the minimum
element in this

366
00:19:04,420 --> 00:19:05,560
transition matrix.

367
00:19:05,560 --> 00:19:09,590
That means it's going to be the
state that contains the

368
00:19:09,590 --> 00:19:11,580
minimum transition.

369
00:19:11,580 --> 00:19:15,935
So let's call alpha-- it's
the minimum probability.

370
00:19:15,935 --> 00:19:19,080
Excuse me.

371
00:19:19,080 --> 00:19:21,660
So let all these
states i and j.

372
00:19:21,660 --> 00:19:25,040
And for n greater than or equal
to 1, we have these

373
00:19:25,040 --> 00:19:26,740
three expressions.

374
00:19:26,740 --> 00:19:34,060
So this first expression says
this, that if I have an n plus

375
00:19:34,060 --> 00:19:38,590
1 walk from i to j, I take
the most probable of

376
00:19:38,590 --> 00:19:41,960
this walk over i.

377
00:19:41,960 --> 00:19:44,860
So my choices, I can choose
my initial starting state.

378
00:19:44,860 --> 00:19:46,960
In n plus 1 steps, I want
to end in state j.

379
00:19:46,960 --> 00:19:48,690
So I pick the most
probable path.

380
00:19:48,690 --> 00:19:52,680
If I minus this, which is the
least probable path--

381
00:19:52,680 --> 00:19:56,690
but you get to minimize this
over i, over the initial

382
00:19:56,690 --> 00:19:57,890
starting a state.

383
00:19:57,890 --> 00:20:02,800
So this is less than or
equal to the n step.

384
00:20:02,800 --> 00:20:06,670
It's exactly this term
here, the n step

385
00:20:06,670 --> 00:20:08,130
times 1 minus 2 alpha.

386
00:20:08,130 --> 00:20:12,390
So alpha is the minimum
transition probability in this

387
00:20:12,390 --> 00:20:14,780
probability transition matrix.

388
00:20:14,780 --> 00:20:17,940
So this one, it's not so
obvious right now.

389
00:20:17,940 --> 00:20:20,370
But we are going to prove
that in the next slide.

390
00:20:20,370 --> 00:20:26,830
So once we have this, we can
iterative on n to get the

391
00:20:26,830 --> 00:20:28,670
second term.

392
00:20:28,670 --> 00:20:35,170
So for this term inside here,
the most probable path to

393
00:20:35,170 --> 00:20:38,550
state j in n steps, minus the
least probable path to state j

394
00:20:38,550 --> 00:20:43,670
in n steps, is equal to exactly
the same thing in n

395
00:20:43,670 --> 00:20:46,090
minus 1 steps times
1 minus 2 alpha.

396
00:20:46,090 --> 00:20:49,300
So we just keep on iterating
this over. n, and then we

397
00:20:49,300 --> 00:20:50,170
should get this.

398
00:20:50,170 --> 00:20:52,640
So to prove this to this, we
prove it by induction.

399
00:20:52,640 --> 00:20:58,690
We just have to prove the
initial step, that the maximum

400
00:20:58,690 --> 00:21:02,810
single transition from l to j,
minus the minimum single

401
00:21:02,810 --> 00:21:05,650
transition from l to j,
is less than or equal

402
00:21:05,650 --> 00:21:07,700
to 1 minus 2 alpha.

403
00:21:07,700 --> 00:21:10,510
So this one is proved
by induction.

404
00:21:10,510 --> 00:21:14,440
So as n goes to infinity, notice
that this term is going

405
00:21:14,440 --> 00:21:16,290
to go to 0.

406
00:21:16,290 --> 00:21:19,030
because alpha is going to
be less than a 1/2.

407
00:21:19,030 --> 00:21:23,560
Because if it's not, then we can
choose 1 minus alpha to be

408
00:21:23,560 --> 00:21:24,870
this minimum.

409
00:21:24,870 --> 00:21:27,340
So if this is going to 0, this
tells us the difference

410
00:21:27,340 --> 00:21:31,260
between the most probable path
minus the least probable path,

411
00:21:31,260 --> 00:21:33,390
the fact that we end
up in state j.

412
00:21:33,390 --> 00:21:37,140
So if we take the limit as n
goes to infinity of both of

413
00:21:37,140 --> 00:21:39,240
these, they should equal.

414
00:21:39,240 --> 00:21:41,420
Because the difference of this,
we notice that it's

415
00:21:41,420 --> 00:21:45,290
going down exponentially in n.

416
00:21:45,290 --> 00:21:49,500
So this shows us that this
limit indeed does

417
00:21:49,500 --> 00:21:51,450
exist and is equal.

418
00:21:51,450 --> 00:21:55,760
We want to prove this first
statement over here.

419
00:21:55,760 --> 00:21:57,870
So in order to prove this first
statement, what we're

420
00:21:57,870 --> 00:22:03,290
going to do is we're going to
take this i, j transition in n

421
00:22:03,290 --> 00:22:04,990
plus 1 transitions.

422
00:22:04,990 --> 00:22:08,090
And then we're going to express
it as a function of n

423
00:22:08,090 --> 00:22:09,500
transitions.

424
00:22:09,500 --> 00:22:11,150
So the idea is this.

425
00:22:11,150 --> 00:22:14,900
We're going to use the
Chapman-Kolmogorov equations

426
00:22:14,900 --> 00:22:18,540
to have an intermediary step.

427
00:22:18,540 --> 00:22:24,310
So in order to do this i to j
in n plus 1 steps, the most

428
00:22:24,310 --> 00:22:29,530
probable path, we're going to
go to this intermediate step

429
00:22:29,530 --> 00:22:34,220
and then on to the final step.

430
00:22:34,220 --> 00:22:36,270
In this intermediate step,
it's going to be

431
00:22:36,270 --> 00:22:36,970
a function of n.

432
00:22:36,970 --> 00:22:39,650
So we're going to take one step
and then n more steps.

433
00:22:39,650 --> 00:22:43,310
So what we're going to do is,
the intuition is, we're going

434
00:22:43,310 --> 00:22:47,890
to remove the least
probable path.

435
00:22:47,890 --> 00:22:50,110
So we remove that from
the sum in this

436
00:22:50,110 --> 00:22:52,780
Chapman-Kolmogorov equation.

437
00:22:52,780 --> 00:22:54,870
And then we have the sum
of everything else

438
00:22:54,870 --> 00:22:56,010
except for that path.

439
00:22:56,010 --> 00:22:59,080
And then the sum of everything
else, we're going to bound it.

440
00:22:59,080 --> 00:23:02,760
Once we bound it, then we
have this expression.

441
00:23:02,760 --> 00:23:05,850
The probability of i to j in n
plus 1 steps is going be a

442
00:23:05,850 --> 00:23:09,050
function of a max and
a min over n steps

443
00:23:09,050 --> 00:23:10,840
with a bunch of terms.

444
00:23:10,840 --> 00:23:14,720
So that's the intuition of
how we're going to do it.

445
00:23:14,720 --> 00:23:17,750
So the probability of ij going
from state i to state j in

446
00:23:17,750 --> 00:23:20,290
exactly n plus 1 steps
is equal to this.

447
00:23:20,290 --> 00:23:23,100
So it's the probability of
going from i to k, this

448
00:23:23,100 --> 00:23:23,550
intermediate step.

449
00:23:23,550 --> 00:23:26,840
We're going to take one
step to a state k.

450
00:23:26,840 --> 00:23:29,860
And then we're going from
k to j in n steps,

451
00:23:29,860 --> 00:23:31,000
summing over all k.

452
00:23:31,000 --> 00:23:34,260
So this is exactly equal to this
with Chapman-Kolmogorov.

453
00:23:34,260 --> 00:23:37,685
So now what happens is
we're going to take--

454
00:23:41,100 --> 00:23:43,660
Before we get to this next step,
let's define this l min

455
00:23:43,660 --> 00:23:47,690
to be the state that minimizes
p of ij, n over i.

456
00:23:47,690 --> 00:23:51,430
So l min is going to be the
state that's going to be such

457
00:23:51,430 --> 00:23:57,060
that the choices I pick over i
that in n steps I arrive at j

458
00:23:57,060 --> 00:23:59,670
that's going to be the
least probable.

459
00:23:59,670 --> 00:24:03,160
So this is l min over here.

460
00:24:03,160 --> 00:24:04,860
It's the l min that
satisfies this.

461
00:24:04,860 --> 00:24:06,840
Then I'm going to remove this.

462
00:24:06,840 --> 00:24:09,670
So this is one state. l min is
just one state that i is going

463
00:24:09,670 --> 00:24:12,690
to go to in this first step.

464
00:24:12,690 --> 00:24:15,440
So we're going to remove
it from the sum.

465
00:24:15,440 --> 00:24:18,660
So then, this is just here.

466
00:24:18,660 --> 00:24:28,120
So that path goes from i to l
min times l to j in n steps.

467
00:24:28,120 --> 00:24:30,160
So remove that one
path from here.

468
00:24:30,160 --> 00:24:33,610
Now we have the sum over the
rest of the cases because we

469
00:24:33,610 --> 00:24:34,620
just removed that.

470
00:24:34,620 --> 00:24:38,890
So we have ik, kj to
n, where k is not

471
00:24:38,890 --> 00:24:39,700
equal to that element.

472
00:24:39,700 --> 00:24:44,450
So we removed that path, the one
that goes to that state.

473
00:24:44,450 --> 00:24:49,890
But p of kj, n, the path that
goes from k to j in n steps,

474
00:24:49,890 --> 00:24:54,650
we can just bound this term
by the maximum over l

475
00:24:54,650 --> 00:24:56,390
from l to j of n.

476
00:24:56,390 --> 00:24:58,640
So then we're going to take the
most probable path in n

477
00:24:58,640 --> 00:25:02,720
steps such that we end
up in state j in n.

478
00:25:02,720 --> 00:25:06,250
So this term right here is
bounded by this term.

479
00:25:06,250 --> 00:25:08,150
Becomes is bounded by this,
that's why we have this less

480
00:25:08,150 --> 00:25:10,240
than or equal sign.

481
00:25:10,240 --> 00:25:13,670
So we just do two things from
this step, the first step, to

482
00:25:13,670 --> 00:25:14,600
the second step.

483
00:25:14,600 --> 00:25:20,420
So we took out the path that's
going to minimize that right

484
00:25:20,420 --> 00:25:24,670
at the j-th node in n steps.

485
00:25:24,670 --> 00:25:30,870
And then we bounded the rest
of this sum by this.

486
00:25:30,870 --> 00:25:35,840
So when we sum this all up, this
is just a constant here.

487
00:25:35,840 --> 00:25:41,800
And ik here is just all the
states that i is going to

488
00:25:41,800 --> 00:25:45,330
visit except for this
one state, l min.

489
00:25:45,330 --> 00:25:48,450
Since it's just all of them
except for that, it's just 1

490
00:25:48,450 --> 00:25:52,730
minus the probability that it
goes from state i to l min.

491
00:25:52,730 --> 00:25:57,580
So this sum here is just
equal to this sum here.

492
00:25:57,580 --> 00:25:59,120
So this arrives here.

493
00:25:59,120 --> 00:26:02,860
And this term is still here.

494
00:26:02,860 --> 00:26:10,620
So going from here, what
happens is we just to

495
00:26:10,620 --> 00:26:12,420
rearrange the terms.

496
00:26:12,420 --> 00:26:13,440
So nothing happens right here.

497
00:26:13,440 --> 00:26:14,690
It's just rearranging.

498
00:26:17,560 --> 00:26:20,330
Now we have this term here.

499
00:26:20,330 --> 00:26:23,620
So we look at this term, P
from i going to l min--

500
00:26:23,620 --> 00:26:27,970
Remember, we chose alpha to be
the minimum single transition

501
00:26:27,970 --> 00:26:31,760
probability, single
transition in that

502
00:26:31,760 --> 00:26:33,020
probability transition matrix.

503
00:26:33,020 --> 00:26:36,310
So i to l has to be
greater than that.

504
00:26:36,310 --> 00:26:39,050
But the minus of this has to be
less than, the negative has

505
00:26:39,050 --> 00:26:39,800
to be less than.

506
00:26:39,800 --> 00:26:42,320
So this we can substitute
here.

507
00:26:46,670 --> 00:26:48,010
So now we have this.

508
00:26:48,010 --> 00:26:51,872
So the maximum over i of this
n plus 1 step actually shows

509
00:26:51,872 --> 00:26:52,760
you the probability.

510
00:26:52,760 --> 00:26:56,190
Because this I can write
as an n plus 1 step

511
00:26:56,190 --> 00:26:57,380
path from i to j.

512
00:26:57,380 --> 00:27:02,030
So if this is less than this
entire term, of course I can

513
00:27:02,030 --> 00:27:05,740
write the maximum path
from i to j.

514
00:27:05,740 --> 00:27:07,320
It also has to be less of
this because this is

515
00:27:07,320 --> 00:27:10,900
satisfied for all i, j.

516
00:27:10,900 --> 00:27:15,570
So therefore, we arrive at
this expression here.

517
00:27:15,570 --> 00:27:21,750
So now we're kind of in good
business because we have the n

518
00:27:21,750 --> 00:27:24,110
plus one step at transition, the
maximum path from i to j

519
00:27:24,110 --> 00:27:26,750
in n plus 1 steps as a function
of n, which is what

520
00:27:26,750 --> 00:27:29,525
we wanted, and a function
of this alpha.

521
00:27:34,430 --> 00:27:35,980
So we repeat that
last statement.

522
00:27:38,490 --> 00:27:42,170
And the last one is here,
the last line.

523
00:27:45,160 --> 00:27:46,140
So now we have the maximum.

524
00:27:46,140 --> 00:27:48,910
So now we want to do is we
want to get the minimum.

525
00:27:48,910 --> 00:27:52,750
So we do exactly the same thing,
with the same proof.

526
00:27:52,750 --> 00:27:55,593
And with the minimum, what we're
going to do is we're

527
00:27:55,593 --> 00:28:01,180
going to look at the ij
transition in n plus 1 steps.

528
00:28:01,180 --> 00:28:03,180
And then what we're going to do
is we're going to pull out

529
00:28:03,180 --> 00:28:04,540
the maximum this time.

530
00:28:04,540 --> 00:28:08,830
So we pull out the most probable
path in n steps such

531
00:28:08,830 --> 00:28:11,250
that it arrives in state j.

532
00:28:11,250 --> 00:28:12,200
Then we play the same game.

533
00:28:12,200 --> 00:28:13,170
Would bound everything--

534
00:28:13,170 --> 00:28:16,830
above, this time-- by the
minimum of the n step

535
00:28:16,830 --> 00:28:18,890
transition probabilities
to get to j.

536
00:28:18,890 --> 00:28:24,440
So once we do that, we get this
expression, very similar

537
00:28:24,440 --> 00:28:26,650
to this one up here.

538
00:28:26,650 --> 00:28:31,330
So now we have the maximum path,
which is n plus 1 steps

539
00:28:31,330 --> 00:28:36,440
to j, and the minimum of
n plus 1 steps to j.

540
00:28:36,440 --> 00:28:40,770
So we could take the difference
between these two.

541
00:28:40,770 --> 00:28:44,000
So if you subtract these
equations here, so this first

542
00:28:44,000 --> 00:28:48,010
equation minus the second
equation, we have this on the

543
00:28:48,010 --> 00:28:53,170
right-hand side here and then
these terms over here on the

544
00:28:53,170 --> 00:28:55,950
left-hand side.

545
00:28:55,950 --> 00:28:59,660
So these terms over here on the
left-hand exactly proves

546
00:28:59,660 --> 00:29:01,620
the first line of the lemma.

547
00:29:06,670 --> 00:29:13,110
So the first line of
the lemma was here.

548
00:29:13,110 --> 00:29:15,860
So now, to prove the second of
the lemma, remember, we're

549
00:29:15,860 --> 00:29:17,330
going to prove this
by induction.

550
00:29:17,330 --> 00:29:20,900
in order to prove this by
induction, we need to be first

551
00:29:20,900 --> 00:29:23,320
initial step.

552
00:29:23,320 --> 00:29:24,670
So the initial step is this.

553
00:29:24,670 --> 00:29:30,660
So if I take the minimum
transition probability from l

554
00:29:30,660 --> 00:29:33,230
to j, it has to be greater
than here with the alpha.

555
00:29:33,230 --> 00:29:35,670
Because we said that alpha was
the absolute minimum of all

556
00:29:35,670 --> 00:29:38,510
the single-step transition
probabilities.

557
00:29:38,510 --> 00:29:41,770
Then the maximum transition
probability has to be greater

558
00:29:41,770 --> 00:29:44,250
than or equal to
1 minus alpha.

559
00:29:44,250 --> 00:29:46,450
It's just by definition
of what we choose.

560
00:29:46,450 --> 00:29:49,840
So therefore, if I take this
term, the maximize minus the

561
00:29:49,840 --> 00:29:52,750
minimum is just 1
minus 2 alpha.

562
00:29:52,750 --> 00:29:55,850
So that's your first step in
the induction process.

563
00:29:55,850 --> 00:29:58,215
So we iterate on n.

564
00:29:58,215 --> 00:30:01,630
When we iterate on n,
one arrives at this

565
00:30:01,630 --> 00:30:02,880
equation down here.

566
00:30:16,610 --> 00:30:24,760
So this shows us from here that
if we take the limit as n

567
00:30:24,760 --> 00:30:27,360
goes to infinity of this
term, this goes down

568
00:30:27,360 --> 00:30:29,880
exponentially in n.

569
00:30:29,880 --> 00:30:32,590
And both of these limits are
going to converge, and they

570
00:30:32,590 --> 00:30:34,795
exist, and they're going
to be greater than 0.

571
00:30:34,795 --> 00:30:37,690
So they'll be greater than 0
because of our initial state

572
00:30:37,690 --> 00:30:41,250
that we chose this path with
a positive probability.

573
00:30:41,250 --> 00:30:43,735
Yeah, go ahead.

574
00:30:43,735 --> 00:30:46,847
AUDIENCE: It seems to me that
alpha is the minimum, the

575
00:30:46,847 --> 00:30:48,934
smallest number in the
transition matrix, right?

576
00:30:48,934 --> 00:30:51,250
SHAN-YUAN HO: Alpha is the
smallest number, correct.

577
00:30:51,250 --> 00:30:51,665
AUDIENCE: Yeah.

578
00:30:51,665 --> 00:30:53,716
How does it fall from
that, like that?

579
00:30:53,716 --> 00:31:01,460
So my is, the convergence
rate is related to f?

580
00:31:01,460 --> 00:31:03,890
SHAN-YUAN HO: Yes,
it is, yeah.

581
00:31:03,890 --> 00:31:06,570
In general, it doesn't really
matter because it's still

582
00:31:06,570 --> 00:31:09,250
going to go down exponentially
in n.

583
00:31:09,250 --> 00:31:12,640
But it does depend on
that alpha, yes.

584
00:31:16,570 --> 00:31:17,820
Any other questions?

585
00:31:22,110 --> 00:31:23,050
Yes.

586
00:31:23,050 --> 00:31:25,515
AUDIENCE: Is the strength that
bound it proportional to the

587
00:31:25,515 --> 00:31:27,820
size of that matrix, right?

588
00:31:27,820 --> 00:31:28,470
SHAN-YUAN HO: Excuse me?

589
00:31:28,470 --> 00:31:29,800
AUDIENCE: The strength
of that bound is

590
00:31:29,800 --> 00:31:31,074
proportional to the size?

591
00:31:31,074 --> 00:31:33,490
I mean, for a very large
finite-state Markov chain, the

592
00:31:33,490 --> 00:31:35,210
strength of the bound is going
to be somewhat weak because

593
00:31:35,210 --> 00:31:37,190
alpha is going to be--

594
00:31:37,190 --> 00:31:39,300
SHAN-YUAN HO: Alpha has
to be less than 1/2.

595
00:31:39,300 --> 00:31:40,060
AUDIENCE: OK, yes.

596
00:31:40,060 --> 00:31:43,974
But the strength of the bound,
though, it's not a very tight

597
00:31:43,974 --> 00:31:47,902
bound on max minus min.

598
00:31:47,902 --> 00:31:50,400
Because in a large--

599
00:31:50,400 --> 00:31:50,660
SHAN-YUAN HO: Yes.

600
00:31:50,660 --> 00:31:52,760
This is just a bound.

601
00:31:52,760 --> 00:31:54,940
And the bound is what
when we took it that

602
00:31:54,940 --> 00:31:59,380
minimum-probability path,
the l min, remember?

603
00:31:59,380 --> 00:32:02,050
The bound was actually
in here.

604
00:32:02,050 --> 00:32:04,070
So we took the
minimum-probability path in n

605
00:32:04,070 --> 00:32:09,620
steps, this l min that minimizes
this over i.

606
00:32:09,620 --> 00:32:11,910
And then this is where this less
than or equal to here is

607
00:32:11,910 --> 00:32:13,160
just a substitution.

608
00:32:18,250 --> 00:32:19,500
Any other questions?

609
00:32:23,810 --> 00:32:27,750
So what we know is that what
happens is that these

610
00:32:27,750 --> 00:32:29,820
limited-state probabilities
exist.

611
00:32:29,820 --> 00:32:34,095
So we have a finite
ergodic chain.

612
00:32:37,310 --> 00:32:41,950
So if the probability of the
elements in this transition

613
00:32:41,950 --> 00:32:43,630
matrix are all greater
than 0, we know

614
00:32:43,630 --> 00:32:44,990
that this limit exists.

615
00:32:44,990 --> 00:32:47,950
But we know that in general,
that may not be the case.

616
00:32:47,950 --> 00:32:50,415
We're going to have some 0's
in our transition matrix.

617
00:32:54,570 --> 00:32:57,240
So let's go back to the
arbitrary finite-state ergodic

618
00:32:57,240 --> 00:33:03,740
chain with probability
transition matrix P. So in the

619
00:33:03,740 --> 00:33:08,610
last slide, we showed that this
transition matrix P of h

620
00:33:08,610 --> 00:33:13,560
is positive for h is equal to
M minus 1 squared plus 1.

621
00:33:13,560 --> 00:33:19,130
So what we do is, we can apply
lemma 2 to P of h with this

622
00:33:19,130 --> 00:33:22,025
alpha equals to minimum
going from i to j

623
00:33:22,025 --> 00:33:23,275
in exactly h steps.

624
00:33:26,340 --> 00:33:29,850
So why is this M minus
1 squared plus 1?

625
00:33:29,850 --> 00:33:33,020
So in the last lecture--

626
00:33:33,020 --> 00:33:34,490
so what it means is this.

627
00:33:37,670 --> 00:33:39,020
So what is says is here.

628
00:33:39,020 --> 00:33:41,075
This was an example given
in the last lecture.

629
00:33:41,075 --> 00:33:43,430
It was a 6-state Markov chain.

630
00:33:43,430 --> 00:33:48,020
So what it says is that if n is
greater than or equal to M

631
00:33:48,020 --> 00:33:50,510
minus 1 squared plus 1-- in this
case, it's going to be 6.

632
00:33:50,510 --> 00:33:56,570
So if n is greater than or equal
to 26, then I take P to

633
00:33:56,570 --> 00:33:59,000
the 26th power, it means
it's greater than zero.

634
00:33:59,000 --> 00:34:02,350
That meas if I take P to the
26th power, every single

635
00:34:02,350 --> 00:34:06,590
element in this transition
matrix is going to be

636
00:34:06,590 --> 00:34:14,260
non-zero, which means that you
can go from any state to any

637
00:34:14,260 --> 00:34:17,389
state with nonzero probability,
as long as n is

638
00:34:17,389 --> 00:34:18,290
bigger than that.

639
00:34:18,290 --> 00:34:20,889
So basically, in this Markov
chain, if you go long enough,

640
00:34:20,889 --> 00:34:22,190
long enough.

641
00:34:22,190 --> 00:34:25,905
Then I say, OK, I want to go
from state i to state j in

642
00:34:25,905 --> 00:34:28,469
exactly how many steps, there is
a positive probability that

643
00:34:28,469 --> 00:34:31,600
this is going to happen.

644
00:34:31,600 --> 00:34:34,380
So how did this bound
come across?

645
00:34:34,380 --> 00:34:44,980
Well, for instance, in this
chain, if we look at P1,1 so

646
00:34:44,980 --> 00:34:46,340
we have here?

647
00:34:46,340 --> 00:34:50,449
So I'm going to look
at the transition

648
00:34:50,449 --> 00:34:51,940
starting at state 1.

649
00:34:51,940 --> 00:34:53,600
And I want to come back to 1.

650
00:34:53,600 --> 00:34:56,860
So you definitely could come
back at 6, because these are

651
00:34:56,860 --> 00:34:59,300
all positive probability 1.

652
00:34:59,300 --> 00:35:00,350
So 6 is possible.

653
00:35:00,350 --> 00:35:02,160
So n is equal to
6 is possible.

654
00:35:02,160 --> 00:35:03,680
So what's the next one
that's possible? n is

655
00:35:03,680 --> 00:35:05,720
equal to 11, right?

656
00:35:05,720 --> 00:35:09,100
Then the next one is what?

657
00:35:09,100 --> 00:35:11,690
16 is possible, right?

658
00:35:11,690 --> 00:35:15,100
So 0 to 5 is impossible, is 0.

659
00:35:15,100 --> 00:35:19,560
So if I pick n between 0 and 5,
and 7 and 10, you're toast.

660
00:35:19,560 --> 00:35:22,750
You can't get back to 1.

661
00:35:22,750 --> 00:35:23,840
And so forth.

662
00:35:23,840 --> 00:35:25,300
So 18 is possible.

663
00:35:29,250 --> 00:35:30,410
21--

664
00:35:30,410 --> 00:35:32,480
let's see, is 17 possible?

665
00:35:32,480 --> 00:35:33,740
Yeah, 17 is also possible.

666
00:35:37,040 --> 00:35:38,750
AUDIENCE: Why is 16 possible?

667
00:35:38,750 --> 00:35:41,630
SHAN-YUAN HO: So I go around
here twice, and

668
00:35:41,630 --> 00:35:42,880
then the last one.

669
00:35:46,060 --> 00:35:48,810
Is that right?

670
00:35:48,810 --> 00:35:52,160
So if I go from here to here
to here to here, if I go

671
00:35:52,160 --> 00:35:54,696
twice, and then one more
in the final loop.

672
00:35:54,696 --> 00:35:55,550
AUDIENCE: That's 12.

673
00:35:55,550 --> 00:35:57,340
SHAN-YUAN HO: Oh, it's 12?

674
00:35:57,340 --> 00:35:57,870
No.

675
00:35:57,870 --> 00:36:01,640
I'm going to go this inner
loop right here.

676
00:36:01,640 --> 00:36:07,234
So if I go from 1 to 2 to 3
to 4 to 5 to 6, down to 2.

677
00:36:07,234 --> 00:36:10,130
Then I go 3, 4, 5, 6, 1.

678
00:36:10,130 --> 00:36:11,380
That's 11, isn't it?

679
00:36:14,300 --> 00:36:17,090
So 16 is I'm going to go around
the inner loop twice.

680
00:36:19,630 --> 00:36:19,870
OK.

681
00:36:19,870 --> 00:36:20,610
Go ahead.

682
00:36:20,610 --> 00:36:20,980
Question?

683
00:36:20,980 --> 00:36:23,670
AUDIENCE: So everything 20 and
under is possible, right?

684
00:36:23,670 --> 00:36:24,670
SHAN-YUAN HO: No.

685
00:36:24,670 --> 00:36:25,730
Is 25 possible?

686
00:36:25,730 --> 00:36:28,092
Tell me how you're going
to go 25 on this.

687
00:36:28,092 --> 00:36:30,760
You just do the 5
loop 5 times.

688
00:36:30,760 --> 00:36:32,140
SHAN-YUAN HO: Yeah, but I
want to go from 1 to 1.

689
00:36:32,140 --> 00:36:33,240
You're starting in state 1.

690
00:36:33,240 --> 00:36:33,825
AUDIENCE: Oh, oh, sorry.

691
00:36:33,825 --> 00:36:34,150
OK.

692
00:36:34,150 --> 00:36:35,662
SHAN-YUAN HO: 1 to 1, right?

693
00:36:35,662 --> 00:36:36,594
AUDIENCE: OK, cool.

694
00:36:36,594 --> 00:36:37,844
OK, I see.

695
00:36:40,030 --> 00:36:42,230
SHAN-YUAN HO: So you know for
this one that this bound is

696
00:36:42,230 --> 00:36:44,160
actually tight.

697
00:36:44,160 --> 00:36:46,880
So 25 is impossible.

698
00:36:46,880 --> 00:36:50,520
So P1,1 of 25 is equal to 0.

699
00:36:50,520 --> 00:36:51,920
There's no way you
can do that.

700
00:36:51,920 --> 00:36:54,850
But for 26 on, then you can.

701
00:36:54,850 --> 00:36:58,090
So what you're noticing is that
you need this loop of 6

702
00:36:58,090 --> 00:37:02,830
here and that any combination
of 5 or 6 is possible.

703
00:37:02,830 --> 00:37:08,310
So basically, in this particular
example, if n is

704
00:37:08,310 --> 00:37:18,710
equal to 6k plus 5j, where k
is greater than or equal to

705
00:37:18,710 --> 00:37:21,020
1-- because I need that final
loop to get back--

706
00:37:21,020 --> 00:37:23,350
or j is greater than
or equal to 0--

707
00:37:23,350 --> 00:37:28,730
So any combination of this one,
then I can express n.

708
00:37:28,730 --> 00:37:31,140
I can go around it to give me
a positive probability of

709
00:37:31,140 --> 00:37:33,660
going from state 1 to state 1.

710
00:37:33,660 --> 00:37:38,910
So I'm going to prove this
using extremal property.

711
00:37:38,910 --> 00:37:41,330
So we're going to take the
absolute worst case.

712
00:37:41,330 --> 00:37:46,970
So the absolute worst case is
that for M state finite Markov

713
00:37:46,970 --> 00:37:49,540
chain is if have a loop
of m and you have a

714
00:37:49,540 --> 00:37:51,005
loop of m minus 1.

715
00:37:51,005 --> 00:37:52,460
You can't just have
a loop of m.

716
00:37:52,460 --> 00:37:53,950
The problem is now this
becomes periodic.

717
00:37:56,580 --> 00:38:00,680
So we have to get rid
of the periodicity.

718
00:38:00,680 --> 00:38:03,390
If you add a single group here,
that doesn't help you.

719
00:38:03,390 --> 00:38:05,590
Then after 6, then it I get
7, 8, 9, 10, 11, 12.

720
00:38:05,590 --> 00:38:08,930
That didn't have this.

721
00:38:08,930 --> 00:38:11,880
So the absolute worst case for
an M state chain is going to

722
00:38:11,880 --> 00:38:13,430
be something that
looks like this.

723
00:38:13,430 --> 00:38:16,550
1 that goes to 2-- you're
forced to go to 2--

724
00:38:16,550 --> 00:38:20,840
so forth, until state M. And
then this M is going to go

725
00:38:20,840 --> 00:38:23,830
back to 2 or is going
to go back to 1.

726
00:38:23,830 --> 00:38:31,290
So in other words, the worst
case is if you have--

727
00:38:31,290 --> 00:38:39,210
n has to be some combination
of Mk plus M minus 1 j.

728
00:38:39,210 --> 00:38:43,430
So this will be the worst
possible case for M state

729
00:38:43,430 --> 00:38:44,220
Markov chain.

730
00:38:44,220 --> 00:38:50,480
So it'll be Mk plus
M minus 1 j.

731
00:38:50,480 --> 00:38:52,570
So k has to be greater
than or equal to 1.

732
00:38:52,570 --> 00:38:55,840
And then j has to be greater
than or equal to 0, because

733
00:38:55,840 --> 00:38:56,950
you need to come back.

734
00:38:56,950 --> 00:39:00,000
So I'm just looking at the case
probability that I start

735
00:39:00,000 --> 00:39:02,500
in state 1 and I come
back in state 1.

736
00:39:02,500 --> 00:39:04,670
So all right.

737
00:39:04,670 --> 00:39:09,770
So how do we get this bound?

738
00:39:09,770 --> 00:39:14,860
Well, there is an identity
that says this.

739
00:39:14,860 --> 00:39:29,370
If a and b are relatively prime,
then the largest n such

740
00:39:29,370 --> 00:39:32,750
that it cannot be written-- so
we want to find the largest n

741
00:39:32,750 --> 00:39:42,430
such that ak plus bj--

742
00:39:42,430 --> 00:39:48,860
but this is k and j greater
than or equal to 0--

743
00:39:48,860 --> 00:39:52,260
that it cannot be written
in this form.

744
00:39:55,860 --> 00:39:58,820
The largest integer that it
cannot be written is ab

745
00:39:58,820 --> 00:40:00,680
minus a minus b.

746
00:40:00,680 --> 00:40:02,960
This takes a little bit to
prove, but it's not too hard.

747
00:40:02,960 --> 00:40:05,480
If you want to know this proof,
come see me offline

748
00:40:05,480 --> 00:40:09,090
after class.

749
00:40:09,090 --> 00:40:10,170
This is the largest integer.

750
00:40:10,170 --> 00:40:13,160
If n is equal to this,
it cannot be

751
00:40:13,160 --> 00:40:14,310
written in this form.

752
00:40:14,310 --> 00:40:17,690
But if n is greater than
this, then it can.

753
00:40:17,690 --> 00:40:21,680
So all we do is substitute M
for a and M minus 1 for b

754
00:40:21,680 --> 00:40:24,440
because M and M minus 1
are relatively prime.

755
00:40:24,440 --> 00:40:29,460
But remember, we have a k here
that has to be greater than or

756
00:40:29,460 --> 00:40:29,980
equal to 1.

757
00:40:29,980 --> 00:40:31,330
We need at least one k.

758
00:40:31,330 --> 00:40:34,690
But this so identity is for
k and j greater than 0.

759
00:40:34,690 --> 00:40:39,420
So therefore, we have to
subtract out that k.

760
00:40:39,420 --> 00:40:46,030
So therefore, we have M times
M minus 1, minus M

761
00:40:46,030 --> 00:40:48,200
minus M minus 1.

762
00:40:48,200 --> 00:40:58,960
But the thing is we have to add
the extra M, because this

763
00:40:58,960 --> 00:41:00,360
k is greater than
or equal to 1.

764
00:41:00,360 --> 00:41:05,330
So we have to add up one of
the M's because of this.

765
00:41:05,330 --> 00:41:13,830
So this is just equal to
M minus 1, squared.

766
00:41:13,830 --> 00:41:18,920
So this number, if n is equal
to this, it's the largest

767
00:41:18,920 --> 00:41:20,490
number that it cannot be
written like that.

768
00:41:20,490 --> 00:41:21,810
So therefore, we
have to add 1.

769
00:41:21,810 --> 00:41:24,080
So that's why the bound
is equal to 1.

770
00:41:24,080 --> 00:41:30,660
So the upper bound that n can
be written is going to be M

771
00:41:30,660 --> 00:41:34,320
minus 1, squared plus 1.

772
00:41:34,320 --> 00:41:36,410
AUDIENCE: Why did you add
the 1 at the end?

773
00:41:36,410 --> 00:41:36,960
SHAN-YUAN HO: This one?

774
00:41:36,960 --> 00:41:40,476
AUDIENCE: No, we've got to
do the 1 at the end.

775
00:41:40,476 --> 00:41:42,380
AUDIENCE: We already
have that in there.

776
00:41:42,380 --> 00:41:42,810
SHAN-YUAN HO: Oh, where is it?

777
00:41:42,810 --> 00:41:44,245
No, it's in here, right?

778
00:41:44,245 --> 00:41:45,495
AUDIENCE: No, it's not here.

779
00:41:48,420 --> 00:41:49,670
SHAN-YUAN HO: Did I--

780
00:41:53,980 --> 00:41:54,720
What are you talking about?

781
00:41:54,720 --> 00:41:57,088
Where's the 1?

782
00:41:57,088 --> 00:41:59,030
AUDIENCE: At the end,
the last equation.

783
00:41:59,030 --> 00:41:59,500
SHAN-YUAN HO: This one?

784
00:41:59,500 --> 00:42:01,290
AUDIENCE: Yes.

785
00:42:01,290 --> 00:42:02,270
SHAN-YUAN HO: OK.

786
00:42:02,270 --> 00:42:12,420
This is the "cannot," largest
n which you cannot write.

787
00:42:12,420 --> 00:42:15,170
You cannot write this.

788
00:42:15,170 --> 00:42:18,220
So this bound is tight.

789
00:42:18,220 --> 00:42:22,890
It means that this is the
one that you can.

790
00:42:22,890 --> 00:42:24,990
So if n is greater
than or equal to

791
00:42:24,990 --> 00:42:26,470
this, then it's possible.

792
00:42:26,470 --> 00:42:28,970
This is the largest
one it cannot.

793
00:42:28,970 --> 00:42:30,790
Based on this, it cannot.

794
00:42:30,790 --> 00:42:32,070
So we have to add the 1.

795
00:42:32,070 --> 00:42:34,970
So therefore, in here,
you could do 26.

796
00:42:34,970 --> 00:42:38,470
So starting from 26, 27,
28, you can do that.

797
00:42:41,310 --> 00:42:42,890
Any questions?

798
00:42:42,890 --> 00:42:45,590
AUDIENCE: Relatively prime,
what do you mean by

799
00:42:45,590 --> 00:42:45,920
"relatively"?

800
00:42:45,920 --> 00:42:47,982
SHAN-YUAN HO: There is a
greatest common divisor of 1.

801
00:42:51,900 --> 00:42:56,790
So if we take h here, h is
going to be positive.

802
00:42:56,790 --> 00:43:00,400
So if h is equal to M minus 1,
squared plus 1, then now all

803
00:43:00,400 --> 00:43:01,530
the elements are positive.

804
00:43:01,530 --> 00:43:04,590
Because we just proved that
we can write this--

805
00:43:10,230 --> 00:43:13,390
every state can be visited by
any other state, with positive

806
00:43:13,390 --> 00:43:15,260
probability.

807
00:43:15,260 --> 00:43:21,620
So we say, looking at P, we know
that P of h is positive

808
00:43:21,620 --> 00:43:23,510
for h greater than or
equal to this bound.

809
00:43:23,510 --> 00:43:28,330
So what we do is we applied this
lemma 2 probability to

810
00:43:28,330 --> 00:43:32,980
this transition matrix P of h,
where we have picked alpha--

811
00:43:32,980 --> 00:43:34,620
remember, alpha is the
single-step transition

812
00:43:34,620 --> 00:43:35,340
probability.

813
00:43:35,340 --> 00:43:38,840
So instead of the single
transition, we have lumped

814
00:43:38,840 --> 00:43:42,700
this P into P to the h power.

815
00:43:42,700 --> 00:43:45,470
So it's h steps.

816
00:43:45,470 --> 00:43:50,790
Because we proved the result
before for positive P. So this

817
00:43:50,790 --> 00:43:53,760
P to the h is positive, so we
take alpha as the minimum from

818
00:43:53,760 --> 00:43:59,510
i to j of P to the
h in this matrix.

819
00:43:59,510 --> 00:44:02,360
So it doesn't really matter what
the value of alpha is,

820
00:44:02,360 --> 00:44:03,950
only that it's going
to be positive.

821
00:44:03,950 --> 00:44:06,290
And it has to be positive
because it's a probability.

822
00:44:06,290 --> 00:44:12,770
So what happens is, if we follow
the proof of what we

823
00:44:12,770 --> 00:44:17,260
just showed in the lemma, then
we show that the maximum path

824
00:44:17,260 --> 00:44:19,190
from l to j--

825
00:44:22,260 --> 00:44:25,240
h times M. So M is going
to be an integer, so in

826
00:44:25,240 --> 00:44:27,220
multiples of h--

827
00:44:27,220 --> 00:44:30,930
this upper limit is going to be
equal to the lower limit.

828
00:44:30,930 --> 00:44:34,730
So the most probable path
is equal to the

829
00:44:34,730 --> 00:44:36,590
least probable path.

830
00:44:40,240 --> 00:44:42,380
So this is multiple of h's.

831
00:44:42,380 --> 00:44:44,730
So if we take this as M goes
to infinity, this has

832
00:44:44,730 --> 00:44:47,510
got to equal to--

833
00:44:47,510 --> 00:44:53,110
Oops, this should be going
to pi sub j, excuse me.

834
00:44:53,110 --> 00:44:55,230
This little temple here.

835
00:44:55,230 --> 00:44:57,950
And this is going to
be greater than 0.

836
00:44:57,950 --> 00:45:01,040
So the problem is now we've
shown it for multiples of h's,

837
00:45:01,040 --> 00:45:04,180
what about the h's in between?

838
00:45:04,180 --> 00:45:10,510
But the fact is that lemma 1,
we showed that this maximum

839
00:45:10,510 --> 00:45:14,170
path from l to j in n is
not increasing in n.

840
00:45:14,170 --> 00:45:18,620
So all those states, all those
paths, the transition

841
00:45:18,620 --> 00:45:21,460
probability for the paths in
between these multiples of

842
00:45:21,460 --> 00:45:25,110
h's, in between them it's
going to be not

843
00:45:25,110 --> 00:45:26,100
increasing in n.

844
00:45:26,100 --> 00:45:30,852
So even if we're taking these
multiples of each of h and n

845
00:45:30,852 --> 00:45:33,150
here, here, here, and we know
that this limit is increasing,

846
00:45:33,150 --> 00:45:37,690
we know that all the ones in
between them are also going to

847
00:45:37,690 --> 00:45:42,350
be increasing to the same limit
because of lemma 1.

848
00:45:42,350 --> 00:45:45,335
To remember, the maximum is
going to be not increasing,

849
00:45:45,335 --> 00:45:46,460
and the minimum is going to be

850
00:45:46,460 --> 00:45:48,760
non-decreasing in any one path.

851
00:45:48,760 --> 00:45:54,280
So this must have the
same limit as

852
00:45:54,280 --> 00:45:56,040
this multiple of this.

853
00:45:56,040 --> 00:45:58,390
So the same limit applies.

854
00:45:58,390 --> 00:46:00,220
So any questions on this?

855
00:46:00,220 --> 00:46:03,390
So this is how we prove it for
the arbitrary finite-state

856
00:46:03,390 --> 00:46:07,790
ergodic chain when we have some
0 probability transition

857
00:46:07,790 --> 00:46:13,490
elements in the matrix P. So
the proof is the same.

858
00:46:17,680 --> 00:46:19,880
So now for ergodic unichain.

859
00:46:19,880 --> 00:46:26,880
So we see that this limit as n
approaches infinity from i to

860
00:46:26,880 --> 00:46:30,120
j of n is going to just end up
in the steady-state transition

861
00:46:30,120 --> 00:46:32,380
pi of j for all i.

862
00:46:32,380 --> 00:46:35,040
So it doesn't matter what
your initial state is.

863
00:46:35,040 --> 00:46:39,170
As n goes to infinity of this
path, as this Markov chain

864
00:46:39,170 --> 00:46:42,360
goes on and on, you will end up
in state j with probability

865
00:46:42,360 --> 00:46:47,440
pi sub j, where pi is this
probability vector.

866
00:46:47,440 --> 00:46:50,090
So now we have this steady-state
vector, and then

867
00:46:50,090 --> 00:46:54,130
we can solve for the
steady-state vector solution.

868
00:46:54,130 --> 00:46:59,600
So this pi P is equal to pi.

869
00:46:59,600 --> 00:46:59,860
Yeah?

870
00:46:59,860 --> 00:47:00,330
Go ahead.

871
00:47:00,330 --> 00:47:02,270
AUDIENCE: Where did you prove
that the sum of all the pi j's

872
00:47:02,270 --> 00:47:04,030
equal to one?

873
00:47:04,030 --> 00:47:06,563
Because you say that we
proved that this is

874
00:47:06,563 --> 00:47:07,355
the probability vector.

875
00:47:07,355 --> 00:47:08,780
But did prove only that
it is non-negative?

876
00:47:08,780 --> 00:47:09,290
SHAN-YUAN HO: It's
non-negative.

877
00:47:09,290 --> 00:47:13,090
But the thing is because as n
goes to infinity, you have to

878
00:47:13,090 --> 00:47:15,200
land up someone, right?

879
00:47:15,200 --> 00:47:16,960
This is a finite-state
Markov chain.

880
00:47:16,960 --> 00:47:19,030
You have to be somewhere.

881
00:47:19,030 --> 00:47:21,060
And the fact that you have to be
somewhere, your whole state

882
00:47:21,060 --> 00:47:23,480
space has to add up to 1.

883
00:47:23,480 --> 00:47:24,550
Because it's a constant,
remember?

884
00:47:24,550 --> 00:47:29,290
For every j, as n goes to
infinity, it goes to pi sub j.

885
00:47:29,290 --> 00:47:31,020
So you have that for
every single state.

886
00:47:31,020 --> 00:47:32,530
And then you have to
end up somewhere.

887
00:47:32,530 --> 00:47:34,570
So if you have to end up
somewhere, the space has to

888
00:47:34,570 --> 00:47:35,896
add up to one.

889
00:47:35,896 --> 00:47:37,980
Yeah, good question.

890
00:47:37,980 --> 00:47:41,990
So why are we interested
in this pi sub j?

891
00:47:41,990 --> 00:47:45,210
The question is that because
in this recurrent class, it

892
00:47:45,210 --> 00:47:48,910
tells us that as this goes to
infinity, we see this sequence

893
00:47:48,910 --> 00:47:50,880
of states going back and
forth, back and forth.

894
00:47:50,880 --> 00:47:53,430
And we know that as n goes
to infinity, we have some

895
00:47:53,430 --> 00:47:56,285
probability, pi sub j, of
landing in state j, pi sub i

896
00:47:56,285 --> 00:47:57,810
of landing in state
i, and so forth.

897
00:47:57,810 --> 00:48:01,980
So it says that in the n step,
as n goes to infinity, that

898
00:48:01,980 --> 00:48:04,290
this is the fraction of time
that, actually, that state is

899
00:48:04,290 --> 00:48:05,440
going to be visited.

900
00:48:05,440 --> 00:48:08,590
Because at each step, you have
to make a transition.

901
00:48:08,590 --> 00:48:15,050
So it's kind of the expected
number of times per unit time.

902
00:48:15,050 --> 00:48:16,280
So it's divide by n.

903
00:48:16,280 --> 00:48:18,260
It's going to be that fraction
of time that you're going to

904
00:48:18,260 --> 00:48:19,110
visit that state.

905
00:48:19,110 --> 00:48:20,110
It's the fraction of time
that you're going

906
00:48:20,110 --> 00:48:21,660
to be in that state.

907
00:48:21,660 --> 00:48:26,940
It's this limiting state as
n gets very, very large.

908
00:48:26,940 --> 00:48:32,520
So we will see that in the next
few chapters when we do

909
00:48:32,520 --> 00:48:35,210
renewal theory that this will
come into useful play.

910
00:48:35,210 --> 00:48:39,670
And we give a slightly different
viewpoint of it.

911
00:48:39,670 --> 00:48:42,270
So it's very easy to extend this
result to a more general

912
00:48:42,270 --> 00:48:44,510
class of ergodic unichains.

913
00:48:44,510 --> 00:48:46,350
So remember the ergodic
unichains, now we have

914
00:48:46,350 --> 00:48:48,020
increased these transient
states.

915
00:48:48,020 --> 00:48:50,120
So before, we proved this.

916
00:48:50,120 --> 00:48:53,270
We just proved it for it
contains exactly one class.

917
00:48:53,270 --> 00:48:59,300
It's aperiodic, so we have no
cycles, no periodicity in this

918
00:48:59,300 --> 00:49:00,170
Markov chain.

919
00:49:00,170 --> 00:49:03,080
And so we know that the
steady-state transition

920
00:49:03,080 --> 00:49:04,510
probabilities have a limit.

921
00:49:04,510 --> 00:49:06,596
And the upper limit and the
lower limit of these paths as

922
00:49:06,596 --> 00:49:07,680
they go to infinity--

923
00:49:07,680 --> 00:49:09,770
in fact, they end up in
a particular state--

924
00:49:09,770 --> 00:49:10,430
has a limit.

925
00:49:10,430 --> 00:49:14,570
And we have this steady-state
probability vector that

926
00:49:14,570 --> 00:49:15,610
describes this.

927
00:49:15,610 --> 00:49:17,940
So now we have these
transient states.

928
00:49:17,940 --> 00:49:20,290
So these transient states of
this Markov chain, what

929
00:49:20,290 --> 00:49:25,480
happens is there exists a path
that this transient state is

930
00:49:25,480 --> 00:49:27,370
going to go to a recurrent
state.

931
00:49:27,370 --> 00:49:29,930
So once it leaves this transient
state, it goes to

932
00:49:29,930 --> 00:49:30,550
recurrent state.

933
00:49:30,550 --> 00:49:31,810
It's never going to come back.

934
00:49:31,810 --> 00:49:40,500
So there is some probability,
alpha, of leaving the

935
00:49:40,500 --> 00:49:41,700
class at each step.

936
00:49:41,700 --> 00:49:43,750
So there's some transition
probability in this transient

937
00:49:43,750 --> 00:49:45,630
state that's going
to be alpha.

938
00:49:45,630 --> 00:49:48,730
And the probability of remaining
in this transient

939
00:49:48,730 --> 00:49:50,970
state is just 1 minus
alpha to the n.

940
00:49:50,970 --> 00:49:52,840
And this goes down
exponentially.

941
00:49:52,840 --> 00:49:56,340
So what this says is that
eventually, as n gets very

942
00:49:56,340 --> 00:49:59,290
large, it's very, very hard to
stay in that transient state.

943
00:49:59,290 --> 00:50:01,340
So it's going to go out of
the transient state.

944
00:50:01,340 --> 00:50:04,900
And then it will go into
the recurrent class.

945
00:50:04,900 --> 00:50:09,410
So when one does the analysis
for this, what happens in the

946
00:50:09,410 --> 00:50:13,960
probability in this steady-state
vector is those

947
00:50:13,960 --> 00:50:17,580
transient states, this pi,
will be equal to 0.

948
00:50:17,580 --> 00:50:21,610
So this distribution is only
going to be non-zero for

949
00:50:21,610 --> 00:50:23,640
recurrent states in this
Markov chains.

950
00:50:23,640 --> 00:50:27,510
And the transient states will
have probability equal to 0.

951
00:50:27,510 --> 00:50:31,080
In the notes, they just
extend the argument.

952
00:50:31,080 --> 00:50:35,340
But you need a little bit
more care to show this.

953
00:50:35,340 --> 00:50:38,210
And it divides the transient
states into a block and then

954
00:50:38,210 --> 00:50:40,660
the recurrent classes into
another block and then shows

955
00:50:40,660 --> 00:50:45,630
that these transient states'
limiting probability is going

956
00:50:45,630 --> 00:50:46,880
to go to 0.

957
00:50:52,020 --> 00:50:55,080
So let's see.

958
00:50:55,080 --> 00:50:59,490
So this says just what I said,
that these transient states

959
00:50:59,490 --> 00:51:02,180
decay exponentially, and one
of the paths will be taken,

960
00:51:02,180 --> 00:51:03,880
eventually, out of it.

961
00:51:03,880 --> 00:51:07,440
So for ergodic unichains, the
ergodic class is eventually

962
00:51:07,440 --> 00:51:09,180
entered, and then steady state
in that class is reached.

963
00:51:09,180 --> 00:51:13,370
So every state j, we
have exactly this.

964
00:51:13,370 --> 00:51:18,270
The maximum path from
i to j in n steps--

965
00:51:18,270 --> 00:51:19,100
and the minimum path.

966
00:51:19,100 --> 00:51:21,820
We look at the minimum path in
n steps and the maximum path

967
00:51:21,820 --> 00:51:22,700
in n steps.

968
00:51:22,700 --> 00:51:25,820
And for each n, we take the
limit as n goes to infinity.

969
00:51:25,820 --> 00:51:29,220
These guys, these limits are
exactly equal, and it equals

970
00:51:29,220 --> 00:51:32,680
to this pi sub j, which is
equal to the j state.

971
00:51:32,680 --> 00:51:39,150
So your initial states, how you
went the paths that you

972
00:51:39,150 --> 00:51:40,560
have gone is completely
wiped out.

973
00:51:40,560 --> 00:51:44,470
And all that matters is
this final state,

974
00:51:44,470 --> 00:51:45,820
as n gets very large.

975
00:51:45,820 --> 00:51:49,200
So the difference here is that
pi sub j equals 0 for each

976
00:51:49,200 --> 00:51:51,380
transient state, and it's
greater than 0 for the

977
00:51:51,380 --> 00:51:52,630
recurrent state.

978
00:51:55,770 --> 00:51:57,580
So other finite Markov chains.

979
00:51:57,580 --> 00:51:59,280
So we can consider a
Markov chain with

980
00:51:59,280 --> 00:52:00,330
several ergodic classes.

981
00:52:00,330 --> 00:52:03,340
Because we just considered it
with one ergodic class.

982
00:52:03,340 --> 00:52:05,790
So if the classes don't
communicate, then you just

983
00:52:05,790 --> 00:52:06,740
consider it separately.

984
00:52:06,740 --> 00:52:08,920
So you figure out the
steady-state transition

985
00:52:08,920 --> 00:52:11,170
probabilities for each of
the classes separately.

986
00:52:11,170 --> 00:52:17,080
But if you have to insist on
analyzing the entire chain P,

987
00:52:17,080 --> 00:52:19,760
then this P will have m
independent steady-state

988
00:52:19,760 --> 00:52:29,180
vectors and one non-zero
in each class.

989
00:52:29,180 --> 00:52:32,690
So this P sub n is still going
to converge, but the rows are

990
00:52:32,690 --> 00:52:33,590
not going to be the same.

991
00:52:33,590 --> 00:52:35,210
So basically, you're going
to have blocks.

992
00:52:35,210 --> 00:52:38,510
So if you have one class, say 1
through k is going to be in

993
00:52:38,510 --> 00:52:41,680
one class, and then k through
l is going to be another

994
00:52:41,680 --> 00:52:45,070
class, and then l through z is
going to another class, you

995
00:52:45,070 --> 00:52:45,960
have a block.

996
00:52:45,960 --> 00:52:49,170
So this steady-state vector
is going to be in blocks.

997
00:52:51,770 --> 00:52:56,480
So you can see the recurring
classes only communicate

998
00:52:56,480 --> 00:52:57,520
within themselves.

999
00:52:57,520 --> 00:52:59,350
Because these don't

1000
00:52:59,350 --> 00:53:01,570
communicate, so they're separate.

1001
00:53:01,570 --> 00:53:11,450
So you could have a lot of 0's
in limiting state, if you look

1002
00:53:11,450 --> 00:53:15,690
at this, P sub n goes
to infinity.

1003
00:53:15,690 --> 00:53:18,010
So there m set of rows,
one for each class.

1004
00:53:18,010 --> 00:53:20,220
And a row for each class k
will be non-zero for the

1005
00:53:20,220 --> 00:53:22,280
elements of that class.

1006
00:53:22,280 --> 00:53:26,350
So then finally, if we
have periodicity.

1007
00:53:26,350 --> 00:53:32,540
So now if we have a periodic
recurrent chain with period d.

1008
00:53:32,540 --> 00:53:34,440
We had the two where it's
just a period of 2.

1009
00:53:34,440 --> 00:53:39,130
So with periodicity, what you
do is you're going to divide

1010
00:53:39,130 --> 00:53:41,770
these classes into d
different states.

1011
00:53:41,770 --> 00:53:44,880
So you have to go
to one state--

1012
00:53:44,880 --> 00:53:50,070
So if there's d states, this is
a period of d, you separate

1013
00:53:50,070 --> 00:53:54,410
or you partition the states into
d of them, d subclasses,

1014
00:53:54,410 --> 00:53:56,190
with a cycle rotation
between them.

1015
00:53:56,190 --> 00:54:00,380
So basically, each time unit,
you have to go from one class

1016
00:54:00,380 --> 00:54:01,910
to the next class.

1017
00:54:01,910 --> 00:54:05,080
And then we do that, then for
each class, you could have the

1018
00:54:05,080 --> 00:54:07,210
limiting-state probability.

1019
00:54:07,210 --> 00:54:11,290
So in other words, you are
looking at this transition

1020
00:54:11,290 --> 00:54:13,460
matrix, pi d.

1021
00:54:13,460 --> 00:54:15,820
Because when it cycles, it
totally depends on which one

1022
00:54:15,820 --> 00:54:17,960
you start out at.

1023
00:54:17,960 --> 00:54:22,560
But if you look at the d
intervals, then that becomes

1024
00:54:22,560 --> 00:54:24,640
the ergodic class by itself.

1025
00:54:24,640 --> 00:54:27,130
And there are exactly
d of them.

1026
00:54:27,130 --> 00:54:31,020
So the limit as n approaches
infinity of P of nd, this

1027
00:54:31,020 --> 00:54:36,220
thing also exists, but exists in
the subclass sense of there

1028
00:54:36,220 --> 00:54:40,070
is d subclasses if it
has a period of d.

1029
00:54:40,070 --> 00:54:42,570
So that means a steady state
is reached within each

1030
00:54:42,570 --> 00:54:44,640
subclass, but the chain
rotates from

1031
00:54:44,640 --> 00:54:47,240
one subclass to another.

1032
00:54:47,240 --> 00:54:47,950
Yeah, go ahead.

1033
00:54:47,950 --> 00:54:49,410
AUDIENCE: In this case, if we
do a simple check with 1 and

1034
00:54:49,410 --> 00:54:52,700
2, with 1 and 1, it
doesn't converge.

1035
00:54:52,700 --> 00:54:53,570
SHAN-YUAN HO: No, it does.

1036
00:54:53,570 --> 00:54:56,380
It is 1, converges to 1.

1037
00:54:56,380 --> 00:54:58,740
So it's 1, and then it's
going to be 1.

1038
00:54:58,740 --> 00:55:00,970
AUDIENCE: It's 1, 1,
1, 1, 1, 1, 1, 1.

1039
00:55:00,970 --> 00:55:01,500
So you go here?

1040
00:55:01,500 --> 00:55:02,904
Like, it's reached--?

1041
00:55:02,904 --> 00:55:03,372
SHAN-YUAN HO: No, no.

1042
00:55:03,372 --> 00:55:04,790
It converges for here.

1043
00:55:04,790 --> 00:55:08,510
But this d is equal to
2, in that case.

1044
00:55:08,510 --> 00:55:10,534
So you have to do nd,
so you've got

1045
00:55:10,534 --> 00:55:12,200
to look at P squared.

1046
00:55:12,200 --> 00:55:14,536
So if I look at P squared,
I'm always a 1--

1047
00:55:14,536 --> 00:55:16,196
1, 1, 1, 1, 1, 1, 1, 1.

1048
00:55:16,196 --> 00:55:17,380
That's converging.

1049
00:55:17,380 --> 00:55:19,300
The other one is 2,
2, 2, 2, 2, 2.

1050
00:55:19,300 --> 00:55:20,550
That's also converging.

1051
00:55:23,690 --> 00:55:24,150
OK.

1052
00:55:24,150 --> 00:55:26,040
So is there any other questions
about this?

1053
00:55:28,680 --> 00:55:30,050
OK, that's it.

1054
00:55:30,050 --> 00:55:31,300
Thank you.