1
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative

2
00:00:02,960 --> 00:00:04,370
Commons license.

3
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to

4
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.

5
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from

6
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:19,790 --> 00:00:21,040
ocw.mit.edu.

8
00:00:24,930 --> 00:00:27,780
PROFESSOR: I guess
we should start.

9
00:00:27,780 --> 00:00:32,050
This is the last of
these lectures.

10
00:00:32,050 --> 00:00:36,940
The final will be on next
Wednesday, as I hope you all

11
00:00:36,940 --> 00:00:41,700
know by this time, in the ice
rink, whatever that means.

12
00:00:41,700 --> 00:00:45,520
And there was some question
about how many sheets of paper

13
00:00:45,520 --> 00:00:49,170
you could bring in
as crib sheets.

14
00:00:49,170 --> 00:00:52,780
And it seems like the reasonable
thing is four

15
00:00:52,780 --> 00:00:55,960
sheets, which means you can
bring in the two sheets you

16
00:00:55,960 --> 00:00:58,100
made up for the quiz
plus two more.

17
00:00:58,100 --> 00:01:00,280
Or you can make up four new
ones if you want or do

18
00:01:00,280 --> 00:01:02,950
whatever you want.

19
00:01:02,950 --> 00:01:04,980
I don't think it's very
important how many sheets you

20
00:01:04,980 --> 00:01:09,800
bring in, because I've never
seen anybody referring to

21
00:01:09,800 --> 00:01:11,880
their sheets.

22
00:01:11,880 --> 00:01:15,770
I mean, it's a good way of
organizing what you know to

23
00:01:15,770 --> 00:01:17,960
try to put it on four
sheets of paper.

24
00:01:20,840 --> 00:01:25,210
I want to mostly review what
we've done throughout the

25
00:01:25,210 --> 00:01:29,990
term, with a few more general
comments thrown in.

26
00:01:29,990 --> 00:01:33,740
I thought I'd start with
martingales, because we then

27
00:01:33,740 --> 00:01:35,410
completely finish what
we wanted to

28
00:01:35,410 --> 00:01:36,720
talk about last time.

29
00:01:36,720 --> 00:01:39,510
And the Strong Law of
Large Numbers was

30
00:01:39,510 --> 00:01:41,800
left slightly hanging.

31
00:01:41,800 --> 00:01:45,790
And I want to show you
how to do that in a

32
00:01:45,790 --> 00:01:47,870
little better way.

33
00:01:47,870 --> 00:01:52,070
And also show you that it's a
more general theorem than it

34
00:01:52,070 --> 00:01:54,550
appears to be at first sight.

35
00:01:54,550 --> 00:01:59,686
So let's go with martingales.

36
00:01:59,686 --> 00:02:03,690
The basic definition is a
sequence of random variables

37
00:02:03,690 --> 00:02:09,430
is a martingale, if, for all
elements of the sequence, the

38
00:02:09,430 --> 00:02:16,940
expected value of Zn, given all
of the previous values, is

39
00:02:16,940 --> 00:02:21,260
equal to the random variable,
Z n minus 1.

40
00:02:21,260 --> 00:02:25,230
Remember, and we've talked about
this a number of times,

41
00:02:25,230 --> 00:02:28,540
when you're talking about the
expected value of one random

42
00:02:28,540 --> 00:02:32,010
variable, given a bunch of other
random variables, you're

43
00:02:32,010 --> 00:02:35,050
only taking the expectation
over the first part.

44
00:02:35,050 --> 00:02:38,760
You're only taking the
expectation over Z sub n.

45
00:02:38,760 --> 00:02:41,940
And the other quantities are
still random variables.

46
00:02:41,940 --> 00:02:45,860
Namely, you have an expected
value Z sub n, for each sample

47
00:02:45,860 --> 00:02:49,360
value of Z n minus 1, all
the way down to Z 1.

48
00:02:49,360 --> 00:02:54,520
And what the definition says is
it's a martingale only if,

49
00:02:54,520 --> 00:03:00,880
for all sample values of those
earlier values, the expected

50
00:03:00,880 --> 00:03:07,190
value is equal to the sample
value of the most recent one.

51
00:03:07,190 --> 00:03:10,690
Namely, the memory is all
contained right in this last

52
00:03:10,690 --> 00:03:12,390
term, effectively.

53
00:03:12,390 --> 00:03:14,920
At least as far as expectation
is concerned.

54
00:03:14,920 --> 00:03:20,626
Memory might be far broader than
that for everything else.

55
00:03:20,626 --> 00:03:25,290
And the first thing we did with
martingales is we said

56
00:03:25,290 --> 00:03:29,030
the expected value was again,
if you're only given part of

57
00:03:29,030 --> 00:03:31,770
the history, if you're only
given the history from i back

58
00:03:31,770 --> 00:03:38,360
to 1, where i is strictly less
than n minus 1, that expected

59
00:03:38,360 --> 00:03:40,170
value is equal to Zi.

60
00:03:40,170 --> 00:03:43,920
So no matter where you start
going back, the expected value

61
00:03:43,920 --> 00:03:52,160
of Z sub n is the most recent
value that is given.

62
00:03:52,160 --> 00:03:54,960
So if the most recent value
given is Z 1, then the

63
00:03:54,960 --> 00:04:00,590
expected value of Zn,
given Z1, is Z1.

64
00:04:00,590 --> 00:04:03,400
And also along with that, you
have the relationship, the

65
00:04:03,400 --> 00:04:07,700
expected value of Zn is equal
to the expected value of Zi,

66
00:04:07,700 --> 00:04:10,700
just by taking the expected
value over Z sub i.

67
00:04:10,700 --> 00:04:15,950
So all of that's sort
of straightforward.

68
00:04:15,950 --> 00:04:18,570
We talked a good deal about
the increments of a

69
00:04:18,570 --> 00:04:19,970
martingale.

70
00:04:19,970 --> 00:04:27,050
The increments, X sub n equals
Z sub n minus Zn minus 1, are

71
00:04:27,050 --> 00:04:30,890
very much like the increments
that we have when a renewal

72
00:04:30,890 --> 00:04:33,900
process, when a Poisson
process, all of these

73
00:04:33,900 --> 00:04:38,530
processes we talked about we
can define in various ways.

74
00:04:38,530 --> 00:04:42,410
And here we can define a
martingale in two ways also.

75
00:04:42,410 --> 00:04:46,950
One is by the actual martingale
itself, which are,

76
00:04:46,950 --> 00:04:49,570
in a sense, the sums
of the increments.

77
00:04:49,570 --> 00:04:52,070
And the other ways in terms
of the increments.

78
00:04:52,070 --> 00:04:55,680
And the increments satisfy the
property that the expected

79
00:04:55,680 --> 00:05:01,850
value of Xn, given all the
earlier values, is equal to 0.

80
00:05:01,850 --> 00:05:05,430
Namely, no matter what are all
the earlier values are, X sub

81
00:05:05,430 --> 00:05:11,010
n has mean 0 in order
to be a martingale.

82
00:05:11,010 --> 00:05:16,940
A good special case of this is
where X sub n is equal to U

83
00:05:16,940 --> 00:05:25,430
sub n times Y sub n, where the
U sub n are IID, equiprobable

84
00:05:25,430 --> 00:05:27,330
1 and minus 1.

85
00:05:27,330 --> 00:05:29,870
And the Y sub i's are anything
you want them to be.

86
00:05:29,870 --> 00:05:32,490
It's just that the U sub i's
have to be independent

87
00:05:32,490 --> 00:05:34,110
of the Y sub i.

88
00:05:34,110 --> 00:05:40,420
So I think this shows that in
fact martingales are really a

89
00:05:40,420 --> 00:05:43,470
pretty broad class of things.

90
00:05:43,470 --> 00:05:47,050
And they were invented to talk
about fair gambling games,

91
00:05:47,050 --> 00:05:50,540
where they wanted to give the
gambler the opportunity to do

92
00:05:50,540 --> 00:05:52,420
whatever he wanted to do.

93
00:05:52,420 --> 00:05:56,770
But the game itself was defined
in such a way that, no

94
00:05:56,770 --> 00:05:59,340
matter what you do,
the game is fair.

95
00:05:59,340 --> 00:06:02,910
You establish bets in one
whatever things you want to.

96
00:06:02,910 --> 00:06:07,260
And when you wind up with it,
the expected value of X sub n,

97
00:06:07,260 --> 00:06:10,430
given the past, is always 0.

98
00:06:10,430 --> 00:06:13,640
And that's equivalent to saying
the expected value of Z

99
00:06:13,640 --> 00:06:18,900
sub n, given the past,
is equal to Z sub i.

100
00:06:18,900 --> 00:06:23,890
Examples we talked about are 0
mean random walks and products

101
00:06:23,890 --> 00:06:27,000
of unit-mean IID random
variables.

102
00:06:27,000 --> 00:06:29,830
So they're both these product
martingales, and there are

103
00:06:29,830 --> 00:06:31,710
these sum martingales.

104
00:06:31,710 --> 00:06:35,000
And those are just two simple
examples, which

105
00:06:35,000 --> 00:06:37,670
come up all the time.

106
00:06:37,670 --> 00:06:40,035
Then we talked about
submartingales.

107
00:06:40,035 --> 00:06:43,590
A submartingale is like
a martingale, except

108
00:06:43,590 --> 00:06:46,290
it grows with time.

109
00:06:46,290 --> 00:06:49,560
And we're not going to talk
about supermartingales,

110
00:06:49,560 --> 00:06:53,750
because a supermartingale is
just a negative submartingale.

111
00:06:53,750 --> 00:06:56,960
So we don't have to
talk about that.

112
00:06:56,960 --> 00:06:59,570
A martingale is a
submartingale.

113
00:06:59,570 --> 00:07:03,150
So anything you know about
submartingales applies to

114
00:07:03,150 --> 00:07:04,970
martingales also.

115
00:07:04,970 --> 00:07:09,110
So you can state theorems for
submartingales and they apply

116
00:07:09,110 --> 00:07:11,160
to martingales just as well.

117
00:07:11,160 --> 00:07:14,370
You can say stronger things very
often about martingales.

118
00:07:16,950 --> 00:07:21,270
And then we have the same
theorem for submartingales.

119
00:07:25,750 --> 00:07:30,450
Now that should say, and it did
say, until my evil twin

120
00:07:30,450 --> 00:07:34,830
got a hold of it, if Zn is a
submartingales, then for n

121
00:07:34,830 --> 00:07:39,250
greater than i, greater than
0, this expected value is

122
00:07:39,250 --> 00:07:41,740
greater than or equal to Zi.

123
00:07:41,740 --> 00:07:45,480
And the expected value Zn is
greater than equal to the

124
00:07:45,480 --> 00:07:47,070
expected value of Zi.

125
00:07:47,070 --> 00:07:50,920
In other words, this theorem,
for submartingales, is the

126
00:07:50,920 --> 00:07:54,470
same as the corresponding
theorem for martingales,

127
00:07:54,470 --> 00:07:58,770
except now you have inequalities
there, just like

128
00:07:58,770 --> 00:08:01,100
you have inequalities in
the definition of the

129
00:08:01,100 --> 00:08:01,970
submartingales.

130
00:08:01,970 --> 00:08:05,530
So there's nothing
strange there.

131
00:08:05,530 --> 00:08:10,560
Then we found out that, if you
have a convex function, from

132
00:08:10,560 --> 00:08:14,320
the reals into the reals, then
Jensen's inequality says that

133
00:08:14,320 --> 00:08:18,750
the expected value of h of X is
greater than or equal to h

134
00:08:18,750 --> 00:08:20,300
of the expected value of x.

135
00:08:20,300 --> 00:08:22,170
We showed a picture for
that you remember.

136
00:08:22,170 --> 00:08:23,800
There's a convex curve.

137
00:08:23,800 --> 00:08:25,520
There's some straight line.

138
00:08:25,520 --> 00:08:30,930
And what Jensen's inequality
says is you take an average

139
00:08:30,930 --> 00:08:32,919
over the expected value
of X, and you're

140
00:08:32,919 --> 00:08:34,620
somewhere above the line.

141
00:08:34,620 --> 00:08:37,070
And you take the average
first, and you're

142
00:08:37,070 --> 00:08:38,320
sitting on the line.

143
00:08:41,491 --> 00:08:45,620
So if h of X is convex, that's
what Jensen's inequality is.

144
00:08:45,620 --> 00:08:49,460
And it follows from that that,
if Zn is a submartingale--

145
00:08:49,460 --> 00:08:52,480
and that includes
martingales--

146
00:08:52,480 --> 00:08:57,460
and h is convex and the expected
value of h of X is

147
00:08:57,460 --> 00:09:02,420
finite, then h of Zn is
a martingale also.

148
00:09:02,420 --> 00:09:06,300
In other words, if you have
a martingale Z sub n, the

149
00:09:06,300 --> 00:09:11,535
expected value of Z sub
n is a submartingale.

150
00:09:11,535 --> 00:09:16,490
The expected value of E to
R Zn is a martingale.

151
00:09:16,490 --> 00:09:20,790
Use whatever convex function you
want to, and you wind up,

152
00:09:20,790 --> 00:09:22,230
martingales go into
submartingales.

153
00:09:25,030 --> 00:09:30,400
You can't get out of the range
of submartingales that easily.

154
00:09:30,400 --> 00:09:35,000
We then talked about stopped
martingales and stopped

155
00:09:35,000 --> 00:09:36,805
submartingales.

156
00:09:36,805 --> 00:09:41,020
We said a stopped process,
for a possibly

157
00:09:41,020 --> 00:09:43,330
defective stopping time--

158
00:09:43,330 --> 00:09:45,150
now you remember what
a stopping time is?

159
00:09:45,150 --> 00:09:49,400
A stopping time is a random
variable, which is a function

160
00:09:49,400 --> 00:09:53,590
of everything that takes place
up until the time of stopping.

161
00:09:53,590 --> 00:09:57,530
And you have to look at the
definition carefully, because

162
00:09:57,530 --> 00:10:02,030
stopping time comes in too many
places to just say it and

163
00:10:02,030 --> 00:10:03,770
understand what it means.

164
00:10:03,770 --> 00:10:08,130
But it's clear what it means,
if you view yourself as an

165
00:10:08,130 --> 00:10:13,000
observer watching a sequence of
random variables, of sample

166
00:10:13,000 --> 00:10:15,940
values of random variables,
one after another.

167
00:10:15,940 --> 00:10:20,160
And after you see a certain
number of random variables,

168
00:10:20,160 --> 00:10:22,200
your rule says, stop.

169
00:10:22,200 --> 00:10:23,940
And then you don't
observe anymore.

170
00:10:23,940 --> 00:10:26,690
So you just observe this
finite number.

171
00:10:26,690 --> 00:10:28,135
And then you stop
at that point.

172
00:10:28,135 --> 00:10:29,860
And then you're all done.

173
00:10:29,860 --> 00:10:33,170
If it's a possibly defective
stopping rule, then you might

174
00:10:33,170 --> 00:10:35,250
keep on going forever,
or you might stop.

175
00:10:35,250 --> 00:10:36,580
You don't know what you're
going to do.

176
00:10:42,610 --> 00:10:47,100
The stopped process Z sub n
star is a little different

177
00:10:47,100 --> 00:10:49,400
from what we were
doing before.

178
00:10:49,400 --> 00:10:51,610
Before what we were doing
is we were sitting there

179
00:10:51,610 --> 00:10:53,730
observing this process.

180
00:10:53,730 --> 00:10:57,390
At a certain point, the stopping
rule said stop.

181
00:10:57,390 --> 00:11:00,900
And before, we were
very obedient.

182
00:11:00,900 --> 00:11:04,550
And when the stopping rule told
us to stop, we stopped.

183
00:11:04,550 --> 00:11:08,900
Now, since we know a little
more, we question authority a

184
00:11:08,900 --> 00:11:09,960
little more.

185
00:11:09,960 --> 00:11:13,470
And when the stopping rule says
stop, we break things

186
00:11:13,470 --> 00:11:15,180
into two processes.

187
00:11:15,180 --> 00:11:18,980
There's the original process,
which keeps on going.

188
00:11:18,980 --> 00:11:22,360
And there this stopped process,
which just stops.

189
00:11:22,360 --> 00:11:25,730
And it's convenient to have a
stopped process instead of

190
00:11:25,730 --> 00:11:27,300
just a stopping rule.

191
00:11:27,300 --> 00:11:31,000
Because with a stopped process,
you can look at any

192
00:11:31,000 --> 00:11:34,670
time into the future, and if
it's already stopped, you know

193
00:11:34,670 --> 00:11:36,050
what the stopped value is.

194
00:11:36,050 --> 00:11:38,270
You know what it was
when it stopped.

195
00:11:38,270 --> 00:11:41,100
You don't necessarily know when
it stopped, by looking at

196
00:11:41,100 --> 00:11:41,710
in the future.

197
00:11:41,710 --> 00:11:44,370
But you know that it did stop.

198
00:11:44,370 --> 00:11:51,448
So the stopped process, well,
it says here what it is.

199
00:11:54,840 --> 00:12:00,740
It satisfies the stopped value
at time n as equal to Z sub n,

200
00:12:00,740 --> 00:12:05,110
if n is less than or equal to
the stopping time J, and Z sub

201
00:12:05,110 --> 00:12:09,430
n star is equal to Z sub J,
if n is greater than J.

202
00:12:09,430 --> 00:12:11,740
So you get up to the stopping
timing, and you stop.

203
00:12:11,740 --> 00:12:14,880
And then it just stays
fixed forever after.

204
00:12:14,880 --> 00:12:22,170
And the nice theorem there is
that the stopped process for a

205
00:12:22,170 --> 00:12:27,570
submartingale, with a possibly
defective stopping rule, is a

206
00:12:27,570 --> 00:12:29,690
submartingale again.

207
00:12:29,690 --> 00:12:33,970
What that means is it's just a
concise way of writing, the

208
00:12:33,970 --> 00:12:37,810
stopped process for a martingale
is a martingale in

209
00:12:37,810 --> 00:12:39,140
its own right.

210
00:12:39,140 --> 00:12:42,140
And the stopped process for
a submartingale is a

211
00:12:42,140 --> 00:12:46,350
submartingale in
its own right.

212
00:12:46,350 --> 00:12:50,680
So the convenient thing is, you
can take a martingale, you

213
00:12:50,680 --> 00:12:53,870
can stop it, you still
have a martingale.

214
00:12:53,870 --> 00:12:56,200
And everything you know about
martingales applies to this

215
00:12:56,200 --> 00:12:59,170
stopping process.

216
00:12:59,170 --> 00:13:02,180
So we're getting to the point
where, starting out with a

217
00:13:02,180 --> 00:13:04,890
martingale, we can do lots
of things with it.

218
00:13:04,890 --> 00:13:08,650
And that's the whole
mathematical game.

219
00:13:08,650 --> 00:13:11,040
With a mathematical
game, you build up

220
00:13:11,040 --> 00:13:13,740
theorems from nothing.

221
00:13:13,740 --> 00:13:17,690
As an experimentalist or an
engineer, you sort of try to

222
00:13:17,690 --> 00:13:26,050
figure out those things from
the reality around you.

223
00:13:26,050 --> 00:13:28,700
Here, we're just
building it up.

224
00:13:28,700 --> 00:13:36,420
And the other part of that
theorem says that the expected

225
00:13:36,420 --> 00:13:41,370
value of Z1 is less than or
equal to the expected value of

226
00:13:41,370 --> 00:13:45,520
Zn star, is less than or equal
to the expected value of Zn

227
00:13:45,520 --> 00:13:47,070
for a submartingale.

228
00:13:47,070 --> 00:13:49,570
And they're all equal
to a martingale.

229
00:13:49,570 --> 00:13:55,130
In other words, the marginal
expectations for a martingale,

230
00:13:55,130 --> 00:13:57,130
they start out a Z1.

231
00:13:57,130 --> 00:13:58,870
They stay at Z1.

232
00:13:58,870 --> 00:14:03,370
And for the stopped process,
they stay at that same value.

233
00:14:03,370 --> 00:14:06,350
And that's not too surprising.

234
00:14:06,350 --> 00:14:10,840
Because if you have a
martingale, if you go until

235
00:14:10,840 --> 00:14:15,400
you reach the stopping point,
from that stopping point on,

236
00:14:15,400 --> 00:14:19,260
the martingale has mean
0, from that point on.

237
00:14:22,390 --> 00:14:25,610
Not the martingale itself,
but the increments of the

238
00:14:25,610 --> 00:14:29,340
martingale have mean 0,
from that point on.

239
00:14:29,340 --> 00:14:31,920
And the stopped process
has mean 0.

240
00:14:31,920 --> 00:14:34,720
In other words, the stopped
process, the increments are

241
00:14:34,720 --> 00:14:35,720
actually 0.

242
00:14:35,720 --> 00:14:37,680
Whereas for the original
process, the

243
00:14:37,680 --> 00:14:39,560
increments wobble around.

244
00:14:39,560 --> 00:14:41,390
But they still have mean 0.

245
00:14:41,390 --> 00:14:46,430
So this is a very nice and
useful thing to know.

246
00:14:46,430 --> 00:14:51,090
If you look at this product
martingale, Z sub n is E to

247
00:14:51,090 --> 00:14:53,760
the rsn minus n gamma or r .

248
00:14:53,760 --> 00:14:55,050
Why is that a martingale?

249
00:14:58,670 --> 00:14:59,920
How do you know it's
a martingale?

250
00:15:04,540 --> 00:15:07,880
Well, you look at the expected
value of this.

251
00:15:07,880 --> 00:15:10,070
And it's expected
value of this.

252
00:15:10,070 --> 00:15:15,910
The expected value of E to rsn
is the moment generating

253
00:15:15,910 --> 00:15:21,610
function of Z sub
n, of s sub n.

254
00:15:21,610 --> 00:15:25,980
It's moment generating
function of E to rsn.

255
00:15:25,980 --> 00:15:31,355
And the moment generating
function of E to rsn is just E

256
00:15:31,355 --> 00:15:33,110
to the n gamma of r.

257
00:15:33,110 --> 00:15:40,560
So this is clearly something
which should be a martingale,

258
00:15:40,560 --> 00:15:44,090
because it just keeps at
that level all along.

259
00:15:44,090 --> 00:15:47,900
If you have a stopping rule,
such as a threshold crossing,

260
00:15:47,900 --> 00:15:50,200
then you've got a stopped
martingale.

261
00:15:50,200 --> 00:15:54,570
And subject to some little
mathematical nitpicks, which

262
00:15:54,570 --> 00:15:58,440
the text talks about, this leads
you to the much more

263
00:15:58,440 --> 00:16:04,370
general version of Wald's
identity, which says that the

264
00:16:04,370 --> 00:16:09,790
expected value of Z, at the time
of stopping, is equal to

265
00:16:09,790 --> 00:16:12,520
the expected value of
E to the rsJ minus J

266
00:16:12,520 --> 00:16:14,300
gamma of r equals 1.

267
00:16:14,300 --> 00:16:17,310
This you remember is what Wald's
identity was when we

268
00:16:17,310 --> 00:16:19,180
were just talking about
random walks.

269
00:16:19,180 --> 00:16:22,980
And this was a more general
version, because it's talking

270
00:16:22,980 --> 00:16:27,910
about general stopping rules,
instead of just to thresholds.

271
00:16:27,910 --> 00:16:31,730
But it does have these little
mathematical nitpicks in it,

272
00:16:31,730 --> 00:16:36,030
which I'm not going to
talk about here.

273
00:16:36,030 --> 00:16:40,020
Then we have Kolmogorov's
submartingale inequality.

274
00:16:40,020 --> 00:16:42,400
We talked about all of these
things last time.

275
00:16:42,400 --> 00:16:46,310
So we're going pretty quickly
through them.

276
00:16:46,310 --> 00:16:51,390
The submartingale inequality
is really the Markov

277
00:16:51,390 --> 00:16:55,390
inequality souped up.

278
00:16:55,390 --> 00:16:58,280
And what it says is, if you
have a non-negative

279
00:16:58,280 --> 00:17:02,100
submartingale, that can
include a non-negative

280
00:17:02,100 --> 00:17:07,560
martingale, for any positive
integer m, the probability

281
00:17:07,560 --> 00:17:13,940
that the maximum is a Z sub i,
from 1 to m, is greater than

282
00:17:13,940 --> 00:17:17,109
or equal to a, is less than or
equal to the expected value of

283
00:17:17,109 --> 00:17:19,150
Z sub m over a.

284
00:17:19,150 --> 00:17:25,150
You see all that the Markov
inequality says is the

285
00:17:25,150 --> 00:17:28,990
probability that Z sub m is
greater than or equal to a, is

286
00:17:28,990 --> 00:17:30,770
less than or equal to this.

287
00:17:30,770 --> 00:17:33,930
This puts a lot more teeth into
it, because it lets you

288
00:17:33,930 --> 00:17:38,310
talk about all of these random
variables, up until time m.

289
00:17:38,310 --> 00:17:41,630
And it says the maximum
of them satisfies this

290
00:17:41,630 --> 00:17:42,100
inequality.

291
00:17:42,100 --> 00:17:44,410
I mean, we always knew that
the Markov inequality was

292
00:17:44,410 --> 00:17:46,570
very, very weak.

293
00:17:46,570 --> 00:17:48,820
And this is also pretty weak.

294
00:17:48,820 --> 00:17:50,580
But it's not quite as
weak, because it

295
00:17:50,580 --> 00:17:52,940
covers a lot more things.

296
00:17:52,940 --> 00:17:54,930
If you have a non-negative
martingale--

297
00:17:54,930 --> 00:17:58,520
this is submartingales,
this is martingales.

298
00:17:58,520 --> 00:18:02,270
You see here, with
submartingales, the expected

299
00:18:02,270 --> 00:18:07,640
value of Z sub m keeps
increasing with m.

300
00:18:07,640 --> 00:18:10,450
So there's a trade-off between
making m large and

301
00:18:10,450 --> 00:18:11,700
not making m large.

302
00:18:15,040 --> 00:18:18,020
If you're dealing with a
martingale, then expected

303
00:18:18,020 --> 00:18:21,010
value Z sub m is constant
over all time.

304
00:18:21,010 --> 00:18:22,680
It doesn't change.

305
00:18:22,680 --> 00:18:27,550
And therefore, you can take
this inequality here.

306
00:18:27,550 --> 00:18:30,570
You can go to the limit,
as m goes to infinity.

307
00:18:30,570 --> 00:18:33,920
And you wind up with a
probability, the sup of Zm,

308
00:18:33,920 --> 00:18:37,390
greater than or equal to a, is
less than or equal to the

309
00:18:37,390 --> 00:18:40,510
expected value of the first of
those random variables, the

310
00:18:40,510 --> 00:18:44,390
expected value of
Z1 divided by a.

311
00:18:44,390 --> 00:18:48,440
So this looks like a very
powerful inequality.

312
00:18:48,440 --> 00:18:52,585
It turns out that I don't know
many applications of that.

313
00:18:52,585 --> 00:18:54,070
And I don't know why.

314
00:18:54,070 --> 00:18:56,780
It seems like it ought
to be very useful.

315
00:18:56,780 --> 00:18:59,930
But I know one reason, which is
what I'm going to show you

316
00:18:59,930 --> 00:19:04,260
next, which is how you can
really use the submartingale

317
00:19:04,260 --> 00:19:07,520
inequality to make it do an
awful lot of things that you

318
00:19:07,520 --> 00:19:09,405
wouldn't imagine that it
could do otherwise.

319
00:19:13,560 --> 00:19:16,270
First, you go to the Kolmogorov
version of the

320
00:19:16,270 --> 00:19:18,470
Chebyshev inequality.

321
00:19:18,470 --> 00:19:22,860
This has the same relationship
to the Kolmogorov

322
00:19:22,860 --> 00:19:28,120
submartingale inequality as
Chebyshev has to Markov.

323
00:19:28,120 --> 00:19:31,170
Namely, what you do is, instead
of looking at the

324
00:19:31,170 --> 00:19:36,020
random variables Z sub n, you
look at the random variable Z

325
00:19:36,020 --> 00:19:38,630
sub n squared.

326
00:19:38,630 --> 00:19:41,230
And what do we know now?

327
00:19:41,230 --> 00:19:46,420
If Z sub n is a martingale or
a submartingale, Z sub n

328
00:19:46,420 --> 00:19:50,880
squared is a martingale
or submartingale also.

329
00:19:50,880 --> 00:19:55,520
Namely, well, the only thing
we can be sure of is that Z

330
00:19:55,520 --> 00:19:59,000
sub n squared is a
submartingale.

331
00:19:59,000 --> 00:20:02,260
But if it's a submartingale,
then we can apply this

332
00:20:02,260 --> 00:20:03,990
inequality again.

333
00:20:03,990 --> 00:20:07,315
And what it tells us, in this
case, is the probability at

334
00:20:07,315 --> 00:20:11,840
the maximum of the magnitudes
of these random variables.

335
00:20:11,840 --> 00:20:14,370
Probably the maximum is greater
than or equal to b, is

336
00:20:14,370 --> 00:20:17,870
less than or equal to the
expected value of Z sub m

337
00:20:17,870 --> 00:20:20,600
squared over b squared.

338
00:20:20,600 --> 00:20:24,190
So before, just like the Markov
inequality, the Markov

339
00:20:24,190 --> 00:20:28,530
inequality only works for
non-negative random variables.

340
00:20:28,530 --> 00:20:31,790
You go to the Chebyshev
inequality, because that works

341
00:20:31,790 --> 00:20:35,500
for negative or positive
random variables.

342
00:20:35,500 --> 00:20:38,200
So that makes it kind of neat.

343
00:20:38,200 --> 00:20:43,120
And then what you have is this
thing, which goes down as 1

344
00:20:43,120 --> 00:20:45,360
over b squared, which looks
a little stronger.

345
00:20:45,360 --> 00:20:50,690
But that's not the real reason
that you want to use it.

346
00:20:50,690 --> 00:21:02,480
Now, this inequality here only
works for the first m values

347
00:21:02,480 --> 00:21:04,790
of this random variable.

348
00:21:04,790 --> 00:21:08,970
What we're usually interested
in here is what happens as m

349
00:21:08,970 --> 00:21:10,630
gets very large.

350
00:21:10,630 --> 00:21:16,620
As m gets very large, this
thing, very often, blows up.

351
00:21:16,620 --> 00:21:18,410
So this [INAUDIBLE]

352
00:21:18,410 --> 00:21:21,130
does not really do what
you would like an

353
00:21:21,130 --> 00:21:22,980
inequality to do.

354
00:21:22,980 --> 00:21:28,960
So what we're going to do is,
first, we're going to say, if

355
00:21:28,960 --> 00:21:35,950
you had this inequality here,
then you can lower bound this

356
00:21:35,950 --> 00:21:41,940
by taking just a maximum, not
over 1 up to m, but only over

357
00:21:41,940 --> 00:21:44,510
m over 2 up to m.

358
00:21:44,510 --> 00:21:46,640
Now why do we want to do that?

359
00:21:46,640 --> 00:21:49,250
Well, hold on and you'll see.

360
00:21:49,250 --> 00:21:53,390
But anyway this is bigger
than, greater than

361
00:21:53,390 --> 00:21:55,580
or equal to, this.

362
00:21:55,580 --> 00:21:59,010
So what we're going to do now
is we're going to take this

363
00:21:59,010 --> 00:22:00,640
inequality.

364
00:22:00,640 --> 00:22:05,930
We're going to use it for m
equals 2 to the m, for m

365
00:22:05,930 --> 00:22:10,410
equals 2 to the k plus 1, m
equals 2 to the k plus 2, all

366
00:22:10,410 --> 00:22:13,180
the way up to infinity.

367
00:22:13,180 --> 00:22:18,280
And so we're going to find the
probability of the union over

368
00:22:18,280 --> 00:22:23,010
j greater than or equal to k of
this quantity here, but now

369
00:22:23,010 --> 00:22:28,020
just maximized over 2 to the j
minus 1, less than n, less

370
00:22:28,020 --> 00:22:30,180
than or equal 2 to the j.

371
00:22:30,180 --> 00:22:33,730
And then the maximum of Z sub
n, greater than or equal to.

372
00:22:33,730 --> 00:22:37,680
And now, for each one of these
j's here, we'll put in

373
00:22:37,680 --> 00:22:39,460
whatever b sub j we want.

374
00:22:39,460 --> 00:22:43,680
So the general form of this
inequality then becomes.

375
00:22:43,680 --> 00:22:45,860
We have this term on the left.

376
00:22:45,860 --> 00:22:47,750
We use the union bound.

377
00:22:47,750 --> 00:22:50,690
And we get this term
on the right.

378
00:22:50,690 --> 00:22:55,470
So at this point, we have an
inequality, which works for

379
00:22:55,470 --> 00:22:59,260
all n, instead of just
for values smaller

380
00:22:59,260 --> 00:23:01,340
than some given amount.

381
00:23:01,340 --> 00:23:04,510
So this is sort of a general
technique for taking an

382
00:23:04,510 --> 00:23:09,130
inequality, which only works
up to a certain value, and

383
00:23:09,130 --> 00:23:12,300
extending it so it works
over all values.

384
00:23:12,300 --> 00:23:17,590
You have to be pretty careful
about how you choose b sub j.

385
00:23:17,590 --> 00:23:21,470
Now what we're going
to do is say, OK.

386
00:23:21,470 --> 00:23:25,870
And remember, what is happening
here is we started

387
00:23:25,870 --> 00:23:29,370
out with a submartingale
or a martingale.

388
00:23:29,370 --> 00:23:34,920
When we take Z n squared, we
still have a submartingale.

389
00:23:34,920 --> 00:23:38,190
So we can use a submartingale
inequality, which is what

390
00:23:38,190 --> 00:23:39,370
we're doing here.

391
00:23:39,370 --> 00:23:44,210
We're using the submartingale
inequality on Zm squared

392
00:23:44,210 --> 00:23:45,910
rather than on Zm.

393
00:23:45,910 --> 00:23:48,120
And Zm squared is non-negative,

394
00:23:48,120 --> 00:23:50,060
so that works there.

395
00:23:50,060 --> 00:23:52,260
Then we go down to this point.

396
00:23:52,260 --> 00:23:55,750
We take a union over
all of these terms.

397
00:23:55,750 --> 00:23:57,650
And note what happens.

398
00:23:57,650 --> 00:24:03,430
Every n is included in one
of these terms, every n

399
00:24:03,430 --> 00:24:06,080
beyond 2 the k.

400
00:24:06,080 --> 00:24:09,140
So if we want to prove something
about the limiting

401
00:24:09,140 --> 00:24:16,860
values of Z sub n, we have
everything included there,

402
00:24:16,860 --> 00:24:19,160
everything beyond 2 to the k.

403
00:24:19,160 --> 00:24:21,690
But as far as the limit is
concerned, you don't care

404
00:24:21,690 --> 00:24:25,340
about any initial finite set.

405
00:24:25,340 --> 00:24:29,950
You care what happens after
that initial finite set.

406
00:24:29,950 --> 00:24:32,940
So what we have then
[INAUDIBLE]

407
00:24:32,940 --> 00:24:36,130
of these terms, less than
or equal to this term.

408
00:24:36,130 --> 00:24:43,350
When I apply this to a random
walk S sub n, S sub n is a

409
00:24:43,350 --> 00:24:46,650
submartingale, at this point.

410
00:24:46,650 --> 00:24:48,790
The expected value
of x squared will

411
00:24:48,790 --> 00:24:50,470
assume a sigma squared.

412
00:24:50,470 --> 00:24:57,300
The expected value now S sub
n, or Z sub n is we'll call

413
00:24:57,300 --> 00:25:02,070
it, is the sum of these n
IID random variables.

414
00:25:02,070 --> 00:25:03,376
So the expected value--

415
00:25:03,376 --> 00:25:04,910
AUDIENCE: 10 o'clock.

416
00:25:04,910 --> 00:25:09,430
PROFESSOR: The expected value
of Z to the 2J is just 2 to

417
00:25:09,430 --> 00:25:13,090
the J times the expected value
of x squared, in other words,

418
00:25:13,090 --> 00:25:14,120
sigma squared.

419
00:25:14,120 --> 00:25:17,370
[INAUDIBLE] just doing this
for a 0 mean [INAUDIBLE]

420
00:25:17,370 --> 00:25:21,492
variable, because [INAUDIBLE]

421
00:25:21,492 --> 00:25:23,590
given an arbitrary
non-0 [INAUDIBLE]

422
00:25:23,590 --> 00:25:24,590
random variable.

423
00:25:24,590 --> 00:25:27,540
You can look at it as to
mean plus a random

424
00:25:27,540 --> 00:25:29,400
variable, which is 0 mean.

425
00:25:29,400 --> 00:25:33,390
So that's the same ideas
we're using here.

426
00:25:33,390 --> 00:25:38,770
So we take this inequality now,
and I'm going to use for

427
00:25:38,770 --> 00:25:43,410
b sub J, 3/2 to the
J. Why 3/2 to J?

428
00:25:43,410 --> 00:25:45,430
Well you'll see in
just a second.

429
00:25:45,430 --> 00:25:49,780
But when I use 3/2 to the J here
I get the maximum over S

430
00:25:49,780 --> 00:25:52,830
sub n, greater than or equal
to 3/2 to the J.

431
00:25:52,830 --> 00:26:00,010
And over here I get b sub J
squared is 9/4 to the J. And

432
00:26:00,010 --> 00:26:06,240
here I have 2 the J also.

433
00:26:06,240 --> 00:26:11,810
So when I sum this, it winds up
with 8/9 to the k times 9

434
00:26:11,810 --> 00:26:13,620
sigma squared.

435
00:26:13,620 --> 00:26:18,040
So what I have now is something
where, when k gets

436
00:26:18,040 --> 00:26:21,850
larger, this term
is going to 0.

437
00:26:21,850 --> 00:26:25,160
And I have something over here,
well that doesn't look

438
00:26:25,160 --> 00:26:27,670
quite so attractive, but
just wait a minute.

439
00:26:27,670 --> 00:26:31,010
What I'm really interested
in is not S sub n.

440
00:26:31,010 --> 00:26:33,670
But I'm interested in
S sub n over n.

441
00:26:33,670 --> 00:26:36,710
For the strong law of large
numbers, I'd like to show that

442
00:26:36,710 --> 00:26:40,300
S sub n over n approaches
a limit.

443
00:26:40,300 --> 00:26:44,690
And n in this case runs between
2 to the J minus 1 and

444
00:26:44,690 --> 00:26:46,100
2 to the J.

445
00:26:46,100 --> 00:26:48,640
So when I put that in here--

446
00:26:48,640 --> 00:26:53,640
we'll see what that amounts
to in the next slide.

447
00:26:53,640 --> 00:27:00,130
For the strong law of large
numbers, what our theorem says

448
00:27:00,130 --> 00:27:04,290
is that the probability of the
set of sample points, for

449
00:27:04,290 --> 00:27:09,890
which S sub n over n equals 0,
that set of sample points has

450
00:27:09,890 --> 00:27:12,770
probability y.

451
00:27:12,770 --> 00:27:16,000
So the proof of that, if I pick
up this equation from the

452
00:27:16,000 --> 00:27:24,590
previous slide, and when I lower
bound the left side of

453
00:27:24,590 --> 00:27:30,030
this, what I'm going to get, I'm
going to divide by n here.

454
00:27:30,030 --> 00:27:32,500
And I'm going to divide by
something a little bit

455
00:27:32,500 --> 00:27:36,060
smaller, which is 2 to
the J minus 1 here.

456
00:27:36,060 --> 00:27:39,730
So I get the maximum of S sub
n over n, greater than or

457
00:27:39,730 --> 00:27:42,930
equal to 2 times 3/4 to the J.

458
00:27:42,930 --> 00:27:44,360
Now you see why I picked--

459
00:27:47,940 --> 00:27:50,680
I think you see at this
point why I picked the

460
00:27:50,680 --> 00:27:52,010
sub j the way I did.

461
00:27:52,010 --> 00:27:55,250
I wanted to pick it t to be
smaller than 2 to the J. And I

462
00:27:55,250 --> 00:28:00,030
wanted to pick it to be big
enough that it drove the right

463
00:28:00,030 --> 00:28:03,270
hand term to 0.

464
00:28:03,270 --> 00:28:05,230
So now we're done really.

465
00:28:05,230 --> 00:28:10,180
Because, if I look at this
expression here, a sample

466
00:28:10,180 --> 00:28:15,290
sequence S sub n of omega,
that's not contained in this

467
00:28:15,290 --> 00:28:21,810
union, has to approach 0.

468
00:28:21,810 --> 00:28:27,650
Because these terms from 2 to
the J minus 1 to 2 to the J,

469
00:28:27,650 --> 00:28:31,140
in order to be in this set, they
have to be greater than

470
00:28:31,140 --> 00:28:34,180
or equal to 2 times
3/4 to the J.

471
00:28:34,180 --> 00:28:39,490
As j gets larger and larger,
this term goes to 0.

472
00:28:39,490 --> 00:28:43,930
So the only terms that exceed
that are terms that are

473
00:28:43,930 --> 00:28:45,180
arbitrarily small.

474
00:28:48,240 --> 00:28:54,840
So the complement of this set is
the set of terms for which

475
00:28:54,840 --> 00:28:58,350
S sub n over n does
not approach 0.

476
00:28:58,350 --> 00:29:03,270
But the probability of that is
8/9 to the k times time some

477
00:29:03,270 --> 00:29:05,480
garbage over here.

478
00:29:05,480 --> 00:29:09,450
So now it's true for all k.

479
00:29:09,450 --> 00:29:20,130
The terms which approach 0,
namely the sampled values for

480
00:29:20,130 --> 00:29:24,980
which S sub n over n approaches
0 are all

481
00:29:24,980 --> 00:29:27,670
complimentary to this set.

482
00:29:27,670 --> 00:29:36,590
So the probability that S sub n
over omega over n approaches

483
00:29:36,590 --> 00:29:43,460
0 is greater than 1 minus
this quantity here.

484
00:29:43,460 --> 00:29:45,990
That's true for all k.

485
00:29:45,990 --> 00:29:50,650
And since it's true for all
k, this term goes to 0.

486
00:29:50,650 --> 00:29:54,140
And the theorem is proven.

487
00:29:54,140 --> 00:29:57,280
Now why did I want to
go through this.

488
00:29:57,280 --> 00:30:01,080
There are perhaps easier ways
to prove the strong law of

489
00:30:01,080 --> 00:30:08,990
large numbers, just assuming
that the variance is finite.

490
00:30:08,990 --> 00:30:11,502
Why this particular y?

491
00:30:11,502 --> 00:30:15,900
Well, if you look at this, it
applies to much more than just

492
00:30:15,900 --> 00:30:19,100
sums of IID random variables.

493
00:30:19,100 --> 00:30:23,540
It applies to arbitrary
martingales, so long as these

494
00:30:23,540 --> 00:30:25,800
conditions are satisfied.

495
00:30:25,800 --> 00:30:29,410
It applies to these cases, like
where you have a random

496
00:30:29,410 --> 00:30:34,160
variable, which is plus or minus
1 times some arbitrary

497
00:30:34,160 --> 00:30:35,680
random variable.

498
00:30:35,680 --> 00:30:39,050
So this gives you sort of a
general way of proving strong

499
00:30:39,050 --> 00:30:42,390
laws of large numbers
for strange

500
00:30:42,390 --> 00:30:44,110
sequences of random variables.

501
00:30:46,660 --> 00:30:48,610
So that's the reason for
going through this.

502
00:30:48,610 --> 00:30:54,050
We now have a way of proving
strong laws of large numbers

503
00:30:54,050 --> 00:30:57,860
for lots of different kinds of
martingales, rather than just

504
00:30:57,860 --> 00:31:01,690
for this set of things here.

505
00:31:01,690 --> 00:31:10,760
So let's move on back to Markov
chains, countable or

506
00:31:10,760 --> 00:31:12,020
finite state.

507
00:31:12,020 --> 00:31:16,480
I'm moving back to chapter three
and five in the text,

508
00:31:16,480 --> 00:31:20,240
mostly chapter five, and trying
to finish some sort of

509
00:31:20,240 --> 00:31:22,190
review of what we've done.

510
00:31:22,190 --> 00:31:24,280
When I look back at what we've
done, it seems like we've

511
00:31:24,280 --> 00:31:26,990
proven an awful lot theorems.

512
00:31:26,990 --> 00:31:29,360
So all I can do is talk
about the theorems.

513
00:31:32,370 --> 00:31:38,490
I should say something, again,
this last day, on this last

514
00:31:38,490 --> 00:31:44,310
lecture, about why we spend so
much time proving theorems.

515
00:31:44,310 --> 00:31:46,860
In other words, we've just
proven a theorem here.

516
00:31:46,860 --> 00:31:53,760
I promised you I would prove a
theorem every lecture, along

517
00:31:53,760 --> 00:31:58,760
with talking about why they're
important and so on.

518
00:31:58,760 --> 00:32:04,760
And most of you are engineers
or you're scientists in

519
00:32:04,760 --> 00:32:05,650
various fields.

520
00:32:05,650 --> 00:32:07,380
You're not mathematicians.

521
00:32:07,380 --> 00:32:11,120
Why should you be interested
in all these theorems.

522
00:32:11,120 --> 00:32:14,240
Why should you take abstract
courses, which

523
00:32:14,240 --> 00:32:17,180
look like math courses?

524
00:32:17,180 --> 00:32:21,210
And the reason is this kind of
stuff is more important for

525
00:32:21,210 --> 00:32:23,800
you than it is for
mathematicians.

526
00:32:23,800 --> 00:32:26,410
And it's more important for
you, because when you're

527
00:32:26,410 --> 00:32:30,190
dealing with a real engineering
or real scientific

528
00:32:30,190 --> 00:32:33,410
problem, how do you
deal with it?

529
00:32:33,410 --> 00:32:36,650
I mean, you have a real
mess facing you.

530
00:32:36,650 --> 00:32:39,550
You spend a lot of time trying
to understand what that mess

531
00:32:39,550 --> 00:32:41,850
is all about.

532
00:32:41,850 --> 00:32:46,070
And you don't form a model of
it, and then apply theorems.

533
00:32:46,070 --> 00:32:48,690
What you do is to try
to understand it.

534
00:32:48,690 --> 00:32:51,590
You look at multiple models.

535
00:32:51,590 --> 00:32:55,780
When we were looking at
hypothesis testing, we said

536
00:32:55,780 --> 00:32:59,450
we're going to assume a
priori probabilities.

537
00:32:59,450 --> 00:33:01,960
I lied about that
a little bit.

538
00:33:01,960 --> 00:33:05,050
We we're not assuming a
priori probabilities.

539
00:33:05,050 --> 00:33:10,340
We we're assuming a class of
probability models, each of

540
00:33:10,340 --> 00:33:13,810
which had a priori probabilities
in them.

541
00:33:13,810 --> 00:33:16,560
And then we said something
about that class of

542
00:33:16,560 --> 00:33:19,220
probability models.

543
00:33:19,220 --> 00:33:21,950
And saying something about
that class of probability

544
00:33:21,950 --> 00:33:25,490
models, we were able to say a
great deal more than you can

545
00:33:25,490 --> 00:33:28,960
say if you refuse to even think
about a model, which

546
00:33:28,960 --> 00:33:31,980
doesn't have a priori
probabilities in it.

547
00:33:31,980 --> 00:33:35,960
So by looking at lots of
different models, you can

548
00:33:35,960 --> 00:33:40,470
understand an enormous number
of things without really

549
00:33:40,470 --> 00:33:42,940
having any one model which
describes the whole

550
00:33:42,940 --> 00:33:45,650
situation for you.

551
00:33:45,650 --> 00:33:50,020
And that's why we try to prove
theorems for models, because

552
00:33:50,020 --> 00:33:53,460
then, when you understand lots
of simple models, you have

553
00:33:53,460 --> 00:33:56,280
these complicated physical
situations, and

554
00:33:56,280 --> 00:33:57,570
you play with them.

555
00:33:57,570 --> 00:34:00,470
You play with them by applying
various simple models that you

556
00:34:00,470 --> 00:34:02,170
understand to them.

557
00:34:02,170 --> 00:34:04,860
And as you do this, you
gradually understand the

558
00:34:04,860 --> 00:34:07,060
physical process better.

559
00:34:07,060 --> 00:34:09,540
And that's the way we
discover things.

560
00:34:09,540 --> 00:34:11,260
OK, end of lecture.

561
00:34:11,260 --> 00:34:16,830
Not end of lecture, but end of
partial lecture about why you

562
00:34:16,830 --> 00:34:18,080
want to learn some
mathematics.

563
00:34:20,550 --> 00:34:26,510
The first passage time from
state i to j, remember, is the

564
00:34:26,510 --> 00:34:32,790
smallest n, when you start off
in state i, at which you get

565
00:34:32,790 --> 00:34:33,610
to state j.

566
00:34:33,610 --> 00:34:35,690
You start off in state i.

567
00:34:35,690 --> 00:34:38,090
You jump from one lily
pad to another.

568
00:34:38,090 --> 00:34:41,040
You eventually wind up
at lily pad number j.

569
00:34:41,040 --> 00:34:44,989
And we want to know how long
it takes you to get to j.

570
00:34:44,989 --> 00:34:48,889
That's a random variable,
obviously.

571
00:34:48,889 --> 00:34:54,620
And this Tij, as possibly the
effective random variable,

572
00:34:54,620 --> 00:34:58,230
that has the probability
mass function.

573
00:34:58,230 --> 00:35:00,460
It's the definition of
what this probability

574
00:35:00,460 --> 00:35:02,600
mass function is.

575
00:35:02,600 --> 00:35:05,480
And it has a distribution
function.

576
00:35:05,480 --> 00:35:08,490
And the probability
mass function--

577
00:35:08,490 --> 00:35:10,900
you probably remember
how we derived this.

578
00:35:10,900 --> 00:35:14,200
We derived it by sort of
crawling up on it, by looking

579
00:35:14,200 --> 00:35:19,900
at it, first, for n equals 1,
in which case it's just a

580
00:35:19,900 --> 00:35:23,750
transition probability
of n equals 2.

581
00:35:23,750 --> 00:35:26,720
In which case, it's the
probability that you first go

582
00:35:26,720 --> 00:35:31,760
to k, and then in n minus
1 steps, you go to j.

583
00:35:31,760 --> 00:35:35,710
But you have to leave j out,
because if you go to j in the

584
00:35:35,710 --> 00:35:41,850
first step, you've already
had your first passage.

585
00:35:41,850 --> 00:35:43,870
so

586
00:35:43,870 --> 00:35:49,120
We define a state to be
recurrent, if T sub jj is

587
00:35:49,120 --> 00:35:50,980
non-defective.

588
00:35:50,980 --> 00:35:54,160
And we define it to be
transient otherwise.

589
00:35:54,160 --> 00:35:57,740
In other words, if it's not
certain that you ever get to

590
00:35:57,740 --> 00:36:01,540
state j, then you define it
to be transient, if it's

591
00:36:01,540 --> 00:36:04,400
recurrent and it's positive
recurrent, if the expected

592
00:36:04,400 --> 00:36:07,740
value of T sub jj is
less than infinity.

593
00:36:07,740 --> 00:36:11,110
And it's null recurrent
otherwise.

594
00:36:11,110 --> 00:36:14,190
How do we know how
to analyze this?

595
00:36:14,190 --> 00:36:17,530
Well we study renewal
processes.

596
00:36:17,530 --> 00:36:21,300
And if you look at the renewal
process where you've got a

597
00:36:21,300 --> 00:36:24,150
renewal every time you hit
state j, you start

598
00:36:24,150 --> 00:36:25,040
out on stage j.

599
00:36:25,040 --> 00:36:28,710
The first time you hit state
j, that's a renewal.

600
00:36:28,710 --> 00:36:31,370
The next time you hit state
j, that's another renewal.

601
00:36:33,970 --> 00:36:40,150
You have a renewal process where
the interrenewal time is

602
00:36:40,150 --> 00:36:47,920
a random variable, which has
the PMF F sub ij event.

603
00:36:52,400 --> 00:37:00,060
Excuse me, if you have a renewal
process, if you start

604
00:37:00,060 --> 00:37:05,440
in state j, where T sub jj is
the amount of time before

605
00:37:05,440 --> 00:37:10,730
renewal occurs, from that time
on, you get another renewal

606
00:37:10,730 --> 00:37:12,790
with another random variable
with the same

607
00:37:12,790 --> 00:37:14,990
distribution as Tjj.

608
00:37:14,990 --> 00:37:22,610
And F sub ij is the PMF
of that renewal time.

609
00:37:22,610 --> 00:37:29,620
And F sub ij is the distribution
function of it.

610
00:37:29,620 --> 00:37:34,460
So then when we define the state
j as being recurrent,

611
00:37:34,460 --> 00:37:37,030
what we're really doing is going
back to what we know

612
00:37:37,030 --> 00:37:43,820
about renewal processes and
saying a Markov chain is

613
00:37:43,820 --> 00:37:49,180
recurrent if the renewal process
that we define for

614
00:37:49,180 --> 00:37:55,700
that countable state Markov
chain has these various

615
00:37:55,700 --> 00:38:00,470
properties for this renewal
random variable.

616
00:38:00,470 --> 00:38:03,640
For each recurrent j, there's
an integer renewal counting

617
00:38:03,640 --> 00:38:08,130
process N sub jj of t.

618
00:38:08,130 --> 00:38:14,180
You start in state j at time t,
which is after t steps of

619
00:38:14,180 --> 00:38:16,230
the Markov process.

620
00:38:16,230 --> 00:38:19,850
What you're interested is how
many times have you hit state

621
00:38:19,850 --> 00:38:22,570
j, up until time t.

622
00:38:22,570 --> 00:38:27,620
That's the counting process we
talk about in renewal theory.

623
00:38:27,620 --> 00:38:33,550
So N sub jj of t is the number
of visits to j starting in j.

624
00:38:33,550 --> 00:38:37,370
And it has the interrenewal
distribution F sub jj, which

625
00:38:37,370 --> 00:38:39,520
is that quantity up there.

626
00:38:39,520 --> 00:38:45,250
We have a delayed renewal
counting process N sub ij of

627
00:38:45,250 --> 00:38:51,030
t, if we count visits
to j, starting in i.

628
00:38:51,030 --> 00:38:55,100
We didn't talk much about
delayed renewal processes,

629
00:38:55,100 --> 00:38:57,220
except for pointing out that
when you have a delayed

630
00:38:57,220 --> 00:38:59,930
renewal process, it
really is the same

631
00:38:59,930 --> 00:39:01,410
as a renewal processes.

632
00:39:01,410 --> 00:39:05,160
It just has some arbitrary
amount of time that's required

633
00:39:05,160 --> 00:39:07,790
to get to state j for the
first time and the keep

634
00:39:07,790 --> 00:39:09,100
recurrent on.

635
00:39:09,100 --> 00:39:12,810
Even if the expected time to
get to j for first time is

636
00:39:12,810 --> 00:39:17,370
infinite, and the expected time
for renewals from j to j

637
00:39:17,370 --> 00:39:21,990
is finite, you still have this
same renewal processes.

638
00:39:21,990 --> 00:39:25,030
You can even lose an infinite
amount of time at the

639
00:39:25,030 --> 00:39:28,480
beginning, and you amortize
it over time.

640
00:39:28,480 --> 00:39:31,100
Don't ask me why you can
amortize an infinite amount of

641
00:39:31,100 --> 00:39:32,510
time over time.

642
00:39:32,510 --> 00:39:33,760
But you can.

643
00:39:35,810 --> 00:39:40,710
And actually if you read about
delayed renewal processes, you

644
00:39:40,710 --> 00:39:42,601
see why you actually get that.

645
00:39:46,310 --> 00:39:51,740
So all states in a class are
positive recurrent, or all are

646
00:39:51,740 --> 00:39:54,520
null recurrent, or all
are transient.

647
00:39:54,520 --> 00:39:55,620
We've proved that theorem.

648
00:39:55,620 --> 00:39:59,030
It wasn't really a very
hard theorem to prove.

649
00:39:59,030 --> 00:40:04,040
And you can sort of see
that it ought to be.

650
00:40:04,040 --> 00:40:07,860
Then we define the chain as
being irreducible, if all

651
00:40:07,860 --> 00:40:09,340
state pairs communicate.

652
00:40:09,340 --> 00:40:14,050
In other words, if for every
pair of states, there's a path

653
00:40:14,050 --> 00:40:16,800
that goes from one state
to the other state.

654
00:40:19,520 --> 00:40:22,790
This is intuitively a simple
idea, if you have a finite

655
00:40:22,790 --> 00:40:24,480
state and Markov chain.

656
00:40:24,480 --> 00:40:29,390
If you have a countably infinite
state Markov chain,

657
00:40:29,390 --> 00:40:31,420
it seems to be a little
more peculiar.

658
00:40:31,420 --> 00:40:34,080
But it really isn't.

659
00:40:34,080 --> 00:40:38,610
For a countably infinite state
an Markov chain every state

660
00:40:38,610 --> 00:40:41,430
has a finite number.

661
00:40:41,430 --> 00:40:44,140
And you can take every
pair of states.

662
00:40:44,140 --> 00:40:45,750
You can identify them.

663
00:40:45,750 --> 00:40:48,260
And you can see whether there's
a path going from one

664
00:40:48,260 --> 00:40:48,870
to the other.

665
00:40:48,870 --> 00:40:53,390
For all of these birth-death
processes we've talked about,

666
00:40:53,390 --> 00:40:55,600
I mean, it's obvious whether
the states all

667
00:40:55,600 --> 00:40:56,660
communicate or not.

668
00:40:56,660 --> 00:40:58,940
You just see if there's
any break in the

669
00:40:58,940 --> 00:41:00,720
chain at any point.

670
00:41:00,720 --> 00:41:02,180
And it really looks
like a chain.

671
00:41:02,180 --> 00:41:07,850
It's a node, two transitions,
another node, two transitions,

672
00:41:07,850 --> 00:41:09,570
another node.

673
00:41:09,570 --> 00:41:12,430
And that's just the way chains
are supposed to work.

674
00:41:15,450 --> 00:41:19,650
An irreducible class might
be positive recurrent.

675
00:41:19,650 --> 00:41:21,680
It might be null recurrent.

676
00:41:21,680 --> 00:41:23,880
Or it might be transient.

677
00:41:23,880 --> 00:41:28,600
And we already have seen what
makes a state null recurrent

678
00:41:28,600 --> 00:41:31,280
or transient.

679
00:41:31,280 --> 00:41:33,720
And it's the same thing
for the class.

680
00:41:42,910 --> 00:41:47,830
We started out by saying
a state is either null

681
00:41:47,830 --> 00:41:53,390
recurrent, positive recurrent,
or transient depending on this

682
00:41:53,390 --> 00:41:56,070
renewal process associated
with it.

683
00:41:56,070 --> 00:42:00,210
And now there's this theorem,
which says that if one node in

684
00:42:00,210 --> 00:42:06,880
a class of states is positive
recurrent, they all are.

685
00:42:06,880 --> 00:42:09,230
And you ought to be able
to sort of see

686
00:42:09,230 --> 00:42:11,480
the reason for that.

687
00:42:11,480 --> 00:42:17,040
If I have one state which is
positive recurrent, it means

688
00:42:17,040 --> 00:42:20,940
that the expected time to
go from this state to

689
00:42:20,940 --> 00:42:22,210
this state is finite.

690
00:42:24,820 --> 00:42:29,060
Now if I had some other
state, I have to go

691
00:42:29,060 --> 00:42:31,660
from here to there.

692
00:42:31,660 --> 00:42:34,230
I can go through here and
then off to there.

693
00:42:34,230 --> 00:42:37,490
So the amount of time it takes
to get to there, and then from

694
00:42:37,490 --> 00:42:41,740
there to there, is also finite,
expected amount, and

695
00:42:41,740 --> 00:42:43,060
the same backwards.

696
00:42:43,060 --> 00:42:45,030
So that was the way
we proved this.

697
00:42:49,220 --> 00:42:53,930
If we have an irreducible
Markov chain--

698
00:42:53,930 --> 00:42:58,520
now this is the theorem you
really use all the time.

699
00:42:58,520 --> 00:43:03,580
This is sort of says how you
operate with these things.

700
00:43:03,580 --> 00:43:06,880
It says the steady
state equations--

701
00:43:06,880 --> 00:43:10,530
they're the equations you've
used in half the problems

702
00:43:10,530 --> 00:43:13,030
you've done with
Markov chains--

703
00:43:13,030 --> 00:43:17,350
if these equations have a
solution for the pi sub j's,

704
00:43:17,350 --> 00:43:21,250
remember the Markov chain is
defined in terms of the

705
00:43:21,250 --> 00:43:25,050
transition probabilities
P sub ij.

706
00:43:25,050 --> 00:43:28,120
We solve these equations to find
out what the steady state

707
00:43:28,120 --> 00:43:30,780
probabilities pi sub j are.

708
00:43:30,780 --> 00:43:34,900
And the theorem says, if you can
find the solution to those

709
00:43:34,900 --> 00:43:36,230
equations--

710
00:43:36,230 --> 00:43:39,830
pi sub j's have to
add up to 1--

711
00:43:39,830 --> 00:43:42,720
then the solution is unique.

712
00:43:42,720 --> 00:43:50,640
The pi sub j's are equal to 1
over the mean time to go from

713
00:43:50,640 --> 00:43:55,510
that state back to
that state again.

714
00:43:55,510 --> 00:43:58,960
And what does that mean?

715
00:43:58,960 --> 00:44:03,760
What that really gives you is
not a way to find pi sub j.

716
00:44:03,760 --> 00:44:08,710
It gives you a way to
find a T sub jj.

717
00:44:08,710 --> 00:44:13,050
Because these equations are more
often the way that you

718
00:44:13,050 --> 00:44:15,760
solve for the steady state
probabilities.

719
00:44:15,760 --> 00:44:19,010
And then that gives you a way
to find the mean recurrence

720
00:44:19,010 --> 00:44:22,165
time between visits to
this given state.

721
00:44:25,430 --> 00:44:28,660
And what else does
this theorem say?

722
00:44:28,660 --> 00:44:32,290
It says if the states are
positive recurrent, then the

723
00:44:32,290 --> 00:44:34,450
steady state equations
have a solution.

724
00:44:34,450 --> 00:44:38,640
So this is an if and only
if kind of statement.

725
00:44:38,640 --> 00:44:42,570
It relates these equations,
these steady state equations,

726
00:44:42,570 --> 00:44:48,820
to solutions and says, if
these equations have a

727
00:44:48,820 --> 00:44:51,780
solution, then in fact
you have the

728
00:44:51,780 --> 00:44:54,110
steady state equations.

729
00:44:54,110 --> 00:44:56,790
They satisfy all these
relationships about mean

730
00:44:56,790 --> 00:44:59,310
recurrence time.

731
00:44:59,310 --> 00:45:02,910
And if the states are positive
recurrent, then those

732
00:45:02,910 --> 00:45:04,790
equations have a solution.

733
00:45:04,790 --> 00:45:09,730
And in the solutions, the pi
sub j's are all possible.

734
00:45:09,730 --> 00:45:11,900
So it's an infinite set of
equations, so you can't

735
00:45:11,900 --> 00:45:14,190
necessarily solve it.

736
00:45:14,190 --> 00:45:16,220
But you sort of know everything
there is to know

737
00:45:16,220 --> 00:45:19,470
about it, at this point.

738
00:45:19,470 --> 00:45:21,730
Well, there's one other thing,
when you have a birth-death

739
00:45:21,730 --> 00:45:25,670
chain, these equations simplify
a great deal.

740
00:45:25,670 --> 00:45:30,010
The counting processes under
positive recurrence have to

741
00:45:30,010 --> 00:45:33,280
satisfy this equation.

742
00:45:33,280 --> 00:45:37,910
And my evil twin brother got a
hold of this and left out the

743
00:45:37,910 --> 00:45:40,850
n in the copy that you have.

744
00:45:40,850 --> 00:45:46,060
And I spotted it when I looked
at it just a little bit.

745
00:45:46,060 --> 00:45:49,470
He was still sleeping, so
I've managed to find it.

746
00:45:49,470 --> 00:45:51,650
So it's corrected here.

747
00:45:51,650 --> 00:45:52,922
And what does that say?

748
00:45:55,820 --> 00:46:00,380
It says, when you have positive
recurrence, if you

749
00:46:00,380 --> 00:46:04,840
look from 0 out to t, and you
count the number of times that

750
00:46:04,840 --> 00:46:10,600
you hit state j, that's
a random variable.

751
00:46:10,600 --> 00:46:16,980
If you take that and divide by
n, you look from time 0 out to

752
00:46:16,980 --> 00:46:22,960
time N, N sub ij of N, it's
the number of times

753
00:46:22,960 --> 00:46:24,660
you visit state j.

754
00:46:24,660 --> 00:46:28,850
You divide that by N, and
you go to the limit.

755
00:46:28,850 --> 00:46:33,130
And there's a strong law of
large numbers there, which was

756
00:46:33,130 --> 00:46:36,650
a strong law of large numbers
for renewal processes, which

757
00:46:36,650 --> 00:46:39,760
says that it has a limit
with probability 1.

758
00:46:39,760 --> 00:46:43,355
And this says that limit
is pi sub j.

759
00:46:43,355 --> 00:46:45,730
And that's sort of
obvious, again.

760
00:46:45,730 --> 00:46:49,340
I mean, visualize
what happens.

761
00:46:49,340 --> 00:46:51,900
You start out in state j.

762
00:46:51,900 --> 00:46:55,030
For one unit of time,
you're in state j.

763
00:46:55,030 --> 00:46:58,700
Then you go away from state j,
and for a long time you're out

764
00:46:58,700 --> 00:47:00,220
in the wilderness.

765
00:47:00,220 --> 00:47:03,660
And then you finally get
back to state j again.

766
00:47:03,660 --> 00:47:07,770
Think of a renewal reward
process, where you get 1 unit

767
00:47:07,770 --> 00:47:12,200
of reward every time you're in
state j and 0 reward every

768
00:47:12,200 --> 00:47:14,360
time you're not in state j.

769
00:47:14,360 --> 00:47:18,470
That means every interrenewal
period, you pick

770
00:47:18,470 --> 00:47:19,830
up one unit of reward.

771
00:47:25,000 --> 00:47:27,065
Well, this is what that says.

772
00:47:32,290 --> 00:47:36,190
It says that the fraction of
those visits to state j--

773
00:47:40,800 --> 00:47:46,230
that out of the total visits in
the Markov chain, the ones

774
00:47:46,230 --> 00:47:50,830
that go to state j have
probability pi sub j.

775
00:47:50,830 --> 00:47:54,020
So again this is another
relationship with these steady

776
00:47:54,020 --> 00:47:55,200
state probabilities.

777
00:47:55,200 --> 00:47:58,860
The steady state probabilities
tell you what these mean

778
00:47:58,860 --> 00:48:00,890
recurrence times are.

779
00:48:00,890 --> 00:48:03,300
And that tells you
what this is.

780
00:48:03,300 --> 00:48:08,060
This, in a sense, is
the same as this.

781
00:48:08,060 --> 00:48:11,290
Those are just sort of
the same results.

782
00:48:11,290 --> 00:48:14,540
So there's nothing
special about it.

783
00:48:14,540 --> 00:48:18,340
We talked a little bit about
the Markov model of age of

784
00:48:18,340 --> 00:48:22,750
renewal process for any
integer valued renewal

785
00:48:22,750 --> 00:48:33,010
process, you can find a Markov
chain which gives you the age

786
00:48:33,010 --> 00:48:35,010
of that process.

787
00:48:35,010 --> 00:48:37,660
You visualize being
in state j.

788
00:48:37,660 --> 00:48:46,790
And you visualize being in state
0, of this Markov model,

789
00:48:46,790 --> 00:48:50,550
at the point where you
have a renewal.

790
00:48:50,550 --> 00:48:56,670
One step later, if you have
another renewal, that happens

791
00:48:56,670 --> 00:49:02,070
with probability P sub 00, you
go back to state 0 again.

792
00:49:02,070 --> 00:49:04,400
If you don't have a renewal
in the next time,

793
00:49:04,400 --> 00:49:06,580
you go to state 1.

794
00:49:06,580 --> 00:49:09,830
From state 1, you might
go to state 2.

795
00:49:09,830 --> 00:49:13,200
When you're in state 2, it means
you're two time units

796
00:49:13,200 --> 00:49:15,750
away from state 0.

797
00:49:15,750 --> 00:49:21,830
If you go back to state 0, it
means you have a renewal in

798
00:49:21,830 --> 00:49:24,160
three time units.

799
00:49:24,160 --> 00:49:26,480
Otherwise you go to state 3.

800
00:49:26,480 --> 00:49:30,000
Then you might have a renewal
and so forth.

801
00:49:30,000 --> 00:49:38,360
So for this very simple kind
of Markov chain, this tells

802
00:49:38,360 --> 00:49:41,920
you everything there is to
know, in the sense, about

803
00:49:41,920 --> 00:49:44,540
integer value renewal
processes.

804
00:49:44,540 --> 00:49:48,820
So there's this nice connection
between the two.

805
00:49:48,820 --> 00:49:52,510
And it lets you see pretty
easily about when you have no

806
00:49:52,510 --> 00:49:53,380
recurrence.

807
00:49:53,380 --> 00:49:55,280
Now we spend a lot of time
talking about these

808
00:49:55,280 --> 00:49:58,800
birth-death Markov chains.

809
00:49:58,800 --> 00:50:03,400
And the easy way to solve for
birth-death Markov of chains

810
00:50:03,400 --> 00:50:10,490
is to say intuitively that
between any two adjacent

811
00:50:10,490 --> 00:50:14,370
states, the number of times
you go up has to equal the

812
00:50:14,370 --> 00:50:16,980
number of times you go down,
plus or minus 1.

813
00:50:16,980 --> 00:50:21,000
If you start out here and you
end up here, you're going this

814
00:50:21,000 --> 00:50:25,680
way one more time than you've
gone that way and vice versa.

815
00:50:25,680 --> 00:50:29,660
And combining that with the
steady state equations that we

816
00:50:29,660 --> 00:50:34,600
now have been talking about,
it must be that the steady

817
00:50:34,600 --> 00:50:37,900
state probability
of pi sub i--

818
00:50:37,900 --> 00:50:42,000
pi sub i times P sub i is the
probability of going from

819
00:50:42,000 --> 00:50:43,510
state 2 to state 3.

820
00:50:43,510 --> 00:50:46,850
It's the probability of being
in state 2 and making a

821
00:50:46,850 --> 00:50:49,760
transition to state 3.

822
00:50:49,760 --> 00:50:53,820
This probability here is the
probability of being in state

823
00:50:53,820 --> 00:50:59,580
3 and going to state 2.

824
00:50:59,580 --> 00:51:02,540
And we're saying that
asymptotically, as you look

825
00:51:02,540 --> 00:51:05,330
over an infinite number of
transitions, those two have to

826
00:51:05,330 --> 00:51:07,040
be the same.

827
00:51:07,040 --> 00:51:10,190
The other way to do it, if you
like algebra, is to start out

828
00:51:10,190 --> 00:51:11,750
with a steady state equation.

829
00:51:11,750 --> 00:51:14,750
And you can derive
this right away.

830
00:51:14,750 --> 00:51:16,710
I think it's nicer to
see intuitively

831
00:51:16,710 --> 00:51:19,100
why it has be true.

832
00:51:19,100 --> 00:51:26,850
And what that says is if rho sub
i is equal to P sub i over

833
00:51:26,850 --> 00:51:34,200
Q sub i plus 1, P sub i is the
up transition probability.

834
00:51:34,200 --> 00:51:37,920
Q sub i is the down transition
probability.

835
00:51:37,920 --> 00:51:45,330
Rho sub i is the ratio of the
two state probabilities.

836
00:51:45,330 --> 00:51:50,330
And that's equal to this
equation here.

837
00:51:50,330 --> 00:51:52,830
That's just how to calculate
these things.

838
00:51:52,830 --> 00:51:54,080
And you've done that.

839
00:51:56,800 --> 00:51:59,185
Let's go on to Markov
processes.

840
00:52:02,250 --> 00:52:04,990
I have no idea where I'm
going to finish up.

841
00:52:04,990 --> 00:52:08,310
I had a lot to do.

842
00:52:08,310 --> 00:52:11,100
I better not waste
too much time.

843
00:52:11,100 --> 00:52:13,470
Remember what a Markov
process is now.

844
00:52:16,650 --> 00:52:20,910
At least the way we started
out thinking about, it's a

845
00:52:20,910 --> 00:52:24,080
Markov chain along with
a holding time.

846
00:52:24,080 --> 00:52:26,420
And each state is
a Markov chain.

847
00:52:26,420 --> 00:52:30,130
And the holding times are
exponential, to be a countable

848
00:52:30,130 --> 00:52:32,450
state Markov process.

849
00:52:32,450 --> 00:52:36,300
So we can visualize it
as a sequence of

850
00:52:36,300 --> 00:52:40,480
states, X0, X1, X2, X3.

851
00:52:40,480 --> 00:52:45,240
And a sequence of holding
times, U1, U2, U3, U4.

852
00:52:45,240 --> 00:52:47,760
these are all random
variables.

853
00:52:47,760 --> 00:52:51,110
And this kind of dependence
diagram says what random

854
00:52:51,110 --> 00:52:54,790
variables depend on what
random variables.

855
00:52:54,790 --> 00:52:59,790
U1, given X0, is independent
of the rest of the world.

856
00:52:59,790 --> 00:53:02,960
U2, given X1, is independent
of the rest of the

857
00:53:02,960 --> 00:53:05,930
world, and so forth.

858
00:53:05,930 --> 00:53:11,060
And if you look at this graph
here and you visualize the

859
00:53:11,060 --> 00:53:13,740
fact that because of Bayes'
rule, you could go

860
00:53:13,740 --> 00:53:16,490
both ways on this.

861
00:53:16,490 --> 00:53:22,750
In other words, if this, given
this, is independent of

862
00:53:22,750 --> 00:53:26,240
everything else, we
can go through the

863
00:53:26,240 --> 00:53:28,760
same kind of argument.

864
00:53:28,760 --> 00:53:34,850
And we can make these arrows
go the opposite way.

865
00:53:34,850 --> 00:53:39,320
And we can say, if we just
consider these states here, we

866
00:53:39,320 --> 00:53:46,480
can say that, given X3, U4 is
independent of X2 and also

867
00:53:46,480 --> 00:53:52,130
independent of U3 and X1
and U2 and so forth.

868
00:53:52,130 --> 00:53:55,520
So if you look at the dependence
graph of a Markov

869
00:53:55,520 --> 00:54:01,230
chain, which is which states
depend on which other states,

870
00:54:01,230 --> 00:54:03,870
those arrows there that we have,
which make it easier to

871
00:54:03,870 --> 00:54:06,090
see what's going on, you
can take them off.

872
00:54:06,090 --> 00:54:10,210
You can redraw them in any way
you want to and look at the

873
00:54:10,210 --> 00:54:15,080
dependencies in the
opposite way.

874
00:54:15,080 --> 00:54:25,610
Now to understand what the
state is at any time t,

875
00:54:25,610 --> 00:54:28,650
there's an equation
to do that.

876
00:54:28,650 --> 00:54:31,700
It's an equation that
isn't much help.

877
00:54:31,700 --> 00:54:37,600
I think it's more help to look
at this and to see from this

878
00:54:37,600 --> 00:54:38,950
what's going on.

879
00:54:38,950 --> 00:54:41,785
You start in some
state that's 0.

880
00:54:44,670 --> 00:54:49,080
And starting in state 0, there's
a holding time in U0.

881
00:54:49,080 --> 00:54:51,680
The holding time is U1.

882
00:54:51,680 --> 00:54:54,870
And you stay in.

883
00:54:54,870 --> 00:54:58,230
And the time in U1 is an
exponential random variable

884
00:54:58,230 --> 00:55:00,000
with rate U sub i.

885
00:55:00,000 --> 00:55:01,550
That's what this says.

886
00:55:01,550 --> 00:55:06,300
So at the end of that holding
time, you go from state i to

887
00:55:06,300 --> 00:55:07,590
some other state.

888
00:55:07,590 --> 00:55:09,040
This is the state you go to.

889
00:55:09,040 --> 00:55:11,950
The state you go to is according
to the mark Markov

890
00:55:11,950 --> 00:55:13,710
chain probabilities.

891
00:55:13,710 --> 00:55:17,180
And it's state j in this case.

892
00:55:17,180 --> 00:55:22,230
You stay in state j until the
holding time U2, which is a

893
00:55:22,230 --> 00:55:28,490
function of j, finishes you up
at this time and so forth.

894
00:55:28,490 --> 00:55:32,810
So if you want to look at what
state you're in at a given

895
00:55:32,810 --> 00:55:37,060
time, namely pick a time here
and say what's the state at

896
00:55:37,060 --> 00:55:39,760
this time, as a random
variable.

897
00:55:39,760 --> 00:55:44,460
So what you have to do then is
you have to climb your way up

898
00:55:44,460 --> 00:55:46,970
from here to there.

899
00:55:46,970 --> 00:55:55,150
And you have to talk about the
value of S1, S2, and S3.

900
00:55:55,150 --> 00:55:58,180
And those are exponential
random variables.

901
00:55:58,180 --> 00:56:01,230
But they're exponential random
variables that depend on the

902
00:56:01,230 --> 00:56:02,800
state that you're in.

903
00:56:02,800 --> 00:56:06,480
So as you're climbing your way
up and looking at this sample

904
00:56:06,480 --> 00:56:11,070
function of the process, you
have to look at U1 an X0.

905
00:56:11,070 --> 00:56:15,740
X0 defines what U1 is,
as a random variable.

906
00:56:15,740 --> 00:56:18,610
It says that U1 is an
exponential random variable,

907
00:56:18,610 --> 00:56:21,190
with rate U sub i.

908
00:56:21,190 --> 00:56:25,050
So you get to here, then you
have some holding time here,

909
00:56:25,050 --> 00:56:29,940
which is a function of j and
so forth, the whole way up.

910
00:56:29,940 --> 00:56:34,480
Which is why I said that an
equation for X of t, in terms

911
00:56:34,480 --> 00:56:37,910
of these S's is not going to
help you a great deal.

912
00:56:37,910 --> 00:56:41,340
Understanding how the process
is working I think

913
00:56:41,340 --> 00:56:44,770
helps you a lot more.

914
00:56:44,770 --> 00:56:47,650
We said that there were three
ways to represent a Markov

915
00:56:47,650 --> 00:56:55,350
process, which I'm giving
here in terms

916
00:56:55,350 --> 00:56:57,630
just of Markov chains.

917
00:56:57,630 --> 00:56:59,410
The first one--

918
00:56:59,410 --> 00:57:02,430
and the fact that these are all
for M/M/1 doesn't make any

919
00:57:02,430 --> 00:57:03,050
difference.

920
00:57:03,050 --> 00:57:06,890
It's just these three
general [INAUDIBLE].

921
00:57:06,890 --> 00:57:11,970
One of them is, you look at it
in terms of the embedded

922
00:57:11,970 --> 00:57:13,220
Markov chain.

923
00:57:24,030 --> 00:57:26,990
For this embedded Markov
chain, the transition

924
00:57:26,990 --> 00:57:31,290
probabilities, when you're in
state 0 in an M/M/1 queue,

925
00:57:31,290 --> 00:57:33,470
what's the next state
you go to?

926
00:57:33,470 --> 00:57:37,140
Well the only state you
can go to is state 1.

927
00:57:37,140 --> 00:57:40,110
Because we don't have any
self transitions.

928
00:57:40,110 --> 00:57:42,050
So you go up to state
1 eventually.

929
00:57:42,050 --> 00:57:45,930
From state 1, you can go that
way, with probability mu over

930
00:57:45,930 --> 00:57:47,360
lambda plus mu.

931
00:57:47,360 --> 00:57:51,210
Or you can go this way, with
probability lambda over lambda

932
00:57:51,210 --> 00:57:56,120
plus mu, and so forth
the whole way out.

933
00:57:56,120 --> 00:58:01,140
The next way of describing it,
which is almost the same, is

934
00:58:01,140 --> 00:58:05,020
instead of using the transition
probabilities and

935
00:58:05,020 --> 00:58:08,400
the embedded chain, you look
directly at the transition

936
00:58:08,400 --> 00:58:11,500
rates for the Poisson process.

937
00:58:11,500 --> 00:58:15,580
Meaning the transition rates are
the new sub i's associated

938
00:58:15,580 --> 00:58:16,920
with the different states.

939
00:58:16,920 --> 00:58:20,540
When you get in state i, the
amount of time you spend is

940
00:58:20,540 --> 00:58:24,810
state i is an exponential
random variable.

941
00:58:24,810 --> 00:58:27,800
And when you make a transition,
you're either

942
00:58:27,800 --> 00:58:32,020
going to go to one state or
another state, in this case.

943
00:58:32,020 --> 00:58:36,380
In general, you might go to any
one of a number of states.

944
00:58:36,380 --> 00:58:47,650
Now if I tell that we start out
in state one and the next

945
00:58:47,650 --> 00:58:53,430
state we go is state 2, now I
ask you what's the expected

946
00:58:53,430 --> 00:58:56,140
amount of time that that
transition took?

947
00:58:56,140 --> 00:58:57,390
What's the answer?

948
00:59:00,210 --> 00:59:03,620
Is it queue 1, 2, or
is it mu sub 1?

949
00:59:10,550 --> 00:59:13,804
Anybody awake out there?

950
00:59:13,804 --> 00:59:16,230
AUDIENCE: Sir, could you
repeat the question?

951
00:59:16,230 --> 00:59:17,010
PROFESSOR: Yes.

952
00:59:17,010 --> 00:59:20,600
The question is, we started
out in state 1.

953
00:59:20,600 --> 00:59:25,730
Given that we started out in
state 1 and given that the

954
00:59:25,730 --> 00:59:30,770
next state is state 2, what's
the amount of time that it

955
00:59:30,770 --> 00:59:32,930
takes to go from 1 to 2?

956
00:59:32,930 --> 00:59:34,850
It's an exponential
random variable.

957
00:59:34,850 --> 00:59:37,722
What's the rate of that
random variable?

958
00:59:37,722 --> 00:59:39,210
AUDIENCE: Lambda plus U.

959
00:59:39,210 --> 00:59:39,706
PROFESSOR: What?

960
00:59:39,706 --> 00:59:41,330
AUDIENCE: Lambda plus U.

961
00:59:41,330 --> 00:59:42,945
PROFESSOR: Lambda plus mu?

962
00:59:42,945 --> 00:59:44,610
Yes.

963
00:59:44,610 --> 00:59:48,990
Lambda plus mu in the
case of M/M/1 queue.

964
00:59:48,990 --> 00:59:53,860
If you have an arbitrary change,
why the amount of time

965
00:59:53,860 --> 01:00:00,260
that it takes is mu sub I. This
is just back to this old

966
01:00:00,260 --> 01:00:01,540
thing about splitting and

967
01:00:01,540 --> 01:00:03,110
combining of Poisson processes.

968
01:00:05,860 --> 01:00:10,130
When you have a combined Poisson
process, which is what

969
01:00:10,130 --> 01:00:13,980
you have here, when you're in
state i, there's a combined

970
01:00:13,980 --> 01:00:18,940
Poisson process, which is
running, which says you go

971
01:00:18,940 --> 01:00:20,600
right with probability.

972
01:00:20,600 --> 01:00:22,940
Lambda, you go left with
probability mu

973
01:00:22,940 --> 01:00:26,120
for an M/M/1 queue.

974
01:00:26,120 --> 01:00:32,640
And you can look at it in terms
of, first, you see what

975
01:00:32,640 --> 01:00:34,490
the next state is.

976
01:00:34,490 --> 01:00:37,500
And then you ask how long did
it take to get there?

977
01:00:37,500 --> 01:00:40,590
Or you look at in terms of how
long does it take to make a

978
01:00:40,590 --> 01:00:43,910
transition and then which
state did you go to?

979
01:00:43,910 --> 01:00:46,820
And with these combined Poisson
processes, those two

980
01:00:46,820 --> 01:00:50,550
questions are independent
of each other.

981
01:00:50,550 --> 01:00:53,990
And if there's one thing you
remember from all of this,

982
01:00:53,990 --> 01:00:55,150
please remember that.

983
01:00:55,150 --> 01:01:00,420
Because it's something that
you use in almost every

984
01:01:00,420 --> 01:01:03,010
problem that you do with Markov

985
01:01:03,010 --> 01:01:04,990
chains and Markov processes.

986
01:01:04,990 --> 01:01:07,730
It just comes up all the time.

987
01:01:07,730 --> 01:01:15,710
This final version here is
looking at the same Markov

988
01:01:15,710 --> 01:01:23,090
process, but looking at it in
sample time instead of looking

989
01:01:23,090 --> 01:01:24,820
at the embedded queue.

990
01:01:24,820 --> 01:01:28,110
Now the important thing here
is, when you look at it in

991
01:01:28,110 --> 01:01:32,340
sample time, you might not
be able to do this.

992
01:01:32,340 --> 01:01:40,010
Because with this entire
cannibal state Markov chain,

993
01:01:40,010 --> 01:01:42,950
you might not be able to
define these self-loop

994
01:01:42,950 --> 01:01:44,450
transition probabilities.

995
01:01:44,450 --> 01:01:47,220
Because these numbers
might get too large.

996
01:01:47,220 --> 01:01:49,700
But for the M/M/1 queue,
you can do it.

997
01:01:49,700 --> 01:01:53,500
The important thing is that the
steady state probabilities

998
01:01:53,500 --> 01:01:57,780
you find for these states are
not the same as the steady

999
01:01:57,780 --> 01:02:01,300
state probabilities you find for
the embedded Markov chain.

1000
01:02:01,300 --> 01:02:04,930
They are in fact the same as the
steady state probabilities

1001
01:02:04,930 --> 01:02:07,330
for the Markov process itself.

1002
01:02:07,330 --> 01:02:11,880
That's these steady state
probabilities are the fraction

1003
01:02:11,880 --> 01:02:14,980
of time that you spend
in state j.

1004
01:02:14,980 --> 01:02:18,580
And this is a sample time
Markov process.

1005
01:02:18,580 --> 01:02:22,570
It is the same fraction of time
you spend in state j.

1006
01:02:22,570 --> 01:02:25,040
Here you have this
embedded chain.

1007
01:02:25,040 --> 01:02:28,250
And for example, in the embedded
chain, the only place

1008
01:02:28,250 --> 01:02:32,190
you go from state
0 is state 1.

1009
01:02:32,190 --> 01:02:35,120
Here from state 0, you
can stay in state

1010
01:02:35,120 --> 01:02:36,680
0 for a long time.

1011
01:02:36,680 --> 01:02:39,400
Because here the increments
of time are constant.

1012
01:02:44,060 --> 01:02:47,530
We can look at delayed renewal
reward theorems for the

1013
01:02:47,530 --> 01:02:52,610
renewal process to see what's
going on here, for the

1014
01:02:52,610 --> 01:02:56,260
fraction of time we
spend in state j.

1015
01:02:56,260 --> 01:02:58,570
We look at that picture
up there.

1016
01:02:58,570 --> 01:03:01,580
We start out in state
j, for example.

1017
01:03:01,580 --> 01:03:04,930
Same as the renewal reward
process that we had for a

1018
01:03:04,930 --> 01:03:07,280
Markov chain.

1019
01:03:07,280 --> 01:03:10,020
We got a reward of 1 for the
amount of time that we

1020
01:03:10,020 --> 01:03:11,800
stay in state j.

1021
01:03:11,800 --> 01:03:14,810
After that, we're wandering
around in the wilderness.

1022
01:03:14,810 --> 01:03:17,580
We finally come back
to state j again.

1023
01:03:17,580 --> 01:03:21,230
We get 1 unit of reward
times the amount of

1024
01:03:21,230 --> 01:03:22,560
time we spend here.

1025
01:03:22,560 --> 01:03:26,580
In other words, we're
accumulating reward at a rate

1026
01:03:26,580 --> 01:03:30,540
of 1 unit per unit time,
up to there.

1027
01:03:30,540 --> 01:03:36,690
So the average reward we get per
unit time is the expected

1028
01:03:36,690 --> 01:03:44,790
value of U of j divided by the
expected interrenewal time,

1029
01:03:44,790 --> 01:03:49,320
which is 1 over mu j times the
expected time, from one

1030
01:03:49,320 --> 01:03:52,230
renewal to the next.

1031
01:03:52,230 --> 01:03:56,710
Which tells us that the fraction
of time we spend in

1032
01:03:56,710 --> 01:04:02,340
state j is equal to the fraction
of transitions that

1033
01:04:02,340 --> 01:04:06,480
go to state j, divided by the
rate at which we leave state

1034
01:04:06,480 --> 01:04:09,970
j, times the expected
number of overall

1035
01:04:09,970 --> 01:04:13,220
transitions per unit time.

1036
01:04:13,220 --> 01:04:15,330
This is an important result.

1037
01:04:15,330 --> 01:04:18,700
Because depending on what M sub
i is, depending on what

1038
01:04:18,700 --> 01:04:23,360
the number of transitions per
unit time is, it really tells

1039
01:04:23,360 --> 01:04:24,580
you what's going on.

1040
01:04:24,580 --> 01:04:28,140
Because all of these bizarre
Markov processes that we've

1041
01:04:28,140 --> 01:04:33,520
looked at are bizarre because of
the way that this behaves.

1042
01:04:33,520 --> 01:04:35,405
This can infinite or can be 0.

1043
01:04:47,080 --> 01:04:54,210
At this point, we've been
talking about the expected

1044
01:04:54,210 --> 01:05:01,100
number of transitions per unit
time as a random variable, as

1045
01:05:01,100 --> 01:05:03,770
a limit in probability
1, given that we

1046
01:05:03,770 --> 01:05:05,790
start in state i.

1047
01:05:05,790 --> 01:05:10,270
And suddenly, we see that it
doesn't depend on i at all.

1048
01:05:10,270 --> 01:05:14,150
So there is some number, M bar,
which is the expected

1049
01:05:14,150 --> 01:05:17,900
number of transitions per unit
time, which is independent of

1050
01:05:17,900 --> 01:05:19,210
what state we started in.

1051
01:05:19,210 --> 01:05:26,970
We call that M M bar instead
M sub I. And that's this

1052
01:05:26,970 --> 01:05:29,300
quantity here.

1053
01:05:29,300 --> 01:05:38,600
And what we get from that it
is the fraction of time we

1054
01:05:38,600 --> 01:05:44,330
spend in state j is
proportional to pi

1055
01:05:44,330 --> 01:05:46,310
j over mu sub j.

1056
01:05:46,310 --> 01:05:50,250
But since it has to add up to
1, we have to divide it by

1057
01:05:50,250 --> 01:05:52,080
this quantity here.

1058
01:05:52,080 --> 01:05:56,330
And this quantity here
is one over--

1059
01:05:56,330 --> 01:06:00,325
this is the expected number of
transitions per unit time.

1060
01:06:03,190 --> 01:06:09,710
And if we try to get the pi sub
j's from P sub j's, the

1061
01:06:09,710 --> 01:06:13,440
corresponding thing, as we find
out, the expected number

1062
01:06:13,440 --> 01:06:18,300
transitions per unit time as
a sum over i, P sub i,

1063
01:06:18,300 --> 01:06:19,330
times mu sub i.

1064
01:06:19,330 --> 01:06:23,640
You can play all sorts of games
with these equations.

1065
01:06:23,640 --> 01:06:27,685
And when you do so, all of those
things become evident.

1066
01:06:44,010 --> 01:06:49,460
I would advise you to just
cross this equation out.

1067
01:06:49,460 --> 01:06:51,380
I don't know what
it came from.

1068
01:06:51,380 --> 01:06:54,300
But it doesn't mean anything.

1069
01:06:57,780 --> 01:07:02,700
We spent a lot of time talking
about what happens when the

1070
01:07:02,700 --> 01:07:06,770
expected number of transitions
per unit time

1071
01:07:06,770 --> 01:07:10,020
is either 0 or infinity.

1072
01:07:10,020 --> 01:07:15,870
We had this case we looked at of
an M/M/1 type queue, where

1073
01:07:15,870 --> 01:07:19,150
the server got rattled
as time went on.

1074
01:07:19,150 --> 01:07:21,010
And the server got rattled
with more and

1075
01:07:21,010 --> 01:07:22,700
more customers waiting.

1076
01:07:22,700 --> 01:07:25,590
The customer's got discouraged
and didn't come in.

1077
01:07:25,590 --> 01:07:31,090
So we had a process where the
longer the queue got, the

1078
01:07:31,090 --> 01:07:33,965
longer time it took for
anything to happen.

1079
01:07:41,600 --> 01:07:46,130
So that as far as the embedded
Markov chain went,

1080
01:07:46,130 --> 01:07:47,550
everything was fine.

1081
01:07:47,550 --> 01:07:52,010
But then we looked at the
process itself, the time that

1082
01:07:52,010 --> 01:07:55,140
it took in each of these higher
order states was so

1083
01:07:55,140 --> 01:08:00,360
large, that, as a process,
it didn't make any sense.

1084
01:08:00,360 --> 01:08:02,330
So the P sub i's were all 0.

1085
01:08:02,330 --> 01:08:04,140
The pi sub i's all
looked fine.

1086
01:08:06,640 --> 01:08:10,300
And the other kind of cases,
where the expected number of

1087
01:08:10,300 --> 01:08:15,070
transitions per unit time
becomes infinite.

1088
01:08:15,070 --> 01:08:18,080
And that's just the opposite
kind of case, where, when you

1089
01:08:18,080 --> 01:08:21,170
get to the higher ordered
states, things start happening

1090
01:08:21,170 --> 01:08:22,510
very, very fast.

1091
01:08:22,510 --> 01:08:26,310
The higher ordered state you
go to, the faster the

1092
01:08:26,310 --> 01:08:28,520
transitions occur.

1093
01:08:28,520 --> 01:08:29,770
It's like a small child.

1094
01:08:32,810 --> 01:08:35,890
I mean, the more excited the
small child gets, the faster

1095
01:08:35,890 --> 01:08:37,040
things happen.

1096
01:08:37,040 --> 01:08:38,670
And the faster things
happen, the more

1097
01:08:38,670 --> 01:08:40,050
excited the child gets.

1098
01:08:40,050 --> 01:08:43,450
So pretty soon things are
happening so fast, the child

1099
01:08:43,450 --> 01:08:44,790
just collapses.

1100
01:08:44,790 --> 01:08:47,330
And if you're lucky,
the child sleeps.

1101
01:08:47,330 --> 01:08:49,689
So you can think
of it that way.

1102
01:08:52,279 --> 01:08:53,529
We talked about reversibility.

1103
01:08:58,990 --> 01:09:03,350
And reversibility for Markov
processes I think is somewhat

1104
01:09:03,350 --> 01:09:05,170
easier to see then

1105
01:09:05,170 --> 01:09:07,115
reversibility for Markov chains.

1106
01:09:12,790 --> 01:09:15,660
If you're dealing with a Markov
process, we're sitting

1107
01:09:15,660 --> 01:09:17,790
in state i for a while.

1108
01:09:17,790 --> 01:09:20,380
At some time we make
a transition.

1109
01:09:20,380 --> 01:09:21,529
We go to state j.

1110
01:09:21,529 --> 01:09:23,689
We sit there for a long time.

1111
01:09:23,689 --> 01:09:26,819
Then we go to state
k and so forth.

1112
01:09:26,819 --> 01:09:30,210
If we try to look at this
process coming back the other

1113
01:09:30,210 --> 01:09:34,740
way, we see that we're
in state k.

1114
01:09:34,740 --> 01:09:37,930
At a certain point, we
had a transition.

1115
01:09:37,930 --> 01:09:40,779
We had a transition
into state j.

1116
01:09:40,779 --> 01:09:42,550
And how long does it
take before that

1117
01:09:42,550 --> 01:09:43,819
transition is over?

1118
01:09:46,319 --> 01:09:49,340
We're in state j, so the amount
of time that it takes

1119
01:09:49,340 --> 01:09:52,510
is an exponentially distributed
random variable.

1120
01:09:52,510 --> 01:09:54,610
And it's exponentially
distributed with the same

1121
01:09:54,610 --> 01:09:58,360
amount of time, whether we're
coming in this way or whether

1122
01:09:58,360 --> 01:10:00,160
we're coming in this way.

1123
01:10:00,160 --> 01:10:02,560
And that's the notion
of reversibility.

1124
01:10:02,560 --> 01:10:05,840
It doesn't make any difference
whether you look at it from

1125
01:10:05,840 --> 01:10:10,380
right to left or from
left to right.

1126
01:10:10,380 --> 01:10:16,620
And in this kind of situation,
if you find the steady state

1127
01:10:16,620 --> 01:10:23,120
probabilities for these
transitions or you find the

1128
01:10:23,120 --> 01:10:29,180
steady state fraction of time
you spend in each state.

1129
01:10:29,180 --> 01:10:32,920
I mean, we just showed that if
you look at this process going

1130
01:10:32,920 --> 01:10:35,630
backwards, if you define all
the probabilities coming

1131
01:10:35,630 --> 01:10:40,700
backwards, the expected amount
of time that you spend in

1132
01:10:40,700 --> 01:10:44,650
state i or the rate for leaving
state i is independent

1133
01:10:44,650 --> 01:10:46,010
of right to left.

1134
01:10:46,010 --> 01:10:49,110
And a slightly more complicated
argument says the

1135
01:10:49,110 --> 01:10:52,140
P sub i's are the same
going right to left.

1136
01:10:52,140 --> 01:10:55,400
And the fraction of time you
spend in each state is

1137
01:10:55,400 --> 01:10:57,830
obviously the same going
from right to left as

1138
01:10:57,830 --> 01:10:59,490
these limits occur.

1139
01:10:59,490 --> 01:11:05,980
So that gives you all these
bizarre conditions for

1140
01:11:05,980 --> 01:11:10,570
queuing, which are
very useful.

1141
01:11:15,220 --> 01:11:20,080
I'm not going to say any
more about that except

1142
01:11:20,080 --> 01:11:23,280
the guessing theorem.

1143
01:11:23,280 --> 01:11:26,940
The guessing theorem says
suppose a Markov process is

1144
01:11:26,940 --> 01:11:28,980
irreducible.

1145
01:11:28,980 --> 01:11:30,690
You can check pretty
easily whether it's

1146
01:11:30,690 --> 01:11:31,830
irreducible or not.

1147
01:11:31,830 --> 01:11:33,910
You can't necessarily
check very easily

1148
01:11:33,910 --> 01:11:36,170
whether it's recurrent.

1149
01:11:38,700 --> 01:11:42,160
And suppose P sub i is a set
of probabilities that

1150
01:11:42,160 --> 01:11:48,530
satisfies P sub i times
Q sub ij equals P sub

1151
01:11:48,530 --> 01:11:50,830
j times Q sub ji.

1152
01:11:50,830 --> 01:11:56,520
In other words, this is the
probability of being in state

1153
01:11:56,520 --> 01:12:00,690
i, and the next transition
is to state j.

1154
01:12:00,690 --> 01:12:03,640
This is the probability of being
in state j, and the next

1155
01:12:03,640 --> 01:12:05,600
transition to state i.

1156
01:12:05,600 --> 01:12:10,500
This says that if you can find
a set of probabilities which

1157
01:12:10,500 --> 01:12:14,740
satisfy these equations, and
if they also satisfy this

1158
01:12:14,740 --> 01:12:20,640
condition, P sub i, mu sub i,
less than infinity, then P sub

1159
01:12:20,640 --> 01:12:23,030
i is greater than 0 for all i.

1160
01:12:23,030 --> 01:12:26,430
P sub i is a steady state time
averaged probability state i.

1161
01:12:26,430 --> 01:12:28,340
The processes is reversible.

1162
01:12:28,340 --> 01:12:31,930
And the embedded chain is
positive recurring.

1163
01:12:31,930 --> 01:12:34,580
So all you have to do is
solve those equations.

1164
01:12:34,580 --> 01:12:37,760
And if you can solve those
equations, you're done.

1165
01:12:40,410 --> 01:12:43,120
Everything is fine.

1166
01:12:43,120 --> 01:12:45,680
You don't have to know anything
about reversibility

1167
01:12:45,680 --> 01:12:48,330
or renewal theory or
anything else.

1168
01:12:48,330 --> 01:12:51,210
If you have that theorem,
you just

1169
01:12:51,210 --> 01:12:53,300
solve for those equations.

1170
01:12:53,300 --> 01:12:57,710
Solve these equations by
guessing what the solution is,

1171
01:12:57,710 --> 01:13:00,430
and then you in fact have
a reversible process.

1172
01:13:06,690 --> 01:13:10,530
So the useful application of
this is that all birth-death

1173
01:13:10,530 --> 01:13:15,890
processes are reversible if this
equation is satisfied.

1174
01:13:15,890 --> 01:13:19,330
And you can immediately
find the steady state

1175
01:13:19,330 --> 01:13:20,580
probabilities of them.

1176
01:13:23,050 --> 01:13:25,276
I'm not going to have much
time for random walks.

1177
01:13:28,680 --> 01:13:29,880
But random walks are
what we've been

1178
01:13:29,880 --> 01:13:31,490
talking about all term.

1179
01:13:31,490 --> 01:13:34,500
We just didn't call them random
walks until we got to

1180
01:13:34,500 --> 01:13:36,180
the seventh chapter.

1181
01:13:36,180 --> 01:13:41,140
But a random walk is a sequence
of random variables,

1182
01:13:41,140 --> 01:13:47,490
where each Sn in the sequence
is a sum of some number of

1183
01:13:47,490 --> 01:13:52,330
underlying IID random variables,
X1 up to X sub n.

1184
01:13:52,330 --> 01:13:56,780
Well we're interested in
exponential bounds on S sub n

1185
01:13:56,780 --> 01:13:57,600
for large n.

1186
01:13:57,600 --> 01:13:59,910
These are known as
Chernoff bounds.

1187
01:13:59,910 --> 01:14:03,560
We talked about them back
in chapter one.

1188
01:14:03,560 --> 01:14:05,320
I'm not going to mention
them again now.

1189
01:14:05,320 --> 01:14:07,860
We're interested in threshold
crossings.

1190
01:14:07,860 --> 01:14:11,460
If you have two thresholds, one
positive threshold, one

1191
01:14:11,460 --> 01:14:16,120
negative threshold, you would
like to know what's the

1192
01:14:16,120 --> 01:14:20,810
stopping time when S sub
n first crosses alpha?

1193
01:14:20,810 --> 01:14:23,960
Or what's the stopping time when
it first crosses beta?

1194
01:14:23,960 --> 01:14:28,210
What's the probability of
crossing alpha before you

1195
01:14:28,210 --> 01:14:30,930
cross beta or vice versa?

1196
01:14:30,930 --> 01:14:33,760
And what's the distribution of
the overshoot, when you pass

1197
01:14:33,760 --> 01:14:34,760
one of them?

1198
01:14:34,760 --> 01:14:37,120
So there all those questions.

1199
01:14:37,120 --> 01:14:40,890
We pretty much talked
about the first two.

1200
01:14:40,890 --> 01:14:45,480
The question of overshoot,
I think I mentioned this.

1201
01:14:45,480 --> 01:14:48,460
The text doesn't say
much about it.

1202
01:14:48,460 --> 01:14:52,090
Overshoot is just a nasty,
nasty problem.

1203
01:14:52,090 --> 01:14:55,250
If you ever have to find the
overshoot of something, go

1204
01:14:55,250 --> 01:14:59,570
look for a computer program to
simulate it or something.

1205
01:14:59,570 --> 01:15:02,760
You're not going to solve
the problem very easily.

1206
01:15:02,760 --> 01:15:08,030
Fowler is the only book I know
which does a reasonable job of

1207
01:15:08,030 --> 01:15:09,720
trying to solve this.

1208
01:15:09,720 --> 01:15:13,170
And you have to be
extraordinarily patient.

1209
01:15:13,170 --> 01:15:17,030
I mean Fowler does everything
in the nicest possible way.

1210
01:15:17,030 --> 01:15:19,330
Or at least he always seem to
do everything in the nicest

1211
01:15:19,330 --> 01:15:20,810
possible way.

1212
01:15:20,810 --> 01:15:23,970
Most textbooks you look at,
after you understand the

1213
01:15:23,970 --> 01:15:27,110
subject, you look at and you
say, oh, he should have done

1214
01:15:27,110 --> 01:15:28,570
it this way.

1215
01:15:28,570 --> 01:15:31,910
I've never had that experience
with Fowler at all.

1216
01:15:31,910 --> 01:15:33,530
Always, I look at it.

1217
01:15:33,530 --> 01:15:35,370
I say, oh, there's an
easier way to do it.

1218
01:15:35,370 --> 01:15:36,990
I try to do it the easier way.

1219
01:15:36,990 --> 01:15:38,800
And then I find something's
wrong with it.

1220
01:15:38,800 --> 01:15:41,620
And then I go back and say,
ah, I got to do it the way

1221
01:15:41,620 --> 01:15:43,530
Fowler did it.

1222
01:15:43,530 --> 01:15:48,590
So if you're serious about this
field and you don't have

1223
01:15:48,590 --> 01:15:51,730
a copy of this very
old book, get it,

1224
01:15:51,730 --> 01:15:53,190
because it's solid gold.

1225
01:16:01,460 --> 01:16:06,450
Suppose a random variable has a
moment generating function,

1226
01:16:06,450 --> 01:16:11,470
expected value of E to
the zr over some

1227
01:16:11,470 --> 01:16:13,180
positive region of r.

1228
01:16:13,180 --> 01:16:17,270
And suppose it has a mean
which is negative.

1229
01:16:17,270 --> 01:16:22,570
The Chernoff bound says that for
any alpha greater than 0

1230
01:16:22,570 --> 01:16:27,860
and any r in 0 to r plus, the
probability that Z is greater

1231
01:16:27,860 --> 01:16:31,040
than or equal to alpha is
less than or equal to

1232
01:16:31,040 --> 01:16:31,860
this quantity here.

1233
01:16:31,860 --> 01:16:33,370
You remember, we derived this.

1234
01:16:33,370 --> 01:16:36,730
The derivation is very simple.

1235
01:16:36,730 --> 01:16:39,660
It's a an obvious result.

1236
01:16:39,660 --> 01:16:41,740
It's a little strange.

1237
01:16:41,740 --> 01:16:48,130
Because this says that for
this random variable it's

1238
01:16:48,130 --> 01:16:52,130
complimentary distribution
function has to go down as e

1239
01:16:52,130 --> 01:16:55,270
to the minus r alpha.

1240
01:16:55,270 --> 01:16:59,330
Now all random variables can't
go down exponentially as e to

1241
01:16:59,330 --> 01:17:01,260
the minus r alpha.

1242
01:17:01,260 --> 01:17:06,310
The reason for this is that
these moment generating

1243
01:17:06,310 --> 01:17:09,050
functions down exist
for all alpha.

1244
01:17:09,050 --> 01:17:14,150
So what it's really saying is
where it exists, it goes down

1245
01:17:14,150 --> 01:17:17,700
with alpha as e to the
minus r alpha.

1246
01:17:17,700 --> 01:17:20,160
We then define the
semi-invariant moment

1247
01:17:20,160 --> 01:17:21,920
generating function.

1248
01:17:21,920 --> 01:17:26,000
And then a more convenient way
of stating the Chernoff bound

1249
01:17:26,000 --> 01:17:28,000
was in this way.

1250
01:17:28,000 --> 01:17:29,210
You look here.

1251
01:17:29,210 --> 01:17:35,800
And you say, for a fixed value
of n here, this probability of

1252
01:17:35,800 --> 01:17:39,320
S sub n is greater than or equal
to n a, is something

1253
01:17:39,320 --> 01:17:42,540
which is going down
exponentially with n.

1254
01:17:42,540 --> 01:17:45,680
And if you optimize over
r, this bound is

1255
01:17:45,680 --> 01:17:47,340
exponentially tight.

1256
01:17:47,340 --> 01:17:54,110
In other words, if you try to
replace this with anything

1257
01:17:54,110 --> 01:17:58,090
smaller, namely which goes down
faster, than for large

1258
01:17:58,090 --> 01:18:01,090
enough n, the bound
will be false.

1259
01:18:01,090 --> 01:18:04,870
So this is the tightest bound
you can get when you

1260
01:18:04,870 --> 01:18:07,170
optimize it over r.

1261
01:18:07,170 --> 01:18:10,220
So its exponential in n.

1262
01:18:10,220 --> 01:18:13,970
Mostly we wanted to use it
for threshold crossings.

1263
01:18:13,970 --> 01:18:20,630
And for threshold crossings, we
would like to look at it in

1264
01:18:20,630 --> 01:18:22,860
another way.

1265
01:18:22,860 --> 01:18:26,680
And we dealt with this
graphically.

1266
01:18:26,680 --> 01:18:30,470
Probability of Sn greater
than or equal to alpha.

1267
01:18:30,470 --> 01:18:33,240
Now what we want to do is
hold alpha constant.

1268
01:18:33,240 --> 01:18:35,350
Alpha is some threshold
up there.

1269
01:18:35,350 --> 01:18:39,060
We want to ask, what's the
probability that after n

1270
01:18:39,060 --> 01:18:41,780
trials, we're sitting
above alpha?

1271
01:18:41,780 --> 01:18:43,460
And we'd like to try
to solve that for

1272
01:18:43,460 --> 01:18:45,580
different values of n.

1273
01:18:45,580 --> 01:18:50,560
The Chernoff bound, in this
case, this quantity here is

1274
01:18:50,560 --> 01:18:52,470
this intercept here.

1275
01:18:52,470 --> 01:18:54,950
You take the semi-invariant
moment

1276
01:18:54,950 --> 01:18:57,270
generating function as convex.

1277
01:18:57,270 --> 01:18:59,640
You draw this curve.

1278
01:18:59,640 --> 01:19:04,290
You take a tangent of
slope alpha over n.

1279
01:19:04,290 --> 01:19:06,420
And you see where
it hits here.

1280
01:19:06,420 --> 01:19:08,290
And this is the exponent
that you have.

1281
01:19:08,290 --> 01:19:10,730
This is a negative exponent.

1282
01:19:10,730 --> 01:19:16,400
As you very n, this tilts
around on this curve.

1283
01:19:16,400 --> 01:19:19,520
And it comes in to this point.

1284
01:19:19,520 --> 01:19:22,230
It goes back out again.

1285
01:19:22,230 --> 01:19:24,670
That's what happens to it.

1286
01:19:24,670 --> 01:19:31,190
And that smallest exponent, as
you vary n, is the most likely

1287
01:19:31,190 --> 01:19:34,680
time at which you're going
to cross that threshold.

1288
01:19:34,680 --> 01:19:37,930
And what we found,
from looking at

1289
01:19:37,930 --> 01:19:41,570
Wald's equality is that--

1290
01:19:41,570 --> 01:19:46,000
let me go on, because we're
running out of time.

1291
01:19:50,870 --> 01:19:54,390
Wald's identity for two
thresholds says this.

1292
01:19:54,390 --> 01:19:58,930
And the corollary says, if the
underlying random variable is

1293
01:19:58,930 --> 01:20:06,580
less than 0, and if the
r at which the--

1294
01:20:06,580 --> 01:20:10,260
the second solution of
gamma of r equals 0.

1295
01:20:10,260 --> 01:20:12,000
You have this convex curve.

1296
01:20:12,000 --> 01:20:15,450
Gamma is always equal to 0.

1297
01:20:15,450 --> 01:20:19,380
There's some other value of r,
for which gamma is equal to 0.

1298
01:20:19,380 --> 01:20:21,370
And that's r star.

1299
01:20:21,370 --> 01:20:25,080
And this says that the
probability that we have

1300
01:20:25,080 --> 01:20:29,830
crossed alpha at time j, where
j is the time of first

1301
01:20:29,830 --> 01:20:32,310
crossing, is less than
or equal e to the

1302
01:20:32,310 --> 01:20:34,270
minus alpha r star.

1303
01:20:34,270 --> 01:20:36,780
This bound is tight also.

1304
01:20:36,780 --> 01:20:38,460
And that's a very nice result.

1305
01:20:38,460 --> 01:20:42,830
Because that just says that all
you got do is find r star.

1306
01:20:42,830 --> 01:20:45,810
And that tells you what the
probability of crossing a

1307
01:20:45,810 --> 01:20:47,210
threshold is.

1308
01:20:47,210 --> 01:20:50,060
And it's a very tight bound
if alpha is very large.

1309
01:20:50,060 --> 01:20:53,780
It doesn't make any difference
what the negative threshold

1310
01:20:53,780 --> 01:20:56,230
is, or whether it's
there or not.

1311
01:20:56,230 --> 01:20:59,660
This tells you the thing
you want to know.

1312
01:20:59,660 --> 01:21:04,610
I think I'm going to stop at
that point, because I have

1313
01:21:04,610 --> 01:21:08,330
been sort of rushing to
get to this point.

1314
01:21:08,330 --> 01:21:11,210
And it doesn't do any good
to keep rushing.

1315
01:21:11,210 --> 01:21:16,610
So thank you all for being
around all term.

1316
01:21:16,610 --> 01:21:17,540
I appreciate it.

1317
01:21:17,540 --> 01:21:18,790
Thank you.