1
00:00:00,760 --> 00:00:03,420
Let us now go
through an example.

2
00:00:03,420 --> 00:00:05,700
Suppose that we have an
unknown random variable

3
00:00:05,700 --> 00:00:11,240
Theta that has a uniform
distribution between 4 and 10.

4
00:00:11,240 --> 00:00:13,850
We observe some
other random variable

5
00:00:13,850 --> 00:00:16,790
X that's related
to Theta according

6
00:00:16,790 --> 00:00:18,500
to the following model.

7
00:00:18,500 --> 00:00:22,120
This is the conditional
distribution of X given Theta.

8
00:00:22,120 --> 00:00:25,400
For any given value
of theta, X is

9
00:00:25,400 --> 00:00:29,960
going to take values between
theta minus 1 and theta plus 1.

10
00:00:29,960 --> 00:00:34,760
And the conditional distribution
is uniform on that range.

11
00:00:34,760 --> 00:00:38,740
One way of thinking about this
particular observation model

12
00:00:38,740 --> 00:00:42,220
is that what we observe
is the true value

13
00:00:42,220 --> 00:00:46,120
of Theta plus some noise term.

14
00:00:46,120 --> 00:00:51,950
And this noise term is
uniform on the range

15
00:00:51,950 --> 00:00:54,770
from minus 1 to plus 1.

16
00:00:54,770 --> 00:00:56,960
So given a value
of Theta, we may

17
00:00:56,960 --> 00:01:02,460
observe anything, because
of noise, that's up to one

18
00:01:02,460 --> 00:01:07,260
lower or one higher than
the true value of Theta.

19
00:01:07,260 --> 00:01:11,590
And if we take this description,
actually, this random variable

20
00:01:11,590 --> 00:01:15,140
U has this distribution
no matter what Theta is.

21
00:01:15,140 --> 00:01:18,810
And therefore, U is
independent of Theta.

22
00:01:18,810 --> 00:01:22,970
But in any case, this particular
interpretation will not matter.

23
00:01:22,970 --> 00:01:25,640
Let us see how do we proceed.

24
00:01:25,640 --> 00:01:27,610
In Bayesian estimation,
the first step

25
00:01:27,610 --> 00:01:30,880
is always to put our
hands on the posterior

26
00:01:30,880 --> 00:01:32,710
distribution of Theta.

27
00:01:32,710 --> 00:01:34,430
And to find the
posterior, we can

28
00:01:34,430 --> 00:01:37,289
start by first
finding the joint.

29
00:01:37,289 --> 00:01:40,160
So let us look at
the x theta plane.

30
00:01:40,160 --> 00:01:43,630
That's where the joint
distribution is going to live.

31
00:01:43,630 --> 00:01:46,100
And our first step
will be to locate

32
00:01:46,100 --> 00:01:49,420
those values of X and
Theta that are possible,

33
00:01:49,420 --> 00:01:51,020
given our description.

34
00:01:51,020 --> 00:01:55,600
From this model here, we
know that theta minus 1

35
00:01:55,600 --> 00:01:58,270
is going to be less
than or equal to x.

36
00:01:58,270 --> 00:02:02,270
And x is going to be less
than or equal to theta plus 1.

37
00:02:02,270 --> 00:02:05,620
And we translate this
into two inequalities,

38
00:02:05,620 --> 00:02:10,350
namely that theta is less
than or equal to x plus 1,

39
00:02:10,350 --> 00:02:12,920
and from here, that
theta is larger

40
00:02:12,920 --> 00:02:15,590
than or equal to x minus 1.

41
00:02:15,590 --> 00:02:18,340
So these are the
constraints that we

42
00:02:18,340 --> 00:02:21,760
have on the possible
values of x and theta.

43
00:02:21,760 --> 00:02:30,340
So here we plot the line where
theta is equal to x plus one.

44
00:02:30,340 --> 00:02:33,079
And here we plot
the line on which

45
00:02:33,079 --> 00:02:36,030
theta is equal to x minus 1.

46
00:02:36,030 --> 00:02:38,900
And these two inequalities
that we've got here

47
00:02:38,900 --> 00:02:43,410
tell us that we need to live
somewhere in between those two

48
00:02:43,410 --> 00:02:44,910
lines.

49
00:02:44,910 --> 00:02:47,850
In addition, we have
the fact that theta

50
00:02:47,850 --> 00:02:50,130
lives between 4 and 10.

51
00:02:50,130 --> 00:02:53,300
And that places these
two limits as well.

52
00:02:53,300 --> 00:02:56,680
So to summarize, this
shape here is the set

53
00:02:56,680 --> 00:03:00,750
off all possible x's and thetas.

54
00:03:00,750 --> 00:03:05,170
Outside this shape, the joint
PDF is going to be zero.

55
00:03:05,170 --> 00:03:07,930
What is it going
to be inside here?

56
00:03:07,930 --> 00:03:10,370
Well, because the
prior is uniform,

57
00:03:10,370 --> 00:03:15,730
that is, it is constant, and
the model is also uniform,

58
00:03:15,730 --> 00:03:19,260
to obtain the joint
we multiply these two.

59
00:03:19,260 --> 00:03:21,360
And since they are
constant, we obtain

60
00:03:21,360 --> 00:03:23,829
a joint that's also constant.

61
00:03:23,829 --> 00:03:31,850
So the joint PDF is equal
to a constant over that set.

62
00:03:31,850 --> 00:03:34,560
We can easily calculate
the area of this set.

63
00:03:34,560 --> 00:03:35,810
It is 12.

64
00:03:35,810 --> 00:03:40,960
So the joint is equal to
1 over 12 inside this set.

65
00:03:40,960 --> 00:03:44,140
And of course, it's 0 elsewhere.

66
00:03:44,140 --> 00:03:47,960
So we have a uniform joint PDF.

67
00:03:47,960 --> 00:03:50,440
Now, let us look
at the posterior.

68
00:03:50,440 --> 00:03:54,930
If I tell you that X takes
on this specific value,

69
00:03:54,930 --> 00:03:59,290
this means that we now
live in this universe.

70
00:03:59,290 --> 00:04:03,810
And it means that all of
those thetas are possible.

71
00:04:03,810 --> 00:04:05,950
The posterior
distribution is going

72
00:04:05,950 --> 00:04:07,800
to be a distribution
that tells us

73
00:04:07,800 --> 00:04:10,700
the probabilities of
these different thetas.

74
00:04:10,700 --> 00:04:13,060
What kind of distribution is it?

75
00:04:13,060 --> 00:04:16,649
Well, we know that the
conditional is just

76
00:04:16,649 --> 00:04:19,870
a section out of the joint
but keeps, otherwise,

77
00:04:19,870 --> 00:04:21,190
the same shape.

78
00:04:21,190 --> 00:04:26,260
Since the joint is constant,
it's uniform over that set,

79
00:04:26,260 --> 00:04:30,260
it means that the posterior,
or the conditional,

80
00:04:30,260 --> 00:04:33,000
is also constant over that set.

81
00:04:33,000 --> 00:04:36,909
So the conclusion is that the
posterior distribution of Theta

82
00:04:36,909 --> 00:04:41,230
is a uniform
distribution on this set.

83
00:04:41,230 --> 00:04:45,590
Given this knowledge, what is
the conditional expectation?

84
00:04:45,590 --> 00:04:48,110
The conditional
expectation of a uniform

85
00:04:48,110 --> 00:04:50,930
is just the midpoint
of that uniform.

86
00:04:50,930 --> 00:04:56,409
And so this is going to
be our estimate of Theta,

87
00:04:56,409 --> 00:04:58,720
the conditional
expectation of Theta,

88
00:04:58,720 --> 00:05:01,610
given the observation
that we have obtained.

89
00:05:01,610 --> 00:05:04,460
And then a similar
argument applies no matter

90
00:05:04,460 --> 00:05:06,890
what x we have obtained.

91
00:05:06,890 --> 00:05:10,130
For any given x, our
estimate is going

92
00:05:10,130 --> 00:05:13,855
to be the midpoint of the
corresponding interval.

93
00:05:17,870 --> 00:05:22,380
So what kind of shape
do we get by doing this,

94
00:05:22,380 --> 00:05:25,870
by joining the mid-points?

95
00:05:25,870 --> 00:05:30,530
It's going to be a straight
line over this region.

96
00:05:30,530 --> 00:05:33,880
It's also going to be a
straight line over this region

97
00:05:33,880 --> 00:05:36,560
except that, because
of the change in shape,

98
00:05:36,560 --> 00:05:39,500
it's going to be a straight
line with a different slope.

99
00:05:39,500 --> 00:05:41,280
And similarly, in
this region, it's

100
00:05:41,280 --> 00:05:45,740
also going to be a straight
line with a different slope.

101
00:05:45,740 --> 00:05:48,350
So what have we plotted here?

102
00:05:48,350 --> 00:05:50,810
For any given
value of X, we have

103
00:05:50,810 --> 00:05:54,790
plotted the corresponding
conditional expectation

104
00:05:54,790 --> 00:06:00,960
of Theta given that value of
X. And as a function of x, this

105
00:06:00,960 --> 00:06:03,050
gives us a certain curve.

106
00:06:03,050 --> 00:06:08,800
And this blue curve
that we have calculated

107
00:06:08,800 --> 00:06:13,000
is a particular function of x.

108
00:06:13,000 --> 00:06:15,230
And we can think
of this function g

109
00:06:15,230 --> 00:06:17,440
as being our estimator.

110
00:06:17,440 --> 00:06:19,310
So the way we're
going to be processing

111
00:06:19,310 --> 00:06:23,620
the data will be that
whenever we obtain an x,

112
00:06:23,620 --> 00:06:26,650
we apply this
particular function g.

113
00:06:26,650 --> 00:06:29,711
And we come up with an estimate.