1
00:00:00,000 --> 00:00:00,580

2
00:00:00,580 --> 00:00:04,710
In this problem, we're given a
collection of 10 variables, x1

3
00:00:04,710 --> 00:00:10,580
through x10, where each i, xi,
is a uniform random variable

4
00:00:10,580 --> 00:00:12,345
between 0 and 1.

5
00:00:12,345 --> 00:00:18,710
So each i is uniform between 0
and 1, and all 10 variables

6
00:00:18,710 --> 00:00:19,950
are independent.

7
00:00:19,950 --> 00:00:26,640
And we'd like to develop a bound
on the probability that

8
00:00:26,640 --> 00:00:34,860
some of the 10 variables, 1 to
10, being greater than 7 using

9
00:00:34,860 --> 00:00:36,250
different methods.

10
00:00:36,250 --> 00:00:39,960
So in part A we'll be using
the Markov's inequality

11
00:00:39,960 --> 00:00:41,240
written here.

12
00:00:41,240 --> 00:00:44,060
That is, if we have a random
variable, positive random

13
00:00:44,060 --> 00:00:48,350
variable x, the probability x is
greater than a, where a is

14
00:00:48,350 --> 00:00:52,480
again some positive number, is
bounded above by the expected

15
00:00:52,480 --> 00:00:55,450
value of x divided by a.

16
00:00:55,450 --> 00:00:58,140
And let's see how that works
out in our situation.

17
00:00:58,140 --> 00:01:02,910
In our situation, we will call
x the summation of i equal to

18
00:01:02,910 --> 00:01:13,010
1 to 10xi, and therefore, E of
x is simply 10 times E of x1,

19
00:01:13,010 --> 00:01:16,470
the individual ones, and
this gives us 5.

20
00:01:16,470 --> 00:01:21,970
Here we use used the linearity
of expectation such that the

21
00:01:21,970 --> 00:01:24,530
expectation of the sum of the
random variable is simply the

22
00:01:24,530 --> 00:01:27,020
sum of the expectations.

23
00:01:27,020 --> 00:01:29,600
Now, we can invoke Markov's
Inequality.

24
00:01:29,600 --> 00:01:32,630
It says x greater
or equal to 7.

25
00:01:32,630 --> 00:01:42,750
This is less than E of x over 7,
and this gives us 5 over 7.

26
00:01:42,750 --> 00:01:45,780
For part B, let's see if we can
improve the bound we got

27
00:01:45,780 --> 00:01:49,860
in part A using the Chebyshev
inequality, which takes into

28
00:01:49,860 --> 00:01:52,950
account the variance of
random variable x.

29
00:01:52,950 --> 00:01:56,700
Again, to refresh you on this,
the Chebyshev Inequality says

30
00:01:56,700 --> 00:02:01,010
the probability that x deviates
from its mean E of x,

31
00:02:01,010 --> 00:02:05,850
by more than a is bound above
by the variance of x divided

32
00:02:05,850 --> 00:02:08,090
by a squared.

33
00:02:08,090 --> 00:02:11,960
So we have to actually do some
work to transform the

34
00:02:11,960 --> 00:02:15,680
probability we're interested
in, which is x greater or

35
00:02:15,680 --> 00:02:20,140
equal to 7, into the form that's
convenient to use using

36
00:02:20,140 --> 00:02:22,430
the Chebyshev Inequality.

37
00:02:22,430 --> 00:02:26,510
To do so, we'll rewrite this
probability as the probability

38
00:02:26,510 --> 00:02:32,410
of x minus 5 greater or equal
to 2 simply by moving 5 from

39
00:02:32,410 --> 00:02:33,920
the right to the left.

40
00:02:33,920 --> 00:02:37,470
The reason we chose 5 is because
5 is equal to the

41
00:02:37,470 --> 00:02:42,660
expected value of x from part
A as we know before.

42
00:02:42,660 --> 00:02:45,880
And in fact, this quantity is
also equal to the probability

43
00:02:45,880 --> 00:02:49,430
that x minus 5 less or
equal to negative 2.

44
00:02:49,430 --> 00:02:52,490

45
00:02:52,490 --> 00:02:55,730
To see why this is true, recall
that x is simply the

46
00:02:55,730 --> 00:03:01,670
summation of the xi's, the 10
xi's, and each xi is a uniform

47
00:03:01,670 --> 00:03:04,370
random variable between
0 and 1.

48
00:03:04,370 --> 00:03:08,100
And therefore, each xi, the
distribution of which is

49
00:03:08,100 --> 00:03:11,470
symmetric around its mean 1/2.

50
00:03:11,470 --> 00:03:14,290
So we can see that after we
add up all the xi's, the

51
00:03:14,290 --> 00:03:16,960
resulting distribution
x is also symmetric

52
00:03:16,960 --> 00:03:19,050
around its mean 5.

53
00:03:19,050 --> 00:03:21,840
And as a result, the probability
of x minus 5

54
00:03:21,840 --> 00:03:26,380
greater than 2 is now equal to
the probability that x minus 5

55
00:03:26,380 --> 00:03:28,670
less than negative 2.

56
00:03:28,670 --> 00:03:32,120
And knowing these two, we can
then say they're both equal to

57
00:03:32,120 --> 00:03:38,440
1/2 the probability x minus 5
absolute value greater or

58
00:03:38,440 --> 00:03:42,270
equal to 2, because this term
right here is simply the sum

59
00:03:42,270 --> 00:03:44,850
of both terms here and here.

60
00:03:44,850 --> 00:03:48,810

61
00:03:48,810 --> 00:03:51,270
At this point, we have
transformed the probability of

62
00:03:51,270 --> 00:03:54,970
x greater or equal to 7 into the
form right here, such that

63
00:03:54,970 --> 00:03:59,090
we can apply the Chebyshev's
Inequality basically directly.

64
00:03:59,090 --> 00:04:02,050
And we'll write the probably
here being less than or equal

65
00:04:02,050 --> 00:04:08,560
to 1/2 times, applying the
Chebyshev Inequality, variance

66
00:04:08,560 --> 00:04:11,510
of x divided by 2 squared.

67
00:04:11,510 --> 00:04:16,589
Now, 2 is the same as a
right here, and this

68
00:04:16,589 --> 00:04:20,230
gives us 1/8 times--

69
00:04:20,230 --> 00:04:24,230
now, the variance of x, we know
is 10 times the variance

70
00:04:24,230 --> 00:04:29,220
of a uniform random variable
between 0 and 1, which is

71
00:04:29,220 --> 00:04:33,230
1/12, and that gives us 5/48.

72
00:04:33,230 --> 00:04:36,500

73
00:04:36,500 --> 00:04:38,900
Now, let's compare this with
the number we got earlier

74
00:04:38,900 --> 00:04:42,650
using the Markov Inequality,
which was 5/7.

75
00:04:42,650 --> 00:04:46,770
We see that 5/48 is much
smaller, and this tells us

76
00:04:46,770 --> 00:04:49,540
that, at least for this example,
using the Chebyshev

77
00:04:49,540 --> 00:04:52,720
Inequality combined with the
information of the variance of

78
00:04:52,720 --> 00:04:56,030
x, we're able to get a stronger
upper bound on the

79
00:04:56,030 --> 00:04:59,150
probability of the event that
we're interested in.

80
00:04:59,150 --> 00:05:01,890
Now, in part B, we saw that
by using the additional

81
00:05:01,890 --> 00:05:04,340
information of the variance
combined with the Chebyshev

82
00:05:04,340 --> 00:05:08,190
Inequality, we can improve upon
bound given by Markov's

83
00:05:08,190 --> 00:05:09,650
Inequality.

84
00:05:09,650 --> 00:05:12,420
Now, in part C, we'll use
a somewhat more powerful

85
00:05:12,420 --> 00:05:15,230
approach in addition to the
Chebyshev Inequality, the

86
00:05:15,230 --> 00:05:16,910
so-called central
limit theorem.

87
00:05:16,910 --> 00:05:19,450
Let's see if we can even
get a better bound.

88
00:05:19,450 --> 00:05:22,600
To remind you what a central
limit theorem is, let's say we

89
00:05:22,600 --> 00:05:28,740
have a summation of i equal
to 1 to some number n of

90
00:05:28,740 --> 00:05:30,610
independent and identically
distributed

91
00:05:30,610 --> 00:05:32,420
random variables xi.

92
00:05:32,420 --> 00:05:34,720
Now, the central limit theorem
says the following.

93
00:05:34,720 --> 00:05:38,770
We take the sum right here, and
subtract out its means,

94
00:05:38,770 --> 00:05:44,890
which is E of the same
summation, and further, we'll

95
00:05:44,890 --> 00:05:48,500
divide out, what we call
normalize, by the standard

96
00:05:48,500 --> 00:05:50,275
deviation of the summation.

97
00:05:50,275 --> 00:05:55,580
In other words, the square
root of the variance

98
00:05:55,580 --> 00:05:57,000
of the sum of xi.

99
00:05:57,000 --> 00:05:59,500

100
00:05:59,500 --> 00:06:05,230
So if we perform this procedure
right here, then as

101
00:06:05,230 --> 00:06:09,090
the number of terms in the sums
going to infinity, here

102
00:06:09,090 --> 00:06:13,270
as in goes to infinity, we will
actually see that this

103
00:06:13,270 --> 00:06:17,490
random variable will converge
in distribution in some way

104
00:06:17,490 --> 00:06:20,960
that will eventually look like
a standard normal random

105
00:06:20,960 --> 00:06:23,880
variable with means 0 and 1.

106
00:06:23,880 --> 00:06:26,380
And since we know how the
distribution of a standard

107
00:06:26,380 --> 00:06:29,790
normal looks like, we can go to
table and look up certain

108
00:06:29,790 --> 00:06:32,580
properties of the resulting
distribution.

109
00:06:32,580 --> 00:06:34,430
So that is a plan to do.

110
00:06:34,430 --> 00:06:36,970
So right now, we have
about 10 variables.

111
00:06:36,970 --> 00:06:40,680
It's not that many compared to
a huge numbering, but again,

112
00:06:40,680 --> 00:06:43,310
if we believe it's a good
approximation, we can get some

113
00:06:43,310 --> 00:06:46,600
information out of it by using
the central limit theorem.

114
00:06:46,600 --> 00:06:49,840
So we are interesting knowing
that probability summation of

115
00:06:49,840 --> 00:06:55,250
i equal to 1 to 10 x1 greater
or equal to 7.

116
00:06:55,250 --> 00:06:59,410
We'll rewrite this as 1 minus
the probability the summation

117
00:06:59,410 --> 00:07:06,270
i equal to 1 to 10, and
xi less equal to 7.

118
00:07:06,270 --> 00:07:08,310
Now, we're going to apply
the scaling to the

119
00:07:08,310 --> 00:07:10,570
summation right here.

120
00:07:10,570 --> 00:07:20,050
So this is equal to 1 minus the
probability summation i

121
00:07:20,050 --> 00:07:25,880
equal to 1 to 10xi minus 5.

122
00:07:25,880 --> 00:07:29,020
Because we know from previous
parts that 5 is the expected

123
00:07:29,020 --> 00:07:32,090
value of the sum right
here, and divided by

124
00:07:32,090 --> 00:07:36,420
square root of 10/12.

125
00:07:36,420 --> 00:07:40,390
Again, earlier we know that
10/12 is the variance of the

126
00:07:40,390 --> 00:07:42,340
sum of xi's.

127
00:07:42,340 --> 00:07:45,890
And we'll do the same on the
other side, writing it 7 minus

128
00:07:45,890 --> 00:07:48,930
5 divided by square
root of 10/12.

129
00:07:48,930 --> 00:07:52,280

130
00:07:52,280 --> 00:07:55,710
Now, if we compute out the
quantity right here, we know

131
00:07:55,710 --> 00:08:02,550
that this quantity is roughly
2.19, and by the central limit

132
00:08:02,550 --> 00:08:07,030
theorem, if we believe 10 is a
large enough number, then this

133
00:08:07,030 --> 00:08:13,390
will be roughly equal to 1 minus
the CDF of a standard

134
00:08:13,390 --> 00:08:17,200
normal evaluated at 2.19.

135
00:08:17,200 --> 00:08:20,460
And we could look up the number
in the table, and this

136
00:08:20,460 --> 00:08:27,910
gives us number roughly,
0.014.

137
00:08:27,910 --> 00:08:30,580
Now let's do a quick summary of
what this problem is about.

138
00:08:30,580 --> 00:08:33,720
We're asked to compute the
probability of x greater or

139
00:08:33,720 --> 00:08:38,909
equal to 7, where x is the sum
of 10 uniform random variables

140
00:08:38,909 --> 00:08:41,850
between 0 and 1, so
we'll call it xi.

141
00:08:41,850 --> 00:08:44,560
We know that because each
random variable has

142
00:08:44,560 --> 00:08:48,390
expectation 1/2, adding 10
of them up, gives us

143
00:08:48,390 --> 00:08:50,820
expectation of 5.

144
00:08:50,820 --> 00:08:54,320
So this is essentially asking,
what is the chance that x is

145
00:08:54,320 --> 00:08:57,600
more than two away from
its expectation?

146
00:08:57,600 --> 00:09:01,240
So if this is a real line, and
5 is here, maybe x has some

147
00:09:01,240 --> 00:09:05,180
distribution around 5, so the
center what the expected value

148
00:09:05,180 --> 00:09:09,650
is at 5, we wonder how likely is
it for us to see something

149
00:09:09,650 --> 00:09:11,930
greater than 7?

150
00:09:11,930 --> 00:09:16,430
Now, let's see where do we land
on the probably spectrum

151
00:09:16,430 --> 00:09:17,890
from 0 to 1.

152
00:09:17,890 --> 00:09:20,110
Well, without using any
information, we know the

153
00:09:20,110 --> 00:09:23,710
probability cannot be greater
than 1, so a trivial upper

154
00:09:23,710 --> 00:09:27,990
bound for the probability
right here will be 1.

155
00:09:27,990 --> 00:09:31,570
Well, for the first part we use
Markov's Inequality and

156
00:09:31,570 --> 00:09:35,750
that gives us some number, which
is roughly equal to 0.7.

157
00:09:35,750 --> 00:09:40,320
In fact, we got number 5/7,
and this is from Markov's

158
00:09:40,320 --> 00:09:41,420
Inequality.

159
00:09:41,420 --> 00:09:44,210
Oh, it's better than 1, already
telling us it cannot

160
00:09:44,210 --> 00:09:48,500
be between 0.7 and 1, but
can we do better?

161
00:09:48,500 --> 00:09:51,980
Well, the part B, we see that
all the way, using the

162
00:09:51,980 --> 00:09:54,430
additional information variance,
we can get this

163
00:09:54,430 --> 00:09:59,510
number down to 5/48, which
is roughly 0.1.

164
00:09:59,510 --> 00:10:01,910
Already, that's much
better than 0.7.

165
00:10:01,910 --> 00:10:03,230
Can we even do better?

166
00:10:03,230 --> 00:10:05,450
And this is the Chebyshev,
and it turns out we

167
00:10:05,450 --> 00:10:06,960
can indeed do better.

168
00:10:06,960 --> 00:10:09,390
Using the central limit theorem,
we can squeeze this

169
00:10:09,390 --> 00:10:14,370
number all the way down to
0.014, almost a 10 times

170
00:10:14,370 --> 00:10:16,910
improvement over the
previous number.

171
00:10:16,910 --> 00:10:18,630
This is from central
limit theorem.

172
00:10:18,630 --> 00:10:22,010
As we can see, by using
different bounding techniques,

173
00:10:22,010 --> 00:10:25,250
we can progressively improve the
bound on the probability

174
00:10:25,250 --> 00:10:28,810
of x exceeding 7, and from this
problem we learned that

175
00:10:28,810 --> 00:10:33,480
even with 10 variables, the
truth is more like this, which

176
00:10:33,480 --> 00:10:36,850
says that the distribution of
x concentrates very heavily

177
00:10:36,850 --> 00:10:40,080
around 5, and hence, the
probability of x being greater

178
00:10:40,080 --> 00:10:43,400
or equal to 7 could be much
smaller than one might expect.

179
00:10:43,400 --> 00:10:45,033