1 00:00:01,150 --> 00:00:03,900 We now look at an application of the Bayes rule that's 2 00:00:03,900 --> 00:00:08,039 involves a continuous unknown random variable, which we try 3 00:00:08,039 --> 00:00:11,730 to estimate based on a discrete measurement. 4 00:00:11,730 --> 00:00:14,120 Our model will be as follows. 5 00:00:14,120 --> 00:00:17,790 We observe the discrete random variable K, which is 6 00:00:17,790 --> 00:00:22,870 Bernoulli, so it can take two values, 1 or 0. 7 00:00:22,870 --> 00:00:27,010 And it takes those values with probabilities y and 1 minus y, 8 00:00:27,010 --> 00:00:27,740 respectively. 9 00:00:27,740 --> 00:00:32,750 This is our model of K. The catch is that the value of y 10 00:00:32,750 --> 00:00:34,110 is not known. 11 00:00:34,110 --> 00:00:38,710 And it is modeled as a random variable by itself. 12 00:00:38,710 --> 00:00:41,970 You can think of a situation where we are dealing with a 13 00:00:41,970 --> 00:00:43,390 single coin flip. 14 00:00:43,390 --> 00:00:46,250 We observe the outcome of the coin flip, 15 00:00:46,250 --> 00:00:48,260 but the coin is biased. 16 00:00:48,260 --> 00:00:52,090 The probability of heads is some unknown number, y. 17 00:00:52,090 --> 00:00:56,290 And we try to infer or say something about the bias of 18 00:00:56,290 --> 00:01:00,290 the coin on the basis of the observation that we have made. 19 00:01:00,290 --> 00:01:03,350 So what do we assume about this y or 20 00:01:03,350 --> 00:01:05,950 the bias of the coin? 21 00:01:05,950 --> 00:01:08,850 If we know nothing about this random variable, we might as 22 00:01:08,850 --> 00:01:11,300 well model it as a uniform random 23 00:01:11,300 --> 00:01:14,350 variable on the unit interval. 24 00:01:14,350 --> 00:01:18,510 And the question now is, given that we made one observation 25 00:01:18,510 --> 00:01:23,400 and the outcome was 1, what can we say about the 26 00:01:23,400 --> 00:01:27,010 probability distribution of Y given this particular 27 00:01:27,010 --> 00:01:29,340 information? 28 00:01:29,340 --> 00:01:32,350 So the question that we're asking is, what we can tell 29 00:01:32,350 --> 00:01:38,180 about the density of Y given that the value 30 00:01:38,180 --> 00:01:42,170 of 1 has been observed. 31 00:01:42,170 --> 00:01:45,740 The way to approach this problem is by using a version 32 00:01:45,740 --> 00:01:47,930 of the Bayes rule. 33 00:01:47,930 --> 00:01:52,030 We want to calculate this quantity for the special case 34 00:01:52,030 --> 00:01:56,470 where k is equal to 1. 35 00:01:56,470 --> 00:02:00,060 So let us calculate the various pieces on the right 36 00:02:00,060 --> 00:02:01,990 hand side of this equation. 37 00:02:01,990 --> 00:02:05,570 The first piece is the density of Y. This is the prior 38 00:02:05,570 --> 00:02:08,389 density before we obtain any measurement. 39 00:02:08,389 --> 00:02:12,750 And since the random variable is uniform, this is equal to 1 40 00:02:12,750 --> 00:02:17,760 for y in the unit interval. 41 00:02:17,760 --> 00:02:21,423 And of course, it is 0 otherwise. 42 00:02:26,329 --> 00:02:31,195 The next piece that we need is the distribution of K given 43 00:02:31,195 --> 00:02:38,550 the value of Y. Well, given Y, K takes a value of 1, with 44 00:02:38,550 --> 00:02:41,350 probability equal to Y-- 45 00:02:41,350 --> 00:02:45,350 so the probability of 1, if we're told the value of y is 46 00:02:45,350 --> 00:02:46,890 just a y itself. 47 00:02:46,890 --> 00:02:51,770 y is the bias of the coin that we're dealing with. 48 00:02:51,770 --> 00:02:56,410 The next term that we need is the denominator. 49 00:02:56,410 --> 00:02:58,460 We will use this formula. 50 00:02:58,460 --> 00:03:03,070 It is the integral of the density of Y, 51 00:03:03,070 --> 00:03:04,910 which is equal to 1. 52 00:03:04,910 --> 00:03:10,330 And it is equal to 1 only on the range from 0 to 1, times 53 00:03:10,330 --> 00:03:16,430 this probability that K takes a value, a certain value. 54 00:03:16,430 --> 00:03:19,770 In this case, we're dealing with a value of 1, so here 55 00:03:19,770 --> 00:03:23,640 we're going to put 1 instead of k. 56 00:03:23,640 --> 00:03:26,310 And therefore, we're dealing with this expression here, 57 00:03:26,310 --> 00:03:28,180 which is just y. 58 00:03:28,180 --> 00:03:31,200 And we integrated over y's. 59 00:03:31,200 --> 00:03:37,690 So this is y squared over 2, evaluated at 0 and 1, which 60 00:03:37,690 --> 00:03:40,220 gives us 1/2. 61 00:03:40,220 --> 00:03:42,850 So this is the unconditional probability that 62 00:03:42,850 --> 00:03:44,140 K is equal to 1. 63 00:03:44,140 --> 00:03:48,920 If we know nothing about Y, by symmetry, higher biases are 64 00:03:48,920 --> 00:03:51,230 equally likely as lower biases. 65 00:03:51,230 --> 00:03:55,760 So we should expect that it's equally likely to give us a 1 66 00:03:55,760 --> 00:03:58,130 as it is to give us a 0. 67 00:03:58,130 --> 00:04:01,470 Now, we have in our hands all the pieces that go into this 68 00:04:01,470 --> 00:04:02,940 particular formula. 69 00:04:02,940 --> 00:04:05,920 And we can go ahead with the final calculation. 70 00:04:05,920 --> 00:04:13,180 So in the numerator, we have 1 times this term, evaluated at 71 00:04:13,180 --> 00:04:16,890 k equal to 1, which is equal to y. 72 00:04:16,890 --> 00:04:19,810 And then in the denominator, we have a term that 73 00:04:19,810 --> 00:04:21,950 evaluates to 1/2. 74 00:04:21,950 --> 00:04:25,110 So the final answer is 2y. 75 00:04:25,110 --> 00:04:28,520 Over what range of y's is this correct? 76 00:04:28,520 --> 00:04:30,800 Only for those y's that are possible. 77 00:04:30,800 --> 00:04:33,909 So this is for y's in the unit interval. 78 00:04:36,409 --> 00:04:41,159 If we are to plot this PDF, it has this shape. 79 00:04:46,620 --> 00:04:52,820 This is a plot of the PDF of Y given that the random variable 80 00:04:52,820 --> 00:04:55,990 K takes on a value of 1. 81 00:04:55,990 --> 00:05:01,090 Initially, we started with a uniform for Y. So all values 82 00:05:01,090 --> 00:05:05,370 of Y were equally likely. 83 00:05:05,370 --> 00:05:12,920 But once we observed an outcome of 1, this tells us 84 00:05:12,920 --> 00:05:18,410 that perhaps Y is on the higher end 85 00:05:18,410 --> 00:05:20,180 rather than lower end. 86 00:05:20,180 --> 00:05:23,390 So after we obtain our observation, the random 87 00:05:23,390 --> 00:05:26,840 variable Y has this distribution, with higher 88 00:05:26,840 --> 00:05:30,640 values being more likely than lower values. 89 00:05:30,640 --> 00:05:33,960 This example is a prototype of situations where we want to 90 00:05:33,960 --> 00:05:37,210 estimate a continuous random variable based on discrete 91 00:05:37,210 --> 00:05:38,690 measurements. 92 00:05:38,690 --> 00:05:42,020 Essentially it is the same as trying to estimate the bias of 93 00:05:42,020 --> 00:05:46,820 a coin based on a single measurement of the 94 00:05:46,820 --> 00:05:48,860 result of a coin flip. 95 00:05:48,860 --> 00:05:52,070 As you can imagine, there are generalizations in which we 96 00:05:52,070 --> 00:05:54,950 observe multiple coin flips. 97 00:05:54,950 --> 00:05:57,040 And this is an example that we will see 98 00:05:57,040 --> 00:05:58,440 later on in this class.