1 00:00:02,550 --> 00:00:06,070 In the next variation we consider, all random variables 2 00:00:06,070 --> 00:00:07,920 are continuous. 3 00:00:07,920 --> 00:00:11,070 For this case, we do have a Bayes rule, once more. 4 00:00:11,070 --> 00:00:13,800 And we have worked [out] quite a few examples. 5 00:00:13,800 --> 00:00:16,120 So there's no point, again, in going 6 00:00:16,120 --> 00:00:17,400 through a detailed example. 7 00:00:17,400 --> 00:00:20,890 Let us just discuss some of the issues. 8 00:00:20,890 --> 00:00:24,150 One question is when do these models arise? 9 00:00:24,150 --> 00:00:27,850 One particular class of models that is very useful and very 10 00:00:27,850 --> 00:00:31,980 commonly used are so-called linear normal models. 11 00:00:31,980 --> 00:00:34,305 In these models, we, basically, combine 12 00:00:34,305 --> 00:00:37,500 various random variables in a linear function. 13 00:00:37,500 --> 00:00:41,220 And all the random variables of interest are now to be normal. 14 00:00:41,220 --> 00:00:45,140 For instance, we might have a signal, a noisy signal, 15 00:00:45,140 --> 00:00:50,930 call it Theta, which is now a continuous valued signal. 16 00:00:50,930 --> 00:00:53,280 We receive that signal, but corrupted 17 00:00:53,280 --> 00:00:56,290 by some noise, which is independent from what was sent. 18 00:00:56,290 --> 00:01:00,110 And we wish to recover, on the basis of the observation X, 19 00:01:00,110 --> 00:01:03,180 we wish to recover the value of Theta. 20 00:01:03,180 --> 00:01:05,580 And then there are versions of this problem that 21 00:01:05,580 --> 00:01:09,840 involve Theta vectors instead of single values. 22 00:01:09,840 --> 00:01:13,530 So that Theta consists of multiple components, 23 00:01:13,530 --> 00:01:17,200 and where we obtain many measurements X. We will, 24 00:01:17,200 --> 00:01:19,750 actually, see in the next lecture sequence, 25 00:01:19,750 --> 00:01:23,990 a quite detailed discussion of models of this type. 26 00:01:23,990 --> 00:01:27,380 And this will be one of our main examples 27 00:01:27,380 --> 00:01:30,500 within our study of inference. 28 00:01:30,500 --> 00:01:34,100 There will be another example that we will see a few times, 29 00:01:34,100 --> 00:01:36,430 and this involves estimating the parameter 30 00:01:36,430 --> 00:01:38,850 of a uniform distribution. 31 00:01:38,850 --> 00:01:43,120 So X is a random variable that's uniform over a certain range. 32 00:01:43,120 --> 00:01:47,241 But the range itself is random and unknown. 33 00:01:47,241 --> 00:01:50,080 And on the basis of observations X, 34 00:01:50,080 --> 00:01:54,840 we would like to make an estimation of what 35 00:01:54,840 --> 00:01:57,670 the true value of Theta is. 36 00:01:57,670 --> 00:01:59,860 This is an example that you will see 37 00:01:59,860 --> 00:02:03,870 in our collection of solved problems for this class. 38 00:02:03,870 --> 00:02:05,940 So what are the questions in this setting, 39 00:02:05,940 --> 00:02:10,350 we wish to come up with ways of estimating Theta, 40 00:02:10,350 --> 00:02:13,840 we form an estimator, and the main candidates 41 00:02:13,840 --> 00:02:18,440 for estimators at this points are, once more, 42 00:02:18,440 --> 00:02:21,760 the maximum a posteriori probability estimator, 43 00:02:21,760 --> 00:02:24,770 which looks at this conditional density 44 00:02:24,770 --> 00:02:28,460 and picks a value of theta that makes this conditional density 45 00:02:28,460 --> 00:02:30,200 as large as possible. 46 00:02:30,200 --> 00:02:33,560 And then the alternative one is the least mean squares 47 00:02:33,560 --> 00:02:36,430 estimator, which just computes the expected value 48 00:02:36,430 --> 00:02:38,840 of Theta given X. 49 00:02:38,840 --> 00:02:41,300 For any given estimator, we then want 50 00:02:41,300 --> 00:02:43,840 to characterize its performance. 51 00:02:43,840 --> 00:02:46,800 In this case, a natural notion of performance 52 00:02:46,800 --> 00:02:50,800 is the distance between our estimate, or estimator, 53 00:02:50,800 --> 00:02:52,890 from the true value of Theta. 54 00:02:52,890 --> 00:02:56,650 And commonly we use the squared distance 55 00:02:56,650 --> 00:03:00,700 and then take the average of that squared distance. 56 00:03:00,700 --> 00:03:03,310 So in a conditional universe where we have already 57 00:03:03,310 --> 00:03:06,330 observed some data, we might be interested 58 00:03:06,330 --> 00:03:09,110 in this particular expectation, which 59 00:03:09,110 --> 00:03:16,090 is the mean squared error of this particular estimator, 60 00:03:16,090 --> 00:03:18,970 given that we obtain some particular data. 61 00:03:18,970 --> 00:03:23,220 Or we can average over all possible data points 62 00:03:23,220 --> 00:03:26,230 that we might obtain so that we look 63 00:03:26,230 --> 00:03:30,210 at the unconditional mean squared error, which 64 00:03:30,210 --> 00:03:34,620 is a measure of the overall performance of our estimator. 65 00:03:34,620 --> 00:03:37,480 We will be talking about these criteria 66 00:03:37,480 --> 00:03:41,430 and the mean squared error in a fair amount of detail 67 00:03:41,430 --> 00:03:44,660 in a subsequent lecture sequence.