1 00:00:00,000 --> 00:00:02,350 The following content is provided under a Creative 2 00:00:02,350 --> 00:00:03,640 Commons license. 3 00:00:03,640 --> 00:00:06,540 Your support will help MIT OpenCourseWare continue to 4 00:00:06,540 --> 00:00:09,970 offer high quality educational resources for free. 5 00:00:09,970 --> 00:00:12,780 To make a donation or to view additional materials from 6 00:00:12,780 --> 00:00:16,550 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:16,550 --> 00:00:17,800 OCW.MIT.edu. 8 00:00:23,950 --> 00:00:24,520 PROFESSOR: OK. 9 00:00:24,520 --> 00:00:32,300 We are talking about jointly Gaussian random variables. 10 00:00:32,300 --> 00:00:35,240 One comment through all of this and through all of the 11 00:00:35,240 --> 00:00:41,860 notes is that you can add a mean to Gaussian random 12 00:00:41,860 --> 00:00:44,230 variables, or you can talk about 13 00:00:44,230 --> 00:00:47,470 zero-mean random variables. 14 00:00:47,470 --> 00:00:49,540 Here we're using random variables mostly 15 00:00:49,540 --> 00:00:51,370 to talk about noise. 16 00:00:51,370 --> 00:00:56,440 When we're talking about noise, you really should be 17 00:00:56,440 --> 00:01:00,150 talking about zero-mean random variables, because you can 18 00:01:00,150 --> 00:01:02,750 always take the mean out. 19 00:01:02,750 --> 00:01:06,320 And because of that, I don't like to state everything twice 20 00:01:06,320 --> 00:01:10,830 once for variables of processes without a mean, and 21 00:01:10,830 --> 00:01:14,640 once for variables of processes with a mean. 22 00:01:14,640 --> 00:01:17,620 And after looking at the notes, I think I've been a 23 00:01:17,620 --> 00:01:21,700 little inconsistent about that. 24 00:01:21,700 --> 00:01:25,830 I think the point is you can keep yourself straight by just 25 00:01:25,830 --> 00:01:27,840 saying the only thing important is the 26 00:01:27,840 --> 00:01:30,350 case without a mean. 27 00:01:30,350 --> 00:01:34,270 Putting the mean in is something unfortunately done 28 00:01:34,270 --> 00:01:37,410 by people who like complexity. 29 00:01:37,410 --> 00:01:41,440 And they have unfortunately got in the 30 00:01:41,440 --> 00:01:43,310 notation on their side. 31 00:01:43,310 --> 00:01:45,640 So anytime you want to talk about a 32 00:01:45,640 --> 00:01:47,680 zero-mean random variable. 33 00:01:47,680 --> 00:01:49,910 You have to say zero-mean random variable. 34 00:01:49,910 --> 00:01:53,060 And if you say random variable, it means it could 35 00:01:53,060 --> 00:01:55,220 have a mean or not have a mean. 36 00:01:55,220 --> 00:01:58,940 And of course the way the notation should be stated is 37 00:01:58,940 --> 00:02:01,300 you talk about random variables as things which 38 00:02:01,300 --> 00:02:02,470 don't have means. 39 00:02:02,470 --> 00:02:05,960 And then you can talk about random variables plus means as 40 00:02:05,960 --> 00:02:10,340 things which do have means, which would make life easier 41 00:02:10,340 --> 00:02:12,040 but unfortunately it's not that way. 42 00:02:12,040 --> 00:02:15,260 So anytime you see something and are wondering about 43 00:02:15,260 --> 00:02:20,170 whether I've been careful about the mean or not, the 44 00:02:20,170 --> 00:02:23,230 answer is I well might not have been. 45 00:02:23,230 --> 00:02:25,720 And two, it's not very important. 46 00:02:25,720 --> 00:02:28,160 So anyway. 47 00:02:28,160 --> 00:02:32,110 Here I'll be talking all about zero-mean things. 48 00:02:32,110 --> 00:02:37,210 A k-tuple of zero-mean random variables is said to be 49 00:02:37,210 --> 00:02:41,770 jointly Gaussian if you can express them in this way here. 50 00:02:41,770 --> 00:02:48,490 Namely as a linear combination of IID normal Gaussian again 51 00:02:48,490 --> 00:02:50,220 random variables. 52 00:02:50,220 --> 00:02:50,540 OK. 53 00:02:50,540 --> 00:02:54,240 In your homework, those of you who've done it already, have 54 00:02:54,240 --> 00:02:59,160 realized that just having two Gaussian random variables is 55 00:02:59,160 --> 00:03:03,070 not enough to make those two random 56 00:03:03,070 --> 00:03:05,670 variables be jointly Gaussian. 57 00:03:05,670 --> 00:03:07,760 They can be individually Gaussian 58 00:03:07,760 --> 00:03:09,710 but not jointly Gaussian. 59 00:03:09,710 --> 00:03:12,510 This is sort of important because when you start 60 00:03:12,510 --> 00:03:17,120 manipulating things as we will do when we're generating 61 00:03:17,120 --> 00:03:20,650 signals to send and things like this, you can very easily 62 00:03:20,650 --> 00:03:24,280 wind up with things which look Gaussian and are appropriately 63 00:03:24,280 --> 00:03:28,030 modeled as Gaussian, but which are not jointly Gaussian. 64 00:03:28,030 --> 00:03:30,870 Things which are jointly Gaussian are 65 00:03:30,870 --> 00:03:32,250 defined in this way. 66 00:03:32,250 --> 00:03:33,840 We will come up with a couple of other 67 00:03:33,840 --> 00:03:36,440 definitions of them later. 68 00:03:36,440 --> 00:03:40,500 But you've undoubtedly been taught that any old Gaussian 69 00:03:40,500 --> 00:03:42,840 random variables are jointly Gaussian. 70 00:03:42,840 --> 00:03:45,950 You've probably seen joint densities for them 71 00:03:45,950 --> 00:03:47,300 and things like this. 72 00:03:47,300 --> 00:03:50,730 Those joint densities only apply when you have jointly 73 00:03:50,730 --> 00:03:52,510 Gaussian random variables. 74 00:03:52,510 --> 00:03:55,750 And the fundamental definition is this. 75 00:03:55,750 --> 00:04:02,010 This fundamental definition makes sense, because the way 76 00:04:02,010 --> 00:04:06,540 that you generate these noise random variables is usually 77 00:04:06,540 --> 00:04:10,830 from some very large underlying set of very, very 78 00:04:10,830 --> 00:04:12,570 small random variables. 79 00:04:12,570 --> 00:04:15,320 And the Law of Large Numbers says that when you add up a 80 00:04:15,320 --> 00:04:18,800 very, very large number of small underlying random 81 00:04:18,800 --> 00:04:24,460 variables, you normalize the sum so it has some reasonable 82 00:04:24,460 --> 00:04:27,150 variance then that random variable is going to be 83 00:04:27,150 --> 00:04:30,440 appropriately modeled as being Gaussian. 84 00:04:30,440 --> 00:04:35,950 If you at the same time look at linear combinations of 85 00:04:35,950 --> 00:04:39,880 those things for the same reason, it's appropriate to 86 00:04:39,880 --> 00:04:43,780 model each of the random variables you're looking at as 87 00:04:43,780 --> 00:04:47,770 a linear combination of some underlying set of noise 88 00:04:47,770 --> 00:04:49,800 variables which is very, very large. 89 00:04:49,800 --> 00:04:51,490 Probably not Gaussian. 90 00:04:51,490 --> 00:04:55,290 But here when we're trying to do this, because of the 91 00:04:55,290 --> 00:04:59,590 central limit theorem, we just model these as normal Gaussian 92 00:04:59,590 --> 00:05:01,810 random variables to start with. 93 00:05:01,810 --> 00:05:05,330 So that's where that definition comes from. 94 00:05:05,330 --> 00:05:09,680 It's why when you're looking at noise processes in the real 95 00:05:09,680 --> 00:05:14,170 world, and trying to model them in a sensible way this 96 00:05:14,170 --> 00:05:16,510 jointly Gaussian is the thing you almost 97 00:05:16,510 --> 00:05:19,270 always come up with. 98 00:05:19,270 --> 00:05:20,160 OK. 99 00:05:20,160 --> 00:05:24,280 So one important point here, and the notes point this out. 100 00:05:24,280 --> 00:05:32,280 If each of these random variables Z sub i is 101 00:05:32,280 --> 00:05:35,060 independent of each of the others, and they have 102 00:05:35,060 --> 00:05:41,355 arbitrary variances, then because of this formula, the 103 00:05:41,355 --> 00:05:45,150 set of Z i's are going to be jointly Gaussian. 104 00:05:45,150 --> 00:05:46,300 Why is that? 105 00:05:46,300 --> 00:05:51,960 Well you simply make a matrix here A, which is diagonal. 106 00:05:51,960 --> 00:05:55,560 And the elements of this diagonal matrix are sigma 1 107 00:05:55,560 --> 00:05:59,840 squared, sigma 2 squared, sigma 3 squared, and so forth. 108 00:06:06,310 --> 00:06:15,580 If you have a vector Z, which is then sigma 1 squared down 109 00:06:15,580 --> 00:06:25,480 to sigma k squared times a noise vector N1 to N sub k. 110 00:06:25,480 --> 00:06:32,340 What you wind up with is Z sub i is going to be equal to -- 111 00:06:32,340 --> 00:06:34,540 I guess I don't want those squares in there. 112 00:06:34,540 --> 00:06:36,010 Sigma 1 up to sigma k -- 113 00:06:36,010 --> 00:06:42,670 Z sub i is going to be equal to sigma i, N sub i; N sub i 114 00:06:42,670 --> 00:06:45,240 is a Gaussian random variable with variance one and 115 00:06:45,240 --> 00:06:50,040 therefore Z sub i as Gaussian random variable with variance 116 00:06:50,040 --> 00:06:51,640 sigma sub i squared. 117 00:06:51,640 --> 00:06:52,980 OK. 118 00:06:52,980 --> 00:06:58,270 So anyway, one special case of this formula is anytime that 119 00:06:58,270 --> 00:07:01,840 you want to deal with a set of independent Gaussian random 120 00:07:01,840 --> 00:07:08,720 variables with arbitrary variances, they are always 121 00:07:08,720 --> 00:07:11,690 going to be jointly Gaussian. 122 00:07:11,690 --> 00:07:13,280 Saying that they're uncorrelated is 123 00:07:13,280 --> 00:07:14,650 not enough for that. 124 00:07:14,650 --> 00:07:17,930 You really need the statement that they're independent of 125 00:07:17,930 --> 00:07:20,010 each other. 126 00:07:20,010 --> 00:07:21,820 OK. 127 00:07:21,820 --> 00:07:24,980 That's sort of where we were last time. 128 00:07:27,950 --> 00:07:31,750 When you look at this formula, you can look at it in terms of 129 00:07:31,750 --> 00:07:33,880 sample values. 130 00:07:33,880 --> 00:07:38,980 And if we look at a sample value what's happening is that 131 00:07:38,980 --> 00:07:44,940 the sample value of the random vector Z, namely little z, is 132 00:07:44,940 --> 00:07:49,320 going to be defined as some matrix A times this sample 133 00:07:49,320 --> 00:07:53,650 value for the normal vector N. OK. 134 00:07:53,650 --> 00:07:57,060 And what we want to look at is geometrically 135 00:07:57,060 --> 00:07:59,090 what happens there. 136 00:07:59,090 --> 00:08:07,210 Well this matrix a is going to map a unit vector, E sub i 137 00:08:07,210 --> 00:08:10,750 into the i'th column of A. Why is that? 138 00:08:10,750 --> 00:08:17,140 Well you look at A, which is whatever it is, A sub 1, 1, 139 00:08:17,140 --> 00:08:22,280 blah, blah, blah up to A sub k, k. 140 00:08:22,280 --> 00:08:28,040 And you look at multiplying it by a vector which is zero only 141 00:08:28,040 --> 00:08:31,090 in the i'th position. 142 00:08:31,090 --> 00:08:37,060 And what's this matrix multiplication going to do? 143 00:08:37,060 --> 00:08:40,720 It's going to simply pick out the i'th column 144 00:08:40,720 --> 00:08:42,490 of this matrix here. 145 00:08:42,490 --> 00:08:43,040 OK. 146 00:08:43,040 --> 00:08:48,670 So a is going to map e sub i into the i'th column of A. OK. 147 00:08:48,670 --> 00:08:55,710 So then the question is what is this matrix A going to do 148 00:08:55,710 --> 00:09:01,030 to some small cube in the n plane? 149 00:09:01,030 --> 00:09:01,340 OK. 150 00:09:01,340 --> 00:09:07,110 If you take a small cube in the n plane from 0 to delta 151 00:09:07,110 --> 00:09:15,170 along the n1 line is going to map into 0 to this point here, 152 00:09:15,170 --> 00:09:20,390 which is A e sub 1. 153 00:09:20,390 --> 00:09:26,630 This point here is going to map into A times e sub 2. 154 00:09:26,630 --> 00:09:29,600 This is just drawing for two dimensions of course. 155 00:09:29,600 --> 00:09:33,280 So in fact all the points in this cube are going to get map 156 00:09:33,280 --> 00:09:35,890 into this little rectangle here. 157 00:09:35,890 --> 00:09:37,450 OK. 158 00:09:37,450 --> 00:09:41,740 Namely, that's what a matrix times a vector is going to do. 159 00:09:41,740 --> 00:09:42,820 Anybody awake out there? 160 00:09:42,820 --> 00:09:45,720 You're all looking at me as if I'm totally insane. 161 00:09:50,180 --> 00:09:50,480 OK. 162 00:09:50,480 --> 00:09:54,050 Every everyone following this? 163 00:09:54,050 --> 00:09:55,780 OK. 164 00:09:55,780 --> 00:09:58,870 So perhaps this is just too trivial. 165 00:09:58,870 --> 00:10:00,690 I hope not. 166 00:10:00,690 --> 00:10:01,080 OK. 167 00:10:01,080 --> 00:10:05,770 So unit cubes get mapped into rectangles here. 168 00:10:05,770 --> 00:10:10,030 If I take a unit cube up here, it's going to get mapped into 169 00:10:10,030 --> 00:10:12,780 the same kind of unit rectangle here. 170 00:10:12,780 --> 00:10:16,520 If I visualize tiling this plane here with little tiny 171 00:10:16,520 --> 00:10:20,320 cubes, delta on a side, what's going to happen? 172 00:10:20,320 --> 00:10:24,690 Each of these little cubes is going to map into a 173 00:10:24,690 --> 00:10:28,460 parallelogram over here, and these parallelograms going to 174 00:10:28,460 --> 00:10:31,120 tile this space here. 175 00:10:31,120 --> 00:10:31,490 OK. 176 00:10:31,490 --> 00:10:35,700 Which means that each little cube here maps into one of 177 00:10:35,700 --> 00:10:36,940 these rectangles. 178 00:10:36,940 --> 00:10:40,940 Each rectangle here maps back into a little cube here. 179 00:10:40,940 --> 00:10:42,490 Which means that I'm looking at a very 180 00:10:42,490 --> 00:10:45,080 special case of a here. 181 00:10:45,080 --> 00:10:48,330 I'm looking at a case where a is non-singular. 182 00:10:48,330 --> 00:10:51,820 In other words, I can get the any point in this plane by 183 00:10:51,820 --> 00:10:54,420 starting with some point in this plane. 184 00:10:54,420 --> 00:10:58,440 Which means for any point here, I can go back here also. 185 00:10:58,440 --> 00:11:05,250 In other words, I can also write n is equal to A to the 186 00:11:05,250 --> 00:11:10,310 minus 1 times Z. And this matrix has to exist. 187 00:11:10,310 --> 00:11:10,570 OK. 188 00:11:10,570 --> 00:11:13,380 That's what you mean geometrically by a 189 00:11:13,380 --> 00:11:14,420 non-singular matrix. 190 00:11:14,420 --> 00:11:18,150 It means that all points in this plane get mapped into 191 00:11:18,150 --> 00:11:19,910 points in this plane. 192 00:11:19,910 --> 00:11:23,650 And get mapped into only one point in this plane, and get 193 00:11:23,650 --> 00:11:26,030 mapped into only one point in this plane. 194 00:11:26,030 --> 00:11:28,490 Where every point in this plane is the map 195 00:11:28,490 --> 00:11:29,830 of some point here. 196 00:11:29,830 --> 00:11:31,820 In other words you can go from here to there. 197 00:11:31,820 --> 00:11:35,060 You can also go back again. 198 00:11:35,060 --> 00:11:35,440 OK. 199 00:11:35,440 --> 00:11:38,900 The volume of a parallelepiped here, and this is in an 200 00:11:38,900 --> 00:11:42,610 arbitrary number of dimensions is going to be the determinant 201 00:11:42,610 --> 00:11:46,420 of A. And you all know how to find determinants. 202 00:11:46,420 --> 00:11:48,680 Namely you program a computer to do it, and the 203 00:11:48,680 --> 00:11:50,110 computer does it. 204 00:11:50,110 --> 00:11:52,940 I mean it used to be we had to do this by an enormous amount 205 00:11:52,940 --> 00:11:54,340 of calculation. 206 00:11:54,340 --> 00:11:57,780 And of course nobody does that anymore. 207 00:11:57,780 --> 00:11:58,240 OK. 208 00:11:58,240 --> 00:12:01,060 So we can find the volume of this parallelepiped, it's this 209 00:12:01,060 --> 00:12:02,970 determinant. 210 00:12:02,970 --> 00:12:05,680 And what does all of that mean? 211 00:12:05,680 --> 00:12:08,410 Well first let me ask you the question. 212 00:12:08,410 --> 00:12:11,280 What's going to happen if the determinant of a 213 00:12:11,280 --> 00:12:13,760 is equaled to zero? 214 00:12:13,760 --> 00:12:15,410 What does that mean geometrically? 215 00:12:18,380 --> 00:12:20,080 What does that mean here in terms of this 216 00:12:20,080 --> 00:12:21,650 two dimensional diagram? 217 00:12:21,650 --> 00:12:23,580 AUDIENCE: [INAUDIBLE] 218 00:12:23,580 --> 00:12:23,980 PROFESSOR: What? 219 00:12:23,980 --> 00:12:25,860 AUDIENCE: Projection onto a line. 220 00:12:25,860 --> 00:12:26,550 PROFESSOR: Yeah. 221 00:12:26,550 --> 00:12:30,220 This little cube here is simply going to get projected 222 00:12:30,220 --> 00:12:31,620 onto some line here. 223 00:12:31,620 --> 00:12:34,890 Like for example that. 224 00:12:34,890 --> 00:12:39,140 In other words it means that this matrix is not invertible 225 00:12:39,140 --> 00:12:42,960 for one thing, but it also means everything here gets 226 00:12:42,960 --> 00:12:45,220 mapped onto some lower dimensional 227 00:12:45,220 --> 00:12:47,810 sub-space here in general. 228 00:12:47,810 --> 00:12:48,750 OK. 229 00:12:48,750 --> 00:12:50,890 Now remember that for a minute, and we'll come back to 230 00:12:50,890 --> 00:12:53,610 that when we start talking about probability densities. 231 00:12:58,890 --> 00:13:03,820 OK because of the picture we were just looking at, the 232 00:13:03,820 --> 00:13:11,710 density of the random variables Z at A times N, 233 00:13:11,710 --> 00:13:17,640 namely the density of Z at this particular value here is 234 00:13:17,640 --> 00:13:23,170 just the density that we get corresponding to the density 235 00:13:23,170 --> 00:13:26,830 of some point here mapped over into here. 236 00:13:26,830 --> 00:13:29,620 And what's going to happen when we take a point here and 237 00:13:29,620 --> 00:13:31,540 map it over into here? 238 00:13:31,540 --> 00:13:33,960 If you have a certain amount of density here, which is 239 00:13:33,960 --> 00:13:37,220 probability per unit volume. 240 00:13:37,220 --> 00:13:40,770 Now when you map it into here, and the determinant is bigger 241 00:13:40,770 --> 00:13:44,330 than zero, what you're doing is mapping a little volume 242 00:13:44,330 --> 00:13:46,930 into a big volume. 243 00:13:46,930 --> 00:13:49,660 And if you're doing that over small enough region where the 244 00:13:49,660 --> 00:13:53,370 probability density is of such essentially fixed, what's 245 00:13:53,370 --> 00:13:57,240 going to happen to the probability density over here? 246 00:13:57,240 --> 00:14:01,400 It's going to get scaled down, and it's going to get scaled 247 00:14:01,400 --> 00:14:04,850 down precisely by that determinant. 248 00:14:04,850 --> 00:14:05,560 OK. 249 00:14:05,560 --> 00:14:09,800 So what this is saying is the probability density of the 250 00:14:09,800 --> 00:14:13,000 random variable z, which is linear combination of these 251 00:14:13,000 --> 00:14:17,810 normal random variables, is in fact the probability density 252 00:14:17,810 --> 00:14:21,530 of this normal vector N, and we know what that probability 253 00:14:21,530 --> 00:14:26,250 density is, divided by this determinant of a. 254 00:14:26,250 --> 00:14:28,480 In fact this is a general formula for any old 255 00:14:28,480 --> 00:14:30,290 probability density at all. 256 00:14:30,290 --> 00:14:32,660 You can start out with anything which you call a 257 00:14:32,660 --> 00:14:37,210 random vector N and you can derive the probability density 258 00:14:37,210 --> 00:14:41,030 of any linear combination of those elements in 259 00:14:41,030 --> 00:14:42,680 precisely this way. 260 00:14:42,680 --> 00:14:49,620 So long as this volume element is non-zero, which means 261 00:14:49,620 --> 00:14:53,640 you're not mapping an entire space into a sub-space. 262 00:14:53,640 --> 00:14:57,720 When you're mapping an entire space into a sub-space and you 263 00:14:57,720 --> 00:15:02,270 define density as being density per unit volume, of 264 00:15:02,270 --> 00:15:06,140 course the density in this map space doesn't exist anymore. 265 00:15:06,140 --> 00:15:09,370 Which is exactly what this formula says. 266 00:15:09,370 --> 00:15:12,710 If this determinant is zero, it means this density here is 267 00:15:12,710 --> 00:15:17,980 going to be infinite in the regions where this z exists at 268 00:15:17,980 --> 00:15:21,830 all, which is just in this linear sub-space, and what do 269 00:15:21,830 --> 00:15:23,920 you do about that? 270 00:15:23,920 --> 00:15:27,970 I mean do you get all frustrated about it? 271 00:15:27,970 --> 00:15:31,540 Or do you say what's going on and treat it in 272 00:15:31,540 --> 00:15:32,790 some sensible way? 273 00:15:37,200 --> 00:15:41,080 I mean the thing is happening here. 274 00:15:41,080 --> 00:15:42,830 And this talks about it a little bit. 275 00:15:42,830 --> 00:15:46,210 If a is singular, then A is going to map Rk 276 00:15:46,210 --> 00:15:48,240 into a proper sub-space. 277 00:15:48,240 --> 00:15:50,700 Determinant A is going to be equal to 0. 278 00:15:50,700 --> 00:15:53,400 The density doesn't exist. 279 00:15:53,400 --> 00:15:56,230 So what do you do about it? 280 00:15:56,230 --> 00:16:01,820 I mean what does this mean if you're mapping 281 00:16:01,820 --> 00:16:03,330 into a smaller sub-space. 282 00:16:06,840 --> 00:16:11,180 What does it mean in terms of this diagram here? 283 00:16:13,680 --> 00:16:16,150 Well in the diagram here it's pretty clear. 284 00:16:16,150 --> 00:16:18,580 Because these little cubes here are getting mapped into 285 00:16:18,580 --> 00:16:20,880 straight lines here. 286 00:16:20,880 --> 00:16:23,310 Yeah? 287 00:16:23,310 --> 00:16:23,630 What? 288 00:16:23,630 --> 00:16:27,110 AUDIENCE: [INAUDIBLE] 289 00:16:27,110 --> 00:16:31,640 PROFESSOR: Some linear combinations of this are being 290 00:16:31,640 --> 00:16:32,650 mapped into 0. 291 00:16:32,650 --> 00:16:39,050 Namely if this straight line is this way any Z which is 292 00:16:39,050 --> 00:16:43,700 going in this direction is being mapped into 0. 293 00:16:43,700 --> 00:16:50,520 Any vector Z which is going in this direction has to be 294 00:16:50,520 --> 00:16:53,060 identically equaled to 0. 295 00:16:53,060 --> 00:16:59,010 In other words some linear combination of n1 and n2 is a 296 00:16:59,010 --> 00:17:03,740 random variable which takes on the value 0 identically. 297 00:17:03,740 --> 00:17:09,420 Why do you try to represent that in a probabilistic sense? 298 00:17:09,420 --> 00:17:12,840 Why don't you just take it out of consideration altogether? 299 00:17:12,840 --> 00:17:17,890 Here what it means is that z1 and z2 are simply linear 300 00:17:17,890 --> 00:17:20,390 combinations of each other. 301 00:17:20,390 --> 00:17:20,750 OK. 302 00:17:20,750 --> 00:17:24,990 In other words once you know what the sample value of z1 303 00:17:24,990 --> 00:17:29,220 is, you can find the sample values z2. 304 00:17:29,220 --> 00:17:35,070 In other words z2 is a linear combination of z1. 305 00:17:35,070 --> 00:17:39,600 It's linearly dependent on z1, which means that you can 306 00:17:39,600 --> 00:17:43,030 identify it exactly once you know what z1 is. 307 00:17:43,030 --> 00:17:45,150 Which means you might as well not call it a 308 00:17:45,150 --> 00:17:47,340 random variable at all. 309 00:17:47,340 --> 00:17:50,880 Which means you might as well view this situation where the 310 00:17:50,880 --> 00:17:56,650 determinant is 0 as where the vector Z is really just one 311 00:17:56,650 --> 00:18:00,320 random variable, and everything else is determined. 312 00:18:00,320 --> 00:18:05,570 So you throw out these extra random variables, 313 00:18:05,570 --> 00:18:08,130 pseudo-random variables, which are really just linear 314 00:18:08,130 --> 00:18:09,830 combinations of the others. 315 00:18:09,830 --> 00:18:12,060 So you deal with a smaller dimensional set. 316 00:18:12,060 --> 00:18:14,690 You find the probability density of the smaller 317 00:18:14,690 --> 00:18:18,270 dimensional set, and you don't worry about all of these 318 00:18:18,270 --> 00:18:22,390 mathematical peculiarities that would arise otherwise. 319 00:18:22,390 --> 00:18:22,850 OK. 320 00:18:22,850 --> 00:18:28,270 So once we do that, A is going to be non-singular. 321 00:18:28,270 --> 00:18:30,500 Because we're going to be left with a set of random 322 00:18:30,500 --> 00:18:32,880 variables, which are not linearly 323 00:18:32,880 --> 00:18:34,580 dependent on each other. 324 00:18:34,580 --> 00:18:37,580 They can be statistically dependent on each other, but 325 00:18:37,580 --> 00:18:40,490 not linearly dependent, OK. 326 00:18:40,490 --> 00:18:46,170 So for all z then, since determinant A is not 0, the 327 00:18:46,170 --> 00:18:51,090 probability density at some arbitrary vector Z is going to 328 00:18:51,090 --> 00:18:59,400 be the normal joint density evaluated at a to the minus 1z 329 00:18:59,400 --> 00:19:03,590 divided by the determinant of A. OK. 330 00:19:03,590 --> 00:19:05,850 In other words we're just working the map backwards. 331 00:19:05,850 --> 00:19:09,500 This formula is the same as this formula, except instead 332 00:19:09,500 --> 00:19:12,230 of writing An here, we're writing z here. 333 00:19:12,230 --> 00:19:17,450 And when An is equal to z then little n has to be equal to A 334 00:19:17,450 --> 00:19:19,370 to the minus 1z. 335 00:19:19,370 --> 00:19:20,410 And what does that say? 336 00:19:20,410 --> 00:19:23,870 It says that the joint probability density has to be 337 00:19:23,870 --> 00:19:28,180 equal to this quantity here. 338 00:19:28,180 --> 00:19:31,075 Which in matrix terms looks rather simple, it 339 00:19:31,075 --> 00:19:32,890 looks rather nice. 340 00:19:32,890 --> 00:19:34,570 You can rewrite this. 341 00:19:34,570 --> 00:19:38,340 This is a norm here, so it's this vector 342 00:19:38,340 --> 00:19:41,950 there times this vector. 343 00:19:41,950 --> 00:19:44,960 Where in fact when you want to multiply vectors in this way, 344 00:19:44,960 --> 00:19:47,890 you're taking inner product of the vector with this cell. 345 00:19:47,890 --> 00:19:50,050 These are real vectors we're looking at. 346 00:19:50,050 --> 00:19:52,700 Because we're trying to model real noise, because we're 347 00:19:52,700 --> 00:19:55,840 modeling the noise on communication channels. 348 00:19:55,840 --> 00:20:01,270 So this is going to be the inner product of A to the 349 00:20:01,270 --> 00:20:05,670 minus 1z with itself, which means you want to look at the 350 00:20:05,670 --> 00:20:08,490 transform of a to the minus 1z. 351 00:20:08,490 --> 00:20:11,530 Now what's the transform of a to the minus 1z? 352 00:20:11,530 --> 00:20:15,590 It's z transform times a to the minus 1 transform. 353 00:20:15,590 --> 00:20:19,190 So we wind up with a to the minus 1 transform times a to 354 00:20:19,190 --> 00:20:21,030 the minus 1 times z. 355 00:20:21,030 --> 00:20:25,700 So we have some kind of peculiar bilinear form here. 356 00:20:25,700 --> 00:20:32,270 So for any sample value of the random vector Z we can find 357 00:20:32,270 --> 00:20:37,510 the probability density in terms of this quantity here. 358 00:20:37,510 --> 00:20:40,920 Which looks a little bit peculiar, but it 359 00:20:40,920 --> 00:20:42,170 doesn't look too bad. 360 00:20:45,720 --> 00:20:46,040 OK. 361 00:20:46,040 --> 00:20:48,280 Now we want to simplify that a little bit. 362 00:20:51,670 --> 00:20:54,770 Anytime you're dealing with zero-mean random variables -- 363 00:20:54,770 --> 00:20:57,430 now remember I'm going to forget to say zero-mean half 364 00:20:57,430 --> 00:20:59,940 the time because everything is concerned with zero-mean 365 00:20:59,940 --> 00:21:01,470 random variables. 366 00:21:01,470 --> 00:21:08,240 The covariance of Z1 and Z2 is expected value of Z1 times Z2. 367 00:21:08,240 --> 00:21:11,630 So if you have a k-tuple Z, the covariance is a matrix 368 00:21:11,630 --> 00:21:17,370 whose i,j element is expected value of Zi times Zj. 369 00:21:17,370 --> 00:21:21,180 And what that means is that the covariance matrix, this is 370 00:21:21,180 --> 00:21:26,330 a matrix now, it's going to be the expected value of z times 371 00:21:26,330 --> 00:21:28,010 z transpose. 372 00:21:28,010 --> 00:21:33,400 Z is a random vector, which is a column random vector, Z 373 00:21:33,400 --> 00:21:38,890 transpose is a row-random vector, which is this simply 374 00:21:38,890 --> 00:21:42,550 turned upside down, turned by 90 degrees. 375 00:21:42,550 --> 00:21:44,820 Now when you multiply the components of this vector 376 00:21:44,820 --> 00:21:48,580 together, you can see that what you get is the elements 377 00:21:48,580 --> 00:21:50,130 of this covariance matrix. 378 00:21:50,130 --> 00:21:53,770 In other words this is just standard matrix manipulation, 379 00:21:53,770 --> 00:21:58,560 which I hope most of you or at least partly familiar with. 380 00:21:58,560 --> 00:21:59,000 OK. 381 00:21:59,000 --> 00:22:02,280 When we then talk about the expected values Z times Z 382 00:22:02,280 --> 00:22:07,210 transpose we can write this as the expected value of A times 383 00:22:07,210 --> 00:22:09,410 N, which is what z is. 384 00:22:09,410 --> 00:22:15,640 N transpose times A transpose, which is what Z transpose is. 385 00:22:15,640 --> 00:22:20,000 And miraculously here the N and the N transpose are in the 386 00:22:20,000 --> 00:22:24,330 middle here, where it's easy to deal with them, because 387 00:22:24,330 --> 00:22:28,750 these are normal Gaussian random variables. 388 00:22:28,750 --> 00:22:33,380 And when you look at this column times this row, since 389 00:22:33,380 --> 00:22:37,000 all diagonal elements are independent of each other, and 390 00:22:37,000 --> 00:22:40,670 all of them have variance one, the expected value of N times 391 00:22:40,670 --> 00:22:44,940 N transpose is simply the identity matrix. 392 00:22:44,940 --> 00:22:48,810 All the randomness goes out of this which it obviously has to 393 00:22:48,810 --> 00:22:51,540 because we're looking at a covariance which is just a 394 00:22:51,540 --> 00:22:54,000 matrix and not something random. 395 00:22:54,000 --> 00:22:57,010 So you wind up with this covariance matrix is equal to 396 00:22:57,010 --> 00:23:02,830 a rather arbitrary matrix A, but not singular, times the 397 00:23:02,830 --> 00:23:06,850 transpose of that matrix. 398 00:23:06,850 --> 00:23:07,230 OK. 399 00:23:07,230 --> 00:23:10,340 We've assumed that A is non-singular and therefore 400 00:23:10,340 --> 00:23:17,620 it's not too hard to see the k sub Z is non-singular also. 401 00:23:17,620 --> 00:23:22,600 And explicitly case of Z mainly its co-variance matrix, 402 00:23:22,600 --> 00:23:28,980 the inverse of it, is A to the minus 1 transpose times A to 403 00:23:28,980 --> 00:23:29,940 the minus 1. 404 00:23:29,940 --> 00:23:32,330 Namely you take the inverse and you start flipping things 405 00:23:32,330 --> 00:23:35,210 and you do all of these neat matrix things 406 00:23:35,210 --> 00:23:37,560 that you always do. 407 00:23:37,560 --> 00:23:41,290 And you should review them if you've forgotten that. 408 00:23:41,290 --> 00:23:44,930 So that when we write our probability density, which was 409 00:23:44,930 --> 00:23:49,180 this in terms of this transformation here, what we 410 00:23:49,180 --> 00:23:54,270 get is the density of Z is equal to, in place of 411 00:23:54,270 --> 00:23:57,620 determinant of A, we get the square root of the determinant 412 00:23:57,620 --> 00:24:03,170 in the k sub Z. You probably don't remember that, but is 413 00:24:03,170 --> 00:24:03,890 what you get. 414 00:24:03,890 --> 00:24:07,140 And it's sort of a blah. 415 00:24:07,140 --> 00:24:09,110 Here this is more interesting. 416 00:24:09,110 --> 00:24:13,590 This is minus 1/2 z transpose times Kz to the 417 00:24:13,590 --> 00:24:15,560 minus 1 times z. 418 00:24:15,560 --> 00:24:18,640 What does this tell you? 419 00:24:18,640 --> 00:24:21,490 Look at this formula. 420 00:24:21,490 --> 00:24:26,510 Is it just a big formula or what does it say to you? 421 00:24:26,510 --> 00:24:30,860 You got to look at these things and see what they say. 422 00:24:30,860 --> 00:24:33,890 I mean we've gone through a lot of work to derive 423 00:24:33,890 --> 00:24:35,140 something here. 424 00:24:43,140 --> 00:24:46,170 AUDIENCE: [INAUDIBLE] 425 00:24:46,170 --> 00:24:47,320 PROFESSOR: Well it is Gaussian. 426 00:24:47,320 --> 00:24:48,890 Yes. 427 00:24:48,890 --> 00:24:53,400 I mean that's the way we define jointly Gaussian. 428 00:24:53,400 --> 00:24:55,760 But what's the funny thing about 429 00:24:55,760 --> 00:24:57,780 this probability density? 430 00:24:57,780 --> 00:24:59,030 What does it depend on? 431 00:25:09,090 --> 00:25:12,300 What do I have to tell you in order for you to calculate 432 00:25:12,300 --> 00:25:15,140 this probability density for every z you 433 00:25:15,140 --> 00:25:18,500 want to plug in here? 434 00:25:18,500 --> 00:25:21,750 I have to tell you what this covariance matrix is. 435 00:25:21,750 --> 00:25:24,980 And once I tell you what the covariance matrix is, there is 436 00:25:24,980 --> 00:25:26,960 nothing more to be specified. 437 00:25:26,960 --> 00:25:30,200 In other words, a jointly Gaussian random vector is 438 00:25:30,200 --> 00:25:34,020 completely specified by its covariance matrix. 439 00:25:34,020 --> 00:25:36,710 And it's specified exactly this way by 440 00:25:36,710 --> 00:25:38,880 its covariance matrix. 441 00:25:38,880 --> 00:25:39,100 OK. 442 00:25:39,100 --> 00:25:40,350 There's nothing more there. 443 00:25:43,830 --> 00:25:46,950 So this says anytime you're dealing with jointly Gaussian, 444 00:25:46,950 --> 00:25:50,100 the only thing you have to be interested in is this 445 00:25:50,100 --> 00:25:52,520 covariance here. 446 00:25:52,520 --> 00:25:55,780 Namely all you need to have jointly Gaussian is somebody 447 00:25:55,780 --> 00:26:00,290 has to tell you what the covariance is, and somebody 448 00:26:00,290 --> 00:26:07,230 has to tell you also that it's jointly Gaussian. 449 00:26:07,230 --> 00:26:10,420 Jointly Gaussian plus a given covariance specifies the 450 00:26:10,420 --> 00:26:13,870 probability density. 451 00:26:13,870 --> 00:26:14,420 OK. 452 00:26:14,420 --> 00:26:17,790 What does that tell you? 453 00:26:17,790 --> 00:26:22,100 Well let's look at an example where we just have two random 454 00:26:22,100 --> 00:26:25,020 variables, Z1 and Z2. 455 00:26:25,020 --> 00:26:28,870 then expected value of Z1 squared is the upper. 456 00:26:28,870 --> 00:26:35,520 Left hand element of that covariance matrix, which we'll 457 00:26:35,520 --> 00:26:37,380 call sigma 1 squared. 458 00:26:37,380 --> 00:26:44,050 The lower, right hand side of the matrix is K22, which we'll 459 00:26:44,050 --> 00:26:45,540 call sigma 2 squared. 460 00:26:48,490 --> 00:26:51,430 And we're going to let rho be the normalize covariance. 461 00:26:51,430 --> 00:26:54,690 We're just defining a bunch of things here, because this is 462 00:26:54,690 --> 00:26:57,470 the way people usually define this. 463 00:26:57,470 --> 00:27:02,230 So rho will be the normalized cross covariance. 464 00:27:02,230 --> 00:27:06,100 Then the determinant of the Kz is this mess here. 465 00:27:06,100 --> 00:27:09,960 For A to be non-singular, we have to have rho less than 1. 466 00:27:09,960 --> 00:27:13,060 If rho is equal to 1 then this determinant is going to be 467 00:27:13,060 --> 00:27:16,480 equal to 0, and we're back in this awful case that we don't 468 00:27:16,480 --> 00:27:18,260 want to think about. 469 00:27:18,260 --> 00:27:21,600 So then if we go through the trouble of finding out what 470 00:27:21,600 --> 00:27:27,580 the inverse of K sub z is, we find this 1 over 1 minus row 471 00:27:27,580 --> 00:27:30,440 square times this matrix here. 472 00:27:30,440 --> 00:27:34,710 The probability density plugging this in is this big 473 00:27:34,710 --> 00:27:36,500 thing here. 474 00:27:36,500 --> 00:27:38,140 OK what does that tell you? 475 00:27:45,580 --> 00:27:49,080 Well the thing that it tells me is that I never want to 476 00:27:49,080 --> 00:27:51,880 deal with this, and I particularly don't want to 477 00:27:51,880 --> 00:27:56,010 deal with it if I'm dealing with three or more variables. 478 00:27:56,010 --> 00:27:56,400 OK. 479 00:27:56,400 --> 00:27:59,250 In other words the interesting thing here is the simple 480 00:27:59,250 --> 00:28:07,470 formula we had before, which is this formula. 481 00:28:07,470 --> 00:28:08,470 OK. 482 00:28:08,470 --> 00:28:13,150 And we have computers these days which say given nice 483 00:28:13,150 --> 00:28:17,130 formulas like this, their standard computer routines to 484 00:28:17,130 --> 00:28:19,930 calculate things like this. 485 00:28:19,930 --> 00:28:24,910 And you never want to look at some awful mess like this. 486 00:28:24,910 --> 00:28:26,700 OK. 487 00:28:26,700 --> 00:28:29,560 And if you put a mean into here, which you will see in 488 00:28:29,560 --> 00:28:34,020 every textbook on random variables and probability you 489 00:28:34,020 --> 00:28:38,410 ever look at, this thing becomes so ugly that you were 490 00:28:38,410 --> 00:28:41,530 probably convinced before you took this class that jointly 491 00:28:41,530 --> 00:28:44,420 Gaussian random variables were things you wanted to avoid 492 00:28:44,420 --> 00:28:45,700 like the plague. 493 00:28:45,700 --> 00:28:48,460 And if you really have to deal with explicit formulas like 494 00:28:48,460 --> 00:28:50,000 this, you're absolutely right. 495 00:28:50,000 --> 00:28:53,130 You do want to avoid them like the plague, because you can't 496 00:28:53,130 --> 00:28:57,140 get any insight from what that says, or at least I can't. 497 00:28:57,140 --> 00:29:00,420 So I say OK, we have to deal with this. 498 00:29:00,420 --> 00:29:03,070 But yet we like to get a little more insight about what 499 00:29:03,070 --> 00:29:05,190 this means. 500 00:29:05,190 --> 00:29:08,070 And to do this, we like to find a little bit more about 501 00:29:08,070 --> 00:29:11,250 what these bilinear forms are all about. 502 00:29:11,250 --> 00:29:14,540 Those of you who have taken any course on linear algebra 503 00:29:14,540 --> 00:29:18,190 have dealt with these bilinear forms. 504 00:29:18,190 --> 00:29:21,100 And played with them forever. 505 00:29:21,100 --> 00:29:24,720 And those of you who haven't are probably puzzled about how 506 00:29:24,720 --> 00:29:26,270 to deal with them. 507 00:29:26,270 --> 00:29:30,850 The notes have an appendix, which is about two pages long 508 00:29:30,850 --> 00:29:36,880 which tells you what you have to know about these matrices. 509 00:29:36,880 --> 00:29:42,610 And I will just sort of quote those results as we go. 510 00:29:42,610 --> 00:29:47,880 Incidentally those results are not hard to derive. 511 00:29:47,880 --> 00:29:50,550 And not hard to find out about. 512 00:29:50,550 --> 00:29:53,660 You can simply derive them on your own, or you can look at 513 00:29:53,660 --> 00:29:56,290 Strang's book on linear algebra, which is about the 514 00:29:56,290 --> 00:29:58,620 simplest way to get them. 515 00:29:58,620 --> 00:30:04,680 And that's all you need to do. 516 00:30:04,680 --> 00:30:05,070 OK. 517 00:30:05,070 --> 00:30:08,560 We've said the probability density depends on this 518 00:30:08,560 --> 00:30:12,040 bilinear form z transpose times Kz to the 519 00:30:12,040 --> 00:30:14,600 minus 1 time z. 520 00:30:14,600 --> 00:30:15,200 What is this? 521 00:30:15,200 --> 00:30:17,830 Is this a matrix or a vector or what? 522 00:30:22,590 --> 00:30:24,190 How many people think it's a matrix? 523 00:30:26,720 --> 00:30:30,110 How many people think it's a vector? 524 00:30:30,110 --> 00:30:31,395 You think it's a vector? 525 00:30:31,395 --> 00:30:31,900 OK. 526 00:30:31,900 --> 00:30:34,630 Well in a very peculiar sense it is. 527 00:30:34,630 --> 00:30:37,330 How many people think it's a number? 528 00:30:37,330 --> 00:30:37,690 Good. 529 00:30:37,690 --> 00:30:39,460 OK. 530 00:30:39,460 --> 00:30:42,220 It is a number, and it's a number because 531 00:30:42,220 --> 00:30:44,450 this is a row vector. 532 00:30:44,450 --> 00:30:46,090 This is a matrix. 533 00:30:46,090 --> 00:30:47,890 This is a column vector. 534 00:30:47,890 --> 00:30:50,650 And if you think of multiplying a matrix times a 535 00:30:50,650 --> 00:30:53,760 column vector, you get a column vector. 536 00:30:53,760 --> 00:30:55,940 And if you take a row vector times a column 537 00:30:55,940 --> 00:30:57,750 vector, you got a number. 538 00:30:57,750 --> 00:30:58,370 OK. 539 00:30:58,370 --> 00:31:04,420 So this is just a number which depends on little z. 540 00:31:04,420 --> 00:31:05,150 OK. 541 00:31:05,150 --> 00:31:08,880 Kz is called a positive definite matrix. 542 00:31:08,880 --> 00:31:11,600 And it's called a positive definite matrix, because this 543 00:31:11,600 --> 00:31:14,730 thing is always non-negative. 544 00:31:14,730 --> 00:31:17,800 And it always has to be non-negative because this 545 00:31:17,800 --> 00:31:19,260 refers to the -- 546 00:31:28,020 --> 00:31:30,950 if I put capital Z in here, namely if I put the random 547 00:31:30,950 --> 00:31:36,300 vector in here Z, then what this is, is the variance of a 548 00:31:36,300 --> 00:31:39,360 particular random variable. 549 00:31:39,360 --> 00:31:42,100 So it has to be greater than or equal to zero. 550 00:31:42,100 --> 00:31:47,790 So anyway K sub z is always non-negative definite. 551 00:31:47,790 --> 00:31:51,100 Here it's going to be positive definite, because we've 552 00:31:51,100 --> 00:31:54,030 already assumed that the matrix A was non-singular, and 553 00:31:54,030 --> 00:31:57,050 therefore the matrix Kz has to be non-singular. 554 00:31:57,050 --> 00:31:59,370 So this has to be positive definite. 555 00:31:59,370 --> 00:32:03,470 So it has an inverse, K sub z minus 1 is also positive 556 00:32:03,470 --> 00:32:07,130 definite which means this quantity is always greater 557 00:32:07,130 --> 00:32:10,910 than zero, if z is non-zero. 558 00:32:10,910 --> 00:32:14,340 You can take these positive definite matrices and you can 559 00:32:14,340 --> 00:32:17,640 find eigenvectors and eigenvalues for them. 560 00:32:17,640 --> 00:32:20,340 Do you ever actually calculate these eigenvectors and 561 00:32:20,340 --> 00:32:22,660 eigenvalues? 562 00:32:22,660 --> 00:32:23,460 I hope not. 563 00:32:23,460 --> 00:32:26,310 It's a mess to do it. 564 00:32:26,310 --> 00:32:29,170 I mean it's just as bad as writing that awful formula we 565 00:32:29,170 --> 00:32:30,470 had before. 566 00:32:30,470 --> 00:32:33,420 So you don't want to actually do this, but it's nice to know 567 00:32:33,420 --> 00:32:35,910 that these things exist. 568 00:32:35,910 --> 00:32:44,120 And because these vectors exist, and in fact if you have 569 00:32:44,120 --> 00:32:48,990 a matrix which is k by k, little k by little k, then 570 00:32:48,990 --> 00:32:53,130 there are k such eigenvectors and they can be chosen 571 00:32:53,130 --> 00:32:54,570 orthonormal. 572 00:32:54,570 --> 00:32:55,010 OK. 573 00:32:55,010 --> 00:33:00,770 In other words each of these Q sub i are orthogonal to each 574 00:33:00,770 --> 00:33:01,870 of the others. 575 00:33:01,870 --> 00:33:04,550 You can clearly scale them, because you scale this and 576 00:33:04,550 --> 00:33:06,420 scale this together. 577 00:33:06,420 --> 00:33:08,070 And it's going to maintain equality. 578 00:33:08,070 --> 00:33:18,380 So you just scale them down so you can make them normalize. 579 00:33:18,380 --> 00:33:21,060 If you have a bunch of eigenvectors with the same 580 00:33:21,060 --> 00:33:26,660 eigenvalue, then the whole linear sub-space formed by 581 00:33:26,660 --> 00:33:33,540 that set of eigenvectors all have the same eigenvalue 582 00:33:33,540 --> 00:33:34,540 lambda sub i. 583 00:33:34,540 --> 00:33:38,560 Namely you take any linear combination of these things 584 00:33:38,560 --> 00:33:42,080 which satisfy this equation for a particular lambda. 585 00:33:42,080 --> 00:33:44,860 And any linear combination will satisfy the 586 00:33:44,860 --> 00:33:47,800 same the same equation. 587 00:33:47,800 --> 00:33:51,910 So you can simply choose an orthonormal set among them to 588 00:33:51,910 --> 00:33:53,300 satisfy this. 589 00:33:53,300 --> 00:33:58,290 And if you look at Q sub i and Q sub j, which have different 590 00:33:58,290 --> 00:34:03,080 eigenvalues, then it's pretty easy to show that in fact they 591 00:34:03,080 --> 00:34:05,770 have to be orthogonal to each other. 592 00:34:05,770 --> 00:34:13,060 So anyway when you do this you wind up with this form becomes 593 00:34:13,060 --> 00:34:17,280 just the sum over i of lambda sub i to the minus 1. 594 00:34:17,280 --> 00:34:19,770 Namely these eigenvalues to the minus 1. 595 00:34:19,770 --> 00:34:23,020 These eigenvalues are all positive. 596 00:34:23,020 --> 00:34:26,730 Times the inner product of z with Q sub i. 597 00:34:26,730 --> 00:34:29,440 In other words, you take whatever vector Z you're 598 00:34:29,440 --> 00:34:35,070 interested in here, you project it on these k 599 00:34:35,070 --> 00:34:38,220 orthonormal vectors Q sub i. 600 00:34:38,220 --> 00:34:41,600 You get those k values. 601 00:34:41,600 --> 00:34:47,680 And then this form here is just that sum there. 602 00:34:47,680 --> 00:34:49,530 So when you write the probability 603 00:34:49,530 --> 00:34:52,640 density in that way -- 604 00:34:52,640 --> 00:34:54,510 we still have this here we'll get rid of that 605 00:34:54,510 --> 00:34:55,640 in the minute -- 606 00:34:55,640 --> 00:35:00,840 you have e to the minus sum over i, these inner product 607 00:35:00,840 --> 00:35:04,230 terms squared divided by 2 times lambda sub i. 608 00:35:04,230 --> 00:35:07,010 That's just because this is equal to this. 609 00:35:07,010 --> 00:35:11,500 It's just substituting this for this in the formula for 610 00:35:11,500 --> 00:35:14,180 the probability density. 611 00:35:14,180 --> 00:35:14,530 OK. 612 00:35:14,530 --> 00:35:18,480 What does that say pictorially? 613 00:35:18,480 --> 00:35:22,210 Let me show you a picture of it first. 614 00:35:22,210 --> 00:35:27,840 It says that for any positive definite matrix and therefore 615 00:35:27,840 --> 00:35:37,440 for any covariance matrix, you can always find these 616 00:35:37,440 --> 00:35:38,370 orthonormal vectors. 617 00:35:38,370 --> 00:35:41,010 I've drawn them here for two dimensions. 618 00:35:41,010 --> 00:35:45,060 They're just some arbitrary vector q1; q2 has to be 619 00:35:45,060 --> 00:35:47,930 orthogonal to it. 620 00:35:47,930 --> 00:35:52,370 And if you look at the square root of lambda 1 times q1, you 621 00:35:52,370 --> 00:35:53,640 got a point here. 622 00:35:53,640 --> 00:35:57,100 You look at the square root of lambda 2 times q2, you got a 623 00:35:57,100 --> 00:35:58,180 point here. 624 00:35:58,180 --> 00:36:01,710 If you then look at this probability density here, you 625 00:36:01,710 --> 00:36:07,760 see that all the points on this ellipse here have to have 626 00:36:07,760 --> 00:36:15,730 the same sum of z times Q sub i. 627 00:36:15,730 --> 00:36:18,170 OK. 628 00:36:18,170 --> 00:36:20,120 It looked a little better over here. 629 00:36:20,120 --> 00:36:21,120 Yes. 630 00:36:21,120 --> 00:36:21,820 OK. 631 00:36:21,820 --> 00:36:27,380 Namely the points little z for which this is constant are the 632 00:36:27,380 --> 00:36:29,690 points which form this ellipse. 633 00:36:29,690 --> 00:36:34,940 And it's the ellipse which has the axes square root of lambda 634 00:36:34,940 --> 00:36:37,930 i times Q sub i. 635 00:36:37,930 --> 00:36:44,170 And then you just imagine it if it's lined up this way. 636 00:36:44,170 --> 00:36:51,720 And think of what you would get if you were looking at the 637 00:36:51,720 --> 00:36:55,590 lines of equal probability density for independent 638 00:36:55,590 --> 00:36:58,880 Gaussian random variables with different variances. 639 00:36:58,880 --> 00:37:01,150 I mean we already pointed out if the variances are all the 640 00:37:01,150 --> 00:37:06,990 same, these equal probability contours are spheres. 641 00:37:06,990 --> 00:37:13,040 If you now expand on some of the axes, you get ellipses. 642 00:37:13,040 --> 00:37:16,700 And now we've taken these arbitrary vectors, so these 643 00:37:16,700 --> 00:37:19,860 are not in the directions we started out with, but in some 644 00:37:19,860 --> 00:37:23,230 arbitrary directions. 645 00:37:23,230 --> 00:37:27,140 We have q1 and q2 are orthornormal to each other, 646 00:37:27,140 --> 00:37:29,850 because that's the way we've chosen them. 647 00:37:29,850 --> 00:37:33,640 And then the probability density has this form 648 00:37:33,640 --> 00:37:36,490 which is this form. 649 00:37:36,490 --> 00:37:45,480 And the other thing we can do is to take this form here and 650 00:37:45,480 --> 00:37:49,260 say gee this is just a probability density for a 651 00:37:49,260 --> 00:37:52,870 bunch of independent random variables, where the 652 00:37:52,870 --> 00:37:59,540 independent random variables are the inner product of the 653 00:37:59,540 --> 00:38:07,030 random vector Z with Q sub 1 up to the inner product of z 654 00:38:07,030 --> 00:38:08,440 with Q sub k. 655 00:38:08,440 --> 00:38:11,490 So these are independent Gaussian random variables. 656 00:38:11,490 --> 00:38:14,480 They have variances lambda sub i. 657 00:38:14,480 --> 00:38:19,850 And this is the nicest formula for the probability density of 658 00:38:19,850 --> 00:38:23,990 an arbitrary set of jointly density of jointly Gaussian 659 00:38:23,990 --> 00:38:25,900 random variables. 660 00:38:25,900 --> 00:38:26,990 OK. 661 00:38:26,990 --> 00:38:30,450 In other words what this says is in general if you have a 662 00:38:30,450 --> 00:38:34,190 set of jointly Gaussian random variables and I have this 663 00:38:34,190 --> 00:38:38,340 messy form here, all you're doing is looking at them in a 664 00:38:38,340 --> 00:38:40,240 wrong coordinate system. 665 00:38:40,240 --> 00:38:43,440 If you switch them around, you look at them this way instead 666 00:38:43,440 --> 00:38:47,990 of this way, you're going to have independent Gaussian 667 00:38:47,990 --> 00:38:49,800 random variables. 668 00:38:49,800 --> 00:38:55,240 And the way to look at them is found by solving this 669 00:38:55,240 --> 00:38:58,760 eigenvector eigenvalue equation, which will tell you 670 00:38:58,760 --> 00:39:01,720 what these orthogonal directions are. 671 00:39:01,720 --> 00:39:05,290 And then it'll tell you how to get this nice picture that 672 00:39:05,290 --> 00:39:06,780 looks this way. 673 00:39:06,780 --> 00:39:09,010 OK. 674 00:39:09,010 --> 00:39:13,040 OK so that tells us what we need to know, maybe even a 675 00:39:13,040 --> 00:39:16,360 little more than we have to know, about jointly Gaussian 676 00:39:16,360 --> 00:39:19,320 random variables. 677 00:39:19,320 --> 00:39:21,080 But there's one more bonus we get. 678 00:39:24,530 --> 00:39:27,980 And the bonus is the following. 679 00:39:27,980 --> 00:39:34,780 If you create a matrix here B where the i'th row of B is 680 00:39:34,780 --> 00:39:38,930 this vector Q sub i divided by the square root of lambda sub 681 00:39:38,930 --> 00:39:43,820 i, then what this is going to do is the corresponding 682 00:39:43,820 --> 00:39:48,700 element here is going to be a normalized Gaussian random 683 00:39:48,700 --> 00:39:52,340 variable with variance one and all of these are going to be 684 00:39:52,340 --> 00:39:55,150 independent of each other. 685 00:39:55,150 --> 00:39:55,300 OK. 686 00:39:55,300 --> 00:39:58,330 That's essentially what we were saying before. 687 00:39:58,330 --> 00:40:01,660 This is just another way of saying the same thing. 688 00:40:01,660 --> 00:40:08,720 That when you squish this probability density around and 689 00:40:08,720 --> 00:40:11,660 look at it in a different frame of reference, and then 690 00:40:11,660 --> 00:40:15,460 you scale the random variables down or up, what you wind up 691 00:40:15,460 --> 00:40:20,640 with is IID normal Gaussian random variables. 692 00:40:20,640 --> 00:40:21,000 OK. 693 00:40:21,000 --> 00:40:23,620 But that says that Z is equal to B to the 694 00:40:23,620 --> 00:40:26,470 minus 1 times N prime. 695 00:40:26,470 --> 00:40:28,470 Well so what? 696 00:40:28,470 --> 00:40:29,980 Here's the reason for so what. 697 00:40:29,980 --> 00:40:32,800 We started out with a definition of jointly 698 00:40:32,800 --> 00:40:36,820 Gaussian, which is probably not the definition of jointly 699 00:40:36,820 --> 00:40:41,190 Gaussian you've ever seen if you've seen this before. 700 00:40:41,190 --> 00:40:44,700 Then what we did was to say, OK if you have jointly 701 00:40:44,700 --> 00:40:47,770 Gaussian random variables and they're not linearly 702 00:40:47,770 --> 00:40:54,700 independent and none of them are linearly dependent on the 703 00:40:54,700 --> 00:40:58,610 others, then this matrix A is invertible. 704 00:40:58,610 --> 00:41:03,250 From that we derive the probability density. 705 00:41:03,250 --> 00:41:07,670 From the probability density we derive this. 706 00:41:07,670 --> 00:41:07,970 OK. 707 00:41:07,970 --> 00:41:10,100 The only thing you need to get this is 708 00:41:10,100 --> 00:41:12,010 the probability density. 709 00:41:12,010 --> 00:41:16,520 And this says that anytime you have a random vector Z with 710 00:41:16,520 --> 00:41:19,340 this probability density that we just wrote down. 711 00:41:23,170 --> 00:41:27,100 Then you have the property that N prime is equal to BZ, 712 00:41:27,100 --> 00:41:30,600 and Z is equal to B to the minus 1 N prime. 713 00:41:30,600 --> 00:41:35,120 Which says that if you have this probability density, then 714 00:41:35,120 --> 00:41:37,440 you have jointly Gaussian random variables. 715 00:41:37,440 --> 00:41:38,460 So we have an alternate 716 00:41:38,460 --> 00:41:41,170 definition of jointly Gaussian. 717 00:41:41,170 --> 00:41:44,320 Random variables are jointly Gaussian if they have this 718 00:41:44,320 --> 00:41:45,700 probability density. 719 00:41:49,150 --> 00:41:52,400 You can somehow represent them as linear combinations of 720 00:41:52,400 --> 00:41:55,000 normal random variables. 721 00:41:55,000 --> 00:41:55,260 OK. 722 00:41:55,260 --> 00:41:57,330 Then there's something even simpler. 723 00:41:57,330 --> 00:42:00,080 It says that if all linear combinations of a random 724 00:42:00,080 --> 00:42:04,490 vector Z are Gaussian, then Z is jointly Gaussian. 725 00:42:04,490 --> 00:42:05,910 Why is that? 726 00:42:05,910 --> 00:42:10,060 well If you look at this formula here, it says take any 727 00:42:10,060 --> 00:42:14,750 old random vector at all that has a covariance matrix. 728 00:42:14,750 --> 00:42:18,980 From that covariance matrix, we can solve for all of these 729 00:42:18,980 --> 00:42:20,230 eigenvectors. 730 00:42:37,510 --> 00:42:42,140 If I find the appropriate random variables here from 731 00:42:42,140 --> 00:42:46,250 this transformation, those random variables are then 732 00:42:46,250 --> 00:42:50,320 uncorrelated from each other, they are all statistically 733 00:42:50,320 --> 00:42:53,970 independent of each other, And it follows from that, that if 734 00:42:53,970 --> 00:42:58,700 all these linear combinations are Gaussian then Z has to be 735 00:42:58,700 --> 00:43:00,150 jointly Gaussian. 736 00:43:00,150 --> 00:43:00,560 OK. 737 00:43:00,560 --> 00:43:03,330 So we now have three definitions. 738 00:43:03,330 --> 00:43:05,740 This one is the simplest one to state. 739 00:43:05,740 --> 00:43:08,450 It's the hardest one to work with. 740 00:43:08,450 --> 00:43:11,210 That's life you know. 741 00:43:11,210 --> 00:43:15,230 This one is probably the most straightforward, because you 742 00:43:15,230 --> 00:43:19,730 don't have to know anything to understand this definition. 743 00:43:19,730 --> 00:43:24,200 This original definition is physically the most appealing 744 00:43:24,200 --> 00:43:27,780 because it shows why noise vectors actually 745 00:43:27,780 --> 00:43:29,050 do have this property. 746 00:43:32,370 --> 00:43:32,660 OK. 747 00:43:32,660 --> 00:43:35,330 So here's a summary of all of this. 748 00:43:35,330 --> 00:43:39,280 It says if Kz is singular, you want to remove the linearly 749 00:43:39,280 --> 00:43:41,820 independent random variables. 750 00:43:41,820 --> 00:43:44,500 You just take them away because they're uniquely 751 00:43:44,500 --> 00:43:48,600 specified in terms of the other variables. 752 00:43:48,600 --> 00:43:55,540 And then you take the resulting non-singular matrix, 753 00:43:55,540 --> 00:43:59,300 and Z is going to be jointly Gaussian if and only if Z is 754 00:43:59,300 --> 00:44:04,470 equal to AN for some normal random variable N. If Z has 755 00:44:04,470 --> 00:44:08,550 jointly Gaussian density or if all linear combinations of Z 756 00:44:08,550 --> 00:44:12,930 are Gaussian all of this for 0 mean would mean it applies to 757 00:44:12,930 --> 00:44:14,990 fluctuation. 758 00:44:14,990 --> 00:44:17,030 OK. 759 00:44:17,030 --> 00:44:20,240 So why do we have to know all of that about jointly Gaussian 760 00:44:20,240 --> 00:44:23,360 random variables? 761 00:44:23,360 --> 00:44:26,680 Well because everything about Gaussian processes depends on 762 00:44:26,680 --> 00:44:28,470 jointly Gaussian random variables. 763 00:44:28,470 --> 00:44:32,910 You can't do anything with Gaussian processes without 764 00:44:32,910 --> 00:44:35,330 being able to look at these jointly 765 00:44:35,330 --> 00:44:36,790 Gaussian random variables. 766 00:44:36,790 --> 00:44:40,770 And the reason is that we said that Z of t is a Gaussian 767 00:44:40,770 --> 00:44:47,200 process if for every k and every set of time instance, 768 00:44:47,200 --> 00:44:50,420 every set of e, that's t1 to t sub k. 769 00:44:50,420 --> 00:44:54,050 If Z of t1 up to Z of tk is a jointly 770 00:44:54,050 --> 00:44:56,840 Gaussian random vector. 771 00:44:56,840 --> 00:44:57,180 OK. 772 00:44:57,180 --> 00:45:00,250 So that directly links the definition of a Gaussian 773 00:45:00,250 --> 00:45:04,120 process to Gaussian random vectors. 774 00:45:04,120 --> 00:45:04,450 OK. 775 00:45:04,450 --> 00:45:07,450 Supposed the sample functions of Z of t or L2 with 776 00:45:07,450 --> 00:45:08,420 probability one. 777 00:45:08,420 --> 00:45:12,020 I want to say a little bit about this because otherwise 778 00:45:12,020 --> 00:45:16,800 you can't sort out any of these things about how L2 779 00:45:16,800 --> 00:45:21,910 theory connects with random processes. 780 00:45:21,910 --> 00:45:22,220 OK. 781 00:45:22,220 --> 00:45:25,030 So I'm going to start out by just assuming that all these 782 00:45:25,030 --> 00:45:30,280 sample functions are going to be L2 functions with 783 00:45:30,280 --> 00:45:33,040 probability 1. 784 00:45:33,040 --> 00:45:36,290 One way to ensure this is to look only at processes of the 785 00:45:36,290 --> 00:45:42,210 form Z of t equals some sum of Z sub i times ti of t. 786 00:45:42,210 --> 00:45:44,950 OK remember at the beginning we started looking at this 787 00:45:44,950 --> 00:45:50,470 process the sum of a set of normalized Gaussian random 788 00:45:50,470 --> 00:45:52,980 variables times sinc functions times 789 00:45:52,980 --> 00:45:55,800 displaced sinc functions. 790 00:45:55,800 --> 00:45:57,750 You can also do the same thing with Fourier 791 00:45:57,750 --> 00:46:00,200 coefficients or anything. 792 00:46:00,200 --> 00:46:03,020 You got a fairly general set of processes that way. 793 00:46:06,890 --> 00:46:09,780 unfortunately they don't quite work, because if you look at 794 00:46:09,780 --> 00:46:12,730 the sinc functions and you look at noise, which is 795 00:46:12,730 --> 00:46:16,510 independent and identically distributed in time, then the 796 00:46:16,510 --> 00:46:20,540 sample functions are going to have infinite energy. 797 00:46:20,540 --> 00:46:24,370 I mean that kind of process just runs on forever. 798 00:46:24,370 --> 00:46:27,140 It runs on with finite power forever. 799 00:46:27,140 --> 00:46:30,510 And therefore it has infinite energy. 800 00:46:30,510 --> 00:46:35,550 And therefore the simplest process to look at, the sample 801 00:46:35,550 --> 00:46:41,680 functions are non-L2 with probability one, which is sort 802 00:46:41,680 --> 00:46:43,250 of unfortunate. 803 00:46:43,250 --> 00:46:47,490 So we say OK we don't care about that, because if you 804 00:46:47,490 --> 00:46:52,160 want to look at that process a sum of Z sub i times sinc 805 00:46:52,160 --> 00:46:56,410 functions, what do you care about? 806 00:46:56,410 --> 00:47:00,500 I mean you only care about the terms in that expansion, which 807 00:47:00,500 --> 00:47:07,600 run from the big bang until the next big bang. 808 00:47:07,600 --> 00:47:09,270 OK. 809 00:47:09,270 --> 00:47:12,620 We certainly don't care about it before that or after that. 810 00:47:12,620 --> 00:47:15,630 And if we look within those finite time limits, then all 811 00:47:15,630 --> 00:47:19,570 these functions are going to be L2. 812 00:47:19,570 --> 00:47:22,700 Because they just last for a finite amount of time. 813 00:47:22,700 --> 00:47:26,890 So all we need to do is to truncate these things somehow. 814 00:47:26,890 --> 00:47:29,640 And we're going to diddle around a little bit with a 815 00:47:29,640 --> 00:47:33,650 question of how to truncate these series. 816 00:47:33,650 --> 00:47:36,990 But for the time being we just say we can do that. 817 00:47:36,990 --> 00:47:38,200 And we will do it. 818 00:47:38,200 --> 00:47:41,450 So we can look at any processes the form sum of Zi 819 00:47:41,450 --> 00:47:45,660 times phi i of t, where the Zi are independent and the phi i 820 00:47:45,660 --> 00:47:48,510 of t are orthonormal. 821 00:47:48,510 --> 00:47:54,180 And to make things L2, we're going to assume that the sum 822 00:47:54,180 --> 00:48:04,770 over i of Zi squared bar is less than infinity. 823 00:48:04,770 --> 00:48:05,090 OK. 824 00:48:05,090 --> 00:48:08,690 In other words we only take a finite number of these things, 825 00:48:08,690 --> 00:48:11,230 or if we want to take infinite a number of them, the 826 00:48:11,230 --> 00:48:15,040 variances are going to go off to zero. 827 00:48:15,040 --> 00:48:17,600 And I don't know whether you're proving it in the 828 00:48:17,600 --> 00:48:20,590 homework this time or you will prove it in the homework next 829 00:48:20,590 --> 00:48:24,460 time, I forget, but you're going to look at the question 830 00:48:24,460 --> 00:48:29,040 of why this finite variance condition makes these sample 831 00:48:29,040 --> 00:48:37,120 functions be L2 with probability one. 832 00:48:37,120 --> 00:48:37,530 OK. 833 00:48:37,530 --> 00:48:43,050 So if you had this condition, then all your sample functions 834 00:48:43,050 --> 00:48:44,840 are going to be L2. 835 00:48:44,840 --> 00:48:46,670 I'm going to get off all of this L2 836 00:48:46,670 --> 00:48:48,870 business relatively shortly. 837 00:48:48,870 --> 00:48:51,270 I want to do a little bit of it to start with. 838 00:48:51,270 --> 00:48:55,600 Because if any of you have start doing any research in 839 00:48:55,600 --> 00:48:59,150 this area, at some point you're going to be merrily 840 00:48:59,150 --> 00:49:02,880 working away calculating all sorts of things. 841 00:49:02,880 --> 00:49:06,440 And suddenly you're going to find that none of it exists, 842 00:49:06,440 --> 00:49:09,380 because of these problems of infinite energy. 843 00:49:09,380 --> 00:49:11,530 And you're going to get very puzzled. 844 00:49:11,530 --> 00:49:14,260 So one of the things I tried to do in the notes is to write 845 00:49:14,260 --> 00:49:17,240 them in way that you can understand them at a first 846 00:49:17,240 --> 00:49:20,370 reading without worrying about any of this. 847 00:49:20,370 --> 00:49:23,860 And then when you go back for a second reading, you can pick 848 00:49:23,860 --> 00:49:26,610 up all the mathematics that you need. 849 00:49:26,610 --> 00:49:29,810 So that in fact you won't have the problem of suddenly 850 00:49:29,810 --> 00:49:32,310 finding out that three-quarters of your thesis 851 00:49:32,310 --> 00:49:35,150 has to be thrown away, because you've been dealing with 852 00:49:35,150 --> 00:49:37,850 things that don't make any sense. 853 00:49:37,850 --> 00:49:38,170 OK. 854 00:49:38,170 --> 00:49:41,180 So, we're going to define linear functionals in the 855 00:49:41,180 --> 00:49:42,050 following way. 856 00:49:42,050 --> 00:49:46,340 We're going to first look at the sample functions of this 857 00:49:46,340 --> 00:49:48,820 random process Z. OK. 858 00:49:48,820 --> 00:49:51,600 Now we talked about this last time. 859 00:49:51,600 --> 00:49:56,170 If you have a random process Z than really what you have is a 860 00:49:56,170 --> 00:49:59,530 set of functions defined on some samples space. 861 00:49:59,530 --> 00:50:03,560 So the quantities you're interested in is what is the 862 00:50:03,560 --> 00:50:11,760 value of the random process at time t for sample point omega. 863 00:50:11,760 --> 00:50:14,540 OK. 864 00:50:14,540 --> 00:50:19,490 If we look at that and make it for a given omega, this thing 865 00:50:19,490 --> 00:50:21,270 becomes a function of t. 866 00:50:21,270 --> 00:50:24,200 In fact for a given omega, this is just what we've been 867 00:50:24,200 --> 00:50:28,000 calling a sample element of the random process. 868 00:50:28,000 --> 00:50:30,770 So if we take this sample element, look at the inner 869 00:50:30,770 --> 00:50:34,600 product of that with some function g of t. 870 00:50:34,600 --> 00:50:38,720 In other words we look at the integral of Z of t omega times 871 00:50:38,720 --> 00:50:41,290 g of t, dt. 872 00:50:41,290 --> 00:50:45,620 And if all these sample functions are L2 and if g of t 873 00:50:45,620 --> 00:50:49,170 is L2, what happens when you take the integral of an L2 874 00:50:49,170 --> 00:50:56,020 function times an L2 function, which is the inner product of 875 00:50:56,020 --> 00:51:00,310 something L2 with something L2 which says something with 876 00:51:00,310 --> 00:51:02,930 finite energy inner product with 877 00:51:02,930 --> 00:51:05,950 something with finite energy. 878 00:51:05,950 --> 00:51:11,560 Well the Schwarz inequality tells you that if this has 879 00:51:11,560 --> 00:51:15,310 finite energy and this has finite energy, the inner 880 00:51:15,310 --> 00:51:17,740 product exists. 881 00:51:17,740 --> 00:51:19,920 That's the reason why we went through the Schwarz 882 00:51:19,920 --> 00:51:20,610 inequality. 883 00:51:20,610 --> 00:51:22,900 It's the main reason for doing that. 884 00:51:22,900 --> 00:51:25,250 So these things have finite value. 885 00:51:25,250 --> 00:51:30,220 So V of omega the results of doing this namely V as a 886 00:51:30,220 --> 00:51:33,380 function of the sample space is a real number. 887 00:51:33,380 --> 00:51:38,380 And it's a real number for the sample points of omega with 888 00:51:38,380 --> 00:51:42,280 probability one, which means we can talk about V as a 889 00:51:42,280 --> 00:51:43,800 random variable. 890 00:51:43,800 --> 00:51:45,060 OK. 891 00:51:45,060 --> 00:51:49,290 And now V is a random variable which is defined in this way. 892 00:51:49,290 --> 00:51:52,140 And from now on we will call these things linear 893 00:51:52,140 --> 00:51:55,530 functionals which are in fact the integral of a random 894 00:51:55,530 --> 00:51:58,940 process times a function. 895 00:51:58,940 --> 00:52:02,520 And we can take that kind of integral. 896 00:52:02,520 --> 00:52:05,830 It sort of looks like the linear combinations of things 897 00:52:05,830 --> 00:52:11,930 we were doing before when we were talking about matrices 898 00:52:11,930 --> 00:52:13,180 and random vectors. 899 00:52:21,280 --> 00:52:21,800 OK. 900 00:52:21,800 --> 00:52:24,830 If we restrict the random process to have the following 901 00:52:24,830 --> 00:52:27,110 form, where these are independent and these are 902 00:52:27,110 --> 00:52:33,160 orthonormal, then one of these linear functionals is given by 903 00:52:33,160 --> 00:52:36,640 the random variable V is going to be the integral of Z of t 904 00:52:36,640 --> 00:52:40,790 times g of t, but Z of t is this. 905 00:52:40,790 --> 00:52:45,190 And at this point we're not going to fuss about 906 00:52:45,190 --> 00:52:47,910 interchanging integrals with summations. 907 00:52:47,910 --> 00:52:50,610 You have the machinery to do it, because we're now dealing 908 00:52:50,610 --> 00:52:52,870 with an L2 space. 909 00:52:52,870 --> 00:52:54,730 We're not going to fuss about it. 910 00:52:54,730 --> 00:52:57,810 And I advise you not to fuss about it. 911 00:52:57,810 --> 00:53:02,080 So we have a sum of these random variables here times 912 00:53:02,080 --> 00:53:03,170 these integrals here. 913 00:53:03,170 --> 00:53:07,320 These integrals here are just projections of g of t on this 914 00:53:07,320 --> 00:53:09,440 space of orthonormal functions. 915 00:53:09,440 --> 00:53:12,590 So whatever space of orthonormal functions gives 916 00:53:12,590 --> 00:53:16,810 you your jollies, use it talk about the inner products on 917 00:53:16,810 --> 00:53:18,130 that space. 918 00:53:18,130 --> 00:53:22,070 This gives you a nice inner products space of 919 00:53:22,070 --> 00:53:24,760 sequences of numbers. 920 00:53:24,760 --> 00:53:28,210 And then if the z i are jointly Gaussian, then V is 921 00:53:28,210 --> 00:53:30,110 going to be Gaussian. 922 00:53:30,110 --> 00:53:35,020 And then to generalize this one little bit further, if you 923 00:53:35,020 --> 00:53:40,010 take a whole bunch of L2 functions, g1 of t g2 of t and 924 00:53:40,010 --> 00:53:43,030 so forth, you can talk about a whole bunch of random 925 00:53:43,030 --> 00:53:47,670 variables V1 up to V sub j, 0. 926 00:53:47,670 --> 00:53:50,880 And V sub j is going to be the integral of Z of 927 00:53:50,880 --> 00:53:53,450 t times gj of tdt. 928 00:53:53,450 --> 00:53:57,540 Remember this thing looks very simple. 929 00:53:57,540 --> 00:54:00,180 It looks like the -- 930 00:54:00,180 --> 00:54:04,500 like the convolutions you've been doing all your life. 931 00:54:04,500 --> 00:54:05,080 It's not. 932 00:54:05,080 --> 00:54:08,510 It's really a rather peculiar quantity. 933 00:54:08,510 --> 00:54:12,510 This in fact is what we call a linear functional, and is the 934 00:54:12,510 --> 00:54:16,720 integral of a random process times this. 935 00:54:16,720 --> 00:54:19,650 Which we have defined in terms of the sample functions of the 936 00:54:19,650 --> 00:54:22,270 random process. 937 00:54:22,270 --> 00:54:26,450 And now we said OK now that we understand what it is, we will 938 00:54:26,450 --> 00:54:28,770 just write this all the time. 939 00:54:28,770 --> 00:54:32,300 But I just caution you not to let 940 00:54:32,300 --> 00:54:33,980 familiarity breed contempt. 941 00:54:33,980 --> 00:54:39,440 Because this is a rather peculiar notion. 942 00:54:39,440 --> 00:54:41,610 And a rather powerful notion. 943 00:54:41,610 --> 00:54:42,100 OK. 944 00:54:42,100 --> 00:54:45,800 So these things are jointly Gaussian. 945 00:54:45,800 --> 00:54:49,700 We want to take the expected value of V sub i times V sub 946 00:54:49,700 --> 00:54:55,020 j, and now we're going to do this without worrying about 947 00:54:55,020 --> 00:54:56,640 being careful at all. 948 00:54:56,640 --> 00:55:00,900 We have the expected value of the integral of Z of t, gi of 949 00:55:00,900 --> 00:55:07,520 t, td times the integral of Z tau gj of tau, d tau. 950 00:55:07,520 --> 00:55:11,400 And now we're going to slide this expected value inside of 951 00:55:11,400 --> 00:55:14,010 both of these integrals. 952 00:55:14,010 --> 00:55:18,890 And not worry about it. 953 00:55:18,890 --> 00:55:21,400 And therefore what we're going to have is a double integral 954 00:55:21,400 --> 00:55:27,500 of gi of t expected value of z of t time z of tau times gj of 955 00:55:27,500 --> 00:55:34,470 tau dt, d tau, which is this thing here. 956 00:55:34,470 --> 00:55:38,210 Which you should compare with what we've been dealing with 957 00:55:38,210 --> 00:55:41,300 most of the lecture today. 958 00:55:41,300 --> 00:55:47,200 This is the same kind of form for a covariance function as 959 00:55:47,200 --> 00:55:50,420 we've been dealing with for covariance matrices. 960 00:55:50,420 --> 00:55:55,660 It has very similar effects. 961 00:55:55,660 --> 00:55:57,900 I mean before you were just talking about finite 962 00:55:57,900 --> 00:56:02,060 dimensional matrices, which is all simple mathematically in 963 00:56:02,060 --> 00:56:04,590 eigenfunctions and eigenvalues. 964 00:56:04,590 --> 00:56:07,120 You have eigenfunctions and eigenvalues of 965 00:56:07,120 --> 00:56:10,110 these things also. 966 00:56:10,110 --> 00:56:14,690 And so long as these are defined nicely by these L2 967 00:56:14,690 --> 00:56:17,270 properties we've been talking about. 968 00:56:17,270 --> 00:56:20,130 In fact you can deal with these in virtually the same 969 00:56:20,130 --> 00:56:25,370 way that you can deal with the matrices we were 970 00:56:25,370 --> 00:56:27,130 dealing with before. 971 00:56:27,130 --> 00:56:30,240 If you just remember what the results are from matrices, you 972 00:56:30,240 --> 00:56:34,770 can guess what they are for these covariance functions. 973 00:56:34,770 --> 00:56:35,020 OK. 974 00:56:35,020 --> 00:56:39,270 But anyway you can find the expected value of Vi times Vj 975 00:56:39,270 --> 00:56:40,630 by this formula. 976 00:56:40,630 --> 00:56:44,050 Again we're dealing with zero-mean and therefore we 977 00:56:44,050 --> 00:56:46,820 don't have to worry about the mean, put that in later. 978 00:56:51,070 --> 00:56:54,390 And that all exists. 979 00:56:54,390 --> 00:56:54,740 OK. 980 00:56:54,740 --> 00:56:58,540 So the next thing we want to deal with, hitting 981 00:56:58,540 --> 00:56:59,590 you with a lot today. 982 00:56:59,590 --> 00:57:05,500 But I mean the trouble is a lot of this is half familiar 983 00:57:05,500 --> 00:57:07,790 to most of you. 984 00:57:07,790 --> 00:57:10,680 People who have taken various communication courses at 985 00:57:10,680 --> 00:57:15,320 various places have all been exposed to random processes in 986 00:57:15,320 --> 00:57:18,980 some highly trivialized sense. 987 00:57:18,980 --> 00:57:21,480 But the major results are the same as the results we're 988 00:57:21,480 --> 00:57:22,460 going through here. 989 00:57:22,460 --> 00:57:25,010 And all we're doing here is adding a little bit of 990 00:57:25,010 --> 00:57:28,380 carefulness about what works and what doesn't work. 991 00:57:28,380 --> 00:57:32,850 Incidentally in the notes which is towards the end of 992 00:57:32,850 --> 00:57:39,250 lectures 14 and 15, we give three examples which let you 993 00:57:39,250 --> 00:57:43,960 know why in fact we want to look primarily at random 994 00:57:43,960 --> 00:57:48,790 processes which are defined in terms of a sum of independent 995 00:57:48,790 --> 00:57:52,790 Gaussian random variables time orthonormal functions. 996 00:57:52,790 --> 00:57:56,430 And if you look at those three examples, some 997 00:57:56,430 --> 00:57:58,410 of them have problems. 998 00:57:58,410 --> 00:58:02,560 Because of the fact that everything you're dealing with 999 00:58:02,560 --> 00:58:04,610 has infinite energy. 1000 00:58:04,610 --> 00:58:07,810 And therefore it doesn't really make any sense. 1001 00:58:07,810 --> 00:58:10,980 And one of them I should talk about this just a 1002 00:58:10,980 --> 00:58:11,850 little bit in class. 1003 00:58:11,850 --> 00:58:14,730 And I think I still have a couple of minutes, is a very 1004 00:58:14,730 --> 00:58:24,140 strange process where Z of t is IID. 1005 00:58:24,140 --> 00:58:26,510 In fact just let it be normal. 1006 00:58:30,310 --> 00:58:34,600 And independent for all t. 1007 00:58:39,170 --> 00:58:39,470 OK. 1008 00:58:39,470 --> 00:58:45,810 In other words you generate a random process by looking at 1009 00:58:45,810 --> 00:58:48,710 an uncountably infinite collection of 1010 00:58:48,710 --> 00:58:52,260 normal random variables. 1011 00:58:52,260 --> 00:58:56,310 How do you deal with such a process? 1012 00:58:56,310 --> 00:58:58,920 I don't know how to deal with it. 1013 00:58:58,920 --> 00:59:02,900 I mean it sounds like it's simple. 1014 00:59:02,900 --> 00:59:06,610 If I put this on a quiz, three-quarters of you would 1015 00:59:06,610 --> 00:59:08,310 say oh that's very simple. 1016 00:59:08,310 --> 00:59:13,280 What we're dealing with is a family of impulse functions. 1017 00:59:13,280 --> 00:59:15,800 Spaced arbitrarily closely together. 1018 00:59:15,800 --> 00:59:19,410 This is not impulse function. 1019 00:59:19,410 --> 00:59:22,720 Impulse functions are even worse than this, but this is 1020 00:59:22,720 --> 00:59:25,000 bad enough. 1021 00:59:25,000 --> 00:59:28,050 When we start talking about spectral density, we can 1022 00:59:28,050 --> 00:59:33,020 explain this a little bit better by thinking this kind 1023 00:59:33,020 --> 00:59:36,990 of process it doesn't make any sense. 1024 00:59:36,990 --> 00:59:39,680 But this kind of process, if you look at its spectral 1025 00:59:39,680 --> 00:59:46,440 density, it's going to have a spectral density which is zero 1026 00:59:46,440 --> 00:59:50,390 everywhere, but whose integral over all frequencies is one. 1027 00:59:53,060 --> 00:59:54,800 OK. 1028 00:59:54,800 --> 00:59:57,780 In other words it's not something you want to wish on 1029 00:59:57,780 --> 00:59:59,670 your on your worse friend. 1030 00:59:59,670 --> 01:00:02,200 It makes a certain amount of sense as a limit of things. 1031 01:00:02,200 --> 01:00:06,450 You can look at a very broadband process where in 1032 01:00:06,450 --> 01:00:10,180 fact you spread the process out enormously. 1033 01:00:10,180 --> 01:00:13,270 You can make pseudo noise which looks sort of like this. 1034 01:00:13,270 --> 01:00:16,280 And you make the process broader and broader and lower 1035 01:00:16,280 --> 01:00:18,370 and lower intensity everywhere. 1036 01:00:18,370 --> 01:00:20,750 But it still has this energy of one. 1037 01:00:20,750 --> 01:00:24,770 It still has a power of one everywhere. 1038 01:00:24,770 --> 01:00:26,150 And it just is ugly. 1039 01:00:26,150 --> 01:00:28,630 OK. 1040 01:00:28,630 --> 01:00:32,480 Now if you never worried about these questions of L2, you 1041 01:00:32,480 --> 01:00:36,010 would look at a process like that and say, gee there must 1042 01:00:36,010 --> 01:00:39,140 be some easy way to handle that because it's probably the 1043 01:00:39,140 --> 01:00:42,440 easiest process you can define. 1044 01:00:42,440 --> 01:00:45,050 I mean everything is normal. 1045 01:00:45,050 --> 01:00:48,430 If you look at any set at different times, you get a set 1046 01:00:48,430 --> 01:00:52,560 of IID normal Gaussian variables. 1047 01:00:52,560 --> 01:00:56,380 You try to put it together, and it doesn't mean anything. 1048 01:00:56,380 --> 01:00:59,700 If you pass it through a filter, the filter is going to 1049 01:00:59,700 --> 01:01:02,220 cancel it all out. 1050 01:01:02,220 --> 01:01:09,190 So anyway, that's one reason why we want to look at this 1051 01:01:09,190 --> 01:01:14,470 restricted class of random processes we're looking at. 1052 01:01:14,470 --> 01:01:14,880 OK. 1053 01:01:14,880 --> 01:01:19,430 What we're interested in now is we want to take a Gaussian 1054 01:01:19,430 --> 01:01:24,140 random process really, but you can take any random process. 1055 01:01:24,140 --> 01:01:27,370 We want to pass it through a filter, and we want to look at 1056 01:01:27,370 --> 01:01:30,180 the random process that comes out. 1057 01:01:30,180 --> 01:01:30,440 OK. 1058 01:01:30,440 --> 01:01:33,720 And that certainly is a very physical kind of operation. 1059 01:01:33,720 --> 01:01:37,320 I mean any kind of communication system that you 1060 01:01:37,320 --> 01:01:42,870 build is going to have noise on the channel. 1061 01:01:42,870 --> 01:01:45,740 And one of the first things you're going to do is you're 1062 01:01:45,740 --> 01:01:48,860 going to filter what you've received. 1063 01:01:48,860 --> 01:01:52,190 So you have to have some way of dealing with this. 1064 01:01:52,190 --> 01:01:53,060 OK. 1065 01:01:53,060 --> 01:01:58,190 And the way we sort of been dealing with it all along in 1066 01:01:58,190 --> 01:02:02,750 terms of the transmitted way forms we've been dealing with 1067 01:02:02,750 --> 01:02:03,860 is to say OK. 1068 01:02:03,860 --> 01:02:07,280 What we're going to do is to first look what happens when 1069 01:02:07,280 --> 01:02:10,360 we take sample functions of this, pass them through the 1070 01:02:10,360 --> 01:02:13,570 filter, and then what comes out is going to be some 1071 01:02:13,570 --> 01:02:15,220 function again. 1072 01:02:15,220 --> 01:02:18,600 And then we're back into what you studied as an 1073 01:02:18,600 --> 01:02:22,410 undergraduate talking about functions through filters. 1074 01:02:22,410 --> 01:02:25,330 We're going to jazz it up a little bit by saying these 1075 01:02:25,330 --> 01:02:27,680 functions are going to be L2 the filters 1076 01:02:27,680 --> 01:02:29,000 are going to be L2. 1077 01:02:29,000 --> 01:02:31,020 So that in fact you know you'd get something 1078 01:02:31,020 --> 01:02:33,490 out that make sense. 1079 01:02:33,490 --> 01:02:37,480 So what we're doing is looking at these sample functions V 1080 01:02:37,480 --> 01:02:43,020 the output at time tau for sample point omega is going to 1081 01:02:43,020 --> 01:02:51,760 be the convolution of the input at time t and sample 1082 01:02:51,760 --> 01:02:52,570 point omega. 1083 01:02:52,570 --> 01:02:57,260 Remember this one sample point exists for all time. 1084 01:02:57,260 --> 01:03:00,850 That's why these sample points and sample spaces are so damn 1085 01:03:00,850 --> 01:03:02,260 complicated. 1086 01:03:02,260 --> 01:03:04,650 Because they have everything in them. 1087 01:03:04,650 --> 01:03:05,440 OK. 1088 01:03:05,440 --> 01:03:09,410 So there's one sample point which exists for all of them. 1089 01:03:09,410 --> 01:03:11,650 This is a sample function. 1090 01:03:11,650 --> 01:03:14,830 You're passing the sample function through a filter, 1091 01:03:14,830 --> 01:03:17,920 which is just normal convolution. 1092 01:03:17,920 --> 01:03:20,350 What comes out then. 1093 01:03:20,350 --> 01:03:26,240 If in fact we express this random process in terms of 1094 01:03:26,240 --> 01:03:30,230 this orthonormal sum the way we've been doing before. 1095 01:03:30,230 --> 01:03:36,060 Is you get the sum over j of this, which is a sample value 1096 01:03:36,060 --> 01:03:39,470 of the j'th random variable coming out times the integral 1097 01:03:39,470 --> 01:03:43,800 of pj of t, h of tau minus t, d tau. 1098 01:03:43,800 --> 01:03:44,260 OK. 1099 01:03:44,260 --> 01:03:48,430 For each tau that you look at, this is just a sample value of 1100 01:03:48,430 --> 01:03:50,030 a linear functional. 1101 01:03:50,030 --> 01:03:50,570 OK. 1102 01:03:50,570 --> 01:03:54,500 If I want to look at this at one value of tau, I have this 1103 01:03:54,500 --> 01:03:59,990 integral here which is a random process. 1104 01:03:59,990 --> 01:04:02,740 A sample value of a random process at 1105 01:04:02,740 --> 01:04:05,230 omega time a function. 1106 01:04:05,230 --> 01:04:08,500 This is just a function of t for given tau. 1107 01:04:08,500 --> 01:04:09,080 OK. 1108 01:04:09,080 --> 01:04:11,770 So this is a linear functional. 1109 01:04:11,770 --> 01:04:15,400 As a type we've been talking about. 1110 01:04:15,400 --> 01:04:17,740 And that linear functional is then given by 1111 01:04:17,740 --> 01:04:20,860 this for each tau. 1112 01:04:20,860 --> 01:04:22,460 This is a sample value of a linear 1113 01:04:22,460 --> 01:04:26,020 functional we can talk about. 1114 01:04:26,020 --> 01:04:26,330 OK. 1115 01:04:26,330 --> 01:04:30,620 These things are then, if you look over the whole sample 1116 01:04:30,620 --> 01:04:35,230 space omega, V of tau becomes a random variable. 1117 01:04:35,230 --> 01:04:36,520 OK. 1118 01:04:36,520 --> 01:04:41,940 V of tau is the random variable whose sample values 1119 01:04:41,940 --> 01:04:49,200 are V of t omega, and they're given by this. 1120 01:04:49,200 --> 01:04:55,050 So if z of t is Gaussian process, you get jointly 1121 01:04:55,050 --> 01:05:02,400 Gaussian linear functionals at each of any set of times tau 1122 01:05:02,400 --> 01:05:05,150 1, tau 2 up to tau sub k. 1123 01:05:05,150 --> 01:05:09,650 So this just gives you a whole set of linear functionals. 1124 01:05:09,650 --> 01:05:14,930 And if z of t is a Gaussian process, then all these linear 1125 01:05:14,930 --> 01:05:19,520 functionals are going to be jointly Gaussian also. 1126 01:05:19,520 --> 01:05:20,460 And bingo. 1127 01:05:20,460 --> 01:05:23,310 What we have then is an alternate way to generate 1128 01:05:23,310 --> 01:05:25,370 Gaussian processes. 1129 01:05:25,370 --> 01:05:25,730 OK. 1130 01:05:25,730 --> 01:05:29,860 In other words you can generate a Gaussian process by 1131 01:05:29,860 --> 01:05:34,320 specifying at each set of k times, you have jointly 1132 01:05:34,320 --> 01:05:36,140 Gaussian random variables. 1133 01:05:36,140 --> 01:05:39,890 But once you do that, once you understand one Gaussian 1134 01:05:39,890 --> 01:05:42,460 process, you're off and running. 1135 01:05:42,460 --> 01:05:46,060 Because then you can pass it through any L2 filter 1136 01:05:46,060 --> 01:05:46,850 that you want to. 1137 01:05:46,850 --> 01:05:50,160 And you generate another Gaussian process. 1138 01:05:50,160 --> 01:05:57,000 So for example, if you start out with this sinc type 1139 01:05:57,000 --> 01:06:04,820 process, I mean we'll see that that has a spectral density, 1140 01:06:04,820 --> 01:06:07,470 which is flat over all frequencies. 1141 01:06:07,470 --> 01:06:10,670 And we'll talk about spectral density tomorrow. 1142 01:06:10,670 --> 01:06:12,950 But then you pass it through a linear filter, and you can 1143 01:06:12,950 --> 01:06:16,260 give it any spectral density that you want. 1144 01:06:16,260 --> 01:06:20,120 So at this point, we really have enough to talk about 1145 01:06:20,120 --> 01:06:26,810 arbitrary covariance functions just by starting out with 1146 01:06:26,810 --> 01:06:30,000 random processes, which are defined in terms of some 1147 01:06:30,000 --> 01:06:33,180 sequence of orthonormal random variables. 1148 01:06:40,280 --> 01:06:40,860 OK. 1149 01:06:40,860 --> 01:06:45,140 Now we can get to covariance function of a filter process 1150 01:06:45,140 --> 01:06:52,200 in the same way as we got the matrix for linear functionals. 1151 01:06:52,200 --> 01:06:52,500 OK. 1152 01:06:52,500 --> 01:06:54,020 And this is just computation. 1153 01:06:54,020 --> 01:06:55,380 OK. 1154 01:06:55,380 --> 01:06:56,870 So what is it? 1155 01:06:56,870 --> 01:07:01,720 The covariance function of this output process V 1156 01:07:01,720 --> 01:07:06,510 evaluated at one time r another time s. 1157 01:07:06,510 --> 01:07:10,550 One of the nasty things about notation when you start 1158 01:07:10,550 --> 01:07:13,570 dealing with a covariance function of the input and the 1159 01:07:13,570 --> 01:07:18,860 output to a linear filter is you suddenly need to worry 1160 01:07:18,860 --> 01:07:21,440 about two times at the input and two other 1161 01:07:21,440 --> 01:07:23,160 times at the output. 1162 01:07:23,160 --> 01:07:23,640 OK. 1163 01:07:23,640 --> 01:07:29,870 Because this thing is then expected value of V sub r 1164 01:07:29,870 --> 01:07:31,895 times the expected value of V subs. 1165 01:07:31,895 --> 01:07:32,360 OK. 1166 01:07:32,360 --> 01:07:36,510 This is a random variable, which is the process V of r 1167 01:07:36,510 --> 01:07:38,600 evaluated at time r. 1168 01:07:38,600 --> 01:07:43,410 This is the random variable, which is the process V of t 1169 01:07:43,410 --> 01:07:47,570 evaluated at a particular time s. 1170 01:07:47,570 --> 01:07:52,060 This is going to be the expected value of what this 1171 01:07:52,060 --> 01:07:56,980 random variable now is the integral of the random process 1172 01:07:56,980 --> 01:08:03,365 z of t times this function here, which is now 1173 01:08:03,365 --> 01:08:04,420 a function of t. 1174 01:08:04,420 --> 01:08:06,980 Because we're looking at a fixed value of r. 1175 01:08:06,980 --> 01:08:10,290 So this is a linear functional, 1176 01:08:10,290 --> 01:08:12,430 which is a random variable. 1177 01:08:12,430 --> 01:08:14,210 This is another linear functional. 1178 01:08:14,210 --> 01:08:18,440 This is evaluated at some time s, which is the output of the 1179 01:08:18,440 --> 01:08:20,730 filter at time s. 1180 01:08:20,730 --> 01:08:24,940 Then we will throw caution to the wind interchange integrals 1181 01:08:24,940 --> 01:08:27,720 with expectation and everything. 1182 01:08:27,720 --> 01:08:31,210 And then in the middle we'll have expected value of z of t 1183 01:08:31,210 --> 01:08:36,860 times z of tau, which is the covariance function of z. 1184 01:08:36,860 --> 01:08:37,200 OK. 1185 01:08:37,200 --> 01:08:42,150 So the covariance function of z then specifies what the 1186 01:08:42,150 --> 01:08:46,370 covariance function of the output is. 1187 01:08:46,370 --> 01:08:47,310 OK. 1188 01:08:47,310 --> 01:08:53,810 So whenever you pass a random process through a filter if 1189 01:08:53,810 --> 01:08:57,060 you know what the covariance function of input to the 1190 01:08:57,060 --> 01:09:00,610 filter is, you can find the covariance function of the 1191 01:09:00,610 --> 01:09:02,830 output of the filter. 1192 01:09:02,830 --> 01:09:07,200 That's a kind of a nasty formula, it's not very nice. 1193 01:09:07,200 --> 01:09:12,800 But anyway the thing that it tells you is that whether this 1194 01:09:12,800 --> 01:09:16,110 random process is Gaussian or not. 1195 01:09:16,110 --> 01:09:19,140 The only thing that determines the covariance function of the 1196 01:09:19,140 --> 01:09:22,300 output of the filter is the covariance function of the 1197 01:09:22,300 --> 01:09:25,420 input to the filter plus of course the filter response, 1198 01:09:25,420 --> 01:09:28,440 which is needed OK. 1199 01:09:31,300 --> 01:09:35,360 And this is just the same kind of bilinear form that we were 1200 01:09:35,360 --> 01:09:37,700 dealing with before. 1201 01:09:37,700 --> 01:09:41,410 Next time we will talk a little bit about the fact that 1202 01:09:41,410 --> 01:09:45,210 when you're dealing with a bilinear form like this, you 1203 01:09:45,210 --> 01:09:49,220 can take these covariance functions and they have the 1204 01:09:49,220 --> 01:09:52,870 same kind of eigenvalues and eigenvectors that we had 1205 01:09:52,870 --> 01:09:55,030 before for a matrix. 1206 01:09:55,030 --> 01:09:59,630 Namely this is again going to be positive definite as a 1207 01:09:59,630 --> 01:10:04,070 function, and we will be able to find its eigenvectors and 1208 01:10:04,070 --> 01:10:05,180 its eigenvalues. 1209 01:10:05,180 --> 01:10:06,460 We can't calculate them. 1210 01:10:06,460 --> 01:10:09,000 Computers can calculate them. 1211 01:10:09,000 --> 01:10:11,310 People who've spend their lives doing this 1212 01:10:11,310 --> 01:10:13,060 can calculate them. 1213 01:10:13,060 --> 01:10:16,770 I wouldn't suggest that you spend your life doing this. 1214 01:10:16,770 --> 01:10:20,710 Because again you would be setting yourself up as a 1215 01:10:20,710 --> 01:10:25,060 second class computer, and you don't make any 1216 01:10:25,060 --> 01:10:26,910 profit out of that. 1217 01:10:26,910 --> 01:10:30,270 But anyway, we can find this in principle from this. 1218 01:10:30,270 --> 01:10:30,610 OK. 1219 01:10:30,610 --> 01:10:33,280 One of the things that we haven't talked about at all 1220 01:10:33,280 --> 01:10:37,350 yet, and which we will start talking about next time in 1221 01:10:37,350 --> 01:10:42,720 which the next set of lecture notes, lecture 16, we'll deal 1222 01:10:42,720 --> 01:10:45,870 with is the question of stationarity. 1223 01:10:45,870 --> 01:10:49,940 Let me say just a little bit about that to get into it. 1224 01:10:49,940 --> 01:10:52,535 And then we'll talk a lot more about it next time. 1225 01:10:52,535 --> 01:10:57,620 The notes will probably be on the web some time tomorrow. 1226 01:10:57,620 --> 01:11:01,770 I hope before noon if you want to look at them. 1227 01:11:01,770 --> 01:11:08,670 Physically if you look at a stochastic process, and you 1228 01:11:08,670 --> 01:11:10,380 want to model it. 1229 01:11:10,380 --> 01:11:14,010 I mean suppose you want to model a noise process. 1230 01:11:14,010 --> 01:11:16,850 How do you model a noise process? 1231 01:11:16,850 --> 01:11:20,850 Well you look at it over a long period of time. 1232 01:11:20,850 --> 01:11:23,600 You start taking statistics about it over a 1233 01:11:23,600 --> 01:11:26,480 long period of time. 1234 01:11:26,480 --> 01:11:30,840 And somehow you want to model it in such a way, I mean the 1235 01:11:30,840 --> 01:11:32,720 only thing you can look at is statistics over a 1236 01:11:32,720 --> 01:11:34,420 long period of time. 1237 01:11:34,420 --> 01:11:37,580 So if you're only looking at one process, you can look at 1238 01:11:37,580 --> 01:11:41,060 it for a year and then you can model it, and then you can use 1239 01:11:41,060 --> 01:11:44,250 that model for the next 10 years. 1240 01:11:44,250 --> 01:11:47,960 And what that's assuming is that the noise process looks 1241 01:11:47,960 --> 01:11:52,830 the same way this year as it does next year. 1242 01:11:52,830 --> 01:11:55,360 You can go further than that and say, OK I'm going to 1243 01:11:55,360 --> 01:11:59,540 manufacture cell phones or some other kind of widget. 1244 01:11:59,540 --> 01:12:02,820 And what I'm interested in then is what these noise wave 1245 01:12:02,820 --> 01:12:06,250 forms are to the whole collection of my widgets. 1246 01:12:06,250 --> 01:12:08,270 Namely different people will buy my widgets. 1247 01:12:08,270 --> 01:12:10,270 They will use them in different places. 1248 01:12:10,270 --> 01:12:13,700 So I'm interested in modeling the noise over this whole set 1249 01:12:13,700 --> 01:12:15,270 of widgets. 1250 01:12:15,270 --> 01:12:19,810 But still if you're doing that you're still almost forced to 1251 01:12:19,810 --> 01:12:25,090 deal with models which have the same statistics over at 1252 01:12:25,090 --> 01:12:28,820 least a broad range of times. 1253 01:12:28,820 --> 01:12:38,550 Sometimes when we're dealing with wireless communication we 1254 01:12:38,550 --> 01:12:41,100 say, no the channel keeps changing in time. and the 1255 01:12:41,100 --> 01:12:44,420 channel keeps changing slowly in time. 1256 01:12:44,420 --> 01:12:47,360 And therefore you don't have the same statistics 1257 01:12:47,360 --> 01:12:50,660 now as you have then. 1258 01:12:50,660 --> 01:12:54,740 If you want to understands that, believe me, the only way 1259 01:12:54,740 --> 01:12:58,180 you're going to understand it is to first understand how to 1260 01:12:58,180 --> 01:13:00,770 deal with statistics for the channel which 1261 01:13:00,770 --> 01:13:03,010 stay the same forever. 1262 01:13:03,010 --> 01:13:07,010 And once you understand those statistics, you will then be 1263 01:13:07,010 --> 01:13:10,200 in a position to start to understand what happens when 1264 01:13:10,200 --> 01:13:13,970 these statistics change slowly. 1265 01:13:13,970 --> 01:13:14,360 OK. 1266 01:13:14,360 --> 01:13:18,490 In other words, what our modeling assumption is in this 1267 01:13:18,490 --> 01:13:22,750 course, and I believe it's the right modeling assumption for 1268 01:13:22,750 --> 01:13:27,840 all engineers, is that you never start with some physical 1269 01:13:27,840 --> 01:13:31,770 phenomena and say, I want to test the hell out of this 1270 01:13:31,770 --> 01:13:36,500 until I find an appropriate statistical model for it. 1271 01:13:36,500 --> 01:13:39,310 You do that only after you know enough 1272 01:13:39,310 --> 01:13:41,270 about random processes. 1273 01:13:41,270 --> 01:13:44,680 That you know how to deal with an enormous variety of 1274 01:13:44,680 --> 01:13:48,520 different home cooked random processes. 1275 01:13:48,520 --> 01:13:48,850 OK. 1276 01:13:48,850 --> 01:13:51,920 So what we do in a course like this is we deal with lots of 1277 01:13:51,920 --> 01:13:55,300 different home cooked random processes, which is in fact 1278 01:13:55,300 --> 01:13:58,880 why we've done rather peculiar things like saying, let's look 1279 01:13:58,880 --> 01:14:03,880 at a random process which comes from a sum of 1280 01:14:03,880 --> 01:14:06,830 independent random variables multiplied 1281 01:14:06,830 --> 01:14:08,920 by orthonormal functions. 1282 01:14:08,920 --> 01:14:12,110 And you see what we've accomplished by that already. 1283 01:14:12,110 --> 01:14:15,660 Namely by starting out that way we've been able to define 1284 01:14:15,660 --> 01:14:18,270 Gaussian processes. 1285 01:14:18,270 --> 01:14:20,840 We've been able to define what happens when one of those 1286 01:14:20,840 --> 01:14:24,110 Gaussian processes goes through a filter. 1287 01:14:24,110 --> 01:14:27,630 And in fact, that gives a way of generating a lot more 1288 01:14:27,630 --> 01:14:29,250 random processes. 1289 01:14:29,250 --> 01:14:34,380 And next time what we're going to do is not to say how is it 1290 01:14:34,380 --> 01:14:36,920 that we know that processes are stationary. 1291 01:14:36,920 --> 01:14:38,760 How do you test whether processes are 1292 01:14:38,760 --> 01:14:40,120 stationary or not. 1293 01:14:40,120 --> 01:14:42,930 But we're just going to assume that they're stationary. 1294 01:14:42,930 --> 01:14:45,910 In other words if they had the same statistics now as they're 1295 01:14:45,910 --> 01:14:47,400 going to have next year. 1296 01:14:47,400 --> 01:14:51,080 And those statistics stay the same forever. 1297 01:14:51,080 --> 01:14:53,830 And then see what we can say about it. 1298 01:14:53,830 --> 01:14:56,940 To give you a clue as to how to start looking at this, 1299 01:14:56,940 --> 01:15:00,350 remember what we said quite a long time ago 1300 01:15:00,350 --> 01:15:03,640 about Markov chains. 1301 01:15:03,640 --> 01:15:04,230 OK. 1302 01:15:04,230 --> 01:15:07,820 And now when you look at Markov chains, remember that 1303 01:15:07,820 --> 01:15:11,270 what happens at one time is statistically a function of 1304 01:15:11,270 --> 01:15:14,780 what happened at the unit of time before it. 1305 01:15:14,780 --> 01:15:17,690 OK. 1306 01:15:17,690 --> 01:15:21,940 But we can still model those Markov change as being 1307 01:15:21,940 --> 01:15:23,410 stationary. 1308 01:15:23,410 --> 01:15:27,670 Because the dependents at this time on the previous sample of 1309 01:15:27,670 --> 01:15:33,020 time is the same now as it will be five years from now. 1310 01:15:33,020 --> 01:15:33,390 OK. 1311 01:15:33,390 --> 01:15:36,590 In other words you can't just look at the process at one 1312 01:15:36,590 --> 01:15:38,980 instant of time and say this is independent 1313 01:15:38,980 --> 01:15:40,540 of all other times. 1314 01:15:40,540 --> 01:15:42,370 That's not what stationary means. 1315 01:15:42,370 --> 01:15:47,230 What stationary means is the way the process depends on the 1316 01:15:47,230 --> 01:15:51,720 past at time t is the same is the way it depends on the past 1317 01:15:51,720 --> 01:15:54,430 at some later time tau. 1318 01:15:54,430 --> 01:15:56,260 And that in fact is the way we're going to define 1319 01:15:56,260 --> 01:15:58,380 stationarity. 1320 01:15:58,380 --> 01:16:03,740 It's at these joint sample times that we're going to be 1321 01:16:03,740 --> 01:16:06,350 looking at. 1322 01:16:06,350 --> 01:16:09,390 Which I had a better word for that. 1323 01:16:09,390 --> 01:16:12,140 Join sets of epochs that we'll be looking at. 1324 01:16:12,140 --> 01:16:15,570 The joint statistics over a set of epics is going to be 1325 01:16:15,570 --> 01:16:19,350 the same now as it will be at sometime in the future. 1326 01:16:19,350 --> 01:16:22,040 And that's the way we're going to define stationarity. 1327 01:16:22,040 --> 01:16:25,370 A prelude of what we're going to find when we do that is 1328 01:16:25,370 --> 01:16:30,690 that this covariance function, if the covariance function at 1329 01:16:30,690 --> 01:16:36,160 time t and time tau is the same if you translate it up to 1330 01:16:36,160 --> 01:16:41,560 t plus t1 and tau plus t1, then in fact this function is 1331 01:16:41,560 --> 01:16:45,740 going to be a function only if the difference t minus tau. 1332 01:16:45,740 --> 01:16:47,850 It's going to be a function of one variable 1333 01:16:47,850 --> 01:16:50,840 instead of two variables. 1334 01:16:50,840 --> 01:16:51,250 OK. 1335 01:16:51,250 --> 01:16:53,710 So as soon as we get this function being a function of 1336 01:16:53,710 --> 01:16:58,200 one variable instead of two variables, first thing we're 1337 01:16:58,200 --> 01:17:02,010 going to do is to take the Fourier transform of this. 1338 01:17:02,010 --> 01:17:05,240 Because then we'll be taking the Fourier transform of a 1339 01:17:05,240 --> 01:17:07,710 function of a single variable. 1340 01:17:07,710 --> 01:17:09,530 We're going to call that the spectral 1341 01:17:09,530 --> 01:17:12,440 density of the process. 1342 01:17:12,440 --> 01:17:17,100 And we're going to find that for stationary processes the 1343 01:17:17,100 --> 01:17:20,560 spectrum densities tell you 1344 01:17:20,560 --> 01:17:21,980 everything if they're Gaussian. 1345 01:17:21,980 --> 01:17:22,930 Why is that? 1346 01:17:22,930 --> 01:17:24,940 Well the inverse Fourier transform is 1347 01:17:24,940 --> 01:17:28,470 this covariance function. 1348 01:17:28,470 --> 01:17:31,330 And we've now seen that the covariance function tells you 1349 01:17:31,330 --> 01:17:34,200 everything about a Gaussian process. 1350 01:17:34,200 --> 01:17:36,600 So if you know the spectral density for a stationary 1351 01:17:36,600 --> 01:17:38,850 process, it will tell you everything. 1352 01:17:38,850 --> 01:17:42,720 We will also have to fiddle around a little bit about how 1353 01:17:42,720 --> 01:17:45,280 we define stationarity. 1354 01:17:45,280 --> 01:17:46,860 But at the same time don't have this 1355 01:17:46,860 --> 01:17:49,580 infinite energy problem. 1356 01:17:49,580 --> 01:17:50,930 And the way we're going to do it is the way 1357 01:17:50,930 --> 01:17:52,010 we've done it all along. 1358 01:17:52,010 --> 01:17:54,330 We're going to take something that looked stationary, we're 1359 01:17:54,330 --> 01:17:58,210 going to truncate it over some long period of time, and we're 1360 01:17:58,210 --> 01:18:00,570 going to have our cake and eat it too that way. 1361 01:18:00,570 --> 01:18:01,040 OK. 1362 01:18:01,040 --> 01:18:03,660 So we'll do that next time.