1 00:00:00,740 --> 00:00:02,770 In this segment we start a new topic. 2 00:00:02,770 --> 00:00:05,180 We will talk about the covariance of two random 3 00:00:05,180 --> 00:00:08,430 variables, which gives us useful information about the 4 00:00:08,430 --> 00:00:11,720 dependencies between these two random variables. 5 00:00:11,720 --> 00:00:14,110 Let us motivate the concept by looking first 6 00:00:14,110 --> 00:00:15,600 at a special case. 7 00:00:15,600 --> 00:00:18,690 Suppose that X and Y have zero means and that they there are 8 00:00:18,690 --> 00:00:20,870 discrete random variables. 9 00:00:20,870 --> 00:00:24,690 If X and Y are independent, then the expectation of the 10 00:00:24,690 --> 00:00:28,880 product is the product of the expectations. 11 00:00:28,880 --> 00:00:32,100 And since we have assumed zero means, this is going to be 12 00:00:32,100 --> 00:00:33,830 equal to zero. 13 00:00:33,830 --> 00:00:39,190 But suppose instead that the joint PMF of X and Y is of the 14 00:00:39,190 --> 00:00:40,990 following kind. 15 00:00:40,990 --> 00:00:45,080 Each point in this diagram is equally likely, so we have 16 00:00:45,080 --> 00:00:49,600 here a discrete uniform distribution on the discrete 17 00:00:49,600 --> 00:00:55,100 set which consists of the points shown in this diagram. 18 00:00:55,100 --> 00:00:58,330 What we have in this particular example is that at 19 00:00:58,330 --> 00:01:02,610 most outcomes, positive values of X tend to go together with 20 00:01:02,610 --> 00:01:06,970 positive values of Y. And negative values of X tend to 21 00:01:06,970 --> 00:01:11,100 go together with negative values of Y. So most of the 22 00:01:11,100 --> 00:01:15,039 time we have outcomes in this quadrant, in which x times y 23 00:01:15,039 --> 00:01:18,950 is positive, or in this quadrant where x times y is, 24 00:01:18,950 --> 00:01:20,210 again, positive. 25 00:01:20,210 --> 00:01:24,190 But some of the time we fall in this quadrant where x times 26 00:01:24,190 --> 00:01:26,410 y is negative, or in this quadrant where 27 00:01:26,410 --> 00:01:28,830 x times y is negative. 28 00:01:28,830 --> 00:01:32,940 Since we have many more points here and here, on the average, 29 00:01:32,940 --> 00:01:37,130 the value of x times y is going to be positive. 30 00:01:41,350 --> 00:01:45,990 On the other hand, if the diagram takes this form, then, 31 00:01:45,990 --> 00:01:50,140 most of the time, the pair x, y lies in this quadrant or in 32 00:01:50,140 --> 00:01:51,870 that quadrant where the product of 33 00:01:51,870 --> 00:01:54,280 x times y is negative. 34 00:01:54,280 --> 00:01:57,300 So the random variables X and Y typically have opposite 35 00:01:57,300 --> 00:02:02,950 signs, and on the average, the expected value of X times Y is 36 00:02:02,950 --> 00:02:06,170 going to be negative. 37 00:02:06,170 --> 00:02:09,530 So here we have a positive expectation, here we have a 38 00:02:09,530 --> 00:02:14,110 negative expectation of X times Y. This quantity, the 39 00:02:14,110 --> 00:02:18,300 expected value of X times Y, tells us whether X and Y tend 40 00:02:18,300 --> 00:02:22,579 to move in the same or in opposite directions. 41 00:02:22,579 --> 00:02:25,860 And this quantity is what we call the covariance, in the 42 00:02:25,860 --> 00:02:28,420 zero mean case. 43 00:02:28,420 --> 00:02:30,329 Let us now generalize. 44 00:02:30,329 --> 00:02:33,260 The random variables do not have to be discrete. 45 00:02:33,260 --> 00:02:35,430 This quantity is well defined for any 46 00:02:35,430 --> 00:02:37,900 kind of random variables. 47 00:02:37,900 --> 00:02:43,040 And if we have non-zero means, the covariance is defined by 48 00:02:43,040 --> 00:02:46,100 this expression. 49 00:02:46,100 --> 00:02:50,829 What we have here is that we look at the deviation of X 50 00:02:50,829 --> 00:02:55,240 from its mean value, and the deviation of Y from its mean 51 00:02:55,240 --> 00:02:59,720 value, and we're asking whether these two deviations 52 00:02:59,720 --> 00:03:03,170 tend to have the same sign or not, whether they move in the 53 00:03:03,170 --> 00:03:05,370 same direction or not. 54 00:03:05,370 --> 00:03:08,930 If the covariance is positive, what it tells us is that 55 00:03:08,930 --> 00:03:12,210 whenever this quantity is positive so that X is above 56 00:03:12,210 --> 00:03:18,410 its mean, then, typically or usually, the deviation of Y 57 00:03:18,410 --> 00:03:22,600 from its mean will also tend to be positive. 58 00:03:22,600 --> 00:03:26,160 To summarize, the covariance, in general, tells us whether 59 00:03:26,160 --> 00:03:31,470 two random variables tend to move together, both being high 60 00:03:31,470 --> 00:03:37,160 or both being low, in some average or typical sense. 61 00:03:37,160 --> 00:03:40,140 Now, if the two random variables are independent, we 62 00:03:40,140 --> 00:03:44,160 already saw that in the zero mean case, this quantity-- 63 00:03:44,160 --> 00:03:45,100 the covariance-- 64 00:03:45,100 --> 00:03:46,550 is going to be 0. 65 00:03:46,550 --> 00:03:50,570 How about the case where we have non-zero means? 66 00:03:50,570 --> 00:03:56,620 Well, if we have independence, then we have the expected 67 00:03:56,620 --> 00:03:58,720 value of the product of two random variables. 68 00:03:58,720 --> 00:04:03,540 X and Y are independent, so X minus the expected value, 69 00:04:03,540 --> 00:04:06,710 which is a constant, is going to be independent from Y minus 70 00:04:06,710 --> 00:04:08,920 its expected value. 71 00:04:08,920 --> 00:04:12,520 And so, the covariance is going to be the product of two 72 00:04:12,520 --> 00:04:13,770 expectations. 73 00:04:25,220 --> 00:04:30,590 But the expected value of X minus this constant is 0, and 74 00:04:30,590 --> 00:04:34,120 the same is true for this term as well. 75 00:04:34,120 --> 00:04:37,805 So the covariance in this case is going to be equal to 0. 76 00:04:40,340 --> 00:04:44,790 So in the independent case, we have zero covariances. 77 00:04:44,790 --> 00:04:48,570 On the other hand, the converse is not true. 78 00:04:48,570 --> 00:04:54,085 There are examples in which we have dependence but zero 79 00:04:54,085 --> 00:04:55,640 covariance. 80 00:04:55,640 --> 00:04:57,140 Here is one example. 81 00:04:57,140 --> 00:05:01,580 In this example there are four possible outcomes. 82 00:05:01,580 --> 00:05:05,800 At any particular outcome, either X or Y 83 00:05:05,800 --> 00:05:07,430 is going to be 0. 84 00:05:07,430 --> 00:05:11,150 So in this example the random variable X times Y is 85 00:05:11,150 --> 00:05:13,200 identically equal to 0. 86 00:05:13,200 --> 00:05:16,940 The mean of X is also 0, the mean of Y is also 0 by 87 00:05:16,940 --> 00:05:19,230 symmetry, so the covariance is the expected 88 00:05:19,230 --> 00:05:20,780 value of this quantity. 89 00:05:20,780 --> 00:05:25,160 And so the covariance, in this example, is equal to 0. 90 00:05:25,160 --> 00:05:27,810 On the other hand, the two random variables, 91 00:05:27,810 --> 00:05:30,380 X and Y, are dependent. 92 00:05:30,380 --> 00:05:35,300 If I tell you that X is equal to 1, then you know that this 93 00:05:35,300 --> 00:05:38,460 outcome has occurred. 94 00:05:38,460 --> 00:05:44,010 And in that case, you are certain that Y is equal to 0. 95 00:05:44,010 --> 00:05:47,070 So knowing the value of X tells you a lot about the 96 00:05:47,070 --> 00:05:51,440 value of Y and, therefore, we have dependence between these 97 00:05:51,440 --> 00:05:52,690 two random variables.