1 00:00:00,000 --> 00:00:00,040 2 00:00:00,040 --> 00:00:00,710 Hey, guys. 3 00:00:00,710 --> 00:00:02,020 Welcome back. 4 00:00:02,020 --> 00:00:04,680 Today, we're going to do another fun problem, which is 5 00:00:04,680 --> 00:00:07,980 a drill problem on joint PMFs. 6 00:00:07,980 --> 00:00:11,330 And the goal is that you will feel more comfortable by the 7 00:00:11,330 --> 00:00:13,830 end of this problem, manipulating joint PMFs. 8 00:00:13,830 --> 00:00:17,490 And we'll also review some ideas about independents in 9 00:00:17,490 --> 00:00:19,380 the process. 10 00:00:19,380 --> 00:00:22,440 So just to go over what I've drawn here, we 11 00:00:22,440 --> 00:00:25,090 are given an xy plane. 12 00:00:25,090 --> 00:00:27,810 And we're told what the PMF is. 13 00:00:27,810 --> 00:00:30,080 And it's plotted for you here. 14 00:00:30,080 --> 00:00:34,220 What these stars indicate is simply that 15 00:00:34,220 --> 00:00:35,130 there is a value there. 16 00:00:35,130 --> 00:00:36,230 But we don't know what it is. 17 00:00:36,230 --> 00:00:39,750 It could be anything between 0 and 1. 18 00:00:39,750 --> 00:00:41,600 And so we're given this list of questions. 19 00:00:41,600 --> 00:00:42,590 And we're just going to work through 20 00:00:42,590 --> 00:00:45,000 them linearly together. 21 00:00:45,000 --> 00:00:48,970 So we start off pretty simply. 22 00:00:48,970 --> 00:00:52,900 We want to compute, in part a, the probability that x takes 23 00:00:52,900 --> 00:00:54,590 on a value of 1. 24 00:00:54,590 --> 00:00:58,690 So for those of you who like formulas, I'm going to use the 25 00:00:58,690 --> 00:01:01,390 formula, which is usually referred to as 26 00:01:01,390 --> 00:01:03,380 marginalization. 27 00:01:03,380 --> 00:01:06,740 So the marginal over x is given by 28 00:01:06,740 --> 00:01:08,840 summing over the joint. 29 00:01:08,840 --> 00:01:13,170 So here we are interested in the probability that x is 1. 30 00:01:13,170 --> 00:01:15,650 So I'm just going to freeze the value of 1 here. 31 00:01:15,650 --> 00:01:17,365 And we sum over y. 32 00:01:17,365 --> 00:01:20,530 And in particular, 1, 2, and 3. 33 00:01:20,530 --> 00:01:28,230 So carrying this out, this is the Pxy of 1, 1, plus Pxy of 34 00:01:28,230 --> 00:01:34,290 1, 2, plus Pxy 1, 3. 35 00:01:34,290 --> 00:01:37,990 And this, of course, reading from the graph, is 1/12 plus 36 00:01:37,990 --> 00:01:45,700 2/12 plus 1/12, which is equal to 4/12, or 1/3. 37 00:01:45,700 --> 00:01:48,340 So now you guys know the formula. 38 00:01:48,340 --> 00:01:51,040 Hopefully you'll remember the term marginalization. 39 00:01:51,040 --> 00:01:54,170 But I want to point out that intuitively you can come up 40 00:01:54,170 --> 00:01:57,040 with the answer much faster. 41 00:01:57,040 --> 00:02:01,010 So the probability that x is equal to 1 is the probability 42 00:02:01,010 --> 00:02:02,840 that this dot happens or this dot 43 00:02:02,840 --> 00:02:05,320 happens or this dot happens. 44 00:02:05,320 --> 00:02:10,280 Now, these dots, or outcomes, they're disjoint. 45 00:02:10,280 --> 00:02:12,320 So you can just sum the probability to get the 46 00:02:12,320 --> 00:02:15,280 probability of one of these things happening. 47 00:02:15,280 --> 00:02:17,310 So it's the same computation. 48 00:02:17,310 --> 00:02:20,300 And you'll probably get there a little bit faster. 49 00:02:20,300 --> 00:02:23,940 So we're done with a already, which is great. 50 00:02:23,940 --> 00:02:29,050 So for part b, conditioning on x is equal to 1, we want to 51 00:02:29,050 --> 00:02:31,850 sketch the PMF of y. 52 00:02:31,850 --> 00:02:35,380 So if x is equal to 1 we are suddenly 53 00:02:35,380 --> 00:02:37,510 living in this universe. 54 00:02:37,510 --> 00:02:41,440 y can take values of 1, 2, or 3 with these relative 55 00:02:41,440 --> 00:02:42,840 frequencies. 56 00:02:42,840 --> 00:02:45,790 So let's draw this here. 57 00:02:45,790 --> 00:02:47,400 So this is y. 58 00:02:47,400 --> 00:02:50,550 I said, already, y can take on a value of 1. y can take on a 59 00:02:50,550 --> 00:02:51,270 value of 2. 60 00:02:51,270 --> 00:02:53,680 Or it can take on a value of 3. 61 00:02:53,680 --> 00:02:58,700 And we're plotting here, Py given x, y, conditioned on x 62 00:02:58,700 --> 00:03:00,390 is equal to 1. 63 00:03:00,390 --> 00:03:03,950 OK, so what I mean by preserving the relative 64 00:03:03,950 --> 00:03:07,925 frequencies is that in unconditional world this is 65 00:03:07,925 --> 00:03:10,320 dot is twice as likely to happen as either 66 00:03:10,320 --> 00:03:12,470 this dot or this dot. 67 00:03:12,470 --> 00:03:16,110 And that relative likelihood remains the same after 68 00:03:16,110 --> 00:03:17,590 conditioning. 69 00:03:17,590 --> 00:03:20,540 And the reason why we have to change these values is because 70 00:03:20,540 --> 00:03:21,820 they have to sum to 1. 71 00:03:21,820 --> 00:03:24,410 So in other words, we have to scale them up. 72 00:03:24,410 --> 00:03:26,420 So you can use a formula. 73 00:03:26,420 --> 00:03:28,780 But again, I'm here to show you faster ways of 74 00:03:28,780 --> 00:03:30,390 thinking about it. 75 00:03:30,390 --> 00:03:34,620 So my little algorithm for figuring out conditional PMFs 76 00:03:34,620 --> 00:03:36,600 is to take the numerators-- 77 00:03:36,600 --> 00:03:38,500 so 1, 2, and 1-- 78 00:03:38,500 --> 00:03:39,390 and sum them. 79 00:03:39,390 --> 00:03:41,300 So here that gives us 4. 80 00:03:41,300 --> 00:03:44,120 And then to preserve the relative frequency, you 81 00:03:44,120 --> 00:03:47,630 actually keep the same numerators but divide it by 82 00:03:47,630 --> 00:03:49,820 the sum, which you just computed. 83 00:03:49,820 --> 00:03:51,530 So I'm going fast. 84 00:03:51,530 --> 00:03:53,950 I'll review in a second. 85 00:03:53,950 --> 00:03:57,390 But this is what you will end up getting. 86 00:03:57,390 --> 00:04:02,890 So to recap, I did 1 plus 2 plus 1, which is 4, to get 87 00:04:02,890 --> 00:04:05,570 these denominators. 88 00:04:05,570 --> 00:04:08,080 And so I skipped a step here. 89 00:04:08,080 --> 00:04:11,880 This is really 2/4, which is 1/2, obviously. 90 00:04:11,880 --> 00:04:14,030 So you add these guys to get 4. 91 00:04:14,030 --> 00:04:15,710 And then you keep the numerators and just 92 00:04:15,710 --> 00:04:16,649 divide them by 4. 93 00:04:16,649 --> 00:04:20,040 So 1/4, 2/4, which is 1/2 and 1/4. 94 00:04:20,040 --> 00:04:21,959 And that's what we mean by preserving 95 00:04:21,959 --> 00:04:23,230 the relative frequency. 96 00:04:23,230 --> 00:04:27,730 Except so this thing now sums to 1, which is what we want. 97 00:04:27,730 --> 00:04:32,520 OK, so we're done with part b. 98 00:04:32,520 --> 00:04:36,490 Part c actually follows almost immediately from part b. 99 00:04:36,490 --> 00:04:39,540 In part c we're interested in computing the conditional 100 00:04:39,540 --> 00:04:42,970 expectation of y given that x is equal to 1. 101 00:04:42,970 --> 00:04:45,550 So we've already done most of the legwork because we have 102 00:04:45,550 --> 00:04:49,070 the conditional PMF that we need. 103 00:04:49,070 --> 00:04:52,390 And so expectation, you guys have calculated a bunch of 104 00:04:52,390 --> 00:04:53,060 these by now. 105 00:04:53,060 --> 00:04:55,350 So I'm just going to appeal to your 106 00:04:55,350 --> 00:04:57,640 intuition and to symmetry. 107 00:04:57,640 --> 00:05:00,490 Expectation acts like center of mass. 108 00:05:00,490 --> 00:05:04,200 This is a symmetrical distribution of mass. 109 00:05:04,200 --> 00:05:07,960 And so the center is right here at 2. 110 00:05:07,960 --> 00:05:10,660 So this is simply 2. 111 00:05:10,660 --> 00:05:13,110 And if that went too fast, just convince yourselves. 112 00:05:13,110 --> 00:05:15,050 Use the normal formula for expectations. 113 00:05:15,050 --> 00:05:18,240 And your answer will agree with ours. 114 00:05:18,240 --> 00:05:21,670 OK, so d is a really cool question. 115 00:05:21,670 --> 00:05:24,140 Because you can do a lot of math. 116 00:05:24,140 --> 00:05:28,810 Or you can think and ask yourself, at the most 117 00:05:28,810 --> 00:05:31,330 fundamental level, what is independents? 118 00:05:31,330 --> 00:05:33,290 And if you think that way you'll come to 119 00:05:33,290 --> 00:05:36,120 the answer very easily. 120 00:05:36,120 --> 00:05:41,740 So essentially, I rephrased this to truncate it from the 121 00:05:41,740 --> 00:05:43,940 problem statement that you guys are reading. 122 00:05:43,940 --> 00:05:47,710 But the idea is that these stars are 123 00:05:47,710 --> 00:05:49,820 unknown probability masses. 124 00:05:49,820 --> 00:05:53,620 And this question is asking can you figure out a way of 125 00:05:53,620 --> 00:05:58,390 assigning numbers between 0 and 1 to these values such 126 00:05:58,390 --> 00:06:02,395 that you end up with a valid probability mass function, so 127 00:06:02,395 --> 00:06:06,020 everything sums to 1 and such that x and y are independent? 128 00:06:06,020 --> 00:06:08,300 So it seems hard a priori. 129 00:06:08,300 --> 00:06:10,470 But let's think about it a bit. 130 00:06:10,470 --> 00:06:12,540 And in the meantime I'm going to erase this 131 00:06:12,540 --> 00:06:14,390 so I have more room. 132 00:06:14,390 --> 00:06:18,580 What does it mean for x and y to be independent? 133 00:06:18,580 --> 00:06:23,480 Well, it means that they don't, essentially, have 134 00:06:23,480 --> 00:06:25,760 information about each other. 135 00:06:25,760 --> 00:06:29,360 So if I tell you something about x and if x and y are 136 00:06:29,360 --> 00:06:33,430 independent, your belief about y shouldn't change. 137 00:06:33,430 --> 00:06:36,010 In other words, if you're a rational person, x shouldn't 138 00:06:36,010 --> 00:06:38,950 change your belief about y. 139 00:06:38,950 --> 00:06:41,890 So let's look more closely at this diagram. 140 00:06:41,890 --> 00:06:45,090 Now, the number 0 should be popping out to you. 141 00:06:45,090 --> 00:06:49,500 Because this essentially means that the 0.31 can't happen. 142 00:06:49,500 --> 00:06:52,850 Or it happens with 0 probability. 143 00:06:52,850 --> 00:06:57,630 So let's say fix x equal to 3. 144 00:06:57,630 --> 00:07:02,750 If you condition on x is equal to 3, as I just said, this 145 00:07:02,750 --> 00:07:04,320 outcome can't happen. 146 00:07:04,320 --> 00:07:08,740 So y could only take on values of 2 or 3. 147 00:07:08,740 --> 00:07:12,760 However, if you condition on x is equal to 1, y could take on 148 00:07:12,760 --> 00:07:16,235 a value of 1 with probability 1/4 as we computed in the 149 00:07:16,235 --> 00:07:17,080 other problem. 150 00:07:17,080 --> 00:07:19,550 It could take on a value of 2 with probability of 1/2. 151 00:07:19,550 --> 00:07:23,220 Or it could take on a value of 3 with probability 1/4. 152 00:07:23,220 --> 00:07:25,410 So these are actually very different cases, right? 153 00:07:25,410 --> 00:07:28,450 Because if you observe x is equal to 3 y can 154 00:07:28,450 --> 00:07:29,950 only be 2 or 3. 155 00:07:29,950 --> 00:07:35,030 But if you observe x is equal to 1, y can be 1, 2, or 3. 156 00:07:35,030 --> 00:07:40,510 So actually, x, no matter what values these stars have on, x 157 00:07:40,510 --> 00:07:43,000 always tells you something about y. 158 00:07:43,000 --> 00:07:47,410 Therefore, the answer to this, part d, is no. 159 00:07:47,410 --> 00:07:50,600 So let's put a no with an exclamation point. 160 00:07:50,600 --> 00:07:52,730 So I like that problem a lot. 161 00:07:52,730 --> 00:07:56,960 And hopefully it clarifies independents for you guys. 162 00:07:56,960 --> 00:08:00,240 So parts e and f, we're going to be thinking about 163 00:08:00,240 --> 00:08:02,390 independents again. 164 00:08:02,390 --> 00:08:05,330 To go over what the problem statement gives you, we 165 00:08:05,330 --> 00:08:08,900 defined this event, b, which is the event that x is less 166 00:08:08,900 --> 00:08:12,640 than or equal to 2 and y is less than or equal to 2. 167 00:08:12,640 --> 00:08:14,250 So let's get some colors. 168 00:08:14,250 --> 00:08:15,510 So do bright pink. 169 00:08:15,510 --> 00:08:17,170 So that means we're essentially 170 00:08:17,170 --> 00:08:18,500 living in this world. 171 00:08:18,500 --> 00:08:22,320 There's only those four dots. 172 00:08:22,320 --> 00:08:25,370 And we're also told a very important piece of information 173 00:08:25,370 --> 00:08:30,870 that conditions on B. x and y are conditionally independent. 174 00:08:30,870 --> 00:08:33,409 OK, so part e, now that we have this. 175 00:08:33,409 --> 00:08:36,860 And by the way, these two assumptions apply to both 176 00:08:36,860 --> 00:08:39,100 parts e and part f. 177 00:08:39,100 --> 00:08:44,920 So in part e, we want to find out Pxy of 2, 2. 178 00:08:44,920 --> 00:08:48,090 Or in English, what is the probability that x takes on a 179 00:08:48,090 --> 00:08:50,650 value of 2 and y takes on a value of 2? 180 00:08:50,650 --> 00:08:54,630 So determine the value of this star. 181 00:08:54,630 --> 00:09:01,090 And the whole trick here is that the possible values that 182 00:09:01,090 --> 00:09:04,500 this star could take on are constrained by the fact that 183 00:09:04,500 --> 00:09:07,210 we need to make sure that x and y are conditionally 184 00:09:07,210 --> 00:09:11,310 independent given B. 185 00:09:11,310 --> 00:09:17,670 So my claim is that if two random variables are 186 00:09:17,670 --> 00:09:22,040 independent and you condition on one of them, say we 187 00:09:22,040 --> 00:09:24,050 condition on x. 188 00:09:24,050 --> 00:09:27,180 If you condition on different values of x, the relative 189 00:09:27,180 --> 00:09:30,460 frequencies of y should be the same. 190 00:09:30,460 --> 00:09:33,930 So here, the relative frequency, condition on x is 191 00:09:33,930 --> 00:09:35,060 equal to 1. 192 00:09:35,060 --> 00:09:37,500 The relative frequencies of y are 2 to 1. 193 00:09:37,500 --> 00:09:41,170 This outcome is twice as likely to happen as this one. 194 00:09:41,170 --> 00:09:44,370 If we condition on 2 this outcome needs to be twice as 195 00:09:44,370 --> 00:09:46,990 likely to happen as this outcome. 196 00:09:46,990 --> 00:09:51,210 If they weren't, x would tell you information about y. 197 00:09:51,210 --> 00:09:55,450 Because you would know that the distribution over 2 and 1 198 00:09:55,450 --> 00:09:56,530 would be different. 199 00:09:56,530 --> 00:09:57,900 OK? 200 00:09:57,900 --> 00:10:00,940 So because the relative frequencies have to be the 201 00:10:00,940 --> 00:10:06,440 same and 2/12 is 2 times 1/12 this guy must 202 00:10:06,440 --> 00:10:09,800 also be 2 times 2/12. 203 00:10:09,800 --> 00:10:13,860 So that gives us our answer for part e. 204 00:10:13,860 --> 00:10:16,220 Let me write up here. 205 00:10:16,220 --> 00:10:26,340 Part e, we need Pxy 2, 2 to be equal to 4/12. 206 00:10:26,340 --> 00:10:32,630 And again, the way we got this is simply we need x and y to 207 00:10:32,630 --> 00:10:37,250 be conditionally independent given B. And if this were 208 00:10:37,250 --> 00:10:43,530 anything other than 4 the relative frequency of y is 209 00:10:43,530 --> 00:10:46,960 equal to 2 to 1 would be different from over here. 210 00:10:46,960 --> 00:10:49,540 So here condition on x is equal to 1. 211 00:10:49,540 --> 00:10:53,250 The outcome, y is equal to 2 is twice as likely as x is 212 00:10:53,250 --> 00:10:54,990 equal to 1. 213 00:10:54,990 --> 00:10:59,510 Here, if we put a value of 4/12 and you condition on x is 214 00:10:59,510 --> 00:11:05,090 equal to 2, the outcome y is equal to 2 is still twice as 215 00:11:05,090 --> 00:11:09,610 likely as the outcome y is equal to 1. 216 00:11:09,610 --> 00:11:11,630 And if you put any other number there the relative 217 00:11:11,630 --> 00:11:13,080 frequencies would be different. 218 00:11:13,080 --> 00:11:15,050 So x would be telling you something about y. 219 00:11:15,050 --> 00:11:18,491 So there would not be independent condition on B. 220 00:11:18,491 --> 00:11:19,730 OK, that was a mouthful. 221 00:11:19,730 --> 00:11:22,860 But hopefully you guys have it now. 222 00:11:22,860 --> 00:11:28,080 And lastly, we have part f, which follows pretty directly 223 00:11:28,080 --> 00:11:29,900 from part e. 224 00:11:29,900 --> 00:11:32,550 So we were still in the unconditional universe. 225 00:11:32,550 --> 00:11:36,840 In part e, we were figuring out what's the value of star 226 00:11:36,840 --> 00:11:39,010 in the whole unconditional universe? 227 00:11:39,010 --> 00:11:41,890 Now, in part f, we want the value of star in the 228 00:11:41,890 --> 00:11:44,250 conditional universe where B occurred. 229 00:11:44,250 --> 00:11:49,180 So let's come over here and plot a new graph so we don't 230 00:11:49,180 --> 00:11:51,130 confuse ourselves. 231 00:11:51,130 --> 00:11:53,442 So we have xy. 232 00:11:53,442 --> 00:11:55,130 x can be 1 or 2. 233 00:11:55,130 --> 00:11:56,740 Y could be 1 or 2. 234 00:11:56,740 --> 00:12:00,690 So we have a plot that looks something like this. 235 00:12:00,690 --> 00:12:05,360 And so again, same argument as before. 236 00:12:05,360 --> 00:12:06,430 Let me just fill this in. 237 00:12:06,430 --> 00:12:08,930 From part e, we have that this is 4/12. 238 00:12:08,930 --> 00:12:10,880 And we're going to use my algorithm again. 239 00:12:10,880 --> 00:12:13,340 So in the conditional world, the relative frequencies of 240 00:12:13,340 --> 00:12:14,990 these four dots should be the same. 241 00:12:14,990 --> 00:12:19,000 But you need to scale them up so that if you sum over all of 242 00:12:19,000 --> 00:12:20,780 them the probability sums to 1. 243 00:12:20,780 --> 00:12:23,470 So you have a valid PMF. 244 00:12:23,470 --> 00:12:25,840 So my algorithm from before was to add up all the 245 00:12:25,840 --> 00:12:26,920 numerators. 246 00:12:26,920 --> 00:12:30,520 So 1 plus 2 plus 4 plus 2 gives you 9. 247 00:12:30,520 --> 00:12:33,060 And then to preserve the relative frequency you keep 248 00:12:33,060 --> 00:12:34,320 the same numerator. 249 00:12:34,320 --> 00:12:36,530 So here we had a numerator of 1. 250 00:12:36,530 --> 00:12:38,790 That becomes 1/9. 251 00:12:38,790 --> 00:12:40,490 Here we had a numerator of 2. 252 00:12:40,490 --> 00:12:42,290 This becomes 2/9. 253 00:12:42,290 --> 00:12:43,900 Here we had a numerator of 4. 254 00:12:43,900 --> 00:12:45,310 That becomes 4/9. 255 00:12:45,310 --> 00:12:47,510 Here we had a numerator of 2, so 2/9. 256 00:12:47,510 --> 00:12:51,040 And indeed, the relative frequencies are preserved. 257 00:12:51,040 --> 00:12:52,870 And they all sum to 1. 258 00:12:52,870 --> 00:12:56,340 So our answer for part f-- 259 00:12:56,340 --> 00:12:57,720 let's box it here-- 260 00:12:57,720 --> 00:13:03,780 is that Pxy 2, 2 conditioned on B is equal to 4/9, 261 00:13:03,780 --> 00:13:06,910 is just that guy. 262 00:13:06,910 --> 00:13:08,020 So we're done. 263 00:13:08,020 --> 00:13:09,780 Hopefully that wasn't too painful. 264 00:13:09,780 --> 00:13:13,400 And this is a good drill problem, because we got more 265 00:13:13,400 --> 00:13:16,880 comfortable working with PMFs, joint PMFs. 266 00:13:16,880 --> 00:13:19,700 We went over marginalization. 267 00:13:19,700 --> 00:13:21,560 We went over conditioning. 268 00:13:21,560 --> 00:13:24,600 We went over into independents. 269 00:13:24,600 --> 00:13:30,500 And I also gave you this quick algorithm for figuring out 270 00:13:30,500 --> 00:13:32,310 what conditional PMFs are if you don't 271 00:13:32,310 --> 00:13:33,840 want to use the formulas. 272 00:13:33,840 --> 00:13:36,190 Namely, you sum all of the numerators to get a new 273 00:13:36,190 --> 00:13:39,280 denominator and then divide all the old numerators by the 274 00:13:39,280 --> 00:13:42,020 new denominator you computed. 275 00:13:42,020 --> 00:13:43,350 So I hope that was helpful. 276 00:13:43,350 --> 00:13:44,600 I'll see you next time. 277 00:13:44,600 --> 00:13:45,420