1 00:00:00,500 --> 00:00:02,800 The following content is provided under a Creative 2 00:00:02,800 --> 00:00:04,340 Commons license. 3 00:00:04,340 --> 00:00:06,660 Your support will help MIT OpenCourseWare 4 00:00:06,660 --> 00:00:11,020 continue to offer high quality educational resources for free. 5 00:00:11,020 --> 00:00:13,640 To make a donation or view additional materials 6 00:00:13,640 --> 00:00:17,365 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,365 --> 00:00:17,990 at ocw.mit.edu. 8 00:00:24,780 --> 00:00:26,497 PROFESSOR: OK, let's get started. 9 00:00:29,400 --> 00:00:32,520 Last week we talked about random variables. 10 00:00:32,520 --> 00:00:34,540 And this week we're going to talk 11 00:00:34,540 --> 00:00:37,550 about their expected value. 12 00:00:37,550 --> 00:00:41,000 The expected value of a random variable 13 00:00:41,000 --> 00:00:42,637 comes up in all sorts of applications. 14 00:00:42,637 --> 00:00:44,720 We're going to spend a whole week talking about it 15 00:00:44,720 --> 00:00:46,980 and some of next week talking about variations 16 00:00:46,980 --> 00:00:48,496 from the expected value. 17 00:00:48,496 --> 00:00:49,870 And it's probably one of the best 18 00:00:49,870 --> 00:00:52,190 tools you have for working with probability 19 00:00:52,190 --> 00:00:54,289 problems in practice. 20 00:00:54,289 --> 00:00:55,747 So let's write down the definition. 21 00:01:04,129 --> 00:01:12,320 The expected value also has other names. 22 00:01:12,320 --> 00:01:23,570 It's also known as the average or the mean 23 00:01:23,570 --> 00:01:24,683 of a random variable. 24 00:01:29,390 --> 00:01:41,950 Random variable R over a probability space S 25 00:01:41,950 --> 00:01:44,520 is denoted by a lot of ways. 26 00:01:44,520 --> 00:01:48,760 We're going to use Ex to denote it, Ex of R. 27 00:01:48,760 --> 00:01:52,070 And it's the sum over all possible outcomes in the sample 28 00:01:52,070 --> 00:01:55,670 space of the value of the random variable 29 00:01:55,670 --> 00:02:01,190 on that outcome times the probability of that outcome. 30 00:02:01,190 --> 00:02:04,310 In other words, the expected value of a random variable 31 00:02:04,310 --> 00:02:09,080 is just a weighted average of all possible values 32 00:02:09,080 --> 00:02:13,380 of the random variable where the weight is the probability 33 00:02:13,380 --> 00:02:15,910 of that happening. 34 00:02:15,910 --> 00:02:27,438 For example, suppose we roll a fair six sided die. 35 00:02:30,860 --> 00:02:33,810 So the numbers that come up are 1 to 6. 36 00:02:33,810 --> 00:02:37,890 And we let R be the random variable denoting 37 00:02:37,890 --> 00:02:40,135 the outcome, 1 through 6. 38 00:02:43,330 --> 00:02:45,990 So the expected value of the role, 39 00:02:45,990 --> 00:02:50,160 we can easily compute from the definition. 40 00:02:50,160 --> 00:02:54,890 Well, there's a 1/6 chance that it comes out a 1. 41 00:02:54,890 --> 00:02:56,200 The next outcome would be a 2. 42 00:02:56,200 --> 00:02:59,600 That happens with 1/6 chance. 43 00:02:59,600 --> 00:03:03,025 3, 4, 5, and six all happen with a 1/6 chance. 44 00:03:06,680 --> 00:03:13,000 If I sum up 1, 2, 3, 4 up to 6, I get 6 times 7 over 2 times 45 00:03:13,000 --> 00:03:16,910 the 1/6 is just 7 over 2 or 3 and 1/2. 46 00:03:19,590 --> 00:03:23,320 So expected value when I roll a die, a fair six sided die, 47 00:03:23,320 --> 00:03:25,320 is just 3 and 1/2. 48 00:03:25,320 --> 00:03:27,880 And as you can see from this example, 49 00:03:27,880 --> 00:03:31,210 the expected value doesn't have to be attainable by one 50 00:03:31,210 --> 00:03:33,030 of the outcomes. 51 00:03:33,030 --> 00:03:36,730 You don't get a 3 and 1/2 when you roll a die. 52 00:03:36,730 --> 00:03:39,840 But that's the average value. 53 00:03:39,840 --> 00:03:44,080 Now the expected value, the expectation, the average, 54 00:03:44,080 --> 00:03:46,812 the mean, they're all the same thing. 55 00:03:46,812 --> 00:03:48,770 They're all defined based on a random variable. 56 00:03:48,770 --> 00:03:50,890 They all mean the same thing. 57 00:03:50,890 --> 00:03:54,384 They are all different than the median. 58 00:03:54,384 --> 00:03:55,800 That's something that's different. 59 00:03:55,800 --> 00:03:58,400 So let me define the median. 60 00:03:58,400 --> 00:04:02,090 Basically it is the outcome which splits 61 00:04:02,090 --> 00:04:03,390 the probabilities in half. 62 00:04:03,390 --> 00:04:05,030 In other words, you've got a 50% chance 63 00:04:05,030 --> 00:04:08,260 of being bigger than the median and a 50% chance of being 64 00:04:08,260 --> 00:04:09,760 smaller. 65 00:04:09,760 --> 00:04:17,209 Now precisely you can define it this way. 66 00:04:17,209 --> 00:04:24,950 The median of a random variable R 67 00:04:24,950 --> 00:04:35,300 is the value in the range of R such that the probability 68 00:04:35,300 --> 00:04:39,700 that the random variable is less than this median point is 69 00:04:39,700 --> 00:04:42,020 at most 1/2. 70 00:04:42,020 --> 00:04:44,330 And the probability that random variable 71 00:04:44,330 --> 00:04:48,086 is bigger than the median is strictly less than 1/2. 72 00:04:50,910 --> 00:04:52,950 Now, some texts do it differently. 73 00:04:52,950 --> 00:04:56,944 They swap this less than or equal to with this one. 74 00:04:56,944 --> 00:04:58,610 And it could give you a different answer 75 00:04:58,610 --> 00:05:01,170 if you do that. 76 00:05:01,170 --> 00:05:04,506 And I think in the text, actually, we screwed it up. 77 00:05:04,506 --> 00:05:05,880 So we have to get that corrected. 78 00:05:05,880 --> 00:05:07,796 I think we might have put a less than or equal 79 00:05:07,796 --> 00:05:09,990 here, which is not the right definition. 80 00:05:09,990 --> 00:05:13,160 So this is the right definition that we'll use. 81 00:05:13,160 --> 00:05:16,730 So using this definition, what is the median 82 00:05:16,730 --> 00:05:20,950 of the random variable that corresponds to rolling a die? 83 00:05:20,950 --> 00:05:24,551 What's the median when I roll a six sided fair die? 84 00:05:27,810 --> 00:05:32,820 Not 3 and 1/2, because it's got to be one of the values. 85 00:05:32,820 --> 00:05:35,652 The median does have to be in the realm 86 00:05:35,652 --> 00:05:36,610 of the random variable. 87 00:05:36,610 --> 00:05:39,340 It has to one of the attainable values. 88 00:05:39,340 --> 00:05:41,900 What's the median value when I roll a die? 89 00:05:41,900 --> 00:05:43,280 AUDIENCE: Four. 90 00:05:43,280 --> 00:05:43,980 PROFESSOR: Four. 91 00:05:43,980 --> 00:05:45,580 Let's try that out. 92 00:05:45,580 --> 00:05:49,620 If I plug in 4 here, the probability I'm less than 4 93 00:05:49,620 --> 00:05:50,790 is 1/2. 94 00:05:50,790 --> 00:05:52,540 Could be 1, 2, or 3. 95 00:05:52,540 --> 00:05:57,330 The probability of greater than 4 is 5 or six is 1/3. 96 00:05:57,330 --> 00:05:58,470 That's less than 1/2. 97 00:05:58,470 --> 00:06:00,280 So 4 works. 98 00:06:00,280 --> 00:06:05,290 3 doesn't work, because the probability I'm bigger than 3 99 00:06:05,290 --> 00:06:06,710 is 1/2. 100 00:06:06,710 --> 00:06:08,296 So it doesn't work. 101 00:06:08,296 --> 00:06:10,170 And the other definition sometimes people use 102 00:06:10,170 --> 00:06:11,960 would turn out that three is the median. 103 00:06:14,560 --> 00:06:16,160 Now, we're not going to spend any more 104 00:06:16,160 --> 00:06:18,330 time talking about the median. 105 00:06:18,330 --> 00:06:21,430 What's really important in probability is the mean. 106 00:06:21,430 --> 00:06:23,170 And that's because you can do a whole lot 107 00:06:23,170 --> 00:06:25,970 more with it in applications. 108 00:06:25,970 --> 00:06:31,770 Any question about the definitions so far? 109 00:06:31,770 --> 00:06:36,890 All right, so now we're going to do a more interesting example. 110 00:06:36,890 --> 00:06:39,780 We're going to play a gambling game 111 00:06:39,780 --> 00:06:43,110 and we're going to analyze the expected winnings, the expected 112 00:06:43,110 --> 00:06:44,810 return. 113 00:06:44,810 --> 00:06:47,090 And this is a simple three person game 114 00:06:47,090 --> 00:06:52,480 that you can see played in bars and informally. 115 00:06:52,480 --> 00:06:55,590 But to play it I need a volunteer 116 00:06:55,590 --> 00:06:57,625 from the class, somebody who wants to play. 117 00:06:57,625 --> 00:06:58,250 You've done it. 118 00:06:58,250 --> 00:06:59,041 Who hasn't done it? 119 00:06:59,041 --> 00:07:00,359 Have you done it? 120 00:07:00,359 --> 00:07:01,150 You haven't played. 121 00:07:01,150 --> 00:07:03,012 Do you have money in your pocket? 122 00:07:03,012 --> 00:07:04,220 You got to borrow some money. 123 00:07:04,220 --> 00:07:07,048 If you don't have any money, you got to borrow some. 124 00:07:07,048 --> 00:07:08,770 AUDIENCE: How much money? 125 00:07:08,770 --> 00:07:11,560 PROFESSOR: Oh, I don't know, $5 or $10. 126 00:07:11,560 --> 00:07:12,120 Got it? 127 00:07:12,120 --> 00:07:13,507 All right, come on down. 128 00:07:15,746 --> 00:07:17,620 you're going to play against a couple of TAs. 129 00:07:17,620 --> 00:07:20,910 Maybe a couple of you will want to play. 130 00:07:20,910 --> 00:07:23,630 I'll get Nick and Martyna to play here. 131 00:07:23,630 --> 00:07:27,960 Now, this is a very simple game. 132 00:07:27,960 --> 00:07:33,060 In each round, each of the players is going to wager $2. 133 00:07:33,060 --> 00:07:36,950 So I can loan you guys some money here, I guess. 134 00:07:36,950 --> 00:07:39,741 So you have $2? 135 00:07:39,741 --> 00:07:41,240 All right, I have to loan you money. 136 00:07:41,240 --> 00:07:42,646 I better count this out. 137 00:07:42,646 --> 00:07:44,354 AUDIENCE: I think I can find the Zimbabwe 138 00:07:44,354 --> 00:07:45,900 dollars I had here somewhere. 139 00:07:45,900 --> 00:07:47,260 PROFESSOR: I don't know if I'm taking those. 140 00:07:47,260 --> 00:07:48,426 Here, you can split that up. 141 00:07:48,426 --> 00:07:49,850 I loaned you guys $10 there. 142 00:07:49,850 --> 00:07:50,880 You got $2. 143 00:07:50,880 --> 00:07:53,780 Put it on the table. 144 00:07:53,780 --> 00:07:55,760 Wager your $2. 145 00:07:55,760 --> 00:07:58,160 Now what we're going to do is they're each 146 00:07:58,160 --> 00:08:02,230 going to guess the results of a coin toss. 147 00:08:02,230 --> 00:08:05,371 We'll put the pot together here, mix it up. 148 00:08:05,371 --> 00:08:06,370 You only have to do two. 149 00:08:06,370 --> 00:08:08,250 You can save those for now. 150 00:08:08,250 --> 00:08:09,420 Oh, that's your two. 151 00:08:09,420 --> 00:08:12,220 All right, so we have $6 in the pot. 152 00:08:12,220 --> 00:08:15,670 Now, each of you is going to guess the outcome of a coin. 153 00:08:15,670 --> 00:08:17,330 So here's the guesses. 154 00:08:17,330 --> 00:08:19,930 Heads or tails, that's for you. 155 00:08:19,930 --> 00:08:21,399 Heads or tails, the guess for you. 156 00:08:21,399 --> 00:08:22,690 And you get a heads or a tails. 157 00:08:22,690 --> 00:08:24,270 So instead of writing it down, you're 158 00:08:24,270 --> 00:08:26,070 going to sort of hide those, maybe 159 00:08:26,070 --> 00:08:27,820 get some advice from the class, and you're 160 00:08:27,820 --> 00:08:32,409 going to pick one and put it on the table here as your guess. 161 00:08:32,409 --> 00:08:35,650 And then one of you out there is going to flip a coin 162 00:08:35,650 --> 00:08:38,909 and announce the result. And then we're 163 00:08:38,909 --> 00:08:42,480 going to reveal their guesses and the winners 164 00:08:42,480 --> 00:08:44,670 split the money. 165 00:08:44,670 --> 00:08:47,940 And if there's no winners, well, they take their $2 back 166 00:08:47,940 --> 00:08:49,570 and they split the pot. 167 00:08:49,570 --> 00:08:52,520 So this is a very fair game. 168 00:08:52,520 --> 00:08:54,810 They make their choices, you guys 169 00:08:54,810 --> 00:08:57,350 toss the coin, and the winners share of the pot. 170 00:08:57,350 --> 00:09:01,650 If there's one, winner they get $6, which is a profit of $4. 171 00:09:01,650 --> 00:09:04,480 Two winners, they each get three, profit of one, 172 00:09:04,480 --> 00:09:06,790 the other one's out two bucks. 173 00:09:06,790 --> 00:09:08,940 Nobody wins, they get their money back, profit 0. 174 00:09:08,940 --> 00:09:11,300 They all win, they get their money back, profit 0. 175 00:09:11,300 --> 00:09:13,520 Is the game clear? 176 00:09:13,520 --> 00:09:15,810 Who's got a coin out there? 177 00:09:15,810 --> 00:09:17,030 You've got a coin? 178 00:09:17,030 --> 00:09:17,870 Very good. 179 00:09:17,870 --> 00:09:19,510 Now you've got to guess the coin. 180 00:09:19,510 --> 00:09:22,520 Don't show each other your guesses. 181 00:09:22,520 --> 00:09:26,459 And put your guesses on the table here. 182 00:09:26,459 --> 00:09:27,750 You get to pick heads or tails. 183 00:09:27,750 --> 00:09:28,583 Put it on the table. 184 00:09:35,270 --> 00:09:36,620 [LAUGHS] 185 00:09:36,620 --> 00:09:37,950 This is not a hard game. 186 00:09:41,140 --> 00:09:44,710 You got a heads and a tails, right? 187 00:09:44,710 --> 00:09:48,198 Well, yeah, did you put a heads or a tails in there? 188 00:09:48,198 --> 00:09:51,650 Yeah, there's one heads, one tails. 189 00:09:51,650 --> 00:09:52,420 No, that's blank. 190 00:09:52,420 --> 00:09:53,909 You don't win that way. 191 00:09:53,909 --> 00:09:54,450 There you go. 192 00:09:54,450 --> 00:09:55,158 That's the tails. 193 00:09:55,158 --> 00:09:56,940 AUDIENCE: I thought you meant-- 194 00:09:56,940 --> 00:09:57,690 PROFESSOR: Heads. 195 00:09:57,690 --> 00:09:59,418 So if you want heads, you go like that. 196 00:09:59,418 --> 00:09:59,846 AUDIENCE: Yeah. 197 00:09:59,846 --> 00:10:01,560 I thought you meant do it the other way. 198 00:10:01,560 --> 00:10:02,345 PROFESSOR: There you go. 199 00:10:02,345 --> 00:10:04,205 AUDIENCE: All right, I'll have to make these indistinguishable 200 00:10:04,205 --> 00:10:05,140 now. 201 00:10:05,140 --> 00:10:06,140 PROFESSOR: Well, yeah. 202 00:10:06,140 --> 00:10:09,030 One of them is all ripped up now. 203 00:10:09,030 --> 00:10:11,490 They already guessed, so they've already done their guess. 204 00:10:11,490 --> 00:10:12,800 So you pick one and put it there. 205 00:10:12,800 --> 00:10:14,425 AUDIENCE: This is what I was originally 206 00:10:14,425 --> 00:10:15,997 going to guess anyway. 207 00:10:15,997 --> 00:10:17,080 PROFESSOR: So that's good. 208 00:10:17,080 --> 00:10:24,690 And you now flip the coin and tell us what it came out. 209 00:10:24,690 --> 00:10:25,562 AUDIENCE: Heads. 210 00:10:25,562 --> 00:10:26,270 PROFESSOR: Heads. 211 00:10:26,270 --> 00:10:27,811 All right, let's reveal your choices. 212 00:10:30,280 --> 00:10:31,930 All right, so there's one heads. 213 00:10:31,930 --> 00:10:35,780 Martyna takes all the money. 214 00:10:35,780 --> 00:10:37,377 Just bad luck there, right? 215 00:10:37,377 --> 00:10:38,460 We're going to play again. 216 00:10:38,460 --> 00:10:39,793 We're going to play a few times. 217 00:10:39,793 --> 00:10:43,490 But let's start recording what happened here. 218 00:10:43,490 --> 00:10:46,850 So the coin came out heads. 219 00:10:46,850 --> 00:10:48,005 OK, remind me your name? 220 00:10:48,005 --> 00:10:48,630 AUDIENCE: Adam. 221 00:10:48,630 --> 00:10:50,950 PROFESSOR: Adam gassed tails. 222 00:10:54,850 --> 00:10:56,061 And Martyna. 223 00:10:59,230 --> 00:11:01,290 Is it Y? 224 00:11:01,290 --> 00:11:07,680 Martyna guessed heads and Nick, he also loses at tails. 225 00:11:07,680 --> 00:11:13,780 So in this case Adam is out $2. 226 00:11:13,780 --> 00:11:16,800 Martyna got a profit of $4. 227 00:11:16,800 --> 00:11:20,230 And Nick is also down $2. 228 00:11:20,230 --> 00:11:22,800 All right, let's try it again. 229 00:11:22,800 --> 00:11:26,500 So again, pick another heads or tails. 230 00:11:26,500 --> 00:11:30,802 Adam's busily working out probabilities here. 231 00:11:30,802 --> 00:11:33,010 Well you got to put your money on the table here now. 232 00:11:33,010 --> 00:11:35,700 You got $2. 233 00:11:35,700 --> 00:11:37,480 That's three. 234 00:11:37,480 --> 00:11:38,517 Here you go. 235 00:11:38,517 --> 00:11:39,990 [LAUGHS] 236 00:11:39,990 --> 00:11:40,490 There we go. 237 00:11:40,490 --> 00:11:41,330 Two more dollars. 238 00:11:41,330 --> 00:11:43,390 You can make change if you want. 239 00:11:43,390 --> 00:11:44,650 There we go. 240 00:11:44,650 --> 00:11:47,990 All right, make your choices. 241 00:11:47,990 --> 00:11:50,800 You got one? 242 00:11:50,800 --> 00:11:53,050 As long as you guys don't show him your choices, 243 00:11:53,050 --> 00:11:55,340 we should be good here, I think. 244 00:11:55,340 --> 00:11:56,790 And you got to pick one here. 245 00:12:00,140 --> 00:12:00,640 Very good. 246 00:12:00,640 --> 00:12:03,370 Can we have a coin toss please? 247 00:12:03,370 --> 00:12:04,300 AUDIENCE: Heads. 248 00:12:04,300 --> 00:12:05,460 PROFESSOR: Heads again. 249 00:12:05,460 --> 00:12:07,560 What do we got? 250 00:12:07,560 --> 00:12:08,480 All right, good job. 251 00:12:08,480 --> 00:12:11,506 OK, Adam got heads and Martyna's on a roll here. 252 00:12:11,506 --> 00:12:12,630 So we got to split this up. 253 00:12:12,630 --> 00:12:14,360 Can you make change for him here? 254 00:12:14,360 --> 00:12:16,080 He gets $3. 255 00:12:16,080 --> 00:12:18,750 So it was heads. 256 00:12:18,750 --> 00:12:23,870 Adam had heads, Martyna has heads, and Nick's in trouble. 257 00:12:23,870 --> 00:12:28,180 Nick's off another $2 of my money. 258 00:12:28,180 --> 00:12:32,690 Martyna is up $1 because she wagered two and got three 259 00:12:32,690 --> 00:12:37,570 and Adam is up $1 here for that one. 260 00:12:37,570 --> 00:12:39,420 All right, let's make another selection. 261 00:12:39,420 --> 00:12:41,700 Heads or tails? 262 00:12:41,700 --> 00:12:43,860 Put the money up here, $2. 263 00:12:43,860 --> 00:12:47,786 Going to need another loan here, Nick? 264 00:12:47,786 --> 00:12:49,410 AUDIENCE: I can throw in my keys. 265 00:12:49,410 --> 00:12:50,910 PROFESSOR: No, no. 266 00:12:50,910 --> 00:12:52,889 AUDIENCE: The keys to my non-existent car. 267 00:12:52,889 --> 00:12:53,930 PROFESSOR: You got money? 268 00:12:53,930 --> 00:12:54,610 OK. 269 00:12:54,610 --> 00:12:55,335 All right. 270 00:12:55,335 --> 00:12:55,790 AUDIENCE: I got a lot of money. 271 00:12:55,790 --> 00:12:57,373 PROFESSOR: Oh, very good, that's good. 272 00:12:57,373 --> 00:12:59,976 That's what we like to hear here. 273 00:12:59,976 --> 00:13:01,600 AUDIENCE: That's what he likes to hear. 274 00:13:01,600 --> 00:13:03,590 PROFESSOR: Everybody make a choice. 275 00:13:03,590 --> 00:13:05,050 Heads or tails? 276 00:13:05,050 --> 00:13:06,690 Now, you all want to be thinking, OK, 277 00:13:06,690 --> 00:13:09,275 what's going on here? 278 00:13:09,275 --> 00:13:11,400 We're going to figure out expectations in a minute, 279 00:13:11,400 --> 00:13:12,191 do the tree method. 280 00:13:12,191 --> 00:13:15,080 Is there any catch going on? 281 00:13:15,080 --> 00:13:16,580 Not that we would have a catch here. 282 00:13:16,580 --> 00:13:19,150 OK, put a heads or tails down. 283 00:13:19,150 --> 00:13:20,475 OK, coin toss please. 284 00:13:23,185 --> 00:13:24,210 AUDIENCE: Tails. 285 00:13:24,210 --> 00:13:24,920 PROFESSOR: Tails. 286 00:13:24,920 --> 00:13:26,160 What do we got? 287 00:13:26,160 --> 00:13:31,150 A winner, a winner, and a loser. 288 00:13:31,150 --> 00:13:35,130 So we got tails, tails, tails. 289 00:13:35,130 --> 00:13:40,210 Nick has lost another $2. 290 00:13:40,210 --> 00:13:43,840 Martyna's cruising and Adam is back to even. 291 00:13:43,840 --> 00:13:46,900 Good job. 292 00:13:46,900 --> 00:13:48,110 AUDIENCE: Can I cash out now? 293 00:13:48,110 --> 00:13:49,040 PROFESSOR: Well, you're only even. 294 00:13:49,040 --> 00:13:50,354 You've got to get ahead here. 295 00:13:50,354 --> 00:13:51,562 All right, let's do it again. 296 00:13:57,790 --> 00:13:58,330 $2. 297 00:13:58,330 --> 00:13:59,676 Everybody got $2 on the table? 298 00:14:02,650 --> 00:14:04,120 I'm going to have to loan. 299 00:14:04,120 --> 00:14:05,370 You got money? 300 00:14:05,370 --> 00:14:05,870 You got it. 301 00:14:05,870 --> 00:14:08,100 OK. 302 00:14:08,100 --> 00:14:11,150 See, my hope is Nick loses track. 303 00:14:11,150 --> 00:14:13,110 OK, make a choice please. 304 00:14:19,760 --> 00:14:22,690 Coin toss, please. 305 00:14:22,690 --> 00:14:23,520 AUDIENCE: Tails. 306 00:14:23,520 --> 00:14:24,228 PROFESSOR: Tails. 307 00:14:24,228 --> 00:14:25,770 What do we got? 308 00:14:25,770 --> 00:14:26,660 Heads. 309 00:14:26,660 --> 00:14:28,746 Martyna, Nick! 310 00:14:28,746 --> 00:14:31,360 Oh, for goodness sake. 311 00:14:31,360 --> 00:14:34,330 AUDIENCE: She's psychic, I tell you. 312 00:14:34,330 --> 00:14:36,580 PROFESSOR: Martyna wins again. 313 00:14:36,580 --> 00:14:40,460 You didn't make a deal with him, did you? 314 00:14:40,460 --> 00:14:42,820 AUDIENCE: [INAUDIBLE]. 315 00:14:42,820 --> 00:14:47,270 PROFESSOR: Nick and Martyna are perfect and Adam is down again. 316 00:14:50,670 --> 00:14:53,050 Let's collect your money. 317 00:14:53,050 --> 00:14:54,440 She gets it all. 318 00:14:54,440 --> 00:14:56,380 Oh, she got four here. 319 00:14:56,380 --> 00:14:57,590 Yeah, she made four. 320 00:14:57,590 --> 00:14:58,590 Good point. 321 00:14:58,590 --> 00:15:00,410 Because she won it all. 322 00:15:00,410 --> 00:15:01,610 Very good. 323 00:15:01,610 --> 00:15:03,981 All right, let's put the next wager up. 324 00:15:07,970 --> 00:15:10,670 Nick, if I were you, I wouldn't do much gambling in life. 325 00:15:10,670 --> 00:15:13,970 [LAUGHS] You got any money left Adam? 326 00:15:13,970 --> 00:15:15,582 AUDIENCE: I'm going to have a big win. 327 00:15:15,582 --> 00:15:16,540 PROFESSOR: There it is. 328 00:15:16,540 --> 00:15:17,498 Working on the big win. 329 00:15:17,498 --> 00:15:18,660 There we go. 330 00:15:18,660 --> 00:15:21,510 AUDIENCE: I'm running out of ones here. 331 00:15:21,510 --> 00:15:24,277 PROFESSOR: Martyna can make change for you. 332 00:15:24,277 --> 00:15:25,360 AUDIENCE: Thanks, Martyna. 333 00:15:27,950 --> 00:15:32,010 Do you have change for a 10? 334 00:15:32,010 --> 00:15:34,520 PROFESSOR: All right. 335 00:15:34,520 --> 00:15:36,655 There we go. 336 00:15:36,655 --> 00:15:37,530 A couple bucks there. 337 00:15:37,530 --> 00:15:38,779 And then make your selections. 338 00:15:42,260 --> 00:15:43,380 Okey doke. 339 00:15:43,380 --> 00:15:44,550 It's a pretty fair game. 340 00:15:44,550 --> 00:15:46,860 They put the money in, the winners split the pot. 341 00:15:46,860 --> 00:15:50,760 You guys flip the coin after they make their selections. 342 00:15:50,760 --> 00:15:52,360 OK, make a selection. 343 00:15:52,360 --> 00:15:54,860 Martyna has hers. 344 00:15:54,860 --> 00:15:56,820 Adam's ready and Nick is ready. 345 00:15:56,820 --> 00:15:59,550 Can we have a coin toss? 346 00:15:59,550 --> 00:16:00,550 AUDIENCE: Tails. 347 00:16:00,550 --> 00:16:01,390 PROFESSOR: Tails. 348 00:16:01,390 --> 00:16:03,006 What do we got? 349 00:16:03,006 --> 00:16:03,720 Nick! 350 00:16:03,720 --> 00:16:05,440 Congratulations. 351 00:16:05,440 --> 00:16:06,450 Nick wins big. 352 00:16:06,450 --> 00:16:09,020 All right. 353 00:16:09,020 --> 00:16:13,790 Tails, heads, heads, tails. 354 00:16:13,790 --> 00:16:15,470 Nick, $4. 355 00:16:15,470 --> 00:16:18,050 He's climbing back. 356 00:16:18,050 --> 00:16:21,800 And Martyna lost for the first time $2. 357 00:16:21,800 --> 00:16:25,320 Poor Adam is sinking a little bit here. 358 00:16:25,320 --> 00:16:26,840 All right, one more. 359 00:16:26,840 --> 00:16:27,790 So who collects? 360 00:16:27,790 --> 00:16:29,960 Nick, do you want to collect this? 361 00:16:29,960 --> 00:16:33,920 Start paying off the debts, but save $2 for the last round. 362 00:16:33,920 --> 00:16:35,690 This yours? 363 00:16:35,690 --> 00:16:36,867 Yeah, there's a last one. 364 00:16:36,867 --> 00:16:37,950 All right, two more bucks. 365 00:16:37,950 --> 00:16:38,704 Last round. 366 00:16:43,300 --> 00:16:44,535 OK, make your selections. 367 00:16:47,860 --> 00:16:49,920 Martyna's ready. 368 00:16:49,920 --> 00:16:50,700 Nick is ready. 369 00:16:50,700 --> 00:16:52,025 OK, Adam, what do you like? 370 00:16:55,350 --> 00:16:58,290 Coin toss. 371 00:16:58,290 --> 00:16:58,962 AUDIENCE: Heads. 372 00:16:58,962 --> 00:16:59,670 PROFESSOR: Heads. 373 00:16:59,670 --> 00:17:00,930 How'd you do? 374 00:17:00,930 --> 00:17:02,110 Oh Adam, tough break. 375 00:17:02,110 --> 00:17:03,580 Nick again. 376 00:17:03,580 --> 00:17:04,859 Nick is the sole winner. 377 00:17:04,859 --> 00:17:06,880 So he collects on the heads. 378 00:17:11,440 --> 00:17:17,119 All right, so it's plus $4 for Nick, minus 2 for Martina, 379 00:17:17,119 --> 00:17:19,880 minus 2 for Adam. 380 00:17:19,880 --> 00:17:22,980 All right, let's see how they did in total. 381 00:17:22,980 --> 00:17:24,020 Oh, poor Adam. 382 00:17:24,020 --> 00:17:26,940 Minus $6 here. 383 00:17:26,940 --> 00:17:27,800 How did that happen? 384 00:17:27,800 --> 00:17:32,560 Martyna is plus $6. 385 00:17:32,560 --> 00:17:34,570 Now we know how it happened. 386 00:17:34,570 --> 00:17:37,655 And Nick is even. 387 00:17:40,266 --> 00:17:40,766 Even. 388 00:17:46,950 --> 00:17:50,050 Yeah, so that's just a tough break for Adam. 389 00:17:50,050 --> 00:17:53,659 He came up here and lost $6. 390 00:17:53,659 --> 00:17:55,700 AUDIENCE: Interestingly, there's $6 on the table. 391 00:17:55,700 --> 00:17:57,158 I think I should take that and run. 392 00:17:57,158 --> 00:17:57,779 [LAUGHS] 393 00:17:57,779 --> 00:17:59,320 PROFESSOR: Yeah, you probably should. 394 00:17:59,320 --> 00:18:02,500 But I need to get paid back from Nick here. 395 00:18:02,500 --> 00:18:06,420 Now obviously this is a fair game, right? 396 00:18:06,420 --> 00:18:08,760 So how many people think that Adam was just unlucky? 397 00:18:12,010 --> 00:18:13,010 A few. 398 00:18:13,010 --> 00:18:17,450 How many people think we just screwed him out of $6? 399 00:18:17,450 --> 00:18:19,000 Yeah, you're right. 400 00:18:19,000 --> 00:18:23,260 But let's do the analysis to see what happened here. 401 00:18:23,260 --> 00:18:26,156 Meanwhile, I probably should give him his $6. 402 00:18:26,156 --> 00:18:29,060 AUDIENCE: [INAUDIBLE]. 403 00:18:29,060 --> 00:18:31,376 PROFESSOR: And you got my money. 404 00:18:31,376 --> 00:18:34,696 All right, so I have a gift certificate, 405 00:18:34,696 --> 00:18:36,570 if you would like, for playing the game here. 406 00:18:36,570 --> 00:18:40,450 You can have this or you can have the pocket protector. 407 00:18:40,450 --> 00:18:42,640 I got to say somebody's last time turned 408 00:18:42,640 --> 00:18:44,670 in this for one of these. 409 00:18:44,670 --> 00:18:46,560 The nerd pride pocket protector. 410 00:18:46,560 --> 00:18:49,700 So what would you like for your memento? 411 00:18:49,700 --> 00:18:51,408 AUDIENCE: I'm going to take the box where 412 00:18:51,408 --> 00:18:53,040 [? David Chena ?] is standing. 413 00:18:53,040 --> 00:18:55,820 PROFESSOR: What? 414 00:18:55,820 --> 00:18:57,280 AUDIENCE: As in [INAUDIBLE]. 415 00:18:57,280 --> 00:19:00,110 PROFESSOR: You get one too for tossing the coin. 416 00:19:00,110 --> 00:19:03,810 So we'll pass this up. 417 00:19:03,810 --> 00:19:04,310 Here you go. 418 00:19:04,310 --> 00:19:07,010 Want to pass that up for our coin tosser? 419 00:19:07,010 --> 00:19:07,840 OK, very good. 420 00:19:07,840 --> 00:19:10,247 Thanks very much. 421 00:19:10,247 --> 00:19:11,330 Just leave the money here. 422 00:19:11,330 --> 00:19:12,377 Yeah. 423 00:19:12,377 --> 00:19:13,210 Leave my money here. 424 00:19:13,210 --> 00:19:14,420 Oh, we took some out of your wallet. 425 00:19:14,420 --> 00:19:16,076 AUDIENCE: I got $5 money from you-- $5. 426 00:19:16,076 --> 00:19:17,868 PROFESSOR: $5 from me, yeah. 427 00:19:17,868 --> 00:19:19,312 All right, so leave me the $5. 428 00:19:19,312 --> 00:19:21,672 AUDIENCE: This is your $5. 429 00:19:21,672 --> 00:19:24,570 PROFESSOR: Oh, I should give them back their $6. 430 00:19:24,570 --> 00:19:26,304 Well, otherwise they could report me, 431 00:19:26,304 --> 00:19:27,220 and I'd be in trouble. 432 00:19:27,220 --> 00:19:28,450 And it's all on film. 433 00:19:28,450 --> 00:19:31,120 That would not be good. 434 00:19:31,120 --> 00:19:35,990 OK, so what we're going to do now is analyze the game. 435 00:19:35,990 --> 00:19:38,320 And I'm going to prove to you he was 436 00:19:38,320 --> 00:19:40,520 just unlucky, despite the fact that you 437 00:19:40,520 --> 00:19:42,830 think I screwed him here. 438 00:19:42,830 --> 00:19:46,730 So we're going to analyze Adam's expected winnings 439 00:19:46,730 --> 00:19:48,910 playing this game. 440 00:19:48,910 --> 00:19:49,840 So let's do that. 441 00:19:52,600 --> 00:19:55,880 So we're going to make the tree, as usual. 442 00:19:55,880 --> 00:20:00,500 Now, the first thing we have is Adam's choice. 443 00:20:00,500 --> 00:20:02,467 Now, Adam were you doing anything besides sort 444 00:20:02,467 --> 00:20:03,300 of randomly picking? 445 00:20:03,300 --> 00:20:04,883 Were you trying to psych out the coin? 446 00:20:07,784 --> 00:20:09,950 AUDIENCE: Apparently I wouldn't be able to bribe him 447 00:20:09,950 --> 00:20:11,370 from all the way down here. 448 00:20:11,370 --> 00:20:13,280 PROFESSOR: Yeah, and you did lose most of it. 449 00:20:13,280 --> 00:20:16,870 We'll say Adam was playing 50-50. 450 00:20:16,870 --> 00:20:18,850 It's a reasonable assumption. 451 00:20:18,850 --> 00:20:22,550 And then we have Martyna and then 452 00:20:22,550 --> 00:20:26,010 we have Nick and their choices. 453 00:20:26,010 --> 00:20:29,530 And let's say they're 50-50 as well, because none of them 454 00:20:29,530 --> 00:20:31,280 know what coin is going to be tossed. 455 00:20:31,280 --> 00:20:33,571 And I'm not going to draw this bottom half of the tree, 456 00:20:33,571 --> 00:20:35,000 but it's totally symmetric. 457 00:20:35,000 --> 00:20:38,060 It just gets too deep otherwise. 458 00:20:38,060 --> 00:20:45,230 And then we have Nick's choices, heads, tails, heads, tails. 459 00:20:45,230 --> 00:20:46,027 And they're 50-50. 460 00:20:50,342 --> 00:20:51,425 And then we have the coin. 461 00:20:53,941 --> 00:20:55,440 And I'll start drawing it down here. 462 00:20:55,440 --> 00:20:57,197 And the coin can be heads or tails. 463 00:20:57,197 --> 00:20:59,030 And we're going to assume it was a fair coin 464 00:20:59,030 --> 00:21:00,780 toss back there, because it looked like it 465 00:21:00,780 --> 00:21:02,320 was flipping a bunch of times. 466 00:21:02,320 --> 00:21:04,640 And you don't look like Persi Diaconis, 467 00:21:04,640 --> 00:21:06,240 so we're going to assume you didn't 468 00:21:06,240 --> 00:21:09,150 learn how to flip heads always. 469 00:21:09,150 --> 00:21:10,118 So those are 50-50. 470 00:21:13,790 --> 00:21:16,330 A little tight here. 471 00:21:16,330 --> 00:21:17,370 Everything is 50-50. 472 00:21:23,040 --> 00:21:27,680 So the probabilities are all 1/16. 473 00:21:27,680 --> 00:21:30,670 I got 16 possible outcomes. 474 00:21:30,670 --> 00:21:32,330 2 by 2 by 2 by 2. 475 00:21:32,330 --> 00:21:34,044 They're all equally likely. 476 00:21:34,044 --> 00:21:35,460 So all the probabilities are 1/16. 477 00:21:41,970 --> 00:21:45,380 And now let's see the winnings. 478 00:21:45,380 --> 00:21:48,957 So we'll take Adam's game. 479 00:21:48,957 --> 00:21:50,207 It'll turn out to be negative. 480 00:21:53,790 --> 00:21:58,440 Now, in this case if Adam's heads and Martyna and Nick 481 00:21:58,440 --> 00:22:02,990 are heads and the coin comes up heads, they all win, 482 00:22:02,990 --> 00:22:03,880 they split the pot. 483 00:22:03,880 --> 00:22:07,670 So how much is the gain for Adam in that case? 484 00:22:07,670 --> 00:22:10,630 0. 485 00:22:10,630 --> 00:22:14,600 Now they all guessed heads, but the coin comes out tails, 486 00:22:14,600 --> 00:22:15,720 so they split the pot. 487 00:22:15,720 --> 00:22:19,150 What's Adam's gain there? 488 00:22:19,150 --> 00:22:21,240 0, because they just split the pot again. 489 00:22:21,240 --> 00:22:22,350 He gets his $2 back. 490 00:22:22,350 --> 00:22:25,010 That's 0. 491 00:22:25,010 --> 00:22:29,430 Now we have a case where Nick is tails, 492 00:22:29,430 --> 00:22:33,100 Adam and Martyna were heads, and it comes out heads. 493 00:22:33,100 --> 00:22:34,870 So Nick is losing here. 494 00:22:34,870 --> 00:22:37,500 The pot is split by Adam and Martyna. 495 00:22:37,500 --> 00:22:40,540 What does Adam get as a profit? 496 00:22:40,540 --> 00:22:41,040 1. 497 00:22:41,040 --> 00:22:43,290 He gets half the part because he splits with Martyna. 498 00:22:43,290 --> 00:22:49,080 That's $3 he put into, he gets 1, plus 1. 499 00:22:49,080 --> 00:22:51,710 Now it turns out same scenario of guesses, 500 00:22:51,710 --> 00:22:54,740 but it comes out tails, so Nick wins everything. 501 00:22:54,740 --> 00:22:57,550 What's Adam's status here? 502 00:22:57,550 --> 00:22:58,360 Minus 2. 503 00:22:58,360 --> 00:23:01,800 He bet $2, he lost it. 504 00:23:01,800 --> 00:23:07,000 Then we have heads for Adam, tails for Martyna, 505 00:23:07,000 --> 00:23:08,330 heads for Nick. 506 00:23:08,330 --> 00:23:15,550 Coin comes up heads, Adam splits with Nick, he gets a net of $1. 507 00:23:15,550 --> 00:23:17,300 Same scenario guesses, but now it's 508 00:23:17,300 --> 00:23:22,770 tails, so Martina wins everything, Adam loses $2. 509 00:23:22,770 --> 00:23:24,180 Now we go down to here where it's 510 00:23:24,180 --> 00:23:28,650 heads for Adam, tails for Martyna and Nick, 511 00:23:28,650 --> 00:23:31,520 and heads comes up, so Adam wins the whole thing. 512 00:23:31,520 --> 00:23:34,430 What does he get in this case? 513 00:23:34,430 --> 00:23:35,020 Plus 4. 514 00:23:35,020 --> 00:23:36,640 He wins the whole part of 6 minus 2. 515 00:23:36,640 --> 00:23:38,410 So he gets 4 net. 516 00:23:38,410 --> 00:23:42,690 And then lastly, in this scenario it comes up tails 517 00:23:42,690 --> 00:23:46,920 and Martina and Nick split the pot, Adam loses. 518 00:23:46,920 --> 00:23:48,810 Minus 2. 519 00:23:48,810 --> 00:23:52,030 And then the same thing is happening down here. 520 00:23:52,030 --> 00:23:52,860 Same thing. 521 00:23:52,860 --> 00:23:56,800 It's just everything is reversed by symmetry. 522 00:23:56,800 --> 00:23:59,290 Now, they're all equally likely. 523 00:23:59,290 --> 00:24:01,380 So what we do to compute the expected gain 524 00:24:01,380 --> 00:24:05,290 is take each value times its probability and add them up. 525 00:24:05,290 --> 00:24:08,420 So 0 times 1/16 plus 0 times 1/16 526 00:24:08,420 --> 00:24:12,190 plus 1 times 1/16 minus 2 times 1/16 and so forth. 527 00:24:12,190 --> 00:24:15,450 So it's easier just to add these up and multiply by 1/16. 528 00:24:15,450 --> 00:24:18,430 And when we do, we get 0. 529 00:24:18,430 --> 00:24:20,070 We get 0. 530 00:24:20,070 --> 00:24:21,510 If I add all these up, I get 0. 531 00:24:21,510 --> 00:24:23,420 The same thing for down here. 532 00:24:23,420 --> 00:24:34,050 So the expected gain for Adam is 0. 533 00:24:34,050 --> 00:24:35,841 And that is a fair game. 534 00:24:39,320 --> 00:24:40,746 What do you think? 535 00:24:40,746 --> 00:24:42,620 Do you think we'd go to all that trouble just 536 00:24:42,620 --> 00:24:43,670 to play a fair game? 537 00:24:43,670 --> 00:24:46,800 Or do you think we're trying to set him up and take his money? 538 00:24:46,800 --> 00:24:48,666 Yeah. 539 00:24:48,666 --> 00:24:51,890 AUDIENCE: Martyna and Nick are alternating, 540 00:24:51,890 --> 00:24:53,824 so the branch with the plus 4 goes away. 541 00:24:53,824 --> 00:24:55,240 PROFESSOR: Oh, that's interesting. 542 00:24:55,240 --> 00:24:56,396 Look what happened here. 543 00:24:59,588 --> 00:25:09,350 Nick and Martina, well, they're opposite. 544 00:25:09,350 --> 00:25:11,290 That's opposite. 545 00:25:11,290 --> 00:25:12,090 That's opposite. 546 00:25:12,090 --> 00:25:14,180 That's opposite. 547 00:25:14,180 --> 00:25:16,050 You don't suppose they planned that, do you? 548 00:25:19,610 --> 00:25:22,620 And how could that possibly help that they just 549 00:25:22,620 --> 00:25:24,414 happened to guess opposite? 550 00:25:24,414 --> 00:25:25,830 AUDIENCE: One of them always wins. 551 00:25:25,830 --> 00:25:27,920 PROFESSOR: One of them always wins. 552 00:25:27,920 --> 00:25:29,956 And what does that mean for poor Adam? 553 00:25:29,956 --> 00:25:31,580 AUDIENCE: He never takes the whole pot. 554 00:25:31,580 --> 00:25:34,660 PROFESSOR: He never takes the whole pot. 555 00:25:34,660 --> 00:25:36,430 Well, all right. 556 00:25:36,430 --> 00:25:40,050 Let's see if that changed anything here. 557 00:25:40,050 --> 00:25:43,950 So if Nick and Martyna are always 558 00:25:43,950 --> 00:25:48,260 opposite, that means some of these branches can't occur. 559 00:25:48,260 --> 00:25:51,760 So if Martyna's heads, Nick is tails, this is out. 560 00:25:51,760 --> 00:25:55,100 So this is happening with probability 1 now. 561 00:25:55,100 --> 00:26:02,210 So these points are at 1/8 each instead of 1/16. 562 00:26:02,210 --> 00:26:06,770 And these go to 0 because that branch can't happen. 563 00:26:06,770 --> 00:26:07,490 Same thing here. 564 00:26:07,490 --> 00:26:11,290 When you go tails to Martyna, Nick can't be tails, 565 00:26:11,290 --> 00:26:13,110 has to be heads. 566 00:26:13,110 --> 00:26:16,590 So these go to 1/8. 567 00:26:16,590 --> 00:26:17,755 These go to 0. 568 00:26:21,180 --> 00:26:24,190 Now, you notice this isn't working out so well for Adam 569 00:26:24,190 --> 00:26:26,830 because we're putting more weight here 570 00:26:26,830 --> 00:26:31,750 where he's got a net negative compared to an even situation. 571 00:26:31,750 --> 00:26:35,200 And here he's gone from a positive situation, that's 572 00:26:35,200 --> 00:26:37,380 getting wiped out to a net negative. 573 00:26:37,380 --> 00:26:39,960 So let's compute his probability now. 574 00:26:39,960 --> 00:26:42,480 And I'll get the same contribution down here. 575 00:26:42,480 --> 00:26:49,760 I've got 1 minus 2 plus 1 of 0 minus 2 is negative 2. 576 00:26:49,760 --> 00:26:52,760 I'll get a negative 2 down here times 1/8. 577 00:26:52,760 --> 00:26:55,080 In this case, I'll have the expected gain 578 00:26:55,080 --> 00:27:00,330 for Adam is going to be minus. 579 00:27:00,330 --> 00:27:02,270 Let's make sure I do this right. 580 00:27:02,270 --> 00:27:09,130 It's going to be 2 for the top and bottom times 1/8 581 00:27:09,130 --> 00:27:12,560 times these guys. 582 00:27:12,560 --> 00:27:17,110 1 minus 2 plus 1 minus 2. 583 00:27:17,110 --> 00:27:20,890 And that equals minus 2. 584 00:27:20,890 --> 00:27:21,880 That's minus 1/2. 585 00:27:27,540 --> 00:27:31,600 So in fact, if they come up here and guess differently, 586 00:27:31,600 --> 00:27:33,330 who knew? 587 00:27:33,330 --> 00:27:36,610 Now the expected gain for Adam is minus $0.50. 588 00:27:36,610 --> 00:27:42,420 Every time he plays, he expects to lose $0.50 on his $2 bet. 589 00:27:42,420 --> 00:27:46,290 That's a lousy game for Adam to be playing, even 590 00:27:46,290 --> 00:27:50,830 though it seemed very fair. 591 00:27:50,830 --> 00:27:52,680 You see what's going on here? 592 00:27:52,680 --> 00:27:58,790 Now, this kind of trick is used in all sorts of gambling games. 593 00:27:58,790 --> 00:28:00,790 Maybe you've probably played some of these games 594 00:28:00,790 --> 00:28:02,623 and may not have realized maybe somebody was 595 00:28:02,623 --> 00:28:05,240 using this trick against you. 596 00:28:05,240 --> 00:28:06,640 For example, how many of you have 597 00:28:06,640 --> 00:28:10,424 been in some kind of sports betting pool? 598 00:28:10,424 --> 00:28:11,840 It's March Madness, you're betting 599 00:28:11,840 --> 00:28:15,350 on the victors in round one. 600 00:28:15,350 --> 00:28:17,180 It's a football pool for the weekend. 601 00:28:17,180 --> 00:28:18,550 You're going to guess against the spread. 602 00:28:18,550 --> 00:28:19,341 Who's going to win? 603 00:28:19,341 --> 00:28:21,420 All right, some of you have done that. 604 00:28:21,420 --> 00:28:24,160 All right, now everybody puts $1 into the pool. 605 00:28:24,160 --> 00:28:27,110 And the winner is the guy who was the most wins 606 00:28:27,110 --> 00:28:28,330 or games picked right. 607 00:28:28,330 --> 00:28:31,240 All the winners split the pot. 608 00:28:31,240 --> 00:28:34,380 Now, what this says, by doing the same kind of analysis we 609 00:28:34,380 --> 00:28:39,430 just did, is that if you collude with one or two or three 610 00:28:39,430 --> 00:28:43,260 other players in the pool to always pick differently 611 00:28:43,260 --> 00:28:47,170 on all the games, it's going to give you an edge. 612 00:28:47,170 --> 00:28:48,330 Same reasoning. 613 00:28:48,330 --> 00:28:51,280 When they pick differently, it gives them an edge 614 00:28:51,280 --> 00:28:54,580 and now your expected return is bigger than 0, 615 00:28:54,580 --> 00:28:57,080 at the expense of the other players who 616 00:28:57,080 --> 00:28:59,640 just go in and are putting their picks, if we 617 00:28:59,640 --> 00:29:02,340 assume each one is 50-50. 618 00:29:02,340 --> 00:29:06,590 In fact, a former professor of statistics here at MIT 619 00:29:06,590 --> 00:29:12,030 used this idea, a guy named Herman Chernoff, in the 1980s 620 00:29:12,030 --> 00:29:15,190 to beat the state lottery. 621 00:29:15,190 --> 00:29:17,600 Now, everyone knows that lotteries 622 00:29:17,600 --> 00:29:20,430 are the worst game around because everybody puts 623 00:29:20,430 --> 00:29:24,080 the money in, the state takes half, 624 00:29:24,080 --> 00:29:26,850 and then the winners split the pot. 625 00:29:26,850 --> 00:29:28,540 So it's a horrendous game. 626 00:29:28,540 --> 00:29:31,190 Your expected return is half of what you put in. 627 00:29:31,190 --> 00:29:33,850 So you expect to lose half your bet, because the state's taking 628 00:29:33,850 --> 00:29:38,830 half and splitting the remainder among all the participants. 629 00:29:38,830 --> 00:29:45,170 Now, what Chernoff realized is that people don't bet randomly. 630 00:29:45,170 --> 00:29:49,540 They tend to pick the same sets of numbers. 631 00:29:49,540 --> 00:29:50,780 It might be a birthday. 632 00:29:50,780 --> 00:29:54,500 There's only so many birthdays out there. 633 00:29:54,500 --> 00:29:57,660 Might be number of home runs Papi's hit, 634 00:29:57,660 --> 00:30:00,530 his batting average, Pabelbon's ERA. 635 00:30:00,530 --> 00:30:03,100 Who knows? 636 00:30:03,100 --> 00:30:05,880 But they pick a relatively small set 637 00:30:05,880 --> 00:30:08,620 of numbers that tend to collapse on. 638 00:30:08,620 --> 00:30:14,370 In fact, if you graph how people tend to pick. 639 00:30:14,370 --> 00:30:16,010 And say you're playing pick four, 640 00:30:16,010 --> 00:30:18,187 where you pick four numbers. 641 00:30:18,187 --> 00:30:20,020 And now you look at the frequency with which 642 00:30:20,020 --> 00:30:21,690 the numbers are picked. 643 00:30:21,690 --> 00:30:25,090 Very crudely it looks something like this. 644 00:30:25,090 --> 00:30:27,230 Every once in a while you get a hot ticket 645 00:30:27,230 --> 00:30:31,890 where a lot of people pick that number. 646 00:30:31,890 --> 00:30:33,940 For example, if you're picking four numbers, 647 00:30:33,940 --> 00:30:39,422 MIT students might pick 2, 4, 6, 16. 648 00:30:39,422 --> 00:30:41,380 They might pick that and so it'd be a big spike 649 00:30:41,380 --> 00:30:43,070 for that set of four. 650 00:30:43,070 --> 00:30:47,274 Down the street they're probably picking in 1, 2, 3, 4. 651 00:30:47,274 --> 00:30:49,190 Something like that and there's a spike there. 652 00:30:52,280 --> 00:30:55,550 If you knew this was the histogram of what 653 00:30:55,550 --> 00:30:58,640 people were picking and you knew half the pot was 654 00:30:58,640 --> 00:31:03,430 going to get split with the winners, what would you pick? 655 00:31:03,430 --> 00:31:05,215 Would you pick these? 656 00:31:05,215 --> 00:31:07,590 No, because now you're splitting the pot with 100 people. 657 00:31:07,590 --> 00:31:09,190 That's no good. 658 00:31:09,190 --> 00:31:11,640 You pick down here. 659 00:31:11,640 --> 00:31:15,202 And now when you win, you get half the pot. 660 00:31:15,202 --> 00:31:17,160 The state always takes half, but you don't have 661 00:31:17,160 --> 00:31:19,470 to split it with anybody else. 662 00:31:19,470 --> 00:31:21,950 And that means if you're picking down here, 663 00:31:21,950 --> 00:31:25,820 your expected return is positive, 664 00:31:25,820 --> 00:31:27,640 even with the state taking half. 665 00:31:27,640 --> 00:31:32,330 Because so much of the money is piled up in these things. 666 00:31:32,330 --> 00:31:36,910 And so Chernoff proved that in fact, he 667 00:31:36,910 --> 00:31:39,550 didn't know which numbers were popular. 668 00:31:39,550 --> 00:31:40,960 He didn't know that. 669 00:31:40,960 --> 00:31:42,009 So what did he do? 670 00:31:42,009 --> 00:31:44,050 If you don't know where the spikes are but you're 671 00:31:44,050 --> 00:31:46,350 trying to avoid them and there's not very many spikes, 672 00:31:46,350 --> 00:31:49,470 what would you do to avoid them? 673 00:31:49,470 --> 00:31:51,637 Pick random. 674 00:31:51,637 --> 00:31:53,970 Because if you pick random, probably you miss the spike. 675 00:31:53,970 --> 00:31:55,680 You're down here. 676 00:31:55,680 --> 00:31:57,680 With a number nobody would have thought to pick, 677 00:31:57,680 --> 00:31:59,430 if some random number. 678 00:31:59,430 --> 00:32:02,240 And he showed that, in fact, if you pick randomly, 679 00:32:02,240 --> 00:32:06,870 your expected gain for the lottery at the time 680 00:32:06,870 --> 00:32:10,500 was 7%, or $0.7 on the dollar. 681 00:32:10,500 --> 00:32:13,500 So positive even with the state taking half. 682 00:32:13,500 --> 00:32:17,160 Now, shortly after that, you saw the proliferation 683 00:32:17,160 --> 00:32:20,970 of these random machines that we create random numbers. 684 00:32:20,970 --> 00:32:23,090 Because the state wanted to balance it out and not 685 00:32:23,090 --> 00:32:25,310 have this kind of a scenario. 686 00:32:25,310 --> 00:32:26,910 So now a lot of the picks are randomly 687 00:32:26,910 --> 00:32:29,570 generated in a lot of the lotteries. 688 00:32:34,210 --> 00:32:36,280 These things are not immediately obvious, 689 00:32:36,280 --> 00:32:42,240 but become very clear once you know the mathematics behind it. 690 00:32:42,240 --> 00:32:43,860 Any questions? 691 00:32:43,860 --> 00:32:44,645 Yeah. 692 00:32:44,645 --> 00:32:47,900 AUDIENCE: How would a person hit a random number? 693 00:32:47,900 --> 00:32:49,870 PROFESSOR: Oh, you go to your favorite computer 694 00:32:49,870 --> 00:32:52,290 and do a random number generator. 695 00:32:52,290 --> 00:32:54,490 Now, that's not perfectly random. 696 00:32:54,490 --> 00:32:56,050 People do get into a science of this 697 00:32:56,050 --> 00:32:58,512 where they're certain cosmic rays 698 00:32:58,512 --> 00:33:00,470 or whatever hitting the earth and the frequency 699 00:33:00,470 --> 00:33:02,136 they try to get random numbers out of it 700 00:33:02,136 --> 00:33:04,220 or certain kinds of clocks and stuff 701 00:33:04,220 --> 00:33:07,090 like that, the tiny low order bits. 702 00:33:07,090 --> 00:33:10,069 Actually getting really independent random numbers 703 00:33:10,069 --> 00:33:10,860 can be challenging. 704 00:33:10,860 --> 00:33:13,140 A lot of the things you do with a computer generating 705 00:33:13,140 --> 00:33:16,140 random numbers, they're distributed nicely, 706 00:33:16,140 --> 00:33:17,850 but they're not mutually independent. 707 00:33:17,850 --> 00:33:20,250 And there's whole texts that go into how to do 708 00:33:20,250 --> 00:33:22,385 that for mutual independence. 709 00:33:22,385 --> 00:33:24,260 Getting something that's fair for one of them 710 00:33:24,260 --> 00:33:27,100 is not too hard. 711 00:33:27,100 --> 00:33:30,920 Any questions about that? 712 00:33:30,920 --> 00:33:32,970 There's another example. 713 00:33:32,970 --> 00:33:34,920 How many people ever participated 714 00:33:34,920 --> 00:33:38,090 in a Super Bowl bet? 715 00:33:38,090 --> 00:33:40,810 OK, like maybe you're trying to guess the over under on point 716 00:33:40,810 --> 00:33:42,930 scored. 717 00:33:42,930 --> 00:33:47,050 And the person who guesses close to the total number of points 718 00:33:47,050 --> 00:33:48,370 wins the pot. 719 00:33:48,370 --> 00:33:50,760 And if there's a tie, it's shared. 720 00:33:50,760 --> 00:33:53,490 So in that kind of situation, some people 721 00:33:53,490 --> 00:33:56,570 figured out that the average number of points scored 722 00:33:56,570 --> 00:34:00,860 in a Super Bowl is, say, 30. 723 00:34:00,860 --> 00:34:07,730 And a lot of the guesses then will cluster around 30 points. 724 00:34:07,730 --> 00:34:10,840 Now, if you knew this and you knew a lot of the guesses, 725 00:34:10,840 --> 00:34:13,639 because there may be people who cared betting, where would you 726 00:34:13,639 --> 00:34:16,560 make, say, this is, I don't know, 40 and this is 20, 727 00:34:16,560 --> 00:34:19,440 where would you guess? 728 00:34:19,440 --> 00:34:20,480 You could guess here. 729 00:34:20,480 --> 00:34:22,120 That's a good one. 730 00:34:22,120 --> 00:34:24,620 Or you could guess here. 731 00:34:24,620 --> 00:34:27,448 Not only are you not going to share the pot, which helps you, 732 00:34:27,448 --> 00:34:29,739 but in this case you'd actually, because being closest, 733 00:34:29,739 --> 00:34:32,630 you'd capture all the scores out here. 734 00:34:32,630 --> 00:34:34,870 So that's really good. 735 00:34:34,870 --> 00:34:38,198 So these are the best guesses to make for your expected return, 736 00:34:38,198 --> 00:34:40,489 assuming everybody is guessing around here because they 737 00:34:40,489 --> 00:34:41,880 know that's the median. 738 00:34:41,880 --> 00:34:42,846 Yeah? 739 00:34:42,846 --> 00:34:45,384 AUDIENCE: But those scores aren't likely to happen. 740 00:34:45,384 --> 00:34:47,050 PROFESSOR: They're not likely to happen, 741 00:34:47,050 --> 00:34:51,159 but it can outweigh splitting the pot with all the people 742 00:34:51,159 --> 00:34:53,870 that guessed here. 743 00:34:53,870 --> 00:34:56,744 And you get to cover more bases. 744 00:34:56,744 --> 00:34:58,660 So you're right, they're less likely to happen 745 00:34:58,660 --> 00:35:02,170 because so many people guessed in here where it's likely. 746 00:35:02,170 --> 00:35:05,790 Your expected return is better out here. 747 00:35:05,790 --> 00:35:08,750 You may not be likely to win, but your payoff 748 00:35:08,750 --> 00:35:11,790 will be very large when you do. 749 00:35:11,790 --> 00:35:14,590 Now, if you're in a bet or a pool 750 00:35:14,590 --> 00:35:17,400 with a bunch of 6042 students and they're all 751 00:35:17,400 --> 00:35:22,542 guessing out here, well then you want to go back home here. 752 00:35:22,542 --> 00:35:23,250 Then it's better. 753 00:35:23,250 --> 00:35:28,800 So it all depends on what that distribution looks like. 754 00:35:28,800 --> 00:35:31,660 All right, any more questions about this? 755 00:35:31,660 --> 00:35:32,672 Yeah. 756 00:35:32,672 --> 00:35:34,213 AUDIENCE: [INAUDIBLE] if you're going 757 00:35:34,213 --> 00:35:35,883 to be playing a whole bunch of times, 758 00:35:35,883 --> 00:35:37,758 and if you're only playing once, wouldn't you 759 00:35:37,758 --> 00:35:40,158 look at the most likely result? 760 00:35:40,158 --> 00:35:41,408 PROFESSOR: Say that again now? 761 00:35:41,408 --> 00:35:43,929 AUDIENCE: If you're going to be only playing once wouldn't you 762 00:35:43,929 --> 00:35:45,595 only be concerned with the result that's 763 00:35:45,595 --> 00:35:47,606 most likely to show up? 764 00:35:47,606 --> 00:35:49,480 PROFESSOR: Well, it depends on your strategy. 765 00:35:49,480 --> 00:35:52,580 If you want the expected gain to be maximized, 766 00:35:52,580 --> 00:35:54,760 it doesn't matter how many times you're playing. 767 00:35:54,760 --> 00:35:57,400 And that can be very different than maximizing 768 00:35:57,400 --> 00:35:59,480 your probability of winning. 769 00:35:59,480 --> 00:36:02,210 If you want to maximize your probability of winning, 770 00:36:02,210 --> 00:36:06,165 you're going to go right at the center point here. 771 00:36:06,165 --> 00:36:08,290 Because if you look at the probability distribution 772 00:36:08,290 --> 00:36:12,572 and say the probability distribution looks like that, 773 00:36:12,572 --> 00:36:14,280 then you want to bet here, because that's 774 00:36:14,280 --> 00:36:17,390 the maximum chance of winning. 775 00:36:17,390 --> 00:36:22,250 But say so many people bet there, you'd split 30 ways. 776 00:36:22,250 --> 00:36:27,260 Well, this divided by 30 is smaller than this divided by 1. 777 00:36:27,260 --> 00:36:31,280 And so your expected return will be bigger out here. 778 00:36:31,280 --> 00:36:32,500 So two different things. 779 00:36:32,500 --> 00:36:35,160 Maximizing the probability of winning and maximizing 780 00:36:35,160 --> 00:36:36,830 your expected return. 781 00:36:36,830 --> 00:36:37,394 Yeah? 782 00:36:37,394 --> 00:36:38,810 AUDIENCE: When you say maximizing, 783 00:36:38,810 --> 00:36:41,144 do you just do derivatives? 784 00:36:41,144 --> 00:36:43,060 PROFESSOR: Yeah, then you could do derivatives 785 00:36:43,060 --> 00:36:43,840 and all that kind of stuff. 786 00:36:43,840 --> 00:36:44,350 That's right. 787 00:36:44,350 --> 00:36:46,141 Once you have the curve, you figure it out. 788 00:36:46,141 --> 00:36:47,820 Yeah. 789 00:36:47,820 --> 00:36:48,551 Yeah. 790 00:36:48,551 --> 00:36:50,342 AUDIENCE: Wouldn't it be better to maximize 791 00:36:50,342 --> 00:36:56,115 your expected value versus the probability that you win? 792 00:36:56,115 --> 00:36:56,740 PROFESSOR: Yes. 793 00:36:56,740 --> 00:37:01,160 In general, you want to maximize the expected return 794 00:37:01,160 --> 00:37:03,170 on the basis that probably over life you're 795 00:37:03,170 --> 00:37:04,950 doing lots of things. 796 00:37:04,950 --> 00:37:06,990 And that overall puts you in a better state. 797 00:37:06,990 --> 00:37:09,110 Now, we're going to talk about this 798 00:37:09,110 --> 00:37:14,750 some next time in terms of taking high risk 799 00:37:14,750 --> 00:37:17,860 bets with high, high payoffs. 800 00:37:17,860 --> 00:37:20,370 That can maximize your expected return, 801 00:37:20,370 --> 00:37:23,230 but you have a very decent chance of losing a lot. 802 00:37:23,230 --> 00:37:24,840 And so you might not want to go there. 803 00:37:24,840 --> 00:37:26,423 And we'll talk about that next time we 804 00:37:26,423 --> 00:37:29,790 talk about variance and actually what your choice is. 805 00:37:29,790 --> 00:37:32,110 Because that is a fundamental choice you face. 806 00:37:32,110 --> 00:37:35,760 Maximizing chance of winning, maximizing expected return. 807 00:37:35,760 --> 00:37:37,970 And of course, tied into that is the risk of losing 808 00:37:37,970 --> 00:37:40,011 and what kind of risk you're willing to tolerate. 809 00:37:43,530 --> 00:37:47,050 We're going to do a bunch more examples, but before I do, 810 00:37:47,050 --> 00:37:48,810 I want to show you some other equivalent 811 00:37:48,810 --> 00:37:49,940 definitions of expectation. 812 00:38:03,650 --> 00:38:06,150 So the expected value of a random variable, 813 00:38:06,150 --> 00:38:09,290 you can also compute it by summing over 814 00:38:09,290 --> 00:38:12,400 all possible values of the random variable. 815 00:38:12,400 --> 00:38:17,600 x being in the range of R. x times the probability 816 00:38:17,600 --> 00:38:20,330 that R equals x. 817 00:38:20,330 --> 00:38:22,200 And let's see why this is true. 818 00:38:22,200 --> 00:38:24,058 It follows from the definition. 819 00:38:27,600 --> 00:38:30,270 From the definition, we know the expected value 820 00:38:30,270 --> 00:38:33,600 is the sum over all sample points 821 00:38:33,600 --> 00:38:35,970 of the value of the random variable on the sample 822 00:38:35,970 --> 00:38:37,365 point times its probability. 823 00:38:39,970 --> 00:38:46,880 And now we can organize this sum by the value that R takes. 824 00:38:46,880 --> 00:38:50,480 So we're going to split this into a double sum 825 00:38:50,480 --> 00:38:57,330 where here we're looking first at x in the range of R. 826 00:38:57,330 --> 00:39:02,030 And then here we look at sample points for which R 827 00:39:02,030 --> 00:39:04,380 on that point is x. 828 00:39:04,380 --> 00:39:07,220 So this sum is equivalent to this one. 829 00:39:07,220 --> 00:39:08,970 Here we're just organizing all the sample 830 00:39:08,970 --> 00:39:10,345 points for the same value of x. 831 00:39:28,990 --> 00:39:32,430 All right, now in the inner sum, I've 832 00:39:32,430 --> 00:39:36,140 only got values for which R w equals x, so I can just 833 00:39:36,140 --> 00:39:37,410 replace this with x. 834 00:39:50,710 --> 00:39:53,860 The same as before. 835 00:39:53,860 --> 00:39:56,575 Only now I've just put x instead of Rw. 836 00:40:01,750 --> 00:40:05,805 Now I can pull the x out since it's a constant independent 837 00:40:05,805 --> 00:40:06,670 of that sum. 838 00:40:23,070 --> 00:40:27,280 Now here I'm summing up the probability of all the sample 839 00:40:27,280 --> 00:40:32,100 points for which R of the sample point is x. 840 00:40:32,100 --> 00:40:37,890 And that is just the probability that R equals x. 841 00:40:37,890 --> 00:40:40,540 That's the definition of the probability of that event. 842 00:40:40,540 --> 00:40:44,310 And so now my answer is some overall values 843 00:40:44,310 --> 00:40:48,177 x in the range of x times of probability R equals x. 844 00:40:48,177 --> 00:40:49,760 And that's what I was trying to prove. 845 00:40:55,870 --> 00:41:00,990 OK, any questions about that? 846 00:41:00,990 --> 00:41:03,230 Now there are some special cases of this 847 00:41:03,230 --> 00:41:07,917 when the random variable is on the natural numbers. 848 00:41:07,917 --> 00:41:09,625 So the range of R is the natural numbers. 849 00:41:21,960 --> 00:41:22,631 So a corollary. 850 00:41:25,400 --> 00:41:33,400 If the random variable has a range on the natural numbers, 851 00:41:33,400 --> 00:41:37,560 then another way to compute the expected value of r 852 00:41:37,560 --> 00:41:43,640 is to simply sum i equals 1 to infinity i times 853 00:41:43,640 --> 00:41:45,760 the probability R equals i. 854 00:41:48,330 --> 00:41:51,190 And the proof, well, it's really just saying the same thing 855 00:41:51,190 --> 00:41:51,890 here. 856 00:41:51,890 --> 00:41:54,160 I'm just summing over the natural numbers. 857 00:41:54,160 --> 00:41:56,140 And the case with 0 doesn't matter 858 00:41:56,140 --> 00:41:58,830 because I get 0 times the probability of 0. 859 00:41:58,830 --> 00:42:02,390 And that adds nothing to the sum. 860 00:42:02,390 --> 00:42:09,390 So I just have to sum over the positive integers in that case. 861 00:42:09,390 --> 00:42:11,620 All right, there's another special case 862 00:42:11,620 --> 00:42:15,060 of this, which makes it even easier 863 00:42:15,060 --> 00:42:17,600 sometimes to compute the expected value. 864 00:42:39,910 --> 00:42:46,260 If R is a random variable in the natural numbers, 865 00:42:46,260 --> 00:42:50,570 then the expected value of R is simply 866 00:42:50,570 --> 00:42:56,240 the sum i equals 0 to infinity of the probability 867 00:42:56,240 --> 00:43:00,400 the random variable is bigger than i. 868 00:43:00,400 --> 00:43:05,755 Which is the same as summing i equals 1 to infinity, 869 00:43:05,755 --> 00:43:09,300 the probability R is bigger than or equal to i. 870 00:43:09,300 --> 00:43:11,077 They're the same thing. 871 00:43:11,077 --> 00:43:13,160 For example, the first term here, your probability 872 00:43:13,160 --> 00:43:14,750 R bigger than 0. 873 00:43:14,750 --> 00:43:18,510 That's the same as saying R is bigger than or equal to 1 874 00:43:18,510 --> 00:43:19,220 and so forth. 875 00:43:19,220 --> 00:43:22,000 So these are clearly the same. 876 00:43:22,000 --> 00:43:25,760 And the difference here is we have i times 877 00:43:25,760 --> 00:43:28,230 probability R equals i. 878 00:43:28,230 --> 00:43:30,990 Here we have probability R is bigger than I, 879 00:43:30,990 --> 00:43:32,120 with no i out in front. 880 00:43:32,120 --> 00:43:33,328 So let's see why that's true. 881 00:43:57,200 --> 00:44:00,460 Well, we're going to work backwards and evaluate 882 00:44:00,460 --> 00:44:02,830 that sum. 883 00:44:02,830 --> 00:44:13,350 The sum i equal 0 to infinity probability R is bigger than i. 884 00:44:13,350 --> 00:44:17,250 Well, that's the probability R is bigger 885 00:44:17,250 --> 00:44:21,450 than 0 plus the probability R is bigger 886 00:44:21,450 --> 00:44:27,700 than 1 plus probability R is bigger than 2, 887 00:44:27,700 --> 00:44:29,133 and so forth out to infinity. 888 00:44:32,610 --> 00:44:35,340 Adding those up. 889 00:44:35,340 --> 00:44:36,910 And now I can write this one out. 890 00:44:36,910 --> 00:44:43,250 That's the probability R equals 1 plus the probability R 891 00:44:43,250 --> 00:44:49,666 equals 2 plus the probability R equals 3 and so forth. 892 00:44:52,450 --> 00:44:58,110 Probability are bigger than 1 equals, well, R could be 2, 893 00:44:58,110 --> 00:45:03,760 R could be 3, and so forth. 894 00:45:03,760 --> 00:45:05,460 R bigger than 2. 895 00:45:05,460 --> 00:45:12,010 Well, R could be 3, 4, and so forth. 896 00:45:12,010 --> 00:45:18,110 And so now if I add all these up, 897 00:45:18,110 --> 00:45:23,780 well, I get 1 times probability R equals 1. 898 00:45:23,780 --> 00:45:24,910 Two of these guys. 899 00:45:30,810 --> 00:45:32,400 Three of these guys. 900 00:45:32,400 --> 00:45:34,042 And you can see the pattern here. 901 00:45:36,970 --> 00:45:39,810 Before the next and so forth. 902 00:45:39,810 --> 00:45:45,150 And so we've shown that this value equals that value, which 903 00:45:45,150 --> 00:45:54,635 is by the corollary just the expected value of R. 904 00:45:54,635 --> 00:45:57,010 So by the corollary, we know that the expected value of R 905 00:45:57,010 --> 00:46:00,188 equals the sum up there. 906 00:46:00,188 --> 00:46:01,688 And that's the proof of the theorem. 907 00:46:06,900 --> 00:46:11,280 Sometimes it's easier to use that top expression there. 908 00:46:11,280 --> 00:46:15,610 And as a good example, it gives a really easy way 909 00:46:15,610 --> 00:46:19,690 to compute the mean time to failure in a system. 910 00:46:19,690 --> 00:46:20,548 So let's do that. 911 00:46:30,290 --> 00:46:42,020 Suppose you have a system and it fails with probability p 912 00:46:42,020 --> 00:46:45,210 at each step. 913 00:46:45,210 --> 00:46:47,160 And let's assume that the failures 914 00:46:47,160 --> 00:46:50,500 are mutually independent. 915 00:46:50,500 --> 00:46:53,340 So if the system has been going for t steps, 916 00:46:53,340 --> 00:46:56,700 it still will fail on step t plus 1 with probability p, 917 00:46:56,700 --> 00:46:59,380 no matter what's happened before. 918 00:46:59,380 --> 00:47:02,950 And the question is, what's the expected number of steps 919 00:47:02,950 --> 00:47:04,330 before the system fails? 920 00:47:04,330 --> 00:47:07,680 How long is it going to live before you get a crash? 921 00:47:07,680 --> 00:47:11,720 And we're going to let R be that random variable. 922 00:47:11,720 --> 00:47:17,000 It will be the step when failure occurs, first failure. 923 00:47:25,120 --> 00:47:28,770 We want to know what's the expected value of R. Mean time 924 00:47:28,770 --> 00:47:29,336 to failure. 925 00:47:32,430 --> 00:47:36,350 And we're going to use that top formula up there. 926 00:47:36,350 --> 00:47:38,807 Makes it a lot simpler to do that 927 00:47:38,807 --> 00:47:39,932 than the other definitions. 928 00:47:45,010 --> 00:47:50,130 Now, the probability that R is bigger than i 929 00:47:50,130 --> 00:47:53,520 is the same as the probability it did not 930 00:47:53,520 --> 00:47:55,396 fail in the first i steps. 931 00:47:58,970 --> 00:48:08,465 So this equals the probability of no failure in the first i 932 00:48:08,465 --> 00:48:08,965 steps. 933 00:48:13,900 --> 00:48:15,840 Because this event, R bigger than i 934 00:48:15,840 --> 00:48:19,840 means that the first failure was not in the first steps, 935 00:48:19,840 --> 00:48:23,380 so the system was fine for the first i steps. 936 00:48:23,380 --> 00:48:25,970 And because of mutual independence 937 00:48:25,970 --> 00:48:29,110 on when failure occurs, this is simply 938 00:48:29,110 --> 00:48:37,140 the product that we're OK, no failure in the first step 939 00:48:37,140 --> 00:48:38,600 times the probability. 940 00:48:38,600 --> 00:48:47,800 We're OK in the second step and so forth up to the ith step. 941 00:48:47,800 --> 00:48:49,134 We're OK in ith step. 942 00:48:52,830 --> 00:48:55,040 And this is where we're using mutual independence, 943 00:48:55,040 --> 00:48:57,420 because I've gone from a situation where 944 00:48:57,420 --> 00:48:58,990 we're OK in the first i steps. 945 00:48:58,990 --> 00:49:02,280 The probability of that is the product of the probabilities. 946 00:49:02,280 --> 00:49:04,860 We're OK in each step individually. 947 00:49:04,860 --> 00:49:06,213 So that needed independents. 948 00:49:09,890 --> 00:49:13,410 Well, what is the probability we're OK in the first step? 949 00:49:17,610 --> 00:49:21,208 What's the probability of no failure on step one? 950 00:49:21,208 --> 00:49:24,950 1 minus p, because p is the probability we did fail. 951 00:49:24,950 --> 00:49:28,120 So we're OK with probability 1 minus p. 952 00:49:28,120 --> 00:49:31,930 What is the probability we're OK in the second step? 953 00:49:31,930 --> 00:49:39,040 1 minus p and so forth for all i steps. 954 00:49:39,040 --> 00:49:42,480 So this is 1 minus p to the eye. 955 00:49:42,480 --> 00:49:48,960 And it's usually simpler to write that as alpha to the i 956 00:49:48,960 --> 00:49:53,200 where alpha equals 1 minus p. 957 00:49:53,200 --> 00:49:59,150 Because what we've got here now is the sum. 958 00:49:59,150 --> 00:50:02,360 Expected value of R is a sum, i equals 0 959 00:50:02,360 --> 00:50:10,990 to infinity of that probability, which is just alpha to the i. 960 00:50:10,990 --> 00:50:14,280 And we all know what that sum is, right? 961 00:50:14,280 --> 00:50:15,040 What's that sum? 962 00:50:19,430 --> 00:50:20,985 1 over 1 minus alpha. 963 00:50:23,640 --> 00:50:26,790 And that plug back in the alpha to be 1 minus 964 00:50:26,790 --> 00:50:32,702 p is 1 over 1 minus 1 minus p. 965 00:50:32,702 --> 00:50:33,660 And that's very simple. 966 00:50:33,660 --> 00:50:37,300 That's 1 over p. 967 00:50:37,300 --> 00:50:40,310 So the expected time to fail, the expected step 968 00:50:40,310 --> 00:50:43,210 of when you're going to fail, is 1 over p where 969 00:50:43,210 --> 00:50:45,500 p is the failure probability. 970 00:50:45,500 --> 00:50:50,120 So for example, if you have a 1% chance of failing on any step, 971 00:50:50,120 --> 00:50:54,510 your mean time to failure is what? 972 00:50:54,510 --> 00:50:55,322 100. 973 00:50:55,322 --> 00:50:57,260 1 over 0.01. 974 00:50:57,260 --> 00:51:01,870 So very easy to compute mean time to failure. 975 00:51:01,870 --> 00:51:06,264 Any questions about that? 976 00:51:06,264 --> 00:51:08,430 Of course, you can do it with the other definitions, 977 00:51:08,430 --> 00:51:11,660 but the calculations are a little more painful. 978 00:51:11,660 --> 00:51:12,573 Yeah. 979 00:51:12,573 --> 00:51:13,906 AUDIENCE: Why are we summing it? 980 00:51:13,906 --> 00:51:17,250 Is it like an accumulative solution basically? 981 00:51:17,250 --> 00:51:18,860 PROFESSOR: Why are we summing? 982 00:51:18,860 --> 00:51:20,830 Well, that's the definition. 983 00:51:20,830 --> 00:51:25,930 The theorem says the expected value of a random variable 984 00:51:25,930 --> 00:51:30,832 is the sum of the probability that it's bigger than i. 985 00:51:30,832 --> 00:51:32,040 That's what the theorem says. 986 00:51:32,040 --> 00:51:34,540 So we're just plugging into the theorem. 987 00:51:34,540 --> 00:51:37,420 And then the theorem was proved based 988 00:51:37,420 --> 00:51:42,020 on the corollary, which came from the theorem and then 989 00:51:42,020 --> 00:51:43,129 the definition. 990 00:51:43,129 --> 00:51:44,670 So we went through a series of steps. 991 00:51:44,670 --> 00:51:47,510 We started with a definition of expected value, which 992 00:51:47,510 --> 00:51:48,599 makes sense. 993 00:51:48,599 --> 00:51:50,140 Then we got another way to compute it 994 00:51:50,140 --> 00:51:51,181 based on that definition. 995 00:51:53,770 --> 00:51:56,782 And then a corollary to that and then 996 00:51:56,782 --> 00:51:58,490 we use the corollary to prove another way 997 00:51:58,490 --> 00:51:59,710 of computing expected values. 998 00:51:59,710 --> 00:52:01,754 We went through a bunch of general steps 999 00:52:01,754 --> 00:52:03,670 and then we used, basically you could use this 1000 00:52:03,670 --> 00:52:06,354 as a definition of this point for expected value. 1001 00:52:06,354 --> 00:52:08,020 And it just says you sum those things up 1002 00:52:08,020 --> 00:52:08,978 and you get the answer. 1003 00:52:12,510 --> 00:52:13,568 Any other questions? 1004 00:52:16,990 --> 00:52:19,380 OK, there's a variation on this problem 1005 00:52:19,380 --> 00:52:22,890 that you see all the time in sort of trick questions 1006 00:52:22,890 --> 00:52:25,650 or in the popular press sometime, that 1007 00:52:25,650 --> 00:52:27,220 often confuses people. 1008 00:52:27,220 --> 00:52:29,630 People sometimes think of it as a paradox, 1009 00:52:29,630 --> 00:52:31,930 though it's not really one. 1010 00:52:31,930 --> 00:52:34,060 And the example's something like the following. 1011 00:52:34,060 --> 00:52:36,610 Say that a couple, they're going to have kids. 1012 00:52:36,610 --> 00:52:38,890 What they really want is a baby girl. 1013 00:52:38,890 --> 00:52:40,390 They get a boy, fine, but that's not 1014 00:52:40,390 --> 00:52:41,598 what they're concerned about. 1015 00:52:41,598 --> 00:52:43,440 They want to have a baby girl. 1016 00:52:43,440 --> 00:52:45,310 And let's say that each time they have 1017 00:52:45,310 --> 00:52:48,300 a kid it's 50-50 boy or girl. 1018 00:52:48,300 --> 00:52:51,710 And let's say that it's mutually independent from one 1019 00:52:51,710 --> 00:52:54,380 kid to the next, which is not true in practice. 1020 00:52:54,380 --> 00:52:55,735 There tends to be correlation. 1021 00:52:55,735 --> 00:52:57,860 But let's assume it's mutually independent from one 1022 00:52:57,860 --> 00:52:59,500 kid to the next. 1023 00:52:59,500 --> 00:53:03,750 Now, if on the first try a couple get a girl, 1024 00:53:03,750 --> 00:53:06,110 great, they're done, they have one kid, that's it, 1025 00:53:06,110 --> 00:53:07,680 because they just wanted a girl. 1026 00:53:07,680 --> 00:53:11,140 If they get a boy, OK, try again. 1027 00:53:11,140 --> 00:53:14,890 And they keep trying again until they get the girl, 1028 00:53:14,890 --> 00:53:16,480 even if it's 50 kids. 1029 00:53:16,480 --> 00:53:18,570 They wait till they get the girl. 1030 00:53:18,570 --> 00:53:21,830 And the question is, how many baby boys 1031 00:53:21,830 --> 00:53:27,190 do you expect to get before you have the girl? 1032 00:53:27,190 --> 00:53:27,690 what's? 1033 00:53:27,690 --> 00:53:32,461 The expected number of boys to get the girl, then you quit? 1034 00:53:32,461 --> 00:53:37,331 So let's do that. 1035 00:53:48,070 --> 00:53:49,735 So we want to know. 1036 00:53:49,735 --> 00:53:52,690 We have the following data. 1037 00:53:52,690 --> 00:53:57,860 The probability of a boy is 1/2. 1038 00:53:57,860 --> 00:54:00,838 You keep having boys until you get a girl, then you quit. 1039 00:54:06,840 --> 00:54:10,527 We're going to let R be the random variable 1040 00:54:10,527 --> 00:54:11,485 for the number of boys. 1041 00:54:14,050 --> 00:54:21,290 And everything is mutually independent from one child 1042 00:54:21,290 --> 00:54:23,300 to the next. 1043 00:54:23,300 --> 00:54:28,430 How many people think you expect to have more boys than the one 1044 00:54:28,430 --> 00:54:30,330 girl? 1045 00:54:30,330 --> 00:54:32,680 You keep having boys till you get the girl. 1046 00:54:32,680 --> 00:54:33,880 Most people think this. 1047 00:54:33,880 --> 00:54:39,390 How many people think you expect to have fewer than one boy? 1048 00:54:39,390 --> 00:54:40,630 Nobody. 1049 00:54:40,630 --> 00:54:43,136 How many people think you expect to have an equal number? 1050 00:54:43,136 --> 00:54:44,160 Expect one boy. 1051 00:54:44,160 --> 00:54:46,140 Good, OK, so that's the answer. 1052 00:54:46,140 --> 00:54:48,700 In fact, you expect to have just one boy. 1053 00:54:48,700 --> 00:54:52,859 And the proof, we sort of just did it here. 1054 00:54:52,859 --> 00:54:54,650 Now, in this case, we're going to set it up 1055 00:54:54,650 --> 00:54:56,460 as a mean time to failure. 1056 00:54:56,460 --> 00:54:58,480 Same kind of thing. 1057 00:54:58,480 --> 00:55:01,950 Now, in this case the failure mode, they want the girl, 1058 00:55:01,950 --> 00:55:05,640 but that's when you stop, so that's the failure mode. 1059 00:55:05,640 --> 00:55:07,210 And a working step is you have a boy 1060 00:55:07,210 --> 00:55:08,860 and you keep having the working steps until you have 1061 00:55:08,860 --> 00:55:10,570 failure mode and then you stop. 1062 00:55:10,570 --> 00:55:12,630 And in this case, we're not counting 1063 00:55:12,630 --> 00:55:13,610 the step when you stop. 1064 00:55:13,610 --> 00:55:19,270 So we know from that that the expected value of R 1065 00:55:19,270 --> 00:55:22,620 is 1 over p, which is going to be 1/2. 1066 00:55:22,620 --> 00:55:26,420 That's the number of children you have. 1067 00:55:26,420 --> 00:55:30,170 And that equals the mean time to failure. 1068 00:55:30,170 --> 00:55:34,250 Minus the girl. 1069 00:55:34,250 --> 00:55:37,810 Because you count the girl as one of the children. 1070 00:55:37,810 --> 00:55:40,039 And that's going to be the expected number of boys. 1071 00:55:40,039 --> 00:55:41,580 So number of children you're expected 1072 00:55:41,580 --> 00:55:43,880 to have minus the girl. 1073 00:55:43,880 --> 00:55:44,770 This is 1 over 1/2. 1074 00:55:49,115 --> 00:55:53,330 And that is 2 minus 1 equals 1. 1075 00:55:53,330 --> 00:55:58,770 So you expect to have one boy before you get the girl. 1076 00:55:58,770 --> 00:56:02,930 Any questions on that? 1077 00:56:02,930 --> 00:56:06,180 OK, how about this? 1078 00:56:06,180 --> 00:56:09,840 Some couples want to have at least one of each sex. 1079 00:56:09,840 --> 00:56:13,530 They want to have at least one boy and one girl. 1080 00:56:13,530 --> 00:56:16,170 So they keep having children until they get one of each 1081 00:56:16,170 --> 00:56:18,170 and then they stop. 1082 00:56:18,170 --> 00:56:20,210 How many children do they expect to have? 1083 00:56:25,910 --> 00:56:27,580 Somebody said two. 1084 00:56:27,580 --> 00:56:30,820 That's a minimum number. 1085 00:56:30,820 --> 00:56:34,380 So it's not likely to be the expected number. 1086 00:56:34,380 --> 00:56:35,580 Three. 1087 00:56:35,580 --> 00:56:37,730 Well, OK, who said three? 1088 00:56:37,730 --> 00:56:39,030 Why do you think three? 1089 00:56:39,030 --> 00:56:42,848 AUDIENCE: Because you have to have at least the first two, 1090 00:56:42,848 --> 00:56:45,820 so it can be greater than one. 1091 00:56:45,820 --> 00:56:48,302 I mean, one probability is greater than one child, 1092 00:56:48,302 --> 00:56:50,534 one probability is greater than two children. 1093 00:56:50,534 --> 00:56:55,080 And after that it's halves which add up to one. 1094 00:56:55,080 --> 00:56:56,330 PROFESSOR: Yeah, that's right. 1095 00:56:56,330 --> 00:56:57,163 That's a good proof. 1096 00:56:57,163 --> 00:56:58,000 Very good. 1097 00:56:58,000 --> 00:56:58,850 That'll work. 1098 00:56:58,850 --> 00:56:59,838 Yeah, yeah. 1099 00:56:59,838 --> 00:57:01,421 AUDIENCE: I have a question about what 1100 00:57:01,421 --> 00:57:02,383 you were doing before. 1101 00:57:02,383 --> 00:57:05,429 If you were to switch your expectations for a girl 1102 00:57:05,429 --> 00:57:07,674 and boy, wouldn't it come out to the same number 1103 00:57:07,674 --> 00:57:10,570 but wouldn't it kind of contradict itself? 1104 00:57:10,570 --> 00:57:12,940 PROFESSOR: No, if you stopped as soon as you had a boy, 1105 00:57:12,940 --> 00:57:15,780 you'd expect to have one girl before you got a boy. 1106 00:57:15,780 --> 00:57:17,057 Totally symmetric. 1107 00:57:17,057 --> 00:57:19,140 In this case, we're stopping when we get the girl, 1108 00:57:19,140 --> 00:57:21,560 so we expect to have one boy. 1109 00:57:21,560 --> 00:57:24,354 You might have none, you might have one, you might have two. 1110 00:57:24,354 --> 00:57:26,770 and as he mentions, if you put the probabilities in there, 1111 00:57:26,770 --> 00:57:28,000 they will work out the right way, 1112 00:57:28,000 --> 00:57:30,291 but we just did it simpler by the mean time to failure. 1113 00:57:32,840 --> 00:57:34,910 And in fact, there's also another proof, 1114 00:57:34,910 --> 00:57:37,220 what you're doing. 1115 00:57:37,220 --> 00:57:39,530 Another way to think about how many kids 1116 00:57:39,530 --> 00:57:41,400 you have to get one of each. 1117 00:57:41,400 --> 00:57:43,880 Well you have the first kid. 1118 00:57:43,880 --> 00:57:47,260 And now you have this problem, because you 1119 00:57:47,260 --> 00:57:49,200 want one of the other sex. 1120 00:57:49,200 --> 00:57:51,980 And you keep on trying to get one of the other sex 1121 00:57:51,980 --> 00:57:55,850 and you expect to have two children 1122 00:57:55,850 --> 00:57:57,090 to get one of the other sex. 1123 00:57:57,090 --> 00:57:58,420 So you have one to start. 1124 00:57:58,420 --> 00:57:59,760 Whatever it is doesn't matter. 1125 00:57:59,760 --> 00:58:01,690 And now you expect to have two more 1126 00:58:01,690 --> 00:58:03,280 until you hit the other sex. 1127 00:58:03,280 --> 00:58:05,970 So a total of three is the expected number of children 1128 00:58:05,970 --> 00:58:10,985 to get one boy and one girl, at least. 1129 00:58:10,985 --> 00:58:13,314 Any questions about expectation? 1130 00:58:15,980 --> 00:58:22,010 OK, let's do another example. 1131 00:58:22,010 --> 00:58:26,120 This comes up all the time in experimental work. 1132 00:58:26,120 --> 00:58:29,540 Probably all of you are going to have some example at some point 1133 00:58:29,540 --> 00:58:33,050 where you're going to do a problem like this. 1134 00:58:33,050 --> 00:58:37,020 And most all the time, people do it wrong. 1135 00:58:37,020 --> 00:58:39,520 So let's see an example. 1136 00:58:39,520 --> 00:58:44,390 Say that you want to measure the latency of some communications 1137 00:58:44,390 --> 00:58:46,410 channel. 1138 00:58:46,410 --> 00:58:49,490 And you want to know what's the average latency. 1139 00:58:49,490 --> 00:58:52,570 So you set up an experiment. 1140 00:58:52,570 --> 00:58:54,180 You send a package to the channel, 1141 00:58:54,180 --> 00:58:56,740 you measure when it started, when it got to the other end, 1142 00:58:56,740 --> 00:59:00,900 and you record the answer, and then you do it 100 times 1143 00:59:00,900 --> 00:59:02,940 and you take the average. 1144 00:59:02,940 --> 00:59:06,270 And you say that's the expected latency of the channel. 1145 00:59:06,270 --> 00:59:10,827 Very, very typical kind of thing to do, and sometimes OK to do. 1146 00:59:10,827 --> 00:59:12,160 So you'd do something like this. 1147 00:59:12,160 --> 00:59:15,690 You pick a random variable D, which 1148 00:59:15,690 --> 00:59:23,197 is going to denote the delay of a packet going 1149 00:59:23,197 --> 00:59:24,030 through the channel. 1150 00:59:28,560 --> 00:59:31,010 And there's, of course, some underlying distribution 1151 00:59:31,010 --> 00:59:35,990 here, which we'll denote by f of x. 1152 00:59:35,990 --> 00:59:38,500 And that's just the probability that D equals x. 1153 00:59:38,500 --> 00:59:40,510 That's the probability distribution function. 1154 00:59:46,450 --> 00:59:49,120 And as part of doing your stuff, you 1155 00:59:49,120 --> 00:59:52,110 notice that if I look at plot f here, 1156 00:59:52,110 --> 00:59:58,320 if I look at x on that axis and f of x here, 1157 00:59:58,320 --> 01:00:00,045 it looks something like this. 1158 01:00:02,910 --> 01:00:05,100 The chance of getting the observed 1159 01:00:05,100 --> 01:00:09,290 cases where I got a long delay were small, low probability. 1160 01:00:09,290 --> 01:00:11,890 And almost all the time I got a short delay. 1161 01:00:11,890 --> 01:00:14,540 So very typically you'll get a curve that looks something 1162 01:00:14,540 --> 01:00:17,430 like that in terms of the observations 1163 01:00:17,430 --> 01:00:20,600 of your experiment. 1164 01:00:20,600 --> 01:00:25,930 And say that you do your experiment 100 times 1165 01:00:25,930 --> 01:00:28,710 and the average latency was 10 milliseconds. 1166 01:00:31,887 --> 01:00:33,720 And sometimes you want to be really careful, 1167 01:00:33,720 --> 01:00:37,150 so you do the whole thing again, you do another 100 times. 1168 01:00:37,150 --> 01:00:39,190 And maybe it's nine milliseconds the next time. 1169 01:00:39,190 --> 01:00:41,280 And that sort of confirms your belief 1170 01:00:41,280 --> 01:00:43,250 that your first experiment was valid 1171 01:00:43,250 --> 01:00:45,430 and that the expected latency on this channel 1172 01:00:45,430 --> 01:00:48,510 is 10 milliseconds. 1173 01:00:48,510 --> 01:00:51,390 Sounds pretty good. 1174 01:00:51,390 --> 01:00:53,710 But it can be completely wrong. 1175 01:00:53,710 --> 01:00:55,390 And not just because you're unlucky, 1176 01:00:55,390 --> 01:00:57,870 but because you're taking a simple method like that. 1177 01:00:57,870 --> 01:01:04,400 And let me show you an example where it is way off. 1178 01:01:14,310 --> 01:01:17,210 All right, say you were just a little more 1179 01:01:17,210 --> 01:01:19,350 sophisticated about this. 1180 01:01:19,350 --> 01:01:22,010 And when you did your observations, 1181 01:01:22,010 --> 01:01:26,930 you tried to figure out what this curve really looks like. 1182 01:01:26,930 --> 01:01:29,090 And say that you, looking at the data, 1183 01:01:29,090 --> 01:01:33,330 concluded that the probability that you have a delay of more 1184 01:01:33,330 --> 01:01:37,610 than i milliseconds is 1 and i. 1185 01:01:37,610 --> 01:01:41,480 That looks like this curve, as close as anything. 1186 01:01:41,480 --> 01:01:45,234 So from your data, you conclude this. 1187 01:01:45,234 --> 01:01:47,150 That's a little more sophisticated conclusion, 1188 01:01:47,150 --> 01:01:49,300 because now you've identified what 1189 01:01:49,300 --> 01:01:51,560 you believe the distribution is, which 1190 01:01:51,560 --> 01:01:55,000 is stronger than expectation. 1191 01:01:55,000 --> 01:01:58,754 How would you go about figuring out the expected value 1192 01:01:58,754 --> 01:01:59,920 if you had that information? 1193 01:02:03,800 --> 01:02:06,380 I mean, you could average the 100 sample points. 1194 01:02:06,380 --> 01:02:10,550 But is there any way, if you assume this, what would you 1195 01:02:10,550 --> 01:02:14,680 do for the expected value? 1196 01:02:14,680 --> 01:02:16,500 Yeah, did I erase the theorem? 1197 01:02:16,500 --> 01:02:18,870 No, it's over there. 1198 01:02:18,870 --> 01:02:22,190 We just plug it to the theorem. 1199 01:02:22,190 --> 01:02:23,590 The expected delay is going to be 1200 01:02:23,590 --> 01:02:29,800 the sum of those probabilities from 1 to infinity. 1201 01:02:29,800 --> 01:02:32,760 So we would compute from the theorem 1202 01:02:32,760 --> 01:02:39,626 the expected delay is i equals 1 to infinity probability. 1203 01:02:43,090 --> 01:02:44,804 Do I want to do the 0 case? 1204 01:02:44,804 --> 01:02:46,970 Want to be sure I don't get caught up in the 0 case. 1205 01:02:46,970 --> 01:02:48,560 We'll use the case 1 to infinity. 1206 01:02:48,560 --> 01:02:51,070 Probability D greater than or equal to i. 1207 01:02:51,070 --> 01:02:52,960 So let me put greater than or equal to here. 1208 01:02:56,740 --> 01:03:02,220 That equals the sum i equals 1 to infinity of 1 over i. 1209 01:03:02,220 --> 01:03:03,220 What's that? 1210 01:03:07,220 --> 01:03:11,630 What's the sum of 1 over i, i from 1 to infinity? 1211 01:03:11,630 --> 01:03:13,290 It's infinite. 1212 01:03:13,290 --> 01:03:15,830 Remember those harmonic number? 1213 01:03:15,830 --> 01:03:19,120 i going from 1 to n gives you about log n. 1214 01:03:19,120 --> 01:03:22,290 Remember that, the book stacking thing? 1215 01:03:22,290 --> 01:03:28,620 i going from 1 to infinity, that's infinity. 1216 01:03:28,620 --> 01:03:33,070 So in fact, your expected latency is infinite. 1217 01:03:33,070 --> 01:03:35,720 And you just published a paper saying it was 10 milliseconds, 1218 01:03:35,720 --> 01:03:36,450 maybe nine. 1219 01:03:41,020 --> 01:03:43,460 So it's very dangerous to just take a bunch of the points, 1220 01:03:43,460 --> 01:03:47,904 add them up, and average them and say that is the answer. 1221 01:03:47,904 --> 01:03:49,320 Especially if you have some reason 1222 01:03:49,320 --> 01:03:51,130 to believe the distribution looks like this 1223 01:03:51,130 --> 01:03:52,671 and it really is something like that. 1224 01:03:52,671 --> 01:03:54,844 It could be infinite. 1225 01:03:54,844 --> 01:03:57,010 Now in some cases, if your distribution is very well 1226 01:03:57,010 --> 01:03:58,970 behaved, averaging your sample points 1227 01:03:58,970 --> 01:04:01,320 is the perfect thing to do. 1228 01:04:01,320 --> 01:04:05,183 But it helps to know it's not necessarily the case that's 1229 01:04:05,183 --> 01:04:06,090 a good way to go. 1230 01:04:09,000 --> 01:04:10,518 Any ideas what went wrong? 1231 01:04:10,518 --> 01:04:11,018 Yeah. 1232 01:04:11,018 --> 01:04:12,184 AUDIENCE: I have a question. 1233 01:04:12,184 --> 01:04:14,290 What would be your probability be over i squared? 1234 01:04:14,290 --> 01:04:15,956 PROFESSOR: Yeah, what would happen then? 1235 01:04:15,956 --> 01:04:19,700 If in fact, it was 1 over i squared. 1236 01:04:23,840 --> 01:04:26,457 So you need to do this sum. 1237 01:04:26,457 --> 01:04:28,290 Here's a good review question for the final. 1238 01:04:28,290 --> 01:04:30,110 What method do you use to estimate that? 1239 01:04:32,770 --> 01:04:35,360 Remember the how do we do that? 1240 01:04:35,360 --> 01:04:37,540 Is that infinite? 1241 01:04:37,540 --> 01:04:38,580 No. 1242 01:04:38,580 --> 01:04:41,250 He used the integration bound. 1243 01:04:41,250 --> 01:04:45,072 And you'll see what this is pretty small. 1244 01:04:45,072 --> 01:04:46,780 It'll be, what, 1 and a 1/2, 2, something 1245 01:04:46,780 --> 01:04:49,660 like that where we can do the integration method here. 1246 01:04:49,660 --> 01:04:51,890 So huge difference. 1247 01:04:51,890 --> 01:04:52,740 This is O of 1. 1248 01:04:52,740 --> 01:04:54,190 This is bounded. 1249 01:04:54,190 --> 01:04:58,760 Probably less than 2, if we did the integration method. 1250 01:04:58,760 --> 01:05:02,650 So huge difference here in what the outcome is. 1251 01:05:02,650 --> 01:05:07,120 Now, how can it be that I've got something 1252 01:05:07,120 --> 01:05:12,150 with expected infinite value and I average 100 points 1253 01:05:12,150 --> 01:05:14,180 and I got 10 milliseconds? 1254 01:05:14,180 --> 01:05:16,130 Yeah. 1255 01:05:16,130 --> 01:05:18,130 AUDIENCE: The infinite value comes from the fact 1256 01:05:18,130 --> 01:05:20,973 that there is a decent probability 1257 01:05:20,973 --> 01:05:23,228 that delay is going to be huge. 1258 01:05:23,228 --> 01:05:26,306 However, if you only take a finite number of sample points, 1259 01:05:26,306 --> 01:05:28,828 then chances are you're not going 1260 01:05:28,828 --> 01:05:30,790 to get any monstrous delays. 1261 01:05:30,790 --> 01:05:32,010 PROFESSOR: Exactly. 1262 01:05:32,010 --> 01:05:34,670 The chance of seeing anything beyond 100 milliseconds 1263 01:05:34,670 --> 01:05:37,886 is 1 in 100. 1264 01:05:37,886 --> 01:05:39,260 So I'm probably not going to see. 1265 01:05:39,260 --> 01:05:41,530 In a sample size of 100, almost surely 1266 01:05:41,530 --> 01:05:46,550 I won't see something that takes a second, 1,000 milliseconds. 1267 01:05:46,550 --> 01:05:49,470 But yet it's those rare sample points 1268 01:05:49,470 --> 01:05:52,620 that are causing that sum to blow up. 1269 01:05:52,620 --> 01:05:56,150 If I sum that from 1 to 100, I get log of 10. 1270 01:05:56,150 --> 01:05:56,742 Pretty small. 1271 01:05:59,300 --> 01:06:04,082 So I get log of 100, which is pretty small. 1272 01:06:04,082 --> 01:06:06,290 So what's happening when you do your finite sample is 1273 01:06:06,290 --> 01:06:08,950 you're missing the big guys which 1274 01:06:08,950 --> 01:06:12,130 are very rare, but enough to blow up your expectation. 1275 01:06:12,130 --> 01:06:15,020 Now, you can draw two conclusions from that. 1276 01:06:15,020 --> 01:06:16,250 One of them is just we did. 1277 01:06:16,250 --> 01:06:17,640 The other is, well, expectation's 1278 01:06:17,640 --> 01:06:18,534 the wrong measure. 1279 01:06:18,534 --> 01:06:19,950 And really what we should be doing 1280 01:06:19,950 --> 01:06:23,939 is looking at only 1,000 sample points or something like that. 1281 01:06:23,939 --> 01:06:26,230 But in practice of using the thing over and over again, 1282 01:06:26,230 --> 01:06:28,480 eventually you're going to get hit with a whopper. 1283 01:06:28,480 --> 01:06:30,630 Sometimes you'll see people take data points out 1284 01:06:30,630 --> 01:06:32,730 when they're really the big ones. 1285 01:06:32,730 --> 01:06:34,520 They say, oh, well, that was an anomaly. 1286 01:06:34,520 --> 01:06:37,293 I take that out and then we compute the average. 1287 01:06:37,293 --> 01:06:37,792 Yeah? 1288 01:06:37,792 --> 01:06:39,458 AUDIENCE: How much would you pay to play 1289 01:06:39,458 --> 01:06:43,115 a game where the pair would the be the latency of the packet? 1290 01:06:43,115 --> 01:06:44,073 PROFESSOR: What's that? 1291 01:06:44,073 --> 01:06:45,614 AUDIENCE: How much do you pay to play 1292 01:06:45,614 --> 01:06:48,080 a game where the pair would be the latency of the packet? 1293 01:06:48,080 --> 01:06:48,955 PROFESSOR: All right. 1294 01:06:51,720 --> 01:06:53,810 That's a big number. 1295 01:06:53,810 --> 01:06:57,170 If that was my losses here, that's tough. 1296 01:06:57,170 --> 01:07:00,500 I'd bet anything up against that. 1297 01:07:00,500 --> 01:07:02,730 To get a payoff of infinity? 1298 01:07:02,730 --> 01:07:05,260 You'd pay $1,000 to play that game. 1299 01:07:05,260 --> 01:07:08,040 Now, you'd want to play it for a long time 1300 01:07:08,040 --> 01:07:11,050 to get that big payoff, right? 1301 01:07:11,050 --> 01:07:13,420 But eventually that's what it's going to be. 1302 01:07:19,260 --> 01:07:19,860 Any questions? 1303 01:07:19,860 --> 01:07:23,040 People understand the issue here and what 1304 01:07:23,040 --> 01:07:25,650 to worry about when you all do that someday? 1305 01:07:25,650 --> 01:07:26,234 Yeah. 1306 01:07:26,234 --> 01:07:27,150 AUDIENCE: [INAUDIBLE]. 1307 01:07:33,350 --> 01:07:36,100 PROFESSOR: Yeah, now here you don't know for a fact 1308 01:07:36,100 --> 01:07:36,600 this is it. 1309 01:07:36,600 --> 01:07:38,640 But you could model and you start seeing these. 1310 01:07:38,640 --> 01:07:42,160 You fill in the points and you say, it looks like this, 1311 01:07:42,160 --> 01:07:45,304 let's assume that's what it was, then here's the result. 1312 01:07:45,304 --> 01:07:47,470 If it looks like it's 1 over i squared, you can say, 1313 01:07:47,470 --> 01:07:49,700 let's assume that's what it is and then you get 1314 01:07:49,700 --> 01:07:53,050 a different result. But you take the various cases 1315 01:07:53,050 --> 01:07:55,210 and consider them to do it. 1316 01:07:55,210 --> 01:07:59,200 Or you could say, I take the expectation assuming I never 1317 01:07:59,200 --> 01:08:02,270 get a point bigger than 100. 1318 01:08:02,270 --> 01:08:03,540 And I limit it that way. 1319 01:08:03,540 --> 01:08:05,242 And then you're safe at that point 1320 01:08:05,242 --> 01:08:06,450 and you can get away with it. 1321 01:08:06,450 --> 01:08:08,241 But to blindly go out there and say here it 1322 01:08:08,241 --> 01:08:10,113 is, not so reliable. 1323 01:08:16,359 --> 01:08:20,410 The expected value does have a lot of useful properties. 1324 01:08:20,410 --> 01:08:25,510 And the most important is called linearity of expectation. 1325 01:08:25,510 --> 01:08:27,863 And we'll spend the rest of today and some of next time 1326 01:08:27,863 --> 01:08:28,571 talking about it. 1327 01:08:32,229 --> 01:08:34,910 And quite possibly it's one of the reasons people 1328 01:08:34,910 --> 01:08:37,899 use it so much instead of other things you might think about. 1329 01:08:53,020 --> 01:08:55,319 The theorem, and this may be the most important theorem 1330 01:08:55,319 --> 01:08:55,973 on probability. 1331 01:08:58,810 --> 01:09:13,420 For any random variables, R1 and R2, on a probability space S, 1332 01:09:13,420 --> 01:09:21,890 the expected value of R1 plus R2 equals the expected value of R1 1333 01:09:21,890 --> 01:09:26,290 plus the expected value of R2. 1334 01:09:26,290 --> 01:09:27,547 Very simple. 1335 01:09:27,547 --> 01:09:29,130 It's another way of saying expectation 1336 01:09:29,130 --> 01:09:32,080 is a linear function. 1337 01:09:32,080 --> 01:09:32,579 The proof. 1338 01:09:37,109 --> 01:09:38,330 No, skip the proof here. 1339 01:09:38,330 --> 01:09:39,981 It's not hard and it's in the text. 1340 01:09:43,760 --> 01:09:45,864 Follows pretty simply from the definition. 1341 01:09:53,250 --> 01:09:57,067 There's a generalization for more than two random variables. 1342 01:09:59,810 --> 01:10:00,997 So corollary. 1343 01:10:03,990 --> 01:10:12,390 For all k in the natural numbers and k random variables R1, 1344 01:10:12,390 --> 01:10:18,920 R2, up to Rk on the sample probability space 1345 01:10:18,920 --> 01:10:28,090 S. The expected value of R1 plus R2 plus Rk is just 1346 01:10:28,090 --> 01:10:29,465 the sum of the expected values. 1347 01:10:36,220 --> 01:10:40,212 And the proof of that is by induction using that result. 1348 01:10:40,212 --> 01:10:41,920 And the really important thing about this 1349 01:10:41,920 --> 01:10:44,430 is that neither of these results needs independence. 1350 01:10:47,370 --> 01:10:52,764 It is true whether or not the Ri are independent. 1351 01:10:58,850 --> 01:11:01,190 Pretty much everything we do in probability 1352 01:11:01,190 --> 01:11:04,530 to manipulate random variables needs them to be independent. 1353 01:11:04,530 --> 01:11:06,050 You don't need that here. 1354 01:11:06,050 --> 01:11:08,412 And that'll make it very powerful. 1355 01:11:08,412 --> 01:11:09,370 So let's do an example. 1356 01:11:28,370 --> 01:11:32,446 Say I roll two fair dice. 1357 01:11:35,780 --> 01:11:36,720 Six sided dice. 1358 01:11:42,320 --> 01:11:45,360 Not necessarily independent. 1359 01:11:45,360 --> 01:11:48,730 R1 is the outcome on the first die. 1360 01:11:53,990 --> 01:12:00,740 And R2 will be the outcome on the second one. 1361 01:12:00,740 --> 01:12:04,680 And I'm interested in the sum of the dice. 1362 01:12:04,680 --> 01:12:08,470 So with that R it'd be R1 plus R2. 1363 01:12:08,470 --> 01:12:12,062 And I want to know the expected value of the sum of the dice 1364 01:12:12,062 --> 01:12:12,770 when I roll them. 1365 01:12:16,280 --> 01:12:20,260 Now, if I didn't use that theorem 1366 01:12:20,260 --> 01:12:22,740 I'd compute the tree in the sample space. 1367 01:12:22,740 --> 01:12:27,360 I'd get 36 possible outcomes, take the probability of each. 1368 01:12:27,360 --> 01:12:29,210 It'd take you a little while to do it, 1369 01:12:29,210 --> 01:12:33,320 but using linearity of expectation, it's easy. 1370 01:12:33,320 --> 01:12:38,070 It's expected value of R1 plus the expected value of R2. 1371 01:12:38,070 --> 01:12:43,230 Each of these we already figured out is 3 and 1/2. 1372 01:12:43,230 --> 01:12:46,330 And so the answer is 7. 1373 01:12:46,330 --> 01:12:49,170 So if you roll a pair of dice, whether or not 1374 01:12:49,170 --> 01:12:52,721 they are independent, the expected sum is seven. 1375 01:12:55,370 --> 01:12:59,270 Any questions about linearity of expectation? 1376 01:12:59,270 --> 01:13:00,122 Yeah. 1377 01:13:00,122 --> 01:13:01,038 AUDIENCE: [INAUDIBLE]. 1378 01:13:08,750 --> 01:13:10,550 PROFESSOR: This one? 1379 01:13:10,550 --> 01:13:11,334 [INAUDIBLE] 1380 01:13:11,334 --> 01:13:12,250 AUDIENCE: [INAUDIBLE]. 1381 01:13:14,950 --> 01:13:16,530 PROFESSOR: No, I mean pluses here. 1382 01:13:16,530 --> 01:13:18,230 Here? 1383 01:13:18,230 --> 01:13:20,413 I'm computing the sum of the random variables. 1384 01:13:23,180 --> 01:13:27,180 So this could be a 1, that could be a 10, this could be a 3. 1385 01:13:27,180 --> 01:13:29,610 So I'm compute the expected value of the sum, 1386 01:13:29,610 --> 01:13:31,770 just like when I rolled two, dice I'm 1387 01:13:31,770 --> 01:13:36,328 taking the expected value of the sum of the dice. 1388 01:13:36,328 --> 01:13:37,161 Any other questions? 1389 01:13:43,850 --> 01:13:47,610 Now we're going to do a little trickier problem that 1390 01:13:47,610 --> 01:13:49,562 uses linearity of expectation. 1391 01:13:53,250 --> 01:13:54,970 Yeah? 1392 01:13:54,970 --> 01:13:57,940 AUDIENCE: [INAUDIBLE] sets, like when 1393 01:13:57,940 --> 01:14:01,845 we're looking at the failure. 1394 01:14:01,845 --> 01:14:06,208 Couldn't we add the two cases, like R1 being 1395 01:14:06,208 --> 01:14:08,650 a girl and R2 being a boy? 1396 01:14:08,650 --> 01:14:14,924 PROFESSOR: Well, what would it mean to sum-- so 1397 01:14:14,924 --> 01:14:16,090 what is the random variable? 1398 01:14:16,090 --> 01:14:17,820 R1 is the case when you get a boy. 1399 01:14:17,820 --> 01:14:19,940 So it's an indicator for getting a boy. 1400 01:14:19,940 --> 01:14:22,790 R2 is the indicator for getting a girl. 1401 01:14:22,790 --> 01:14:25,310 R1 plus R2 is by definition 1 then, 1402 01:14:25,310 --> 01:14:26,760 because you got a boy or a girl. 1403 01:14:26,760 --> 01:14:28,330 One of them had to happen. 1404 01:14:28,330 --> 01:14:29,920 And the expected value would be one, 1405 01:14:29,920 --> 01:14:32,690 But that's a different kind of game. 1406 01:14:32,690 --> 01:14:36,440 But we are going to start using this in sophisticated ways 1407 01:14:36,440 --> 01:14:39,305 to make calculations be easier for things like that. 1408 01:14:43,680 --> 01:14:44,180 so 1409 01:14:44,180 --> 01:14:46,230 This problem we call the hat check problem. 1410 01:14:51,560 --> 01:14:54,250 And the idea here behind this is say 1411 01:14:54,250 --> 01:14:59,350 that you have n men at a restaurant having dinner. 1412 01:14:59,350 --> 01:15:01,110 And when they come into the restaurant, 1413 01:15:01,110 --> 01:15:04,360 they check their hats in the coat room. 1414 01:15:04,360 --> 01:15:06,790 Then something goes wrong in the coat room 1415 01:15:06,790 --> 01:15:10,200 and the hats are all scrambled up randomly. 1416 01:15:10,200 --> 01:15:13,010 Let's say a random permutation of the hats. 1417 01:15:13,010 --> 01:15:15,480 So the men come back to get their hats 1418 01:15:15,480 --> 01:15:19,370 and they get a random hat coming back. 1419 01:15:19,370 --> 01:15:31,220 So each man gets a random hat back after dinner. 1420 01:15:31,220 --> 01:15:35,520 And the question is, what is the expected number of men 1421 01:15:35,520 --> 01:15:37,331 to get the right hat back? 1422 01:15:39,980 --> 01:15:41,840 So we let R be the random variable 1423 01:15:41,840 --> 01:15:45,590 that says the number of men to get 1424 01:15:45,590 --> 01:15:49,162 the right hat, their happen. 1425 01:15:52,000 --> 01:15:57,140 And we want to know the expected value of R. 1426 01:15:57,140 --> 01:15:58,850 Now, from the definition, that's just 1427 01:15:58,850 --> 01:16:01,570 the sum of all possibilities. 1428 01:16:01,570 --> 01:16:04,556 K from 1 to n. 1429 01:16:04,556 --> 01:16:09,100 K times the probability R equals k. 1430 01:16:09,100 --> 01:16:12,330 So using one of the definitions, we 1431 01:16:12,330 --> 01:16:14,310 could compute the expected value this way. 1432 01:16:14,310 --> 01:16:17,820 In fact, if you were to be assigned this on a test 1433 01:16:17,820 --> 01:16:19,770 or on homework, that's probably how you'd 1434 01:16:19,770 --> 01:16:21,010 start something like that. 1435 01:16:23,560 --> 01:16:25,994 And then, well, the next step you'd take 1436 01:16:25,994 --> 01:16:28,410 would be to figure out what's the probability that exactly 1437 01:16:28,410 --> 01:16:32,980 K men get the right hat back. 1438 01:16:32,980 --> 01:16:35,280 In fact, we actually asked this once before we 1439 01:16:35,280 --> 01:16:37,430 started doing it in class. 1440 01:16:37,430 --> 01:16:40,170 And it was really hard if you went down this path. 1441 01:16:40,170 --> 01:16:44,230 Because if you spent all night with your buddies, 1442 01:16:44,230 --> 01:16:46,780 you would maybe get to the conclusion 1443 01:16:46,780 --> 01:16:52,330 that probability R equals K is this. 1444 01:16:52,330 --> 01:16:59,800 1 over K factorial times n minus K down here for K less than 1445 01:16:59,800 --> 01:17:02,680 or equal to n minus 2. 1446 01:17:02,680 --> 01:17:10,130 And 1 over n factorial if K equals n minus 1 or n. 1447 01:17:10,130 --> 01:17:14,590 Then you would plug that nasty looking thing into there. 1448 01:17:14,590 --> 01:17:18,330 So multiply by K. And you'd have to sum it up. 1449 01:17:18,330 --> 01:17:19,780 And Lord help you. 1450 01:17:19,780 --> 01:17:22,000 That's just a nightmare to do. 1451 01:17:22,000 --> 01:17:24,640 You'd have a very hard time getting the answer. 1452 01:17:24,640 --> 01:17:27,980 If you doubt that, try it. 1453 01:17:27,980 --> 01:17:29,964 But that would be a natural way to proceed. 1454 01:17:32,790 --> 01:17:35,430 But it turns out there is a trivial way to get the answer. 1455 01:17:35,430 --> 01:17:37,940 And this is a very powerful technique 1456 01:17:37,940 --> 01:17:40,200 using linearity of expectation. 1457 01:17:40,200 --> 01:17:43,590 And for sure there will be a problem on the final exam just 1458 01:17:43,590 --> 01:17:45,790 like this. 1459 01:17:45,790 --> 01:17:49,620 And so if you go down this path, which is a natural first path, 1460 01:17:49,620 --> 01:17:53,630 it may take you the rest of the day to solve it. 1461 01:17:53,630 --> 01:17:55,130 But the method I'm going to show you 1462 01:17:55,130 --> 01:17:57,407 will take you a couple minutes to do it. 1463 01:18:02,370 --> 01:18:05,070 Now, the trick is to use linearity of expectations. 1464 01:18:05,070 --> 01:18:09,610 So the problem is there's no sum here. 1465 01:18:09,610 --> 01:18:11,960 So what we need to do is we're going to express R 1466 01:18:11,960 --> 01:18:16,300 as the sum of random variables. 1467 01:18:16,300 --> 01:18:22,740 And the way we're going to do that is as follows. 1468 01:18:22,740 --> 01:18:24,540 And it's not obvious, but once you see it, 1469 01:18:24,540 --> 01:18:26,094 it's easy to keep using it. 1470 01:18:28,700 --> 01:18:35,825 We let R be the sum of R1 plus R2 plus Rn. 1471 01:18:35,825 --> 01:18:40,075 And Ri is going to tell us the event. 1472 01:18:40,075 --> 01:18:42,570 This is sort of what you were talking 1473 01:18:42,570 --> 01:18:45,170 about before with the event of a boy or event of a girl. 1474 01:18:45,170 --> 01:18:47,580 This is going to be the event of the ith man 1475 01:18:47,580 --> 01:18:50,830 gets his right hat back. 1476 01:18:50,830 --> 01:18:52,710 So it's an indicator random variable. 1477 01:18:52,710 --> 01:18:59,610 It's 1 if the ith man gets the right hat. 1478 01:19:02,170 --> 01:19:04,110 And it's 0 if he doesn't. 1479 01:19:10,060 --> 01:19:14,700 So whenever the ith guy gets his hat back, that counts as one, 1480 01:19:14,700 --> 01:19:17,095 and now you can see why this sum works. 1481 01:19:17,095 --> 01:19:21,110 R is the number of men to get the right hat back. 1482 01:19:21,110 --> 01:19:24,210 And it basically there's a one counted in here every time 1483 01:19:24,210 --> 01:19:28,740 a guy gets his hat back, and 0 if he doesn't. 1484 01:19:28,740 --> 01:19:33,276 So this sum is counting how many men got their right hat back. 1485 01:19:36,280 --> 01:19:38,980 Non-obvious the first time, gets really simple 1486 01:19:38,980 --> 01:19:41,420 the fourth or fifth time. 1487 01:19:41,420 --> 01:19:43,896 We'll try to do a couple of them today. 1488 01:19:48,290 --> 01:19:52,505 All right, now the expected value of R is easy. 1489 01:19:55,230 --> 01:19:57,270 It's just by linearity of expectation, 1490 01:19:57,270 --> 01:20:00,590 expected value of R1 and so forth 1491 01:20:00,590 --> 01:20:01,890 plus the expected value of Rn. 1492 01:20:05,310 --> 01:20:10,150 The expected value of an indicator random variable 1493 01:20:10,150 --> 01:20:13,350 is just the probability that it's 1. 1494 01:20:13,350 --> 01:20:15,040 Right it's 1 times the probability, 1495 01:20:15,040 --> 01:20:17,484 it's 0 times the probability of 0. 1496 01:20:17,484 --> 01:20:19,150 That's just the probability that it's 1. 1497 01:20:28,650 --> 01:20:31,680 What's the probability that the first man gets the right hat 1498 01:20:31,680 --> 01:20:34,400 back? 1499 01:20:34,400 --> 01:20:35,080 1 over n. 1500 01:20:35,080 --> 01:20:35,730 There's n hats. 1501 01:20:35,730 --> 01:20:38,820 He gets a random one. 1502 01:20:38,820 --> 01:20:41,607 What's the probability that the second man gets his hat back? 1503 01:20:44,870 --> 01:20:47,980 Not 1 over n minus 1. 1504 01:20:47,980 --> 01:20:50,760 He's coming in whether he's first, second, or last. 1505 01:20:50,760 --> 01:20:53,630 He gets a random hat. 1506 01:20:53,630 --> 01:20:55,310 1 in n chance it's his. 1507 01:20:59,340 --> 01:21:04,660 What's the probability the last man gets the right hat back? 1508 01:21:04,660 --> 01:21:05,200 One over n. 1509 01:21:05,200 --> 01:21:07,033 Doesn't really matter if he's first or last. 1510 01:21:07,033 --> 01:21:09,030 Just sort of tricks up a little bit. 1511 01:21:09,030 --> 01:21:11,480 He's getting a random hat back. 1512 01:21:11,480 --> 01:21:12,470 So it's 1 over n. 1513 01:21:15,330 --> 01:21:17,880 I got n of each, 1 over n, so what's 1514 01:21:17,880 --> 01:21:21,170 the expected number of men to get the right hat back? 1515 01:21:21,170 --> 01:21:23,290 One. 1516 01:21:23,290 --> 01:21:26,680 Now, the math doesn't get much easier than that. 1517 01:21:26,680 --> 01:21:29,480 Now, the amazing thing is we just 1518 01:21:29,480 --> 01:21:34,730 proved that if we take that mess, stick it in here 1519 01:21:34,730 --> 01:21:38,610 and sum it up, what answer do we get? 1520 01:21:38,610 --> 01:21:39,900 1. 1521 01:21:39,900 --> 01:21:42,310 That is certainly not obvious, but that is a consequence 1522 01:21:42,310 --> 01:21:44,240 of everything we've just done. 1523 01:21:44,240 --> 01:21:47,560 We've just given a probability proof of that fact. 1524 01:21:47,560 --> 01:21:50,170 But the nice thing here is there's actually 1525 01:21:50,170 --> 01:21:52,920 even more powerful. 1526 01:21:52,920 --> 01:21:56,870 Did I need to assume that it was a random permutation of hats 1527 01:21:56,870 --> 01:21:58,600 like I would need to assume for this? 1528 01:22:03,180 --> 01:22:06,040 No independence is needed. 1529 01:22:06,040 --> 01:22:09,350 In fact, there's all sorts of distributions that will 1530 01:22:09,350 --> 01:22:11,140 give the same expected value. 1531 01:22:11,140 --> 01:22:14,220 All I need is that each person gets the right hat back 1532 01:22:14,220 --> 01:22:17,246 with probability 1 over n. 1533 01:22:17,246 --> 01:22:19,960 In fact, this is an example of a different distribution 1534 01:22:19,960 --> 01:22:22,390 for which the result is the same. 1535 01:22:22,390 --> 01:22:24,459 Say you're at a Chinese restaurant. 1536 01:22:24,459 --> 01:22:26,250 And you know they have the thing that spins 1537 01:22:26,250 --> 01:22:28,640 in the middle of the table? 1538 01:22:28,640 --> 01:22:30,380 Say that you each order an appetizer. 1539 01:22:30,380 --> 01:22:32,850 There's n people and they're around a big circular table. 1540 01:22:32,850 --> 01:22:35,110 Everybody gets their appetizer. 1541 01:22:35,110 --> 01:22:36,930 Wonton soup, whatever. 1542 01:22:36,930 --> 01:22:40,540 And then there's always the joker who spins the thing 1543 01:22:40,540 --> 01:22:44,560 and spins around and then it stops 1544 01:22:44,560 --> 01:22:48,480 and now you've got a random appetizer in front of you. 1545 01:22:48,480 --> 01:22:50,930 In this case, we want to know what's 1546 01:22:50,930 --> 01:22:53,920 the expected number of people to get the right appetizer back. 1547 01:22:53,920 --> 01:22:58,460 Not the other guys wonton soup, but yours. 1548 01:22:58,460 --> 01:23:00,120 That's a different probability space 1549 01:23:00,120 --> 01:23:02,362 because there's only n sample points, n places where 1550 01:23:02,362 --> 01:23:03,570 the thing could have stopped. 1551 01:23:03,570 --> 01:23:08,390 Not n factorial like the hats. 1552 01:23:08,390 --> 01:23:12,670 Well, does the analysis change? 1553 01:23:12,670 --> 01:23:14,660 Exactly the same. 1554 01:23:14,660 --> 01:23:17,320 Ri is the indicator variable for the ith person 1555 01:23:17,320 --> 01:23:20,200 gets the right appetizer back. 1556 01:23:20,200 --> 01:23:22,650 Linearity of expectation. 1557 01:23:22,650 --> 01:23:24,580 The expected value of the indicator variable 1558 01:23:24,580 --> 01:23:26,750 is just the probability that it's 1. 1559 01:23:26,750 --> 01:23:28,210 And the probability that any person 1560 01:23:28,210 --> 01:23:31,010 gets the right appetizer is 1 over n. 1561 01:23:31,010 --> 01:23:32,150 So the answer is the same. 1562 01:23:32,150 --> 01:23:34,483 The expected number of people to get the right appetizer 1563 01:23:34,483 --> 01:23:36,400 back is 1. 1564 01:23:36,400 --> 01:23:40,130 So totally different probability spaces, exactly 1565 01:23:40,130 --> 01:23:43,840 the same analysis and answer. 1566 01:23:43,840 --> 01:23:47,260 OK, so we'll do more of this next time.