1 00:00:00,500 --> 00:00:02,820 The following content is provided under a Creative 2 00:00:02,820 --> 00:00:04,340 Commons license. 3 00:00:04,340 --> 00:00:06,670 Your support will help MIT OpenCourseWare 4 00:00:06,670 --> 00:00:11,040 continue to offer high quality educational resources for free. 5 00:00:11,040 --> 00:00:13,650 To make a donation or view additional materials 6 00:00:13,650 --> 00:00:17,556 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,556 --> 00:00:18,181 at ocw.mit.edu. 8 00:00:23,447 --> 00:00:25,280 PROFESSOR: Last time we defined the expected 9 00:00:25,280 --> 00:00:27,030 value of a random variable. 10 00:00:27,030 --> 00:00:29,570 And we talked about a lot of ways it could be computed. 11 00:00:29,570 --> 00:00:33,090 We proved all sorts of equivalent definitions. 12 00:00:33,090 --> 00:00:35,730 Today, we're going to keep talking about expectation. 13 00:00:35,730 --> 00:00:38,970 And we're going to start with an example that 14 00:00:38,970 --> 00:00:42,030 talks about the expected number of events 15 00:00:42,030 --> 00:00:43,770 that you expect to have occur. 16 00:00:43,770 --> 00:00:45,200 And it's a generalization of what 17 00:00:45,200 --> 00:00:50,946 we did with Chinese appetizer and hat check from last time. 18 00:00:50,946 --> 00:00:52,910 We're going to call this theorem 1. 19 00:00:55,600 --> 00:01:06,360 If you have a probability space, s, 20 00:01:06,360 --> 00:01:10,860 and you've got a collection of n events, 21 00:01:10,860 --> 00:01:20,310 let's call them A1 through A n, and these are just 22 00:01:20,310 --> 00:01:31,750 subsets of s, then the expected number of events to occur, 23 00:01:31,750 --> 00:01:47,375 of these events, is simply the sum of i equals 1 to n 24 00:01:47,375 --> 00:01:52,610 of the probability of the i-th event occurring. 25 00:01:52,610 --> 00:01:54,380 So you just sum up the probabilities 26 00:01:54,380 --> 00:01:56,250 that the events occur. 27 00:01:56,250 --> 00:01:58,400 And that tells you the expected number 28 00:01:58,400 --> 00:02:00,185 of events that will occur. 29 00:02:00,185 --> 00:02:03,320 So a very simple formula. 30 00:02:03,320 --> 00:02:06,730 So for example, A i might be the event 31 00:02:06,730 --> 00:02:10,829 if the i-th man gets the right hat back from last time. 32 00:02:10,829 --> 00:02:13,570 Or it could be the event if the i-th person gets 33 00:02:13,570 --> 00:02:16,140 the right appetizer back at the Chinese restaurant 34 00:02:16,140 --> 00:02:20,320 after we spin the wheel in the center. 35 00:02:20,320 --> 00:02:21,890 So we're going to prove this. 36 00:02:21,890 --> 00:02:24,677 And the proof is very similar to what we did last time when 37 00:02:24,677 --> 00:02:26,510 we figured out the expected number of people 38 00:02:26,510 --> 00:02:28,350 to get the right hat back. 39 00:02:28,350 --> 00:02:31,810 In particular, we're going to start by setting up 40 00:02:31,810 --> 00:02:36,160 an indicator variable, T sub i, that tells us whether or not 41 00:02:36,160 --> 00:02:38,610 the i-th event, A sub i occurs. 42 00:02:42,710 --> 00:02:50,400 So we define T sub i-- and it's a function of a sample point-- 43 00:02:50,400 --> 00:02:55,650 to be 1 if the sample point is in the i-th event, 44 00:02:55,650 --> 00:02:59,310 meaning the i-th event occurs, and 0 otherwise. 45 00:03:03,050 --> 00:03:05,220 And this is just another way of saying 46 00:03:05,220 --> 00:03:13,910 that T sub i is 1 if and only if A sub i happens or occurs. 47 00:03:18,810 --> 00:03:21,970 Now what we care about is the number of events that occur. 48 00:03:21,970 --> 00:03:25,550 And we get that just by summing up the T sub i. 49 00:03:25,550 --> 00:03:33,805 So we'll let T be T1 plus T2 plus T n. 50 00:03:33,805 --> 00:03:36,220 And that'll count because we'll get a 1 every time 51 00:03:36,220 --> 00:03:37,060 an event occurs. 52 00:03:37,060 --> 00:03:39,260 By adding those up, we'll get the number 53 00:03:39,260 --> 00:03:42,570 of events that occur. 54 00:03:42,570 --> 00:03:43,070 All right. 55 00:03:43,070 --> 00:03:46,210 Now we care about the expected value 56 00:03:46,210 --> 00:03:49,760 of T, the expected number of events to occur. 57 00:03:49,760 --> 00:03:53,870 And I claim that's just the sum i equals 1 to n 58 00:03:53,870 --> 00:03:57,200 of the expected value of T i. 59 00:03:57,200 --> 00:03:58,110 Why is that true? 60 00:04:01,540 --> 00:04:06,740 Why is the expected value of T the sum of the expected values 61 00:04:06,740 --> 00:04:09,440 of the T sub i? 62 00:04:09,440 --> 00:04:11,610 Linearity of expectations. 63 00:04:11,610 --> 00:04:15,910 Now did we need the T sub i's to be independent events for that? 64 00:04:15,910 --> 00:04:16,970 No. 65 00:04:16,970 --> 00:04:17,884 OK, very good. 66 00:04:20,480 --> 00:04:25,180 Now the expected value of T i is really easy to evaluate. 67 00:04:25,180 --> 00:04:27,220 It's just a 0, 1 variable. 68 00:04:27,220 --> 00:04:33,130 So it's just the probability that T i is 1 69 00:04:33,130 --> 00:04:36,800 because it's 1 times this plus 0 times the probability of 0, 70 00:04:36,800 --> 00:04:39,470 and that cancels out. 71 00:04:39,470 --> 00:04:41,440 And the event that T i equals 1 is just 72 00:04:41,440 --> 00:04:53,530 the situation where the i-th event occurs because T i equals 73 00:04:53,530 --> 00:04:56,090 1 is the case that A i occurs. 74 00:04:56,090 --> 00:04:58,770 That's what it is. 75 00:04:58,770 --> 00:05:01,770 So we've shown that the expected number of events to occur 76 00:05:01,770 --> 00:05:04,070 is simply the sum of the probabilities 77 00:05:04,070 --> 00:05:05,620 that the events occur. 78 00:05:05,620 --> 00:05:10,630 So a very simple formula, very handy. 79 00:05:10,630 --> 00:05:13,630 And you don't need independence for that. 80 00:05:13,630 --> 00:05:18,020 Any questions about that? 81 00:05:18,020 --> 00:05:20,210 We're going to use that theorem a lot today. 82 00:05:20,210 --> 00:05:26,445 As a simple example, suppose we flip n fair coins. 83 00:05:30,110 --> 00:05:35,640 And we let A i be the event that the i-th coin is heads. 84 00:05:40,560 --> 00:05:43,590 And suppose we want to know the expected number of heads 85 00:05:43,590 --> 00:05:46,640 in the n flips. 86 00:05:46,640 --> 00:05:49,530 Well, we can use this theorem. 87 00:05:49,530 --> 00:05:51,954 The expected number of heads is just 88 00:05:51,954 --> 00:05:53,620 going to be the sum of the probabilities 89 00:05:53,620 --> 00:05:55,930 that each coin is a heads. 90 00:05:55,930 --> 00:05:57,960 So let's do that. 91 00:05:57,960 --> 00:05:59,300 T is the number of heads. 92 00:06:02,880 --> 00:06:05,460 We want to know the expected value of T. 93 00:06:05,460 --> 00:06:07,810 And from theorem one, that's just 94 00:06:07,810 --> 00:06:11,070 the probability the first coin is heads plus the probability 95 00:06:11,070 --> 00:06:13,700 the second coin is heads. 96 00:06:13,700 --> 00:06:16,070 And the same out to the probability the n-th coin 97 00:06:16,070 --> 00:06:18,700 is heads. 98 00:06:18,700 --> 00:06:20,690 What's the probability the first coin is heads? 99 00:06:23,750 --> 00:06:25,620 And 1/2. 100 00:06:25,620 --> 00:06:29,614 The probability the second coin is heads is a 1/2. 101 00:06:29,614 --> 00:06:31,530 The probability the last coin is heads is 1/2. 102 00:06:34,730 --> 00:06:38,740 And so the expected number of heads-- we add up 1/2 n times-- 103 00:06:38,740 --> 00:06:41,798 is just n/2. 104 00:06:41,798 --> 00:06:43,470 Of course, you all knew that. 105 00:06:43,470 --> 00:06:45,800 If you flip n fair coins, the expected number of heads 106 00:06:45,800 --> 00:06:46,900 is half of them. 107 00:06:46,900 --> 00:06:50,770 But that's a very simple way to prove it. 108 00:06:50,770 --> 00:06:53,290 Did we need the coin tosses to be mutually 109 00:06:53,290 --> 00:06:55,720 independent to conclude that? 110 00:06:58,290 --> 00:06:59,050 No. 111 00:06:59,050 --> 00:07:03,140 I could've glued them together in some weird way. 112 00:07:03,140 --> 00:07:06,740 In fact, I could have glued some face up and some face down 113 00:07:06,740 --> 00:07:08,920 and done weird things, and you still 114 00:07:08,920 --> 00:07:12,060 expect n/2 heads even if they were glued together 115 00:07:12,060 --> 00:07:13,480 in strange ways. 116 00:07:13,480 --> 00:07:15,490 Because I don't need independence 117 00:07:15,490 --> 00:07:17,610 for linearity of expectation to prove this. 118 00:07:21,080 --> 00:07:24,242 Now that's the easy way to evaluate 119 00:07:24,242 --> 00:07:25,450 the expected number of heads. 120 00:07:25,450 --> 00:07:27,920 There is a hard way to do it. 121 00:07:27,920 --> 00:07:29,790 Let me set that up. 122 00:07:29,790 --> 00:07:31,630 We could start from the definition, 123 00:07:31,630 --> 00:07:38,040 a different definition, namely, that the expected value of T 124 00:07:38,040 --> 00:07:43,140 is the sum from i equals 0 to n. 125 00:07:43,140 --> 00:07:48,300 i times the probability that you have i heads. 126 00:07:48,300 --> 00:07:50,670 This would be a natural way to compute 127 00:07:50,670 --> 00:07:52,260 the expected number of heads. 128 00:07:52,260 --> 00:07:54,050 You add up the case where there's 129 00:07:54,050 --> 00:07:56,390 zero heads times the probability of 0, 130 00:07:56,390 --> 00:07:58,600 1 times the probability of one head, 2 times 131 00:07:58,600 --> 00:08:00,096 probably two heads, and so forth. 132 00:08:00,096 --> 00:08:02,220 That's one of the first definitions of expectation. 133 00:08:04,820 --> 00:08:06,555 So let's keep trying to do this. 134 00:08:09,410 --> 00:08:12,402 What is the probability of getting i heads? 135 00:08:12,402 --> 00:08:13,860 And now I'm going to have to assume 136 00:08:13,860 --> 00:08:15,530 mutual independence actually. 137 00:08:15,530 --> 00:08:18,360 Now I'm going to need mutual independence. 138 00:08:18,360 --> 00:08:20,439 So already, this method isn't as good 139 00:08:20,439 --> 00:08:21,980 because I had to make that assumption 140 00:08:21,980 --> 00:08:24,524 to answer this question. 141 00:08:24,524 --> 00:08:25,940 If you don't make that assumption, 142 00:08:25,940 --> 00:08:27,420 you can't answer that question. 143 00:08:27,420 --> 00:08:31,040 What's the probability of getting i heads out of n. 144 00:08:31,040 --> 00:08:32,745 Yeah? 145 00:08:32,745 --> 00:08:36,700 AUDIENCE: s to the n-th power times n [INAUDIBLE] i. 146 00:08:36,700 --> 00:08:40,789 PROFESSOR: Yes, because if you look at the sample space, 147 00:08:40,789 --> 00:08:45,110 there are 2 to the n sample points all equally likely. 148 00:08:45,110 --> 00:08:47,310 They're all probability 2 the minus n. 149 00:08:47,310 --> 00:08:52,590 And there's n choose i of them that have i heads. 150 00:08:52,590 --> 00:08:57,570 And now you'd have to evaluate that sum which 151 00:08:57,570 --> 00:08:59,520 is sort of a pain. 152 00:08:59,520 --> 00:09:03,320 That one won't come to mind readily. 153 00:09:03,320 --> 00:09:06,940 So you might say I reached sort of a dead end here. 154 00:09:06,940 --> 00:09:09,910 But in fact, the answer is easy to use, easy 155 00:09:09,910 --> 00:09:13,330 to get using this method. 156 00:09:13,330 --> 00:09:16,580 In fact, we've actually proved an identity here 157 00:09:16,580 --> 00:09:18,440 because we know the answer is n/2. 158 00:09:18,440 --> 00:09:23,360 We've just proved that this messy thing is n/2. 159 00:09:26,400 --> 00:09:28,930 In fact, you can multiply by 2 to the n here. 160 00:09:28,930 --> 00:09:32,840 We have proved, using probability theory and theorem 161 00:09:32,840 --> 00:09:39,660 1 over there, the sum of i n choose i equals n 2 162 00:09:39,660 --> 00:09:43,770 to the n minus 1. 163 00:09:43,770 --> 00:09:46,370 Just multiply by 2 to the n on each side. 164 00:09:46,370 --> 00:09:48,700 So we've given a probability based 165 00:09:48,700 --> 00:09:55,040 proof of this identity which is sort of hard to do otherwise, 166 00:09:55,040 --> 00:09:56,630 could be harder to do. 167 00:09:56,630 --> 00:09:59,640 Any questions about that? 168 00:09:59,640 --> 00:10:03,570 So again, if it comes time for a homework problem or a test 169 00:10:03,570 --> 00:10:06,950 problem, if it naturally divides up 170 00:10:06,950 --> 00:10:11,030 in this way where you can take a random variable when 171 00:10:11,030 --> 00:10:13,750 you got a computer's expectation to make it a sum of indicator 172 00:10:13,750 --> 00:10:17,760 variables, if that is a natural thing to do, do it that way. 173 00:10:17,760 --> 00:10:20,370 Because it's so much easier than trying 174 00:10:20,370 --> 00:10:22,770 to go from the definition because you 175 00:10:22,770 --> 00:10:24,430 might encounter nasty things like 176 00:10:24,430 --> 00:10:26,290 that that you got to evaluate. 177 00:10:32,150 --> 00:10:33,840 So in this case, we flipped n coins. 178 00:10:33,840 --> 00:10:36,050 We expect n/2 heads. 179 00:10:36,050 --> 00:10:39,650 In the hat check, in Chinese appetizer problems-- 180 00:10:39,650 --> 00:10:42,620 we had n hats or n appetizers-- we expected 181 00:10:42,620 --> 00:10:46,490 to get one back to the right person-- so a smaller expected 182 00:10:46,490 --> 00:10:48,150 value. 183 00:10:48,150 --> 00:10:51,890 For some problems, the expected value is even less. 184 00:10:51,890 --> 00:10:54,550 The expected number of events to occur is less than 1. 185 00:10:54,550 --> 00:10:57,130 In fact, it might be much less than 1. 186 00:10:57,130 --> 00:11:01,980 Now in those cases it turns out that the expected value 187 00:11:01,980 --> 00:11:07,710 is an upper bound on the probability 188 00:11:07,710 --> 00:11:10,280 that one or more events occur. 189 00:11:10,280 --> 00:11:11,910 We're going to state this as theorem 2. 190 00:11:18,090 --> 00:11:22,190 The probability that at least one event occurs 191 00:11:22,190 --> 00:11:25,570 is always upper bounded by the expected number of events 192 00:11:25,570 --> 00:11:27,450 to occur. 193 00:11:27,450 --> 00:11:29,090 Now this theorem is pretty useless 194 00:11:29,090 --> 00:11:32,220 if the expected number of events to occur is bigger than 1 195 00:11:32,220 --> 00:11:35,230 because all probabilities are at most 1. 196 00:11:35,230 --> 00:11:37,190 But if the expected number of events to occur 197 00:11:37,190 --> 00:11:40,080 is small, something much less than 1, 198 00:11:40,080 --> 00:11:43,100 this is a pretty useful bound. 199 00:11:43,100 --> 00:11:44,025 So let's prove that. 200 00:11:47,380 --> 00:11:50,710 The expected value of T-- and in this case 201 00:11:50,710 --> 00:11:54,200 we'll use one of the definitions we have of expected value, 202 00:11:54,200 --> 00:11:56,710 name of the one where you sum from i 203 00:11:56,710 --> 00:12:01,056 equals 1 to infinity of the probability T 204 00:12:01,056 --> 00:12:04,210 is greater than or equal to i. 205 00:12:04,210 --> 00:12:08,951 Now what did we have to know about T to use that definition? 206 00:12:08,951 --> 00:12:11,240 That doesn't work for all random variables 207 00:12:11,240 --> 00:12:19,650 T. What condition do I have on T to be able to use this one? 208 00:12:19,650 --> 00:12:20,770 Anybody remember? 209 00:12:20,770 --> 00:12:21,289 Yeah. 210 00:12:21,289 --> 00:12:23,330 AUDIENCE: Must be defined on the natural numbers? 211 00:12:23,330 --> 00:12:25,371 PROFESSOR: T must defined on the natural numbers. 212 00:12:25,371 --> 00:12:28,890 If it is, I can use this very simple definition. 213 00:12:28,890 --> 00:12:31,260 And is T defined on the natural numbers here? 214 00:12:31,260 --> 00:12:34,560 Is the range of T natural numbers? 215 00:12:34,560 --> 00:12:37,920 Well, I'm counting the number of events that occurred. 216 00:12:37,920 --> 00:12:39,575 Could be 0, 1, 2, 3, 4. 217 00:12:39,575 --> 00:12:40,700 Has to be a natural number. 218 00:12:40,700 --> 00:12:42,240 So it's OK. 219 00:12:42,240 --> 00:12:45,300 So I can use this definition. 220 00:12:45,300 --> 00:12:48,860 Now this is summing up probability T is at least 1 221 00:12:48,860 --> 00:12:51,680 plus probability T is at least 2 and so forth. 222 00:12:51,680 --> 00:12:53,960 I'm just going to use the first term. 223 00:12:53,960 --> 00:12:56,425 This is at least the size of the first term 224 00:12:56,425 --> 00:12:58,050 because probabilities are non-negative. 225 00:13:00,349 --> 00:13:02,890 So I'm just going to throw away all the terms after the first 226 00:13:02,890 --> 00:13:05,400 and conclude that this is at least the probability T 227 00:13:05,400 --> 00:13:07,000 is bigger than or equal to 1. 228 00:13:07,000 --> 00:13:08,882 And I'm done. 229 00:13:08,882 --> 00:13:10,090 I just look at it in reverse. 230 00:13:10,090 --> 00:13:12,680 The probability of at least one event occurring 231 00:13:12,680 --> 00:13:17,480 is at most the expected value. 232 00:13:17,480 --> 00:13:18,850 Very simple. 233 00:13:18,850 --> 00:13:20,620 There's a very quick corollary here. 234 00:13:24,330 --> 00:13:26,140 The probability at least one event 235 00:13:26,140 --> 00:13:31,045 occurs is at most the sum of the probabilities of the events. 236 00:13:34,870 --> 00:13:39,260 And the proof there is just plugging in theorem 1. 237 00:13:39,260 --> 00:13:43,920 Because the expected value is the sum of the probabilities. 238 00:13:43,920 --> 00:13:48,460 So we just plug-in theorem 1 for the expected value 239 00:13:48,460 --> 00:13:49,710 because it's just that. 240 00:13:53,930 --> 00:13:57,900 Any questions about the proof? 241 00:13:57,900 --> 00:13:59,670 Very simple. 242 00:13:59,670 --> 00:14:03,530 Now this theorem is very useful in situations 243 00:14:03,530 --> 00:14:05,100 where you're trying to upper bound 244 00:14:05,100 --> 00:14:08,400 the probability of some kind of disaster or something 245 00:14:08,400 --> 00:14:10,160 bad happening. 246 00:14:10,160 --> 00:14:12,210 For example, suppose you want to compute 247 00:14:12,210 --> 00:14:16,570 the probability that a nuclear plant melts down. 248 00:14:16,570 --> 00:14:19,405 Now actually, the government does this. 249 00:14:19,405 --> 00:14:20,780 They got to figure this thing out 250 00:14:20,780 --> 00:14:22,488 because if it's a high probability, well, 251 00:14:22,488 --> 00:14:24,810 we're not going to allow anybody to build them. 252 00:14:24,810 --> 00:14:29,250 And the way they do it is they convene a panel of experts. 253 00:14:29,250 --> 00:14:32,160 They get some people from various good universities 254 00:14:32,160 --> 00:14:33,950 and they bring them down to Washington. 255 00:14:33,950 --> 00:14:35,940 And they have them figure out every way 256 00:14:35,940 --> 00:14:39,450 they can think of that a meltdown could occur, 257 00:14:39,450 --> 00:14:42,800 every possible event that would lead to a meltdown. 258 00:14:42,800 --> 00:14:46,470 And then they'd have them figure out the probability 259 00:14:46,470 --> 00:14:48,600 for each one of those events. 260 00:14:48,600 --> 00:14:51,240 For example, A1 could be the event 261 00:14:51,240 --> 00:14:56,180 that the operator goes crazy and makes it meltdown. 262 00:14:56,180 --> 00:15:00,150 A2 could be the event that an earthquake hits and the cooling 263 00:15:00,150 --> 00:15:04,340 pipes are ruptured, and then you got a meltdown. 264 00:15:04,340 --> 00:15:07,800 A3 is the event that terrorists shoot their way in and cause 265 00:15:07,800 --> 00:15:09,577 a meltdown. 266 00:15:09,577 --> 00:15:11,410 So you've got a lot of possibilities for how 267 00:15:11,410 --> 00:15:13,730 the thing can melt down. 268 00:15:13,730 --> 00:15:16,000 And then they compute the probabilities. 269 00:15:16,000 --> 00:15:21,260 And then they add them up just using this result. 270 00:15:21,260 --> 00:15:23,120 And they say, well, the probability 271 00:15:23,120 --> 00:15:27,510 that a meltdown-causing event or one or more occurs 272 00:15:27,510 --> 00:15:29,270 is at most this small number. 273 00:15:29,270 --> 00:15:31,510 And hopefully it's small. 274 00:15:31,510 --> 00:15:37,530 So for example, suppose there's 100 ways that a meltdown could 275 00:15:37,530 --> 00:15:40,730 occur, 100 things that could cause a meltdown. 276 00:15:40,730 --> 00:15:45,350 And each one happens with probability one in a million. 277 00:15:45,350 --> 00:15:48,710 What can you say about the probability 278 00:15:48,710 --> 00:15:49,690 that a meltdown occurs? 279 00:15:53,510 --> 00:15:55,820 You got a hundred ways it could happen only. 280 00:15:55,820 --> 00:15:58,902 Each is a one in a million chance. 281 00:15:58,902 --> 00:16:00,610 What's the probability a meltdown occurs? 282 00:16:03,370 --> 00:16:07,940 1 in 10,000 because you're adding up one in a million 283 00:16:07,940 --> 00:16:08,940 100 times. 284 00:16:08,940 --> 00:16:10,000 n is 100. 285 00:16:10,000 --> 00:16:12,300 Each of these is one in a million. 286 00:16:12,300 --> 00:16:13,630 So you get 100 over a million. 287 00:16:13,630 --> 00:16:14,880 There's 1 in 10,000. 288 00:16:14,880 --> 00:16:16,570 And so then they publish a report that 289 00:16:16,570 --> 00:16:20,500 says the chance of this reactor melting down is 1 in 10,000. 290 00:16:20,500 --> 00:16:24,969 Now what if I've got 100 reactors? 291 00:16:24,969 --> 00:16:27,260 What's the chance that at least one of them melts down? 292 00:16:30,990 --> 00:16:35,020 1 in 100 because I got 100 over 10,000-- same theorem. 293 00:16:35,020 --> 00:16:38,065 So there's a 1 in 100 chance something melts down somewhere, 294 00:16:38,065 --> 00:16:40,657 at most. 295 00:16:40,657 --> 00:16:42,490 Hopefully, the numbers are better than that. 296 00:16:45,030 --> 00:16:47,870 Same thing if you bought 100 lottery tickets, each 297 00:16:47,870 --> 00:16:50,270 a one in a million chance, you got a 1 298 00:16:50,270 --> 00:16:54,780 in 10,000 chance of winning, at most. 299 00:16:54,780 --> 00:16:59,980 So simple fact but powerful and used a lot in practice. 300 00:16:59,980 --> 00:17:03,080 And this is sort of the good case when 301 00:17:03,080 --> 00:17:07,010 the expected number of events that are bad to happen 302 00:17:07,010 --> 00:17:11,569 is small, like a lot less than 1. 303 00:17:11,569 --> 00:17:15,760 But what if the expected number of events to happen is big. 304 00:17:15,760 --> 00:17:18,819 Say it's 10. 305 00:17:18,819 --> 00:17:21,020 Say this government panel gets together 306 00:17:21,020 --> 00:17:23,030 and they add up all the probabilities 307 00:17:23,030 --> 00:17:25,530 and it comes out to be 10. 308 00:17:25,530 --> 00:17:29,260 Well, it doesn't sound so good if that's the case. 309 00:17:29,260 --> 00:17:30,880 But is it necessarily bad? 310 00:17:30,880 --> 00:17:34,000 Does it necessarily mean that you're 311 00:17:34,000 --> 00:17:35,160 going to have a meltdown. 312 00:17:35,160 --> 00:17:36,810 So for example, let's say there's 313 00:17:36,810 --> 00:17:39,590 1,000 ways you could melt down. 314 00:17:39,590 --> 00:17:43,350 And let's say that the probability of each one 315 00:17:43,350 --> 00:17:46,860 is 1 in 100. 316 00:17:46,860 --> 00:17:49,710 So the expected number of things that could happen 317 00:17:49,710 --> 00:17:53,400 to cause a meltdown is 10. 318 00:17:53,400 --> 00:17:55,120 Am I guaranteed we're going to melt down? 319 00:17:58,590 --> 00:17:59,710 No. 320 00:17:59,710 --> 00:18:01,680 Can anybody think of a way where it's 321 00:18:01,680 --> 00:18:04,720 unlikely we're going to melt down 322 00:18:04,720 --> 00:18:08,280 but respect these values here, hypothetically? 323 00:18:12,282 --> 00:18:13,740 Is there any chance that it's still 324 00:18:13,740 --> 00:18:14,906 unlikely to have a meltdown? 325 00:18:17,900 --> 00:18:20,150 We're going to think of a way? 326 00:18:20,150 --> 00:18:20,800 Yeah. 327 00:18:20,800 --> 00:18:22,591 AUDIENCE: They all happen at the same time. 328 00:18:22,591 --> 00:18:24,800 PROFESSOR: They all happen at the same time. 329 00:18:24,800 --> 00:18:27,565 Now the examples I gave you-- the terrorists, the earthquake, 330 00:18:27,565 --> 00:18:31,470 and the crazy operator-- put that on the side. 331 00:18:31,470 --> 00:18:34,490 If they all happen together, when any one happens, 332 00:18:34,490 --> 00:18:37,360 the others have to happen. 333 00:18:37,360 --> 00:18:42,380 So we can express that as for all ij, 334 00:18:42,380 --> 00:18:44,750 probability of the i-th event happening 335 00:18:44,750 --> 00:18:51,440 given the j-th event happening is one, so total dependence. 336 00:18:51,440 --> 00:18:54,060 What's the probability of a meltdown in that scenario? 337 00:18:58,780 --> 00:19:01,350 What's the probability one of those meltdown-inducing events 338 00:19:01,350 --> 00:19:01,850 occurs? 339 00:19:04,960 --> 00:19:07,332 They all happen at once. 340 00:19:07,332 --> 00:19:08,320 AUDIENCE: 1 in 100. 341 00:19:08,320 --> 00:19:15,142 PROFESSOR: 1 in 100. 342 00:19:15,142 --> 00:19:17,600 Because it's the same as the probability of the first event 343 00:19:17,600 --> 00:19:22,180 happening which, by definition, was 1 in 100. 344 00:19:22,180 --> 00:19:25,400 So it could be that the probability of a meltdown 345 00:19:25,400 --> 00:19:27,520 is small. 346 00:19:27,520 --> 00:19:29,270 But it might not be as well. 347 00:19:29,270 --> 00:19:31,405 There's no way, given this, to know. 348 00:19:35,030 --> 00:19:38,940 What if I chain-- in fact, this is like Chinese appetizer, 349 00:19:38,940 --> 00:19:40,140 right? 350 00:19:40,140 --> 00:19:45,040 If one person gets their appetizer back, everybody does. 351 00:19:45,040 --> 00:19:47,009 So there are circumstances where you have 352 00:19:47,009 --> 00:19:48,300 the total dependence like that. 353 00:19:51,160 --> 00:19:54,760 Let's say I change a little bit and I don't do this scenario. 354 00:19:54,760 --> 00:19:56,760 In fact, say I tell you the events 355 00:19:56,760 --> 00:20:02,850 are mutually independent, but you expect 10 to occur. 356 00:20:02,850 --> 00:20:06,210 Do you sleep at night now? 357 00:20:06,210 --> 00:20:08,470 Of course, 1% is still a pretty high number. 358 00:20:08,470 --> 00:20:12,690 But how many people think that if they're mutually independent 359 00:20:12,690 --> 00:20:16,630 and you expect 10 that, no matter what, 360 00:20:16,630 --> 00:20:20,931 there's a least a 50% chance of a meltdown? 361 00:20:20,931 --> 00:20:21,430 Anybody? 362 00:20:24,400 --> 00:20:25,290 OK. 363 00:20:25,290 --> 00:20:32,840 In fact, if you expect 10 and they're mutually independent, 364 00:20:32,840 --> 00:20:35,260 a meltdown is a virtual certainty. 365 00:20:35,260 --> 00:20:41,890 The chance you don't melt down is less than 1 in 22,000. 366 00:20:41,890 --> 00:20:44,270 For sure something will occur that's bad. 367 00:20:44,270 --> 00:20:50,472 And this is a theorem that we call Murphy's law. 368 00:20:50,472 --> 00:20:52,555 And Murphy's law, probably you've all heard of it, 369 00:20:52,555 --> 00:20:53,920 it says-- it's a famous saying. 370 00:20:53,920 --> 00:20:56,679 If something can go wrong, it will go wrong. 371 00:20:56,679 --> 00:20:58,720 And we're going to see why that's true, at least, 372 00:20:58,720 --> 00:21:00,900 in our circumstances here. 373 00:21:13,795 --> 00:21:15,170 That's a pretty powerful theorem. 374 00:21:28,870 --> 00:21:49,380 If you have mutually independent events A1 through A n, then 375 00:21:49,380 --> 00:21:55,300 the probability that none of them occur, t equals 0, 376 00:21:55,300 --> 00:22:00,190 is upper bounded by e to the minus expected number of events 377 00:22:00,190 --> 00:22:02,980 to occur. 378 00:22:02,980 --> 00:22:09,430 So if I expect 10 to occur, the chance that none do 379 00:22:09,430 --> 00:22:14,660 is upper bounded by e to the minus 10, which is very small, 380 00:22:14,660 --> 00:22:21,640 which means almost surely one of the events or more will occur. 381 00:22:21,640 --> 00:22:26,110 And that's bad news in this case. 382 00:22:26,110 --> 00:22:27,800 So let's prove that. 383 00:22:31,900 --> 00:22:35,690 Well, the probability that t equals 0 384 00:22:35,690 --> 00:22:39,150 is the same as the probability that A1 does not occur 385 00:22:39,150 --> 00:22:44,890 and A2 does not occur and all the way to A n does not occur. 386 00:22:47,880 --> 00:22:50,700 And I claim this [INAUDIBLE] is the product 387 00:22:50,700 --> 00:22:54,300 of the probabilities they don't occur. 388 00:22:54,300 --> 00:22:57,110 So I'm taking the product i equals 1 to n 389 00:22:57,110 --> 00:23:01,240 of the probability that A i does not occur. 390 00:23:01,240 --> 00:23:02,670 Now why can I make that step? 391 00:23:05,850 --> 00:23:06,350 Yeah. 392 00:23:06,350 --> 00:23:08,270 Independence. 393 00:23:08,270 --> 00:23:11,670 This is the product rule for independent events. 394 00:23:23,840 --> 00:23:28,560 Now the probability that A i does not occur 395 00:23:28,560 --> 00:23:31,255 is simply 1 minus the probability it does occur. 396 00:23:35,440 --> 00:23:39,620 And now I'm going to use a simple fact, which 397 00:23:39,620 --> 00:23:43,840 is that for any number x, 1 minus x 398 00:23:43,840 --> 00:23:47,367 is at most e to the minus x. 399 00:23:47,367 --> 00:23:48,700 Just a simple fact from algebra. 400 00:23:48,700 --> 00:23:53,980 So I've got 1 minus-- I'm going to treat this as the x. 401 00:23:53,980 --> 00:24:03,180 So this is at most e to the minus probability of A i 402 00:24:03,180 --> 00:24:05,710 using that fact. 403 00:24:05,710 --> 00:24:07,820 And now I'll take the product and put it 404 00:24:07,820 --> 00:24:09,130 into a sum in the exponent. 405 00:24:19,440 --> 00:24:23,450 And then some of the probabilities of the events 406 00:24:23,450 --> 00:24:25,190 is just the expected value. 407 00:24:25,190 --> 00:24:28,390 That was theorem 1 that I just erased. 408 00:24:28,390 --> 00:24:31,720 So this is e to the minus expected number of events 409 00:24:31,720 --> 00:24:34,650 to occur. 410 00:24:34,650 --> 00:24:36,220 So not too hard a proof. 411 00:24:36,220 --> 00:24:38,190 We had to use that fact. 412 00:24:38,190 --> 00:24:41,000 But that gets the expected number of events 413 00:24:41,000 --> 00:24:44,080 to occur in the exponent. 414 00:24:44,080 --> 00:24:51,010 So a simple corollary is the case when 415 00:24:51,010 --> 00:24:52,890 we expect 10 events to occur. 416 00:25:03,840 --> 00:25:19,040 So if we expect 10 or more mutually independent events 417 00:25:19,040 --> 00:25:35,870 to occur, the probability that no event occurs 418 00:25:35,870 --> 00:25:41,020 is at most e to the minus 10, which is less than 1 419 00:25:41,020 --> 00:25:46,490 over 22,000. 420 00:25:46,490 --> 00:25:50,490 Now there's not even any dependence on n here. 421 00:25:50,490 --> 00:25:52,740 It had nothing to do with a number of possible events, 422 00:25:52,740 --> 00:25:55,460 just that if you expect 10 of them to occur, 423 00:25:55,460 --> 00:25:58,910 you're pretty sure one of them will. 424 00:25:58,910 --> 00:26:04,270 And this explains why you see weird coincidences. 425 00:26:04,270 --> 00:26:07,880 Or people sometimes see what they think are miracles. 426 00:26:07,880 --> 00:26:10,720 Because out in the real world, there's 427 00:26:10,720 --> 00:26:15,360 billions of possible weird things that could happen. 428 00:26:15,360 --> 00:26:19,450 You just can create all sorts of crazy possibilities. 429 00:26:19,450 --> 00:26:22,230 And each one might be one in a billion chance 430 00:26:22,230 --> 00:26:24,800 of actually happening. 431 00:26:24,800 --> 00:26:28,970 But you got billions that could've. 432 00:26:28,970 --> 00:26:30,710 And if they're all mutually independent-- 433 00:26:30,710 --> 00:26:33,870 because you made up all these different things-- than you 434 00:26:33,870 --> 00:26:36,940 expect some of them to happen. 435 00:26:36,940 --> 00:26:39,110 And so you should-- in fact, you're 436 00:26:39,110 --> 00:26:42,470 going to know that for sure some of those weird things 437 00:26:42,470 --> 00:26:43,545 are going to happen. 438 00:26:43,545 --> 00:26:45,720 At least the chance that no weird thing happens 439 00:26:45,720 --> 00:26:48,500 is 1 in 22,000. 440 00:26:48,500 --> 00:26:50,846 And so this can be why somebody will go along 441 00:26:50,846 --> 00:26:52,710 and say, oh my goodness. 442 00:26:52,710 --> 00:26:55,650 You won't believe what happened, a coincidence. 443 00:26:55,650 --> 00:26:57,880 And it's like, wow, the chance of that happening 444 00:26:57,880 --> 00:26:59,380 was one in a billion. 445 00:26:59,380 --> 00:27:04,046 It must've been a miracle or an act of God that this happened. 446 00:27:04,046 --> 00:27:06,420 But you're not thinking about the other 10 billion things 447 00:27:06,420 --> 00:27:07,250 that didn't happen. 448 00:27:09,930 --> 00:27:13,050 So for sure some of those things are going to happen. 449 00:27:13,050 --> 00:27:17,080 It's not likely that I'm going to win megabucks next week. 450 00:27:17,080 --> 00:27:19,310 But somebody's going to win. 451 00:27:19,310 --> 00:27:22,860 If enough people play and it's more than 1 452 00:27:22,860 --> 00:27:24,852 over the probability that you're going to win, 453 00:27:24,852 --> 00:27:26,310 than it's very likely somebody will 454 00:27:26,310 --> 00:27:30,010 win if everybody is guessing randomly. 455 00:27:30,010 --> 00:27:33,110 Any questions about what we're doing? 456 00:27:36,780 --> 00:27:40,110 So this is amazingly powerful, this result. In fact, 457 00:27:40,110 --> 00:27:43,040 it's so powerful that it's going to let me read 458 00:27:43,040 --> 00:27:44,860 somebody's mind in the class. 459 00:27:44,860 --> 00:27:48,980 We're going to do a little card trick here. 460 00:27:48,980 --> 00:27:51,647 Now the way this card trick works-- 461 00:27:51,647 --> 00:27:52,730 it's a little complicated. 462 00:27:52,730 --> 00:27:54,410 I'm going to need a volunteer, probably one of you 463 00:27:54,410 --> 00:27:55,390 guys down front. 464 00:27:55,390 --> 00:27:56,480 We'll get you. 465 00:27:56,480 --> 00:28:00,410 And one of your buddies is going to keep you honest for me here. 466 00:28:00,410 --> 00:28:01,962 I'm going to reveal-- first I'm going 467 00:28:01,962 --> 00:28:03,129 to let you shuffle the deck. 468 00:28:03,129 --> 00:28:05,170 So go ahead and shuffle it, do whatever you want. 469 00:28:05,170 --> 00:28:06,040 It's a normal deck. 470 00:28:06,040 --> 00:28:09,760 It's got 52 cards and two jokers. 471 00:28:09,760 --> 00:28:12,970 And I don't care what order they're in. 472 00:28:12,970 --> 00:28:16,410 I'm going to turn over the cards one at a time. 473 00:28:16,410 --> 00:28:21,110 Now I'm going to ask you to pick a number from 1 to 9 474 00:28:21,110 --> 00:28:22,200 ahead of time. 475 00:28:22,200 --> 00:28:24,530 Don't tell me or anybody else. 476 00:28:24,530 --> 00:28:27,550 In fact, I'm going to want you guys to play along too. 477 00:28:27,550 --> 00:28:31,120 And we're going to see where we all end up here. 478 00:28:31,120 --> 00:28:33,110 And that's your starting number. 479 00:28:33,110 --> 00:28:36,360 And as I turn over the cards one at a time-- 480 00:28:36,360 --> 00:28:39,630 say you started with a 3 was the number you had in mind-- 481 00:28:39,630 --> 00:28:44,100 on the third card I show, that becomes your card. 482 00:28:44,100 --> 00:28:46,865 You don't tell me or jump up and down or anything. 483 00:28:46,865 --> 00:28:47,740 But that's your card. 484 00:28:47,740 --> 00:28:52,650 And say it's a 4 of diamonds. 485 00:28:52,650 --> 00:28:55,110 Now a 4 replaces the three in your mind 486 00:28:55,110 --> 00:29:00,130 and you count 4 more cards, then that becomes your card. 487 00:29:00,130 --> 00:29:04,350 Now let's say that's a jack or a face card or a 10 or a joker. 488 00:29:04,350 --> 00:29:06,840 10, face card, and jokers all count as 1 489 00:29:06,840 --> 00:29:08,730 just like an ace counts as 1. 490 00:29:08,730 --> 00:29:10,940 And so then the next card would be your card 491 00:29:10,940 --> 00:29:12,540 because you count 1. 492 00:29:12,540 --> 00:29:17,510 And we keep on going until you have a card, maybe it's a 7. 493 00:29:17,510 --> 00:29:20,230 But there's only four cards left in the deck. 494 00:29:20,230 --> 00:29:21,760 And so you don't get a new one. 495 00:29:21,760 --> 00:29:24,390 And your last card is the 7. 496 00:29:24,390 --> 00:29:29,300 And then you're going to write that down here, not showing me. 497 00:29:29,300 --> 00:29:31,510 And you're going to do this, maybe do this 498 00:29:31,510 --> 00:29:33,610 with a friend over there. 499 00:29:33,610 --> 00:29:36,030 And you're going to make sure you count right on the deck 500 00:29:36,030 --> 00:29:37,571 because if you screw up the counting, 501 00:29:37,571 --> 00:29:40,800 it's going to be hard for me to read your mind. 502 00:29:40,800 --> 00:29:42,965 So just to make sure we all understand this, let 503 00:29:42,965 --> 00:29:45,423 me write the rules down here because I want the whole class 504 00:29:45,423 --> 00:29:49,270 to pick a number from 1 to 9 and play the same game. 505 00:29:49,270 --> 00:29:53,565 And we're going to see what happens. 506 00:29:53,565 --> 00:29:55,190 So let me show you the rules again just 507 00:29:55,190 --> 00:29:56,648 to make sure everybody understands. 508 00:29:59,520 --> 00:30:03,260 So say the deck starts out like this. 509 00:30:03,260 --> 00:30:05,760 I got a 4, a 5. 510 00:30:05,760 --> 00:30:08,890 So my first few cards of the deck go like this. 511 00:30:08,890 --> 00:30:10,410 10 equals a 1. 512 00:30:10,410 --> 00:30:12,630 Then I got a queen equals a 1. 513 00:30:12,630 --> 00:30:16,024 3, 7, 6, 4, 2. 514 00:30:16,024 --> 00:30:16,940 Say it's a small deck. 515 00:30:16,940 --> 00:30:19,740 I'm going to use 54 cards. 516 00:30:19,740 --> 00:30:26,420 And say you're chosen number to start, you start with a three. 517 00:30:29,030 --> 00:30:34,640 As I show the cards, you're going to count 1, 2, 3. 518 00:30:34,640 --> 00:30:36,650 That becomes your new card. 519 00:30:36,650 --> 00:30:38,340 Then you're going to count 1, 2. 520 00:30:38,340 --> 00:30:39,430 That becomes your card. 521 00:30:39,430 --> 00:30:42,220 It's a 10, so you convert it to a 1 522 00:30:42,220 --> 00:30:44,390 because we're only doing single digit numbers. 523 00:30:44,390 --> 00:30:46,760 Go to 1, that becomes your card. 524 00:30:46,760 --> 00:30:47,910 Queen converts to a 1. 525 00:30:47,910 --> 00:30:49,520 You go 1, that becomes your card. 526 00:30:49,520 --> 00:30:52,320 3, 1, 2, 3. 527 00:30:52,320 --> 00:30:54,730 That becomes your card. 528 00:30:54,730 --> 00:31:00,240 And you can't get 4, so you remember the final card. 529 00:31:00,240 --> 00:31:03,230 Does everybody understand what you're supposed to do? 530 00:31:03,230 --> 00:31:06,410 Because we're going to do 54 cards of this. 531 00:31:06,410 --> 00:31:08,859 Maybe we get the TAs to play along here. 532 00:31:08,859 --> 00:31:11,150 And as you do it, maybe you want to talk to your buddy, 533 00:31:11,150 --> 00:31:12,940 make sure you got it worked out there. 534 00:31:16,742 --> 00:31:18,200 And if I could read your mind maybe 535 00:31:18,200 --> 00:31:21,510 we'll have a gift certificate or something. 536 00:31:21,510 --> 00:31:23,330 So you shuffle the deck? 537 00:31:23,330 --> 00:31:24,350 Got it good? 538 00:31:24,350 --> 00:31:24,880 All right. 539 00:31:24,880 --> 00:31:27,213 So I'm going to start revealing the cards one at a time. 540 00:31:27,213 --> 00:31:30,190 So you guys play along quietly in your mind. 541 00:31:30,190 --> 00:31:33,100 And we'll see if we can concentrate long enough. 542 00:31:42,750 --> 00:31:43,700 Aces are 1. 543 00:31:59,950 --> 00:32:00,693 Jacks are 1. 544 00:32:18,790 --> 00:32:19,550 10's are 1. 545 00:33:27,530 --> 00:33:28,380 10's are 1. 546 00:34:11,410 --> 00:34:12,840 We're halfway done. 547 00:35:22,260 --> 00:35:23,650 Jokers are 1. 548 00:35:28,471 --> 00:35:28,970 OK. 549 00:35:28,970 --> 00:35:30,070 That's the last card. 550 00:35:30,070 --> 00:35:32,111 So remember the last one that was yours. 551 00:35:32,111 --> 00:35:33,735 And you got to go check with your buddy 552 00:35:33,735 --> 00:35:38,592 to make sure you guys agree on the counting there. 553 00:35:38,592 --> 00:35:39,550 And then write it down. 554 00:35:43,840 --> 00:35:47,040 Don't tell me because I'm going to read your mind. 555 00:35:47,040 --> 00:35:47,990 I'm going to tell you. 556 00:35:53,312 --> 00:35:54,020 This is not good. 557 00:35:54,020 --> 00:35:57,690 They're arguing over the last card. 558 00:35:57,690 --> 00:36:00,980 I'll have to read one of your minds. 559 00:36:00,980 --> 00:36:01,480 What's that? 560 00:36:01,480 --> 00:36:02,286 The 11 of clubs. 561 00:36:02,286 --> 00:36:03,410 That's hard one to predict. 562 00:36:12,976 --> 00:36:14,600 Make your best guess and write it down. 563 00:36:14,600 --> 00:36:15,183 Don't tell me. 564 00:36:15,183 --> 00:36:16,152 Write it down. 565 00:36:16,152 --> 00:36:16,652 You got two? 566 00:36:16,652 --> 00:36:17,568 Well, write them both. 567 00:36:17,568 --> 00:36:18,780 I'll predict one of them. 568 00:36:30,000 --> 00:36:32,704 I've never had a dispute on what the-- 569 00:36:32,704 --> 00:36:34,620 because if you started with the same position, 570 00:36:34,620 --> 00:36:36,870 you've got to wind up in the same position. 571 00:36:36,870 --> 00:36:38,244 You wrote it down? 572 00:36:38,244 --> 00:36:39,910 Now think about your number really hard. 573 00:36:39,910 --> 00:36:40,690 We'll take yours. 574 00:36:40,690 --> 00:36:41,370 I'll trust you there. 575 00:36:41,370 --> 00:36:42,870 Think about it really hard because I 576 00:36:42,870 --> 00:36:46,230 need the brain waves to come over and read the mind here. 577 00:36:49,900 --> 00:36:51,920 Yeah, yeah. 578 00:36:51,920 --> 00:36:55,559 I'm getting a really strong signal on the last card. 579 00:36:55,559 --> 00:36:56,350 Maybe I don't know. 580 00:36:56,350 --> 00:36:58,183 Maybe it's something-- it's really powerful. 581 00:36:58,183 --> 00:37:00,941 I'm going to say it's the queen of hearts. 582 00:37:00,941 --> 00:37:02,861 Is that right? 583 00:37:02,861 --> 00:37:05,360 Both were the queen of-- oh, you were trying to screw me up, 584 00:37:05,360 --> 00:37:06,080 mess with me. 585 00:37:06,080 --> 00:37:09,066 So you both got the queen of hearts. 586 00:37:09,066 --> 00:37:09,820 Oh, you did. 587 00:37:09,820 --> 00:37:14,030 So how many people got the queen of hearts? 588 00:37:14,030 --> 00:37:14,530 Oh wow. 589 00:37:14,530 --> 00:37:18,200 How many people did not get the queen of hearts. 590 00:37:18,200 --> 00:37:18,700 Somebody. 591 00:37:18,700 --> 00:37:19,680 OK. 592 00:37:19,680 --> 00:37:22,490 Now there's a chance you did it legitimately. 593 00:37:22,490 --> 00:37:26,030 But usually, with a deck, we're all 594 00:37:26,030 --> 00:37:28,329 going to get the same last card. 595 00:37:28,329 --> 00:37:30,620 Now in this case, it happened to be the very last card. 596 00:37:30,620 --> 00:37:32,880 That is typically not the case. 597 00:37:32,880 --> 00:37:34,660 So very good. 598 00:37:34,660 --> 00:37:35,670 So I read your mind. 599 00:37:35,670 --> 00:37:38,670 So you guys get the gift certificates here. 600 00:37:38,670 --> 00:37:39,170 Very good. 601 00:37:39,170 --> 00:37:42,220 One for you and your sponsor there. 602 00:37:42,220 --> 00:37:47,300 So it's clear how I read his mind because I got 603 00:37:47,300 --> 00:37:49,350 the same number everybody did. 604 00:37:49,350 --> 00:37:52,630 And somehow it doesn't matter where we started. 605 00:37:52,630 --> 00:37:55,322 We all had the same card at the end. 606 00:37:55,322 --> 00:37:58,150 How is that possible? 607 00:37:58,150 --> 00:38:00,874 There's nine different starting points. 608 00:38:00,874 --> 00:38:02,790 Why don't we wind up in nine different places? 609 00:38:02,790 --> 00:38:04,460 And why isn't there a one in nine chance 610 00:38:04,460 --> 00:38:07,420 that I guess his card? 611 00:38:07,420 --> 00:38:10,829 Why do we all wind up in the same place? 612 00:38:10,829 --> 00:38:11,370 Any thoughts? 613 00:38:11,370 --> 00:38:12,806 Yeah? 614 00:38:12,806 --> 00:38:13,972 AUDIENCE: Get the same card. 615 00:38:13,972 --> 00:38:16,525 After that you stay [INAUDIBLE]. 616 00:38:16,525 --> 00:38:17,525 PROFESSOR: That's right. 617 00:38:17,525 --> 00:38:20,960 If ever we had the same card, then we're 618 00:38:20,960 --> 00:38:23,910 going to track forever and finish in the same card. 619 00:38:23,910 --> 00:38:25,710 But why should we ever get the same card? 620 00:38:25,710 --> 00:38:29,605 What are the chances of that, that we land on the same card? 621 00:38:32,060 --> 00:38:33,810 Why don't we just keep missing each other? 622 00:38:33,810 --> 00:38:35,410 It's a 1 in 9 chance or something. 623 00:38:35,410 --> 00:38:36,247 I don't know. 624 00:38:36,247 --> 00:38:37,330 Why don't we keep missing? 625 00:38:37,330 --> 00:38:39,420 AUDIENCE: It seems like there are enough low cards 626 00:38:39,420 --> 00:38:42,130 that you just move slowly along and, eventually, you're 627 00:38:42,130 --> 00:38:43,524 going to intersect. 628 00:38:43,524 --> 00:38:44,190 PROFESSOR: Yeah. 629 00:38:44,190 --> 00:38:47,060 I did make a lot of 1's in the deck. 630 00:38:47,060 --> 00:38:48,460 If I would've made all these face 631 00:38:48,460 --> 00:38:53,360 cards be 10's, the chances of my reading your mind go down. 632 00:38:53,360 --> 00:38:55,300 Why do they go down? 633 00:38:55,300 --> 00:38:56,300 What does it have to do? 634 00:38:56,300 --> 00:39:00,320 Why did I put a lot of 1's in the deck? 635 00:39:00,320 --> 00:39:01,770 AUDIENCE: It goes on longer. 636 00:39:01,770 --> 00:39:02,728 PROFESSOR: What's that? 637 00:39:02,728 --> 00:39:04,520 AUDIENCE: The game goes on longer? 638 00:39:04,520 --> 00:39:05,978 PROFESSOR: The game goes on longer. 639 00:39:05,978 --> 00:39:09,180 So there's more chances to hit together. 640 00:39:09,180 --> 00:39:12,630 Because at any given time-- you've got your card. 641 00:39:12,630 --> 00:39:13,250 I've got mine. 642 00:39:13,250 --> 00:39:17,590 If mine is behind you, I got a chance to land on you. 643 00:39:17,590 --> 00:39:19,170 And if you're behind me in the deck, 644 00:39:19,170 --> 00:39:22,146 you've got a chance to land on me with your number. 645 00:39:22,146 --> 00:39:23,770 And if the numbers are smaller, there's 646 00:39:23,770 --> 00:39:27,190 more chances to land on each other. 647 00:39:27,190 --> 00:39:30,010 And it is true that on any given chance, 648 00:39:30,010 --> 00:39:33,300 the chances are low that we land on the same card. 649 00:39:33,300 --> 00:39:35,990 But there's a lot of chances. 650 00:39:35,990 --> 00:39:37,610 And if there's a lot more chances 651 00:39:37,610 --> 00:39:40,720 than the probability of landing on each other, 652 00:39:40,720 --> 00:39:44,580 we've got Murphy's law. 653 00:39:44,580 --> 00:39:49,600 If you've got a lot of chances and they're not less likely 654 00:39:49,600 --> 00:39:51,690 than the number of chances or the inverse of that, 655 00:39:51,690 --> 00:39:53,880 then we expect to have a certain bunch of times 656 00:39:53,880 --> 00:39:56,050 that we're going to land on each other. 657 00:39:56,050 --> 00:39:59,970 And therefore, a very high probability we do. 658 00:39:59,970 --> 00:40:02,240 Now that was a little hand-wavy. 659 00:40:06,430 --> 00:40:09,210 And in fact, there's a reason it was hand-wavy. 660 00:40:09,210 --> 00:40:13,510 Why doesn't Murphy's law really apply in this case, 661 00:40:13,510 --> 00:40:14,711 really mathematically apply? 662 00:40:14,711 --> 00:40:15,210 Yeah. 663 00:40:18,399 --> 00:40:20,190 AUDIENCE: They're not mutually independent. 664 00:40:20,190 --> 00:40:23,940 Once you draw one card, it's not coming back. 665 00:40:23,940 --> 00:40:25,240 PROFESSOR: That's correct. 666 00:40:25,240 --> 00:40:29,640 And it means that the knowledge that we haven't collided yet 667 00:40:29,640 --> 00:40:32,670 tells me something about the cards we've seen-- not a lot, 668 00:40:32,670 --> 00:40:34,670 but something maybe. 669 00:40:34,670 --> 00:40:37,080 And it's a finite deck which tells me 670 00:40:37,080 --> 00:40:39,180 something about the cards that are coming. 671 00:40:39,180 --> 00:40:40,840 And it might influence the probability 672 00:40:40,840 --> 00:40:42,710 that we land together, the next person 673 00:40:42,710 --> 00:40:45,390 who's jumping on the deck. 674 00:40:45,390 --> 00:40:48,870 And so the events of-- like for example, in this case, 675 00:40:48,870 --> 00:40:54,930 we let A i be the event of a collision on the i-th jump. 676 00:40:54,930 --> 00:40:58,000 And there's about 20 jumps in this game, 10 for each of us 677 00:40:58,000 --> 00:40:58,500 expected. 678 00:41:01,230 --> 00:41:06,935 So A i is the event that we collide on the i-th jump. 679 00:41:11,420 --> 00:41:15,710 These events are not necessarily mutually independent. 680 00:41:15,710 --> 00:41:19,900 Now if I had an infinite deck or a deck 681 00:41:19,900 --> 00:41:24,360 with replacement so every card is equally likely to come next 682 00:41:24,360 --> 00:41:26,230 no matter what's come in the past, 683 00:41:26,230 --> 00:41:28,630 now you can start getting some mutual independence here. 684 00:41:28,630 --> 00:41:31,880 And then you could start really applying the theorem. 685 00:41:31,880 --> 00:41:34,730 Now in this case, you don't expect 10 things to happen. 686 00:41:34,730 --> 00:41:36,190 You expect a few. 687 00:41:36,190 --> 00:41:38,945 But that's good enough that, in fact-- so 688 00:41:38,945 --> 00:41:40,990 we did a computer simulation once 689 00:41:40,990 --> 00:41:44,300 and I got about a 90% chance that we'll all 690 00:41:44,300 --> 00:41:45,970 be on the same card. 691 00:41:45,970 --> 00:41:48,640 So I have a pretty good chance that I'm going to guess right. 692 00:41:48,640 --> 00:41:51,410 And so far, I haven't guessed wrong. 693 00:41:51,410 --> 00:41:53,670 But it will happen some day that we'll 694 00:41:53,670 --> 00:41:55,180 start with a different first number, 695 00:41:55,180 --> 00:41:57,450 and we will miss at the end because they'll 696 00:41:57,450 --> 00:41:59,954 be two possible outcomes. 697 00:41:59,954 --> 00:42:01,620 Just the way it works out with 52 cards. 698 00:42:01,620 --> 00:42:03,320 Now of course, if we have more cards 699 00:42:03,320 --> 00:42:07,020 or I made more things be 1's instead of 9's, say, 700 00:42:07,020 --> 00:42:10,290 my odds go up because the number of events I've got, 701 00:42:10,290 --> 00:42:13,970 the number of chances to collide, increases. 702 00:42:13,970 --> 00:42:17,920 And the chance of hitting when I jump also increases. 703 00:42:17,920 --> 00:42:19,860 Any questions on that game? 704 00:42:23,490 --> 00:42:28,090 So the point of all this is that if the expected number 705 00:42:28,090 --> 00:42:31,730 of events to occur is small, then 706 00:42:31,730 --> 00:42:33,400 it's an upper bound on the probability 707 00:42:33,400 --> 00:42:38,550 that something happens, whether they're independent or not. 708 00:42:38,550 --> 00:42:42,310 If the expected number of events to occur is bigger than 1, 709 00:42:42,310 --> 00:42:46,390 large, and if the events are mutually independent, 710 00:42:46,390 --> 00:42:49,140 then you can be sure one of those events 711 00:42:49,140 --> 00:42:52,260 is going to occur-- very, very likely one of them will occur. 712 00:42:52,260 --> 00:42:54,690 And that's Murphy's law. 713 00:42:54,690 --> 00:42:57,670 Any questions about numbers of events to occur? 714 00:43:01,930 --> 00:43:05,010 We'll talk more about the probability in the numbers 715 00:43:05,010 --> 00:43:09,055 of events that occur next time. 716 00:43:09,055 --> 00:43:10,430 Before we do that, I want to talk 717 00:43:10,430 --> 00:43:12,725 about some more useful facts about expectation. 718 00:43:16,100 --> 00:43:20,020 Now we know from linearity of expectation 719 00:43:20,020 --> 00:43:24,330 that the expected value of a sum of random variables 720 00:43:24,330 --> 00:43:30,550 is the sum of the expected values of the random variables. 721 00:43:30,550 --> 00:43:34,600 Now we're going to look at the expected value of a product 722 00:43:34,600 --> 00:43:36,660 of random variables. 723 00:43:36,660 --> 00:43:38,965 And it turns out there's a very nice rule for that. 724 00:43:41,530 --> 00:43:46,845 Theorem 4, and it's the product rule for expectation. 725 00:43:53,010 --> 00:43:57,000 And it says that for-- if your random variables are 726 00:43:57,000 --> 00:44:09,140 independent, R1 and R2 are independent, 727 00:44:09,140 --> 00:44:12,855 then the expected value of their product, also 728 00:44:12,855 --> 00:44:17,510 a random variable, is simply the product of the expected values. 729 00:44:20,150 --> 00:44:23,650 So it's sort of the equivalent thing 730 00:44:23,650 --> 00:44:26,672 to linearity of expectation, except we're doing products. 731 00:44:26,672 --> 00:44:27,755 And you need independence. 732 00:44:31,781 --> 00:44:34,280 Now the proof of this is not too hard, and it's in the book. 733 00:44:34,280 --> 00:44:37,061 So we're not going to do it in class. 734 00:44:37,061 --> 00:44:38,185 But we can give an example. 735 00:44:41,030 --> 00:44:52,820 Say we roll two six-sided fair and independent dice. 736 00:44:58,230 --> 00:45:02,130 And I want to know what's the expected product of the dice. 737 00:45:10,900 --> 00:45:18,720 So we're going to let R1 be the value on the first die, 738 00:45:18,720 --> 00:45:22,355 and R2 would be the value on the second one. 739 00:45:25,060 --> 00:45:31,670 And the expected value of the product 740 00:45:31,670 --> 00:45:33,320 is the product of the expectations. 741 00:45:37,910 --> 00:45:43,160 And we already know the expected value of a single die is 7/2. 742 00:45:43,160 --> 00:45:49,260 So we get 7/2 times 7/2 is 49/4 or 12 and 1/4. 743 00:45:49,260 --> 00:45:55,400 So it's easy to compute the expected product of two dice. 744 00:45:55,400 --> 00:45:57,200 Any questions about that? 745 00:46:00,910 --> 00:46:06,200 Much easier than looking at all 36 outcomes to use this rule. 746 00:46:06,200 --> 00:46:11,350 Now what if the dice we're rigged, glued together somehow 747 00:46:11,350 --> 00:46:12,790 so they always came up the same? 748 00:46:15,780 --> 00:46:18,540 Would the expected product B 12 and 1/4 then? 749 00:46:21,990 --> 00:46:22,610 No? 750 00:46:22,610 --> 00:46:24,460 Why not? 751 00:46:24,460 --> 00:46:26,060 Why wouldn't it be the case? 752 00:46:29,670 --> 00:46:31,890 Why isn't the expected value of R1 753 00:46:31,890 --> 00:46:37,060 squared the square of the expected value of R1? 754 00:46:37,060 --> 00:46:40,174 Isn't that what this says? 755 00:46:40,174 --> 00:46:41,150 AUDIENCE: Independent. 756 00:46:41,150 --> 00:46:42,730 PROFESSOR: They're not independent. 757 00:46:42,730 --> 00:46:47,120 R1 is not independent of R1 In fact, it's the same thing. 758 00:46:47,120 --> 00:46:50,490 And you need independence for that to be the case. 759 00:46:50,490 --> 00:47:02,140 So a non example, the expected value of R1 times R1 760 00:47:02,140 --> 00:47:07,950 is the expected value of R1 squared. 761 00:47:07,950 --> 00:47:10,210 And to do that, we got to go back to the basics. 762 00:47:10,210 --> 00:47:12,805 We're taking the six possible values of R1. 763 00:47:12,805 --> 00:47:15,400 i equals 1 to 6. 764 00:47:15,400 --> 00:47:18,550 i squared, because we're squaring it, 765 00:47:18,550 --> 00:47:21,060 times the probability R1 equals i. 766 00:47:23,660 --> 00:47:26,020 And each of those probabilities is 1/6. 767 00:47:26,020 --> 00:47:36,470 So we get 1/6 times 1 plus 4 plus 9 plus 16 plus 25 plus 36. 768 00:47:36,470 --> 00:47:42,250 And if you add all that up you get 15 and 1/6, 769 00:47:42,250 --> 00:47:50,180 which is not 3 and 1/2 squared, which is the expected 770 00:47:50,180 --> 00:47:54,340 value of R1 squared. 771 00:47:54,340 --> 00:47:56,660 So the expected value of the square 772 00:47:56,660 --> 00:48:01,150 is not necessarily the square of the expectation. 773 00:48:01,150 --> 00:48:03,774 Because a random variable is not independent of itself 774 00:48:03,774 --> 00:48:04,273 generally. 775 00:48:07,450 --> 00:48:08,190 OK. 776 00:48:08,190 --> 00:48:10,270 Any questions there? 777 00:48:12,492 --> 00:48:14,075 There's a couple of quick corollaries. 778 00:48:19,250 --> 00:48:21,590 The first is you take this rule and apply it 779 00:48:21,590 --> 00:48:26,067 to many random variables as long as they're mutually dependent. 780 00:48:26,067 --> 00:48:40,580 So if R1 R2 out to R n are mutually independent, 781 00:48:40,580 --> 00:48:48,620 then the expected value of their product 782 00:48:48,620 --> 00:48:50,510 is the product of the expected values. 783 00:48:59,720 --> 00:49:03,620 And the proof is just by induction on the number 784 00:49:03,620 --> 00:49:04,550 of random variables. 785 00:49:04,550 --> 00:49:06,439 So that's pretty easy. 786 00:49:06,439 --> 00:49:07,730 There's another easy corollary. 787 00:49:11,718 --> 00:49:19,670 And that says, for any constants, constant values, 788 00:49:19,670 --> 00:49:28,150 a and b, and any random variable R, 789 00:49:28,150 --> 00:49:35,460 the expected value of a times R plus b is simply a times 790 00:49:35,460 --> 00:49:38,760 the expected value of R plus b. 791 00:49:41,360 --> 00:49:43,860 And the reason that's true-- well, the sum 792 00:49:43,860 --> 00:49:47,300 works because of linearity of expectation for the sum. 793 00:49:47,300 --> 00:49:50,030 You can think of b as a random variable that just always 794 00:49:50,030 --> 00:49:52,950 has the value b. 795 00:49:52,950 --> 00:49:56,862 And the a comes out in front because you 796 00:49:56,862 --> 00:49:58,570 can think of it as a random variable that 797 00:49:58,570 --> 00:50:01,225 always has a value a. 798 00:50:01,225 --> 00:50:04,240 And that's independent of any other random variable 799 00:50:04,240 --> 00:50:06,630 because it never changes. 800 00:50:06,630 --> 00:50:10,325 So by the product rule, the a can come out. 801 00:50:10,325 --> 00:50:11,700 Now you've got to prove all that. 802 00:50:11,700 --> 00:50:14,159 But it's not too hard and not especially interesting. 803 00:50:14,159 --> 00:50:15,200 So we won't do that here. 804 00:50:18,372 --> 00:50:19,455 Any questions about those? 805 00:50:27,140 --> 00:50:28,530 So we've got a rule for computing 806 00:50:28,530 --> 00:50:32,170 the sum of random variables, a rule for the product 807 00:50:32,170 --> 00:50:34,420 of random variables. 808 00:50:34,420 --> 00:50:39,350 What about a rule for the ratio of random variables? 809 00:50:39,350 --> 00:50:40,280 Let's look at that. 810 00:50:45,330 --> 00:50:47,600 So is this the corollary? 811 00:50:50,490 --> 00:50:52,890 In fact, let's take the inverse of random variable. 812 00:50:52,890 --> 00:50:57,690 Is the expected value of 1/R equal to 1 813 00:50:57,690 --> 00:51:04,510 over the expected value of R for any random variable R? 814 00:51:04,510 --> 00:51:06,560 Some folks saying yes. 815 00:51:06,560 --> 00:51:09,240 Some saying no. 816 00:51:09,240 --> 00:51:10,350 What do you think? 817 00:51:10,350 --> 00:51:11,445 Is that true? 818 00:51:14,610 --> 00:51:15,790 Oh, got a mix. 819 00:51:15,790 --> 00:51:17,910 How many say yes? 820 00:51:17,910 --> 00:51:19,385 How many say no? 821 00:51:19,385 --> 00:51:20,500 Oh, more no's. 822 00:51:20,500 --> 00:51:22,940 Somebody tell me why that's not true. 823 00:51:22,940 --> 00:51:26,058 Who would like to give me an example? 824 00:51:26,058 --> 00:51:31,345 Give us an example there that'll be very convincing. 825 00:51:31,345 --> 00:51:33,954 Yeah? 826 00:51:33,954 --> 00:51:36,370 AUDIENCE: I don't think it's one that would be immediately 827 00:51:36,370 --> 00:51:41,558 obvious, but I think if R is the result of the roll of a die, 828 00:51:41,558 --> 00:51:43,900 I don't think it works out. 829 00:51:43,900 --> 00:51:47,060 PROFESSOR: So it's 50 chance of-- oh, I see. 830 00:51:47,060 --> 00:51:53,159 So I take the average of 1/i-- that's sort of hard to compute. 831 00:51:53,159 --> 00:51:55,200 I got to do [INAUDIBLE] the sixth harmonic number 832 00:51:55,200 --> 00:51:56,825 and then invert it. 833 00:51:56,825 --> 00:52:00,439 There's an easier way to show that this is false. 834 00:52:00,439 --> 00:52:00,938 Yeah? 835 00:52:00,938 --> 00:52:02,869 AUDIENCE: The expected value equals 0? 836 00:52:02,869 --> 00:52:03,535 PROFESSOR: Yeah. 837 00:52:03,535 --> 00:52:07,330 The expected value equals 0 which 838 00:52:07,330 --> 00:52:12,520 could happen if R is plus 1 or minus 1 equally likely. 839 00:52:12,520 --> 00:52:15,840 So here's an example here. 840 00:52:15,840 --> 00:52:23,640 So R equals 1 with probability 1/2, and minus 1 841 00:52:23,640 --> 00:52:25,500 with probability 1/2. 842 00:52:25,500 --> 00:52:27,835 So the expected value of R is 0. 843 00:52:32,370 --> 00:52:33,370 So that blows up. 844 00:52:33,370 --> 00:52:34,625 That's infinity. 845 00:52:34,625 --> 00:52:36,000 What's the expected value of 1/R? 846 00:52:41,320 --> 00:52:44,300 Well, 1/1 and 1 over minus 1, it's the same. 847 00:52:44,300 --> 00:52:46,300 It equals 0. 848 00:52:46,300 --> 00:52:48,930 And this would say 0 equals 1/0. 849 00:52:48,930 --> 00:52:50,030 That's not true. 850 00:52:50,030 --> 00:52:51,980 So this is false. 851 00:52:51,980 --> 00:52:54,230 It is not true for every random variable. 852 00:52:57,350 --> 00:53:00,180 So once you see this example, just obviously not true. 853 00:53:00,180 --> 00:53:02,010 In fact, there's very few random variables 854 00:53:02,010 --> 00:53:06,440 for which this is true, even an indicator random variable. 855 00:53:06,440 --> 00:53:11,290 So it's 1 with probability 1/2 and 0 with probability 1/2. 856 00:53:11,290 --> 00:53:14,950 Then the expected value of 1/R is infinite. 857 00:53:14,950 --> 00:53:16,790 1 over the expected value of R is 2. 858 00:53:20,940 --> 00:53:24,570 So it's clearly not true. 859 00:53:24,570 --> 00:53:27,066 Let's do another one. 860 00:53:34,730 --> 00:53:37,090 What about this potential corollary? 861 00:53:42,310 --> 00:53:53,530 Given independent random variables R and T, 862 00:53:53,530 --> 00:54:02,570 if the expected value or R/T is bigger than 1, 863 00:54:02,570 --> 00:54:07,630 then the expected value of R is bigger than the expected 864 00:54:07,630 --> 00:54:15,970 value of T. And let me even give you a potential proof of this, 865 00:54:15,970 --> 00:54:18,960 see if you like this proof. 866 00:54:18,960 --> 00:54:22,270 Well, let's assume the expected value of R/T is bigger than 1. 867 00:54:25,970 --> 00:54:39,620 And let's multiply both sides by the expected value of T. 868 00:54:39,620 --> 00:54:42,530 And well, the product rule says that this is just 869 00:54:42,530 --> 00:54:48,830 the expected value of R/T times T, which 870 00:54:48,830 --> 00:54:51,990 is just the expected value of R because the T's cancel. 871 00:55:02,654 --> 00:55:03,570 So I gave you a proof. 872 00:55:06,370 --> 00:55:09,760 Anybody have any quibbles with this proof? 873 00:55:15,172 --> 00:55:15,672 Yeah? 874 00:55:19,050 --> 00:55:20,330 That's a big problem. 875 00:55:20,330 --> 00:55:23,680 R/T is not independent of T. [INAUDIBLE] 876 00:55:23,680 --> 00:55:28,240 if T is very big, likely that R/T is small. 877 00:55:28,240 --> 00:55:30,280 So we can't do that step. 878 00:55:30,280 --> 00:55:34,950 We can't use the independence here to go from here to here. 879 00:55:34,950 --> 00:55:36,030 That's wrong. 880 00:55:36,030 --> 00:55:39,474 There's actually another big problem with this proof. 881 00:55:39,474 --> 00:55:40,640 Anybody see another problem? 882 00:55:40,640 --> 00:55:41,140 Yeah? 883 00:55:43,990 --> 00:55:44,880 Yeah. 884 00:55:44,880 --> 00:55:47,840 If the expected value of T is negative, 885 00:55:47,840 --> 00:55:50,440 I would end up doing that. 886 00:55:50,440 --> 00:55:52,030 So that step's wrong. 887 00:55:52,030 --> 00:55:53,600 So this is a pretty lousy proof. 888 00:55:53,600 --> 00:55:55,900 Every step is wrong. 889 00:55:55,900 --> 00:55:58,170 So this is not a good one to use. 890 00:55:58,170 --> 00:56:04,780 And in fact, the theorem is wrong. 891 00:56:04,780 --> 00:56:07,100 Not only is the proof wrong, but the result is wrong. 892 00:56:07,100 --> 00:56:08,950 It's not true. 893 00:56:08,950 --> 00:56:11,890 And we can see examples. 894 00:56:11,890 --> 00:56:13,840 We'll do some examples in a minute. 895 00:56:13,840 --> 00:56:16,350 Now the amazing thing is that despite the fact that this 896 00:56:16,350 --> 00:56:21,610 is just blatantly wrong, it is used all the time 897 00:56:21,610 --> 00:56:23,270 in research papers. 898 00:56:23,270 --> 00:56:26,280 And let me give you a famous example. 899 00:56:26,280 --> 00:56:32,430 This is a case of, actually, a pretty well-known paper written 900 00:56:32,430 --> 00:56:36,090 by some very famous computer science professors at Berkeley. 901 00:56:38,640 --> 00:56:42,130 And let me show you what they did. 902 00:56:42,130 --> 00:56:45,950 And this is so that you will never do this. 903 00:56:45,950 --> 00:56:54,870 They were trying to compare two instruction sets way back 904 00:56:54,870 --> 00:56:55,510 in the day. 905 00:56:55,510 --> 00:56:58,750 And they were comparing the RISC architecture 906 00:56:58,750 --> 00:57:01,740 to something called the Z8002. 907 00:57:01,740 --> 00:57:04,260 And they were proponents of RISC. 908 00:57:04,260 --> 00:57:07,600 And they were using this to prove that it was a better way 909 00:57:07,600 --> 00:57:09,290 to do things. 910 00:57:09,290 --> 00:57:14,290 So they had a bunch of benchmark problems, probably 20 or so 911 00:57:14,290 --> 00:57:14,940 in the paper. 912 00:57:14,940 --> 00:57:16,520 And I'm not going to do all 20 here. 913 00:57:16,520 --> 00:57:19,210 I'm going to give you a flavor for what the data showed. 914 00:57:19,210 --> 00:57:27,110 And then they looked at the code size for RISC 915 00:57:27,110 --> 00:57:32,182 and the other guys, Z8002. 916 00:57:32,182 --> 00:57:33,390 And then they took the ratio. 917 00:57:40,020 --> 00:57:44,720 So the first problem was called E-string search, 918 00:57:44,720 --> 00:57:45,440 whatever that is. 919 00:57:45,440 --> 00:57:48,820 But it was some benchmark problem out there at the time. 920 00:57:48,820 --> 00:57:51,510 And the code length on RISC was 150 921 00:57:51,510 --> 00:57:53,450 say-- I've changed these numbers a little bit 922 00:57:53,450 --> 00:57:54,820 to make them simpler. 923 00:57:54,820 --> 00:57:58,930 Code length here and the Z8002 is 120. 924 00:57:58,930 --> 00:58:01,270 The ratio is 0.8. 925 00:58:01,270 --> 00:58:05,020 So for this problem, you're trying to get low, short code. 926 00:58:05,020 --> 00:58:08,410 So this was a better way to go to support that. 927 00:58:08,410 --> 00:58:12,640 And they had something called F-bit test. 928 00:58:12,640 --> 00:58:14,530 And here you have 120 lines. 929 00:58:14,530 --> 00:58:16,750 Here's 180. 930 00:58:16,750 --> 00:58:20,260 So in this case, risk is better. 931 00:58:20,260 --> 00:58:26,500 So the ratio of that way to this way would be 1.5. 932 00:58:26,500 --> 00:58:28,465 And they had computing and Ackermann function. 933 00:58:31,540 --> 00:58:34,260 And that was 150 and 300. 934 00:58:34,260 --> 00:58:39,390 So a big win for RISC, ratio of 2. 935 00:58:39,390 --> 00:58:45,690 And then they had a thing called recursive sorting problem. 936 00:58:45,690 --> 00:58:47,530 This is a hard problem. 937 00:58:47,530 --> 00:58:50,040 There's 2,800 lines on RISC. 938 00:58:50,040 --> 00:58:54,190 1400 on the old way. 939 00:58:54,190 --> 00:58:57,200 Ratio of 0.5. 940 00:58:57,200 --> 00:59:00,400 And there was a bunch more which I'm not going to go through. 941 00:59:00,400 --> 00:59:04,470 But their analysis, what they did is they took the ratio, 942 00:59:04,470 --> 00:59:06,380 and then they averaged it. 943 00:59:06,380 --> 00:59:12,715 And so when you do this, you get an average of, well, 2.3, 4.3, 944 00:59:12,715 --> 00:59:19,250 4.8/4 is 1.2. 945 00:59:19,250 --> 00:59:23,890 So the conclusion is that on average code in this framework 946 00:59:23,890 --> 00:59:27,400 is 20% longer than the code on RISC. 947 00:59:27,400 --> 00:59:30,100 Therefore, clearly, RISC is a better way to go. 948 00:59:30,100 --> 00:59:33,020 Your code on average will be shorter. 949 00:59:33,020 --> 00:59:38,080 Using the Z8002 on average, the code will be 20% longer. 950 00:59:38,080 --> 00:59:39,246 Makes perfect sense, right? 951 00:59:42,162 --> 00:59:44,840 In fact, this is one of the most common things 952 00:59:44,840 --> 00:59:47,133 that is done when people are comparing two systems. 953 00:59:50,514 --> 00:59:56,420 Now just one problem with this approach, and that's 954 00:59:56,420 --> 00:59:59,840 that it's completely bogus, completely bogus. 955 00:59:59,840 --> 01:00:02,820 You cannot conclude-- let's make this conclusion. 956 01:00:02,820 --> 01:00:13,090 So their conclusion, they concluded that Z8002 programs 957 01:00:13,090 --> 01:00:17,905 are 20% longer on average. 958 01:00:24,890 --> 01:00:28,110 Everybody understands the reasoning why, right? 959 01:00:28,110 --> 01:00:31,875 Take the ratio of all the test cases, average them up. 960 01:00:31,875 --> 01:00:33,410 Then you get the average ratio. 961 01:00:38,380 --> 01:00:41,030 Now there could be some hint why this is bogus. 962 01:00:41,030 --> 01:00:45,000 If I just looked at-- I took and summed 963 01:00:45,000 --> 01:00:51,670 these numbers, if I add all those numbers up, I get 3,220. 964 01:00:51,670 --> 01:00:55,570 And all these, I get 2,000. 965 01:00:55,570 --> 01:00:59,500 RISC code is not looking shorter if I do that. 966 01:00:59,500 --> 01:01:02,340 Looking longer. 967 01:01:02,340 --> 01:01:08,200 But all that gain, all the loss of RISC is in this one problem. 968 01:01:08,200 --> 01:01:10,960 And maybe it's not fair to do that. 969 01:01:10,960 --> 01:01:14,537 And that's why when people have data like this, 970 01:01:14,537 --> 01:01:15,620 they just take the ratios. 971 01:01:15,620 --> 01:01:20,110 Because now it would be-- if I just took the average code 972 01:01:20,110 --> 01:01:22,290 length and took the ratio of that, 973 01:01:22,290 --> 01:01:23,950 it's not fair because one problem just 974 01:01:23,950 --> 01:01:26,409 wiped out the whole thing. 975 01:01:26,409 --> 01:01:27,700 I might as well not even do it. 976 01:01:27,700 --> 01:01:29,880 And they want every problem to count equally. 977 01:01:29,880 --> 01:01:32,210 And that's why they take the ratio, 978 01:01:32,210 --> 01:01:34,840 to make them all count equally. 979 01:01:34,840 --> 01:01:36,330 Let's do one more thing here. 980 01:01:36,330 --> 01:01:44,130 Let's look at what happens if we take the ratio of RISC 981 01:01:44,130 --> 01:01:47,356 to the Z8002. 982 01:01:47,356 --> 01:01:48,355 Make some room for that. 983 01:01:48,355 --> 01:01:54,630 So this is-- this column is Z8002 over RISC. 984 01:01:54,630 --> 01:02:00,930 What if I just did this-- RISC over the Z8002 I mean 985 01:02:00,930 --> 01:02:03,754 the answer should come out to 1/1.2, right? 986 01:02:03,754 --> 01:02:05,420 That's what we expect, because I've just 987 01:02:05,420 --> 01:02:07,820 been turning it upside down. 988 01:02:07,820 --> 01:02:10,490 Well, I get 1.25 here. 989 01:02:10,490 --> 01:02:12,100 These are just being inverted. 990 01:02:12,100 --> 01:02:14,386 Here I've got 2/3, 0.67. 991 01:02:14,386 --> 01:02:15,010 Here I get 1/2. 992 01:02:18,500 --> 01:02:19,440 Here I get 2. 993 01:02:22,400 --> 01:02:23,815 Let's add those up. 994 01:02:23,815 --> 01:02:32,250 I get 1.92, 2.42, 4.42. 995 01:02:32,250 --> 01:02:38,020 Divide by 4-- wow, I get 1.1 something 996 01:02:38,020 --> 01:02:41,100 which says, well, that on average, RISC is 997 01:02:41,100 --> 01:02:44,740 10% longer than the other one. 998 01:02:44,740 --> 01:02:50,430 So same analysis says that RISC programs 999 01:02:50,430 --> 01:02:54,010 are 10% longer on average. 1000 01:02:57,430 --> 01:03:00,210 Now the beauty of this method is you 1001 01:03:00,210 --> 01:03:05,410 can make any conclusion you want seem reasonable, typically. 1002 01:03:05,410 --> 01:03:08,590 You could have the exact same data. 1003 01:03:08,590 --> 01:03:13,330 And if you want RISC to look better you do it this way. 1004 01:03:13,330 --> 01:03:17,720 If you want RISC to look worse, you do it that way. 1005 01:03:17,720 --> 01:03:18,540 You see? 1006 01:03:18,540 --> 01:03:19,690 Is that possible? 1007 01:03:19,690 --> 01:03:21,470 Is it possible for one to be 20% longer 1008 01:03:21,470 --> 01:03:24,026 than the other on average, but the other be 10% longer 1009 01:03:24,026 --> 01:03:24,525 on average? 1010 01:03:27,040 --> 01:03:29,000 How many people think that's possible? 1011 01:03:29,000 --> 01:03:31,440 We had some weird things happen in this class, 1012 01:03:31,440 --> 01:03:33,650 but that's not possible. 1013 01:03:33,650 --> 01:03:34,830 That can't happen. 1014 01:03:34,830 --> 01:03:36,430 These conclusions are both bogus. 1015 01:03:41,090 --> 01:03:44,210 Now I'm not teaching you this so that later when 1016 01:03:44,210 --> 01:03:46,510 you're doing your PhD thesis and it's down to the wire 1017 01:03:46,510 --> 01:03:49,250 and you need a conclusion to be proved, good news. 1018 01:03:49,250 --> 01:03:51,240 You can prove it. 1019 01:03:51,240 --> 01:03:53,957 No matter what your conclusion is, you can prove it. 1020 01:03:53,957 --> 01:03:55,290 That's not why we're doing this. 1021 01:03:55,290 --> 01:03:58,831 We're doing this so you can spot the flaw in this whole setup 1022 01:03:58,831 --> 01:04:00,080 and that you'll never do this. 1023 01:04:00,080 --> 01:04:02,538 And you'll see it when other people do because people do it 1024 01:04:02,538 --> 01:04:03,190 all the time. 1025 01:04:06,160 --> 01:04:09,460 So let's try to put some formality under this 1026 01:04:09,460 --> 01:04:10,460 in terms of probability. 1027 01:04:10,460 --> 01:04:12,607 Because when you start talking about averages, 1028 01:04:12,607 --> 01:04:15,190 really think about expectations of random variables and stuff. 1029 01:04:15,190 --> 01:04:19,970 So let's try to view this as a probability problem 1030 01:04:19,970 --> 01:04:22,410 and see if we can shed some light on what's 1031 01:04:22,410 --> 01:04:25,480 going on here because it sure seemed reasonable. 1032 01:04:35,850 --> 01:04:41,890 So let's let x be the benchmark. 1033 01:04:41,890 --> 01:04:45,400 And maybe that'll be something in the sample space, 1034 01:04:45,400 --> 01:04:47,280 an outcome in the sample space. 1035 01:04:47,280 --> 01:05:00,490 Let's let R x be the code length for RISC on x and Z x 1036 01:05:00,490 --> 01:05:09,340 be the code length for the other processor on x. 1037 01:05:09,340 --> 01:05:19,060 And then we'll define a probability of seeing x. 1038 01:05:19,060 --> 01:05:21,560 That's our problem we're looking at. 1039 01:05:21,560 --> 01:05:28,290 And typically, you might assume that it's uniform, 1040 01:05:28,290 --> 01:05:30,790 the distribution there. 1041 01:05:30,790 --> 01:05:33,830 We need this to be able to define an expected value 1042 01:05:33,830 --> 01:05:37,350 for R and for Z. 1043 01:05:37,350 --> 01:05:42,610 Now what they're doing in the paper, what really is happening 1044 01:05:42,610 --> 01:05:50,230 here, is instead of this, they have the expected value of Z/R 1045 01:05:50,230 --> 01:05:52,160 is 1.2. 1046 01:05:52,160 --> 01:05:54,880 That is what they can conclude. 1047 01:05:54,880 --> 01:06:01,630 That does not mean that the expected value of Z 1048 01:06:01,630 --> 01:06:05,350 is 1.2, the expected value of R, which 1049 01:06:05,350 --> 01:06:10,200 is what they conclude that the Z8002 code is 1050 01:06:10,200 --> 01:06:12,430 20% longer than RISC code. 1051 01:06:12,430 --> 01:06:14,860 This is true. 1052 01:06:14,860 --> 01:06:16,160 That is not implied. 1053 01:06:19,070 --> 01:06:21,506 And why not? 1054 01:06:21,506 --> 01:06:26,909 That's just, actually, what this corollary was doing. 1055 01:06:26,909 --> 01:06:28,450 Really, it's just what we were-- they 1056 01:06:28,450 --> 01:06:33,100 made the same false assumption as happened in the corollary. 1057 01:06:33,100 --> 01:06:37,210 You can't multiply both sides here by the expected value of R 1058 01:06:37,210 --> 01:06:41,240 and then get the expected value Z. Of course, if you ask them, 1059 01:06:41,240 --> 01:06:43,066 they would have known that. 1060 01:06:43,066 --> 01:06:44,440 But they don't even think through 1061 01:06:44,440 --> 01:06:46,523 that, they just used the standard method of taking 1062 01:06:46,523 --> 01:06:49,290 the expected value of a ratio. 1063 01:06:49,290 --> 01:06:51,220 So this is fair to conclude. 1064 01:06:51,220 --> 01:06:58,340 But as we saw, the expected value of R/Z was 1.1. 1065 01:06:58,340 --> 01:07:00,950 So both of these can be true at the same time. 1066 01:07:00,950 --> 01:07:01,960 That's fine. 1067 01:07:01,960 --> 01:07:03,420 But you can't make the conclusions 1068 01:07:03,420 --> 01:07:04,419 that they tried to make. 1069 01:07:08,000 --> 01:07:11,470 Here's another-- in fact, in this case, 1070 01:07:11,470 --> 01:07:14,100 if we had a uniform distribution, 1071 01:07:14,100 --> 01:07:21,396 the expected value of R is like 805 for uniform. 1072 01:07:21,396 --> 01:07:28,990 And the expected value of Z is 500. 1073 01:07:28,990 --> 01:07:31,254 And that's all you can conclude if you're 1074 01:07:31,254 --> 01:07:32,420 taking uniform distribution. 1075 01:07:32,420 --> 01:07:36,310 In which case, of course, if they're promoting RISC, 1076 01:07:36,310 --> 01:07:39,240 well, you don't like that conclusion. 1077 01:07:39,240 --> 01:07:41,739 So it's better to get this one's. 1078 01:07:41,739 --> 01:07:43,530 I don't think it was intentional of course. 1079 01:07:43,530 --> 01:07:46,690 But it's nice that it came out that way. 1080 01:07:49,430 --> 01:07:53,770 Here's another example that really makes it painfully clear 1081 01:07:53,770 --> 01:07:57,060 why you never want to do this. 1082 01:07:57,060 --> 01:08:02,150 So a really simple case, just generic variables R and Z. 1083 01:08:02,150 --> 01:08:08,550 And I got two problems only-- problem one, problem two. 1084 01:08:08,550 --> 01:08:11,020 R is 2 for problem 1, and z is 1. 1085 01:08:11,020 --> 01:08:12,768 And they reverse on problem 2. 1086 01:08:15,396 --> 01:08:19,205 Z/R is 2 and 1/2. 1087 01:08:21,760 --> 01:08:23,955 R/Z, just the reverse. 1088 01:08:28,029 --> 01:08:36,200 Now the expected value of R/Z here 1089 01:08:36,200 --> 01:08:39,260 is 2 plus 1/2 divided by 2 is 1 and 1/4. 1090 01:08:42,810 --> 01:08:44,359 And what's the expected value of Z/R? 1091 01:08:49,399 --> 01:08:50,859 The average of these is 1 and 1/4, 1092 01:08:50,859 --> 01:08:53,470 what's the average of these? 1093 01:08:53,470 --> 01:08:54,410 Same thing, 1 and 1/4. 1094 01:08:58,020 --> 01:09:00,830 So never, ever take averages of ratios 1095 01:09:00,830 --> 01:09:02,689 without really knowing what you're doing. 1096 01:09:05,560 --> 01:09:07,470 Any questions? 1097 01:09:07,470 --> 01:09:09,960 Yeah. 1098 01:09:09,960 --> 01:09:12,640 AUDIENCE: What would be the word explanation of the expected 1099 01:09:12,640 --> 01:09:14,240 value of Z/R? 1100 01:09:14,240 --> 01:09:15,430 What is that? 1101 01:09:15,430 --> 01:09:18,270 PROFESSOR: That is the average of the ratio. 1102 01:09:21,270 --> 01:09:25,859 It is not the ratio of the average. 1103 01:09:25,859 --> 01:09:27,744 They are very different things. 1104 01:09:27,744 --> 01:09:29,660 And you can see how you get caught up in that. 1105 01:09:29,660 --> 01:09:31,060 You could see how you have linearity of expectation, 1106 01:09:31,060 --> 01:09:33,010 you got the product rule for expectation. 1107 01:09:33,010 --> 01:09:39,135 You do not have a rule that says this implies that. 1108 01:09:39,135 --> 01:09:41,590 AUDIENCE: [INAUDIBLE] 1109 01:09:41,590 --> 01:09:42,465 PROFESSOR: Which two? 1110 01:09:42,465 --> 01:09:46,710 AUDIENCE: [INAUDIBLE] Z/R? 1111 01:09:46,710 --> 01:09:48,910 PROFESSOR: Well, in this case, they're one. 1112 01:09:48,910 --> 01:09:50,968 I don't think that'll be true in general. 1113 01:09:50,968 --> 01:09:53,990 AUDIENCE: Does that give you information? 1114 01:09:53,990 --> 01:09:55,820 PROFESSOR: They give you information. 1115 01:09:55,820 --> 01:10:00,020 That may not be the information you want. 1116 01:10:00,020 --> 01:10:01,530 It wouldn't imply that which is what 1117 01:10:01,530 --> 01:10:04,220 you're after in some sense. 1118 01:10:04,220 --> 01:10:06,140 But it gives you some information. 1119 01:10:06,140 --> 01:10:08,170 It's the expected average ratio. 1120 01:10:08,170 --> 01:10:12,700 The problem is the human brain goes right from there to here. 1121 01:10:12,700 --> 01:10:14,450 It's just you do. 1122 01:10:14,450 --> 01:10:17,400 It's hard to help yourself from doing it. 1123 01:10:17,400 --> 01:10:20,510 And it's not true. 1124 01:10:20,510 --> 01:10:23,430 That's the problem. 1125 01:10:23,430 --> 01:10:26,170 We have a version of this in the one in the homework questions 1126 01:10:26,170 --> 01:10:27,420 which is true. 1127 01:10:27,420 --> 01:10:29,490 But it's a special version of it where 1128 01:10:29,490 --> 01:10:33,030 you can say something positive. 1129 01:10:33,030 --> 01:10:36,020 Any questions about this? 1130 01:10:36,020 --> 01:10:39,180 So anybody ever shows you an average of ratios, 1131 01:10:39,180 --> 01:10:42,120 you want the light to go off and say, danger, danger. 1132 01:10:42,120 --> 01:10:44,682 Think what's happening here. 1133 01:10:44,682 --> 01:10:47,015 Or if you're ever analyzing data to compare two systems. 1134 01:10:52,880 --> 01:10:55,540 So we talked a lot about expectation, seen 1135 01:10:55,540 --> 01:10:56,840 a lot of ways of computing it. 1136 01:10:56,840 --> 01:10:59,310 We've done a lot of examples. 1137 01:10:59,310 --> 01:11:01,420 For the rest of today and for next time, 1138 01:11:01,420 --> 01:11:04,180 we're going to talk about deviations 1139 01:11:04,180 --> 01:11:06,870 from the expected value. 1140 01:11:06,870 --> 01:11:09,830 Now for some random variables, they 1141 01:11:09,830 --> 01:11:11,880 are very likely to take on values that 1142 01:11:11,880 --> 01:11:13,850 are near their expectation. 1143 01:11:13,850 --> 01:11:17,510 For example, if I flip 100 coins. 1144 01:11:17,510 --> 01:11:20,770 And say they're fair and mutually independent. 1145 01:11:20,770 --> 01:11:24,389 We know that the expected number of heads is 50. 1146 01:11:24,389 --> 01:11:25,930 Does anybody remember the probability 1147 01:11:25,930 --> 01:11:30,990 of getting far from that, namely having 25 or fewer heads or 75 1148 01:11:30,990 --> 01:11:32,564 or more heads? 1149 01:11:32,564 --> 01:11:33,480 Remember, we did that? 1150 01:11:33,480 --> 01:11:35,760 It was a couple of weeks ago? 1151 01:11:35,760 --> 01:11:39,334 Is it likely to have 25 or fewer heads? 1152 01:11:39,334 --> 01:11:41,000 AUDIENCE: It's less than 1 in a million? 1153 01:11:41,000 --> 01:11:42,140 PROFESSOR: Less than 1 in a million. 1154 01:11:42,140 --> 01:11:42,400 Yeah. 1155 01:11:42,400 --> 01:11:43,450 It was 1 in 5 million or something. 1156 01:11:43,450 --> 01:11:45,420 I don't know, some horribly small number. 1157 01:11:45,420 --> 01:11:49,510 So if I flip 100 coins, I expect to get 50 heads. 1158 01:11:49,510 --> 01:11:52,480 And I'm very likely to get close to 50 heads. 1159 01:11:52,480 --> 01:11:56,080 I'm not going to be 25 off. 1160 01:11:56,080 --> 01:11:59,240 And then the example we had in recitation, 1161 01:11:59,240 --> 01:12:02,350 you got a noisy channel, and you expect 1162 01:12:02,350 --> 01:12:06,640 an error rate, 1% of your 10,000 bits to be corrupted. 1163 01:12:06,640 --> 01:12:08,830 The chance of getting 2% corrupted 1164 01:12:08,830 --> 01:12:12,210 was like-- what was it, 2 to the minus 60 or something? 1165 01:12:12,210 --> 01:12:17,710 Extremely unlikely to be far from the expected value. 1166 01:12:17,710 --> 01:12:21,900 But there's other cases where you are likely-- you could well 1167 01:12:21,900 --> 01:12:24,189 be far from the expected value. 1168 01:12:24,189 --> 01:12:25,730 Can anybody remember an example we've 1169 01:12:25,730 --> 01:12:32,510 done where you are almost surely way off your expected 1170 01:12:32,510 --> 01:12:35,700 value for a random variable? 1171 01:12:35,700 --> 01:12:39,730 Anybody remember an example we did that has that feature? 1172 01:12:39,730 --> 01:12:41,239 AUDIENCE: The appetizer I think. 1173 01:12:41,239 --> 01:12:42,280 PROFESSOR: The appetizer. 1174 01:12:42,280 --> 01:12:43,460 Let's see. 1175 01:12:43,460 --> 01:12:49,350 Appetizers, you expect 1, but you're almost certain to be 0. 1176 01:12:49,350 --> 01:12:51,290 Or actually, you're almost certain to be 0, 1177 01:12:51,290 --> 01:12:53,560 and you have a chance of being n. 1178 01:12:53,560 --> 01:12:55,204 So if you count 0 as being close to 1, 1179 01:12:55,204 --> 01:12:57,120 you're likely to be close to your expectation. 1180 01:12:57,120 --> 01:13:01,690 Because you're likely to be 0, and you expect 1. 1181 01:13:01,690 --> 01:13:04,851 Remember that noisy channel problem-- 1182 01:13:04,851 --> 01:13:06,600 not the noisy channel, the latency problem 1183 01:13:06,600 --> 01:13:08,330 across the channel? 1184 01:13:08,330 --> 01:13:12,310 And we show that the expected latency was infinite? 1185 01:13:12,310 --> 01:13:15,580 But 99% of the time you had 10 milliseconds, 1186 01:13:15,580 --> 01:13:17,770 something like that? 1187 01:13:17,770 --> 01:13:20,020 There's an example where almost all the time you 1188 01:13:20,020 --> 01:13:22,870 are far from your expectation which is infinite. 1189 01:13:22,870 --> 01:13:25,380 So there are examples that go both ways. 1190 01:13:28,250 --> 01:13:33,570 Now let's look at another couple of examples 1191 01:13:33,570 --> 01:13:36,700 that'll motivate the definition that measures this. 1192 01:13:41,160 --> 01:13:45,930 I'd say that we've got a simple Bernoulli random variable where 1193 01:13:45,930 --> 01:13:51,250 the probability that R is 1,000 is 1/2, 1194 01:13:51,250 --> 01:13:58,660 and the probability that R is minus 1,000 is 1/2. 1195 01:13:58,660 --> 01:14:00,660 Then the expected value of R is 0. 1196 01:14:05,440 --> 01:14:07,770 Similarly, we could have another one, S, 1197 01:14:07,770 --> 01:14:12,540 where the probability that S equals 1 is 1/2, 1198 01:14:12,540 --> 01:14:16,230 and the probability that S equals minus 1 is 1/2. 1199 01:14:16,230 --> 01:14:18,110 And the expected value of S is 0. 1200 01:14:20,820 --> 01:14:23,240 Now if this was a betting game and we're 1201 01:14:23,240 --> 01:14:26,470 talking about dollars-- here's where you're 1202 01:14:26,470 --> 01:14:28,910 wagering $1,000, fair game. 1203 01:14:28,910 --> 01:14:31,550 Here's where you're wagering $1. 1204 01:14:31,550 --> 01:14:36,950 Now in this game-- both games are fair. 1205 01:14:36,950 --> 01:14:38,980 Expected value is 0. 1206 01:14:38,980 --> 01:14:43,070 But here you're likely to end up near your expected value. 1207 01:14:43,070 --> 01:14:46,130 Here you're certain to be far by some measure from your expected 1208 01:14:46,130 --> 01:14:47,290 value. 1209 01:14:47,290 --> 01:14:52,740 And in fact, if you were offered to play a game, 1210 01:14:52,740 --> 01:14:56,270 you might have a real decision as to which game you played. 1211 01:14:56,270 --> 01:15:00,990 If you like risk, you might play that game. 1212 01:15:00,990 --> 01:15:05,000 If you're risk averse, maybe you stick with this game 1213 01:15:05,000 --> 01:15:07,180 because what you could lose would be less. 1214 01:15:10,460 --> 01:15:16,770 Now this motivates the definition of the variance 1215 01:15:16,770 --> 01:15:20,700 because it helps mathematicians distinguish between these two 1216 01:15:20,700 --> 01:15:22,398 cases with a simple statistic. 1217 01:15:33,530 --> 01:15:44,670 The variance of a random variable R-- 1218 01:15:44,670 --> 01:15:49,170 we'll denote it by var, V-A-R, of R-- 1219 01:15:49,170 --> 01:15:54,210 is defined as the expected value of the random variable 1220 01:15:54,210 --> 01:16:01,250 minus this expected value squared. 1221 01:16:01,250 --> 01:16:02,950 That's sort of a mouthful there. 1222 01:16:02,950 --> 01:16:04,075 So let's break it down. 1223 01:16:07,285 --> 01:16:11,240 This is the expected value of R. This is the deviation 1224 01:16:11,240 --> 01:16:14,130 from the expected value. 1225 01:16:14,130 --> 01:16:16,630 So this is the deviation from the mean. 1226 01:16:21,490 --> 01:16:22,770 Then we square it. 1227 01:16:25,590 --> 01:16:31,800 So that equals the square of the deviation. 1228 01:16:31,800 --> 01:16:36,810 And then we take the expected value of the square. 1229 01:16:36,810 --> 01:16:42,190 So the variance equals the expected value 1230 01:16:42,190 --> 01:16:46,156 of the square of the deviation. 1231 01:16:50,829 --> 01:16:52,370 In other words, the variance gives us 1232 01:16:52,370 --> 01:16:55,630 the average of the squares of the amount 1233 01:16:55,630 --> 01:16:59,662 by which the random variable deviates from its mean. 1234 01:16:59,662 --> 01:17:04,500 Now the idea behind this is that if a random variable is 1235 01:17:04,500 --> 01:17:08,710 likely to deviate from its mean, the variance will be high. 1236 01:17:08,710 --> 01:17:10,580 And if it's likely to be near its mean, 1237 01:17:10,580 --> 01:17:12,180 the variance will be low. 1238 01:17:12,180 --> 01:17:15,805 And so variance can tell us something about the expected 1239 01:17:15,805 --> 01:17:16,305 deviation. 1240 01:17:23,920 --> 01:17:26,835 So let's compute the variance for R and S 1241 01:17:26,835 --> 01:17:29,220 and see what happens. 1242 01:17:29,220 --> 01:17:35,010 So with R minus the expected value of R, well, 1243 01:17:35,010 --> 01:17:37,670 that is going to be 1,000, because you expect 1244 01:17:37,670 --> 01:17:42,420 the value of 0, with probability 1/2, and minus 1,000 1245 01:17:42,420 --> 01:17:44,770 with probability 1/2. 1246 01:17:44,770 --> 01:17:45,825 Then I square that. 1247 01:17:51,220 --> 01:17:58,430 Well, I square 1,000, I get a million with probability 1/2. 1248 01:17:58,430 --> 01:18:00,700 And I square minus 1,000, I get a million 1249 01:18:00,700 --> 01:18:05,340 again with probability 1/2. 1250 01:18:05,340 --> 01:18:09,620 And so therefore, the variance of R, 1251 01:18:09,620 --> 01:18:12,664 well, it's the expected value of this, which is-- well, 1252 01:18:12,664 --> 01:18:13,580 it's always a million. 1253 01:18:13,580 --> 01:18:14,610 So it's just a million. 1254 01:18:18,650 --> 01:18:20,160 Big. 1255 01:18:20,160 --> 01:18:26,140 Now if I were to do this with S, S minus the expected value of S 1256 01:18:26,140 --> 01:18:31,170 is 1 with probability 1/2, minus 1 with probability 1/2. 1257 01:18:31,170 --> 01:18:39,290 If I square that, well, I get 1 squared is 1, 1258 01:18:39,290 --> 01:18:42,670 minus 1 squared is 1. 1259 01:18:42,670 --> 01:18:47,280 And so the variance of S is the expected value of this. 1260 01:18:47,280 --> 01:18:50,010 And that's just 1. 1261 01:18:50,010 --> 01:18:52,590 So a big difference in the variance. 1262 01:18:52,590 --> 01:18:55,520 So the variance being different tells us these random variables 1263 01:18:55,520 --> 01:18:58,110 are-- the distributions are very different even 1264 01:18:58,110 --> 01:19:01,220 though their expected values are the same. 1265 01:19:01,220 --> 01:19:02,870 And the guy with big variance says, 1266 01:19:02,870 --> 01:19:06,920 hey, we're likely to deviate from the mean here. 1267 01:19:06,920 --> 01:19:09,720 And so risk averse people stay away from strategies 1268 01:19:09,720 --> 01:19:11,678 when they're investing that have high variance. 1269 01:19:18,370 --> 01:19:24,710 Now does anybody have any idea why we square the deviation? 1270 01:19:24,710 --> 01:19:26,850 Why don't we just-- why didn't mathematicians 1271 01:19:26,850 --> 01:19:28,890 when they figured out this stuff I don't know 1272 01:19:28,890 --> 01:19:30,931 how many centuries ago, why didn't they just take 1273 01:19:30,931 --> 01:19:32,930 the expected deviation? 1274 01:19:32,930 --> 01:19:34,510 Why do the stupid squaring thing? 1275 01:19:34,510 --> 01:19:37,110 That only is going to complicate it? 1276 01:19:37,110 --> 01:19:44,250 Why don't we instead compute the expected value 1277 01:19:44,250 --> 01:19:48,730 of R minus the mean? 1278 01:19:48,730 --> 01:19:51,620 Why didn't they do that and call that the variance? 1279 01:19:51,620 --> 01:19:53,170 Yeah? 1280 01:19:53,170 --> 01:19:54,252 That's zero. 1281 01:19:54,252 --> 01:19:55,490 Yeah. 1282 01:19:55,490 --> 01:19:57,210 Because by linearity of expectation, 1283 01:19:57,210 --> 01:19:59,670 that corollary [? 4-2 ?] or whatever, 1284 01:19:59,670 --> 01:20:03,590 this is just the expected value of R minus the expected value 1285 01:20:03,590 --> 01:20:09,030 of the expected value of R. The expected value of a scalar is 1286 01:20:09,030 --> 01:20:09,925 just that scalar. 1287 01:20:13,400 --> 01:20:17,270 And that is 0. 1288 01:20:17,270 --> 01:20:20,570 So the expected deviation from the mean 1289 01:20:20,570 --> 01:20:24,042 is 0 because of how the mean is defined. 1290 01:20:24,042 --> 01:20:25,750 It's the midpoint, the weighted midpoint. 1291 01:20:25,750 --> 01:20:28,320 The times you're high cancel out the times you're 1292 01:20:28,320 --> 01:20:31,950 low if you got the mean right. 1293 01:20:31,950 --> 01:20:33,820 And so this is a useless definition. 1294 01:20:33,820 --> 01:20:36,150 It's always 0. 1295 01:20:36,150 --> 01:20:40,262 So mathematicians had to do something to capture this. 1296 01:20:40,262 --> 01:20:42,220 Now what would have been the more logical thing 1297 01:20:42,220 --> 01:20:43,390 to do that is the next step. 1298 01:20:43,390 --> 01:20:44,910 This doesn't work, but what would 1299 01:20:44,910 --> 01:20:47,730 you think the mathematicians would've done? 1300 01:20:47,730 --> 01:20:51,670 Absolute value would have made a lot of sense here. 1301 01:20:51,670 --> 01:20:54,370 Why they didn't do that? 1302 01:20:54,370 --> 01:20:56,440 Well, you could do that, but it's 1303 01:20:56,440 --> 01:20:57,960 hard to work with mathematically. 1304 01:20:57,960 --> 01:21:01,470 You can't prove nice theorems, it turns out. 1305 01:21:01,470 --> 01:21:05,050 If you put the square in there and make that be the variance, 1306 01:21:05,050 --> 01:21:09,030 you can prove a theorem about linearity of variance. 1307 01:21:09,030 --> 01:21:11,290 And if the random variables are independent, 1308 01:21:11,290 --> 01:21:14,090 then the variance of the sum is the sum of the variances. 1309 01:21:14,090 --> 01:21:16,630 And mathematicians like that kind of thing. 1310 01:21:16,630 --> 01:21:19,640 It makes it easier to work with and do things with. 1311 01:21:19,640 --> 01:21:23,320 Now there are also other choices like, in fact, 1312 01:21:23,320 --> 01:21:26,120 there's a special name for a weird case where 1313 01:21:26,120 --> 01:21:27,685 you take the fourth power. 1314 01:21:32,450 --> 01:21:33,360 You could do that. 1315 01:21:33,360 --> 01:21:35,690 As long as an even power, you could do it. 1316 01:21:35,690 --> 01:21:39,040 And that's actually called the kurtosis. 1317 01:21:39,040 --> 01:21:41,260 Sounds like a foot disease. 1318 01:21:41,260 --> 01:21:44,370 But it's the kurtosis of the random variable. 1319 01:21:44,370 --> 01:21:46,915 Now we're not going to worry about that in this class. 1320 01:21:46,915 --> 01:21:50,030 But we are going to worry about variance. 1321 01:21:50,030 --> 01:21:52,350 And let me do one more definition, 1322 01:21:52,350 --> 01:21:56,560 then we'll talk about variance a lot more tomorrow. 1323 01:21:56,560 --> 01:21:58,480 That square is a bit of a pain. 1324 01:21:58,480 --> 01:22:02,310 And to get rid of it, they made another definition 1325 01:22:02,310 --> 01:22:06,590 after the fact called the standard deviation. 1326 01:22:06,590 --> 01:22:10,790 And standard deviation is defined as follows. 1327 01:22:13,790 --> 01:22:29,650 For a random variable R, the standard deviation of R 1328 01:22:29,650 --> 01:22:34,810 is denoted by a sigma of R. And it's just 1329 01:22:34,810 --> 01:22:38,750 the square root of the variance, undoing 1330 01:22:38,750 --> 01:22:41,650 that nasty square root after the fact. 1331 01:22:41,650 --> 01:22:45,250 So it turns out to be the square root of the expectation 1332 01:22:45,250 --> 01:22:48,940 of the deviation squared. 1333 01:22:48,940 --> 01:22:52,090 Another name for this you've probably seen, 1334 01:22:52,090 --> 01:23:02,770 it's the root of the mean of the square of the deviations. 1335 01:23:02,770 --> 01:23:05,532 And so you get this thing called root-mean-square, 1336 01:23:05,532 --> 01:23:06,990 which if any of you ever done curve 1337 01:23:06,990 --> 01:23:08,614 fitting or any of those kinds of things 1338 01:23:08,614 --> 01:23:12,152 in statistics or whatever, this is what you're talking about. 1339 01:23:12,152 --> 01:23:14,360 And so that's why that expression were to come about. 1340 01:23:18,090 --> 01:23:20,780 So for the standard deviation of R-- 1341 01:23:20,780 --> 01:23:23,670 what's the standard deviation of R? 1342 01:23:23,670 --> 01:23:24,597 1,000? 1343 01:23:24,597 --> 01:23:26,180 In effect, that's pretty close to what 1344 01:23:26,180 --> 01:23:28,360 you expect the deviation to be. 1345 01:23:28,360 --> 01:23:30,095 What's the standard deviation of S? 1346 01:23:32,950 --> 01:23:33,450 1. 1347 01:23:33,450 --> 01:23:34,825 Square root of 1 is 1, and that's 1348 01:23:34,825 --> 01:23:37,240 what you expect its deviation to be. 1349 01:23:37,240 --> 01:23:40,210 So we'll do more of this tomorrow on recitation.