1 00:00:00,000 --> 00:00:01,990 OPERATOR: The following content is provided under a 2 00:00:01,990 --> 00:00:03,840 Creative Commons license. 3 00:00:03,840 --> 00:00:06,840 Your support will help MIT OpenCourseWare continue to 4 00:00:06,840 --> 00:00:10,530 offer high quality educational resources for free. 5 00:00:10,530 --> 00:00:13,390 To make a donation, or view additional materials from 6 00:00:13,390 --> 00:00:17,590 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,590 --> 00:00:19,580 ocw.mit.edu. 8 00:00:19,580 --> 00:00:25,090 PROFESSOR: So you may recall that, in the last lecture, we 9 00:00:25,090 --> 00:00:29,220 more or less solved the drunken student problem, 10 00:00:29,220 --> 00:00:32,050 looking at a random walk. 11 00:00:32,050 --> 00:00:36,510 I now want to move on and discuss some variants of the 12 00:00:36,510 --> 00:00:42,230 random walk problem that are collectively known as biased 13 00:00:42,230 --> 00:00:53,260 random walks. 14 00:00:53,260 --> 00:00:57,950 So the notion here is, the walk is still stochastic but 15 00:00:57,950 --> 00:01:01,230 there is some bias in the direction, so the movements 16 00:01:01,230 --> 00:01:05,500 are not uniformly distributed or equally distributed in all 17 00:01:05,500 --> 00:01:07,560 directions. 18 00:01:07,560 --> 00:01:11,940 As I go through this, I want you to sort of, think in 19 00:01:11,940 --> 00:01:16,520 advance about what the take-home message should be. 20 00:01:16,520 --> 00:01:20,740 One, and this is probably the most important part for today, 21 00:01:20,740 --> 00:01:24,560 is I want to illustrate how by designing our programs in a 22 00:01:24,560 --> 00:01:29,670 nice way around classes, we can change them to do 23 00:01:29,670 --> 00:01:35,430 something else writing a minimal amount of code. 24 00:01:35,430 --> 00:01:39,180 And the idea here is that classes are not there just to 25 00:01:39,180 --> 00:01:42,950 provide some syntax that makes life complicated, but to 26 00:01:42,950 --> 00:01:46,050 actually provide a mechanism that lets us structure our 27 00:01:46,050 --> 00:01:52,380 programs in a way that we can modify them later. 28 00:01:52,380 --> 00:01:55,670 I also want to give you a bit more experience looking at 29 00:01:55,670 --> 00:01:59,820 data, and understanding how, when you get the results of a 30 00:01:59,820 --> 00:02:04,000 simulation or anything else that provides you with data, 31 00:02:04,000 --> 00:02:08,720 you can plot, them study the plots, and try and use those 32 00:02:08,720 --> 00:02:12,460 to develop an understanding about what's going on. 33 00:02:12,460 --> 00:02:16,860 And finally, I want to just get you thinking more about 34 00:02:16,860 --> 00:02:18,560 random behavior. 35 00:02:18,560 --> 00:02:21,080 Because we're going to talk a lot more this semester and in 36 00:02:21,080 --> 00:02:24,300 general throughout your careers, you'll see a lot of 37 00:02:24,300 --> 00:02:26,530 uses of stochastic things. 38 00:02:26,530 --> 00:02:31,280 All right, so let's think about these biased walks. 39 00:02:31,280 --> 00:02:36,610 So for example, assume that the drunk grew up in South 40 00:02:36,610 --> 00:02:40,440 Florida, and really hates the New England winter. 41 00:02:40,440 --> 00:02:43,730 Well there might be a bias that even though the drunk is 42 00:02:43,730 --> 00:02:48,940 not in total control, she's kind of wandering southward. 43 00:02:48,940 --> 00:02:52,380 So there's a bias when taking a step to more likely go south 44 00:02:52,380 --> 00:02:55,400 than certainly north. 45 00:02:55,400 --> 00:02:59,160 Or imagine that the drunk is photosensitive and moves 46 00:02:59,160 --> 00:03:02,030 either towards the sun or away from the sun, 47 00:03:02,030 --> 00:03:04,410 or some such thing. 48 00:03:04,410 --> 00:03:06,910 So we're going to look at different things of that 49 00:03:06,910 --> 00:03:09,970 nature and think about, I'll come back to Chutes and 50 00:03:09,970 --> 00:03:13,390 Ladders later in the lecture, you can speculate 51 00:03:13,390 --> 00:03:19,210 on why this is here. 52 00:03:19,210 --> 00:03:21,540 And we'll look at that. 53 00:03:21,540 --> 00:03:26,910 So, this is on the front page of your handout here. 54 00:03:26,910 --> 00:03:32,560 I've taken the random walk program we looked at last time 55 00:03:32,560 --> 00:03:35,370 and changed it a bit. 56 00:03:35,370 --> 00:03:41,210 And in particular, I've changed the 57 00:03:41,210 --> 00:03:46,430 way the drunk works. 58 00:03:46,430 --> 00:03:50,170 So let's look at the code. 59 00:03:50,170 --> 00:03:52,570 Wow, haven't even done anything it's -- 60 00:03:52,570 --> 00:04:07,220 see how this works. 61 00:04:07,220 --> 00:04:14,030 So now, drunk itself is a very small class, one I don't 62 00:04:14,030 --> 00:04:19,410 actually intend to ever instantiate. 63 00:04:19,410 --> 00:04:23,880 I've pared it down, it has only two methods in it. 64 00:04:23,880 --> 00:04:29,890 It has an init, which is the same as before, and a move 65 00:04:29,890 --> 00:04:35,150 which is the same as before, almost the same. 66 00:04:35,150 --> 00:04:40,090 Now I'm going to have the class usual drunk. 67 00:04:40,090 --> 00:04:42,900 That's the drunk we looked at the last time. 68 00:04:42,900 --> 00:04:45,780 And this usual drunk will behave just like the drunk we 69 00:04:45,780 --> 00:04:48,570 looked at last time. 70 00:04:48,570 --> 00:04:55,260 In here, I'm going to override the move method of the 71 00:04:55,260 --> 00:05:02,770 inherited superclass, and what this is going to do is, it's 72 00:05:02,770 --> 00:05:07,380 going to make the same random choice it made before and then 73 00:05:07,380 --> 00:05:10,580 it's going to call drunk dot move with 74 00:05:10,580 --> 00:05:14,420 the new compass point. 75 00:05:14,420 --> 00:05:17,230 So all I've done is, I've taken a couple of lines of 76 00:05:17,230 --> 00:05:21,380 code out of the previous implementation and moved it to 77 00:05:21,380 --> 00:05:23,390 the subclass. 78 00:05:23,390 --> 00:05:29,540 So it's now, excuse me, in the subclass I'm making the 79 00:05:29,540 --> 00:05:32,400 decision about which direction to move. 80 00:05:32,400 --> 00:05:37,540 Once I've made that decision, then I can call the drunk in 81 00:05:37,540 --> 00:05:39,360 the superclass. 82 00:05:39,360 --> 00:05:47,930 Notice here, when I've made the call, drunk dot move. 83 00:05:47,930 --> 00:05:50,640 So usually you're expecting me to write 84 00:05:50,640 --> 00:05:54,610 something like d dot move. 85 00:05:54,610 --> 00:05:55,930 The object. 86 00:05:55,930 --> 00:06:01,170 Here, instead I'm specifying which class to get the move 87 00:06:01,170 --> 00:06:07,710 from, So this is saying, don't call my local move, call the 88 00:06:07,710 --> 00:06:11,670 move from the superclass. 89 00:06:11,670 --> 00:06:14,410 There are a number of other syntaxes I could've used to 90 00:06:14,410 --> 00:06:16,350 accomplish the same thing. 91 00:06:16,350 --> 00:06:24,830 I used this because it seemed the most straightforward. 92 00:06:24,830 --> 00:06:29,960 Now let's compare the usual drunk to the cold drunk. 93 00:06:29,960 --> 00:06:35,560 So the cold drunk, again, I've overridden the move, and now 94 00:06:35,560 --> 00:06:37,810 it does something different. 95 00:06:37,810 --> 00:06:42,730 It gets the random compass point as before, but if it 96 00:06:42,730 --> 00:06:47,420 happens to be south, it calls drunk dot move 97 00:06:47,420 --> 00:06:53,500 with twice the distance. 98 00:06:53,500 --> 00:06:59,070 So whenever I get the notion I'm moving south, because I 99 00:06:59,070 --> 00:07:04,210 love the warmth, I actually move twice as fast. So instead 100 00:07:04,210 --> 00:07:07,100 of taking one step, which I do to the other compass points, 101 00:07:07,100 --> 00:07:11,360 if I happen to be going south I'll take two steps. 102 00:07:11,360 --> 00:07:17,830 So I'm biased towards moving southward. 103 00:07:17,830 --> 00:07:23,640 And then I call drunk dot move. 104 00:07:23,640 --> 00:07:26,340 Otherwise, I call drunk dot move with 105 00:07:26,340 --> 00:07:30,540 the distances before. 106 00:07:30,540 --> 00:07:34,800 So I've isolated into the subclass this notion of being 107 00:07:34,800 --> 00:07:38,060 heat-seeking. 108 00:07:38,060 --> 00:07:42,170 We've got a third sub class, ew drunk, known 109 00:07:42,170 --> 00:07:47,100 for east west drunk. 110 00:07:47,100 --> 00:07:52,160 And here, what I've done is, I choose to compass point and 111 00:07:52,160 --> 00:07:57,200 then while it's not either east or west, I choose again. 112 00:07:57,200 --> 00:08:00,790 So this drunk can only move eastward or westward. 113 00:08:00,790 --> 00:08:07,810 Can't move north or south. 114 00:08:07,810 --> 00:08:12,790 So again you'll see I'm able to have three kinds of drunks. 115 00:08:12,790 --> 00:08:15,550 Who as we will see, exhibit rather different behavior from 116 00:08:15,550 --> 00:08:25,650 each other with a minimal amount of change to the code. 117 00:08:25,650 --> 00:08:33,880 If we look down at the bottom, we'll see that the code is 118 00:08:33,880 --> 00:08:37,540 pretty much exactly what it was last time. 119 00:08:37,540 --> 00:08:39,340 By the way, this doesn't quite match the 120 00:08:39,340 --> 00:08:40,970 code in your handout. 121 00:08:40,970 --> 00:08:44,230 I decided to make the plots a little bit prettier with 122 00:08:44,230 --> 00:08:47,550 better titles and things, but when I tried to put these in 123 00:08:47,550 --> 00:08:49,740 your handouts it wouldn't fit. 124 00:08:49,740 --> 00:08:53,360 So I have a little more compact things, but this is, I 125 00:08:53,360 --> 00:08:55,630 think, a better way to write it. 126 00:08:55,630 --> 00:09:00,520 But essentially, perform trial, perform sim, answer 127 00:09:00,520 --> 00:09:05,940 quest, are all identical to what they were before. 128 00:09:05,940 --> 00:09:11,850 So because I structured my program in such a way that the 129 00:09:11,850 --> 00:09:15,620 drunk's behavior was independent of all the code 130 00:09:15,620 --> 00:09:21,420 for the generating the trials and plotting the results, I 131 00:09:21,420 --> 00:09:24,920 didn't have to change any of that. 132 00:09:24,920 --> 00:09:27,680 And this is the key to good design. 133 00:09:27,680 --> 00:09:33,270 Is to isolate decisions in small parts of your program, 134 00:09:33,270 --> 00:09:37,650 so when as inevitable happens, you want to enhance the 135 00:09:37,650 --> 00:09:40,860 program, or you discover you've done something wrong, 136 00:09:40,860 --> 00:09:48,100 the number changes you make is not large. 137 00:09:48,100 --> 00:09:53,290 OK, one thing we do need to look at down here is, it's not 138 00:09:53,290 --> 00:09:57,550 quite the same as before, because we'll see that there's 139 00:09:57,550 --> 00:10:06,240 an extra parameter to these things called drunk type. 140 00:10:06,240 --> 00:10:10,940 So if we start with answer question, it takes as before, 141 00:10:10,940 --> 00:10:14,700 the time, the number of trials, a title, and this one 142 00:10:14,700 --> 00:10:18,960 extra thing called the drunk type. 143 00:10:18,960 --> 00:10:22,620 And then when it goes on and it calls perform sim, it 144 00:10:22,620 --> 00:10:27,310 passes the drunk type. 145 00:10:27,310 --> 00:10:38,710 And this is the key line here. 146 00:10:38,710 --> 00:10:41,690 Before, you may recall we just said d equals 147 00:10:41,690 --> 00:10:45,340 drunk with the arguments. 148 00:10:45,340 --> 00:10:51,730 Here, I say d equals drunk type. 149 00:10:51,730 --> 00:10:56,650 So what is drunk type? 150 00:10:56,650 --> 00:11:00,710 Drunk type is itself a type. 151 00:11:00,710 --> 00:11:04,350 We can use types, you know variables can have as their 152 00:11:04,350 --> 00:11:09,610 values types, parameters can have as their values types. 153 00:11:09,610 --> 00:11:13,630 So I'm just passing the type around so that here, when it 154 00:11:13,630 --> 00:11:17,160 comes time to get a drunk, I get a drunk of 155 00:11:17,160 --> 00:11:21,620 the appropriate type. 156 00:11:21,620 --> 00:11:25,110 So I can now answer the question about a usual drunk, 157 00:11:25,110 --> 00:11:31,700 or a cold drunk, or a ew drunk. 158 00:11:31,700 --> 00:11:36,880 This is very common kind of programming paradigm. 159 00:11:36,880 --> 00:11:40,430 We've talked about it before, but this is yet another 160 00:11:40,430 --> 00:11:44,500 example of how we use polymorphism. 161 00:11:44,500 --> 00:11:47,950 Polymorphism. 162 00:11:47,950 --> 00:11:51,470 We write our code in such a way that it works with many 163 00:11:51,470 --> 00:11:54,950 types of objects. 164 00:11:54,950 --> 00:11:57,740 Here we're writing code that will work with 165 00:11:57,740 --> 00:12:01,630 any subtype of drunk. 166 00:12:01,630 --> 00:12:04,170 Now I'd better not pass it a float. 167 00:12:04,170 --> 00:12:08,230 You know, if I answer quest, and instead of something like 168 00:12:08,230 --> 00:12:12,360 usual drunk or cold drunk or e w drunk, I pass it float or 169 00:12:12,360 --> 00:12:17,560 int, the world will come crashing down around my head. 170 00:12:17,560 --> 00:12:21,720 But I haven't, so you'll see down here, I pass it num 171 00:12:21,720 --> 00:12:27,130 steps, number trials, and usual drunk. 172 00:12:27,130 --> 00:12:28,530 Which is the name of? 173 00:12:28,530 --> 00:12:33,050 A class. 174 00:12:33,050 --> 00:12:38,760 Any questions about this? 175 00:12:38,760 --> 00:12:39,160 Yes? 176 00:12:39,160 --> 00:12:45,360 STUDENT: [INAUDIBLE] 177 00:12:45,360 --> 00:12:46,330 PROFESSOR: Louder, please? 178 00:12:46,330 --> 00:12:53,370 STUDENT: [INAUDIBLE] 179 00:12:53,370 --> 00:12:54,950 PROFESSOR: Good, good question. 180 00:12:54,950 --> 00:12:58,250 The question is, does a drunk type make a new drunk? 181 00:12:58,250 --> 00:13:00,630 Now, I have so much candy to choose from, I don't know what 182 00:13:00,630 --> 00:13:02,470 to throw you. 183 00:13:02,470 --> 00:13:07,670 Do you like Tootsie Rolls, Snickers, Tootsie Roll Pops, 184 00:13:07,670 --> 00:13:11,220 Dots, what's your choice? 185 00:13:11,220 --> 00:13:19,370 Snickers. 186 00:13:19,370 --> 00:13:23,040 OK, so what is it? 187 00:13:23,040 --> 00:13:29,970 Down here nothing is created when that call was made. 188 00:13:29,970 --> 00:13:34,180 It was if I had typed the word float or int. 189 00:13:34,180 --> 00:13:41,460 I'm merely saying, here is a type, and the parameter should 190 00:13:41,460 --> 00:13:45,860 now be bound to that type. 191 00:13:45,860 --> 00:13:49,680 Now any time you use the parameter, it's just as if you 192 00:13:49,680 --> 00:13:56,240 had written the string, in this case, usual drunk. 193 00:13:56,240 --> 00:14:08,790 So when I get up to here, think about, it will take, the 194 00:14:08,790 --> 00:14:19,610 variable drunk type, evaluate it, it will evaluate to a type 195 00:14:19,610 --> 00:14:23,040 usual drunk, say, in the first call. 196 00:14:23,040 --> 00:14:28,040 And then it will, just as if I had written usual drunk, 197 00:14:28,040 --> 00:14:34,430 invoke the init method of drunk, and create a new drunk, 198 00:14:34,430 --> 00:14:36,890 but this drunk will be of type usual drunk 199 00:14:36,890 --> 00:14:40,300 rather than a drunk. 200 00:14:40,300 --> 00:14:44,970 Does that make sense? 201 00:14:44,970 --> 00:14:50,320 All right, so this avoids my having to write a whole bunch 202 00:14:50,320 --> 00:14:54,590 of tests, I could have passed a string, and said if the 203 00:14:54,590 --> 00:14:59,140 string said usual drunk, then d equals something. 204 00:14:59,140 --> 00:15:00,860 But here instead of that, I'm actually 205 00:15:00,860 --> 00:15:07,330 passing the type itself. 206 00:15:07,330 --> 00:15:12,710 OK? 207 00:15:12,710 --> 00:15:16,300 So other than the fact that I have to pass the type around, 208 00:15:16,300 --> 00:15:21,570 this is exactly the same as before. 209 00:15:21,570 --> 00:15:26,150 Now just for fun, we saw that last time when I asked you 210 00:15:26,150 --> 00:15:30,100 what did usual drunk do, there was some confusion. 211 00:15:30,100 --> 00:15:34,410 Not confusion, just a bad guess about how the drunk 212 00:15:34,410 --> 00:15:35,950 would wander. 213 00:15:35,950 --> 00:15:38,950 So, let's try it again. 214 00:15:38,950 --> 00:15:44,490 What do you think about the cold drunk? 215 00:15:44,490 --> 00:15:48,850 Will the cold drunk get further from the origin than 216 00:15:48,850 --> 00:15:50,740 the usual drunk? 217 00:15:50,740 --> 00:15:51,880 Not further from the origin? 218 00:15:51,880 --> 00:15:55,450 Who thinks the cold drunk will be further away 219 00:15:55,450 --> 00:15:56,710 the end of 500 steps? 220 00:15:56,710 --> 00:16:01,610 Who thinks the cold drunk will not be further away? 221 00:16:01,610 --> 00:16:05,590 And I'll bet we can all agr -- got that right, I think -- and 222 00:16:05,590 --> 00:16:08,470 I'll bet you'll all agree that, with higher probability, 223 00:16:08,470 --> 00:16:12,870 the cold drunk will be south of the origin than north of 224 00:16:12,870 --> 00:16:14,520 the origin. 225 00:16:14,520 --> 00:16:19,440 More interesting, how about the ew drunk? 226 00:16:19,440 --> 00:16:24,120 So this is a drunk that only goes east or west, east or 227 00:16:24,120 --> 00:16:31,950 west. Will this drunk, well, first of all, what's the 228 00:16:31,950 --> 00:16:33,280 expected distance? 229 00:16:33,280 --> 00:16:34,880 Do you think it should be close to zero or 230 00:16:34,880 --> 00:16:37,560 not close to zero? 231 00:16:37,560 --> 00:16:41,180 Who thinks it should be close to 0? 232 00:16:41,180 --> 00:16:44,780 Who thinks it won't be close to 0? 233 00:16:44,780 --> 00:16:47,370 Well you guys have all fallen for the same trick you fell 234 00:16:47,370 --> 00:16:49,530 for last time. 235 00:16:49,530 --> 00:16:50,510 Fooled you twice. 236 00:16:50,510 --> 00:16:54,230 Be careful, one more and you're out. 237 00:16:54,230 --> 00:16:56,560 As we'll see. 238 00:16:56,560 --> 00:16:57,770 But it's interesting. 239 00:16:57,770 --> 00:17:12,680 So let's run this program and see what happens. 240 00:17:12,680 --> 00:17:15,590 I can take a just a second or two and we should 241 00:17:15,590 --> 00:17:22,470 get a bunch of plots. 242 00:17:22,470 --> 00:17:25,710 All right, so here's the usual drunk. 243 00:17:25,710 --> 00:17:33,050 100 trials and just as before, wandered away. 244 00:17:33,050 --> 00:17:36,440 Moderately smooth, but of course little ups and downs, 245 00:17:36,440 --> 00:17:39,990 maybe suggesting averaging over 100 trials 246 00:17:39,990 --> 00:17:42,230 is not quite enough. 247 00:17:42,230 --> 00:17:45,920 I felt a little better if it were smoother, but the trend 248 00:17:45,920 --> 00:17:52,970 is pretty clear here, I think we probably don't doubt it. 249 00:17:52,970 --> 00:17:56,530 Well, look at cold drunk. 250 00:17:56,530 --> 00:18:00,520 Like a shot heading south. 251 00:18:00,520 --> 00:18:04,930 So here we see, there's bias, that if you get south you go 252 00:18:04,930 --> 00:18:08,670 twice as far makes a huge difference. 253 00:18:08,670 --> 00:18:13,450 Instead of being roughly 20 steps away, it's roughly 120 254 00:18:13,450 --> 00:18:16,550 steps away. 255 00:18:16,550 --> 00:18:21,500 So we see, and this is something to think about, a 256 00:18:21,500 --> 00:18:27,520 small bias can end up making a huge difference over time. 257 00:18:27,520 --> 00:18:31,230 This is why casinos do so well. 258 00:18:31,230 --> 00:18:34,370 They don't have to have a very much of a bias in their favor 259 00:18:34,370 --> 00:18:37,640 on each roll of the dice in order to accumulate a lot of 260 00:18:37,640 --> 00:18:40,410 money over time. 261 00:18:40,410 --> 00:18:43,040 It's why if you just weigh your dice a little bit, so 262 00:18:43,040 --> 00:18:47,020 that they're not fair, you'll do very very well. 263 00:18:47,020 --> 00:18:52,100 I'm not recommending that. 264 00:18:52,100 --> 00:18:55,220 Kind of interesting. 265 00:18:55,220 --> 00:18:59,680 The e w drunk is pretty much the same distance away as the 266 00:18:59,680 --> 00:19:04,440 usual drunk. 267 00:19:04,440 --> 00:19:10,090 This is why we run these simulations, is because we 268 00:19:10,090 --> 00:19:10,780 learn things. 269 00:19:10,780 --> 00:19:13,920 Now, I have to confess, it fooled me, too. 270 00:19:13,920 --> 00:19:16,570 The first time I ran this, I looked at this and said, wait 271 00:19:16,570 --> 00:19:20,800 a minute, that's not what I expected. 272 00:19:20,800 --> 00:19:24,610 So I scratched my head, and thought about, all right, why 273 00:19:24,610 --> 00:19:27,770 is it behaving this way? 274 00:19:27,770 --> 00:19:32,260 This happens me a lot in my own research, that I crunch 275 00:19:32,260 --> 00:19:36,950 some data, do some plots, look at it, get surprised, and on 276 00:19:36,950 --> 00:19:41,100 the basis of that I learned something about the underlying 277 00:19:41,100 --> 00:19:42,960 system I'm modeling. 278 00:19:42,960 --> 00:19:46,910 Because it provokes me to think in ways I had not 279 00:19:46,910 --> 00:19:49,510 thought before I saw the data. 280 00:19:49,510 --> 00:19:50,900 This is an important lesson. 281 00:19:50,900 --> 00:19:54,740 You don't ignore the data, you try and explain the data. 282 00:19:54,740 --> 00:19:58,200 So when I thought about what happened, I realize that while 283 00:19:58,200 --> 00:20:02,290 I might have started at the origin, and after the first 284 00:20:02,290 --> 00:20:07,950 step I'm equally likely to be here or here. 285 00:20:07,950 --> 00:20:17,260 Once I've taken the first step, now my expected location 286 00:20:17,260 --> 00:20:20,600 is not zero, right? 287 00:20:20,600 --> 00:20:25,580 The first step, my expected location was zero. 288 00:20:25,580 --> 00:20:29,550 Because equally likely to be plus 1 or minus 1. 289 00:20:29,550 --> 00:20:34,810 But once I've taken a step and I happen to be here, my 290 00:20:34,810 --> 00:20:40,820 expected location, if I'm here, is plus 1, it's not 0. 291 00:20:40,820 --> 00:20:43,630 Because half the time I'll get back to 0, but half the time 292 00:20:43,630 --> 00:20:48,100 I'll get to 2. 293 00:20:48,100 --> 00:20:55,100 So once I have moved away from where I've started, the 294 00:20:55,100 --> 00:21:03,160 expectation of where I'll end up has changed. 295 00:21:03,160 --> 00:21:07,880 This is kind of a profound thing to think about, because 296 00:21:07,880 --> 00:21:12,090 when we look at it, this kind of behavior explains a lot of 297 00:21:12,090 --> 00:21:14,820 surprising results in life. 298 00:21:14,820 --> 00:21:21,620 Where, if you get a small run of random events that take you 299 00:21:21,620 --> 00:21:28,170 far from where you started, it's hard to get back. 300 00:21:28,170 --> 00:21:32,160 So if you think about the stock market, for example, if 301 00:21:32,160 --> 00:21:36,420 you happen to get unlucky and have a few days or weeks or 302 00:21:36,420 --> 00:21:40,490 months, as the case may be, where you get a run of bad 303 00:21:40,490 --> 00:21:44,910 days, it's really hard to get back if all the movements 304 00:21:44,910 --> 00:21:46,770 after that are random. 305 00:21:46,770 --> 00:21:50,710 Because you've established a new baseline. 306 00:21:50,710 --> 00:21:53,240 And as we'll see later, when you start doing things at 307 00:21:53,240 --> 00:21:57,740 random, it's likely you'll get a few runs in one direction or 308 00:21:57,740 --> 00:22:01,940 another, and then you've got a new baseline, and it's 309 00:22:01,940 --> 00:22:04,960 unlikely to get back to where you were originally. 310 00:22:04,960 --> 00:22:07,200 We'll see a lot of examples of this. 311 00:22:07,200 --> 00:22:13,530 But that's what's happening here, and that's why it looks 312 00:22:13,530 --> 00:22:17,140 the way it does. 313 00:22:17,140 --> 00:22:20,450 That makes sense to people? 314 00:22:20,450 --> 00:22:24,430 Well, so I had this theory, as I often do when I have 315 00:22:24,430 --> 00:22:28,650 theories, I don't believe them, so I decided I better 316 00:22:28,650 --> 00:22:34,120 write some more code to check it out. 317 00:22:34,120 --> 00:22:40,020 So, what I did is, I took the previous simulation and now 318 00:22:40,020 --> 00:22:43,330 instead of looking at different kinds of drunks, I 319 00:22:43,330 --> 00:22:46,990 got more aggressive in thinking about my 320 00:22:46,990 --> 00:22:50,440 analysis of the data. 321 00:22:50,440 --> 00:22:53,590 So again, there's a lesson here. 322 00:22:53,590 --> 00:22:58,900 I decided I wanted to analyze my data in new ways. 323 00:22:58,900 --> 00:23:03,160 And the beauty of it was, I could do that without changing 324 00:23:03,160 --> 00:23:07,912 either any of the code about the drunk, or any they of the 325 00:23:07,912 --> 00:23:10,970 code about performing the simulation, really. 326 00:23:10,970 --> 00:23:14,890 All I had to do was do a different kind of analysis to 327 00:23:14,890 --> 00:23:18,250 a first approximation. 328 00:23:18,250 --> 00:23:26,170 So, let's look at what I've done here. 329 00:23:26,170 --> 00:23:30,990 What I've decided to do here is not only plot, and again, I 330 00:23:30,990 --> 00:23:37,830 think you'll see this code, not only plot how far from the 331 00:23:37,830 --> 00:23:44,300 origin the drunk is over time, but I'm going to actually plot 332 00:23:44,300 --> 00:23:46,870 two other things. 333 00:23:46,870 --> 00:23:51,880 I'm going to use a scatter plot, I think I use scatter in 334 00:23:51,880 --> 00:23:54,870 this one, I do, to show where the drunk is at 335 00:23:54,870 --> 00:23:56,070 each period of time. 336 00:23:56,070 --> 00:23:58,700 Actually, just all the places the drunk 337 00:23:58,700 --> 00:24:00,880 ends up in each trial. 338 00:24:00,880 --> 00:24:04,980 This will give me a visualized way just see, OK, in my, say, 339 00:24:04,980 --> 00:24:08,860 100 trials, how many the drunk ended up near the origin, how 340 00:24:08,860 --> 00:24:11,160 many of them ended up really far away? 341 00:24:11,160 --> 00:24:16,240 And just, allow me to visualize what's going on. 342 00:24:16,240 --> 00:24:19,390 And then I'm going to also use a histogram to sort of 343 00:24:19,390 --> 00:24:21,400 summarize some things. 344 00:24:21,400 --> 00:24:34,430 To get a look at the distribution of the data. 345 00:24:34,430 --> 00:24:38,620 We've been just looking at average results, but it's 346 00:24:38,620 --> 00:24:41,610 often the case that the average does not really tell 347 00:24:41,610 --> 00:24:44,190 the whole story. 348 00:24:44,190 --> 00:24:46,490 You want to ask, well, how many of the drunks ended up 349 00:24:46,490 --> 00:24:48,930 really far away? 350 00:24:48,930 --> 00:24:51,830 How many of the drunks ended up where they started? 351 00:24:51,830 --> 00:24:55,780 You can't tell that from the averages. 352 00:24:55,780 --> 00:24:59,860 So you use a distribution, which tells you information 353 00:24:59,860 --> 00:25:03,500 about how many people ended up where. 354 00:25:03,500 --> 00:25:07,400 So we again saw that, in a less pleasant context, maybe, 355 00:25:07,400 --> 00:25:10,910 in looking at the quizzes, remember I sent you some plots 356 00:25:10,910 --> 00:25:13,470 about the quizzes, and I didn't just say, 357 00:25:13,470 --> 00:25:16,320 here was the average. 358 00:25:16,320 --> 00:25:20,750 I showed you the distribution of grades. 359 00:25:20,750 --> 00:25:23,970 That allowed you to see, not only where you were relative 360 00:25:23,970 --> 00:25:27,360 to the average, but whether you were an outlier. 361 00:25:27,360 --> 00:25:30,160 Was your grade much better than the average? 362 00:25:30,160 --> 00:25:33,980 Or were there a whole bunch of people around where you are? 363 00:25:33,980 --> 00:25:37,090 So there's a lot of information there, that will 364 00:25:37,090 --> 00:25:41,280 help us understand what's really going on. 365 00:25:41,280 --> 00:25:45,940 So we'll run this now. 366 00:25:45,940 --> 00:25:48,440 Again, there's nothing very interesting here about how I 367 00:25:48,440 --> 00:25:52,200 did it, I just used some of these Pylab dot hist, which 368 00:25:52,200 --> 00:25:55,890 gives me a histogram, Pylab dot scatter, which gives me a 369 00:25:55,890 --> 00:25:57,970 scatter plot. 370 00:25:57,970 --> 00:26:01,350 I'll take 500 steps, 400 trials. 371 00:26:01,350 --> 00:26:04,600 I've done some more trials here, so we get some smoother 372 00:26:04,600 --> 00:26:07,370 lines, better distributions. 373 00:26:07,370 --> 00:26:10,140 And then I'm going to look at the usual drunk and ew drunk 374 00:26:10,140 --> 00:26:24,790 only, just to save some time. 375 00:26:24,790 --> 00:26:31,220 At least I think I am. 376 00:26:31,220 --> 00:26:35,280 My computer runs faster when it's plugged in. 377 00:26:35,280 --> 00:26:37,040 All right, but we have our figures here, 378 00:26:37,040 --> 00:26:40,360 let's look at them. 379 00:26:40,360 --> 00:26:42,960 So here's figure one, the usual drunk. 380 00:26:42,960 --> 00:26:45,880 Well, all right, as we've seen this 1000 381 00:26:45,880 --> 00:26:47,680 times, or at least six. 382 00:26:47,680 --> 00:26:55,000 So doing what it usually does, not quite 20. 383 00:26:55,000 --> 00:26:56,820 This is kind of interesting. 384 00:26:56,820 --> 00:27:00,540 This is the final locations of the 400 trials 385 00:27:00,540 --> 00:27:02,340 of the usual drunk. 386 00:27:02,340 --> 00:27:04,990 So we see some information here, that we really couldn't 387 00:27:04,990 --> 00:27:08,000 get from the averages. 388 00:27:08,000 --> 00:27:12,560 That there are more of these points near the middle, but 389 00:27:12,560 --> 00:27:18,200 there are certainly plenty scattered around. 390 00:27:18,200 --> 00:27:21,880 And we get to see that, they're pretty much symmetric 391 00:27:21,880 --> 00:27:24,760 around where we started. 392 00:27:24,760 --> 00:27:26,770 This is what we would have hoped, right? 393 00:27:26,770 --> 00:27:31,000 If we had looked at it, and we had seen all the points up 394 00:27:31,000 --> 00:27:34,250 here, or most of them up here, we would have said, wait a 395 00:27:34,250 --> 00:27:38,190 minute, there's something wrong with my code. 396 00:27:38,190 --> 00:27:40,550 This was supposed to not be a biased walk, 397 00:27:40,550 --> 00:27:43,620 but an unbiased walk. 398 00:27:43,620 --> 00:27:47,580 And if I'd seen a bias in this population, I should have 399 00:27:47,580 --> 00:27:50,180 gotten nervous that I had some problems with my 400 00:27:50,180 --> 00:27:51,760 implementation. 401 00:27:51,760 --> 00:27:53,270 My simulation. 402 00:27:53,270 --> 00:27:56,920 So in addition to helping me understand results, this gives 403 00:27:56,920 --> 00:28:00,540 me some confidence that I really am doing an unbiased 404 00:28:00,540 --> 00:28:03,080 walk, right? 405 00:28:03,080 --> 00:28:04,820 Some over here, some over here. 406 00:28:04,820 --> 00:28:09,000 Not exactly the same, but close enough that I feel 407 00:28:09,000 --> 00:28:19,350 pretty comfortable. 408 00:28:19,350 --> 00:28:25,020 And if I look at the distribution, which probably I 409 00:28:25,020 --> 00:28:27,930 could've if I worked really hard, gotten from the scatter 410 00:28:27,930 --> 00:28:34,880 plot, but I really didn't want to, we'll see that most of the 411 00:28:34,880 --> 00:28:39,750 drunks actually do end up pretty close to the origin. 412 00:28:39,750 --> 00:28:43,960 But that there are a few that are pretty far away. 413 00:28:43,960 --> 00:28:49,960 And again, it's close to symmetric around zero. 414 00:28:49,960 --> 00:28:52,590 Which gives me a good feeling that things are working the 415 00:28:52,590 --> 00:28:54,520 way they should. 416 00:28:54,520 --> 00:29:00,100 This, by the way, is what's called a normal distribution. 417 00:29:00,100 --> 00:29:04,210 Because once upon a time, people believed this was the 418 00:29:04,210 --> 00:29:07,850 way things usually happened. 419 00:29:07,850 --> 00:29:16,030 This is called either normal or Gaussian. 420 00:29:16,030 --> 00:29:18,620 The mathematician Gauss was one of the first people to 421 00:29:18,620 --> 00:29:20,870 write about this distribution. 422 00:29:20,870 --> 00:29:24,320 We'll come back to this later, and among other things, we'll 423 00:29:24,320 --> 00:29:27,770 try and see whether normal is really normal. 424 00:29:27,770 --> 00:29:29,960 But you see it a lot of times. 425 00:29:29,960 --> 00:29:34,410 And basically what it says is that most of the values hang 426 00:29:34,410 --> 00:29:38,360 out around the average, the middle. 427 00:29:38,360 --> 00:29:42,190 And there are a few outliers, and as we get further and 428 00:29:42,190 --> 00:29:49,260 further out, fewer and fewer points occur. 429 00:29:49,260 --> 00:29:52,120 So you'll see these kind of curves, of distributions, 430 00:29:52,120 --> 00:29:53,390 occurring a lot. 431 00:29:53,390 --> 00:29:56,680 Again, we'll come back to distributions in a week, or 432 00:29:56,680 --> 00:29:58,810 actually in a lecture or two. 433 00:29:58,810 --> 00:30:03,690 See a little bit more today, and then more later on. 434 00:30:03,690 --> 00:30:13,800 All right, here's our east west drunk, again as before, 435 00:30:13,800 --> 00:30:18,470 drifting off to around 18. 436 00:30:18,470 --> 00:30:23,730 Well, look at these dots. 437 00:30:23,730 --> 00:30:26,820 Well, this makes me feel, that at least I know that the drunk 438 00:30:26,820 --> 00:30:31,250 is only moving east and west. And again you'll see, pretty 439 00:30:31,250 --> 00:30:35,640 dense in the middle and a few outliers. 440 00:30:35,640 --> 00:30:38,730 Managed to get a little further east than west, but 441 00:30:38,730 --> 00:30:41,520 that happens. 442 00:30:41,520 --> 00:30:47,390 East is always more attractive. 443 00:30:47,390 --> 00:30:49,810 What we might have guessed. 444 00:30:49,810 --> 00:30:56,410 And if we look at the distribution, again we can 445 00:30:56,410 --> 00:30:59,480 look at it, and I'm just giving you the east west 446 00:30:59,480 --> 00:31:07,160 values here, and again it's about the same. 447 00:31:07,160 --> 00:31:11,940 So, what do I want you to take away from these graphs? 448 00:31:11,940 --> 00:31:15,680 Different plots show you different things. 449 00:31:15,680 --> 00:31:18,190 So I'm not going to try and claim that of these three 450 00:31:18,190 --> 00:31:22,170 different plots we did, one is the right one to do. 451 00:31:22,170 --> 00:31:24,000 What I'm going to assert is that you want 452 00:31:24,000 --> 00:31:26,930 to do all of them. 453 00:31:26,930 --> 00:31:31,280 Try and visualize the data in multiple ways. 454 00:31:31,280 --> 00:31:36,810 And sometimes it's much easier to, even if different plots 455 00:31:36,810 --> 00:31:41,240 contain exactly the same information, and in some sense 456 00:31:41,240 --> 00:31:45,500 of the scatter plot and the histogram do contain exactly 457 00:31:45,500 --> 00:31:47,320 the same information. 458 00:31:47,320 --> 00:31:49,580 In fact, there's more information in the scatter 459 00:31:49,580 --> 00:31:53,630 plot, in some sense, because I see each one individually. 460 00:31:53,630 --> 00:31:57,520 But it's, for me, a lot harder to see. 461 00:31:57,520 --> 00:32:00,090 If I'd ask you from the scatter plot, are they 462 00:32:00,090 --> 00:32:02,560 normally distributed? 463 00:32:02,560 --> 00:32:04,920 You might have had a hard time answering it. 464 00:32:04,920 --> 00:32:09,180 Particularly for the usual drunk, where you just saw a 465 00:32:09,180 --> 00:32:10,760 lot of points. 466 00:32:10,760 --> 00:32:13,340 So the idea of some summarizing it in the 467 00:32:13,340 --> 00:32:18,680 histogram makes it much easier to make sense of the data, and 468 00:32:18,680 --> 00:32:21,600 see what you're doing, and to try and learn the lesson 469 00:32:21,600 --> 00:32:32,260 you're trying to learn. 470 00:32:32,260 --> 00:32:37,560 OK, this makes sense to people? 471 00:32:37,560 --> 00:32:40,700 All right, let's move on. 472 00:32:40,700 --> 00:32:48,040 Now we'll go back to this lovely picture we had before. 473 00:32:48,040 --> 00:32:51,380 Any of you ever play the game Chutes and Ladders? 474 00:32:51,380 --> 00:32:53,050 Raise your hand? 475 00:32:53,050 --> 00:32:57,790 OK, some things never go out of style. 476 00:32:57,790 --> 00:33:04,070 Well, imagine your poor drunk wandering through this board, 477 00:33:04,070 --> 00:33:09,430 and every once in a while, because I'm kind of a, not a 478 00:33:09,430 --> 00:33:12,620 nice person, I've eliminated all the ladders, and have only 479 00:33:12,620 --> 00:33:13,220 some chutes. 480 00:33:13,220 --> 00:33:15,980 And every once in a while the drunk hits a 481 00:33:15,980 --> 00:33:18,770 chute and goes whoosh. 482 00:33:18,770 --> 00:33:26,560 And in fact, let's, for the sake of simplicity here, 483 00:33:26,560 --> 00:33:32,230 assume that the chutes are going to, every once in a 484 00:33:32,230 --> 00:33:35,970 while, the drunk will hit a bad spot, and go right back to 485 00:33:35,970 --> 00:33:37,300 the origin. 486 00:33:37,300 --> 00:33:39,470 So this poor, say, heat-seeking drunk who's 487 00:33:39,470 --> 00:33:44,260 trying to head south, does it for a while and maybe gets 488 00:33:44,260 --> 00:33:48,230 zipped right back to where he or she started. 489 00:33:48,230 --> 00:33:52,440 All right, I've now told you I want to put 490 00:33:52,440 --> 00:33:55,740 some, make this happen. 491 00:33:55,740 --> 00:34:01,900 What part of the code do you think I should change? 492 00:34:01,900 --> 00:34:02,400 Somebody? 493 00:34:02,400 --> 00:34:04,090 STUDENT: The field. 494 00:34:04,090 --> 00:34:11,360 PROFESSOR: The field, absolutely. 495 00:34:11,360 --> 00:34:14,280 Those are good to throw short distances. 496 00:34:14,280 --> 00:34:17,760 The field, right, because what we're talking about here is 497 00:34:17,760 --> 00:34:21,420 not a property of the drunk, or a property of the 498 00:34:21,420 --> 00:34:26,120 simulation, but a property of the space in which the drunk 499 00:34:26,120 --> 00:34:30,000 is wandering. 500 00:34:30,000 --> 00:34:33,210 So again, we see that by structuring the initial 501 00:34:33,210 --> 00:34:38,370 program around the natural abstractions that occur in the 502 00:34:38,370 --> 00:34:46,470 problem, natural kinds of changes can be localized. 503 00:34:46,470 --> 00:34:52,860 So let's go and change the field. 504 00:34:52,860 --> 00:34:57,570 So I'm going to have the field I had before plus this thing 505 00:34:57,570 --> 00:35:03,140 called an odd field. 506 00:35:03,140 --> 00:35:06,850 Odd field will be a subclass of field. 507 00:35:06,850 --> 00:35:08,790 This is not odd as in odd or even, this 508 00:35:08,790 --> 00:35:12,770 is odd as in strange. 509 00:35:12,770 --> 00:35:19,850 So I'm going to first define where my chutes are. 510 00:35:19,850 --> 00:35:29,540 So is chute will take, will get the coordinates, and 511 00:35:29,540 --> 00:35:32,650 assign it to x and y, so it gets the coordinates of a 512 00:35:32,650 --> 00:35:37,220 place, and then will return abs x minus abs 513 00:35:37,220 --> 00:35:39,680 y is equal to zero. 514 00:35:39,680 --> 00:35:45,780 So, where are my chutes going to occur? 515 00:35:45,780 --> 00:35:53,980 Somebody? 516 00:35:53,980 --> 00:36:00,130 Yeah, kind of radiating out from the origin. 517 00:36:00,130 --> 00:36:06,720 Any place, so I'll have my origin, here. 518 00:36:06,720 --> 00:36:08,200 Got my graph. 519 00:36:08,200 --> 00:36:13,860 And all along here, any place x and y are equal, 520 00:36:13,860 --> 00:36:18,180 there'll be a chute. 521 00:36:18,180 --> 00:36:21,590 Now, I have to be a little bit careful that it doesn't happen 522 00:36:21,590 --> 00:36:27,110 here, or we won't get anywhere. 523 00:36:27,110 --> 00:36:32,400 So, and then move, in odd field, will call field dot 524 00:36:32,400 --> 00:36:38,530 move, which is unchanged from what it was before. 525 00:36:38,530 --> 00:36:43,130 And then it will say, if self dot is chute, set the location 526 00:36:43,130 --> 00:36:45,690 back to 0,0. 527 00:36:45,690 --> 00:36:50,120 So the poor drunk will move as before, and if he or she hits 528 00:36:50,120 --> 00:36:58,220 this wormhole, get instantly teleported back to the origin. 529 00:36:58,220 --> 00:37:01,910 Very small change, but it will give us rather different 530 00:37:01,910 --> 00:37:08,410 behaviors, at least I think it will. 531 00:37:08,410 --> 00:37:17,460 And then if we look at how we use it, not much changes. 532 00:37:17,460 --> 00:37:20,670 Except, what do you think is going to have to change? 533 00:37:20,670 --> 00:37:23,420 Where will I have to make the change, somebody? 534 00:37:23,420 --> 00:37:27,840 You can actually see the code here. 535 00:37:27,840 --> 00:37:31,710 I'll have to get a different field, right? 536 00:37:31,710 --> 00:37:35,880 So, at the place where I instantiate a field, instead 537 00:37:35,880 --> 00:37:40,600 of instantiating a field, I'll instantiate an odd field. 538 00:37:40,600 --> 00:37:44,130 Now if I'd had 16 different kinds of odd fields, or even 539 00:37:44,130 --> 00:37:49,500 two, I might well have done what I did with drunk. 540 00:37:49,500 --> 00:37:53,860 But in order to sort of minimize things, I've done 541 00:37:53,860 --> 00:37:55,040 something much smaller. 542 00:37:55,040 --> 00:37:57,400 So let's see, where did we get the field? 543 00:37:57,400 --> 00:37:59,780 I don't even remember. 544 00:37:59,780 --> 00:38:03,140 So, here's what I do when I'm looking at my program and I 545 00:38:03,140 --> 00:38:06,530 want to see where something is. 546 00:38:06,530 --> 00:38:13,090 I use the text editor, and I'm going to search for it. 547 00:38:13,090 --> 00:38:24,500 Let's see. 548 00:38:24,500 --> 00:38:31,320 Well, it's not there. 549 00:38:31,320 --> 00:38:33,710 There it is. 550 00:38:33,710 --> 00:38:37,280 So here when I get my field, I get my drunk, and then I get a 551 00:38:37,280 --> 00:38:44,280 field which is an odd field. 552 00:38:44,280 --> 00:38:46,460 All right, anyone want to speculate what will happen 553 00:38:46,460 --> 00:38:54,170 when I run this one? 554 00:38:54,170 --> 00:38:59,150 Think the drunk will be closer or further from the origin 555 00:38:59,150 --> 00:39:00,870 when we're done? 556 00:39:00,870 --> 00:39:03,050 Who thinks closer? 557 00:39:03,050 --> 00:39:05,120 Who thinks further? 558 00:39:05,120 --> 00:39:07,720 Who thinks no difference? 559 00:39:07,720 --> 00:39:09,920 Well, you're right, I mean it's sort of logical if we 560 00:39:09,920 --> 00:39:12,180 think about it, that if every once in a while you get to 561 00:39:12,180 --> 00:39:15,220 zipped, back it's going to be harder to get further away. 562 00:39:15,220 --> 00:39:17,890 We'll see different behaviors, by the way, it depends on the 563 00:39:17,890 --> 00:39:19,200 drunk, right? 564 00:39:19,200 --> 00:39:23,832 That's true for the usual drunk, but maybe not for the 565 00:39:23,832 --> 00:39:27,740 east west drunk. 566 00:39:27,740 --> 00:39:29,590 We can see what happens. 567 00:39:29,590 --> 00:39:33,900 So let's try some. 568 00:39:33,900 --> 00:39:36,030 We'll do it for the usual drunk, which is the more 569 00:39:36,030 --> 00:39:58,030 interesting case to start with. 570 00:39:58,030 --> 00:40:01,670 So you'll see whereas before we were up just shy of 20 most 571 00:40:01,670 --> 00:40:06,140 of the time, here we don't get very much much, further. 572 00:40:06,140 --> 00:40:13,040 We do get steadily further, but at a lot slower pace. 573 00:40:13,040 --> 00:40:16,220 What do you think it means that, before you remember, we 574 00:40:16,220 --> 00:40:17,830 saw a pretty smooth line. 575 00:40:17,830 --> 00:40:21,200 Here we see what may look you like a fat line, but is in 576 00:40:21,200 --> 00:40:25,660 fact a line with a lot of wiggles in it. 577 00:40:25,660 --> 00:40:30,680 What does that imply, about what's going on in the 578 00:40:30,680 --> 00:40:36,430 simulation, the fact that line is jagged rather than smooth 579 00:40:36,430 --> 00:40:41,490 as it was before? 580 00:40:41,490 --> 00:40:43,390 Pardon? 581 00:40:43,390 --> 00:40:44,070 Can't hear you? 582 00:40:44,070 --> 00:40:44,470 STUDENT: That it jumps. 583 00:40:44,470 --> 00:40:50,320 PROFESSOR: Well, certainly it happens because of the jumps, 584 00:40:50,320 --> 00:40:51,680 but what else is going on? 585 00:40:51,680 --> 00:40:55,910 But what is it, sort of, imply more, whoa, sorry about that, 586 00:40:55,910 --> 00:40:57,410 more generally? 587 00:40:57,410 --> 00:41:01,190 Right, what it's saying, is that because of the jumps, 588 00:41:01,190 --> 00:41:07,190 there's a lot more variation from trial to trial. 589 00:41:07,190 --> 00:41:09,860 Not surprising, right, because you sometimes you hit these 590 00:41:09,860 --> 00:41:12,480 wormholes and sometimes you don't. 591 00:41:12,480 --> 00:41:15,330 So let's look at this other figure. 592 00:41:15,330 --> 00:41:18,460 This is kind of interesting. 593 00:41:18,460 --> 00:41:24,090 So here's the scatter plot. 594 00:41:24,090 --> 00:41:37,070 And if we zoom in on it, what we see here is, there are 595 00:41:37,070 --> 00:41:41,710 essentially holes, you know, these alleys 596 00:41:41,710 --> 00:41:46,410 where no points lie. 597 00:41:46,410 --> 00:41:50,380 Just as we might have guessed from here. 598 00:41:50,380 --> 00:41:52,940 The only way you would get a point lying on these alleys, 599 00:41:52,940 --> 00:41:55,640 if it happened to be the very last step of the simulation, 600 00:41:55,640 --> 00:41:58,740 not even then, because the last thing we do is go back to 601 00:41:58,740 --> 00:42:00,540 the origin. 602 00:42:00,540 --> 00:42:04,840 So we can look at this, and say, well, sure enough, this 603 00:42:04,840 --> 00:42:15,790 is keeping people off of a certain part of the field. 604 00:42:15,790 --> 00:42:20,110 All right, some of you will be happy to know we are, for the 605 00:42:20,110 --> 00:42:22,590 moment, actually maybe for the whole rest of the term, 606 00:42:22,590 --> 00:42:24,010 leaving the notion of drunks. 607 00:42:24,010 --> 00:42:26,880 Though not of random walks. 608 00:42:26,880 --> 00:42:30,260 All right, any questions about what's going on here? 609 00:42:30,260 --> 00:42:32,970 And again, I would urge you to sort of study the code and the 610 00:42:32,970 --> 00:42:35,910 way it's been factored to accomplish these things. 611 00:42:35,910 --> 00:42:38,370 And play with it, and look at what you get from the 612 00:42:38,370 --> 00:42:40,870 different plots. 613 00:42:40,870 --> 00:42:44,620 I now want to pull back from this specific example, and 614 00:42:44,620 --> 00:42:50,310 spend a few minutes talking about simulation in general. 615 00:42:50,310 --> 00:42:55,720 Computer simulation really grew hand in hand with the 616 00:42:55,720 --> 00:42:59,400 development of the computers, from the very beginning. 617 00:42:59,400 --> 00:43:02,520 The first large-scale deployment of computer 618 00:43:02,520 --> 00:43:15,720 simulation was as part of the Manhattan Project. 619 00:43:15,720 --> 00:43:19,600 This was done during the war to model the process of 620 00:43:19,600 --> 00:43:29,380 nuclear detonation. 621 00:43:29,380 --> 00:43:33,840 And, in fact, what they did was a simulation of 12 hard 622 00:43:33,840 --> 00:43:37,110 spheres, and what would happen when they would 623 00:43:37,110 --> 00:43:38,960 bump into each other. 624 00:43:38,960 --> 00:43:43,300 It was a huge step forward to do it with simulation, since 625 00:43:43,300 --> 00:43:47,110 the whole project had been stalled by their attempt to do 626 00:43:47,110 --> 00:43:48,830 this analytically. 627 00:43:48,830 --> 00:43:51,050 They were unable to actually solve the problem 628 00:43:51,050 --> 00:43:52,480 analytically. 629 00:43:52,480 --> 00:43:55,120 And it was only when they hit upon the notion of using a 630 00:43:55,120 --> 00:43:58,610 computer, a relatively new tool in those days, to 631 00:43:58,610 --> 00:44:00,690 simulate it, that they were able to get the 632 00:44:00,690 --> 00:44:02,700 answers they needed. 633 00:44:02,700 --> 00:44:06,910 And they did it using something called a Monte Carlo 634 00:44:06,910 --> 00:44:19,180 simulation. 635 00:44:19,180 --> 00:44:22,010 Now in fact, that's just what we've been doing with the 636 00:44:22,010 --> 00:44:24,960 random walk, the Monte Carlo simulation, and I'll come back 637 00:44:24,960 --> 00:44:27,160 to that in a minute. 638 00:44:27,160 --> 00:44:32,940 In general, the thing to think about, is we use simulation 639 00:44:32,940 --> 00:44:39,500 when we really can't get a closed form analytic solution. 640 00:44:39,500 --> 00:44:42,750 If you can put down a system of equations and easily solve 641 00:44:42,750 --> 00:44:47,510 it to get an answer, that's usually the right thing to do. 642 00:44:47,510 --> 00:44:50,780 But when we can't easily do that, the right thing to do is 643 00:44:50,780 --> 00:44:54,940 typically to fall back on simulation. 644 00:44:54,940 --> 00:44:57,930 Sometimes even when we can do it analytically, simulation 645 00:44:57,930 --> 00:45:05,340 has some advantages, as we'll be seeing as we go forward. 646 00:45:05,340 --> 00:45:07,970 What we're typically doing when we're simulating 647 00:45:07,970 --> 00:45:31,960 anything, is we're attempting to generate a sample of 648 00:45:31,960 --> 00:45:43,320 representative scenarios. 649 00:45:43,320 --> 00:45:47,010 Because an exhaustive enumeration of all possible 650 00:45:47,010 --> 00:45:50,330 states would be impossible. 651 00:45:50,330 --> 00:45:53,550 So again, if sometimes you can't solve it analytically, 652 00:45:53,550 --> 00:45:57,530 you can exhaustively enumerate the space and then see 653 00:45:57,530 --> 00:45:59,280 what's going on. 654 00:45:59,280 --> 00:46:01,210 But again, usually you can't. 655 00:46:01,210 --> 00:46:05,150 And so we look for a sample, and the key is, it's gotta be 656 00:46:05,150 --> 00:46:07,960 representative. 657 00:46:07,960 --> 00:46:11,570 Of what we would get in reality. 658 00:46:11,570 --> 00:46:14,500 We'll see an example of that shortly. 659 00:46:14,500 --> 00:46:20,120 So simulation attempts to build an experimental device 660 00:46:20,120 --> 00:46:25,890 that will act like the real system in important aspects. 661 00:46:25,890 --> 00:46:29,170 So I always think of a simulation as an 662 00:46:29,170 --> 00:46:32,180 experimental device. 663 00:46:32,180 --> 00:46:37,440 And every time I run it, I'm running an experiment designed 664 00:46:37,440 --> 00:46:41,430 to give me some information about the real world. 665 00:46:41,430 --> 00:46:45,110 These things are enormously popular, so if you were to get 666 00:46:45,110 --> 00:46:52,690 on Google and look at simulation. 667 00:46:52,690 --> 00:47:00,540 So for, example, we could Google simulation finance. 668 00:47:00,540 --> 00:47:06,230 You know, we see we get about 3,000,000 hits. 669 00:47:06,230 --> 00:47:13,600 We can do biology. 670 00:47:13,600 --> 00:47:16,360 We get twice as many hits, showing the relative 671 00:47:16,360 --> 00:47:21,770 importance in the world of biology and finance. 672 00:47:21,770 --> 00:47:25,150 For all you Course 7 majors, we should make sure that 673 00:47:25,150 --> 00:47:27,650 should compare biology to physics. 674 00:47:27,650 --> 00:47:31,280 Oh dear, we'll see that physics is even more important 675 00:47:31,280 --> 00:47:35,170 than biology. 676 00:47:35,170 --> 00:47:38,320 And I know we have a lot of Course 2 students in here, so 677 00:47:38,320 --> 00:47:43,310 let's try mechanical. 678 00:47:43,310 --> 00:47:45,520 Oh, lot of those, too. 679 00:47:45,520 --> 00:47:48,680 And if I did baseball, or football, we'd get more than 680 00:47:48,680 --> 00:47:50,720 any of them. 681 00:47:50,720 --> 00:47:54,110 Showing the real importance of things in the world. 682 00:47:54,110 --> 00:48:00,570 As we look at simulations, I want you to keep in mind that 683 00:48:00,570 --> 00:48:12,930 they are typically descriptive not prescriptive. 684 00:48:12,930 --> 00:48:18,230 So by that I mean, they describe a situation, they 685 00:48:18,230 --> 00:48:22,590 don't tell you what the answer is. 686 00:48:22,590 --> 00:48:28,790 So another way to think about this is, a simulation is not 687 00:48:28,790 --> 00:48:31,370 an optimization procedure. 688 00:48:31,370 --> 00:48:35,180 When we looked at optimization, that was 689 00:48:35,180 --> 00:48:35,840 prescriptive. 690 00:48:35,840 --> 00:48:41,280 We ran an optimization algorithm, and it gave us the 691 00:48:41,280 --> 00:48:45,010 best possible solution. 692 00:48:45,010 --> 00:48:48,420 A simulation typically doesn't give you this best possible 693 00:48:48,420 --> 00:48:53,150 solution, but if you give it the starting point, it will 694 00:48:53,150 --> 00:48:58,780 tell you the consequences of that. 695 00:48:58,780 --> 00:49:01,930 Now, can we use simulation to do optimization? 696 00:49:01,930 --> 00:49:03,780 Absolutely. 697 00:49:03,780 --> 00:49:07,440 For example, we can use it to do guess and check. 698 00:49:07,440 --> 00:49:10,510 Where we guess a, probably not to get a truly optimal 699 00:49:10,510 --> 00:49:15,950 solution, but to get a good solution, we can guess various 700 00:49:15,950 --> 00:49:18,880 possibilities, simulate them, and see. 701 00:49:18,880 --> 00:49:21,750 People do that all the time. 702 00:49:21,750 --> 00:49:27,120 If they want to see what's the right number of checkout 703 00:49:27,120 --> 00:49:30,920 counters at the supermarket, or what's the right airline 704 00:49:30,920 --> 00:49:32,540 schedule to use? 705 00:49:32,540 --> 00:49:35,803 They'll make a guess, they'll simulate it, and they'll say 706 00:49:35,803 --> 00:49:39,980 all right, this was better than this, so we'll choose it. 707 00:49:39,980 --> 00:49:42,170 All right, I'm going to stop here. 708 00:49:42,170 --> 00:49:45,920 Next time we're going to get more deeply into the actual 709 00:49:45,920 --> 00:49:49,720 stochastics, the probability distributions, and start 710 00:49:49,720 --> 00:49:52,580 understanding what's going on under the covers.