1 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,870 Commons license. 3 00:00:03,870 --> 00:00:06,910 Your support will help MIT OpenCourseWare continue to 4 00:00:06,910 --> 00:00:10,560 offer high quality educational resources for free. 5 00:00:10,560 --> 00:00:13,460 To make a donation or view additional materials from 6 00:00:13,460 --> 00:00:18,440 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:18,440 --> 00:00:23,200 ocw.mit.edu 8 00:00:23,200 --> 00:00:25,690 PROFESSOR: Good morning. 9 00:00:25,690 --> 00:00:29,010 I want to start today's lecture with, I guess one 10 00:00:29,010 --> 00:00:30,540 could call it, a confession. 11 00:00:30,540 --> 00:00:34,650 This isn't a real 6.00 lecture. 12 00:00:34,650 --> 00:00:39,390 I'm standing here on June 30th in an empty classroom dressed, 13 00:00:39,390 --> 00:00:41,920 for some bizarre reason, as if it's the winter. 14 00:00:41,920 --> 00:00:47,070 I guess it's what the video folks would call continuity. 15 00:00:47,070 --> 00:00:51,110 What happened is Professor Grimson gave a beautiful 16 00:00:51,110 --> 00:00:58,350 lecture 13, for which we have a lovely picture and no sound. 17 00:00:58,350 --> 00:01:01,930 We decided that probably those of you watching this in 18 00:01:01,930 --> 00:01:05,360 OpenCourseWare would be unhappy just to watch 19 00:01:05,360 --> 00:01:09,040 Professor Grimson and not hear him, so we're now re-taping 20 00:01:09,040 --> 00:01:10,210 the lecture. 21 00:01:10,210 --> 00:01:12,400 This is, actually, one of two lectures we're going to be 22 00:01:12,400 --> 00:01:16,380 re-taping because we had video problems. 23 00:01:16,380 --> 00:01:19,110 So when I say something hilariously funny, and there 24 00:01:19,110 --> 00:01:21,090 is no laughter in the room, it's 25 00:01:21,090 --> 00:01:23,860 because the room is empty. 26 00:01:23,860 --> 00:01:28,183 Do me a favor and laugh out there in video-land, so maybe 27 00:01:28,183 --> 00:01:30,810 at least someone will respond. 28 00:01:30,810 --> 00:01:38,090 OK, the last lecture closed with a slight mystery, not a 29 00:01:38,090 --> 00:01:40,190 great mystery, but a little mystery. 30 00:01:40,190 --> 00:01:44,110 We ran our simulation of the drunkard's walk and got a 31 00:01:44,110 --> 00:01:46,470 result that wasn't credible. 32 00:01:46,470 --> 00:01:49,070 How did we know it wasn't credible? 33 00:01:49,070 --> 00:01:52,840 Well we had worked out some details on the blackboard of 34 00:01:52,840 --> 00:01:56,250 what we thought would happen with a small number of steps, 35 00:01:56,250 --> 00:01:59,130 and the results we got didn't match that. 36 00:01:59,130 --> 00:02:01,530 That told us we had something wrong. 37 00:02:01,530 --> 00:02:04,200 We asked you to go think about it and come back today 38 00:02:04,200 --> 00:02:06,840 prepared to tell us what was wrong. 39 00:02:06,840 --> 00:02:09,550 Well since there is no one here to ask, I'm not going to 40 00:02:09,550 --> 00:02:14,250 ask the class to fix it, but instead I have had the 41 00:02:14,250 --> 00:02:17,390 laborious task of fixing it myself. 42 00:02:17,390 --> 00:02:18,790 So let's look at it. 43 00:02:18,790 --> 00:02:21,440 The problem was in SimWalk. 44 00:02:21,440 --> 00:02:24,140 And what we had is we had the wrong argument here. 45 00:02:24,140 --> 00:02:27,580 We had numTrials Instead of numSteps. 46 00:02:27,580 --> 00:02:30,040 And so it didn't make any sense. 47 00:02:30,040 --> 00:02:33,360 I've now fixed it, and I, now, want to run it. 48 00:02:33,360 --> 00:02:37,540 Well clearly, what I should do to begin with, is run it on 49 00:02:37,540 --> 00:02:41,970 some examples for which we already know the answer. 50 00:02:41,970 --> 00:02:46,410 So I'll change drunkTest to instead of running it on large 51 00:02:46,410 --> 00:02:51,770 numbers of steps, run it on just a few. 52 00:02:51,770 --> 00:02:53,150 And let's see what we get. 53 00:02:59,976 --> 00:03:03,950 We run a lot of trials here, since it's a short trial. 54 00:03:03,950 --> 00:03:09,540 And what we see is for 100 trials of 0 steps, the mean 55 00:03:09,540 --> 00:03:12,960 was 0, the max was 0, the min was 0. 56 00:03:12,960 --> 00:03:15,180 Well that's exactly what we saw when we looked 57 00:03:15,180 --> 00:03:16,060 at it on the board. 58 00:03:16,060 --> 00:03:17,560 It should happen. 59 00:03:17,560 --> 00:03:19,770 And the same thing with 1. 60 00:03:19,770 --> 00:03:23,106 Everything worked the way it was supposed to work. 61 00:03:23,106 --> 00:03:25,680 It doesn't tell us the program is perfect. 62 00:03:25,680 --> 00:03:29,040 It does tell us it works in at least two examples, which was 63 00:03:29,040 --> 00:03:31,826 better than we had last time. 64 00:03:31,826 --> 00:03:35,480 All right now let's look at it on a larger set of examples. 65 00:03:38,090 --> 00:03:41,580 I think I won't run 100 trials here, because it'll take a 66 00:03:41,580 --> 00:03:42,960 little longer than we want. 67 00:03:45,630 --> 00:03:46,944 So let's run maybe 20. 68 00:03:52,268 --> 00:03:55,750 Let's see what we get. 69 00:03:55,750 --> 00:03:57,000 Well it's running through here. 70 00:03:57,000 --> 00:04:01,530 Now we're getting examples that seem much more credible. 71 00:04:01,530 --> 00:04:06,350 When we take 10 steps, the mean is 2.85, and the max is 72 00:04:06,350 --> 00:04:10,660 between 6 and 1, or the max is 6, the min is 1. 73 00:04:10,660 --> 00:04:12,420 It's what we would hope for, that there's 74 00:04:12,420 --> 00:04:14,110 some dispersion there. 75 00:04:14,110 --> 00:04:18,720 And as we get up higher, we see that for a 100,000 steps, 76 00:04:18,720 --> 00:04:22,932 the mean is 248, and there's quite a spread between the max 77 00:04:22,932 --> 00:04:25,870 and the min. 78 00:04:25,870 --> 00:04:30,380 Finally we get to look at the question that had us writing 79 00:04:30,380 --> 00:04:33,710 this code in the first place-- 80 00:04:33,710 --> 00:04:37,700 how far should we expect this drunk to be, given a 81 00:04:37,700 --> 00:04:39,530 particular amount of time. 82 00:04:39,530 --> 00:04:42,790 Well we can look at these numbers and try and think 83 00:04:42,790 --> 00:04:47,840 about them in our heads, but in fact, it's a lot easier to 84 00:04:47,840 --> 00:04:49,430 look at a picture. 85 00:04:49,430 --> 00:04:52,130 And this will get us to a theme that we'll be getting to 86 00:04:52,130 --> 00:04:58,100 shortly in the course of how do we visualize data. 87 00:04:58,100 --> 00:05:01,560 So here we have a visualization. 88 00:05:01,560 --> 00:05:06,030 What I'm plotting, here, is the mean number of the mean 89 00:05:06,030 --> 00:05:12,020 distance against the number of steps for this random walk. 90 00:05:12,020 --> 00:05:17,380 And what we can see is, as we knew, at 0, it's 0, and then 91 00:05:17,380 --> 00:05:19,180 it sort of goes up. 92 00:05:19,180 --> 00:05:20,735 And I've taken it only up to a 1,000. 93 00:05:20,735 --> 00:05:22,880 And at 1,000 we're somewhere, it looks 94 00:05:22,880 --> 00:05:25,335 like, around 25 steps. 95 00:05:28,160 --> 00:05:30,750 So we can learn something. 96 00:05:30,750 --> 00:05:33,620 Well what can we learn from this? 97 00:05:33,620 --> 00:05:35,010 How fast is it growing? 98 00:05:35,010 --> 00:05:37,890 Well it seems to grow pretty fast, and 99 00:05:37,890 --> 00:05:39,140 then it flattens out. 100 00:05:42,490 --> 00:05:46,590 It looks to me like, roughly speaking, the distance is 101 00:05:46,590 --> 00:05:50,410 growing as sort of close to the square root of 102 00:05:50,410 --> 00:05:52,860 the number of steps. 103 00:05:52,860 --> 00:05:55,530 Well I could look at it more closely. 104 00:05:55,530 --> 00:05:59,640 We, actually, know this is not exactly the square root. 105 00:05:59,640 --> 00:06:04,740 Right, the square root of a 1,000 is not going to be 25, 106 00:06:04,740 --> 00:06:08,390 but I'm not going to delve into the details of that. 107 00:06:08,390 --> 00:06:10,450 It's, actually, a little bit complicated. 108 00:06:10,450 --> 00:06:14,370 We could derive it, but we won't, because I want to get 109 00:06:14,370 --> 00:06:18,160 to what I think is a more important message. 110 00:06:22,280 --> 00:06:26,690 How much should we infer from this or from the numbers I 111 00:06:26,690 --> 00:06:32,620 displayed before, and I want to say not too much. 112 00:06:32,620 --> 00:06:42,550 Because what we saw, if we go look at what we had before, is 113 00:06:42,550 --> 00:06:48,900 quite a dispersion between the max and the min. 114 00:06:48,900 --> 00:07:01,990 And furthermore, if we run it again, the 20 trials, we're 115 00:07:01,990 --> 00:07:03,240 getting different answers. 116 00:07:06,220 --> 00:07:11,040 So what you can see, here, is for 10,000 steps, here, my 117 00:07:11,040 --> 00:07:18,450 mean was 78, and here, for 10,000 steps, my mean is 90. 118 00:07:18,450 --> 00:07:20,340 Here, the mean was 279. 119 00:07:20,340 --> 00:07:21,830 Here, it was 248. 120 00:07:21,830 --> 00:07:25,800 The maxs and the mins are different. 121 00:07:25,800 --> 00:07:29,650 So I don't want to read too much into that graph. 122 00:07:29,650 --> 00:07:33,470 Now one of the issues we should have asked is, well 123 00:07:33,470 --> 00:07:35,610 it's the mean of how many trials. 124 00:07:35,610 --> 00:07:38,290 I didn't tell you. 125 00:07:38,290 --> 00:07:40,310 And to be honest, I don't quite remember. 126 00:07:40,310 --> 00:07:42,350 I think it was 20. 127 00:07:42,350 --> 00:07:45,790 But I don't have enough information to interpret it. 128 00:07:49,070 --> 00:07:52,570 I need a lot more and that's going to be what this whole 129 00:07:52,570 --> 00:07:58,260 next unit of the course is about is how do we think about 130 00:07:58,260 --> 00:08:02,460 the results of programs when the programs themselves are 131 00:08:02,460 --> 00:08:04,990 stochastic. 132 00:08:04,990 --> 00:08:09,430 And this is important because, as you will see, not only in 133 00:08:09,430 --> 00:08:11,910 this course, but as you progress in your careers, if 134 00:08:11,910 --> 00:08:15,980 you're involved in engineering or science, is that almost 135 00:08:15,980 --> 00:08:19,220 everything in the real world is stochastic. 136 00:08:19,220 --> 00:08:21,600 And in order to think about it, we have to really think 137 00:08:21,600 --> 00:08:25,360 pretty hard about what those things mean. 138 00:08:25,360 --> 00:08:28,920 So I want to, now, pull back and talk a little bit, sort 139 00:08:28,920 --> 00:08:32,740 of, philosophically about that, about the role of 140 00:08:32,740 --> 00:08:33,990 randomness in computation-- 141 00:08:36,789 --> 00:08:40,140 something you probably haven't seen a lot, if at all in the 142 00:08:40,140 --> 00:08:42,760 other courses you've taken if you're a freshman or 143 00:08:42,760 --> 00:08:47,520 sophomore, and that's because there's something really 144 00:08:47,520 --> 00:08:51,040 comforting about Newtonian mechanics. 145 00:08:51,040 --> 00:08:54,350 When I first learned physics, it was really comforting. 146 00:08:54,350 --> 00:08:56,860 I learn the physics of Isaac Newton. 147 00:08:56,860 --> 00:08:59,410 You push down on one end of the lever, the other end of 148 00:08:59,410 --> 00:09:01,290 the lever goes up. 149 00:09:01,290 --> 00:09:04,340 You throw a ball up in the air, it travels a parabolic 150 00:09:04,340 --> 00:09:05,590 path and lands. 151 00:09:08,010 --> 00:09:13,310 F equals MA, that wonderful rule of physics. 152 00:09:13,310 --> 00:09:15,450 Everything happened for a reason. 153 00:09:15,450 --> 00:09:17,950 It was predictable. 154 00:09:17,950 --> 00:09:22,500 It was a great comfort to some of us about it. 155 00:09:22,500 --> 00:09:28,770 And for centuries, that's the way the world thought, from 156 00:09:28,770 --> 00:09:32,880 almost the beginning of the times of science to, really, 157 00:09:32,880 --> 00:09:35,640 if you look at history of science, to today, people 158 00:09:35,640 --> 00:09:38,400 believed the world was deterministic. 159 00:09:38,400 --> 00:09:40,720 And they liked it. 160 00:09:40,720 --> 00:09:46,060 Then along came the so-called Copenhagen Doctrine and 161 00:09:46,060 --> 00:09:51,790 quantum physics in the 20th century, and the comforting 162 00:09:51,790 --> 00:09:56,090 world of Newtonian physics disappeared. 163 00:09:56,090 --> 00:09:59,220 The Doctrine, the Copenhagen Doctrine, led by the 164 00:09:59,220 --> 00:10:04,110 physicists Bohr and Heisenberg, argued that at its 165 00:10:04,110 --> 00:10:08,780 most predictable and most fundamental level, the 166 00:10:08,780 --> 00:10:14,230 behavior of the physical world cannot be predicted. 167 00:10:14,230 --> 00:10:18,370 One can make probabilistic statements of the form x is 168 00:10:18,370 --> 00:10:23,070 highly likely to occur, but not statements of the form x 169 00:10:23,070 --> 00:10:26,360 is certain to occur-- 170 00:10:26,360 --> 00:10:28,850 period. 171 00:10:28,850 --> 00:10:31,750 I can make the probabilistic statement that if I take this 172 00:10:31,750 --> 00:10:33,350 laser pointer and put it on the table, 173 00:10:33,350 --> 00:10:35,540 it won't fall through. 174 00:10:35,540 --> 00:10:38,230 The molecules won't separate in such a nice way that it 175 00:10:38,230 --> 00:10:43,270 just drops through, but I can't promise it won't happen. 176 00:10:43,270 --> 00:10:46,590 Truth is I'm going to make that promise anyway, because I 177 00:10:46,590 --> 00:10:50,740 believe in probability, and it's highly, highly probable. 178 00:10:50,740 --> 00:10:55,100 But they tried to say that, in fact, the world is all 179 00:10:55,100 --> 00:10:55,870 stochastic. 180 00:10:55,870 --> 00:10:59,330 Everything is probabilistic. 181 00:10:59,330 --> 00:11:04,170 Other distinguished physicists at the time, most notably 182 00:11:04,170 --> 00:11:09,030 Einstein and Schrodinger, vehemently disagreed. 183 00:11:09,030 --> 00:11:13,090 This debate, it's hard to believe, actually, roiled the 184 00:11:13,090 --> 00:11:17,650 world of physics, philosophy, and religion. 185 00:11:17,650 --> 00:11:21,870 The heart of the debate was the validity of something 186 00:11:21,870 --> 00:11:23,550 called causal non-determinism. 187 00:11:40,710 --> 00:11:42,410 The idea here-- 188 00:11:42,410 --> 00:11:47,010 causal means caused by previous events. 189 00:11:47,010 --> 00:11:51,140 So causal non-determinism was the belief that not every 190 00:11:51,140 --> 00:11:56,110 event is caused by previous events. 191 00:11:56,110 --> 00:12:00,050 Einstein and Schrodinger found this view philosophically 192 00:12:00,050 --> 00:12:05,720 unacceptable, as exemplified by Einstein's often quoted 193 00:12:05,720 --> 00:12:11,440 comment, "God does not play dice." 194 00:12:11,440 --> 00:12:15,950 What they argued is for something called predictive 195 00:12:15,950 --> 00:12:17,200 non-determinism. 196 00:12:32,260 --> 00:12:37,510 The concept here was that our inability to make accurate 197 00:12:37,510 --> 00:12:40,880 measurements about the physical world makes it 198 00:12:40,880 --> 00:12:44,050 impossible to make precise predictions about the future. 199 00:12:46,550 --> 00:12:50,700 So this distinction was nicely summed up again by Einstein, 200 00:12:50,700 --> 00:12:54,410 who said, and I quote, "the essentially statistical 201 00:12:54,410 --> 00:12:59,060 character of contemporary theory is solely to be 202 00:12:59,060 --> 00:13:02,290 ascribed to the fact that this theory operates with an 203 00:13:02,290 --> 00:13:07,960 incomplete description of physical systems," i.e., 204 00:13:07,960 --> 00:13:11,370 things are not unpredictable, they just look unpredictable 205 00:13:11,370 --> 00:13:16,330 because we don't know enough about the initial states. 206 00:13:16,330 --> 00:13:20,720 This question is still unsettled in science. 207 00:13:20,720 --> 00:13:22,990 The good news is it probably doesn't matter at all 208 00:13:22,990 --> 00:13:25,510 what the truth is. 209 00:13:25,510 --> 00:13:29,280 However you want to look at it, we have to assume that the 210 00:13:29,280 --> 00:13:33,410 world is non-deterministic, because we can't, actually, 211 00:13:33,410 --> 00:13:35,290 predict it. 212 00:13:35,290 --> 00:13:37,810 So there's a little experiment I sometimes do. 213 00:13:37,810 --> 00:13:39,665 I'll pretend to do it, even though there 214 00:13:39,665 --> 00:13:41,350 are no students here. 215 00:13:41,350 --> 00:13:43,110 I take three coins-- 216 00:13:43,110 --> 00:13:45,630 see if there are students in the class, I ask the students 217 00:13:45,630 --> 00:13:48,370 to give me coins, and then I try to steal them. 218 00:13:48,370 --> 00:13:50,200 But since there are no students, I'm going to have to 219 00:13:50,200 --> 00:13:56,030 use my own coins, and I'm going to ask, is at least one 220 00:13:56,030 --> 00:13:57,280 of them heads. 221 00:13:59,240 --> 00:14:05,250 Well the truth is, it's completely predictable. 222 00:14:05,250 --> 00:14:07,470 I know the answer. 223 00:14:07,470 --> 00:14:09,030 But you don't know the answer, because you 224 00:14:09,030 --> 00:14:11,320 can't see these coins. 225 00:14:11,320 --> 00:14:15,410 And so you might as well assume it's probabilistic, and 226 00:14:15,410 --> 00:14:19,930 guess well, he put 3 coins down at random, the odds of 227 00:14:19,930 --> 00:14:23,430 each one being a head is 1/2, so probably 1 228 00:14:23,430 --> 00:14:25,630 of the 3 is a head. 229 00:14:25,630 --> 00:14:29,620 And you're relying on probability, because you don't 230 00:14:29,620 --> 00:14:32,200 know what's going on-- 231 00:14:32,200 --> 00:14:33,450 predictive non-determinism. 232 00:14:36,720 --> 00:14:39,520 OK, and that's the way we're going to deal with a lot of 233 00:14:39,520 --> 00:14:42,360 things going on for the rest of the semester. 234 00:14:42,360 --> 00:14:46,410 We'll look at a lot of examples where we have to act 235 00:14:46,410 --> 00:14:49,380 as if things are non-deterministic. 236 00:14:49,380 --> 00:14:53,120 And that gets us to this notion of what mathematicians 237 00:14:53,120 --> 00:14:54,580 call stochastic processes. 238 00:15:04,760 --> 00:15:10,090 A process is stochastic if it's next state depends on 239 00:15:10,090 --> 00:15:14,910 both the previous states and some random element. 240 00:15:48,550 --> 00:15:53,390 So now what I'm going to do is pick up one of these coins, 241 00:15:53,390 --> 00:16:00,320 flip it in the air, put it down, and ask, again, about 242 00:16:00,320 --> 00:16:04,270 the state of these three coins. 243 00:16:04,270 --> 00:16:08,160 It depends upon the previous state, because these two coins 244 00:16:08,160 --> 00:16:11,880 have the same value they had in the previous state, plus a 245 00:16:11,880 --> 00:16:14,720 stochastic element, a probabilistic element, a 246 00:16:14,720 --> 00:16:16,340 random element-- 247 00:16:16,340 --> 00:16:18,680 the value of that coin, which I just flipped. 248 00:16:21,320 --> 00:16:26,520 OK, most programming languages, including Python, 249 00:16:26,520 --> 00:16:31,450 include simple ways to write programs that use randomness. 250 00:16:31,450 --> 00:16:38,110 As we'll see, as we've already seen in Python with our 251 00:16:38,110 --> 00:16:47,930 drunkard's walk, we use the function random.choice, which, 252 00:16:47,930 --> 00:16:52,020 given a set of values at random, chose 253 00:16:52,020 --> 00:16:54,460 one of those values. 254 00:16:54,460 --> 00:16:58,900 That function and almost all of the other functions in 255 00:16:58,900 --> 00:17:04,380 Python that involve randomness are implemented using 256 00:17:04,380 --> 00:17:05,630 something called random.random. 257 00:17:10,200 --> 00:17:14,579 This function generates a random float that's greater 258 00:17:14,579 --> 00:17:23,119 than 0 and no greater than 1.0, So you get one of the 259 00:17:23,119 --> 00:17:26,050 infinite or seemingly infinite number of floating point 260 00:17:26,050 --> 00:17:31,890 values that are greater than 0 and no greater than 1. 261 00:17:31,890 --> 00:17:35,360 So let's go look at another example of 262 00:17:35,360 --> 00:17:36,610 the stochastic process. 263 00:17:40,510 --> 00:17:41,885 We're going to look at throwing dice. 264 00:17:48,170 --> 00:17:55,350 So I've got something called rollDie, which chooses a value 265 00:17:55,350 --> 00:17:57,630 between 1 and 6. 266 00:17:57,630 --> 00:17:59,790 For those of you who have never gambled with 267 00:17:59,790 --> 00:18:01,820 dice, it's a cube. 268 00:18:01,820 --> 00:18:04,410 You roll it, and it has a value between 1 and 6 that 269 00:18:04,410 --> 00:18:06,620 shows up at random. 270 00:18:06,620 --> 00:18:12,760 And then I've got this little program called testRoll that 271 00:18:12,760 --> 00:18:17,420 rolls a bunch of dice and comes up with an answer. 272 00:18:17,420 --> 00:18:21,190 All right, so let's see what happens. 273 00:18:24,000 --> 00:18:27,060 Actually, before we do that, let me ask you-- 274 00:18:27,060 --> 00:18:29,550 we can look at a question. 275 00:18:29,550 --> 00:18:34,350 Imagine I roll it, and I run it some large 276 00:18:34,350 --> 00:18:37,900 number of times, 10. 277 00:18:37,900 --> 00:18:40,940 Would you expect to see the value-- 278 00:18:44,890 --> 00:18:50,780 more likely see that value or might more likely see a value 279 00:18:50,780 --> 00:18:52,030 that looks like this. 280 00:19:02,120 --> 00:19:05,390 Which of these values is more likely to come up from my 281 00:19:05,390 --> 00:19:07,870 random rolls of the die? 282 00:19:12,380 --> 00:19:14,860 Well when I take a vote-- 283 00:19:14,860 --> 00:19:16,230 if I take a vote-- 284 00:19:16,230 --> 00:19:21,110 historically in this class, this strikes people as more 285 00:19:21,110 --> 00:19:23,840 likely to happen than this. 286 00:19:23,840 --> 00:19:27,790 But it's a trick question, because as it happens, they're 287 00:19:27,790 --> 00:19:29,920 equally likely. 288 00:19:29,920 --> 00:19:36,590 And the reason they're equally likely is each roll is 289 00:19:36,590 --> 00:19:40,120 independent of the previous rolls. 290 00:19:43,240 --> 00:19:46,620 And as we'll see in our excursion in probability and 291 00:19:46,620 --> 00:19:53,030 randomness, independence is a very important assumption. 292 00:19:53,030 --> 00:19:59,020 In a stochastic process, two events are independent if the 293 00:19:59,020 --> 00:20:02,700 outcome of one event has no influence on the 294 00:20:02,700 --> 00:20:05,540 outcome of the other. 295 00:20:05,540 --> 00:20:10,490 The events are independent if the outcome of one event has 296 00:20:10,490 --> 00:20:14,870 no influence on the outcome of the other. 297 00:20:14,870 --> 00:20:17,880 So it's a bit easier to think about this, maybe, if we 298 00:20:17,880 --> 00:20:21,050 simplify the situation for the moment to think about flipping 299 00:20:21,050 --> 00:20:25,830 coins, which have either heads or tails, and I'll look at the 300 00:20:25,830 --> 00:20:30,590 value 0 or 1 as the examples there. 301 00:20:30,590 --> 00:20:32,240 That means I can use-- 302 00:20:32,240 --> 00:20:34,960 I have a binary die for some reason. 303 00:20:34,960 --> 00:20:41,380 So as we've seen before, if I flip a coin ten times, how 304 00:20:41,380 --> 00:20:45,330 many different possibilities, sequences 305 00:20:45,330 --> 00:20:47,220 of 0 and 1 are there? 306 00:20:47,220 --> 00:20:49,730 Well we've seen this kind of thing a lot of times already 307 00:20:49,730 --> 00:20:51,880 this semester. 308 00:20:51,880 --> 00:20:55,930 There are 2 to the 10 binary numbers of 10 digits, so we 309 00:20:55,930 --> 00:20:57,590 know that there are 2 to the 10 possibilities. 310 00:21:00,870 --> 00:21:02,210 Each of these 2 to the 10 311 00:21:02,210 --> 00:21:06,350 possibilities are equally likely. 312 00:21:06,350 --> 00:21:13,140 So the number in which I have all 0's is no more likely than 313 00:21:13,140 --> 00:21:16,670 the number of all 1's, is no more likely than some 314 00:21:16,670 --> 00:21:19,580 seemingly random combinations of 0 and 1. 315 00:21:24,800 --> 00:21:28,640 So it's a very small probability. 316 00:21:28,640 --> 00:21:33,370 So what's the probability of getting all 1's? 317 00:21:33,370 --> 00:21:36,060 It's 1 out of 2 to the 10. 318 00:21:39,120 --> 00:21:41,580 What's the probability of getting all 0's? 319 00:21:41,580 --> 00:21:43,850 1 out of 2 to the 10. 320 00:21:43,850 --> 00:21:46,880 What's the probability of any combination you would happen 321 00:21:46,880 --> 00:21:49,070 to pick of 0's and 1's? 322 00:21:49,070 --> 00:21:50,970 1 over 2 to the 10. 323 00:21:54,310 --> 00:21:57,520 I know I'm belaboring this point, but the point I want to 324 00:21:57,520 --> 00:22:02,830 make is that when we talk about some result having a 325 00:22:02,830 --> 00:22:07,000 particular probability, we are asking, essentially, the 326 00:22:07,000 --> 00:22:19,960 question, what fraction of the possible results have the 327 00:22:19,960 --> 00:22:22,550 property we're testing for. 328 00:22:26,720 --> 00:22:30,120 So I ask what property are all 1's. 329 00:22:30,120 --> 00:22:32,225 I'm saying what fraction are all 1's. 330 00:22:32,225 --> 00:22:36,350 If I say well, the properties are exactly four 1's. 331 00:22:36,350 --> 00:22:40,600 What fraction of these numbers have exactly four 1's in them? 332 00:22:40,600 --> 00:22:42,000 Whatever I want. 333 00:22:42,000 --> 00:22:47,250 So probabilities will always be fractions. 334 00:22:47,250 --> 00:22:51,180 That's important because it means that when we talk about 335 00:22:51,180 --> 00:22:55,450 the probability of some event occurring, we know it has to 336 00:22:55,450 --> 00:23:00,060 be somewhere between 0 and 1. 337 00:23:00,060 --> 00:23:03,420 Probabilities are never less than 0. 338 00:23:03,420 --> 00:23:05,900 They're never more than 1. 339 00:23:05,900 --> 00:23:10,710 Cannot happen, guaranteed to happen, usually 340 00:23:10,710 --> 00:23:11,960 somewhere in between. 341 00:23:14,740 --> 00:23:18,570 All right, that's the key thing to remember when 342 00:23:18,570 --> 00:23:19,820 thinking about probabilities. 343 00:23:22,230 --> 00:23:25,980 Suppose I want to ask, what's the probability of getting 344 00:23:25,980 --> 00:23:30,240 some sequence of coin flips other than all 1's. 345 00:23:33,040 --> 00:23:37,990 Well I know the probability of getting all 1's is 1 out of 2 346 00:23:37,990 --> 00:23:39,240 to the 10th. 347 00:23:41,400 --> 00:23:43,750 I know the probability of getting some 348 00:23:43,750 --> 00:23:46,350 sequence of flips is 1. 349 00:23:46,350 --> 00:23:49,110 It's certain I'll get one of those numbers. 350 00:23:49,110 --> 00:23:52,610 So the answer is the probability of not getting all 351 00:23:52,610 --> 00:23:55,680 1's is 1 minus 1 over 2 to the 10th. 352 00:23:58,200 --> 00:24:02,490 This is an important trick to remember. 353 00:24:02,490 --> 00:24:07,010 Typically we have two ways of computing probabilities. 354 00:24:07,010 --> 00:24:11,240 We can either compute it directly, as I did when I 355 00:24:11,240 --> 00:24:16,170 computed the probability of getting all 1's, or we can 356 00:24:16,170 --> 00:24:22,940 compute the probability of something not happening by 357 00:24:22,940 --> 00:24:26,450 subtracting one probability from another, 358 00:24:26,450 --> 00:24:29,240 this 1 minus trick. 359 00:24:29,240 --> 00:24:32,720 And so you'll see me using this formulation a lot of 360 00:24:32,720 --> 00:24:36,760 times, And we'll talk as we go forward about when 361 00:24:36,760 --> 00:24:38,910 you do it which way. 362 00:24:38,910 --> 00:24:45,910 All right so let's go back, finally, to our six-sided die. 363 00:24:45,910 --> 00:24:50,020 How many sequences are there of length 10 for that? 364 00:24:50,020 --> 00:24:51,290 2 to the 10th? 365 00:24:51,290 --> 00:24:52,620 No. 366 00:24:52,620 --> 00:24:54,370 6 to the 10th. 367 00:24:54,370 --> 00:24:57,390 Because unlike the coin where we only had 2 possibilities, 368 00:24:57,390 --> 00:25:00,020 we now have 6 possibilities. 369 00:25:00,020 --> 00:25:05,120 So there are 6 to the 10th different sequences of rolls I 370 00:25:05,120 --> 00:25:08,470 could get, quite a few. 371 00:25:08,470 --> 00:25:14,600 So the probability of getting 10 consecutive 1's is 1 over 6 372 00:25:14,600 --> 00:25:17,070 to the 10th. 373 00:25:17,070 --> 00:25:19,640 And of course, the probability of getting this sequence, 374 00:25:19,640 --> 00:25:23,320 here, is also 1 over 6 to the 10th, so we see 375 00:25:23,320 --> 00:25:26,815 that they are equal. 376 00:25:26,815 --> 00:25:31,080 OK, we're going to spend a lot more time on probability and 377 00:25:31,080 --> 00:25:34,155 randomized and stochastic algorithms. 378 00:25:34,155 --> 00:25:38,850 But before I do that, I want to take a brief digression and 379 00:25:38,850 --> 00:25:41,570 return to the topic we looked at a little earlier this 380 00:25:41,570 --> 00:25:46,220 morning, which was data visualization, plotting. 381 00:25:46,220 --> 00:25:48,610 I want to do this for two reasons. 382 00:25:48,610 --> 00:25:51,360 One, it's really important. 383 00:25:51,360 --> 00:25:54,650 It's something that all of us do a lot of in the course of 384 00:25:54,650 --> 00:25:59,430 our work, but it also will just make it a lot easier for 385 00:25:59,430 --> 00:26:03,720 me to talk about probability and stochastics when I can 386 00:26:03,720 --> 00:26:07,040 draw some pretty pictures to illustrate what's going on. 387 00:26:10,450 --> 00:26:17,720 Now many people, most of us, probably, when we're writing 388 00:26:17,720 --> 00:26:21,470 code to do something, focus on writing programs that perform 389 00:26:21,470 --> 00:26:25,000 some complicated analysis of the data, 390 00:26:25,000 --> 00:26:26,960 and then print something. 391 00:26:26,960 --> 00:26:31,220 We don't spend enough time, I think, worrying about how the 392 00:26:31,220 --> 00:26:35,760 results of our analyses are presented so that somebody 393 00:26:35,760 --> 00:26:38,460 else can make sense of them, or in fact, we can understand 394 00:26:38,460 --> 00:26:40,730 them better ourselves. 395 00:26:40,730 --> 00:26:44,660 Sometimes text is the best way, but sometimes there's a 396 00:26:44,660 --> 00:26:49,140 lot of truth to the Chinese proverb that a picture's 397 00:26:49,140 --> 00:26:53,330 meaning can express 10,000 words. 398 00:26:53,330 --> 00:26:55,840 Now most of us, sort of, believe this. 399 00:26:55,840 --> 00:26:57,480 Why don't we do it? 400 00:26:57,480 --> 00:26:59,930 Well because in most programming languages, it's 401 00:26:59,930 --> 00:27:03,010 hard to draw pretty pictures. 402 00:27:03,010 --> 00:27:05,670 One of the reasons we use Python in this class is 403 00:27:05,670 --> 00:27:09,770 because in Python, it's easy to draw pretty pictures or, at 404 00:27:09,770 --> 00:27:12,530 least, to make plots. 405 00:27:12,530 --> 00:27:14,020 Why is that? 406 00:27:14,020 --> 00:27:16,560 It's because somebody-- 407 00:27:16,560 --> 00:27:17,780 not me-- 408 00:27:17,780 --> 00:27:19,500 went to the trouble of building 409 00:27:19,500 --> 00:27:20,750 something called PyLab. 410 00:27:26,170 --> 00:27:30,780 PyLab is a Python library that provides many of the 411 00:27:30,780 --> 00:27:34,050 facilities of something called MATLAB. 412 00:27:41,170 --> 00:27:47,260 If you're an MIT student, the probability of your graduating 413 00:27:47,260 --> 00:27:51,740 without using MATLAB is very low. 414 00:27:51,740 --> 00:27:54,980 It is something people use a lot. 415 00:27:54,980 --> 00:27:57,330 It's not my favorite programming language. 416 00:27:57,330 --> 00:27:59,560 It has its utility. 417 00:27:59,560 --> 00:28:02,590 I like Python, because it brings-- 418 00:28:02,590 --> 00:28:07,600 a lot of the features of MATLAB are easy to use in a 419 00:28:07,600 --> 00:28:10,270 programming language that I find much more 420 00:28:10,270 --> 00:28:12,170 convivial than MATLAB. 421 00:28:14,710 --> 00:28:17,830 All right I'm not going to give you a complete tutorial 422 00:28:17,830 --> 00:28:20,030 for PyLab here. 423 00:28:20,030 --> 00:28:23,630 It would take a long time, and it would be boring. 424 00:28:23,630 --> 00:28:26,960 Instead I'm going to give you a few examples, and in fact, 425 00:28:26,960 --> 00:28:32,166 focus primarily on the plotting capabilities. 426 00:28:35,810 --> 00:28:39,880 And the good news is the plotting capabilities in PyLab 427 00:28:39,880 --> 00:28:43,760 are almost identical to those in MATLAB, so if you learn how 428 00:28:43,760 --> 00:28:48,580 to do it here, you'll already know how to do it here. 429 00:28:48,580 --> 00:28:54,046 For details you should take a look at-- 430 00:28:54,046 --> 00:28:56,060 let me make sure I write this correctly-- 431 00:28:59,140 --> 00:29:15,360 this website, matplotlib.sourceforge.net, 432 00:29:15,360 --> 00:29:18,750 and it's a very nicely put together website that will 433 00:29:18,750 --> 00:29:22,880 give you all of the capabilities of plotting. 434 00:29:22,880 --> 00:29:27,000 Also you'll find in the class website, I've written a little 435 00:29:27,000 --> 00:29:31,330 chapter of a book about how to do this sort of thing, and you 436 00:29:31,330 --> 00:29:32,750 may also find that helpful. 437 00:29:35,510 --> 00:29:39,170 I should point out that PyLab is not part of the standard 438 00:29:39,170 --> 00:29:41,240 Python distribution. 439 00:29:41,240 --> 00:29:44,610 It has to be installed on your computer, and again, there are 440 00:29:44,610 --> 00:29:46,890 instructions about how to do this posted 441 00:29:46,890 --> 00:29:49,730 on the class website. 442 00:29:49,730 --> 00:29:53,290 All right let's start with something very simple. 443 00:29:58,400 --> 00:29:59,480 We'll look at that, here. 444 00:29:59,480 --> 00:30:02,930 I'm beginning by importing. 445 00:30:08,500 --> 00:30:09,700 I'll import PyLab. 446 00:30:09,700 --> 00:30:13,220 You have to import it to use it. 447 00:30:13,220 --> 00:30:16,470 And then I'm going to plot two things. 448 00:30:16,470 --> 00:30:19,560 So what I want to observe, here, is I'm 449 00:30:19,560 --> 00:30:20,980 going to plot two vectors-- 450 00:30:20,980 --> 00:30:26,160 the vector 1, 2, 3, 4 and the vector 1, 2, 3, 4. 451 00:30:26,160 --> 00:30:27,830 These are the x-coordinates. 452 00:30:27,830 --> 00:30:29,850 These are the y-coordinates. 453 00:30:29,850 --> 00:30:33,540 So we'll get a two-dimensional plot, x versus y. 454 00:30:33,540 --> 00:30:36,900 And it's very important that these two vectors be of the 455 00:30:36,900 --> 00:30:38,150 same length. 456 00:30:40,420 --> 00:30:43,170 Doubtless when you're using this in your problem sets, you 457 00:30:43,170 --> 00:30:46,310 will screw up and you'll get an error message, which you 458 00:30:46,310 --> 00:30:48,940 will have a hard time interpreting, but what it 459 00:30:48,940 --> 00:30:51,340 probably is going to boil down to is you've done something 460 00:30:51,340 --> 00:30:53,540 wrong, and you're plotting things that are 461 00:30:53,540 --> 00:30:56,700 not the same length. 462 00:30:56,700 --> 00:30:59,290 After I plot these two things, I'm going to type 463 00:30:59,290 --> 00:31:03,340 "pyLab.show," which will put the plots up 464 00:31:03,340 --> 00:31:04,760 for us to look at. 465 00:31:04,760 --> 00:31:06,320 So let's do that now. 466 00:31:16,640 --> 00:31:20,530 So the first plot in our straight line-- 467 00:31:20,530 --> 00:31:24,900 1 versus 1, 2 versus 2, 3 versus 3, 4 versus 4-- 468 00:31:24,900 --> 00:31:26,110 got that. 469 00:31:26,110 --> 00:31:30,500 Then it plotted this rather funny looking zigzag we saw, 470 00:31:30,500 --> 00:31:35,280 kind of just randomly chosen in the other one. 471 00:31:35,280 --> 00:31:37,615 And the plots will always look something like this. 472 00:31:40,730 --> 00:31:46,295 I should mention, if we go look at the code, pyLab.show-- 473 00:31:49,030 --> 00:31:53,190 if I haven't said that, the plot would not have appeared 474 00:31:53,190 --> 00:31:54,950 on my screen. 475 00:31:54,950 --> 00:31:59,440 PyLab would have produced it, but not displayed it. 476 00:31:59,440 --> 00:32:03,000 You may think that's silly, but in fact it's useful, 477 00:32:03,000 --> 00:32:06,350 because most of the time when I'm writing programs that do 478 00:32:06,350 --> 00:32:10,200 plotting, I don't want to see them on my screen. 479 00:32:10,200 --> 00:32:12,550 I'm producing a whole bunch of plots, and I'm going to write 480 00:32:12,550 --> 00:32:16,170 them to files that I will then look at later or include in a 481 00:32:16,170 --> 00:32:19,940 paper or a lecture or something, so it makes me say 482 00:32:19,940 --> 00:32:23,230 that I want to see it. 483 00:32:23,230 --> 00:32:28,280 I should point out, depending upon your operating system, it 484 00:32:28,280 --> 00:32:32,460 can be pretty annoying, because if you try and do this 485 00:32:32,460 --> 00:32:36,180 twice in your code, something bad can happen. 486 00:32:36,180 --> 00:32:39,350 The code can hang. 487 00:32:39,350 --> 00:32:41,600 Therefore, you should only execute 488 00:32:41,600 --> 00:32:45,010 pyLab.show once per program. 489 00:32:45,010 --> 00:32:48,510 And it should always be the last thing you executed, 490 00:32:48,510 --> 00:32:52,140 because once you execute pyLab.show, the program will 491 00:32:52,140 --> 00:32:54,960 stop running, essentially. 492 00:32:54,960 --> 00:32:55,730 It's annoying. 493 00:32:55,730 --> 00:32:58,280 I wish it weren't that way, but it is. 494 00:32:58,280 --> 00:33:01,840 So live with it. 495 00:33:01,840 --> 00:33:06,095 All right let's go look back at our graph, our plot. 496 00:33:11,900 --> 00:33:15,450 At the top is a title. 497 00:33:15,450 --> 00:33:17,080 This is the default title. 498 00:33:17,080 --> 00:33:19,520 It says figure 1. 499 00:33:19,520 --> 00:33:22,130 Later we'll see that I could have given it a much better 500 00:33:22,130 --> 00:33:23,990 title than that. 501 00:33:23,990 --> 00:33:29,230 Then it's got the values of the x and y-axes and a bunch 502 00:33:29,230 --> 00:33:34,050 of things down at the bottom that we could point out. 503 00:33:34,050 --> 00:33:39,890 You can zoom in on the plots, using that or zoom out. 504 00:33:39,890 --> 00:33:42,390 You can use this funny icon. 505 00:33:42,390 --> 00:33:46,940 It happens to be a floppy disk icon, something that probably 506 00:33:46,940 --> 00:33:49,190 most of you have never seen. 507 00:33:49,190 --> 00:33:52,130 Congratulations if you've never seen a floppy disk. 508 00:33:52,130 --> 00:33:54,750 Your life is better than it would be had you had 509 00:33:54,750 --> 00:33:56,580 to deal with them. 510 00:33:56,580 --> 00:33:59,370 But that's used for saving them to a file. 511 00:33:59,370 --> 00:34:00,890 You can move around in it. 512 00:34:00,890 --> 00:34:04,130 You can get back to what the original figure was, a whole 513 00:34:04,130 --> 00:34:05,380 bunch of useful things. 514 00:34:05,380 --> 00:34:09,550 I suggest you just bring this up and play with it. 515 00:34:14,520 --> 00:34:19,110 One of the things I should mention here is you'll note 516 00:34:19,110 --> 00:34:23,639 that when I produced this, I only use four points-- 517 00:34:23,639 --> 00:34:26,800 1, 2, 3, and 4, say, for this one. 518 00:34:26,800 --> 00:34:31,732 Yet it looks as if I have a continuous plot. 519 00:34:31,732 --> 00:34:37,600 You know, it claims that 1.5 and 1.5 match. 520 00:34:37,600 --> 00:34:41,730 This can be very deceptive, as we'll see later on, and 521 00:34:41,730 --> 00:34:45,150 probably it might have been better for me to plot not 522 00:34:45,150 --> 00:34:48,960 lines here, but points, indicating there are only four 523 00:34:48,960 --> 00:34:51,420 points in this graph. 524 00:34:51,420 --> 00:34:54,040 It's more apparent here, where we see this funny looking 525 00:34:54,040 --> 00:34:58,070 zigzag, implying some complicated relationship 526 00:34:58,070 --> 00:35:00,370 amongst the points, which, actually, 527 00:35:00,370 --> 00:35:02,640 probably doesn't exist. 528 00:35:02,640 --> 00:35:09,060 And again later on we'll see ways to avoid that. 529 00:35:09,060 --> 00:35:10,820 OK. 530 00:35:10,820 --> 00:35:14,940 It is, of course, possible to produce more than one figure. 531 00:35:14,940 --> 00:35:16,190 So let's look at this. 532 00:35:18,720 --> 00:35:22,630 We'll comment this piece out for now. 533 00:35:34,480 --> 00:35:37,900 And we'll run this code. 534 00:35:37,900 --> 00:35:41,540 So this says I want to plot something on a figure I'm 535 00:35:41,540 --> 00:35:45,630 calling figure 1, as before, And then I'm going to 536 00:35:45,630 --> 00:35:46,620 go to figure 2. 537 00:35:46,620 --> 00:35:49,650 So now instead of plotting both of these on the same 538 00:35:49,650 --> 00:35:52,950 figure, I'm going to put them on separate figures. 539 00:35:52,950 --> 00:35:59,100 And then pyLab.savefigure will, actually, create a file 540 00:35:59,100 --> 00:36:02,880 in the directory in which I'm running the program and save 541 00:36:02,880 --> 00:36:06,260 it-- called firstsaved And then this will be secondsaved. 542 00:36:06,260 --> 00:36:07,640 And now I can run it. 543 00:36:14,820 --> 00:36:17,150 And now I have two figures-- 544 00:36:17,150 --> 00:36:19,390 figure 1, as before-- 545 00:36:19,390 --> 00:36:21,740 well not quite as before, a different figure-- 546 00:36:21,740 --> 00:36:22,990 and figure 2. 547 00:36:27,150 --> 00:36:30,850 Again, nothing very magical there. 548 00:36:30,850 --> 00:36:33,530 And if we look in my directory, we should see, I 549 00:36:33,530 --> 00:36:39,380 hope, firstsaved and secondsaved, so if we look at 550 00:36:39,380 --> 00:36:44,770 secondsaved, we'll see, there it is. 551 00:36:44,770 --> 00:36:48,640 And just to show that I'm not cheating, you can notice that 552 00:36:48,640 --> 00:36:53,620 the time stamp is today at 10:34 AM, which happens to be 553 00:36:53,620 --> 00:36:58,170 when I'm giving this particular lecture. 554 00:36:58,170 --> 00:37:03,025 All right you can put that away now and continue. 555 00:37:08,000 --> 00:37:24,180 Now what I want you to notice is the last one. 556 00:37:24,180 --> 00:37:27,030 I just gave it one argument-- 557 00:37:27,030 --> 00:37:30,820 5, 6, 7 and 10. 558 00:37:30,820 --> 00:37:32,185 And if we look at what it plotted-- 559 00:37:41,100 --> 00:37:43,010 actually, I think I put that in figure 1, didn't I? 560 00:37:47,110 --> 00:37:48,360 You'll see, here it is. 561 00:37:51,900 --> 00:37:57,480 And it's made up some values for the x-axis. 562 00:37:57,480 --> 00:38:00,380 If you only give it one set of values, it assumes it's the 563 00:38:00,380 --> 00:38:03,825 y-axis, and it finds values. 564 00:38:06,340 --> 00:38:10,890 All right now what values is it going to 565 00:38:10,890 --> 00:38:13,090 choose for the x-axis? 566 00:38:13,090 --> 00:38:16,690 Well this is Python, so surprise, surprise, the first 567 00:38:16,690 --> 00:38:21,610 value is 0, 1, 2, and 3. 568 00:38:21,610 --> 00:38:23,090 It's how I get the four values. 569 00:38:25,890 --> 00:38:32,320 Now we could look at another example, a slightly more 570 00:38:32,320 --> 00:38:33,570 interesting one. 571 00:38:37,190 --> 00:38:39,790 Comment this out, so we don't look at the boring stuff over 572 00:38:39,790 --> 00:38:41,040 and over again. 573 00:38:46,330 --> 00:38:50,325 I've written a little program to calculate interest. 574 00:38:58,270 --> 00:39:04,090 So I'm going to start with an initial principal, here, of 575 00:39:04,090 --> 00:39:08,900 1,000, an interest rate of 5%, 20 years, and 576 00:39:08,900 --> 00:39:12,490 just do compound interest. 577 00:39:12,490 --> 00:39:15,430 This, you all would know how to write this. 578 00:39:15,430 --> 00:39:17,450 And I'm going to plot it and see what we get. 579 00:39:26,350 --> 00:39:28,380 All right so what we have-- 580 00:39:28,380 --> 00:39:32,040 something here, which, sort of, shows the kind of 581 00:39:32,040 --> 00:39:36,420 beautiful growth you get with compound interest, what the 582 00:39:36,420 --> 00:39:40,050 finance people call the magic of compounding, which will 583 00:39:40,050 --> 00:39:43,710 make us all rich in principle, until we see what the 584 00:39:43,710 --> 00:39:45,630 markets really do. 585 00:39:45,630 --> 00:39:48,790 But at any rate, for now we can look at it, and it looks 586 00:39:48,790 --> 00:39:50,040 very pretty. 587 00:39:53,090 --> 00:39:56,850 But I don't know what it means. 588 00:39:56,850 --> 00:40:00,560 I look and say, "oh, its figure 1." Well that's not 589 00:40:00,560 --> 00:40:05,670 very informative, and x goes from 0 to 20, but if I haven't 590 00:40:05,670 --> 00:40:07,380 told you, you wouldn't know what that meant. 591 00:40:07,380 --> 00:40:12,720 And y from 10,000 up to 28,000, but again, you 592 00:40:12,720 --> 00:40:13,970 wouldn't know what that means. 593 00:40:16,690 --> 00:40:19,340 We see this all the time. 594 00:40:19,340 --> 00:40:20,590 It's not a good thing. 595 00:40:24,340 --> 00:40:26,820 It's a bad thing, in fact. 596 00:40:26,820 --> 00:40:31,080 All plots should have informative titles, and all 597 00:40:31,080 --> 00:40:33,730 axes should be labeled. 598 00:40:33,730 --> 00:40:36,570 I can't tell you the number of times I've had a graduate 599 00:40:36,570 --> 00:40:40,890 student show up in my office, having worked for weeks 600 00:40:40,890 --> 00:40:44,530 producing some data, put a plot on my desk, and say, look 601 00:40:44,530 --> 00:40:46,570 at this, isn't it great. 602 00:40:46,570 --> 00:40:49,940 And I say, I have no idea what it means. 603 00:40:49,940 --> 00:40:52,560 And sometimes, I'll say, well what is the y-axis, and they 604 00:40:52,560 --> 00:40:53,680 end up scratching their head. 605 00:40:53,680 --> 00:40:56,700 They're not quite sure. 606 00:40:56,700 --> 00:40:58,390 You've got to label your axes. 607 00:40:58,390 --> 00:40:59,590 You've got to put a title. 608 00:40:59,590 --> 00:41:02,575 You've got to give the person a break. 609 00:41:02,575 --> 00:41:05,000 Well how do we do that? 610 00:41:05,000 --> 00:41:07,750 Well it's pretty simple. 611 00:41:07,750 --> 00:41:09,000 So here's this code. 612 00:41:15,530 --> 00:41:23,770 So I just want to get rid of the old graphs, because 613 00:41:23,770 --> 00:41:26,035 sometimes if you look at them, it causes you problems. 614 00:41:30,720 --> 00:41:32,770 Because it's now hung. 615 00:41:32,770 --> 00:41:34,790 It won't continue until I get rid of it. 616 00:41:34,790 --> 00:41:37,110 Now my shell is back. 617 00:41:37,110 --> 00:41:43,090 So pyLab.title just says, OK, I'm going to call it 5% 618 00:41:43,090 --> 00:41:45,500 growth, compounded annually. 619 00:41:45,500 --> 00:41:49,570 Notice that I've put in that it's 5% interest rate and that 620 00:41:49,570 --> 00:41:51,340 I'm compounding it annually, not 621 00:41:51,340 --> 00:41:52,710 semi-annually or quarterly. 622 00:41:55,790 --> 00:41:59,280 The x-axis is going to be the years of compounding and the 623 00:41:59,280 --> 00:42:02,760 y-axis, the value of the principle in dollars. 624 00:42:11,450 --> 00:42:15,350 So now it's the same curve, but a far 625 00:42:15,350 --> 00:42:17,455 more informative picture. 626 00:42:20,580 --> 00:42:23,730 All right here's where I want to stop today. 627 00:42:23,730 --> 00:42:27,960 We're going to come back to this topic, in probably more 628 00:42:27,960 --> 00:42:31,640 detail than you want, and spend quite a lot of time 629 00:42:31,640 --> 00:42:35,890 talking about how do we produce beautiful plots, and 630 00:42:35,890 --> 00:42:38,080 more importantly, how do we produce plots that are, 631 00:42:38,080 --> 00:42:41,760 actually, meaningful to those reading them. 632 00:42:41,760 --> 00:42:43,220 Thanks a lot. 633 00:42:43,220 --> 00:42:44,710 I'll see you in the next lecture.