1 00:00:00,030 --> 00:00:02,400 The following content is provided under a creative 2 00:00:02,400 --> 00:00:03,840 commons license. 3 00:00:03,840 --> 00:00:06,840 Your support will help MIT OpenCourseWare continue to 4 00:00:06,840 --> 00:00:10,530 offer high quality educational resources for free. 5 00:00:10,530 --> 00:00:13,390 To make a donation or view additional materials from 6 00:00:13,390 --> 00:00:17,190 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,190 --> 00:00:21,960 ocw.mit.edu. 8 00:00:21,960 --> 00:00:24,640 PROFESSOR: One of the things that you should probably have 9 00:00:24,640 --> 00:00:29,490 noticed is as we're moving in the terms, the problem sets 10 00:00:29,490 --> 00:00:35,960 are getting less well defined. line And I've seen a lot of 11 00:00:35,960 --> 00:00:39,870 email traffic of the nature of what should we do with this, 12 00:00:39,870 --> 00:00:41,140 what should we do with that? 13 00:00:41,140 --> 00:00:44,040 For example, suppose the computer 14 00:00:44,040 --> 00:00:45,830 player runs out of time. 15 00:00:45,830 --> 00:00:49,400 Or the person runs out of time playing the game. 16 00:00:49,400 --> 00:00:51,400 Should it stop right away? 17 00:00:51,400 --> 00:00:55,670 Should it just give them zero score? 18 00:00:55,670 --> 00:00:57,530 That's left out of the problem set. 19 00:00:57,530 --> 00:01:00,960 In part because one of the things were trying to 20 00:01:00,960 --> 00:01:05,240 accomplish is to have you folks start noticing 21 00:01:05,240 --> 00:01:07,910 ambiguities in problem statements. 22 00:01:07,910 --> 00:01:12,120 Because that's life in computing. 23 00:01:12,120 --> 00:01:15,360 And so this is not like a math problem set or a physics 24 00:01:15,360 --> 00:01:17,210 problem set. 25 00:01:17,210 --> 00:01:19,740 Or, like a high school physics lab, where we all know what 26 00:01:19,740 --> 00:01:21,400 the answer should be, and you could fake 27 00:01:21,400 --> 00:01:25,210 your lab results anyway. 28 00:01:25,210 --> 00:01:27,250 These are things where you're going to have to kind of 29 00:01:27,250 --> 00:01:29,060 figure it out. 30 00:01:29,060 --> 00:01:32,660 And for most of these things, all we ask is that you do 31 00:01:32,660 --> 00:01:36,110 something reasonable, and that you describe what 32 00:01:36,110 --> 00:01:38,410 it is you're doing. 33 00:01:38,410 --> 00:01:41,710 So I don't much care, for example, whether you give the 34 00:01:41,710 --> 00:01:44,580 human players zero points for playing after 35 00:01:44,580 --> 00:01:46,560 the time runs out. 36 00:01:46,560 --> 00:01:50,180 Or you say you're done when the time runs out. 37 00:01:50,180 --> 00:01:54,810 Any of that -- thank you, Sheila -- is ok with me. 38 00:01:54,810 --> 00:01:57,320 Whatever. 39 00:01:57,320 --> 00:01:59,700 What I don't want your program to do is 40 00:01:59,700 --> 00:02:01,030 crash when that happens. 41 00:02:01,030 --> 00:02:02,650 Or run forever. 42 00:02:02,650 --> 00:02:05,180 Figure out something reasonable and do it. 43 00:02:05,180 --> 00:02:08,600 And again, we'll see this as an increasing trend as we work 44 00:02:08,600 --> 00:02:11,260 our way through the term. 45 00:02:11,260 --> 00:02:15,290 The exception will be the next problem set, which will come 46 00:02:15,290 --> 00:02:17,500 out Friday. 47 00:02:17,500 --> 00:02:20,800 Because that's not a programming problem. 48 00:02:20,800 --> 00:02:23,850 It's a problem, as you'll see, designed to give you some 49 00:02:23,850 --> 00:02:28,460 practice at dealing with some of the, dare I say, more 50 00:02:28,460 --> 00:02:31,420 theoretical concepts we've covered in class. 51 00:02:31,420 --> 00:02:34,630 Like algorithmic complexity. 52 00:02:34,630 --> 00:02:39,960 That are not readily dealt with in a prograing problem. 53 00:02:39,960 --> 00:02:43,950 It also deals with issues of some of the subtleties of 54 00:02:43,950 --> 00:02:46,190 things like aliasing. 55 00:02:46,190 --> 00:02:48,560 So there's no programming. 56 00:02:48,560 --> 00:02:50,611 And in fact, we're not even going to ask 57 00:02:50,611 --> 00:02:52,640 you to hand it in. 58 00:02:52,640 --> 00:02:55,780 It's a problem set where we've worked pretty hard to write 59 00:02:55,780 --> 00:02:58,750 some problems that we think will provide you with a good 60 00:02:58,750 --> 00:03:00,820 learning experience. 61 00:03:00,820 --> 00:03:03,980 And you should just do it to learn the material. 62 00:03:03,980 --> 00:03:07,260 We'll help you, if you need -- to see the TAs because you 63 00:03:07,260 --> 00:03:10,720 can't do them, by all means make sure you get some help. 64 00:03:10,720 --> 00:03:13,835 So I'm not suggesting that it's an optional problem set 65 00:03:13,835 --> 00:03:15,780 that you shouldn't do. 66 00:03:15,780 --> 00:03:19,020 Because you will come to regret it if you don't do it. 67 00:03:19,020 --> 00:03:20,540 But we're not going to grade it. 68 00:03:20,540 --> 00:03:23,420 And since we're not going to grade it, it seems kind of 69 00:03:23,420 --> 00:03:25,550 unfair to ask you to hand it in. 70 00:03:25,550 --> 00:03:29,000 So it's a short problem set, but make sure you know how to 71 00:03:29,000 --> 00:03:29,130 do 72 00:03:29,130 --> 00:03:33,600 those problems. OK. 73 00:03:33,600 --> 00:03:36,710 Today, for the rest of the lecture, we're going to take a 74 00:03:36,710 --> 00:03:41,950 break from the topic of algorithms, and computation, 75 00:03:41,950 --> 00:03:43,490 and things of the sort. 76 00:03:43,490 --> 00:03:47,200 And do something pretty pragmatic. 77 00:03:47,200 --> 00:03:52,200 And we're going to talk briefly about testing. 78 00:03:52,200 --> 00:04:01,220 And at considerable length about debugging. 79 00:04:01,220 --> 00:04:04,110 I have tried to give this lecture at the beginning of 80 00:04:04,110 --> 00:04:06,120 the term, at the end of the term. 81 00:04:06,120 --> 00:04:08,650 Now I'm trying it kind of a third of the way, or 82 00:04:08,650 --> 00:04:10,590 middle of the term. 83 00:04:10,590 --> 00:04:13,550 I never know the right time to give it. 84 00:04:13,550 --> 00:04:17,070 These are sort of pragmatic hints that are useful. 85 00:04:17,070 --> 00:04:20,820 I suspect all of you have found that debugging can be a 86 00:04:20,820 --> 00:04:24,160 frustrating activity. 87 00:04:24,160 --> 00:04:27,000 My hope is that at this point, you've experienced enough 88 00:04:27,000 --> 00:04:30,780 frustration that the kind of pragmatic hints I'm going to 89 00:04:30,780 --> 00:04:35,250 talk about will not be, "yeah sure, of course." But they'll 90 00:04:35,250 --> 00:04:37,410 actually make sense to you. 91 00:04:37,410 --> 00:04:40,560 We'll see. 92 00:04:40,560 --> 00:04:41,850 OK. 93 00:04:41,850 --> 00:04:45,480 In a perfect world, the weather would always be like 94 00:04:45,480 --> 00:04:48,730 it's been this week. 95 00:04:48,730 --> 00:04:51,600 The M in MIT would stand for Maui, instead of 96 00:04:51,600 --> 00:04:54,680 Massachusetts. 97 00:04:54,680 --> 00:04:57,990 Quantum physics would be easier to understand. 98 00:04:57,990 --> 00:05:03,770 All the supreme court justices would share our social values. 99 00:05:03,770 --> 00:05:06,840 And most importantly, our programs would work the first 100 00:05:06,840 --> 00:05:09,850 time we typed them. 101 00:05:09,850 --> 00:05:13,130 By now you may have noticed that we do not live in an 102 00:05:13,130 --> 00:05:14,630 ideal world. 103 00:05:14,630 --> 00:05:20,140 At least one of those things I mentioned is not true. 104 00:05:20,140 --> 00:05:22,300 I'm only going to address the last one. 105 00:05:22,300 --> 00:05:24,180 Why our programs don't work. 106 00:05:24,180 --> 00:05:27,430 And I will leave the supreme court up to the rest of you. 107 00:05:27,430 --> 00:05:29,340 There is an election coming up. 108 00:05:29,340 --> 00:05:33,080 Alright, First a few definitions. 109 00:05:33,080 --> 00:05:42,260 Things I want to make sure we all understand what they mean. 110 00:05:42,260 --> 00:05:46,370 Validation is a process. 111 00:05:46,370 --> 00:05:52,160 And I want to emphasize the word process. 112 00:05:52,160 --> 00:06:09,630 Designed to uncover problems and increase confidence that 113 00:06:09,630 --> 00:06:16,820 our program does what we think it's intended to do. 114 00:06:16,820 --> 00:06:23,600 I want to emphasize that it will increase our confidence, 115 00:06:23,600 --> 00:06:29,960 but we can never really be sure we've got it nailed. 116 00:06:29,960 --> 00:06:33,340 And so it's a process that goes on and on. 117 00:06:33,340 --> 00:06:37,930 And I also want to emphasize that a big piece of it is to 118 00:06:37,930 --> 00:06:43,030 uncover problems. So we need to have a method not designed 119 00:06:43,030 --> 00:06:45,910 to give us unwarranted confidence. 120 00:06:45,910 --> 00:06:50,650 But in fact warranted confidence in our programs. 121 00:06:50,650 --> 00:06:55,490 It's typically a combination of two things. 122 00:06:55,490 --> 00:07:05,280 Testing and reasoning. 123 00:07:05,280 --> 00:07:09,160 Testing, we run our program on some set of inputs. 124 00:07:09,160 --> 00:07:13,910 And check the answers, and say yeah, that's what we expected. 125 00:07:13,910 --> 00:07:16,540 But it also involves reasoning. 126 00:07:16,540 --> 00:07:19,220 About why that's an appropriate set of inputs to 127 00:07:19,220 --> 00:07:20,400 test it on it. 128 00:07:20,400 --> 00:07:23,140 Have we tested it on enough inputs? 129 00:07:23,140 --> 00:07:26,930 Maybe just reading the code and studying it and convincing 130 00:07:26,930 --> 00:07:29,640 ourselves that works. 131 00:07:29,640 --> 00:07:37,350 So we do both of those as part of the validation process. 132 00:07:37,350 --> 00:07:42,870 And we'll talk about all of this as we go along. 133 00:07:42,870 --> 00:07:58,020 Debugging is a different process. 134 00:07:58,020 --> 00:08:11,160 And that's basically the process of ascertaining why 135 00:08:11,160 --> 00:08:18,140 the program is not working. 136 00:08:18,140 --> 00:08:24,810 Why it's failing to work properly. 137 00:08:24,810 --> 00:08:29,950 So validation says whoops, it's not working. 138 00:08:29,950 --> 00:08:35,030 And now we try and figure out why not. 139 00:08:35,030 --> 00:08:37,510 And then of course, once we figure out why not, we try and 140 00:08:37,510 --> 00:08:43,110 fix it. but today I'm going to emphasize not how do you fix 141 00:08:43,110 --> 00:08:45,640 it, but how do you find out what's wrong. 142 00:08:45,640 --> 00:08:48,765 Usually when you know why it's not working, it's obvious what 143 00:08:48,765 --> 00:08:51,950 you have to do to make it work. 144 00:08:51,950 --> 00:08:56,590 There are two aspects of it. 145 00:08:56,590 --> 00:08:59,340 Thus far, the problem sets have 146 00:08:59,340 --> 00:09:02,640 mostly focused on function. 147 00:09:02,640 --> 00:09:06,350 Does it exhibit the functional behavior? 148 00:09:06,350 --> 00:09:12,170 Does it give you the answer that you expected it to give? 149 00:09:12,170 --> 00:09:17,050 Often, in practical problems, you'll spend just as much time 150 00:09:17,050 --> 00:09:21,180 doing performance debugging. 151 00:09:21,180 --> 00:09:23,240 Why is it slow? 152 00:09:23,240 --> 00:09:26,310 Why is it not getting the answer as fast 153 00:09:26,310 --> 00:09:28,980 as I want it to? 154 00:09:28,980 --> 00:09:31,510 And in fact, in a lot of industry -- for example, if 155 00:09:31,510 --> 00:09:36,170 you're working on building a computer game, you'll discover 156 00:09:36,170 --> 00:09:38,410 that in fact the people working the game will spend 157 00:09:38,410 --> 00:09:42,010 more time on performance debugging than on getting it 158 00:09:42,010 --> 00:09:43,310 to do the right thing. 159 00:09:43,310 --> 00:09:45,540 Trying to make it do it fast enough. 160 00:09:45,540 --> 00:09:51,470 Or get to run on the right processor. 161 00:09:51,470 --> 00:09:56,920 Some other terms we've talked about is defensive 162 00:09:56,920 --> 00:10:02,590 programming. 163 00:10:02,590 --> 00:10:05,840 And we've been weaving that pretty consistently 164 00:10:05,840 --> 00:10:08,320 throughout the term. 165 00:10:08,320 --> 00:10:13,020 And that's basically writing your programs in such a way 166 00:10:13,020 --> 00:10:31,180 that it will facilitate both validation and debugging. 167 00:10:31,180 --> 00:10:34,280 And we've talked about a lot of ways we do that. 168 00:10:34,280 --> 00:10:39,990 One of the most important things we do is we use assert 169 00:10:39,990 --> 00:10:44,070 statements so that we catch problems early. 170 00:10:44,070 --> 00:10:46,170 We write specifications of our functions. 171 00:10:46,170 --> 00:10:48,620 We modularize things. 172 00:10:48,620 --> 00:10:49,990 And we'll come back to this. 173 00:10:49,990 --> 00:10:53,710 As every time we introduce a new programming concept, we'll 174 00:10:53,710 --> 00:10:57,340 relate it back, as we have been doing consistently, to 175 00:10:57,340 --> 00:11:00,330 defensive programming. 176 00:11:00,330 --> 00:11:03,780 So one of the things I want you to notice here is that 177 00:11:03,780 --> 00:11:08,710 testing and debugging are not the same thing. 178 00:11:08,710 --> 00:11:13,310 When we test, we compare an input output pair to a 179 00:11:13,310 --> 00:11:16,830 specification. 180 00:11:16,830 --> 00:11:44,150 When we debug, we study the events that led to an error. 181 00:11:44,150 --> 00:11:47,480 I'll return to testing later in the term. 182 00:11:47,480 --> 00:11:51,120 But I do want to make a couple of quick remarks with very 183 00:11:51,120 --> 00:11:55,040 broad strokes. 184 00:11:55,040 --> 00:11:58,880 There are basically two classes of testing. 185 00:11:58,880 --> 00:12:05,210 There's unit testing, where we validate each piece of the 186 00:12:05,210 --> 00:12:08,640 program independently. 187 00:12:08,640 --> 00:12:14,850 Thus far, for us it's been testing individual functions. 188 00:12:14,850 --> 00:12:28,330 Later in the term, we'll talk about unit testing of classes. 189 00:12:28,330 --> 00:12:33,830 The other kind of testing is integration testing. 190 00:12:33,830 --> 00:12:37,930 Where we put our whole program together, and we say does the 191 00:12:37,930 --> 00:12:46,970 whole thing work? 192 00:12:46,970 --> 00:12:52,530 People tend to want to rush in and do this right away. 193 00:12:52,530 --> 00:12:54,930 That's usually a big mistake. 194 00:12:54,930 --> 00:12:59,690 Because usually it doesn't work. 195 00:12:59,690 --> 00:13:05,250 And so one of the things that I think is very important is 196 00:13:05,250 --> 00:13:08,520 to always begin by testing each unit. 197 00:13:08,520 --> 00:13:12,180 So before I try and run my program, I test each part of 198 00:13:12,180 --> 00:13:16,690 it independently. 199 00:13:16,690 --> 00:13:19,070 And that's because it's easier to test small 200 00:13:19,070 --> 00:13:21,140 things than big things. 201 00:13:21,140 --> 00:13:25,540 And it's easier to debug small things than big things. 202 00:13:25,540 --> 00:13:27,960 Eventually, it's a big program, I run it. 203 00:13:27,960 --> 00:13:31,710 It never works the first time if it's a big program. 204 00:13:31,710 --> 00:13:34,320 And I end up going back and doing unit testing anyway, to 205 00:13:34,320 --> 00:13:36,430 try and figure out why it doesn't work. 206 00:13:36,430 --> 00:13:39,000 So over the years, I've just convinced myself I might as 207 00:13:39,000 --> 00:13:44,490 well start where I'm going to end up. 208 00:13:44,490 --> 00:13:47,440 What's so hard about testing? 209 00:13:47,440 --> 00:13:56,420 Why is testing always a challenge? 210 00:13:56,420 --> 00:13:59,340 Well, you could just try it and see if it works, right? 211 00:13:59,340 --> 00:14:02,140 That's what testing is all about. 212 00:14:02,140 --> 00:14:07,580 So we could look at something small. 213 00:14:07,580 --> 00:14:13,740 Just write a program to find the max of x and y. 214 00:14:13,740 --> 00:14:24,810 Where x and y are floats. 215 00:14:24,810 --> 00:14:28,840 However many quotes I need. 216 00:14:28,840 --> 00:14:31,780 Well, just see if it works. 217 00:14:31,780 --> 00:14:34,650 Let's test it in all possible combinations of x and y and 218 00:14:34,650 --> 00:14:37,100 see if we get the right answer. 219 00:14:37,100 --> 00:14:40,920 Well, as Carl Sagan would have said, there are billions and 220 00:14:40,920 --> 00:14:43,780 billions of tests we would have to do. 221 00:14:43,780 --> 00:14:47,190 Or maybe it's billions and billions and billions. 222 00:14:47,190 --> 00:14:49,310 Pretty impractical. 223 00:14:49,310 --> 00:14:53,390 And it's hard to imagine a simpler program than this. 224 00:14:53,390 --> 00:14:57,340 So we very quickly realize that exhaustive testing is 225 00:14:57,340 --> 00:15:01,870 just never feasible for an interesting program. 226 00:15:01,870 --> 00:15:05,620 So as we look at testing, what we have to find is what's 227 00:15:05,620 --> 00:15:12,300 called a test suite. 228 00:15:12,300 --> 00:15:21,430 A test suite is small enough so that we can test it in a 229 00:15:21,430 --> 00:15:27,160 reasonable amount of time. 230 00:15:27,160 --> 00:15:47,150 But also large enough to give us some confidence. 231 00:15:47,150 --> 00:15:51,250 Later in the term, we'll spend part of a lecture talking 232 00:15:51,250 --> 00:15:54,760 about, how do we find such a test suite? 233 00:15:54,760 --> 00:15:58,460 A test suite that will make us feel good about things. 234 00:15:58,460 --> 00:16:02,780 For now, I just want you to be aware that you're always doing 235 00:16:02,780 --> 00:16:06,450 this balancing act. 236 00:16:06,450 --> 00:16:09,950 So let's assume we've run our test suite. 237 00:16:09,950 --> 00:16:14,730 And, sad to say, at least one of our tests produced an 238 00:16:14,730 --> 00:16:19,000 output that we were unhappy with. 239 00:16:19,000 --> 00:16:21,140 It took it too long to generate the output. 240 00:16:21,140 --> 00:16:26,410 Or more likely, it was just the wrong output. 241 00:16:26,410 --> 00:16:29,220 That gets us to debugging. 242 00:16:29,220 --> 00:16:36,440 So a word about debugging. 243 00:16:36,440 --> 00:16:38,030 Where did the name come from? 244 00:16:38,030 --> 00:16:43,610 Well here's a fun story, at least. This was one of the 245 00:16:43,610 --> 00:16:48,950 very first recorded bugs in the history of computation. 246 00:16:48,950 --> 00:16:54,640 Recorded September 9th, 1947, in case you're interested. 247 00:16:54,640 --> 00:16:58,370 This was the lab book of Grace Murray Hopper. 248 00:16:58,370 --> 00:17:00,230 Later Admiral Grace Murray Hopper. 249 00:17:00,230 --> 00:17:04,290 The first female admiral in the U.S. navy. 250 00:17:04,290 --> 00:17:10,170 Who was also one of the word's first programmers. 251 00:17:10,170 --> 00:17:13,090 So she was trying to write this program, 252 00:17:13,090 --> 00:17:14,880 and it didn't work. 253 00:17:14,880 --> 00:17:16,250 It was a complicated program. 254 00:17:16,250 --> 00:17:18,760 It was computing the arctan. 255 00:17:18,760 --> 00:17:20,690 So you can imagine, right? 256 00:17:20,690 --> 00:17:23,570 You had a whole team of people trying to figure 257 00:17:23,570 --> 00:17:25,340 out how to do arctans. 258 00:17:25,340 --> 00:17:27,560 Times were different in those days. 259 00:17:27,560 --> 00:17:30,980 And they tried to run it, and it ran a long time. 260 00:17:30,980 --> 00:17:32,910 Then it basically stopped. 261 00:17:32,910 --> 00:17:35,360 Then they started the cosine tape. 262 00:17:35,360 --> 00:17:39,860 That didn't work. 263 00:17:39,860 --> 00:17:41,720 Well they couldn't figure out what was wrong. 264 00:17:41,720 --> 00:17:45,630 And they spent a long time trying to debug the program. 265 00:17:45,630 --> 00:17:47,640 They didn't apparently call it debugging. 266 00:17:47,640 --> 00:17:51,900 And then they found the problem. 267 00:17:51,900 --> 00:17:56,840 In relay number 70, a moth had been trapped. 268 00:17:56,840 --> 00:17:59,330 And the relay had closed on the poor creature, 269 00:17:59,330 --> 00:18:02,280 crushing it to death. 270 00:18:02,280 --> 00:18:06,780 The defense department didn't care about the loss of a moth. 271 00:18:06,780 --> 00:18:08,370 But they did care about the fact that the 272 00:18:08,370 --> 00:18:09,970 relay was now stuck. 273 00:18:09,970 --> 00:18:11,630 It didn't work. 274 00:18:11,630 --> 00:18:14,890 They removed the moth, and the program worked. 275 00:18:14,890 --> 00:18:19,060 And you'll see at the bottom, it says the first actual case 276 00:18:19,060 --> 00:18:21,850 of a bug being found. 277 00:18:21,850 --> 00:18:23,380 And they were very proud of themselves. 278 00:18:23,380 --> 00:18:25,900 Now it's a wonderful story, and it is true. 279 00:18:25,900 --> 00:18:28,750 After all, Grace wouldn't have lied. 280 00:18:28,750 --> 00:18:31,700 But it's not the first use of the term "bug." And as you'll 281 00:18:31,700 --> 00:18:34,740 see by your handout, I've attempted tend to trace it. 282 00:18:34,740 --> 00:18:38,460 And the first one I could find was in 1896. 283 00:18:38,460 --> 00:18:44,250 In a handbook on electricity. 284 00:18:44,250 --> 00:18:44,530 Alright. 285 00:18:44,530 --> 00:18:48,220 Now debugging is a learned skill. 286 00:18:48,220 --> 00:18:51,580 Nobody does it well instinctively. 287 00:18:51,580 --> 00:18:55,060 And a large part of being a good programmer, or learning 288 00:18:55,060 --> 00:18:59,050 to be a good programmer, is learning how to debug. 289 00:18:59,050 --> 00:19:02,170 And it's one of these things where it's harder. 290 00:19:02,170 --> 00:19:05,310 It's slow, slow, and you suddenly have an epiphany. 291 00:19:05,310 --> 00:19:08,070 And you now get the hang of it. 292 00:19:08,070 --> 00:19:09,850 And I'm hoping that today's lecture will 293 00:19:09,850 --> 00:19:13,200 help you learn faster. 294 00:19:13,200 --> 00:19:17,620 The nice thing, is once you learn to debug programs, you 295 00:19:17,620 --> 00:19:20,950 will discover it's a transferable skill. 296 00:19:20,950 --> 00:19:26,110 And you can use it to debug other complex systems. So for 297 00:19:26,110 --> 00:19:28,730 example, a laboratory experience. 298 00:19:28,730 --> 00:19:31,740 Why isn't this experiment working? 299 00:19:31,740 --> 00:19:35,320 There's a lecture I've given several times at hospitals, to 300 00:19:35,320 --> 00:19:41,300 doctors, on doing diagnosis of complex multi illnesses. 301 00:19:41,300 --> 00:19:44,050 And I go through it, and almost the same kind of stuff 302 00:19:44,050 --> 00:19:46,690 I'm going to talk to you about, about debugging. 303 00:19:46,690 --> 00:19:50,830 Explaining that it's really a process of engineering. 304 00:19:50,830 --> 00:19:54,650 So I want to start by disabusing you of a couple of 305 00:19:54,650 --> 00:19:56,740 myths about bugs. 306 00:19:56,740 --> 00:20:08,200 So myth one is that bugs crawl into programs. Well it may 307 00:20:08,200 --> 00:20:12,980 have been true in the old days, when bugs flew or 308 00:20:12,980 --> 00:20:15,230 crawled into relays. 309 00:20:15,230 --> 00:20:17,900 It's not true now. 310 00:20:17,900 --> 00:20:20,620 If there is a bug in the program, it's there for only 311 00:20:20,620 --> 00:20:22,080 one reason. 312 00:20:22,080 --> 00:20:27,820 You put it there. i.e. you made a mistake. 313 00:20:27,820 --> 00:20:30,070 So we like to call them bugs, because it doesn't make us 314 00:20:30,070 --> 00:20:31,700 feel stupid. 315 00:20:31,700 --> 00:20:37,880 But in fact, a better word would be mistake. 316 00:20:37,880 --> 00:20:45,840 Another myth is that the bugs breed. 317 00:20:45,840 --> 00:20:47,350 They do not. 318 00:20:47,350 --> 00:20:50,790 If there are multiple bugs in the program, it's because you 319 00:20:50,790 --> 00:20:53,570 made multiple mistakes. 320 00:20:53,570 --> 00:20:56,650 Not because you made one or two and they mated and 321 00:20:56,650 --> 00:20:59,190 produced many more bugs. 322 00:20:59,190 --> 00:21:00,290 It doesn't work that way. 323 00:21:00,290 --> 00:21:02,830 That's a good thing. 324 00:21:02,830 --> 00:21:05,870 Typically, even though they don't breed, 325 00:21:05,870 --> 00:21:09,150 there are many bugs. 326 00:21:09,150 --> 00:21:19,330 And keep in mind that the goal of debugging is not to 327 00:21:19,330 --> 00:21:29,230 eliminate one bug. 328 00:21:29,230 --> 00:21:39,720 The goal is to move towards a bug free program. 329 00:21:39,720 --> 00:21:42,800 I emphasize this because it often leads to a different 330 00:21:42,800 --> 00:21:44,810 debugging strategy. 331 00:21:44,810 --> 00:21:47,680 People can get hung up on sort of hunting these things down, 332 00:21:47,680 --> 00:21:49,850 and stamping them out, one at a time. 333 00:21:49,850 --> 00:21:52,230 And it's a little bit like playing Whack-a-Mole. 334 00:21:52,230 --> 00:21:52,780 Right? 335 00:21:52,780 --> 00:21:55,010 They keep jumping up at you. 336 00:21:55,010 --> 00:22:00,280 So the goal is to figure out a way to stamp them all out. 337 00:22:00,280 --> 00:22:04,250 Now, should you be proud when you find a bug? 338 00:22:04,250 --> 00:22:07,740 I've had graduate students come to me and say I found a 339 00:22:07,740 --> 00:22:08,770 bug in my program. 340 00:22:08,770 --> 00:22:11,150 And they're really proud of themselves. 341 00:22:11,150 --> 00:22:14,560 And depending on the mood I'm in, I either congratulate 342 00:22:14,560 --> 00:22:18,600 them, or I say ah, you screwed up, huh? 343 00:22:18,600 --> 00:22:23,050 Then you had to fix it. 344 00:22:23,050 --> 00:22:28,820 If you find a bug, it probably means there are more of them. 345 00:22:28,820 --> 00:22:31,940 So you ought to be a little bit careful. 346 00:22:31,940 --> 00:22:34,860 The story I've heard told is you're at somebody's house for 347 00:22:34,860 --> 00:22:36,620 dinner, and you're sitting at the dining room table, then 348 00:22:36,620 --> 00:22:39,900 you hear a [BANG]. 349 00:22:39,900 --> 00:22:42,620 And then your hostess walks in with the turkey in a tray, and 350 00:22:42,620 --> 00:22:49,830 says, "I killed the last cockroach." Well it wouldn't 351 00:22:49,830 --> 00:22:57,180 increase my appetite, at least. So be worried about it. 352 00:22:57,180 --> 00:23:00,600 For at least four decades, people have been building 353 00:23:00,600 --> 00:23:02,770 tools called debuggers. 354 00:23:02,770 --> 00:23:04,220 Things to help you find bugs. 355 00:23:04,220 --> 00:23:07,750 And there are some built into Idol. 356 00:23:07,750 --> 00:23:09,870 My personal view is most of them are 357 00:23:09,870 --> 00:23:12,560 not worth the trouble. 358 00:23:12,560 --> 00:23:17,440 The two best debugging tools are the same now that they 359 00:23:17,440 --> 00:23:20,590 have almost always been. 360 00:23:20,590 --> 00:23:41,980 And they are the print statement, and reading. 361 00:23:41,980 --> 00:23:47,160 There is no substitute for reading your code. 362 00:23:47,160 --> 00:23:50,400 Getting good at this is probably the single most 363 00:23:50,400 --> 00:23:53,750 important skill for debugging. 364 00:23:53,750 --> 00:23:57,920 And people are often resistant to that. 365 00:23:57,920 --> 00:24:00,260 They'd rather single step it through using Idol or 366 00:24:00,260 --> 00:24:06,920 something, than just read it and try and figure things out. 367 00:24:06,920 --> 00:24:09,610 The most important thing to remember when you're doing all 368 00:24:09,610 --> 00:24:17,370 of this is to be systematic. 369 00:24:17,370 --> 00:24:21,330 That's what distinguishes good debuggers from bad debuggers. 370 00:24:21,330 --> 00:24:26,840 Good debuggers have evolved a way of systematically hunting 371 00:24:26,840 --> 00:24:29,320 for the bugs. 372 00:24:29,320 --> 00:24:33,460 And what they're doing as they hunt, is they're reducing the 373 00:24:33,460 --> 00:24:42,340 search space. 374 00:24:42,340 --> 00:24:52,680 And they do that to localize the source of the problem. 375 00:24:52,680 --> 00:24:55,960 We've already spent a fair amount of time this semester 376 00:24:55,960 --> 00:24:58,270 talking about searches. 377 00:24:58,270 --> 00:25:00,450 Algorithms for searching. 378 00:25:00,450 --> 00:25:07,000 Debugging is simply a search process. 379 00:25:07,000 --> 00:25:09,420 When you are searching a list to see whether it has an 380 00:25:09,420 --> 00:25:14,520 element, you don't randomly probe the list, hoping to find 381 00:25:14,520 --> 00:25:15,890 whether or not it's there. 382 00:25:15,890 --> 00:25:18,530 You find some way of systematically going through 383 00:25:18,530 --> 00:25:22,550 the list. Yet, I often see people, when they're 384 00:25:22,550 --> 00:25:26,120 debugging, proceeding at what, to me, looks almost like a 385 00:25:26,120 --> 00:25:32,580 random fashion of looking for the bug. 386 00:25:32,580 --> 00:25:35,940 That is a problem that may not terminate. 387 00:25:35,940 --> 00:25:39,730 So you need to be careful. 388 00:25:39,730 --> 00:25:42,820 So let's talk about how we go about being 389 00:25:42,820 --> 00:25:57,580 systematic, as we do this. 390 00:25:57,580 --> 00:26:00,860 So debugging starts when we find out that 391 00:26:00,860 --> 00:26:06,140 there exists a problem. 392 00:26:06,140 --> 00:26:14,780 So the first thing to do is to study the program text, and 393 00:26:14,780 --> 00:26:32,480 ask how could it have produced this result? 394 00:26:32,480 --> 00:26:34,330 So there's something subtle about the 395 00:26:34,330 --> 00:26:38,110 way I've worded this. 396 00:26:38,110 --> 00:26:41,650 I didn't ask, why didn't it produce the result I wanted it 397 00:26:41,650 --> 00:26:43,880 to produce? 398 00:26:43,880 --> 00:26:47,610 Which is sort of the question we'd immediately like to ask. 399 00:26:47,610 --> 00:26:53,190 Instead, I asked why did it produce the result it did. 400 00:26:53,190 --> 00:26:58,200 So I'm not asking myself what's wrong? 401 00:26:58,200 --> 00:26:59,850 Or how could I make it right? 402 00:26:59,850 --> 00:27:01,530 I'm asking how could have done this? 403 00:27:01,530 --> 00:27:04,450 I didn't expect it to do this. 404 00:27:04,450 --> 00:27:07,400 If you understand why it did what it did, 405 00:27:07,400 --> 00:27:10,530 you're half way there. 406 00:27:10,530 --> 00:27:22,400 The next big question you ask, is it part of a family? 407 00:27:22,400 --> 00:27:26,200 This gets back to the question of trying to get the program 408 00:27:26,200 --> 00:27:28,240 to be bug free. 409 00:27:28,240 --> 00:27:33,830 So for example, oh, it did this because it was aliasing, 410 00:27:33,830 --> 00:27:35,930 where I hadn't expected it. 411 00:27:35,930 --> 00:27:39,980 Or some side effect of some mutation with lists. 412 00:27:39,980 --> 00:27:42,330 And then I say, oh you know I've used lists 413 00:27:42,330 --> 00:27:44,520 all over this program. 414 00:27:44,520 --> 00:27:46,120 I'll bet this isn't the only place where 415 00:27:46,120 --> 00:27:50,550 I've made this mistake. 416 00:27:50,550 --> 00:27:53,590 So you say well, rather than rushing off and fixing this 417 00:27:53,590 --> 00:27:57,180 one bug, let me pull back and ask, is this a systematic 418 00:27:57,180 --> 00:28:01,030 mistake that I've made throughout the program? 419 00:28:01,030 --> 00:28:03,930 And if so, let's fix them all at once, rather 420 00:28:03,930 --> 00:28:06,480 than one at a time. 421 00:28:06,480 --> 00:28:11,430 And that gets me to the final question. 422 00:28:11,430 --> 00:28:21,030 How to fix it. 423 00:28:21,030 --> 00:28:27,730 When I think about debugging, I think about it in terms of 424 00:28:27,730 --> 00:28:29,930 what you learned in high school as 425 00:28:29,930 --> 00:28:35,660 the scientific method. 426 00:28:35,660 --> 00:28:36,900 Actually, I should ask the question. 427 00:28:36,900 --> 00:28:38,550 Maybe I'm dating myself. 428 00:28:38,550 --> 00:28:39,960 Do they still teach the scientific 429 00:28:39,960 --> 00:28:41,910 method in high school? 430 00:28:41,910 --> 00:28:43,920 Yes, alright good. 431 00:28:43,920 --> 00:28:47,920 All is not lost with the American educational system. 432 00:28:47,920 --> 00:28:51,510 So what does the scientific method tell us to do? 433 00:28:51,510 --> 00:28:55,680 Well it says you first start by studying 434 00:28:55,680 --> 00:29:05,100 the available data. 435 00:29:05,100 --> 00:29:12,680 In this case, the available data are the test results. 436 00:29:12,680 --> 00:29:17,530 And by the way, I mean all the test results. 437 00:29:17,530 --> 00:29:20,850 Not just the one where it didn't work, but also the ones 438 00:29:20,850 --> 00:29:23,210 where it did. 439 00:29:23,210 --> 00:29:24,920 Because maybe the program worked on some 440 00:29:24,920 --> 00:29:27,050 inputs and not on others. 441 00:29:27,050 --> 00:29:30,540 And maybe by understanding why it worked on a and not on b, 442 00:29:30,540 --> 00:29:33,380 you'll get a lot of insight that you won't if you just 443 00:29:33,380 --> 00:29:35,800 focus on the bug. 444 00:29:35,800 --> 00:29:37,790 You'll also feel a little bit better knowing your program 445 00:29:37,790 --> 00:29:42,130 works on at least something. 446 00:29:42,130 --> 00:29:44,670 The other big piece of available data we have is, of 447 00:29:44,670 --> 00:29:49,080 course, the program text. 448 00:29:49,080 --> 00:29:52,920 As you the study the program text, keep in mind that you 449 00:29:52,920 --> 00:29:55,980 don't understand it. 450 00:29:55,980 --> 00:29:59,280 Because if you really did, you wouldn't have the bug. 451 00:29:59,280 --> 00:30:06,990 So read it with sort of a skeptical eye. 452 00:30:06,990 --> 00:30:17,280 You then form a hypothesis consistent with all the data. 453 00:30:17,280 --> 00:30:22,410 Not just some of the data, but all of the data. 454 00:30:22,410 --> 00:30:42,790 And then you design and run a repeatable experiment. 455 00:30:42,790 --> 00:30:46,090 Now what is the thing we learned in high school about 456 00:30:46,090 --> 00:30:48,920 how to design these experiments? 457 00:30:48,920 --> 00:30:53,600 What must this experiment have the potential to do, to be a 458 00:30:53,600 --> 00:31:01,620 valid scientific experiment? 459 00:31:01,620 --> 00:31:03,100 Somebody? 460 00:31:03,100 --> 00:31:05,680 What's the key thing? 461 00:31:05,680 --> 00:31:14,940 It must have the potential to refute the hypothesis. 462 00:31:14,940 --> 00:31:20,120 It's not a valid experiment if it has no chance of showing 463 00:31:20,120 --> 00:31:23,560 that my hypothesis is flawed. 464 00:31:23,560 --> 00:31:28,450 Otherwise why bother running it? 465 00:31:28,450 --> 00:31:31,260 So it has to have that. 466 00:31:31,260 --> 00:31:34,240 Typically it's nice if it can have useful 467 00:31:34,240 --> 00:31:38,920 intermediate results. 468 00:31:38,920 --> 00:31:42,320 Not just one answer at the end. 469 00:31:42,320 --> 00:31:45,660 So we can sort of check the progress of the code. 470 00:31:45,660 --> 00:31:56,240 And we must know what the result is supposed to be. 471 00:31:56,240 --> 00:31:58,560 Typically when you run an experiment, you say, and I 472 00:31:58,560 --> 00:32:01,310 think the answer will be x. 473 00:32:01,310 --> 00:32:08,790 If it's not x, you've refuted the hypothesis. 474 00:32:08,790 --> 00:32:10,060 This is the place where people 475 00:32:10,060 --> 00:32:14,480 typically slip up in debugging. 476 00:32:14,480 --> 00:32:18,730 They don't think in advance what they expect 477 00:32:18,730 --> 00:32:20,530 the result to be. 478 00:32:20,530 --> 00:32:24,310 And therefore, they are not systematic about interpreting 479 00:32:24,310 --> 00:32:27,660 the results. 480 00:32:27,660 --> 00:32:31,930 So when someone comes to me, and they're about to do a 481 00:32:31,930 --> 00:32:36,070 test, I ask them, what do you expect your program to do? 482 00:32:36,070 --> 00:32:38,540 And if they can't answer that question, I say well, before 483 00:32:38,540 --> 00:32:43,830 you even run it, have an answer to that. 484 00:32:43,830 --> 00:32:47,170 Why might repeatability be an issue? 485 00:32:47,170 --> 00:32:51,030 Well as we'll see later in the term, we're going to use a lot 486 00:32:51,030 --> 00:32:54,820 of randomness in a lot of our programs. Where we essentially 487 00:32:54,820 --> 00:32:58,550 do the equivalent of flipping coins or rolling dice. 488 00:32:58,550 --> 00:33:00,880 And so the program may do different things 489 00:33:00,880 --> 00:33:03,270 on different runs. 490 00:33:03,270 --> 00:33:05,540 We'll see a lot of that, because it's used a lot in 491 00:33:05,540 --> 00:33:07,280 modern computing. 492 00:33:07,280 --> 00:33:10,130 And so you have to figure out how to take that randomness 493 00:33:10,130 --> 00:33:12,080 out of the experiment. 494 00:33:12,080 --> 00:33:17,160 And yet get a valid test. Sometimes it can be timing. 495 00:33:17,160 --> 00:33:18,960 If you're running multiple processes. 496 00:33:18,960 --> 00:33:22,050 That's why your operating systems and your personal 497 00:33:22,050 --> 00:33:25,680 computers often crash for no apparent reason. 498 00:33:25,680 --> 00:33:28,390 Just because two things happen to, once in a while, occur at 499 00:33:28,390 --> 00:33:31,880 the same time. 500 00:33:31,880 --> 00:33:34,700 And often there's human input. 501 00:33:34,700 --> 00:33:37,800 And people have to type things out of it. 502 00:33:37,800 --> 00:33:41,150 So you want to get rid of that. 503 00:33:41,150 --> 00:33:43,520 And we'll talk more about this later. 504 00:33:43,520 --> 00:33:46,810 Particularly when we get to using randomness. 505 00:33:46,810 --> 00:33:49,780 About how to debug programs where random 506 00:33:49,780 --> 00:33:52,680 choices are being made. 507 00:33:52,680 --> 00:33:54,170 Now let's think about designing 508 00:33:54,170 --> 00:33:58,250 the experiment itself. 509 00:33:58,250 --> 00:34:01,750 The goal here, there are two goals. 510 00:34:01,750 --> 00:34:02,470 Or more than two. 511 00:34:02,470 --> 00:34:11,040 One is to find the simplest input that 512 00:34:11,040 --> 00:34:17,060 will provoke the bug. 513 00:34:17,060 --> 00:34:20,090 So it's often the case that a program will run a long time, 514 00:34:20,090 --> 00:34:22,800 and then suddenly a bug will show up. 515 00:34:22,800 --> 00:34:26,000 But you don't want to have to run it a long time, every time 516 00:34:26,000 --> 00:34:27,670 you have a hypothesis. 517 00:34:27,670 --> 00:34:30,180 So you try and find a smaller input that 518 00:34:30,180 --> 00:34:32,550 will produce the problem. 519 00:34:32,550 --> 00:34:35,860 So if your word game doesn't work when the words are 12 520 00:34:35,860 --> 00:34:41,480 letters long, instead of continuing to debug 12 letter 521 00:34:41,480 --> 00:34:47,390 hands, see if you can make it fail on a three letter hand. 522 00:34:47,390 --> 00:34:50,240 If you can figure out why fails on three letters instead 523 00:34:50,240 --> 00:34:53,410 of 12, you'll be more than half way 524 00:34:53,410 --> 00:34:58,600 to solving the problem. 525 00:34:58,600 --> 00:35:01,730 What I typically do is I start with the input that provoked 526 00:35:01,730 --> 00:35:05,490 the problem, and I keep making it smaller and smaller. 527 00:35:05,490 --> 00:35:12,170 And see if I can't get it to show up. 528 00:35:12,170 --> 00:35:15,290 The other thing you want to do is find the part of the 529 00:35:15,290 --> 00:35:18,410 program that is most likely at fault. 530 00:35:18,410 --> 00:35:22,720 In both of these cases, I strongly 531 00:35:22,720 --> 00:35:29,330 recommend binary search. 532 00:35:29,330 --> 00:35:35,960 We've talked about this binary search a lot already. 533 00:35:35,960 --> 00:35:40,150 Again, the trick is, if you can get rid of half of the 534 00:35:40,150 --> 00:35:44,790 data at each shot, or half of the code at each shot., you'll 535 00:35:44,790 --> 00:35:48,890 quickly converge on where the problem is. 536 00:35:48,890 --> 00:35:52,930 So I now want to work through an example where we can see 537 00:35:52,930 --> 00:35:53,630 this happening. 538 00:35:53,630 --> 00:35:57,960 So this is the example on the handout. 539 00:35:57,960 --> 00:36:07,420 I've got a little program called Silly. 540 00:36:07,420 --> 00:36:09,820 And it's called Silly because it's really 541 00:36:09,820 --> 00:36:12,990 a rather ugly program. 542 00:36:12,990 --> 00:36:14,760 It's certainly not the right way to write a 543 00:36:14,760 --> 00:36:20,870 program to do this. 544 00:36:20,870 --> 00:36:25,590 But it will let us illustrate a few points. 545 00:36:25,590 --> 00:36:31,270 So the trick, what we're going to go through here, is this 546 00:36:31,270 --> 00:36:34,210 whole scientific process. 547 00:36:34,210 --> 00:36:37,400 And see what's going on. 548 00:36:37,400 --> 00:36:47,420 So let's try running Silly. 549 00:36:47,420 --> 00:36:50,740 So this is to test whether a list is a palindrome. 550 00:36:50,740 --> 00:36:54,270 So we'll put one as the first element, maybe a 551 00:36:54,270 --> 00:36:56,610 is the second element. 552 00:36:56,610 --> 00:37:01,350 And one is the third element. 553 00:37:01,350 --> 00:37:02,810 And just return, it's done. 554 00:37:02,810 --> 00:37:04,090 It is a palindrome. 555 00:37:04,090 --> 00:37:04,810 That make sense. 556 00:37:04,810 --> 00:37:08,150 The list one a one reads the same from the 557 00:37:08,150 --> 00:37:10,610 front or from the back. 558 00:37:10,610 --> 00:37:11,860 So that's good. 559 00:37:11,860 --> 00:37:14,530 Making some progress. 560 00:37:14,530 --> 00:37:19,540 Let's try it again. 561 00:37:19,540 --> 00:37:28,930 And now let's do one, a, two. 562 00:37:28,930 --> 00:37:31,140 Whoops. 563 00:37:31,140 --> 00:37:33,240 It tells me it is a palindrome. 564 00:37:33,240 --> 00:37:37,780 Well, it isn't really. 565 00:37:37,780 --> 00:37:40,650 I have a bug. 566 00:37:40,650 --> 00:37:43,360 Alright. 567 00:37:43,360 --> 00:37:44,380 Now what do I do? 568 00:37:44,380 --> 00:37:46,560 Well I'm going to use binary search to see if I 569 00:37:46,560 --> 00:37:50,030 can't find this bug. 570 00:37:50,030 --> 00:37:54,330 As I go through, I'm going to try and eliminate half of the 571 00:37:54,330 --> 00:37:59,330 code at each step. 572 00:37:59,330 --> 00:38:02,610 And the way I'm going to do that is by printing 573 00:38:02,610 --> 00:38:09,100 intermediate values, as I go part way through the code. 574 00:38:09,100 --> 00:38:14,820 I'm going to try and predict what the value is going to be. 575 00:38:14,820 --> 00:38:20,760 And then see if, indeed, I get what I predicted. 576 00:38:20,760 --> 00:38:24,540 Now, as I do this, I'm going to use binary search. 577 00:38:24,540 --> 00:38:28,050 I'm going to start somewhere near the middle of the code. 578 00:38:28,050 --> 00:38:32,640 Again, a lot of times, people don't do that. 579 00:38:32,640 --> 00:38:35,390 And they'll test an intermediate value near the 580 00:38:35,390 --> 00:38:38,830 end or near the beginning. 581 00:38:38,830 --> 00:38:42,680 Kind of in the hope of getting there in one shot. 582 00:38:42,680 --> 00:38:45,540 And that's like kind of hoping that the element you're 583 00:38:45,540 --> 00:38:47,750 searching for is the first in the list and the last in the 584 00:38:47,750 --> 00:38:50,560 list. Maybe. 585 00:38:50,560 --> 00:38:54,620 But part of the process of being systematic is not 586 00:38:54,620 --> 00:38:57,690 assuming that I'm going to get a lucky guess. 587 00:38:57,690 --> 00:39:00,490 But not even thinking really hard at this point. 588 00:39:00,490 --> 00:39:03,800 But just pruning the search space. 589 00:39:03,800 --> 00:39:08,540 Getting rid of half at each step. 590 00:39:08,540 --> 00:39:09,250 Alright. 591 00:39:09,250 --> 00:39:12,040 So let's start with the bisection. 592 00:39:12,040 --> 00:39:14,390 So we're going to choose a point about in the middle of 593 00:39:14,390 --> 00:39:16,950 my program. 594 00:39:16,950 --> 00:39:19,330 That's close to the middle. 595 00:39:19,330 --> 00:39:21,350 It might even be the middle. 596 00:39:21,350 --> 00:39:24,030 And we're going to see, well all right. 597 00:39:24,030 --> 00:39:26,600 The only thing I've done in this part of the program, now 598 00:39:26,600 --> 00:39:31,660 I'm going to go and read the code, is I've gotten the user 599 00:39:31,660 --> 00:39:34,410 to input a bunch of data. 600 00:39:34,410 --> 00:39:38,510 And built up the list corresponding to the three 601 00:39:38,510 --> 00:39:41,630 items that the user entered. 602 00:39:41,630 --> 00:39:51,040 So the only intermediate value I have here is really res. 603 00:39:51,040 --> 00:39:53,960 So I'm going to, just so when I'm finished I know what it is 604 00:39:53,960 --> 00:39:59,860 that I think I've printed. 605 00:39:59,860 --> 00:40:04,310 But in fact maybe I'll do even more than that here. 606 00:40:04,310 --> 00:40:12,300 Let me say what I think it should be. 607 00:40:12,300 --> 00:40:14,680 And then we'll see if it is. 608 00:40:14,680 --> 00:40:20,220 So I think I put in one a two, right? 609 00:40:20,220 --> 00:40:22,510 Or one a two? 610 00:40:22,510 --> 00:40:32,260 So it should be something like one, a, two. 611 00:40:32,260 --> 00:40:35,330 So I predicted what answer I'm expecting to get. 612 00:40:35,330 --> 00:40:38,640 And I've put it in my debugging code. 613 00:40:38,640 --> 00:40:45,400 And now I'll run it and see what we get. 614 00:40:45,400 --> 00:40:49,880 We'll save it. 615 00:40:49,880 --> 00:40:51,850 Well all right, a syntax error. 616 00:40:51,850 --> 00:40:53,680 This happens. 617 00:40:53,680 --> 00:40:55,570 And there's a syntax error. 618 00:40:55,570 --> 00:40:56,150 I see. 619 00:40:56,150 --> 00:40:57,520 Because I've got a quote in a quote. 620 00:40:57,520 --> 00:41:22,870 Alright I'm just going to do that. 621 00:41:22,870 --> 00:41:24,960 What I expected. 622 00:41:24,960 --> 00:41:27,640 So what have I learned? 623 00:41:27,640 --> 00:41:34,030 I've learned that with high probability, the error is not 624 00:41:34,030 --> 00:41:37,720 in the first part of the program. 625 00:41:37,720 --> 00:41:41,910 So I can now ignore that. 626 00:41:41,910 --> 00:41:46,060 So now I have these six lines. 627 00:41:46,060 --> 00:41:52,760 So we'll try and go in the middle of that. 628 00:41:52,760 --> 00:41:55,650 See if we can find it here. 629 00:41:55,650 --> 00:42:02,790 And notice, by the way, that I commented out the previous 630 00:42:02,790 --> 00:42:05,420 debugging line, rather than got rid of it. 631 00:42:05,420 --> 00:42:08,670 Since I'm not sure I won't need to go back to it. 632 00:42:08,670 --> 00:42:14,230 So what should I look at here? 633 00:42:14,230 --> 00:42:16,150 Well there are a couple of interesting intermediate 634 00:42:16,150 --> 00:42:20,220 values here, right? 635 00:42:20,220 --> 00:42:29,190 There's tmp. 636 00:42:29,190 --> 00:42:35,720 And there's res. 637 00:42:35,720 --> 00:42:40,540 Never type kneeling. 638 00:42:40,540 --> 00:42:40,870 Right? 639 00:42:40,870 --> 00:42:44,230 I find something to tmp. 640 00:42:44,230 --> 00:42:48,800 And I need to make sure maybe I haven't messed up res. 641 00:42:48,800 --> 00:42:51,270 Now it would be easy to assume, don't bother looking 642 00:42:51,270 --> 00:42:53,020 at [UNINTELLIGIBLE]. 643 00:42:53,020 --> 00:42:55,870 Because the code doesn't change res. 644 00:42:55,870 --> 00:42:59,150 Well remember, that I started this with a bug. 645 00:42:59,150 --> 00:43:02,240 That means it was something I didn't understand. 646 00:43:02,240 --> 00:43:05,740 So I'm going to be cautious and systematic. 647 00:43:05,740 --> 00:43:08,910 And say let's just print them both. 648 00:43:08,910 --> 00:43:17,380 And see whether they're okay. 649 00:43:17,380 --> 00:43:31,650 Now, let's do this. 650 00:43:31,650 --> 00:43:39,030 So it says tmp is two a one, and res is two a one. 651 00:43:39,030 --> 00:43:42,300 Well let's think it. 652 00:43:42,300 --> 00:43:44,630 Is this what we wanted, here? 653 00:43:44,630 --> 00:43:49,290 What's the basic idea behind this program? 654 00:43:49,290 --> 00:43:51,780 How is it attempting to work? 655 00:43:51,780 --> 00:43:55,050 Well what it's attempting to do, and now is when I have to 656 00:43:55,050 --> 00:43:57,560 stand back and form a hypothesis and think about 657 00:43:57,560 --> 00:44:01,730 what's going on, is it gets in the list, it reverses the 658 00:44:01,730 --> 00:44:04,720 list, and then sees whether the list and the 659 00:44:04,720 --> 00:44:05,630 reverse were identical. 660 00:44:05,630 --> 00:44:10,900 If so it was a palindrome, otherwise it wasn't. 661 00:44:10,900 --> 00:44:18,300 So I've now done this, and what do you think? 662 00:44:18,300 --> 00:44:25,340 Is this good or bad? 663 00:44:25,340 --> 00:44:28,990 Is this what I should be getting? 664 00:44:28,990 --> 00:44:29,450 No. 665 00:44:29,450 --> 00:44:30,020 What's wrong? 666 00:44:30,020 --> 00:44:32,710 Somebody? yeah. 667 00:44:32,710 --> 00:44:35,300 STUDENT: [UNINTELLIGIBLE] 668 00:44:35,300 --> 00:44:37,140 PROFESSOR: Yeah. 669 00:44:37,140 --> 00:44:42,330 Somehow I wanted to -- 670 00:44:42,330 --> 00:44:47,110 Got to work on those hands. 671 00:44:47,110 --> 00:44:50,530 I didn't want to change res. 672 00:44:50,530 --> 00:44:57,370 So, I now know that the bug has got to be between these 673 00:44:57,370 --> 00:45:00,100 two print statements. 674 00:45:00,100 --> 00:45:03,000 I'm narrowing it down. 675 00:45:03,000 --> 00:45:05,200 It's getting a little silly, but you know I'm going to 676 00:45:05,200 --> 00:45:08,930 really be persistent and just follow the rules here of 677 00:45:08,930 --> 00:45:14,880 binary search, rather than jumping to conclusions. 678 00:45:14,880 --> 00:45:21,130 Well clearly what I probably want to do here is what? 679 00:45:21,130 --> 00:45:31,930 Print these same two things. 680 00:45:31,930 --> 00:45:40,390 See what I get. 681 00:45:40,390 --> 00:45:41,910 Whoops. 682 00:45:41,910 --> 00:45:43,890 I have to, of course, do that. 683 00:45:43,890 --> 00:45:46,130 Otherwise it just tells me that Silly 684 00:45:46,130 --> 00:45:55,570 happens to be a function. 685 00:45:55,570 --> 00:45:57,200 Alright. 686 00:45:57,200 --> 00:45:58,900 How do I feel about this result? 687 00:45:58,900 --> 00:46:02,550 I feel pretty good here. 688 00:46:02,550 --> 00:46:02,810 Right? 689 00:46:02,810 --> 00:46:06,180 The idea was to make a copy of res and temp. 690 00:46:06,180 --> 00:46:08,940 And sure enough, they're both the same. 691 00:46:08,940 --> 00:46:12,250 What I expected them to be. 692 00:46:12,250 --> 00:46:14,040 So I know the bug is not above. 693 00:46:14,040 --> 00:46:17,120 Now I'm really honing in. 694 00:46:17,120 --> 00:46:25,340 I now know it's got to be between these two statements. 695 00:46:25,340 --> 00:46:50,380 So let's put it there. 696 00:46:50,380 --> 00:46:50,620 Aha. 697 00:46:50,620 --> 00:46:52,080 It's gone wrong. 698 00:46:52,080 --> 00:46:55,520 So now I've narrowed the bug down to one place. 699 00:46:55,520 --> 00:47:03,160 I know exactly which statement it's in. 700 00:47:03,160 --> 00:47:05,580 So something has happened there that 701 00:47:05,580 --> 00:47:09,700 wasn't what I expected. 702 00:47:09,700 --> 00:47:13,680 Who wants to tell me what that bug is? 703 00:47:13,680 --> 00:47:13,950 Yeah? 704 00:47:13,950 --> 00:47:25,890 STUDENT: [UNINTELLIGIBLE]. 705 00:47:25,890 --> 00:47:29,880 PROFESSOR: Right. 706 00:47:29,880 --> 00:47:33,010 Bad throw, good catch. 707 00:47:33,010 --> 00:47:37,020 So this is a classic error. 708 00:47:37,020 --> 00:47:40,050 I've not made a copy of the list. I've got an alias of the 709 00:47:40,050 --> 00:47:42,460 list. This was the thing that tripped up many 710 00:47:42,460 --> 00:47:44,440 of you on the quiz. 711 00:47:44,440 --> 00:47:58,840 And really what I should have done is this. 712 00:47:58,840 --> 00:48:09,040 Now we'll try it. 713 00:48:09,040 --> 00:48:10,410 Ha. 714 00:48:10,410 --> 00:48:16,420 It's not a palindrome. 715 00:48:16,420 --> 00:48:21,220 So small silly little exercise, but I'm hoping that 716 00:48:21,220 --> 00:48:23,890 you've sort of seen how by being patient. 717 00:48:23,890 --> 00:48:26,790 Patience is an important part of the debugging process. 718 00:48:26,790 --> 00:48:28,270 I have not rushed. 719 00:48:28,270 --> 00:48:31,770 I've calmly and slowly narrowed the search. 720 00:48:31,770 --> 00:48:35,020 Found where the statement is, and then fixed it. 721 00:48:35,020 --> 00:48:38,230 And now I'm going to go hunt through the rest of my code to 722 00:48:38,230 --> 00:48:42,030 look for places where I used assignment, when I should have 723 00:48:42,030 --> 00:48:45,180 use cloning as part of the assignment. 724 00:48:45,180 --> 00:48:48,485 The bug, the family here, is failure to clone when I should 725 00:48:48,485 --> 00:48:50,970 have cloned. 726 00:48:50,970 --> 00:48:54,080 Thursday we'll talk a little bit more about what to do once 727 00:48:54,080 --> 00:48:58,380 we've found the bug, and then back to algorithms.