1 00:00:00,790 --> 00:00:03,190 The following content is provided under a Creative 2 00:00:03,190 --> 00:00:04,730 Commons license. 3 00:00:04,730 --> 00:00:07,030 Your support will help MIT OpenCourseWare 4 00:00:07,030 --> 00:00:11,390 continue to offer high quality educational resources for free. 5 00:00:11,390 --> 00:00:13,990 To make a donation, or view additional materials 6 00:00:13,990 --> 00:00:17,880 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,880 --> 00:00:31,279 at ocw.mit.edu 8 00:00:31,279 --> 00:00:32,570 PROFESSOR: All right, everyone. 9 00:00:32,570 --> 00:00:33,250 Good afternoon. 10 00:00:33,250 --> 00:00:35,250 Let's get started. 11 00:00:35,250 --> 00:00:38,160 So today's lecture will be on testing, debugging, and then 12 00:00:38,160 --> 00:00:40,320 exceptions and assertions. 13 00:00:40,320 --> 00:00:44,100 So before we begin, let's start with an analogy to sort of come 14 00:00:44,100 --> 00:00:46,110 back to real life for a second. 15 00:00:46,110 --> 00:00:48,510 So I've made soup before. 16 00:00:48,510 --> 00:00:50,670 Perhaps you've made soup before. 17 00:00:50,670 --> 00:00:54,940 Let's say you're making soup in this big pot here. 18 00:00:54,940 --> 00:00:57,690 And it turns out that bugs keep falling into your soup 19 00:00:57,690 --> 00:00:58,570 from the ceiling. 20 00:00:58,570 --> 00:00:59,070 All right. 21 00:00:59,070 --> 00:01:02,130 Quick question to the audience. 22 00:01:02,130 --> 00:01:04,122 What do you do if you encountered this issue? 23 00:01:04,122 --> 00:01:07,330 AUDIENCE: [INTERPOSING VOICES] 24 00:01:07,330 --> 00:01:08,490 PROFESSOR: All right. 25 00:01:08,490 --> 00:01:09,080 Hands up. 26 00:01:09,080 --> 00:01:10,496 One one at a time. 27 00:01:10,496 --> 00:01:11,370 Anyone have any idea? 28 00:01:11,370 --> 00:01:11,685 Yeah. 29 00:01:11,685 --> 00:01:12,540 AUDIENCE: Eat it. 30 00:01:12,540 --> 00:01:13,290 PROFESSOR: Eat it. 31 00:01:13,290 --> 00:01:15,190 You want to eat it Anyway OK. 32 00:01:15,190 --> 00:01:15,690 All right. 33 00:01:15,690 --> 00:01:18,139 We're going for an analogy here with computer programming. 34 00:01:18,139 --> 00:01:20,430 I don't know what you'd do if you have a buggy program, 35 00:01:20,430 --> 00:01:22,379 I guess you just release it to the customer 36 00:01:22,379 --> 00:01:23,420 and they'd complain, but. 37 00:01:23,420 --> 00:01:24,180 OK. 38 00:01:24,180 --> 00:01:24,790 What else? 39 00:01:24,790 --> 00:01:25,290 Yeah. 40 00:01:25,290 --> 00:01:27,050 AUDIENCE: [INAUDIBLE] Cover the soup? 41 00:01:27,050 --> 00:01:27,900 PROFESSOR: Cover the soup. 42 00:01:27,900 --> 00:01:28,860 That's a good suggestion. 43 00:01:28,860 --> 00:01:29,420 Yeah. 44 00:01:29,420 --> 00:01:33,120 So you can cover the soup, so put a lid on it. 45 00:01:33,120 --> 00:01:35,935 Sometimes you'd have to open up, take the lid off, 46 00:01:35,935 --> 00:01:37,560 right, to check to make sure it's done. 47 00:01:37,560 --> 00:01:39,150 To taste it, add things. 48 00:01:39,150 --> 00:01:41,099 So bugs might fall in in between there. 49 00:01:41,099 --> 00:01:42,640 But covering the soup is a good idea. 50 00:01:42,640 --> 00:01:43,206 What else. 51 00:01:43,206 --> 00:01:43,705 Yeah. 52 00:01:43,705 --> 00:01:44,967 AUDIENCE: Debug it. 53 00:01:44,967 --> 00:01:45,800 PROFESSOR: Debug it. 54 00:01:49,240 --> 00:01:50,910 I wish I had something for that answer. 55 00:01:50,910 --> 00:01:51,410 All right. 56 00:01:51,410 --> 00:01:52,284 That's a good answer. 57 00:01:52,284 --> 00:01:52,900 Yeah. 58 00:01:52,900 --> 00:01:54,775 AUDIENCE: Take all the food out of your house 59 00:01:54,775 --> 00:01:56,654 so there's no-- nothing for the bugs to eat. 60 00:01:56,654 --> 00:01:58,695 PROFESSOR: So take all the food out of your house 61 00:01:58,695 --> 00:02:01,290 so there's nothing for the bugs to eat. 62 00:02:01,290 --> 00:02:04,530 That's sort of the equivalent of cleaning, 63 00:02:04,530 --> 00:02:07,646 like doing a mass cleaning of your entire house. 64 00:02:07,646 --> 00:02:09,020 That's a good, that's a good one. 65 00:02:09,020 --> 00:02:12,810 That's sort of eliminating the source of the bugs, right? 66 00:02:12,810 --> 00:02:13,320 What else? 67 00:02:16,190 --> 00:02:16,760 Yeah, John. 68 00:02:16,760 --> 00:02:18,176 AUDIENCE: Decide it's high protein 69 00:02:18,176 --> 00:02:19,670 and declare it a feature. 70 00:02:19,670 --> 00:02:21,530 PROFESSOR: Decide it's high protein 71 00:02:21,530 --> 00:02:23,330 and declare it a feature. 72 00:02:23,330 --> 00:02:26,911 That's probably what a lot of people would do, right? 73 00:02:26,911 --> 00:02:27,410 All right. 74 00:02:27,410 --> 00:02:28,040 Cool. 75 00:02:28,040 --> 00:02:34,340 So I wish computer debugging was as fun as taking bugs out 76 00:02:34,340 --> 00:02:34,940 of your soup. 77 00:02:34,940 --> 00:02:36,050 So what did we decide? 78 00:02:36,050 --> 00:02:38,390 Well we could check the soup for bugs. 79 00:02:38,390 --> 00:02:40,610 Keep the lid closed, that was a good suggestion. 80 00:02:40,610 --> 00:02:43,084 And cleaning your kitchen, which someone suggested. 81 00:02:43,084 --> 00:02:44,750 The equivalent of cleaning their kitchen 82 00:02:44,750 --> 00:02:46,730 was to just throw out all the food. 83 00:02:46,730 --> 00:02:48,570 I would take a mop and clean the floor, 84 00:02:48,570 --> 00:02:50,340 but yeah, that works too. 85 00:02:50,340 --> 00:02:57,267 So we can draw some parallels for this analogy 86 00:02:57,267 --> 00:02:58,350 with computer programming. 87 00:02:58,350 --> 00:03:01,590 So checking the soup is really equivalent to testing, right? 88 00:03:01,590 --> 00:03:04,980 You have a soup you think has bugs in it. 89 00:03:04,980 --> 00:03:05,700 Test it. 90 00:03:05,700 --> 00:03:06,810 Make sure there's no bugs. 91 00:03:06,810 --> 00:03:07,740 Continue on. 92 00:03:07,740 --> 00:03:09,660 Keeping the lid closed. 93 00:03:09,660 --> 00:03:11,820 It's sort of this idea of defensive programming. 94 00:03:11,820 --> 00:03:14,260 So make sure that bugs don't fall in in the first place. 95 00:03:14,260 --> 00:03:16,440 Sometimes you have to open the lid 96 00:03:16,440 --> 00:03:20,254 to make sure that the soup is tastes good or whatever. 97 00:03:20,254 --> 00:03:22,170 So that's equivalent to defensive programming. 98 00:03:22,170 --> 00:03:23,961 So try not to have bugs in the first place, 99 00:03:23,961 --> 00:03:25,984 but they might show up anyway. 100 00:03:25,984 --> 00:03:27,900 Cleaning the kitchen is eliminating the source 101 00:03:27,900 --> 00:03:29,191 of the bugs in the first place. 102 00:03:29,191 --> 00:03:33,050 This is actually really hard to do in programming. 103 00:03:33,050 --> 00:03:35,780 But you can still try to do it. 104 00:03:35,780 --> 00:03:36,890 OK. 105 00:03:36,890 --> 00:03:40,010 So let's talk a little bit about programming 106 00:03:40,010 --> 00:03:43,040 so far in 60001 600. 107 00:03:43,040 --> 00:03:46,350 So you expect, really, that you write a program, 108 00:03:46,350 --> 00:03:49,370 you maybe do a little debugging, and you run the program 109 00:03:49,370 --> 00:03:51,140 and it's perfect. 110 00:03:51,140 --> 00:03:51,920 Right? 111 00:03:51,920 --> 00:03:54,000 You just nailed it. 112 00:03:54,000 --> 00:03:56,780 But in reality you write this really complex piece of code 113 00:03:56,780 --> 00:04:01,470 and you go to run it and it crashes. 114 00:04:01,470 --> 00:04:02,390 Right? 115 00:04:02,390 --> 00:04:04,220 It's happened to me many times. 116 00:04:04,220 --> 00:04:06,920 It's happened to you many times. 117 00:04:06,920 --> 00:04:08,150 That's the reality. 118 00:04:08,150 --> 00:04:09,650 OK. 119 00:04:09,650 --> 00:04:12,580 So today's lecture will go over some tips and tricks 120 00:04:12,580 --> 00:04:15,830 and debugging and how you can help make your life easier 121 00:04:15,830 --> 00:04:18,515 when you're writing programs so you don't end up 122 00:04:18,515 --> 00:04:19,640 like this little girl here. 123 00:04:19,640 --> 00:04:21,200 Disappointed beyond belief. 124 00:04:21,200 --> 00:04:23,100 All right. 125 00:04:23,100 --> 00:04:25,500 So at the heart of it all is really 126 00:04:25,500 --> 00:04:28,310 starting with a defensive programming attitude. 127 00:04:28,310 --> 00:04:29,130 OK. 128 00:04:29,130 --> 00:04:31,860 And this comes back to the idea of decomposition 129 00:04:31,860 --> 00:04:35,400 and abstraction that we talked about when we started-- when we 130 00:04:35,400 --> 00:04:37,350 did the lecture on functions. 131 00:04:37,350 --> 00:04:37,980 Right? 132 00:04:37,980 --> 00:04:41,250 So try to start out with two modularize your code, right? 133 00:04:41,250 --> 00:04:44,370 If you write your code in different blocks, 134 00:04:44,370 --> 00:04:46,440 documenting each different block, 135 00:04:46,440 --> 00:04:49,380 you're more likely to understand what's happening in your code 136 00:04:49,380 --> 00:04:51,990 later on and you'll be able to test it and debug 137 00:04:51,990 --> 00:04:54,170 it a lot easier. 138 00:04:54,170 --> 00:04:56,210 Speaking of testing and debugging, 139 00:04:56,210 --> 00:05:01,070 once you've written a program that's modular, 140 00:05:01,070 --> 00:05:03,690 you still have to test it. 141 00:05:03,690 --> 00:05:06,150 And the process of testing is really just 142 00:05:06,150 --> 00:05:08,760 coming up with inputs. 143 00:05:08,760 --> 00:05:11,880 Figuring out what outputs you expect. 144 00:05:11,880 --> 00:05:13,360 And then running your program. 145 00:05:13,360 --> 00:05:17,610 Does the output that the program give match what you expected? 146 00:05:17,610 --> 00:05:19,930 If it does, great, you're done. 147 00:05:19,930 --> 00:05:22,950 But if it doesn't, you have to go to this debugging step. 148 00:05:22,950 --> 00:05:25,570 And the debugging step is the hardest part. 149 00:05:25,570 --> 00:05:28,471 And it's really just figuring out why the program crashed, 150 00:05:28,471 --> 00:05:30,720 or why the program didn't give you the answer that you 151 00:05:30,720 --> 00:05:33,860 expected it to give. 152 00:05:33,860 --> 00:05:36,200 So as I mentioned, the most important thing 153 00:05:36,200 --> 00:05:38,600 is to do defensive programming and to that end, 154 00:05:38,600 --> 00:05:42,200 you want to set yourself up for easy testing and debugging. 155 00:05:42,200 --> 00:05:44,020 Which really comes down to making sure 156 00:05:44,020 --> 00:05:46,850 that the code you write is modular. 157 00:05:46,850 --> 00:05:49,330 So write as many functions as you can. 158 00:05:49,330 --> 00:05:50,950 Document what the functions do. 159 00:05:50,950 --> 00:05:53,470 Document their constraints. 160 00:05:53,470 --> 00:05:56,080 And it'll make your life a little bit easier later on 161 00:05:56,080 --> 00:05:59,464 when you have to debug it. 162 00:05:59,464 --> 00:06:00,880 When do you want to start testing? 163 00:06:00,880 --> 00:06:03,430 Well first you have to make sure your program runs. 164 00:06:03,430 --> 00:06:05,860 So eliminate syntax errors and static semantic errors 165 00:06:05,860 --> 00:06:10,970 which, by the way, Python can easily catch for you. 166 00:06:10,970 --> 00:06:13,230 Once you've ensured that a piece of code runs, 167 00:06:13,230 --> 00:06:15,980 then you want to come up with some test cases. 168 00:06:15,980 --> 00:06:18,010 So this is pairs of input and output 169 00:06:18,010 --> 00:06:23,220 for what you expect the program to do. 170 00:06:23,220 --> 00:06:27,060 Once you have your test cases and a piece of code that runs, 171 00:06:27,060 --> 00:06:29,420 you can start doing tests. 172 00:06:29,420 --> 00:06:32,670 So there's three general classes of tests that you can do. 173 00:06:32,670 --> 00:06:35,120 The first is called unit testing. 174 00:06:35,120 --> 00:06:37,910 And if you've written functions, unit testings-- testing just 175 00:06:37,910 --> 00:06:41,210 makes sure that, for example, each function runs according 176 00:06:41,210 --> 00:06:43,680 to the specifications. 177 00:06:43,680 --> 00:06:45,970 So you do this multiple times. 178 00:06:45,970 --> 00:06:49,500 As you're testing each function, you might find a bug. 179 00:06:49,500 --> 00:06:52,230 At that point, you do regression testing. 180 00:06:52,230 --> 00:06:56,350 Come up with a test case that found that bug. 181 00:06:56,350 --> 00:07:00,160 And run all of the different pieces of your code again 182 00:07:00,160 --> 00:07:02,290 to make sure that when you fix the bug, 183 00:07:02,290 --> 00:07:06,430 you don't re-introduce new bugs into pieces of the code that 184 00:07:06,430 --> 00:07:08,610 had already run. 185 00:07:08,610 --> 00:07:10,130 So you do this a bunch of times. 186 00:07:10,130 --> 00:07:11,630 You do a little bit of unit testing, 187 00:07:11,630 --> 00:07:13,880 a little bit of regression testing, and keep doing that. 188 00:07:13,880 --> 00:07:16,129 At some point, you're ready to do integration testing. 189 00:07:16,129 --> 00:07:18,620 Which means, test your program as a whole. 190 00:07:18,620 --> 00:07:20,564 Does the overall program work? 191 00:07:20,564 --> 00:07:21,980 So this is the part where you take 192 00:07:21,980 --> 00:07:24,650 all of the individual pieces, put them together. 193 00:07:24,650 --> 00:07:26,960 And integration testing tests to make sure 194 00:07:26,960 --> 00:07:31,550 that the interactions between all of the different pieces 195 00:07:31,550 --> 00:07:33,740 works as expected. 196 00:07:33,740 --> 00:07:35,130 If it does, great, you're done. 197 00:07:35,130 --> 00:07:36,080 But if it doesn't, then you'll have 198 00:07:36,080 --> 00:07:38,820 to go back to unit testing, and regression testing, and so on. 199 00:07:38,820 --> 00:07:42,230 So it's really a cycle of testing. 200 00:07:45,050 --> 00:07:47,880 So what are some testing approaches? 201 00:07:47,880 --> 00:07:50,689 The first, and this is probably most common with programs 202 00:07:50,689 --> 00:07:52,230 that involve numbers, is figuring out 203 00:07:52,230 --> 00:07:55,230 some natural boundaries for the numbers-- for the program, 204 00:07:55,230 --> 00:07:57,180 sorry. 205 00:07:57,180 --> 00:08:00,990 So for example, if I have a function is_bigger, 206 00:08:00,990 --> 00:08:04,380 and it compares if x is bigger than y, 207 00:08:04,380 --> 00:08:07,200 then some natural boundary, given the specification, 208 00:08:07,200 --> 00:08:09,810 is if x is less than y, x is greater than y, 209 00:08:09,810 --> 00:08:11,220 x is equal to y. 210 00:08:11,220 --> 00:08:13,440 Maybe throw in less than or equal to or greater 211 00:08:13,440 --> 00:08:15,430 or equal to, and so on. 212 00:08:15,430 --> 00:08:19,850 So that's just sort of an intuition about the problem. 213 00:08:19,850 --> 00:08:22,220 It's possible you have some problems for which there 214 00:08:22,220 --> 00:08:24,481 are no natural partitions. 215 00:08:24,481 --> 00:08:26,480 In which case, you might do some random testing, 216 00:08:26,480 --> 00:08:28,130 and then the more random testing you 217 00:08:28,130 --> 00:08:31,920 do, the greater the likelihood that your program is correct. 218 00:08:31,920 --> 00:08:36,174 But there's actually two more rigorous ways to do testing. 219 00:08:36,174 --> 00:08:38,090 And one is black box testing and the other one 220 00:08:38,090 --> 00:08:41,659 is glass box testing. 221 00:08:41,659 --> 00:08:43,650 In black box testing, you're assuming 222 00:08:43,650 --> 00:08:46,150 you have the specifications to a function. 223 00:08:46,150 --> 00:08:48,599 So that's the docstring. 224 00:08:48,599 --> 00:08:50,640 All you're looking at is the docstring and coming 225 00:08:50,640 --> 00:08:53,100 up with some test cases based on that. 226 00:08:53,100 --> 00:08:55,530 In glass box testing, you have the code 227 00:08:55,530 --> 00:08:57,540 itself and you're trying to come up 228 00:08:57,540 --> 00:09:01,320 with some test cases that hit upon all of the possible paths 229 00:09:01,320 --> 00:09:04,041 through the code. 230 00:09:04,041 --> 00:09:04,540 All right. 231 00:09:04,540 --> 00:09:07,510 Let's look at an example for black box testing. 232 00:09:07,510 --> 00:09:13,470 I'm finding the square root of x to some close enough value 233 00:09:13,470 --> 00:09:16,710 given by this epsilon. 234 00:09:16,710 --> 00:09:18,460 And the idea here, notice I don't actually 235 00:09:18,460 --> 00:09:20,350 give you how this function's implemented. 236 00:09:20,350 --> 00:09:23,020 The idea is that you're just figuring out test cases 237 00:09:23,020 --> 00:09:26,760 based on the specification. 238 00:09:26,760 --> 00:09:28,660 And the great thing about black box testing 239 00:09:28,660 --> 00:09:31,630 is that whoever implements this function can implement it 240 00:09:31,630 --> 00:09:34,540 in whatever way they wish, they can use approximation method, 241 00:09:34,540 --> 00:09:37,180 that can use bisection method, it doesn't matter. 242 00:09:37,180 --> 00:09:39,820 The test cases that you come up with for this function 243 00:09:39,820 --> 00:09:41,230 are going to be exactly the same. 244 00:09:41,230 --> 00:09:41,730 Right? 245 00:09:41,730 --> 00:09:45,560 No matter what the implementation. 246 00:09:45,560 --> 00:09:50,490 So for this particular function, here's a sample set. 247 00:09:50,490 --> 00:09:53,030 We check the boundary, we check perfect squares, 248 00:09:53,030 --> 00:09:55,970 we can check some number that's less than 1, 249 00:09:55,970 --> 00:10:00,040 we can check maybe irrationals, and then you do extreme tests. 250 00:10:00,040 --> 00:10:03,730 So when either epsilon is really large or epsilon 251 00:10:03,730 --> 00:10:07,150 is really small, or x is really large or x is really small, 252 00:10:07,150 --> 00:10:09,420 and all the possible combinations of those. 253 00:10:11,804 --> 00:10:13,720 So the important thing about black box testing 254 00:10:13,720 --> 00:10:17,860 is that you are doing you are creating the test cases based 255 00:10:17,860 --> 00:10:20,400 on the specifications only. 256 00:10:20,400 --> 00:10:23,430 Glass box testing, you're using the code itself 257 00:10:23,430 --> 00:10:27,560 to guide your test cases. 258 00:10:27,560 --> 00:10:30,460 So if you have a piece of code and you come up 259 00:10:30,460 --> 00:10:32,980 with a test case that goes through every single possible 260 00:10:32,980 --> 00:10:38,950 combination of input-- of every single possible path 261 00:10:38,950 --> 00:10:43,700 through the code, then that test set is called path complete. 262 00:10:43,700 --> 00:10:46,190 The problem with this is when you encounter loops, 263 00:10:46,190 --> 00:10:48,040 for example. 264 00:10:48,040 --> 00:10:50,370 Every single possible path through a loop 265 00:10:50,370 --> 00:10:52,694 is maybe the code not going through the loop at all, 266 00:10:52,694 --> 00:10:54,360 going through once, going through twice, 267 00:10:54,360 --> 00:10:57,100 going through three times, four times, five times, and so on. 268 00:10:57,100 --> 00:10:57,900 Right? 269 00:10:57,900 --> 00:11:00,885 Which could be a very, very big test. 270 00:11:00,885 --> 00:11:02,760 So instead there are actually some guidelines 271 00:11:02,760 --> 00:11:05,260 for when you're dealing with loops and things like that. 272 00:11:05,260 --> 00:11:08,130 So for branches, when you're doing glass box testing, 273 00:11:08,130 --> 00:11:10,890 it's important-- you should just exercise 274 00:11:10,890 --> 00:11:12,520 all of the parts of the conditional. 275 00:11:12,520 --> 00:11:14,103 So make sure you have a test case that 276 00:11:14,103 --> 00:11:16,800 goes through each part of the conditional. 277 00:11:16,800 --> 00:11:18,970 For for loops, make sure you have a test case where 278 00:11:18,970 --> 00:11:21,130 the loop is not entered at all, where the loop is entered one 279 00:11:21,130 --> 00:11:23,463 time, and when the loop is entered just some number more 280 00:11:23,463 --> 00:11:26,000 than once. 281 00:11:26,000 --> 00:11:28,070 For while loops, similar to for loops, 282 00:11:28,070 --> 00:11:29,990 except that make sure you have test cases that 283 00:11:29,990 --> 00:11:32,984 cover all of the possible ways to break out of the while loop. 284 00:11:32,984 --> 00:11:37,700 So if the while loop condition becomes false, or if maybe 285 00:11:37,700 --> 00:11:44,480 there's a break inside the while loop, and so on. 286 00:11:44,480 --> 00:11:47,840 So in this example, we have the absolute value of x. 287 00:11:47,840 --> 00:11:50,300 This is its specification and this is the implementation 288 00:11:50,300 --> 00:11:54,580 that someone decided to do for this function. 289 00:11:57,890 --> 00:11:59,910 So a path complete test set means 290 00:11:59,910 --> 00:12:02,394 that you want to have a test that 291 00:12:02,394 --> 00:12:04,060 goes through each one of these branches. 292 00:12:04,060 --> 00:12:07,200 So if x is less than minus 1, well, minus 2 293 00:12:07,200 --> 00:12:08,230 is less than minus 1. 294 00:12:08,230 --> 00:12:09,690 So that's good. 295 00:12:09,690 --> 00:12:15,815 And otherwise, which means pick a number greater than minus 1. 296 00:12:15,815 --> 00:12:18,610 So 2. 297 00:12:18,610 --> 00:12:22,290 So 2 and minus 2 are path complete. 298 00:12:22,290 --> 00:12:26,850 Yield path complete-- yields a path complete test suite. 299 00:12:26,850 --> 00:12:29,820 But notice that while we've hit upon every possible path 300 00:12:29,820 --> 00:12:32,730 through this code, we've actually missed a test case. 301 00:12:32,730 --> 00:12:33,450 Minus 1. 302 00:12:33,450 --> 00:12:37,620 So this code incorrectly classifies minus 1 303 00:12:37,620 --> 00:12:42,330 as returning minus 1, which is wrong. 304 00:12:42,330 --> 00:12:44,959 So for glass box testing, in addition to 305 00:12:44,959 --> 00:12:47,250 making sure you're going through all the possible paths 306 00:12:47,250 --> 00:12:49,080 through the code, you also want to make 307 00:12:49,080 --> 00:12:52,150 sure you hit upon any boundary condition. 308 00:12:52,150 --> 00:12:54,590 So in this case, for branches, minus 1 309 00:12:54,590 --> 00:12:57,510 is a boundary condition. 310 00:12:57,510 --> 00:13:01,140 So you've created a test suite, you've tested your program, 311 00:13:01,140 --> 00:13:03,950 chances are you found a bug. 312 00:13:03,950 --> 00:13:06,350 What do you do now? 313 00:13:06,350 --> 00:13:07,460 All right. 314 00:13:07,460 --> 00:13:11,600 Quick sort of detour into a little bit of history. 315 00:13:11,600 --> 00:13:13,380 The history of debugging. 316 00:13:13,380 --> 00:13:18,760 So 1947, this computer was built. 317 00:13:18,760 --> 00:13:22,470 And it was a computer that was very impressive for its day. 318 00:13:22,470 --> 00:13:27,000 It could do things like addition in 0.1 seconds. 319 00:13:27,000 --> 00:13:30,870 Things like multiplication in 0.7 seconds. 320 00:13:30,870 --> 00:13:36,450 And take the log of something in five seconds. 321 00:13:36,450 --> 00:13:38,850 So faster than a human, possibly. 322 00:13:38,850 --> 00:13:42,810 But pretty slow for today's standards. 323 00:13:42,810 --> 00:13:44,550 And a group of engineers were working 324 00:13:44,550 --> 00:13:47,750 on running a program that found-- 325 00:13:47,750 --> 00:13:52,050 that was supposed to find the trigonometric function. 326 00:13:52,050 --> 00:13:55,780 And among them being this-- one of the first female scientists, 327 00:13:55,780 --> 00:13:57,702 Grace Hopper. 328 00:13:57,702 --> 00:13:59,410 And they found that their program was not 329 00:13:59,410 --> 00:14:01,100 working correctly. 330 00:14:01,100 --> 00:14:06,340 So they went through all of the panels and all of the relays 331 00:14:06,340 --> 00:14:10,680 in the computer, and they isolated a program 332 00:14:10,680 --> 00:14:15,830 in panel F relay 70, where they found this moth. 333 00:14:15,830 --> 00:14:17,540 Just sitting in there. 334 00:14:17,540 --> 00:14:19,430 I think it was dead, probably electrocuted. 335 00:14:19,430 --> 00:14:23,120 But it was a moth that was impeding the calculation. 336 00:14:23,120 --> 00:14:27,620 And I don't know if you can read this, but this part right here. 337 00:14:27,620 --> 00:14:29,240 They made a note in their logbook 338 00:14:29,240 --> 00:14:32,760 that says, first actual case of bug being found. 339 00:14:32,760 --> 00:14:34,560 Which I think is really cute. 340 00:14:34,560 --> 00:14:38,900 So they were literally doing debugging in this computer. 341 00:14:38,900 --> 00:14:40,351 Right. 342 00:14:40,351 --> 00:14:40,850 All right. 343 00:14:40,850 --> 00:14:43,019 So you won't be doing that sort of debugging. 344 00:14:43,019 --> 00:14:45,560 You'll be doing a virtual kind of debugging in your programs. 345 00:14:45,560 --> 00:14:47,330 Which, again, is not that fun. 346 00:14:47,330 --> 00:14:49,460 But you still have to do it. 347 00:14:49,460 --> 00:14:52,580 So debugging, as you might have noticed so far in your problem 348 00:14:52,580 --> 00:14:57,320 sets, has a bit of a steep learning curve. 349 00:14:57,320 --> 00:14:59,870 And obviously your goal is to have a bug free program, 350 00:14:59,870 --> 00:15:04,090 and in order to achieve that, you have to do the debugging. 351 00:15:04,090 --> 00:15:07,420 There are some tools which some of you have been using. 352 00:15:07,420 --> 00:15:09,520 There are some tools built into Anaconda, 353 00:15:09,520 --> 00:15:14,274 or whatever ID you've been using to do debugging. 354 00:15:14,274 --> 00:15:16,690 I know some of you have been using the Python tutor, which 355 00:15:16,690 --> 00:15:19,010 is awesome. 356 00:15:19,010 --> 00:15:23,640 The print statement can also be a good debugging tool. 357 00:15:23,640 --> 00:15:26,610 But over above everything else, it's 358 00:15:26,610 --> 00:15:28,320 really important to just be systematic 359 00:15:28,320 --> 00:15:31,287 as you're trying to debug your program. 360 00:15:31,287 --> 00:15:33,370 I want to talk a little bit about print statements 361 00:15:33,370 --> 00:15:37,120 and how you can use them to debug, because I think-- 362 00:15:37,120 --> 00:15:39,100 Python tutor, if you don't have the internet, 363 00:15:39,100 --> 00:15:41,480 you might not be able to use it. 364 00:15:41,480 --> 00:15:43,530 If you don't know how to use the debugger, 365 00:15:43,530 --> 00:15:44,800 you don't need to learn. 366 00:15:44,800 --> 00:15:46,390 But print statements, you'll always have them, 367 00:15:46,390 --> 00:15:47,950 and you can always put them in your program. 368 00:15:47,950 --> 00:15:50,042 And they're really good ways to test hypotheses. 369 00:15:52,570 --> 00:15:55,230 So good places to put print statements 370 00:15:55,230 --> 00:15:57,667 are inside functions. 371 00:15:57,667 --> 00:16:00,000 Inside loops, for example, what are the loop parameters, 372 00:16:00,000 --> 00:16:03,510 what are the loop values, what function-- what 373 00:16:03,510 --> 00:16:05,700 functions return what values. 374 00:16:05,700 --> 00:16:09,480 So you can make sure that values are being passed-- 375 00:16:09,480 --> 00:16:11,040 the correct values are being passed 376 00:16:11,040 --> 00:16:12,315 between parts of your code. 377 00:16:15,240 --> 00:16:17,130 I will mention that you can use the bisection 378 00:16:17,130 --> 00:16:19,900 method when you're debugging. 379 00:16:19,900 --> 00:16:22,640 Which is interesting. 380 00:16:22,640 --> 00:16:24,860 So if you take a print statement, 381 00:16:24,860 --> 00:16:27,920 find approximately the halfway point in your code. 382 00:16:27,920 --> 00:16:31,130 Print out what values you-- print out some relevant values. 383 00:16:31,130 --> 00:16:34,220 All of the possible-- print out some 384 00:16:34,220 --> 00:16:36,620 values at that point in your code. 385 00:16:39,130 --> 00:16:40,606 If everything is as you expect it 386 00:16:40,606 --> 00:16:42,730 to be at that point in your code, then you're good. 387 00:16:42,730 --> 00:16:46,690 That means the code so far is bug free. 388 00:16:46,690 --> 00:16:50,710 That means that-- however, that means that the code beyond it 389 00:16:50,710 --> 00:16:52,270 has a bug, right? 390 00:16:52,270 --> 00:16:54,820 So since you've put a print statement halfway in your code 391 00:16:54,820 --> 00:16:57,730 and you think that gave good results, 392 00:16:57,730 --> 00:17:00,730 then put a print statement 3/4 of the way in the code. 393 00:17:00,730 --> 00:17:03,131 And see if the values are as you expect at that point. 394 00:17:03,131 --> 00:17:04,089 And if they are, great. 395 00:17:04,089 --> 00:17:08,229 Then put a print statement further down. 396 00:17:08,229 --> 00:17:10,270 So in this way you could use the bisection method 397 00:17:10,270 --> 00:17:15,339 to pinpoint a line, or a set of lines, or maybe a function 398 00:17:15,339 --> 00:17:17,960 that that's giving you the bad results. 399 00:17:21,280 --> 00:17:27,181 So the general debugging steps is to study the program code. 400 00:17:27,181 --> 00:17:29,180 Don't ask what is wrong, because that's actually 401 00:17:29,180 --> 00:17:30,260 part of the testing. 402 00:17:30,260 --> 00:17:32,645 So your test cases would have figured out what's wrong. 403 00:17:35,230 --> 00:17:37,690 The debugging process is figuring out 404 00:17:37,690 --> 00:17:40,960 how the result took place. 405 00:17:40,960 --> 00:17:43,820 And since programming is-- programming and debugging 406 00:17:43,820 --> 00:17:47,480 is, sort of, is a science, you can use the scientific method 407 00:17:47,480 --> 00:17:48,710 as well. 408 00:17:48,710 --> 00:17:51,440 So look at all the data, that's your test cases. 409 00:17:51,440 --> 00:17:52,580 Figure out a hypothesis. 410 00:17:52,580 --> 00:17:56,450 Maybe say, oh, maybe I'm indexing from 1 instead of 0 411 00:17:56,450 --> 00:17:59,600 in lists, for example. 412 00:17:59,600 --> 00:18:01,570 Come up with an experiment that you can repeat. 413 00:18:01,570 --> 00:18:03,111 And then pick a simple test case then 414 00:18:03,111 --> 00:18:04,840 you can test your hypothesis with. 415 00:18:07,870 --> 00:18:12,359 So as you're debugging, you will encounter error messages. 416 00:18:12,359 --> 00:18:13,900 And these error messages are actually 417 00:18:13,900 --> 00:18:16,960 pretty easy to figure out. 418 00:18:16,960 --> 00:18:19,210 And they're really easy to fix in your code. 419 00:18:19,210 --> 00:18:22,510 So for example, accessing things beyond the limits of the lists 420 00:18:22,510 --> 00:18:24,610 give you index errors. 421 00:18:24,610 --> 00:18:27,470 Trying to convert, in this case, a list to an integer 422 00:18:27,470 --> 00:18:29,350 gives you type errors. 423 00:18:29,350 --> 00:18:33,190 Accessing variables that you haven't created before 424 00:18:33,190 --> 00:18:34,120 gives you name errors. 425 00:18:34,120 --> 00:18:35,950 And so on and so on. 426 00:18:35,950 --> 00:18:38,320 And syntax errors are things, for things like, 427 00:18:38,320 --> 00:18:41,050 if you forget a parentheses, or forget a colon, 428 00:18:41,050 --> 00:18:44,260 or something like that. 429 00:18:44,260 --> 00:18:46,870 So error messages are really easy to spot. 430 00:18:46,870 --> 00:18:50,170 The Python interpreter spits these out for you 431 00:18:50,170 --> 00:18:52,270 and then you can pinpoint the exact line. 432 00:18:52,270 --> 00:18:54,870 Logic errors are actually the hard part. 433 00:18:54,870 --> 00:18:58,120 And logic errors are the ones that you will 434 00:18:58,120 --> 00:19:00,820 be spending the most time on. 435 00:19:00,820 --> 00:19:04,000 For which I would recommend always trying to take a break. 436 00:19:04,000 --> 00:19:05,620 Take a nap, go eat. 437 00:19:05,620 --> 00:19:06,850 Something. 438 00:19:06,850 --> 00:19:10,240 Sometimes you'd have to start all over, so 439 00:19:10,240 --> 00:19:11,842 throughout the code you have and just 440 00:19:11,842 --> 00:19:14,050 sit down with a piece of paper, try to figure out how 441 00:19:14,050 --> 00:19:16,090 you want to solve the problem. 442 00:19:16,090 --> 00:19:21,180 And if you look up the term rubber ducky-- a lot of heads 443 00:19:21,180 --> 00:19:24,720 went up on that one-- rubber ducky debugging. 444 00:19:24,720 --> 00:19:28,270 That is an actual term in Wikipedia. 445 00:19:28,270 --> 00:19:30,460 And it's when a programmer explains their code 446 00:19:30,460 --> 00:19:31,900 to a rubber ducky. 447 00:19:31,900 --> 00:19:34,480 That's me on the left explaining code to my rubber ducky. 448 00:19:34,480 --> 00:19:37,930 You should always-- you should go buy one. 449 00:19:37,930 --> 00:19:40,600 Or code to anyone else, preferably someone 450 00:19:40,600 --> 00:19:43,209 who doesn't really understand anything. 451 00:19:43,209 --> 00:19:45,500 Because that'll force you to explain everything really, 452 00:19:45,500 --> 00:19:47,346 really closely. 453 00:19:47,346 --> 00:19:49,720 And as you're doing that, you'll figure out your problem. 454 00:19:49,720 --> 00:19:51,907 And I figured out my problem in both of these cases. 455 00:19:54,640 --> 00:19:56,186 So just go back to the basics. 456 00:20:00,030 --> 00:20:04,650 Quick summary of dos and don'ts of debugging and testing. 457 00:20:04,650 --> 00:20:06,151 So don't write the entire program, 458 00:20:06,151 --> 00:20:08,400 test the entire program, and debug the entire program. 459 00:20:08,400 --> 00:20:10,200 I know this is really tempting to do, 460 00:20:10,200 --> 00:20:12,480 and I do it all the time. 461 00:20:12,480 --> 00:20:15,180 But don't do it. 462 00:20:15,180 --> 00:20:16,680 Because you're going to introduce 463 00:20:16,680 --> 00:20:18,240 a lot of bugs and it's going to be 464 00:20:18,240 --> 00:20:22,360 hard to isolate which bugs are affecting other ones. 465 00:20:22,360 --> 00:20:24,870 And it'll lead to a lot more stress than you need. 466 00:20:24,870 --> 00:20:28,650 Instead do unit testing. 467 00:20:28,650 --> 00:20:32,155 So write one function, test the function, debug the function, 468 00:20:32,155 --> 00:20:34,030 make sure it works, write the other function, 469 00:20:34,030 --> 00:20:35,130 and so on and so on. 470 00:20:35,130 --> 00:20:37,230 Do a little regression testing, a little more unit 471 00:20:37,230 --> 00:20:39,990 testing, a little integration testing, 472 00:20:39,990 --> 00:20:44,180 and it's a lot more systematic way to write the program. 473 00:20:44,180 --> 00:20:48,066 And it'll cut down on your debugging time immensely. 474 00:20:48,066 --> 00:20:50,190 If you're changing your code, and inevitably you'll 475 00:20:50,190 --> 00:20:53,370 be changing your code as you're doing your problem sets, 476 00:20:53,370 --> 00:20:56,920 remember to back up your code. 477 00:20:56,920 --> 00:20:58,800 So if you have a version that almost works, 478 00:20:58,800 --> 00:21:01,220 don't just modify that and maybe save a copy. 479 00:21:01,220 --> 00:21:04,410 [INAUDIBLE] you've got terabytes of memory on your computer, 480 00:21:04,410 --> 00:21:07,270 it won't hurt to just make a quick copy of it. 481 00:21:07,270 --> 00:21:11,250 Document maybe what worked and what didn't in that copy. 482 00:21:11,250 --> 00:21:18,300 And then make another copy, and then you can modify your code. 483 00:21:24,360 --> 00:21:26,420 So that's sort of a high level introduction 484 00:21:26,420 --> 00:21:28,830 to testing and debugging. 485 00:21:28,830 --> 00:21:32,850 The rest of the class will be on the error messages, 486 00:21:32,850 --> 00:21:37,580 or on errors that you will get in your programs. 487 00:21:37,580 --> 00:21:41,740 So when your functions-- when you run functions, 488 00:21:41,740 --> 00:21:44,360 or when you run your program, at some point, 489 00:21:44,360 --> 00:21:47,410 the program execution is going to stop. 490 00:21:47,410 --> 00:21:50,260 Maybe it encountered an error because 491 00:21:50,260 --> 00:21:52,660 of some unexpected condition. 492 00:21:52,660 --> 00:21:54,860 And when that happens you get an exception. 493 00:21:54,860 --> 00:21:56,450 So the error is called an exception. 494 00:21:56,450 --> 00:21:58,783 And it's called an exception because it was an exception 495 00:21:58,783 --> 00:22:00,640 to what was expected. 496 00:22:00,640 --> 00:22:03,760 To what the program expected. 497 00:22:03,760 --> 00:22:05,440 So all of these errors that I've talked 498 00:22:05,440 --> 00:22:07,240 about in the previous slides are actually 499 00:22:07,240 --> 00:22:08,900 examples of exceptions. 500 00:22:12,836 --> 00:22:14,460 And there are actually many other types 501 00:22:14,460 --> 00:22:18,420 of exceptions, which you'll see as you go on in this course 502 00:22:18,420 --> 00:22:23,290 and also in 60002. 503 00:22:23,290 --> 00:22:27,150 So how do we deal with these exceptions? 504 00:22:27,150 --> 00:22:33,700 In Python, you can actually have handlers for exceptions. 505 00:22:33,700 --> 00:22:38,860 So if you know that a piece of code might give you an error. 506 00:22:38,860 --> 00:22:44,870 For example, here I'm dealing with inputs from users. 507 00:22:44,870 --> 00:22:47,392 And users are really unpredictable. 508 00:22:47,392 --> 00:22:48,850 You tell them to give you a number, 509 00:22:48,850 --> 00:22:50,141 they might give you their name. 510 00:22:52,470 --> 00:22:53,760 Nothing you can do about that. 511 00:22:53,760 --> 00:22:54,540 Or is there? 512 00:22:54,540 --> 00:22:55,530 Yes there is. 513 00:22:55,530 --> 00:23:01,790 So in your program you can actually put any lines of code 514 00:23:01,790 --> 00:23:03,500 that you think might be problematic, 515 00:23:03,500 --> 00:23:07,610 that might give you an error an exception, in this try block. 516 00:23:07,610 --> 00:23:09,965 So you say try colon, and you put in any lines of code 517 00:23:09,965 --> 00:23:11,590 that you think might give you an error. 518 00:23:17,230 --> 00:23:21,430 If none of these lines of code actually produce an error, 519 00:23:21,430 --> 00:23:22,960 then great. 520 00:23:22,960 --> 00:23:24,520 Python doesn't do anything else. 521 00:23:24,520 --> 00:23:27,220 It treats them as just part-- as just if they 522 00:23:27,220 --> 00:23:29,440 were part of a regular program. 523 00:23:29,440 --> 00:23:32,590 But if an error does come up-- for example, 524 00:23:32,590 --> 00:23:34,090 if someone doesn't put in a number 525 00:23:34,090 --> 00:23:37,220 but puts their name in-- that's going 526 00:23:37,220 --> 00:23:41,390 to raise an error, specifically a value error. 527 00:23:41,390 --> 00:23:43,530 And at that point, Python's going to say, 528 00:23:43,530 --> 00:23:47,880 is there an accept statement? 529 00:23:47,880 --> 00:23:51,830 And if so, this except statement is going to handle the error. 530 00:23:55,000 --> 00:23:56,860 And it's going say, OK, an error came up, 531 00:23:56,860 --> 00:23:58,840 but I know how to handle it. 532 00:23:58,840 --> 00:24:03,590 I'm going to print out this message to the user. 533 00:24:03,590 --> 00:24:07,385 So if we look at code-- this is the same code 534 00:24:07,385 --> 00:24:11,520 as in the slides-- and there's no try except block around it. 535 00:24:11,520 --> 00:24:15,090 So if I run it and I say, three and four, 536 00:24:15,090 --> 00:24:17,860 it's going to run fine. 537 00:24:17,860 --> 00:24:21,390 But if I run it and I say, [INAUDIBLE] a, 538 00:24:21,390 --> 00:24:22,792 it's going to give a value error. 539 00:24:26,210 --> 00:24:29,690 Now if I run the same piece of code with 540 00:24:29,690 --> 00:24:32,165 try-- with a try except block. 541 00:24:35,380 --> 00:24:39,980 I run it, if I give it regular numbers, it's fine. 542 00:24:39,980 --> 00:24:48,220 But if I'm being a cheeky user, and I say three, 543 00:24:48,220 --> 00:24:51,910 automatically this would have raised the value error 544 00:24:51,910 --> 00:24:55,060 in the previous version of the program. 545 00:24:55,060 --> 00:24:57,010 But in this version of the program, 546 00:24:57,010 --> 00:24:59,170 the programmer handled the exception 547 00:24:59,170 --> 00:25:01,390 or caught the exception, and printed 548 00:25:01,390 --> 00:25:04,120 out this nicer looking message. 549 00:25:04,120 --> 00:25:10,090 So bug in user input is nicer than this whole lot here. 550 00:25:14,590 --> 00:25:15,985 A lot easier to read. 551 00:25:20,660 --> 00:25:23,210 So any problematic lines of code, 552 00:25:23,210 --> 00:25:24,770 you can put in a try block, and then 553 00:25:24,770 --> 00:25:29,910 handle whatever errors might come up in this except block. 554 00:25:29,910 --> 00:25:34,460 This except block is going to catch any error that comes up. 555 00:25:34,460 --> 00:25:36,860 And you can actually get a little bit more specific 556 00:25:36,860 --> 00:25:40,560 and catch specific types of errors. 557 00:25:40,560 --> 00:25:44,190 In this case, I'm saying, if a value error comes up-- 558 00:25:44,190 --> 00:25:47,310 for example, if the user inputs a string instead 559 00:25:47,310 --> 00:25:53,330 of an integer-- do this, which is going to print this message. 560 00:25:53,330 --> 00:25:58,590 If the user inputs a number for B such that 561 00:25:58,590 --> 00:26:02,220 we're doing a divided by b, so that would give a 0 division 562 00:26:02,220 --> 00:26:03,300 error. 563 00:26:03,300 --> 00:26:06,400 In that case we're going to catch this other error here, 564 00:26:06,400 --> 00:26:08,370 the 0 division error, and we're going 565 00:26:08,370 --> 00:26:10,575 to print this other message, can't divide by 0. 566 00:26:14,310 --> 00:26:18,120 So each-- so you can think of these different except blocks 567 00:26:18,120 --> 00:26:22,380 as sort of if else if statements, 568 00:26:22,380 --> 00:26:24,870 except for exceptions. 569 00:26:24,870 --> 00:26:26,490 So we're going to try this. 570 00:26:26,490 --> 00:26:29,170 But if there's a value error do this. 571 00:26:29,170 --> 00:26:31,620 Otherwise, if there's a division error, do this. 572 00:26:31,620 --> 00:26:33,829 And otherwise do this. 573 00:26:33,829 --> 00:26:35,370 So this last except is actually going 574 00:26:35,370 --> 00:26:37,200 to be for any other error that comes up. 575 00:26:37,200 --> 00:26:40,410 So if it's not a value error, nor a division error, then 576 00:26:40,410 --> 00:26:42,850 we're going to print, something went very wrong. 577 00:26:42,850 --> 00:26:45,090 I couldn't even try to create-- I couldn't even 578 00:26:45,090 --> 00:26:48,960 try to make the program come up with any other error 579 00:26:48,960 --> 00:26:49,710 besides those two. 580 00:26:55,670 --> 00:26:59,320 So a lot of the time you're just going to use try except blocks. 581 00:26:59,320 --> 00:27:02,120 But there's other blocks that you can add to exceptions. 582 00:27:02,120 --> 00:27:04,210 And these are more rarely used, but I'll 583 00:27:04,210 --> 00:27:05,860 talk about them anyway. 584 00:27:05,860 --> 00:27:07,990 So you could have an else block. 585 00:27:10,680 --> 00:27:13,170 And an else block is going to get executed 586 00:27:13,170 --> 00:27:16,140 when the code in the try block finished 587 00:27:16,140 --> 00:27:19,632 without raising an error. 588 00:27:19,632 --> 00:27:22,770 And you can also have a finally block, 589 00:27:22,770 --> 00:27:25,800 which is always executed. 590 00:27:25,800 --> 00:27:29,790 If the code in the try block finished without an error, 591 00:27:29,790 --> 00:27:31,860 if you raised an exception, if you raised 592 00:27:31,860 --> 00:27:35,280 a different kind of exception, if you went through the else, 593 00:27:35,280 --> 00:27:38,550 in any of these cases, whatever's in the finally block 594 00:27:38,550 --> 00:27:42,160 is always going to get executed. 595 00:27:42,160 --> 00:27:47,834 And it's usually used to clean up code. 596 00:27:47,834 --> 00:27:50,000 Like if you want to print, oh, the program finished, 597 00:27:50,000 --> 00:27:53,761 or if you want to close a file, or something like that. 598 00:27:53,761 --> 00:27:55,750 So. 599 00:27:55,750 --> 00:27:57,250 We've encountered errors. 600 00:27:57,250 --> 00:27:58,930 We've caught them. 601 00:27:58,930 --> 00:28:01,031 What else can we do with errors-- with exceptions. 602 00:28:04,259 --> 00:28:05,050 Three other things. 603 00:28:05,050 --> 00:28:10,400 So one is if we've caught an error, 604 00:28:10,400 --> 00:28:13,250 we can just fail silently. 605 00:28:13,250 --> 00:28:16,250 What this means is, you've caught an error, 606 00:28:16,250 --> 00:28:20,970 and you just substitute whatever erroneous value the user gave 607 00:28:20,970 --> 00:28:23,329 you for some other value. 608 00:28:23,329 --> 00:28:24,870 That's not actually a very good idea. 609 00:28:24,870 --> 00:28:26,700 That's a bad idea. 610 00:28:26,700 --> 00:28:29,790 Because suddenly the user thinks that they entered something, 611 00:28:29,790 --> 00:28:32,690 and they think everything's great, your program accepts it, 612 00:28:32,690 --> 00:28:34,170 but then they get some weird value 613 00:28:34,170 --> 00:28:37,080 as an output, which is far from what they expected. 614 00:28:37,080 --> 00:28:39,180 So it's not really a good idea to just replace 615 00:28:39,180 --> 00:28:40,346 user's values with anything. 616 00:28:43,930 --> 00:28:46,360 In the context-- so this is in the context of a function. 617 00:28:46,360 --> 00:28:48,560 In the context of a function, what else can we do? 618 00:28:48,560 --> 00:28:54,810 Well, if you have a function that fails, 619 00:28:54,810 --> 00:28:59,370 for example, let's say you're trying to do you're 620 00:28:59,370 --> 00:29:03,780 trying to get the square root of an even number. 621 00:29:03,780 --> 00:29:06,502 And let's say the user gives you a-- sorry, 622 00:29:06,502 --> 00:29:08,960 you're trying to find the square root of a positive number. 623 00:29:08,960 --> 00:29:11,084 And let's say the user gives you a negative number. 624 00:29:14,490 --> 00:29:16,550 Well, if the user gives you a negative number, 625 00:29:16,550 --> 00:29:19,790 your function could return an error value, which 626 00:29:19,790 --> 00:29:23,360 means, well if the number inputted is less than 0, 627 00:29:23,360 --> 00:29:24,740 then return 0. 628 00:29:24,740 --> 00:29:25,700 Or minus 1. 629 00:29:25,700 --> 00:29:26,990 Or minus 100. 630 00:29:26,990 --> 00:29:29,450 Just pick any value to return which 631 00:29:29,450 --> 00:29:32,140 represents some error value. 632 00:29:32,140 --> 00:29:34,947 This is actually not a good idea either, because later 633 00:29:34,947 --> 00:29:37,030 on in your program, if you're using this function, 634 00:29:37,030 --> 00:29:39,130 now you have to do a check. 635 00:29:39,130 --> 00:29:41,800 And the check is, well if the return from this function 636 00:29:41,800 --> 00:29:44,210 is minus 1 or minus 100, do this. 637 00:29:44,210 --> 00:29:45,610 Otherwise, do this. 638 00:29:45,610 --> 00:29:48,490 So you you're complicating your code 639 00:29:48,490 --> 00:29:51,790 because now you always have to have this check for this error 640 00:29:51,790 --> 00:29:53,380 value. 641 00:29:53,380 --> 00:29:56,590 Which makes the code really messy. 642 00:29:56,590 --> 00:29:59,800 The other thing we can do is we can signal an error condition. 643 00:29:59,800 --> 00:30:07,480 So this is how you create control flow in your programs 644 00:30:07,480 --> 00:30:09,550 with exceptions. 645 00:30:09,550 --> 00:30:11,620 So in Python, signaling an error condition 646 00:30:11,620 --> 00:30:14,590 means raising your own exception. 647 00:30:14,590 --> 00:30:18,980 So so far we've just seen the programs crashing, 648 00:30:18,980 --> 00:30:20,900 which means they raise an exception 649 00:30:20,900 --> 00:30:22,750 and then you deal with them. 650 00:30:22,750 --> 00:30:26,240 But in this last case, you're raising your own exception. 651 00:30:26,240 --> 00:30:31,630 As a way to use that exception later on in the code. 652 00:30:31,630 --> 00:30:33,380 So in Python, you raise your own exception 653 00:30:33,380 --> 00:30:36,410 using this raise keyword and then an exception. 654 00:30:36,410 --> 00:30:38,240 And then some sort of description, 655 00:30:38,240 --> 00:30:42,452 like "user entered a negative number" or something like that. 656 00:30:48,000 --> 00:30:52,480 A lot of the time we're going to raise a value error. 657 00:30:52,480 --> 00:30:57,520 So if the number is less than 0, then raise a value error, 658 00:30:57,520 --> 00:31:00,036 something is wrong. 659 00:31:00,036 --> 00:31:01,910 The key word, the name of the error, and then 660 00:31:01,910 --> 00:31:03,360 some sort of descriptive string. 661 00:31:08,070 --> 00:31:12,270 So let's see an example of how we raise an exception. 662 00:31:12,270 --> 00:31:15,380 I have this function here called get ratios. 663 00:31:15,380 --> 00:31:18,620 It takes in two lists, L1 and L2. 664 00:31:18,620 --> 00:31:21,050 And it's going to create a new list that's 665 00:31:21,050 --> 00:31:24,860 going to contain the ratio of each element 666 00:31:24,860 --> 00:31:29,230 in L1 divided by each element in L2. 667 00:31:29,230 --> 00:31:31,990 So I have a for loop here. 668 00:31:31,990 --> 00:31:34,480 For index in range length L1. 669 00:31:34,480 --> 00:31:37,840 So I'm going through every single element in L1. 670 00:31:37,840 --> 00:31:43,330 I'm going to try here. 671 00:31:43,330 --> 00:31:44,860 I'm going to try to do this line. 672 00:31:44,860 --> 00:31:47,090 So I think that this line might give me an error. 673 00:31:47,090 --> 00:31:49,810 So I'm going to put it in a try block. 674 00:31:49,810 --> 00:31:52,310 The error I think I'm going to get 675 00:31:52,310 --> 00:31:54,710 is a 0 division error, because what happens 676 00:31:54,710 --> 00:31:56,170 when an element and L2 is 0? 677 00:32:00,030 --> 00:32:01,950 And when an element in L2 is 0 I'm 678 00:32:01,950 --> 00:32:06,630 going to append this not a number as a float. 679 00:32:06,630 --> 00:32:10,310 So NAN, as a string, you can convert it to a float, 680 00:32:10,310 --> 00:32:11,760 and it stands for not a number. 681 00:32:14,810 --> 00:32:18,290 So then I can continue populating the list 682 00:32:18,290 --> 00:32:20,120 with these not a numbers. 683 00:32:20,120 --> 00:32:23,460 If an element and L2 is 0. 684 00:32:23,460 --> 00:32:27,130 And otherwise, if there's no 0 division error, 685 00:32:27,130 --> 00:32:29,180 but there's another kind of error, 686 00:32:29,180 --> 00:32:31,150 then I'm going to raise my own error. 687 00:32:31,150 --> 00:32:33,640 And say, for any other kind of error, 688 00:32:33,640 --> 00:32:35,590 just raise a value error. 689 00:32:35,590 --> 00:32:41,030 Which says, "get ratios was called with a bad argument." 690 00:32:41,030 --> 00:32:42,950 So here I'm sort of consolidating all errors 691 00:32:42,950 --> 00:32:44,260 into my one value error. 692 00:32:44,260 --> 00:32:47,840 So later on in my program, I can catch this value error 693 00:32:47,840 --> 00:32:48,944 and do something with it. 694 00:32:53,030 --> 00:32:56,580 Here's another example of exceptions. 695 00:32:56,580 --> 00:32:59,570 So let's say we're were given a class list. 696 00:32:59,570 --> 00:33:02,980 We have a list of lists. 697 00:33:02,980 --> 00:33:05,620 Where we have the name of a student, 698 00:33:05,620 --> 00:33:08,680 first name and last name, and their grades in the class. 699 00:33:08,680 --> 00:33:11,920 So we currently have two students. 700 00:33:11,920 --> 00:33:14,320 And what I want to do is create a new list 701 00:33:14,320 --> 00:33:20,140 which is the same things, the same inputs here. 702 00:33:20,140 --> 00:33:23,800 But I'm adding an extra-- I'm appending an extra value 703 00:33:23,800 --> 00:33:26,590 at the end of the list for each student, which 704 00:33:26,590 --> 00:33:28,627 is the average of all of their grades. 705 00:33:28,627 --> 00:33:30,460 Or all of their-- yeah, all of their grades. 706 00:33:34,130 --> 00:33:35,325 So let's look at the code. 707 00:33:37,980 --> 00:33:42,080 This is the function that takes the class list, which 708 00:33:42,080 --> 00:33:43,404 is this whole list here. 709 00:33:47,040 --> 00:33:53,230 I'm creating a new list inside it, initially empty. 710 00:33:53,230 --> 00:33:57,820 And then I'm going for every element in the class list. 711 00:33:57,820 --> 00:34:01,240 I'm appending element at 0, which is going 712 00:34:01,240 --> 00:34:02,950 to be this first list here. 713 00:34:02,950 --> 00:34:05,980 So it's going to be the first name and the last name. 714 00:34:05,980 --> 00:34:10,639 Element at 1, which is the grades. 715 00:34:10,639 --> 00:34:16,340 And then the last thing I'm appending is a function call. 716 00:34:16,340 --> 00:34:18,920 The function call being called with element 1, which 717 00:34:18,920 --> 00:34:22,909 is all of the grades, and this is my function call. 718 00:34:22,909 --> 00:34:26,050 We're going to see three different function calls. 719 00:34:26,050 --> 00:34:27,760 This is the first one. 720 00:34:27,760 --> 00:34:29,539 It simply takes the sum of the grades 721 00:34:29,539 --> 00:34:31,330 and divides it by the length of the grades. 722 00:34:35,719 --> 00:34:39,550 If these students are responsible, 723 00:34:39,550 --> 00:34:44,348 and they've taken all of the tests, then there's no problem. 724 00:34:44,348 --> 00:34:46,389 Because length of grades is going to be something 725 00:34:46,389 --> 00:34:49,580 greater than 0. 726 00:34:49,580 --> 00:34:52,030 But what if we have a student in the class who 727 00:34:52,030 --> 00:34:53,303 didn't show up for any tests? 728 00:34:57,740 --> 00:35:00,560 Then we have no record of any of their tests. 729 00:35:00,560 --> 00:35:04,160 No record of grades or anything like that. 730 00:35:04,160 --> 00:35:06,800 So they're going to have an empty list. 731 00:35:06,800 --> 00:35:10,849 So if we run this function, averages, on their data, 732 00:35:10,849 --> 00:35:13,390 we're actually going to get a 0 division error, because we're 733 00:35:13,390 --> 00:35:17,230 trying to divide by length of grades, which is going to be 0. 734 00:35:21,170 --> 00:35:22,100 So what can we do? 735 00:35:22,100 --> 00:35:24,980 Two things, two options here. 736 00:35:24,980 --> 00:35:29,790 One is we can just flag the error and print the message. 737 00:35:29,790 --> 00:35:34,490 So here there's a new average function, an improved one, 738 00:35:34,490 --> 00:35:36,500 that's going to try to do the exact same line 739 00:35:36,500 --> 00:35:39,340 as the previous one. 740 00:35:39,340 --> 00:35:42,920 And it's going to catch the 0 division error. 741 00:35:45,620 --> 00:35:49,369 And when it catches it, it's going to print this warning. 742 00:35:49,369 --> 00:35:51,410 And when we run it, we're going to get, "warning, 743 00:35:51,410 --> 00:35:54,831 no grades data," which is fine. 744 00:35:54,831 --> 00:36:03,990 And we're going to get this "none" here, for the grades. 745 00:36:03,990 --> 00:36:06,570 So everyone else's grades was calculated correctly, 746 00:36:06,570 --> 00:36:09,560 and for this last one, we got a none. 747 00:36:09,560 --> 00:36:12,690 That's because, when we entered this except statement, 748 00:36:12,690 --> 00:36:17,330 if this is a function, remember functions return something. 749 00:36:17,330 --> 00:36:20,510 This function in this particular except statement 750 00:36:20,510 --> 00:36:21,480 didn't return anything. 751 00:36:21,480 --> 00:36:23,440 So it returns a none. 752 00:36:23,440 --> 00:36:27,010 So for the averages for this particular function, 753 00:36:27,010 --> 00:36:30,400 the average is going to be a "none" for this person who 754 00:36:30,400 --> 00:36:33,184 didn't have any grades associated with them. 755 00:36:37,450 --> 00:36:43,350 And yeah, so that's basically what I said. 756 00:36:43,350 --> 00:36:45,810 So that's our first option, is to just flag the error 757 00:36:45,810 --> 00:36:47,310 and print a message. 758 00:36:47,310 --> 00:36:50,080 The other option is to actually change the policy. 759 00:36:50,080 --> 00:36:54,390 So this is where you replace the data with some sort of default 760 00:36:54,390 --> 00:36:54,984 value. 761 00:36:54,984 --> 00:36:56,400 And if you do something like this, 762 00:36:56,400 --> 00:36:58,380 then this should be documented inside the function. 763 00:36:58,380 --> 00:37:00,421 So when you write the docstring for the function, 764 00:37:00,421 --> 00:37:15,360 you would say if the list is empty, then it'll will a 0. 765 00:37:15,360 --> 00:37:17,490 So this is the exact same thing as before. 766 00:37:17,490 --> 00:37:21,510 We have a try and an except for the 0 division error. 767 00:37:21,510 --> 00:37:24,810 We also print a warning, no grades data. 768 00:37:24,810 --> 00:37:25,940 And then we return the 0. 769 00:37:28,800 --> 00:37:31,350 So we still flag the error, and now instead of a "none," 770 00:37:31,350 --> 00:37:36,150 we get a 0, because we've returned 0.0 here, as opposed 771 00:37:36,150 --> 00:37:37,420 to just leaving it blank. 772 00:37:43,410 --> 00:37:43,910 All right. 773 00:37:43,910 --> 00:37:46,660 So those are exceptions. 774 00:37:46,660 --> 00:37:48,800 Last thing we're going to talk about today 775 00:37:48,800 --> 00:37:52,790 are these things called assertions. 776 00:37:52,790 --> 00:38:02,010 And assertions are good example of defensive programming. 777 00:38:02,010 --> 00:38:05,470 In that, you have assert statements 778 00:38:05,470 --> 00:38:08,530 at the beginning of functions, typically. 779 00:38:08,530 --> 00:38:11,320 Or at the end of functions. 780 00:38:11,320 --> 00:38:14,020 And assert statements are used to make sure 781 00:38:14,020 --> 00:38:17,950 that the assumptions on computations 782 00:38:17,950 --> 00:38:21,880 are exactly what the function expects them to be. 783 00:38:21,880 --> 00:38:23,980 So if we have a function that says 784 00:38:23,980 --> 00:38:27,580 it's supposed to take in an integer greater than 0, 785 00:38:27,580 --> 00:38:31,120 then the assert statement will assert 786 00:38:31,120 --> 00:38:35,760 that the function takes in an integer that's greater than 0. 787 00:38:35,760 --> 00:38:37,360 Here's an example. 788 00:38:37,360 --> 00:38:41,750 This is the same average function we've seen before. 789 00:38:41,750 --> 00:38:43,430 Here, instead of using exceptions, 790 00:38:43,430 --> 00:38:46,190 we're going to use an assert statement. 791 00:38:46,190 --> 00:38:49,640 And the assert statement we're putting right at the front. 792 00:38:49,640 --> 00:38:52,690 At the beginning of the function, sorry. 793 00:38:52,690 --> 00:38:54,015 And the key word is assert. 794 00:38:56,820 --> 00:39:01,770 The next part of the assert is what the function expects. 795 00:39:01,770 --> 00:39:05,970 So we expect that the length of grades is not equal to 0. 796 00:39:05,970 --> 00:39:07,424 So has to be greater than 0. 797 00:39:10,229 --> 00:39:11,770 And then we have a string here, which 798 00:39:11,770 --> 00:39:17,320 represents what do you print out if the assertion does not hold. 799 00:39:17,320 --> 00:39:20,990 So if you run the function, and you give it 800 00:39:20,990 --> 00:39:27,360 a list that is empty, this becomes false, 801 00:39:27,360 --> 00:39:29,820 so the assert is false, and we're 802 00:39:29,820 --> 00:39:32,520 going to print out an assertion error, no grades data. 803 00:39:35,750 --> 00:39:39,050 If the assert is false, the function does not continue. 804 00:39:39,050 --> 00:39:42,040 It stops right there. 805 00:39:42,040 --> 00:39:43,330 Why does it behave this way? 806 00:39:43,330 --> 00:39:47,570 Well, assertions are great to make sure 807 00:39:47,570 --> 00:39:51,260 that preconditions and post-conditions on functions 808 00:39:51,260 --> 00:39:54,380 are exactly as you expect. 809 00:39:54,380 --> 00:39:57,790 So as soon as an assert becomes false, 810 00:39:57,790 --> 00:40:00,970 the function's going to immediately terminate. 811 00:40:00,970 --> 00:40:06,740 This is useful because it'll prevent the program 812 00:40:06,740 --> 00:40:09,480 from propagating bad values. 813 00:40:09,480 --> 00:40:12,380 So as soon as a precondition isn't true, for example, 814 00:40:12,380 --> 00:40:14,630 as you enter a function, then that means something 815 00:40:14,630 --> 00:40:16,540 went wrong in your program. 816 00:40:16,540 --> 00:40:19,100 And the program is going to stop right there. 817 00:40:19,100 --> 00:40:21,230 So instead of propagating a bad value 818 00:40:21,230 --> 00:40:22,730 throughout the program, and then you 819 00:40:22,730 --> 00:40:24,710 getting an output that you didn't expect, 820 00:40:24,710 --> 00:40:28,310 and then you having to trace back to the function that 821 00:40:28,310 --> 00:40:32,150 gave this bad value, you'll get this bad value, you'll get this 822 00:40:32,150 --> 00:40:36,260 assert being false a lot earlier. 823 00:40:36,260 --> 00:40:39,200 So it'll be a lot easier to figure out 824 00:40:39,200 --> 00:40:40,832 where the bug came from. 825 00:40:40,832 --> 00:40:42,790 And you won't have to trace back so many steps. 826 00:40:46,410 --> 00:40:48,090 So this is basically what I said, 827 00:40:48,090 --> 00:40:52,650 you really want to spot the bugs as soon as they're introduced. 828 00:40:52,650 --> 00:40:56,820 And exceptions are good if you want to raise them 829 00:40:56,820 --> 00:40:58,800 when the user supplies bad data input, 830 00:40:58,800 --> 00:41:00,600 but assertions are used to make sure 831 00:41:00,600 --> 00:41:07,160 that the types and other-- the types of inputs to functions, 832 00:41:07,160 --> 00:41:10,800 maybe other conditions on inputs to functions, 833 00:41:10,800 --> 00:41:15,864 are being held as the values are being passed in. 834 00:41:15,864 --> 00:41:17,280 So the keyword here is making sure 835 00:41:17,280 --> 00:41:21,700 that the invariants on data structures are meant. 836 00:41:21,700 --> 00:41:22,840 And that's it. 837 00:41:22,840 --> 00:41:23,710 Great. 838 00:41:23,710 --> 00:41:25,260 Thanks.