1 00:00:00,000 --> 00:00:00,870 2 00:00:00,870 --> 00:00:03,630 PROFESSOR: Just as truth tables provide a simple, direct way 3 00:00:03,630 --> 00:00:07,330 to define the meanings of the individual propositional 4 00:00:07,330 --> 00:00:09,080 connectives or propositional operators, 5 00:00:09,080 --> 00:00:10,690 they also provide a methodical way 6 00:00:10,690 --> 00:00:13,030 to understand the behavior of formulas, 7 00:00:13,030 --> 00:00:15,660 and in particular whether two formulas are equivalent 8 00:00:15,660 --> 00:00:18,970 or whether a formula is valid, meaning that it's always true. 9 00:00:18,970 --> 00:00:22,180 So let's take a look at that. 10 00:00:22,180 --> 00:00:26,440 To being with, if I'm thinking about a propositional formula 11 00:00:26,440 --> 00:00:28,240 that's a composite one that's built out 12 00:00:28,240 --> 00:00:29,955 of lots of more atomic, primitive ones, 13 00:00:29,955 --> 00:00:32,330 then in order to figure out the value of the whole thing, 14 00:00:32,330 --> 00:00:36,200 I need to know the values of the individual components. 15 00:00:36,200 --> 00:00:40,470 So if we think of a formula involving P's and Q's and R's 16 00:00:40,470 --> 00:00:43,590 that are true/false value propositional variables, 17 00:00:43,590 --> 00:00:45,806 then I need a truth assignment to know 18 00:00:45,806 --> 00:00:47,430 the values of those variables, in order 19 00:00:47,430 --> 00:00:51,600 to know whether or not the formula is true or false. 20 00:00:51,600 --> 00:00:54,880 So in computer science jargon, this process 21 00:00:54,880 --> 00:00:59,300 of assigning values to variables is called an environment, 22 00:00:59,300 --> 00:01:01,770 although logicians would call it a truth assignment. 23 00:01:01,770 --> 00:01:05,394 So an environment tells you, given a variable, 24 00:01:05,394 --> 00:01:06,810 whether or not it's true or false. 25 00:01:06,810 --> 00:01:11,920 Let's look at an example of three 26 00:01:11,920 --> 00:01:14,910 variables-- P, Q, and R. They are true/false valued, 27 00:01:14,910 --> 00:01:16,710 and I'm going to tell you that I've 28 00:01:16,710 --> 00:01:20,710 got an environment v in which P is true and Q is true 29 00:01:20,710 --> 00:01:22,500 and R is false. 30 00:01:22,500 --> 00:01:24,570 So v of P was T, et cetera. 31 00:01:24,570 --> 00:01:26,320 So I'm thinking of v as a function that 32 00:01:26,320 --> 00:01:31,110 maps a variable to its value. 33 00:01:31,110 --> 00:01:34,340 Now, let's see how I would use this particular environment 34 00:01:34,340 --> 00:01:39,160 to figure out the value of this composite formula whose 35 00:01:39,160 --> 00:01:45,859 atomic parts are P and Q and R, and Q again, I guess. 36 00:01:45,859 --> 00:01:48,150 Let's take a look at how we would go about figuring out 37 00:01:48,150 --> 00:01:51,602 the value of this whole formula given the values of P, Q, 38 00:01:51,602 --> 00:01:53,060 and R. It's pretty straightforward, 39 00:01:53,060 --> 00:01:56,520 but the methodical way to do it is sort of from inside out. 40 00:01:56,520 --> 00:02:00,810 Let's begin by attaching the truth values that we're given 41 00:02:00,810 --> 00:02:03,800 to those particular variables. 42 00:02:03,800 --> 00:02:08,590 Now, notice here that I've got both arms of the AND 43 00:02:08,590 --> 00:02:10,949 have been assigned truth values, and they're both true. 44 00:02:10,949 --> 00:02:13,540 That means I can assign true to the conjunction 45 00:02:13,540 --> 00:02:16,180 formula, the AND formula, and I do that 46 00:02:16,180 --> 00:02:18,050 by putting the T under that AND, which 47 00:02:18,050 --> 00:02:22,060 is the principal connective of this subformula. 48 00:02:22,060 --> 00:02:24,180 Now, looking back at this Q, I've 49 00:02:24,180 --> 00:02:28,520 got it's true, which means that NOT of Q is false, 50 00:02:28,520 --> 00:02:31,850 so I can put a false under the NOT. 51 00:02:31,850 --> 00:02:35,750 Now, notice I've got both arms of this XOR are defined. 52 00:02:35,750 --> 00:02:38,860 They're both false, and that means that the XOR is false 53 00:02:38,860 --> 00:02:41,360 because it's only supposed to be true if exactly one of them 54 00:02:41,360 --> 00:02:42,410 is true. 55 00:02:42,410 --> 00:02:46,520 And next, I had this true for AND. 56 00:02:46,520 --> 00:02:49,740 That means that NOT of that formula is false. 57 00:02:49,740 --> 00:02:53,680 And now that I have both arms of the OR, this one is false 58 00:02:53,680 --> 00:02:55,920 and that one is false, I can conclude 59 00:02:55,920 --> 00:02:59,240 that the entire expression is false by putting 60 00:02:59,240 --> 00:03:02,710 that final truth value under the principal connective 61 00:03:02,710 --> 00:03:04,030 of the whole formula. 62 00:03:04,030 --> 00:03:07,110 So that's actually an easily defined recursive process. 63 00:03:07,110 --> 00:03:08,620 I've described it from inside out, 64 00:03:08,620 --> 00:03:10,640 but if you are programming it, you 65 00:03:10,640 --> 00:03:13,660 would be doing it recursively top-down 66 00:03:13,660 --> 00:03:17,370 in order to evaluate the truth value of a formula given 67 00:03:17,370 --> 00:03:20,730 the truth values of its constituent variables given 68 00:03:20,730 --> 00:03:23,380 an environment. 69 00:03:23,380 --> 00:03:26,540 Now, a basic idea about propositional formulas 70 00:03:26,540 --> 00:03:29,830 is that two of them are equivalent if and only 71 00:03:29,830 --> 00:03:33,290 if they have the same truth values in all environments-- 72 00:03:33,290 --> 00:03:34,360 in all environments. 73 00:03:34,360 --> 00:03:37,624 No matter what the values of the P's and Q's and R's are, 74 00:03:37,624 --> 00:03:39,040 no matter what the truth value is, 75 00:03:39,040 --> 00:03:41,687 these two formulas come out to the same truth value, 76 00:03:41,687 --> 00:03:43,270 and that's what makes them equivalent. 77 00:03:43,270 --> 00:03:45,110 Let's look at an important example 78 00:03:45,110 --> 00:03:47,150 known as De Morgan's law. 79 00:03:47,150 --> 00:03:49,010 So De Morgan's law says that if I 80 00:03:49,010 --> 00:03:54,680 look at this formula, the NOT of P or Q, the negation of P or Q, 81 00:03:54,680 --> 00:04:00,200 I claim that that's equivalent to not P and not Q, 82 00:04:00,200 --> 00:04:03,940 and let's check that with a truth table. 83 00:04:03,940 --> 00:04:06,860 So here's going to be the truth table for the first formula, 84 00:04:06,860 --> 00:04:11,750 NOT P or Q, so let's write out the four possible values for P 85 00:04:11,750 --> 00:04:13,900 and Q. These are the four possible environments, 86 00:04:13,900 --> 00:04:17,790 one per row, and there is the formula whose value 87 00:04:17,790 --> 00:04:19,029 I'm trying to compute. 88 00:04:19,029 --> 00:04:21,055 So we do it in the usual way. 89 00:04:21,055 --> 00:04:23,180 I'm not going to repeat the truth values of P and Q 90 00:04:23,180 --> 00:04:24,555 because they're given here, but I 91 00:04:24,555 --> 00:04:27,820 can fill in the values of the OR because I know the truth 92 00:04:27,820 --> 00:04:30,750 table for OR and I discovered that the first three 93 00:04:30,750 --> 00:04:34,224 rows or subformula is true and the last place is false 94 00:04:34,224 --> 00:04:35,390 when both of them are false. 95 00:04:35,390 --> 00:04:37,820 And that means that I know the value of the whole formula, 96 00:04:37,820 --> 00:04:39,120 which is the negation. 97 00:04:39,120 --> 00:04:42,520 The NOT of all those values just flips the trues and falses, 98 00:04:42,520 --> 00:04:46,760 and that is the final truth values for this NOT P 99 00:04:46,760 --> 00:04:49,900 or Q in all possible environments. 100 00:04:49,900 --> 00:04:52,820 Let's do the same thing for NOT P and NOT Q. Well, this time, 101 00:04:52,820 --> 00:04:55,360 I'll fill in the values of NOT P and NOT Q 102 00:04:55,360 --> 00:04:57,680 because they're not just repeated there. 103 00:04:57,680 --> 00:05:00,600 So the NOT P is the flip of the P column 104 00:05:00,600 --> 00:05:04,170 and the NOT Q is the flip of the Q column, 105 00:05:04,170 --> 00:05:07,320 and now I can fill in the values at the AND. 106 00:05:07,320 --> 00:05:09,801 And of course, the AND is going to be true only 107 00:05:09,801 --> 00:05:11,550 when they're both true, and otherwise it's 108 00:05:11,550 --> 00:05:15,350 going to be false, and look what I've got. 109 00:05:15,350 --> 00:05:20,240 The possible truth values of the first formula, NOT P or Q, 110 00:05:20,240 --> 00:05:23,970 in all possible environments is exactly the same 111 00:05:23,970 --> 00:05:27,450 as the possible truth values of NOT P 112 00:05:27,450 --> 00:05:30,180 and NOT Q in those corresponding environments. 113 00:05:30,180 --> 00:05:32,070 The columns are the same, and that 114 00:05:32,070 --> 00:05:34,600 means these two formulas are equivalent, 115 00:05:34,600 --> 00:05:36,940 which is the proof by truth table. 116 00:05:36,940 --> 00:05:39,010 We've just examined all possible environments 117 00:05:39,010 --> 00:05:44,010 and verified that, in fact, they get the same truth value. 118 00:05:44,010 --> 00:05:46,040 This brings us to a useful other connective 119 00:05:46,040 --> 00:05:48,480 we haven't talked about yet called if and only if. 120 00:05:48,480 --> 00:05:51,320 So the value of the if and only if connective 121 00:05:51,320 --> 00:05:54,697 P if and only if Q is true, if and only if P and Q 122 00:05:54,697 --> 00:05:55,780 have the same truth value. 123 00:05:55,780 --> 00:05:58,071 Now, this if and only if is an English word that you're 124 00:05:58,071 --> 00:06:02,330 supposed to understand what it means, and that if 125 00:06:02,330 --> 00:06:04,030 and only if is an operator on truth 126 00:06:04,030 --> 00:06:05,666 values that we're defining. 127 00:06:05,666 --> 00:06:07,790 Since the English now can be confusing between that 128 00:06:07,790 --> 00:06:09,740 if and only if and this if and only if, let's 129 00:06:09,740 --> 00:06:12,560 disambiguate by having the truth table for if and only if. 130 00:06:12,560 --> 00:06:13,420 Here it is. 131 00:06:13,420 --> 00:06:16,640 What it says is that when both of them are true-- 132 00:06:16,640 --> 00:06:19,370 P if and only if Q is true-- and both of them are false-- 133 00:06:19,370 --> 00:06:23,170 P if and only if Q is true-- and otherwise, it's false. 134 00:06:23,170 --> 00:06:29,390 You can check that P if and only if is true 135 00:06:29,390 --> 00:06:35,400 exactly when the complement of P XOR Q is true. 136 00:06:35,400 --> 00:06:40,030 So now we come to two crucial properties of formulas called 137 00:06:40,030 --> 00:06:44,770 satisfiability and validity, and let's examine what those are. 138 00:06:44,770 --> 00:06:47,710 So a formula is satisfiable if and only 139 00:06:47,710 --> 00:06:49,860 if it's true in some environment. 140 00:06:49,860 --> 00:06:51,950 That is, it's satisfiable if there is some way 141 00:06:51,950 --> 00:06:54,730 to set the values of the variables P and Q 142 00:06:54,730 --> 00:06:57,956 to be truth values in such a way that the formula comes out 143 00:06:57,956 --> 00:06:59,440 to be true. 144 00:06:59,440 --> 00:07:02,620 And a related idea is that a formula is valid-- 145 00:07:02,620 --> 00:07:04,890 it's also called a tautology-- if 146 00:07:04,890 --> 00:07:07,080 and only if it's true in all environments. 147 00:07:07,080 --> 00:07:09,460 No matter what you set the variables to, 148 00:07:09,460 --> 00:07:10,930 it's going to come out to be true. 149 00:07:10,930 --> 00:07:14,460 Let's look at some examples to solidify those two concepts. 150 00:07:14,460 --> 00:07:17,510 So the formula P all by itself is satisfiable 151 00:07:17,510 --> 00:07:20,180 because it can be true if P is true, 152 00:07:20,180 --> 00:07:23,020 but it's not always true because P might be false. 153 00:07:23,020 --> 00:07:25,680 Symmetrically, NOT P is also satisfiable 154 00:07:25,680 --> 00:07:27,900 because it could be true if P is false, 155 00:07:27,900 --> 00:07:29,920 but it's not always true. 156 00:07:29,920 --> 00:07:33,070 It's not valid because P might be true, in which case 157 00:07:33,070 --> 00:07:36,400 not P would be false. 158 00:07:36,400 --> 00:07:38,070 A formula that's not satisfiable, 159 00:07:38,070 --> 00:07:42,430 a formula that means that there is no truth value that 160 00:07:42,430 --> 00:07:44,890 makes it true, which is the same as saying that it's always 161 00:07:44,890 --> 00:07:48,420 false, is the formula P and NOT P. 162 00:07:48,420 --> 00:07:51,040 It's probably the simplest not satisfiable formula 163 00:07:51,040 --> 00:07:53,230 or unsatisfiable formula. 164 00:07:53,230 --> 00:07:57,140 So this is clearly false because either P or NOT 165 00:07:57,140 --> 00:07:59,190 P has got to be false and the [? and ?] will then 166 00:07:59,190 --> 00:08:00,960 will definitely come out to be false. 167 00:08:00,960 --> 00:08:04,540 There is no value for P that makes this formula true. 168 00:08:04,540 --> 00:08:06,320 It's unsatisfiable. 169 00:08:06,320 --> 00:08:08,950 A valid formula, actually by De Morgan's law 170 00:08:08,950 --> 00:08:13,250 applied to the P and NOT P, is P or NOT 171 00:08:13,250 --> 00:08:17,630 P is going to be valid because no matter what truth value 172 00:08:17,630 --> 00:08:19,030 P has, it comes out to be true. 173 00:08:19,030 --> 00:08:21,580 That is, one of P and NOT P is true, 174 00:08:21,580 --> 00:08:24,580 and therefore the OR is going to get at least one true, exactly 175 00:08:24,580 --> 00:08:27,590 one true really, and going to come out 176 00:08:27,590 --> 00:08:29,030 to be true, so this is valid. 177 00:08:29,030 --> 00:08:31,750 178 00:08:31,750 --> 00:08:34,730 Now we can connect up the pieces that we've just 179 00:08:34,730 --> 00:08:38,440 set up of relating validity and equivalence. 180 00:08:38,440 --> 00:08:40,909 Two formulas G and H are equivalent 181 00:08:40,909 --> 00:08:45,370 is the same as saying that G if and only if H is valid. 182 00:08:45,370 --> 00:08:47,770 So G if and only if H comes out to be true 183 00:08:47,770 --> 00:08:50,640 when G and H have the same truth value, 184 00:08:50,640 --> 00:08:54,200 and G and H are equivalent says that they 185 00:08:54,200 --> 00:08:57,680 do have the same truth value no matter what the environment is. 186 00:08:57,680 --> 00:09:00,200 So that's the same as saying that if G and H are equivalent 187 00:09:00,200 --> 00:09:02,350 no matter what the environment is, G if and only 188 00:09:02,350 --> 00:09:04,200 if H comes out to be true. 189 00:09:04,200 --> 00:09:07,780 So if G and H are equivalent, G if and only if H is valid, 190 00:09:07,780 --> 00:09:11,150 and the converse argument works the same way. 191 00:09:11,150 --> 00:09:14,970 A very important problem that comes up in multiple ways, 192 00:09:14,970 --> 00:09:18,180 and we're going to examine some of them later, 193 00:09:18,180 --> 00:09:22,670 is the problem of whether or not a formula is valid-- 194 00:09:22,670 --> 00:09:25,530 proving that it's valid-- and checking whether or not 195 00:09:25,530 --> 00:09:28,230 a formula is satisfiable. 196 00:09:28,230 --> 00:09:30,160 Now, there's a simple way to do that. 197 00:09:30,160 --> 00:09:31,600 The truth table tells it to you. 198 00:09:31,600 --> 00:09:34,170 If you want to know whether the formula is satisfiable, 199 00:09:34,170 --> 00:09:36,340 you just look at its truth table. 200 00:09:36,340 --> 00:09:39,150 Try every possible environment and see whether one of them 201 00:09:39,150 --> 00:09:41,260 yields the value of true. 202 00:09:41,260 --> 00:09:43,520 The problem with that approach, which theoretically 203 00:09:43,520 --> 00:09:47,430 is sound, but pragmatically the truth table size 204 00:09:47,430 --> 00:09:50,550 doubles with each additional variable. 205 00:09:50,550 --> 00:09:52,317 So with two variables, you got four rows. 206 00:09:52,317 --> 00:09:54,025 With three variables, you got eight rows. 207 00:09:54,025 --> 00:09:56,860 With four variables, you got 16 rows, 208 00:09:56,860 --> 00:09:59,550 and this very rapidly gets out of hand 209 00:09:59,550 --> 00:10:02,120 once the number of variables gets to be moderate sized. 210 00:10:02,120 --> 00:10:03,370 This is exponential growth. 211 00:10:03,370 --> 00:10:05,660 It's doubling each time you add a variable. 212 00:10:05,660 --> 00:10:08,399 So really when you start having hundreds of variables, 213 00:10:08,399 --> 00:10:09,940 truth tables are out of the question. 214 00:10:09,940 --> 00:10:11,231 You just can't write them down. 215 00:10:11,231 --> 00:10:15,150 They're too big, and in fact, in modern digital circuits 216 00:10:15,150 --> 00:10:17,800 when you think back about how we designed the [? adder, ?] 217 00:10:17,800 --> 00:10:20,290 if you look at what's going on in all those different wires 218 00:10:20,290 --> 00:10:24,730 corresponding to different variables, 219 00:10:24,730 --> 00:10:28,090 there are millions of those in a typical modern digital circuit. 220 00:10:28,090 --> 00:10:32,090 So truth table approach is just not going to be workable. 221 00:10:32,090 --> 00:10:32,700 It's hopeless. 222 00:10:32,700 --> 00:10:35,760 223 00:10:35,760 --> 00:10:40,470 Now, one of the central problems in theoretical computer science 224 00:10:40,470 --> 00:10:42,290 is the question of whether or not 225 00:10:42,290 --> 00:10:44,340 there is a way to test for SAT that's 226 00:10:44,340 --> 00:10:46,960 more efficient than this impossible way of trying 227 00:10:46,960 --> 00:10:49,590 to come up with a truth table that's too big to fit 228 00:10:49,590 --> 00:10:51,890 in the universe when there are hundreds 229 00:10:51,890 --> 00:10:53,582 and thousands of variables. 230 00:10:53,582 --> 00:10:55,040 So we're interested in the question 231 00:10:55,040 --> 00:10:56,760 of is there some other way to check 232 00:10:56,760 --> 00:11:00,310 a formula for satisfiability than exhaustively trying 233 00:11:00,310 --> 00:11:04,150 to generate its whole truth table to see if some row yields 234 00:11:04,150 --> 00:11:05,190 true? 235 00:11:05,190 --> 00:11:09,790 This is the abstract version of the P equals NP problem. 236 00:11:09,790 --> 00:11:11,950 So we've talked about this before. 237 00:11:11,950 --> 00:11:14,240 P equals NP? 238 00:11:14,240 --> 00:11:19,240 is considered to be the most important open problem 239 00:11:19,240 --> 00:11:22,810 in theoretical computer science, in the theory of computation, 240 00:11:22,810 --> 00:11:29,250 and it's simply saying, is there some fast or efficient way 241 00:11:29,250 --> 00:11:32,280 to tell whether or not a formula is satisfiable 242 00:11:32,280 --> 00:11:34,630 that's more efficient than the truth table approach? 243 00:11:34,630 --> 00:11:36,800 Efficiency has a technical definition, 244 00:11:36,800 --> 00:11:38,290 which is that the number of steps 245 00:11:38,290 --> 00:11:41,910 grows much less than exponentially 246 00:11:41,910 --> 00:11:43,580 with the number of variables. 247 00:11:43,580 --> 00:11:47,440 Let's say it grows polynomial, like a quadratic or a cubic, 248 00:11:47,440 --> 00:11:51,030 but you're trying to beat exponential growth, which 249 00:11:51,030 --> 00:11:54,070 is what ruins things quickly. 250 00:11:54,070 --> 00:11:56,130 And this is an open problem. 251 00:11:56,130 --> 00:11:59,380 No one knows if there is some fast way 252 00:11:59,380 --> 00:12:01,910 to check for satisfiability or SAT. 253 00:12:01,910 --> 00:12:03,650 That's the SAT problem. 254 00:12:03,650 --> 00:12:05,930 By the way, very closely related to SAT 255 00:12:05,930 --> 00:12:09,830 is validity because if I wanted to know whether a formula 256 00:12:09,830 --> 00:12:14,750 G is valid-- well, valid means that it's always true. 257 00:12:14,750 --> 00:12:18,780 That means its complement, not G, is always false, 258 00:12:18,780 --> 00:12:22,480 which is the same as saying its complement is not satisfiable. 259 00:12:22,480 --> 00:12:25,040 So to check that G is valid, all I need to do 260 00:12:25,040 --> 00:12:31,410 is check that not G is satisfiable and vice versa, 261 00:12:31,410 --> 00:12:35,090 not G is not satisfiable, if and only if G is valid. 262 00:12:35,090 --> 00:12:38,780 So the point is that SAT and valid stand and fall together. 263 00:12:38,780 --> 00:12:42,070 If you had a fast way to do one, you would very quickly 264 00:12:42,070 --> 00:12:43,990 get a fast way to do the other one. 265 00:12:43,990 --> 00:12:47,810 So checking for one is just as difficult as checking 266 00:12:47,810 --> 00:12:51,050 for the other, and we're going to examine 267 00:12:51,050 --> 00:12:55,070 in subsequent lectures why it is that this problem is 268 00:12:55,070 --> 00:12:56,850 so important.