1 00:00:00,660 --> 00:00:03,840 The law of total probability is another probability law 2 00:00:03,840 --> 00:00:06,220 that gives you a way to reason about cases, which we've 3 00:00:06,220 --> 00:00:09,120 seen is a fundamental technique for dealing 4 00:00:09,120 --> 00:00:10,890 with all sorts of problems. 5 00:00:10,890 --> 00:00:13,542 So the point of cases, of course, 6 00:00:13,542 --> 00:00:15,250 is that you can prove a complicated thing 7 00:00:15,250 --> 00:00:19,550 by breaking it up into, if you're lucky, easy sub cases. 8 00:00:19,550 --> 00:00:22,810 So here's the way to understand the law of total probability 9 00:00:22,810 --> 00:00:23,330 abstractly. 10 00:00:23,330 --> 00:00:25,370 It starts off with set theoretic reasonings. 11 00:00:25,370 --> 00:00:29,610 Suppose that I have a set A embedded in some larger sample 12 00:00:29,610 --> 00:00:33,170 space S. So A is really an event, 13 00:00:33,170 --> 00:00:35,200 but we're just going to think of it as a set. 14 00:00:35,200 --> 00:00:37,670 Now suppose that I have three sets, 15 00:00:37,670 --> 00:00:41,820 B1, B2, and B3 that partition the sample space. 16 00:00:41,820 --> 00:00:44,985 That is B1, B2, and B3 three don't overlap. 17 00:00:44,985 --> 00:00:47,070 They're disjoined, and everything 18 00:00:47,070 --> 00:00:48,480 is in one of those three sets. 19 00:00:48,480 --> 00:00:52,020 So there's a picture of B1, B2, and B3, cutting up 20 00:00:52,020 --> 00:00:54,245 the whole sample space S, represented 21 00:00:54,245 --> 00:00:59,440 by the square or rectangle. 22 00:00:59,440 --> 00:01:01,590 Now of course, these three sets that 23 00:01:01,590 --> 00:01:03,250 cut up the whole space, willy nilly 24 00:01:03,250 --> 00:01:05,360 cut up the set A into three pieces. 25 00:01:05,360 --> 00:01:09,450 The first piece is the points in A that are in B1. 26 00:01:09,450 --> 00:01:12,720 The second piece is the point in A that are in B2. 27 00:01:12,720 --> 00:01:16,800 And the third is the points in A that are in B3. 28 00:01:16,800 --> 00:01:20,690 So that's why we a basic set theoretic 29 00:01:20,690 --> 00:01:24,350 identity that says that as long as B1, B2, and B3 30 00:01:24,350 --> 00:01:28,340 have the property, that they're union is in the universe. 31 00:01:28,340 --> 00:01:29,770 Everything. 32 00:01:29,770 --> 00:01:34,310 And they are pairwise disjoined, then 33 00:01:34,310 --> 00:01:39,120 any set A is equal to the union of the part of A 34 00:01:39,120 --> 00:01:41,070 that's in B1, the part of A that's in B2, 35 00:01:41,070 --> 00:01:42,790 the part of A that's in B3. 36 00:01:42,790 --> 00:01:47,090 And this is a disjoint union, because the B's don't overlap. 37 00:01:47,090 --> 00:01:50,760 That means that if I was talking about cardinality, 38 00:01:50,760 --> 00:01:52,110 I could add them up. 39 00:01:52,110 --> 00:01:54,880 But in terms of probability, I can apply the sum rule 40 00:01:54,880 --> 00:01:58,390 for probabilities and discover that the probability of A 41 00:01:58,390 --> 00:02:02,570 is simply the probability of B1 intersection A, B2 intersection 42 00:02:02,570 --> 00:02:04,830 A, B3 intersection A. 43 00:02:04,830 --> 00:02:08,389 Now the most useful form of the law of total probabilities 44 00:02:08,389 --> 00:02:12,020 is when you replace this intersection, B1 intersection 45 00:02:12,020 --> 00:02:16,250 A, by the conditional probability using the product 46 00:02:16,250 --> 00:02:20,540 rule-- so let's replace it by the probability of A given 47 00:02:20,540 --> 00:02:23,730 B1 times the probability of B1. 48 00:02:23,730 --> 00:02:28,646 That's another formula for B1 intersection A. 49 00:02:28,646 --> 00:02:30,770 And if I do that with the rest of them, 50 00:02:30,770 --> 00:02:34,740 I now have the law of total probability stated 51 00:02:34,740 --> 00:02:37,520 in the usual way in terms of conditional probabilities 52 00:02:37,520 --> 00:02:38,770 where it's most useful. 53 00:02:38,770 --> 00:02:40,550 Now I did it for three sets. 54 00:02:40,550 --> 00:02:43,830 But it obviously works for any finite number of sets. 55 00:02:43,830 --> 00:02:46,830 As a matter of fact, it works fine for any countable union 56 00:02:46,830 --> 00:02:47,330 of sets. 57 00:02:47,330 --> 00:02:50,840 If I have a partition of the sample space S 58 00:02:50,840 --> 00:02:53,870 in to B0, B1, and so on-- a partition 59 00:02:53,870 --> 00:02:56,020 with a countable number of blocks-- then 60 00:02:56,020 --> 00:02:58,490 it's still the case that the probability of A 61 00:02:58,490 --> 00:03:01,170 is equal by the sum rule to the probability 62 00:03:01,170 --> 00:03:03,464 of these disjoint pieces, the parts of A 63 00:03:03,464 --> 00:03:05,130 that are in each of the different blocks 64 00:03:05,130 --> 00:03:06,280 of the partition. 65 00:03:06,280 --> 00:03:09,600 And reformulating that as a conditional probability, 66 00:03:09,600 --> 00:03:11,570 I get the rule that the probability of A 67 00:03:11,570 --> 00:03:16,060 is the sum over all possible i of the probability of A given 68 00:03:16,060 --> 00:03:19,250 Bi times the probability of Bi. 69 00:03:19,250 --> 00:03:20,800 And that basic rule is one and we're 70 00:03:20,800 --> 00:03:22,770 going to get a lot of mileage out of when 71 00:03:22,770 --> 00:03:27,500 we turn in the next segment to analyze and understand 72 00:03:27,500 --> 00:03:31,892 the results of tests that are maybe unreliable.