1 00:00:02,660 --> 00:00:05,830 The last discrete random variable that we will discuss 2 00:00:05,830 --> 00:00:08,730 is the so-called geometric random variable. 3 00:00:08,730 --> 00:00:12,060 It shows up in the context of the following experiment. 4 00:00:12,060 --> 00:00:16,970 We have a coin and we toss it infinitely many times and 5 00:00:16,970 --> 00:00:18,260 independently. 6 00:00:18,260 --> 00:00:21,420 And at each coin toss we have a fixed probability of heads, 7 00:00:21,420 --> 00:00:23,120 which is some given number, p. 8 00:00:23,120 --> 00:00:28,310 This is a parameter that specifies the experiment. 9 00:00:28,310 --> 00:00:31,270 When we say that the infinitely many tosses are 10 00:00:31,270 --> 00:00:36,690 independent, what we mean in a mathematical and formal sense 11 00:00:36,690 --> 00:00:41,030 is that any finite subset of those tosses are independent 12 00:00:41,030 --> 00:00:42,270 of each other. 13 00:00:42,270 --> 00:00:44,850 I'm only making this comment because we introduced a 14 00:00:44,850 --> 00:00:48,340 definition of independence of finitely many events, but had 15 00:00:48,340 --> 00:00:51,570 never defined the notion of independence or infinitely 16 00:00:51,570 --> 00:00:53,910 many events. 17 00:00:53,910 --> 00:00:56,530 The sample space for this experiment is the set of 18 00:00:56,530 --> 00:00:58,980 infinite sequences of heads and tails. 19 00:00:58,980 --> 00:01:01,640 So a typical outcome of this experiment 20 00:01:01,640 --> 00:01:03,340 might look like this. 21 00:01:03,340 --> 00:01:07,990 It's a sequence of heads and tails in some arbitrary order. 22 00:01:07,990 --> 00:01:10,570 And of course, it's an infinite sequence, so it 23 00:01:10,570 --> 00:01:11,940 continues forever. 24 00:01:11,940 --> 00:01:13,539 But I'm only showing you here the 25 00:01:13,539 --> 00:01:16,100 beginning of that sequence. 26 00:01:16,100 --> 00:01:19,550 We're interested in the following random variable, X, 27 00:01:19,550 --> 00:01:23,160 which is the number of tosses until the first heads. 28 00:01:23,160 --> 00:01:27,170 So if our sequence looked like this, our random variable 29 00:01:27,170 --> 00:01:30,950 would be taking a value of 5. 30 00:01:30,950 --> 00:01:34,100 A random variable of this kind appears in many applications 31 00:01:34,100 --> 00:01:36,120 and many real world contexts. 32 00:01:36,120 --> 00:01:39,390 In general, it models situations where we're waiting 33 00:01:39,390 --> 00:01:41,770 for something to happen. 34 00:01:41,770 --> 00:01:46,710 Suppose that we keep doing trials at each time and the 35 00:01:46,710 --> 00:01:50,110 trial can result either in success or failure. 36 00:01:50,110 --> 00:01:53,550 And we're counting the number of trials it takes until a 37 00:01:53,550 --> 00:01:57,670 success is observed for the first time. 38 00:01:57,670 --> 00:02:01,370 Now, these trials could be experiments of some kind, 39 00:02:01,370 --> 00:02:05,470 could be processes of some kind, or they could be whether 40 00:02:05,470 --> 00:02:10,020 a customer shows up in a store in a particular second or not. 41 00:02:10,020 --> 00:02:12,940 So there are many diverse interpretations of the words 42 00:02:12,940 --> 00:02:17,820 trial and of the word success that would allow us to apply 43 00:02:17,820 --> 00:02:21,720 this particular model to a given situation. 44 00:02:21,720 --> 00:02:25,010 Now, let us move to the calculation of the PMF of this 45 00:02:25,010 --> 00:02:26,300 random variable. 46 00:02:26,300 --> 00:02:30,280 By definition, what we need to calculate is the probability 47 00:02:30,280 --> 00:02:32,530 that the random variable takes on a 48 00:02:32,530 --> 00:02:35,160 particular numerical value. 49 00:02:35,160 --> 00:02:38,690 What does it mean for X to be equal to k? 50 00:02:38,690 --> 00:02:43,250 What it means is that the first heads was observed in 51 00:02:43,250 --> 00:02:47,690 the k-th trial, which means that the first k minus 1 52 00:02:47,690 --> 00:02:53,440 trials were tails, and then were followed by heads in the 53 00:02:53,440 --> 00:02:56,530 k-th trial. 54 00:02:56,530 --> 00:03:00,400 This is an event that only concerns the first k trials, 55 00:03:00,400 --> 00:03:04,030 and the probability of this event can be calculated using 56 00:03:04,030 --> 00:03:07,400 the fact that different coin tosses or different trials are 57 00:03:07,400 --> 00:03:08,100 independent. 58 00:03:08,100 --> 00:03:12,620 It is the probability of tails in the first coin toss times 59 00:03:12,620 --> 00:03:15,840 the probability of tails in the second coin toss, and so 60 00:03:15,840 --> 00:03:18,000 on, k minus 1 times. 61 00:03:18,000 --> 00:03:21,600 So we get an exponent here of k minus 1 times the 62 00:03:21,600 --> 00:03:25,120 probability of heads in the k-th coin toss. 63 00:03:25,120 --> 00:03:28,380 So this is the form of the PMF of this particular random 64 00:03:28,380 --> 00:03:31,630 variable, and that formula applies for the possible 65 00:03:31,630 --> 00:03:36,310 values of k, which are the positive integers. 66 00:03:36,310 --> 00:03:39,790 Because the time of the first head can only 67 00:03:39,790 --> 00:03:41,829 be a positive integer. 68 00:03:41,829 --> 00:03:45,430 And any positive integer is possible, so our random 69 00:03:45,430 --> 00:03:50,770 variable takes values in a discrete but infinite set. 70 00:03:50,770 --> 00:03:54,670 The geometric PMF has a shape of this type. 71 00:03:54,670 --> 00:04:00,070 Here we see the plot for the case where p equals to 1/3. 72 00:04:00,070 --> 00:04:03,670 The probability that the first head shows up in the first 73 00:04:03,670 --> 00:04:07,490 trial is equal to p, that's the probability of heads. 74 00:04:07,490 --> 00:04:11,130 The probability that it shows up in the next trial, that the 75 00:04:11,130 --> 00:04:14,760 first head appears in the second trial, this is the 76 00:04:14,760 --> 00:04:20,529 probability that we had heads following a tail. 77 00:04:20,529 --> 00:04:23,070 So we have the probability of a tail and then times the 78 00:04:23,070 --> 00:04:24,580 probability of a head. 79 00:04:24,580 --> 00:04:28,220 And then each time that we move to a further entry, we 80 00:04:28,220 --> 00:04:34,110 multiply by a further factor of 1 minus p. 81 00:04:34,110 --> 00:04:38,420 Finally, one little technical remark. 82 00:04:38,420 --> 00:04:42,080 There's a possible and rather annoying outcome of this 83 00:04:42,080 --> 00:04:46,640 experiment, which would be that we observe a sequence of 84 00:04:46,640 --> 00:04:50,210 tails forever and no heads. 85 00:04:50,210 --> 00:04:53,450 In that case, our random variable is not well-defined, 86 00:04:53,450 --> 00:04:56,480 because there is no first heads to consider. 87 00:04:56,480 --> 00:05:00,190 You might say that in this case our random variable takes 88 00:05:00,190 --> 00:05:04,150 a value of infinity, but we would rather not have to deal 89 00:05:04,150 --> 00:05:07,410 with random variables that could be infinite. 90 00:05:07,410 --> 00:05:11,760 Fortunately, it turns out that this particular event has 0 91 00:05:11,760 --> 00:05:16,890 probability of occurring, which I will now try to show. 92 00:05:16,890 --> 00:05:20,980 So this is the event that we always see tails. 93 00:05:20,980 --> 00:05:25,630 Let us compare it with the event where we see tails in 94 00:05:25,630 --> 00:05:27,450 the first k trials. 95 00:05:30,860 --> 00:05:35,344 How do these two events relate? 96 00:05:35,344 --> 00:05:40,990 If we have always tails, then we will have tails in the 97 00:05:40,990 --> 00:05:42,720 first k trials. 98 00:05:42,720 --> 00:05:46,730 So this event implies that event. 99 00:05:46,730 --> 00:05:50,540 This event is smaller than that event. 100 00:05:50,540 --> 00:05:54,140 So the probability of this event is less than or equal to 101 00:05:54,140 --> 00:05:57,620 the probability of that second event. 102 00:05:57,620 --> 00:06:00,240 And the probability of that second event is 1 103 00:06:00,240 --> 00:06:01,750 minus p to the k. 104 00:06:04,800 --> 00:06:09,700 Now, this is true no matter what k we choose. 105 00:06:09,700 --> 00:06:14,720 And by taking k arbitrarily large, this number here 106 00:06:14,720 --> 00:06:16,530 becomes arbitrarily small. 107 00:06:19,310 --> 00:06:21,790 Why does it become arbitrarily small? 108 00:06:21,790 --> 00:06:25,880 Well, we're assuming that p is positive, so 1 minus p is a 109 00:06:25,880 --> 00:06:27,590 number less than 1. 110 00:06:27,590 --> 00:06:31,040 And when we multiply a number strictly less than 1 by itself 111 00:06:31,040 --> 00:06:34,659 over and over, we get arbitrarily small numbers. 112 00:06:34,659 --> 00:06:39,409 So the probability of never seeing a head is less than or 113 00:06:39,409 --> 00:06:43,130 equal to an arbitrarily small positive number. 114 00:06:43,130 --> 00:06:49,040 So the only possibility for this is that it is equal to 0. 115 00:06:49,040 --> 00:06:53,010 So the probability of not ever seeing any heads is equal to 116 00:06:53,010 --> 00:06:56,350 0, and this means that we can ignore 117 00:06:56,350 --> 00:07:00,840 this particular outcome. 118 00:07:00,840 --> 00:07:05,340 And as a side consequence of this, the sum of the 119 00:07:05,340 --> 00:07:09,650 probabilities of the different possible values of k is going 120 00:07:09,650 --> 00:07:13,830 to be equal to 1, because we're certain that the random 121 00:07:13,830 --> 00:07:17,260 variable is going to take a finite value. 122 00:07:17,260 --> 00:07:19,530 And so when we sum probabilities of all the 123 00:07:19,530 --> 00:07:22,460 possible finite values, that sum will have 124 00:07:22,460 --> 00:07:23,800 to be equal to 1. 125 00:07:23,800 --> 00:07:26,910 And indeed, you can use the formula for the geometric 126 00:07:26,910 --> 00:07:31,300 series to verify that, indeed, the sum of these numbers here, 127 00:07:31,300 --> 00:07:35,080 when you add over all values of k, is, indeed, equal to 1.