1
00:00:04,500 --> 00:00:06,990
We saw in the previous
video that the outcome

2
00:00:06,990 --> 00:00:10,870
of a logistic regression
model is a probability.

3
00:00:10,870 --> 00:00:14,260
Often, we want to make
an actual prediction.

4
00:00:14,260 --> 00:00:16,760
Should we predict
1 for poor care,

5
00:00:16,760 --> 00:00:20,110
or should we predict
0 for good care?

6
00:00:20,110 --> 00:00:23,380
We can convert the
probabilities to predictions

7
00:00:23,380 --> 00:00:27,030
using what's called
a threshold value, t.

8
00:00:27,030 --> 00:00:30,490
If the probability of poor care
is greater than this threshold

9
00:00:30,490 --> 00:00:34,290
value, t, we predict
poor quality care.

10
00:00:34,290 --> 00:00:36,310
But if the probability
of poor care

11
00:00:36,310 --> 00:00:37,920
is less than the
threshold value,

12
00:00:37,920 --> 00:00:41,110
t, then we predict
good quality care.

13
00:00:41,110 --> 00:00:45,340
But what value should we
pick for the threshold, t?

14
00:00:45,340 --> 00:00:48,200
The threshold value,
t, is often selected

15
00:00:48,200 --> 00:00:50,480
based on which
errors are better.

16
00:00:50,480 --> 00:00:53,200
You might be thinking
that making no errors

17
00:00:53,200 --> 00:00:55,590
is better, which
is, of course, true.

18
00:00:55,590 --> 00:00:58,820
But it's rare to have a model
that predicts perfectly,

19
00:00:58,820 --> 00:01:01,500
so you're bound to
make some errors.

20
00:01:01,500 --> 00:01:04,569
There are two types of errors
that a model can make --

21
00:01:04,569 --> 00:01:07,300
ones where you predict
1, or poor care,

22
00:01:07,300 --> 00:01:11,470
but the actual outcome is 0,
and ones where you predict 0,

23
00:01:11,470 --> 00:01:15,500
or good care, but the
actual outcome is 1.

24
00:01:15,500 --> 00:01:18,670
If we pick a large
threshold value t,

25
00:01:18,670 --> 00:01:21,080
then we will predict
poor care rarely,

26
00:01:21,080 --> 00:01:23,150
since the probability
of poor care

27
00:01:23,150 --> 00:01:26,670
has to be really large to be
greater than the threshold.

28
00:01:26,670 --> 00:01:29,050
This means that we will
make more errors where

29
00:01:29,050 --> 00:01:32,700
we say good care, but
it's actually poor care.

30
00:01:32,700 --> 00:01:35,660
This approach would detect the
patients receiving the worst

31
00:01:35,660 --> 00:01:39,660
care and prioritize
them for intervention.

32
00:01:39,660 --> 00:01:43,410
On the other hand, if the
threshold value, t, is small,

33
00:01:43,410 --> 00:01:46,850
we predict poor care frequently,
and we predict good care

34
00:01:46,850 --> 00:01:48,060
rarely.

35
00:01:48,060 --> 00:01:50,400
This means that we will
make more errors where

36
00:01:50,400 --> 00:01:53,910
we say poor care, but
it's actually good care.

37
00:01:53,910 --> 00:01:56,170
This approach would
detect all patients

38
00:01:56,170 --> 00:01:59,039
who might be
receiving poor care.

39
00:01:59,039 --> 00:02:01,540
Some decision-makers
often have a preference

40
00:02:01,540 --> 00:02:03,810
for one type of
error over the other,

41
00:02:03,810 --> 00:02:07,110
which should influence the
threshold value they pick.

42
00:02:07,110 --> 00:02:09,449
If there's no preference
between the errors,

43
00:02:09,449 --> 00:02:13,280
the right threshold
to select is t = 0.5,

44
00:02:13,280 --> 00:02:17,730
since it just predicts
the most likely outcome.

45
00:02:17,730 --> 00:02:20,730
To make this discussion a
little more quantitative,

46
00:02:20,730 --> 00:02:24,360
we use what's called a confusion
matrix or classification

47
00:02:24,360 --> 00:02:25,680
matrix.

48
00:02:25,680 --> 00:02:27,880
This compares the
actual outcomes

49
00:02:27,880 --> 00:02:30,160
to the predicted outcomes.

50
00:02:30,160 --> 00:02:33,880
The rows are labeled
with the actual outcome,

51
00:02:33,880 --> 00:02:37,920
and the columns are labeled
with the predicted outcome.

52
00:02:37,920 --> 00:02:40,350
Each entry of the table
gives the number of data

53
00:02:40,350 --> 00:02:43,570
observations that fall
into that category.

54
00:02:43,570 --> 00:02:46,900
So the number of true
negatives, or TN,

55
00:02:46,900 --> 00:02:49,350
is the number of observations
that are actually

56
00:02:49,350 --> 00:02:53,210
good care and for which
we predict good care.

57
00:02:53,210 --> 00:02:56,770
The true positives,
or TP, is the number

58
00:02:56,770 --> 00:02:58,520
of observations
that are actually

59
00:02:58,520 --> 00:03:02,030
poor care and for which
we predict poor care.

60
00:03:02,030 --> 00:03:05,260
These are the two types
that we get correct.

61
00:03:05,260 --> 00:03:09,510
The false positives, or FP,
are the number of data points

62
00:03:09,510 --> 00:03:14,370
for which we predict poor care,
but they're actually good care.

63
00:03:14,370 --> 00:03:18,690
And the false negatives, or FN,
are the number of data points

64
00:03:18,690 --> 00:03:23,990
for which we predict good care,
but they're actually poor care.

65
00:03:23,990 --> 00:03:26,220
We can compute two
outcome measures

66
00:03:26,220 --> 00:03:30,050
that help us determine what
types of errors we are making.

67
00:03:30,050 --> 00:03:36,850
They're called sensitivity
and specificity.

68
00:03:42,620 --> 00:03:46,770
Sensitivity is equal
to the true positives

69
00:03:46,770 --> 00:03:51,980
divided by the true positives
plus the false negatives,

70
00:03:51,980 --> 00:03:55,750
and measures the percentage
of actual poor care cases

71
00:03:55,750 --> 00:03:57,870
that we classify correctly.

72
00:03:57,870 --> 00:04:01,150
This is often called
the true positive rate.

73
00:04:01,150 --> 00:04:04,710
Specificity is equal
to the true negatives

74
00:04:04,710 --> 00:04:09,520
divided by the true negatives
plus the false positives,

75
00:04:09,520 --> 00:04:12,570
and measures the percentage
of actual good care cases

76
00:04:12,570 --> 00:04:14,700
that we classify correctly.

77
00:04:14,700 --> 00:04:17,649
This is often called
the true negative rate.

78
00:04:17,649 --> 00:04:21,700
A model with a higher threshold
will have a lower sensitivity

79
00:04:21,700 --> 00:04:24,200
and a higher specificity.

80
00:04:24,200 --> 00:04:28,440
A model with a lower threshold
will have a higher sensitivity

81
00:04:28,440 --> 00:04:30,930
and a lower specificity.

82
00:04:30,930 --> 00:04:33,260
Let's compute some
confusion matrices

83
00:04:33,260 --> 00:04:37,550
in R using different
threshold values.

84
00:04:37,550 --> 00:04:41,490
In our R console, let's make
some classification tables

85
00:04:41,490 --> 00:04:45,680
using different threshold
values and the table function.

86
00:04:45,680 --> 00:04:49,560
First, we'll use a
threshold value of 0.5.

87
00:04:49,560 --> 00:04:53,020
So type table, and then
the first argument,

88
00:04:53,020 --> 00:04:56,500
or what we want to label the
rows by, should be the true

89
00:04:56,500 --> 00:04:58,230
outcome, which is
qualityTrain$PoorCare.

90
00:05:04,440 --> 00:05:06,290
And then the second
argument, or what

91
00:05:06,290 --> 00:05:08,750
we want to label
the columns by, will

92
00:05:08,750 --> 00:05:11,850
be predictTrain,
or our predictions

93
00:05:11,850 --> 00:05:16,670
from the previous
video, greater than 0.5.

94
00:05:16,670 --> 00:05:21,200
This will return TRUE if our
prediction is greater than 0.5,

95
00:05:21,200 --> 00:05:23,860
which means we want
to predict poor care,

96
00:05:23,860 --> 00:05:27,540
and it will return FALSE if our
prediction is less than 0.5,

97
00:05:27,540 --> 00:05:31,010
which means we want
to predict good care.

98
00:05:31,010 --> 00:05:34,040
If you hit Enter, we get a
table where the rows are labeled

99
00:05:34,040 --> 00:05:37,840
by 0 or 1, the true
outcome, and the columns

100
00:05:37,840 --> 00:05:41,780
are labeled by FALSE or
TRUE, our predicted outcome.

101
00:05:41,780 --> 00:05:46,400
So you can see here that for
70 cases, we predict good care

102
00:05:46,400 --> 00:05:50,220
and they actually received
good care, and for 10 cases,

103
00:05:50,220 --> 00:05:54,230
we predict poor care, and they
actually received poor care.

104
00:05:54,230 --> 00:05:57,200
We make four mistakes
where we say poor care

105
00:05:57,200 --> 00:06:01,010
and it's actually good care,
and we make 15 mistakes where

106
00:06:01,010 --> 00:06:05,010
we say good care, but
it's actually poor care.

107
00:06:05,010 --> 00:06:09,240
Let's compute the sensitivity,
or the true positive rate,

108
00:06:09,240 --> 00:06:12,790
and the specificity, or
the true negative rate.

109
00:06:12,790 --> 00:06:16,870
The sensitivity here would
be 10, our true positives,

110
00:06:16,870 --> 00:06:22,460
divided by 25 the total
number of positive cases.

111
00:06:22,460 --> 00:06:26,400
So we have a sensitivity of 0.4.

112
00:06:26,400 --> 00:06:31,260
Our specificity here would be
70, the true negative cases,

113
00:06:31,260 --> 00:06:37,430
divided by 74, the total
number of negative cases.

114
00:06:37,430 --> 00:06:42,900
So our specificity
here is about 0.95.

115
00:06:42,900 --> 00:06:45,480
Now, let's try
increasing the threshold.

116
00:06:45,480 --> 00:06:48,830
Use the up arrow to get
back to the table command,

117
00:06:48,830 --> 00:06:53,350
and change the
threshold value to 0.7.

118
00:06:53,350 --> 00:06:56,310
Now, if we compute
our sensitivity,

119
00:06:56,310 --> 00:07:02,480
we get a sensitivity of 8
divided by 25, which is 0.32.

120
00:07:02,480 --> 00:07:04,620
And if we compute
our specificity,

121
00:07:04,620 --> 00:07:08,930
we get a specificity
of 73 divided by 74,

122
00:07:08,930 --> 00:07:11,050
which is about 0.99.

123
00:07:11,050 --> 00:07:14,870
So by increasing the threshold,
our sensitivity went down

124
00:07:14,870 --> 00:07:17,670
and our specificity went up.

125
00:07:17,670 --> 00:07:20,540
Now, let's try
decreasing the threshold.

126
00:07:20,540 --> 00:07:23,820
Hit the up arrow again to
get to the table function,

127
00:07:23,820 --> 00:07:27,940
and change the
threshold value to 0.2.

128
00:07:27,940 --> 00:07:30,570
Now, if we compute
our sensitivity,

129
00:07:30,570 --> 00:07:35,380
it's 16 divided by 25, or 0.64.

130
00:07:35,380 --> 00:07:37,290
And if we compute
our specificity,

131
00:07:37,290 --> 00:07:42,590
it's 54 divided by
74, or about 0.73.

132
00:07:42,590 --> 00:07:45,930
So with the lower threshold,
our sensitivity went up,

133
00:07:45,930 --> 00:07:48,890
and our specificity went down.

134
00:07:48,890 --> 00:07:51,250
But which threshold
should we pick?

135
00:07:51,250 --> 00:07:54,450
Maybe 0.4 is better, or 0.6.

136
00:07:54,450 --> 00:07:56,100
How do we decide?

137
00:07:56,100 --> 00:07:59,320
In the next video, we'll
see a nice visualization

138
00:07:59,320 --> 00:08:01,800
to help us select a threshold.