1
00:00:04,500 --> 00:00:08,920
Let us examine how to interpret
the model we developed.

2
00:00:08,920 --> 00:00:11,450
One of the things
we should look after

3
00:00:11,450 --> 00:00:14,770
is that there might be what
is called multicollinearity.

4
00:00:14,770 --> 00:00:18,240
Multicollinearity occurs
when the various independent

5
00:00:18,240 --> 00:00:21,280
variables are
correlated, and this

6
00:00:21,280 --> 00:00:26,510
might confuse the coefficients--
the betas-- in the model.

7
00:00:26,510 --> 00:00:32,360
So tests to address that involve
checking the correlations

8
00:00:32,360 --> 00:00:35,200
of independent variables.

9
00:00:35,200 --> 00:00:38,250
If they are
excessively high, this

10
00:00:38,250 --> 00:00:40,700
would mean that there
might be multicollinearity,

11
00:00:40,700 --> 00:00:44,230
and you have to potentially
revisit the model,

12
00:00:44,230 --> 00:00:48,380
as well as whether the signs
of the coefficients make sense.

13
00:00:48,380 --> 00:00:52,720
Is the coefficient beta
positive or negative?

14
00:00:52,720 --> 00:00:58,250
If it agrees with intuition,
then multicollinearity

15
00:00:58,250 --> 00:01:02,830
has not been a problem,
but if intuition suggests

16
00:01:02,830 --> 00:01:09,400
a different sign, this might
be a sign of multicollinearity.

17
00:01:09,400 --> 00:01:13,840
The next important
element is significance.

18
00:01:13,840 --> 00:01:16,180
So how do we
interpret the results,

19
00:01:16,180 --> 00:01:23,110
and how do we understand whether
we have a good model or not?

20
00:01:23,110 --> 00:01:26,130
For that purpose,
let's take a look

21
00:01:26,130 --> 00:01:30,490
at what is called Area Under
the Curve, or AUC for short.

22
00:01:30,490 --> 00:01:36,150
So the Area Under the Curve
shows an absolute measure

23
00:01:36,150 --> 00:01:41,450
of quality of prediction-- in
this particular case, 77.5%,

24
00:01:41,450 --> 00:01:47,470
which means that, given that
the perfect score is 100%,

25
00:01:47,470 --> 00:01:52,920
so this is like a B,
whereas, as we'll see later,

26
00:01:52,920 --> 00:01:56,100
a 50% score, which
is pure guessing,

27
00:01:56,100 --> 00:02:05,090
is a 50% rate of success.

28
00:02:05,090 --> 00:02:10,090
So the area under the curve
gives an absolute measure

29
00:02:10,090 --> 00:02:16,090
of quality, and it's less
affected by various benchmarks.

30
00:02:19,560 --> 00:02:22,870
So it illustrates how
accurate the model

31
00:02:22,870 --> 00:02:26,790
is on a more absolute sense.

32
00:02:26,790 --> 00:02:28,840
So what is a good AUC?

33
00:02:28,840 --> 00:02:33,100
The area on the right
shows the maximum possible

34
00:02:33,100 --> 00:02:41,480
of a perfect prediction,
whereas the area on this

35
00:02:41,480 --> 00:02:46,970
curve now-- it is 0.5,
and it's pure guessing.

36
00:02:46,970 --> 00:02:52,390
Other outcome measures that
are important for us to discuss

37
00:02:52,390 --> 00:02:55,510
is the so-called
confusion matrix.

38
00:02:55,510 --> 00:03:01,430
So the matrix here is formulas
for the various terms we use.

39
00:03:04,870 --> 00:03:09,470
The actual class is 0--
means, in our example,

40
00:03:09,470 --> 00:03:12,430
good quality of care,
and actual class = 1

41
00:03:12,430 --> 00:03:17,630
means poor quality of care,
whereas the predicted class =

42
00:03:17,630 --> 00:03:20,420
0 means that will
predict good quality,

43
00:03:20,420 --> 00:03:25,810
and the predicted class = 1 mean
that we predict poor quality.

44
00:03:25,810 --> 00:03:29,210
So we define true
negatives, short by TN.

45
00:03:29,210 --> 00:03:32,470
False positives, short by FP.

46
00:03:32,470 --> 00:03:36,660
False negatives, FN, and
true positives by TP.

47
00:03:36,660 --> 00:03:38,990
So if N is the number
of observations,

48
00:03:38,990 --> 00:03:42,420
the overall accuracy
is basically

49
00:03:42,420 --> 00:03:47,079
the number of true negatives
and true positives divided by N.

50
00:03:47,079 --> 00:03:51,820
It's basically the terms
in the diagonal of this two

51
00:03:51,820 --> 00:03:54,730
by two matrix divided by
the total observations.

52
00:03:54,730 --> 00:03:59,829
The overall error rate is
the terms off-diagonal--

53
00:03:59,829 --> 00:04:02,720
the false positives, plus
the false negatives, divided

54
00:04:02,720 --> 00:04:04,790
by the total number
of observations.

55
00:04:04,790 --> 00:04:10,370
That's the overall
measure of an error rate.

56
00:04:10,370 --> 00:04:14,050
An important component is
the so-called sensitivity,

57
00:04:14,050 --> 00:04:19,079
and sensitivity is TP, the
true positives, whenever

58
00:04:19,079 --> 00:04:22,650
we predict poor
quality, and indeed it

59
00:04:22,650 --> 00:04:30,120
is poor quality, divided by TP,
these true positives, plus FN,

60
00:04:30,120 --> 00:04:37,630
which is the total number
of cases of poor quality.

61
00:04:37,630 --> 00:04:39,850
So this is the total
number of times

62
00:04:39,850 --> 00:04:45,590
that we predict poor
quality, and it is, indeed,

63
00:04:45,590 --> 00:04:52,430
poor quality, versus the
total number of times

64
00:04:52,430 --> 00:04:57,830
the actual quality
is, in fact, poor.

65
00:04:57,830 --> 00:05:04,630
False negative rate is FN,
the number of false negatives,

66
00:05:04,630 --> 00:05:10,500
divided by the number
of true positives,

67
00:05:10,500 --> 00:05:12,740
plus the number of
false negatives.

68
00:05:12,740 --> 00:05:19,550
And specificity is
TN, true negatives,

69
00:05:19,550 --> 00:05:22,670
the number of times we
predict the quality is good,

70
00:05:22,670 --> 00:05:24,810
and, in fact, the
quality is good,

71
00:05:24,810 --> 00:05:31,540
divided by this number,
TN, plus false positives.

72
00:05:31,540 --> 00:05:35,900
So specificity is
the number of times

73
00:05:35,900 --> 00:05:40,780
we predict the quality is
good, and it is indeed good,

74
00:05:40,780 --> 00:05:46,909
versus the total times
we have good quality,

75
00:05:46,909 --> 00:05:51,930
and the false
positive error rate is

76
00:05:51,930 --> 00:05:54,060
1 minus the specificity.

77
00:05:58,570 --> 00:06:02,060
So in this particular example
that we have discussed,

78
00:06:02,060 --> 00:06:05,730
quality of care, just
like in linear regression,

79
00:06:05,730 --> 00:06:09,010
we want to make
predictions on a test set

80
00:06:09,010 --> 00:06:10,780
to compute
out-of-sample metrics.

81
00:06:10,780 --> 00:06:17,850
We develop the logistic
regression model using data,

82
00:06:17,850 --> 00:06:21,440
but would like to make
predictions out-of-sample.

83
00:06:21,440 --> 00:06:29,840
So in our test, we
utilized 32 cases,

84
00:06:29,840 --> 00:06:38,900
and the R command that
makes the statements

85
00:06:38,900 --> 00:06:41,560
about the quality of a
prediction out-of-sample

86
00:06:41,560 --> 00:06:44,930
is illustrated
here in the slide.

87
00:06:44,930 --> 00:06:46,810
So in that way, we
make predictions

88
00:06:46,810 --> 00:06:48,630
about probabilities,
of course, simply

89
00:06:48,630 --> 00:06:52,380
because logistic regression
makes predictions

90
00:06:52,380 --> 00:06:56,400
about probabilities, and
then we transform them

91
00:06:56,400 --> 00:06:58,560
to a binary outcome--
the quality is good,

92
00:06:58,560 --> 00:07:01,700
or the quality is poor--
using a threshold.

93
00:07:01,700 --> 00:07:05,760
In this particular example, we
used a threshold value of 0.3,

94
00:07:05,760 --> 00:07:10,060
and in doing so, we obtain the
following confusion matrix.

95
00:07:10,060 --> 00:07:14,170
So there were, as I
mentioned, there are 32 cases,

96
00:07:14,170 --> 00:07:18,830
out of which 24 of them
are actually good care,

97
00:07:18,830 --> 00:07:22,490
and eight of them are
actually poor care.

98
00:07:22,490 --> 00:07:27,310
We observe that the overall
accuracy of the model

99
00:07:27,310 --> 00:07:33,000
is 19 plus 6, is 25, over 32.

100
00:07:33,000 --> 00:07:41,390
The false positive rate is,
in this case, 5 over 24,

101
00:07:41,390 --> 00:07:49,650
19 plus 5, whereas the true
positive rate is 6 out of 8-- 6

102
00:07:49,650 --> 00:07:51,540
plus 2.

103
00:07:51,540 --> 00:07:56,540
Notice, if you compare this
model with making always--

104
00:07:56,540 --> 00:08:00,370
let's say one
alternative is to say

105
00:08:00,370 --> 00:08:04,010
we predict good
care all the time.

106
00:08:04,010 --> 00:08:07,720
In that situation,
we will be correct 19

107
00:08:07,720 --> 00:08:13,620
plus 5, 24 times, versus
25 times, in our case.

108
00:08:13,620 --> 00:08:19,520
But notice that predicting
always good care

109
00:08:19,520 --> 00:08:24,920
does not capture the dynamics
of what is happening,

110
00:08:24,920 --> 00:08:28,250
versus the logistic
regression model that

111
00:08:28,250 --> 00:08:32,390
is far more intelligent in
capturing these effects.