1
00:00:04,490 --> 00:00:09,210
Now it's time to evaluate our
models on the testing set.

2
00:00:09,210 --> 00:00:11,570
So the first model we're
going to want to look at

3
00:00:11,570 --> 00:00:14,530
is that smart baseline model
that basically just took

4
00:00:14,530 --> 00:00:17,880
a look at the polling results
from the Rasmussen poll

5
00:00:17,880 --> 00:00:19,630
and used those to
determine who was

6
00:00:19,630 --> 00:00:22,060
predicted to win the election.

7
00:00:22,060 --> 00:00:24,850
So it's very easy to
compute the outcome

8
00:00:24,850 --> 00:00:27,980
for this simple baseline
on the testing set.

9
00:00:27,980 --> 00:00:31,620
We're going to want to table
the testing set outcome

10
00:00:31,620 --> 00:00:35,260
variable, Republican,
and we're going

11
00:00:35,260 --> 00:00:38,090
to compare that against
the actual outcome

12
00:00:38,090 --> 00:00:40,110
of the smart baseline,
which as you recall

13
00:00:40,110 --> 00:00:44,490
would be the sign of the testing
set's Rasmussen variables.

14
00:00:47,770 --> 00:00:51,180
And we can see that
for these results,

15
00:00:51,180 --> 00:00:54,890
there are 18 times where
the smart baseline predicted

16
00:00:54,890 --> 00:00:57,320
that the Democrat would
win and it's correct,

17
00:00:57,320 --> 00:01:00,880
21 where it predicted
the Republican would win

18
00:01:00,880 --> 00:01:04,870
and was correct, two times
when it was inconclusive,

19
00:01:04,870 --> 00:01:07,430
and four times where
it predicted Republican

20
00:01:07,430 --> 00:01:09,650
but the Democrat actually won.

21
00:01:09,650 --> 00:01:13,050
So that's four mistakes and
two inconclusive results

22
00:01:13,050 --> 00:01:14,539
on the testing set.

23
00:01:14,539 --> 00:01:15,920
So this is going
to be what we're

24
00:01:15,920 --> 00:01:20,490
going to compare our logistic
regression-based model against.

25
00:01:20,490 --> 00:01:25,110
So we need to obtain final
testing set prediction

26
00:01:25,110 --> 00:01:25,930
from our model.

27
00:01:25,930 --> 00:01:30,030
So we selected mod2, which
was the two variable model.

28
00:01:30,030 --> 00:01:35,400
So we'll say, TestPrediction
is equal to the predict

29
00:01:35,400 --> 00:01:37,500
of that model that we selected.

30
00:01:37,500 --> 00:01:40,450
Now, since we're actually
making testing set predictions,

31
00:01:40,450 --> 00:01:44,490
we'll pass in newdata
= Test, and again,

32
00:01:44,490 --> 00:01:47,250
since we want probabilities
to be returned,

33
00:01:47,250 --> 00:01:48,750
we're going to pass
type="response".

34
00:01:54,280 --> 00:01:58,070
And the moment of truth,
we're finally going to table

35
00:01:58,070 --> 00:02:04,840
the test set Republican value
against the test prediction

36
00:02:04,840 --> 00:02:09,419
being greater than or equal to
0.5, at least a 50% probability

37
00:02:09,419 --> 00:02:11,910
of the Republican winning.

38
00:02:11,910 --> 00:02:15,840
And we see that for this
particular case, in all but one

39
00:02:15,840 --> 00:02:20,050
of the 45 observations in the
testing set, we're correct.

40
00:02:20,050 --> 00:02:22,370
Now, we could have tried
changing this threshold

41
00:02:22,370 --> 00:02:27,250
from 0.5 to other values and
computed out an ROC curve,

42
00:02:27,250 --> 00:02:29,710
but that doesn't quite make
as much sense in this setting

43
00:02:29,710 --> 00:02:31,640
where we're just trying
to accurately predict

44
00:02:31,640 --> 00:02:34,440
the outcome of each state
and we don't care more

45
00:02:34,440 --> 00:02:37,040
about one sort of error--
when we predicted Republican

46
00:02:37,040 --> 00:02:39,510
and it was actually
Democrat-- than the other,

47
00:02:39,510 --> 00:02:42,540
where we predicted Democrat
and it was actually Republican.

48
00:02:42,540 --> 00:02:45,450
So in this particular
case, we feel OK just

49
00:02:45,450 --> 00:02:51,170
using the cutoff of 0.5
to evaluate our model.

50
00:02:51,170 --> 00:02:54,010
So let's take a look now
at the mistake we made

51
00:02:54,010 --> 00:02:56,850
and see if we can
understand what's going on.

52
00:02:56,850 --> 00:03:00,140
So to actually pull out
the mistake we made,

53
00:03:00,140 --> 00:03:03,950
we can just take a
subset of the testing set

54
00:03:03,950 --> 00:03:07,120
and limit it to when
we predicted true,

55
00:03:07,120 --> 00:03:09,100
but actually the
Democrat won, which

56
00:03:09,100 --> 00:03:11,930
is the case when
that one failed.

57
00:03:11,930 --> 00:03:15,880
So this would be
when TestPrediction

58
00:03:15,880 --> 00:03:21,700
is greater than or equal to 0.5,
and it was not a Republican.

59
00:03:21,700 --> 00:03:26,980
So Republican was equal to zero.

60
00:03:26,980 --> 00:03:29,150
So here is that
subset, which just

61
00:03:29,150 --> 00:03:32,170
has one observation since
we made just one mistake.

62
00:03:32,170 --> 00:03:34,840
So this was for the year
2012, the testing set year.

63
00:03:34,840 --> 00:03:36,930
This was the state of Florida.

64
00:03:36,930 --> 00:03:39,470
And looking through these
predictor variables,

65
00:03:39,470 --> 00:03:42,070
we see why we made the mistake.

66
00:03:42,070 --> 00:03:45,520
The Rasmussen poll gave the
Republican a two percentage

67
00:03:45,520 --> 00:03:49,000
point lead, SurveyUSA
called a tie,

68
00:03:49,000 --> 00:03:51,400
DiffCount said there
were six more polls that

69
00:03:51,400 --> 00:03:53,320
predicted Republican
than Democrat,

70
00:03:53,320 --> 00:03:55,110
and two thirds of
the polls predicted

71
00:03:55,110 --> 00:03:56,880
the Republican was going to win.

72
00:03:56,880 --> 00:03:59,090
But actually in this case,
the Republican didn't win.

73
00:03:59,090 --> 00:04:02,280
Barack Obama won
the state of Florida

74
00:04:02,280 --> 00:04:04,450
in 2012 over Mitt Romney.

75
00:04:04,450 --> 00:04:06,990
So the models here
are not magic,

76
00:04:06,990 --> 00:04:11,030
and given this sort of data,
it's pretty unsurprising

77
00:04:11,030 --> 00:04:13,010
that our model actually
didn't get Florida

78
00:04:13,010 --> 00:04:15,280
correct in this case
and made the mistake.

79
00:04:15,280 --> 00:04:17,690
However, overall, it
seems to be outperforming

80
00:04:17,690 --> 00:04:19,740
the smart baseline
that we selected,

81
00:04:19,740 --> 00:04:22,380
and so we think that maybe
this would be a nice model

82
00:04:22,380 --> 00:04:25,150
to use in the
election prediction.