1
00:00:04,500 --> 00:00:08,200
So can a CART model actually
predict Supreme Court case

2
00:00:08,200 --> 00:00:11,220
outcomes better than
a group of experts?

3
00:00:11,220 --> 00:00:15,550
Martin and his colleagues used
628 previous Supreme Court

4
00:00:15,550 --> 00:00:20,790
cases between 1994 and
2001 to build their model.

5
00:00:20,790 --> 00:00:23,470
They made predictions
for the 68 cases

6
00:00:23,470 --> 00:00:26,840
that would be decided
in October, 2002,

7
00:00:26,840 --> 00:00:29,310
before the term started.

8
00:00:29,310 --> 00:00:32,340
Their model had two
stages of CART trees.

9
00:00:32,340 --> 00:00:35,160
The first stage involved
making predictions

10
00:00:35,160 --> 00:00:36,900
using two CART trees.

11
00:00:36,900 --> 00:00:40,420
One to predict a unanimous
liberal decision and one

12
00:00:40,420 --> 00:00:43,700
to predict a unanimous
conservative decision.

13
00:00:43,700 --> 00:00:48,060
If the trees gave conflicting
responses or both predicted no,

14
00:00:48,060 --> 00:00:50,600
then they moved on
to the next stage.

15
00:00:50,600 --> 00:00:54,220
It turns out that about
50% of Supreme Court cases

16
00:00:54,220 --> 00:00:56,550
result in a unanimous decision.

17
00:00:56,550 --> 00:01:00,970
So this was a nice first step
to detect the easier cases.

18
00:01:00,970 --> 00:01:03,600
The second stage
consisted of predicting

19
00:01:03,600 --> 00:01:06,760
the decision of each
individual justice.

20
00:01:06,760 --> 00:01:10,340
And then using the majority
decision of all nine justices

21
00:01:10,340 --> 00:01:14,180
as a final prediction
for the case.

22
00:01:14,180 --> 00:01:16,610
In this lecture, we
constructed the CART tree

23
00:01:16,610 --> 00:01:18,539
for Justice Stevens.

24
00:01:18,539 --> 00:01:22,320
Here's a different tree, this
one for Justice O'Connor.

25
00:01:22,320 --> 00:01:24,170
The first split
is whether or not

26
00:01:24,170 --> 00:01:26,390
the lower court
decision is liberal.

27
00:01:26,390 --> 00:01:30,250
If yes, then we predict that
she will reverse the case.

28
00:01:30,250 --> 00:01:33,360
This makes sense because
Justice O'Connor is generally

29
00:01:33,360 --> 00:01:36,170
viewed as a conservative judge.

30
00:01:36,170 --> 00:01:39,880
On the other hand, if the lower
court decision is conservative,

31
00:01:39,880 --> 00:01:43,009
we check for the
circuit court of origin.

32
00:01:43,009 --> 00:01:46,440
If it's the second, third,
DC, or federal court,

33
00:01:46,440 --> 00:01:49,660
we predict that she
will affirm the case.

34
00:01:49,660 --> 00:01:53,729
If it's not one of these courts,
we move on to the next split.

35
00:01:53,729 --> 00:01:56,170
The remaining two splits
are for the respondent

36
00:01:56,170 --> 00:01:59,520
and the primary issue.

37
00:01:59,520 --> 00:02:03,100
Here's another tree, this
one for Justice Souter.

38
00:02:03,100 --> 00:02:05,830
This shows an unusual
property of the CART trees

39
00:02:05,830 --> 00:02:08,509
that Martin and his
colleagues developed.

40
00:02:08,509 --> 00:02:10,970
They use predictions
for some trees

41
00:02:10,970 --> 00:02:14,380
as independent variables
for other trees.

42
00:02:14,380 --> 00:02:17,120
In this tree, the first
split is whether or not

43
00:02:17,120 --> 00:02:20,860
Justice Ginsburg predicted
decision is liberal.

44
00:02:20,860 --> 00:02:25,190
So we have to run Justice
Ginsburg's CART tree first.

45
00:02:25,190 --> 00:02:27,079
See what the prediction is.

46
00:02:27,079 --> 00:02:30,930
And then use that as input
for Justice Souter's tree.

47
00:02:30,930 --> 00:02:34,690
If Justice Ginsburg
predicted decision is liberal

48
00:02:34,690 --> 00:02:37,329
and the lower court
decision is liberal,

49
00:02:37,329 --> 00:02:41,610
then we predict that Justice
Souter will affirm the case.

50
00:02:41,610 --> 00:02:44,610
But if the lower court
decision is conservative,

51
00:02:44,610 --> 00:02:48,430
then we predict that Justice
Souter will reverse the case.

52
00:02:48,430 --> 00:02:51,530
On the other side of the
tree, if Justice Ginsburg

53
00:02:51,530 --> 00:02:54,020
predicted decision
is conservative,

54
00:02:54,020 --> 00:02:56,660
but the lower court
decision is liberal,

55
00:02:56,660 --> 00:03:00,390
then we predict that Justice
Souter will reverse the case.

56
00:03:00,390 --> 00:03:02,770
But if the lower court
decision is conservative,

57
00:03:02,770 --> 00:03:06,520
then we predict that Justice
Souter will affirm the case.

58
00:03:06,520 --> 00:03:09,650
In summary, if we predict
that Justice Ginsburg will

59
00:03:09,650 --> 00:03:13,460
make a liberal decision,
then Justice Souter

60
00:03:13,460 --> 00:03:16,620
will probably make a
liberal decision too.

61
00:03:16,620 --> 00:03:18,930
But if we predict that
Justice Ginsburg will

62
00:03:18,930 --> 00:03:21,320
make a conservative
decision, then we

63
00:03:21,320 --> 00:03:23,270
predict that Justice
Souter will probably

64
00:03:23,270 --> 00:03:25,329
make a conservative
decision too.

65
00:03:28,720 --> 00:03:33,520
Martin and his colleagues also
recruited 83 legal experts,

66
00:03:33,520 --> 00:03:37,210
71 academics, and 12 attorneys.

67
00:03:37,210 --> 00:03:41,430
38 had previously clerked
for a Supreme Court Justice,

68
00:03:41,430 --> 00:03:44,660
33 were chaired
professors, and five

69
00:03:44,660 --> 00:03:47,340
were current or former
law school deans.

70
00:03:47,340 --> 00:03:50,840
So this was really a
dream team of experts.

71
00:03:50,840 --> 00:03:52,910
Additionally, the
experts were only

72
00:03:52,910 --> 00:03:56,540
asked to predict within
their area of expertise.

73
00:03:56,540 --> 00:03:59,470
So not all experts
predicted all cases.

74
00:03:59,470 --> 00:04:02,010
But there was more than one
expert making predictions

75
00:04:02,010 --> 00:04:04,220
for each case.

76
00:04:04,220 --> 00:04:06,460
When making their
predictions, the experts

77
00:04:06,460 --> 00:04:09,620
were allowed to consider
any source of information.

78
00:04:09,620 --> 00:04:11,430
But they were not
allowed to communicate

79
00:04:11,430 --> 00:04:15,140
with each other regarding
the predictions.

80
00:04:15,140 --> 00:04:18,579
For the 68 cases
in October, 2002,

81
00:04:18,579 --> 00:04:20,890
the predictions were made,
and at the end of the month

82
00:04:20,890 --> 00:04:23,010
the results were computed.

83
00:04:23,010 --> 00:04:24,890
For predicting the
overall decision that

84
00:04:24,890 --> 00:04:27,430
was made by the Supreme
Court, the models

85
00:04:27,430 --> 00:04:31,240
had an accuracy of 75%,
while the experts only

86
00:04:31,240 --> 00:04:33,980
had an accuracy of 59%.

87
00:04:33,980 --> 00:04:37,120
So the models had a significant
edge over the experts

88
00:04:37,120 --> 00:04:40,450
in predicting the
overall case outcomes.

89
00:04:40,450 --> 00:04:41,920
However, when the
predictions were

90
00:04:41,920 --> 00:04:45,820
run for individual justices,
the model and the experts

91
00:04:45,820 --> 00:04:47,950
performed very similarly.

92
00:04:47,950 --> 00:04:50,700
For some justices, the
model performed better.

93
00:04:50,700 --> 00:04:56,070
And for some justices, the
experts performed better.

94
00:04:56,070 --> 00:04:58,659
Being able to predict
Supreme Court decisions

95
00:04:58,659 --> 00:05:01,840
is very valuable to
firms, politicians,

96
00:05:01,840 --> 00:05:04,830
and nongovernmental
organizations.

97
00:05:04,830 --> 00:05:07,180
We saw in this lecture
that a model that

98
00:05:07,180 --> 00:05:10,720
predicts overall Supreme
Court decisions is both more

99
00:05:10,720 --> 00:05:14,650
accurate than experts and can
be run much faster than experts

100
00:05:14,650 --> 00:05:16,810
can make their predictions.

101
00:05:16,810 --> 00:05:18,420
The CART models
that we built were

102
00:05:18,420 --> 00:05:22,060
based on very high level
components of the cases,

103
00:05:22,060 --> 00:05:23,880
compared to the
experts who can process

104
00:05:23,880 --> 00:05:27,520
much more detailed and
complex information.

105
00:05:27,520 --> 00:05:29,660
This example really
shows the edge

106
00:05:29,660 --> 00:05:33,170
that analytics can provide
in traditionally qualitative

107
00:05:33,170 --> 00:05:34,520
applications.