1
00:00:00,070 --> 00:00:02,430
The following content is
provided under a Creative

2
00:00:02,430 --> 00:00:03,810
Commons license.

3
00:00:03,810 --> 00:00:06,060
Your support will help
MIT OpenCourseWare

4
00:00:06,060 --> 00:00:10,140
continue to offer high quality
educational resources for free.

5
00:00:10,140 --> 00:00:12,700
To make a donation or to
view additional materials

6
00:00:12,700 --> 00:00:16,600
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:16,600 --> 00:00:17,260
at ocw.mit.edu.

8
00:00:26,680 --> 00:00:28,150
PROFESSOR: All
right, well I'd like

9
00:00:28,150 --> 00:00:31,470
to thank you for
inviting me again

10
00:00:31,470 --> 00:00:33,750
to talk to the poker class.

11
00:00:33,750 --> 00:00:38,250
It's always great to
come here, and we're

12
00:00:38,250 --> 00:00:40,970
going to be having a
tournament in a couple weeks,

13
00:00:40,970 --> 00:00:43,890
so good luck for the people
participating in that.

14
00:00:43,890 --> 00:00:47,210
Actually, I'm coming
back in another two weeks

15
00:00:47,210 --> 00:00:50,610
because I think [INAUDIBLE]
a Harvard MIT math tournament

16
00:00:50,610 --> 00:00:53,760
for high school kids.

17
00:00:53,760 --> 00:00:56,120
I really love visiting MIT.

18
00:00:56,120 --> 00:00:59,270
I just wish it were at some
other time besides the winter.

19
00:01:03,084 --> 00:01:04,125
Then it would be perfect.

20
00:01:04,125 --> 00:01:05,850
All right, today
I'm going to talk

21
00:01:05,850 --> 00:01:10,290
about the University of
Alberta's Cepheus computer

22
00:01:10,290 --> 00:01:11,100
program.

23
00:01:11,100 --> 00:01:12,470
It supposedly solved poker.

24
00:01:12,470 --> 00:01:15,316
We're going to talk about
what they actually did.

25
00:01:15,316 --> 00:01:16,270
[LAUGHTER]

26
00:01:16,270 --> 00:01:18,360
There seems to be a
lot of buzz about this,

27
00:01:18,360 --> 00:01:24,070
so I thought this
was a good to do.

28
00:01:24,070 --> 00:01:29,130
So I have to tell you that Jared
and I did not work directly

29
00:01:29,130 --> 00:01:31,940
with the University
of Alberta people,

30
00:01:31,940 --> 00:01:34,100
but we are very familiar
with their methods

31
00:01:34,100 --> 00:01:38,920
and have actually tried some
of their coding techniques.

32
00:01:38,920 --> 00:01:42,360
So we're pretty familiar
with the same research that's

33
00:01:42,360 --> 00:01:43,430
going on.

34
00:01:43,430 --> 00:01:47,540
To It's sort of an, I
think, objective commentary.

35
00:01:47,540 --> 00:01:51,480
So by the way, as
the lecture goes on,

36
00:01:51,480 --> 00:01:53,450
you can interrupt
with questions.

37
00:01:53,450 --> 00:01:56,140
Just raise your hands
if something is unclear

38
00:01:56,140 --> 00:02:00,215
because I've been told
I have about 80 minutes.

39
00:02:00,215 --> 00:02:04,440
Probably spend 55 and then
save the rest for questions.

40
00:02:04,440 --> 00:02:07,930
All right, so that
line of talk-- first

41
00:02:07,930 --> 00:02:11,410
I'm going to talk about what
the Cepheus accomplished,

42
00:02:11,410 --> 00:02:14,260
what the University of
Alberta people accomplished,

43
00:02:14,260 --> 00:02:21,390
and I'm going to bring that
up by discussing game theory

44
00:02:21,390 --> 00:02:23,340
optimal energies in poker.

45
00:02:23,340 --> 00:02:25,640
How many of you know
what game [INAUDIBLE] is.

46
00:02:25,640 --> 00:02:28,750
I just want to know [INAUDIBLE]
or what a [INAUDIBLE] is.

47
00:02:28,750 --> 00:02:31,100
Raise your hands.

48
00:02:31,100 --> 00:02:32,200
OK.

49
00:02:32,200 --> 00:02:34,270
So about 1/2, 2/3.

50
00:02:34,270 --> 00:02:34,980
Good.

51
00:02:34,980 --> 00:02:39,194
I'm going to do a quick
introduction to what game

52
00:02:39,194 --> 00:02:40,110
theory [INAUDIBLE] is.

53
00:02:40,110 --> 00:02:43,700
We're going to talk about a
simple poker game and solutions

54
00:02:43,700 --> 00:02:44,970
to it.

55
00:02:44,970 --> 00:02:47,740
And then I'm going to go
into their algorithm, which

56
00:02:47,740 --> 00:02:51,810
is written [INAUDIBLE].

57
00:02:51,810 --> 00:02:55,750
They used the method of
counterfactual [INAUDIBLE].

58
00:02:55,750 --> 00:02:58,370
Actually, the method
they used to push

59
00:02:58,370 --> 00:03:00,080
through to the
solution of the problem

60
00:03:00,080 --> 00:03:04,820
is counter CF plus,
which is basically

61
00:03:04,820 --> 00:03:09,460
the original algorithm with some
shortcuts, which we'll discuss.

62
00:03:09,460 --> 00:03:13,640
After this, though, we're
going to think about extensions

63
00:03:13,640 --> 00:03:17,160
of computer solutions to other
games, including [INAUDIBLE]

64
00:03:17,160 --> 00:03:19,640
games and multiplayer games.

65
00:03:19,640 --> 00:03:23,490
A couple people have questions
about [INAUDIBLE] no limit

66
00:03:23,490 --> 00:03:24,820
program.

67
00:03:24,820 --> 00:03:31,970
We'll talk about what they're
work entailed if questions

68
00:03:31,970 --> 00:03:35,630
lead in that direction.

69
00:03:35,630 --> 00:03:39,120
All right, let's talk about
what Cepheus accomplished.

70
00:03:39,120 --> 00:03:42,280
It's a game theory [INAUDIBLE]
solution to heads up limit

71
00:03:42,280 --> 00:03:43,280
hold 'em.

72
00:03:43,280 --> 00:03:45,140
And so what does that mean?

73
00:03:45,140 --> 00:03:48,350
You guys all know what
limit hold 'em is, right?

74
00:03:48,350 --> 00:03:50,150
Good.

75
00:03:50,150 --> 00:03:56,740
Basically, after
[INAUDIBLE] few years,

76
00:03:56,740 --> 00:03:59,960
they've achieved and
exploited less than 1/1000

77
00:03:59,960 --> 00:04:01,310
of a big blind.

78
00:04:01,310 --> 00:04:09,050
So the first thing is not a
boo perfect optimal solution.

79
00:04:09,050 --> 00:04:11,960
You can still exploit
it for about 1/1000

80
00:04:11,960 --> 00:04:16,579
of a blind for a hand.

81
00:04:16,579 --> 00:04:21,890
However, there are
probably better games.

82
00:04:21,890 --> 00:04:26,590
This is like 1/20-- this
is 1/2000 of a big bet.

83
00:04:26,590 --> 00:04:30,790
You can actually play heads up
for 50 years at normal speed

84
00:04:30,790 --> 00:04:36,260
and still have some
probability of losing.

85
00:04:36,260 --> 00:04:42,400
The reason for that is the
standard deviation of heads

86
00:04:42,400 --> 00:04:47,100
up limit hold 'em is
about five big blinds.

87
00:04:47,100 --> 00:04:49,380
So you can just
imagine how many hands

88
00:04:49,380 --> 00:04:52,440
you have to play [INAUDIBLE]
the significance.

89
00:04:52,440 --> 00:04:56,850
About, oh, 25 million.

90
00:04:56,850 --> 00:04:59,460
So it's definitely a milestone.

91
00:04:59,460 --> 00:05:02,890
This is the first time a real
poker game has been solved.

92
00:05:02,890 --> 00:05:06,030
In math of poker, we solved
ace, king, queen, [INAUDIBLE]

93
00:05:06,030 --> 00:05:11,580
on paper, but [INAUDIBLE] a
real poker games that's solved.

94
00:05:11,580 --> 00:05:14,390
However, given
their previous work,

95
00:05:14,390 --> 00:05:16,120
it was just a matter
of [INAUDIBLE].

96
00:05:16,120 --> 00:05:17,670
I remember two or
three years ago

97
00:05:17,670 --> 00:05:21,970
they passed the 1/100
of a big bet, which

98
00:05:21,970 --> 00:05:24,230
is sort of our measurement
of significance.

99
00:05:24,230 --> 00:05:29,280
If you're playing and you're
winning more than 1/100

100
00:05:29,280 --> 00:05:33,490
of a big bet for a hand,
you can [INAUDIBLE] it's

101
00:05:33,490 --> 00:05:34,440
a probable game.

102
00:05:34,440 --> 00:05:37,230
Below that comes theoretical.

103
00:05:37,230 --> 00:05:40,080
So it's definitely a milestone.

104
00:05:40,080 --> 00:05:47,150
And basically I knew that,
if they just maybe spent

105
00:05:47,150 --> 00:05:50,100
more CPU power, they
would get the solution.

106
00:05:50,100 --> 00:05:56,000
For 900 CPU years, we
finally got the solution.

107
00:05:56,000 --> 00:05:59,460
So I don't know.

108
00:05:59,460 --> 00:06:05,410
If I had that much CPU power,
I'd solve a few problems, too.

109
00:06:05,410 --> 00:06:08,550
But it's still the
miles [INAUDIBLE].

110
00:06:08,550 --> 00:06:10,670
It's great.

111
00:06:10,670 --> 00:06:13,350
So what effect does this
have on other games?

112
00:06:13,350 --> 00:06:16,480
Does this mean poker is
going to go the way of chess

113
00:06:16,480 --> 00:06:19,620
for computers who are just
much better than we are?

114
00:06:19,620 --> 00:06:21,560
I don't think we're
there yet, and we'll

115
00:06:21,560 --> 00:06:25,000
talk about that later.

116
00:06:25,000 --> 00:06:28,230
So let's talk about
Nash equilibrium.

117
00:06:28,230 --> 00:06:32,210
So John F. Nash won
the Nobel Prize in 1994

118
00:06:32,210 --> 00:06:34,460
"for pioneering
analysis of equilibrium

119
00:06:34,460 --> 00:06:36,870
in the theory of
non-cooperative games."

120
00:06:36,870 --> 00:06:39,860
And he extended the work of
John Von Neumann and Oskar

121
00:06:39,860 --> 00:06:42,470
Morgenstern, [INAUDIBLE]
actually first considered

122
00:06:42,470 --> 00:06:44,680
these two player zero sum games.

123
00:06:44,680 --> 00:06:47,530
So Nash equilibrium is
just a set of strategies

124
00:06:47,530 --> 00:06:52,510
such that no player can
actually improve their strategy

125
00:06:52,510 --> 00:06:55,980
and make more [INAUDIBLE].

126
00:06:55,980 --> 00:06:57,420
[INAUDIBLE] whatever.

127
00:06:57,420 --> 00:07:00,020
In a the two player
zero sum games,

128
00:07:00,020 --> 00:07:03,955
we refer to Nash equilibria
as also very optimal.

129
00:07:03,955 --> 00:07:06,800
The reason is because
Nash equilibria are also

130
00:07:06,800 --> 00:07:08,840
the min/max solution.

131
00:07:08,840 --> 00:07:11,520
It's the best you can
do given that he can

132
00:07:11,520 --> 00:07:14,850
see what you do and respond.

133
00:07:14,850 --> 00:07:16,754
Simplest case of
Nash equilibria is,

134
00:07:16,754 --> 00:07:18,420
if you're playing
rock, paper, scissors,

135
00:07:18,420 --> 00:07:21,580
what's the Nash equilibrium?

136
00:07:21,580 --> 00:07:22,420
1/3 each.

137
00:07:22,420 --> 00:07:27,580
So that's not that exciting in
this case, because both players

138
00:07:27,580 --> 00:07:29,340
kind of just t0.

139
00:07:29,340 --> 00:07:31,930
You can't make more than 0,
you can't make less than 0.

140
00:07:31,930 --> 00:07:34,960
So it doesn't seem to be
that exciting a solution,

141
00:07:34,960 --> 00:07:37,520
but in poker it's
kind of exciting

142
00:07:37,520 --> 00:07:41,570
because they're kind of
dominated mistakes people play,

143
00:07:41,570 --> 00:07:47,340
or mistakes that actually lose
money to the optimal solution.

144
00:07:47,340 --> 00:07:50,950
So the reason 1/3, 1/3
is the Nash equilibrium

145
00:07:50,950 --> 00:07:55,530
because nobody can do
anything to improve their lot.

146
00:07:55,530 --> 00:07:57,360
It may not be the
best thing to play.

147
00:07:57,360 --> 00:08:01,670
If a guy is playing 1/2
scissors and 1/2 rock,

148
00:08:01,670 --> 00:08:05,020
what should you play?

149
00:08:05,020 --> 00:08:06,870
100% rock.

150
00:08:06,870 --> 00:08:09,650
Yeah, sort of like the
Aerosmith strategy.

151
00:08:09,650 --> 00:08:12,270
[LAUGHTER]

152
00:08:12,270 --> 00:08:12,850
Right.

153
00:08:12,850 --> 00:08:16,610
So there are much better ways to
play if your opponents deviate

154
00:08:16,610 --> 00:08:18,910
from Nash equilibrium.

155
00:08:18,910 --> 00:08:22,640
So actually game theory optimal
is not necessarily the best way

156
00:08:22,640 --> 00:08:25,800
to play, even heads up.

157
00:08:25,800 --> 00:08:29,490
It's a way to play to kind
of guaranteed you never lose.

158
00:08:29,490 --> 00:08:32,919
So that's sort of
the accomplishment.

159
00:08:32,919 --> 00:08:36,919
That's why we like
to find these things.

160
00:08:36,919 --> 00:08:39,539
I know I could just
play this, and I'm not

161
00:08:39,539 --> 00:08:42,770
taking total advantage of
my opponent's mistakes,

162
00:08:42,770 --> 00:08:44,940
but at least I'm playing
in away where he can't

163
00:08:44,940 --> 00:08:47,410
take advantage of me at all.

164
00:08:47,410 --> 00:08:50,120
Let's do a simple example.

165
00:08:50,120 --> 00:08:55,100
So this is an
example that I shared

166
00:08:55,100 --> 00:08:57,060
with the class a
couple years ago.

167
00:08:57,060 --> 00:08:59,210
So there are two
players, Rose and Colin,

168
00:08:59,210 --> 00:09:00,850
and the reason the
players are called

169
00:09:00,850 --> 00:09:04,185
Rose and Colin are because this
refers to [INAUDIBLE] games.

170
00:09:06,880 --> 00:09:10,090
One player chooses a row, the
other player chooses a column.

171
00:09:10,090 --> 00:09:13,050
That's their payoff.

172
00:09:13,050 --> 00:09:16,740
And for a three player
game, we introduce Larry,

173
00:09:16,740 --> 00:09:19,530
because there are layers.

174
00:09:19,530 --> 00:09:22,510
So the two players
are Rose and Colin.

175
00:09:22,510 --> 00:09:25,840
So each player antes
$50 for $100 in a pot.

176
00:09:25,840 --> 00:09:27,700
Rose looks at a card
[INAUDIBLE] full deck,

177
00:09:27,700 --> 00:09:30,660
who will win in the pot a
showdown if the card is.

178
00:09:30,660 --> 00:09:32,380
Otherwise she will lose.

179
00:09:32,380 --> 00:09:35,950
So Rose can decide
to bet $100 or check

180
00:09:35,950 --> 00:09:37,660
after she looks at her card.

181
00:09:37,660 --> 00:09:38,974
So there's $100 in the pot.

182
00:09:38,974 --> 00:09:39,890
She looks at her card.

183
00:09:39,890 --> 00:09:43,340
She [INAUDIBLE] whether
to be $100 or to check.

184
00:09:43,340 --> 00:09:47,180
If Rose bets, Colin may
decide to call $100 or fold.

185
00:09:47,180 --> 00:09:49,420
If Colin folds, Rose wins.

186
00:09:49,420 --> 00:09:51,220
Well, you guys know
how poker works.

187
00:09:51,220 --> 00:09:52,790
If Colin calls,
there's a showdown,

188
00:09:52,790 --> 00:09:54,420
and her card is
actually a spade.

189
00:09:54,420 --> 00:09:57,710
She wins the whole pot.

190
00:09:57,710 --> 00:09:59,200
Colin wins the pot.

191
00:09:59,200 --> 00:10:01,992
So what's the optimal
strategies for Rose and Colin?

192
00:10:01,992 --> 00:10:03,200
Does anybody know the answer?

193
00:10:06,690 --> 00:10:11,590
Well, let's do one
[INAUDIBLE] part of it.

194
00:10:11,590 --> 00:10:14,840
How often do you think
[INAUDIBLE] should call?

195
00:10:14,840 --> 00:10:17,310
Colin wants a call
[INAUDIBLE] enough to make

196
00:10:17,310 --> 00:10:21,000
Rose's bluffs probable.

197
00:10:21,000 --> 00:10:25,220
If Rose gets a spade,
what is she going to do?

198
00:10:25,220 --> 00:10:25,820
Bet.

199
00:10:25,820 --> 00:10:29,140
She has nothing to
lose by betting,

200
00:10:29,140 --> 00:10:32,060
unless she's being
very, very tricky,

201
00:10:32,060 --> 00:10:36,450
but it is correct to bet.

202
00:10:36,450 --> 00:10:39,460
So let's see.

203
00:10:39,460 --> 00:10:41,690
If Rose doesn't pick
up a spade and bluffs,

204
00:10:41,690 --> 00:10:44,540
how often does that
have to succeed for it

205
00:10:44,540 --> 00:10:46,914
to be profitable?

206
00:10:46,914 --> 00:10:49,030
There's $100 in the pot.

207
00:10:49,030 --> 00:10:50,100
She looks.

208
00:10:50,100 --> 00:10:53,030
If it's not a spade,
she has to bet $100,

209
00:10:53,030 --> 00:10:57,890
and how much is she risking?

210
00:10:57,890 --> 00:10:59,140
How much is she going to win?

211
00:11:03,630 --> 00:11:06,290
It's actually $100 and
another $100, right?

212
00:11:06,290 --> 00:11:09,010
Because there's $100 in a pot.

213
00:11:09,010 --> 00:11:12,330
Sure, she anted something
and made the pot,

214
00:11:12,330 --> 00:11:14,680
but she's spending $100.

215
00:11:14,680 --> 00:11:18,080
And if Colin calls, she's
going to lose the $100.

216
00:11:18,080 --> 00:11:21,280
If Colin folds, she's going
to win the $100 in the pot,

217
00:11:21,280 --> 00:11:22,780
or she could have just given up.

218
00:11:22,780 --> 00:11:24,600
So it's 1 to 1.

219
00:11:24,600 --> 00:11:28,710
So Rose should call
half the time--

220
00:11:28,710 --> 00:11:31,130
I mean Colin should
call half the time.

221
00:11:31,130 --> 00:11:35,230
Rose should bet to
bluff in a 2 to 1

222
00:11:35,230 --> 00:11:39,680
ratio, because that's the
odds Colin gets to call.

223
00:11:39,680 --> 00:11:42,110
So Rose should
always bet a spade.

224
00:11:42,110 --> 00:11:45,520
If Colin calls 100% of the time,
Rose will just never bluff.

225
00:11:45,520 --> 00:11:48,150
If Colin never calls, Rose
would just be every time.

226
00:11:48,150 --> 00:11:51,640
So there is kind of
no equilibrium there.

227
00:11:51,640 --> 00:11:53,740
If Colin calls
half the time, Rose

228
00:11:53,740 --> 00:11:55,100
will be indifferent to bluffing.

229
00:11:55,100 --> 00:11:58,770
She'll be negative $50
either way without a spade,

230
00:11:58,770 --> 00:12:00,570
and then $100 with a spade.

231
00:12:00,570 --> 00:12:02,680
Now, this is strategy
for [INAUDIBLE]

232
00:12:02,680 --> 00:12:04,720
and the correct
strategy for Rose

233
00:12:04,720 --> 00:12:11,070
is this ratio of bluff to
spade, which is 1 to 2.

234
00:12:11,070 --> 00:12:15,110
So Rose should basically
bet half of her hearts.

235
00:12:15,110 --> 00:12:17,530
She can bet the
high hearts, and I

236
00:12:17,530 --> 00:12:21,500
guess with the eight of hearts
she can decide whether-- is

237
00:12:21,500 --> 00:12:23,720
it the eight or the seven?

238
00:12:23,720 --> 00:12:26,870
No, it's the-- yeah,
it's the eight.

239
00:12:26,870 --> 00:12:28,960
[INAUDIBLE] with
the eight of hearts

240
00:12:28,960 --> 00:12:34,120
she can decide whether to bet
or not like half the time.

241
00:12:34,120 --> 00:12:36,870
So these are Nash equilibrium
and game theory optimal

242
00:12:36,870 --> 00:12:43,420
strategies, and basically the
value of the game is negative--

243
00:12:43,420 --> 00:12:47,210
is worth $12.50 to Colin.

244
00:12:47,210 --> 00:12:50,990
Any questions about this?

245
00:12:50,990 --> 00:12:54,950
All right, so these
are the strategies

246
00:12:54,950 --> 00:13:02,970
that the algorithm
tries to find.

247
00:13:02,970 --> 00:13:07,400
Let's go on to
the algorithm now.

248
00:13:07,400 --> 00:13:10,420
Well, let's talk about what
[INAUDIBLE] optimal is first.

249
00:13:10,420 --> 00:13:15,280
By the way, there will be
about five or so transparencies

250
00:13:15,280 --> 00:13:17,150
[INAUDIBLE] of math equations.

251
00:13:17,150 --> 00:13:20,990
So just suffer through these.

252
00:13:20,990 --> 00:13:26,710
Those of you who understand are
going to enjoy the later part,

253
00:13:26,710 --> 00:13:29,710
but let's just talk
formally about what

254
00:13:29,710 --> 00:13:31,460
game theory optimal means.

255
00:13:31,460 --> 00:13:34,980
So there's this
game function, u.

256
00:13:34,980 --> 00:13:38,860
It takes two strategies, an
x strategy and a y strategy,

257
00:13:38,860 --> 00:13:41,500
and it gives [INAUDIBLE].

258
00:13:41,500 --> 00:13:43,130
If this was rock,
paper, scissors,

259
00:13:43,130 --> 00:13:46,890
you would have u of
rock versus scissors

260
00:13:46,890 --> 00:13:50,160
to be 1, so on and so forth.

261
00:13:50,160 --> 00:13:53,000
It's positive for x and
negative-- x is trying

262
00:13:53,000 --> 00:14:01,130
to-- x gets u, and y loses u.

263
00:14:01,130 --> 00:14:03,190
That's the idea.

264
00:14:03,190 --> 00:14:07,280
So one of things is we can
take convex linear combinations

265
00:14:07,280 --> 00:14:08,610
of strategies.

266
00:14:08,610 --> 00:14:13,460
That is, if x sigma
xk are strategies

267
00:14:13,460 --> 00:14:18,900
and we have some coefficients
that are all non-negative

268
00:14:18,900 --> 00:14:21,960
and that sum to 1, we
can make a new strategy

269
00:14:21,960 --> 00:14:24,630
as a linear combination
of these strategies.

270
00:14:24,630 --> 00:14:32,310
And also u is bi-linear means
that the value of the game

271
00:14:32,310 --> 00:14:36,180
here is just the
linear combination

272
00:14:36,180 --> 00:14:38,690
that [INAUDIBLE] sigma x.

273
00:14:38,690 --> 00:14:43,230
And it would be the
same also for sigma y.

274
00:14:43,230 --> 00:14:45,860
This just means, suppose
you have two strategies

275
00:14:45,860 --> 00:14:52,530
and you play 1/3 sigma
x1 and 2/3 sigma x2,

276
00:14:52,530 --> 00:14:56,225
your payoff is going to be
1/3 of the payoff of sigma x1

277
00:14:56,225 --> 00:14:57,725
and 1/3 payoff of sigma x2.

278
00:15:00,310 --> 00:15:04,080
Hopefully that's pretty clear.

279
00:15:04,080 --> 00:15:05,940
Now we define a
pair of strategies

280
00:15:05,940 --> 00:15:10,600
to be an epsilonic rim if
the best x can do against y

281
00:15:10,600 --> 00:15:12,010
is this strategy.

282
00:15:12,010 --> 00:15:17,000
The best y can do against x
is this strategy-- is epsilon.

283
00:15:17,000 --> 00:15:22,550
And if epsilon equals 0,
these are in Nash equilibrium.

284
00:15:22,550 --> 00:15:26,880
So after 900 PU
hours, what they found

285
00:15:26,880 --> 00:15:29,320
were two strategies--
sigma x star,

286
00:15:29,320 --> 00:15:38,860
sigma y star-- that
were within 1/1000

287
00:15:38,860 --> 00:15:41,790
of a big blind of equilibrium.

288
00:15:41,790 --> 00:15:47,950
And that's basically
[INAUDIBLE] accomplished.

289
00:15:47,950 --> 00:15:50,860
So I'm going to actually go
through the nitty gritty of how

290
00:15:50,860 --> 00:15:56,280
they did this in case you would
like to write on poker solver

291
00:15:56,280 --> 00:15:58,630
Sunday.

292
00:15:58,630 --> 00:16:03,670
So the big idea
that they borrowed

293
00:16:03,670 --> 00:16:05,810
was this idea of
regret minimization,

294
00:16:05,810 --> 00:16:07,590
which is actually pretty cool.

295
00:16:07,590 --> 00:16:11,750
Suppose that each time
step t the player has

296
00:16:11,750 --> 00:16:13,920
a few pure strategies.

297
00:16:13,920 --> 00:16:17,510
We're assuming the player
has a handful of strategies.

298
00:16:17,510 --> 00:16:21,780
In poker, obviously, there's
trillions of strategies,

299
00:16:21,780 --> 00:16:24,240
but-- two to the
trillions of strategies.

300
00:16:24,240 --> 00:16:26,920
But say he has two strategies.

301
00:16:26,920 --> 00:16:28,524
He can play one or two.

302
00:16:28,524 --> 00:16:30,690
Suppose it's odds, or evens,
or something like that.

303
00:16:30,690 --> 00:16:32,523
Or he has three strategies
like [INAUDIBLE].

304
00:16:35,380 --> 00:16:39,640
So basically he
chooses some sort

305
00:16:39,640 --> 00:16:45,180
of mixture of strategies
at the beginning,

306
00:16:45,180 --> 00:16:49,730
and we're only dealing with
one player at this time.

307
00:16:49,730 --> 00:16:52,630
We're assuming the
other guy-- we're

308
00:16:52,630 --> 00:16:55,300
assuming he's playing
against some adversary that's

309
00:16:55,300 --> 00:16:55,830
all knowing.

310
00:16:58,530 --> 00:17:01,300
That's the original set
up, regret memorization.

311
00:17:01,300 --> 00:17:03,040
We'll talk about
how this applies

312
00:17:03,040 --> 00:17:06,630
to game theory in general.

313
00:17:06,630 --> 00:17:11,510
Now with each time t we're
given values ut of sigma k.

314
00:17:11,510 --> 00:17:15,030
So basically after
he determines this,

315
00:17:15,030 --> 00:17:19,960
the adversary decides what
the value of use of t is,

316
00:17:19,960 --> 00:17:22,339
and basically his payoff
is just [INAUDIBLE]

317
00:17:22,339 --> 00:17:25,274
a linear combination of
the things he picked.

318
00:17:25,274 --> 00:17:32,090
But the idea is that the
adversary can be adversarial.

319
00:17:32,090 --> 00:17:38,360
he can decide to make the
[INAUDIBLE] strategy score well

320
00:17:38,360 --> 00:17:40,870
some of the time, and
the [INAUDIBLE] strategy

321
00:17:40,870 --> 00:17:42,780
score badly some of the time.

322
00:17:42,780 --> 00:17:48,190
So basically now the idea
is to calculate a regret.

323
00:17:48,190 --> 00:17:50,070
By the way, this
is not the notation

324
00:17:50,070 --> 00:17:55,130
that's used in the three or
four papers they wrote on this,

325
00:17:55,130 --> 00:18:01,780
because I think they did
great work-- it's really

326
00:18:01,780 --> 00:18:05,450
written as a math paper.

327
00:18:05,450 --> 00:18:11,080
It looks like a particle physics
paper, which is-- actually

328
00:18:11,080 --> 00:18:15,450
for particle physics you
need all the complex notation

329
00:18:15,450 --> 00:18:17,920
because they're trying to
describe something [INAUDIBLE]

330
00:18:17,920 --> 00:18:20,910
difficult. I think for
computer science papers usually

331
00:18:20,910 --> 00:18:23,480
don't need this.

332
00:18:23,480 --> 00:18:26,340
So I'll explain this,
and then you guys through

333
00:18:26,340 --> 00:18:29,380
reread their paper.

334
00:18:29,380 --> 00:18:33,630
I think that [INAUDIBLE]
give you a quicker way

335
00:18:33,630 --> 00:18:34,880
to understand their paper.

336
00:18:34,880 --> 00:18:38,230
So there's this thing called
regret of the k option

337
00:18:38,230 --> 00:18:42,980
at time t, which is just the
sum of the difference of playing

338
00:18:42,980 --> 00:18:46,110
k versus playing
whatever you played.

339
00:18:46,110 --> 00:18:49,230
So basically you can
have positive regret

340
00:18:49,230 --> 00:18:51,740
or negative regrets.

341
00:18:51,740 --> 00:18:56,490
Negative regrets means that what
you played-- what you decided

342
00:18:56,490 --> 00:19:00,260
to play up to time t was
better than just playing k

343
00:19:00,260 --> 00:19:02,950
at each time step.

344
00:19:02,950 --> 00:19:06,180
So we're only
concerned-- we're mostly

345
00:19:06,180 --> 00:19:08,030
concerned with the
positive regret, which

346
00:19:08,030 --> 00:19:09,620
means, instead of
playing, you should

347
00:19:09,620 --> 00:19:14,870
have made-- you could have made
more money by playing option k.

348
00:19:14,870 --> 00:19:19,470
So what's the
significance of this?

349
00:19:19,470 --> 00:19:22,520
So the idea is we want
the average regret, which

350
00:19:22,520 --> 00:19:26,860
is this element divided by t.

351
00:19:26,860 --> 00:19:32,220
So basically you want the
average regret, average amount

352
00:19:32,220 --> 00:19:35,950
that you're kind missing out
on to be less than epsilon sub

353
00:19:35,950 --> 00:19:39,080
t, where in epsilon sub t is
the [INAUDIBLE] converging to 0.

354
00:19:39,080 --> 00:19:46,690
If you have this, you have
some regret [INAUDIBLE].

355
00:19:46,690 --> 00:19:51,650
So the cool thing about this
is you can do regret matching.

356
00:19:51,650 --> 00:19:55,770
You can let these
weights-- first of all,

357
00:19:55,770 --> 00:19:59,400
you just look at the
positive, the things

358
00:19:59,400 --> 00:20:03,960
with positive regret,
and weight the options.

359
00:20:03,960 --> 00:20:05,470
At each [INAUDIBLE],
we basically

360
00:20:05,470 --> 00:20:09,530
weight the options that have
positive regrets accordingly.

361
00:20:09,530 --> 00:20:12,330
And if you're so lucky that
nothing is positive regret,

362
00:20:12,330 --> 00:20:15,600
you just randomly
pick a strategy.

363
00:20:15,600 --> 00:20:18,630
Let's do an example,
because I think this

364
00:20:18,630 --> 00:20:21,510
is kind of unclear what it is.

365
00:20:21,510 --> 00:20:24,580
So let's just say we
have two strategies.

366
00:20:24,580 --> 00:20:27,280
The player can pick
one, or the player

367
00:20:27,280 --> 00:20:29,560
can pick two at each
time, or the player can

368
00:20:29,560 --> 00:20:32,880
pick some mixture one and two.

369
00:20:32,880 --> 00:20:37,010
After a player does that, the
adversary comes out and says,

370
00:20:37,010 --> 00:20:39,310
well, one of them is worth
[INAUDIBLE] and one of them

371
00:20:39,310 --> 00:20:41,030
is worth 1.

372
00:20:41,030 --> 00:20:44,380
So let's just see
how this works.

373
00:20:44,380 --> 00:20:47,020
So suppose at the
first time step

374
00:20:47,020 --> 00:20:50,624
we picked sigma 2 because we
don't have any regrets yet.

375
00:20:50,624 --> 00:20:52,790
We're just randomly picking
a strategy-- [INAUDIBLE]

376
00:20:52,790 --> 00:20:54,220
sorry, sigma 1.

377
00:20:54,220 --> 00:20:56,790
We'll just randomly
pick sigma 1.

378
00:20:56,790 --> 00:21:03,790
So the adversary now gives
us the value of sigma 1 0

379
00:21:03,790 --> 00:21:05,240
and sigma 2 is 1.

380
00:21:05,240 --> 00:21:09,290
And you go, oh, well that
means that the regret

381
00:21:09,290 --> 00:21:13,010
of the first option is 0 and
the regret of the second option

382
00:21:13,010 --> 00:21:13,510
is 1.

383
00:21:13,510 --> 00:21:17,930
We're aware this first option is
0 is because we already played

384
00:21:17,930 --> 00:21:20,980
sigma 1, so you can't
have any regrets,

385
00:21:20,980 --> 00:21:23,580
either positive or negative,
for playing sigma 1,

386
00:21:23,580 --> 00:21:27,810
because your option
was playing sigma 1,

387
00:21:27,810 --> 00:21:31,440
but you have some regret
of not playing sigma 2.

388
00:21:31,440 --> 00:21:34,340
Sigma 2 was kind
of the winner here.

389
00:21:34,340 --> 00:21:37,750
If the two [INAUDIBLE]
reversed, we

390
00:21:37,750 --> 00:21:41,890
would have r1 equals 0 and
r2 equals the negative 1.

391
00:21:41,890 --> 00:21:44,550
And then we'd become happy
because all our regrets

392
00:21:44,550 --> 00:21:46,790
would be non-negative.

393
00:21:46,790 --> 00:21:50,840
So at t equals 2,
because we have

394
00:21:50,840 --> 00:21:53,890
zero regret here
and regret 1 here,

395
00:21:53,890 --> 00:21:58,890
we actually pick the
strategy to be all sigma 2.

396
00:21:58,890 --> 00:22:03,350
Now the adversary says, OK,
well the value of sigma 1 is 1,

397
00:22:03,350 --> 00:22:07,014
and the value of sigma 2 is
0 for the second time step.

398
00:22:07,014 --> 00:22:07,680
So what happens?

399
00:22:07,680 --> 00:22:11,240
Well, the same thing
happens as before.

400
00:22:11,240 --> 00:22:15,164
Now we have regret of
1 on the [INAUDIBLE],

401
00:22:15,164 --> 00:22:17,330
and then regret of [INAUDIBLE]
on the second option.

402
00:22:17,330 --> 00:22:19,862
So what do we do next?

403
00:22:19,862 --> 00:22:20,820
The regret [INAUDIBLE].

404
00:22:25,090 --> 00:22:29,620
Well, flip a coin or just
pick a linear even combination

405
00:22:29,620 --> 00:22:33,880
of the two strategies, half
of one and half the other.

406
00:22:33,880 --> 00:22:34,925
That's what we can do.

407
00:22:34,925 --> 00:22:36,000
[INAUDIBLE] the same.

408
00:22:36,000 --> 00:22:39,620
So now the adversary
says sigma 1 is 0

409
00:22:39,620 --> 00:22:45,990
and sigma 2 is 1, which
means that the regret of 1

410
00:22:45,990 --> 00:22:48,740
actually goes to 0.5,
and the regret of 2

411
00:22:48,740 --> 00:22:50,450
actually goes to 1.5.

412
00:22:50,450 --> 00:22:51,820
[INAUDIBLE]

413
00:22:51,820 --> 00:22:53,510
1 goes down a 1/2.

414
00:22:53,510 --> 00:22:56,750
So now with these
regrets our waiting

415
00:22:56,750 --> 00:22:59,740
is kind of the ratio of the two.

416
00:22:59,740 --> 00:23:04,470
It's 1/4 sigma 1
and 3/4 sigma 2.

417
00:23:04,470 --> 00:23:07,560
So now the adversary
goes, OK, well sigma is 0.

418
00:23:07,560 --> 00:23:10,070
Sigma 2 is 1.

419
00:23:10,070 --> 00:23:13,460
So this regret actually goes
[INAUDIBLE] down by 3/4,

420
00:23:13,460 --> 00:23:15,360
and this goes up by a 1/4.

421
00:23:15,360 --> 00:23:17,280
And since this is
negative, now we

422
00:23:17,280 --> 00:23:19,905
pick the strategy to be sigma 2.

423
00:23:19,905 --> 00:23:21,570
[INAUDIBLE] and so forth.

424
00:23:21,570 --> 00:23:23,600
Now the adversary
[INAUDIBLE] for us and say,

425
00:23:23,600 --> 00:23:25,360
oh, it's really sigma 1.

426
00:23:25,360 --> 00:23:29,080
Then a regret of sigma
1 would go up to 0.75,

427
00:23:29,080 --> 00:23:30,500
and so on and so forth.

428
00:23:30,500 --> 00:23:36,090
So it seems that the adversary
can make the job tough on us.

429
00:23:36,090 --> 00:23:40,390
Well actually,
there is a theorem

430
00:23:40,390 --> 00:23:43,800
that says, for our
example, [INAUDIBLE].

431
00:23:43,800 --> 00:23:47,010
The square of the first
regret if it's positive

432
00:23:47,010 --> 00:23:49,830
plus the square of the second
regret if it's positive

433
00:23:49,830 --> 00:23:54,130
is always going to be
less than or equal to t.

434
00:23:54,130 --> 00:23:57,980
And that's because,
if [INAUDIBLE] these

435
00:23:57,980 --> 00:24:02,370
are both positive,
it goes, for example,

436
00:24:02,370 --> 00:24:08,110
you are really going r1 plus
or minus whatever amount of r2

437
00:24:08,110 --> 00:24:09,340
you're doing.

438
00:24:09,340 --> 00:24:12,820
And r2 of t now minus
plus whatever amount of r1

439
00:24:12,820 --> 00:24:14,030
you're doing.

440
00:24:14,030 --> 00:24:18,330
The things that [INAUDIBLE]
this you can see the cross terms

441
00:24:18,330 --> 00:24:19,270
cancel each other out.

442
00:24:19,270 --> 00:24:24,250
This becomes 2 r1 r2
divided by r1 plus rt.

443
00:24:24,250 --> 00:24:27,560
So you're left with this
squared plus this squared

444
00:24:27,560 --> 00:24:29,800
plus this squared
plus this squared.

445
00:24:29,800 --> 00:24:31,880
And this squared
plus this squared

446
00:24:31,880 --> 00:24:33,820
is going to be
less than 1, so we

447
00:24:33,820 --> 00:24:37,990
have this here, which means
that the quadratic sum only

448
00:24:37,990 --> 00:24:39,860
[INAUDIBLE] by 1.

449
00:24:39,860 --> 00:24:41,120
We have this bound.

450
00:24:41,120 --> 00:24:43,150
Why is this bound so great?

451
00:24:43,150 --> 00:24:46,380
Well if the square of the
regrets are less than t,

452
00:24:46,380 --> 00:24:50,550
that means the average regret
is going to be [INAUDIBLE] 1

453
00:24:50,550 --> 00:24:51,820
over root t.

454
00:24:51,820 --> 00:24:54,880
In fact, it's kind of left
as a homework problem.

455
00:24:54,880 --> 00:24:58,440
In a general case, our
kt over t is less than n

456
00:24:58,440 --> 00:25:01,340
minus 1 delta over
root t, where delta

457
00:25:01,340 --> 00:25:04,870
is the maximum
deviation of the options

458
00:25:04,870 --> 00:25:08,100
and is just the
number of options.

459
00:25:08,100 --> 00:25:09,251
Yeah?

460
00:25:09,251 --> 00:25:13,075
AUDIENCE: I'm curious,
is [INAUDIBLE] in terms

461
00:25:13,075 --> 00:25:17,488
of what is the strategy sigma.

462
00:25:17,488 --> 00:25:18,784
Number of like a payoff?

463
00:25:18,784 --> 00:25:19,700
PROFESSOR: No, no, no.

464
00:25:19,700 --> 00:25:23,670
A strategy sigma, in
terms of poker strategy,

465
00:25:23,670 --> 00:25:26,910
is sort of a description
of what you would do.

466
00:25:29,630 --> 00:25:34,620
Suppose you get ace,
six off suit pre-flop.

467
00:25:34,620 --> 00:25:36,860
A strategy would be a
descriptor of what you would

468
00:25:36,860 --> 00:25:39,160
do at each point of the hand.

469
00:25:39,160 --> 00:25:43,820
So there's some
significance in effect

470
00:25:43,820 --> 00:25:50,550
that this regret, average
regret, goes to 0.

471
00:25:50,550 --> 00:25:57,000
Well, the significance in
terms of game theory optimal is

472
00:25:57,000 --> 00:25:59,760
suppose a peer's
strategies are--

473
00:25:59,760 --> 00:26:03,274
suppose you have a bunch of
peer strategies for x and bunch

474
00:26:03,274 --> 00:26:04,820
of peer strategies for y.

475
00:26:04,820 --> 00:26:08,480
If we regret match, but
instead of doing an adversary,

476
00:26:08,480 --> 00:26:13,780
we just say t
utility for x is just

477
00:26:13,780 --> 00:26:20,340
the utility for x playing
against the sigma ty,

478
00:26:20,340 --> 00:26:24,500
and the utility for y is
just negative utility--

479
00:26:24,500 --> 00:26:28,670
the game utility for y
playing against sigma xt.

480
00:26:28,670 --> 00:26:31,060
This is kind of a
mutual regret matching.

481
00:26:31,060 --> 00:26:33,220
You do regret
matching for x and y

482
00:26:33,220 --> 00:26:35,610
in each step, which
means you just modify

483
00:26:35,610 --> 00:26:39,890
x-- you compute the
regrets at each step.

484
00:26:39,890 --> 00:26:42,570
Then you modify x
[INAUDIBLE] y strategy

485
00:26:42,570 --> 00:26:45,550
by this type of regret matching.

486
00:26:45,550 --> 00:26:51,090
And basically the
strategies that you

487
00:26:51,090 --> 00:26:53,050
choose, the average
strategy, which

488
00:26:53,050 --> 00:26:58,330
is the sum of the strategies
you have had all along

489
00:26:58,330 --> 00:26:59,660
divided by t.

490
00:26:59,660 --> 00:27:05,070
1/t-- all the strategies
you've done in these t steps.

491
00:27:05,070 --> 00:27:08,760
And basically what
happens is now,

492
00:27:08,760 --> 00:27:11,700
if you try [INAUDIBLE] to
exploit a [INAUDIBLE] strategy,

493
00:27:11,700 --> 00:27:16,270
again, this is the best x can
do against y minus the best

494
00:27:16,270 --> 00:27:18,720
y does against x.

495
00:27:18,720 --> 00:27:22,160
You compute this,
and you add the sum

496
00:27:22,160 --> 00:27:27,080
of what actually happened
with x of t and y sub t,

497
00:27:27,080 --> 00:27:29,680
and so on and so forth.

498
00:27:29,680 --> 00:27:39,400
You notice that this is the
regret of k-- of x picking

499
00:27:39,400 --> 00:27:41,185
strategy k all the time.

500
00:27:41,185 --> 00:27:45,140
It's just y picking
strategy j all the time.

501
00:27:45,140 --> 00:27:47,770
So that's less than
2 epsilon over t

502
00:27:47,770 --> 00:27:52,220
because regrets over t converge,
so it's within [INAUDIBLE]

503
00:27:52,220 --> 00:27:53,280
game theory optimal.

504
00:27:53,280 --> 00:27:58,720
Basically what this all
means is basically suppose

505
00:27:58,720 --> 00:28:01,980
you choose your strategy,
some mixture of stuff.

506
00:28:01,980 --> 00:28:06,100
Your opponent tries to
figure out how best he

507
00:28:06,100 --> 00:28:07,870
can exploit this strategy.

508
00:28:07,870 --> 00:28:09,990
By the way, this is
often called nemesis.

509
00:28:09,990 --> 00:28:12,570
I really like that name.

510
00:28:12,570 --> 00:28:17,160
Opponent figures out his
nemesis strategy against you.

511
00:28:17,160 --> 00:28:22,721
Then, well, you get to see--
so his nemesis strategies--

512
00:28:22,721 --> 00:28:24,220
unless you're playing
the exact game

513
00:28:24,220 --> 00:28:26,553
theory optimal strategies--
is always going to be better

514
00:28:26,553 --> 00:28:28,100
than the game value.

515
00:28:28,100 --> 00:28:31,620
He looks at what you've done
and finds the best response.

516
00:28:31,620 --> 00:28:35,850
And you do the same to him,
and the difference of those two

517
00:28:35,850 --> 00:28:38,330
games kind of exploitable.

518
00:28:38,330 --> 00:28:44,270
Obviously, this means
basically, if your opponent

519
00:28:44,270 --> 00:28:50,110
sees what you're doing, this is
the best he can do against you.

520
00:28:50,110 --> 00:28:55,300
This number is the one
that's less than 1/1000

521
00:28:55,300 --> 00:28:56,490
of a big blind.

522
00:28:56,490 --> 00:29:00,000
So counterfactual
regret is kind of cool

523
00:29:00,000 --> 00:29:03,500
because-- it's a good
thing I've drawn this tree.

524
00:29:03,500 --> 00:29:09,120
At each of your decision points,
now you can regret match.

525
00:29:09,120 --> 00:29:12,907
So first of all, you
don't need to be fed back

526
00:29:12,907 --> 00:29:14,240
the correct utility [INAUDIBLE].

527
00:29:17,482 --> 00:29:21,560
Here in the example we
gave, we had a u0 and u1.

528
00:29:21,560 --> 00:29:26,920
You'll just be fed back some
unbiased stochastic number that

529
00:29:26,920 --> 00:29:29,470
averages the value of the game.

530
00:29:29,470 --> 00:29:33,080
For example, if you're doing
a regret chain on poker,

531
00:29:33,080 --> 00:29:37,727
it's hard to tell if I'm
up with this strategy that

532
00:29:37,727 --> 00:29:39,750
has a bunch of terabytes,
and you come up

533
00:29:39,750 --> 00:29:42,780
with a strategy that's also
a bunch of terrabytes--

534
00:29:42,780 --> 00:29:45,700
what's the value of playing
against y [INAUDIBLE]?

535
00:29:45,700 --> 00:29:47,707
But we can just get a sample.

536
00:29:47,707 --> 00:29:48,540
We can get a sample.

537
00:29:51,980 --> 00:29:54,170
Well you can just run it once.

538
00:29:54,170 --> 00:29:55,490
Right, that's the idea.

539
00:29:55,490 --> 00:30:00,960
You get a sample by just
saying, OK, just play one hand,

540
00:30:00,960 --> 00:30:02,328
and see the result of that hand.

541
00:30:04,850 --> 00:30:08,360
And you could use either
random chance or whatever

542
00:30:08,360 --> 00:30:12,490
every time you decide to do
whatever branches of your tree

543
00:30:12,490 --> 00:30:13,690
if you do a mix tree.

544
00:30:13,690 --> 00:30:18,860
So the cool thing already is,
without counterfactual regret,

545
00:30:18,860 --> 00:30:20,790
you can quickly
converge the solution,

546
00:30:20,790 --> 00:30:27,945
because a lot of strategies,
like fictitious play--

547
00:30:27,945 --> 00:30:29,310
it's the best response.

548
00:30:29,310 --> 00:30:32,090
The best response is hard
to calculate sometimes,

549
00:30:32,090 --> 00:30:38,750
but each simulation can just
be one iteration through it.

550
00:30:38,750 --> 00:30:41,140
And this is
counterfactual regret

551
00:30:41,140 --> 00:30:42,800
because [INAUDIBLE]
is given assuming

552
00:30:42,800 --> 00:30:46,390
that the player does everything
to play to that node.

553
00:30:46,390 --> 00:30:52,280
So the waiting here is nature
just has its probabilities.

554
00:30:52,280 --> 00:30:56,730
If your opponent plays
according to his strategy,

555
00:30:56,730 --> 00:30:58,570
but when you play
you always kind

556
00:30:58,570 --> 00:31:01,370
of play towards that node,
so your weight actually

557
00:31:01,370 --> 00:31:03,910
1 for each of these
options you pick.

558
00:31:06,860 --> 00:31:10,820
The cool thing is that once
you have the structure set up

559
00:31:10,820 --> 00:31:15,460
where you're just doing
one or a few iterations

560
00:31:15,460 --> 00:31:20,590
throughout the hand, it's
actually pretty easy to set up

561
00:31:20,590 --> 00:31:23,620
different weighting schemes.

562
00:31:23,620 --> 00:31:28,490
For example, if you have two
options and the ace of hearts

563
00:31:28,490 --> 00:31:31,790
comes on the turn, or the deuce
of clubs comes on the turn,

564
00:31:31,790 --> 00:31:34,410
and you don't really have to
worry about the ace of hearts

565
00:31:34,410 --> 00:31:36,420
coming on a turn.

566
00:31:36,420 --> 00:31:37,950
That tree is fine.

567
00:31:37,950 --> 00:31:41,960
That part of the tree has
very little positive regrets.

568
00:31:41,960 --> 00:31:46,040
You can say, OK, we'll
just-- different game where

569
00:31:46,040 --> 00:31:49,520
the ace of hearts
comes about [INAUDIBLE]

570
00:31:49,520 --> 00:31:51,110
at a time the deuce
of clubs comes,

571
00:31:51,110 --> 00:31:53,770
but we're going to
weight the results by 10.

572
00:31:53,770 --> 00:31:56,010
You still get the same answer.

573
00:31:56,010 --> 00:31:59,150
It's just that you get a much
coarser kind of [INAUDIBLE]

574
00:31:59,150 --> 00:32:01,860
every time the ace of hearts
comes, but already kind of know

575
00:32:01,860 --> 00:32:02,750
what to do with that.

576
00:32:02,750 --> 00:32:04,870
You can work on
the deuce of clubs.

577
00:32:04,870 --> 00:32:06,900
So there a lot of different
weightings schemes.

578
00:32:10,206 --> 00:32:12,830
This means that the hands can be
kind of sampled intrinsically.

579
00:32:15,350 --> 00:32:20,760
So the final algorithm they
had was factual regret plus.

580
00:32:20,760 --> 00:32:25,100
So instead of having
accumulated negative regrets,

581
00:32:25,100 --> 00:32:29,900
basically a lot of these option
regrets can be really negative.

582
00:32:29,900 --> 00:32:33,040
Folding aces pre-flop
quickly turns

583
00:32:33,040 --> 00:32:34,855
to really negative regret.

584
00:32:37,420 --> 00:32:40,730
You lose your small
blind, and hopefully

585
00:32:40,730 --> 00:32:45,120
if you play it limit hold
'em, you could win more

586
00:32:45,120 --> 00:32:46,590
than the small blind.

587
00:32:46,590 --> 00:32:50,550
So you accumulate a lot of
[INAUDIBLE] so set options

588
00:32:50,550 --> 00:32:52,850
falls off the map.

589
00:32:52,850 --> 00:32:55,440
Their innovation in
counter factual plus

590
00:32:55,440 --> 00:32:59,020
is to, instead of putting
a big negative number

591
00:32:59,020 --> 00:33:02,040
to a lot of these things,
they just floor them at 0.

592
00:33:02,040 --> 00:33:04,110
And the reason
they floor at 0 is

593
00:33:04,110 --> 00:33:09,050
because you know this a
simultaneous evolution

594
00:33:09,050 --> 00:33:15,630
of strategies where even
strategies at the beginning

595
00:33:15,630 --> 00:33:17,550
just might not be
great strategies,

596
00:33:17,550 --> 00:33:21,840
and you want to-- if
regret of something is 0,

597
00:33:21,840 --> 00:33:26,620
you can route get regret
faster if it's the right thing

598
00:33:26,620 --> 00:33:29,690
to do to respond to your
opponent's strategy.

599
00:33:29,690 --> 00:33:32,700
All of these
things-- suppose you

600
00:33:32,700 --> 00:33:35,260
start with a random initial
guess for your opponent's

601
00:33:35,260 --> 00:33:36,420
strategy.

602
00:33:36,420 --> 00:33:39,540
Then you actually have a
pretty reasonable strategy,

603
00:33:39,540 --> 00:33:43,210
which is bet and raise
every time with every hand.

604
00:33:43,210 --> 00:33:47,270
If your opponent has a random
strategy, he might just fold.

605
00:33:47,270 --> 00:33:49,660
So later in the streets,
it's probably [INAUDIBLE]

606
00:33:49,660 --> 00:33:52,360
just bet and raise every
time with every hand.

607
00:33:52,360 --> 00:33:53,907
He raises you back.

608
00:33:53,907 --> 00:33:55,240
It's not like he knows anything.

609
00:33:55,240 --> 00:33:56,427
It's a random strategy.

610
00:33:56,427 --> 00:33:58,010
Just raise him back
and hope he folds.

611
00:33:58,010 --> 00:34:00,790
If he doesn't fold and
call, you bet again an x3,

612
00:34:00,790 --> 00:34:02,890
because now the pot is bigger.

613
00:34:02,890 --> 00:34:06,060
So he has a 1/3
chance of folding.

614
00:34:06,060 --> 00:34:07,670
You should bet.

615
00:34:07,670 --> 00:34:11,719
So that evolves quick.

616
00:34:11,719 --> 00:34:16,230
If you start off with a random
tree with no information,

617
00:34:16,230 --> 00:34:21,580
that starts off as
the dominant strategy.

618
00:34:21,580 --> 00:34:25,429
And then you have to walk
that back as your opponent's

619
00:34:25,429 --> 00:34:27,080
strategy evolves also.

620
00:34:27,080 --> 00:34:30,000
By the way, they're
actually keeping

621
00:34:30,000 --> 00:34:33,630
two trees-- one for the
small blind strategy,

622
00:34:33,630 --> 00:34:35,159
and one for the
big blind strategy.

623
00:34:35,159 --> 00:34:38,350
And this is everything with
respect to the small blind.

624
00:34:38,350 --> 00:34:43,090
The small blind isn't-- so let's
just go into the next slide

625
00:34:43,090 --> 00:34:44,112
probably.

626
00:34:44,112 --> 00:34:45,070
[INAUDIBLE] have to be.

627
00:34:49,040 --> 00:34:54,139
So let's try to figure out how
big the strategy space in limit

628
00:34:54,139 --> 00:34:56,770
hold 'em has to be.

629
00:34:56,770 --> 00:34:59,180
So let's concentrate
on river nodes

630
00:34:59,180 --> 00:35:03,470
because that's
most of the nodes.

631
00:35:03,470 --> 00:35:08,380
It's a tree so we just have
to calculate the leaves.

632
00:35:08,380 --> 00:35:13,050
So first of all,
assuming a four bet cap--

633
00:35:13,050 --> 00:35:15,860
the reason we assume a four bet
cap-- well, I don't know why,

634
00:35:15,860 --> 00:35:21,870
but it seems that that's--
so this is one approximation,

635
00:35:21,870 --> 00:35:26,280
the four bet cap, but this
is kind of normal in types

636
00:35:26,280 --> 00:35:27,325
of research papers.

637
00:35:29,950 --> 00:35:33,370
if we have a four bet, there
are nine possible actions that

638
00:35:33,370 --> 00:35:34,710
get you to the next street.

639
00:35:34,710 --> 00:35:38,910
There are some actions that
[INAUDIBLE] like player one

640
00:35:38,910 --> 00:35:43,230
bets and player two folds, but
if you don't get to the street,

641
00:35:43,230 --> 00:35:47,580
you don't get to the
river, and that's

642
00:35:47,580 --> 00:35:50,996
a pretty small
percentage of the nodes.

643
00:35:50,996 --> 00:35:52,620
So why are there nine
possible actions?

644
00:35:55,680 --> 00:35:56,720
Let's count them.

645
00:35:56,720 --> 00:35:59,950
One of the actions that gets to
the next street is check check.

646
00:35:59,950 --> 00:36:01,630
So that's one.

647
00:36:01,630 --> 00:36:02,529
What are the eight?

648
00:36:02,529 --> 00:36:03,570
What are the other eight?

649
00:36:06,824 --> 00:36:07,740
AUDIENCE: [INAUDIBLE].

650
00:36:10,660 --> 00:36:12,350
PROFESSOR: Right, check raise.

651
00:36:12,350 --> 00:36:15,010
Let's try systemic
[INAUDIBLE] count them.

652
00:36:15,010 --> 00:36:20,590
So I claim that there are two
ways-- one bet in the pot.

653
00:36:20,590 --> 00:36:22,643
Player one can bet, and
player two can call,

654
00:36:22,643 --> 00:36:24,476
or player one can check,
player two can bet,

655
00:36:24,476 --> 00:36:26,420
and player one can call.

656
00:36:26,420 --> 00:36:29,970
In fact, there are two ways
to put k bets in the pot

657
00:36:29,970 --> 00:36:33,630
and k is greater than 0.

658
00:36:33,630 --> 00:36:37,030
If you want to put three bets
in a pot, what are the two ways?

659
00:36:39,736 --> 00:36:41,090
AUDIENCE: [INAUDIBLE].

660
00:36:41,090 --> 00:36:42,790
PROFESSOR: Right.

661
00:36:42,790 --> 00:36:43,690
Yeah, right.

662
00:36:43,690 --> 00:36:46,860
Bet, raise, re-raise, call, and
check, bet, raise, re-raise,

663
00:36:46,860 --> 00:36:47,640
call.

664
00:36:47,640 --> 00:36:54,140
So if the cap is k bets,
there's always 2k plus 1 ways

665
00:36:54,140 --> 00:36:55,120
to [INAUDIBLE] three.

666
00:36:55,120 --> 00:36:57,880
So there are nine possible
actions in each betting

667
00:36:57,880 --> 00:36:59,430
round before the river.

668
00:36:59,430 --> 00:37:01,790
So there are three betting
rounds-- pre-flop, flop,

669
00:37:01,790 --> 00:37:03,070
and turn.

670
00:37:03,070 --> 00:37:07,990
So let's use some
symmetries because I

671
00:37:07,990 --> 00:37:10,490
don't think the optimal strategy
has you playing something

672
00:37:10,490 --> 00:37:14,500
differently with ace, six of
diamonds, ace, six of heart.

673
00:37:14,500 --> 00:37:15,990
[INAUDIBLE] very easy to prove.

674
00:37:15,990 --> 00:37:19,530
The optimal strategy
doesn't have that.

675
00:37:19,530 --> 00:37:24,830
So using symmetries on a flop--
so how many distinct flops are

676
00:37:24,830 --> 00:37:26,760
there?

677
00:37:26,760 --> 00:37:31,130
Well, I like to think
about it as where

678
00:37:31,130 --> 00:37:32,890
the suits have symmetries.

679
00:37:32,890 --> 00:37:35,800
I like to think about
it as, well, there

680
00:37:35,800 --> 00:37:38,490
could be three suits in a
flop, two suits in a flop,

681
00:37:38,490 --> 00:37:40,070
or one suit.

682
00:37:40,070 --> 00:37:44,980
So if there's one suit on a
flop, there's 13 [INAUDIBLE]

683
00:37:44,980 --> 00:37:46,320
combinations.

684
00:37:46,320 --> 00:37:48,470
That's pretty straightforward.

685
00:37:48,470 --> 00:37:55,180
If there are two suits on a
flop, what's the combinations?

686
00:37:55,180 --> 00:37:57,820
There are 13 possibilities
for one of the suits,

687
00:37:57,820 --> 00:38:01,920
and there are 13 [INAUDIBLE]
for the other suit.

688
00:38:01,920 --> 00:38:03,970
It's based on heart or
something like that.

689
00:38:03,970 --> 00:38:05,390
The suits are symmetric.

690
00:38:05,390 --> 00:38:09,020
So there are 1014
things [INAUDIBLE].

691
00:38:09,020 --> 00:38:10,890
This is [INAUDIBLE] the things.

692
00:38:10,890 --> 00:38:15,090
And if it's three suited,
you just choose three ranks,

693
00:38:15,090 --> 00:38:16,400
but it's not 13 choose 2.

694
00:38:16,400 --> 00:38:18,890
It's 15 choose 2 because why?

695
00:38:22,800 --> 00:38:24,310
I guess the ranks can be equal.

696
00:38:28,030 --> 00:38:31,210
So it would 13 choose 2 if
the ranks would be unique,

697
00:38:31,210 --> 00:38:32,975
but you'd have
three aces on him.

698
00:38:32,975 --> 00:38:35,540
So this is actually 15 choose 2.

699
00:38:35,540 --> 00:38:41,410
So there's 455 three suited
flops, [INAUDIBLE] flops.

700
00:38:41,410 --> 00:38:43,230
That's kind of the
big explosion in limit

701
00:38:43,230 --> 00:38:46,120
hold 'em, pre-flop to flop.

702
00:38:46,120 --> 00:38:48,190
So there is not [INAUDIBLE]
possible actions

703
00:38:48,190 --> 00:38:49,300
in each betting round.

704
00:38:49,300 --> 00:38:51,670
So let's count the number
of turns and rivers.

705
00:38:51,670 --> 00:38:54,520
There's [INAUDIBLE]
turns and 48 rivers.

706
00:38:54,520 --> 00:38:58,820
So counting that, you have
a billion possible action

707
00:38:58,820 --> 00:39:01,360
sequences to the river.

708
00:39:01,360 --> 00:39:05,125
The [INAUDIBLE] things in
each street, all the flops,

709
00:39:05,125 --> 00:39:08,040
then the turns and rivers.

710
00:39:08,040 --> 00:39:14,110
But each river, there could be
up to 126 [INAUDIBLE] types.

711
00:39:14,110 --> 00:39:17,020
47 times 46.

712
00:39:17,020 --> 00:39:21,450
Making about 6.5 trillion
hand river types.

713
00:39:21,450 --> 00:39:24,440
Each node should be
visited about 1,000 times.

714
00:39:24,440 --> 00:39:26,220
It's a big
computational problem,

715
00:39:26,220 --> 00:39:28,100
but it still
tractable, especially

716
00:39:28,100 --> 00:39:30,070
if you have 900 years of CPU.

717
00:39:34,810 --> 00:39:37,030
And they also used
many shortcuts.

718
00:39:37,030 --> 00:39:39,180
They use all the
symmetries I talk about,

719
00:39:39,180 --> 00:39:42,530
and they also have
a few shortcuts.

720
00:39:42,530 --> 00:39:46,340
And you can see
these trees are big.

721
00:39:46,340 --> 00:39:50,200
Terabytes of memory to
actually store your strategy.

722
00:39:50,200 --> 00:39:54,780
So you can't really
get that on a node yet.

723
00:39:54,780 --> 00:39:55,580
I don't know.

724
00:39:55,580 --> 00:39:56,910
Can you fit that on a node now?

725
00:39:56,910 --> 00:39:58,460
Does anybody know?

726
00:39:58,460 --> 00:40:03,340
I don't know of a CPU that has
[INAUDIBLE] bytes of RAM yet.

727
00:40:08,220 --> 00:40:11,690
What they did was they broke
the problem up into about 100

728
00:40:11,690 --> 00:40:14,170
[INAUDIBLE] different
sub-games, and they just

729
00:40:14,170 --> 00:40:15,960
worked on those sub-games.

730
00:40:15,960 --> 00:40:18,380
In fact, I guess if
you're clever about it,

731
00:40:18,380 --> 00:40:23,590
you can use cache memory when
you get down to the river.

732
00:40:23,590 --> 00:40:27,970
Things are pretty close, and
you know that using cache memory

733
00:40:27,970 --> 00:40:30,700
is faster than using
[INAUDIBLE] memory.

734
00:40:30,700 --> 00:40:32,980
You can take advantage
of these things.

735
00:40:32,980 --> 00:40:36,460
A lot of these updates
through these regrets

736
00:40:36,460 --> 00:40:40,180
are just simple addition,
and you can just

737
00:40:40,180 --> 00:40:44,750
optimize the heck out of this,
and I'm sure they did it.

738
00:40:44,750 --> 00:40:49,530
Let's just try to
solve some other games.

739
00:40:49,530 --> 00:40:54,140
I have two games
that seem accessible.

740
00:40:54,140 --> 00:40:57,500
Suppose we do Omaha eight.

741
00:40:57,500 --> 00:41:02,460
Well, this is exactly the same
structure as limit hold 'em.

742
00:41:02,460 --> 00:41:06,750
You just change the hole cards.

743
00:41:06,750 --> 00:41:13,370
So instead of having 47 choose
2 different river hands,

744
00:41:13,370 --> 00:41:15,090
you have 47 choose 4.

745
00:41:15,090 --> 00:41:22,010
That's like a multiple 82.5
x to the original tree,

746
00:41:22,010 --> 00:41:24,410
so that's not that bad.

747
00:41:24,410 --> 00:41:29,420
900 CPU hours-- this is
just 75,000 CPU hours.

748
00:41:29,420 --> 00:41:34,050
If it were a matter
of national security

749
00:41:34,050 --> 00:41:35,680
to get the exact
solution to Omaha,

750
00:41:35,680 --> 00:41:39,150
the military could just
do it in a few months.

751
00:41:39,150 --> 00:41:42,110
There's also [INAUDIBLE]
you can do, by the way.

752
00:41:42,110 --> 00:41:46,270
Basically, what they did
is-- before they did this,

753
00:41:46,270 --> 00:41:49,030
was that they
solved the sub-game.

754
00:41:49,030 --> 00:41:52,830
In that, basically if you
both get hands together

755
00:41:52,830 --> 00:41:55,970
and you say you have to play
these hands the same way,

756
00:41:55,970 --> 00:42:04,340
that's basically a sub-strategy.

757
00:42:04,340 --> 00:42:07,160
You can consider
subspace of your strategy

758
00:42:07,160 --> 00:42:10,090
x prime of x and y
prime of y, and you just

759
00:42:10,090 --> 00:42:14,530
solved the x prime y prime
game, meaning you both get hands

760
00:42:14,530 --> 00:42:18,150
together, probably on the river
because that's when bucketing

761
00:42:18,150 --> 00:42:20,820
kind of becomes more necessary.

762
00:42:20,820 --> 00:42:24,310
And you solve that
game, and you go, well,

763
00:42:24,310 --> 00:42:28,250
how optimal is x prime
in the hold game?

764
00:42:28,250 --> 00:42:34,180
And if you're good at bucketing,
it may be pretty close.

765
00:42:34,180 --> 00:42:36,400
If you're bad at
bucketing, like you

766
00:42:36,400 --> 00:42:41,870
put the aces in the same
bucket as seven, five suited,

767
00:42:41,870 --> 00:42:44,240
you probably won't
get a great answer.

768
00:42:44,240 --> 00:42:51,010
So you need to intelligently
design your buckets.

769
00:42:51,010 --> 00:42:55,530
You can't-- well, I guess there
are also evolutionary things

770
00:42:55,530 --> 00:42:59,100
you can do to try to design
buckets and see what things are

771
00:42:59,100 --> 00:43:01,780
close to each other.

772
00:43:01,780 --> 00:43:03,882
People who have
familiarity with this

773
00:43:03,882 --> 00:43:07,290
know that this is
kind of hit or miss.

774
00:43:07,290 --> 00:43:13,440
Another game that you
can maybe solve is razz.

775
00:43:13,440 --> 00:43:15,840
It's definitely as simple
as [INAUDIBLE] a stud.

776
00:43:15,840 --> 00:43:18,140
Why is razz simpler than
all other games of stud?

777
00:43:18,140 --> 00:43:20,600
There are only 13
different cards.

778
00:43:20,600 --> 00:43:23,100
The deuce of spades is the same
card as the deuce of hearts.

779
00:43:23,100 --> 00:43:25,360
You can't-- well, you
could make flushes,

780
00:43:25,360 --> 00:43:29,040
but they're irrelevant.

781
00:43:29,040 --> 00:43:33,100
So unfortunately there
are 13 to the 8th power

782
00:43:33,100 --> 00:43:36,030
possible ways of cards
can come, because there

783
00:43:36,030 --> 00:43:38,570
are four up cards.

784
00:43:38,570 --> 00:43:41,080
That's sort of the problem.

785
00:43:41,080 --> 00:43:42,830
Kind of the
community information

786
00:43:42,830 --> 00:43:47,200
you have is a bigger
set, and your trees just

787
00:43:47,200 --> 00:43:52,810
get bigger because now
you have one extra street.

788
00:43:52,810 --> 00:43:58,540
And you still have 415 choose 3
combinations of any three ranks

789
00:43:58,540 --> 00:43:59,800
as river hand types.

790
00:43:59,800 --> 00:44:03,740
So There are 2.4
quadrillion river hands.

791
00:44:03,740 --> 00:44:07,280
So that's a factor
of 374 [INAUDIBLE],

792
00:44:07,280 --> 00:44:12,730
but we think some of these
roads are pretty null.

793
00:44:12,730 --> 00:44:16,620
How many of you
actually play razz?

794
00:44:16,620 --> 00:44:17,310
A couple of you.

795
00:44:17,310 --> 00:44:18,285
OK, great.

796
00:44:18,285 --> 00:44:22,740
Good poker class that
people study razz.

797
00:44:22,740 --> 00:44:25,850
If you have a queen up
and a deuce completes it,

798
00:44:25,850 --> 00:44:28,080
you're not really going
to get into a raising war

799
00:44:28,080 --> 00:44:34,310
and make it [INAUDIBLE]
cap on third street.

800
00:44:34,310 --> 00:44:36,460
Some of the [INAUDIBLE]
may be null.

801
00:44:36,460 --> 00:44:40,280
You can do some
bucketing, perhaps.

802
00:44:40,280 --> 00:44:44,090
Razz is kind of more
natural to bucketing

803
00:44:44,090 --> 00:44:50,690
because you can think about
what hands to bucket together.

804
00:44:50,690 --> 00:44:54,250
Maybe the king,
eight, six, deuce

805
00:44:54,250 --> 00:44:57,380
is very close to the
king, eight, six, ace.

806
00:44:57,380 --> 00:45:02,680
And the two strategies--
and you can start in hands

807
00:45:02,680 --> 00:45:06,380
by rank order of cards
or something like that.

808
00:45:06,380 --> 00:45:09,400
So this is 374.

809
00:45:09,400 --> 00:45:11,290
This is 82.5.

810
00:45:11,290 --> 00:45:17,230
Or you could apply
for a grant and say

811
00:45:17,230 --> 00:45:22,190
we need x hours of CPU time.

812
00:45:22,190 --> 00:45:24,250
I don't know what the
right strategy is,

813
00:45:24,250 --> 00:45:28,560
but these two problems
are tractable.

814
00:45:28,560 --> 00:45:30,960
Let's talk about big bet
games because there's

815
00:45:30,960 --> 00:45:42,530
been some sort of discussion,
even last night, about Snowie.

816
00:45:42,530 --> 00:45:48,480
A few people have tried big bet
games, and they're problems.

817
00:45:48,480 --> 00:45:53,150
First of all, there's a
continuum of bet sizes

818
00:45:53,150 --> 00:45:53,820
you can make.

819
00:45:53,820 --> 00:45:57,210
The Snowie solution just
assumes three bet sizes.

820
00:45:57,210 --> 00:46:00,470
I can bet half the pot, I can
bet the pot, or I can jam,

821
00:46:00,470 --> 00:46:01,530
I think.

822
00:46:01,530 --> 00:46:03,980
Maybe there's-- I can
bet two times the pot,

823
00:46:03,980 --> 00:46:07,680
but the problem with that is
that I think that's a little

824
00:46:07,680 --> 00:46:08,550
bit too coarse.

825
00:46:08,550 --> 00:46:10,860
The question is, if
you solved that game,

826
00:46:10,860 --> 00:46:15,370
how close is that
solution to the real game?

827
00:46:15,370 --> 00:46:18,612
And that's kind of an
interesting question,

828
00:46:18,612 --> 00:46:20,445
but you don't even have
a complete strategy.

829
00:46:23,360 --> 00:46:26,200
What if some guy bets
a quarter of the pot,

830
00:46:26,200 --> 00:46:30,930
or 1.5 times the pot, something
that' not on your list?

831
00:46:30,930 --> 00:46:35,970
You have to exploit-- and
then it gets kind of weird,

832
00:46:35,970 --> 00:46:39,500
because my response
to a pot size bet

833
00:46:39,500 --> 00:46:41,145
is to raise the pot again.

834
00:46:41,145 --> 00:46:44,890
All right, what if he
makes a 1.1 times the pot?

835
00:46:44,890 --> 00:46:51,560
Is it right to raise the pot--
just raise the pot 1.1 times

836
00:46:51,560 --> 00:46:55,630
or raise the pot 0.9 times so
you get back to the same stack

837
00:46:55,630 --> 00:46:59,675
sizes so you can do the
same thing in the future.

838
00:46:59,675 --> 00:47:00,925
These are difficult questions.

839
00:47:05,690 --> 00:47:08,360
Even if some bets
[INAUDIBLE] are non-optimal,

840
00:47:08,360 --> 00:47:11,900
our full strategy needs
responses to to the bets.

841
00:47:11,900 --> 00:47:14,510
So simple
approximations may work.

842
00:47:14,510 --> 00:47:19,770
I kind of feel this is kind
of a tough problem, though.

843
00:47:19,770 --> 00:47:24,730
And you could just--
just playing a game

844
00:47:24,730 --> 00:47:28,820
where you can just make
rigid pot bet sizes,

845
00:47:28,820 --> 00:47:31,990
then you might get something
actually interesting.

846
00:47:31,990 --> 00:47:35,120
But one of the things
with regret matching,

847
00:47:35,120 --> 00:47:38,560
if you actually have a lot of
bet sizes, suppose you say,

848
00:47:38,560 --> 00:47:40,580
OK, I'm just going
to kill this problem,

849
00:47:40,580 --> 00:47:47,020
and I'm going to do 0.01 times
the pot, 0.02 times the pot,

850
00:47:47,020 --> 00:47:49,230
0.03 times the pot,
and so on and so forth.

851
00:47:49,230 --> 00:47:51,990
The problems is now you have a
lot of options which are really

852
00:47:51,990 --> 00:47:55,500
close in equity together,
so this regret minimization

853
00:47:55,500 --> 00:47:57,330
is going to take a while.

854
00:47:57,330 --> 00:48:02,100
It's going to have to sort
out really close events.

855
00:48:02,100 --> 00:48:04,500
And then it's going to
have to balance your value

856
00:48:04,500 --> 00:48:06,610
bets with your bluff
and things like that.

857
00:48:06,610 --> 00:48:10,590
So even just trying to kill it
by putting a lot of bet types

858
00:48:10,590 --> 00:48:14,870
may not solve the
problem for you.

859
00:48:19,630 --> 00:48:22,490
So two player,
three player games

860
00:48:22,490 --> 00:48:26,010
are actually kind
of interesting.

861
00:48:26,010 --> 00:48:29,130
The dress by the group
and using counterfactual

862
00:48:29,130 --> 00:48:33,190
regret to create competitive
multi-player agents.

863
00:48:33,190 --> 00:48:42,320
And this is a paper
done in 2011 or so.

864
00:48:42,320 --> 00:48:44,360
And the program for
actually first and second

865
00:48:44,360 --> 00:48:47,760
in annual three
player limit event--

866
00:48:47,760 --> 00:48:50,930
the first problem is that
there's no guarantee of epsilon

867
00:48:50,930 --> 00:48:53,030
convergence.

868
00:48:53,030 --> 00:48:56,140
You're not necessarily within
epsilon of Nash equilibrium.

869
00:48:56,140 --> 00:49:01,450
Second problem is
that, do you just

870
00:49:01,450 --> 00:49:03,160
want to play in
Nash equilibrium?

871
00:49:03,160 --> 00:49:06,340
There could be multiple Nash
equilibria in multi-way games,

872
00:49:06,340 --> 00:49:12,840
especially in these proportional
payout tournaments, satellites

873
00:49:12,840 --> 00:49:16,110
where, say, two
people get a seat.

874
00:49:16,110 --> 00:49:18,450
There are really
nonlinear effects going,

875
00:49:18,450 --> 00:49:20,940
and it could [INAUDIBLE]
which collusive equilibria are

876
00:49:20,940 --> 00:49:21,490
you playing?

877
00:49:25,331 --> 00:49:28,510
In our book, Jared
and I point out

878
00:49:28,510 --> 00:49:31,910
a game called the
rock maniac game

879
00:49:31,910 --> 00:49:33,820
where it's a real
poker game where

880
00:49:33,820 --> 00:49:39,170
players can use a simple
strategy and ensure you losing.

881
00:49:39,170 --> 00:49:41,920
A simple version
non-poker version,

882
00:49:41,920 --> 00:49:45,750
like a game where you play even
or odds with three players,

883
00:49:45,750 --> 00:49:49,200
but the odd man out wins.

884
00:49:49,200 --> 00:49:53,220
So suppose you and
I are colluding

885
00:49:53,220 --> 00:49:54,810
against the third chump.

886
00:49:54,810 --> 00:49:56,940
What would we do?

887
00:49:56,940 --> 00:49:57,900
AUDIENCE: [INAUDIBLE].

888
00:49:57,900 --> 00:50:00,990
PROFESSOR: Right, I would
play one, and you'd play two.

889
00:50:00,990 --> 00:50:04,620
And the third guy
could never win.

890
00:50:04,620 --> 00:50:10,360
There are situations which can
come up in poker like that,

891
00:50:10,360 --> 00:50:18,990
but I think if
there's no collusion

892
00:50:18,990 --> 00:50:22,660
and it's not a tournament,
playing Nash equilibria usually

893
00:50:22,660 --> 00:50:24,190
turns out OK.

894
00:50:24,190 --> 00:50:28,850
I think that's sort of the
argument they were making

895
00:50:28,850 --> 00:50:30,680
in creating these strategies.

896
00:50:30,680 --> 00:50:32,570
All right, here
are the references.

897
00:50:35,870 --> 00:50:41,065
This took about [INAUDIBLE] the
time I estimated, so questions?

898
00:50:43,620 --> 00:50:47,660
OK, let's just-- you
hand your hand up first.

899
00:50:47,660 --> 00:50:49,975
AUDIENCE: Well, the
original strategy

900
00:50:49,975 --> 00:50:52,360
finds that the Nash
equilibria, if you're

901
00:50:52,360 --> 00:50:56,116
playing against someone who's
trying to beat [INAUDIBLE]

902
00:50:56,116 --> 00:51:01,845
strategy-- does it work
if one of the strategies

903
00:51:01,845 --> 00:51:03,825
is probabilistic.

904
00:51:03,825 --> 00:51:05,674
Let's say two strategy trees--

905
00:51:05,674 --> 00:51:06,840
PROFESSOR: Yeah, yeah, yeah.

906
00:51:06,840 --> 00:51:07,803
It does work with--

907
00:51:07,803 --> 00:51:09,344
AUDIENCE: Choose
[INAUDIBLE], but you

908
00:51:09,344 --> 00:51:11,506
don't know always
which one I'll choose.

909
00:51:11,506 --> 00:51:14,610
PROFESSOR: Yeah, it
works because you're

910
00:51:14,610 --> 00:51:18,720
going to play-- all
of these strategies

911
00:51:18,720 --> 00:51:23,050
assume that they could
be mixed strategies.

912
00:51:23,050 --> 00:51:26,600
If you're not allowed to play
1/3 rock, 1/3 paper, and 1/3

913
00:51:26,600 --> 00:51:28,280
scissors, then
you're going to have

914
00:51:28,280 --> 00:51:32,170
to play really bad strategy,
and there's definitely

915
00:51:32,170 --> 00:51:35,780
times in which mixing is
going to be necessary.

916
00:51:35,780 --> 00:51:36,540
So, yeah.

917
00:51:36,540 --> 00:51:38,815
All of these
strategies have mixing.

918
00:51:41,660 --> 00:51:42,160
Yeah?

919
00:51:42,160 --> 00:51:43,618
AUDIENCE: What
effects do you think

920
00:51:43,618 --> 00:51:48,502
[INAUDIBLE] going to have
on limit hold 'em games?

921
00:51:48,502 --> 00:51:49,460
PROFESSOR: I don't now.

922
00:51:49,460 --> 00:51:55,660
I think pretty much before
the solution came out

923
00:51:55,660 --> 00:51:59,440
the big online players kind
of knew that a lot of people

924
00:51:59,440 --> 00:52:04,040
were playing near optimal, and I
think the game is kind of dead.

925
00:52:04,040 --> 00:52:06,254
What do you think, Mike?

926
00:52:06,254 --> 00:52:08,210
AUDIENCE: [INAUDIBLE].

927
00:52:08,210 --> 00:52:10,910
PROFESSOR: Right.

928
00:52:10,910 --> 00:52:13,010
Too bad Matt doesn't come here.

929
00:52:13,010 --> 00:52:13,980
AUDIENCE: [INAUDIBLE] are
already basically doing this

930
00:52:13,980 --> 00:52:14,600
anyway.

931
00:52:14,600 --> 00:52:15,660
PROFESSOR: Well, no.

932
00:52:15,660 --> 00:52:18,940
I mean, even if you have the
strategy, you have to learn it.

933
00:52:25,880 --> 00:52:28,880
The problem is that, if you
go to a casino and you play

934
00:52:28,880 --> 00:52:33,110
somebody who's a good
limit hold 'em player,

935
00:52:33,110 --> 00:52:36,730
he's-- because these types
of strategies have been out

936
00:52:36,730 --> 00:52:42,650
for a while, they already played
much closer to optimal than

937
00:52:42,650 --> 00:52:44,280
they did before.

938
00:52:44,280 --> 00:52:49,500
So I think this would have
absolutely no effect on heads

939
00:52:49,500 --> 00:52:51,670
up limit hold 'em.

940
00:52:51,670 --> 00:52:54,379
It's already kind
of no one-- yes?

941
00:52:54,379 --> 00:52:56,670
AUDIENCE: So can you talk
more about different ways you

942
00:52:56,670 --> 00:52:58,086
can do approximations.
[INAUDIBLE]

943
00:52:58,086 --> 00:53:01,830
mentioning earlier bucketing
all of the different hands

944
00:53:01,830 --> 00:53:05,090
[INAUDIBLE] the
ranks or what are

945
00:53:05,090 --> 00:53:06,870
some other things we can do?

946
00:53:06,870 --> 00:53:09,710
PROFESSOR: It's an
endless [INAUDIBLE]

947
00:53:09,710 --> 00:53:12,770
be clever in bucketing.

948
00:53:12,770 --> 00:53:14,570
So bucket hand types together.

949
00:53:17,360 --> 00:53:20,370
One kind of clever
thing you can do

950
00:53:20,370 --> 00:53:23,430
is try to cut out
the river entirely

951
00:53:23,430 --> 00:53:28,020
by just estimating your
equity on the river.

952
00:53:28,020 --> 00:53:30,270
Of course, that's not going
to be your showdown equity

953
00:53:30,270 --> 00:53:35,750
because you may be
forced to face of a bet.

954
00:53:35,750 --> 00:53:42,890
So you try some sort of
implied value of your hand.

955
00:53:46,200 --> 00:53:46,700
Let's see.

956
00:53:46,700 --> 00:53:48,710
What other bucketing things.

957
00:53:52,480 --> 00:53:57,340
I mean, in some games there's
a sort of a natural way

958
00:53:57,340 --> 00:54:00,000
of bucketing hand types.

959
00:54:00,000 --> 00:54:06,570
Like In the river on
Omaha, you could just

960
00:54:06,570 --> 00:54:10,520
try to bucket the cards that
actually play and ignore

961
00:54:10,520 --> 00:54:12,030
the other cards.

962
00:54:12,030 --> 00:54:14,880
The thing is that, when you do
things like that, [INAUDIBLE]

963
00:54:14,880 --> 00:54:18,526
losing assisting, we
call it card removal.

964
00:54:18,526 --> 00:54:21,610
Card removal and
blocking players

965
00:54:21,610 --> 00:54:23,810
from having the nuts
and things like that

966
00:54:23,810 --> 00:54:26,810
are pretty important--
do turn out

967
00:54:26,810 --> 00:54:30,450
to be a pretty important
part of the game theory

968
00:54:30,450 --> 00:54:33,000
optimal solution when
you're getting down

969
00:54:33,000 --> 00:54:36,830
to the milli big
blind kind of level.

970
00:54:36,830 --> 00:54:42,300
And if you don't think
about card removal at all,

971
00:54:42,300 --> 00:54:46,600
then you actually
have a strategy that

972
00:54:46,600 --> 00:54:48,300
can be exploited pretty easily.

973
00:54:48,300 --> 00:54:51,770
Actually, I talked
about this yesterday.

974
00:54:51,770 --> 00:54:59,630
The thing is typically
when the pot is p

975
00:54:59,630 --> 00:55:02,280
and you're facing
a bet, you want

976
00:55:02,280 --> 00:55:04,590
to make them
indifferent to bluffing.

977
00:55:04,590 --> 00:55:13,670
He's betting 1 to win p, so you
want to call about pr over p

978
00:55:13,670 --> 00:55:15,070
plus 1 at the time.

979
00:55:15,070 --> 00:55:17,340
If you don't call
this much, he's

980
00:55:17,340 --> 00:55:20,110
going to bluff and take it.

981
00:55:20,110 --> 00:55:22,890
So that's sort of the thing.

982
00:55:22,890 --> 00:55:27,520
We're saying the bet
is 1 and the pot is p.

983
00:55:27,520 --> 00:55:30,980
So if the pot is
10, and he bets 1,

984
00:55:30,980 --> 00:55:33,900
and he takes it more
than 1/11 at the time,

985
00:55:33,900 --> 00:55:38,990
he's going to just--
[INAUDIBLE] bluff everything.

986
00:55:38,990 --> 00:55:42,750
The real problem
becomes that, if you

987
00:55:42,750 --> 00:55:45,740
don't think about
card removal at all,

988
00:55:45,740 --> 00:55:49,540
he can start bluffing
hands in which he knows

989
00:55:49,540 --> 00:55:55,960
it's more likely you have a
mediocre hand or something that

990
00:55:55,960 --> 00:55:57,970
includes a strong hand.

991
00:55:57,970 --> 00:56:01,920
One real example is
in PLO when there

992
00:56:01,920 --> 00:56:07,107
is a flush on the board,
what's a good bluff?

993
00:56:07,107 --> 00:56:08,690
AUDIENCE: You have
ace of [INAUDIBLE].

994
00:56:08,690 --> 00:56:10,642
PROFESSOR: Right, you
have the ace in a suit.

995
00:56:10,642 --> 00:56:11,850
You don't have anything else.

996
00:56:11,850 --> 00:56:13,980
That's a great bluff,
because you're blocking him

997
00:56:13,980 --> 00:56:17,960
from having a great
hand, and you're

998
00:56:17,960 --> 00:56:22,260
blocking all of his
not hands and a lot

999
00:56:22,260 --> 00:56:23,900
of his really good hands.

1000
00:56:23,900 --> 00:56:28,350
And he's much more
likely to fold,

1001
00:56:28,350 --> 00:56:33,490
because if you bet the
pot, a lot of his hands

1002
00:56:33,490 --> 00:56:38,050
he's [INAUDIBLE] himself with
[INAUDIBLE] with the nut flush.

1003
00:56:38,050 --> 00:56:39,570
Oh, I have a natural call.

1004
00:56:39,570 --> 00:56:40,270
Are you all in?

1005
00:56:40,270 --> 00:56:40,936
I have the nuts?

1006
00:56:40,936 --> 00:56:42,850
OK, I call.

1007
00:56:42,850 --> 00:56:47,277
So that's why card
removal is important.

1008
00:56:50,690 --> 00:56:51,956
Yeah?

1009
00:56:51,956 --> 00:56:53,928
AUDIENCE: So is my
understanding correct

1010
00:56:53,928 --> 00:56:56,886
that optimal [INAUDIBLE]?

1011
00:56:59,796 --> 00:57:00,420
PROFESSOR: Yes.

1012
00:57:00,420 --> 00:57:04,758
AUDIENCE: And has there been any
study of optimal [INAUDIBLE].

1013
00:57:08,720 --> 00:57:10,345
PROFESSOR: Sort of
like utility theory.

1014
00:57:13,700 --> 00:57:20,760
In poker in general,
it's kind of weird.

1015
00:57:20,760 --> 00:57:23,750
People think a lot
about that [INAUDIBLE]

1016
00:57:23,750 --> 00:57:26,070
what tournament
they should enter,

1017
00:57:26,070 --> 00:57:28,050
what games they should play.

1018
00:57:28,050 --> 00:57:31,810
But there hasn't
been a study really

1019
00:57:31,810 --> 00:57:35,160
optimizing your own personal
utility within the games.

1020
00:57:35,160 --> 00:57:38,250
The assumption is kind
of like, well, I'm

1021
00:57:38,250 --> 00:57:41,075
going to use all this cool
utilities theory [INAUDIBLE]

1022
00:57:41,075 --> 00:57:42,575
to figure out what
game I'm playing.

1023
00:57:42,575 --> 00:57:44,074
As long as I'm
playing the game, I'm

1024
00:57:44,074 --> 00:57:47,440
just going to try to
win the most money.

1025
00:57:47,440 --> 00:57:49,870
That's sort of
been the attitude,

1026
00:57:49,870 --> 00:57:54,440
and I think that's actually
correct for most [INAUDIBLE].

1027
00:57:54,440 --> 00:57:57,060
In limit hold 'em,
[INAUDIBLE] you

1028
00:57:57,060 --> 00:58:00,460
need bank rolls of
hundreds of bets.

1029
00:58:00,460 --> 00:58:02,740
You're not going to
try to optimize and try

1030
00:58:02,740 --> 00:58:09,269
to win some fraction of a bet
with your utility function

1031
00:58:09,269 --> 00:58:10,310
by lowering the variance.

1032
00:58:15,520 --> 00:58:20,910
That is an interesting
question, because maybe-- I

1033
00:58:20,910 --> 00:58:25,460
feel that, if there is some
utility consideration--

1034
00:58:25,460 --> 00:58:28,020
like maybe in a tournament
you feel your chips are

1035
00:58:28,020 --> 00:58:32,150
non-linear-- maybe
you are going to quit

1036
00:58:32,150 --> 00:58:35,850
playing your marginal
hands because of utility

1037
00:58:35,850 --> 00:58:38,240
considerations.

1038
00:58:38,240 --> 00:58:41,200
AUDIENCE: [INAUDIBLE] like the
fountain table of major events.

1039
00:58:41,200 --> 00:58:44,215
They'll go beyond ICM to
say maybe I won't coin

1040
00:58:44,215 --> 00:58:48,215
flip for a $10 edge
[INAUDIBLE] step up.

1041
00:58:51,590 --> 00:58:54,190
PROFESSOR: I mean,
if you use ICM,

1042
00:58:54,190 --> 00:58:59,300
those utilities are already
kind of calculated, but yeah.

1043
00:58:59,300 --> 00:59:02,060
For example, final
table of the main event,

1044
00:59:02,060 --> 00:59:06,510
I'm not only using ICM,
but I'm thinking, well,

1045
00:59:06,510 --> 00:59:13,200
$3 million-- $4 million
compared to $2 million

1046
00:59:13,200 --> 00:59:15,420
is a much smaller step
to me than $2 million

1047
00:59:15,420 --> 00:59:20,080
is compared to 0 in my
own personal utility.

1048
00:59:23,720 --> 00:59:27,660
Like $0.5 million compared to
$2 million versus $2 million

1049
00:59:27,660 --> 00:59:29,400
compared to $3.5 million.

1050
00:59:29,400 --> 00:59:34,760
So I need to optimize utility.

1051
00:59:34,760 --> 00:59:36,620
I mean, yeah.

1052
00:59:36,620 --> 00:59:40,376
I think that's kind
of worthy of study.

1053
00:59:40,376 --> 00:59:41,268
Yeah?

1054
00:59:41,268 --> 00:59:45,230
AUDIENCE: What is it about
the analytics of poker

1055
00:59:45,230 --> 00:59:47,700
that makes it so popular
with trading firms?

1056
00:59:47,700 --> 00:59:49,541
And how does it--

1057
00:59:49,541 --> 00:59:50,290
PROFESSOR: Oh, OK.

1058
00:59:50,290 --> 00:59:51,996
That's a great question.

1059
00:59:51,996 --> 00:59:54,336
AUDIENCE: How do you
use it professionally,

1060
00:59:54,336 --> 00:59:55,280
all of this stuff?

1061
00:59:55,280 --> 00:59:58,140
PROFESSOR: Well, I mean,
I think poker is just

1062
00:59:58,140 --> 01:00:03,840
kind of-- if you think what
one game-- if you could teach

1063
01:00:03,840 --> 01:00:08,500
traders one game, what
one game would represent

1064
01:00:08,500 --> 01:00:09,870
what traders have to know?

1065
01:00:09,870 --> 01:00:13,070
Well poker-- there
are a lot of actors.

1066
01:00:13,070 --> 01:00:15,580
There's incomplete information.

1067
01:00:15,580 --> 01:00:17,900
That's one big thing.

1068
01:00:17,900 --> 01:00:21,715
And you do have to do a lot of
thinking of what your counter

1069
01:00:21,715 --> 01:00:23,610
party is doing.

1070
01:00:23,610 --> 01:00:25,480
If he wants to
trade against you,

1071
01:00:25,480 --> 01:00:30,100
he puts a bid or
offer-- some of that

1072
01:00:30,100 --> 01:00:36,050
is why there's this [INAUDIBLE].

1073
01:00:36,050 --> 01:00:37,935
Are you trying to
get out of risk?

1074
01:00:37,935 --> 01:00:40,780
[INAUDIBLE] big position
he's trying to get out of,

1075
01:00:40,780 --> 01:00:44,750
or do you have to be
worried about these orders

1076
01:00:44,750 --> 01:00:45,920
and things like that?

1077
01:00:45,920 --> 01:00:50,280
And also poker gives
you sort of the skills

1078
01:00:50,280 --> 01:00:55,539
to trade that-- suppose you
know something is worth $10.

1079
01:00:55,539 --> 01:00:57,830
[INAUDIBLE] you're going to
make around it [INAUDIBLE].

1080
01:00:57,830 --> 01:01:01,740
Knowing nothing, you might
make-- bid [INAUDIBLE]

1081
01:01:01,740 --> 01:01:03,810
offer at 10/10,
which means you're

1082
01:01:03,810 --> 01:01:07,150
willing to buy the [INAUDIBLE]
or sell it at 10/10,

1083
01:01:07,150 --> 01:01:10,170
but you know something
about the counter party.

1084
01:01:10,170 --> 01:01:12,810
You may know the
counter party can

1085
01:01:12,810 --> 01:01:20,320
be a better buyer than seller
or that buying is the risky part

1086
01:01:20,320 --> 01:01:22,830
[INAUDIBLE] is the risky part.

1087
01:01:22,830 --> 01:01:25,590
That kind of has a quant.

1088
01:01:25,590 --> 01:01:28,510
Also, as a quant,
doing poker analytics

1089
01:01:28,510 --> 01:01:33,910
is very similar to the
analysis we do in trading.

1090
01:01:33,910 --> 01:01:38,350
A lot of this analysis-- how
these strategies work, do

1091
01:01:38,350 --> 01:01:43,540
these strategies really return
what we think they return

1092
01:01:43,540 --> 01:01:47,320
are similar to discussions we
have in our trading strategy.

1093
01:01:47,320 --> 01:01:50,400
I'm glad I'm able to talk to you
about this, because if you're

1094
01:01:50,400 --> 01:01:53,390
interested in doing
poker strategies,

1095
01:01:53,390 --> 01:01:57,090
you'll probably be interested in
doing trading strategies, too.

1096
01:01:57,090 --> 01:01:58,132
Any more questions?

1097
01:02:01,430 --> 01:02:02,310
Yes?

1098
01:02:02,310 --> 01:02:07,705
AUDIENCE: What about doing
the deviation from [INAUDIBLE]

1099
01:02:07,705 --> 01:02:12,930
the [INAUDIBLE] detecting
deviation or let's say somebody

1100
01:02:12,930 --> 01:02:15,420
goes from playing
optimally [INAUDIBLE]

1101
01:02:15,420 --> 01:02:17,320
not playing optimal [INAUDIBLE].

1102
01:02:21,470 --> 01:02:24,140
PROFESSOR: Yeah, I mean that's
a very interesting thing,

1103
01:02:24,140 --> 01:02:30,270
and that's actually hard
to determine because that

1104
01:02:30,270 --> 01:02:32,070
feels a little bit
harder than this

1105
01:02:32,070 --> 01:02:33,960
because this is [INAUDIBLE].

1106
01:02:33,960 --> 01:02:36,340
It's like I'm trying to figure
out the optimal strategy,

1107
01:02:36,340 --> 01:02:40,040
and I just play this,
and whatever money

1108
01:02:40,040 --> 01:02:42,550
comes to me comes to me.

1109
01:02:42,550 --> 01:02:43,600
You open your arms.

1110
01:02:43,600 --> 01:02:46,500
The money comes to you.

1111
01:02:46,500 --> 01:02:49,970
The other thing is, oh,
well he's playing badly,

1112
01:02:49,970 --> 01:02:52,380
so I'm going to go there
and take his money.

1113
01:02:52,380 --> 01:02:55,740
But then if I
deviate from optimal,

1114
01:02:55,740 --> 01:03:00,220
I'm also opening up
myself to being exploited.

1115
01:03:00,220 --> 01:03:02,480
So that's kind of hard.

1116
01:03:02,480 --> 01:03:05,880
That's much more of
a dynamic problem.

1117
01:03:05,880 --> 01:03:07,260
When does he go on tilt?

1118
01:03:07,260 --> 01:03:09,330
How long was he on tilt?

1119
01:03:09,330 --> 01:03:15,780
What evidence do we
have that he's on tilt.

1120
01:03:15,780 --> 01:03:21,450
I know that [INAUDIBLE],
the guys in CMU,

1121
01:03:21,450 --> 01:03:26,490
were looking into some
sort of zero loss way

1122
01:03:26,490 --> 01:03:29,630
to exploit your opponents,
because you just figure out

1123
01:03:29,630 --> 01:03:32,160
when your opponents
are playing badly,

1124
01:03:32,160 --> 01:03:35,800
how much they've given up
in playing sub-optimally,

1125
01:03:35,800 --> 01:03:37,670
and then you go
to a [INAUDIBLE].

1126
01:03:37,670 --> 01:03:41,830
But you only open
up yourself to, say,

1127
01:03:41,830 --> 01:03:43,750
half the money he's
given up, or something

1128
01:03:43,750 --> 01:03:46,120
like that, playing badly.

1129
01:03:46,120 --> 01:03:53,080
And the metric is-- so there's
some sort of gaming algorithm

1130
01:03:53,080 --> 01:03:57,002
you can do to do that,
but yeah that's definitely

1131
01:03:57,002 --> 01:03:57,960
another field of study.

1132
01:03:57,960 --> 01:04:02,450
There are a lot of interesting
fields that can come out poker

1133
01:04:02,450 --> 01:04:02,950
[INAUDIBLE].

1134
01:04:06,171 --> 01:04:06,670
All right.

1135
01:04:06,670 --> 01:04:07,570
I guess that's it.

1136
01:04:07,570 --> 01:04:09,120
[APPLAUSE]