1
00:00:00,090 --> 00:00:02,490
The following content is
provided under a Creative

2
00:00:02,490 --> 00:00:04,030
Commons license.

3
00:00:04,030 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,720
continue to offer high quality
educational resources for free.

5
00:00:10,720 --> 00:00:13,320
To make a donation or
view additional materials

6
00:00:13,320 --> 00:00:17,280
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,280 --> 00:00:18,450
at ocw.mit.edu.

8
00:00:21,709 --> 00:00:23,500
ERIK DEMAINE: All right,
let's get started.

9
00:00:23,500 --> 00:00:27,960
So today, we start out geometry,
geometric data structures.

10
00:00:27,960 --> 00:00:29,680
There are two lectures on this.

11
00:00:29,680 --> 00:00:31,080
This is lecture one.

12
00:00:31,080 --> 00:00:33,910
And we're going to solve
two main problems today.

13
00:00:33,910 --> 00:00:36,900
One is point location, which
is finding yourself on a map.

14
00:00:36,900 --> 00:00:38,740
And the other is
orthogonal range searching,

15
00:00:38,740 --> 00:00:43,280
which is catching a bunch of
dots with a rectangular net.

16
00:00:43,280 --> 00:00:44,700
And they're fun problems.

17
00:00:44,700 --> 00:00:48,420
And they're good illustrations
of a couple of techniques.

18
00:00:48,420 --> 00:00:51,521
We're going to cover two general
techniques for data structure

19
00:00:51,521 --> 00:00:52,020
building.

20
00:00:52,020 --> 00:00:54,120
One is dynamizing
static data structures,

21
00:00:54,120 --> 00:00:57,030
turning static into dynamic
using a technique called weight

22
00:00:57,030 --> 00:00:58,582
balance, which is really cool.

23
00:00:58,582 --> 00:01:00,540
And another one is called
fractional cascading,

24
00:01:00,540 --> 00:01:02,331
which has probably one
of the coolest names

25
00:01:02,331 --> 00:01:04,739
of any algorithmic or
data structures technique.

26
00:01:04,739 --> 00:01:07,320
It's actually a very simple
idea, but sounds very scary.

27
00:01:10,710 --> 00:01:12,470
And with point
location, we're going

28
00:01:12,470 --> 00:01:15,180
to see some fun connections to
persistence and retroactivity,

29
00:01:15,180 --> 00:01:18,635
which was the topic of the last
two lectures, you may recall.

30
00:01:18,635 --> 00:01:20,010
And so we'll start
out with that.

31
00:01:22,970 --> 00:01:27,600
Planar point location, you
can do it in higher dimensions

32
00:01:27,600 --> 00:01:28,500
as well.

33
00:01:28,500 --> 00:01:30,060
In general, geometric
data structures

34
00:01:30,060 --> 00:01:32,680
are about going to more
than one dimension.

35
00:01:32,680 --> 00:01:35,850
Most data structures are about
one dimensional ordered data.

36
00:01:35,850 --> 00:01:39,430
Now, we have points
in the plane.

37
00:01:39,430 --> 00:01:42,630
We might have
polygons in the plane.

38
00:01:42,630 --> 00:01:48,810
So this is what we
call a planar map,

39
00:01:48,810 --> 00:01:50,550
got a bunch of line
segments and points

40
00:01:50,550 --> 00:01:51,930
forming a graph structure.

41
00:01:51,930 --> 00:01:53,610
So think of it as a
planar graph drawn

42
00:01:53,610 --> 00:01:57,060
in the plane where every edge
is a straight line segment.

43
00:01:57,060 --> 00:02:01,050
And none of the edges
cross, let's say.

44
00:02:01,050 --> 00:02:04,335
So this is a planar map.

45
00:02:04,335 --> 00:02:08,520
It's also called a planar
straight line graph.

46
00:02:08,520 --> 00:02:12,010
And the static version
of this problem--

47
00:02:12,010 --> 00:02:14,910
so there's two versions,
one is static--

48
00:02:14,910 --> 00:02:18,490
you want to preprocess the map.

49
00:02:18,490 --> 00:02:21,960
So I give you a
single map up front.

50
00:02:21,960 --> 00:02:27,120
And then I want to support
dynamic queries, which are

51
00:02:27,120 --> 00:02:33,800
which face contains a point p.

52
00:02:38,010 --> 00:02:39,570
So that point is
going to be given

53
00:02:39,570 --> 00:02:42,750
to you as coordinates x and y.

54
00:02:42,750 --> 00:02:47,430
So maybe I mark a
point like this one.

55
00:02:47,430 --> 00:02:49,000
I give you those x
and y coordinates.

56
00:02:49,000 --> 00:02:51,930
I want to quickly
determine that this face is

57
00:02:51,930 --> 00:02:53,355
the one that contains it.

58
00:02:53,355 --> 00:02:54,960
I give you another
point over here.

59
00:02:54,960 --> 00:02:57,250
It quickly determines this face.

60
00:02:57,250 --> 00:02:58,590
This has a lot of applications.

61
00:02:58,590 --> 00:03:02,137
If you're writing a GUI and
someone clicks on the screen,

62
00:03:02,137 --> 00:03:04,470
you need to map the coordinates
that the mouse gives you

63
00:03:04,470 --> 00:03:08,760
to which GUI element
you're clicking on.

64
00:03:08,760 --> 00:03:12,070
If you have a GPS
device and it has a map,

65
00:03:12,070 --> 00:03:14,370
so it's preprocessed
the map all at once.

66
00:03:14,370 --> 00:03:16,560
And now, given two GPS
coordinates, latitude,

67
00:03:16,560 --> 00:03:19,021
longitude, it needs to
know which city you're in,

68
00:03:19,021 --> 00:03:21,270
which part of the map you're
in, so that it knows what

69
00:03:21,270 --> 00:03:24,250
to display, that sort of thing.

70
00:03:24,250 --> 00:03:26,190
These are all planar
point location problems.

71
00:03:26,190 --> 00:03:28,150
It comes up in simulation,
lots of things.

72
00:03:28,150 --> 00:03:29,774
It's actually one of
the first problems

73
00:03:29,774 --> 00:03:33,930
I got interested in
algorithms way back

74
00:03:33,930 --> 00:03:36,600
in my oceanography days.

75
00:03:36,600 --> 00:03:39,660
So that's planar point location.

76
00:03:39,660 --> 00:03:40,770
That's the static version.

77
00:03:40,770 --> 00:03:44,010
The dynamic version--
make things harder--

78
00:03:44,010 --> 00:03:45,976
is the map is dynamic.

79
00:03:45,976 --> 00:03:47,100
So here, the map is static.

80
00:03:47,100 --> 00:03:49,110
The queries are
still coming online.

81
00:03:49,110 --> 00:03:53,940
Dynamic version, you can insert
and delete edges in your map.

82
00:03:57,390 --> 00:03:59,730
And let's say if you get
a vertex down to degree 0,

83
00:03:59,730 --> 00:04:02,180
you can delete the vertex
as well, add new degrees

84
00:04:02,180 --> 00:04:03,390
0 vertices.

85
00:04:03,390 --> 00:04:05,220
As long as you
don't have crossings

86
00:04:05,220 --> 00:04:07,980
introduced by inserting
edges, you can change things.

87
00:04:07,980 --> 00:04:11,040
So that's obviously harder.

88
00:04:11,040 --> 00:04:14,150
And we can solve this problem
using persistence and using

89
00:04:14,150 --> 00:04:19,079
retroactivity in a pretty simple
way using a technique which you

90
00:04:19,079 --> 00:04:23,040
may have seen before,
pretty classic technique

91
00:04:23,040 --> 00:04:24,186
in computational geometry.

92
00:04:24,186 --> 00:04:25,560
So this is a
technique that comes

93
00:04:25,560 --> 00:04:27,120
from the algorithms world.

94
00:04:27,120 --> 00:04:31,860
And we're going to apply it
to the data structures world.

95
00:04:31,860 --> 00:04:37,360
So, sweep line technique,
it's a very simple idea.

96
00:04:37,360 --> 00:04:42,900
So you have some line
segments in the plane,

97
00:04:42,900 --> 00:04:45,000
something like this.

98
00:04:45,000 --> 00:04:48,750
And I'm going to
take a vertical line.

99
00:04:48,750 --> 00:04:52,900
So the algorithmic
problem is I want to know

100
00:04:52,900 --> 00:04:53,910
are there any crossings.

101
00:04:53,910 --> 00:04:55,201
Do any of these segments cross?

102
00:04:55,201 --> 00:04:57,240
This is where sweep line
technique comes from,

103
00:04:57,240 --> 00:04:58,460
I believe.

104
00:04:58,460 --> 00:05:02,650
So the idea is we want to
linearize or one-dimensionalify

105
00:05:02,650 --> 00:05:03,160
the problem.

106
00:05:03,160 --> 00:05:06,210
So just take a slice of the
problem with a vertical line.

107
00:05:06,210 --> 00:05:09,370
And imagine sweeping that
line from left to right.

108
00:05:09,370 --> 00:05:11,610
So you imagine it
moving continuously.

109
00:05:11,610 --> 00:05:14,730
Of course, in reality,
it moves discretely.

110
00:05:20,390 --> 00:05:23,510
Let me unambiguate
this a little bit.

111
00:05:33,291 --> 00:05:33,790
OK.

112
00:05:33,790 --> 00:05:35,320
There are discrete
moments in time

113
00:05:35,320 --> 00:05:38,425
when what is hit by
the sweep line changes.

114
00:05:41,370 --> 00:05:42,930
Let me maybe label
these segments.

115
00:05:42,930 --> 00:05:48,790
We've got a, b, c, and d.

116
00:05:48,790 --> 00:05:50,620
So initially, we hit nothing.

117
00:05:50,620 --> 00:05:54,064
Then we hit a, then we hit b.

118
00:05:54,064 --> 00:05:54,730
Why do we hit b?

119
00:05:54,730 --> 00:05:56,977
Because we saw the
left end point of b.

120
00:05:56,977 --> 00:05:59,560
Then we see the right endpoint
of a which means we no longer--

121
00:05:59,560 --> 00:06:04,610
sorry, at this point, we see
both a and b in that order.

122
00:06:04,610 --> 00:06:07,230
Then we lose a, so
we're down to b.

123
00:06:07,230 --> 00:06:08,410
Then we see c.

124
00:06:08,410 --> 00:06:11,050
c is above b.

125
00:06:11,050 --> 00:06:15,760
Then we see d. d
is above c and b.

126
00:06:15,760 --> 00:06:18,180
Then c and d cross.

127
00:06:18,180 --> 00:06:20,920
So c and d change positions.

128
00:06:20,920 --> 00:06:22,900
And then we have b.

129
00:06:22,900 --> 00:06:28,630
Then we lose b, then we lose c.

130
00:06:28,630 --> 00:06:30,970
Then we lose d.

131
00:06:30,970 --> 00:06:34,930
This is a classic algorithm for
detecting these intersections.

132
00:06:34,930 --> 00:06:37,090
I don't want to get into
details how you do this.

133
00:06:37,090 --> 00:06:41,110
But you're trying to look for
when things change in order

134
00:06:41,110 --> 00:06:42,682
in these cross-sections.

135
00:06:42,682 --> 00:06:44,140
The way you do that
is you maintain

136
00:06:44,140 --> 00:06:46,670
the cross-section in
a binary search tree,

137
00:06:46,670 --> 00:06:48,040
so you maintain the order.

138
00:06:48,040 --> 00:06:50,080
If you hit a left endpoint, you
insert into the binary search

139
00:06:50,080 --> 00:06:50,260
tree.

140
00:06:50,260 --> 00:06:52,300
If you see a right endpoint, you
delete from the binary search

141
00:06:52,300 --> 00:06:53,110
tree.

142
00:06:53,110 --> 00:06:54,670
And you do some stuff
to check for crossings.

143
00:06:54,670 --> 00:06:56,336
In this problem, there
are no crossings.

144
00:06:56,336 --> 00:06:59,290
So we don't need to
worry about that.

145
00:06:59,290 --> 00:07:01,280
But we're taking this technique.

146
00:07:01,280 --> 00:07:03,470
Say, OK, there's a
data structure here,

147
00:07:03,470 --> 00:07:08,800
which is the binary search tree
maintaining the cross-section.

148
00:07:08,800 --> 00:07:09,330
OK.

149
00:07:09,330 --> 00:07:26,120
So, typically, the
cross-section data structure

150
00:07:26,120 --> 00:07:29,830
is regular balanced
binary search tree.

151
00:07:34,150 --> 00:07:39,260
Our idea is what if
we add persistence

152
00:07:39,260 --> 00:07:40,400
to that binary search tree?

153
00:07:40,400 --> 00:07:42,441
So instead of using a
regular binary search tree,

154
00:07:42,441 --> 00:07:45,440
we use a partially persistent
balanced binary search

155
00:07:45,440 --> 00:07:47,139
tree, which we know how to do.

156
00:07:47,139 --> 00:07:48,680
This is a bounded
n degree structure.

157
00:07:48,680 --> 00:07:53,070
We can make it partially
persistent, constant overhead.

158
00:07:53,070 --> 00:08:07,430
So if we add
partial persistence,

159
00:08:07,430 --> 00:08:09,590
what does that let us do?

160
00:08:09,590 --> 00:08:11,360
Well, let's just
look at a moment

161
00:08:11,360 --> 00:08:14,340
in the past, partial persistence
about querying the past.

162
00:08:14,340 --> 00:08:17,210
So there's a sequence of
insertions and deletions

163
00:08:17,210 --> 00:08:19,217
that occur from the sweep line.

164
00:08:19,217 --> 00:08:21,050
But now, if we can query
in the past, that's

165
00:08:21,050 --> 00:08:24,860
like going to a desired
x-coordinate and saying,

166
00:08:24,860 --> 00:08:28,620
what does my data structure
look like at this moment?

167
00:08:28,620 --> 00:08:29,740
OK.

168
00:08:29,740 --> 00:08:32,049
Now, the data structure,
let's maybe look at this one,

169
00:08:32,049 --> 00:08:35,870
because it's got three
elements, very exciting.

170
00:08:35,870 --> 00:08:39,970
So you've got d, then c, then b.

171
00:08:39,970 --> 00:08:41,530
So you've got a
little data structure

172
00:08:41,530 --> 00:08:44,870
that looks something like this.

173
00:08:44,870 --> 00:08:46,780
It understands the order
of the cross-section

174
00:08:46,780 --> 00:08:48,670
of those segments.

175
00:08:48,670 --> 00:08:52,690
And so, for example, if
I was given a query point

176
00:08:52,690 --> 00:08:57,370
like this one, I
could figure out

177
00:08:57,370 --> 00:09:00,010
what is the segment above me,
what is the segment below me.

178
00:09:00,010 --> 00:09:03,190
That is a successor query
and a predecessor query

179
00:09:03,190 --> 00:09:04,840
in that binary search tree.

180
00:09:11,500 --> 00:09:16,750
This notation maybe-- a query
at time t of, let's say,

181
00:09:16,750 --> 00:09:35,370
successor of y is what we call
an upward ray shooting query

182
00:09:35,370 --> 00:09:38,620
from coordinates t,y.

183
00:09:38,620 --> 00:09:42,330
So t, the time, is
acting as x-coordinate.

184
00:09:42,330 --> 00:09:45,640
Time is left to right here.

185
00:09:45,640 --> 00:09:49,080
And so what's happening is we're
imagining, from this point,

186
00:09:49,080 --> 00:09:51,960
shooting a ray upward
and asking what is

187
00:09:51,960 --> 00:09:53,670
the segment that I hit first.

188
00:09:53,670 --> 00:09:56,130
That's an upward
ray shooting query.

189
00:09:56,130 --> 00:10:01,710
And this is from a problem
called vertical ray shooting,

190
00:10:01,710 --> 00:10:05,055
which is more or less equivalent
to planar point location.

191
00:10:10,710 --> 00:10:16,560
So vertical ray shooting, again,
you're given a map, planar map.

192
00:10:16,560 --> 00:10:19,540
And the queries are like this.

193
00:10:19,540 --> 00:10:23,430
What is the first segment that
you hit with an upward ray?

194
00:10:35,750 --> 00:10:39,110
So I give you a point, x,y.

195
00:10:39,110 --> 00:10:41,060
And I ask, if I
go up from there,

196
00:10:41,060 --> 00:10:43,720
what's the next edge that I get?

197
00:10:43,720 --> 00:10:45,797
That's the vertical
ray shooting problem.

198
00:10:45,797 --> 00:10:47,255
And we just solved
the vertical ray

199
00:10:47,255 --> 00:10:49,340
shooting problem for static.

200
00:10:49,340 --> 00:10:52,430
If you're given a static map,
you run this algorithm once,

201
00:10:52,430 --> 00:10:55,049
assume there are no crossings.

202
00:10:55,049 --> 00:10:56,840
Then to answer vertical
ray shooting query,

203
00:10:56,840 --> 00:10:59,390
we just go back
in time to time t,

204
00:10:59,390 --> 00:11:02,540
do the successor query,
which takes log n time,

205
00:11:02,540 --> 00:11:04,680
and then we get
the answer to this.

206
00:11:04,680 --> 00:11:13,220
So we can do this in
log n per query static.

207
00:11:13,220 --> 00:11:15,881
This is all two dimensional.

208
00:11:15,881 --> 00:11:17,840
I should probably say that.

209
00:11:17,840 --> 00:11:19,040
You can generalize.

210
00:11:22,560 --> 00:11:23,100
Questions?

211
00:11:23,100 --> 00:11:24,360
This is actually really easy.

212
00:11:24,360 --> 00:11:26,640
This is the stuff we get
for free out of persistence

213
00:11:26,640 --> 00:11:28,920
and, at the moment,
retroactivity.

214
00:11:28,920 --> 00:11:31,230
I believe this is one of
the reasons persistence

215
00:11:31,230 --> 00:11:32,729
was invented in the first place.

216
00:11:32,729 --> 00:11:35,020
There were a bunch of early
persistent data structures.

217
00:11:35,020 --> 00:11:36,686
Then there was a
general Driscoll paper,

218
00:11:36,686 --> 00:11:38,286
which I talked about.

219
00:11:38,286 --> 00:11:42,300
But I think geometry was
one of the main motivations.

220
00:11:42,300 --> 00:11:44,400
Because it lets you
add a dimension.

221
00:11:46,960 --> 00:11:50,800
As long as that dimension
is time-like, then

222
00:11:50,800 --> 00:11:53,410
you get the dimension
sort of for free.

223
00:11:53,410 --> 00:11:54,640
So that's nice.

224
00:11:54,640 --> 00:11:55,140
OK.

225
00:11:55,140 --> 00:11:56,540
What about retroactivity?

226
00:11:59,730 --> 00:12:02,790
Again, we're going to use
partial retroactivity.

227
00:12:02,790 --> 00:12:05,230
And I can tell you for
certainty, because I was there,

228
00:12:05,230 --> 00:12:06,960
this is why retroactivity
was invented.

229
00:12:09,930 --> 00:12:13,260
So retroactivity,
so that would mean

230
00:12:13,260 --> 00:12:17,250
that we get to
dynamically add and delete

231
00:12:17,250 --> 00:12:19,560
insertions and deletions.

232
00:12:19,560 --> 00:12:22,350
So that's like adding
and deleting segments

233
00:12:22,350 --> 00:12:23,346
from the structure.

234
00:12:23,346 --> 00:12:24,720
Again, we have a
linear timeline.

235
00:12:24,720 --> 00:12:26,594
We always want to maintain
a linear timeline,

236
00:12:26,594 --> 00:12:28,260
because that is reality.

237
00:12:28,260 --> 00:12:30,240
That corresponds to
the x-coordinate.

238
00:12:30,240 --> 00:12:33,270
And now I want to be able to
add a segment like this, which

239
00:12:33,270 --> 00:12:35,580
means there was an insertion
at this time, a deletion

240
00:12:35,580 --> 00:12:37,380
at this time.

241
00:12:37,380 --> 00:12:38,680
Now, this doesn't quite work.

242
00:12:38,680 --> 00:12:41,350
Because this point
in cross-section,

243
00:12:41,350 --> 00:12:43,300
it's actually moving over time.

244
00:12:43,300 --> 00:12:47,460
Binary search trees, that's
OK, because things are simple.

245
00:12:47,460 --> 00:12:49,890
But at the moment,
all we know how to do

246
00:12:49,890 --> 00:12:54,960
is actually horizontal segments,
which are inserted and deleted

247
00:12:54,960 --> 00:12:58,420
at the same y-coordinate.

248
00:12:58,420 --> 00:13:05,316
So then we can do
insert at time t1,

249
00:13:05,316 --> 00:13:10,650
an insertion of
some y-coordinate,

250
00:13:10,650 --> 00:13:15,380
and then an insert
at some later time,

251
00:13:15,380 --> 00:13:17,374
the deletion of
that y-coordinate.

252
00:13:20,650 --> 00:13:25,320
So this is a partially
retroactive successor problem.

253
00:13:25,320 --> 00:13:32,502
This is equal to dynamic
vertical ray shooting.

254
00:13:36,930 --> 00:13:39,760
I guess this
insertion corresponds

255
00:13:39,760 --> 00:13:41,040
to the insertion of a thing.

256
00:13:41,040 --> 00:13:44,220
If you instead do
delete here, then you're

257
00:13:44,220 --> 00:13:47,040
deleting one of the segments.

258
00:13:47,040 --> 00:13:49,905
This is among
horizontal segments.

259
00:13:55,720 --> 00:13:58,680
So if your map is made of
horizontal and vertical

260
00:13:58,680 --> 00:13:59,890
segments--

261
00:13:59,890 --> 00:14:01,590
so it's an orthogonal map--

262
00:14:01,590 --> 00:14:04,410
then you can solve the dynamic
problem using a partially

263
00:14:04,410 --> 00:14:06,120
retroactive successor.

264
00:14:06,120 --> 00:14:09,260
Again, we want to do
successor just like before,

265
00:14:09,260 --> 00:14:10,920
querying the past.

266
00:14:10,920 --> 00:14:12,935
But now, our updates
are different.

267
00:14:12,935 --> 00:14:14,310
Now, we have
retroactive updates.

268
00:14:14,310 --> 00:14:17,410
That lets us dynamically
change the past,

269
00:14:17,410 --> 00:14:19,920
which is like inserting
and deleting edges

270
00:14:19,920 --> 00:14:21,072
through that algorithm.

271
00:14:21,072 --> 00:14:22,530
But at the moment,
we only know how

272
00:14:22,530 --> 00:14:24,520
to do this for
horizontal segments.

273
00:14:24,520 --> 00:14:26,550
So this gives us,
if you remember,

274
00:14:26,550 --> 00:14:30,260
the retroactive
successor result.

275
00:14:30,260 --> 00:14:32,380
We haven't seen
that, how it works.

276
00:14:32,380 --> 00:14:33,360
It's complicated.

277
00:14:33,360 --> 00:14:38,970
But it achieves log n insert,
delete successor retroactively.

278
00:14:38,970 --> 00:14:41,670
And so we get a log n,
which is an optimal solution

279
00:14:41,670 --> 00:14:43,800
for dynamic vertical
ray shooting

280
00:14:43,800 --> 00:14:47,010
among horizontal segments.

281
00:14:47,010 --> 00:14:49,620
There are a bunch
of open problems.

282
00:14:49,620 --> 00:14:51,960
What about general maps?

283
00:14:56,220 --> 00:15:10,210
So for a dynamic vertical
ray shooting in general maps,

284
00:15:10,210 --> 00:15:13,860
if you want log n
query, the best results

285
00:15:13,860 --> 00:15:15,960
are log to the 1 plus
epsilon insert log

286
00:15:15,960 --> 00:15:17,890
to the 2 plus epsilon delete.

287
00:15:17,890 --> 00:15:19,140
There's some other trade-offs.

288
00:15:19,140 --> 00:15:22,000
You can get log times log
log n query and reduce.

289
00:15:22,000 --> 00:15:25,110
Still, we don't know
how to delete faster

290
00:15:25,110 --> 00:15:28,260
than log squared in any
of the general solutions.

291
00:15:28,260 --> 00:15:30,840
So you can do log
square for everything.

292
00:15:30,840 --> 00:15:34,187
But the hope would be you could
do log for everything even when

293
00:15:34,187 --> 00:15:35,520
the segments are not horizontal.

294
00:15:35,520 --> 00:15:41,980
But here, retroactivity
doesn't seem to buy you things.

295
00:15:41,980 --> 00:15:43,090
It'd be nice if you could.

296
00:15:43,090 --> 00:15:47,170
Another fun open
problem is, what

297
00:15:47,170 --> 00:15:58,010
about non-vertical rays,
general rays, non-vertical rays?

298
00:16:01,240 --> 00:16:05,150
So I give you a point
and I give you a vector,

299
00:16:05,150 --> 00:16:07,126
I want to know what do
I hit in that direction.

300
00:16:07,126 --> 00:16:08,000
This is a lot harder.

301
00:16:08,000 --> 00:16:09,790
You can't use any
of these tricks.

302
00:16:09,790 --> 00:16:13,120
And in fact, it's believed you
cannot get polylog performance

303
00:16:13,120 --> 00:16:15,410
unless you have a ton of space.

304
00:16:15,410 --> 00:16:18,482
So the best known result is--

305
00:16:18,482 --> 00:16:19,690
I'll just throw this up here.

306
00:16:19,690 --> 00:16:27,730
You can get n over square
root s polylog and query.

307
00:16:27,730 --> 00:16:29,550
If you use, basically, s space.

308
00:16:36,320 --> 00:16:39,080
So you need quite
a bit of space.

309
00:16:39,080 --> 00:16:42,260
Because if you use n to
the 1 plus epsilon space,

310
00:16:42,260 --> 00:16:45,090
you can get roughly
root n query time.

311
00:16:45,090 --> 00:16:50,200
If you use n to
5 space, then you

312
00:16:50,200 --> 00:16:53,655
get somewhat better query
time, but still not great.

313
00:16:53,655 --> 00:16:55,960
You can maybe get down
to n to the epsilon

314
00:16:55,960 --> 00:17:00,520
if you have very large
polynomial space.

315
00:17:00,520 --> 00:17:02,890
But this is conjectured
to be roughly optimal,

316
00:17:02,890 --> 00:17:04,730
I assume, other than
the polylog factors.

317
00:17:04,730 --> 00:17:06,887
The belief is you cannot
beat this for general ray.

318
00:17:06,887 --> 00:17:08,470
This is kind of
annoying, because this

319
00:17:08,470 --> 00:17:09,650
is a problem we care about.

320
00:17:09,650 --> 00:17:12,565
Especially in 3D,
this is ray tracing.

321
00:17:12,565 --> 00:17:13,992
You shoot a ray,
what does it hit?

322
00:17:13,992 --> 00:17:14,575
You bounce it.

323
00:17:14,575 --> 00:17:15,700
You shoot another ray.

324
00:17:15,700 --> 00:17:18,160
I always want to know
what objects am I hitting.

325
00:17:18,160 --> 00:17:21,041
And for special cases,
you can do better.

326
00:17:21,041 --> 00:17:22,540
But in general, it
seems quite hard.

327
00:17:22,540 --> 00:17:24,748
This is even in two dimensions.

328
00:17:24,748 --> 00:17:30,500
But there are a bunch of
papers on 3D and so on.

329
00:17:30,500 --> 00:17:33,260
I just wanted to give you those
connections to persistence

330
00:17:33,260 --> 00:17:34,640
and retroactivity.

331
00:17:34,640 --> 00:17:37,070
And that's point location.

332
00:17:37,070 --> 00:17:42,740
And now, I want to go on two
orthogonal range searching.

333
00:17:42,740 --> 00:17:46,630
We can do some new data
structures, new to us.

334
00:17:51,480 --> 00:17:53,490
So first, what is the problem?

335
00:18:10,930 --> 00:18:14,470
So it's sort of the reverse
kind of problem here.

336
00:18:14,470 --> 00:18:21,710
You're given a bunch of points
before the query was a point.

337
00:18:21,710 --> 00:18:25,060
And the query, in
this case, is going

338
00:18:25,060 --> 00:18:29,560
to be, in two dimensions,
a rectangle, a window

339
00:18:29,560 --> 00:18:30,290
if you will.

340
00:18:30,290 --> 00:18:35,320
And you want to know what
points are in the rectangle.

341
00:18:35,320 --> 00:18:49,950
So given n points and d
dimensions, query in general

342
00:18:49,950 --> 00:18:52,410
is going to be a box.

343
00:18:52,410 --> 00:18:55,830
So in 2D, it's an interval
crossing interval.

344
00:18:55,830 --> 00:18:59,090
In 3D, it's three intervals
cross-product together.

345
00:19:01,801 --> 00:19:02,300
OK.

346
00:19:02,300 --> 00:19:05,714
So in the static version, you
get to preprocess the points.

347
00:19:05,714 --> 00:19:07,130
In the dynamic
version, the points

348
00:19:07,130 --> 00:19:09,380
are being added and deleted.

349
00:19:09,380 --> 00:19:11,600
And in all cases, we
have dynamic queries,

350
00:19:11,600 --> 00:19:13,500
which are what are
the points in the box.

351
00:19:13,500 --> 00:19:15,250
Now, there are different
versions of this query.

352
00:19:15,250 --> 00:19:17,208
There is an existence
query, which is are there

353
00:19:17,208 --> 00:19:18,600
any points in the box?

354
00:19:18,600 --> 00:19:19,970
That's sort of the easiest.

355
00:19:19,970 --> 00:19:22,800
Next level up is, how many
points are in the box?

356
00:19:22,800 --> 00:19:24,770
Which you can use
to solve existence.

357
00:19:24,770 --> 00:19:27,740
Next level up is give me
all the points in the box,

358
00:19:27,740 --> 00:19:32,100
or give me 10 points in the
box, give me a point in the box.

359
00:19:32,100 --> 00:19:34,100
All of these problems are
more or less the same.

360
00:19:34,100 --> 00:19:35,510
They do differ in some cases.

361
00:19:35,510 --> 00:19:36,968
But the things
we'll see today, you

362
00:19:36,968 --> 00:19:38,850
can solve them all
about as efficiently.

363
00:19:38,850 --> 00:19:41,070
But, of course, if you want to
list all the points in the box,

364
00:19:41,070 --> 00:19:42,028
it could be everything.

365
00:19:42,028 --> 00:19:44,190
And so that could
take linear time.

366
00:19:44,190 --> 00:19:47,000
So in general, our goal is to
get a running time something

367
00:19:47,000 --> 00:19:51,740
like log n plus k, where k
is the size of the output.

368
00:19:54,630 --> 00:19:56,835
So if you're asking how
many points are in there,

369
00:19:56,835 --> 00:19:58,640
the size the output
is a single number.

370
00:19:58,640 --> 00:20:01,280
So k is 1, you should
get log n time.

371
00:20:01,280 --> 00:20:04,520
If you want to list 100
points in there, k is 100.

372
00:20:04,520 --> 00:20:07,280
And so you have to
pay that to list them.

373
00:20:07,280 --> 00:20:09,155
If you want to know all
of them, well, then k

374
00:20:09,155 --> 00:20:11,270
is the number of points
that are in there.

375
00:20:11,270 --> 00:20:17,990
And we'll be able to achieve
these kinds of bounds

376
00:20:17,990 --> 00:20:20,750
pretty much all the time,
definitely in two dimensions.

377
00:20:20,750 --> 00:20:25,246
In D dimensions, it's
going to get harder.

378
00:20:25,246 --> 00:20:25,745
OK.

379
00:20:28,760 --> 00:20:31,670
So I want to start out
with one dimension just

380
00:20:31,670 --> 00:20:34,700
to make sure we're
on the same page.

381
00:20:34,700 --> 00:20:37,280
And in general, we're going to
start with a solution called

382
00:20:37,280 --> 00:20:42,230
range trees, which were
simultaneously invented

383
00:20:42,230 --> 00:20:45,800
by a lot of people in the
late '70s, Bentley, one

384
00:20:45,800 --> 00:20:47,720
of the main guys.

385
00:20:47,720 --> 00:20:52,220
And in general, we're going to
aim here for a log to the d n

386
00:20:52,220 --> 00:20:53,630
plus k query time.

387
00:20:57,370 --> 00:20:59,950
So I like this, but now we
have a dependence on dimension.

388
00:20:59,950 --> 00:21:01,330
And for 2D, this is not great.

389
00:21:01,330 --> 00:21:03,340
It's log squared.

390
00:21:03,340 --> 00:21:04,820
And we're going to do better.

391
00:21:04,820 --> 00:21:05,320
OK.

392
00:21:05,320 --> 00:21:08,200
But let's start with d equals 1.

393
00:21:08,200 --> 00:21:10,260
How do you do this?

394
00:21:10,260 --> 00:21:13,090
How do I achieve
log n plus k query?

395
00:21:17,360 --> 00:21:18,230
Sort the points.

396
00:21:18,230 --> 00:21:18,730
Yeah.

397
00:21:18,730 --> 00:21:21,630
I could sort the points,
then do binary search.

398
00:21:21,630 --> 00:21:26,270
So the query now is
just an interval.

399
00:21:26,270 --> 00:21:28,790
That's the one dimensional
version of a box.

400
00:21:28,790 --> 00:21:32,300
So if I search for a, search
for b in a sorted list,

401
00:21:32,300 --> 00:21:34,040
then all the points
in between I can

402
00:21:34,040 --> 00:21:35,729
count the different indices--

403
00:21:35,729 --> 00:21:37,520
or subtract the two
indices into the array.

404
00:21:37,520 --> 00:21:39,320
That will give me how
many points there are

405
00:21:39,320 --> 00:21:42,920
in the box, all these things.

406
00:21:42,920 --> 00:21:45,270
Arrays aren't going to
generalize super nicely.

407
00:21:45,270 --> 00:21:48,660
Although, we'll come
back to arrays later.

408
00:21:48,660 --> 00:21:52,390
For now, I'd like to think
of a binary search tree,

409
00:21:52,390 --> 00:21:54,359
balanced binary search tree.

410
00:21:54,359 --> 00:21:56,150
And I'm going to make
it a little different

411
00:21:56,150 --> 00:21:58,820
from the usual kind
of binary search tree.

412
00:21:58,820 --> 00:22:00,875
I want the data to
be in the leaves.

413
00:22:04,190 --> 00:22:07,430
So I want the leaves
to be my points.

414
00:22:07,430 --> 00:22:11,150
And this will be convenient
for higher dimensions.

415
00:22:11,150 --> 00:22:12,950
It doesn't really matter
for one dimension,

416
00:22:12,950 --> 00:22:15,230
but it's kind of
nice to think about.

417
00:22:15,230 --> 00:22:17,480
So you've got a
binary search tree.

418
00:22:17,480 --> 00:22:21,320
And then here is the data sorted
by the only coordinate that

419
00:22:21,320 --> 00:22:24,000
exists, the x-coordinate.

420
00:22:24,000 --> 00:22:26,240
And so, of course,
I can search for a,

421
00:22:26,240 --> 00:22:29,870
here's a maybe, search for b.

422
00:22:29,870 --> 00:22:35,446
And the stuff in between
here, that is my result.

423
00:22:35,446 --> 00:22:40,540
And in a little more detail,
as you search for a and b,

424
00:22:40,540 --> 00:22:43,425
at some point,
they will diverge.

425
00:22:43,425 --> 00:22:44,925
One will go left,
one will go right.

426
00:22:50,730 --> 00:22:52,680
At some point, you reach a.

427
00:22:52,680 --> 00:22:54,704
Maybe a isn't actually
in the structure.

428
00:22:54,704 --> 00:22:57,120
You're searching for everything
between a and b inclusive,

429
00:22:57,120 --> 00:22:58,810
but a may not be there.

430
00:22:58,810 --> 00:23:00,990
So in general, we're going
to find the predecessor

431
00:23:00,990 --> 00:23:02,325
and successor of a.

432
00:23:02,325 --> 00:23:05,280
In this case, I'm interested
in the predecessor.

433
00:23:05,280 --> 00:23:11,010
And similarly over
here, eventually--

434
00:23:11,010 --> 00:23:13,080
this is all, of course,
logarithmic time--

435
00:23:13,080 --> 00:23:16,719
I find the successor of b.

436
00:23:16,719 --> 00:23:18,510
Those are the two things
I'm interested in.

437
00:23:18,510 --> 00:23:20,760
And now, all the
leaves in between here,

438
00:23:20,760 --> 00:23:22,007
that's the result. Question?

439
00:23:22,007 --> 00:23:24,215
AUDIENCE: So if you have
the data just on the leaves,

440
00:23:24,215 --> 00:23:26,270
what do you have
intermediate node?

441
00:23:26,270 --> 00:23:27,270
ERIK DEMAINE: Ah, right.

442
00:23:27,270 --> 00:23:29,144
So in the intermediate
nodes, I need to know,

443
00:23:29,144 --> 00:23:32,520
let's say, if every subtree
knows the min and max, then

444
00:23:32,520 --> 00:23:36,547
at a node, I can decide should
I go left, should I go right?

445
00:23:36,547 --> 00:23:38,880
I think every node can store
the max of the left subtree

446
00:23:38,880 --> 00:23:40,569
if you just want
one key per node.

447
00:23:40,569 --> 00:23:41,610
But, yeah, good question.

448
00:23:41,610 --> 00:23:43,330
Sorry, I forgot to mention that.

449
00:23:43,330 --> 00:23:47,270
You store a representative
sort of in the middle

450
00:23:47,270 --> 00:23:50,460
that lets you decide
whether to go left or right.

451
00:23:50,460 --> 00:23:51,730
So you can still do searches.

452
00:23:51,730 --> 00:23:53,070
We can find these two nodes.

453
00:23:53,070 --> 00:23:57,105
And now, the answer is
basically all of this stuff.

454
00:24:01,660 --> 00:24:05,860
I did not leave
myself enough space.

455
00:24:05,860 --> 00:24:07,210
That's the left child.

456
00:24:12,840 --> 00:24:13,570
OK.

457
00:24:13,570 --> 00:24:16,900
So wherever this
left branch went

458
00:24:16,900 --> 00:24:19,970
left, the right
branches in the answer.

459
00:24:19,970 --> 00:24:21,970
Whenever this right
branch went right,

460
00:24:21,970 --> 00:24:23,760
the left branch is the answer.

461
00:24:23,760 --> 00:24:26,290
But from here, there's no
subtree that we care about.

462
00:24:26,290 --> 00:24:29,080
Because this is all greater
than what we care about.

463
00:24:29,080 --> 00:24:29,580
OK.

464
00:24:29,580 --> 00:24:30,955
But the good news
is there's only

465
00:24:30,955 --> 00:24:33,100
log n of these subtrees,
maybe two log n.

466
00:24:33,100 --> 00:24:35,330
Because there's the left
side, the right side.

467
00:24:35,330 --> 00:24:35,830
OK.

468
00:24:35,830 --> 00:24:39,790
So the answer is
implicitly represented.

469
00:24:39,790 --> 00:24:42,010
We don't have to explicitly
touch all these items.

470
00:24:42,010 --> 00:24:44,720
We just know that they
live in the subtrees,

471
00:24:44,720 --> 00:24:49,370
in those order log n subtrees.

472
00:24:49,370 --> 00:24:52,780
So in particular, if every node
stores the size of its subtree,

473
00:24:52,780 --> 00:24:54,430
then we can add up
these log n numbers.

474
00:24:54,430 --> 00:24:56,130
And we get the
size of the answer.

475
00:24:56,130 --> 00:24:59,380
If we want the first k items,
we can visit the first k items

476
00:24:59,380 --> 00:25:03,490
here in order k time.

477
00:25:03,490 --> 00:25:07,150
So in log n time, we get a nice
representation of the answers,

478
00:25:07,150 --> 00:25:08,650
log n subtrees.

479
00:25:08,650 --> 00:25:11,860
Of course, we also had a nice
answer when we had an array.

480
00:25:11,860 --> 00:25:14,930
But this one will be
easier to generalize.

481
00:25:14,930 --> 00:25:16,690
And that's range trees.

482
00:25:21,540 --> 00:25:23,429
So that was a 1D range tree.

483
00:25:23,429 --> 00:25:25,470
The only difference is we
put data at the leaves.

484
00:25:28,796 --> 00:25:34,980
2D range tree has a simple idea.

485
00:25:34,980 --> 00:25:37,800
We have the data
in these subtrees.

486
00:25:37,800 --> 00:25:39,172
These are the matches.

487
00:25:39,172 --> 00:25:40,630
Let's think we have
an x-coordinate

488
00:25:40,630 --> 00:25:41,421
and a y-coordinate.

489
00:25:41,421 --> 00:25:43,500
We have an x range
and a y range.

490
00:25:43,500 --> 00:25:44,940
Let's do this for x.

491
00:25:44,940 --> 00:25:48,340
Now, we have a representation
of all the matches in x.

492
00:25:48,340 --> 00:25:50,490
So we want this rectangle.

493
00:25:50,490 --> 00:25:54,360
But we can get this
entire slab in log n time,

494
00:25:54,360 --> 00:25:56,760
and we have log n
subtrees that we now

495
00:25:56,760 --> 00:25:58,080
have to filter in terms of y.

496
00:25:58,080 --> 00:26:00,610
There's all these points out
here that we don't care about.

497
00:26:00,610 --> 00:26:05,920
We want to get rid of those and
just focus in on these points.

498
00:26:05,920 --> 00:26:07,980
So we're going to do
the same thing on y,

499
00:26:07,980 --> 00:26:10,530
but we want to do
that for this subtree.

500
00:26:10,530 --> 00:26:12,270
And we want to do
it for this subtree,

501
00:26:12,270 --> 00:26:15,330
and for this subtree,
so simple idea.

502
00:26:15,330 --> 00:26:20,470
For each subtree, let's
call it an x subtree.

503
00:26:20,470 --> 00:26:22,810
So we have one tree which
represents all the x data.

504
00:26:22,810 --> 00:26:24,690
It looks just like this.

505
00:26:24,690 --> 00:26:27,930
And then for each
subtree of that x tree,

506
00:26:27,930 --> 00:26:39,750
we store, let's say,
a pointer to a y tree,

507
00:26:39,750 --> 00:26:43,230
which is also a 1D range tree.

508
00:26:43,230 --> 00:26:52,720
So this guy has a pointer to
a similarly sized triangle.

509
00:26:52,720 --> 00:26:54,370
Except, this one is on y.

510
00:26:54,370 --> 00:26:55,600
This one's sorted by x.

511
00:26:55,600 --> 00:26:58,620
This one's sorted
by y, same points.

512
00:26:58,620 --> 00:27:04,210
This subtree also has one,
same data as over here,

513
00:27:04,210 --> 00:27:07,000
but now sorted an
y instead of x.

514
00:27:07,000 --> 00:27:12,220
For example, there is a
smaller tree inside this one.

515
00:27:12,220 --> 00:27:15,885
That one also has a pointer
to a smaller y tree.

516
00:27:15,885 --> 00:27:17,260
Except, now, these
are disjoined,

517
00:27:17,260 --> 00:27:19,180
because these are completely--

518
00:27:19,180 --> 00:27:19,720
yeah.

519
00:27:19,720 --> 00:27:21,670
This is a subset of that one.

520
00:27:21,670 --> 00:27:24,400
But we're going to store a y
tree for this one and a y tree

521
00:27:24,400 --> 00:27:24,980
for this one.

522
00:27:24,980 --> 00:27:26,063
So we're blowing up space.

523
00:27:29,320 --> 00:27:40,690
Every element, every point
lives in log n y trees.

524
00:27:43,186 --> 00:27:44,810
Because if you look
at a point, there's

525
00:27:44,810 --> 00:27:47,090
the tiny y tree that contains
it bigger, bigger, bigger,

526
00:27:47,090 --> 00:27:49,050
bigger until the entire
tree also contains it.

527
00:27:49,050 --> 00:27:51,730
Each of those has a
corresponding y tree.

528
00:27:51,730 --> 00:27:55,360
So the overall space
will be n log n.

529
00:27:58,550 --> 00:28:00,810
We're repeating points here.

530
00:28:00,810 --> 00:28:03,350
But the good news is now I can
do search really efficiently,

531
00:28:03,350 --> 00:28:05,330
well, log squared efficiently.

532
00:28:05,330 --> 00:28:08,360
I spend log time to find
these x trees that represent

533
00:28:08,360 --> 00:28:10,670
the slabs that I care about.

534
00:28:10,670 --> 00:28:13,970
So it's more like this picture.

535
00:28:13,970 --> 00:28:16,460
So there's a bunch of
disjoint slabs, which together

536
00:28:16,460 --> 00:28:19,040
contain my points in x.

537
00:28:19,040 --> 00:28:21,200
And now I want to filter
each of them by y.

538
00:28:21,200 --> 00:28:23,720
So for each of them, I
jump over to y space and do

539
00:28:23,720 --> 00:28:29,490
a range query in y space just
like what we were doing here.

540
00:28:29,490 --> 00:28:33,470
So search for a, search
for b, but in y-coordinate.

541
00:28:33,470 --> 00:28:36,860
And then I get log n subtrees
in here, log n subtrees in here,

542
00:28:36,860 --> 00:28:38,360
log n subtrees in here.

543
00:28:38,360 --> 00:28:49,066
So the query gives me
log squared y subtrees.

544
00:28:49,066 --> 00:28:51,690
It takes me log squared
n time to find them.

545
00:28:51,690 --> 00:28:54,050
If I have subtree sizes, I
compute the number of matches

546
00:28:54,050 --> 00:28:55,740
in log squared n time.

547
00:28:55,740 --> 00:28:58,460
If I want k items, I
can grab k items out

548
00:28:58,460 --> 00:29:00,730
of them in order k time.

549
00:29:00,730 --> 00:29:01,980
OK.

550
00:29:01,980 --> 00:29:03,000
Pretty easy.

551
00:29:03,000 --> 00:29:06,300
Of course, D dimensions
is just the same trick.

552
00:29:06,300 --> 00:29:07,770
You have x tree.

553
00:29:07,770 --> 00:29:09,390
Every subtree links to a y tree.

554
00:29:09,390 --> 00:29:11,320
Every y subtree
links to a z tree.

555
00:29:11,320 --> 00:29:15,030
Every z subtree links
to a w tree, and so on.

556
00:29:15,030 --> 00:29:21,360
For D dimensions, you're going
to get log to the D query

557
00:29:21,360 --> 00:29:24,540
as I claimed before.

558
00:29:24,540 --> 00:29:25,680
How much space?

559
00:29:25,680 --> 00:29:28,050
Well, every dimension you
add adds another log factor

560
00:29:28,050 --> 00:29:29,220
of space.

561
00:29:29,220 --> 00:29:35,520
So it's going to be n log
to the d minus 1 space.

562
00:29:35,520 --> 00:29:38,340
And if you want to
do this statically,

563
00:29:38,340 --> 00:29:45,540
you can also build the
data structure in n log

564
00:29:45,540 --> 00:29:49,620
to the d minus 1 n time,
except for d equals 1 where

565
00:29:49,620 --> 00:29:52,020
you need n log n time to sort.

566
00:29:52,020 --> 00:29:53,940
But as long as d
is bigger than 1,

567
00:29:53,940 --> 00:29:56,900
this is the right bound
for higher dimensions.

568
00:29:56,900 --> 00:29:59,970
It takes a little bit of
effort to actually build

569
00:29:59,970 --> 00:30:02,650
the structure in that much
time, but it can be done.

570
00:30:06,010 --> 00:30:06,510
OK.

571
00:30:06,510 --> 00:30:08,093
That's the very
simple data structure.

572
00:30:08,093 --> 00:30:11,325
Any questions about that
before we make it cooler?

573
00:30:11,325 --> 00:30:13,200
You may have seen this
data structure before.

574
00:30:13,200 --> 00:30:14,700
It's kind of classic.

575
00:30:14,700 --> 00:30:16,890
But you can do
much better, well,

576
00:30:16,890 --> 00:30:18,506
at least a log factor better.

577
00:30:18,506 --> 00:30:19,340
AUDIENCE: Question.

578
00:30:19,340 --> 00:30:20,131
ERIK DEMAINE: Yeah.

579
00:30:20,131 --> 00:30:23,586
AUDIENCE: So when your storing
one pointer for each subtree,

580
00:30:23,586 --> 00:30:26,960
you essentially have a
pointer for each root,

581
00:30:26,960 --> 00:30:27,930
like for each node?

582
00:30:27,930 --> 00:30:28,290
ERIK DEMAINE: Yeah.

583
00:30:28,290 --> 00:30:28,789
Right.

584
00:30:28,789 --> 00:30:29,520
Every node.

585
00:30:29,520 --> 00:30:32,790
So I know these are the nodes
that the stuff below them

586
00:30:32,790 --> 00:30:34,410
represents my answer in x.

587
00:30:34,410 --> 00:30:38,281
And so I teleport over to the
y universe from the x universe.

588
00:30:38,281 --> 00:30:40,364
AUDIENCE: So, basically,
it has all the same nodes

589
00:30:40,364 --> 00:30:41,749
of that subtree, but [INAUDIBLE]

590
00:30:41,749 --> 00:30:42,540
ERIK DEMAINE: Yeah.

591
00:30:42,540 --> 00:30:44,340
All the points that
are in here also

592
00:30:44,340 --> 00:30:46,460
live in here, except these
ones are sorted by x.

593
00:30:46,460 --> 00:30:48,090
These ones are sorted by y.

594
00:30:48,090 --> 00:30:53,210
If I kept following pointers,
I get to z and w and so on.

595
00:30:53,210 --> 00:30:53,960
Other questions?

596
00:30:53,960 --> 00:30:54,762
Yeah.

597
00:30:54,762 --> 00:30:56,720
AUDIENCE: So if we were
doing the dynamic case,

598
00:30:56,720 --> 00:30:59,020
how would we implement
rotations in the [INAUDIBLE]?

599
00:30:59,020 --> 00:30:59,728
ERIK DEMAINE: OK.

600
00:31:02,850 --> 00:31:05,171
Dynamic is annoying.

601
00:31:05,171 --> 00:31:05,670
Yeah.

602
00:31:05,670 --> 00:31:08,100
Rotations are annoying.

603
00:31:08,100 --> 00:31:10,710
I think we'll come back to that.

604
00:31:10,710 --> 00:31:11,804
We can solve that.

605
00:31:11,804 --> 00:31:13,470
I thought it was easy,
but you're right.

606
00:31:13,470 --> 00:31:15,630
Rotations are kind of annoying.

607
00:31:15,630 --> 00:31:18,840
And we can solve that using
this dynamization trick.

608
00:31:18,840 --> 00:31:23,802
So we don't have to worry
about it till we get there.

609
00:31:23,802 --> 00:31:26,010
It's going to get even harder
to make things dynamic.

610
00:31:26,010 --> 00:31:30,880
And so then we really need
to pull out the black box.

611
00:31:30,880 --> 00:31:32,120
Well, it's not a black box.

612
00:31:32,120 --> 00:31:33,540
We're going to see how it works.

613
00:31:33,540 --> 00:31:34,790
But it's a general
transformation

614
00:31:34,790 --> 00:31:35,873
that makes things dynamic.

615
00:31:40,465 --> 00:31:40,965
OK.

616
00:31:43,830 --> 00:31:46,980
Before we get a dynamic,
stick with static.

617
00:31:46,980 --> 00:31:49,095
And let's improve
things by a log factor.

618
00:31:54,530 --> 00:31:57,070
This is an idea called
layered range trees.

619
00:32:02,240 --> 00:32:07,520
It's also sometimes called
fractional cascading,

620
00:32:07,520 --> 00:32:10,190
which is the technique
we're going to get to later.

621
00:32:10,190 --> 00:32:12,740
I would say it involves one
half of fractional cascading.

622
00:32:12,740 --> 00:32:14,960
Fractional cascading
has two ideas.

623
00:32:14,960 --> 00:32:17,800
And the one that it's named
after is not this idea.

624
00:32:17,800 --> 00:32:23,420
So idea one is basically
to reuse searches.

625
00:32:23,420 --> 00:32:25,490
The idea is we're
searching in this subtree

626
00:32:25,490 --> 00:32:27,920
or, I guess, this subtree
with respect to y.

627
00:32:27,920 --> 00:32:31,430
We're also searching for
the same interval of y

628
00:32:31,430 --> 00:32:32,150
in this subtree.

629
00:32:32,150 --> 00:32:33,830
Completely different
elements, but

630
00:32:33,830 --> 00:32:36,660
if there was some way we
could reuse the searches for y

631
00:32:36,660 --> 00:32:40,160
in all of these log n subtrees,
we could save a log factor.

632
00:32:40,160 --> 00:32:41,550
And it turns out we can.

633
00:32:41,550 --> 00:32:43,520
And this is one idea in
fractional cascading,

634
00:32:43,520 --> 00:32:45,200
but there will be
another one later.

635
00:32:48,110 --> 00:32:50,470
OK.

636
00:32:50,470 --> 00:32:53,570
So, fun stuff.

637
00:32:53,570 --> 00:32:56,170
This is where I want
to change my notes.

638
00:32:56,170 --> 00:33:03,190
So we're searching in x with
a regular 1D range tree.

639
00:33:03,190 --> 00:33:07,490
I also want to have a
regular 1D range tree--

640
00:33:07,490 --> 00:33:08,650
range tree?

641
00:33:08,650 --> 00:33:09,189
Sure.

642
00:33:09,189 --> 00:33:10,730
Actually, it doesn't
matter too much.

643
00:33:10,730 --> 00:33:13,570
I want to have an array
of all the items sorted

644
00:33:13,570 --> 00:33:16,610
by y-coordinate.

645
00:33:16,610 --> 00:33:18,740
And we're going to
simplify things here.

646
00:33:18,740 --> 00:33:20,420
Instead of pointing
to a tree, I'm

647
00:33:20,420 --> 00:33:22,700
going to point to an
array sorted by y.

648
00:33:22,700 --> 00:33:24,700
This is totally static.

649
00:33:24,700 --> 00:33:28,640
And this is where
dynamic gets harder,

650
00:33:28,640 --> 00:33:32,000
not that know how to
do it over there yet.

651
00:33:32,000 --> 00:33:36,500
So for each x
subtree, we're going

652
00:33:36,500 --> 00:33:39,740
to have a pointer to the
same elements sorted by y.

653
00:33:45,360 --> 00:33:47,430
So all the leaves
that are down here

654
00:33:47,430 --> 00:33:53,352
are, basically, also there, but
by y coordinate instead of x.

655
00:33:53,352 --> 00:33:55,060
Obviously, we can
still do the same thing

656
00:33:55,060 --> 00:33:57,510
we could do before, spend
log n time to search

657
00:33:57,510 --> 00:34:01,710
in each of these log n arrays
corresponding to these log n

658
00:34:01,710 --> 00:34:02,920
subtrees.

659
00:34:02,920 --> 00:34:05,710
And in log squared n,
we'll have our answers.

660
00:34:05,710 --> 00:34:07,950
But we can do better now.

661
00:34:07,950 --> 00:34:11,550
I only want to do one
binary search in y.

662
00:34:11,550 --> 00:34:13,460
And that will be at the root.

663
00:34:13,460 --> 00:34:15,210
So the root, there's
an array representing

664
00:34:15,210 --> 00:34:17,159
everything sorted by y.

665
00:34:17,159 --> 00:34:19,724
I search for the
lower y-coordinate.

666
00:34:19,724 --> 00:34:23,671
I search for the upper
y-coordinate, some things.

667
00:34:23,671 --> 00:34:25,170
It's hard to draw
this, because it's

668
00:34:25,170 --> 00:34:26,919
in the dimensional
orthogonal to this one.

669
00:34:26,919 --> 00:34:31,239
I guess I should really
draw the arrays like this.

670
00:34:31,239 --> 00:34:32,760
So this guy has an array.

671
00:34:32,760 --> 00:34:34,770
We find the upper
and lower bounds

672
00:34:34,770 --> 00:34:37,500
for the y-coordinate
in the global space.

673
00:34:37,500 --> 00:34:39,791
This takes log n time
to do two searches.

674
00:34:39,791 --> 00:34:40,290
Question.

675
00:34:40,290 --> 00:34:41,757
AUDIENCE: Those are the
upper and lower bounds

676
00:34:41,757 --> 00:34:44,049
from the predecessor [INAUDIBLE]
successor [INAUDIBLE]?

677
00:34:44,049 --> 00:34:45,131
ERIK DEMAINE: Yeah, right.

678
00:34:45,131 --> 00:34:47,909
So we're doing a predecessor
and successor search, let's say,

679
00:34:47,909 --> 00:34:48,870
in this array.

680
00:34:48,870 --> 00:34:52,170
Binary search we find--

681
00:34:52,170 --> 00:34:54,960
I didn't give them
names, but in the notes

682
00:34:54,960 --> 00:35:00,540
they're a1 through b1 and x.

683
00:35:00,540 --> 00:35:04,560
And they're a2 through b2 and y.

684
00:35:04,560 --> 00:35:06,740
So that's my query,
this rectangle.

685
00:35:06,740 --> 00:35:11,380
I'm doing the search for a2
and for b2 in the top array.

686
00:35:11,380 --> 00:35:14,460
Now, what I'd like to do
is keep that information

687
00:35:14,460 --> 00:35:15,750
as I walk down the tree.

688
00:35:18,760 --> 00:35:21,390
So that in the end, when
I get to these nodes,

689
00:35:21,390 --> 00:35:26,520
I know where I am in
those arrays in y.

690
00:35:26,520 --> 00:35:28,440
So let's think of that
just step by step.

691
00:35:47,460 --> 00:35:53,900
So imagine in the x
tree, I'm at some node.

692
00:35:53,900 --> 00:35:57,080
And then I follow, let's
say, a right pointer

693
00:35:57,080 --> 00:35:59,580
to the right child.

694
00:35:59,580 --> 00:36:00,080
OK.

695
00:36:00,080 --> 00:36:02,240
Now, in y space--

696
00:36:02,240 --> 00:36:06,500
maybe I should switch
to red for y space.

697
00:36:06,500 --> 00:36:13,010
This guy has a really big array
representing all of the nodes

698
00:36:13,010 --> 00:36:16,000
down here, but sorted
by y-coordinate.

699
00:36:16,000 --> 00:36:18,410
This guy has a
corresponding array

700
00:36:18,410 --> 00:36:20,780
with some subset of the nodes.

701
00:36:20,780 --> 00:36:22,160
Which subset?

702
00:36:22,160 --> 00:36:24,660
The ones that are to the
right of this x-coordinate.

703
00:36:24,660 --> 00:36:25,790
So there's no relation.

704
00:36:25,790 --> 00:36:28,587
I mean, some of the
guys that are here--

705
00:36:28,587 --> 00:36:29,420
let me circle them--

706
00:36:32,180 --> 00:36:34,400
some of these guys
exist over here.

707
00:36:34,400 --> 00:36:37,740
They'll be in the
same relative order.

708
00:36:37,740 --> 00:36:41,220
So here's those four
guys, then one, and two.

709
00:36:41,220 --> 00:36:43,899
So some of these guys will
be preserved over here.

710
00:36:43,899 --> 00:36:46,190
Some of them won't, because
their x-coordinate smaller.

711
00:36:46,190 --> 00:36:47,780
It's an arbitrary subset.

712
00:36:47,780 --> 00:36:50,660
These guys will also live here.

713
00:36:50,660 --> 00:36:51,410
OK.

714
00:36:51,410 --> 00:36:56,180
The idea is store pointers from
every element over here to,

715
00:36:56,180 --> 00:36:59,090
let's say, the
successor over here.

716
00:36:59,090 --> 00:37:02,680
So store these red arrows.

717
00:37:02,680 --> 00:37:06,800
let's say, these guys
all point to this node.

718
00:37:06,800 --> 00:37:09,260
These guys point to that node.

719
00:37:09,260 --> 00:37:12,320
I guess these guys just
point to some adjacent node,

720
00:37:12,320 --> 00:37:15,550
either the predecessor
or the successor.

721
00:37:15,550 --> 00:37:20,292
So the result is if I know where
a2 and b2 live in this array,

722
00:37:20,292 --> 00:37:22,250
I can figure out where
they live in this array.

723
00:37:22,250 --> 00:37:23,920
I just follow the pointer.

724
00:37:23,920 --> 00:37:25,776
Easy.

725
00:37:25,776 --> 00:37:28,240
Done.

726
00:37:28,240 --> 00:37:28,750
OK.

727
00:37:28,750 --> 00:37:30,350
Let's think about
what this means.

728
00:37:30,350 --> 00:37:39,580
So I'm going to store
pointers from the y

729
00:37:39,580 --> 00:37:45,110
array of some x node.

730
00:37:49,560 --> 00:37:56,340
Let's call that
node v in the x tree

731
00:37:56,340 --> 00:38:04,940
to the corresponding places,
corresponding points,

732
00:38:04,940 --> 00:38:15,410
let's say, in the y
arrays of left child of v

733
00:38:15,410 --> 00:38:16,760
and the right child of v.

734
00:38:16,760 --> 00:38:19,822
So, actually,
every array item is

735
00:38:19,822 --> 00:38:21,530
going to have two
pointers, one if you're

736
00:38:21,530 --> 00:38:23,290
going right in the x
tree, one if you're

737
00:38:23,290 --> 00:38:26,310
going the left in the x tree.

738
00:38:26,310 --> 00:38:28,700
But we can afford a constant
number of pointers per node.

739
00:38:28,700 --> 00:38:32,059
This only increases space
by a constant factor.

740
00:38:32,059 --> 00:38:34,100
And now, it tells me
exactly what I need to know.

741
00:38:34,100 --> 00:38:35,510
I start at the root.

742
00:38:35,510 --> 00:38:36,590
I do a binary search.

743
00:38:36,590 --> 00:38:37,820
That's the slow part.

744
00:38:37,820 --> 00:38:39,740
I spend log n time,
find those two slots.

745
00:38:39,740 --> 00:38:42,260
Every time I go down,
I follow the pointer.

746
00:38:42,260 --> 00:38:47,120
I know exactly where a2 and
b2 live in the next array.

747
00:38:47,120 --> 00:38:50,070
In constant time, as I walk
down, I can figure this out.

748
00:38:50,070 --> 00:38:52,430
I can remember the information
on both sides here.

749
00:38:52,430 --> 00:38:55,940
And every time I go to
one of these subtrees,

750
00:38:55,940 --> 00:38:58,970
I know exactly where I live--

751
00:38:58,970 --> 00:39:02,940
it's no longer a tree--
now, in that array.

752
00:39:02,940 --> 00:39:06,710
So I can identify the
regions in these arrays.

753
00:39:06,710 --> 00:39:11,540
that correspond to these
matching subrectangles

754
00:39:11,540 --> 00:39:12,560
with no extra time.

755
00:39:12,560 --> 00:39:14,606
So I save that last log factor.

756
00:39:14,606 --> 00:39:16,230
If you generalize
this to D dimensions,

757
00:39:16,230 --> 00:39:17,730
it only works in
the last dimension.

758
00:39:17,730 --> 00:39:19,730
You can use this trick
in the last dimension

759
00:39:19,730 --> 00:39:25,880
and improve from log to the d
query to log to the d minus 1.

760
00:39:25,880 --> 00:39:28,459
In the higher dimensions, we
just use regular range trees.

761
00:39:28,459 --> 00:39:30,500
And when we get down to
the two dimensional case,

762
00:39:30,500 --> 00:39:31,754
it's a recursion.

763
00:39:31,754 --> 00:39:33,920
Before we were stopping at
the one dimensional case.

764
00:39:33,920 --> 00:39:36,080
We use a regular
binary search tree.

765
00:39:36,080 --> 00:39:38,390
Now, we stop at the
two dimensional case,

766
00:39:38,390 --> 00:39:39,590
and we use this fancy thing.

767
00:39:43,370 --> 00:39:44,780
I call this cross-linking.

768
00:39:44,780 --> 00:39:46,970
A lot of people call it
fractional cascading.

769
00:39:46,970 --> 00:39:49,520
Both are valid names.

770
00:39:49,520 --> 00:39:52,100
It's a cool idea,
but simple once you

771
00:39:52,100 --> 00:39:54,570
can see both dimensions
at once, which I know it's

772
00:39:54,570 --> 00:39:55,820
hard to see in two dimensions.

773
00:39:55,820 --> 00:39:59,060
But it can be done.

774
00:39:59,060 --> 00:40:00,630
All right.

775
00:40:00,630 --> 00:40:01,250
Questions?

776
00:40:04,270 --> 00:40:06,700
I guess the obvious
question is dynamic.

777
00:40:06,700 --> 00:40:10,040
Now, we're going
to go to dynamic.

778
00:40:10,040 --> 00:40:11,860
This is a very static
thing to be doing.

779
00:40:11,860 --> 00:40:14,132
How in the world
would we maintain this

780
00:40:14,132 --> 00:40:15,340
if the point set is changing?

781
00:40:15,340 --> 00:40:17,590
All these pointers are
going to move around.

782
00:40:17,590 --> 00:40:20,350
Life seems so hard.

783
00:40:20,350 --> 00:40:22,330
But it's not.

784
00:40:22,330 --> 00:40:25,060
In fact, updates are a lot
easier than you might think.

785
00:40:45,850 --> 00:40:49,540
Some of you may believe
this in your heart.

786
00:40:49,540 --> 00:40:51,190
Some of you may not.

787
00:40:51,190 --> 00:40:55,750
But if you've ever seen an
amortization argument that

788
00:40:55,750 --> 00:40:57,684
says, basically, when
you modify a tree,

789
00:40:57,684 --> 00:40:59,350
only a constant number
of things happen.

790
00:40:59,350 --> 00:41:01,620
And they usually
happen near the leaves.

791
00:41:01,620 --> 00:41:03,900
I'm thinking of a
binary search tree.

792
00:41:03,900 --> 00:41:05,700
The easiest way to see
this is in a B-tree

793
00:41:05,700 --> 00:41:07,870
if you know B-trees.

794
00:41:07,870 --> 00:41:09,567
Usually, if you do
insertion, you're

795
00:41:09,567 --> 00:41:11,650
going to do maybe one or
two splits at the bottom,

796
00:41:11,650 --> 00:41:12,579
and that's it.

797
00:41:12,579 --> 00:41:14,620
Constant fraction at a
time, that's all there is.

798
00:41:14,620 --> 00:41:16,900
So it should only take
constant time to do an update.

799
00:41:16,900 --> 00:41:21,935
This structure is easy
to update at the leaves.

800
00:41:21,935 --> 00:41:24,310
If you look at one of these
structures, a constant number

801
00:41:24,310 --> 00:41:26,500
of items, there's a
constant size array.

802
00:41:26,500 --> 00:41:29,840
You could update everything
in constant time.

803
00:41:29,840 --> 00:41:32,745
If we're only up to hitting near
the leaves, then life is good.

804
00:41:32,745 --> 00:41:34,120
Occasionally,
though, we're going

805
00:41:34,120 --> 00:41:36,609
to have to update
these giant structures.

806
00:41:36,609 --> 00:41:38,650
And then we're going to
have to spend giant time.

807
00:41:38,650 --> 00:41:41,320
That's OK.

808
00:41:41,320 --> 00:41:43,720
The only thing we need
out of this data structure

809
00:41:43,720 --> 00:41:46,750
is that it takes the same amount
of space and pre-processing

810
00:41:46,750 --> 00:41:52,950
time, n log to d minus 1
space, and time to build

811
00:41:52,950 --> 00:41:56,800
the static data structure.

812
00:41:56,800 --> 00:42:01,540
If we have this, it turns out
we can make it dynamic for free.

813
00:42:01,540 --> 00:42:03,900
This is the magic of
weight balance trees.

814
00:42:15,610 --> 00:42:20,140
In general, there are many
kinds of weight balance trees.

815
00:42:20,140 --> 00:42:22,570
We're going to look
at one called BB alpha

816
00:42:22,570 --> 00:42:28,450
trees, which are the oldest
and sort of the simplest.

817
00:42:28,450 --> 00:42:29,210
Well, you'll see.

818
00:42:29,210 --> 00:42:31,240
It's pretty easy to do.

819
00:42:31,240 --> 00:42:33,280
You've already seen
height balance trees.

820
00:42:33,280 --> 00:42:36,190
AVL trees, for example, you keep
the left and the right subtree.

821
00:42:36,190 --> 00:42:38,481
You want their height to be
within an additive constant

822
00:42:38,481 --> 00:42:40,630
of each other, 1.

823
00:42:40,630 --> 00:42:43,780
Red black trees are
multiplicative factor 2.

824
00:42:43,780 --> 00:42:45,250
Left and right
subtree, the heights

825
00:42:45,250 --> 00:42:46,990
will be roughly the same.

826
00:42:46,990 --> 00:42:49,450
Weight balance trees,
weight is the number

827
00:42:49,450 --> 00:42:51,005
of nodes in a subtree.

828
00:42:51,005 --> 00:42:52,630
Weight balance trees,
they want to keep

829
00:42:52,630 --> 00:42:55,330
the size of the left subtree and
the size of the right subtree

830
00:42:55,330 --> 00:42:57,560
to be roughly the same.

831
00:42:57,560 --> 00:43:02,710
So here's the definition
of BB alpha trees.

832
00:43:02,710 --> 00:43:11,900
For each node v, size
of the left subtree of v

833
00:43:11,900 --> 00:43:16,300
is at least alpha
times the size of v.

834
00:43:16,300 --> 00:43:21,540
And size of the
right subtree of v

835
00:43:21,540 --> 00:43:27,860
is at least alpha times
the size of v. Now, size,

836
00:43:27,860 --> 00:43:28,810
I didn't define size.

837
00:43:28,810 --> 00:43:30,640
It could be the total number
of nodes in the subtree.

838
00:43:30,640 --> 00:43:32,664
It could be the number
of leaves in the subtree.

839
00:43:32,664 --> 00:43:33,580
Doesn't really matter.

840
00:43:36,160 --> 00:43:38,140
What else?

841
00:43:38,140 --> 00:43:39,080
What's alpha?

842
00:43:39,080 --> 00:43:40,900
Alpha is a half,
you're in trouble.

843
00:43:40,900 --> 00:43:43,630
Because then it has to
be perfectly balanced.

844
00:43:43,630 --> 00:43:46,570
But just make alpha small,
like 1/10 or something.

845
00:43:46,570 --> 00:43:49,340
Any constant less
than a half will do.

846
00:43:52,500 --> 00:43:53,110
Right.

847
00:43:53,110 --> 00:43:54,568
The nice thing
about weight balance

848
00:43:54,568 --> 00:43:55,990
is they imply height balance.

849
00:43:55,990 --> 00:43:59,290
If you have this property
that neither your left

850
00:43:59,290 --> 00:44:04,300
nor your right subtree are too
small, then as you go down,

851
00:44:04,300 --> 00:44:06,940
every time you take a
left or a right child,

852
00:44:06,940 --> 00:44:10,840
you throw away an alpha
fraction of your nodes.

853
00:44:10,840 --> 00:44:12,480
So initially, you
have all the nodes.

854
00:44:12,480 --> 00:44:14,604
Every time you go down,
you lose an alpha fraction.

855
00:44:14,604 --> 00:44:16,240
How many times can that happen?

856
00:44:16,240 --> 00:44:21,280
Log base alpha, basically,
so log base 1 over alpha.

857
00:44:21,280 --> 00:44:29,350
The height is log base
1 over alpha of n.

858
00:44:29,350 --> 00:44:32,350
So this is really a stronger
property than height balance.

859
00:44:32,350 --> 00:44:34,871
It implies that your
heights are good.

860
00:44:34,871 --> 00:44:37,120
So it implies the height of
the left and right subtree

861
00:44:37,120 --> 00:44:39,340
are not too far from each other.

862
00:44:39,340 --> 00:44:41,290
But it's a lot stronger.

863
00:44:41,290 --> 00:44:46,960
It lets you do updates
lickety fast, basically.

864
00:44:46,960 --> 00:44:48,130
So how do we do an update?

865
00:44:57,710 --> 00:45:01,250
The idea is, normally,
you insert a leaf,

866
00:45:01,250 --> 00:45:03,520
do a regular BST,
insert a delete.

867
00:45:03,520 --> 00:45:06,350
You add a leaf at the
bottom or delete a leaf.

868
00:45:06,350 --> 00:45:11,240
And so you have to update like
that node and maybe its parent.

869
00:45:11,240 --> 00:45:13,650
As long as you have
weight balance,

870
00:45:13,650 --> 00:45:15,650
you're just making little
constant sized changes

871
00:45:15,650 --> 00:45:16,430
at the bottom.

872
00:45:16,430 --> 00:45:18,330
Everything's good.

873
00:45:18,330 --> 00:45:18,830
OK.

874
00:45:18,830 --> 00:45:21,413
The trouble is when one of these
constraints becomes violated.

875
00:45:21,413 --> 00:45:23,880
Then you want to do a
rotation or something.

876
00:45:23,880 --> 00:45:24,380
OK.

877
00:45:24,380 --> 00:45:34,650
So when a node is
not weight balanced,

878
00:45:34,650 --> 00:45:37,220
it's a pretty loose algorithm.

879
00:45:37,220 --> 00:45:39,380
But it's easy to find nodes.

880
00:45:39,380 --> 00:45:41,630
You just store all the
weights, all the subtree sizes,

881
00:45:41,630 --> 00:45:43,610
which we were doing already.

882
00:45:43,610 --> 00:45:47,150
You can detect when nodes are
no longer weight balanced.

883
00:45:47,150 --> 00:45:49,044
And then we just want
to weight balance it.

884
00:45:49,044 --> 00:45:50,210
How do we weight balance it?

885
00:45:50,210 --> 00:45:54,090
We rebuild the entire
subtree from scratch.

886
00:45:54,090 --> 00:45:56,387
This is sort of the only
thing we know how to do.

887
00:45:56,387 --> 00:45:57,720
We have a static data structure.

888
00:45:57,720 --> 00:46:00,230
This is a general
transformation, dynamization

889
00:46:00,230 --> 00:46:03,740
when you have augmentation.

890
00:46:03,740 --> 00:46:05,060
We have this data structure.

891
00:46:05,060 --> 00:46:06,560
It's got all these
augmented things.

892
00:46:06,560 --> 00:46:07,760
It's complicated.

893
00:46:07,760 --> 00:46:09,560
But at least it's sort
of downward looking.

894
00:46:09,560 --> 00:46:12,660
I mean, you only need to
store pointers from here down,

895
00:46:12,660 --> 00:46:13,840
not up.

896
00:46:13,840 --> 00:46:15,340
I mean, your parent
points into you.

897
00:46:15,340 --> 00:46:17,499
But you have a nice local thing.

898
00:46:17,499 --> 00:46:19,040
So if this guy's
not weight balanced,

899
00:46:19,040 --> 00:46:23,530
if this left subtree is way
heavier than the right subtree

900
00:46:23,530 --> 00:46:26,950
by this alpha factor,
one over alpha factor,

901
00:46:26,950 --> 00:46:30,020
then just redo
everything in here.

902
00:46:30,020 --> 00:46:31,580
Find the median.

903
00:46:31,580 --> 00:46:33,522
Make a perfect
binary search tree.

904
00:46:33,522 --> 00:46:35,480
Then the weights between
the left and the right

905
00:46:35,480 --> 00:46:36,740
will be perfectly balanced.

906
00:46:36,740 --> 00:46:40,880
We'll have achieved the one
half, one half split of weight.

907
00:46:40,880 --> 00:46:43,610
How long before it
gets unbalanced again?

908
00:46:43,610 --> 00:46:45,704
A long time.

909
00:46:45,704 --> 00:46:47,870
If I start with a one half,
one half split, and then

910
00:46:47,870 --> 00:46:51,860
I have to get to an alpha
1 minus alpha split,

911
00:46:51,860 --> 00:46:56,365
a lot of nodes had to move
from one side to the other.

912
00:46:56,365 --> 00:46:57,770
The alpha gets messy.

913
00:46:57,770 --> 00:47:01,760
So let me just say
when this happens,

914
00:47:01,760 --> 00:47:03,920
rebuild entire subtree.

915
00:47:08,940 --> 00:47:12,676
I guess it's like a 1/2
minus alpha had to move.

916
00:47:12,676 --> 00:47:14,550
1/2 minus alpha times
the size of the subtree

917
00:47:14,550 --> 00:47:17,970
had to be inserted or deleted,
had to happen, or maybe half

918
00:47:17,970 --> 00:47:19,877
of that, some constant fraction.

919
00:47:19,877 --> 00:47:20,710
I don't really care.

920
00:47:20,710 --> 00:47:22,650
Alpha's a constant.

921
00:47:22,650 --> 00:47:28,790
I'm going to charge
to the theta k

922
00:47:28,790 --> 00:47:38,560
updates that unbalance things.

923
00:47:47,150 --> 00:47:49,650
k here is the size
of the subtree.

924
00:47:54,260 --> 00:47:57,440
k So when I see a node is
on balance, just fix it.

925
00:47:57,440 --> 00:47:59,270
Make it perfect.

926
00:47:59,270 --> 00:48:01,810
And if I started out perfect,
the subtree started out

927
00:48:01,810 --> 00:48:04,780
perfect, I know there were theta
k updates that I can charge to.

928
00:48:04,780 --> 00:48:09,200
The only catch is I'm actually
double charging quite a bit,

929
00:48:09,200 --> 00:48:09,700
actually.

930
00:48:09,700 --> 00:48:14,860
If you look at a tree,
if I do an insert here,

931
00:48:14,860 --> 00:48:17,650
it makes this subtree
potentially slightly

932
00:48:17,650 --> 00:48:18,190
unbalanced.

933
00:48:18,190 --> 00:48:19,430
It makes this subtrees
slightly unbalanced.

934
00:48:19,430 --> 00:48:21,180
It makes this subtree
slightly unbalanced.

935
00:48:21,180 --> 00:48:24,580
There are log n subtrees
that contain that item.

936
00:48:24,580 --> 00:48:27,095
Each of them may
be getting worse.

937
00:48:27,095 --> 00:48:29,470
So if I say, well, yeah, there
are these theta k updates,

938
00:48:29,470 --> 00:48:31,636
but actually there are log
n different subtrees that

939
00:48:31,636 --> 00:48:33,590
will charge to the same update.

940
00:48:33,590 --> 00:48:36,490
So I lose a log n factor
in this amortization.

941
00:48:36,490 --> 00:48:38,110
But it's not so bad.

942
00:48:38,110 --> 00:48:40,420
I get log n amortized update.

943
00:48:46,680 --> 00:48:50,510
This is if a rebuild
costs linear time.

944
00:48:57,616 --> 00:48:58,490
This is pretty nifty.

945
00:48:58,490 --> 00:49:00,930
I don't have to do
rotations per se.

946
00:49:00,930 --> 00:49:03,980
I just take all the notes in
the subtree, write them down.

947
00:49:03,980 --> 00:49:05,150
I do an in order traverse.

948
00:49:05,150 --> 00:49:06,800
I have them sorted,
take the median,

949
00:49:06,800 --> 00:49:09,740
build a nice perfect binary
search tree on those items.

950
00:49:09,740 --> 00:49:12,570
I can easily do
that in linear time.

951
00:49:12,570 --> 00:49:14,930
And so this is like
the brain dead way

952
00:49:14,930 --> 00:49:19,760
to make this weight
balanced tree dynamic.

953
00:49:19,760 --> 00:49:21,680
The original BB alpha
trees use rotations.

954
00:49:21,680 --> 00:49:22,959
But you don't have to.

955
00:49:22,959 --> 00:49:25,250
You can do this very simple
thing and still get a log n

956
00:49:25,250 --> 00:49:27,650
amortized update.

957
00:49:27,650 --> 00:49:30,260
And the good news is, if you
have augmentation as well--

958
00:49:30,260 --> 00:49:31,940
because with this
subtree, there's

959
00:49:31,940 --> 00:49:35,990
tons of extra stuff, all these
arrays and pointers and stuff,

960
00:49:35,990 --> 00:49:37,850
it's easy to build from scratch.

961
00:49:37,850 --> 00:49:39,440
But it's hard to
maintain dynamically.

962
00:49:39,440 --> 00:49:41,600
The point is, now,
we don't have to.

963
00:49:41,600 --> 00:49:43,250
If ever we need
to change a node,

964
00:49:43,250 --> 00:49:44,870
we just rebuild
the entire subtree.

965
00:49:44,870 --> 00:49:49,620
And we can afford it at the
loss of a logarithmic overhead.

966
00:49:49,620 --> 00:49:53,510
So we had n log to the d minus
1 n time to build the structure.

967
00:49:53,510 --> 00:49:55,010
So for a structure
of size k, it's

968
00:49:55,010 --> 00:50:00,154
going to be k times log
to the d minus 1 of k.

969
00:50:00,154 --> 00:50:01,820
We're going to lose
an extra log factor.

970
00:50:01,820 --> 00:50:04,220
So this d minus 1 is going
to turn into a d minus 2

971
00:50:04,220 --> 00:50:04,835
for updates.

972
00:50:24,590 --> 00:50:31,010
So that was the
generic structure.

973
00:50:31,010 --> 00:50:38,720
And now, if we apply this
to layered range trees,

974
00:50:38,720 --> 00:50:46,250
we get log to the d
n amortized update.

975
00:50:52,160 --> 00:50:56,165
Because we had k
times log to the d

976
00:50:56,165 --> 00:51:01,220
minus 1 of k pre-processing
to rebuild node.

977
00:51:04,460 --> 00:51:07,380
And just to recall,
we still have log

978
00:51:07,380 --> 00:51:11,900
to the d minus 1 of n query.

979
00:51:11,900 --> 00:51:14,780
So this was regular range trees.

980
00:51:14,780 --> 00:51:19,490
And we've made them dynamic,
the same time as range trees.

981
00:51:19,490 --> 00:51:22,250
And still, the query
is a log factor faster.

982
00:51:22,250 --> 00:51:26,060
So for 2D, we get log n query
log squired n update insertion

983
00:51:26,060 --> 00:51:28,700
and deletion of points.

984
00:51:28,700 --> 00:51:29,590
Questions about that?

985
00:51:33,180 --> 00:51:34,650
Cool.

986
00:51:34,650 --> 00:51:38,990
Well, that is range searching,
orthogonal range searching.

987
00:51:42,750 --> 00:51:44,980
Let's see.

988
00:51:44,980 --> 00:51:47,860
There are more
results, which I don't

989
00:51:47,860 --> 00:51:50,530
want to cover in detail here.

990
00:51:50,530 --> 00:51:53,164
But you should at
least know about them.

991
00:51:53,164 --> 00:51:55,330
And then we're going to
turn to fractional cascading

992
00:51:55,330 --> 00:51:56,638
a little more generally.

993
00:52:05,110 --> 00:52:10,855
So where is this result?

994
00:52:10,855 --> 00:52:11,480
Somewhere here.

995
00:52:21,160 --> 00:52:28,110
So for static orthogonal
range searching,

996
00:52:28,110 --> 00:52:30,090
range searching is a big area.

997
00:52:30,090 --> 00:52:31,797
We're looking at
the orthogonal case.

998
00:52:31,797 --> 00:52:33,630
There's other versions
where you're querying

999
00:52:33,630 --> 00:52:37,140
with a triangle or a simplex.

1000
00:52:37,140 --> 00:52:40,020
You can query with
two-sided box, which

1001
00:52:40,020 --> 00:52:41,430
goes out to infinity here.

1002
00:52:41,430 --> 00:52:44,050
All sorts of things
are out there.

1003
00:52:44,050 --> 00:52:46,312
But let me stick to rectangles.

1004
00:52:46,312 --> 00:52:50,760
Because that's what we've
seen and we can relate to.

1005
00:52:50,760 --> 00:52:54,400
You can achieve
these same bounds--

1006
00:52:54,400 --> 00:52:55,500
sorry, no update.

1007
00:52:55,500 --> 00:52:57,420
You can achieve the
log to the d minus 1

1008
00:52:57,420 --> 00:53:01,690
n query using less space.

1009
00:53:01,690 --> 00:53:06,630
So I can get log
to the d minus 1 n

1010
00:53:06,630 --> 00:53:16,170
query and n log to the
d minus 1 n space--

1011
00:53:16,170 --> 00:53:18,270
that's what we were
getting before--

1012
00:53:18,270 --> 00:53:21,240
divided by log log n.

1013
00:53:21,240 --> 00:53:23,280
Slight improvement.

1014
00:53:23,280 --> 00:53:26,370
And in a certain model,
this is basically optimal,

1015
00:53:26,370 --> 00:53:27,960
which is kind of even crazier.

1016
00:53:27,960 --> 00:53:31,970
This is an old
result by Chazelle.

1017
00:53:31,970 --> 00:53:34,560
That's in '86.

1018
00:53:34,560 --> 00:53:35,670
OK.

1019
00:53:35,670 --> 00:53:41,325
This is 2D-- sorry, not
2D, just in general.

1020
00:53:45,600 --> 00:53:48,840
Turns out this query
time is not optimal.

1021
00:53:48,840 --> 00:53:51,630
If you allow the space
to go up a little bit,

1022
00:53:51,630 --> 00:53:55,188
you can get another
log improvement.

1023
00:53:55,188 --> 00:53:59,160
So I can get log
to the d minus 2

1024
00:53:59,160 --> 00:54:04,675
and query if I'm
willing to pay--

1025
00:54:04,675 --> 00:54:05,715
I didn't this is space--

1026
00:54:09,210 --> 00:54:15,510
n log to the d n space.

1027
00:54:15,510 --> 00:54:19,260
So if I give up another
log factor in space,

1028
00:54:19,260 --> 00:54:21,060
I can get another
log factor in query.

1029
00:54:21,060 --> 00:54:23,040
I don't think you
can keep doing that.

1030
00:54:23,040 --> 00:54:25,520
But for one more step, you can.

1031
00:54:25,520 --> 00:54:28,770
I believe this is conjectured
optimal for query.

1032
00:54:28,770 --> 00:54:32,040
I don't know if it's proved.

1033
00:54:32,040 --> 00:54:35,190
And this was originally
done by Chazelle and Guibas

1034
00:54:35,190 --> 00:54:38,190
using fractional cascading.

1035
00:54:38,190 --> 00:54:39,570
And we'll see.

1036
00:54:39,570 --> 00:54:43,430
If there's time next class,
I'll show you how this works.

1037
00:54:43,430 --> 00:54:45,180
But for now, I want
to tell you in general

1038
00:54:45,180 --> 00:54:48,540
how fractional cascading
works in generality.

1039
00:54:48,540 --> 00:54:50,250
This is part of
fractional cascading,

1040
00:54:50,250 --> 00:54:53,970
this idea of cross-linking from
a bigger structure to a smaller

1041
00:54:53,970 --> 00:54:56,640
one, so that you don't
have to keep researching.

1042
00:54:56,640 --> 00:54:59,340
You just reuse where you were.

1043
00:54:59,340 --> 00:55:00,480
But there's another idea.

1044
00:55:00,480 --> 00:55:01,688
I want to show you that idea.

1045
00:55:04,200 --> 00:55:06,180
So, fractional cascading.

1046
00:55:22,052 --> 00:55:24,540
AUDIENCE: Would that
work for d equals 2?

1047
00:55:24,540 --> 00:55:27,840
ERIK DEMAINE: For d equals
2, no it does not work.

1048
00:55:27,840 --> 00:55:31,590
So I should say this
is for 2D and higher.

1049
00:55:31,590 --> 00:55:33,620
D has to be bigger than 1.

1050
00:55:33,620 --> 00:55:35,920
Because you can never be log n.

1051
00:55:35,920 --> 00:55:39,470
So for 2D and higher, we could
use the trick that we just did.

1052
00:55:39,470 --> 00:55:43,450
For 3D and higher, you can
improve by another long,

1053
00:55:43,450 --> 00:55:44,984
thanks.

1054
00:55:44,984 --> 00:55:45,650
Other questions?

1055
00:55:45,650 --> 00:55:47,540
AUDIENCE: But you said
you can never beat log n.

1056
00:55:47,540 --> 00:55:49,123
ERIK DEMAINE: We can
never beat log n.

1057
00:55:49,123 --> 00:55:52,880
In this model, which is
basically comparison model,

1058
00:55:52,880 --> 00:55:54,580
we're comparing coordinates.

1059
00:55:54,580 --> 00:55:56,690
In that model and
many other models,

1060
00:55:56,690 --> 00:55:58,762
you can't beat log n query.

1061
00:55:58,762 --> 00:56:01,220
Because in particular, you have
to solve the search problem

1062
00:56:01,220 --> 00:56:02,570
in 1D.

1063
00:56:02,570 --> 00:56:04,310
So we're always
hampered by that.

1064
00:56:04,310 --> 00:56:08,170
But the question is,
how does it grow with d?

1065
00:56:08,170 --> 00:56:10,600
And the claim is we can get
log n all the way up to three

1066
00:56:10,600 --> 00:56:11,230
dimensions.

1067
00:56:11,230 --> 00:56:13,900
Only at four dimensions do
we have to pay log squared.

1068
00:56:13,900 --> 00:56:17,870
It's pretty amazing I think.

1069
00:56:17,870 --> 00:56:18,570
OK.

1070
00:56:18,570 --> 00:56:32,191
Fractional cascading-- super
cool name, kind of scary name.

1071
00:56:32,191 --> 00:56:34,690
I was always scared when I heard
about fractional cascading.

1072
00:56:34,690 --> 00:56:35,940
But it turns out,
it's very simple.

1073
00:56:35,940 --> 00:56:37,330
Goal today is to not be scared.

1074
00:56:40,110 --> 00:56:43,560
Let's start with
a warm up problem.

1075
00:56:43,560 --> 00:56:46,190
And then I'll tell you
its full generality.

1076
00:56:46,190 --> 00:56:48,370
But simple version
of the problem

1077
00:56:48,370 --> 00:56:50,290
is not geometry, per se.

1078
00:56:50,290 --> 00:56:52,900
It's kind of 1 and 1/2
dimensions, if you will.

1079
00:56:52,900 --> 00:57:02,620
Suppose I have k lists
and each has size n.

1080
00:57:02,620 --> 00:57:04,630
They're sorted
lists, think of them.

1081
00:57:04,630 --> 00:57:09,460
So we have n items come
from an ordered universe.

1082
00:57:09,460 --> 00:57:10,840
Here's list one.

1083
00:57:10,840 --> 00:57:12,760
Here's list two.

1084
00:57:12,760 --> 00:57:14,260
Here's a list three.

1085
00:57:14,260 --> 00:57:16,330
There's k of them.

1086
00:57:16,330 --> 00:57:19,180
Each of them has n items.

1087
00:57:19,180 --> 00:57:21,805
I would like to
search the query.

1088
00:57:24,712 --> 00:57:26,920
We'll just do static here.

1089
00:57:26,920 --> 00:57:29,880
Original fractional
cascading was just static.

1090
00:57:29,880 --> 00:57:32,154
And these results
are just static.

1091
00:57:32,154 --> 00:57:34,320
You can make it dynamic,
but there is some overhead.

1092
00:57:34,320 --> 00:57:35,920
And I don't want
to get into that.

1093
00:57:35,920 --> 00:57:40,450
It's even messier,
or it is messy.

1094
00:57:40,450 --> 00:57:43,900
Fractional cascading by
itself is a very simple idea.

1095
00:57:43,900 --> 00:57:50,720
Query is search
for x in all lists.

1096
00:57:53,870 --> 00:57:54,370
OK.

1097
00:57:54,370 --> 00:57:57,639
So I want to know what is the
predecessor and successor of x

1098
00:57:57,639 --> 00:57:58,180
in this list.

1099
00:57:58,180 --> 00:57:59,560
I want to know what
is the predecessor

1100
00:57:59,560 --> 00:58:00,690
and successor in this list.

1101
00:58:00,690 --> 00:58:03,010
I want to know what's the
predecessor and successor

1102
00:58:03,010 --> 00:58:04,764
in this list, all of them.

1103
00:58:04,764 --> 00:58:05,680
It's more information.

1104
00:58:05,680 --> 00:58:08,470
If I just merged the
lists and searched for x,

1105
00:58:08,470 --> 00:58:11,224
I would find where
x fits globally.

1106
00:58:11,224 --> 00:58:13,390
But I want to know how it
fits relative to this list

1107
00:58:13,390 --> 00:58:16,480
and relative to this list
and relative to this list.

1108
00:58:16,480 --> 00:58:18,430
How do I do it?

1109
00:58:18,430 --> 00:58:20,330
I could just do k
binary searches.

1110
00:58:20,330 --> 00:58:25,270
So this is an easy
problem to solve.

1111
00:58:25,270 --> 00:58:29,500
You get k times log n.

1112
00:58:29,500 --> 00:58:32,050
But, now, fractional
cascading comes in.

1113
00:58:32,050 --> 00:58:40,760
And we can get the optimal
bound, which is k plus log n.

1114
00:58:40,760 --> 00:58:43,600
I need k to write
down the answers.

1115
00:58:43,600 --> 00:58:45,640
I need log n to do the
search in one list.

1116
00:58:45,640 --> 00:58:47,500
It turns out I can
search on all k lists,

1117
00:58:47,500 --> 00:58:52,570
simultaneously get all k
answers in k plus log n time.

1118
00:58:52,570 --> 00:58:57,400
It's kind of cool and,
actually, quite easy to do.

1119
00:58:57,400 --> 00:58:58,645
We want to use this concept.

1120
00:59:01,390 --> 00:59:04,870
If I could search for
my item, for x, in here,

1121
00:59:04,870 --> 00:59:07,380
and then basically follow
a pointer to where I want

1122
00:59:07,380 --> 00:59:09,910
to go in here, I'd be done.

1123
00:59:09,910 --> 00:59:12,500
Sadly, that can't be done.

1124
00:59:12,500 --> 00:59:13,300
Why?

1125
00:59:13,300 --> 00:59:16,810
Because who knows what
elements are in here?

1126
00:59:16,810 --> 00:59:21,480
All of these elements could
fit right in this slot.

1127
00:59:21,480 --> 00:59:24,400
And so how do I know where
to go in this giant list?

1128
00:59:24,400 --> 00:59:27,680
If these all fit in
here and, recursively,

1129
00:59:27,680 --> 00:59:30,280
these all fit in here,
then by searching up here,

1130
00:59:30,280 --> 00:59:32,420
I learn nothing about
where x fits in here.

1131
00:59:32,420 --> 00:59:33,820
I have to do another search.

1132
00:59:33,820 --> 00:59:35,986
And then I learn nothing
about where x fits in here.

1133
00:59:35,986 --> 00:59:37,720
So it doesn't work straight up.

1134
00:59:37,720 --> 00:59:41,830
But if we combine this idea
with fractional cascading,

1135
00:59:41,830 --> 00:59:43,786
then we can do it.

1136
00:59:43,786 --> 00:59:46,170
So I can erase this now.

1137
00:59:58,040 --> 00:59:59,590
So what do we do?

1138
01:00:02,720 --> 01:00:04,430
Idea is very simple.

1139
01:00:04,430 --> 01:00:10,410
So I'm going to call these
lists L1, L2, L3 up to Lk.

1140
01:00:13,780 --> 01:00:35,300
I want to add every other
item in Lk to Lk minus 1

1141
01:00:35,300 --> 01:00:37,340
and produce a new
list Lk minus 1 prime.

1142
01:00:39,880 --> 01:00:45,150
So I take every
second item here,

1143
01:00:45,150 --> 01:00:46,640
just insert them into this list.

1144
01:00:49,320 --> 01:00:52,460
[INAUDIBLE] it's a
constant fraction bigger.

1145
01:00:52,460 --> 01:00:54,080
And then repeat.

1146
01:00:54,080 --> 01:00:56,090
This is the fractional part.

1147
01:00:56,090 --> 01:00:57,849
Here, a fraction is one half.

1148
01:00:57,849 --> 01:00:59,390
You can make it
whatever fraction you

1149
01:00:59,390 --> 01:01:02,540
like less than one.

1150
01:01:02,540 --> 01:01:06,560
In general, I'm going to
add every other item--

1151
01:01:09,720 --> 01:01:13,360
this is in sorted order
in Lk, of course--

1152
01:01:13,360 --> 01:01:16,610
that's in Li prime--

1153
01:01:16,610 --> 01:01:18,410
the prime is the
important part here--

1154
01:01:18,410 --> 01:01:24,780
to Li minus 1 to
form Li minus prime.

1155
01:01:24,780 --> 01:01:27,520
So I've got this new
larger version of L2.

1156
01:01:27,520 --> 01:01:29,030
I take half the items from here.

1157
01:01:29,030 --> 01:01:31,850
Some of them may be
items that were in L3.

1158
01:01:31,850 --> 01:01:34,280
Some of them are items
that were originally in L2.

1159
01:01:34,280 --> 01:01:36,030
But all of them get promoted.

1160
01:01:36,030 --> 01:01:40,550
Or half of them
get promoted to L1.

1161
01:01:40,550 --> 01:01:43,880
So I keep promoting
from the bottom up.

1162
01:01:43,880 --> 01:01:46,340
How big do my lists get?

1163
01:01:46,340 --> 01:01:50,550
What is the size of Li prime?

1164
01:01:50,550 --> 01:01:54,290
Well, it started with Li.

1165
01:01:54,290 --> 01:01:57,680
And then I added
half of the items

1166
01:01:57,680 --> 01:02:02,200
that were in the next
level down, Li plus 1.

1167
01:02:02,200 --> 01:02:02,700
OK.

1168
01:02:02,700 --> 01:02:05,800
So this is n.

1169
01:02:05,800 --> 01:02:07,970
And so this is going
to be half of n

1170
01:02:07,970 --> 01:02:10,546
plus half of another
n plus half of--

1171
01:02:10,546 --> 01:02:12,920
I mean, it's going to be n
plus a half n plus a quarter n

1172
01:02:12,920 --> 01:02:14,010
plus an eighth n.

1173
01:02:14,010 --> 01:02:15,670
It's a geometric series.

1174
01:02:15,670 --> 01:02:18,215
This is just a
constant factor growth.

1175
01:02:18,215 --> 01:02:20,510
I'm assuming all the lists
have the same size here

1176
01:02:20,510 --> 01:02:21,140
for simplicity.

1177
01:02:21,140 --> 01:02:24,380
You can generalize.

1178
01:02:24,380 --> 01:02:27,920
So I didn't really make the
lists any bigger, per se.

1179
01:02:27,920 --> 01:02:30,230
But I fixed this problem.

1180
01:02:30,230 --> 01:02:36,330
If all of the elements in
L2 fit right here in L1,

1181
01:02:36,330 --> 01:02:37,760
it's no longer a problem.

1182
01:02:37,760 --> 01:02:41,090
Because, now, half of the
items from L2 now live in L1.

1183
01:02:41,090 --> 01:02:43,880
So when I search among
L1, I'm not quite

1184
01:02:43,880 --> 01:02:45,500
doing a global search.

1185
01:02:45,500 --> 01:02:47,590
But I'm finding
where I fit in L1.

1186
01:02:47,590 --> 01:02:50,990
I didn't contaminate
it too much from L2.

1187
01:02:50,990 --> 01:02:56,390
And then now, it's useful to
have pointers from L1 to L2.

1188
01:02:56,390 --> 01:02:57,804
Let me draw a picture maybe.

1189
01:03:12,130 --> 01:03:21,700
So here's L1, L2, L3.

1190
01:03:21,700 --> 01:03:29,380
So half of the items here
have been inserted into here.

1191
01:03:29,380 --> 01:03:31,870
Now, we don't really know--

1192
01:03:31,870 --> 01:03:36,480
maybe many of them went
near the same location.

1193
01:03:36,480 --> 01:03:37,707
But they went there.

1194
01:03:37,707 --> 01:03:39,790
And I'm going to have
pointers in both directions.

1195
01:03:39,790 --> 01:03:42,776
Let's say I need them down.

1196
01:03:42,776 --> 01:03:44,650
So that if I search in
here, I can figure out

1197
01:03:44,650 --> 01:03:45,910
where I am down here.

1198
01:03:49,020 --> 01:03:51,670
Then half of these guys--

1199
01:03:51,670 --> 01:03:55,120
maybe I'll use another color--

1200
01:03:55,120 --> 01:03:57,340
get promoted to
the next level up.

1201
01:03:57,340 --> 01:04:00,070
So maybe this one gets promoted.

1202
01:04:00,070 --> 01:04:02,670
Maybe this one gets promoted.

1203
01:04:02,670 --> 01:04:07,740
I guess half would be that one,
that one, that one, that one,

1204
01:04:07,740 --> 01:04:08,710
that one.

1205
01:04:08,710 --> 01:04:10,960
These guys get promoted
to the next level.

1206
01:04:26,450 --> 01:04:27,600
OK.

1207
01:04:27,600 --> 01:04:29,290
I claim this is
enough information.

1208
01:04:29,290 --> 01:04:32,020
This is fractional cascading
in its full generality.

1209
01:04:32,020 --> 01:04:33,490
We have the
cross-linking that we

1210
01:04:33,490 --> 01:04:36,400
had in the layered range trees.

1211
01:04:36,400 --> 01:04:38,987
But, now, we also have the
fractional cascading part,

1212
01:04:38,987 --> 01:04:40,820
which is you take a
fraction, you cascade it

1213
01:04:40,820 --> 01:04:41,980
into the next layer.

1214
01:04:41,980 --> 01:04:45,330
The cascading refers to those
guys continue to get promoted.

1215
01:04:45,330 --> 01:04:49,330
Half of them get
promoted up recursively.

1216
01:04:49,330 --> 01:04:51,950
That's where the
name comes from.

1217
01:04:51,950 --> 01:04:53,417
So now, how do we do a search?

1218
01:04:53,417 --> 01:04:54,750
We're going to start at the top.

1219
01:04:54,750 --> 01:04:57,490
And we're going to do a regular
binary search at the top.

1220
01:04:57,490 --> 01:04:59,650
Because we can
afford log n once.

1221
01:04:59,650 --> 01:05:02,230
So we do the binary
search at the top.

1222
01:05:02,230 --> 01:05:07,600
So maybe we find that our
item fits here in this search.

1223
01:05:07,600 --> 01:05:10,420
So that tells us,
oh, well, this is

1224
01:05:10,420 --> 01:05:13,960
where item x fits in this list.

1225
01:05:13,960 --> 01:05:14,566
Great.

1226
01:05:14,566 --> 01:05:15,940
Now, I need to
know where it fits

1227
01:05:15,940 --> 01:05:18,970
in the next list
in constant time.

1228
01:05:18,970 --> 01:05:20,980
Well, I need some more
pointers for this.

1229
01:05:20,980 --> 01:05:22,990
So for each item
in here, I'm going

1230
01:05:22,990 --> 01:05:25,880
to store a pointer to the
previous and next, let's say,

1231
01:05:25,880 --> 01:05:28,570
red node, the
previous and next node

1232
01:05:28,570 --> 01:05:31,490
that was promoted
from the list below.

1233
01:05:31,490 --> 01:05:31,990
OK.

1234
01:05:31,990 --> 01:05:35,290
So, now, I basically
know where it fits here.

1235
01:05:35,290 --> 01:05:38,350
Not quite, because this
is only half the items.

1236
01:05:38,350 --> 01:05:42,040
So I know that it fits
between this guy and this guy

1237
01:05:42,040 --> 01:05:47,260
in list 2 prime, technically.

1238
01:05:47,260 --> 01:05:51,370
So the only thing I don't
know is, is it here or here?

1239
01:05:51,370 --> 01:05:53,590
So I compare with this one item.

1240
01:05:53,590 --> 01:05:56,560
And in general, if
it's not a half,

1241
01:05:56,560 --> 01:05:58,190
if the fraction is
some other constant,

1242
01:05:58,190 --> 01:06:01,120
I spend constant time to look
at a constant number of items,

1243
01:06:01,120 --> 01:06:02,930
figure out where it
fits among those items.

1244
01:06:02,930 --> 01:06:05,980
Now, I know where
it fits in L2 prime.

1245
01:06:05,980 --> 01:06:09,490
Then I, again, follow
pointers to the next items.

1246
01:06:09,490 --> 01:06:11,510
In this case, they're
the white items.

1247
01:06:11,510 --> 01:06:15,150
So let's say it fits
here, basically.

1248
01:06:15,150 --> 01:06:18,530
I have a pointer to the
previous and next white item

1249
01:06:18,530 --> 01:06:19,900
from that item.

1250
01:06:19,900 --> 01:06:21,070
Follow those pointers down.

1251
01:06:21,070 --> 01:06:24,630
And now, I know it's either,
basically, here, here, here.

1252
01:06:24,630 --> 01:06:28,300
It's somewhere in that little
range of either equaling this

1253
01:06:28,300 --> 01:06:30,420
or being between these
two items or on this item

1254
01:06:30,420 --> 01:06:33,280
or between those two
items or on this item.

1255
01:06:33,280 --> 01:06:35,410
And, again, constant number
of things to look at.

1256
01:06:35,410 --> 01:06:37,240
I figure out where I belong.

1257
01:06:37,240 --> 01:06:43,450
In the primed list, which is
not quite the original list,

1258
01:06:43,450 --> 01:06:46,095
maybe I determine
that x falls here.

1259
01:06:46,095 --> 01:06:47,470
And what I really
want to know is

1260
01:06:47,470 --> 01:06:51,170
it's between this item and
that item of the original list.

1261
01:06:51,170 --> 01:06:53,920
I don't care so much
about the promoted lists.

1262
01:06:53,920 --> 01:06:56,670
So I need more
pointers, which tell me

1263
01:06:56,670 --> 01:07:03,370
if it happens that I fall here,
basically, every promoted item

1264
01:07:03,370 --> 01:07:05,530
has a pointer to the
previous and next unpromoted

1265
01:07:05,530 --> 01:07:07,800
item from the original list.

1266
01:07:07,800 --> 01:07:09,040
This is static.

1267
01:07:09,040 --> 01:07:10,650
I can have all these pointers.

1268
01:07:10,650 --> 01:07:12,490
Let's write them down.

1269
01:07:12,490 --> 01:07:23,620
So every promoted
item in Li prime--

1270
01:07:23,620 --> 01:07:28,060
that means it came from
a promotion from below--

1271
01:07:28,060 --> 01:07:42,290
has a pointer to the previous
and next non-promoted item.

1272
01:07:42,290 --> 01:07:45,350
So that's an item in Li.

1273
01:07:45,350 --> 01:07:45,850
OK.

1274
01:07:45,850 --> 01:07:47,560
That's two pointers.

1275
01:07:47,560 --> 01:07:50,080
And that's what we just use.

1276
01:07:50,080 --> 01:07:52,960
So I found where I was
among the entire L1

1277
01:07:52,960 --> 01:07:56,590
prime, which was almost like
a global search, not quite.

1278
01:07:56,590 --> 01:07:59,500
And then I follow these
points to figure out where

1279
01:07:59,500 --> 01:08:00,790
it was in the original L1.

1280
01:08:05,137 --> 01:08:06,970
Well, so if I found
that I was in the middle

1281
01:08:06,970 --> 01:08:10,180
of this big white region, I need
to find the next red region.

1282
01:08:10,180 --> 01:08:11,810
So it's basically the reverse.

1283
01:08:11,810 --> 01:08:19,354
Every non-promoted item, every
item in Li, has a pointer.

1284
01:08:23,229 --> 01:08:27,344
So this is basically Li
prime minus Li, if you will.

1285
01:08:27,344 --> 01:08:28,760
And then these
guys need a pointer

1286
01:08:28,760 --> 01:08:35,540
to the next and previous
in that, so previous

1287
01:08:35,540 --> 01:08:45,720
and the next item in
Li prime minus Li.

1288
01:08:45,720 --> 01:08:47,608
So these are the promoted items.

1289
01:08:47,608 --> 01:08:48,899
These are the unpromoted items.

1290
01:08:48,899 --> 01:08:52,020
So it's actually just
two pointers per item.

1291
01:08:52,020 --> 01:08:54,560
If you're promoted, you
store the previous and next.

1292
01:08:54,560 --> 01:08:56,010
Unpromoted, if
you're unpromoted,

1293
01:08:56,010 --> 01:08:57,739
you store the previous
and next promoted.

1294
01:08:57,739 --> 01:08:58,739
It's nice and symmetric.

1295
01:08:58,739 --> 01:09:00,550
It's pretty clean,
a lot of pointers,

1296
01:09:00,550 --> 01:09:04,359
hard to draw, but quite
simple in the end.

1297
01:09:04,359 --> 01:09:05,350
There's two main ideas.

1298
01:09:05,350 --> 01:09:07,560
One is to promote
recursively up,

1299
01:09:07,560 --> 01:09:11,705
just a constant fraction, so
the lists don't get much bigger.

1300
01:09:11,705 --> 01:09:13,080
Because it's a
constant fraction,

1301
01:09:13,080 --> 01:09:15,779
the gaps when you walk
down are constant size.

1302
01:09:15,779 --> 01:09:17,819
And so you basically
get free relocalization

1303
01:09:17,819 --> 01:09:20,370
within each list with
the help of some pointers

1304
01:09:20,370 --> 01:09:26,220
to walk down and jump left and
right between the two colors.

1305
01:09:26,220 --> 01:09:27,439
OK.

1306
01:09:27,439 --> 01:09:29,350
That's basic
fractional cascading

1307
01:09:29,350 --> 01:09:34,460
in that we solved this problem,
searched within k lists

1308
01:09:34,460 --> 01:09:38,810
each of size n and k plus log n
time, which is kind of amazing

1309
01:09:38,810 --> 01:09:40,700
I think, pretty cool.

1310
01:09:44,510 --> 01:09:47,590
But there's a more
general form of this,

1311
01:09:47,590 --> 01:09:49,529
uses the exact same ideas.

1312
01:09:49,529 --> 01:09:54,149
But I just want to tell you
how they generalized it.

1313
01:09:58,084 --> 01:09:59,250
This is Chazelle and Guibas.

1314
01:10:07,958 --> 01:10:22,920
So in general,
fractional cascading,

1315
01:10:22,920 --> 01:10:25,065
if you look at what
cascading is happening here--

1316
01:10:27,660 --> 01:10:31,800
here being here-- we essentially
have a path of cascades.

1317
01:10:31,800 --> 01:10:34,780
We start at the bottom and
push into the predecessor

1318
01:10:34,780 --> 01:10:35,280
in the path.

1319
01:10:35,280 --> 01:10:37,530
We push into the
predecessor in the path.

1320
01:10:37,530 --> 01:10:39,960
In the general case, we
do it on a graph instead

1321
01:10:39,960 --> 01:10:42,580
of a path, arbitrary graph.

1322
01:10:42,580 --> 01:10:46,510
So we have a graph.

1323
01:10:46,510 --> 01:10:48,870
The input, in some
sense, you can think

1324
01:10:48,870 --> 01:10:52,080
of this as a transformation.

1325
01:10:52,080 --> 01:10:54,510
But it's for a specific
kind of data structure.

1326
01:10:54,510 --> 01:10:59,640
The data structure is
represented by a graph.

1327
01:10:59,640 --> 01:11:11,370
And each vertex of the
graph has a set of elements

1328
01:11:11,370 --> 01:11:12,540
or a list of elements.

1329
01:11:12,540 --> 01:11:13,680
That's what we had here.

1330
01:11:13,680 --> 01:11:17,415
We had a path here.

1331
01:11:17,415 --> 01:11:20,040
And each node in the path had a
corresponding list of elements.

1332
01:11:20,040 --> 01:11:22,982
And we wanted to search
among those lists.

1333
01:11:22,982 --> 01:11:24,690
Before I tell you
exactly what search is,

1334
01:11:24,690 --> 01:11:26,790
let me tell you about
the rest of the graph.

1335
01:11:34,500 --> 01:11:38,710
And this is, sorry, in
an ordered universe.

1336
01:11:38,710 --> 01:11:40,167
So it's one dimensional.

1337
01:11:43,750 --> 01:11:50,245
Edge is labeled with a range
from that ordered universe a,

1338
01:11:50,245 --> 01:11:50,940
b.

1339
01:11:50,940 --> 01:11:52,440
Every edge has some range.

1340
01:11:55,454 --> 01:11:57,120
You can think of it
as a directed graph.

1341
01:11:57,120 --> 01:11:58,080
It's probably cleaner.

1342
01:11:58,080 --> 01:12:02,580
So when I follow this
edge, there's a range here.

1343
01:12:02,580 --> 01:12:05,130
And, basically, I'm only
allowed to follow that edge

1344
01:12:05,130 --> 01:12:08,730
if the range contains the
thing I'm searching for.

1345
01:12:08,730 --> 01:12:13,202
So here, I was searching for
some item x in all the lists.

1346
01:12:13,202 --> 01:12:14,160
There's no ranges here.

1347
01:12:14,160 --> 01:12:16,674
But in general, you
get to specify a range.

1348
01:12:16,674 --> 01:12:18,090
Why do we want to
specify a range?

1349
01:12:18,090 --> 01:12:21,633
We need a sort of bounded
degree constraint.

1350
01:12:28,890 --> 01:12:33,348
We want to have
bounded n degree.

1351
01:12:33,348 --> 01:12:37,116
So here we had n degree
1 for every node.

1352
01:12:37,116 --> 01:12:38,490
In general, we
don't want to have

1353
01:12:38,490 --> 01:12:39,760
too many nodes pointing in.

1354
01:12:39,760 --> 01:12:42,492
Because we want to take
half the nodes in here,

1355
01:12:42,492 --> 01:12:43,950
or a constant
fraction of the items

1356
01:12:43,950 --> 01:12:46,872
here, and promote them into
all the nodes that point to it,

1357
01:12:46,872 --> 01:12:48,330
so that when we
follow the pointer,

1358
01:12:48,330 --> 01:12:50,400
we get to know where
we belong here.

1359
01:12:50,400 --> 01:12:52,530
That's the general concept.

1360
01:12:52,530 --> 01:12:54,570
So, ideally, we have
bounded n degree.

1361
01:12:54,570 --> 01:12:56,130
If we do, we're done.

1362
01:12:56,130 --> 01:12:58,200
We can have a slightly
weaker condition,

1363
01:12:58,200 --> 01:13:02,010
which is called
locally-bounded n degree where

1364
01:13:02,010 --> 01:13:12,750
the number of incoming edges for
a node whose labels are ranges

1365
01:13:12,750 --> 01:13:15,960
The labels have to have
a common intersection, x.

1366
01:13:15,960 --> 01:13:19,260
So we're searching
for some item x.

1367
01:13:19,260 --> 01:13:22,230
And if all the possible ways we
can enter this node given item

1368
01:13:22,230 --> 01:13:22,920
x--

1369
01:13:22,920 --> 01:13:25,980
so this x has to fall
in all those ranges--

1370
01:13:25,980 --> 01:13:32,700
that should be bounded, so,
at most, some constant c.

1371
01:13:32,700 --> 01:13:34,470
If it's always at
most c for all nodes

1372
01:13:34,470 --> 01:13:38,040
and for all x's, then this
is locally-bounded degree.

1373
01:13:38,040 --> 01:13:42,720
And these range labels help
you achieve this property.

1374
01:13:42,720 --> 01:13:44,332
If you can constrain
that you're only

1375
01:13:44,332 --> 01:13:46,290
going to follow this edge
in certain situations

1376
01:13:46,290 --> 01:13:47,706
and there aren't
too many ways you

1377
01:13:47,706 --> 01:13:50,340
could have gotten to a node,
then you have this property.

1378
01:13:50,340 --> 01:13:57,210
AUDIENCE: [INAUDIBLE]
bound to x?

1379
01:13:57,210 --> 01:14:02,030
ERIK DEMAINE: Contain x is
a backwards containment.

1380
01:14:02,030 --> 01:14:03,340
Let me put it this way.

1381
01:14:03,340 --> 01:14:04,620
You have a node.

1382
01:14:04,620 --> 01:14:07,740
You have all these
edges coming into it.

1383
01:14:07,740 --> 01:14:11,190
I want x to be a valid choice
for each of these edges.

1384
01:14:11,190 --> 01:14:16,980
Meaning, the range, each of them
is some interval on the line.

1385
01:14:16,980 --> 01:14:18,900
All those intervals
should contain x.

1386
01:14:22,740 --> 01:14:25,710
It's basically, if you laid
out all the intervals incoming

1387
01:14:25,710 --> 01:14:28,290
into this node, what
is the maximum depth

1388
01:14:28,290 --> 01:14:29,310
of those intervals?

1389
01:14:29,310 --> 01:14:32,040
What's the maximum intersection
between all those intervals?

1390
01:14:32,040 --> 01:14:34,200
That is your local degree.

1391
01:14:34,200 --> 01:14:36,720
And as long as that's the
constant, we're happy.

1392
01:14:39,516 --> 01:14:42,320
All right.

1393
01:14:42,320 --> 01:14:44,590
So now, let me specify
what a search means.

1394
01:14:47,120 --> 01:14:49,495
This is the problem that
fractional cascading can solve.

1395
01:14:52,150 --> 01:15:05,310
Goal is to find x in
some k vertex sets.

1396
01:15:05,310 --> 01:15:08,800
So k vertices, each
of them has a set.

1397
01:15:08,800 --> 01:15:10,860
I want to find x in k of them.

1398
01:15:10,860 --> 01:15:12,670
Not all of them, k of them.

1399
01:15:12,670 --> 01:15:14,650
That's the general problem.

1400
01:15:14,650 --> 01:15:16,945
I have a constraint on
how those sets are found.

1401
01:15:23,170 --> 01:15:28,775
They're found by navigating this
graph starting from any vertex.

1402
01:15:35,010 --> 01:15:42,000
And we navigate by following
edges whose labels contain x.

1403
01:15:50,520 --> 01:15:53,070
So we started some
vertex in the graph.

1404
01:15:53,070 --> 01:15:56,160
We can follow some
edges that contain

1405
01:15:56,160 --> 01:15:59,690
x. x is a valid choice here
that's inside the interval.

1406
01:15:59,690 --> 01:16:02,730
Then from here, maybe we
follow some more where

1407
01:16:02,730 --> 01:16:06,990
x is a valid choice, and so on.

1408
01:16:06,990 --> 01:16:09,485
It could look like anything.

1409
01:16:09,485 --> 01:16:11,610
It doesn't have to be depth
first or breadth first.

1410
01:16:11,610 --> 01:16:15,030
It's just you follow
some tree from some node

1411
01:16:15,030 --> 01:16:19,530
where all of the
edges are valid for x.

1412
01:16:19,530 --> 01:16:22,140
At some point, you decide
that I've seen enough.

1413
01:16:22,140 --> 01:16:27,750
And now, the goal is to find
in this set, where is x?

1414
01:16:27,750 --> 01:16:28,910
In this set, where is x?

1415
01:16:28,910 --> 01:16:31,490
In this set, where is x?

1416
01:16:31,490 --> 01:16:34,090
In each of these lists, what
is a predecessor and successor

1417
01:16:34,090 --> 01:16:34,900
of x?

1418
01:16:34,900 --> 01:16:35,400
Question.

1419
01:16:35,400 --> 01:16:37,274
AUDIENCE: So there's
generally some root node

1420
01:16:37,274 --> 01:16:39,030
from which all queries start?

1421
01:16:39,030 --> 01:16:42,420
ERIK DEMAINE: I believe you do
not need a single root node.

1422
01:16:42,420 --> 01:16:44,924
Each search could start
from a different point.

1423
01:16:44,924 --> 01:16:45,465
AUDIENCE: OK.

1424
01:16:45,465 --> 01:16:46,298
So it's [INAUDIBLE].

1425
01:16:46,298 --> 01:16:48,090
ERIK DEMAINE: But
you're told where.

1426
01:16:48,090 --> 01:16:50,580
So imagine this is like
an interaction between two

1427
01:16:50,580 --> 01:16:51,100
parties.

1428
01:16:51,100 --> 01:16:55,350
So the input basically says,
look, I'm searching for x.

1429
01:16:55,350 --> 01:16:57,051
And I'm going to
start at this node.

1430
01:16:57,051 --> 01:16:59,050
And then the fractional
cascading data structure

1431
01:16:59,050 --> 01:17:01,560
says, OK, here's where
x is in that node.

1432
01:17:01,560 --> 01:17:03,057
It tells you immediately.

1433
01:17:03,057 --> 01:17:04,220
Why not?

1434
01:17:04,220 --> 01:17:07,050
Then it says, OK, I'd
like to follow this edge

1435
01:17:07,050 --> 01:17:08,454
and go to this node.

1436
01:17:08,454 --> 01:17:09,870
And fractional
cascading says, OK,

1437
01:17:09,870 --> 01:17:12,670
here's where x is in this
node in constant time.

1438
01:17:12,670 --> 01:17:13,170
OK.

1439
01:17:13,170 --> 01:17:14,720
Then now these two
guys are active.

1440
01:17:14,720 --> 01:17:19,030
And now, the adversary, the
input, whatever, can decide,

1441
01:17:19,030 --> 01:17:22,410
OK, I'm going to follow this
edge, or this edge, any order.

1442
01:17:22,410 --> 01:17:24,120
It can build this
tree in any order.

1443
01:17:24,120 --> 01:17:27,510
And every time it says here's
the edge I want to follow,

1444
01:17:27,510 --> 01:17:30,240
the fractional cascading data
structure in constant time

1445
01:17:30,240 --> 01:17:32,190
tells you here's
where x is among all

1446
01:17:32,190 --> 01:17:34,130
the items in that node.

1447
01:17:34,130 --> 01:17:35,730
How does it do that?

1448
01:17:35,730 --> 01:17:37,200
With fractional cascading.

1449
01:17:37,200 --> 01:17:39,450
You just take half the items.

1450
01:17:39,450 --> 01:17:41,040
Half doesn't work anymore.

1451
01:17:41,040 --> 01:17:44,130
Now, it depends on
that bounded n degree.

1452
01:17:44,130 --> 01:17:46,530
But you take some
function of that degree

1453
01:17:46,530 --> 01:17:50,790
c, take some constant
fraction of the items,

1454
01:17:50,790 --> 01:17:54,030
promote them to all
the things, keep going.

1455
01:17:54,030 --> 01:17:57,269
It's a little trickier,
because now you have cycles.

1456
01:17:57,269 --> 01:17:59,310
So you could actually
promote back into yourself,

1457
01:17:59,310 --> 01:18:01,350
eventually, by chain reactions.

1458
01:18:01,350 --> 01:18:03,040
But if you set the
constant low enough,

1459
01:18:03,040 --> 01:18:05,700
it's like radioactive decay.

1460
01:18:05,700 --> 01:18:09,281
Eventually, it all
goes away, right?

1461
01:18:09,281 --> 01:18:09,780
I wish.

1462
01:18:12,390 --> 01:18:14,610
So it's much better
than radioactive decay.

1463
01:18:14,610 --> 01:18:15,750
Radioactive is logarithmic.

1464
01:18:15,750 --> 01:18:17,110
This is exponential.

1465
01:18:17,110 --> 01:18:18,510
So it's decreasing very quickly.

1466
01:18:18,510 --> 01:18:21,030
After log n steps, all
your items are gone.

1467
01:18:21,030 --> 01:18:23,620
So, yeah, maybe you go in
a short loop for a while.

1468
01:18:23,620 --> 01:18:26,732
But after log n
steps, it's all gone.

1469
01:18:26,732 --> 01:18:28,690
So you're, at most,
increasing by a log factor.

1470
01:18:28,690 --> 01:18:30,450
In fact, you just increase
by a constant factor,

1471
01:18:30,450 --> 01:18:32,075
because the number
of items that remain

1472
01:18:32,075 --> 01:18:35,144
gets so tiny very quickly.

1473
01:18:35,144 --> 01:18:36,810
So I'm not going to
go into the details,

1474
01:18:36,810 --> 01:18:40,320
but you just take this list
idea, apply it to your graph.

1475
01:18:40,320 --> 01:18:41,700
It works.

1476
01:18:41,700 --> 01:18:43,230
It gets messier.

1477
01:18:43,230 --> 01:18:45,240
But in this very
general scenario,

1478
01:18:45,240 --> 01:18:48,870
you can support
these searches in k

1479
01:18:48,870 --> 01:18:53,310
plus log n where n, let's
say, is the maximum size

1480
01:18:53,310 --> 01:18:55,650
of any vertex set.

1481
01:19:01,200 --> 01:19:03,780
So just it directly generalizes.

1482
01:19:03,780 --> 01:19:06,420
And this is the
thing that you can

1483
01:19:06,420 --> 01:19:11,520
use to get this log
factor improvement

1484
01:19:11,520 --> 01:19:12,630
and many other things.

1485
01:19:12,630 --> 01:19:15,642
Actually, this was such
a big thing at the time.

1486
01:19:15,642 --> 01:19:17,850
There were two papers on
fractional cascading, part 1

1487
01:19:17,850 --> 01:19:18,870
and part 2.

1488
01:19:18,870 --> 01:19:21,690
Part 1 is what is solving this.

1489
01:19:21,690 --> 01:19:23,312
And part 2 is applications.

1490
01:19:23,312 --> 01:19:25,020
They solved a ton of
problems that no one

1491
01:19:25,020 --> 01:19:28,380
knew how to solve using this
general fractional cascading

1492
01:19:28,380 --> 01:19:29,280
technique.

1493
01:19:29,280 --> 01:19:31,430
That's it for today.