1
00:00:07,000 --> 00:00:10,000
-- shortest paths.
This is the finale.

2
00:00:10,000 --> 00:00:13,000
Hopefully it was worth waiting
for.

3
00:00:13,000 --> 00:00:17,000
Remind you there's a quiz
coming up soon,

4
00:00:17,000 --> 00:00:23,000
you should be studying for it.
There's no problem set due at

5
00:00:23,000 --> 00:00:28,000
the same time as the quiz
because you should be studying

6
00:00:28,000 --> 00:00:32,000
now.
It's a take-home exam.

7
00:00:32,000 --> 00:00:37,000
It's required that you come to
class on Monday.

8
00:00:37,000 --> 00:00:43,000
Of course, you'll all come,
but everyone watching at home

9
00:00:43,000 --> 00:00:47,000
should also come next Monday to
get the quiz.

10
00:00:47,000 --> 00:00:53,000
It's the required lecture.
So, we need a bit of a recap in

11
00:00:53,000 --> 00:00:58,000
the trilogy so far.
So, the last two lectures,

12
00:00:58,000 --> 00:01:04,000
the last two episodes,
or about single source shortest

13
00:01:04,000 --> 00:01:08,000
paths.
So, we wanted to find the

14
00:01:08,000 --> 00:01:13,000
shortest path from a source
vertex to every other vertex.

15
00:01:13,000 --> 00:01:17,000
And, we saw a few algorithms
for this.

16
00:01:17,000 --> 00:01:21,000
Here's some recap.
We saw in the unweighted case,

17
00:01:21,000 --> 00:01:27,000
that was sort of the easiest
where all the edge weights were

18
00:01:27,000 --> 00:01:30,000
one.
Then we could use breadth first

19
00:01:30,000 --> 00:01:34,000
search.
And this costs what we call

20
00:01:34,000 --> 00:01:41,000
linear time in the graph world,
the number of vertices plus the

21
00:01:41,000 --> 00:01:46,000
number of edges.
The next simplest case,

22
00:01:46,000 --> 00:01:50,000
perhaps, is nonnegative edge
weights.

23
00:01:50,000 --> 00:01:54,000
And in that case,
what algorithm do we use?

24
00:01:54,000 --> 00:02:00,000
Dijkstra, all right,
everyone's awake.

25
00:02:00,000 --> 00:02:04,000
Several answers at once,
great.

26
00:02:04,000 --> 00:02:11,000
So this takes almost linear
time if you use a good heap

27
00:02:11,000 --> 00:02:15,000
structure, so,
V log V plus E.

28
00:02:15,000 --> 00:02:21,000
And, in the general case,
general weights,

29
00:02:21,000 --> 00:02:26,000
we would use Bellman-Ford which
you saw.

30
00:02:26,000 --> 00:02:33,000
And that costs VE,
good, OK, which is quite a bit

31
00:02:33,000 --> 00:02:38,000
worse.
This is ignoring log factors.

32
00:02:38,000 --> 00:02:42,000
Dijkstra is basically linear
time, Bellman-Ford you're

33
00:02:42,000 --> 00:02:45,000
quadratic if you have a
connected graph.

34
00:02:45,000 --> 00:02:49,000
So, in the sparse case,
when E is order V,

35
00:02:49,000 --> 00:02:52,000
this is about linear.
This is about quadratic.

36
00:02:52,000 --> 00:02:56,000
In the dense case,
when E is about V^2,

37
00:02:56,000 --> 00:03:00,000
this is quadratic,
and this is cubic.

38
00:03:00,000 --> 00:03:06,000
So, Dijkstra and Bellman-Ford
are separated by about an order

39
00:03:06,000 --> 00:03:09,000
of V factor, which is pretty
bad.

40
00:03:09,000 --> 00:03:15,000
OK, but that's the best we know
how to do for single source

41
00:03:15,000 --> 00:03:19,000
shortest paths,
negative edge weights,

42
00:03:19,000 --> 00:03:24,000
Bellman-Ford is the best.
We also saw in recitation the

43
00:03:24,000 --> 00:03:30,000
case of a DAG.
And there, what do you do?

44
00:03:30,000 --> 00:03:32,000
Topological sort,
yeah.

45
00:03:32,000 --> 00:03:39,000
So, you can do a topological
sort to get an ordering on the

46
00:03:39,000 --> 00:03:42,000
vertices.
That you run Bellman-Ford,

47
00:03:42,000 --> 00:03:47,000
one round.
This is one way to think of

48
00:03:47,000 --> 00:03:51,000
what's going on.
You run Bellman-Ford in the

49
00:03:51,000 --> 00:03:57,000
order given by the topological
sort, which is once,

50
00:03:57,000 --> 00:04:03,000
and you get a linear time
algorithm.

51
00:04:03,000 --> 00:04:06,000
So, DAG is another case where
we know how to do well even with

52
00:04:06,000 --> 00:04:08,000
weights.
Unweighted, we can also do

53
00:04:08,000 --> 00:04:10,000
linear time.
But most of the time,

54
00:04:10,000 --> 00:04:13,000
though, will be,
so you should keep these in

55
00:04:13,000 --> 00:04:15,000
mind in the quiz.
When you get a shortest path

56
00:04:15,000 --> 00:04:19,000
problem, or what you end up
determining is the shortest path

57
00:04:19,000 --> 00:04:22,000
problem, think about what's the
best algorithm you can use in

58
00:04:22,000 --> 00:04:24,000
that case?
OK, so that's single source

59
00:04:24,000 --> 00:04:27,000
shortest paths.
And so, in our evolution of the

60
00:04:27,000 --> 00:04:30,000
Death Star, initially it was
just nonnegative edge weights.

61
00:04:30,000 --> 00:04:34,000
Then we got negative edge
weights.

62
00:04:34,000 --> 00:04:37,000
Today, the Death Star
challenges us with all pair

63
00:04:37,000 --> 00:04:40,000
shortest paths,
where we want to know the

64
00:04:40,000 --> 00:04:44,000
shortest path weight between
every pair of vertices.

65
00:04:59,000 --> 00:05:03,000
OK, so let's get some quick
results.

66
00:05:03,000 --> 00:05:07,000
What could we do with this
case?

67
00:05:07,000 --> 00:05:13,000
So, for example,
suppose I have an unweighted

68
00:05:13,000 --> 00:05:18,000
graph.
Any suggestions of how I should

69
00:05:18,000 --> 00:05:26,000
compute all pair shortest paths?
Between every pair of vertices,

70
00:05:26,000 --> 00:05:32,000
I want to know the shortest
path weight.

71
00:05:32,000 --> 00:05:37,000
BFS, a couple more words?
Yeah?

72
00:05:37,000 --> 00:05:44,000
Right, BFS V times.
OK, I'll say V times BFS,

73
00:05:44,000 --> 00:05:49,000
OK?
So, the running time would be

74
00:05:49,000 --> 00:05:57,000
V^2 plus V times E,
yeah, which is assuming your

75
00:05:57,000 --> 00:06:03,000
graph is connected,
V times E.

76
00:06:03,000 --> 00:06:05,000
OK, good.
That's probably about the best

77
00:06:05,000 --> 00:06:07,000
algorithm we know for unweighted
graphs.

78
00:06:07,000 --> 00:06:11,000
So, a lot of these are going to
sort of be the obvious answer.

79
00:06:11,000 --> 00:06:15,000
You take your single source
algorithm, you run it V times.

80
00:06:15,000 --> 00:06:18,000
That's the best you can do,
OK, or the best we know how to

81
00:06:18,000 --> 00:06:19,000
do.
This is not so bad.

82
00:06:19,000 --> 00:06:22,000
This is like one iteration of
Bellman-Ford,

83
00:06:22,000 --> 00:06:25,000
for comparison.
We definitely need at least,

84
00:06:25,000 --> 00:06:27,000
like, V^2 time,
because the size of the output

85
00:06:27,000 --> 00:06:32,000
is V^2, shortest path weight we
have to compute.

86
00:06:32,000 --> 00:06:37,000
So, this is not perfect,
but pretty good.

87
00:06:37,000 --> 00:06:41,000
And we are not going to improve
on that.

88
00:06:41,000 --> 00:06:49,000
So, nonnegative edge weights:
the natural thing to do is to

89
00:06:49,000 --> 00:06:54,000
run Dijkstra V times,
OK, no big surprise.

90
00:06:54,000 --> 00:07:01,000
And the running time of that
is, well, V times E again,

91
00:07:01,000 --> 00:07:08,000
plus V^2, log V,
which is also not too bad.

92
00:07:08,000 --> 00:07:10,000
I mean, it's basically the same
as running BFS.

93
00:07:10,000 --> 00:07:12,000
And then, there's the log
factor.

94
00:07:12,000 --> 00:07:16,000
If you ignore the log factor,
this is the dominant term.

95
00:07:16,000 --> 00:07:18,000
And, I mean,
this had an [added?] V^2 as

96
00:07:18,000 --> 00:07:20,000
well.
So, these are both pretty good.

97
00:07:20,000 --> 00:07:22,000
I mean, this is kind of neat.
Essentially,

98
00:07:22,000 --> 00:07:26,000
the time it takes to run one
Bellman-Ford plus a log factor,

99
00:07:26,000 --> 00:07:29,000
you can compute all pair
shortest paths if you have

100
00:07:29,000 --> 00:07:35,000
nonnegative edge weights.
So, I mean, comparing all pairs

101
00:07:35,000 --> 00:07:39,000
to signal source,
this seems a lot better,

102
00:07:39,000 --> 00:07:45,000
except we can only handle
nonnegative edge weights.

103
00:07:45,000 --> 00:07:49,000
OK, so now let's think about
the general case.

104
00:07:49,000 --> 00:07:55,000
Well, this is the focus of
today, and here's where we can

105
00:07:55,000 --> 00:08:02,000
actually make an improvement.
So the obvious thing is V times

106
00:08:02,000 --> 00:08:08,000
Bellman-Ford,
which would cost V^2 times E.

107
00:08:08,000 --> 00:08:11,000
And that's pretty pitiful,
and we're going to try to

108
00:08:11,000 --> 00:08:15,000
improve that to something closer
to that nonnegative edge weight

109
00:08:15,000 --> 00:08:17,000
bound.
So it turns out,

110
00:08:17,000 --> 00:08:21,000
here, we can actually make an
improvement whereas in these

111
00:08:21,000 --> 00:08:24,000
special cases,
we really can't do much better.

112
00:08:24,000 --> 00:08:26,000
OK, I don't have a good
intuition why,

113
00:08:26,000 --> 00:08:30,000
but it's the case.
So, we'll cover something like

114
00:08:30,000 --> 00:08:34,000
three algorithms today for this
problem.

115
00:08:34,000 --> 00:08:37,000
The last one will be the best,
but along the way we'll see

116
00:08:37,000 --> 00:08:40,000
some nice connections between
shortest paths and dynamic

117
00:08:40,000 --> 00:08:42,000
programming, which we haven't
really seen yet.

118
00:08:42,000 --> 00:08:46,000
We've seen shortest path,
and applying greedy algorithms

119
00:08:46,000 --> 00:08:49,000
to it, but today will actually
do dynamic programming.

120
00:08:49,000 --> 00:08:51,000
The intuition is that with all
pair shortest paths,

121
00:08:51,000 --> 00:08:54,000
there's more potential
subproblem reuse.

122
00:08:54,000 --> 00:08:57,000
We've got to compute the
shortest path from x to y for

123
00:08:57,000 --> 00:08:59,000
all x and y.
Maybe we can reuse those

124
00:08:59,000 --> 00:09:03,000
shortest paths in computing
other shortest paths.

125
00:09:03,000 --> 00:09:07,000
OK, there's a bit more
reusability, let's say.

126
00:09:07,000 --> 00:09:12,000
OK, let me quickly define all
pair shortest paths formally,

127
00:09:12,000 --> 00:09:17,000
because we're going to change
our notation slightly.

128
00:09:17,000 --> 00:09:20,000
It's because we care about all
pairs.

129
00:09:20,000 --> 00:09:24,000
So, as usual,
the input is directed graph,

130
00:09:24,000 --> 00:09:29,000
so, vertices and edges.
We're going to say that the

131
00:09:29,000 --> 00:09:35,000
vertices are labeled one to n
for convenience because with all

132
00:09:35,000 --> 00:09:42,000
pairs, we're going to think of
things more as an n by n matrix

133
00:09:42,000 --> 00:09:48,000
instead of edges in some sense
because it doesn't help to think

134
00:09:48,000 --> 00:09:51,000
any more in terms of adjacency
lists.

135
00:09:51,000 --> 00:09:55,000
And, you have edge weights as
usual.

136
00:09:55,000 --> 00:10:00,000
This is what makes it
interesting.

137
00:10:00,000 --> 00:10:05,000
Some of them are going to be
negative.

138
00:10:05,000 --> 00:10:13,000
So, w maps to every real
number, and the target output is

139
00:10:13,000 --> 00:10:20,000
a shortest path matrix.
So, this is now an n by n

140
00:10:20,000 --> 00:10:25,000
matrix.
So, n is just the number of

141
00:10:25,000 --> 00:10:32,000
vertices of shortest path
weights.

142
00:10:32,000 --> 00:10:37,000
So, delta of i,
j is the shortest path weight

143
00:10:37,000 --> 00:10:42,000
from i to j for all pairs of
vertices.

144
00:10:42,000 --> 00:10:50,000
So this, you could represent as
an n by n matrix in particular.

145
00:10:50,000 --> 00:10:57,000
OK, so now let's start doing
algorithms.

146
00:10:57,000 --> 00:11:02,000
So, we have this very simple
algorithm, V times Bellman-Ford,

147
00:11:02,000 --> 00:11:06,000
V^2 times E,
and just for comparison's sake,

148
00:11:06,000 --> 00:11:09,000
I'm going to say,
let me rewrite that,

149
00:11:09,000 --> 00:11:14,000
V times Bellman-Ford gives us
this running time of V^2 E,

150
00:11:14,000 --> 00:11:18,000
and I'm going to think about
the case where,

151
00:11:18,000 --> 00:11:23,000
let's just say the graph is
dense, meeting that the number

152
00:11:23,000 --> 00:11:29,000
of edges is quadratic,
and the number of vertices.

153
00:11:29,000 --> 00:11:33,000
So in that case,
this will take V^4 time,

154
00:11:33,000 --> 00:11:37,000
which is pretty slow.
We'd like to do better.

155
00:11:37,000 --> 00:11:43,000
So, first goal would just be to
beat V^4, V hypercubed,

156
00:11:43,000 --> 00:11:46,000
I guess.
OK, and we are going to use

157
00:11:46,000 --> 00:11:52,000
dynamic programming to do that.
Or at least that's what the

158
00:11:52,000 --> 00:11:58,000
motivation will come from.
It will take us a while before

159
00:11:58,000 --> 00:12:03,000
we can even beat V^4,
which is maybe a bit pathetic,

160
00:12:03,000 --> 00:12:10,000
but it takes some clever
insights, let's say.

161
00:12:10,000 --> 00:12:19,000
OK, so I'm going to introduce a
bit more notation for this

162
00:12:19,000 --> 00:12:25,000
graph.
So, I'm going to think about

163
00:12:25,000 --> 00:12:33,000
the weighted adjacency matrix.
So, I don't think we've really

164
00:12:33,000 --> 00:12:37,000
seen this in lecture before,
although I think it's in the

165
00:12:37,000 --> 00:12:39,000
appendix.
What that means,

166
00:12:39,000 --> 00:12:44,000
so normally adjacency matrix is
like one if there's an edge,

167
00:12:44,000 --> 00:12:47,000
and zero if there isn't.
And this is in a digraph,

168
00:12:47,000 --> 00:12:50,000
so you have to be a little bit
careful.

169
00:12:50,000 --> 00:12:54,000
Here, these values,
the entries in the matrix,

170
00:12:54,000 --> 00:12:57,000
are going to be the weights of
the edges.

171
00:12:57,000 --> 00:13:01,000
OK, this is this if ij is an
edge.

172
00:13:01,000 --> 00:13:04,000
So, if ij is an edge in the
graph, and it's going to be

173
00:13:04,000 --> 00:13:08,000
infinity if there is no edge.
OK, in terms of shortest paths,

174
00:13:08,000 --> 00:13:12,000
this is a more useful way to
represent the graph.

175
00:13:12,000 --> 00:13:16,000
All right, and so this includes
everything that we need from

176
00:13:16,000 --> 00:13:18,000
here.
And now we just have to think

177
00:13:18,000 --> 00:13:21,000
about it as a matrix.
Matrices will be a useful tool

178
00:13:21,000 --> 00:13:25,000
in a little while.
OK, so now I'm going to define

179
00:13:25,000 --> 00:13:28,000
some sub problems.
And, there's different ways

180
00:13:28,000 --> 00:13:32,000
that you could define what's
going on in the shortest paths

181
00:13:32,000 --> 00:13:35,000
problem.
OK, the natural thing is I want

182
00:13:35,000 --> 00:13:39,000
to go from vertex i to vertex j.
What's the shortest path?

183
00:13:39,000 --> 00:13:42,000
OK, we need to refine the sub
problems a little but more than

184
00:13:42,000 --> 00:13:43,000
that.
Not surprising.

185
00:13:43,000 --> 00:13:46,000
And if you think about my
analogy to Bellman-Ford,

186
00:13:46,000 --> 00:13:50,000
what Bellman-Ford does is it
tries to build longer and longer

187
00:13:50,000 --> 00:13:52,000
shortest paths.
But here, length is in terms of

188
00:13:52,000 --> 00:13:55,000
the number of edges.
So, first, it builds shortest

189
00:13:55,000 --> 00:13:58,000
paths of length one.
We've proven the first round it

190
00:13:58,000 --> 00:14:01,000
does that.
The second round,

191
00:14:01,000 --> 00:14:06,000
it provides all shortest paths
of length two,

192
00:14:06,000 --> 00:14:08,000
of count two,
and so on.

193
00:14:08,000 --> 00:14:14,000
We'd like to do that sort of
analogously, and try to reuse

194
00:14:14,000 --> 00:14:20,000
things a little bit more.
So, I'm going to say d_ij^(m)

195
00:14:20,000 --> 00:14:26,000
is the weight of the shortest
path from i to j with some

196
00:14:26,000 --> 00:14:33,000
restriction involving m.
So: shortest path from i to j

197
00:14:33,000 --> 00:14:36,000
using at most m edges.
OK, for example,

198
00:14:36,000 --> 00:14:41,000
if m is zero,
then we don't have to really

199
00:14:41,000 --> 00:14:47,000
think very hard to find all
shortest paths of length zero.

200
00:14:47,000 --> 00:14:50,000
OK, they use zero edges,
I should say.

201
00:14:50,000 --> 00:14:57,000
So, Bellman-Ford sort of tells
us how to go from m to m plus

202
00:14:57,000 --> 00:15:02,000
one.
So, let's just figure that out.

203
00:15:02,000 --> 00:15:05,000
So one thing we know from the
Bellman-Ford analysis is if we

204
00:15:05,000 --> 00:15:08,000
look at d_ij^(m-1),
we know that in some sense the

205
00:15:08,000 --> 00:15:12,000
longest shortest path of
relevance, unless you have

206
00:15:12,000 --> 00:15:15,000
negative weight cycle,
the longest shortest path of

207
00:15:15,000 --> 00:15:19,000
relevance is when m equals n
minus one because that's the

208
00:15:19,000 --> 00:15:21,000
longest simple path you can
have.

209
00:15:21,000 --> 00:15:24,000
So, this should be a shortest
path weight from i to j,

210
00:15:24,000 --> 00:15:28,000
and it would be no matter what
larger value you put in the

211
00:15:28,000 --> 00:15:32,000
superscript.
This should be delta of i comma

212
00:15:32,000 --> 00:15:35,000
j if there's no negative weight
cycles.

213
00:15:35,000 --> 00:15:38,000
OK, so this feels good for
dynamic programming.

214
00:15:38,000 --> 00:15:43,000
This will give us the answer if
we can compute this for all m.

215
00:15:43,000 --> 00:15:47,000
Then we'll have the shortest
path weights in particular.

216
00:15:47,000 --> 00:15:50,000
We need a way to detect
negative weight cycles,

217
00:15:50,000 --> 00:15:54,000
but let's not worry about that
too much for now.

218
00:15:54,000 --> 00:15:58,000
There are negative weights,
but let's just assume for now

219
00:15:58,000 --> 00:16:02,000
there's no negative weight
cycles.

220
00:16:02,000 --> 00:16:06,000
OK, and we get a recursion
recurrence.

221
00:16:06,000 --> 00:16:10,000
And the base case is when m
equals zero.

222
00:16:10,000 --> 00:16:16,000
This is pretty easy.
They have the same vertices,

223
00:16:16,000 --> 00:16:22,000
the weight of zero,
and otherwise it's infinity.

224
00:16:22,000 --> 00:16:28,000
OK, and then the actual
recursion is for m.

225
00:16:57,000 --> 00:17:00,000
OK, if I got this right,
this is a pretty easy,

226
00:17:00,000 --> 00:17:05,000
intuitive recursion for
d_ij^(m) is a min of smaller

227
00:17:05,000 --> 00:17:10,000
things in terms of n minus one.
I'll just show the picture,

228
00:17:10,000 --> 00:17:14,000
and then the proof of that
claim should be obvious.

229
00:17:14,000 --> 00:17:19,000
So, this is proof by picture.
So, we have on the one hand,

230
00:17:19,000 --> 00:17:22,000
I over here,
and j over here.

231
00:17:22,000 --> 00:17:25,000
We want to know the shortest
path from i to j.

232
00:17:25,000 --> 00:17:30,000
And, we want to use,
at most, m edges.

233
00:17:30,000 --> 00:17:34,000
So, the idea is,
well, you could use m minus one

234
00:17:34,000 --> 00:17:39,000
edges to get somewhere.
So this is, at most,

235
00:17:39,000 --> 00:17:42,000
m minus one edges,
some other place,

236
00:17:42,000 --> 00:17:48,000
and we'll call it k.
So this is a candidate for k.

237
00:17:48,000 --> 00:17:53,000
And then you could take the
edge directly from k to j.

238
00:17:53,000 --> 00:18:00,000
So, this costs A_k^j,
and this costs DIK m minus one.

239
00:18:00,000 --> 00:18:02,000
OK, and that's a candidate path
of length that uses,

240
00:18:02,000 --> 00:18:06,000
at most, m edges from I to j.
And this is essentially just

241
00:18:06,000 --> 00:18:08,000
considering all of them.
OK, so there's sort of many

242
00:18:08,000 --> 00:18:11,000
paths we are considering.
All of these are candidate

243
00:18:11,000 --> 00:18:14,000
values of k.
We are taking them in over all

244
00:18:14,000 --> 00:18:16,000
k as intermediate nodes,
whatever.

245
00:18:16,000 --> 00:18:18,000
So there they are.
We take the best such path.

246
00:18:18,000 --> 00:18:20,000
That should encompass all
shortest paths.

247
00:18:20,000 --> 00:18:24,000
And this is essentially sort of
what Bellman-Ford is doing,

248
00:18:24,000 --> 00:18:26,000
although not exactly.
We also sort of want to think

249
00:18:26,000 --> 00:18:29,000
about, well, what if I just go
directly with,

250
00:18:29,000 --> 00:18:34,000
say, m minus one edges?
What if there is no edge here

251
00:18:34,000 --> 00:18:36,000
that I want to use,
in some sense?

252
00:18:36,000 --> 00:18:40,000
Well, we always think about
there being, and the way the A's

253
00:18:40,000 --> 00:18:45,000
are defined, there's always this
zero weight edge to yourself.

254
00:18:45,000 --> 00:18:48,000
So, you could just take a path
that's shorter,

255
00:18:48,000 --> 00:18:51,000
go from d i to j,
and j is a particular value of

256
00:18:51,000 --> 00:18:55,000
k that we might consider,
and then take a zero weight

257
00:18:55,000 --> 00:19:00,000
edge at the end from A and jj.
OK, so this really encompasses

258
00:19:00,000 --> 00:19:03,000
everything.
So that's a pretty trivial

259
00:19:03,000 --> 00:19:06,000
claim.
OK, now once we have such a

260
00:19:06,000 --> 00:19:08,000
recursion, we get a dynamic
program.

261
00:19:08,000 --> 00:19:11,000
I mean, there,
this is it in some sense.

262
00:19:11,000 --> 00:19:15,000
It's written recursively.
You can write a bottom up.

263
00:19:15,000 --> 00:19:19,000
And I would like to write it
bottom up it little bit because

264
00:19:19,000 --> 00:19:23,000
while it doesn't look like it,
this is a relaxation.

265
00:19:23,000 --> 00:19:26,000
This is yet another relaxation
algorithm.

266
00:19:26,000 --> 00:19:29,000
So, I'll give you,
so, this is sort of the

267
00:19:29,000 --> 00:19:31,000
algorithm.
This is not a very interesting

268
00:19:31,000 --> 00:19:35,000
algorithm.
So, you don't have to write it

269
00:19:35,000 --> 00:19:38,000
all down if you don't feel like
it.

270
00:19:38,000 --> 00:19:40,000
It's probably not even in the
book.

271
00:19:40,000 --> 00:19:42,000
This is just an intermediate
step.

272
00:19:42,000 --> 00:19:45,000
So, we loop over all m.
That's sort of the outermost

273
00:19:45,000 --> 00:19:48,000
thing to do.
I want to build longer and

274
00:19:48,000 --> 00:19:51,000
longer paths,
and this vaguely corresponds to

275
00:19:51,000 --> 00:19:53,000
Bellman-Ford,
although it's actually worse

276
00:19:53,000 --> 00:19:56,000
than Bellman-Ford.
But hey, what the heck?

277
00:19:56,000 --> 00:20:03,000
It's a stepping stone.
OK, then for all i and j,

278
00:20:03,000 --> 00:20:10,000
and then we want to compute
this min.

279
00:20:10,000 --> 00:20:17,000
So, we'll just loop over all k,
and relax.

280
00:20:17,000 --> 00:20:26,000
And, here's where we're
actually computing the min.

281
00:20:26,000 --> 00:20:35,000
And, it's a relaxation,
is the point.

282
00:20:35,000 --> 00:20:38,000
This is our good friend,
the relaxation step,

283
00:20:38,000 --> 00:20:40,000
relaxing edge.
Well, it's not,

284
00:20:40,000 --> 00:20:42,000
yeah.
I guess we're relaxing edge kj,

285
00:20:42,000 --> 00:20:45,000
or something,
except we don't have the same

286
00:20:45,000 --> 00:20:48,000
clear notion.
I mean, it's a particular thing

287
00:20:48,000 --> 00:20:52,000
that we're relaxing.
It's not just a single edge

288
00:20:52,000 --> 00:20:55,000
because we don't have a single
source anymore.

289
00:20:55,000 --> 00:20:59,000
It's now relative to source I,
we are relaxing the edge kj,

290
00:20:59,000 --> 00:21:03,000
something like that.
But this is clearly a

291
00:21:03,000 --> 00:21:05,000
relaxation.
We are just making the triangle

292
00:21:05,000 --> 00:21:08,000
inequality true if it wasn't
before.

293
00:21:08,000 --> 00:21:11,000
The tribal inequality has got
to hold between all pairs.

294
00:21:11,000 --> 00:21:14,000
And that's just implementing
this min, right?

295
00:21:14,000 --> 00:21:17,000
You're taking d ij.
You take the min of what it was

296
00:21:17,000 --> 00:21:19,000
before in some sense.
That was one of the

297
00:21:19,000 --> 00:21:23,000
possibilities we considered when
we looked at the zero weight

298
00:21:23,000 --> 00:21:24,000
edge.
We say, well,

299
00:21:24,000 --> 00:21:28,000
or you could go from i to some
k in some way that we knew how

300
00:21:28,000 --> 00:21:32,000
to before, and then add on the
edge, and check whether that's

301
00:21:32,000 --> 00:21:35,000
better if it's better,
set our current estimate to

302
00:21:35,000 --> 00:21:38,000
that.
And, you do this for all k.

303
00:21:38,000 --> 00:21:40,000
In particular,
you might actually compute

304
00:21:40,000 --> 00:21:43,000
something smaller than this min
because I didn't put

305
00:21:43,000 --> 00:21:46,000
superscripts up here.
But that's just making paths

306
00:21:46,000 --> 00:21:49,000
even better.
OK, so you have to argue that

307
00:21:49,000 --> 00:21:51,000
relaxation is always a good
thing to do.

308
00:21:51,000 --> 00:21:53,000
So, by not putting
superscripts,

309
00:21:53,000 --> 00:21:56,000
maybe I do some more
relaxation, but more relaxation

310
00:21:56,000 --> 00:21:59,000
never hurts us.
You can still argue correctness

311
00:21:59,000 --> 00:22:03,000
using this claim.
So, it's not quite the direct

312
00:22:03,000 --> 00:22:05,000
implementation,
but there you go,

313
00:22:05,000 --> 00:22:10,000
dynamic programming algorithm.
The main reason I'll write it

314
00:22:10,000 --> 00:22:14,000
down: so you see that it's a
relaxation, and you see the

315
00:22:14,000 --> 00:22:18,000
running time is n^4,
OK, which is certainly no

316
00:22:18,000 --> 00:22:22,000
better than Bellman-Ford.
Bellman-Ford was n^4 even in

317
00:22:22,000 --> 00:22:26,000
the dense case,
and it's a little better in the

318
00:22:26,000 --> 00:22:30,000
sparse case.
So: not doing so great.

319
00:22:30,000 --> 00:22:34,000
But it's a start.
OK, it gets our dynamic

320
00:22:34,000 --> 00:22:41,000
programming minds thinking.
And, we'll get a better dynamic

321
00:22:41,000 --> 00:22:47,000
program in a moment.
But first, there's actually

322
00:22:47,000 --> 00:22:52,000
something useful we can do with
this formulation,

323
00:22:52,000 --> 00:22:59,000
and I guess I'll ask,
but I'll be really impressed if

324
00:22:59,000 --> 00:23:04,000
anyone can see.
Does this formula look like

325
00:23:04,000 --> 00:23:09,000
anything else that you've seen
in any context,

326
00:23:09,000 --> 00:23:15,000
mathematical or algorithmic?
Have you seen that recurrence

327
00:23:15,000 --> 00:23:20,000
anywhere else?
OK, not exactly as stated,

328
00:23:20,000 --> 00:23:24,000
but similar.
I'm sure if you thought about

329
00:23:24,000 --> 00:23:30,000
it for awhile,
you could come up with it.

330
00:23:30,000 --> 00:23:33,000
Any answers?
I didn't think you would be

331
00:23:33,000 --> 00:23:36,000
very intuitive,
but the answer is matrix

332
00:23:36,000 --> 00:23:39,000
multiplication.
And it may now be obvious to

333
00:23:39,000 --> 00:23:43,000
you, or it may not.
You have to think with the

334
00:23:43,000 --> 00:23:47,000
right quirky mind.
Then it's obvious that it's

335
00:23:47,000 --> 00:23:50,000
matrix multiplication.
Remember, matrix

336
00:23:50,000 --> 00:23:52,000
multiplication,
we have A, B,

337
00:23:52,000 --> 00:23:55,000
and C.
They're all n by n matrices.

338
00:23:55,000 --> 00:24:00,000
And, we want to compute C
equals A times B.

339
00:24:00,000 --> 00:24:04,000
And what that meant was,
well, c_ij was a sum over all k

340
00:24:04,000 --> 00:24:08,000
of a_ik times b_kj.
All right, that was our

341
00:24:08,000 --> 00:24:11,000
definition of matrix
multiplication.

342
00:24:11,000 --> 00:24:15,000
And that formula looks kind of
like this one.

343
00:24:15,000 --> 00:24:19,000
I mean, notice the subscripts:
ik and kj.

344
00:24:19,000 --> 00:24:22,000
Now, the operators are a little
different.

345
00:24:22,000 --> 00:24:27,000
Here, we're multiplying the
inside things and adding them

346
00:24:27,000 --> 00:24:34,000
all together.
There, we're adding the inside

347
00:24:34,000 --> 00:24:41,000
things and taking them in.
But other than that,

348
00:24:41,000 --> 00:24:47,000
it's the same.
OK, weird, but here we go.

349
00:24:47,000 --> 00:24:55,000
So, the connection to shortest
paths is you replace these

350
00:24:55,000 --> 00:25:00,000
operators.
So, let's take matrix

351
00:25:00,000 --> 00:25:05,000
multiplication and replace,
what should I do first,

352
00:25:05,000 --> 00:25:10,000
plus this thing with min.
So, why not just change the

353
00:25:10,000 --> 00:25:13,000
operators, replace dot with
plus?

354
00:25:13,000 --> 00:25:18,000
This is just a different
algebra to work in,

355
00:25:18,000 --> 00:25:23,000
where plus actually means min,
and dot actually means plus.

356
00:25:23,000 --> 00:25:29,000
So, you have to check that
things sort of work out in that

357
00:25:29,000 --> 00:25:35,000
context, but if we do that,
then we get that c_ij is the

358
00:25:35,000 --> 00:25:39,000
min overall k of a_ik plus,
a bit messy here,

359
00:25:39,000 --> 00:25:44,000
b_kj.
And that looks like what we

360
00:25:44,000 --> 00:25:49,000
actually want to compute,
here, for one value of m,

361
00:25:49,000 --> 00:25:52,000
you have to sort of do this m
times.

362
00:25:52,000 --> 00:25:56,000
But this conceptually is
d_ij^(m), and this is

363
00:25:56,000 --> 00:25:59,000
d_ik^(m-1).
So, this is looking like a

364
00:25:59,000 --> 00:26:04,000
matrix product,
which is kind of cool.

365
00:26:04,000 --> 00:26:11,000
So, if we sort of plug in this
claim, then, and think about

366
00:26:11,000 --> 00:26:17,000
things as matrices,
the recurrence gives us,

367
00:26:17,000 --> 00:26:25,000
and I'll just write this now at
matrix form, that d^(m) is d^(m)

368
00:26:25,000 --> 00:26:30,000
minus one, funny product,
A.

369
00:26:30,000 --> 00:26:32,000
All right, so these are the
weights.

370
00:26:32,000 --> 00:26:34,000
These were the weighted
adjacency matrix.

371
00:26:34,000 --> 00:26:38,000
This was the previous d value.
This is the new d value.

372
00:26:38,000 --> 00:26:41,000
So, I'll just rewrite that in
matrix form with capital

373
00:26:41,000 --> 00:26:43,000
letters.
OK, I have the circle up things

374
00:26:43,000 --> 00:26:47,000
that are using this funny
algebra, so, in particular,

375
00:26:47,000 --> 00:26:49,000
circled product.
OK, so that's kind of nifty.

376
00:26:49,000 --> 00:26:52,000
We know something about
computing matrix

377
00:26:52,000 --> 00:26:54,000
multiplications.
We can do it in n^3 time.

378
00:26:54,000 --> 00:26:57,000
If we were a bit fancier,
maybe we could do it in

379
00:26:57,000 --> 00:27:02,000
sub-cubic time.
So, we could try to sort of use

380
00:27:02,000 --> 00:27:07,000
this connection.
And, well, think about what we

381
00:27:07,000 --> 00:27:10,000
are computing here.
We are saying,

382
00:27:10,000 --> 00:27:14,000
well, d to the m is the
previous one times A.

383
00:27:14,000 --> 00:27:19,000
So, what is d^(m)?
Is that some other algebraic

384
00:27:19,000 --> 00:27:23,000
notion that we know?
Yeah, it's the exponent.

385
00:27:23,000 --> 00:27:27,000
We're taking A,
and we want to raise it to the

386
00:27:27,000 --> 00:27:33,000
power, m, with this funny notion
of product.

387
00:27:33,000 --> 00:27:36,000
So, in other words,
d to the m is really just A to

388
00:27:36,000 --> 00:27:40,000
the m in a funny way.
So, I'll circle it,

389
00:27:40,000 --> 00:27:41,000
OK?
So, that sounds good.

390
00:27:41,000 --> 00:27:46,000
We also know how to compute
powers of things relatively

391
00:27:46,000 --> 00:27:50,000
quickly, if you remember how.
OK, for this notion,

392
00:27:50,000 --> 00:27:52,000
this power notion,
to make sense,

393
00:27:52,000 --> 00:27:55,000
I should say what A to the zero
means.

394
00:27:55,000 --> 00:28:00,000
And so, I need some kind of
identity matrix.

395
00:28:00,000 --> 00:28:02,000
And for here,
the identity matrix is this

396
00:28:02,000 --> 00:28:06,000
one, if I get it right.
So, it has zeros along the

397
00:28:06,000 --> 00:28:09,000
diagonal, and infinities
everywhere else.

398
00:28:09,000 --> 00:28:12,000
OK, that sort of just to match
this definition.

399
00:28:12,000 --> 00:28:16,000
d_ij zero should be zeros on
the diagonals and infinity

400
00:28:16,000 --> 00:28:19,000
everywhere else.
But you can check this is

401
00:28:19,000 --> 00:28:23,000
actually an identity.
If you multiply it with this

402
00:28:23,000 --> 00:28:26,000
funny multiplication against any
other matrix,

403
00:28:26,000 --> 00:28:31,000
you get the matrix back.
Nothing changes.

404
00:28:31,000 --> 00:28:34,000
This really is a valid identity
matrix.

405
00:28:34,000 --> 00:28:40,000
And, I should mention that for
A to the m to make sense,

406
00:28:40,000 --> 00:28:44,000
you really knew that your
product operation is

407
00:28:44,000 --> 00:28:48,000
associative.
So, actually A to the m circled

408
00:28:48,000 --> 00:28:54,000
makes sense because circled
multiplication is associative,

409
00:28:54,000 --> 00:28:58,000
and you can check that;
not hard because,

410
00:28:58,000 --> 00:29:03,000
I mean, min is associative,
and addition is associative,

411
00:29:03,000 --> 00:29:10,000
and all sorts of good stuff.
And, you have some kind of

412
00:29:10,000 --> 00:29:14,000
distributivity property.
And, this is,

413
00:29:14,000 --> 00:29:18,000
in turn, because the real
numbers with,

414
00:29:18,000 --> 00:29:23,000
and get the right order here,
with min as your addition

415
00:29:23,000 --> 00:29:29,000
operation, and plus as your
multiplication operation is a

416
00:29:29,000 --> 00:29:34,000
closed semi-ring.
So, if ever you want to know

417
00:29:34,000 --> 00:29:37,000
when powers make sense,
this is a good rule.

418
00:29:37,000 --> 00:29:42,000
If you have a closed semi-ring,
then matrix products on that

419
00:29:42,000 --> 00:29:46,000
semi-ring will give you an
associative operator,

420
00:29:46,000 --> 00:29:49,000
and then, good,
you can take products.

421
00:29:49,000 --> 00:29:54,000
OK, that's just some formalism.
So now, we have some intuition.

422
00:29:54,000 --> 00:29:57,000
The question is,
what's the right.

423
00:29:57,000 --> 00:30:00,000
Algorithm?
There are many possible

424
00:30:00,000 --> 00:30:06,000
answers, some of which are
right, some of which are not.

425
00:30:06,000 --> 00:30:09,000
So, we have this connection to
matrix products,

426
00:30:09,000 --> 00:30:13,000
and we have a connection to
matrix powers.

427
00:30:13,000 --> 00:30:15,000
And, we have algorithms for
both.

428
00:30:15,000 --> 00:30:18,000
The question is,
what should we do?

429
00:30:18,000 --> 00:30:23,000
So, all we need to do now is to
compute A to the funny power,

430
00:30:23,000 --> 00:30:26,000
n minus one.
n minus one is when we get

431
00:30:26,000 --> 00:30:29,000
shortest paths,
assuming we have no negative

432
00:30:29,000 --> 00:30:34,000
weight cycles.
In fact, we could compute a

433
00:30:34,000 --> 00:30:39,000
larger power than n minus one.
Once you get beyond n minus

434
00:30:39,000 --> 00:30:43,000
one, multipling by A doesn't
change you anymore.

435
00:30:43,000 --> 00:30:47,000
So, how should we do it?
OK, you're not giving any smart

436
00:30:47,000 --> 00:30:50,000
answers.
I'll give the stupid answer.

437
00:30:50,000 --> 00:30:53,000
You could say,
well, I take A.

438
00:30:53,000 --> 00:30:56,000
I multiply it by A.
Then I multiply it by A,

439
00:30:56,000 --> 00:31:00,000
and I multiply it by A,
and I use normal,

440
00:31:00,000 --> 00:31:04,000
boring matrix to
multiplication.

441
00:31:04,000 --> 00:31:07,000
So, I do, like,
n minus two,

442
00:31:07,000 --> 00:31:13,000
standard matrix multiplies.
So, standard multiply costs,

443
00:31:13,000 --> 00:31:17,000
like, n^3.
And I'm doing n of them.

444
00:31:17,000 --> 00:31:23,000
So, this gives me an n^4
algorithm, and compute all the

445
00:31:23,000 --> 00:31:26,000
shortest pathways in n^4.
Woohoo!

446
00:31:26,000 --> 00:31:31,000
OK, no improvement.
So, how can I do better?

447
00:31:31,000 --> 00:31:36,000
Right, natural thing to try
which sadly does not work,

448
00:31:36,000 --> 00:31:40,000
is to use the sub cubic matrix
multiply algorithm.

449
00:31:40,000 --> 00:31:44,000
We will, in some sense,
get there in a moment with a

450
00:31:44,000 --> 00:31:48,000
somewhat simpler problem.
But, it's actually not known

451
00:31:48,000 --> 00:31:53,000
how to compute shortest paths
using fast matrix multiplication

452
00:31:53,000 --> 00:31:55,000
like Strassen's system
algorithm.

453
00:31:55,000 --> 00:32:00,000
But, good suggestion.
OK, you have to think about why

454
00:32:00,000 --> 00:32:04,000
it doesn't work,
and I'll tell you.

455
00:32:04,000 --> 00:32:07,000
It's not obvious,
so it's a perfectly reasonable

456
00:32:07,000 --> 00:32:10,000
suggestion.
But in this context it doesn't

457
00:32:10,000 --> 00:32:12,000
quite work.
It will come up in a few

458
00:32:12,000 --> 00:32:14,000
moments.
The problem is,

459
00:32:14,000 --> 00:32:17,000
Strassen requires the notion of
subtraction.

460
00:32:17,000 --> 00:32:21,000
And here, addition is min.
And, there's no inverse to min.

461
00:32:21,000 --> 00:32:25,000
Once you take the arguments,
you can't sort of undo a min.

462
00:32:25,000 --> 00:32:28,000
OK, so there's no notion of
subtraction, so it's not known

463
00:32:28,000 --> 00:32:32,000
how to pull that off,
sadly.

464
00:32:32,000 --> 00:32:35,000
So, what other tricks do we
have up our sleeve?

465
00:32:35,000 --> 00:32:37,000
Yeah?
Divide and conquer,

466
00:32:37,000 --> 00:32:41,000
log n powering,
yeah, repeated squaring.

467
00:32:41,000 --> 00:32:44,000
That works.
Good, we had a fancy way.

468
00:32:44,000 --> 00:32:47,000
If you had a number n,
you sort of looked at the

469
00:32:47,000 --> 00:32:52,000
binary number representation of
n, and you either squared the

470
00:32:52,000 --> 00:32:57,000
number or squared it and added
another factor of A.

471
00:32:57,000 --> 00:33:02,000
Here, we don't even have to be
smart about it.

472
00:33:02,000 --> 00:33:07,000
OK, we can just compute,
we really only have to think

473
00:33:07,000 --> 00:33:11,000
about powers of two.
What we want to know,

474
00:33:11,000 --> 00:33:17,000
and I'm going to need a bigger
font here because there's

475
00:33:17,000 --> 00:33:22,000
multiple levels of subscripts,
A to the circled power,

476
00:33:22,000 --> 00:33:28,000
two to the ceiling of log n.
Actually, n minus one would be

477
00:33:28,000 --> 00:33:32,000
enough.
But there you go.

478
00:33:32,000 --> 00:33:35,000
You can write n if you didn't
leave yourself enough space like

479
00:33:35,000 --> 00:33:37,000
me, n the ceiling,
n the circle.

480
00:33:37,000 --> 00:33:41,000
This just means the next power
of two after n minus one,

481
00:33:41,000 --> 00:33:44,000
two to the ceiling log.
So, we don't have to go

482
00:33:44,000 --> 00:33:47,000
directly to n minus one.
We can go further because

483
00:33:47,000 --> 00:33:51,000
anything farther than n minus
one is still just the shortest

484
00:33:51,000 --> 00:33:53,000
pathways.
If you look at the definition,

485
00:33:53,000 --> 00:33:57,000
and you know that your paths
are simple, which is true if you

486
00:33:57,000 --> 00:34:02,000
have no negative weight cycles,
then fine, just go farther.

487
00:34:02,000 --> 00:34:04,000
Why not?
And so, to compute this,

488
00:34:04,000 --> 00:34:09,000
we just do ceiling of log n
minus one products,

489
00:34:09,000 --> 00:34:13,000
just take A squared,
and then take the result and

490
00:34:13,000 --> 00:34:17,000
square it; take the result and
square it.

491
00:34:17,000 --> 00:34:20,000
So, this is order log n
squares.

492
00:34:20,000 --> 00:34:25,000
And, we don't know how to use
Strassen, but we can use the

493
00:34:25,000 --> 00:34:30,000
boring, standard multiply of
n^3, and that gives us n^3 log n

494
00:34:30,000 --> 00:34:34,000
running time,
OK, which finally is something

495
00:34:34,000 --> 00:34:40,000
that beats Bellman-Ford in the
dense case.

496
00:34:40,000 --> 00:34:43,000
OK, in the dense case,
Bellman-Ford was n^4.

497
00:34:43,000 --> 00:34:46,000
Here we get n^3 log n,
finally something better.

498
00:34:46,000 --> 00:34:49,000
In the sparse case,
it's about the same,

499
00:34:49,000 --> 00:34:52,000
maybe a little worse.
E is order V.

500
00:34:52,000 --> 00:34:55,000
Then we're going to get,
like, V3 for Bellman-Ford.

501
00:34:55,000 --> 00:34:59,000
Here, we get n^3 log n.
OK, after log factors,

502
00:34:59,000 --> 00:35:03,000
this is an improvement some of
the time.

503
00:35:03,000 --> 00:35:05,000
OK, it's about the same the
other times.

504
00:35:05,000 --> 00:35:09,000
Another nifty thing that you
get for free out of this,

505
00:35:09,000 --> 00:35:13,000
is you can detect negative
weight cycles.

506
00:35:13,000 --> 00:35:16,000
So, here's a bit of a puzzle.
How would I detect,

507
00:35:16,000 --> 00:35:21,000
after I compute this product,
A to the power to ceiling log n

508
00:35:21,000 --> 00:35:25,000
minus one, how would I know if I
found a negative weight cycle?

509
00:35:25,000 --> 00:35:30,000
What would that mean it this
matrix of all their shortest

510
00:35:30,000 --> 00:35:34,000
paths of, at most,
a certain length?

511
00:35:34,000 --> 00:35:36,000
If I found a cycle,
what would have to be in that

512
00:35:36,000 --> 00:35:37,000
matrix?
Yeah?

513
00:35:37,000 --> 00:35:39,000
Right, so I could,
for example,

514
00:35:39,000 --> 00:35:41,000
take this thing,
multiply it by A,

515
00:35:41,000 --> 00:35:43,000
see if the matrix changed at
all.

516
00:35:43,000 --> 00:35:45,000
Right, that works fine.
That's what we do in

517
00:35:45,000 --> 00:35:48,000
Bellman-Ford.
It's an even simpler thing.

518
00:35:48,000 --> 00:35:51,000
It's already there.
You don't have to multiply.

519
00:35:51,000 --> 00:35:52,000
But that's the same running
time.

520
00:35:52,000 --> 00:35:55,000
That's a good answer.
The diagonal would have a

521
00:35:55,000 --> 00:35:56,000
negative value,
yeah.

522
00:35:56,000 --> 00:36:04,000
So, this is just a cute thing.
Both approaches would work,

523
00:36:04,000 --> 00:36:15,000
can detect a negative weight
cycle just by looking at the

524
00:36:15,000 --> 00:36:24,000
diagonal of the matrix.
You just look for a negative

525
00:36:24,000 --> 00:36:30,000
value in the diagonal.
OK.

526
00:36:30,000 --> 00:36:32,000
So, that's algorithm one,
let's say.

527
00:36:32,000 --> 00:36:37,000
I mean, we've seen several that
are all bad, but I'll call this

528
00:36:37,000 --> 00:36:39,000
number one.
OK, we'll see two more.

529
00:36:39,000 --> 00:36:44,000
This is the only one that will,
well, I shouldn't say that.

530
00:36:44,000 --> 00:36:47,000
Fine, there we go.
So, this is one dynamic program

531
00:36:47,000 --> 00:36:51,000
that wasn't so helpful,
except it showed us a

532
00:36:51,000 --> 00:36:53,000
connection to matrix
multiplication,

533
00:36:53,000 --> 00:36:57,000
which is interesting.
We'll see why it's useful a

534
00:36:57,000 --> 00:37:02,000
little bit more.
But, it bled to this nasty four

535
00:37:02,000 --> 00:37:04,000
nested loops.
And, using this trick,

536
00:37:04,000 --> 00:37:08,000
we got down to n^3 log n.
Let's try, just for n^3.

537
00:37:08,000 --> 00:37:11,000
OK, just get rid of that log.
It's annoying.

538
00:37:11,000 --> 00:37:15,000
It makes you a little bit worse
than Bellman-Ford,

539
00:37:15,000 --> 00:37:18,000
and the sparse case.
So, let's just erase one of

540
00:37:18,000 --> 00:37:21,000
these nested loops.
OK, I want to do that.

541
00:37:21,000 --> 00:37:25,000
OK, obviously that algorithm
doesn't work because it's for

542
00:37:25,000 --> 00:37:28,000
first decay, and it's not
defined, but,

543
00:37:28,000 --> 00:37:31,000
you know, I've got enough
variables.

544
00:37:31,000 --> 00:37:35,000
Why don't I just define k to
the m?

545
00:37:35,000 --> 00:37:39,000
OK, it turns out that works.
I'll do it from scratch,

546
00:37:39,000 --> 00:37:42,000
but why not?
I don't know if that's how

547
00:37:42,000 --> 00:37:47,000
Floyd and Warshall came up with
their algorithm,

548
00:37:47,000 --> 00:37:50,000
but here you go.
Here's Floyd-Warshall.

549
00:37:50,000 --> 00:37:55,000
The idea is to define the
subproblems a little bit more

550
00:37:55,000 --> 00:37:59,000
cleverly so that to compute one
of these values,

551
00:37:59,000 --> 00:38:04,000
you don't have to take the min
of n things.

552
00:38:04,000 --> 00:38:06,000
I just want to take the min of
two things.

553
00:38:06,000 --> 00:38:09,000
If I could do that,
and I still only have n^3

554
00:38:09,000 --> 00:38:12,000
subproblems, then I would have
n^3 time.

555
00:38:12,000 --> 00:38:14,000
So, all right,
the running time of dynamic

556
00:38:14,000 --> 00:38:18,000
program is number of subproblems
times the time to compute the

557
00:38:18,000 --> 00:38:22,000
recurrence for one subproblem.
So, here's linear times n^3,

558
00:38:22,000 --> 00:38:26,000
and we want n^3 times constant.
That would be good.

559
00:38:26,000 --> 00:38:29,000
So that's Floyd-Warshall.
So, here's the way we're going

560
00:38:29,000 --> 00:38:35,000
to redefine c_ij.
Or I guess, there it was called

561
00:38:35,000 --> 00:38:39,000
d_ij.
Good, so we're going to define

562
00:38:39,000 --> 00:38:43,000
something new.
So, c_ij superscript k is now

563
00:38:43,000 --> 00:38:50,000
going to be the weight of the
shortest path from I to j as

564
00:38:50,000 --> 00:38:54,000
before.
Notice I used the superscript k

565
00:38:54,000 --> 00:39:00,000
instead of m because I want k
and m to be the same thing.

566
00:39:00,000 --> 00:39:03,000
Deep.
OK, now, here's the new

567
00:39:03,000 --> 00:39:05,000
constraint.
I want all intermediate

568
00:39:05,000 --> 00:39:09,000
vertices along the path,
meeting all vertices except for

569
00:39:09,000 --> 00:39:13,000
I and j at the beginning and the
end to have a small label.

570
00:39:13,000 --> 00:39:17,000
So, they should be in the set
from one up to k.

571
00:39:17,000 --> 00:39:21,000
And this is where we are really
using that our vertices are

572
00:39:21,000 --> 00:39:24,000
labeled one up to m.
So, I'm going to say,

573
00:39:24,000 --> 00:39:28,000
well, first think about the
shortest paths that don't use

574
00:39:28,000 --> 00:39:32,000
any other vertices.
That's when k is zero.

575
00:39:32,000 --> 00:39:35,000
Then think about all the
shortest paths that maybe they

576
00:39:35,000 --> 00:39:38,000
use vertex one.
And then think about the

577
00:39:38,000 --> 00:39:41,000
shortest paths that maybe use
vertex one or vertex two.

578
00:39:41,000 --> 00:39:43,000
Why not?
You could define it in this

579
00:39:43,000 --> 00:39:44,000
way.
It turns out,

580
00:39:44,000 --> 00:39:48,000
then when you increase k,
you only have to think about

581
00:39:48,000 --> 00:39:51,000
one new vertex.
Here, we had to take min over

582
00:39:51,000 --> 00:39:53,000
all k.
Now we know which k to look at.

583
00:39:53,000 --> 00:39:57,000
OK, maybe that made sense.
Maybe it's not quite obvious

584
00:39:57,000 --> 00:39:59,000
yet.
But I'm going to redo this

585
00:39:59,000 --> 00:40:04,000
claim, redo a recurrence.
So, maybe first I should say

586
00:40:04,000 --> 00:40:07,000
some obvious things.
So, if I want delta of ij of

587
00:40:07,000 --> 00:40:10,000
the shortest pathway,
well, just take all the

588
00:40:10,000 --> 00:40:13,000
vertices.
So, take c_ij superscript n.

589
00:40:13,000 --> 00:40:15,000
That's everything.
And this even works,

590
00:40:15,000 --> 00:40:19,000
this is true even if you have a
negative weight cycle.

591
00:40:19,000 --> 00:40:22,000
Although, again,
we're going to sort of ignore

592
00:40:22,000 --> 00:40:26,000
negative weight cycles as long
as we can detect them.

593
00:40:26,000 --> 00:40:29,000
And, another simple case is if
you have, well,

594
00:40:29,000 --> 00:40:35,000
c_ij to zero.
Let me put that in the claim to

595
00:40:35,000 --> 00:40:40,000
be a little bit more consistent
here.

596
00:40:40,000 --> 00:40:47,000
So, here's the new claim.
If we want to compute c_ij

597
00:40:47,000 --> 00:40:50,000
superscript zero,
what is it?

598
00:40:50,000 --> 00:40:58,000
Superscript zero means I really
shouldn't use any intermediate

599
00:40:58,000 --> 00:41:03,000
vertices.
So, this has a very simple

600
00:41:03,000 --> 00:41:09,000
answer, a three letter answer.
So, it's not zero.

601
00:41:09,000 --> 00:41:12,000
It's four letters.
What's that?

602
00:41:12,000 --> 00:41:15,000
Nil.
No, not working yet.

603
00:41:15,000 --> 00:41:18,000
It has some subscripts,
too.

604
00:41:18,000 --> 00:41:25,000
So, the definition would be,
what's the shortest path weight

605
00:41:25,000 --> 00:41:31,000
from I to j when you're not
allowed to use any intermediate

606
00:41:31,000 --> 00:41:34,000
vertices?
Sorry?

607
00:41:34,000 --> 00:41:38,000
So, yeah, it has a very simple
name.

608
00:41:38,000 --> 00:41:43,000
That's the tricky part.
All right, so if i equals j,

609
00:41:43,000 --> 00:41:48,000
[LAUGHTER] you're clever,
right, open bracket i equals j

610
00:41:48,000 --> 00:41:50,000
means one, well,
OK.

611
00:41:50,000 --> 00:41:54,000
It sort of works,
but it's not quite right.

612
00:41:54,000 --> 00:41:59,000
In fact, I want infinity if i
does not equal j.

613
00:41:59,000 --> 00:42:05,000
And I want to zero if i equals
j, a_ij, good.

614
00:42:05,000 --> 00:42:07,000
I think it's a_ij.
It should be,

615
00:42:07,000 --> 00:42:09,000
right?
Maybe I'm wrong.

616
00:42:09,000 --> 00:42:12,000
Right, a_ij.
So it's essentially not what I

617
00:42:12,000 --> 00:42:13,000
said.
That's the point.

618
00:42:13,000 --> 00:42:17,000
If i does not equal j,
you still have to think about a

619
00:42:17,000 --> 00:42:20,000
single edge connecting i to j,
right?

620
00:42:20,000 --> 00:42:23,000
OK, so that's a bit of a
subtlety.

621
00:42:23,000 --> 00:42:27,000
This is only intermediate
vertices, so you could still go

622
00:42:27,000 --> 00:42:32,000
from i to j via a single edge.
That will cost a_ij.

623
00:42:32,000 --> 00:42:34,000
If there is an edge:
infinity.

624
00:42:34,000 --> 00:42:37,000
If there isn't one:
that is a_ij.

625
00:42:37,000 --> 00:42:42,000
So, OK, that gets us started.
And then, we want a recurrence.

626
00:42:42,000 --> 00:42:46,000
And, the recurrence is,
well, maybe you get away with

627
00:42:46,000 --> 00:42:49,000
all the vertices that you had
before.

628
00:42:49,000 --> 00:42:52,000
So, if you want to know paths
that you had before,

629
00:42:52,000 --> 00:42:56,000
so if you want to know paths
that use one up to k,

630
00:42:56,000 --> 00:43:01,000
maybe I just use one up to k
minus one.

631
00:43:01,000 --> 00:43:04,000
You could try that.
Or, you could try using k.

632
00:43:04,000 --> 00:43:07,000
So, either you use k or you
don't.

633
00:43:07,000 --> 00:43:09,000
If you don't,
it's got to be this.

634
00:43:09,000 --> 00:43:12,000
If you do, then you've got to
go to k.

635
00:43:12,000 --> 00:43:17,000
So why not go to k at the end?
So, you go from I to k using

636
00:43:17,000 --> 00:43:21,000
the previous vertices.
Obviously, you don't want to

637
00:43:21,000 --> 00:43:24,000
repeat k in there.
And then, you go from k to j

638
00:43:24,000 --> 00:43:29,000
somehow using vertices that are
not k.

639
00:43:29,000 --> 00:43:31,000
This should be pretty
intuitive.

640
00:43:31,000 --> 00:43:35,000
Again, I can draw a picture.
So, either you never go to k,

641
00:43:35,000 --> 00:43:40,000
and that's this wiggly line.
You go from i to j using things

642
00:43:40,000 --> 00:43:43,000
only one up to k minus one.
In other words,

643
00:43:43,000 --> 00:43:45,000
here we have to use one up to
k.

644
00:43:45,000 --> 00:43:48,000
So, this just means don't use
k.

645
00:43:48,000 --> 00:43:52,000
So, that's this thing.
Or, you use k somewhere in the

646
00:43:52,000 --> 00:43:55,000
middle there.
OK, it's got to be one of the

647
00:43:55,000 --> 00:43:57,000
two.
And in this case,

648
00:43:57,000 --> 00:44:00,000
you go from i to k using only
smaller vertices,

649
00:44:00,000 --> 00:44:05,000
because you don't want to
repeat k.

650
00:44:05,000 --> 00:44:10,000
And here, you go from k to j
using only smaller labeled

651
00:44:10,000 --> 00:44:14,000
vertices.
So, every path is one of the

652
00:44:14,000 --> 00:44:18,000
two.
So, we take the shortest of

653
00:44:18,000 --> 00:44:22,000
these two subproblems.
That's the answer.

654
00:44:22,000 --> 00:44:26,000
So, now we have a min of two
things.

655
00:44:26,000 --> 00:44:29,000
It takes constant time to
compute.

656
00:44:29,000 --> 00:44:36,000
So, we get a cubic algorithm.
So, let me write it down.

657
00:44:36,000 --> 00:44:41,000
So, this is the Floyd-Warshall
algorithm.

658
00:44:41,000 --> 00:44:46,000
I'll write the name again.
You give it a matrix A.

659
00:44:46,000 --> 00:44:50,000
That's all it really needs to
know.

660
00:44:50,000 --> 00:44:54,000
It codes everything.
You copy C to A.

661
00:44:54,000 --> 00:44:58,000
That's the warm up.
Right at time zero,

662
00:44:58,000 --> 00:45:03,000
C equals A.
And then you just have these

663
00:45:03,000 --> 00:45:07,000
three loops for every value of
k, for every value of i,

664
00:45:07,000 --> 00:45:10,000
and for every value of j.
You compute that min.

665
00:45:10,000 --> 00:45:15,000
And if you think about it a
little bit, that min is a

666
00:45:15,000 --> 00:45:18,000
relaxation.
Surprise, surprise.

667
00:45:47,000 --> 00:45:51,000
So, that is the Floyd-Warshall
algorithm.

668
00:45:51,000 --> 00:45:58,000
And, the running time is
clearly n^3, three nested loops,

669
00:45:58,000 --> 00:46:02,000
constant time inside.
So, we're finally getting

670
00:46:02,000 --> 00:46:05,000
something that is never worse
than Bellman-Ford.

671
00:46:05,000 --> 00:46:06,000
In the sparse case,
it's the same.

672
00:46:06,000 --> 00:46:09,000
And anything denser,
the number of edges is super

673
00:46:09,000 --> 00:46:11,000
linear.
This is strictly better than

674
00:46:11,000 --> 00:46:13,000
Bellman-Ford.
And, it's better than

675
00:46:13,000 --> 00:46:16,000
everything we've seen so far for
all pair, shortest paths.

676
00:46:16,000 --> 00:46:19,000
And, this handles negative
weights; very simple algorithm,

677
00:46:19,000 --> 00:46:21,000
even simpler than the one
before.

678
00:46:21,000 --> 00:46:23,000
It's just relaxation within
three loops.

679
00:46:23,000 --> 00:46:27,000
What more could you ask for?
And we need to check that this

680
00:46:27,000 --> 00:46:29,000
is indeed what min we're
computing here,

681
00:46:29,000 --> 00:46:33,000
except that the superscripts
are omitted.

682
00:46:33,000 --> 00:46:35,000
That's, again,
a bit of hand waving a bit.

683
00:46:35,000 --> 00:46:39,000
It's OK to omit subscripts
because that can only mean that

684
00:46:39,000 --> 00:46:42,000
you're doing more relaxation
techniques should be.

685
00:46:42,000 --> 00:46:45,000
Doing more relaxations can
never hurt you.

686
00:46:45,000 --> 00:46:48,000
In particular,
we do all the ones that we have

687
00:46:48,000 --> 00:46:50,000
to.
Therefore, we find the shortest

688
00:46:50,000 --> 00:46:52,000
path weights.
And, again, here,

689
00:46:52,000 --> 00:46:55,000
we're assuming that there is no
negative weight cycles.

690
00:46:55,000 --> 00:46:59,000
It shouldn't be hard to find
them, but you have to think

691
00:46:59,000 --> 00:47:04,000
about that a little bit.
OK, you could run another round

692
00:47:04,000 --> 00:47:07,000
of Bellman-Ford,
see if it relaxes in a new

693
00:47:07,000 --> 00:47:09,000
edges again.
For example,

694
00:47:09,000 --> 00:47:13,000
I think there's no nifty trick
for that version.

695
00:47:13,000 --> 00:47:17,000
And, we're going to cover,
that's our second algorithm for

696
00:47:17,000 --> 00:47:21,000
all pairs shortest paths.
Before we go up to the third

697
00:47:21,000 --> 00:47:26,000
algorithm, which is going to be
the cleverest of them all,

698
00:47:26,000 --> 00:47:30,000
the one Ring to rule them all,
to switch trilogies,

699
00:47:30,000 --> 00:47:33,000
we're going to take a little
bit of a diversion,

700
00:47:33,000 --> 00:47:37,000
side story, whatever,
and talk about transitive

701
00:47:37,000 --> 00:47:42,000
closure briefly.
This is just a good thing to

702
00:47:42,000 --> 00:47:45,000
know about.
And, it relates to the

703
00:47:45,000 --> 00:47:51,000
algorithms we've seen so far.
So, here's a transitive closure

704
00:47:51,000 --> 00:47:54,000
problem.
I give you a directed graph,

705
00:47:54,000 --> 00:47:59,000
and for all pair vertices,
i and j, I want to compute this

706
00:47:59,000 --> 00:48:03,000
number.
It's one if there's a path from

707
00:48:03,000 --> 00:48:06,000
i to j.
From i to j,

708
00:48:06,000 --> 00:48:14,000
OK, and then zero otherwise.
OK, this is sort of like a

709
00:48:14,000 --> 00:48:22,000
boring adjacency matrix with no
weights, except it's about paths

710
00:48:22,000 --> 00:48:32,000
instead of being about edges.
OK, so how can I compute this?

711
00:48:32,000 --> 00:48:39,000
That's very simple.
How should I compute this?

712
00:48:39,000 --> 00:48:45,000
This gives me a graph in some
sense.

713
00:48:45,000 --> 00:48:54,000
This is adjacency matrix of a
new graph called the transitive

714
00:48:54,000 --> 00:49:01,000
closure of my input graph.
So, breadth first search,

715
00:49:01,000 --> 00:49:05,000
yeah, good.
So, all I need to do is find

716
00:49:05,000 --> 00:49:08,000
shortest paths,
and if the weights come out

717
00:49:08,000 --> 00:49:12,000
infinity, then there's no path.
If it's less than infinity,

718
00:49:12,000 --> 00:49:15,000
that there's a path.
And so here,

719
00:49:15,000 --> 00:49:19,000
so you are saying maybe I don't
care about the weights,

720
00:49:19,000 --> 00:49:22,000
so I can run breadth first
search n times,

721
00:49:22,000 --> 00:49:27,000
and that will work indeed.
So, if we do B times B of S,

722
00:49:27,000 --> 00:49:31,000
so it's maybe weird that I'm
covering here in the middle,

723
00:49:31,000 --> 00:49:36,000
but it's just an interlude.
So, we have,

724
00:49:36,000 --> 00:49:42,000
then, something like V times E.
OK, you can run any of these

725
00:49:42,000 --> 00:49:46,000
algorithms.
You could take Floyd-Warshall

726
00:49:46,000 --> 00:49:48,000
for example.
Why not?

727
00:49:48,000 --> 00:49:54,000
OK, then it would just be V^ I
mean, you could run in any of

728
00:49:54,000 --> 00:50:00,000
these algorithms with weights of
one or zero, and just check

729
00:50:00,000 --> 00:50:06,000
whether the values are infinity
or not.

730
00:50:06,000 --> 00:50:10,000
So, I mean, t_ij equals zero,
if and only if the shortest

731
00:50:10,000 --> 00:50:12,000
path weight from i to j is
infinity.

732
00:50:12,000 --> 00:50:16,000
So, just solve this.
This is an easier problem than

733
00:50:16,000 --> 00:50:18,000
shortest paths.
It is, in fact,

734
00:50:18,000 --> 00:50:22,000
strictly easier in a certain
sense, because what's going on

735
00:50:22,000 --> 00:50:26,000
with transitive closure,
and I just want to mention this

736
00:50:26,000 --> 00:50:30,000
out of interest because
transitive closure is a useful

737
00:50:30,000 --> 00:50:33,000
thing to know about.
Essentially,

738
00:50:33,000 --> 00:50:36,000
what we are doing,
let me get this right,

739
00:50:36,000 --> 00:50:39,000
is using a different set of
operators.

740
00:50:39,000 --> 00:50:43,000
We're using or and and,
a logical or and and instead of

741
00:50:43,000 --> 00:50:46,000
min and plus,
OK, because we want to know,

742
00:50:46,000 --> 00:50:49,000
if you think about a
relaxation, in some sense,

743
00:50:49,000 --> 00:50:53,000
maybe I should think about it
in terms of this min.

744
00:50:53,000 --> 00:50:56,000
So, if I want to know,
is there a pathway from I to j

745
00:50:56,000 --> 00:51:02,000
that uses vertices labeled one
through k in the middle?

746
00:51:02,000 --> 00:51:05,000
Well, either there is a path
that doesn't use the vertex k,

747
00:51:05,000 --> 00:51:09,000
or there is a path that uses k,
and then it would have to look

748
00:51:09,000 --> 00:51:12,000
like that.
OK, so there would have to be a

749
00:51:12,000 --> 00:51:15,000
path here, and there would have
to be a path there.

750
00:51:15,000 --> 00:51:18,000
So, the min and plus get
replaced with or and and.

751
00:51:18,000 --> 00:51:21,000
And if you remember,
this used to be plus,

752
00:51:21,000 --> 00:51:24,000
and this used to be product in
the matrix world.

753
00:51:24,000 --> 00:51:28,000
So, plus is now like or.
And, multiply is now like and,

754
00:51:28,000 --> 00:51:31,000
which sounds very good,
right?

755
00:51:31,000 --> 00:51:35,000
Plus does feel like or,
and multiply does feel like and

756
00:51:35,000 --> 00:51:40,000
if you live in a zero-one world.
So, in fact,

757
00:51:40,000 --> 00:51:45,000
this is not quite the field Z
mod two, but this is a good,

758
00:51:45,000 --> 00:51:49,000
nice, field to work in.
This is the Boolean world.

759
00:51:49,000 --> 00:51:55,000
So, I'll just write Boole.
Good old Boole knows all about

760
00:51:55,000 --> 00:51:58,000
this.
It's like his master's thesis,

761
00:51:58,000 --> 00:52:03,000
I think, talking about Boolean
algebra.

762
00:52:03,000 --> 00:52:06,000
And, this actually means that
you can use fast matrix

763
00:52:06,000 --> 00:52:09,000
multiply.
You can use Strassen's

764
00:52:09,000 --> 00:52:13,000
algorithm, and the fancier
algorithms, and you can compute

765
00:52:13,000 --> 00:52:16,000
the transitive closure in
subcubic time.

766
00:52:16,000 --> 00:52:19,000
So, this is sub cubic if the
edges are sparse.

767
00:52:19,000 --> 00:52:24,000
But, it's cubic in the worst
case if there are lots of edges.

768
00:52:24,000 --> 00:52:27,000
This is cubic.
You can actually do better

769
00:52:27,000 --> 00:52:30,000
using Strassen.
So, I'll just say you can do

770
00:52:30,000 --> 00:52:33,000
it.
No details here.

771
00:52:33,000 --> 00:52:37,000
I think it should be,
so in fact, there is a theorem.

772
00:52:37,000 --> 00:52:41,000
This is probably not in the
textbook, but there's a theorem

773
00:52:41,000 --> 00:52:45,000
that says transitive closure is
just as hard as matrix multiply.

774
00:52:45,000 --> 00:52:49,000
OK, they are equivalent.
Their running times are the

775
00:52:49,000 --> 00:52:52,000
same.
We don't know how long it takes

776
00:52:52,000 --> 00:52:55,000
to do a matrix multiply over a
field.

777
00:52:55,000 --> 00:52:57,000
It's somewhere between n^2 and
n^2.3.

778
00:52:57,000 --> 00:53:03,000
But, whatever the answer is:
same for transitive closure.

779
00:53:03,000 --> 00:53:09,000
OK, there's the interlude.
And that's where we actually

780
00:53:09,000 --> 00:53:16,000
get to use Strassen and friends.
Remember, Strassen was n to the

781
00:53:16,000 --> 00:53:22,000
log base two of seven algorithm.
Remember that,

782
00:53:22,000 --> 00:53:28,000
especially on the final.
Those are things you should

783
00:53:28,000 --> 00:53:35,000
have at the tip of your tongue.
OK, the last algorithm we're

784
00:53:35,000 --> 00:53:39,000
going to cover is really going
to build on what we saw last

785
00:53:39,000 --> 00:53:43,000
time: Johnson's algorithm.
And, I've lost some of the

786
00:53:43,000 --> 00:53:46,000
running times here.
But, when we had unweighted

787
00:53:46,000 --> 00:53:50,000
graphs, we could do all pairs
really fast, just as fast as a

788
00:53:50,000 --> 00:53:54,000
single source Bellman-Ford.
That's kind of nifty.

789
00:53:54,000 --> 00:53:58,000
We don't know how to improve
Bellman-Ford in the single

790
00:53:58,000 --> 00:54:02,000
source case.
So, we can't really help to get

791
00:54:02,000 --> 00:54:07,000
anything better than V times E.
And, if you remember running V

792
00:54:07,000 --> 00:54:11,000
times Dijkstra,
V times Dijkstra was about the

793
00:54:11,000 --> 00:54:14,000
same.
So, just put this in the recall

794
00:54:14,000 --> 00:54:19,000
bubble here: V times Dijkstra
would give us V times E plus V^2

795
00:54:19,000 --> 00:54:21,000
log V.
And, if you ignore that log

796
00:54:21,000 --> 00:54:25,000
factor, this is just VE.
OK, so this was really good.

797
00:54:25,000 --> 00:54:29,000
Dijkstra was great.
And this was for nonnegative

798
00:54:29,000 --> 00:54:34,000
edge weights.
So, with negative edge weights,

799
00:54:34,000 --> 00:54:38,000
somehow we'd like to get the
same running time.

800
00:54:38,000 --> 00:54:41,000
Now, how might I get the same
running time?

801
00:54:41,000 --> 00:54:45,000
Well, it would be really nice
if I could use Dijkstra.

802
00:54:45,000 --> 00:54:49,000
Of course, Dijkstra doesn't
work with negative weights.

803
00:54:49,000 --> 00:54:53,000
So what could I do?
What would I hope to do?

804
00:54:53,000 --> 00:54:56,000
What could I hope to?
Suppose I want,

805
00:54:56,000 --> 00:55:02,000
in the middle of the algorithm,
it says run Dijkstra n times.

806
00:55:02,000 --> 00:55:05,000
Then, what should I do to
prepare for that?

807
00:55:05,000 --> 00:55:09,000
Make all the weights positive,
or nonnegative.

808
00:55:09,000 --> 00:55:13,000
Why not, right?
We're being wishful thinking.

809
00:55:13,000 --> 00:55:17,000
That's what we'll do.
So, this is called graph

810
00:55:17,000 --> 00:55:21,000
re-weighting.
And, what's cool is we actually

811
00:55:21,000 --> 00:55:26,000
already know how to do it.
We just don't know that we know

812
00:55:26,000 --> 00:55:30,000
how to do it.
But I know that we know that we

813
00:55:30,000 --> 00:55:34,000
know how to do it.
You don't yet know that we know

814
00:55:34,000 --> 00:55:39,000
that I know that we know how to
do it.

815
00:55:39,000 --> 00:55:41,000
So, it turns out you can
re-weight the vertices.

816
00:55:41,000 --> 00:55:44,000
So, at the end of the last
class someone asked me,

817
00:55:44,000 --> 00:55:46,000
can you just,
like, add the same weight to

818
00:55:46,000 --> 00:55:48,000
all the edges?
That doesn't work.

819
00:55:48,000 --> 00:55:51,000
Not quite, because different
paths have different numbers of

820
00:55:51,000 --> 00:55:53,000
edges.
What we are going to do is add

821
00:55:53,000 --> 00:55:55,000
a particular weight to each
vertex.

822
00:55:55,000 --> 00:55:58,000
What does that mean?
Well, because we really only

823
00:55:58,000 --> 00:56:02,000
have weights on the edges,
here's what well do.

824
00:56:02,000 --> 00:56:06,000
We'll re-weight each edge,
so, (u,v), let's say,

825
00:56:06,000 --> 00:56:12,000
going to go back into graph
speak instead of matrix speak,

826
00:56:12,000 --> 00:56:17,000
(u,v) instead of I and j,
and we'll call this modified

827
00:56:17,000 --> 00:56:20,000
weight w_h.
h is our function.

828
00:56:20,000 --> 00:56:24,000
It gives us a number for every
vertex.

829
00:56:24,000 --> 00:56:30,000
And, it's just going to be the
old weight of that edge plus the

830
00:56:30,000 --> 00:56:36,000
weight of the start vertex minus
the weight of the terminating

831
00:56:36,000 --> 00:56:40,000
vertex.
I'm sure these have good names.

832
00:56:40,000 --> 00:56:43,000
One of these is the head,
and the other is the tail,

833
00:56:43,000 --> 00:56:47,000
but I can never remember which.
OK, so we've directed edge

834
00:56:47,000 --> 00:56:48,000
(u,v).
Just add one of them;

835
00:56:48,000 --> 00:56:51,000
subtract the other.
And, it's a directed edge,

836
00:56:51,000 --> 00:56:53,000
so that's a consistent
definition.

837
00:56:53,000 --> 00:56:55,000
OK, so that's called
re-weighting.

838
00:56:55,000 --> 00:56:58,000
Now, this is actually a
theorem.

839
00:56:58,000 --> 00:57:03,000
If you do this,
then, let's say,

840
00:57:03,000 --> 00:57:10,000
for any vertices,
u and v in the graph,

841
00:57:10,000 --> 00:57:18,000
for any two vertices,
all paths from u to v have the

842
00:57:18,000 --> 00:57:27,000
same weight as they did before,
well, not quite.

843
00:57:27,000 --> 00:57:34,000
They have the same
re-weighting.

844
00:57:34,000 --> 00:57:37,000
So, if you look at all the
different paths and you say,

845
00:57:37,000 --> 00:57:39,000
well, what's the difference
between vh, well,

846
00:57:39,000 --> 00:57:42,000
sorry, let's say delta,
which is the old shortest

847
00:57:42,000 --> 00:57:45,000
paths, and deltas of h,
which is the shortest path

848
00:57:45,000 --> 00:57:48,000
weights according to this new
weight function,

849
00:57:48,000 --> 00:57:50,000
then that difference is the
same.

850
00:57:50,000 --> 00:57:53,000
So, we'll say that all these
paths are re-weighted by the

851
00:57:53,000 --> 00:57:55,000
same amounts.
OK, this is actually a

852
00:57:55,000 --> 00:58:00,000
statement about all paths,
not just shortest paths.

853
00:58:00,000 --> 00:58:05,000
There we go.
OK, to how many people is this

854
00:58:05,000 --> 00:58:08,000
obvious already?
A few, yeah,

855
00:58:08,000 --> 00:58:12,000
it is.
And what's the one word?

856
00:58:12,000 --> 00:58:16,000
OK, it's maybe not that
obvious.

857
00:58:16,000 --> 00:58:23,000
All right, shout out the word
when you figure it out.

858
00:58:23,000 --> 00:58:29,000
Meanwhile, I'll write out this
rather verbose proof.

859
00:58:29,000 --> 00:58:36,000
There's a one word proof,
still waiting.

860
00:58:36,000 --> 00:58:41,000
So, let's just take one of
these paths that starts at u and

861
00:58:41,000 --> 00:58:43,000
ends at v.
Take any path.

862
00:58:43,000 --> 00:58:49,000
We're just going to see what
its new weight is relative to

863
00:58:49,000 --> 00:58:53,000
its old weight.
And so, let's just write out

864
00:58:53,000 --> 00:58:57,000
w_h of the path,
which we define in the usual

865
00:58:57,000 --> 00:59:03,000
way as the sum over all edges of
the new weight of the edge from

866
00:59:03,000 --> 00:59:09,000
v_i to v_i plus one.
Do you have the word?

867
00:59:09,000 --> 00:59:11,000
No?
Tough puzzle then,

868
00:59:11,000 --> 00:59:15,000
OK.
So that's the definition of the

869
00:59:15,000 --> 00:59:20,000
weight of a path.
And, then we know this thing is

870
00:59:20,000 --> 00:59:23,000
just w of v_i,
v_i plus one.

871
00:59:23,000 --> 00:59:27,000
I'll get it right,
plus the weight of the first

872
00:59:27,000 --> 00:59:32,000
vertex, plus,
sorry, the re-weighting of v_i

873
00:59:32,000 --> 00:59:38,000
minus the re-weighting of v_i
plus one.

874
00:59:38,000 --> 00:59:42,000
This is all in parentheses
that's summed over I.

875
00:59:42,000 --> 00:59:46,000
Now I need the magic word.
Telescopes, good.

876
00:59:46,000 --> 00:59:51,000
Now this is obvious:
each of these telescopes with

877
00:59:51,000 --> 00:59:55,000
an extra previous,
except the very beginning and

878
00:59:55,000 --> 00:59:59,000
the very end.
So, this is the sum of these

879
00:59:59,000 --> 01:00:03,817
weights of edges,
but then outside the sum,

880
01:00:03,817 --> 01:00:09,000
we have plus h of v_1,
and minus h of v_k.

881
01:00:09,000 --> 01:00:11,933
OK, those guys don't quite
cancel.

882
01:00:11,933 --> 01:00:15,577
We're not looking at a cycle,
just a path.

883
01:00:15,577 --> 01:00:20,822
And, this thing is just w of
the path, as this is the normal

884
01:00:20,822 --> 01:00:24,111
weight of the path.
And so the change,

885
01:00:24,111 --> 01:00:29,088
the difference between w_h of P
and w of P is this thing,

886
01:00:29,088 --> 01:00:33,000
which is just h of u minus h of
v.

887
01:00:33,000 --> 01:00:36,744
And, the point is that's the
same as long as you fix the

888
01:00:36,744 --> 01:00:39,468
endpoints, u and v,
of the shortest path,

889
01:00:39,468 --> 01:00:43,348
you're changing this path
weight by the same thing for all

890
01:00:43,348 --> 01:00:45,800
paths.
This is for any path from u to

891
01:00:45,800 --> 01:00:49,612
v, and that proves the theorem.
So, the one word here was

892
01:00:49,612 --> 01:00:51,927
telescopes.
These change in weights

893
01:00:51,927 --> 01:00:55,536
telescope over any path.
Therefore, if we want to find

894
01:00:55,536 --> 01:00:58,327
shortest paths,
you just find the shortest

895
01:00:58,327 --> 01:01:01,800
paths in this re-weighted
version, and then you just

896
01:01:01,800 --> 01:01:06,848
change it by this one amount.
You subtract off this amount

897
01:01:06,848 --> 01:01:10,281
instead of adding it.
That will give you the shortest

898
01:01:10,281 --> 01:01:12,591
path weight in the original
weights.

899
01:01:12,591 --> 01:01:15,694
OK, so this is a tool.
We now know how to change

900
01:01:15,694 --> 01:01:18,995
weights in the graph.
But what we really want is to

901
01:01:18,995 --> 01:01:22,889
change weights in the graph so
that the weights all come out

902
01:01:22,889 --> 01:01:25,134
nonnegative.
OK, how do we do that?

903
01:01:25,134 --> 01:01:28,105
Why in the world would there be
a function, h,

904
01:01:28,105 --> 01:01:32,000
that makes all the edge weights
nonnegative?

905
01:01:32,000 --> 01:01:42,851
It doesn't make sense.
It turns out we already know.

906
01:01:42,851 --> 01:01:52,000
So, I should write down this
consequence.

907
01:02:12,000 --> 01:02:14,193
Let me get this in the right
order.

908
01:02:14,193 --> 01:02:17,096
So in particular,
the shortest path changes by

909
01:02:17,096 --> 01:02:19,677
this amount.
And if you want to know this

910
01:02:19,677 --> 01:02:22,774
value, you just move the stuff
to the other side.

911
01:02:22,774 --> 01:02:26,193
So, we compute deltas of h,
then we can compute delta.

912
01:02:26,193 --> 01:02:29,935
That's the consequence here.
How many people here pronounce

913
01:02:29,935 --> 01:02:33,981
this word corollary?
OK, and how many people

914
01:02:33,981 --> 01:02:37,599
pronounce it corollary?
Yeah, we are alone.

915
01:02:37,599 --> 01:02:42,596
Usually get at least one other
student, and they're usually

916
01:02:42,596 --> 01:02:45,353
Canadian or British or
something.

917
01:02:45,353 --> 01:02:50,006
I think that the accent.
So, I always avoid pronouncing

918
01:02:50,006 --> 01:02:53,969
his word unless I really think,
it's corollary,

919
01:02:53,969 --> 01:02:57,587
and get it right.
I at least say Z not Zed.

920
01:02:57,587 --> 01:03:03,428
OK, here we go.
So, what we want to do is find

921
01:03:03,428 --> 01:03:09,371
one of these functions.
I mean, let's just write down

922
01:03:09,371 --> 01:03:15,771
what we could hope to have.
We want to find a re-weighted

923
01:03:15,771 --> 01:03:22,971
function, h, the signs of weight
to each vertex such that w_h of

924
01:03:22,971 --> 01:03:28,457
(u,v) is nonnegative.
That would be great for all

925
01:03:28,457 --> 01:03:34,735
edges, all (u,v) in E.
OK, then we could run Dijkstra.

926
01:03:34,735 --> 01:03:38,264
We could run Dijkstra,
get the delta h's,

927
01:03:38,264 --> 01:03:41,352
and then just undo the
re-weighting,

928
01:03:41,352 --> 01:03:45,147
and get what we want.
And, that is Johnson's

929
01:03:45,147 --> 01:03:48,235
algorithm.
The claim is that this is

930
01:03:48,235 --> 01:03:52,029
always possible.
OK, why should it always be

931
01:03:52,029 --> 01:03:54,941
possible?
Well, let's look at this

932
01:03:54,941 --> 01:03:57,764
constraint.
w_h of (u,v) is that.

933
01:03:57,764 --> 01:04:02,441
So, it's w of (u,v) plus h of u
minus h of V should be

934
01:04:02,441 --> 01:04:09,691
nonnegative.
Let me rewrite this a little

935
01:04:09,691 --> 01:04:14,886
bit.
I'm going to put these guys

936
01:04:14,886 --> 01:04:21,589
over here.
That would be the right thing,

937
01:04:21,589 --> 01:04:30,805
h of v minus h of u is less
than or equal to w of (u,v).

938
01:04:30,805 --> 01:04:39,068
Does that look familiar?
Did I get it right?

939
01:04:39,068 --> 01:04:46,496
It should be right.
Anyone seen that inequality

940
01:04:46,496 --> 01:04:51,826
before?
Yeah, yes, correct answer.

941
01:04:51,826 --> 01:04:56,993
OK, where?
In a previous lecture?

942
01:04:56,993 --> 01:05:06,000
In the previous lecture.
What is this called if I

943
01:05:06,000 --> 01:05:11,166
replace h with x?
Charles knows.

944
01:05:11,166 --> 01:05:20,833
Good, anyone else remember all
the way back to episode two?

945
01:05:20,833 --> 01:05:31,000
I know there was a weekend.
What's this operator called?

946
01:05:31,000 --> 01:05:34,058
Not subtraction but,
I think I heard it,

947
01:05:34,058 --> 01:05:36,568
oh man.
All right, I'll tell you.

948
01:05:36,568 --> 01:05:39,627
It's a difference constraint,
all right?

949
01:05:39,627 --> 01:05:42,058
This is the difference
operator.

950
01:05:42,058 --> 01:05:45,745
OK, it's our good friend
difference constraints.

951
01:05:45,745 --> 01:05:48,490
So, this is what we want to
satisfy.

952
01:05:48,490 --> 01:05:51,784
We have a system of difference
constraints.

953
01:05:51,784 --> 01:05:55,862
h of V minus h of u should be,
we want to find these.

954
01:05:55,862 --> 01:05:59,941
These are our unknowns.
Subject to these constraints,

955
01:05:59,941 --> 01:06:05,845
we are given the w's.
Now, we know in these

956
01:06:05,845 --> 01:06:10,995
difference constraints are
satisfiable.

957
01:06:10,995 --> 01:06:18,855
Can someone tell me when these
constraints are satisfiable?

958
01:06:18,855 --> 01:06:26,714
We know exactly when for any
set of difference constraints.

959
01:06:26,714 --> 01:06:32,000
You've got to remember the
math.

960
01:06:32,000 --> 01:06:37,649
Terminology,
I can understand.

961
01:06:37,649 --> 01:06:47,779
It's hard to remember words
unless you're a linguist,

962
01:06:47,779 --> 01:06:54,207
perhaps.
So, when is the system of

963
01:06:54,207 --> 01:07:02,000
different constraints
satisfiable?

964
01:07:02,000 --> 01:07:08,341
All right, you should
definitely, very good.

965
01:07:08,341 --> 01:07:12,027
[LAUGHTER] Yes,
very good.

966
01:07:12,027 --> 01:07:21,023
Someone brought their lecture
notes: when the constraint graph

967
01:07:21,023 --> 01:07:27,806
has no negative weight cycles.
Good, thank you.

968
01:07:27,806 --> 01:07:34,000
Now, what is the constraint
graph?

969
01:07:34,000 --> 01:07:37,726
OK, this has a one letter
answer more or less.

970
01:07:37,726 --> 01:07:40,458
I'll accept the one letter
answer.

971
01:07:40,458 --> 01:07:41,038
What?
A?

972
01:07:41,038 --> 01:07:41,949
A: close.
G.

973
01:07:41,949 --> 01:07:43,936
Yeah, I mean,
same thing.

974
01:07:43,936 --> 01:07:47,745
Yeah, so the constraint graph
is essentially G.

975
01:07:47,745 --> 01:07:51,388
Actually, it is G.
The constraint graph is G,

976
01:07:51,388 --> 01:07:54,286
good.
And, we prove this by adding a

977
01:07:54,286 --> 01:07:57,764
new source for text,
and connecting that to

978
01:07:57,764 --> 01:08:01,766
everyone.
But that's sort of beside the

979
01:08:01,766 --> 01:08:03,898
point.
That was in order to actually

980
01:08:03,898 --> 01:08:05,604
satisfy them.
But this is our

981
01:08:05,604 --> 01:08:08,527
characterization.
So, if we assume that there are

982
01:08:08,527 --> 01:08:12,243
no negative weight cycles in our
graph, which we've been doing

983
01:08:12,243 --> 01:08:14,923
all the time,
then we know that this thing is

984
01:08:14,923 --> 01:08:16,993
satisfiable.
Therefore, there is an

985
01:08:16,993 --> 01:08:20,100
assignment of this h's.
There is a re-weighting that

986
01:08:20,100 --> 01:08:22,111
makes all the weights
nonnegative.

987
01:08:22,111 --> 01:08:24,548
Then we can run Dijkstra.
OK, we're done.

988
01:08:24,548 --> 01:08:27,167
Isn't that cool?
And how do we satisfy these

989
01:08:27,167 --> 01:08:29,786
constraints?
We know how to do that with one

990
01:08:29,786 --> 01:08:32,283
run of Bellman-Ford,
which costs order VE,

991
01:08:32,283 --> 01:08:36,000
which is less than V times
Dijkstra.

992
01:08:36,000 --> 01:08:39,750
So, that's it,
write down the details

993
01:08:39,750 --> 01:08:41,000
somewhere.

994
01:09:00,000 --> 01:09:03,902
So, this is Johnson's
algorithm.

995
01:09:03,902 --> 01:09:07,931
This is the fanciest of them
all.

996
01:09:07,931 --> 01:09:13,723
It will be our fastest,
all pairs shortest path

997
01:09:13,723 --> 01:09:17,122
algorithm.
So, the claim is,

998
01:09:17,122 --> 01:09:23,542
we can find a function,
h, from V to R such that the

999
01:09:23,542 --> 01:09:30,970
modified weight of every edge is
nonnegative for every edge,

1000
01:09:30,970 --> 01:09:37,366
(u,v), in our graph.
And, we do that using

1001
01:09:37,366 --> 01:09:43,000
Bellman-Ford to solve the
difference constraints.

1002
01:09:57,000 --> 01:10:01,075
These are exactly the different
constraints that we were born to

1003
01:10:01,075 --> 01:10:03,663
solve that we learned to solve
last time.

1004
01:10:03,663 --> 01:10:06,704
The graphs here are
corresponding exactly if you

1005
01:10:06,704 --> 01:10:10,391
look back at the definition.
Or, Bellman-Ford will tell us

1006
01:10:10,391 --> 01:10:12,785
that there is a negative weight
cycle.

1007
01:10:12,785 --> 01:10:16,796
OK, great, so it's not that we
really have to assume that there

1008
01:10:16,796 --> 01:10:19,772
is no negative weight cycle.
We'll get to know.

1009
01:10:19,772 --> 01:10:22,942
And if your fancy,
you can actually figure out the

1010
01:10:22,942 --> 01:10:25,918
minus infinities from this.
But, at this point,

1011
01:10:25,918 --> 01:10:29,865
I just want to think about the
case where there is no negative

1012
01:10:29,865 --> 01:10:33,696
weight cycle.
But if there is,

1013
01:10:33,696 --> 01:10:39,954
we can find out that it exists,
and that just tell the user.

1014
01:10:39,954 --> 01:10:45,257
OK, then we'd stop.
Otherwise, there is no negative

1015
01:10:45,257 --> 01:10:48,969
weight cycle.
Therefore, there is an

1016
01:10:48,969 --> 01:10:54,166
assignment that gives is
nonnegative edge weights.

1017
01:10:54,166 --> 01:11:00,000
So, we just use it.
We use it to run Dijkstra.

1018
01:11:00,000 --> 01:11:02,744
So, step two is,
oh, I should say the running

1019
01:11:02,744 --> 01:11:05,987
time of all this is V times E.
So, we're just running

1020
01:11:05,987 --> 01:11:08,419
Bellman-Ford on exactly the
input graph.

1021
01:11:08,419 --> 01:11:10,665
Plus, we add a source,
if you recall,

1022
01:11:10,665 --> 01:11:13,160
to solve a set of difference
constraints.

1023
01:11:13,160 --> 01:11:16,340
You add a source vertex,
S, connected to everyone at

1024
01:11:16,340 --> 01:11:20,145
weight zero, run Bellman-Ford
from there because we don't have

1025
01:11:20,145 --> 01:11:22,327
a source here.
We just have a graph.

1026
01:11:22,327 --> 01:11:25,758
We want to know all pairs.
So, this, you can use to find

1027
01:11:25,758 --> 01:11:30,000
whether there is a negative
weight cycle anywhere.

1028
01:11:30,000 --> 01:11:33,428
Or, we get this magic
assignment.

1029
01:11:33,428 --> 01:11:39,535
So now, w_h is nonnegative,
so we can run Dijkstra on w_h.

1030
01:11:39,535 --> 01:11:43,821
We'll say, using w_h,
so you compute w_h.

1031
01:11:43,821 --> 01:11:49,392
That takes linear time.
And, we run Dijkstra for each

1032
01:11:49,392 --> 01:11:54,428
possible source.
I'll write this out explicitly.

1033
01:11:54,428 --> 01:12:00,000
We've had this in our minds
several times.

1034
01:12:00,000 --> 01:12:05,368
But, when we said n times
Dijkstra over n times BFS,

1035
01:12:05,368 --> 01:12:09,684
here it is.
We want to compute delta sub h

1036
01:12:09,684 --> 01:12:15,263
now, of (u,v) for all V,
and we do this separately for

1037
01:12:15,263 --> 01:12:18,947
all u.
And so, the running time here

1038
01:12:18,947 --> 01:12:23,684
is VE plus V^2 log V.
This is just V times the

1039
01:12:23,684 --> 01:12:30,000
running time of Dijkstra,
which is E plus V log V.

1040
01:12:30,000 --> 01:12:35,084
OK, it happens that this term
is the same as this one,

1041
01:12:35,084 --> 01:12:39,017
which is nice,
because that means step one

1042
01:12:39,017 --> 01:12:43,334
costs us nothing asymptotically.
OK, and then,

1043
01:12:43,334 --> 01:12:47,075
last step is,
well, now we know delta h.

1044
01:12:47,075 --> 01:12:52,831
We just need to compute delta.
So, for each pair of vertices,

1045
01:12:52,831 --> 01:12:57,052
we'll call it (u,v),
we just compute what the

1046
01:12:57,052 --> 01:13:03,000
original weights would be,
so what delta (u,v) is.

1047
01:13:03,000 --> 01:13:07,471
And we can do that using this
corollary.

1048
01:13:07,471 --> 01:13:13,777
It's just delta sub h of (u,v)
minus h of u plus h of v.

1049
01:13:13,777 --> 01:13:19,624
I got the signs right.
Yeah, so this takes V^2 time,

1050
01:13:19,624 --> 01:13:24,668
also dwarfed by the running
time of Dijkstra.

1051
01:13:24,668 --> 01:13:31,777
So, the overall running time of
Johnson's algorithm is just the

1052
01:13:31,777 --> 01:13:39,000
running time of step two,
running Dijkstra n times --

1053
01:13:51,000 --> 01:13:54,951
-- which is pretty cool.
When it comes to single source

1054
01:13:54,951 --> 01:13:58,243
shortest paths,
Bellman-Ford is the best thing

1055
01:13:58,243 --> 01:14:01,990
for general weights.
Dijkstra is the best thing for

1056
01:14:01,990 --> 01:14:04,976
nonnegative weights.
But for all pair shortest

1057
01:14:04,976 --> 01:14:08,890
paths, we can skirt the whole
negative weight issue by using

1058
01:14:08,890 --> 01:14:11,213
this magic we saw from
Bellman-Ford.

1059
01:14:11,213 --> 01:14:14,995
But now, running Dijkstra n
times, which is still the best

1060
01:14:14,995 --> 01:14:17,383
thing we know how to do,
pretty much,

1061
01:14:17,383 --> 01:14:21,232
for the all pairs nonnegative
weights, now we can do it for

1062
01:14:21,232 --> 01:14:24,018
general weights too,
which is a pretty nice

1063
01:14:24,018 --> 01:14:28,000
combination of all the
techniques we've seen.

1064
01:14:28,000 --> 01:14:30,217
In the trilogy,
and along the way,

1065
01:14:30,217 --> 01:14:33,577
we saw lots of dynamic
programming, which is always

1066
01:14:33,577 --> 01:14:35,459
good practice.
Any questions?

1067
01:14:35,459 --> 01:14:38,954
This is the last new content
lecture before the quiz.

1068
01:14:38,954 --> 01:14:42,852
On Wednesday it will be quiz
review, if I recall correctly.

1069
01:14:42,852 --> 01:14:46,347
And then it's Thanksgiving,
so there's no recitation.

1070
01:14:46,347 --> 01:14:48,632
And then the quiz starts on
Monday.

1071
01:14:48,632 --> 01:14:51,000
So, study up.
See you then.