1
00:00:00,030 --> 00:00:02,430
The following content is
provided under a Creative

2
00:00:02,430 --> 00:00:03,990
Commons license.

3
00:00:03,990 --> 00:00:06,860
Your support will help MIT
OpenCourseWare continue to

4
00:00:06,860 --> 00:00:10,550
offer high quality educational
resources for free.

5
00:00:10,550 --> 00:00:13,410
To make a donation or view
additional materials from

6
00:00:13,410 --> 00:00:17,510
hundreds of MIT courses, visit
MIT OpenCourseWare at

7
00:00:17,510 --> 00:00:19,950
ocw.mit.edu.

8
00:00:19,950 --> 00:00:26,100
PROFESSOR: So, yesterday in
the recitation we talked a

9
00:00:26,100 --> 00:00:29,260
little bit about how to debug
programs on Cell.

10
00:00:29,260 --> 00:00:32,420
Today I'm going to talk a little
more about debugging

11
00:00:32,420 --> 00:00:36,120
parallel programs in general and
give you some common tips

12
00:00:36,120 --> 00:00:39,430
that might be helpful in
helping you track down

13
00:00:39,430 --> 00:00:41,290
problems that you run into.

14
00:00:45,190 --> 00:00:48,170
As you might have gotten a feel
for yesterday, debugging

15
00:00:48,170 --> 00:00:51,650
parallel programs is harder
than normal sequential

16
00:00:51,650 --> 00:00:54,090
programs. So in normal
sequential programs, you have

17
00:00:54,090 --> 00:00:55,830
your traditional set
of bugs which

18
00:00:55,830 --> 00:00:57,220
parallel programs inherit.

19
00:00:57,220 --> 00:01:00,890
So that doesn't get much harder,
but then you add on

20
00:01:00,890 --> 00:01:03,490
new things I can go wrong
because of parallelization.

21
00:01:03,490 --> 00:01:05,540
So things like synchronization,
things like

22
00:01:05,540 --> 00:01:07,790
deadlocks, things
like data races.

23
00:01:07,790 --> 00:01:09,220
So you have to get
those right.

24
00:01:09,220 --> 00:01:11,310
Now you have to debug your
program and figure

25
00:01:11,310 --> 00:01:12,680
out how to do that.

26
00:01:12,680 --> 00:01:15,530
One of the things you'll see
here is a lot of tools might

27
00:01:15,530 --> 00:01:19,440
not be as good as you'd like
them to in terms of providing

28
00:01:19,440 --> 00:01:22,840
you the functionality
for debugging.

29
00:01:22,840 --> 00:01:26,490
Add to that that bugs in
parallel programs often just

30
00:01:26,490 --> 00:01:32,300
go away if you change one
statement in your code -- you

31
00:01:32,300 --> 00:01:34,640
reorder things and all of a
sudden the bug is gone.

32
00:01:34,640 --> 00:01:37,360
It's kind of like those pointer
problems in C, where

33
00:01:37,360 --> 00:01:40,940
you might add a word, add a
new variable somewhere and

34
00:01:40,940 --> 00:01:42,505
problem's gone, or you
add a print-off and

35
00:01:42,505 --> 00:01:44,130
the problem is gone.

36
00:01:44,130 --> 00:01:47,060
So here it gets harder because
those things can get rid of

37
00:01:47,060 --> 00:01:50,530
deadlocks, so it makes it
really hard to have an

38
00:01:50,530 --> 00:01:53,210
experiment that you can repeat
and come down to where the

39
00:01:53,210 --> 00:01:55,630
problem is.

40
00:01:55,630 --> 00:01:57,910
So what might you want
in a debugger.

41
00:01:57,910 --> 00:02:01,190
So this is a list that I've come
up with, and if you have

42
00:02:01,190 --> 00:02:03,760
some ideas we'll want
to throw them out.

43
00:02:03,760 --> 00:02:05,970
I'm thinking in terms of
debugging parallel program,

44
00:02:05,970 --> 00:02:09,190
what I want is a visual
debugging system that really

45
00:02:09,190 --> 00:02:11,490
let's me see all the processors
that I have in my

46
00:02:11,490 --> 00:02:14,790
network in my multi-processor
system.

47
00:02:14,790 --> 00:02:19,060
That includes actual computing
and the actual network that

48
00:02:19,060 --> 00:02:21,055
they're interconnecting all the
processors that are going

49
00:02:21,055 --> 00:02:23,220
to be communicating
with each other.

50
00:02:23,220 --> 00:02:25,160
So I'd like to be able
to see what code is

51
00:02:25,160 --> 00:02:26,610
running on each processor.

52
00:02:26,610 --> 00:02:29,110
I'd like to see which edges
are being used to send

53
00:02:29,110 --> 00:02:30,300
messages around.

54
00:02:30,300 --> 00:02:33,010
I might want to know which
processors are blocked -- that

55
00:02:33,010 --> 00:02:37,250
might help me identify
deadlock problems.

56
00:02:37,250 --> 00:02:40,110
For these kinds of scenarios it
might be tricky to define

57
00:02:40,110 --> 00:02:42,940
step, because there's no global
clock, you can't force

58
00:02:42,940 --> 00:02:45,150
everybody to proceed
through one step.

59
00:02:45,150 --> 00:02:47,590
What's one step on one processor
might be different

60
00:02:47,590 --> 00:02:49,270
on another, especially
if they're not all

61
00:02:49,270 --> 00:02:50,460
running the same code.

62
00:02:50,460 --> 00:02:55,170
So how do you actually do that
without a global clock.

63
00:02:55,170 --> 00:02:58,150
So that can get a little
bit tricky.

64
00:02:58,150 --> 00:03:01,230
It likely won't help with data
races, because I'm looking at

65
00:03:01,230 --> 00:03:03,660
global communication problems,
I'm looking at trying to

66
00:03:03,660 --> 00:03:05,680
identify what's deadlocked
and what's not.

67
00:03:05,680 --> 00:03:08,040
So if there are data races, this
kind of tool may or may

68
00:03:08,040 --> 00:03:09,940
not help with that.

69
00:03:09,940 --> 00:03:11,380
In general, this is
the tool that I

70
00:03:11,380 --> 00:03:14,060
would build for debugging.

71
00:03:14,060 --> 00:03:16,430
I looked around on the web to
see what's out there for

72
00:03:16,430 --> 00:03:17,640
debugging parallel
programs, and I

73
00:03:17,640 --> 00:03:19,820
found this called TotalView.

74
00:03:19,820 --> 00:03:22,260
This is actually something you
have to buy, it's not free.

75
00:03:22,260 --> 00:03:24,790
I don't know if they have
evaluation licenses or

76
00:03:24,790 --> 00:03:28,070
licences for academic
purposes.

77
00:03:28,070 --> 00:03:29,940
It kind of gets close to
some of the things

78
00:03:29,940 --> 00:03:32,330
I was talking about.

79
00:03:32,330 --> 00:03:34,540
You have processors that shows
your communication between

80
00:03:34,540 --> 00:03:39,040
those processors, how much data
is being sent through.

81
00:03:39,040 --> 00:03:42,730
This particular version uses
NPI, which we talked about in

82
00:03:42,730 --> 00:03:43,780
previous lectures.

83
00:03:43,780 --> 00:03:46,570
So it's sort of helpful in
being able to see the

84
00:03:46,570 --> 00:03:49,180
computation, looking at
the communication, and

85
00:03:49,180 --> 00:03:51,130
tracking down bugs.

86
00:03:51,130 --> 00:03:53,930
But it doesn't get much
better from there.

87
00:03:53,930 --> 00:03:58,600
You know, how many people have
used printouts for debugging?

88
00:03:58,600 --> 00:04:01,600
It's the most popular way of
debugging, and even I still

89
00:04:01,600 --> 00:04:03,500
use it for debugging
some of the Cell

90
00:04:03,500 --> 00:04:06,680
programs we've been writing.

91
00:04:06,680 --> 00:04:09,490
I know the TAs actually
use this is well.

92
00:04:09,490 --> 00:04:14,100
Yesterday you got a hands-on
experience with GDB, and GDB

93
00:04:14,100 --> 00:04:17,440
is a nice debugger, but it lacks
a lot of things that you

94
00:04:17,440 --> 00:04:20,830
might want, especially for
debugging parallel programs.

95
00:04:20,830 --> 00:04:22,860
You saw, for example, that you
have multiple threads, you

96
00:04:22,860 --> 00:04:24,980
need to be able switch between
the threads, getting the

97
00:04:24,980 --> 00:04:28,190
context right, being able to
name the variables is tricky.

98
00:04:28,190 --> 00:04:32,820
So there's a lot of things
that could be improved.

99
00:04:32,820 --> 00:04:35,970
There are some research
debuggers, like something

100
00:04:35,970 --> 00:04:37,720
we've built as part
of the streaming

101
00:04:37,720 --> 00:04:39,160
projects, StreamIt debugger.

102
00:04:39,160 --> 00:04:40,870
I'll show you some screenshots
of this so you can

103
00:04:40,870 --> 00:04:42,290
see what we can do.

104
00:04:42,290 --> 00:04:45,930
So in the StreamIt debugger,
remember we have -- so this is

105
00:04:45,930 --> 00:04:48,720
actually built in Eclipse and
you can download this off the

106
00:04:48,720 --> 00:04:49,900
web as well.

107
00:04:49,900 --> 00:04:52,230
You can look at your
stream graph.

108
00:04:52,230 --> 00:04:55,410
Unfortunately, I couldn't get a
split join in there, much to

109
00:04:55,410 --> 00:04:56,990
Bill's dismay.

110
00:04:56,990 --> 00:04:59,180
So you can't see, for example,
the split join in all the

111
00:04:59,180 --> 00:05:01,740
communication.

112
00:05:01,740 --> 00:05:04,850
Each one of these is a filter,
and if you recall the filter

113
00:05:04,850 --> 00:05:08,710
is the computational element
in your stream graph and

114
00:05:08,710 --> 00:05:10,300
interconnected by channels.

115
00:05:10,300 --> 00:05:12,210
So channels communicate data.

116
00:05:12,210 --> 00:05:14,940
So what you see here -- well,
you might not be able to quite

117
00:05:14,940 --> 00:05:17,010
see it -- actually see the
data that's being passed

118
00:05:17,010 --> 00:05:18,890
through from one filter
to the other.

119
00:05:18,890 --> 00:05:20,890
You can actually go in there
and change the value if you

120
00:05:20,890 --> 00:05:23,780
wanted to or highlight
particular value and see how

121
00:05:23,780 --> 00:05:25,870
it flows down through
the graph.

122
00:05:25,870 --> 00:05:28,520
If you had a split join, then
you might be able to -- in

123
00:05:28,520 --> 00:05:29,420
fact, you can do this.

124
00:05:29,420 --> 00:05:33,400
You can look at each path of the
split join independently

125
00:05:33,400 --> 00:05:35,490
and you can look at
it in sequence.

126
00:05:35,490 --> 00:05:38,300
Because the split join has nice
semantics, it's actually

127
00:05:38,300 --> 00:05:42,475
you can replicate the behavior
because of the static nature

128
00:05:42,475 --> 00:05:44,800
is everything is
deterministic.

129
00:05:44,800 --> 00:05:46,050
So this is very helpful.

130
00:05:46,050 --> 00:05:50,220
We we did a user study two
years ago almost with

131
00:05:50,220 --> 00:05:52,770
something like 30 MIT students
who use the C bugger and gave

132
00:05:52,770 --> 00:05:56,120
us feedback on it.

133
00:05:56,120 --> 00:06:01,730
So we gave them like 10
problems, 10 code snippets,

134
00:06:01,730 --> 00:06:04,000
each of them had a bug in it,
we asked them to find it.

135
00:06:04,000 --> 00:06:06,090
So a lot of them found to
debugger to be helpful in

136
00:06:06,090 --> 00:06:09,010
being able to track the flow of
data and being able to see

137
00:06:09,010 --> 00:06:09,810
what goes wrong.

138
00:06:09,810 --> 00:06:13,460
So if you had, for example, a
division that resulted in NaN

139
00:06:13,460 --> 00:06:15,750
and not a number, floating
point division you can

140
00:06:15,750 --> 00:06:17,750
immediately see on a screen, so
you know exactly where to

141
00:06:17,750 --> 00:06:18,880
go look for it.

142
00:06:18,880 --> 00:06:22,110
Doing that would print-offs
might not be as easy.

143
00:06:22,110 --> 00:06:24,930
So sometimes visual debugging
can be very nice.

144
00:06:24,930 --> 00:06:27,300
Unfortunately, visual debugging
for the Cell isn't

145
00:06:27,300 --> 00:06:28,110
that great.

146
00:06:28,110 --> 00:06:32,240
So this is the Cell plug-in
in Eclipse.

147
00:06:32,240 --> 00:06:35,180
I've mentioned just to some of
you if you want to run it you

148
00:06:35,180 --> 00:06:39,050
can run it from a Playstation 3,
but if more than one of you

149
00:06:39,050 --> 00:06:41,920
is running it then it
becomes unusable

150
00:06:41,920 --> 00:06:44,360
because of memory issues.

151
00:06:44,360 --> 00:06:50,050
You can install this on other
Linux machines and remotely

152
00:06:50,050 --> 00:06:51,750
debug on the Playstation
3 hardware.

153
00:06:51,750 --> 00:06:55,410
So the two remote machines can
talk through GDB ports.

154
00:06:55,410 --> 00:06:57,770
I can talk to you about how to
set that up if you want to,

155
00:06:57,770 --> 00:06:59,260
but it doesn't really
add anything

156
00:06:59,260 --> 00:07:00,830
over E-Max, for example.

157
00:07:00,830 --> 00:07:03,950
It just might look fancier than
an E-Max window or a GDB

158
00:07:03,950 --> 00:07:05,530
on the command line prompt.

159
00:07:05,530 --> 00:07:07,200
So this is the code
from yesterday.

160
00:07:07,200 --> 00:07:09,610
These are the exercises
we asked you to do.

161
00:07:09,610 --> 00:07:11,130
You can look at the
different threads.

162
00:07:11,130 --> 00:07:13,190
If you have debug Java programs
in Eclipse this

163
00:07:13,190 --> 00:07:14,860
should look very familiar.

164
00:07:14,860 --> 00:07:18,030
You can look at the different
variables.

165
00:07:18,030 --> 00:07:19,760
You still have the
naming problem.

166
00:07:19,760 --> 00:07:22,800
So yesterday, remember you had
to qualify which control box

167
00:07:22,800 --> 00:07:23,700
you were looking at?

168
00:07:23,700 --> 00:07:25,750
Still the same kind of issue --
have to do some trickery to

169
00:07:25,750 --> 00:07:27,010
find it here.

170
00:07:27,010 --> 00:07:29,840
It doesn't have the nice visual
aspect of showing you

171
00:07:29,840 --> 00:07:32,980
which code is running on which
SPE, and you might not be able

172
00:07:32,980 --> 00:07:37,510
to find mailbox synchronization
problems.

173
00:07:37,510 --> 00:07:39,700
Maybe those things will come in
the future, and the fact,

174
00:07:39,700 --> 00:07:41,210
they likely will.

175
00:07:41,210 --> 00:07:42,950
But a lot of that is sort
of still lacking.

176
00:07:42,950 --> 00:07:45,530
So what do you do in the
meantime, In the next two

177
00:07:45,530 --> 00:07:47,650
weeks as you're writing
your programs?

178
00:07:47,650 --> 00:07:51,560
So I've looked around for some
tips or some talks and

179
00:07:51,560 --> 00:07:56,350
lectures and what people have
done in terms of improving the

180
00:07:56,350 --> 00:07:59,280
process for debugging
parallel codes.

181
00:07:59,280 --> 00:08:01,860
Probably the best thing I've
found is this talk that was

182
00:08:01,860 --> 00:08:05,190
given at University of Maryland
on defect patterns.

183
00:08:05,190 --> 00:08:09,260
So the rest of these
slides are largely

184
00:08:09,260 --> 00:08:11,370
drawn from that talk.

185
00:08:11,370 --> 00:08:13,680
I'm going to identify just a few
of them to give you some

186
00:08:13,680 --> 00:08:16,576
examples so you can understand
what to look for and what are

187
00:08:16,576 --> 00:08:19,040
some common symptoms, what are
some common prevention

188
00:08:19,040 --> 00:08:20,770
techniques.

189
00:08:20,770 --> 00:08:24,660
So defect patterns, just like
the programming patterns we

190
00:08:24,660 --> 00:08:27,520
talked about, are meant to help
you conjure up to write

191
00:08:27,520 --> 00:08:28,600
contextual information.

192
00:08:28,600 --> 00:08:31,240
You had what are things you
should look for if you're

193
00:08:31,240 --> 00:08:33,190
communicating with
somebody else.

194
00:08:33,190 --> 00:08:35,470
What kind of terminology do you
use so that you don't have

195
00:08:35,470 --> 00:08:40,370
to explain things down
to every last detail.

196
00:08:40,370 --> 00:08:42,930
At the end of this course, one
thing I'd like to do is maybe

197
00:08:42,930 --> 00:08:45,220
get some feedback from each of
you as to what are some of the

198
00:08:45,220 --> 00:08:48,920
problems that you ran into in
writing your programs, and how

199
00:08:48,920 --> 00:08:51,110
you actually went about
debugging them, and maybe we

200
00:08:51,110 --> 00:08:55,050
can come up with Cell defect
patterns, and maybe Cell

201
00:08:55,050 --> 00:08:58,430
defect recipes for resolving
those defect patterns.

202
00:09:02,050 --> 00:09:05,920
So, probably the worst one of
all, and the easiest one to

203
00:09:05,920 --> 00:09:09,130
fix is that you have new
language features or new

204
00:09:09,130 --> 00:09:12,050
language extensions that are
not well understood.

205
00:09:12,050 --> 00:09:15,290
This is especially true when you
take a class of students

206
00:09:15,290 --> 00:09:18,340
and they don't really know the
language, they don't know all

207
00:09:18,340 --> 00:09:23,150
the tools, and you ask them to
do a project in four weeks and

208
00:09:23,150 --> 00:09:24,120
you expect things to work.

209
00:09:24,120 --> 00:09:28,410
So there's a lot for everybody
to pick up and understand.

210
00:09:28,410 --> 00:09:31,620
So you might have inconsistent
types that you use in terms of

211
00:09:31,620 --> 00:09:33,240
calling a function.

212
00:09:33,240 --> 00:09:35,430
There might be alignment
issues, which some of

213
00:09:35,430 --> 00:09:36,900
you have run into.

214
00:09:36,900 --> 00:09:38,830
You might use the
wrong functions.

215
00:09:38,830 --> 00:09:41,190
You know the functionality you
want but you just don't know

216
00:09:41,190 --> 00:09:44,510
how to name it and so you might
use the wrong function.

217
00:09:44,510 --> 00:09:47,300
Some of these are easy to fix
because you might get a

218
00:09:47,300 --> 00:09:48,470
compile time error.

219
00:09:48,470 --> 00:09:51,150
If you have mismatch in function
parameters then you

220
00:09:51,150 --> 00:09:52,840
can fix that very easily.

221
00:09:52,840 --> 00:09:55,780
Some defects -- you know, very
natural parallel programs

222
00:09:55,780 --> 00:09:58,730
might not come up until run
time, so you might end up with

223
00:09:58,730 --> 00:10:02,220
crashes or just erroneous
behavior.

224
00:10:02,220 --> 00:10:03,840
I really think this is probably
the easiest one to

225
00:10:03,840 --> 00:10:07,140
fix, and the prevention
technique that I would

226
00:10:07,140 --> 00:10:10,400
recommend is if there's
something you're unfamiliar

227
00:10:10,400 --> 00:10:13,830
about or you're not sure about
how to use something, ask.

228
00:10:13,830 --> 00:10:16,930
But also, you don't need to know
all the functions that

229
00:10:16,930 --> 00:10:20,600
are available in something
like the Cell language

230
00:10:20,600 --> 00:10:22,440
extensions for C.

231
00:10:22,440 --> 00:10:23,760
Yes, there are a lot of
functions -- you know, the

232
00:10:23,760 --> 00:10:26,890
manuals, hundreds of pages,
and you can't possibly go

233
00:10:26,890 --> 00:10:28,600
through it all and nobody
becomes an expert in

234
00:10:28,600 --> 00:10:29,280
everything.

235
00:10:29,280 --> 00:10:32,500
But understand just a few basic
concepts and features.

236
00:10:32,500 --> 00:10:36,730
So, David identified a bunch
that he found useful for

237
00:10:36,730 --> 00:10:41,850
writing the programs, and some
of the ones that are up on the

238
00:10:41,850 --> 00:10:47,210
web page under the recipes for
this course list a few more.

239
00:10:47,210 --> 00:10:50,760
And so this might help you in
just understanding how these

240
00:10:50,760 --> 00:10:53,070
functions work, understanding
basic mechanisms that they

241
00:10:53,070 --> 00:10:54,810
give you, and that's good
enough because it'll

242
00:10:54,810 --> 00:10:55,790
help you get by.

243
00:10:55,790 --> 00:10:59,250
Certainly for doing the project
under short time

244
00:10:59,250 --> 00:11:01,570
constraints, you don't need to
know all the advanced features

245
00:11:01,570 --> 00:11:04,560
that Cell might have. Or you can
probably just pick them up

246
00:11:04,560 --> 00:11:07,550
on the fly as you need them.

247
00:11:07,550 --> 00:11:09,010
So what are some more
interesting

248
00:11:09,010 --> 00:11:10,980
problems that come up.

249
00:11:10,980 --> 00:11:14,560
I think one that is probably
not too unfamiliar is this

250
00:11:14,560 --> 00:11:17,820
space decomposition problems.
So, if you remember, space

251
00:11:17,820 --> 00:11:20,230
decomposition is really
data distribution.

252
00:11:20,230 --> 00:11:23,320
You have a serial program that
you want to parallelize.

253
00:11:23,320 --> 00:11:25,450
And what that means if you have
to actually send data

254
00:11:25,450 --> 00:11:28,450
around to different processors
so that each one knows how to

255
00:11:28,450 --> 00:11:31,030
compute locally.

256
00:11:31,030 --> 00:11:34,440
Here you might get things like
segmentation faults, alignment

257
00:11:34,440 --> 00:11:39,650
problems, you might have index
out of range errors.

258
00:11:39,650 --> 00:11:42,900
What this comes of is forgetting
to change things or

259
00:11:42,900 --> 00:11:45,420
overlooking some simple things
that don't carryover from the

260
00:11:45,420 --> 00:11:47,670
sequential case that
are parallel case.

261
00:11:47,670 --> 00:11:50,200
So what you might want to do is
validate your distributions

262
00:11:50,200 --> 00:11:51,980
and your memory partitions
correctly.

263
00:11:51,980 --> 00:11:54,970
So what's an example?

264
00:11:54,970 --> 00:11:59,610
So suppose you had an array
or a list of cells,

265
00:11:59,610 --> 00:12:01,120
each cell has a number.

266
00:12:01,120 --> 00:12:04,420
What you want to do is at each
step of the computation for

267
00:12:04,420 --> 00:12:07,550
any given cell, you want add
the value to the left of it

268
00:12:07,550 --> 00:12:09,210
and the value to the
right of it.

269
00:12:09,210 --> 00:12:11,660
So here we have cell zero
has the value 2.

270
00:12:11,660 --> 00:12:13,730
We'll assume that the n
disconnectors are first, so

271
00:12:13,730 --> 00:12:17,910
this is like a circular list,
a circular buffer.

272
00:12:17,910 --> 00:12:22,060
So adding the left and right
neighbor would get me, in this

273
00:12:22,060 --> 00:12:24,600
case, 3 plus 1, 4.

274
00:12:24,600 --> 00:12:26,570
And so on and so forth.

275
00:12:26,570 --> 00:12:28,710
You want to repeat this
computation for n steps.

276
00:12:28,710 --> 00:12:31,100
So this might be very common in
computations where you're

277
00:12:31,100 --> 00:12:34,190
doing unit [? book ?]
communication.

278
00:12:34,190 --> 00:12:37,560
So what's a straightforward
sequential implementation?

279
00:12:37,560 --> 00:12:39,850
Well, you can use
two buffers --

280
00:12:39,850 --> 00:12:42,220
one for the current time step,
and you do all the

281
00:12:42,220 --> 00:12:43,330
calculations in that.

282
00:12:43,330 --> 00:12:47,970
Then you use another buffer
for next time step.

283
00:12:47,970 --> 00:12:50,530
Then you swap the two.

284
00:12:50,530 --> 00:12:52,830
So the code might look
something like this.

285
00:12:52,830 --> 00:12:56,470
Sequential c code, my two
buffers, here's my loop.

286
00:12:56,470 --> 00:12:58,960
I write into one buffer
and then I switch

287
00:12:58,960 --> 00:13:00,380
the two buffers around.

288
00:13:00,380 --> 00:13:03,450
Any questions so far?

289
00:13:03,450 --> 00:13:07,035
So now, what are some things
that can go wrong when you try

290
00:13:07,035 --> 00:13:11,060
to parallelize this?

291
00:13:11,060 --> 00:13:13,580
So how would you actually
parallelize this code?

292
00:13:13,580 --> 00:13:15,990
Well, we saw in some of your
labs, for example, that you

293
00:13:15,990 --> 00:13:18,620
can take a big array, split it
up in smaller chunks and

294
00:13:18,620 --> 00:13:21,450
assign each chunk to one
particular processor.

295
00:13:21,450 --> 00:13:23,610
So we can use that
technique here.

296
00:13:23,610 --> 00:13:28,540
So each processor, we have n of
them, rather size of them,

297
00:13:28,540 --> 00:13:32,000
and it's going to get some
number of elements.

298
00:13:32,000 --> 00:13:35,060
So each time step, we have to
compute locally all the

299
00:13:35,060 --> 00:13:37,870
communications, but then there's
some special cases

300
00:13:37,870 --> 00:13:40,270
that we need to treat at
the boundaries, right.

301
00:13:40,270 --> 00:13:42,430
So if I have this chunk and I
need to do my new neighbor

302
00:13:42,430 --> 00:13:44,800
communication, I don't have
this particular cells.

303
00:13:44,800 --> 00:13:47,230
I have to go out there
and request it.

304
00:13:47,230 --> 00:13:48,890
Similarly, somebody has
to send me this

305
00:13:48,890 --> 00:13:49,960
particular data item.

306
00:13:49,960 --> 00:13:54,890
So there's some data exchange
that has to happen.

307
00:13:54,890 --> 00:13:59,300
So in the decomposition, you
write your parallel code.

308
00:13:59,300 --> 00:14:01,930
Here, each buffer is
a different size.

309
00:14:01,930 --> 00:14:05,550
What you do is you have some
local, which says this is how

310
00:14:05,550 --> 00:14:08,510
much of the data I'm getting,
and has a total number of

311
00:14:08,510 --> 00:14:11,490
elements, besides the number
of processors.

312
00:14:11,490 --> 00:14:14,540
Local essentially tells me
the size of my chunk.

313
00:14:14,540 --> 00:14:17,900
I'm iterating through from zero
and local and I'm doing

314
00:14:17,900 --> 00:14:19,710
essentially the same
computation.

315
00:14:19,710 --> 00:14:21,400
So what's a bug in here?

316
00:14:21,400 --> 00:14:22,260
Anybody see it?

317
00:14:22,260 --> 00:14:24,780
Sort of giving you some hints
of things highlighted in red

318
00:14:24,780 --> 00:14:26,170
or there's something wrong
with the things

319
00:14:26,170 --> 00:14:27,420
highlighted in red.

320
00:14:36,790 --> 00:14:38,090
There's another hint.

321
00:14:38,090 --> 00:14:40,330
So this is essentially the
computations going on at every

322
00:14:40,330 --> 00:14:44,230
processor, this is my buffer,
and at every step I have to do

323
00:14:44,230 --> 00:14:46,320
the calculations, taking care
of the boundary edges.

324
00:14:51,840 --> 00:14:55,100
Anybody want to take a stab?

325
00:14:55,100 --> 00:14:55,450
Mark?

326
00:14:55,450 --> 00:15:00,490
AUDIENCE: Is it that the
nextbuffer zero needs to look

327
00:15:00,490 --> 00:15:01,740
at data from 1?

328
00:15:04,720 --> 00:15:06,820
PROFESSOR: Next buffer
zero, right.

329
00:15:06,820 --> 00:15:08,780
So what might be
a fix to that?

330
00:15:12,910 --> 00:15:14,810
So the next buffer is zero.

331
00:15:14,810 --> 00:15:18,850
So if this is zero then buffer
of x minus 1 plus

332
00:15:18,850 --> 00:15:20,100
this points to what?

333
00:15:23,160 --> 00:15:24,640
AUDIENCE: So you need to
start at 1 and iterate.

334
00:15:24,640 --> 00:15:26,840
PROFESSOR: Right, exactly.

335
00:15:26,840 --> 00:15:30,780
It's a local plus 1, if you
were going to do these.

336
00:15:30,780 --> 00:15:32,150
So that's one bug.

337
00:15:32,150 --> 00:15:34,810
The other thing is the
assumption that your data

338
00:15:34,810 --> 00:15:37,800
elements might be divisible by
the number of processors that

339
00:15:37,800 --> 00:15:40,260
you have. So you pick the
decomposition that might not

340
00:15:40,260 --> 00:15:42,110
be symmetric across
all processors.

341
00:15:42,110 --> 00:15:46,320
So it's more subtle, I think,
to keep in mind.

342
00:15:46,320 --> 00:15:48,640
So that's one particular kind of
problem that might come up

343
00:15:48,640 --> 00:15:51,350
on your decomposing data and
replicating among different

344
00:15:51,350 --> 00:15:52,610
processors.

345
00:15:52,610 --> 00:15:56,380
So you have to be careful about
what are your boundary

346
00:15:56,380 --> 00:16:00,650
cases going to be and how are
you going to deal with those.

347
00:16:00,650 --> 00:16:03,870
The more difficult one
is synchronization.

348
00:16:03,870 --> 00:16:06,630
So synchronization is when
you're sending data from one

349
00:16:06,630 --> 00:16:09,530
processor to the other and you
might end up with deadlock,

350
00:16:09,530 --> 00:16:11,920
because one is trying to send,
the other's trying to send,

351
00:16:11,920 --> 00:16:15,020
and neither can make progress
until the other's received.

352
00:16:15,020 --> 00:16:19,390
So your program hangs, you get
non-deterministic behavior or

353
00:16:19,390 --> 00:16:22,110
output, every time you run
your program you get a

354
00:16:22,110 --> 00:16:24,910
different result -- that
can drive you crazy.

355
00:16:24,910 --> 00:16:27,910
So some of the defects
can be very subtle.

356
00:16:27,910 --> 00:16:29,750
This is probably where you'll
spend most of your time trying

357
00:16:29,750 --> 00:16:32,400
to figure it out.

358
00:16:32,400 --> 00:16:35,100
So one of the ways to prevent
this is to actually look at

359
00:16:35,100 --> 00:16:37,680
how you're orchestrating your
communication and doing it

360
00:16:37,680 --> 00:16:38,700
very carefully.

361
00:16:38,700 --> 00:16:42,660
So look at, for example,
what's going on here.

362
00:16:42,660 --> 00:16:45,620
So this is the same problem, and
what I'm doing is now this

363
00:16:45,620 --> 00:16:48,780
is the parallel version and
I'm sending the boundary

364
00:16:48,780 --> 00:16:52,870
cases, the boundary cells to
the different processors.

365
00:16:52,870 --> 00:16:54,290
This is an SPMD program.

366
00:16:54,290 --> 00:16:58,240
So an SPMD program has every
processor running essentially

367
00:16:58,240 --> 00:16:59,380
the same code.

368
00:16:59,380 --> 00:17:01,640
So this code is replicated
over n processors and

369
00:17:01,640 --> 00:17:03,290
everybody's trying to
do this same thing.

370
00:17:03,290 --> 00:17:06,500
So what's the problem
with this code?

371
00:17:06,500 --> 00:17:11,060
We're doing send of
next buffer zero.

372
00:17:11,060 --> 00:17:13,320
Here, rank essentially
just says each

373
00:17:13,320 --> 00:17:14,880
processor has a rank.

374
00:17:14,880 --> 00:17:17,360
So this is a way of identifying
things.

375
00:17:17,360 --> 00:17:20,280
So I'm trying to send it to my
previous guy, I'm trying to

376
00:17:20,280 --> 00:17:23,740
send it to the next guy, and
here I'm sending the value at

377
00:17:23,740 --> 00:17:27,150
the far extreme of the buffer to
the next processor and then

378
00:17:27,150 --> 00:17:28,560
to the previous processor.

379
00:17:28,560 --> 00:17:29,630
Anybody see what's wrong here?

380
00:17:29,630 --> 00:17:33,610
AUDIENCE: So are these
blocking things?

381
00:17:33,610 --> 00:17:35,160
PROFESSOR: Yeah, imagine they're
blocking things.

382
00:17:35,160 --> 00:17:37,149
AUDIENCE: Why will
that deadlock?

383
00:17:40,630 --> 00:17:42,100
PROFESSOR: Right.

384
00:17:42,100 --> 00:17:45,380
So this will deadlock, right.

385
00:17:45,380 --> 00:17:48,110
So this will deadlock because
this processor is trying to

386
00:17:48,110 --> 00:17:50,590
send here, this processor is
trying to send here, but

387
00:17:50,590 --> 00:17:53,410
neither is receiving yet, so
neither makes progress.

388
00:17:53,410 --> 00:17:55,940
So how would you fix
it at this point?

389
00:17:55,940 --> 00:17:59,950
You might not want to use a
blocking send all the time.

390
00:17:59,950 --> 00:18:03,600
So if your architecture allows
you to have different flavors

391
00:18:03,600 --> 00:18:05,960
of communication, so synchronous
versus an

392
00:18:05,960 --> 00:18:08,680
asynchronous, a blocking versus
non-blocking, you'll

393
00:18:08,680 --> 00:18:11,790
want to avoid using constructs
that can lead you to deadlock

394
00:18:11,790 --> 00:18:14,020
if you don't need to.

395
00:18:14,020 --> 00:18:17,430
The other mechanism -- this was
pointed out briefly in the

396
00:18:17,430 --> 00:18:19,980
talk on parallel programming
-- you want to order your

397
00:18:19,980 --> 00:18:21,440
sends and receives properly.

398
00:18:21,440 --> 00:18:22,330
So alternate them.

399
00:18:22,330 --> 00:18:24,160
So you have a send in
one processor, a

400
00:18:24,160 --> 00:18:25,970
receive in the other.

401
00:18:25,970 --> 00:18:28,430
You can use that to prevent
deadlock and get the

402
00:18:28,430 --> 00:18:30,890
communication patterns right.

403
00:18:30,890 --> 00:18:32,640
There could be more interesting
cases that come up

404
00:18:32,640 --> 00:18:35,230
if you're communicating over a
network where you might end up

405
00:18:35,230 --> 00:18:40,470
with cyclic patterns leading
to loops, and that also can

406
00:18:40,470 --> 00:18:41,720
create some problems for you.

407
00:18:44,770 --> 00:18:48,780
The last two I'll talk about
aren't really bugs in that

408
00:18:48,780 --> 00:18:51,090
they might not cause your
program to break or compute

409
00:18:51,090 --> 00:18:52,050
incorrectly.

410
00:18:52,050 --> 00:18:53,890
Things might work properly,
but you might not get the

411
00:18:53,890 --> 00:18:56,460
actual performance that
you're expecting.

412
00:18:56,460 --> 00:19:01,660
So these are performance bucks
or performance defects.

413
00:19:01,660 --> 00:19:04,940
So side effects of
parallelization is often case

414
00:19:04,940 --> 00:19:07,560
that you're focusing on your
parallel code and you might

415
00:19:07,560 --> 00:19:09,820
ignore things that are going
on in your sequential code,

416
00:19:09,820 --> 00:19:12,060
and that might lead you to,
essentially you've spent all

417
00:19:12,060 --> 00:19:14,720
this time trying to parallelize
your code, but

418
00:19:14,720 --> 00:19:16,760
your end result is not getting
the performance that you

419
00:19:16,760 --> 00:19:19,030
expect because things
look sequential.

420
00:19:19,030 --> 00:19:21,120
So what's wrong here?

421
00:19:21,120 --> 00:19:26,240
So as an example, imagine that
we're doing instead of reading

422
00:19:26,240 --> 00:19:30,160
data from a --

423
00:19:30,160 --> 00:19:32,020
so, in the previous case I
didn't show you how we were

424
00:19:32,020 --> 00:19:34,720
reading data into the different
buffers, but suppose

425
00:19:34,720 --> 00:19:37,240
we were getting it from some
files, so input buffer.

426
00:19:37,240 --> 00:19:39,750
So now we have an SPMD program
again, everybody's trying to

427
00:19:39,750 --> 00:19:41,010
read from this buffer.

428
00:19:41,010 --> 00:19:42,930
What could go wrong here?

429
00:19:42,930 --> 00:19:45,150
Anybody have an idea?

430
00:19:45,150 --> 00:19:48,050
So every processor is opening
the file and then it's going

431
00:19:48,050 --> 00:19:50,650
to figure out how much to skip
and it'll start reading from

432
00:19:50,650 --> 00:19:51,245
that location.

433
00:19:51,245 --> 00:19:53,830
So everybody's reading from a
file, so that's OK, nobody's

434
00:19:53,830 --> 00:19:54,510
modifying it.

435
00:19:54,510 --> 00:19:55,950
But what can go wrong here?

436
00:19:55,950 --> 00:19:59,887
AUDIENCE: [INAUDIBLE PHRASE].

437
00:20:07,270 --> 00:20:09,760
PROFESSOR: Right.

438
00:20:09,760 --> 00:20:13,010
So essentially, sequentialize
your execution because reading

439
00:20:13,010 --> 00:20:16,770
from the file system becomes
the bottleneck.

440
00:20:16,770 --> 00:20:19,550
So you'll want to schedule input
and output carefully.

441
00:20:19,550 --> 00:20:21,680
You might find that not
everybody needs to do the

442
00:20:21,680 --> 00:20:22,860
input and output.

443
00:20:22,860 --> 00:20:26,520
Only one processor has to do
the input and then it can

444
00:20:26,520 --> 00:20:28,370
distribute it to all the
different processors.

445
00:20:28,370 --> 00:20:32,390
So, in the Master/Slave model,
which a lot of you are using

446
00:20:32,390 --> 00:20:36,550
for the Cell programming, the
Master can just read the data

447
00:20:36,550 --> 00:20:37,810
from the input files
and distribute it

448
00:20:37,810 --> 00:20:38,633
to everybody else.

449
00:20:38,633 --> 00:20:39,900
So this avoids some
of the problems

450
00:20:39,900 --> 00:20:41,740
with input and output.

451
00:20:41,740 --> 00:20:45,150
You can have similar kinds of
problems if you're reading

452
00:20:45,150 --> 00:20:46,130
from other devices.

453
00:20:46,130 --> 00:20:48,740
It doesn't have to be
the file system.

454
00:20:48,740 --> 00:20:52,520
So here's another one,
a little more subtle.

455
00:20:52,520 --> 00:20:55,810
So you're generating data--.

456
00:20:55,810 --> 00:20:57,196
Hey, Allen, what's up?

457
00:20:57,196 --> 00:20:59,965
AUDIENCE: I somehow missed the
distinction between when

458
00:20:59,965 --> 00:21:02,231
you're waiting for the master
to read all the dat aand

459
00:21:02,231 --> 00:21:04,748
distribute it, and waiting for
the other [? processes ?] to

460
00:21:04,748 --> 00:21:07,600
get through so I can read my
private data, isn't it going

461
00:21:07,600 --> 00:21:10,910
to be about the same
time on this?

462
00:21:10,910 --> 00:21:11,090
PROFESSOR: No.

463
00:21:11,090 --> 00:21:15,130
So here, just essentially, the
Master reads the file as part

464
00:21:15,130 --> 00:21:17,410
of the initialization.

465
00:21:17,410 --> 00:21:18,050
Then you distribute it.

466
00:21:18,050 --> 00:21:19,770
So distribution can happen
at run time.

467
00:21:19,770 --> 00:21:23,250
So, the initialization you
don't care about because

468
00:21:23,250 --> 00:21:25,080
hopefully that's a small
part of the code.

469
00:21:28,680 --> 00:21:31,550
So this code is guarded by rank
equals Master, so only it

470
00:21:31,550 --> 00:21:32,270
does this code.

471
00:21:32,270 --> 00:21:35,200
Then here you might have the
command that says wait until

472
00:21:35,200 --> 00:21:38,680
I've received it and then
execute, or on the cells, then

473
00:21:38,680 --> 00:21:41,270
these might be the SPE create
threads that happen after

474
00:21:41,270 --> 00:21:42,340
you've read the data.

475
00:21:42,340 --> 00:21:45,200
So hopefully, initialization
time is not something you have

476
00:21:45,200 --> 00:21:46,450
concern about too much.

477
00:21:48,830 --> 00:21:52,930
So if you're generating data on
the fly or dynamically, so

478
00:21:52,930 --> 00:21:56,250
here we might use the Srand
function to sort of start with

479
00:21:56,250 --> 00:21:58,830
a random seeing and then
fill in the buffer

480
00:21:58,830 --> 00:22:00,330
with some random data.

481
00:22:00,330 --> 00:22:01,580
So what could go wrong here?

482
00:22:04,250 --> 00:22:10,070
So in Srand, when you're using
a random function -- sorry,

483
00:22:10,070 --> 00:22:12,340
this is the same function.

484
00:22:12,340 --> 00:22:14,640
When you're using a random,
a pseudo random number

485
00:22:14,640 --> 00:22:17,390
generator, you have to give it
a seed, then if everybody

486
00:22:17,390 --> 00:22:20,060
starts off with the same seed,
then you might end up with the

487
00:22:20,060 --> 00:22:22,640
same random number sequence.

488
00:22:22,640 --> 00:22:25,960
If that's something you're
using to parallelize your

489
00:22:25,960 --> 00:22:28,460
computation, you might, in
effect, end up with the same

490
00:22:28,460 --> 00:22:31,820
kind of sequence on each
processor and you lose all

491
00:22:31,820 --> 00:22:34,250
kinds of parallelization.

492
00:22:34,250 --> 00:22:36,860
So there's some hidden
serialization issues in some

493
00:22:36,860 --> 00:22:38,940
of the functions that you
might use that you

494
00:22:38,940 --> 00:22:40,190
should be aware of.

495
00:22:42,350 --> 00:22:44,430
The last one I'll talk about is

496
00:22:44,430 --> 00:22:46,570
performance scalability defect.

497
00:22:46,570 --> 00:22:50,170
So here you parallelize your
code, things look good, but

498
00:22:50,170 --> 00:22:51,300
you're still not getting --

499
00:22:51,300 --> 00:22:54,030
you've taken care of all your
IO issue, you're still not

500
00:22:54,030 --> 00:22:55,370
getting the performance
you want.

501
00:22:55,370 --> 00:22:57,710
So, why is that?

502
00:22:57,710 --> 00:23:01,430
You might have -- remember your
Amdahl's law, and what

503
00:23:01,430 --> 00:23:03,580
you want is an efficiency
that's linear.

504
00:23:03,580 --> 00:23:08,500
Every time you add one processor
you want a straight

505
00:23:08,500 --> 00:23:11,410
line curve between the number
of processors and speed up.

506
00:23:11,410 --> 00:23:13,490
This should be a linear
relationship.

507
00:23:13,490 --> 00:23:16,440
So you might see sublinear speed
ups, and you want to

508
00:23:16,440 --> 00:23:17,980
figure out why that is.

509
00:23:17,980 --> 00:23:20,680
Some of the common causes here,
and this will be the end

510
00:23:20,680 --> 00:23:23,280
up focus of the next talk
is, unbalanced amount of

511
00:23:23,280 --> 00:23:24,100
computation.

512
00:23:24,100 --> 00:23:25,420
Remember, dynamic
load balancing

513
00:23:25,420 --> 00:23:27,050
versus static load balancing.

514
00:23:27,050 --> 00:23:29,210
Your work estimation might be
wrong and so you might end up

515
00:23:29,210 --> 00:23:32,660
with some processors idling,
other processors

516
00:23:32,660 --> 00:23:34,940
doing too much work.

517
00:23:34,940 --> 00:23:37,280
So the way to prevent this is
to actually look at the work

518
00:23:37,280 --> 00:23:40,370
that's being done and figure
out whether it's actually

519
00:23:40,370 --> 00:23:42,380
roughly the same amount
of work everywhere.

520
00:23:42,380 --> 00:23:45,040
Here you might need profiling
tools to help, and so I'm

521
00:23:45,040 --> 00:23:46,250
going to talk about
this in a lot more

522
00:23:46,250 --> 00:23:49,930
detail in the next lecture.

523
00:23:49,930 --> 00:23:53,800
So in summary, there are lots
of different bugs that you

524
00:23:53,800 --> 00:23:56,030
might come up with.

525
00:23:56,030 --> 00:23:59,490
There's a few that I've
identified here, some common

526
00:23:59,490 --> 00:24:01,070
things you should
look out for.

527
00:24:01,070 --> 00:24:03,520
So the erroneous use of language
features understand

528
00:24:03,520 --> 00:24:06,790
only a few basic concepts of the
entire language extension

529
00:24:06,790 --> 00:24:10,270
set that you have. Space
decomposition, side effects

530
00:24:10,270 --> 00:24:11,540
from parallelization.

531
00:24:11,540 --> 00:24:14,330
Don't ignore sequential code.

532
00:24:14,330 --> 00:24:16,430
Last one is trying to understand
your performance

533
00:24:16,430 --> 00:24:17,390
scalability.

534
00:24:17,390 --> 00:24:18,760
But there are other kinds
of bugs, like

535
00:24:18,760 --> 00:24:19,990
data races, for example.

536
00:24:19,990 --> 00:24:22,200
So what can you do with those?

537
00:24:22,200 --> 00:24:24,800
So remember, data races you
have different concurrent

538
00:24:24,800 --> 00:24:25,990
threads and they're
trying to update

539
00:24:25,990 --> 00:24:28,010
the same memory location.

540
00:24:28,010 --> 00:24:30,400
So depending on who gets to
write first and when you

541
00:24:30,400 --> 00:24:34,540
actually do your read, you might
get a different result.

542
00:24:34,540 --> 00:24:37,880
So with data race detection,
these things are actually

543
00:24:37,880 --> 00:24:38,840
getting better.

544
00:24:38,840 --> 00:24:41,140
There are tools out there that
will essentially generate

545
00:24:41,140 --> 00:24:43,530
traces as your program
is running.

546
00:24:43,530 --> 00:24:46,140
So for each thread you
instrumented and you look at

547
00:24:46,140 --> 00:24:48,120
every load stored at executes.

548
00:24:48,120 --> 00:24:51,040
Then what you do is you look at
the load in stores between

549
00:24:51,040 --> 00:24:53,410
the difference threads and see
if there's any intersections,

550
00:24:53,410 --> 00:24:57,600
any orderings that might give
you erroneous behavior.

551
00:24:57,600 --> 00:25:00,770
So this is getting better, it's
getting more automated.

552
00:25:00,770 --> 00:25:02,610
Intel Threadchecker
is one example.

553
00:25:02,610 --> 00:25:05,310
There are others.

554
00:25:05,310 --> 00:25:07,670
I really think the trend in
debugging will be towards

555
00:25:07,670 --> 00:25:11,720
trace-based systems. You'll
have things like

556
00:25:11,720 --> 00:25:12,430
checkpointing.

557
00:25:12,430 --> 00:25:15,550
So as your program is running
you can take a snapshot of

558
00:25:15,550 --> 00:25:17,830
where it is in the execution,
and then you can use that

559
00:25:17,830 --> 00:25:21,390
snapshot later on to inspect
it and see what went wrong.

560
00:25:21,390 --> 00:25:23,380
I think you might even have
features like replay.

561
00:25:23,380 --> 00:25:26,350
In fact, some people are working
on this in research

562
00:25:26,350 --> 00:25:28,120
and in industry.

563
00:25:28,120 --> 00:25:31,000
So you might be able to say
uh-oh, something went wrong.

564
00:25:31,000 --> 00:25:33,840
Here's my list of checkpoints,
can you replay the execution

565
00:25:33,840 --> 00:25:36,360
from this particular stage
in the computation.

566
00:25:36,360 --> 00:25:40,540
So it helps you focus down
in the entire lifetime of

567
00:25:40,540 --> 00:25:42,320
execution on a particular
chunk where

568
00:25:42,320 --> 00:25:45,710
things have gone wrong.

569
00:25:45,710 --> 00:25:48,900
This is sort of a
personal dream.

570
00:25:48,900 --> 00:25:51,300
I think one day we'll have the
equivalent of a TiVo for your

571
00:25:51,300 --> 00:25:54,350
programs, and you can use
it for debugging.

572
00:25:54,350 --> 00:25:56,470
So my program is running,
something goes wrong, I can

573
00:25:56,470 --> 00:25:58,970
rewind it, I can inspect things,
do my traditional

574
00:25:58,970 --> 00:26:02,700
debugging, change things maybe
even, and then start replaying

575
00:26:02,700 --> 00:26:07,580
things and letting the
program execute.

576
00:26:07,580 --> 00:26:10,320
In fact, we're working on things
like this here at MIT

577
00:26:10,320 --> 00:26:12,140
and with collaborators
elsewhere.

578
00:26:12,140 --> 00:26:15,890
So, this was a short lecture.

579
00:26:15,890 --> 00:26:16,530
We'll take a break.

580
00:26:16,530 --> 00:26:18,240
You can do the quizzes.

581
00:26:18,240 --> 00:26:19,410
Note on the quizzes,
there are two

582
00:26:19,410 --> 00:26:20,720
different kinds of questions.

583
00:26:20,720 --> 00:26:24,570
They're very similar, just one
word is different, and so

584
00:26:24,570 --> 00:26:26,720
you'll want to just keep that in
mind when you're discussing

585
00:26:26,720 --> 00:26:28,310
it with others.

586
00:26:28,310 --> 00:26:31,320
Then about 5, 10 minutes and
then we'll continue with the

587
00:26:31,320 --> 00:26:33,710
rest of the talk, lecture 2.

588
00:26:33,710 --> 00:26:34,960
Thanks.