1 00:00:00,030 --> 00:00:02,420 The following content is provided under a Creative 2 00:00:02,420 --> 00:00:03,850 Common License. 3 00:00:03,850 --> 00:00:06,860 Your support will help MIT OpenCourseWare continue to 4 00:00:06,860 --> 00:00:08,660 offer high quality, educational 5 00:00:08,660 --> 00:00:10,550 resources for free. 6 00:00:10,550 --> 00:00:13,410 To make a donation, or view additional materials from 7 00:00:13,410 --> 00:00:15,960 hundreds of MIT courses, visit mitopencourseware@ocw.mit.edu. 8 00:00:19,940 --> 00:00:22,000 PROFESSOR: OK. 9 00:00:22,000 --> 00:00:26,780 So today is the final lecture, and I'm going to talk to about 10 00:00:26,780 --> 00:00:29,220 the future. 11 00:00:29,220 --> 00:00:31,240 If I can figure out how to get the slides going. 12 00:00:36,450 --> 00:00:40,170 Predicting the Future is always not a simple thing. 13 00:00:40,170 --> 00:00:43,500 And especially this day and age we are -- things are 14 00:00:43,500 --> 00:00:45,960 appearing on the internet, so this is getting recorded and 15 00:00:45,960 --> 00:00:46,740 on the internet. 16 00:00:46,740 --> 00:00:49,790 And I'm going to make a few comments that in five years 17 00:00:49,790 --> 00:00:53,220 time, I'm going to regret. 18 00:00:53,220 --> 00:00:56,010 What I'll try to do is try to have a systematic process 19 00:00:56,010 --> 00:00:57,160 through that. 20 00:00:57,160 --> 00:00:59,110 Because I know you guys are great hackers. 21 00:00:59,110 --> 00:01:00,200 You guys are great programers. 22 00:01:00,200 --> 00:01:02,300 You and any problem, you can solve it. 23 00:01:02,300 --> 00:01:06,250 But as researchers, what's hard, and what's very 24 00:01:06,250 --> 00:01:09,350 difficult, is to figure out which problem to solve. 25 00:01:09,350 --> 00:01:15,100 Because as Roderick pointed out, when we started on Raw in 26 00:01:15,100 --> 00:01:20,040 about '97, there was a raging debate what to build. 27 00:01:20,040 --> 00:01:22,550 And there's group of people who say we can build bigger 28 00:01:22,550 --> 00:01:23,730 and bigger superscalars. 29 00:01:23,730 --> 00:01:29,540 In fact, there was this article, very seminal 30 00:01:29,540 --> 00:01:34,170 proceedings that came out, that basically asked many 31 00:01:34,170 --> 00:01:36,880 people to contribute, saying what do they think will a 32 00:01:36,880 --> 00:01:39,810 billion transistor processor look like? 33 00:01:39,810 --> 00:01:43,410 And about 2/3 of them said, we would have this gigantic, 34 00:01:43,410 --> 00:01:44,900 fabulous superscalar. 35 00:01:44,900 --> 00:01:51,470 And about three out of about ten, I think about three 36 00:01:51,470 --> 00:01:54,600 groups said that, OK, it would be more like tiled 37 00:01:54,600 --> 00:01:55,340 architecture. 38 00:01:55,340 --> 00:01:57,530 The Stanford people and us are the main 39 00:01:57,530 --> 00:01:58,880 people who look at that. 40 00:01:58,880 --> 00:02:02,640 So even that short time, 10 years ago, it was not sure 41 00:02:02,640 --> 00:02:04,950 what's going to happen. 42 00:02:04,950 --> 00:02:07,740 And so we took a stance and spent 10 43 00:02:07,740 --> 00:02:09,490 years working on that. 44 00:02:09,490 --> 00:02:11,800 We were lucky, more than anything else, that ITV had a 45 00:02:11,800 --> 00:02:14,350 little bit better insight, hopefully, but also mostly a 46 00:02:14,350 --> 00:02:15,760 larger dose of luck we get here. 47 00:02:15,760 --> 00:02:19,280 So part of that is how do you go about doing this? 48 00:02:19,280 --> 00:02:23,410 So future ia a combination of evolution and revolutions. 49 00:02:23,410 --> 00:02:26,210 Evolution is somewhat easy to predict. 50 00:02:26,210 --> 00:02:28,510 Basically, what you can look at is, you can look at the 51 00:02:28,510 --> 00:02:30,860 trends going on and you can extrapolate that trend. 52 00:02:30,860 --> 00:02:33,220 And you can say, if this trend continue, what will happen? 53 00:02:33,220 --> 00:02:35,190 If this trend continue what will happen? 54 00:02:35,190 --> 00:02:39,320 It's very interesting in computer science because one 55 00:02:39,320 --> 00:02:44,100 thing I do in 6033 is look at the rate of evolution that 56 00:02:44,100 --> 00:02:46,600 happens in computer science, and trying to predict it with 57 00:02:46,600 --> 00:02:50,490 things like the rate of [OBSCURED] 58 00:02:50,490 --> 00:02:53,420 and stuff like that and you get ridiculous numbers. 59 00:02:53,420 --> 00:02:58,330 So even if it is evolution, in computer science it's very 60 00:02:58,330 --> 00:02:59,230 close to revolution. 61 00:02:59,230 --> 00:03:02,590 Because every 18 months things double. 62 00:03:02,590 --> 00:03:04,520 That's almost unheard of. 63 00:03:04,520 --> 00:03:07,230 In a lot of areas, every 18 months you get 1% improvement 64 00:03:07,230 --> 00:03:08,160 and people are very happy. 65 00:03:08,160 --> 00:03:10,200 This is doubling. 66 00:03:10,200 --> 00:03:11,930 We have that. 67 00:03:11,930 --> 00:03:13,050 THe other part is revolutions. 68 00:03:13,050 --> 00:03:16,710 If some completely new technological solution just 69 00:03:16,710 --> 00:03:18,120 come about and completely change the world. 70 00:03:18,120 --> 00:03:19,810 Change how things happen. 71 00:03:19,810 --> 00:03:21,460 These are a lot harder to predict. 72 00:03:24,590 --> 00:03:26,730 But still it's critically important because some of 73 00:03:26,730 --> 00:03:28,520 these things can have huge impact. 74 00:03:28,520 --> 00:03:32,630 And so we will more focus on evolution and if we have at 75 00:03:32,630 --> 00:03:34,700 the end we can see if we have some interesting 76 00:03:34,700 --> 00:03:37,310 revolutionary ideas. 77 00:03:37,310 --> 00:03:39,220 Paradigm shifts occur in both. 78 00:03:39,220 --> 00:03:40,930 You don't have to wait for a revolution to 79 00:03:40,930 --> 00:03:42,100 have something different. 80 00:03:42,100 --> 00:03:46,150 Because things like 2x improvement type things that 81 00:03:46,150 --> 00:03:49,120 happens, actually brings paradigm shifts on a very 82 00:03:49,120 --> 00:03:50,070 regular basis. 83 00:03:50,070 --> 00:03:52,210 So it's not 1% we are a getting. 84 00:03:52,210 --> 00:03:55,150 We are getting huge evolution changes. 85 00:03:55,150 --> 00:03:58,710 And that can keep changing things. 86 00:03:58,710 --> 00:04:00,620 So what I'm going to talk to you about, a 87 00:04:00,620 --> 00:04:01,510 little bit of trends. 88 00:04:01,510 --> 00:04:02,700 Some of it is repetitive. 89 00:04:02,700 --> 00:04:04,480 I spent sometime in the first lecture 90 00:04:04,480 --> 00:04:05,640 talking about these trends. 91 00:04:05,640 --> 00:04:07,420 And then we'll look at two different parts. 92 00:04:07,420 --> 00:04:10,450 One is architecture side, what can happen next. 93 00:04:10,450 --> 00:04:13,000 And languages, compilers and tools side. 94 00:04:13,000 --> 00:04:15,620 We'll spend a little bit of time on revolution, and just 95 00:04:15,620 --> 00:04:17,750 see that we have group brainstorming. 96 00:04:17,750 --> 00:04:22,190 And then I have this little bit of a my preview of what 97 00:04:22,190 --> 00:04:25,150 people should be doing, very broad level, and I'll finish 98 00:04:25,150 --> 00:04:26,400 off with that. 99 00:04:28,800 --> 00:04:31,870 So look at trend. 100 00:04:31,870 --> 00:04:33,500 I think there are a lot of interesting trends 101 00:04:33,500 --> 00:04:34,290 that we can look at. 102 00:04:34,290 --> 00:04:38,980 Moore's law, which is a very trusting trend that keep going 103 00:04:38,980 --> 00:04:40,000 for a very long time. 104 00:04:40,000 --> 00:04:43,330 Power consumption, wire delay, hardware complexity, 105 00:04:43,330 --> 00:04:46,770 parallelizing compilers, program design methodologies. 106 00:04:46,770 --> 00:04:48,940 So you have all these trends her to look at. 107 00:04:48,940 --> 00:04:53,300 I have this picture, the graph on the right, which shows what 108 00:04:53,300 --> 00:04:58,870 happened at every generation, assume this is a generation. 109 00:04:58,870 --> 00:05:02,640 You concentrate on the hardest problem, what's the most 110 00:05:02,640 --> 00:05:04,190 taxing thing? 111 00:05:04,190 --> 00:05:08,160 And what happens is as you go, at this time, something like 112 00:05:08,160 --> 00:05:10,310 this might be the taxing, and then that generation, that's 113 00:05:10,310 --> 00:05:12,190 where all the design going on. 114 00:05:12,190 --> 00:05:15,620 Then some slowly growing things start catching up, and 115 00:05:15,620 --> 00:05:18,160 then, after some time, that becomes the most important 116 00:05:18,160 --> 00:05:19,330 thing in the design. 117 00:05:19,330 --> 00:05:22,430 So it's a paradigm shift, because you move that focus on 118 00:05:22,430 --> 00:05:24,020 that, that becomes the important thing. 119 00:05:24,020 --> 00:05:28,040 And then some fast moving things start catching up and 120 00:05:28,040 --> 00:05:31,280 accelerate, and that becomes our most taxing thing. 121 00:05:31,280 --> 00:05:34,140 And then this thing, again, and the different curves start 122 00:05:34,140 --> 00:05:34,810 catching up again. 123 00:05:34,810 --> 00:05:38,680 So what happens is you have this cyclic phenomenon, too. 124 00:05:38,680 --> 00:05:41,360 But also you have the discrete So you might get into a 125 00:05:41,360 --> 00:05:43,330 revolutionary thing, and really bring something down. 126 00:05:43,330 --> 00:05:48,680 For example, we went from very power-hungry, 127 00:05:48,680 --> 00:05:51,350 bipolar devices to SEMOS. 128 00:05:51,350 --> 00:05:53,890 And suddenly everything shifted down a lot. 129 00:05:53,890 --> 00:05:56,240 And then we are back again, we are those 130 00:05:56,240 --> 00:05:57,440 power-hungry days again. 131 00:05:57,440 --> 00:05:58,780 So that was kind of a revolutionary 132 00:05:58,780 --> 00:06:01,410 change and get evolution. 133 00:06:01,410 --> 00:06:05,060 If you want to do work on something, it's no fun working 134 00:06:05,060 --> 00:06:08,650 on something that's very important, hot today. 135 00:06:08,650 --> 00:06:10,780 The people who are making the biggest breakthroughs on the 136 00:06:10,780 --> 00:06:14,380 things that are hot today, started 10 years ago. 137 00:06:14,380 --> 00:06:15,880 So your finally figure out that this is the most 138 00:06:15,880 --> 00:06:17,760 important technology today, I'm going to start working on 139 00:06:17,760 --> 00:06:21,730 doing my Ph.D. By the time you are done, it'll be passe, most 140 00:06:21,730 --> 00:06:24,030 of these cyclic things. 141 00:06:24,030 --> 00:06:27,550 So the key thing is, what is going to be hot in 10 years? 142 00:06:27,550 --> 00:06:29,410 That's a hard problem. 143 00:06:29,410 --> 00:06:31,550 So this is the thing you can look at these trends, and say, 144 00:06:31,550 --> 00:06:33,020 look, these are the kind of trends. 145 00:06:33,020 --> 00:06:37,650 This can give you some idea where what might be 146 00:06:37,650 --> 00:06:38,590 interesting. 147 00:06:38,590 --> 00:06:40,660 So are you going to become a low power person? 148 00:06:40,660 --> 00:06:42,795 For example, six, seven years ago, people who studied low 149 00:06:42,795 --> 00:06:45,080 power, they made a lot of significant contributions. 150 00:06:45,080 --> 00:06:47,420 People now still can contribute. 151 00:06:47,420 --> 00:06:51,520 But people who got in early had an easier time, and they 152 00:06:51,520 --> 00:06:55,410 had a lot more impact then people who got in late. 153 00:06:55,410 --> 00:06:58,960 So we look at some of these trends, and see what we can 154 00:06:58,960 --> 00:07:00,660 come up with. 155 00:07:00,660 --> 00:07:03,730 The first trend, this is a very trust trend so far, is 156 00:07:03,730 --> 00:07:05,950 Moore's Law. 157 00:07:05,950 --> 00:07:09,070 Basically, most of our basis for things happening is part 158 00:07:09,070 --> 00:07:10,560 of Moore's Law. 159 00:07:10,560 --> 00:07:12,675 Basically it says, you get twice the number of 160 00:07:12,675 --> 00:07:14,570 transistors every 18 months. 161 00:07:14,570 --> 00:07:16,080 We have had that trend for a very long time. 162 00:07:19,560 --> 00:07:22,440 What's the impact of function? 163 00:07:22,440 --> 00:07:26,390 Nobody cares about processors who can't use transistors, 164 00:07:26,390 --> 00:07:28,410 it's some kind of closed, you don't even know many 165 00:07:28,410 --> 00:07:29,780 transistors are in the processor. 166 00:07:29,780 --> 00:07:31,570 What matters is what it can deliver. 167 00:07:31,570 --> 00:07:32,560 So performance is what? 168 00:07:32,560 --> 00:07:33,350 Delivery. 169 00:07:33,350 --> 00:07:36,840 So what people have realized is if you are in the 170 00:07:36,840 --> 00:07:40,860 superscalar world, this has been flattening out recently. 171 00:07:40,860 --> 00:07:43,260 If you look at it, this is kind of getting flat and that 172 00:07:43,260 --> 00:07:44,310 was a big problem. 173 00:07:44,310 --> 00:07:50,580 So that's kind of the reason for crossover for superscalar, 174 00:07:50,580 --> 00:07:51,700 I mean multicore. 175 00:07:51,700 --> 00:07:55,130 So we are trying to get back into again a higher growth 176 00:07:55,130 --> 00:07:57,820 curve in here. 177 00:07:57,820 --> 00:08:02,220 And then another trend is power. 178 00:08:02,220 --> 00:08:05,700 Power had been going in the wrong direction and also power 179 00:08:05,700 --> 00:08:07,990 density has been going in the wrong direction. 180 00:08:07,990 --> 00:08:10,720 This current keep growing more than this and then people have 181 00:08:10,720 --> 00:08:13,300 problems in that. 182 00:08:13,300 --> 00:08:16,490 Something that is really revealing is what's per spec 183 00:08:16,490 --> 00:08:17,390 performance. 184 00:08:17,390 --> 00:08:22,970 That means how much power you are consuming to do something. 185 00:08:22,970 --> 00:08:26,960 And what you see is, we are just starting to -- 186 00:08:26,960 --> 00:08:30,770 these days we use very little power to get work done, and 187 00:08:30,770 --> 00:08:34,210 now we, what superscalars did was be more and more 188 00:08:34,210 --> 00:08:36,110 [OBSCURED]. 189 00:08:36,110 --> 00:08:41,520 And this is what, I think, led to a big reason for going into 190 00:08:41,520 --> 00:08:41,750 multicores. 191 00:08:41,750 --> 00:08:45,900 Because we realized, look, we keep wasting and now the power 192 00:08:45,900 --> 00:08:47,740 consumption of computers are non-trivial. 193 00:08:47,740 --> 00:08:49,680 You could talk to Google. 194 00:08:49,680 --> 00:08:52,120 Their power costs, power budgets, is non-trivial. 195 00:08:52,120 --> 00:08:54,440 We can't just keep wasting that resource. 196 00:08:54,440 --> 00:08:57,370 You can keep wasting things when it is not that important, 197 00:08:57,370 --> 00:08:59,860 smaller factor, but power becomes a big factor. 198 00:08:59,860 --> 00:09:02,040 So how do we get out of the wasteful mode? 199 00:09:02,040 --> 00:09:03,480 And I think that -- a lot of people are 200 00:09:03,480 --> 00:09:04,110 I'm looking at that. 201 00:09:04,110 --> 00:09:05,650 If you talk to people five years ago, people 202 00:09:05,650 --> 00:09:05,910 say, this is free. 203 00:09:05,910 --> 00:09:08,060 Transistors are free. 204 00:09:08,060 --> 00:09:09,440 You can just keep doing things. 205 00:09:09,440 --> 00:09:11,380 Just keep them available, [OBSCURED] 206 00:09:11,380 --> 00:09:12,800 makes it free. 207 00:09:12,800 --> 00:09:15,830 But what people realize is, wire transistors are free, 208 00:09:15,830 --> 00:09:18,120 switching them is not free. 209 00:09:18,120 --> 00:09:20,880 And switching the request file, so power become an 210 00:09:20,880 --> 00:09:23,130 important issue. 211 00:09:23,130 --> 00:09:25,760 At that point I want to point out was this wire delay. 212 00:09:25,760 --> 00:09:29,710 As we keep switching them faster and faster, the world 213 00:09:29,710 --> 00:09:31,590 looks smaller and smaller and smaller, you 214 00:09:31,590 --> 00:09:34,180 can make global decisions. 215 00:09:34,180 --> 00:09:35,780 What the implication of that? 216 00:09:35,780 --> 00:09:37,650 And we're facing that, I think it's going to get 217 00:09:37,650 --> 00:09:39,240 a little bit worse. 218 00:09:39,240 --> 00:09:42,490 One saving grace for that is people realize they can keep 219 00:09:42,490 --> 00:09:46,730 scaling the clock speeds that much. 220 00:09:46,730 --> 00:09:51,390 This chart I did a few years ago, still is a road map, 221 00:09:51,390 --> 00:09:53,000 that's made the silicon people think where 222 00:09:53,000 --> 00:09:54,360 the silicon is going. 223 00:09:54,360 --> 00:09:57,082 Thought they can get to 10 gigs, then 15 gigs, 224 00:09:57,082 --> 00:09:58,240 and stuff like that. 225 00:09:58,240 --> 00:09:58,970 And now they [OBSCURED], 226 00:09:58,970 --> 00:10:01,440 and go to all of these power requirements. 227 00:10:01,440 --> 00:10:04,050 They might not be able to switch that fast. So there are 228 00:10:04,050 --> 00:10:07,390 changes to road map because of that. 229 00:10:07,390 --> 00:10:08,700 Another interesting thing is -- 230 00:10:14,720 --> 00:10:19,290 AUDIENCE: RC line type wire delay, as opposed to -- 231 00:10:19,290 --> 00:10:21,910 that's not speed of light delay, that's including 232 00:10:21,910 --> 00:10:26,190 scaling of the wires themselves? 233 00:10:26,190 --> 00:10:29,420 PROFESSOR: This is mainly speed of light delay. 234 00:10:29,420 --> 00:10:32,030 Because what you do is you just look at clock speeds. 235 00:10:32,030 --> 00:10:34,620 I haven't done too much of dialectrical 236 00:10:34,620 --> 00:10:37,090 and stuff like that. 237 00:10:37,090 --> 00:10:39,750 That's the thing, you can get closer to the speed of light 238 00:10:39,750 --> 00:10:44,440 but right now we are only about -- 239 00:10:44,440 --> 00:10:47,420 I mean we are not even a factor of, I don't know, 80%, 240 00:10:47,420 --> 00:10:48,910 90% speed of light. 241 00:10:48,910 --> 00:10:53,560 So best case we can get 10% better. 242 00:10:53,560 --> 00:10:57,140 It doesn't give you that much of room. 243 00:10:57,140 --> 00:11:00,103 And this is a continuous thing, because microprocessors 244 00:11:00,103 --> 00:11:03,890 are going much faster than d-RAM. 245 00:11:03,890 --> 00:11:06,150 And this is having a problem. 246 00:11:06,150 --> 00:11:09,660 There are people says there might be revolutionary things 247 00:11:09,660 --> 00:11:10,210 happening here. 248 00:11:10,210 --> 00:11:17,500 Things like non-volatile RAM, might come and kind of take 249 00:11:17,500 --> 00:11:19,980 over this area. 250 00:11:19,980 --> 00:11:23,650 What happening today is d-RAM now looks a lot more like a 251 00:11:23,650 --> 00:11:26,410 disc looked like 10 years ago. 252 00:11:26,410 --> 00:11:29,030 In may ways. 253 00:11:29,030 --> 00:11:30,610 We have to treat a d-RAM like a disc. 254 00:11:30,610 --> 00:11:34,360 That's kind of the way we have of looking at the world. 255 00:11:34,360 --> 00:11:37,470 Here where I look at compilers and parallelizing. 256 00:11:37,470 --> 00:11:38,990 There's no nice number here. 257 00:11:38,990 --> 00:11:41,750 So in somewhere around the 1970s, people did things like 258 00:11:41,750 --> 00:11:42,790 vectorization -- 259 00:11:42,790 --> 00:11:43,080 question? 260 00:11:43,080 --> 00:11:48,460 AUDIENCE: I have a question about the last slide. 261 00:11:48,460 --> 00:11:50,960 When people do this plot which is performance, it's not 262 00:11:50,960 --> 00:11:54,130 exactly performance that or throughput that matters 263 00:11:54,130 --> 00:12:00,680 relative to d-RAM, it is the computation lengths that is 264 00:12:00,680 --> 00:12:02,393 how fast from when I know I want something from memory to 265 00:12:02,393 --> 00:12:05,630 when I really, really need it that matters. 266 00:12:05,630 --> 00:12:12,550 Is this going to change as we move to a wider architecture, 267 00:12:12,550 --> 00:12:16,550 with more parallelism, in the sense that it will no longer 268 00:12:16,550 --> 00:12:18,776 be, the lazy gap won't be increasing anymore and we'll 269 00:12:18,776 --> 00:12:20,810 start to hear more about bandwidth? 270 00:12:20,810 --> 00:12:22,230 PROFESSOR: I don't know, because 271 00:12:22,230 --> 00:12:23,390 bandwidth is also power. 272 00:12:23,390 --> 00:12:25,553 These are a lot of interesting arguments, this is what you 273 00:12:25,553 --> 00:12:25,870 don't know. 274 00:12:25,870 --> 00:12:29,750 Because the thing is, what superscalar people did was, we 275 00:12:29,750 --> 00:12:34,230 reduced the problem with latency by predicting. 276 00:12:34,230 --> 00:12:35,420 And doing extra work. 277 00:12:35,420 --> 00:12:37,990 We will try as much as possible to predict what you 278 00:12:37,990 --> 00:12:40,170 need and do things in advance. 279 00:12:40,170 --> 00:12:42,430 What happens is as you keep predicting more and more, 280 00:12:42,430 --> 00:12:44,730 you're actually doing, your accuracy of 281 00:12:44,730 --> 00:12:46,080 predictions go low. 282 00:12:46,080 --> 00:12:48,420 But you still keep increasing -- 283 00:12:48,420 --> 00:12:53,350 the more data you at least get a small percentage increase in 284 00:12:53,350 --> 00:12:55,810 useful data. 285 00:12:55,810 --> 00:12:57,150 Effecting latency. 286 00:12:57,150 --> 00:12:59,130 But accuracy start pretty low and it start 287 00:12:59,130 --> 00:13:01,440 becoming not useful. 288 00:13:01,440 --> 00:13:03,910 So bandwidth problem has an impact, too. 289 00:13:03,910 --> 00:13:06,210 So unless you're using inside the entire thing, you're 290 00:13:06,210 --> 00:13:07,910 predicting a huge amount of things, predicting that you 291 00:13:07,910 --> 00:13:09,130 might use it. 292 00:13:09,130 --> 00:13:12,580 So if the prediction is wrong, it's a lot of wasteful work. 293 00:13:12,580 --> 00:13:15,360 And the key thing is can you pay by power? 294 00:13:15,360 --> 00:13:17,920 I think latency, at some point, if you can really 295 00:13:17,920 --> 00:13:19,530 predict, it becomes hard. 296 00:13:19,530 --> 00:13:23,420 Unless you have very regular patterns that the predictions 297 00:13:23,420 --> 00:13:26,890 working there. 298 00:13:26,890 --> 00:13:29,040 If you look at what happened in compiling and programming 299 00:13:29,040 --> 00:13:29,850 technology. 300 00:13:29,850 --> 00:13:32,620 In the '70s, you gets vectorization technology. 301 00:13:32,620 --> 00:13:33,500 It did well. 302 00:13:33,500 --> 00:13:36,050 And then some time in 80's we did things that compile for 303 00:13:36,050 --> 00:13:38,450 instruction level parallelism. 304 00:13:38,450 --> 00:13:40,340 That got a lot of parallelism in there, very low, low 305 00:13:40,340 --> 00:13:42,110 parallelism in there. 306 00:13:42,110 --> 00:13:44,550 And somewhere in 90's we did automatic parellelization 307 00:13:44,550 --> 00:13:46,050 compilers for FORTRAN. 308 00:13:46,050 --> 00:13:49,170 I think that was today almost the highlight of that. 309 00:13:49,170 --> 00:13:53,020 Because what happened was we figured out how to get the 310 00:13:53,020 --> 00:13:55,710 program language in FORTRAN and do pretty well 311 00:13:55,710 --> 00:13:58,200 automatically extract patterns. 312 00:13:58,200 --> 00:14:01,710 Then things were kind of down in front there because people 313 00:14:01,710 --> 00:14:04,180 say FORTRAN is interesting, but we don't program in 314 00:14:04,180 --> 00:14:05,020 FORTRAN anymore. 315 00:14:05,020 --> 00:14:07,650 We are going to program in things like C, C . 316 00:14:07,650 --> 00:14:08,810 That did two things. 317 00:14:08,810 --> 00:14:11,350 First of all, one is a language problem, which is C, 318 00:14:11,350 --> 00:14:14,920 C was unfortunately was not a typesafe language. 319 00:14:14,920 --> 00:14:17,270 It was not be given up the right scientific application. 320 00:14:17,270 --> 00:14:19,030 More like operating system hacking, that required looking 321 00:14:19,030 --> 00:14:20,860 at the entire address space. 322 00:14:20,860 --> 00:14:23,810 That really hindered the compiler. 323 00:14:23,810 --> 00:14:26,590 Second, programmers start falling in love with complex 324 00:14:26,590 --> 00:14:28,590 data structures. 325 00:14:28,590 --> 00:14:30,930 The world is not fixed size arrays any more. 326 00:14:30,930 --> 00:14:33,150 They want to develop things, have 327 00:14:33,150 --> 00:14:35,360 different structures, trees. 328 00:14:35,360 --> 00:14:36,580 These things are much harder to analyze. 329 00:14:36,580 --> 00:14:40,050 So suddenly you can do it nicer, they said, look, people 330 00:14:40,050 --> 00:14:43,650 are doing recursions and these very complex structures and 331 00:14:43,650 --> 00:14:45,890 suddenly, the more of the things you can go went down 332 00:14:45,890 --> 00:14:47,300 pretty drastically. 333 00:14:47,300 --> 00:14:49,880 And you're back to a pretty bad state. 334 00:14:49,880 --> 00:14:51,820 And then we're slowly recovering a little bit 335 00:14:51,820 --> 00:14:54,860 because things like JAVA and C Sharp gave a typesafe 336 00:14:54,860 --> 00:14:57,240 language, became a little more analyzable. 337 00:14:57,240 --> 00:15:00,810 I think automatically what you can do kind of improves. 338 00:15:00,810 --> 00:15:06,250 And the hope here is a demand driven by multicores There's 339 00:15:06,250 --> 00:15:09,110 no supply technology but there's the demand. 340 00:15:09,110 --> 00:15:10,410 You need to do something. 341 00:15:10,410 --> 00:15:13,280 And hopefully there'll be interesting things that come 342 00:15:13,280 --> 00:15:15,040 about because of the demand. 343 00:15:15,040 --> 00:15:17,990 So this is my prediction, where things are. 344 00:15:17,990 --> 00:15:19,710 What happened then and what's going to happen next. 345 00:15:24,220 --> 00:15:25,900 Of course, multicores are here. 346 00:15:25,900 --> 00:15:27,660 There's no question about that. 347 00:15:27,660 --> 00:15:31,440 But I don't think programmers are ready for that. 348 00:15:31,440 --> 00:15:35,190 I mean we have established a very nice, solid boundary 349 00:15:35,190 --> 00:15:36,260 between hardware and software. 350 00:15:36,260 --> 00:15:38,070 Nice abstraction layer. 351 00:15:38,070 --> 00:15:40,180 Most programmers don't have to worry about hardware. 352 00:15:40,180 --> 00:15:41,140 Programmers didn't have any 353 00:15:41,140 --> 00:15:42,210 knowledgeable about the process. 354 00:15:42,210 --> 00:15:43,920 In fact, we are moving in that direction. 355 00:15:43,920 --> 00:15:46,060 JAVA said you don't even have to know about the 356 00:15:46,060 --> 00:15:46,780 class of the program. 357 00:15:46,780 --> 00:15:49,580 We are running this thing in a nice high-level byte code. 358 00:15:49,580 --> 00:15:52,670 Write once, run everywhere. 359 00:15:52,670 --> 00:15:56,070 That's a very nice abstraction. 360 00:15:56,070 --> 00:15:58,480 For most types, more slow actually gave you enough 361 00:15:58,480 --> 00:15:59,950 performance, that that was good enough. 362 00:15:59,950 --> 00:16:01,200 We can deal with that. 363 00:16:03,260 --> 00:16:07,350 The nice artifact of that is programmers were oblivious to 364 00:16:07,350 --> 00:16:09,110 what happened in processors. 365 00:16:09,110 --> 00:16:10,940 A program written in 1970 still works. 366 00:16:10,940 --> 00:16:13,680 Still will run 10 times faster. 367 00:16:13,680 --> 00:16:14,620 That is really great. 368 00:16:14,620 --> 00:16:19,700 If you look at Windows Office that probably [OBSCURED]. 369 00:16:19,700 --> 00:16:22,390 We still have code written in 1970, but in 1980. 370 00:16:22,390 --> 00:16:24,630 There's still code in there. 371 00:16:24,630 --> 00:16:27,550 Every time they tout they have this completely new program. 372 00:16:27,550 --> 00:16:30,560 Only probably 10% of new code is there. 373 00:16:30,560 --> 00:16:33,030 But that's not true for the cell phone. 374 00:16:33,030 --> 00:16:35,510 Every time they say I have a new cell phone, most of the 375 00:16:35,510 --> 00:16:38,190 time actually rewritten completely. 376 00:16:38,190 --> 00:16:41,410 So that's why in an industry like that it's very hard to 377 00:16:41,410 --> 00:16:44,660 evolve like that because we are just churning too much. 378 00:16:44,660 --> 00:16:49,100 And in soft ware we got really enamored by this ability to 379 00:16:49,100 --> 00:16:51,050 reuse stuff. 380 00:16:51,050 --> 00:16:53,620 Ability to have this nice abstraction in there. 381 00:16:53,620 --> 00:16:56,630 And the problem is there's probably a lot of freedom for 382 00:16:56,630 --> 00:16:58,490 programmers, so they push on complex issues. 383 00:16:58,490 --> 00:17:02,630 So your question about how large companies want complex 384 00:17:02,630 --> 00:17:07,400 things, they way people get their computer advantage is 385 00:17:07,400 --> 00:17:08,730 pushing the complexity. 386 00:17:08,730 --> 00:17:10,680 So what they did was, instead of pushing the complexity in 387 00:17:10,680 --> 00:17:12,670 the processor, they push the complexity in software, 388 00:17:12,670 --> 00:17:13,750 features and stuff like that. 389 00:17:13,750 --> 00:17:16,250 So we are still on the brink of complexity. 390 00:17:16,250 --> 00:17:17,910 But now we are adding another complexity that's very 391 00:17:17,910 --> 00:17:19,160 hard to deal with. 392 00:17:22,420 --> 00:17:26,830 So this is where things came about, how things move. 393 00:17:26,830 --> 00:17:28,490 Where can we go? 394 00:17:28,490 --> 00:17:29,850 I want to talk about two different parts. 395 00:17:29,850 --> 00:17:31,620 Architecture, and languages, compilers and tools. 396 00:17:31,620 --> 00:17:35,130 Most of the things I talk about, That's no solution. 397 00:17:35,130 --> 00:17:37,440 This is what I think that will happen, should happen. 398 00:17:37,440 --> 00:17:38,840 I think that can hopefully really inspire 399 00:17:38,840 --> 00:17:41,570 people to go do that. 400 00:17:41,570 --> 00:17:47,850 And hopefully I don't have to eat what I say in 10 years. 401 00:17:47,850 --> 00:17:50,970 One thing now about novel opportunities in multicores is 402 00:17:50,970 --> 00:17:53,050 you don't have to contend with uniprocessors. 403 00:17:53,050 --> 00:17:55,560 Most of the time they are a source of big pain when you 404 00:17:55,560 --> 00:17:56,980 are doing research in here. 405 00:17:56,980 --> 00:17:59,130 Everybody says, yeah, that is very nice, but I just wait two 406 00:17:59,130 --> 00:18:00,680 years and this will happen for me automatically. 407 00:18:00,680 --> 00:18:03,020 Why do I have to worry about your complex whatever you are 408 00:18:03,020 --> 00:18:04,270 suggesting in there. 409 00:18:06,730 --> 00:18:08,600 The other thing is, in the good old days, the people who 410 00:18:08,600 --> 00:18:10,970 really wanted performance were these 411 00:18:10,970 --> 00:18:13,530 really performance weenies. 412 00:18:13,530 --> 00:18:15,880 In fact, you probably got a good talk from a performance 413 00:18:15,880 --> 00:18:21,970 weenie last time, who's [OBSCURED]. 414 00:18:21,970 --> 00:18:24,425 And we want just performance at all cost. There 415 00:18:24,425 --> 00:18:26,910 are people like that. 416 00:18:26,910 --> 00:18:30,020 For them, producing anything it's not easy. 417 00:18:30,020 --> 00:18:31,590 They say, I don't want tools. 418 00:18:31,590 --> 00:18:33,840 They just suck my performance out. 419 00:18:33,840 --> 00:18:35,280 And it was very hard to help them, because 420 00:18:35,280 --> 00:18:37,490 they don't want help. 421 00:18:37,490 --> 00:18:38,980 But right now what's happening this is 422 00:18:38,980 --> 00:18:41,100 becoming everybody's problem. 423 00:18:41,100 --> 00:18:44,750 I think that will lead people to want the things to take 424 00:18:44,750 --> 00:18:47,760 advantage of parallelism. 425 00:18:47,760 --> 00:18:51,140 The other thing is, the problem can change. 426 00:18:51,140 --> 00:18:55,420 Because people worked for 20 years or 40 years on this 427 00:18:55,420 --> 00:19:00,250 parallel problem, but that's for multiprocessors. 428 00:19:00,250 --> 00:19:03,830 So how does this problem has changed? 429 00:19:03,830 --> 00:19:06,230 What's the impact from going from a multiprocessor, which 430 00:19:06,230 --> 00:19:08,950 is multiple chips, sitting on a board, with some kind of 431 00:19:08,950 --> 00:19:15,820 interconnecting to multiple cores on the same die. 432 00:19:15,820 --> 00:19:15,980 OK. 433 00:19:15,980 --> 00:19:18,070 I think there are some very substantial things that 434 00:19:18,070 --> 00:19:20,770 happen, that can have big impact. 435 00:19:20,770 --> 00:19:21,950 Let's look at two of them. 436 00:19:21,950 --> 00:19:23,290 Which is communication bandwidth and 437 00:19:23,290 --> 00:19:24,550 communication latency. 438 00:19:24,550 --> 00:19:27,670 I think this is where you can have revolutionary kind of 439 00:19:27,670 --> 00:19:30,300 things happening. 440 00:19:30,300 --> 00:19:35,850 If you look at communication bandwidth, the best you could 441 00:19:35,850 --> 00:19:39,880 get, if you put two chips on any kind of a board, it's 442 00:19:39,880 --> 00:19:44,560 about 32 gigabits per second. 443 00:19:44,560 --> 00:19:48,060 Because you had to go through pins and pins 444 00:19:48,060 --> 00:19:49,970 had a lot of issues. 445 00:19:49,970 --> 00:19:54,190 Today if you try to put two different cores next to each 446 00:19:54,190 --> 00:19:59,100 other in the same die, the bisection bandwidth is huge. 447 00:19:59,100 --> 00:20:02,950 It's four orders of magnitude huge, that's what you can get. 448 00:20:02,950 --> 00:20:03,680 That's a huge change. 449 00:20:03,680 --> 00:20:07,650 Even for the lowest 2x improvement, this is four 450 00:20:07,650 --> 00:20:10,870 orders of magnitude difference you can do. 451 00:20:10,870 --> 00:20:12,770 So why is that? 452 00:20:12,770 --> 00:20:16,460 Because what changes number of wires in the die. 453 00:20:16,460 --> 00:20:19,860 Because, if you think about it, one name type of 454 00:20:19,860 --> 00:20:24,500 architectures is to just just build, to deal with this 455 00:20:24,500 --> 00:20:26,580 reduction of communication bandwidth. 456 00:20:26,580 --> 00:20:29,450 The closed resistors communicates the most, the 457 00:20:29,450 --> 00:20:32,040 local memory next, and then remote memory. 458 00:20:32,040 --> 00:20:33,390 And then if you had to go remote to 459 00:20:33,390 --> 00:20:34,970 the next guy, network. 460 00:20:34,970 --> 00:20:37,100 These things basically reduce the amount of 461 00:20:37,100 --> 00:20:38,050 things you can do. 462 00:20:38,050 --> 00:20:39,010 That was really true. 463 00:20:39,010 --> 00:20:41,040 Because if you you're inside the chip, you 464 00:20:41,040 --> 00:20:42,140 can do a lot more. 465 00:20:42,140 --> 00:20:44,380 The minute you go to the board, you can do less. 466 00:20:44,380 --> 00:20:46,170 Minute you go to multiple boxes, less. 467 00:20:46,170 --> 00:20:46,970 Stuff like that. 468 00:20:46,970 --> 00:20:49,330 That is the way the processor would decide. 469 00:20:49,330 --> 00:20:51,920 And also the clocks. 470 00:20:51,920 --> 00:20:55,660 Being a processor you can do much faster to go pins, not 471 00:20:55,660 --> 00:20:57,580 that fast. People are trying to change that, but still 472 00:20:57,580 --> 00:20:57,840 that's an issue. 473 00:20:57,840 --> 00:21:01,820 And multiplexing, because pins were inherently limited. 474 00:21:01,820 --> 00:21:08,090 If you have thousand pins that's a big processor 475 00:21:08,090 --> 00:21:10,540 basically in there. 476 00:21:10,540 --> 00:21:14,920 And within a die, a thousand wires is basically nothing. 477 00:21:14,920 --> 00:21:19,570 So we have four orders of magnitude improvement in here. 478 00:21:19,570 --> 00:21:21,250 And that's amazing. 479 00:21:21,250 --> 00:21:23,790 That can have a huge impact in here. 480 00:21:23,790 --> 00:21:27,060 So you can do massive data exchange. 481 00:21:27,060 --> 00:21:29,210 On big issue right now in parallelism is the minute you 482 00:21:29,210 --> 00:21:31,280 parallelize something, you might slow it down. 483 00:21:31,280 --> 00:21:34,900 Because the thing is, you might not get data locality. 484 00:21:34,900 --> 00:21:39,580 Because that pipe is going to get clogged up because you've 485 00:21:39,580 --> 00:21:42,080 been trying to take too much data in there. 486 00:21:42,080 --> 00:21:45,020 So if you put wrong things, you're going to get clogged by 487 00:21:45,020 --> 00:21:47,250 the communication panel. 488 00:21:47,250 --> 00:21:50,200 This thing says, I might not have to deal with that. 489 00:21:50,200 --> 00:21:52,420 For example, I don't know why you want to do that, but if 490 00:21:52,420 --> 00:21:56,390 you want to move your entire cache from one core to 491 00:21:56,390 --> 00:21:59,270 another, you can probably do it in a few clock cycles. 492 00:21:59,270 --> 00:22:00,310 That is bandwidth. 493 00:22:00,310 --> 00:22:01,640 I don't know why you want to do that, because that's 494 00:22:01,640 --> 00:22:04,450 probably not memory efficient, power efficient. 495 00:22:04,450 --> 00:22:05,780 But you can do it. 496 00:22:05,780 --> 00:22:07,120 We have wireless. 497 00:22:07,120 --> 00:22:08,720 So the key thing is how can you take 498 00:22:08,720 --> 00:22:11,090 advantage of these wires. 499 00:22:11,090 --> 00:22:14,740 And how can you make it in a way you can really use the 500 00:22:14,740 --> 00:22:17,540 difficulty in programming, because, as you've figured 501 00:22:17,540 --> 00:22:20,410 out, parallel programming is hard. 502 00:22:20,410 --> 00:22:21,775 If you make a wrong decision, you're going 503 00:22:21,775 --> 00:22:23,470 to pay a huge price. 504 00:22:23,470 --> 00:22:26,350 How can you make that in a way that you can make a few wrong 505 00:22:26,350 --> 00:22:28,730 decisions and get everything by taking advantage of that. 506 00:22:28,730 --> 00:22:30,400 I think that will be a big opportunity in here. 507 00:22:30,400 --> 00:22:33,890 Another opportunity is latency. 508 00:22:33,890 --> 00:22:37,255 This is not as big as bandwidth because latency has 509 00:22:37,255 --> 00:22:42,610 the inherent speed of light issue. 510 00:22:42,610 --> 00:22:46,680 Again, what happens is now length of 511 00:22:46,680 --> 00:22:48,560 wires are a lot shorter. 512 00:22:48,560 --> 00:22:50,940 And no multiplexing, so you can get a 513 00:22:50,940 --> 00:22:52,370 direct wire from there. 514 00:22:52,370 --> 00:22:55,190 And on-chip can be much closer. 515 00:22:55,190 --> 00:22:56,570 And the other interesting things you 516 00:22:56,570 --> 00:22:57,340 can also think about. 517 00:22:57,340 --> 00:23:00,280 Right now, why is register to register 518 00:23:00,280 --> 00:23:03,660 communication just very fast? 519 00:23:03,660 --> 00:23:05,760 In a processor if you send something one way this can use 520 00:23:05,760 --> 00:23:08,720 it almost instantaneously is much faster. 521 00:23:08,720 --> 00:23:12,310 Of course, wires are not that, it's very close. 522 00:23:12,310 --> 00:23:14,190 Also there are a lot of other things we do. 523 00:23:14,190 --> 00:23:17,140 We need more register, register before we even come 524 00:23:17,140 --> 00:23:17,630 to the data. 525 00:23:17,630 --> 00:23:19,740 In a normal pipeline machine you have 526 00:23:19,740 --> 00:23:21,080 a pretty long pipeline. 527 00:23:21,080 --> 00:23:23,160 The data is externally receivable at the end of the 528 00:23:23,160 --> 00:23:26,300 pipeline, but I can speculate the same to the next one when 529 00:23:26,300 --> 00:23:29,010 you start working on it if you're wrong, I would yank it 530 00:23:29,010 --> 00:23:33,660 out and you have give [OBSCURED]. 531 00:23:33,660 --> 00:23:36,770 But right now if you actually go across processors, we can 532 00:23:36,770 --> 00:23:38,600 use speculation. 533 00:23:38,600 --> 00:23:40,370 You have to wait until the things are committed to 534 00:23:40,370 --> 00:23:42,620 memory, and then you can start using it. 535 00:23:42,620 --> 00:23:45,000 That's even longer than ideally, because that's like 536 00:23:45,000 --> 00:23:48,410 10, 15 cycles I have to wait. 537 00:23:48,410 --> 00:23:51,190 Why can't I send something speculative to my neighbor? 538 00:23:51,190 --> 00:23:53,740 I have enough bandwidth to deal with it. 539 00:23:53,740 --> 00:23:57,650 If I do something wrong, can I speculate 540 00:23:57,650 --> 00:23:59,350 across multiple cores. 541 00:23:59,350 --> 00:24:03,140 So I can get my data much faster. 542 00:24:03,140 --> 00:24:06,580 I save probably an order of magnitude beyond my wire delay 543 00:24:06,580 --> 00:24:08,170 because I have no pipeline stages made until [OBSCURED] 544 00:24:08,170 --> 00:24:10,320 happens, very long later. 545 00:24:10,320 --> 00:24:12,620 Things like that, can I do in modern process? 546 00:24:12,620 --> 00:24:15,160 Because no, I'm putting these darn things next to each 547 00:24:15,160 --> 00:24:17,580 other, what can I do with that? 548 00:24:17,580 --> 00:24:19,980 And you can do ultrafast [OBSCURED] 549 00:24:19,980 --> 00:24:22,470 real time and they can use -- can I do things a lot 550 00:24:22,470 --> 00:24:24,600 fine-grain across and take advantage of that? 551 00:24:27,210 --> 00:24:31,330 So the way to look at that a traditional microprocessor 552 00:24:31,330 --> 00:24:34,060 basically had this structure. 553 00:24:34,060 --> 00:24:36,840 Cache memory and processor cache memory. 554 00:24:36,840 --> 00:24:41,020 And this is there because to deal with a small amount of 555 00:24:41,020 --> 00:24:43,450 wireless power and stuff like that. 556 00:24:43,450 --> 00:24:46,830 Today what we have done is basically taken this 557 00:24:46,830 --> 00:24:50,890 structure, and kind of put it in the same die. 558 00:24:50,890 --> 00:24:51,260 OK. 559 00:24:51,260 --> 00:24:53,810 We haven't thought beyond that. 560 00:24:53,810 --> 00:24:59,770 And we get a few order, a few factors better. 561 00:24:59,770 --> 00:25:02,870 Because, no, I don't have to go through the pin because now 562 00:25:02,870 --> 00:25:07,900 I can have instead of say 128 pins, I can have 512 going in, 563 00:25:07,900 --> 00:25:09,820 or 1,024 going in there. 564 00:25:09,820 --> 00:25:14,720 But still, I have 4 orders of magnitude freedom in here. 565 00:25:14,720 --> 00:25:16,180 I am not using any of those. 566 00:25:16,180 --> 00:25:18,960 So one thing Raw tried to do is have a much more tightly 567 00:25:18,960 --> 00:25:22,810 coupled, a lot more communication going on there. 568 00:25:22,810 --> 00:25:24,390 So a lot of people say that today. 569 00:25:24,390 --> 00:25:28,720 The biggest problem in multicores is the network. 570 00:25:28,720 --> 00:25:31,990 Because how will I build that scale of a network? 571 00:25:31,990 --> 00:25:35,030 My feeling is that's baloney. 572 00:25:35,030 --> 00:25:36,410 Network is not an issue. 573 00:25:36,410 --> 00:25:38,290 In fact, in Raw, a lot of our experiments we could never 574 00:25:38,290 --> 00:25:41,500 saturate a very, very well done network. 575 00:25:41,500 --> 00:25:44,210 Because in a network I have so many pins, I can keep adding 576 00:25:44,210 --> 00:25:45,620 bandwidth like crazy. 577 00:25:45,620 --> 00:25:50,310 The problem the way people design processor code today is 578 00:25:50,310 --> 00:25:53,240 to minimize communication going out of it. 579 00:25:53,240 --> 00:25:56,990 It has a very thin memory bus going in and out of it. 580 00:25:56,990 --> 00:25:59,290 That memory bus can never saturate any kind of network 581 00:25:59,290 --> 00:26:00,880 you can build on a microprocessor. 582 00:26:04,540 --> 00:26:07,010 It matches well if you have to go through the pins, because 583 00:26:07,010 --> 00:26:07,700 it partially measures that. 584 00:26:07,700 --> 00:26:11,060 But within the die you have so much more bandwidth. 585 00:26:11,060 --> 00:26:13,740 So I think something interesting would be a 586 00:26:13,740 --> 00:26:16,420 fundamental sign of what the microprocessor looks like. 587 00:26:16,420 --> 00:26:17,670 The core looks like. 588 00:26:17,670 --> 00:26:20,313 It's not taking a Pentium and kind of plopping it once, or 589 00:26:20,313 --> 00:26:22,170 plumping it four times. 590 00:26:22,170 --> 00:26:26,100 A Pentium is not any kind of a nice processor, it evolved 591 00:26:26,100 --> 00:26:30,325 over the years to basically deal with this very thin 592 00:26:30,325 --> 00:26:31,575 memory bandwidth. 593 00:26:33,870 --> 00:26:35,880 And suddenly you've got four [OBSCURED] 594 00:26:35,880 --> 00:26:37,920 and that problem goes away. 595 00:26:37,920 --> 00:26:40,590 And we're still saying, OK yeah, I have this entire thing 596 00:26:40,590 --> 00:26:45,140 in there, but I only want four wires out of you, or 1,024 597 00:26:45,140 --> 00:26:49,890 wires when you can give me a hundred million wires. 598 00:26:49,890 --> 00:26:51,190 That is the question. 599 00:26:51,190 --> 00:26:52,990 So I think they'll be some interesting things that the 600 00:26:52,990 --> 00:26:56,525 people can make a big impact trying to figure out how to 601 00:26:56,525 --> 00:26:58,590 take advantage of that. 602 00:26:58,590 --> 00:27:01,530 And by doing that, really reducing the burden for the 603 00:27:01,530 --> 00:27:01,670 programmer. 604 00:27:01,670 --> 00:27:04,750 Right now, you say, ok look, I am taking advantage of that, 605 00:27:04,750 --> 00:27:06,820 but I'm passing on the buck to the programmers. 606 00:27:06,820 --> 00:27:10,910 They're dealing with all the small interconnectary. 607 00:27:10,910 --> 00:27:12,770 Then we say, you can't just that because 608 00:27:12,770 --> 00:27:14,210 programming is hard. 609 00:27:14,210 --> 00:27:16,080 How can I reduce the programmer burden by taking 610 00:27:16,080 --> 00:27:16,656 advantage of that. 611 00:27:16,656 --> 00:27:18,480 I think this is a very interesting to 612 00:27:18,480 --> 00:27:20,490 looked at this problem. 613 00:27:20,490 --> 00:27:22,140 If you're doing any architecture kind of research 614 00:27:22,140 --> 00:27:26,890 or are interested in that, here is an open problem. 615 00:27:26,890 --> 00:27:30,700 I have this huge bandwidth out there, and I have evolved 616 00:27:30,700 --> 00:27:34,780 something that is completely in the way dealing with very 617 00:27:34,780 --> 00:27:36,390 limited amount of bandwidth, but that's why they 618 00:27:36,390 --> 00:27:37,700 live for this long. 619 00:27:37,700 --> 00:27:40,760 And the minute you open it up, how can I do with that? 620 00:27:40,760 --> 00:27:42,310 I don't think we have a model. 621 00:27:42,310 --> 00:27:43,620 And I think a lot of interesting 622 00:27:43,620 --> 00:27:48,680 can come out there. 623 00:27:48,680 --> 00:27:48,970 OK. 624 00:27:48,970 --> 00:27:50,410 Let's switch to language, compilers and tools. 625 00:27:53,050 --> 00:27:55,700 The way you look at it, what's the last big thing that 626 00:27:55,700 --> 00:27:56,880 happened in languages? 627 00:27:56,880 --> 00:28:00,520 I think one that happens is object oriented revolution. 628 00:28:00,520 --> 00:28:02,840 The interesting thing about object oriented revolution is 629 00:28:02,840 --> 00:28:04,150 it didn't happen in a vacuum. 630 00:28:04,150 --> 00:28:07,230 It happened because a lot of interesting research in 631 00:28:07,230 --> 00:28:09,810 academia and outside went into it. 632 00:28:09,810 --> 00:28:11,400 People didn't come up and say, wow, it's a 633 00:28:11,400 --> 00:28:13,120 great I'm writing Java. 634 00:28:13,120 --> 00:28:16,700 There were hundreds of languages that developed and 635 00:28:16,700 --> 00:28:19,290 explored these concepts. 636 00:28:19,290 --> 00:28:22,320 And the interesting thing about programming is, a lot of 637 00:28:22,320 --> 00:28:26,455 things that looks very neat and interesting, when you look 638 00:28:26,455 --> 00:28:30,180 at it first time when you start using it in a media-line 639 00:28:30,180 --> 00:28:32,950 program, that has been evolve over 10 years, you realize 640 00:28:32,950 --> 00:28:35,290 that's a pretty bad concept. 641 00:28:35,290 --> 00:28:37,530 Can anybody think about a concept that looks very 642 00:28:37,530 --> 00:28:39,310 interesting at the beginning, but actually very 643 00:28:39,310 --> 00:28:39,670 hard to deal with. 644 00:28:39,670 --> 00:28:46,930 AUDIENCE: My last program. 645 00:28:46,930 --> 00:28:49,040 PROFESSOR: I haven't heard that analogy 646 00:28:49,040 --> 00:28:51,210 before but what else? 647 00:28:51,210 --> 00:28:54,790 AUDIENCE: Exceptions. 648 00:28:54,790 --> 00:28:57,400 PROFESSOR: Exceptions, I think people still 649 00:28:57,400 --> 00:28:57,930 haven't figured out. 650 00:28:57,930 --> 00:28:59,750 I think one thing people have made assumptions think of 651 00:28:59,750 --> 00:29:01,530 multiple integrators. 652 00:29:01,530 --> 00:29:03,810 With multiple integrators, the first step, wow this is the 653 00:29:03,810 --> 00:29:06,910 best thing since sliced bread, because the thing object 654 00:29:06,910 --> 00:29:11,060 oriented programming does is it forces this one hierarchy, 655 00:29:11,060 --> 00:29:13,010 and the world is objects, but there's no one 656 00:29:13,010 --> 00:29:14,550 hierarchy in the world. 657 00:29:14,550 --> 00:29:16,790 World has these multiple interconnections in there and 658 00:29:16,790 --> 00:29:18,730 so people say, this is great. 659 00:29:18,730 --> 00:29:22,090 This basically now give you objects and you can get a lot 660 00:29:22,090 --> 00:29:23,680 of different relationships. 661 00:29:23,680 --> 00:29:26,170 But, after people trying to deal with multiple integrators 662 00:29:26,170 --> 00:29:27,840 they realize this is a pain. 663 00:29:27,840 --> 00:29:31,760 It's very hard to understand it, it's very hard to right 664 00:29:31,760 --> 00:29:35,100 compilers, there's so many ambiguities around it. 665 00:29:35,100 --> 00:29:37,770 And then at some point you just say, this is too much. 666 00:29:37,770 --> 00:29:40,335 And that's why you went to, like Java went multiple 667 00:29:40,335 --> 00:29:43,720 interfaces light by having, you can have multiple 668 00:29:43,720 --> 00:29:46,390 interfaces. 669 00:29:46,390 --> 00:29:49,180 Here's a concept that looks very nice, but you had to work 670 00:29:49,180 --> 00:29:50,810 -- it took five, 10 years to realize that it's 671 00:29:50,810 --> 00:29:53,530 actually not usable. 672 00:29:53,530 --> 00:29:55,580 So if you to look at object oriented languages, there have 673 00:29:55,580 --> 00:29:56,260 been many languages. 674 00:29:56,260 --> 00:30:00,150 This is an interesting thing, because in parallel community 675 00:30:00,150 --> 00:30:02,680 we haven't had this kind of explosion out there. 676 00:30:02,680 --> 00:30:04,500 Another interesting thing about languages is language 677 00:30:04,500 --> 00:30:06,400 has lot of relationship and evolving. 678 00:30:06,400 --> 00:30:09,330 So if you loot at from FORTRAN that's started. 679 00:30:09,330 --> 00:30:12,290 You can see the FORTRAN, so this is FORTRAN in here. 680 00:30:12,290 --> 00:30:14,770 To figure out what's happening here, you can kind of trace 681 00:30:14,770 --> 00:30:17,360 over the years how new languages academically and 682 00:30:17,360 --> 00:30:18,340 industrially came about. 683 00:30:18,340 --> 00:30:21,770 Got influence from others, and at then end up with things 684 00:30:21,770 --> 00:30:24,400 like at the end of the graph there's something you can 685 00:30:24,400 --> 00:30:24,970 never read it. 686 00:30:24,970 --> 00:30:27,380 But the key thing is, there's all these influences going 687 00:30:27,380 --> 00:30:29,660 across in there. 688 00:30:29,660 --> 00:30:32,740 And cross fertilization. 689 00:30:32,740 --> 00:30:36,460 At the end, you end up with some TCL TK, Python -- 690 00:30:36,460 --> 00:30:37,510 I can't even read it carefully. 691 00:30:37,510 --> 00:30:42,180 Java, C sharp, PHP, and stuff like that. 692 00:30:42,180 --> 00:30:43,710 There are a lot of different languages trying to get 693 00:30:43,710 --> 00:30:45,440 influence in there. 694 00:30:45,440 --> 00:30:48,840 The key thing is you need to feed this beast. If you want 695 00:30:48,840 --> 00:30:51,500 to have a lot of evolution or even evolutionary changes. 696 00:30:54,940 --> 00:30:58,680 What that means is, people say, yeah, we have Java, C and 697 00:30:58,680 --> 00:31:01,600 C , C sharp, we are happy with languages. 698 00:31:01,600 --> 00:31:02,730 No. 699 00:31:02,730 --> 00:31:05,290 I think in the future to deal with that you need to have 700 00:31:05,290 --> 00:31:07,010 kind of revolution in there. 701 00:31:07,010 --> 00:31:12,870 So if you look at something like C , almost every language 702 00:31:12,870 --> 00:31:15,190 started with FORTRAN. 703 00:31:15,190 --> 00:31:17,920 And then there had been lot of different influences that 704 00:31:17,920 --> 00:31:18,970 happened to that. 705 00:31:18,970 --> 00:31:23,460 And I'm going to just go, that's for example, Java, it's 706 00:31:23,460 --> 00:31:25,660 acknowledge that a lot of these different languages 707 00:31:25,660 --> 00:31:28,010 influenced ideas in there. 708 00:31:28,010 --> 00:31:30,010 So why do we need new languages in there? 709 00:31:30,010 --> 00:31:32,590 I think we have paradigm shifting architecture. 710 00:31:32,590 --> 00:31:35,050 Because sequential to multicore will require a new 711 00:31:35,050 --> 00:31:38,360 way of thinking, and we kind of need a new common machine 712 00:31:38,360 --> 00:31:40,340 language for these kind of machines. 713 00:31:40,340 --> 00:31:42,450 And the new application domains. 714 00:31:42,450 --> 00:31:45,090 Because streaming is becoming very interesting. 715 00:31:45,090 --> 00:31:46,650 Scripting. 716 00:31:46,650 --> 00:31:50,930 People are using things like Python and PEARL and stuff 717 00:31:50,930 --> 00:31:52,675 like that in a much regular basis to 718 00:31:52,675 --> 00:31:54,670 do much bigger things. 719 00:31:54,670 --> 00:31:57,110 So those are cobbled together languages. 720 00:31:57,110 --> 00:32:00,300 But can you do better in those things? 721 00:32:00,300 --> 00:32:01,900 Real time. 722 00:32:01,900 --> 00:32:04,230 Things like cell phones and stuff become more important 723 00:32:04,230 --> 00:32:07,470 how we deal with those issues. 724 00:32:07,470 --> 00:32:10,530 And also new hardware features. 725 00:32:10,530 --> 00:32:14,750 Right now good language is tied very well with the one 726 00:32:14,750 --> 00:32:14,870 [OBSCURED] 727 00:32:14,870 --> 00:32:16,980 hardware features in there. 728 00:32:16,980 --> 00:32:21,880 So things like, for example, black when I tell you that we 729 00:32:21,880 --> 00:32:26,080 have all this wires available, if you figure out some new way 730 00:32:26,080 --> 00:32:28,780 of taking advantage of those wires. 731 00:32:28,780 --> 00:32:30,590 That hardware feature. 732 00:32:30,590 --> 00:32:33,270 What's the software that can really utilize that? 733 00:32:33,270 --> 00:32:35,310 So you can just build a hardware feature, but it's not 734 00:32:35,310 --> 00:32:35,670 sufficient. 735 00:32:35,670 --> 00:32:37,490 You have to have a software that do that, too. 736 00:32:37,490 --> 00:32:38,970 So what's the coupling? 737 00:32:38,970 --> 00:32:40,730 A lot of things that has to be that nice 738 00:32:40,730 --> 00:32:43,430 interconnection there. 739 00:32:43,430 --> 00:32:45,730 And I think there are new people 740 00:32:45,730 --> 00:32:47,090 who want mobile devices. 741 00:32:47,090 --> 00:32:48,130 A great program. 742 00:32:48,130 --> 00:32:49,590 If we would look at parallelizing compilers or 743 00:32:49,590 --> 00:32:52,580 parallelization as I said before, was always geared to 744 00:32:52,580 --> 00:32:57,200 this high performance weenies, who just wanted to get their 745 00:32:57,200 --> 00:33:00,030 global simulation going as fast as possible. 746 00:33:00,030 --> 00:33:03,290 Who has a multi-million dollar supercomputer waiting. 747 00:33:03,290 --> 00:33:06,000 It's very different having things like great programmers 748 00:33:06,000 --> 00:33:07,080 who want these things. 749 00:33:07,080 --> 00:33:07,790 How do you cater to them? 750 00:33:07,790 --> 00:33:09,120 They are very different. 751 00:33:13,080 --> 00:33:15,400 The key thing is how can we achieve parallelism without 752 00:33:15,400 --> 00:33:17,820 burdening the programmer? 753 00:33:17,820 --> 00:33:20,540 Because we don't, in this class we barely burden you 754 00:33:20,540 --> 00:33:22,800 guys and then you realize how hard it is. 755 00:33:22,800 --> 00:33:25,940 And how much complexity it adds on top of the algorithm, 756 00:33:25,940 --> 00:33:27,900 just to get it working parallel. 757 00:33:27,900 --> 00:33:31,170 So how do you reduce some of the burden? 758 00:33:31,170 --> 00:33:34,690 Perhaps not completely, but reduce that burden. 759 00:33:34,690 --> 00:33:35,840 I'm going to skip some of this. 760 00:33:35,840 --> 00:33:37,270 I'll make it available on the site. 761 00:33:37,270 --> 00:33:40,020 I want to have some time for discussion. 762 00:33:40,020 --> 00:33:42,420 In streaming, we talked about it, we tried 763 00:33:42,420 --> 00:33:43,750 to do some of that. 764 00:33:43,750 --> 00:33:46,440 Tried to come up with the programming model where it 765 00:33:46,440 --> 00:33:48,535 will give you some additional benefits for the program, at 766 00:33:48,535 --> 00:33:51,810 the same time you can actually get multicore performers. 767 00:33:51,810 --> 00:33:54,450 And, of course, we saw that in compilers we can actually get 768 00:33:54,450 --> 00:33:57,420 good performance in then. 769 00:33:57,420 --> 00:34:00,380 One thing that I did a long time ago, is this SUIF 770 00:34:00,380 --> 00:34:00,970 parallelizing compiler. 771 00:34:00,970 --> 00:34:03,880 What we did was we can automatically parallelize 772 00:34:03,880 --> 00:34:05,940 FORTRAN programs all the way. 773 00:34:08,520 --> 00:34:11,300 This kind of almost worked in the 90's. 774 00:34:11,300 --> 00:34:15,967 We, at some point, spec bench marks were the way to measure 775 00:34:15,967 --> 00:34:19,730 how good your world is, and every company had 20 engineers 776 00:34:19,730 --> 00:34:22,550 trying to get 5% of spec. 777 00:34:22,550 --> 00:34:27,240 At the time our spec performance was about 200. 778 00:34:27,240 --> 00:34:29,020 Wire registered, we did this compiler, that we could expect 779 00:34:29,020 --> 00:34:32,390 bench mark performance at 800, by just parallelizing. 780 00:34:32,390 --> 00:34:36,960 And in fact we made this t-shirt that said, spec red 781 00:34:36,960 --> 00:34:37,730 and the number. 782 00:34:37,730 --> 00:34:38,910 That's all the t-shirt said. 783 00:34:38,910 --> 00:34:44,530 And, in fact, I was working in a few places, including the 784 00:34:44,530 --> 00:34:46,810 grocery store at Palo Alto, and people stopped me and 785 00:34:46,810 --> 00:34:48,320 said, how did you do that? 786 00:34:48,320 --> 00:34:50,970 It's only on this kind of weenie things. 787 00:34:50,970 --> 00:34:54,200 And I had in the grocery story, I had half an hour 788 00:34:54,200 --> 00:34:55,090 conversation with [OBSCURED] 789 00:34:55,090 --> 00:34:56,880 of how -- 790 00:34:56,880 --> 00:34:58,130 he was, like, hmm. 791 00:34:58,130 --> 00:35:00,700 The only thing, spec 823. 792 00:35:00,700 --> 00:35:02,530 That's the only two things the t-shirt said, people 793 00:35:02,530 --> 00:35:03,780 knew what it is. 794 00:35:08,590 --> 00:35:12,990 But, there's problems. The compiler was not robust. The 795 00:35:12,990 --> 00:35:15,670 compiler can be used by the people who wrote the compiler, 796 00:35:15,670 --> 00:35:19,330 because we knew all the things in there. 797 00:35:19,330 --> 00:35:22,040 The thing about that is it worked in a way that if 798 00:35:22,040 --> 00:35:24,560 program worked, you get 8x improvement. 799 00:35:24,560 --> 00:35:26,570 You change one line of a code somewhere, you 800 00:35:26,570 --> 00:35:29,900 went from 8x to 1x. 801 00:35:29,900 --> 00:35:31,870 That, I think, stopped parallelizing around the slow 802 00:35:31,870 --> 00:35:33,930 [OBSCURED]. 803 00:35:33,930 --> 00:35:36,900 So now, normal programmer can now understand what happens. 804 00:35:36,900 --> 00:35:39,665 We go and analyze it and afterward say aha, this is how 805 00:35:39,665 --> 00:35:41,910 to change the program, how to change the compiler. 806 00:35:41,910 --> 00:35:44,210 So we just kind of developing program and compiler at the 807 00:35:44,210 --> 00:35:46,940 same time, and we could use this thing. 808 00:35:46,940 --> 00:35:49,690 You cannot ignore the fact that this is happening. 809 00:35:49,690 --> 00:35:53,040 Because there's so much of a difference in performance. 810 00:35:53,040 --> 00:35:54,540 That makes it even a curse. 811 00:35:54,540 --> 00:35:57,090 I mean, if you only get 4x improvement probably things 812 00:35:57,090 --> 00:35:57,610 would have been better. 813 00:35:57,610 --> 00:36:00,660 We should have probably capped the performance or something. 814 00:36:00,660 --> 00:36:03,990 And clients, in those days, were impossible to with. 815 00:36:03,990 --> 00:36:06,160 You saw a very good example in last lecture. 816 00:36:06,160 --> 00:36:06,490 OK. 817 00:36:06,490 --> 00:36:09,130 Performance at any cost. I don't want any of these tools 818 00:36:09,130 --> 00:36:11,220 if it slows down at any time. 819 00:36:11,220 --> 00:36:12,680 So how we deal with those kind of people. 820 00:36:12,680 --> 00:36:14,920 It's hard. 821 00:36:14,920 --> 00:36:17,670 In multiprocessor, communication was expensive. 822 00:36:17,670 --> 00:36:20,670 If you make a wrong decision, you pay a huge price. 823 00:36:20,670 --> 00:36:22,350 Basically slow the program down. 824 00:36:22,350 --> 00:36:25,370 You go from being 1x, you actually go from 0.5x. 825 00:36:25,370 --> 00:36:27,030 And worse. 826 00:36:27,030 --> 00:36:29,750 And you had to always deal with Moore's law. 827 00:36:29,750 --> 00:36:31,540 Why do I need to go all these complicated compiling? 828 00:36:31,540 --> 00:36:33,690 Can I just wait six months and won't I get it? 829 00:36:33,690 --> 00:36:35,910 They're probably right at that time. 830 00:36:35,910 --> 00:36:39,660 And another problem, what you call a dogfooding problem. 831 00:36:39,660 --> 00:36:42,630 The reason that object oriented languages is very 832 00:36:42,630 --> 00:36:45,090 successful is everybody who did the object oriented 833 00:36:45,090 --> 00:36:48,420 language, in order to get accepted as a reasonable 834 00:36:48,420 --> 00:36:51,780 language, had to write their own compiler in that language. 835 00:36:51,780 --> 00:36:54,710 And the compiler is a very good way of really pushing 836 00:36:54,710 --> 00:36:55,800 object oriented concepts. 837 00:36:55,800 --> 00:36:58,530 And so almost every object orientated language wrote 838 00:36:58,530 --> 00:36:58,980 their compiler. 839 00:36:58,980 --> 00:37:00,850 That's a really big thing you have to write. 840 00:37:00,850 --> 00:37:02,950 And by the time you write the compiler you realize all the 841 00:37:02,950 --> 00:37:05,220 problems of the language, and you feel equal, because you 842 00:37:05,220 --> 00:37:08,600 write a few hundred thousand line programming of language. 843 00:37:08,600 --> 00:37:11,440 Except all the people who were doing high performance, we 844 00:37:11,440 --> 00:37:14,600 were catering for somebody else. 845 00:37:14,600 --> 00:37:17,860 What that means is we can deal with this weirdness, and say, 846 00:37:17,860 --> 00:37:20,280 yeah it's no problem, nobody -- 847 00:37:20,280 --> 00:37:21,530 but those are the things that really kill 848 00:37:21,530 --> 00:37:23,140 the users in there. 849 00:37:23,140 --> 00:37:26,870 We didn't have the users perspective. 850 00:37:26,870 --> 00:37:29,530 And another thing that today is these 851 00:37:29,530 --> 00:37:30,830 things are really different. 852 00:37:30,830 --> 00:37:33,220 Because you have complex data structures, complex control 853 00:37:33,220 --> 00:37:36,590 flow, complex build processes, 854 00:37:36,590 --> 00:37:38,030 aliasing type unsafe languages. 855 00:37:38,030 --> 00:37:41,760 All this cool things we used to do, just became a hundred 856 00:37:41,760 --> 00:37:43,430 times harder, because of all those. 857 00:37:43,430 --> 00:37:46,470 People fell in love with data structures. 858 00:37:46,470 --> 00:37:48,270 And they say, oh, they were so nice. 859 00:37:48,270 --> 00:37:50,310 We knew exactly the same size thing, it doesn't change, I 860 00:37:50,310 --> 00:37:51,130 could analyze it. 861 00:37:51,130 --> 00:37:55,150 But malloc data structures, it's just and very complex 862 00:37:55,150 --> 00:37:59,620 trees and doubly linked lists was just 863 00:37:59,620 --> 00:38:01,600 impossible to deal with. 864 00:38:01,600 --> 00:38:03,490 We want to go in the days where everything is a nice 865 00:38:03,490 --> 00:38:08,800 simple array But I guess the paths for the compiler became 866 00:38:08,800 --> 00:38:12,010 much harder in those days. 867 00:38:12,010 --> 00:38:15,520 So I think compilers are a critical thing to deal with. 868 00:38:15,520 --> 00:38:18,006 I think if you have to improve in multicores we had to deal 869 00:38:18,006 --> 00:38:20,780 with compilers. 870 00:38:20,780 --> 00:38:23,560 And the sad thing is some of those things people are trying 871 00:38:23,560 --> 00:38:26,940 to do it today by hand, we could do it with the compilers 872 00:38:26,940 --> 00:38:27,970 15 years ago. 873 00:38:27,970 --> 00:38:29,280 We kind of lost that technology. 874 00:38:29,280 --> 00:38:32,170 And people are doing this by hand again today. 875 00:38:32,170 --> 00:38:34,290 And go back and dust off this technology and 876 00:38:34,290 --> 00:38:36,640 bring it back in there. 877 00:38:36,640 --> 00:38:38,790 Best case, we might automate everything. 878 00:38:38,790 --> 00:38:42,810 And say, don't worry, do what you are doing today and just 879 00:38:42,810 --> 00:38:44,960 keep doing it and we'll get multicore performance. 880 00:38:44,960 --> 00:38:46,560 I don't think that's going to be that realistic. 881 00:38:46,560 --> 00:38:49,600 But in worse case, at least we can help the darn programmers. 882 00:38:49,600 --> 00:38:52,010 Tell them what's good, what to do when they 883 00:38:52,010 --> 00:38:53,050 do something wrong. 884 00:38:53,050 --> 00:38:54,420 How can you build those kinds of things. 885 00:38:54,420 --> 00:38:55,670 I think that's very important. 886 00:38:58,480 --> 00:39:01,890 I think in tool side, we need tools. 887 00:39:01,890 --> 00:39:06,400 I mean you realize after programming on [OBSCURED] 888 00:39:06,400 --> 00:39:08,560 that was really nice to have your tools. 889 00:39:08,560 --> 00:39:10,990 Everybody can probably come up with some tools additions. 890 00:39:10,990 --> 00:39:13,430 And I think tools, on this, is pretty ancient. 891 00:39:13,430 --> 00:39:15,970 How do you come up with really good tools that the normal 892 00:39:15,970 --> 00:39:17,120 programmer can use them? 893 00:39:17,120 --> 00:39:21,990 Figure out problems, figure out debugging issues and stuff 894 00:39:21,990 --> 00:39:24,610 like that, and get through together. 895 00:39:24,610 --> 00:39:28,400 We need Eclipse type thing for platform for multicores. 896 00:39:28,400 --> 00:39:30,500 We have a lot of nice plug-ins come about that actually help 897 00:39:30,500 --> 00:39:33,840 you to do that. 898 00:39:33,840 --> 00:39:35,660 Another interesting thing, this is about 899 00:39:35,660 --> 00:39:37,750 this dogfooding problem. 900 00:39:37,750 --> 00:39:40,100 Normally what you come up with language compiler tool, you do 901 00:39:40,100 --> 00:39:44,650 implementation, you do evaluation of that and you 902 00:39:44,650 --> 00:39:45,930 cycle through that. 903 00:39:45,930 --> 00:39:51,040 And evaluation says, ok, you develop a program, you debug 904 00:39:51,040 --> 00:39:52,870 the program for functionality. 905 00:39:52,870 --> 00:39:55,260 Performance debugging, evaluate, and everything kind 906 00:39:55,260 --> 00:39:56,030 of go around that. 907 00:39:56,030 --> 00:39:57,210 I mean there's a process here. 908 00:39:57,210 --> 00:39:59,140 And you'd rather realize somethings are very hard to 909 00:39:59,140 --> 00:40:00,370 do, somethings are easy. 910 00:40:00,370 --> 00:40:03,030 And you need that to basically evolve that. 911 00:40:03,030 --> 00:40:05,110 And the problem is -- 912 00:40:05,110 --> 00:40:06,330 I'm contrasting two things. 913 00:40:06,330 --> 00:40:08,840 If you look at something like CAD tools. 914 00:40:08,840 --> 00:40:10,830 CAD tools were developed by a bunch of guys sitting in 915 00:40:10,830 --> 00:40:14,130 places, used by some very different group of people. 916 00:40:14,130 --> 00:40:16,090 Because of that, today's CAD tools are horrendous to use. 917 00:40:16,090 --> 00:40:18,200 Everyone hates them. 918 00:40:18,200 --> 00:40:19,990 Because the people who developed CAD tools never 919 00:40:19,990 --> 00:40:21,400 thought about users. 920 00:40:21,400 --> 00:40:22,170 They were indifferent. 921 00:40:22,170 --> 00:40:23,990 And they just have to get the functionality out and they 922 00:40:23,990 --> 00:40:25,520 were happy after that. 923 00:40:25,520 --> 00:40:28,180 Whereas, object oriented languages were more thought 924 00:40:28,180 --> 00:40:31,130 about as the how the best way to use it. 925 00:40:31,130 --> 00:40:33,130 Very much of a programmer-centric phase. 926 00:40:33,130 --> 00:40:36,600 And in that end, you develop a system that's a lot easier to 927 00:40:36,600 --> 00:40:39,500 use, a lot of people use it, more acceptance in there. 928 00:40:39,500 --> 00:40:41,680 The problem with high performance languages is they 929 00:40:41,680 --> 00:40:43,300 really sit in the CAD tool way. 930 00:40:43,300 --> 00:40:45,480 Because the people who write high performance languages 931 00:40:45,480 --> 00:40:47,890 never really use it. 932 00:40:47,890 --> 00:40:49,880 People who write the compilers and languages are not the ones 933 00:40:49,880 --> 00:40:53,430 who are developing the high performance programs. No high 934 00:40:53,430 --> 00:40:54,560 performance language ever wrote the 935 00:40:54,560 --> 00:40:56,520 compiler in that language. 936 00:40:56,520 --> 00:40:58,230 They say, oh, that's different domain, we don't 937 00:40:58,230 --> 00:40:59,460 know how to do that. 938 00:40:59,460 --> 00:41:02,850 By doing that, you really never expose the problems in 939 00:41:02,850 --> 00:41:05,160 there, and I think the key thing is how do you go about 940 00:41:05,160 --> 00:41:05,550 doing that? 941 00:41:05,550 --> 00:41:07,530 This is a hard problem. 942 00:41:07,530 --> 00:41:10,300 Because how do get two different groups together? 943 00:41:10,300 --> 00:41:12,710 Perhaps multicores, what you to do is you probably 944 00:41:12,710 --> 00:41:14,710 everybody who write the compiling tool, actually have 945 00:41:14,710 --> 00:41:17,440 to write it in the thing that you are doing, because you 946 00:41:17,440 --> 00:41:18,540 need the parallelism. 947 00:41:18,540 --> 00:41:21,430 At that point you realize all the problems with the tool. 948 00:41:21,430 --> 00:41:24,800 So I'm going to the skip on this side. 949 00:41:24,800 --> 00:41:27,210 Another interesting, very interesting problem, that 950 00:41:27,210 --> 00:41:29,290 people don't pay attention is the migration 951 00:41:29,290 --> 00:41:31,480 of the dusty deck. 952 00:41:31,480 --> 00:41:34,350 Millions if not billions of lines of code out there, 953 00:41:34,350 --> 00:41:36,130 written in old styles way. 954 00:41:36,130 --> 00:41:39,190 You've can't just say, look, Microsoft, just go rewrite 955 00:41:39,190 --> 00:41:42,310 everything new with my spiffy new system. 956 00:41:42,310 --> 00:41:42,930 That doesn't work. 957 00:41:42,930 --> 00:41:44,990 How to get these people to migrate out? 958 00:41:47,540 --> 00:41:49,590 That might not mean doing everything automatically, but 959 00:41:49,590 --> 00:41:52,295 how can you even help -- the people who wrote the code is 960 00:41:52,295 --> 00:41:55,310 probably gone, left the company, probably dead. 961 00:41:55,310 --> 00:41:57,810 How do you help somebody move some of these things to you. 962 00:41:57,810 --> 00:41:59,750 That is the rate of evolution. 963 00:41:59,750 --> 00:42:02,230 You don't have to rewrite everything from scratch. 964 00:42:02,230 --> 00:42:04,960 What kind of tools that you can use? 965 00:42:04,960 --> 00:42:08,690 And these applications are still in use. 966 00:42:08,690 --> 00:42:11,270 Sometimes source code is even available, but 967 00:42:11,270 --> 00:42:13,780 programs are gone. 968 00:42:13,780 --> 00:42:18,450 But the interesting thing is, some of those things, 969 00:42:18,450 --> 00:42:23,540 applications have bugs that have become features. 970 00:42:23,540 --> 00:42:25,190 So I'll give you a very good example. 971 00:42:25,190 --> 00:42:27,330 And this happened to Microsoft. 972 00:42:27,330 --> 00:42:30,390 So at Microsoft at some point -- if you know Word, Word does 973 00:42:30,390 --> 00:42:31,340 algorithmic pagination. 974 00:42:31,340 --> 00:42:34,040 That means how to lay out the pages, and how to break pages 975 00:42:34,040 --> 00:42:34,680 and stuff like that. 976 00:42:34,680 --> 00:42:36,343 So they had the algorithm in there. 977 00:42:36,343 --> 00:42:38,750 And they were switching versions, and they said look, 978 00:42:38,750 --> 00:42:40,440 this algorithm is now too old. 979 00:42:40,440 --> 00:42:41,420 It's been there for five, six years. 980 00:42:41,420 --> 00:42:42,490 Rewrite it. 981 00:42:42,490 --> 00:42:44,700 And of course you can assume what the spec is. 982 00:42:44,700 --> 00:42:46,450 There's a nice spec to that, a few pages. 983 00:42:46,450 --> 00:42:48,550 They said, go rewrite that. 984 00:42:48,550 --> 00:42:51,510 And someone wrote a very nice pagination algorithm. 985 00:42:51,510 --> 00:42:55,325 But the problem is, when they use that algorithm, a lot of 986 00:42:55,325 --> 00:42:56,310 old documents broke. 987 00:42:56,310 --> 00:42:57,490 That means they didn't paginate right. 988 00:42:57,490 --> 00:43:00,760 So the old document, when you open it, looks very different. 989 00:43:00,760 --> 00:43:04,170 And they said, darn, what's going on? 990 00:43:04,170 --> 00:43:06,080 What they realized is, the old implementation 991 00:43:06,080 --> 00:43:07,500 had a lot of bugs. 992 00:43:07,500 --> 00:43:09,110 Subtle things. 993 00:43:09,110 --> 00:43:11,640 The guide didn't really conform to the standard. 994 00:43:11,640 --> 00:43:14,350 So it didn't break the word exactly right, if at some 995 00:43:14,350 --> 00:43:16,860 point it has a break and space, it would probably put 996 00:43:16,860 --> 00:43:17,980 it on the next line. 997 00:43:17,980 --> 00:43:19,370 A few things that didn't really -- 998 00:43:19,370 --> 00:43:21,830 And what they had to do is, they had to spend a lot of 999 00:43:21,830 --> 00:43:26,070 time discovering bugs and adding them to the spec. 1000 00:43:26,070 --> 00:43:27,010 So they did that. 1001 00:43:27,010 --> 00:43:29,800 Because otherwise, all the previous program doesn't work, 1002 00:43:29,800 --> 00:43:33,050 the previous files will not look right. 1003 00:43:33,050 --> 00:43:34,920 And they went through this entire process of trying to 1004 00:43:34,920 --> 00:43:36,660 discover and trying to figure out what [OBSCURED] 1005 00:43:36,660 --> 00:43:39,080 instead of what the text of the specs said. 1006 00:43:39,080 --> 00:43:41,000 They're out of sync. 1007 00:43:41,000 --> 00:43:42,900 And so a lot of times, there's that problem. 1008 00:43:42,900 --> 00:43:44,466 So you see we have something old, and we 1009 00:43:44,466 --> 00:43:45,050 want to get it new. 1010 00:43:45,050 --> 00:43:47,150 It doesn't mean that you know the functionality. 1011 00:43:47,150 --> 00:43:50,660 The functionality that was implemented is the spec, or 1012 00:43:50,660 --> 00:43:51,950 the functionality that you wrote about. 1013 00:43:51,950 --> 00:43:53,740 Because that's a little bit subtly different. 1014 00:43:53,740 --> 00:43:55,330 So how do you go and discover that. 1015 00:43:55,330 --> 00:43:56,470 I think that's a very interesting 1016 00:43:56,470 --> 00:43:58,560 and important problem. 1017 00:43:58,560 --> 00:44:00,850 And I think there are many reasons that run such things 1018 00:44:00,850 --> 00:44:04,230 like creating test cases, extracting invariants, things 1019 00:44:04,230 --> 00:44:07,000 like failure oblivious computing type, if you had 1020 00:44:07,000 --> 00:44:08,600 listened to professor Rabbah's talk . 1021 00:44:08,600 --> 00:44:12,150 Things like that can help take these old things and migrate. 1022 00:44:12,150 --> 00:44:13,380 Can use tools to do that. 1023 00:44:13,380 --> 00:44:17,850 I did a lot of interesting research on that. 1024 00:44:17,850 --> 00:44:21,890 OK, so that's my take on languages and compilers, where 1025 00:44:21,890 --> 00:44:24,620 interesting things have happened. 1026 00:44:24,620 --> 00:44:26,490 I want to talk about revolution, but I want to skip 1027 00:44:26,490 --> 00:44:30,290 that first and go into this crossing the abstraction 1028 00:44:30,290 --> 00:44:34,450 boundaries part, and then come back to revolution. 1029 00:44:34,450 --> 00:44:39,780 So the way the world works is, you have some kind of class of 1030 00:44:39,780 --> 00:44:43,780 computing you want to do, and you have some atoms that need 1031 00:44:43,780 --> 00:44:45,100 to run that computation. 1032 00:44:45,100 --> 00:44:49,460 That's basically the end to end of any process. 1033 00:44:49,460 --> 00:44:56,130 But if I said, here's some silicon, and here's my 1034 00:44:56,130 --> 00:44:58,980 algorithm, go do that, no single person can basically 1035 00:44:58,980 --> 00:45:00,410 build that. 1036 00:45:00,410 --> 00:45:00,660 OK? 1037 00:45:00,660 --> 00:45:03,650 You can't take silicon, and go through a semiconductor 1038 00:45:03,650 --> 00:45:06,740 process, build a process, design all those things, 1039 00:45:06,740 --> 00:45:08,890 language -- it doesn't happen. 1040 00:45:08,890 --> 00:45:10,950 Someday probably it was possible to have one person 1041 00:45:10,950 --> 00:45:12,390 know everything, but you can't. 1042 00:45:12,390 --> 00:45:16,210 So what we have done deal with build abstraction boundaries. 1043 00:45:16,210 --> 00:45:21,090 We have compilers, languages, instruction, 1044 00:45:21,090 --> 00:45:24,870 microarchitecture, layout, design rules, process, 1045 00:45:24,870 --> 00:45:26,250 materials science -- 1046 00:45:26,250 --> 00:45:28,190 basically, everything is separated out. 1047 00:45:28,190 --> 00:45:30,220 Every time one gets too complicated, we kind of divide 1048 00:45:30,220 --> 00:45:32,400 it in half, and keep adding and adding. 1049 00:45:32,400 --> 00:45:35,580 And we have this big stack of things. 1050 00:45:35,580 --> 00:45:38,660 So at the beginning, then you break up, people kind of knew 1051 00:45:38,660 --> 00:45:40,690 both sides, and they went into one side and kind 1052 00:45:40,690 --> 00:45:41,970 of broke that up. 1053 00:45:41,970 --> 00:45:44,960 And so people knew a lot of things. 1054 00:45:44,960 --> 00:45:47,420 But the next generation comes, they only know that. 1055 00:45:47,420 --> 00:45:50,480 The guy, someone who's going to be expert in 1056 00:45:50,480 --> 00:45:51,460 microarchitecture. 1057 00:45:51,460 --> 00:45:52,380 You only know microarchitecture. 1058 00:45:52,380 --> 00:45:54,120 He has no idea what the languages are, because that 1059 00:45:54,120 --> 00:45:56,830 world of his is complicated. 1060 00:45:56,830 --> 00:45:59,250 What we have is a compartmentalized world that 1061 00:45:59,250 --> 00:46:04,090 was broken up as we went in the last 40 years, as things 1062 00:46:04,090 --> 00:46:06,220 progressed, the right thing at that time, they figure out 1063 00:46:06,220 --> 00:46:07,890 what layers to do. 1064 00:46:07,890 --> 00:46:11,150 What we have done is create the domain expert. 1065 00:46:11,150 --> 00:46:13,050 There's one guy who probably knows about two things. 1066 00:46:13,050 --> 00:46:15,130 He might know a little bit of design rules, and probably 1067 00:46:15,130 --> 00:46:16,300 process a little bit. 1068 00:46:16,300 --> 00:46:18,590 Another person will know a little bit of design rules, 1069 00:46:18,590 --> 00:46:19,420 layout, and microarchitecture. 1070 00:46:19,420 --> 00:46:21,920 And there's someone who'll do microarchitecture and ISA. 1071 00:46:21,920 --> 00:46:24,400 And somebody knows compilers and ISA, but has no clue 1072 00:46:24,400 --> 00:46:25,780 what's happening. 1073 00:46:25,780 --> 00:46:30,430 So what this has done is kind of entrenched some layering 1074 00:46:30,430 --> 00:46:34,060 that really made sense about 30 years ago. 1075 00:46:34,060 --> 00:46:36,310 The problem now is, this is creating a lot 1076 00:46:36,310 --> 00:46:38,060 of issues in here. 1077 00:46:38,060 --> 00:46:42,910 So what you kind of really need is a way to break some of 1078 00:46:42,910 --> 00:46:44,140 this layering. 1079 00:46:44,140 --> 00:46:45,070 How do you do that? 1080 00:46:45,070 --> 00:46:49,330 You need people who can cross multiple of these disciplines. 1081 00:46:49,330 --> 00:46:51,650 So this is a thing for you. 1082 00:46:51,650 --> 00:46:53,830 I mean, you might think you are an architect. 1083 00:46:53,830 --> 00:46:55,460 But hey, you're in this class, you're learning a little bit 1084 00:46:55,460 --> 00:46:56,500 of programming. 1085 00:46:56,500 --> 00:46:58,700 Go learn a little bit of physics, what's in there. 1086 00:46:58,700 --> 00:47:00,590 Just have a little bit of broadening in there. 1087 00:47:00,590 --> 00:47:03,380 Then a lot of people, there are some people who are a lot 1088 00:47:03,380 --> 00:47:05,440 more in the process side, but know a little bit about 1089 00:47:05,440 --> 00:47:06,450 microarchitecture and compilers, 1090 00:47:06,450 --> 00:47:07,740 somewhere in the middle. 1091 00:47:07,740 --> 00:47:10,920 They are expert in the middle, but they are not just sitting 1092 00:47:10,920 --> 00:47:12,850 happily in their own domain. 1093 00:47:12,850 --> 00:47:16,380 Just trying to look at cross-cutting there. 1094 00:47:16,380 --> 00:47:19,690 And keep going up, in some sense. 1095 00:47:19,690 --> 00:47:22,450 I kind of put myself in there, saying OK, I am in compilers, 1096 00:47:22,450 --> 00:47:23,830 but I know a little bit of language, and I 1097 00:47:23,830 --> 00:47:24,170 know a little -- 1098 00:47:24,170 --> 00:47:26,650 I try to go down and down as much as possible. 1099 00:47:26,650 --> 00:47:28,800 What you actually need is someone who knows everything 1100 00:47:28,800 --> 00:47:30,640 really well. 1101 00:47:30,640 --> 00:47:33,560 That is hard, but at least someone who can know the top 1102 00:47:33,560 --> 00:47:36,870 and the bottom to redo the middle. 1103 00:47:36,870 --> 00:47:39,490 Because my feeling is the way the middle is done is probably 1104 00:47:39,490 --> 00:47:41,850 now too old. 1105 00:47:41,850 --> 00:47:43,470 So I don't know that this is possible. 1106 00:47:43,470 --> 00:47:44,250 It's not me. 1107 00:47:44,250 --> 00:47:46,690 But if someone really understands the top and the 1108 00:47:46,690 --> 00:47:49,225 bottom, and say OK, I can probably throw the middle out 1109 00:47:49,225 --> 00:47:50,650 and redo the middle. 1110 00:47:50,650 --> 00:47:53,680 And I think that might be an interesting revolution that 1111 00:47:53,680 --> 00:47:54,000 might happen. 1112 00:47:54,000 --> 00:47:54,880 It's hard. 1113 00:47:54,880 --> 00:47:56,790 There's so much information there. 1114 00:47:56,790 --> 00:47:59,670 Some of the information, there are the layers out there, it 1115 00:47:59,670 --> 00:48:02,340 doesn't make sense, today, the way things are. 1116 00:48:02,340 --> 00:48:06,910 I mean for example, a lot of these layers had issues with 1117 00:48:06,910 --> 00:48:07,950 wire delay. 1118 00:48:07,950 --> 00:48:09,920 Because when wire delay was not an issue, these layers 1119 00:48:09,920 --> 00:48:11,100 made perfect sense. 1120 00:48:11,100 --> 00:48:14,250 But as wire delay came about, a lot of changes had to 1121 00:48:14,250 --> 00:48:16,430 propagate across layers, and it was very hard, because 1122 00:48:16,430 --> 00:48:18,420 people were domain experts. 1123 00:48:18,420 --> 00:48:21,540 So I think this is where the revolution is going to come. 1124 00:48:21,540 --> 00:48:23,860 Somebody's going to [OBSCURED] 1125 00:48:23,860 --> 00:48:25,815 the way you do compilers, the way you do architecture, the 1126 00:48:25,815 --> 00:48:28,360 way you decide everything is wrong. 1127 00:48:28,360 --> 00:48:30,470 I know the algorithms you want to run. 1128 00:48:30,470 --> 00:48:33,340 I know what's available in very low [OBSCURED]. 1129 00:48:33,340 --> 00:48:35,160 Let me see you put that together. 1130 00:48:35,160 --> 00:48:36,940 A person, a group, whatever. 1131 00:48:36,940 --> 00:48:38,190 Might have a beginning. 1132 00:48:40,770 --> 00:48:46,750 And so with that, I will go to the revolution side. 1133 00:48:46,750 --> 00:48:48,800 So the way you look at revolutions, what are the 1134 00:48:48,800 --> 00:48:50,320 far-out technologies? 1135 00:48:50,320 --> 00:48:52,720 And sometimes revolutions just come from wishful thinking. 1136 00:48:52,720 --> 00:48:55,380 I wish we can do this, and then if you can push hard, you 1137 00:48:55,380 --> 00:48:57,150 might be able to get there. 1138 00:48:57,150 --> 00:49:01,560 Anybody wants to -- so in this talk I talk way too much. 1139 00:49:01,560 --> 00:49:03,700 How about, come in? 1140 00:49:03,700 --> 00:49:04,920 What do you think is going to happen? 1141 00:49:04,920 --> 00:49:06,580 What do you want to work on? 1142 00:49:06,580 --> 00:49:08,460 You guys are the ones who have to make this decision. 1143 00:49:08,460 --> 00:49:08,860 Not me. 1144 00:49:08,860 --> 00:49:12,750 I've already made a lot of decisions myself, 10, 15 years 1145 00:49:12,750 --> 00:49:14,170 ago, what I'm going to do. 1146 00:49:14,170 --> 00:49:15,570 You are in the process, you have a lot 1147 00:49:15,570 --> 00:49:16,440 more choices to make. 1148 00:49:16,440 --> 00:49:20,460 You haven't narrowed your horizon yet. 1149 00:49:20,460 --> 00:49:24,487 So this affects a lot more on you than me at this point. 1150 00:49:32,880 --> 00:49:34,350 What do you think is going to change the world? 1151 00:49:38,060 --> 00:49:38,990 What do you want to work on? 1152 00:49:38,990 --> 00:49:41,740 I mean, you should work on something, I guess, that you 1153 00:49:41,740 --> 00:49:44,240 think is going to have a huge impact. 1154 00:49:44,240 --> 00:49:45,490 Have you thought about that? 1155 00:49:49,090 --> 00:49:50,180 A lot of silence. 1156 00:49:50,180 --> 00:50:04,890 AUDIENCE: [OBSCURED] 1157 00:50:04,890 --> 00:50:10,590 PROFESSOR: So faster means very little latency. 1158 00:50:10,590 --> 00:50:12,335 Latency is a hard problem. 1159 00:50:15,250 --> 00:50:18,440 A lot of people who are trying to do latency [OBSCURED] 1160 00:50:18,440 --> 00:50:22,225 And the revolution side people are looking at quantum can 1161 00:50:22,225 --> 00:50:27,660 kind of do deal with latency. 1162 00:50:27,660 --> 00:50:28,910 AUDIENCE: Yeah. 1163 00:50:31,370 --> 00:50:34,030 PROFESSOR: Yes, I think there are a lot of revolutionary 1164 00:50:34,030 --> 00:50:36,050 things you can do there. 1165 00:50:36,050 --> 00:50:38,570 Especially taking advantage of the fact that you have a lot 1166 00:50:38,570 --> 00:50:39,220 of wires in there. 1167 00:50:39,220 --> 00:50:42,760 And I think people haven't really pursued that yet. 1168 00:50:42,760 --> 00:50:46,270 And this is where the complexity will come in. 1169 00:50:46,270 --> 00:50:48,770 Trying to basically deal with these, take advantage of these 1170 00:50:48,770 --> 00:50:49,800 kind of things. 1171 00:50:49,800 --> 00:50:51,400 I think there are a lot of opportunities there. 1172 00:50:51,400 --> 00:50:53,620 And then that's the kind of evolutionary part. 1173 00:50:53,620 --> 00:50:54,580 And then some people are looking at 1174 00:50:54,580 --> 00:50:56,110 revolution, with quantum. 1175 00:50:59,730 --> 00:51:00,980 What else? 1176 00:51:02,840 --> 00:51:05,300 I expected people to have a lot more interesting insights. 1177 00:51:05,300 --> 00:51:17,710 AUDIENCE: [OBSCURED] 1178 00:51:17,710 --> 00:51:20,940 PROFESSOR: I think the world is almost bifurcated. 1179 00:51:20,940 --> 00:51:24,200 There are things that you can have with power plugs, and 1180 00:51:24,200 --> 00:51:26,770 things that don't have power plugs. 1181 00:51:26,770 --> 00:51:29,690 And I think people who have power plugs still have some 1182 00:51:29,690 --> 00:51:32,740 power issues, like what Google is finding out. 1183 00:51:32,740 --> 00:51:34,970 You can't just keep wasting power. 1184 00:51:34,970 --> 00:51:39,690 But as things start becoming more mobile, going into 1185 00:51:39,690 --> 00:51:44,450 smaller and smaller devices, power and size becomes a much 1186 00:51:44,450 --> 00:51:47,200 bigger thing than what more you can do. 1187 00:51:47,200 --> 00:51:49,570 You don't become performance, I mean, you want some level of 1188 00:51:49,570 --> 00:51:51,550 performance; you need to get a video. 1189 00:51:51,550 --> 00:51:53,750 But the minute you get the video, you don't want to have 1190 00:51:53,750 --> 00:51:57,680 HD TV. You just say, OK, get the media onto 1191 00:51:57,680 --> 00:52:00,590 my cellphone screen. 1192 00:52:00,590 --> 00:52:01,900 I'll be happy with that speed. 1193 00:52:01,900 --> 00:52:02,970 Just do it. 1194 00:52:02,970 --> 00:52:03,800 As low power as possible, as [OBSCURED] 1195 00:52:03,800 --> 00:52:05,160 possible. 1196 00:52:05,160 --> 00:52:07,980 So I think that's pushing there. 1197 00:52:07,980 --> 00:52:10,450 So for example, in for a long time what has happened is 1198 00:52:10,450 --> 00:52:11,780 there's a trickle down economy. 1199 00:52:11,780 --> 00:52:14,120 Things that start happening in a very high end system and 1200 00:52:14,120 --> 00:52:16,500 kind of trickle down to the low end. 1201 00:52:16,500 --> 00:52:19,310 But there might be a diversion here. 1202 00:52:19,310 --> 00:52:21,810 There might be a different part coming. 1203 00:52:21,810 --> 00:52:25,710 Because that technology might be fundamentally different. 1204 00:52:25,710 --> 00:52:27,122 AUDIENCE: So [OBSCURED] 1205 00:52:27,122 --> 00:52:33,170 also changing from the software side is this notion 1206 00:52:33,170 --> 00:52:34,565 of perpetual data. 1207 00:52:34,565 --> 00:52:38,630 I've been using this mash-up code, you put it up on the 1208 00:52:38,630 --> 00:52:40,670 web, lots of people use it. 1209 00:52:40,670 --> 00:52:42,640 So it's pushing toward scripting light 1210 00:52:42,640 --> 00:52:44,500 languages or whatever. 1211 00:52:44,500 --> 00:52:47,430 PROFESSOR: So this is an interesting obsession I have. 1212 00:52:47,430 --> 00:52:54,290 So [OBSCURED] 1213 00:52:54,290 --> 00:52:58,020 And then a few years ago I had a start up, and I just again 1214 00:52:58,020 --> 00:53:01,390 had the time to try it, not large , some 1215 00:53:01,390 --> 00:53:03,160 large piece of code. 1216 00:53:03,160 --> 00:53:06,820 What I realized was, what programming had become, is the 1217 00:53:06,820 --> 00:53:08,580 biggest, best tool for programming at 1218 00:53:08,580 --> 00:53:10,010 that time was Google. 1219 00:53:10,010 --> 00:53:12,420 Because I was trying a bunch of things with Windows. 1220 00:53:12,420 --> 00:53:14,490 And everything I want, there was a function. 1221 00:53:14,490 --> 00:53:17,446 And if I can find that function, my work is a factor 1222 00:53:17,446 --> 00:53:19,140 of 100 smaller. 1223 00:53:19,140 --> 00:53:21,370 So I spend more time than actually thinking about 1224 00:53:21,370 --> 00:53:23,650 algorithms, how the lists should be, or how my data 1225 00:53:23,650 --> 00:53:25,290 structures should be, trying to Google it. 1226 00:53:25,290 --> 00:53:26,530 I said, darn, I want to do this. 1227 00:53:26,530 --> 00:53:27,460 Where's that function? 1228 00:53:27,460 --> 00:53:30,410 Where can I find that function in this entire big library. 1229 00:53:30,410 --> 00:53:31,990 So what that means is there's kind of this 1230 00:53:31,990 --> 00:53:34,290 missing layer in there. 1231 00:53:34,290 --> 00:53:38,110 Because what has happened is, our language is a level of 1232 00:53:38,110 --> 00:53:38,960 abstraction. 1233 00:53:38,960 --> 00:53:41,650 And then we have built these libraries with no kind of 1234 00:53:41,650 --> 00:53:42,520 support at all. 1235 00:53:42,520 --> 00:53:44,680 I mean, the compilers' languages don't really support 1236 00:53:44,680 --> 00:53:46,350 the libraries. 1237 00:53:46,350 --> 00:53:48,090 Because it's just built out of basic blocks. 1238 00:53:48,090 --> 00:53:50,660 And keeping it, can you have better basic blocks, can you 1239 00:53:50,660 --> 00:53:52,650 have more nice abstraction? 1240 00:53:52,650 --> 00:53:58,170 Instead of saying, OK, here's 5,000 libraries with 250,000 1241 00:53:58,170 --> 00:54:00,420 different functions that you can use for compiling. 1242 00:54:00,420 --> 00:54:01,860 Can you build it in a nicer way? 1243 00:54:01,860 --> 00:54:06,718 AUDIENCE: [OBSCURED] 1244 00:54:06,718 --> 00:54:09,310 interface, your input, your output. 1245 00:54:09,310 --> 00:54:11,258 And then you publish your code and then Google 1246 00:54:11,258 --> 00:54:11,450 indexes it for you. 1247 00:54:11,450 --> 00:54:14,410 And then your program is just of Google searches. 1248 00:54:14,410 --> 00:54:19,670 PROFESSOR: I think, if you think about what all the 1249 00:54:19,670 --> 00:54:22,560 Windows source base is, it's like that. 1250 00:54:22,560 --> 00:54:28,360 I mean, in the company we were doing, during the [OBSCURED], 1251 00:54:28,360 --> 00:54:31,870 there was this group of people who understood every line of 1252 00:54:31,870 --> 00:54:35,270 code they wrote, because they were writing a virtualization 1253 00:54:35,270 --> 00:54:37,220 system that runs on the Windows. 1254 00:54:37,220 --> 00:54:39,420 So for them, every line was probably -- 1255 00:54:39,420 --> 00:54:43,150 Interesting thing is, when that group started, that 1256 00:54:43,150 --> 00:54:46,090 entire piece of code was about 50,000 lines. 1257 00:54:46,090 --> 00:54:47,910 Now it's about 65,000 lines. 1258 00:54:47,910 --> 00:54:50,780 But we had about 10 people working on that. 1259 00:54:50,780 --> 00:54:52,920 That means these things get revised. 1260 00:54:52,920 --> 00:54:53,950 You don't keep at it. 1261 00:54:53,950 --> 00:54:57,150 There's another group of people who's doing user 1262 00:54:57,150 --> 00:54:58,770 interface and stuff like that. 1263 00:54:58,770 --> 00:55:01,510 they have alphabet soup of different libraries. 1264 00:55:01,510 --> 00:55:03,332 They're using Ajax, this, that -- 1265 00:55:03,332 --> 00:55:05,920 I don't even know what they are using. 1266 00:55:05,920 --> 00:55:08,650 The amount of code they write is immense. 1267 00:55:08,650 --> 00:55:10,590 The amount of things they have to know is immense. 1268 00:55:10,590 --> 00:55:13,060 Each of them are not that completed. 1269 00:55:13,060 --> 00:55:18,080 But if you use a bad thing, you have a bad looking 1270 00:55:18,080 --> 00:55:21,270 interface, clunky, slow, whatever, if you figure out 1271 00:55:21,270 --> 00:55:24,060 the right three function calls, 1272 00:55:24,060 --> 00:55:25,310 things look much better. 1273 00:55:25,310 --> 00:55:27,490 That's a very different type of programming in there. 1274 00:55:27,490 --> 00:55:30,460 So for them, it's basically knowledge. 1275 00:55:30,460 --> 00:55:32,530 How much do you know. 1276 00:55:32,530 --> 00:55:37,280 Just like, it's almost like I remember my college days, 1277 00:55:37,280 --> 00:55:39,990 friends who were doing things like that in medical school. 1278 00:55:39,990 --> 00:55:42,940 So you need to know this entire thing about all the 1279 00:55:42,940 --> 00:55:44,100 different muscles in the body. 1280 00:55:44,100 --> 00:55:46,470 And there's this big thick book you just study. 1281 00:55:46,470 --> 00:55:47,270 It's almost like that. 1282 00:55:47,270 --> 00:55:50,690 You just take this big Windows manual that are a few hundred 1283 00:55:50,690 --> 00:55:53,330 pounds heavy, and you look at every page, and figure out 1284 00:55:53,330 --> 00:55:54,630 what you know. 1285 00:55:54,630 --> 00:55:56,750 That's a really strange way of coding. 1286 00:55:56,750 --> 00:56:01,050 But, I mean, algorithmically, I think what they were doing 1287 00:56:01,050 --> 00:56:02,100 is pretty simple. 1288 00:56:02,100 --> 00:56:06,180 I don't think they every designed a data structure. 1289 00:56:06,180 --> 00:56:08,520 Every data structure they used came from something. 1290 00:56:08,520 --> 00:56:11,967 But they had to figure out the right data structure. 1291 00:56:11,967 --> 00:56:12,463 AUDIENCE: 1292 00:56:12,463 --> 00:56:16,551 This is going in a totally different direction, but in 1293 00:56:16,551 --> 00:56:19,550 terms of graphics in video games, perhaps getting artists 1294 00:56:19,550 --> 00:56:22,212 who are aware of both the hardware and the software, and 1295 00:56:22,212 --> 00:56:25,550 so can design more for the capabilities of the machine, 1296 00:56:25,550 --> 00:56:29,840 and can make more visually appealing -- 1297 00:56:29,840 --> 00:56:31,640 PROFESSOR: I think that's the are that has a very different 1298 00:56:31,640 --> 00:56:33,130 set of communities. 1299 00:56:33,130 --> 00:56:35,490 Artists [OBSCURED]. 1300 00:56:35,490 --> 00:56:38,870 Not even trained engineers. 1301 00:56:38,870 --> 00:56:40,250 And they're sitting in one, and they have this neat 1302 00:56:40,250 --> 00:56:41,940 feature, they want fire to look natural, 1303 00:56:41,940 --> 00:56:42,950 or whatever it is. 1304 00:56:42,950 --> 00:56:44,670 And then there's the programmers and hardware 1305 00:56:44,670 --> 00:56:48,070 people, that who knows what the capability is. 1306 00:56:48,070 --> 00:56:50,290 I think that's what we're trying to push in Mike 1307 00:56:50,290 --> 00:56:52,510 Eckham's talk, and say how do you go about it? 1308 00:56:52,510 --> 00:56:55,390 How do you know when a new thing comes, what is doable? 1309 00:56:55,390 --> 00:56:59,950 Like for example, what ATI does, ATI has a group called 1310 00:56:59,950 --> 00:57:01,060 the research group. 1311 00:57:01,060 --> 00:57:04,310 What they do is, every time a new chip comes, they actually 1312 00:57:04,310 --> 00:57:06,700 go and figure out all the things you can do with that. 1313 00:57:06,700 --> 00:57:08,650 OK, how can you do rain in this one? 1314 00:57:08,650 --> 00:57:11,340 So they write a lot of -- 1315 00:57:11,340 --> 00:57:13,830 those of a little bit of an artistic bent, but very 1316 00:57:13,830 --> 00:57:17,570 hard-core programmer who kind of figure out what's doable. 1317 00:57:17,570 --> 00:57:19,330 And they make that available and say here, this is 1318 00:57:19,330 --> 00:57:19,870 what you can do. 1319 00:57:19,870 --> 00:57:21,650 And a lot of people kind of emulate that. 1320 00:57:21,650 --> 00:57:23,410 They say, wait a minute, how did they do that? 1321 00:57:23,410 --> 00:57:25,900 Then kind of figure out, OK, now I have this new hardware, 1322 00:57:25,900 --> 00:57:28,010 and I'm able to do something completely new. 1323 00:57:28,010 --> 00:57:30,160 I think that's, pushing there. 1324 00:57:30,160 --> 00:57:32,020 I mean, I think this is where interesting 1325 00:57:32,020 --> 00:57:34,470 revolutions come about. 1326 00:57:34,470 --> 00:57:37,850 One of my firm beliefs is that a lot of good research is not 1327 00:57:37,850 --> 00:57:40,030 in one area, as I pointed out. 1328 00:57:40,030 --> 00:57:42,500 It's kind of looking at two distinct things and kind of 1329 00:57:42,500 --> 00:57:43,410 bringing them together. 1330 00:57:43,410 --> 00:57:45,490 Bringing knowledge from one to another 1331 00:57:45,490 --> 00:57:46,570 and mixing them together. 1332 00:57:46,570 --> 00:57:48,440 I think that can have huge impact. 1333 00:57:48,440 --> 00:57:51,070 And that's a place that I think can have huge impact. 1334 00:57:51,070 --> 00:57:54,910 What other things that will have big visual impact? 1335 00:57:54,910 --> 00:57:58,290 That the next hardware can enable? 1336 00:57:58,290 --> 00:58:00,450 You can keep adding more big servers and stuff like that, 1337 00:58:00,450 --> 00:58:01,470 but it's another thing. 1338 00:58:01,470 --> 00:58:03,485 What other things can can have big impact? 1339 00:58:03,485 --> 00:58:04,001 AUDIENCE: 1340 00:58:04,001 --> 00:58:07,612 [OBSCURED] 1341 00:58:07,612 --> 00:58:10,703 The same is true for biologists, physicists, 1342 00:58:10,703 --> 00:58:12,160 chemists, end users. 1343 00:58:12,160 --> 00:58:15,720 So maybe the thing to do is work on language generators. 1344 00:58:15,720 --> 00:58:19,602 And easy way for me as an end user to express language 1345 00:58:19,602 --> 00:58:20,114 [OBSCURED] 1346 00:58:20,114 --> 00:58:21,650 that I want and give me a way to generate it 1347 00:58:21,650 --> 00:58:24,120 PROFESSOR: So that's a lot of things. 1348 00:58:24,120 --> 00:58:26,380 For example, I was talking to this computational biologist. 1349 00:58:26,380 --> 00:58:29,710 And what he said was, for computer science, the best 1350 00:58:29,710 --> 00:58:32,750 thing that happened from a biology point of view is Excel 1351 00:58:32,750 --> 00:58:35,050 only had 32,000 cells. 1352 00:58:35,050 --> 00:58:36,680 Because the biologists keep using Excel, and 1353 00:58:36,680 --> 00:58:38,430 they say that -- 1354 00:58:38,430 --> 00:58:38,660 AUDIENCE: [OBSCURED] 1355 00:58:38,660 --> 00:58:40,190 PROFESSOR: 65,000 cells. 1356 00:58:40,190 --> 00:58:43,050 And they ran into this and said, jeez, I can't do 1357 00:58:43,050 --> 00:58:43,810 anything more in that. 1358 00:58:43,810 --> 00:58:44,740 Now what do I do? 1359 00:58:44,740 --> 00:58:47,180 And then suddenly they had to start looking at programming 1360 00:58:47,180 --> 00:58:48,710 and stuff like that, because they can't have an Excel 1361 00:58:48,710 --> 00:58:49,440 spreadsheet doing that. 1362 00:58:49,440 --> 00:58:53,310 So that's a lot of biologists actually had to switch. 1363 00:58:53,310 --> 00:58:59,200 So I think, interesting thing about computer sciences, there 1364 00:58:59,200 --> 00:59:01,880 are a lot of fundamental sciences still. 1365 00:59:01,880 --> 00:59:06,550 But I think we have grown and ran so fast developing that, 1366 00:59:06,550 --> 00:59:09,450 what's lagging behind that is applications. 1367 00:59:09,450 --> 00:59:12,960 How to apply that to biology, gaming, stuff like that. 1368 00:59:12,960 --> 00:59:14,630 And I think a lot of interesting things, as 1369 00:59:14,630 --> 00:59:17,290 computer scientists, we can do is to bring that knowledge in 1370 00:59:17,290 --> 00:59:20,350 an applied community. 1371 00:59:20,350 --> 00:59:24,660 Really, you can have a huge impact and fundamentally 1372 00:59:24,660 --> 00:59:27,180 change things if you apply that knowledge. 1373 00:59:27,180 --> 00:59:31,170 Both theoretical things you have to get out, as well as 1374 00:59:31,170 --> 00:59:33,810 using the powers of the computers we've got 1375 00:59:33,810 --> 00:59:34,960 in the right way. 1376 00:59:34,960 --> 00:59:40,170 I think a lot of times, people are taking a few days to do 1377 00:59:40,170 --> 00:59:42,670 two times better, five times better than kind of seeing 1378 00:59:42,670 --> 00:59:47,760 fundamentals [OBSCURED]. 1379 00:59:47,760 --> 00:59:49,150 I think this is an interesting topic. 1380 00:59:49,150 --> 00:59:50,680 I think I'm going to stop soon. 1381 00:59:50,680 --> 00:59:55,070 And I think it's interesting, especially for you guys. 1382 00:59:55,070 --> 00:59:58,590 Because you are probably going into an area, looking at a 1383 00:59:58,590 --> 01:00:02,940 field, trying to get expert at something. 1384 01:00:02,940 --> 01:00:06,140 There's a lot of times you get seduced to become an expert in 1385 01:00:06,140 --> 01:00:08,920 what's hot today. 1386 01:00:08,920 --> 01:00:11,510 But the interesting thing is, especially if you are doing a 1387 01:00:11,510 --> 01:00:14,300 PhD, or if you do something that takes four or five years 1388 01:00:14,300 --> 01:00:17,100 for you to mature in that area, is it going to be the 1389 01:00:17,100 --> 01:00:19,440 hardest thing in four or five years? 1390 01:00:19,440 --> 01:00:21,380 So multicores are hard today. 1391 01:00:21,380 --> 01:00:24,070 So should I recommend that everybody jump and become 1392 01:00:24,070 --> 01:00:26,670 expert and try to do a PhD in multicores? 1393 01:00:26,670 --> 01:00:27,630 Probably not. 1394 01:00:27,630 --> 01:00:30,040 Because five or six years down the line, some of these things 1395 01:00:30,040 --> 01:00:33,347 might be solved. they'd better get solved, or else we're in 1396 01:00:33,347 --> 01:00:36,710 big trouble. 1397 01:00:36,710 --> 01:00:39,970 But the thing is, what's going to be the problem there? 1398 01:00:39,970 --> 01:00:43,150 I see one thing that's really hard, it's almost art, is to 1399 01:00:43,150 --> 01:00:45,630 kind of picture your [OBSCURED] 1400 01:00:45,630 --> 01:00:48,890 and then if your lucky And if you're lucky and you do a good 1401 01:00:48,890 --> 01:00:51,930 job in that prediction, you've come about, and by the time 1402 01:00:51,930 --> 01:00:54,960 you get to a point that you become expert, you've become 1403 01:00:54,960 --> 01:00:57,220 expert in the thing that's most important. 1404 01:00:57,220 --> 01:00:58,500 It's not an easy thing to do. 1405 01:00:58,500 --> 01:01:00,510 And a lot of times, nobody can claim that they have a 1406 01:01:00,510 --> 01:01:01,530 technique to do that, too. 1407 01:01:01,530 --> 01:01:04,120 A lot of people who claim, they discount 1408 01:01:04,120 --> 01:01:06,500 how lucky they were. 1409 01:01:06,500 --> 01:01:08,200 There's a large amount of luck associated with it. 1410 01:01:08,200 --> 01:01:10,240 But a key thing is to kind of follow that trend, identify, 1411 01:01:10,240 --> 01:01:11,350 and do that. 1412 01:01:11,350 --> 01:01:13,810 That's the technique that I think every one of us, every 1413 01:01:13,810 --> 01:01:17,720 one of you should do a lot. 1414 01:01:17,720 --> 01:01:18,510 OK. 1415 01:01:18,510 --> 01:01:22,110 With that, do you have something about the -- 1416 01:01:22,110 --> 01:01:23,500 SPEAKER: Yes. 1417 01:01:23,500 --> 01:01:26,240 A couple of things about -- 1418 01:01:26,240 --> 01:01:31,104 Anybody here want direct access to a Playstation 3? 1419 01:01:31,104 --> 01:01:32,077 Just one? 1420 01:01:32,077 --> 01:01:32,563 Two? 1421 01:01:32,563 --> 01:01:37,520 So come by, actually if you have time now, we'll talk 1422 01:01:37,520 --> 01:01:39,832 about that and figure out a way of getting 1423 01:01:39,832 --> 01:01:40,950 you access to it. 1424 01:01:40,950 --> 01:01:45,500 The final competition's going to be this Thursday.