1 00:00:00,030 --> 00:00:02,420 The following content is provided under a Creative 2 00:00:02,420 --> 00:00:03,840 Commons license. 3 00:00:03,840 --> 00:00:06,860 Your support will help MIT OpenCourseWare continue to 4 00:00:06,860 --> 00:00:10,560 offer high quality educational resources for free. 5 00:00:10,560 --> 00:00:13,420 To make a donation, or view additional materials from 6 00:00:13,420 --> 00:00:17,520 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,520 --> 00:00:18,770 ocw.mit.edu. 8 00:00:21,690 --> 00:00:25,800 PROFESSOR: Mike Acton today who has done a lot of 9 00:00:25,800 --> 00:00:27,320 programming on cell and also done a lot of game 10 00:00:27,320 --> 00:00:28,480 development. 11 00:00:28,480 --> 00:00:32,270 He came from California just like this today and 12 00:00:32,270 --> 00:00:33,560 [OBSCURED]. 13 00:00:33,560 --> 00:00:34,548 MIKE ACTON: Yeah, it's really cool. 14 00:00:34,548 --> 00:00:36,580 I'll tell you. 15 00:00:36,580 --> 00:00:40,790 PROFESSOR: He's going to talk about what it's really like to 16 00:00:40,790 --> 00:00:44,880 use cell and PS3 and what it's like to program games. 17 00:00:44,880 --> 00:00:47,980 So I think it's going to be a fun lecture. 18 00:00:47,980 --> 00:00:52,130 MIKE ACTON: All right, so anyway, I'm the Engine 19 00:00:52,130 --> 00:00:54,430 Director at Insomniac Games. 20 00:00:54,430 --> 00:00:58,890 I've only recently taken that position, previously I was 21 00:00:58,890 --> 00:01:02,530 working on PS3 technology at Highmoon studios, which is 22 00:01:02,530 --> 00:01:06,100 with vin studios. 23 00:01:06,100 --> 00:01:08,350 And I've worked at Sony. 24 00:01:08,350 --> 00:01:18,550 I've worked for Titus, and Bluesky Studios in San Diego. 25 00:01:18,550 --> 00:01:23,670 And I've been doing game development, 11, 12 years. 26 00:01:23,670 --> 00:01:26,750 Before that I was working in simulation. 27 00:01:26,750 --> 00:01:33,200 So, the PlayStation 3 is a really fun platform. 28 00:01:33,200 --> 00:01:36,900 And I know you guys have been working on cell development. 29 00:01:36,900 --> 00:01:39,220 Working with the PS3 under Linux. 30 00:01:39,220 --> 00:01:41,750 Working as developers for the PS3 is definitely a different 31 00:01:41,750 --> 00:01:44,340 environment from that. 32 00:01:44,340 --> 00:01:46,430 I think I'm going to concentrate more on the 33 00:01:46,430 --> 00:01:50,850 high-level aspects of how you design a game for the cell. 34 00:01:50,850 --> 00:01:54,360 And how the cell would impact the design, and what are the 35 00:01:54,360 --> 00:01:56,960 elements of the game. 36 00:01:56,960 --> 00:02:00,800 Just stuff that you probably haven't had as part of this 37 00:02:00,800 --> 00:02:02,950 course that you might find interesting. 38 00:02:02,950 --> 00:02:05,530 And you can feel free to interrupt me at any time with 39 00:02:05,530 --> 00:02:07,640 questions or whatever you'd like. 40 00:02:12,820 --> 00:02:16,130 So just, I wanted to go over, briefly, some of the different 41 00:02:16,130 --> 00:02:19,370 types of game development and what the trade-offs for each 42 00:02:19,370 --> 00:02:20,700 one of them are. 43 00:02:20,700 --> 00:02:24,380 Casual games, console games, PC games, blah, blah, blah. 44 00:02:24,380 --> 00:02:27,763 Casual games, basically, are the small, simple games that 45 00:02:27,763 --> 00:02:30,485 you would download on the PC, or you would 46 00:02:30,485 --> 00:02:31,750 see on Yahoo or whatever. 47 00:02:31,750 --> 00:02:34,150 And those generally don't have really strict performance 48 00:02:34,150 --> 00:02:34,970 requirements. 49 00:02:34,970 --> 00:02:38,780 Where a console game, we have this particular advantage of 50 00:02:38,780 --> 00:02:41,005 knowing the hardware and the hardware doesn't change for an 51 00:02:41,005 --> 00:02:41,870 entire cycle. 52 00:02:41,870 --> 00:02:44,100 So for five, six years, we have 53 00:02:44,100 --> 00:02:46,320 exactly the same hardware. 54 00:02:46,320 --> 00:02:48,750 And that's definitely an advantage from a performance 55 00:02:48,750 --> 00:02:50,350 point anyway. 56 00:02:50,350 --> 00:02:52,250 In this case, it's PlayStation 3. 57 00:02:57,240 --> 00:02:58,750 As far as develpment priorities, development 58 00:02:58,750 --> 00:03:01,610 priorities for a console game-- and especially a PS3 59 00:03:01,610 --> 00:03:03,810 game-- development would be completely different than you 60 00:03:03,810 --> 00:03:07,480 might find on another kind of application. 61 00:03:07,480 --> 00:03:11,890 We don't really consider the code itself important at all. 62 00:03:11,890 --> 00:03:13,790 The real value is in the programmers. 63 00:03:13,790 --> 00:03:15,950 The real value is in the experience, 64 00:03:15,950 --> 00:03:18,600 and is in those skills. 65 00:03:18,600 --> 00:03:20,690 Code is disposable. 66 00:03:20,690 --> 00:03:24,020 After six years, when we start a new platform we pretty much 67 00:03:24,020 --> 00:03:27,780 have to rewrite it anyway, so there's not much point in 68 00:03:27,780 --> 00:03:31,190 trying to plan for a long life span of code. 69 00:03:31,190 --> 00:03:33,090 Especially when you have optimized code written in 70 00:03:33,090 --> 00:03:34,920 assembly for a particular platform. 71 00:03:38,500 --> 00:03:42,646 And to that end, the data is way more significant to the 72 00:03:42,646 --> 00:03:43,850 performance than the code, anyway. 73 00:03:43,850 --> 00:03:47,500 And the data is specific to a particular game. 74 00:03:47,500 --> 00:03:50,450 Or specific to a particular type of game. 75 00:03:50,450 --> 00:03:56,280 And certainly specific to a studios pipeline. 76 00:03:56,280 --> 00:03:58,670 And it's the design of the data where you really want to 77 00:03:58,670 --> 00:04:00,870 spend your time concentrating, especially for the PS3. 78 00:04:03,430 --> 00:04:05,795 Ease of programming-- whether or not it's easier to do 79 00:04:05,795 --> 00:04:08,150 parallelism is not a major concern at all. 80 00:04:08,150 --> 00:04:09,680 If it's hard, so what? 81 00:04:09,680 --> 00:04:10,680 You do it. 82 00:04:10,680 --> 00:04:12,420 That's it. 83 00:04:12,420 --> 00:04:14,580 Portability, runs on PlayStation 3, doesn't run 84 00:04:14,580 --> 00:04:15,280 anywhere else. 85 00:04:15,280 --> 00:04:17,700 That's a non-concern. 86 00:04:17,700 --> 00:04:19,310 And everything is about performance. 87 00:04:19,310 --> 00:04:20,440 Everything we do. 88 00:04:20,440 --> 00:04:23,330 A vast majority of our code is either hand up from IC, or 89 00:04:23,330 --> 00:04:26,380 assembly, very little high level code. 90 00:04:26,380 --> 00:04:28,220 Some of our gameplay programmers will write C plus 91 00:04:28,220 --> 00:04:33,750 plus for the high level logic, but as a general, most of the 92 00:04:33,750 --> 00:04:36,700 code that's running most the time is definitely optimized. 93 00:04:36,700 --> 00:04:38,950 Yeah? 94 00:04:38,950 --> 00:04:43,060 AUDIENCE: If programming is a non-priority, does that mean 95 00:04:43,060 --> 00:04:45,060 to say that if you're developing more than one 96 00:04:45,060 --> 00:04:47,447 product or game, they don't share any common 97 00:04:47,447 --> 00:04:48,240 infrastructure or need? 98 00:04:48,240 --> 00:04:49,300 MIKE ACTION: No, that's not necessarily true. 99 00:04:49,300 --> 00:04:53,440 If we have games that share similar needs, they can 100 00:04:53,440 --> 00:04:55,140 definitely share similar code. 101 00:04:55,140 --> 00:04:58,850 I mean, the point I'm trying to make is, let's say in order 102 00:04:58,850 --> 00:05:02,640 to make something fast it has to be complicated. 103 00:05:02,640 --> 00:05:05,990 So be it, it's complicated. 104 00:05:05,990 --> 00:05:09,710 Whether or not it's easy to use for another programmer is 105 00:05:09,710 --> 00:05:11,260 not a major concern. 106 00:05:11,260 --> 00:05:14,060 AUDIENCE: So you wish it was easier? 107 00:05:14,060 --> 00:05:14,540 MIKE ACTION: No. 108 00:05:14,540 --> 00:05:16,600 I don't care. 109 00:05:16,600 --> 00:05:18,780 That's my point. 110 00:05:18,780 --> 00:05:20,259 AUDIENCE: Well, it's not as important as performance, but 111 00:05:20,259 --> 00:05:22,360 if someone came to you with a high performance tool, you 112 00:05:22,360 --> 00:05:23,330 would like to use it? 113 00:05:23,330 --> 00:05:24,580 MIKE ACTION: I doubt they could. 114 00:05:26,840 --> 00:05:29,460 The highest performance tool that exists is the brains of 115 00:05:29,460 --> 00:05:31,150 the programmers on our team. 116 00:05:31,150 --> 00:05:33,220 You can not create-- 117 00:05:33,220 --> 00:05:35,040 it's theoretically impossible. 118 00:05:35,040 --> 00:05:39,570 You can not out perform people who are customizing for the 119 00:05:39,570 --> 00:05:42,520 data, for the context for the game. 120 00:05:42,520 --> 00:05:46,730 It is not even remotely theoretically possible. 121 00:05:46,730 --> 00:05:48,620 AUDIENCE: That didn't come out in assembly programming for 122 00:05:48,620 --> 00:05:53,550 general purpose but we'll take this offline? 123 00:05:53,550 --> 00:05:54,790 And there was a day when that was also true for general 124 00:05:54,790 --> 00:05:57,950 preferred cleary at the time, but it's no longer true. 125 00:05:57,950 --> 00:05:58,120 MIKE ACTION: It is absolutely-- 126 00:05:58,120 --> 00:05:59,050 AUDIENCE: So the average person prefers to go on -- 127 00:05:59,050 --> 00:06:00,300 take it offline. 128 00:06:01,890 --> 00:06:02,800 MIKE ACTION: Average person. 129 00:06:02,800 --> 00:06:03,860 We're not the average people. 130 00:06:03,860 --> 00:06:05,720 We're game programmers. 131 00:06:05,720 --> 00:06:06,030 Yeah? 132 00:06:06,030 --> 00:06:08,580 AUDIENCE: So does cost ever become an issue? 133 00:06:08,580 --> 00:06:08,870 I mean-- 134 00:06:08,870 --> 00:06:10,640 MIKE ACTION: Absolutely, cost does become an issue. 135 00:06:10,640 --> 00:06:15,210 At a certain pont, something is so difficult that you 136 00:06:15,210 --> 00:06:18,350 either have to throw up your hands or you 137 00:06:18,350 --> 00:06:19,060 can't finish in time. 138 00:06:19,060 --> 00:06:20,900 AUDIENCE: Do you ever hit that point? 139 00:06:20,900 --> 00:06:22,720 MIKE ACTION: Or you figure out a new way of doing it. 140 00:06:22,720 --> 00:06:24,120 Or do a little bit less. 141 00:06:24,120 --> 00:06:25,350 I mean we do have to prioritize 142 00:06:25,350 --> 00:06:27,060 what you want to do. 143 00:06:27,060 --> 00:06:28,750 At the end of the day you can't do everything you want 144 00:06:28,750 --> 00:06:30,990 to do, and you have another game you need to ship 145 00:06:30,990 --> 00:06:32,520 eventually, anyway. 146 00:06:32,520 --> 00:06:34,550 So, a lot of times you do end up tabling things. 147 00:06:34,550 --> 00:06:38,300 And say, look we can get 50% more performance out of this, 148 00:06:38,300 --> 00:06:40,650 but we're going to have to table that for now and scale 149 00:06:40,650 --> 00:06:42,190 back on the content. 150 00:06:42,190 --> 00:06:44,340 And that's why you have six years of development. 151 00:06:44,340 --> 00:06:47,270 You know, maybe in the next cycle, in the next game, 152 00:06:47,270 --> 00:06:48,520 you'll be able to squeeze out a little bit more. 153 00:06:48,520 --> 00:06:50,370 And the next one you squeeze out a little bit more. 154 00:06:50,370 --> 00:06:52,560 That's sort of this continuous development, and continuous 155 00:06:52,560 --> 00:06:57,740 optimization over the course of a platform. 156 00:06:57,740 --> 00:07:00,750 And sometimes, yeah, I mean occasionally you just say 157 00:07:00,750 --> 00:07:03,720 yeah, we can't do it or whatever, it doesn't work. 158 00:07:03,720 --> 00:07:07,290 I mean, that's part and parcel of development in general. 159 00:07:07,290 --> 00:07:08,900 Some ideas just don't pan out. 160 00:07:11,890 --> 00:07:12,200 But-- 161 00:07:12,200 --> 00:07:14,961 AUDIENCE: Have you ever come into a situation where 162 00:07:14,961 --> 00:07:17,283 programming conflicts just kills a project? 163 00:07:17,283 --> 00:07:20,137 Like Microsoft had had a few times, like they couldn't put 164 00:07:20,137 --> 00:07:20,620 out [OBSCURED]. 165 00:07:20,620 --> 00:07:22,930 Couldn't release for-- 166 00:07:22,930 --> 00:07:24,160 MIKE ACTION: Sure, there's plenty of studios where the 167 00:07:24,160 --> 00:07:26,090 programming complexity has killed the studio, or killed 168 00:07:26,090 --> 00:07:27,180 the project. 169 00:07:27,180 --> 00:07:29,870 But I find it hard to believe-- or it's very 170 00:07:29,870 --> 00:07:32,730 rarely-- because it's complexity that has to do 171 00:07:32,730 --> 00:07:34,860 specifically with optimization. 172 00:07:34,860 --> 00:07:38,160 That complexity usually has to do with unnecessary 173 00:07:38,160 --> 00:07:38,740 complexity. 174 00:07:38,740 --> 00:07:41,410 Complexity that doesn't achieve anything. 175 00:07:41,410 --> 00:07:43,500 Organization for the sake of organization. 176 00:07:43,500 --> 00:07:46,740 So you have these sort of over designed C plus plus 177 00:07:46,740 --> 00:07:51,870 hierarchies just for the sake of over organizing things. 178 00:07:51,870 --> 00:07:53,650 That's what will generally kill a project. 179 00:07:53,650 --> 00:07:57,760 But in performance, the complexity tends to come from 180 00:07:57,760 --> 00:08:00,010 the rule set-- what you need to do to set it up. 181 00:08:00,010 --> 00:08:03,700 But the code tends to be smaller when it's faster. 182 00:08:03,700 --> 00:08:05,810 You tend to be doing one thing and doing one 183 00:08:05,810 --> 00:08:07,390 thing really well. 184 00:08:07,390 --> 00:08:08,980 So it doesn't tend to get out of hand. 185 00:08:08,980 --> 00:08:12,750 I mean, it occasionally happens but, yeah? 186 00:08:12,750 --> 00:08:16,395 AUDIENCE: So in terms of the overall cost, how big is this 187 00:08:16,395 --> 00:08:18,020 programming versus the other aspect of 188 00:08:18,020 --> 00:08:19,040 coming up with the game? 189 00:08:19,040 --> 00:08:23,220 Like the game design, the graphics-- 190 00:08:23,220 --> 00:08:26,520 AUDIENCE: So, for example, do you have-- 191 00:08:26,520 --> 00:08:27,990 MIKE ACTION: OK, development team? 192 00:08:30,830 --> 00:08:31,450 So-- 193 00:08:31,450 --> 00:08:33,612 AUDIENCE: So how many programmers, how many 194 00:08:33,612 --> 00:08:34,020 artists, how many-- 195 00:08:34,020 --> 00:08:37,620 PROFESSOR: Maybe, let's-- so for example, like, now it's 196 00:08:37,620 --> 00:08:39,720 like, what, $20 million to deliver a PS3 game? 197 00:08:39,720 --> 00:08:42,476 MIKE ACTION: Between $10 and $20 million, yeah. 198 00:08:42,476 --> 00:08:44,800 PROFESSOR: So let's develop [OBSCURED] 199 00:08:44,800 --> 00:08:48,630 MIKE ACTION: So artists are by far the most-- 200 00:08:48,630 --> 00:08:51,550 the largest group of developers. 201 00:08:51,550 --> 00:08:53,680 So you have animators and shade artists, and textual 202 00:08:53,680 --> 00:08:55,000 artists, and modelers, and enviromental 203 00:08:55,000 --> 00:08:57,110 artists, and lighters. 204 00:08:57,110 --> 00:09:02,630 And so they'll often outnumber programmers 2:1. 205 00:09:02,630 --> 00:09:05,580 Which is completely different than-- certainly very 206 00:09:05,580 --> 00:09:09,480 different from PlayStation and the gap is much larger than it 207 00:09:09,480 --> 00:09:11,750 was on PlayStation 2. 208 00:09:11,750 --> 00:09:16,660 With programmers you tend to have a fairly even split or 209 00:09:16,660 --> 00:09:18,570 you tend to have a divide between the high level game 210 00:09:18,570 --> 00:09:21,450 play programmers and the low level engine programmers. 211 00:09:21,450 --> 00:09:23,700 And you will tend to have more game play programmers than 212 00:09:23,700 --> 00:09:26,390 engine programmers, although most-- the majority of the CPU 213 00:09:26,390 --> 00:09:29,880 time is spent in the engine code. 214 00:09:29,880 --> 00:09:35,030 And that partially comes down to education and experience. 215 00:09:35,030 --> 00:09:39,470 In order to get high performance code you need to 216 00:09:39,470 --> 00:09:40,690 have that experience. 217 00:09:40,690 --> 00:09:41,970 You need to know how to optimize. 218 00:09:41,970 --> 00:09:43,420 You need to understand the machine. 219 00:09:43,420 --> 00:09:44,840 You need to understand the architecture and you need to 220 00:09:44,840 --> 00:09:46,990 understand the data. 221 00:09:46,990 --> 00:09:50,050 And there's only so many people that can do that on any 222 00:09:50,050 --> 00:09:50,760 particular team. 223 00:09:50,760 --> 00:09:56,401 AUDIENCE: Code size wise, How is the code size divided 224 00:09:56,401 --> 00:09:59,120 between game playing and AI, special effects? 225 00:09:59,120 --> 00:10:01,410 MIKE ACTION: Just like, the amount of code? 226 00:10:01,410 --> 00:10:03,870 AUDIENCE: Yeah, it should be small I guess. 227 00:10:03,870 --> 00:10:05,050 MIKE ACTION: Yeah, I mean, it's hard to say. 228 00:10:05,050 --> 00:10:07,420 I mean, because it depends on how many 229 00:10:07,420 --> 00:10:08,720 features you're using. 230 00:10:08,720 --> 00:10:11,710 And, you know sort of the scope of the engine is how 231 00:10:11,710 --> 00:10:13,780 much is being used for a particular game, especially if 232 00:10:13,780 --> 00:10:18,570 you're targeting multiple games within a studio. 233 00:10:18,570 --> 00:10:20,460 But quite often-- interestingly enough-- the 234 00:10:20,460 --> 00:10:23,210 game play code actually overwhelms the engine code in 235 00:10:23,210 --> 00:10:27,590 terms of size and that is back to basically what I was saying 236 00:10:27,590 --> 00:10:31,200 that the engine code tends to do one thing really well or a 237 00:10:31,200 --> 00:10:32,240 series of things really well. 238 00:10:32,240 --> 00:10:34,790 AUDIENCE: Game play code also C plus plus? 239 00:10:34,790 --> 00:10:36,745 MIKE ACTON: These days it's much more likely that game 240 00:10:36,745 --> 00:10:40,220 play code is C plus plus in the high level and kills 241 00:10:40,220 --> 00:10:45,290 performance and doesn't think about things like cache. 242 00:10:47,870 --> 00:10:52,060 That's actually part of the problem with PlayStation 3 243 00:10:52,060 --> 00:10:52,540 development. 244 00:10:52,540 --> 00:10:55,210 It was part of the challenge that we've had with 245 00:10:55,210 --> 00:10:56,790 PlayStation 3 development. 246 00:10:56,790 --> 00:10:59,970 Is in the past, certainly with PlayStation 2 and definitely 247 00:10:59,970 --> 00:11:03,850 on any previous console, this divide between game play and 248 00:11:03,850 --> 00:11:05,600 engine worked very well. 249 00:11:08,120 --> 00:11:10,400 The game play programmers could just call a function and 250 00:11:10,400 --> 00:11:13,200 it did its fat thing really fast and it came back and they 251 00:11:13,200 --> 00:11:17,550 continue this, but in a serial program on one process that 252 00:11:17,550 --> 00:11:19,390 model works very well. 253 00:11:19,390 --> 00:11:25,270 But now when the high level design can destroy performance 254 00:11:25,270 --> 00:11:27,650 but through the simplest decision, like for example, in 255 00:11:27,650 --> 00:11:33,420 collision detection if the logic assumes that the result 256 00:11:33,420 --> 00:11:36,900 is immediately available there's virtually no way of 257 00:11:36,900 --> 00:11:39,440 making that fast. So the high-level design has to 258 00:11:39,440 --> 00:11:43,010 conform to the hardware. 259 00:11:43,010 --> 00:11:45,660 That's sort of a challenge now, is introducing those 260 00:11:45,660 --> 00:11:48,440 concepts to the high-level programmer who haven't 261 00:11:48,440 --> 00:11:49,690 traditionally had to deal with it. 262 00:11:52,050 --> 00:11:56,430 Does that answer that question as far as the split? 263 00:11:56,430 --> 00:11:58,610 AUDIENCE: You said 2:1, right? 264 00:11:58,610 --> 00:12:02,830 MIKE ACTON: Approximately 2:1, artist to programmers. 265 00:12:02,830 --> 00:12:07,390 It varies studio to studio and team to team, so it's hard to 266 00:12:07,390 --> 00:12:08,660 say in the industry as a whole. 267 00:12:16,900 --> 00:12:20,230 So back basically to the point of the code 268 00:12:20,230 --> 00:12:21,610 isn't really important. 269 00:12:21,610 --> 00:12:26,250 The code itself doesn't have a lot of value. 270 00:12:26,250 --> 00:12:27,960 There are fundamental things that affect how you would 271 00:12:27,960 --> 00:12:29,380 design it in the first place. 272 00:12:29,380 --> 00:12:32,260 The type of game, the kind of engine that would run a racing 273 00:12:32,260 --> 00:12:34,320 game is completely different than the kind of engine that 274 00:12:34,320 --> 00:12:36,430 would run a first person shooter. 275 00:12:36,430 --> 00:12:39,240 The needs are different, the optimizations are totally 276 00:12:39,240 --> 00:12:42,670 different, the data is totally different, so you wouldn't try 277 00:12:42,670 --> 00:12:44,940 to reuse code from one to the other. 278 00:12:44,940 --> 00:12:47,350 It just either wouldn't work or would work 279 00:12:47,350 --> 00:12:49,020 really, really poorly. 280 00:12:49,020 --> 00:12:50,470 The framerate-- 281 00:12:50,470 --> 00:12:52,840 having a target of 30 frames per second is a much different 282 00:12:52,840 --> 00:12:55,450 problem than having a target of 60 frames per second. 283 00:12:55,450 --> 00:12:58,710 And in the NCSC territories those are pretty much your 284 00:12:58,710 --> 00:13:02,830 only two choices- 30 frames or 60, which means everything has 285 00:13:02,830 --> 00:13:06,430 to be done in 16 and 2/3 milliseconds. 286 00:13:06,430 --> 00:13:07,510 That's it, that's what you have-- 287 00:13:07,510 --> 00:13:12,220 432 milliseconds. 288 00:13:12,220 --> 00:13:16,830 Of course, back to schedule and cost, how much? 289 00:13:16,830 --> 00:13:19,580 You know, do you have a two year cycle, one year cycle, 290 00:13:19,580 --> 00:13:21,820 how much can you get done? 291 00:13:21,820 --> 00:13:23,200 The kind of hardware. 292 00:13:23,200 --> 00:13:26,620 So taking for example, an engine from PlayStation 2 and 293 00:13:26,620 --> 00:13:29,990 trying to move it to PlayStation 3 is sort of a 294 00:13:29,990 --> 00:13:31,240 lost cause. 295 00:13:36,730 --> 00:13:38,870 The kind of optimizations that you would do, the kind of 296 00:13:38,870 --> 00:13:43,400 parallelization you would do is so completely different, 297 00:13:43,400 --> 00:13:46,750 although there was parallelization in PlayStation 298 00:13:46,750 --> 00:13:49,350 2, the choices would have been completely different. 299 00:13:53,960 --> 00:13:57,510 The loss from trying to port it is much, much greater than 300 00:13:57,510 --> 00:13:59,560 the cost of just doing it again. 301 00:13:59,560 --> 00:14:06,290 AUDIENCE: [OBSCURED] 302 00:14:06,290 --> 00:14:07,970 MIKE ACTON: I don't know that there's an average. 303 00:14:07,970 --> 00:14:11,970 I mean, if you wanted to just like homogenize the industry, 304 00:14:11,970 --> 00:14:17,390 it's probably 18 months. 305 00:14:17,390 --> 00:14:20,450 The compiler actually makes a huge, significant difference 306 00:14:20,450 --> 00:14:25,040 in how you design your code. 307 00:14:25,040 --> 00:14:27,690 If you're working with GCC and you have programmers who have 308 00:14:27,690 --> 00:14:31,900 been working with GCC for 15 years abd who understand the 309 00:14:31,900 --> 00:14:37,020 intricacies and issues involved in GCC, the kind of 310 00:14:37,020 --> 00:14:38,660 code you would write would be completely different than if 311 00:14:38,660 --> 00:14:42,440 you were using XLC for example, on the cell. 312 00:14:46,670 --> 00:14:48,130 There are studios-- 313 00:14:48,130 --> 00:14:50,240 Insomniac doesn't, but there are other studios who do cross 314 00:14:50,240 --> 00:14:50,940 platform design. 315 00:14:50,940 --> 00:14:55,480 So for example, write Playstation 3 games and Xbox 316 00:14:55,480 --> 00:14:59,260 360 games and/or PC titles. 317 00:14:59,260 --> 00:15:03,460 At the moment, probably the easiest approach for that is 318 00:15:03,460 --> 00:15:07,920 to target the PlayStation 3. 319 00:15:07,920 --> 00:15:11,660 So you have these sort of SPU friendly chunks of processing 320 00:15:11,660 --> 00:15:15,540 SPU chunks, friendly chunks of data and move those onto 321 00:15:15,540 --> 00:15:18,910 homogenous parallel processors. 322 00:15:18,910 --> 00:15:21,300 It's not the perfect solution, but virtually all cross 323 00:15:21,300 --> 00:15:23,150 platform titles are not looking for the perfect 324 00:15:23,150 --> 00:15:26,020 solution anyway because they cannot fully optimize for any 325 00:15:26,020 --> 00:15:27,270 particular platform. 326 00:15:32,370 --> 00:15:33,250 I wanted to go through-- 327 00:15:33,250 --> 00:15:38,130 these are a basic list of some of the major modules that a 328 00:15:38,130 --> 00:15:39,610 game is made out of. 329 00:15:42,510 --> 00:15:45,950 I'll go through some of these and explain how designing on 330 00:15:45,950 --> 00:15:47,680 the cell impacts the system. 331 00:15:47,680 --> 00:15:52,260 I'm not going to bother reading them. 332 00:15:52,260 --> 00:15:53,700 I assume you all can read. 333 00:15:57,950 --> 00:16:00,620 So yeah, I'm going to go over the major system, a few of the 334 00:16:00,620 --> 00:16:02,450 major systems and then we're going to drive a little bit 335 00:16:02,450 --> 00:16:05,790 into a specific system, in this case an animation system. 336 00:16:05,790 --> 00:16:11,400 And just talk it through, basically you see how each of 337 00:16:11,400 --> 00:16:14,090 these steps are affected by the hardware that we're 338 00:16:14,090 --> 00:16:17,470 running on it. 339 00:16:17,470 --> 00:16:20,060 So just to start with when you're designing a structure, 340 00:16:20,060 --> 00:16:23,430 any structure, anywhere-- 341 00:16:23,430 --> 00:16:30,460 the initial structure is affected by the kind of 342 00:16:30,460 --> 00:16:32,370 hardware that you're running. 343 00:16:32,370 --> 00:16:37,080 And in this particular case on the SPU and there are other 344 00:16:37,080 --> 00:16:40,660 processors where this is equally true, but in this 345 00:16:40,660 --> 00:16:42,790 conventional structure where you say structure class or 346 00:16:42,790 --> 00:16:46,660 whatever and you have domain-constrained structures 347 00:16:46,660 --> 00:16:51,180 are of surprisingly little use. 348 00:16:51,180 --> 00:16:56,395 In general, the data is either compressed or is in a stram or 349 00:16:56,395 --> 00:16:58,500 is in blocks. 350 00:16:58,500 --> 00:17:01,910 It's sort of based on type, which means that there's no 351 00:17:01,910 --> 00:17:06,360 fixed size struct that you could define anyway. 352 00:17:06,360 --> 00:17:09,080 So as a general rule, the structure of the data is 353 00:17:09,080 --> 00:17:11,130 defined within the code as opposed 354 00:17:11,130 --> 00:17:13,860 to in a struct somewhere. 355 00:17:13,860 --> 00:17:17,540 And that's really to get the performance from the data, you 356 00:17:17,540 --> 00:17:20,590 group things of similar type together rather than for 357 00:17:20,590 --> 00:17:24,085 example, on SPU, having flags that say this is of type A and 358 00:17:24,085 --> 00:17:28,710 this is of type B. Any flag implies a branch, which is-- 359 00:17:28,710 --> 00:17:31,320 I'm sure you all know at this point-- is really poor 360 00:17:31,320 --> 00:17:33,330 performing on SPU. 361 00:17:33,330 --> 00:17:37,890 So basically, pull flags out, resort everything and then 362 00:17:37,890 --> 00:17:40,680 move things in streams. And all of these types are going 363 00:17:40,680 --> 00:17:43,570 to be of varying sizes. 364 00:17:43,570 --> 00:17:46,620 In which case there's very little point to define those 365 00:17:46,620 --> 00:17:48,570 structures in the first place because you can't change them. 366 00:17:52,550 --> 00:17:55,000 And the fact that you're accessing data 367 00:17:55,000 --> 00:17:56,310 in quadwords anyway. 368 00:17:56,310 --> 00:17:58,530 You're always either loading and storing in quadwords, not 369 00:17:58,530 --> 00:18:01,940 on scalars, so having scalar fields in a structure is sort 370 00:18:01,940 --> 00:18:04,150 of pointless. 371 00:18:04,150 --> 00:18:06,300 So again, only SPU generally speaking 372 00:18:06,300 --> 00:18:08,220 structures are of much use. 373 00:18:14,690 --> 00:18:17,020 When you go to define structures in general you need 374 00:18:17,020 --> 00:18:24,910 to consider things like the cache, the TLB, how that's 375 00:18:24,910 --> 00:18:27,380 going to affect you're reading out of the structure or 376 00:18:27,380 --> 00:18:29,670 writing to the structure. 377 00:18:29,670 --> 00:18:32,940 More to the point of you cannot just assume that if 378 00:18:32,940 --> 00:18:36,700 you've written some data definition that you can port 379 00:18:36,700 --> 00:18:38,340 it to another platform. 380 00:18:38,340 --> 00:18:40,490 It's very easy to be poorly, a 381 00:18:40,490 --> 00:18:43,300 performing platform to platform. 382 00:18:43,300 --> 00:18:45,980 In this case, when we design structures you have to 383 00:18:45,980 --> 00:18:48,760 consider the fundamental units of the cell. 384 00:18:48,760 --> 00:18:52,960 The cache line is a fundamental unit of the cell. 385 00:18:52,960 --> 00:18:55,560 Basically, you want to define things in terms of 386 00:18:55,560 --> 00:18:58,690 128 bytes of wide. 387 00:18:58,690 --> 00:19:01,830 What can you fit in there because you read one you read 388 00:19:01,830 --> 00:19:06,500 them all, so you want to pack as much as possible into 128 389 00:19:06,500 --> 00:19:11,290 bytes and just deal with that as a fundamental unit. 390 00:19:11,290 --> 00:19:14,270 16 bytes, of course, you're doing load and stores through 391 00:19:14,270 --> 00:19:17,300 quadword load and store. 392 00:19:17,300 --> 00:19:20,100 So you don't want to have little scalar bits in there 393 00:19:20,100 --> 00:19:21,080 that you're shuffling around. 394 00:19:21,080 --> 00:19:23,390 Just deal with it as a quadword. 395 00:19:23,390 --> 00:19:26,280 And don't deal with anything smaller than that. 396 00:19:26,280 --> 00:19:29,190 So basically the minimum working sizes, in practice, 397 00:19:29,190 --> 00:19:33,460 would be 4 by 128 bits wide and you can split that up 398 00:19:33,460 --> 00:19:36,030 regularly however you want. 399 00:19:36,030 --> 00:19:41,140 So to that point I think-- 400 00:19:41,140 --> 00:19:43,470 here's an example-- 401 00:19:43,470 --> 00:19:45,070 I want to talk about a vector class. 402 00:19:45,070 --> 00:19:49,400 Vector class js usually the first thing a programmer will 403 00:19:49,400 --> 00:19:54,050 jump onto when they might want to make something for games. 404 00:19:54,050 --> 00:19:57,670 But in real life, it's probably the most useless 405 00:19:57,670 --> 00:19:58,920 thing you could ever write. 406 00:20:01,420 --> 00:20:04,150 It doesn't actually do anything. 407 00:20:04,150 --> 00:20:07,170 We have these, we know the instruction set, it's already 408 00:20:07,170 --> 00:20:07,950 in quadwords. 409 00:20:07,950 --> 00:20:09,990 We know the loads and stores, we've already designed your 410 00:20:09,990 --> 00:20:12,130 data so it fits properly. 411 00:20:12,130 --> 00:20:14,810 This doesn't give us anything. 412 00:20:14,810 --> 00:20:16,830 And it potentially makes things worse. 413 00:20:20,130 --> 00:20:23,360 Allowing component access to a quadword, especially on the 414 00:20:23,360 --> 00:20:28,250 PPU is ridiculously bad. 415 00:20:28,250 --> 00:20:31,600 In practice, if you allow component access, high-level 416 00:20:31,600 --> 00:20:34,480 programs will use component access. 417 00:20:34,480 --> 00:20:37,160 So if you have a vector class that says get x, get y, 418 00:20:37,160 --> 00:20:40,660 whatever, somebody somewhere is going to use it, which 419 00:20:40,660 --> 00:20:43,330 means the performance of the whole thing just drops and 420 00:20:43,330 --> 00:20:46,390 it's impossible to optimize. 421 00:20:46,390 --> 00:20:48,990 So as a general rule, you pick your fundamental unit. 422 00:20:48,990 --> 00:20:52,930 In this case, the 4 by 128 bit unit that I was talking about 423 00:20:52,930 --> 00:20:56,150 and you don't define anything smaller than that. 424 00:20:56,150 --> 00:20:58,960 Everything is packed into a unit about that size. 425 00:20:58,960 --> 00:21:03,510 And yes, in practice there'll be some wasted space at the 426 00:21:03,510 --> 00:21:07,950 beginning or end of streams of data, groups of data, but it 427 00:21:07,950 --> 00:21:10,420 doesn't make much difference. 428 00:21:10,420 --> 00:21:12,930 You're going to have that wasted space if you are-- 429 00:21:12,930 --> 00:21:14,840 you're going to have much more than that in wasted space if 430 00:21:14,840 --> 00:21:17,870 you're using dynamic memory, for example, which 431 00:21:17,870 --> 00:21:20,090 when I get to it-- 432 00:21:20,090 --> 00:21:21,340 I don't recommend you use either. 433 00:21:25,930 --> 00:21:27,800 So some things to consider when you're doing this sort of 434 00:21:27,800 --> 00:21:32,020 math transformation anyway is, are you going to do floats, 435 00:21:32,020 --> 00:21:33,690 double, fixed point? 436 00:21:33,690 --> 00:21:34,840 I mean, doubles write out. 437 00:21:34,840 --> 00:21:36,240 There's no point. 438 00:21:36,240 --> 00:21:39,640 Regardless of the speed on the SPU of a double, there's no 439 00:21:39,640 --> 00:21:43,170 value in it for games. 440 00:21:43,170 --> 00:21:46,900 We have known data, so if we need to we can renormalize a 441 00:21:46,900 --> 00:21:51,140 group of around a point and get into the range of a 442 00:21:51,140 --> 00:21:52,100 floating point. 443 00:21:52,100 --> 00:21:53,920 It's a nonissue. 444 00:21:53,920 --> 00:21:56,700 So there's no reason to waste the space in a double at all, 445 00:21:56,700 --> 00:21:59,380 unless it was actually faster, which it isn't. 446 00:21:59,380 --> 00:22:00,630 So we don't use it. 447 00:22:04,440 --> 00:22:06,670 Sort of the only real problematic thing with the SPU 448 00:22:06,670 --> 00:22:09,570 floating point is its format and not supporting 449 00:22:09,570 --> 00:22:12,860 denormalized numbers becomes problematic, but again, you 450 00:22:12,860 --> 00:22:17,680 can work around it by renormalizing your numbers 451 00:22:17,680 --> 00:22:20,760 within a known range so that it won't to get to the point 452 00:22:20,760 --> 00:22:23,290 where it needs to denormalize-- 453 00:22:23,290 --> 00:22:25,270 at least for the work that you're actually doing. 454 00:22:29,790 --> 00:22:30,520 Yeah? 455 00:22:30,520 --> 00:22:38,960 AUDIENCE: [OBSCURED] 456 00:22:38,960 --> 00:22:42,210 MIKE ACTON: Every program will write its own vector class. 457 00:22:42,210 --> 00:22:45,290 And I'm saying that that's a useless exercise. 458 00:22:45,290 --> 00:22:46,360 Don't bother doing it. 459 00:22:46,360 --> 00:22:47,610 Don't use anybody else's either. 460 00:22:52,730 --> 00:22:55,780 If you're writing for the cell-- 461 00:22:55,780 --> 00:22:58,360 if you're writing in C you have the SI intrinsics. 462 00:22:58,360 --> 00:23:00,330 They're already in quadwords, you can do everything you want 463 00:23:00,330 --> 00:23:04,210 to do and you're not restricted by this sort of 464 00:23:04,210 --> 00:23:06,110 concept of what a vector is. 465 00:23:06,110 --> 00:23:08,290 If you want to deal with, especially on the SPU where 466 00:23:08,290 --> 00:23:12,210 you can freely deal with them as integers or floats or 467 00:23:12,210 --> 00:23:17,280 whatever seamlessly without cost, there's plenty that you 468 00:23:17,280 --> 00:23:18,860 can do with the floating point number if you 469 00:23:18,860 --> 00:23:20,030 treat it as an integer. 470 00:23:20,030 --> 00:23:23,800 And when on either AltiVec or the SPU where you can do that 471 00:23:23,800 --> 00:23:26,870 without cost there's a huge advantage to 472 00:23:26,870 --> 00:23:28,970 just doing it straight. 473 00:23:28,970 --> 00:23:30,680 AUDIENCE: [OBSCURED] 474 00:23:30,680 --> 00:23:32,080 MIKE ACTON: Well, I'm saying write it in assembly. 475 00:23:35,260 --> 00:23:37,720 But if you have to, use the intrinsics. 476 00:23:37,720 --> 00:23:42,190 But certainly don't write a vector class. 477 00:23:42,190 --> 00:23:44,150 So memory management. 478 00:23:44,150 --> 00:23:46,190 Static allocations always prefer the dynamic. 479 00:23:46,190 --> 00:23:49,390 Basically, general purpose dynamic memory allocation, 480 00:23:49,390 --> 00:23:53,060 malloc free, whatever has just absolutely no place in games. 481 00:23:57,540 --> 00:24:00,440 We don't have enough unknowns for that to be valuable. 482 00:24:00,440 --> 00:24:03,530 We can group our data by specific types. 483 00:24:03,530 --> 00:24:07,000 We know basic ranges of those types. 484 00:24:07,000 --> 00:24:09,620 The vast majority of the data is known in advance, it's 485 00:24:09,620 --> 00:24:11,080 actually burned onto the disk. 486 00:24:11,080 --> 00:24:13,730 We can actually analyze that. 487 00:24:13,730 --> 00:24:15,350 So most of our allocations tend to 488 00:24:15,350 --> 00:24:16,590 calculate it in advance. 489 00:24:16,590 --> 00:24:20,900 So you load the level and oftentimes you just load 490 00:24:20,900 --> 00:24:24,895 memory in off the disc into memory and 491 00:24:24,895 --> 00:24:26,145 then fix up the pointers. 492 00:24:29,120 --> 00:24:33,570 For things that change during the runtime, just simple 493 00:24:33,570 --> 00:24:37,020 hierarchical allocators, block allocators where you have 494 00:24:37,020 --> 00:24:42,620 fixed sizes is always the easiest and best way to go. 495 00:24:42,620 --> 00:24:45,490 These are known types of known sizes. 496 00:24:45,490 --> 00:24:50,150 The key to that is to organize your data so that's actually a 497 00:24:50,150 --> 00:24:51,590 workable solution. 498 00:24:51,590 --> 00:24:54,460 So you don't have these sort of classes or structures that 499 00:24:54,460 --> 00:24:55,600 are dynamically sized. 500 00:24:55,600 --> 00:24:58,480 That you group them in terms of things that are similar. 501 00:24:58,480 --> 00:25:03,790 Physics data here and AI data is separately here in a 502 00:25:03,790 --> 00:25:05,190 separate array. 503 00:25:05,190 --> 00:25:10,580 And that way those sort of chunks of data are similarly 504 00:25:10,580 --> 00:25:12,210 sized and can be block allocated without any 505 00:25:12,210 --> 00:25:13,480 fragmentation issues at all. 506 00:25:18,610 --> 00:25:23,590 Eventually you'll probably want to design an allocator. 507 00:25:23,590 --> 00:25:25,910 Things to consider are the page sizes. 508 00:25:25,910 --> 00:25:28,910 That's critically important, you want to work within a page 509 00:25:28,910 --> 00:25:30,560 as much as you possibly can. 510 00:25:30,560 --> 00:25:33,180 So you want to group things, not necessarily the same 511 00:25:33,180 --> 00:25:36,360 things, but the things that will be read together or 512 00:25:36,360 --> 00:25:38,430 written together within the same page. 513 00:25:38,430 --> 00:25:43,040 So you want to have a concept of the actual page up through 514 00:25:43,040 --> 00:25:44,290 the system. 515 00:25:47,130 --> 00:25:49,180 Probably the most common mistake I see in a block 516 00:25:49,180 --> 00:25:51,510 allocator, so somebody says-- 517 00:25:51,510 --> 00:25:54,000 everybody knows what I mean by block allocator? 518 00:25:54,000 --> 00:25:54,540 Yeah? 519 00:25:54,540 --> 00:25:55,000 OK. 520 00:25:55,000 --> 00:25:57,890 So the most common mistake I see people make is that they 521 00:25:57,890 --> 00:25:59,040 do least recently used. 522 00:25:59,040 --> 00:26:02,780 They just grab the most least recently used block and use 523 00:26:02,780 --> 00:26:06,510 that when summoning a request. That's actually pretty much 524 00:26:06,510 --> 00:26:08,930 the worst thing you can possibly do because that's the 525 00:26:08,930 --> 00:26:11,530 most likely thing to be called. 526 00:26:11,530 --> 00:26:13,850 That's the most likely thing to be out of cache, both out 527 00:26:13,850 --> 00:26:15,630 of L1 and L2. 528 00:26:15,630 --> 00:26:19,340 Just the easiest thing you can do to change that is just use 529 00:26:19,340 --> 00:26:20,310 most recently used. 530 00:26:20,310 --> 00:26:21,170 Just go up the other way. 531 00:26:21,170 --> 00:26:24,280 I mean, there are much more complicated systems you can 532 00:26:24,280 --> 00:26:28,120 use, but just that one small change where you're much more 533 00:26:28,120 --> 00:26:33,010 likely to get warm data is going to give you a big boost. 534 00:26:33,010 --> 00:26:35,220 And again, like I said, use hierarchies of allocations 535 00:26:35,220 --> 00:26:38,990 instead of these sort of static block allocations. 536 00:26:38,990 --> 00:26:42,190 Instead of trying to have one general purpose super mega 537 00:26:42,190 --> 00:26:46,230 allocator that does everything. 538 00:26:46,230 --> 00:26:48,130 And again, if it's well planned, fragmentation is a 539 00:26:48,130 --> 00:26:51,450 non-issue, it's impossible. 540 00:26:51,450 --> 00:26:54,810 Cache line, oh, and probably another important concept to 541 00:26:54,810 --> 00:26:57,150 keep in mind as you're writing your allocator is the transfer 542 00:26:57,150 --> 00:26:59,920 block size of the SPU. 543 00:26:59,920 --> 00:27:06,110 If you have a 16K block and the system is aware of fixing 544 00:27:06,110 --> 00:27:11,080 K blocks then there are plenty of cases where you don't have 545 00:27:11,080 --> 00:27:14,190 to keep track of-- in the system-- the size of things. 546 00:27:14,190 --> 00:27:17,800 It's just how many blocks, how many SPU blocks do you have? 547 00:27:17,800 --> 00:27:20,990 Or what percentage of SPU blocks you have? 548 00:27:20,990 --> 00:27:25,000 And that will help you can sort of compress down your 549 00:27:25,000 --> 00:27:27,570 memory requirements when you're referring to blocks and 550 00:27:27,570 --> 00:27:28,820 memory streams and memory. 551 00:27:33,230 --> 00:27:36,476 AUDIENCE: About the memory management for data here, you 552 00:27:36,476 --> 00:27:40,620 also write overlay managers for code for the user? 553 00:27:40,620 --> 00:27:43,840 MIKE ACTON: Well, it basically amounts to the same thing. 554 00:27:43,840 --> 00:27:46,900 I mean, the code is just data, you just load it in and fix up 555 00:27:46,900 --> 00:27:48,860 the pointers and you're done. 556 00:27:48,860 --> 00:27:52,292 AUDIENCE: I was just wondering whether IBM gives you 557 00:27:52,292 --> 00:27:55,130 embedding -- 558 00:27:55,130 --> 00:27:56,630 MIKE ACTON: We don't usse any of the IBM 559 00:27:56,630 --> 00:27:58,920 systems at all for games. 560 00:28:01,880 --> 00:28:06,100 I know IBM has an overlay manager as part of the SDK. 561 00:28:06,100 --> 00:28:07,450 AUDIENCE: Well, not really. 562 00:28:07,450 --> 00:28:09,400 It's -- 563 00:28:09,400 --> 00:28:11,970 MIKE ACTON: Well, they have some overlay support, right? 564 00:28:11,970 --> 00:28:15,920 That's not something we would ever use. 565 00:28:15,920 --> 00:28:18,170 And in general, I mean, I guess that's probably an 566 00:28:18,170 --> 00:28:19,700 interesting question of how-- 567 00:28:19,700 --> 00:28:20,810 AUDIENCE: So it's all ground up? 568 00:28:20,810 --> 00:28:21,510 MIKE ACTON: What's that? 569 00:28:21,510 --> 00:28:24,080 AUDIENCE: All your development is ground up? 570 00:28:24,080 --> 00:28:25,340 MIKE ACTON: Yeah, for the most part. 571 00:28:28,400 --> 00:28:30,600 For us, that's definitely true. 572 00:28:30,600 --> 00:28:33,610 There are studios that, especially cross platform 573 00:28:33,610 --> 00:28:36,020 studios that will take middleware development and 574 00:28:36,020 --> 00:28:37,270 just sort of use it on a high-level. 575 00:28:39,680 --> 00:28:43,280 But especially when you're starting a first generation 576 00:28:43,280 --> 00:28:47,220 platform game, there's virtually nothing there to use 577 00:28:47,220 --> 00:28:48,690 because the hardware hasn't been around long enough for 578 00:28:48,690 --> 00:28:51,530 anybody else to write anything either. 579 00:28:51,530 --> 00:28:55,000 So if you need it, you write it yourself. 580 00:28:55,000 --> 00:28:57,240 Plus that's just sort of the general theme of game 581 00:28:57,240 --> 00:28:58,850 development. 582 00:28:58,850 --> 00:29:02,700 It's custom to your situation, to your data. 583 00:29:02,700 --> 00:29:04,740 And anything that's general purpose enough to sell as 584 00:29:04,740 --> 00:29:10,350 middleware is probably not going to be fast enough to run 585 00:29:10,350 --> 00:29:11,920 a triple A title. 586 00:29:11,920 --> 00:29:13,920 Not always true, but as a general 587 00:29:13,920 --> 00:29:15,640 rule, it's pretty valid. 588 00:29:20,240 --> 00:29:24,920 OK, so-- wait, so how'd I get here? 589 00:29:24,920 --> 00:29:28,370 All right, this is next. 590 00:29:28,370 --> 00:29:33,100 So here's another example of how the cell 591 00:29:33,100 --> 00:29:34,760 might affect design. 592 00:29:34,760 --> 00:29:37,540 So you're writing a collision detection system. 593 00:29:43,460 --> 00:29:47,650 It's obvious that you cannot or should not expect immediate 594 00:29:47,650 --> 00:29:50,820 results from a collision detection system, otherwise 595 00:29:50,820 --> 00:29:54,030 you're going to be sitting and syncing all the time for one 596 00:29:54,030 --> 00:29:56,670 result and performance just goes out the window, you may 597 00:29:56,670 --> 00:29:58,710 as well just have a serial program. 598 00:29:58,710 --> 00:30:03,090 So you want to group results, you want to group queries and 599 00:30:03,090 --> 00:30:06,410 you want potentially, for those queries to be deferred 600 00:30:06,410 --> 00:30:09,920 so that you can store them, you can just DMA them out and 601 00:30:09,920 --> 00:30:12,150 then whatever process needed then we'll come back and grab 602 00:30:12,150 --> 00:30:14,590 them later. 603 00:30:14,590 --> 00:30:17,475 So conceptually that's the design you want to build into 604 00:30:17,475 --> 00:30:22,590 a collision detection system, which then in turn affects the 605 00:30:22,590 --> 00:30:24,030 high-level design. 606 00:30:24,030 --> 00:30:29,800 So AI, scripts, any game code that might have previously 607 00:30:29,800 --> 00:30:32,040 depended on a result being immediately available, as in 608 00:30:32,040 --> 00:30:36,120 they have characters that shoot rays around the room to 609 00:30:36,120 --> 00:30:38,910 decide what they're going to do next or bullets that are 610 00:30:38,910 --> 00:30:42,730 flying through the air or whatever, can no longer make 611 00:30:42,730 --> 00:30:43,710 that assumption. 612 00:30:43,710 --> 00:30:46,580 So they have to be able to group up their queries and 613 00:30:46,580 --> 00:30:48,450 look them up later and have other work 614 00:30:48,450 --> 00:30:50,070 to do in the meantime. 615 00:30:50,070 --> 00:30:54,470 So this is a perfect example of how you cannot take old 616 00:30:54,470 --> 00:30:57,630 code and move it to the PS3. 617 00:30:57,630 --> 00:31:01,330 Because old code, serial code would have definitely assumed 618 00:31:01,330 --> 00:31:03,040 that the results were immediately available because 619 00:31:03,040 --> 00:31:04,700 honestly, that was the fastest way to do it. 620 00:31:09,300 --> 00:31:13,230 So on a separate issue, we have SPU decomposition for the 621 00:31:13,230 --> 00:31:15,000 geometry look up. 622 00:31:15,000 --> 00:31:18,710 So from a high-level you have your entire scene in the level 623 00:31:18,710 --> 00:31:24,640 of the world or whatever and you have the set of queries in 624 00:31:24,640 --> 00:31:26,480 the case of static-- did I collide with 625 00:31:26,480 --> 00:31:27,710 anything in the world? 626 00:31:27,710 --> 00:31:29,820 Or you have a RAID that, where does this RAID collide with 627 00:31:29,820 --> 00:31:30,880 something in the world? 628 00:31:30,880 --> 00:31:33,640 And so you have this problem of you have this large sort of 629 00:31:33,640 --> 00:31:37,520 memory database in main RAM and you have the smallest 630 00:31:37,520 --> 00:31:40,680 spew, which obviously cannot read in the whole database, 631 00:31:40,680 --> 00:31:42,670 analyze it, and spit out the result. 632 00:31:42,670 --> 00:31:46,410 It has to go back and forth to main RAM in order 633 00:31:46,410 --> 00:31:49,340 to build its result. 634 00:31:49,340 --> 00:31:52,270 So the question is how do you decompose the memory in the 635 00:31:52,270 --> 00:31:54,700 first place to make that at least 636 00:31:54,700 --> 00:31:58,160 somewhat reasonably efficient? 637 00:31:58,160 --> 00:32:04,660 The first sort of instinct I think, based on history is 638 00:32:04,660 --> 00:32:07,960 sort of the traditional scene graph structures like BSP tree 639 00:32:07,960 --> 00:32:09,680 or off tree or something like that. 640 00:32:13,720 --> 00:32:16,310 Particularly, on the SPU because if TLB misses that 641 00:32:16,310 --> 00:32:19,070 becomes really expensive, really quickly when you're 642 00:32:19,070 --> 00:32:21,550 basically hitting random memory on every 643 00:32:21,550 --> 00:32:25,400 single node on the tree. 644 00:32:25,400 --> 00:32:28,090 So what you want to do is you want to make that hierarchy as 645 00:32:28,090 --> 00:32:31,120 flat as you possibly can. 646 00:32:31,120 --> 00:32:35,480 If the leafs have to be bigger that's fine because it turns 647 00:32:35,480 --> 00:32:40,190 out it's much, much cheaper to stream in a bigger group of-- 648 00:32:40,190 --> 00:32:43,280 as much data as you can fit into the SPU and run through 649 00:32:43,280 --> 00:32:46,820 it and make your decisions and spit it back out than it is to 650 00:32:46,820 --> 00:32:49,190 traverse the hierarchy. 651 00:32:49,190 --> 00:32:53,640 So basically, the depth of your hierarchy in your scene 652 00:32:53,640 --> 00:32:55,960 database is completely determined by how much data 653 00:32:55,960 --> 00:32:58,080 you can fit into the SPU by the maximum 654 00:32:58,080 --> 00:33:02,230 size of the leaf node. 655 00:33:02,230 --> 00:33:05,230 The rest of the dep is only because you don't have any 656 00:33:05,230 --> 00:33:06,010 other choice. 657 00:33:06,010 --> 00:33:10,690 You know, And basically the same thing goes with dynamic 658 00:33:10,690 --> 00:33:12,860 geometry as you have geometry moving around in the scene, 659 00:33:12,860 --> 00:33:15,110 characters moving around in the scene-- 660 00:33:15,110 --> 00:33:18,730 they basically need to update themselves into their own 661 00:33:18,730 --> 00:33:22,020 database, into their own leaves and 662 00:33:22,020 --> 00:33:24,180 they'll do this in groups. 663 00:33:24,180 --> 00:33:26,790 And then when you query, you basically want it to query as 664 00:33:26,790 --> 00:33:28,820 many of those as possible, as you can 665 00:33:28,820 --> 00:33:30,920 possibly fit in at once. 666 00:33:30,920 --> 00:33:34,410 So you could have sort of a broad faced collision first, 667 00:33:34,410 --> 00:33:36,910 where you have all of the groups of characters that are 668 00:33:36,910 --> 00:33:39,800 potentially maximum in this leaf, so 669 00:33:39,800 --> 00:33:41,500 bound and box or whatever. 670 00:33:41,500 --> 00:33:45,780 So even though you could in theory, in principle narrow 671 00:33:45,780 --> 00:33:49,230 that down even more, the cost for that, the cost for the 672 00:33:49,230 --> 00:33:53,790 potential memory miss for that is so high that you just want 673 00:33:53,790 --> 00:33:55,760 to do a linear search through as many as you 674 00:33:55,760 --> 00:33:57,510 possibly can on SPU. 675 00:33:57,510 --> 00:33:59,170 Does that make sense? 676 00:34:04,220 --> 00:34:06,610 Procedural graphics-- 677 00:34:06,610 --> 00:34:12,630 so although we have a GPU on the PlayStation 3, it does 678 00:34:12,630 --> 00:34:14,880 turn out that the SPU is a lot better at 679 00:34:14,880 --> 00:34:16,130 doing a lot of things. 680 00:34:19,590 --> 00:34:22,490 Things basically where you create 681 00:34:22,490 --> 00:34:27,630 geometry for RSX to render. 682 00:34:27,630 --> 00:34:31,540 So particle system, dynamic particle systems. Especially 683 00:34:31,540 --> 00:34:34,990 where their systems have to interact with the world in 684 00:34:34,990 --> 00:34:41,000 some way, which will be much more expensive on the GPU. 685 00:34:41,000 --> 00:34:45,200 Sort of a dynamic systems like cloth. 686 00:34:45,200 --> 00:34:48,460 Fonts is actually really interesting because typically 687 00:34:48,460 --> 00:34:51,880 you'll just see bitmap fonts in which 688 00:34:51,880 --> 00:34:53,840 case are just textures. 689 00:34:53,840 --> 00:34:56,930 But if you have a very complex user interface then just the 690 00:34:56,930 --> 00:35:02,800 size of the bitmap becomes extreme and if you compressed 691 00:35:02,800 --> 00:35:04,370 them they look terrible, especially fonts. 692 00:35:04,370 --> 00:35:06,620 Fonts need to look perfect. 693 00:35:06,620 --> 00:35:08,980 So if you do do procedural fonts, for example, two type 694 00:35:08,980 --> 00:35:12,750 fonts, the cost of rendering a font actually gets 695 00:35:12,750 --> 00:35:15,040 significant. 696 00:35:15,040 --> 00:35:18,830 And in this case, the SPU is actually a great use for 697 00:35:18,830 --> 00:35:22,210 rendering a procedural font. 698 00:35:22,210 --> 00:35:24,020 Rendering textures is basically the 699 00:35:24,020 --> 00:35:25,510 same case as font. 700 00:35:25,510 --> 00:35:29,200 Procedural textures like if you do noise-based clouds or 701 00:35:29,200 --> 00:35:30,750 something like that. 702 00:35:30,750 --> 00:35:33,700 And parametric geometry, it's like nurbs or subdivision 703 00:35:33,700 --> 00:35:35,300 services or something like that, is a perfect 704 00:35:35,300 --> 00:35:36,700 case for the SPU. 705 00:35:42,260 --> 00:35:43,754 Is there a question? 706 00:35:48,060 --> 00:35:49,190 Geometry database, OK. 707 00:35:49,190 --> 00:35:53,970 First thing scene graphs are worthless. 708 00:35:53,970 --> 00:35:55,550 Yeah? 709 00:35:55,550 --> 00:35:58,700 AUDIENCE: So of those sort of differnet conceptualized 710 00:35:58,700 --> 00:36:02,936 paths, are you literally swapping code in and out of 711 00:36:02,936 --> 00:36:06,560 the SPUs with the data many times per frame? 712 00:36:06,560 --> 00:36:07,900 Or is it more of a static--- 713 00:36:07,900 --> 00:36:10,520 MIKE ACTON: OK, that's an excellent question. 714 00:36:10,520 --> 00:36:12,660 It totally depends. 715 00:36:12,660 --> 00:36:16,470 I mean, in general through a game or through, at least a 716 00:36:16,470 --> 00:36:20,480 particular area of a game the SPU set up is stable. 717 00:36:20,480 --> 00:36:24,770 So if we decide you're going to have this SPU dedicated to 718 00:36:24,770 --> 00:36:29,060 physics for example, it is very likely that that SPU is 719 00:36:29,060 --> 00:36:31,860 stable and it's going to be dedicated physics, at least 720 00:36:31,860 --> 00:36:34,840 for some period of time through the level or through 721 00:36:34,840 --> 00:36:36,360 the zone or wherever it is. 722 00:36:36,360 --> 00:36:39,920 Sometimes through an entire game. 723 00:36:39,920 --> 00:36:42,790 So there are going to be elements of that where it's 724 00:36:42,790 --> 00:36:45,260 sort of a well balanced problem. 725 00:36:45,260 --> 00:36:47,620 There's basically no way you're going to get waste. 726 00:36:47,620 --> 00:36:51,350 It's always going to be full, it's always going to be busy. 727 00:36:51,350 --> 00:36:54,470 Collision detection and physics are the two things 728 00:36:54,470 --> 00:36:58,910 that you'll never have enough CPU to do. 729 00:36:58,910 --> 00:37:02,300 You can always use more and more CPU. 730 00:37:02,300 --> 00:37:06,790 And basically, the rest can be dynamically scheduled. 731 00:37:06,790 --> 00:37:09,360 And the question of how to schedule it, is actually an 732 00:37:09,360 --> 00:37:12,230 interesting problem. 733 00:37:12,230 --> 00:37:15,810 It's my opinion that sort of looking for the universal 734 00:37:15,810 --> 00:37:18,820 scheduler that solves all problems and magically makes 735 00:37:18,820 --> 00:37:23,180 everything work is a total lost cause. 736 00:37:23,180 --> 00:37:27,880 You have more than enough data to work with and in your game 737 00:37:27,880 --> 00:37:33,480 to decide how to schedule your SPUs basically, manually. 738 00:37:33,480 --> 00:37:35,320 And it's just not that complicated. 739 00:37:35,320 --> 00:37:36,590 We have six SPUs. 740 00:37:36,590 --> 00:37:39,380 How to schedule six SPUs is just not that complicated a 741 00:37:39,380 --> 00:37:41,920 problem, you could write it down on a piece of paper. 742 00:37:46,990 --> 00:37:52,040 OK, so scene graphs are almost always, universally a complete 743 00:37:52,040 --> 00:37:53,620 waste time. 744 00:37:53,620 --> 00:37:57,640 They store way too much data for no apparent reason. 745 00:37:57,640 --> 00:38:00,170 Store your databases independently based on what 746 00:38:00,170 --> 00:38:02,400 you're actually doing with them, optimize your data 747 00:38:02,400 --> 00:38:04,860 separately because you're accessing it separately. 748 00:38:04,860 --> 00:38:09,070 The only thing that should be linking your sort of domain 749 00:38:09,070 --> 00:38:11,910 object is a key that says all right, we'll exist in this 750 00:38:11,910 --> 00:38:15,290 database and the database and this database. 751 00:38:15,290 --> 00:38:18,510 But to have this sort of giant structure that keeps all of 752 00:38:18,510 --> 00:38:24,820 the data for each element in the scene is about the poorest 753 00:38:24,820 --> 00:38:30,580 performing you can imagine for both cache and TLB and SPU 754 00:38:30,580 --> 00:38:32,620 because you can't fit it in individual node on the SPU. 755 00:38:38,130 --> 00:38:39,380 I think I covered that. 756 00:38:42,620 --> 00:38:45,540 Here's an interesting example, so what you want to do is if 757 00:38:45,540 --> 00:38:48,690 you have the table of queries that you have-- bunch of 758 00:38:48,690 --> 00:38:51,910 people over the course of a frame say I want to know if I 759 00:38:51,910 --> 00:38:54,110 collided with something. 760 00:38:54,110 --> 00:38:57,040 And then if you basically make a pre-sort pass on that and 761 00:38:57,040 --> 00:39:01,150 basically, spatially sort these guys together, so let's 762 00:39:01,150 --> 00:39:03,300 say you have however many you can fit in a SPU. 763 00:39:03,300 --> 00:39:06,330 So you have four of these queries together. 764 00:39:06,330 --> 00:39:10,050 Although they might be a little further apart then you 765 00:39:10,050 --> 00:39:13,510 would hope, you could basically create a baling box 766 00:39:13,510 --> 00:39:16,720 through a single query on the database that's the sum of all 767 00:39:16,720 --> 00:39:20,320 of them and then as I said, now you have a linear list 768 00:39:20,320 --> 00:39:21,930 that you can just stream through for all of them. 769 00:39:21,930 --> 00:39:24,585 So even though it's doing more work for any individual one, 770 00:39:24,585 --> 00:39:29,230 the overhead is reduced so significantly that the end 771 00:39:29,230 --> 00:39:31,050 result is that it's significantly faster. 772 00:39:34,200 --> 00:39:37,780 And that's also what I mean by multiple simultaneous lookups. 773 00:39:37,780 --> 00:39:41,530 Basically you want to group queries together, but make 774 00:39:41,530 --> 00:39:44,240 sure that there's some advantage to that. 775 00:39:44,240 --> 00:39:47,170 By spatially pre-sorting them there is an advantage to that 776 00:39:47,170 --> 00:39:50,050 because it's more likely that they will have 777 00:39:50,050 --> 00:39:53,370 overlap in your queries. 778 00:39:53,370 --> 00:39:54,100 So game logic. 779 00:39:54,100 --> 00:39:58,110 Stuff that the cell would affect in game logic. 780 00:39:58,110 --> 00:40:02,570 State machines are a good example. 781 00:40:02,570 --> 00:40:06,330 If you defer your logic lines and defer your results, SPUs 782 00:40:06,330 --> 00:40:10,610 are amazingly perfect for defining state machines. 783 00:40:10,610 --> 00:40:13,130 If you expect your logic lines to be immediately available 784 00:40:13,130 --> 00:40:17,970 across the entire system, SPU is absolutely horrid. 785 00:40:17,970 --> 00:40:20,920 So if you basically write buffers into your state 786 00:40:20,920 --> 00:40:26,760 machines or your logic machines then each SPU can be 787 00:40:26,760 --> 00:40:30,760 cranking on multiple state machines at once where all the 788 00:40:30,760 --> 00:40:34,500 input and all the output lines are assumed to be deferred and 789 00:40:34,500 --> 00:40:36,640 it's just an extremely straightforward process. 790 00:40:39,480 --> 00:40:41,380 Scripting, so scripting things like-- 791 00:40:41,380 --> 00:40:44,020 I don't know, lewis script or C script or 792 00:40:44,020 --> 00:40:47,080 something like that. 793 00:40:47,080 --> 00:40:49,550 I mean, obviously the first thing to look at is the size 794 00:40:49,550 --> 00:40:50,940 of the interpreter. 795 00:40:50,940 --> 00:40:54,420 Will it fit into an SPU to begin with? 796 00:40:54,420 --> 00:41:00,100 Another option to consider is, can it be converted into SPU 797 00:41:00,100 --> 00:41:03,900 code, either offline or dynamically? 798 00:41:03,900 --> 00:41:05,990 Because you'll find that most off the shelf scripting 799 00:41:05,990 --> 00:41:08,190 languages are scalar, 800 00:41:08,190 --> 00:41:11,220 sequential scripting languages. 801 00:41:11,220 --> 00:41:16,720 So all of a P code within the scripting language itself 802 00:41:16,720 --> 00:41:18,300 basically defines scalar access. 803 00:41:18,300 --> 00:41:21,970 So not only are you switching on every byte to every two 804 00:41:21,970 --> 00:41:24,750 bytes or whatever, so it's sort of poorly performing code 805 00:41:24,750 --> 00:41:27,600 from an SPU point of view, but it's also poorly performing 806 00:41:27,600 --> 00:41:29,760 code from a memory point of view. 807 00:41:29,760 --> 00:41:31,810 So I guess the question is whether or not you can 808 00:41:31,810 --> 00:41:35,700 optimize the script itself and turn turn it into SPU code 809 00:41:35,700 --> 00:41:39,380 that you can then dynamically load or come up with a new 810 00:41:39,380 --> 00:41:42,780 script that's just much more friendly for the SPUs. 811 00:41:45,410 --> 00:41:49,880 Another option if you have to use a single, sort of scalar 812 00:41:49,880 --> 00:41:55,550 scripting language like lua or C script or whatever, if you 813 00:41:55,550 --> 00:41:59,120 can run multiple streams simultaneously so that while 814 00:41:59,120 --> 00:42:01,980 you're doing these sort of individual offline memory 815 00:42:01,980 --> 00:42:06,140 lookups and reads and writes to main memory, that once one 816 00:42:06,140 --> 00:42:08,550 blocks you can start moving on another one. 817 00:42:08,550 --> 00:42:12,070 As long as there's no dependencies between these two 818 00:42:12,070 --> 00:42:14,600 scripts we should be able to stream them both 819 00:42:14,600 --> 00:42:17,640 simultaneously. 820 00:42:17,640 --> 00:42:21,620 Motion control actually turns out to be a critical problem 821 00:42:21,620 --> 00:42:23,850 in games in general that's often overlooked. 822 00:42:23,850 --> 00:42:28,300 It's who controls the motion in the game. 823 00:42:28,300 --> 00:42:29,750 Is is the AI? 824 00:42:29,750 --> 00:42:34,590 So is it the controller in the case of the player? 825 00:42:34,590 --> 00:42:36,720 I say, push forward, so the guy moves forward. 826 00:42:36,720 --> 00:42:39,090 Is that really what controls it? 827 00:42:39,090 --> 00:42:40,950 Or is it the physics? 828 00:42:40,950 --> 00:42:43,660 So all the AI does is say, I want to move forward, tells 829 00:42:43,660 --> 00:42:45,360 the physic system I want to move forward and the physics 830 00:42:45,360 --> 00:42:47,000 tries to follow it. 831 00:42:47,000 --> 00:42:48,660 Or is it the animation? 832 00:42:48,660 --> 00:42:50,710 That you have the animators actually put translation in 833 00:42:50,710 --> 00:42:52,530 the animation, so is that translation the thing that's 834 00:42:52,530 --> 00:42:54,607 actually driving the motion and everything else is trying 835 00:42:54,607 --> 00:42:56,170 to follow it? 836 00:42:56,170 --> 00:42:59,830 Turns out to be a surprisingly difficult problem to solve and 837 00:42:59,830 --> 00:43:06,180 every studio ends up with their own solution. 838 00:43:06,180 --> 00:43:08,590 I forget what point I was making on how the cell 839 00:43:08,590 --> 00:43:09,840 affected that decision. 840 00:43:12,040 --> 00:43:13,210 But-- 841 00:43:13,210 --> 00:43:18,780 AUDIENCE: [OBSCURED] 842 00:43:18,780 --> 00:43:21,120 MIKE ACTON: I think the point, probably that I was trying to 843 00:43:21,120 --> 00:43:24,670 make is that because you want everything to be deferred 844 00:43:24,670 --> 00:43:29,700 anyway, then the order does become a clearer sort of 845 00:43:29,700 --> 00:43:32,530 winner in that order. 846 00:43:32,530 --> 00:43:37,780 Where you want the immediate feedback from the controls, 847 00:43:37,780 --> 00:43:39,900 the control leads the way. 848 00:43:39,900 --> 00:43:42,740 You have the physics, which then follows, perhaps, even a 849 00:43:42,740 --> 00:43:46,960 frame behind that to say how that new position is impacted 850 00:43:46,960 --> 00:43:49,320 by the physical reality of the world. 851 00:43:49,320 --> 00:43:52,230 And then potentially a frame behind that or half a frame 852 00:43:52,230 --> 00:43:55,560 behind that you have the animation system, which in 853 00:43:55,560 --> 00:43:58,050 that case would just be basically, a visual 854 00:43:58,050 --> 00:44:00,520 representation of what's going on rather 855 00:44:00,520 --> 00:44:01,540 than leading anything. 856 00:44:01,540 --> 00:44:04,190 It's basically an icon for what's happening in the 857 00:44:04,190 --> 00:44:06,600 physics and the AI. 858 00:44:09,390 --> 00:44:12,680 The limitations of your system are that it has to be deferred 859 00:44:12,680 --> 00:44:14,470 and that it has to be done in groups. 860 00:44:14,470 --> 00:44:17,430 Basically, some of these sort of really difficult decisions 861 00:44:17,430 --> 00:44:19,350 have only one or two obvious answers. 862 00:44:24,570 --> 00:44:26,550 All right, well I wanted to dig into animation a little 863 00:44:26,550 --> 00:44:29,520 bit, so does anybody have any questions on anything? 864 00:44:29,520 --> 00:44:31,300 Any of the sort of the high-level stuff that I've 865 00:44:31,300 --> 00:44:32,570 covered up to this point? 866 00:44:35,603 --> 00:44:36,853 Yeah? 867 00:44:38,700 --> 00:44:43,196 AUDIENCE: So does the need for deferral and breaking into 868 00:44:43,196 --> 00:44:48,230 groups and staging, does this need break the desire for the 869 00:44:48,230 --> 00:44:51,500 higher-level programmers to abstract what's going on at 870 00:44:51,500 --> 00:44:54,176 the data engine level? 871 00:44:54,176 --> 00:44:57,770 Or is that not quite the issue? 872 00:44:57,770 --> 00:45:01,130 MIKE ACTON: Well, let's say we get a game 873 00:45:01,130 --> 00:45:02,500 play programmer, right? 874 00:45:02,500 --> 00:45:05,430 Fresh out of school, he's taught in school, C 875 00:45:05,430 --> 00:45:06,680 plus plus in school. 876 00:45:06,680 --> 00:45:11,090 Taught to decompose the world into sort of the main classes 877 00:45:11,090 --> 00:45:13,460 and that they all communicate through each other maybe 878 00:45:13,460 --> 00:45:15,650 through messaging. 879 00:45:15,650 --> 00:45:17,920 All right, well the first thing we tell him is that all 880 00:45:17,920 --> 00:45:19,710 that is complete crap. 881 00:45:19,710 --> 00:45:22,970 None of that will actually work in practice. 882 00:45:22,970 --> 00:45:26,260 So in some sense, yes, there is a sort of tendency for them 883 00:45:26,260 --> 00:45:31,310 to want this interface, this sort of clean abstraction, but 884 00:45:31,310 --> 00:45:34,450 abstraction doesn't have any value. 885 00:45:34,450 --> 00:45:37,085 It doesn't make the game faster, it doesn't make the 886 00:45:37,085 --> 00:45:38,590 game cheaper. 887 00:45:38,590 --> 00:45:39,370 It doesn't make-- 888 00:45:39,370 --> 00:45:42,540 AUDIENCE: [OBSCURED] 889 00:45:42,540 --> 00:45:44,860 PROFESSOR: There's a bit of a religious. 890 00:45:44,860 --> 00:45:45,780 Let's move on. 891 00:45:45,780 --> 00:45:47,980 He has a lot of other interesting things to say. 892 00:45:47,980 --> 00:45:50,050 And we can get to that question-- 893 00:45:50,050 --> 00:45:51,860 AUDIENCE: It sounds like there's two completely 894 00:45:51,860 --> 00:45:53,790 different communities involved in the development. 895 00:45:53,790 --> 00:45:56,320 There's the engine developers and there's the higher-level-- 896 00:45:56,320 --> 00:45:57,540 MIKE ACTON: That's a fair enough assessment. 897 00:45:57,540 --> 00:45:59,430 There are different communities. 898 00:45:59,430 --> 00:46:02,476 There is a community of the game play programmers and the 899 00:46:02,476 --> 00:46:04,110 community of engine programmers. 900 00:46:04,110 --> 00:46:06,390 And they have different priorities and they have 901 00:46:06,390 --> 00:46:08,760 different experiences. 902 00:46:08,760 --> 00:46:13,190 So yeah, in that way there is a division. 903 00:46:13,190 --> 00:46:14,830 PROFESSOR: I will let you go on. 904 00:46:14,830 --> 00:46:15,890 You said you had a lot of interesting 905 00:46:15,890 --> 00:46:18,160 information to cover. 906 00:46:18,160 --> 00:46:20,210 MIKE ACTON: OK. 907 00:46:20,210 --> 00:46:21,270 AUDIENCE: I can bitch about that. 908 00:46:21,270 --> 00:46:23,950 Don't worry. 909 00:46:23,950 --> 00:46:26,790 MIKE ACTON: So just to get into animation a little bit. 910 00:46:26,790 --> 00:46:31,030 Let's start with trying to build a simple animation 911 00:46:31,030 --> 00:46:33,520 system and see what problems come creep up as we're trying 912 00:46:33,520 --> 00:46:34,820 to implement it on the cell. 913 00:46:38,270 --> 00:46:41,080 So in the simplest case we have a set of animation 914 00:46:41,080 --> 00:46:43,116 channels defined for a character, which 915 00:46:43,116 --> 00:46:44,330 is made up of joints. 916 00:46:44,330 --> 00:46:47,200 We're just talking about sort of a simple hierarchical 917 00:46:47,200 --> 00:46:49,630 transformation here. 918 00:46:49,630 --> 00:46:51,900 And some of those channels are related. 919 00:46:51,900 --> 00:46:56,600 So in the case of rotation plus translation plus scale 920 00:46:56,600 --> 00:47:00,570 equals any individual joint. 921 00:47:00,570 --> 00:47:04,780 So the first thing, typically that you'll have to answer is 922 00:47:04,780 --> 00:47:12,900 whether or not you want to do euler or quaternion rotation. 923 00:47:12,900 --> 00:47:18,640 Now the tendency I guess, especially for new programmers 924 00:47:18,640 --> 00:47:21,370 is to go with quaternion. 925 00:47:21,370 --> 00:47:24,970 They're taught that gimbal lock is a sort of 926 00:47:24,970 --> 00:47:28,860 insurmountable problem that only quaternion solves. 927 00:47:28,860 --> 00:47:30,260 That's just simply not true. 928 00:47:30,260 --> 00:47:31,940 I mean, gimbal lock is completely 929 00:47:31,940 --> 00:47:33,500 manageable in practice. 930 00:47:36,230 --> 00:47:39,830 When you're trying to rotate on three axes and two axes 931 00:47:39,830 --> 00:47:42,940 rotate 90 degrees apart and the third axis can't be 932 00:47:42,940 --> 00:47:44,570 resolved or 180 degrees apart. 933 00:47:44,570 --> 00:47:46,390 So you can't resolve one of the axes, right? 934 00:47:46,390 --> 00:47:48,950 AUDIENCE: [OBSCURED] 935 00:47:48,950 --> 00:47:51,830 MIKE ACTON: Yeah, it's where it's impossible to resolve one 936 00:47:51,830 --> 00:47:53,990 of the axes and that's the nature of 937 00:47:53,990 --> 00:47:56,320 euler sort of rotation. 938 00:48:00,670 --> 00:48:02,820 But sort of a quaternion rotation completely solves 939 00:48:02,820 --> 00:48:06,800 that mathematical problem, it's always resolvable and 940 00:48:06,800 --> 00:48:08,290 it's not very messy at all. 941 00:48:08,290 --> 00:48:11,550 I mean, from a sort of C programmers perspective, it 942 00:48:11,550 --> 00:48:14,330 looks clean, the math's clean, everything's clean about it. 943 00:48:14,330 --> 00:48:18,850 Unfortunately, it doesn't compress very well at all. 944 00:48:18,850 --> 00:48:23,720 Where if you used euler rotation, which basically just 945 00:48:23,720 --> 00:48:27,640 means that the individual rotation for every axis. 946 00:48:27,640 --> 00:48:30,830 So x rotation, y rotation, z rotation. 947 00:48:30,830 --> 00:48:33,300 That's much, much more compressible because each one 948 00:48:33,300 --> 00:48:36,410 of those axes can be individually compressed. 949 00:48:36,410 --> 00:48:38,665 It's very unlikely that you're always rotating all three 950 00:48:38,665 --> 00:48:42,060 axes, all the time, especially in a human character. 951 00:48:42,060 --> 00:48:44,596 It's much more likely that only one axis is rotating on 952 00:48:44,596 --> 00:48:49,690 any one given time and so that makes it-- 953 00:48:49,690 --> 00:48:52,660 just without any change, without any additional 954 00:48:52,660 --> 00:48:57,368 compression-- it tends to make it about 1/3 of the size. 955 00:48:57,368 --> 00:48:59,910 AUDIENCE: [OBSCURED] 956 00:48:59,910 --> 00:49:03,070 MIKE ACTON: The animation data. 957 00:49:03,070 --> 00:49:05,810 So you have this frame of animation, which is all these 958 00:49:05,810 --> 00:49:07,720 animation channels, right? 959 00:49:07,720 --> 00:49:11,000 And then over time you have these different frames of 960 00:49:11,000 --> 00:49:12,080 animation, right? 961 00:49:12,080 --> 00:49:15,100 If you store-- for every joint, for every rotation you 962 00:49:15,100 --> 00:49:18,190 store a quaternion over time, it's hard to compress across 963 00:49:18,190 --> 00:49:21,680 time because you're basically, essentially rotating all three 964 00:49:21,680 --> 00:49:22,970 axes, all the time. 965 00:49:22,970 --> 00:49:24,900 Well, with-- 966 00:49:24,900 --> 00:49:25,420 yeah? 967 00:49:25,420 --> 00:49:26,670 All right. 968 00:49:28,950 --> 00:49:32,720 So let's say, of course, the next step is how do we store 969 00:49:32,720 --> 00:49:35,170 the actual rotation itself? 970 00:49:35,170 --> 00:49:39,120 Do we store it in cloth, double, half precision, fixed 971 00:49:39,120 --> 00:49:40,370 point precision? 972 00:49:43,500 --> 00:49:45,670 Probably the national tendency at this point would be to 973 00:49:45,670 --> 00:49:49,030 store it in a floating point number, but if you look at the 974 00:49:49,030 --> 00:49:51,630 actual range of rotation, which is extremely limited on 975 00:49:51,630 --> 00:49:55,900 a character, on any particular joint there are very few 976 00:49:55,900 --> 00:50:00,270 joints that would even rotate 180 degrees. 977 00:50:00,270 --> 00:50:05,360 So a floating point is overkill, by a 978 00:50:05,360 --> 00:50:07,810 large margin on rotation-- 979 00:50:11,720 --> 00:50:12,440 for the range. 980 00:50:12,440 --> 00:50:15,940 For the precision, however it's fairly good. 981 00:50:15,940 --> 00:50:18,160 Especially if you're doing very small rotations over a 982 00:50:18,160 --> 00:50:21,170 long period of time. 983 00:50:21,170 --> 00:50:23,980 So probably a more balanced approach would be to go with a 984 00:50:23,980 --> 00:50:27,280 16 bit floating point from a half format where you keep 985 00:50:27,280 --> 00:50:29,270 most of the precision, but you reduce the range 986 00:50:29,270 --> 00:50:31,340 significantly. 987 00:50:31,340 --> 00:50:33,465 There's also the potential for going with an 8 bit floating 988 00:50:33,465 --> 00:50:37,840 point format depending on the kind of 989 00:50:37,840 --> 00:50:39,490 animation that you're doing. 990 00:50:39,490 --> 00:50:43,190 And I'll probably have this on another slide, but it really 991 00:50:43,190 --> 00:50:44,710 depends on how close-- 992 00:50:44,710 --> 00:50:48,940 how compressible a joint is depends on how close to the 993 00:50:48,940 --> 00:50:49,880 root it is. 994 00:50:49,880 --> 00:50:52,530 The further away from the root the less it matters. 995 00:50:52,530 --> 00:50:54,950 So the joint at your fingertip, you can compress a 996 00:50:54,950 --> 00:50:57,017 whole lot more because it doesn't matter as much, it's 997 00:50:57,017 --> 00:50:58,420 not going to affect anything else. 998 00:50:58,420 --> 00:51:01,520 Where a joint at the actual root, the smallest change in 999 00:51:01,520 --> 00:51:04,720 motion will affect the entire system in animation and will 1000 00:51:04,720 --> 00:51:07,150 make it virtually impossible for you to line up animations 1001 00:51:07,150 --> 00:51:09,580 with each other, so that that particular joint needs to be 1002 00:51:09,580 --> 00:51:12,030 nearly perfect. 1003 00:51:12,030 --> 00:51:13,220 And how do you store rotation? 1004 00:51:13,220 --> 00:51:17,150 Do you store them in degrees, radians, or normalized? 1005 00:51:17,150 --> 00:51:19,230 I have seen people store them in degrees. 1006 00:51:19,230 --> 00:51:21,800 I don't understand why you would ever do that. 1007 00:51:21,800 --> 00:51:26,430 It's just adding math to the problem. 1008 00:51:26,430 --> 00:51:31,920 Radians is perfectly fine if you're using off the shelf 1009 00:51:31,920 --> 00:51:35,150 trigonometric functions- tan, sine whatever. 1010 00:51:35,150 --> 00:51:37,080 But typically, if you're going to optimize those functions 1011 00:51:37,080 --> 00:51:39,970 yourself anyway, it's going to be much more effective go with 1012 00:51:39,970 --> 00:51:42,800 a normalized rotational value. 1013 00:51:42,800 --> 00:51:46,860 So basically between zero and 1. 1014 00:51:46,860 --> 00:51:52,970 Makes it a lot easier to do tricks based on the circle. 1015 00:51:52,970 --> 00:51:55,560 Basically you can just take the fractional value and just 1016 00:51:55,560 --> 00:51:58,050 deal with that. 1017 00:51:58,050 --> 00:52:01,150 So normalized rotation is generally the way to go and 1018 00:52:01,150 --> 00:52:07,250 normalizing a half precision is probably the even bet for 1019 00:52:07,250 --> 00:52:08,500 how you would store. 1020 00:52:12,890 --> 00:52:16,210 So looking at what we need to fit into an SPU if we're going 1021 00:52:16,210 --> 00:52:17,090 to running to an end machine. 1022 00:52:17,090 --> 00:52:17,652 Yeah? 1023 00:52:17,652 --> 00:52:20,472 AUDIENCE: You talked a lot about compressing because of 1024 00:52:20,472 --> 00:52:24,590 the way it's impacting data, what's the key driver of that? 1025 00:52:24,590 --> 00:52:25,940 MIKE ACTON: The SPU has very little space. 1026 00:52:25,940 --> 00:52:28,380 AUDIENCE: OK, so it's just the amount of space. 1027 00:52:28,380 --> 00:52:29,800 MIKE ACTON: Yeah, well OK. 1028 00:52:29,800 --> 00:52:33,450 There's two factors really, in all honesty. 1029 00:52:33,450 --> 00:52:34,820 So starting with the SPU. 1030 00:52:34,820 --> 00:52:37,090 That you have to be able to work through 1031 00:52:37,090 --> 00:52:39,660 this data on the SPU. 1032 00:52:39,660 --> 00:52:42,070 But you also have the DMA transfer itself. 1033 00:52:42,070 --> 00:52:45,360 The SPU can actually calculate really, really fast, right? 1034 00:52:45,360 --> 00:52:47,800 I mean, that's the whole point. 1035 00:52:47,800 --> 00:52:51,740 So if you can transfer less data, burn through it a little 1036 00:52:51,740 --> 00:52:55,960 bit to expand it, it's actually a huge win. 1037 00:52:55,960 --> 00:52:59,310 And on top of that we have a big, big game and only 256 1038 00:52:59,310 --> 00:53:02,160 megs of main ram. 1039 00:53:02,160 --> 00:53:08,150 And the amount of geometry that people require from a 1040 00:53:08,150 --> 00:53:14,070 current generation game or next generation game has 1041 00:53:14,070 --> 00:53:16,470 scaled up way more than the amount of memory we've been 1042 00:53:16,470 --> 00:53:20,270 given, so we've only been given eight times as much 1043 00:53:20,270 --> 00:53:21,480 memory as we had in the previous generation. 1044 00:53:21,480 --> 00:53:24,480 People expect significantly more than eight times as much 1045 00:53:24,480 --> 00:53:29,780 geometry on the screen and where do we store that? 1046 00:53:29,780 --> 00:53:32,145 We have the Blu-Ray, we can't be streaming everything off 1047 00:53:32,145 --> 00:53:36,470 the disc all the time, which is to another point. 1048 00:53:36,470 --> 00:53:41,250 You have 40 gigs of data on your disc, but 1049 00:53:41,250 --> 00:53:43,030 only 256 megs of RAM. 1050 00:53:43,030 --> 00:53:46,620 So there's this sort of a series of compression, 1051 00:53:46,620 --> 00:53:50,600 decompression to keep everything-- 1052 00:53:50,600 --> 00:53:55,640 basically, think of RAM as your L3 cache. 1053 00:54:00,830 --> 00:54:03,090 So we look at what we want to store on an SPU. 1054 00:54:03,090 --> 00:54:06,140 Basically, the goal of this is we want to get an entire 1055 00:54:06,140 --> 00:54:10,570 animation for a particular skeleton on an SPU so that we 1056 00:54:10,570 --> 00:54:12,710 can transform the skeleton and output the 1057 00:54:12,710 --> 00:54:17,040 resulting joint data. 1058 00:54:17,040 --> 00:54:19,690 So let's look at how big that would have to be. 1059 00:54:19,690 --> 00:54:22,580 So first we start with the basic nine channels per joint. 1060 00:54:22,580 --> 00:54:25,390 That's not assuming and again, you'd probably have additional 1061 00:54:25,390 --> 00:54:29,000 channels, like foot step channels and sound channels 1062 00:54:29,000 --> 00:54:30,660 and other sort of animation channels to help 1063 00:54:30,660 --> 00:54:32,000 actually make a game. 1064 00:54:32,000 --> 00:54:33,810 In this case, we just want to animate the character. 1065 00:54:33,810 --> 00:54:37,550 So we have rotation times 3, translations times 3, and 1066 00:54:37,550 --> 00:54:39,960 scales times 3. 1067 00:54:39,960 --> 00:54:44,590 So the first thing to drop and this will cover, this will 1068 00:54:44,590 --> 00:54:49,370 reduce your data by 70%, is all the uniform channels. 1069 00:54:49,370 --> 00:54:52,130 So any data that doesn't actually change across the 1070 00:54:52,130 --> 00:54:53,380 entire length of the animation. 1071 00:54:53,380 --> 00:54:56,110 It may not be zero, but it could be just one thing, one 1072 00:54:56,110 --> 00:54:57,232 value that doesn't change across 1073 00:54:57,232 --> 00:54:59,270 length of the animation. 1074 00:54:59,270 --> 00:55:01,990 So you pull all the uniform channels out. 1075 00:55:01,990 --> 00:55:04,370 And most things that's going to be scale, for example, most 1076 00:55:04,370 --> 00:55:06,100 joints don't scale. 1077 00:55:06,100 --> 00:55:08,580 Although occasionally they do. 1078 00:55:08,580 --> 00:55:12,870 And translation, in a human our joints don't translate. 1079 00:55:12,870 --> 00:55:17,710 However, when you actually animate a character in order 1080 00:55:17,710 --> 00:55:19,830 to get particular effects, in order to make it look more 1081 00:55:19,830 --> 00:55:24,650 human you do end up needing to translate joints. 1082 00:55:24,650 --> 00:55:31,680 So we can reduce, but in order to that we need to build a 1083 00:55:31,680 --> 00:55:34,190 map, basically, a table of these uniform channels. 1084 00:55:34,190 --> 00:55:36,340 So now we know this table of uniform channels has to be 1085 00:55:36,340 --> 00:55:40,110 stored in the SPU along with now the remaining actual 1086 00:55:40,110 --> 00:55:41,620 animation data. 1087 00:55:41,620 --> 00:55:44,880 Of course, multiplied by the number of joints. 1088 00:55:47,780 --> 00:55:49,670 So now we have what is essentially 1089 00:55:49,670 --> 00:55:52,520 raw animation data. 1090 00:55:52,520 --> 00:55:55,130 So for the sake of argument, let's say the animation data 1091 00:55:55,130 --> 00:55:58,860 has been baked out by Maya or whatever 1092 00:55:58,860 --> 00:56:01,120 at 30 frames a second. 1093 00:56:01,120 --> 00:56:04,990 We've pulled out the uniform data, so now for the joints 1094 00:56:04,990 --> 00:56:07,420 that do move we have these curves over time of the entire 1095 00:56:07,420 --> 00:56:08,780 length of the animation. 1096 00:56:08,780 --> 00:56:13,620 The problem is if that animation is 10 seconds long, 1097 00:56:13,620 --> 00:56:18,540 it's now way too big to fit in the SPU by a large margin. 1098 00:56:18,540 --> 00:56:22,600 So how do we sort of compress it down so that it 1099 00:56:22,600 --> 00:56:24,410 actually will fit? 1100 00:56:24,410 --> 00:56:27,630 Again, just first of all, the easiest thing to do to start 1101 00:56:27,630 --> 00:56:31,050 with is just do simple curve fitting to get rid of the 1102 00:56:31,050 --> 00:56:32,910 things that don't need to be there that you can easily 1103 00:56:32,910 --> 00:56:35,380 calculate out. 1104 00:56:35,380 --> 00:56:37,310 And again, the closer that you are to the root, the tighter 1105 00:56:37,310 --> 00:56:38,400 that fits need to be. 1106 00:56:38,400 --> 00:56:41,070 Conversely, the further away you are from the root, you can 1107 00:56:41,070 --> 00:56:44,460 loosen up the restrictions a little bit and have a little 1108 00:56:44,460 --> 00:56:46,640 bit looser fit on the curve and compress 1109 00:56:46,640 --> 00:56:47,890 a little bit more. 1110 00:56:49,860 --> 00:56:52,760 So if you're doing a curve fitting with the simple 1111 00:56:52,760 --> 00:56:57,880 spline, basically you have to store your time values in the 1112 00:56:57,880 --> 00:57:00,290 places that were calculated. 1113 00:57:00,290 --> 00:57:03,130 Part of the problem is now you have sort of these individual 1114 00:57:03,130 --> 00:57:05,970 scalars with time can be randomly spread throughout the 1115 00:57:05,970 --> 00:57:07,050 entire animation. 1116 00:57:07,050 --> 00:57:09,260 So any point where there's basically a knot in the curve, 1117 00:57:09,260 --> 00:57:10,960 there's a time value. 1118 00:57:10,960 --> 00:57:13,460 And none of these knots are going to line up with each 1119 00:57:13,460 --> 00:57:15,930 other in any of these animation channels. 1120 00:57:15,930 --> 00:57:18,350 So in principle, if you wanted to code this you would have to 1121 00:57:18,350 --> 00:57:23,520 basically say, what is time right now and loop through 1122 00:57:23,520 --> 00:57:26,440 each of these scalar values, find out where time is, 1123 00:57:26,440 --> 00:57:28,525 calculate the postition on the curve and then 1124 00:57:28,525 --> 00:57:31,930 spit out the result. 1125 00:57:31,930 --> 00:57:37,030 So one, you still have to have the unlimited length of data 1126 00:57:37,030 --> 00:57:39,100 and two, you're looping through scalar values on the 1127 00:57:39,100 --> 00:57:42,740 SPU, which is really actually, horrible. 1128 00:57:42,740 --> 00:57:45,210 So we want to find a way to solve that problem. 1129 00:57:48,470 --> 00:57:49,855 Probably the most trivial solution is 1130 00:57:49,855 --> 00:57:51,690 just do spline segemnts. 1131 00:57:51,690 --> 00:57:55,240 You lose some compressibility, but it solves the problem. 1132 00:57:55,240 --> 00:57:59,170 Basically you split up the spline into say, sections of 1133 00:57:59,170 --> 00:58:03,190 16 knots and you just do that. 1134 00:58:03,190 --> 00:58:06,690 And in order to do that you just need a table, you need to 1135 00:58:06,690 --> 00:58:11,170 add a table that says what the range of time are in each of 1136 00:58:11,170 --> 00:58:15,620 those groups of 16 knots for every channel. 1137 00:58:15,620 --> 00:58:17,690 So when you're going to transform the animation, first 1138 00:58:17,690 --> 00:58:20,120 you load this table in, you say, what's my time right now 1139 00:58:20,120 --> 00:58:21,010 at time, t? 1140 00:58:21,010 --> 00:58:24,000 You go and say which blocks, which segments of the spline 1141 00:58:24,000 --> 00:58:26,580 you need to load in for each channel, you load those in. 1142 00:58:26,580 --> 00:58:31,520 So now you have basically one section of the spline, which 1143 00:58:31,520 --> 00:58:35,400 is too big probably for the current t, but it covers what 1144 00:58:35,400 --> 00:58:36,510 t you're actually in. 1145 00:58:36,510 --> 00:58:40,660 So one block of spline for every single channel. 1146 00:58:43,270 --> 00:58:46,520 So the advantage of this, now that the spline is sorted into 1147 00:58:46,520 --> 00:58:49,420 sections is that rather than having all the spline data 1148 00:58:49,420 --> 00:58:53,680 stored, sort of linearly, you can now reorder the blocks so 1149 00:58:53,680 --> 00:59:00,590 that the spline data from different channels is actually 1150 00:59:00,590 --> 00:59:01,780 tiled next to each other. 1151 00:59:01,780 --> 00:59:03,900 So that when you actually go to do a load it's much more 1152 00:59:03,900 --> 00:59:07,380 likely because you know you're going to be requesting all 1153 00:59:07,380 --> 00:59:11,030 these channel at once and all on the same time, t, you can 1154 00:59:11,030 --> 00:59:15,470 find a more or less, optimal ordering that will allow more 1155 00:59:15,470 --> 00:59:17,630 of these group things to be grouped in the same cache or 1156 00:59:17,630 --> 00:59:19,620 at least the same page. 1157 00:59:23,500 --> 00:59:26,840 And the advantage of course again, is now the length of 1158 00:59:26,840 --> 00:59:29,830 animation makes absolutely no difference at all. 1159 00:59:29,830 --> 00:59:31,790 The disadvantage is its less compressible because you can 1160 00:59:31,790 --> 00:59:36,920 only basically compress this one section of the curve, but 1161 00:59:36,920 --> 00:59:40,820 a huge advantage is it solves the scalar loop problem. 1162 00:59:40,820 --> 00:59:46,240 So now you can take four of these scalar values all with a 1163 00:59:46,240 --> 00:59:50,560 fixed known number of knots in it and just loop through all 1164 00:59:50,560 --> 00:59:53,070 of the knots. 1165 00:59:53,070 --> 00:59:55,240 In principle you could search through and find a minimum 1166 00:59:55,240 --> 00:59:57,190 number of knots to look through for each one of the 1167 00:59:57,190 --> 00:59:59,330 scalars, but in practice it's much faster just to loop 1168 00:59:59,330 --> 01:00:03,190 through all four simultaneously for all 16 1169 01:00:03,190 --> 01:00:07,150 knots and just throw away the results that are invalid as 1170 01:00:07,150 --> 01:00:08,590 you're going through it. 1171 01:00:08,590 --> 01:00:11,220 That way you can use the SPU instruction set. 1172 01:00:11,220 --> 01:00:14,620 You can load quadwords, store quadwords, and do everything 1173 01:00:14,620 --> 01:00:17,690 in the minimum single loop, which you 1174 01:00:17,690 --> 01:00:18,940 can completely unroll. 1175 01:00:23,560 --> 01:00:26,785 Does anybody have the time? 1176 01:00:26,785 --> 01:00:29,190 PROFESSOR: It's [OBSCURED] 1177 01:00:29,190 --> 01:00:30,190 MIKE ACTON: So I'm OK. 1178 01:00:30,190 --> 01:00:33,620 PROFESSOR: [OBSCURED] 1179 01:00:33,620 --> 01:00:34,427 MIKE ACTON: Yeah? 1180 01:00:34,427 --> 01:00:34,985 AUDIENCE: In 1181 01:00:34,985 --> 01:00:38,550 context do you make like rendering the animation or it 1182 01:00:38,550 --> 01:00:40,860 seems like there would be a blow to whatever you're doing 1183 01:00:40,860 --> 01:00:42,170 on the SPUs. 1184 01:00:42,170 --> 01:00:44,430 MIKE ACTON: Basically the SPUs are taking this channel 1185 01:00:44,430 --> 01:00:48,360 animation data and baking it out into-- well, in the 1186 01:00:48,360 --> 01:00:50,530 easiest case baking it out into a 4 by 1187 01:00:50,530 --> 01:00:54,090 4 matrix per joint. 1188 01:00:54,090 --> 01:00:56,010 AUDIENCE: So the output time's much bigger 1189 01:00:56,010 --> 01:00:56,810 than the input time? 1190 01:00:56,810 --> 01:00:59,570 I mean, you're compressing the input by animation? 1191 01:00:59,570 --> 01:01:01,620 MIKE ACTON: No, the output size is significant, but it's 1192 01:01:01,620 --> 01:01:03,586 much smaller. 1193 01:01:03,586 --> 01:01:05,700 PROFESSOR: [OBSCURED] 1194 01:01:05,700 --> 01:01:08,160 AUDIENCE: So the animation data it's [OBSCURED] 1195 01:01:08,160 --> 01:01:09,410 [INTERPOSING VOICES] 1196 01:01:11,470 --> 01:01:12,460 MIKE ACTON: No. 1197 01:01:12,460 --> 01:01:14,700 I was just outputting the joint information. 1198 01:01:14,700 --> 01:01:19,820 PROFESSOR: [OBSCURED] 1199 01:01:19,820 --> 01:01:22,120 MIKE ACTON: Independently we have this skimming problem. 1200 01:01:22,120 --> 01:01:24,250 Independently there's a rendering problem. 1201 01:01:24,250 --> 01:01:26,250 This is just baking animation. 1202 01:01:26,250 --> 01:01:28,050 This is purely animation channel problem. 1203 01:01:28,050 --> 01:01:33,060 PROFESSOR: [OBSCURED] 1204 01:01:33,060 --> 01:01:34,910 MIKE ACTON: OK, I'm just going to skip through this because 1205 01:01:34,910 --> 01:01:36,860 this could take a long time to talk about. 1206 01:01:36,860 --> 01:01:39,560 Basically, what I wanted to say here was let's take the 1207 01:01:39,560 --> 01:01:40,955 next step with animation, let's add 1208 01:01:40,955 --> 01:01:43,050 some dynamic support. 1209 01:01:43,050 --> 01:01:47,080 The easiest thing to do is just create a second uniform 1210 01:01:47,080 --> 01:01:50,380 data table that you then blend with the first 1211 01:01:50,380 --> 01:01:51,510 one and that one. 1212 01:01:51,510 --> 01:01:55,450 In principle, is basically all of the channels and then now a 1213 01:01:55,450 --> 01:01:56,440 game play programmer can go and 1214 01:01:56,440 --> 01:01:57,990 individually set any of those. 1215 01:01:57,990 --> 01:02:00,750 So they can tweak the head or tweak the elbow or whatever. 1216 01:02:00,750 --> 01:02:02,080 And that's definitely compressible because it's very 1217 01:02:02,080 --> 01:02:05,170 unlikely thet're going to be moving all the joints at once. 1218 01:02:05,170 --> 01:02:08,050 You can create a secondary map that says, this is the number 1219 01:02:08,050 --> 01:02:11,450 of joints that are dynamic, this is how they map to the 1220 01:02:11,450 --> 01:02:12,700 uniform values. 1221 01:02:15,350 --> 01:02:18,897 But then once you add any kind of dynamic support, you have 1222 01:02:18,897 --> 01:02:21,760 now complicated the problem significantly. 1223 01:02:21,760 --> 01:02:25,690 Because now in reality, what you need are constraints. 1224 01:02:25,690 --> 01:02:29,650 You need to be able to have a limit to how high the head can 1225 01:02:29,650 --> 01:02:31,900 move because what's going to happen is although you could 1226 01:02:31,900 --> 01:02:35,850 just say the head could only move so much, if that movement 1227 01:02:35,850 --> 01:02:37,380 is algorithmic, so let's say follow a 1228 01:02:37,380 --> 01:02:39,570 character or whatever-- 1229 01:02:39,570 --> 01:02:41,760 it is going to go outside of reasonable 1230 01:02:41,760 --> 01:02:42,930 constraints really quickly. 1231 01:02:42,930 --> 01:02:48,360 So it's much cleaner and simpler to support that on the 1232 01:02:48,360 --> 01:02:50,250 engine side, so basically define constraints for the 1233 01:02:50,250 --> 01:02:57,250 joints and then let the high-level code point 1234 01:02:57,250 --> 01:02:58,060 wherever they want. 1235 01:02:58,060 --> 01:02:58,500 AUDIENCE: [OBSCURED] 1236 01:02:58,500 --> 01:03:00,330 MIKE ACTON: Yeah. 1237 01:03:00,330 --> 01:03:03,090 Yeah, you can have max change over time so it only can move 1238 01:03:03,090 --> 01:03:08,080 so fast. The max range of motion, the max acceleration 1239 01:03:08,080 --> 01:03:11,770 is actually a much harder problem because it implies 1240 01:03:11,770 --> 01:03:15,780 that you need to store the change over time, which we're 1241 01:03:15,780 --> 01:03:17,360 not actually storing. 1242 01:03:17,360 --> 01:03:20,440 Which would probably blow our memory on the SPU. 1243 01:03:20,440 --> 01:03:25,990 So as far as impacting animation, I would immediately 1244 01:03:25,990 --> 01:03:28,880 throw out max acceleration if an animator were to come to me 1245 01:03:28,880 --> 01:03:32,020 and say, this is a feature that I wanted. 1246 01:03:32,020 --> 01:03:35,060 I would say, it's unlikely because it's unlikely we can 1247 01:03:35,060 --> 01:03:36,880 fit it on the SPU. 1248 01:03:36,880 --> 01:03:41,400 Whereas, on the PC, it might be a different story. 1249 01:03:41,400 --> 01:03:42,800 And blending information, how you blend 1250 01:03:42,800 --> 01:03:44,050 these things together. 1251 01:03:52,180 --> 01:03:52,700 What's that? 1252 01:03:52,700 --> 01:03:56,040 AUDIENCE: [OBSCURED] 1253 01:03:56,040 --> 01:03:57,290 MIKE ACTON: OK. 1254 01:03:59,710 --> 01:04:03,840 So as far as mixing, there's plenty of additional problems 1255 01:04:03,840 --> 01:04:05,550 in mixing animation. 1256 01:04:05,550 --> 01:04:08,270 Phase matching, so for example, you have a running 1257 01:04:08,270 --> 01:04:09,800 and a walk. 1258 01:04:09,800 --> 01:04:12,355 Basically all that means is if you were going to blend from a 1259 01:04:12,355 --> 01:04:15,070 run to a walk you kind of want to blend in basically the 1260 01:04:15,070 --> 01:04:17,490 essentially same leg position. 1261 01:04:17,490 --> 01:04:19,330 Because if you just blend from the middle of an animation to 1262 01:04:19,330 --> 01:04:21,760 the beginning of the animation, it's unlikely the 1263 01:04:21,760 --> 01:04:23,820 legs are going to match and for the transition time you're 1264 01:04:23,820 --> 01:04:25,940 going to see the scissoring of the legs. 1265 01:04:25,940 --> 01:04:29,700 Which you see that in plenty of games, but especially in 1266 01:04:29,700 --> 01:04:32,200 next generation, especially as characters look more 1267 01:04:32,200 --> 01:04:37,100 complicated they are expected to act more complicated. 1268 01:04:37,100 --> 01:04:42,390 Transitions handling either programmatic transitions 1269 01:04:42,390 --> 01:04:46,220 between animations, so we have an animation that's standing 1270 01:04:46,220 --> 01:04:49,165 and animation that's crouching and with constraints, move 1271 01:04:49,165 --> 01:04:52,790 them down; or artist driven animated 1272 01:04:52,790 --> 01:04:56,540 transitions and/or both. 1273 01:04:56,540 --> 01:04:57,735 Translation matching is actually 1274 01:04:57,735 --> 01:04:58,820 an interesting problem. 1275 01:04:58,820 --> 01:05:01,290 So you have an animation that's running and you have an 1276 01:05:01,290 --> 01:05:02,180 animation that's walking. 1277 01:05:02,180 --> 01:05:04,170 They both translate obviously, at different speeds, 1278 01:05:04,170 --> 01:05:10,080 nonlinearly and you want to slowly run down into a walk, 1279 01:05:10,080 --> 01:05:13,940 but you have to match these sort of nonlinear translations 1280 01:05:13,940 --> 01:05:16,520 as his feet are stepping onto the ground. 1281 01:05:16,520 --> 01:05:18,690 Turns out to be a really difficult problem to get 1282 01:05:18,690 --> 01:05:22,520 perfectly right, especially if you have eye key on the feet 1283 01:05:22,520 --> 01:05:24,980 where he's walking on the ground or maybe walking uphill 1284 01:05:24,980 --> 01:05:28,010 or downhill and the translation is being affected 1285 01:05:28,010 --> 01:05:30,170 by the world. 1286 01:05:30,170 --> 01:05:33,760 In a lot of cases you'll see people pretty much just ignore 1287 01:05:33,760 --> 01:05:36,970 this problem. 1288 01:05:36,970 --> 01:05:38,950 But it is something to consider going forward and 1289 01:05:38,950 --> 01:05:41,860 this is something that we would consider how to solve, 1290 01:05:41,860 --> 01:05:46,970 regardless of whether or not we could get it in. 1291 01:05:46,970 --> 01:05:51,920 As far as actually rendering the geometry goes, you now 1292 01:05:51,920 --> 01:05:55,920 have your sort of matrices of joints and you have-- 1293 01:05:59,510 --> 01:06:03,070 let's say you want to send those to the GPU along with 1294 01:06:03,070 --> 01:06:06,350 the geometry to skin and render. 1295 01:06:06,350 --> 01:06:08,230 Now the question is, do you single or double 1296 01:06:08,230 --> 01:06:10,280 buffer those joints? 1297 01:06:10,280 --> 01:06:14,480 Because right now basically, the GPU can be reading these 1298 01:06:14,480 --> 01:06:16,320 joints in parallel to when you're 1299 01:06:16,320 --> 01:06:17,890 actually outputting them. 1300 01:06:17,890 --> 01:06:20,495 So the traditional approach or the easiest approach is just 1301 01:06:20,495 --> 01:06:21,850 to double buffer the joints. 1302 01:06:21,850 --> 01:06:23,880 So just output into a different buffer that the R6 1303 01:06:23,880 --> 01:06:24,650 is reading from. 1304 01:06:24,650 --> 01:06:29,070 It's one frame or half a frame behind, doesn't much matter. 1305 01:06:29,070 --> 01:06:33,080 But it also doubles now the space of your joints. 1306 01:06:33,080 --> 01:06:36,740 One advantage that games have is that a frame is a well 1307 01:06:36,740 --> 01:06:39,900 defined element in the games. 1308 01:06:39,900 --> 01:06:43,810 We know what needs to happen across the course of a frame. 1309 01:06:43,810 --> 01:06:48,735 So these characters need to be rendered, the collisions of 1310 01:06:48,735 --> 01:06:49,690 this background needs to happen, physics 1311 01:06:49,690 --> 01:06:51,480 need to happen here. 1312 01:06:51,480 --> 01:06:56,500 So you can within a frame, set it up so that the update from 1313 01:06:56,500 --> 01:07:03,350 the SPUs and the read from the GPU can never overlap. 1314 01:07:03,350 --> 01:07:07,970 Even without any kind of synchronization or lock, it 1315 01:07:07,970 --> 01:07:10,450 can be a well known fact that it's impossible for these two 1316 01:07:10,450 --> 01:07:12,230 things because there's actually something in the 1317 01:07:12,230 --> 01:07:15,770 middle happening that has its own synchronization primitive. 1318 01:07:18,370 --> 01:07:21,370 That will allow you to do single buffering of the data. 1319 01:07:21,370 --> 01:07:23,220 But it does require more organization. 1320 01:07:23,220 --> 01:07:26,190 Especially if you're doing it on more than just one case. 1321 01:07:26,190 --> 01:07:28,520 So you have all these things that you want single buffered, 1322 01:07:28,520 --> 01:07:31,360 so you need to organize them within the frames so they're 1323 01:07:31,360 --> 01:07:33,520 never updating and reading at the same time. 1324 01:07:38,910 --> 01:07:42,850 So I'll make this the last point I'll make. 1325 01:07:42,850 --> 01:07:46,640 Optimization, one of the things that you'll hear, save 1326 01:07:46,640 --> 01:07:47,980 optimization till the end. 1327 01:07:47,980 --> 01:07:50,490 My point here being is if you save optimization till the 1328 01:07:50,490 --> 01:07:52,800 end, you don't know how to do it because you haven't 1329 01:07:52,800 --> 01:07:53,970 actually practiced it. 1330 01:07:53,970 --> 01:07:57,750 If you haven't practiced it you don't know what to do. 1331 01:07:57,750 --> 01:07:59,100 So it will take much longer. 1332 01:07:59,100 --> 01:08:01,090 You should always be optimizing in order to 1333 01:08:01,090 --> 01:08:05,500 understand, when it actually counts, what to do. 1334 01:08:05,500 --> 01:08:08,655 And the fact that real optimization does impact the 1335 01:08:08,655 --> 01:08:10,420 design all the way up. 1336 01:08:10,420 --> 01:08:13,400 Optimization of the hardware impacts how an engine is 1337 01:08:13,400 --> 01:08:17,550 designed to be fast does impact the data, it impacts 1338 01:08:17,550 --> 01:08:21,263 how game play needs to be written, high-level code needs 1339 01:08:21,263 --> 01:08:23,400 to be called. 1340 01:08:23,400 --> 01:08:26,060 So if you save optimization till last, what you're doing 1341 01:08:26,060 --> 01:08:30,310 is completely limiting what you can optimize. 1342 01:08:30,310 --> 01:08:33,010 And the idea that it's the root of all evil certainly 1343 01:08:33,010 --> 01:08:36,580 didn't come from a game developer, I have to say. 1344 01:08:36,580 --> 01:08:37,940 Anyway, that's it. 1345 01:08:37,940 --> 01:08:39,690 I hope that was helpful. 1346 01:08:44,800 --> 01:08:46,700 PROFESSOR: Any questions? 1347 01:08:46,700 --> 01:08:49,005 I think it's very interesting because there is a lot of 1348 01:08:49,005 --> 01:08:50,255 things you learn at MIT. 1349 01:08:57,950 --> 01:09:00,530 Forget everything you learned so I think there's a very 1350 01:09:00,530 --> 01:09:03,440 interesting perspective in there and for some of us it's 1351 01:09:03,440 --> 01:09:06,990 kind of hard to even digest a little bit, but Question? 1352 01:09:06,990 --> 01:09:09,540 AUDIENCE: Call of Duty 3 came out on the Xbox and on the 1353 01:09:09,540 --> 01:09:15,180 PS3, is Call of Duty 3 on the PS3 just running on the GPU 1354 01:09:15,180 --> 01:09:16,580 then or is it-- 1355 01:09:16,580 --> 01:09:18,420 MIKE ACTON: No, it's very likely using the SPUs. 1356 01:09:18,420 --> 01:09:19,050 I mean, I don't know. 1357 01:09:19,050 --> 01:09:20,950 I haven't looked at the source code, but I suspect that it's 1358 01:09:20,950 --> 01:09:22,550 using the SPUs. 1359 01:09:22,550 --> 01:09:24,200 How efficiently it's using them is an 1360 01:09:24,200 --> 01:09:26,150 entirely different question. 1361 01:09:26,150 --> 01:09:31,280 But it's easy to take the most trivial things right, say you 1362 01:09:31,280 --> 01:09:37,140 do hot spot analysis on your sequential code and say, OK, 1363 01:09:37,140 --> 01:09:39,120 well I can grab this section of thing and put it on the SPU 1364 01:09:39,120 --> 01:09:41,970 right and just the heaviest hitters and 1365 01:09:41,970 --> 01:09:42,910 put them on the SPU. 1366 01:09:42,910 --> 01:09:44,590 That's pretty easy to do. 1367 01:09:44,590 --> 01:09:47,230 It's taking it to the next level though, and to really 1368 01:09:47,230 --> 01:09:51,820 have sort of the next gen of game-- 1369 01:09:51,820 --> 01:09:54,260 now there's nowhere to go from there. 1370 01:09:54,260 --> 01:09:56,250 There's nowhere to go from that analysis, you've already 1371 01:09:56,250 --> 01:09:58,460 sort of hit the limit of what you can do with that. 1372 01:09:58,460 --> 01:10:00,970 It has to be redesigned. 1373 01:10:00,970 --> 01:10:03,970 So I don't know what they're doing honestly and certainly 1374 01:10:03,970 --> 01:10:05,290 I'm being recorded so-- 1375 01:10:08,470 --> 01:10:09,940 yeah? 1376 01:10:09,940 --> 01:10:14,390 AUDIENCE: You guys have shipped a game, on the PS3? 1377 01:10:14,390 --> 01:10:16,080 MIKE ACTON: Yeah, it was on action list. 1378 01:10:16,080 --> 01:10:18,480 AUDIENCE: OK, so that was more like the [OBSCURED] 1379 01:10:18,480 --> 01:10:23,890 games and whatever You seem to talk a lot about all these 1380 01:10:23,890 --> 01:10:24,586 things you've had to redo. 1381 01:10:24,586 --> 01:10:26,940 What else is there-- 1382 01:10:26,940 --> 01:10:30,220 games look better as a console was built on, what else is 1383 01:10:30,220 --> 01:10:34,300 there that you guys plan on changing as far as working 1384 01:10:34,300 --> 01:10:36,530 with the cell processor, or do you think 1385 01:10:36,530 --> 01:10:37,950 you've got it all ready? 1386 01:10:37,950 --> 01:10:38,450 MIKE ACTON: Oh, no. 1387 01:10:38,450 --> 01:10:40,110 There's plenty of work. 1388 01:10:40,110 --> 01:10:42,180 There's plenty more to be optimized. 1389 01:10:42,180 --> 01:10:45,330 It's down to cost in scheduling those things. 1390 01:10:45,330 --> 01:10:47,170 I mean, we have a team of people who now really 1391 01:10:47,170 --> 01:10:52,420 understand the platform and whereas a lot of what went 1392 01:10:52,420 --> 01:10:58,120 into previous titles was mixed with learning curve. 1393 01:10:58,120 --> 01:11:01,053 So there's definitely a potential for going back and 1394 01:11:01,053 --> 01:11:03,990 improving things and making things better. 1395 01:11:03,990 --> 01:11:07,060 That's what a cycle of game development is all about. 1396 01:11:07,060 --> 01:11:09,580 I mean, games at the end of the lifetime of PlayStation 3 1397 01:11:09,580 --> 01:11:11,970 will look significantly better than release titles. 1398 01:11:11,970 --> 01:11:13,710 That's the way it always is. 1399 01:11:13,710 --> 01:11:17,120 AUDIENCE: The head of Sony computer and gaming said that 1400 01:11:17,120 --> 01:11:18,970 PS3 pretty soon would be customizable. 1401 01:11:18,970 --> 01:11:20,460 You're be able to get different 1402 01:11:20,460 --> 01:11:21,800 amounts of RAM and whatnot. 1403 01:11:21,800 --> 01:11:24,330 MIKE ACTON: Well, I think in that case he was talking 1404 01:11:24,330 --> 01:11:30,320 specifically about a PS3 based, like Tivo kind of weird 1405 01:11:30,320 --> 01:11:34,318 media thing, which has nothing to do with us. 1406 01:11:34,318 --> 01:11:36,730 AUDIENCE: [OBSCURED] 1407 01:11:36,730 --> 01:11:37,510 MIKE ACTON: We're not stuck. 1408 01:11:37,510 --> 01:11:40,250 That's what we have. I mean, I don't see it as stuck. 1409 01:11:40,250 --> 01:11:41,810 I would much rather have the-- 1410 01:11:41,810 --> 01:11:45,150 I mean, that's what console development is about, really. 1411 01:11:45,150 --> 01:11:48,530 We have a machine, we have a set of limitations of it and 1412 01:11:48,530 --> 01:11:51,080 we can push that machine over the lifetime of the platform. 1413 01:11:51,080 --> 01:11:53,830 If it changes out from under us, it becomes PC development. 1414 01:11:53,830 --> 01:11:54,820 AUDIENCE: Are you allowed to use the 1415 01:11:54,820 --> 01:11:57,380 seven SPUs or are you-- 1416 01:11:57,380 --> 01:11:57,690 [OBSCURED] 1417 01:11:57,690 --> 01:12:09,020 PROFESSOR: [OBSCURED] 1418 01:12:09,020 --> 01:12:13,220 MIKE ACTON: I don't know how much I can answer this just 1419 01:12:13,220 --> 01:12:14,030 from NDA point-of-view. 1420 01:12:14,030 --> 01:12:17,180 But let's say hypothetically, there magically became more 1421 01:12:17,180 --> 01:12:19,640 SPUs on the PS3, right? 1422 01:12:19,640 --> 01:12:20,920 Probably nothing would happen. 1423 01:12:20,920 --> 01:12:24,620 The game has to be optimized for the minimum case, so 1424 01:12:24,620 --> 01:12:25,870 nothing would change. 1425 01:12:29,132 --> 01:12:31,750 Anything else? 1426 01:12:31,750 --> 01:12:32,640 Yeah? 1427 01:12:32,640 --> 01:12:35,130 AUDIENCE: So what's the development life cycle like 1428 01:12:35,130 --> 01:12:37,010 for the engine part of the game. 1429 01:12:37,010 --> 01:12:42,006 And I don't assume you start by prototyping in higher-level 1430 01:12:42,006 --> 01:12:45,910 mechanisms. Then you'll completely miss the design for 1431 01:12:45,910 --> 01:12:47,500 performance aspects of it. 1432 01:12:47,500 --> 01:12:54,140 How do you build up from empty [OBSCURED] 1433 01:12:54,140 --> 01:12:55,610 MIKE ACTON: No, you don't start with an empty. 1434 01:12:55,610 --> 01:12:57,300 That's the perspective difference. 1435 01:12:57,300 --> 01:12:59,620 You don't start with code, code's not important. 1436 01:12:59,620 --> 01:13:00,700 Start with the data. 1437 01:13:00,700 --> 01:13:02,130 You sit down with an artist and they say, what 1438 01:13:02,130 --> 01:13:04,150 do you want to do? 1439 01:13:04,150 --> 01:13:05,140 And then you look at the data. 1440 01:13:05,140 --> 01:13:06,450 What does that data look like? 1441 01:13:06,450 --> 01:13:07,530 What does this animation data look like? 1442 01:13:07,530 --> 01:13:09,730 PROFESSOR: Data size matters. 1443 01:13:09,730 --> 01:13:12,960 [OBSCURED] 1444 01:13:12,960 --> 01:13:13,080 MIKE ACTON: Right. 1445 01:13:13,080 --> 01:13:15,080 We have to figure out how to make it smaller. 1446 01:13:15,080 --> 01:13:16,640 But it all starts with the data. 1447 01:13:16,640 --> 01:13:19,940 It all starts with that concept of what do we want to 1448 01:13:19,940 --> 01:13:20,330 see on the screen? 1449 01:13:20,330 --> 01:13:21,820 What do we even want to hear on the speakers? 1450 01:13:21,820 --> 01:13:23,580 What kind of effects do we want. 1451 01:13:23,580 --> 01:13:28,430 And actually look at that from the perspective of the content 1452 01:13:28,430 --> 01:13:31,490 creator and what they're generating and what we can do 1453 01:13:31,490 --> 01:13:32,550 with that data. 1454 01:13:32,550 --> 01:13:36,440 Because game development is just this black box between 1455 01:13:36,440 --> 01:13:39,150 the artists and the screen. 1456 01:13:39,150 --> 01:13:41,490 We're providing a transformation engine that 1457 01:13:41,490 --> 01:13:44,570 takes the vision of the designers and the artists and 1458 01:13:44,570 --> 01:13:48,810 just transforming it and spitting it on to screen. 1459 01:13:48,810 --> 01:13:51,340 So where you really need to start is with the 1460 01:13:51,340 --> 01:13:53,380 source of the data. 1461 01:13:53,380 --> 01:13:55,040 AUDIENCE: You've been doing game developoment for 11 years 1462 01:13:55,040 --> 01:13:56,330 now is what you said. 1463 01:13:56,330 --> 01:13:57,660 Have you had a favorite platform 1464 01:13:57,660 --> 01:14:00,050 and a nightmare platform? 1465 01:14:00,050 --> 01:14:01,740 MIKE ACTON: I've been pretty much with the PlayStation 1466 01:14:01,740 --> 01:14:07,920 platform since there was a PlayStation platform and I 1467 01:14:07,920 --> 01:14:09,660 don't know, it's hard to get perspective because you're 1468 01:14:09,660 --> 01:14:12,610 getting it and you always really love plarform you're 1469 01:14:12,610 --> 01:14:15,030 working on. 1470 01:14:15,030 --> 01:14:17,800 So it's hard. 1471 01:14:17,800 --> 01:14:20,630 I mean, it's hard to get perspective. 1472 01:14:20,630 --> 01:14:22,357 In the program where I am today is not the same program 1473 01:14:22,357 --> 01:14:25,640 where I was 10 years ago. 1474 01:14:25,640 --> 01:14:27,745 Personally, right now my favorite plarform is PS3. 1475 01:14:27,745 --> 01:14:30,770 PROFESSOR: So when put it in perspective, there are already 1476 01:14:30,770 --> 01:14:37,190 some platforms that on the first time round, [COUGHING], 1477 01:14:37,190 --> 01:14:39,360 it's like cost of development. 1478 01:14:39,360 --> 01:14:44,990 So one platform that as time goes we have [OBSCURED] 1479 01:14:44,990 --> 01:14:47,380 MIKE ACTON: Well, like with the PS3, some of the things 1480 01:14:47,380 --> 01:14:48,920 that I like about the PS3, which is sort of a different 1481 01:14:48,920 --> 01:14:53,450 question are the fact that the cell is much more public than 1482 01:14:53,450 --> 01:14:55,790 any other platform has ever been. 1483 01:14:55,790 --> 01:14:58,990 With IBM's documentation, with Toshiba's support and Sony 1484 01:14:58,990 --> 01:15:05,200 support, I've never had a platform where I can get up on 1485 01:15:05,200 --> 01:15:09,210 a website and actually talk about it outside of NDA. 1486 01:15:09,210 --> 01:15:10,980 And that for me is an amazing change. 1487 01:15:10,980 --> 01:15:13,710 Where I can go and talk to other people-- exactly like 1488 01:15:13,710 --> 01:15:15,610 this group here- that have used the same exact 1489 01:15:15,610 --> 01:15:17,530 platform I've used. 1490 01:15:17,530 --> 01:15:21,280 That's never been able to happen before. 1491 01:15:21,280 --> 01:15:24,380 Even on PS2, for quite a long part of the lifespan, even 1492 01:15:24,380 --> 01:15:27,370 though there was a Linux eventually on the PS2, 1493 01:15:27,370 --> 01:15:29,210 virtually everything was covered by NDA because there 1494 01:15:29,210 --> 01:15:32,830 was no independent release of information. 1495 01:15:32,830 --> 01:15:38,270 So that's one of the great things about PS3 is the public 1496 01:15:38,270 --> 01:15:39,520 availability of cell. 1497 01:15:43,510 --> 01:15:46,304 PROFESSOR: Thank you very much for coming all the way from 1498 01:15:46,304 --> 01:15:48,230 California and giving us some insight.