1 00:00:01,680 --> 00:00:04,080 The following content is provided under a Creative 2 00:00:04,080 --> 00:00:05,620 Commons license. 3 00:00:05,620 --> 00:00:07,920 Your support will help MIT OpenCourseWare 4 00:00:07,920 --> 00:00:12,280 continue to offer high quality educational resources for free. 5 00:00:12,280 --> 00:00:14,910 To make a donation or view additional materials 6 00:00:14,910 --> 00:00:18,870 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,870 --> 00:00:20,010 at ocw.mit.edu. 8 00:00:22,095 --> 00:00:24,720 WINRICH FREIWALD: So my talk is going to be mostly about faces. 9 00:00:24,720 --> 00:00:25,830 And in many ways I'm going to connect 10 00:00:25,830 --> 00:00:27,900 to what Jim DiCarlo was talking about today 11 00:00:27,900 --> 00:00:30,246 and what Nancy talked about today. 12 00:00:30,246 --> 00:00:31,620 I just thought preparing for this 13 00:00:31,620 --> 00:00:34,203 that I should say a few things about primates and intelligence 14 00:00:34,203 --> 00:00:37,170 and how face recognition might be connected 15 00:00:37,170 --> 00:00:39,960 both to the species we are studying 16 00:00:39,960 --> 00:00:42,614 and to the overall question of intelligence. 17 00:00:42,614 --> 00:00:44,280 And I thought the most appropriate thing 18 00:00:44,280 --> 00:00:45,780 to do this at MBL would be to start 19 00:00:45,780 --> 00:00:47,482 with this kind of creature. 20 00:00:47,482 --> 00:00:49,440 So you might have seen in the paper Nature that 21 00:00:49,440 --> 00:00:51,210 came out last week-- 22 00:00:51,210 --> 00:00:53,610 that the genome of the octopus was sequenced. 23 00:00:53,610 --> 00:00:55,980 It was really heralded in the public press 24 00:00:55,980 --> 00:00:58,260 that it's finally proving their intelligence. 25 00:00:58,260 --> 00:01:01,320 The argument is that there are 33,000 genes, 26 00:01:01,320 --> 00:01:02,957 10,000 more than humans. 27 00:01:02,957 --> 00:01:04,790 That's of course not a very strong argument. 28 00:01:04,790 --> 00:01:06,870 There are plants with 45,000 genes. 29 00:01:06,870 --> 00:01:09,480 So that really doesn't tell you very much about intelligence. 30 00:01:09,480 --> 00:01:11,104 But amongst those genes that were found 31 00:01:11,104 --> 00:01:14,130 were lots of genes that are important for development 32 00:01:14,130 --> 00:01:14,970 of the brain. 33 00:01:14,970 --> 00:01:17,030 And there is very high heterogeneity 34 00:01:17,030 --> 00:01:20,150 in certain gene families that are controlling the development 35 00:01:20,150 --> 00:01:21,180 of the nervous system. 36 00:01:21,180 --> 00:01:24,030 So it's one example of how a better understanding 37 00:01:24,030 --> 00:01:27,404 the intelligence of other creatures in neural terms, 38 00:01:27,404 --> 00:01:29,820 even in creatures like the octopus which is very difficult 39 00:01:29,820 --> 00:01:30,437 to study. 40 00:01:30,437 --> 00:01:32,520 The point I would like to stress about the octopus 41 00:01:32,520 --> 00:01:35,692 is there's really nothing social about the species. 42 00:01:35,692 --> 00:01:37,650 Actually almost simultaneously with this paper, 43 00:01:37,650 --> 00:01:40,440 there was one report about sexual reproduction 44 00:01:40,440 --> 00:01:43,260 in one particular species of octopus where it's not clear 45 00:01:43,260 --> 00:01:45,600 if it's violence or more affection. 46 00:01:45,600 --> 00:01:47,520 But this is really the one exception 47 00:01:47,520 --> 00:01:50,830 to otherwise a life that's pretty much non-social at all. 48 00:01:50,830 --> 00:01:53,880 So in the octopus you have the egg stage and then 49 00:01:53,880 --> 00:01:56,280 this larva which are hatching very early on. 50 00:01:56,280 --> 00:01:59,530 There's no interaction with mom or anything like this. 51 00:01:59,530 --> 00:02:01,000 They go to the surface of the ocean 52 00:02:01,000 --> 00:02:03,570 and they start feeding and try to grow as fast as they can 53 00:02:03,570 --> 00:02:06,570 because only a few of them are going to survive. 54 00:02:06,570 --> 00:02:09,139 Then reproduction is a very scary enterprise. 55 00:02:09,139 --> 00:02:12,330 In many octopus, the male has to be 56 00:02:12,330 --> 00:02:15,960 careful not to be eaten by the female in the process. 57 00:02:15,960 --> 00:02:18,954 When it's successful, then actually the male stops eating. 58 00:02:18,954 --> 00:02:21,120 He's going to die even before most of the youngsters 59 00:02:21,120 --> 00:02:22,140 are going to hatch. 60 00:02:22,140 --> 00:02:26,250 And all that mom is going to do is basically to make sure 61 00:02:26,250 --> 00:02:29,550 that fresh water is going to be delivered to the eggs. 62 00:02:29,550 --> 00:02:31,710 And that's it in terms of social life. 63 00:02:31,710 --> 00:02:37,230 So you can be very intelligent like an octopus and, who knows, 64 00:02:37,230 --> 00:02:39,310 Gabriel just mentioned extraterrestrial life. 65 00:02:39,310 --> 00:02:42,259 Maybe other species out there might also not be that social. 66 00:02:42,259 --> 00:02:43,800 And so there doesn't necessarily have 67 00:02:43,800 --> 00:02:46,920 to be a connection between sociality and intelligence. 68 00:02:46,920 --> 00:02:49,816 Second thing is it can be very social, 69 00:02:49,816 --> 00:02:51,690 get a warm fuzzy feeling of being with others 70 00:02:51,690 --> 00:02:53,730 and you get lots of protection being in the group, 71 00:02:53,730 --> 00:02:55,020 but following your instincts in this way 72 00:02:55,020 --> 00:02:56,970 doesn't necessarily make you very smart. 73 00:02:56,970 --> 00:02:59,490 So I don't want to argue against the connection 74 00:02:59,490 --> 00:03:01,470 between sociality and intelligence 75 00:03:01,470 --> 00:03:03,500 in the form of social intelligence. 76 00:03:03,500 --> 00:03:06,030 But we have to be careful there's no necessary connection 77 00:03:06,030 --> 00:03:07,470 between those. 78 00:03:07,470 --> 00:03:09,880 However for primates there is this idea, 79 00:03:09,880 --> 00:03:11,870 this social intelligence hypothesis 80 00:03:11,870 --> 00:03:14,210 that really what made primates so intelligent 81 00:03:14,210 --> 00:03:15,234 is their sociality. 82 00:03:15,234 --> 00:03:17,650 And so let's consider a little bit what the arguments are. 83 00:03:17,650 --> 00:03:21,480 It was most strongly put forward by Nick Humphrey in '76, 84 00:03:21,480 --> 00:03:24,327 but there are similar precursors to this idea. 85 00:03:24,327 --> 00:03:25,410 So who are these primates? 86 00:03:25,410 --> 00:03:28,260 The primates are a small group of mammals 87 00:03:28,260 --> 00:03:30,525 with about 400 plus species. 88 00:03:30,525 --> 00:03:31,400 They're very diverse. 89 00:03:31,400 --> 00:03:33,060 You can have very small primates, 90 00:03:33,060 --> 00:03:36,690 just 30 grams, all the way to 200 kilogram animals. 91 00:03:36,690 --> 00:03:41,110 They evolved, studying 65, 85 million years ago, 92 00:03:41,110 --> 00:03:42,600 there was this mass extinction. 93 00:03:42,600 --> 00:03:45,360 And so you have this high diversity within the mammals 94 00:03:45,360 --> 00:03:47,170 starting from this point on. 95 00:03:47,170 --> 00:03:49,570 So they have certain things in common with other mammals, 96 00:03:49,570 --> 00:03:52,080 but they're also very special in many ways. 97 00:03:52,080 --> 00:03:54,410 All of the primate species are social. 98 00:03:54,410 --> 00:03:56,940 They're not [INAUDIBLE] social, but they are social. 99 00:03:56,940 --> 00:03:58,330 They develop very slowly. 100 00:03:58,330 --> 00:04:01,280 So it's really very different from the octopus. 101 00:04:01,280 --> 00:04:03,780 A lot of investment is made into the offspring There are not 102 00:04:03,780 --> 00:04:05,819 very many offspring. 103 00:04:05,819 --> 00:04:07,860 And the lifespan of these animals is pretty long. 104 00:04:07,860 --> 00:04:10,890 The octopus life span is three to five years. 105 00:04:10,890 --> 00:04:12,880 They're very visual, rather than other mammals, 106 00:04:12,880 --> 00:04:15,190 which are very olfactory oriented. 107 00:04:15,190 --> 00:04:17,490 They have binocular vision, many have color vision, 108 00:04:17,490 --> 00:04:19,980 so vision is very important for primates. 109 00:04:19,980 --> 00:04:23,820 And they have on average larger brains than other mammals have. 110 00:04:23,820 --> 00:04:26,010 So our understanding of the anatomy of primates 111 00:04:26,010 --> 00:04:28,485 and what makes this special is actually very rudimentary. 112 00:04:28,485 --> 00:04:31,886 But there are a few points that should be of interest. 113 00:04:31,886 --> 00:04:33,510 So if you look at the mammalian brains, 114 00:04:33,510 --> 00:04:35,130 obviously they're very, very complex mammalian brains. 115 00:04:35,130 --> 00:04:36,720 They're not primate [INAUDIBLE] ones. 116 00:04:36,720 --> 00:04:38,910 And the main factor really is body mass. 117 00:04:38,910 --> 00:04:41,650 So the bigger an animal is, this is 118 00:04:41,650 --> 00:04:46,780 elementary of body weight versus brain weight. 119 00:04:46,780 --> 00:04:48,900 And you can see there's a log-log relationship 120 00:04:48,900 --> 00:04:49,980 between the two. 121 00:04:49,980 --> 00:04:52,560 That's something that you find all over the animal kingdom. 122 00:04:52,560 --> 00:04:56,910 But if you compare primates to other non-primate mammals 123 00:04:56,910 --> 00:04:58,980 you can see that there is a larger increase 124 00:04:58,980 --> 00:05:00,532 with body mass of brain mass. 125 00:05:00,532 --> 00:05:02,990 And if you count the number of neurons, the number of brain 126 00:05:02,990 --> 00:05:05,390 neurons, it is increasing more steeply with body mass 127 00:05:05,390 --> 00:05:07,109 than it is in other mammals. 128 00:05:07,109 --> 00:05:08,900 This is obviously a very crude measurement. 129 00:05:08,900 --> 00:05:10,320 There are others. 130 00:05:10,320 --> 00:05:12,350 If you look for brains roughly the same weight, 131 00:05:12,350 --> 00:05:14,610 you can again see like in primates, you 132 00:05:14,610 --> 00:05:17,030 have many more neurons than in non-primates 133 00:05:17,030 --> 00:05:18,849 just by these two examples here. 134 00:05:18,849 --> 00:05:21,390 So there seems to be something different in the organization. 135 00:05:21,390 --> 00:05:23,098 There are other measures you can look at. 136 00:05:23,098 --> 00:05:26,450 So for example, the neuron size is increasing 137 00:05:26,450 --> 00:05:27,740 with brain size in rodents. 138 00:05:27,740 --> 00:05:29,870 So a larger brain in rodents is not 139 00:05:29,870 --> 00:05:32,000 necessarily one that has many more neurons, 140 00:05:32,000 --> 00:05:33,750 but these neurons are just getting bigger. 141 00:05:33,750 --> 00:05:35,770 And in primates this is really not so much the case. 142 00:05:35,770 --> 00:05:37,311 So the size of the neuron pretty much 143 00:05:37,311 --> 00:05:39,980 stays constant even if you decrease the size of the brain. 144 00:05:39,980 --> 00:05:42,499 Or how much white matter do you actually need per neuron 145 00:05:42,499 --> 00:05:43,790 to connect other brain regions? 146 00:05:43,790 --> 00:05:46,225 In rodents, apparently the fiber caliber 147 00:05:46,225 --> 00:05:48,240 is increasing with brain size. 148 00:05:48,240 --> 00:05:50,240 So again, your brain might grow just 149 00:05:50,240 --> 00:05:53,750 because the anatomy of the basic element 150 00:05:53,750 --> 00:05:55,431 requires that it will need more space. 151 00:05:55,431 --> 00:05:56,930 But in primates that's not the case. 152 00:05:56,930 --> 00:05:58,430 If you get more white brain matter, 153 00:05:58,430 --> 00:06:02,240 that's likely because the connectivity is more complex. 154 00:06:02,240 --> 00:06:04,490 Primate brains also fold faster with increasing size 155 00:06:04,490 --> 00:06:05,450 than rodent neurons. 156 00:06:05,450 --> 00:06:07,640 So these are all just very coarse indications 157 00:06:07,640 --> 00:06:09,140 that maybe there's something special 158 00:06:09,140 --> 00:06:12,680 about the primate brain compared to other mammalian brains. 159 00:06:12,680 --> 00:06:14,990 So along the anatomy, I mentioned that primates 160 00:06:14,990 --> 00:06:16,370 have forward-facing eyes. 161 00:06:16,370 --> 00:06:18,940 They can do binocular vision, they have color vision. 162 00:06:18,940 --> 00:06:20,690 They have skulls with a large cranium, 163 00:06:20,690 --> 00:06:24,140 that's something that makes them special from other mammals. 164 00:06:24,140 --> 00:06:27,807 They're also special in other ways that are important. 165 00:06:27,807 --> 00:06:29,390 If you think about embodied cognition, 166 00:06:29,390 --> 00:06:31,223 you don't have to buy into all these points, 167 00:06:31,223 --> 00:06:33,742 but obviously if you have a hand as complex as ours 168 00:06:33,742 --> 00:06:35,450 which we share with many primate species, 169 00:06:35,450 --> 00:06:36,950 there are lots of things you can do. 170 00:06:36,950 --> 00:06:39,500 And that requires you to be able to control it. 171 00:06:39,500 --> 00:06:42,020 And this gives you a power to interact with the environment 172 00:06:42,020 --> 00:06:44,090 that other animals might not have. 173 00:06:44,090 --> 00:06:45,950 So the shoulder is more mobile, there's 174 00:06:45,950 --> 00:06:48,100 an opposable thumb in many species. 175 00:06:48,100 --> 00:06:50,810 And then in the face, there are changes that you might already 176 00:06:50,810 --> 00:06:51,740 have seen here. 177 00:06:51,740 --> 00:06:55,790 So if you go up to the more complex animals, 178 00:06:55,790 --> 00:06:59,060 the snout region is becoming increasingly reduced. 179 00:06:59,060 --> 00:07:02,190 And I will tell you later why this might be important. 180 00:07:02,190 --> 00:07:04,690 So these are anatomical specialization in primates. 181 00:07:04,690 --> 00:07:06,794 Sociality is very important as well. 182 00:07:06,794 --> 00:07:08,960 And so there are four main organizational principles 183 00:07:08,960 --> 00:07:10,100 of sociality in primates. 184 00:07:10,100 --> 00:07:12,140 The dominating one is the second one here. 185 00:07:12,140 --> 00:07:14,120 It's called the male transfer system. 186 00:07:14,120 --> 00:07:19,470 So it's a polygamous, multi-male organization. 187 00:07:19,470 --> 00:07:22,040 This is very important for the social life of primates 188 00:07:22,040 --> 00:07:24,650 because what it means is that the social behavior has 189 00:07:24,650 --> 00:07:25,574 to be complex. 190 00:07:25,574 --> 00:07:27,740 So there can be cooperation, like grooming, defense, 191 00:07:27,740 --> 00:07:29,573 and hunting which all the animals of a troop 192 00:07:29,573 --> 00:07:30,630 might engage in. 193 00:07:30,630 --> 00:07:33,000 But at the same time there's competition for food, 194 00:07:33,000 --> 00:07:34,779 mates dominance, hierarchies. 195 00:07:34,779 --> 00:07:36,320 And it's a function of the complexity 196 00:07:36,320 --> 00:07:39,340 of the social environment. 197 00:07:39,340 --> 00:07:41,482 So primate social life was beautifully 198 00:07:41,482 --> 00:07:43,440 inscribed by Dorothy Cheney and Robert Seyfarth 199 00:07:43,440 --> 00:07:45,300 in this wonderful book Baboon Metaphysics. 200 00:07:45,300 --> 00:07:46,883 And I'm just going to quote from that. 201 00:07:46,883 --> 00:07:49,806 So they studied baboon monkeys in the wild and here's 202 00:07:49,806 --> 00:07:51,180 what they have to say about this. 203 00:07:51,180 --> 00:07:53,550 "The domain of expertise for baboons, and indeed 204 00:07:53,550 --> 00:07:55,920 for all monkeys and apes, is social life. 205 00:07:55,920 --> 00:07:58,380 Most baboons live in multi-male and multi-female groups 206 00:07:58,380 --> 00:08:02,130 that typically include eight or nine metrolineal families." 207 00:08:02,130 --> 00:08:04,172 Which means that the females stay in the group 208 00:08:04,172 --> 00:08:05,630 and they found families and they're 209 00:08:05,630 --> 00:08:08,155 going to stay constant over a long period of time. 210 00:08:08,155 --> 00:08:09,780 "They have a linear dominance hierarchy 211 00:08:09,780 --> 00:08:12,750 of males the changes often and the linear hierarchy of females 212 00:08:12,750 --> 00:08:15,660 and the offsprings that can be stable for generations. 213 00:08:15,660 --> 00:08:18,840 Daily life in a baboon group includes small scale alliances 214 00:08:18,840 --> 00:08:20,550 that may involve only three individuals 215 00:08:20,550 --> 00:08:23,460 and occasional large scale familiar battles that 216 00:08:23,460 --> 00:08:27,030 involve all of the members of three or four metrolines. 217 00:08:27,030 --> 00:08:29,400 Males and females can form short term bonds that 218 00:08:29,400 --> 00:08:32,309 lead to reproduction, or longer term friendships that lead 219 00:08:32,309 --> 00:08:34,770 to cooperative child rearing." 220 00:08:34,770 --> 00:08:36,600 "The result of all this social intrigue 221 00:08:36,600 --> 00:08:38,549 is a kind of Jane Austen melodrama 222 00:08:38,549 --> 00:08:41,669 in which each individual must predict the behavior of others 223 00:08:41,669 --> 00:08:43,200 and form those relationships that 224 00:08:43,200 --> 00:08:45,310 return the greatest benefits. 225 00:08:45,310 --> 00:08:47,680 These are the problems that the baboon mind must solve 226 00:08:47,680 --> 00:08:49,890 and this is the environment in which it has evolved." 227 00:08:49,890 --> 00:08:51,660 Most of the problems facing baboons 228 00:08:51,660 --> 00:08:55,357 can be expressed in two words: other baboons. 229 00:08:55,357 --> 00:08:56,690 And so this is really important. 230 00:08:56,690 --> 00:08:59,174 So again, if you're social you don't necessarily 231 00:08:59,174 --> 00:09:00,090 have to be very smart. 232 00:09:00,090 --> 00:09:01,620 You can be very smart and not be social. 233 00:09:01,620 --> 00:09:03,286 But there's something special apparently 234 00:09:03,286 --> 00:09:08,010 about primates that links our intelligence to our sociality. 235 00:09:08,010 --> 00:09:10,020 So again, the Social Intelligence Hypothesis, 236 00:09:10,020 --> 00:09:13,950 what works in its favor is that the primates have large brains. 237 00:09:13,950 --> 00:09:15,770 And primates, apparently group size 238 00:09:15,770 --> 00:09:18,570 is correlated with brain size across different species, 239 00:09:18,570 --> 00:09:19,830 and not the home-range size. 240 00:09:19,830 --> 00:09:21,270 This is an alternative hypothesis 241 00:09:21,270 --> 00:09:24,680 is that maybe you have to forage in more complex environments. 242 00:09:24,680 --> 00:09:26,700 And so a similar proxy for the complexity 243 00:09:26,700 --> 00:09:28,639 of your social life and your physical life 244 00:09:28,639 --> 00:09:30,180 predicts there's a better correlation 245 00:09:30,180 --> 00:09:33,406 of brain size with the social life than physical life. 246 00:09:33,406 --> 00:09:34,780 The complexity of an individual's 247 00:09:34,780 --> 00:09:37,321 social relationships increases exponentially with group size. 248 00:09:37,321 --> 00:09:38,580 And groups are not small. 249 00:09:38,580 --> 00:09:40,550 And we're going to get back to this point in a little bit. 250 00:09:40,550 --> 00:09:42,660 The baboons and other primates know their peer's dominance, 251 00:09:42,660 --> 00:09:43,770 rank, and social relations. 252 00:09:43,770 --> 00:09:45,300 Everything that I mentioned in the previous slide 253 00:09:45,300 --> 00:09:47,160 is good evidence for behavioral work 254 00:09:47,160 --> 00:09:49,890 that actually primates know something about it. 255 00:09:49,890 --> 00:09:52,410 This social knowledge contrasts with surprising cases 256 00:09:52,410 --> 00:09:54,660 of ignorance outside of the social domain. 257 00:09:54,660 --> 00:09:58,870 Even when it's something as important as a predator. 258 00:09:58,870 --> 00:10:01,480 So Cheney and Seyfard, they also studied vervet monkeys. 259 00:10:01,480 --> 00:10:04,110 And what they observed is that vervet monkeys, 260 00:10:04,110 --> 00:10:06,930 one of the main predators to them is a python. 261 00:10:06,930 --> 00:10:09,390 And the python, if it crawls in the sand 262 00:10:09,390 --> 00:10:10,720 it leaves behind a trail. 263 00:10:10,720 --> 00:10:12,540 And there is absolutely no indication that the vervet 264 00:10:12,540 --> 00:10:14,540 monkeys make the connection between these trails 265 00:10:14,540 --> 00:10:16,002 and the presence of the python. 266 00:10:16,002 --> 00:10:17,460 Which is really striking, you would 267 00:10:17,460 --> 00:10:20,010 imagine this is the first thing that they would have to learn. 268 00:10:20,010 --> 00:10:21,000 So there are cases where they actually 269 00:10:21,000 --> 00:10:23,374 follow the trail into the bush and they're very surprised 270 00:10:23,374 --> 00:10:24,730 to find a python there. 271 00:10:24,730 --> 00:10:26,532 Another example is, are leopards. 272 00:10:26,532 --> 00:10:27,990 So in the environment, leopards are 273 00:10:27,990 --> 00:10:31,389 of course predators who also feed on the vervet monkeys. 274 00:10:31,389 --> 00:10:33,680 Leopards have a way of putting the carcasses of animals 275 00:10:33,680 --> 00:10:36,390 they hunted down into the trees to protect them from larger 276 00:10:36,390 --> 00:10:37,959 predators like lions. 277 00:10:37,959 --> 00:10:39,750 So the presence of a carcass would actually 278 00:10:39,750 --> 00:10:42,600 indicate to you that likely there is a leopard around. 279 00:10:42,600 --> 00:10:44,416 And again, the vervet monkeys they 280 00:10:44,416 --> 00:10:45,540 don't make that connection. 281 00:10:45,540 --> 00:10:47,190 So if there is a carcas there, they're 282 00:10:47,190 --> 00:10:49,000 not particularly scared about it. 283 00:10:49,000 --> 00:10:50,110 It doesn't mean that they're ignorant. 284 00:10:50,110 --> 00:10:52,350 So they're even following alarm calls from other species 285 00:10:52,350 --> 00:10:54,000 as to whether an eagle is approaching, 286 00:10:54,000 --> 00:10:55,890 or if cattle is approaching. 287 00:10:55,890 --> 00:10:59,010 So it's not that they're generally dumb in this way, 288 00:10:59,010 --> 00:11:02,070 but there's really this very big contrast about all the details 289 00:11:02,070 --> 00:11:04,560 that they know about their social world and the obliviance 290 00:11:04,560 --> 00:11:09,942 that they can express for non-social factors. 291 00:11:09,942 --> 00:11:11,400 Then we have specializations, we're 292 00:11:11,400 --> 00:11:14,050 going to talk about this in the brain for processing 293 00:11:14,050 --> 00:11:15,534 social stimuli. 294 00:11:15,534 --> 00:11:16,950 And then there's actually evidence 295 00:11:16,950 --> 00:11:19,283 that females who have better social abilities, that they 296 00:11:19,283 --> 00:11:22,020 are less stressed and they have better reproductive success. 297 00:11:22,020 --> 00:11:26,100 And so this all works to say that if you are socially 298 00:11:26,100 --> 00:11:28,671 smart in a baboon environment or in the environment 299 00:11:28,671 --> 00:11:30,420 of many primate species, you actually have 300 00:11:30,420 --> 00:11:31,716 better success of reproducing. 301 00:11:31,716 --> 00:11:33,090 Therefore there's a good argument 302 00:11:33,090 --> 00:11:36,675 to be made that your social intelligence will therefore 303 00:11:36,675 --> 00:11:37,950 go to the next generation. 304 00:11:37,950 --> 00:11:40,161 This is how social intelligence might improve. 305 00:11:40,161 --> 00:11:41,910 I think there's one important point that's 306 00:11:41,910 --> 00:11:42,842 oftentimes not made. 307 00:11:42,842 --> 00:11:45,300 And that's if you become smarter and smarter in interacting 308 00:11:45,300 --> 00:11:47,820 with your physical environment, your physical environment 309 00:11:47,820 --> 00:11:49,302 does not change very much. 310 00:11:49,302 --> 00:11:51,510 But if you are interacting with a social environment, 311 00:11:51,510 --> 00:11:53,310 and getting smarter and smarter interacting 312 00:11:53,310 --> 00:11:55,260 with the social environment, the elements 313 00:11:55,260 --> 00:11:57,130 in your social environment you're interacting with, 314 00:11:57,130 --> 00:11:58,838 they're also getting smarter and smarter. 315 00:11:58,838 --> 00:12:00,990 So you're actually setting forth an arms race 316 00:12:00,990 --> 00:12:03,960 where you're not only improving the situation by getting 317 00:12:03,960 --> 00:12:05,560 smarter, but we have to get smart 318 00:12:05,560 --> 00:12:08,184 in order to keep pace with the others who are outsmarting you. 319 00:12:08,184 --> 00:12:09,600 And so you can see how there could 320 00:12:09,600 --> 00:12:12,294 be a connection between sociality and intelligence. 321 00:12:12,294 --> 00:12:13,710 That there's really this arms race 322 00:12:13,710 --> 00:12:15,870 does not occur for physical interactions 323 00:12:15,870 --> 00:12:17,470 but for social interactions. 324 00:12:17,470 --> 00:12:19,740 So that you actually have to be able to better predict 325 00:12:19,740 --> 00:12:21,960 the next move of someone else in your group 326 00:12:21,960 --> 00:12:24,870 and you have to know something about that individual for you 327 00:12:24,870 --> 00:12:26,551 to be successful. 328 00:12:26,551 --> 00:12:28,800 So there are arguments against the Social Intelligence 329 00:12:28,800 --> 00:12:30,000 Hypothesis. 330 00:12:30,000 --> 00:12:31,770 So, in particular, we're ignorant 331 00:12:31,770 --> 00:12:33,280 about many other species. 332 00:12:33,280 --> 00:12:35,970 So there are other social species, hyenas for example 333 00:12:35,970 --> 00:12:38,740 have complex societies, but hyenas have not been studied 334 00:12:38,740 --> 00:12:41,950 as much as monkeys have. 335 00:12:41,950 --> 00:12:43,780 We also don't know about the complexity, 336 00:12:43,780 --> 00:12:45,870 mostly for this reason, where they're really 337 00:12:45,870 --> 00:12:48,400 compared to whales, whales for example, 338 00:12:48,400 --> 00:12:50,251 primate societies are more complex. 339 00:12:50,251 --> 00:12:52,250 And this of course would be a crucial conjecture 340 00:12:52,250 --> 00:12:55,709 of this Social Intelligence Hypothesis. 341 00:12:55,709 --> 00:12:57,250 Then within the primate orders, there 342 00:12:57,250 --> 00:12:58,780 are actually some other correlations 343 00:12:58,780 --> 00:13:00,322 that do predict brain size very well. 344 00:13:00,322 --> 00:13:02,863 And so within the primate order, social learning, innovation, 345 00:13:02,863 --> 00:13:05,140 and tool use are strongly correlated with brain size 346 00:13:05,140 --> 00:13:06,330 and not with group size. 347 00:13:06,330 --> 00:13:08,830 So you could imagine a scenario where actually the evolution 348 00:13:08,830 --> 00:13:11,200 of basic, social intelligence is something 349 00:13:11,200 --> 00:13:13,150 that's very basic to primates. 350 00:13:13,150 --> 00:13:15,010 But then if you go to different species 351 00:13:15,010 --> 00:13:17,470 within the primate order and ask like why 352 00:13:17,470 --> 00:13:20,740 did they become so smart, like orangutans or chimps who 353 00:13:20,740 --> 00:13:22,270 can use tools, then it really might 354 00:13:22,270 --> 00:13:23,811 be the tool use that would be of more 355 00:13:23,811 --> 00:13:25,674 importance than the sociality. 356 00:13:25,674 --> 00:13:28,090 So I have a movie here that actually illustrates these two 357 00:13:28,090 --> 00:13:29,314 different hypotheses. 358 00:13:29,314 --> 00:13:31,230 So you see social interactions here in a group 359 00:13:31,230 --> 00:13:32,410 of Tonkin macaque monkeys. 360 00:13:32,410 --> 00:13:33,910 You can see the facial displays, you 361 00:13:33,910 --> 00:13:35,810 can see that they are tending to each other. 362 00:13:35,810 --> 00:13:38,268 And well you might think that they are trying to figure out 363 00:13:38,268 --> 00:13:39,670 what's actually going on here. 364 00:13:39,670 --> 00:13:43,030 And here's the alternative hypothesis of this. 365 00:13:43,030 --> 00:13:44,060 This is tool use. 366 00:13:44,060 --> 00:13:47,390 So you can see this guy just invented a cool tool, 367 00:13:47,390 --> 00:13:48,430 a nose pick. 368 00:13:48,430 --> 00:13:50,380 And so it's anyone's guess what's 369 00:13:50,380 --> 00:13:52,105 more important to your intelligence, 370 00:13:52,105 --> 00:13:53,980 you been able to read the social significance 371 00:13:53,980 --> 00:13:55,396 of other individuals in your troop 372 00:13:55,396 --> 00:13:59,990 or your ability to invent a nice nose pick. 373 00:13:59,990 --> 00:14:03,995 So the last point I wanted to make is a question. 374 00:14:03,995 --> 00:14:06,370 So are the primates' abilities in social knowledge really 375 00:14:06,370 --> 00:14:08,440 intelligent or is it just more like 376 00:14:08,440 --> 00:14:09,800 idiot savant-like abilities? 377 00:14:09,800 --> 00:14:12,850 So a unique specialization where they're good at. 378 00:14:12,850 --> 00:14:15,710 The argument that Cheney and Seyfard made is the following. 379 00:14:15,710 --> 00:14:17,668 So the knowledge that they have should actually 380 00:14:17,668 --> 00:14:19,835 be true knowledge and just not learned associations. 381 00:14:19,835 --> 00:14:21,667 And the reason has to do with the complexity 382 00:14:21,667 --> 00:14:22,880 of the social environment. 383 00:14:22,880 --> 00:14:25,060 If you have 80 different individuals, which 384 00:14:25,060 --> 00:14:27,580 is the typical case for these baboon monkeys, 385 00:14:27,580 --> 00:14:32,999 you have 3,160 pairs of animals and 82,160 trails. 386 00:14:32,999 --> 00:14:34,540 It's going to be virtually impossible 387 00:14:34,540 --> 00:14:36,415 for you to learn all these different pairwise 388 00:14:36,415 --> 00:14:39,760 relationships and then behave intelligently based upon it. 389 00:14:39,760 --> 00:14:42,640 Second, these relationships can change very fast. 390 00:14:42,640 --> 00:14:44,862 So it would not be very smart to try and make 391 00:14:44,862 --> 00:14:46,570 a list of all these pairwise interactions 392 00:14:46,570 --> 00:14:48,370 and then act upon that. 393 00:14:48,370 --> 00:14:49,960 No single behavioral metric seems 394 00:14:49,960 --> 00:14:52,270 to be necessary or sufficient to recognize associations 395 00:14:52,270 --> 00:14:53,920 like matrilinear kin. 396 00:14:53,920 --> 00:14:55,510 So human observers are apparently not 397 00:14:55,510 --> 00:14:57,843 very good to predict this if they don't know the animals 398 00:14:57,843 --> 00:14:59,547 very well personally. 399 00:14:59,547 --> 00:15:01,130 Then you might think, well, maybe it's 400 00:15:01,130 --> 00:15:03,088 not physically like a list but you don't really 401 00:15:03,088 --> 00:15:05,440 learn this much and you apply a simple rule to it. 402 00:15:05,440 --> 00:15:07,484 And that also doesn't seem to work very well 403 00:15:07,484 --> 00:15:09,400 because social relationships like friendships, 404 00:15:09,400 --> 00:15:10,684 they are intransitive. 405 00:15:10,684 --> 00:15:12,100 So if A and B are friends, B and C 406 00:15:12,100 --> 00:15:13,900 are friends, that doesn't mean that A and C necessarily 407 00:15:13,900 --> 00:15:15,110 have to be friends. 408 00:15:15,110 --> 00:15:17,080 Others like family relationships are complex, 409 00:15:17,080 --> 00:15:18,080 they're non-associative. 410 00:15:18,080 --> 00:15:19,900 So if A is the mother of B, that actually 411 00:15:19,900 --> 00:15:21,850 means that B is not the mother of A. 412 00:15:21,850 --> 00:15:24,560 So there's a more complex interaction there as well. 413 00:15:24,560 --> 00:15:28,240 And then finally, there can be simultaneous membership 414 00:15:28,240 --> 00:15:29,800 in multiple classes. 415 00:15:29,800 --> 00:15:32,050 And again, for you to be able to keep track of this, 416 00:15:32,050 --> 00:15:33,424 you better have a cognitive model 417 00:15:33,424 --> 00:15:36,190 of what's going on rather than just a list of associations 418 00:15:36,190 --> 00:15:36,947 that you learned. 419 00:15:36,947 --> 00:15:38,530 It's very difficult in experimentation 420 00:15:38,530 --> 00:15:40,000 to prove it's not association, but I 421 00:15:40,000 --> 00:15:41,485 think these are very good arguments to consider 422 00:15:41,485 --> 00:15:43,690 that actually these primates have active knowledge 423 00:15:43,690 --> 00:15:45,950 of their social environment. 424 00:15:45,950 --> 00:15:48,220 So there's one example for this where you can actually 425 00:15:48,220 --> 00:15:49,110 make the point very nicely. 426 00:15:49,110 --> 00:15:50,360 And this is the story of Ahla. 427 00:15:50,360 --> 00:15:56,020 Ahla is a baboon monkey and she was actually 428 00:15:56,020 --> 00:15:58,580 living with farmers in South West Africa. 429 00:15:58,580 --> 00:16:01,600 So there was a habit at the time to actually replace 430 00:16:01,600 --> 00:16:05,080 dogs who were herding goats with baboons. 431 00:16:05,080 --> 00:16:07,420 And so you can see Ahla sitting here. 432 00:16:07,420 --> 00:16:10,220 You can see her here adopting some 433 00:16:10,220 --> 00:16:11,560 of the behavior of the goats. 434 00:16:11,560 --> 00:16:14,590 So she's licking salt here, which is something that baboons 435 00:16:14,590 --> 00:16:15,798 would naturally not do. 436 00:16:15,798 --> 00:16:18,006 She would continue to engage in social behaviors that 437 00:16:18,006 --> 00:16:19,005 are typical for baboons. 438 00:16:19,005 --> 00:16:21,820 So she would groom the goats, for example. 439 00:16:21,820 --> 00:16:24,140 But the most amazing thing about her, 440 00:16:24,140 --> 00:16:26,290 which is a little hard to see as you see her here, 441 00:16:26,290 --> 00:16:29,290 she's carrying one of the little yearlings here, 442 00:16:29,290 --> 00:16:31,540 and brings it to its mother. 443 00:16:31,540 --> 00:16:34,000 And the description of what this animal was doing 444 00:16:34,000 --> 00:16:36,370 was when the goats were brought home 445 00:16:36,370 --> 00:16:39,820 and then sometimes they were separating the mother goats 446 00:16:39,820 --> 00:16:43,210 from the offspring, she would actually go manic and then 447 00:16:43,210 --> 00:16:44,980 try to pair them up and she would not 448 00:16:44,980 --> 00:16:47,080 stop until she was finished. 449 00:16:47,080 --> 00:16:49,420 And this would even happen as multiple goats 450 00:16:49,420 --> 00:16:52,250 were calling for the yearlings and then vise versa. 451 00:16:52,250 --> 00:16:55,207 And so she would have to put order into the social world 452 00:16:55,207 --> 00:16:57,040 that she was living in and she wouldn't stop 453 00:16:57,040 --> 00:16:58,487 until this order was restored. 454 00:16:58,487 --> 00:17:00,070 And the farmer said that they were not 455 00:17:00,070 --> 00:17:04,030 able to tell any of the adult goals or the yearlings 456 00:17:04,030 --> 00:17:05,173 or know any of the pairs. 457 00:17:05,173 --> 00:17:06,339 But this was like her world. 458 00:17:06,339 --> 00:17:07,990 This was like a social environment that she was in 459 00:17:07,990 --> 00:17:09,490 and she would structure it according 460 00:17:09,490 --> 00:17:12,069 to her cognitive demands. 461 00:17:12,069 --> 00:17:13,599 So the point is that primates have 462 00:17:13,599 --> 00:17:15,485 intricate social knowledge. 463 00:17:15,485 --> 00:17:17,359 So they know about the status of individuals, 464 00:17:17,359 --> 00:17:19,119 like their age or their gender. 465 00:17:19,119 --> 00:17:21,369 They know about the interactions of individuals. 466 00:17:21,369 --> 00:17:24,250 They recognize them very simply like grooming and mothering. 467 00:17:24,250 --> 00:17:25,930 And then based on these observations 468 00:17:25,930 --> 00:17:28,150 of the different individuals in the social world, 469 00:17:28,150 --> 00:17:30,730 they built these cognitive structures 470 00:17:30,730 --> 00:17:32,670 like friendship, kinship, and hierarchy, 471 00:17:32,670 --> 00:17:35,900 have an interesting, complicated structure to them. 472 00:17:35,900 --> 00:17:39,150 All of this is rooted in the concept of the person. 473 00:17:39,150 --> 00:17:40,780 And this is very important and as I'm 474 00:17:40,780 --> 00:17:41,810 going to be talking about face recognition, 475 00:17:41,810 --> 00:17:43,939 I have to emphasize these are two different things. 476 00:17:43,939 --> 00:17:45,730 You can recognize a person from their face, 477 00:17:45,730 --> 00:17:47,146 but if you can recognize a face it 478 00:17:47,146 --> 00:17:51,217 doesn't mean that you know who it is, who is behind the face. 479 00:17:51,217 --> 00:17:53,550 So the person concept would include something like this. 480 00:17:53,550 --> 00:17:55,124 It's a juvenile, female monkey. 481 00:17:55,124 --> 00:17:57,040 It's the daughter of x and so on and so forth. 482 00:17:57,040 --> 00:17:58,250 So this is knowledge that we have. 483 00:17:58,250 --> 00:18:00,100 It's actually been shown in rhesus monkeys, the monkeys 484 00:18:00,100 --> 00:18:01,641 that we work with, that they actually 485 00:18:01,641 --> 00:18:04,510 have this person knowledge. 486 00:18:04,510 --> 00:18:05,680 So why do we study faces? 487 00:18:05,680 --> 00:18:07,900 So for us faces really are the ideal intersection 488 00:18:07,900 --> 00:18:10,990 between object recognition, the study 489 00:18:10,990 --> 00:18:15,590 of which Jim DiCarlo talked about, and social cognition. 490 00:18:15,590 --> 00:18:19,540 So as Jim was alluding to yesterday, 491 00:18:19,540 --> 00:18:21,664 vision is really important in primates. 492 00:18:21,664 --> 00:18:23,080 About a third of the primate brain 493 00:18:23,080 --> 00:18:25,040 is thought to be involved in visual function. 494 00:18:25,040 --> 00:18:25,750 So this is a lot. 495 00:18:25,750 --> 00:18:27,160 And this is testament to the fact 496 00:18:27,160 --> 00:18:28,060 that there's a lot of information 497 00:18:28,060 --> 00:18:30,160 to be gathered from the outside visual world, 498 00:18:30,160 --> 00:18:32,800 but also that it's difficult computationally to gather this. 499 00:18:32,800 --> 00:18:35,770 And so Jim was explaining some of the computational challenges 500 00:18:35,770 --> 00:18:37,870 that object recognition has to solve yesterday 501 00:18:37,870 --> 00:18:41,810 and then we'll come back to some of his points a little later. 502 00:18:41,810 --> 00:18:43,150 So what is an object? 503 00:18:43,150 --> 00:18:45,700 So Jim was saying that it's the basic unit of cognition. 504 00:18:45,700 --> 00:18:48,040 And just very quickly, what it actually is it's 505 00:18:48,040 --> 00:18:49,534 more than a collection of features. 506 00:18:49,534 --> 00:18:50,950 So the Gestalt Rules of Perception 507 00:18:50,950 --> 00:18:51,960 actually emphasize this. 508 00:18:51,960 --> 00:18:53,918 So if you have proximity of elements, you group 509 00:18:53,918 --> 00:18:54,520 them together. 510 00:18:54,520 --> 00:18:56,110 If these elements share similarity, 511 00:18:56,110 --> 00:18:58,006 you group them together to larger entities. 512 00:18:58,006 --> 00:18:59,380 If there's good continuation, you 513 00:18:59,380 --> 00:19:01,669 group these local elements together into lines. 514 00:19:01,669 --> 00:19:03,460 If that's common fate, you group them again 515 00:19:03,460 --> 00:19:05,170 together to larger entities. 516 00:19:05,170 --> 00:19:07,277 And something similar is true for faces. 517 00:19:07,277 --> 00:19:08,860 If you have different face parts, then 518 00:19:08,860 --> 00:19:11,410 the wrong organization but now you put them together 519 00:19:11,410 --> 00:19:13,510 correctly, you can suddenly recognize a face. 520 00:19:13,510 --> 00:19:15,790 So there's a larger scale organization to not 521 00:19:15,790 --> 00:19:18,140 that goes beyond just being a collection of features. 522 00:19:18,140 --> 00:19:20,556 And maybe something similar is going on for a higher order 523 00:19:20,556 --> 00:19:23,380 condition that I'm sure other people are going to talk about. 524 00:19:23,380 --> 00:19:25,050 And it's in the physical interactions, 525 00:19:25,050 --> 00:19:29,230 we infer causality from just the sequence of events. 526 00:19:29,230 --> 00:19:32,030 Or social interactions, like in Heider-Simmel movies 527 00:19:32,030 --> 00:19:34,319 where you are telling yourself a complex social story 528 00:19:34,319 --> 00:19:35,860 unfold even when there's just simple, 529 00:19:35,860 --> 00:19:37,780 geometric shapes moving around. 530 00:19:37,780 --> 00:19:41,030 So this creation of higher order representations I think 531 00:19:41,030 --> 00:19:42,631 is essential for object recognition. 532 00:19:42,631 --> 00:19:44,380 It's a constructive process that the brain 533 00:19:44,380 --> 00:19:45,880 imposes on the piece of information 534 00:19:45,880 --> 00:19:47,230 it gets from the eyes. 535 00:19:47,230 --> 00:19:49,180 It's not just a collection of features. 536 00:19:49,180 --> 00:19:51,616 It's kind of the basis of symbolic representations. 537 00:19:51,616 --> 00:19:52,990 It can create meaning, especially 538 00:19:52,990 --> 00:19:55,490 if you think about the face, if it's the face of someone you 539 00:19:55,490 --> 00:19:57,230 know that's very meaningful. 540 00:19:57,230 --> 00:19:59,350 And it makes information actionable. 541 00:19:59,350 --> 00:20:01,400 And these are really, I think, the import 542 00:20:01,400 --> 00:20:03,670 links between object recognition and social cognition, 543 00:20:03,670 --> 00:20:07,449 and faces are smack in the middle of this. 544 00:20:07,449 --> 00:20:08,740 So I already showed this movie. 545 00:20:08,740 --> 00:20:11,590 To use again I emphasize the social communication 546 00:20:11,590 --> 00:20:12,779 that's taking place here. 547 00:20:12,779 --> 00:20:14,320 You can see the facial displays here. 548 00:20:14,320 --> 00:20:16,860 So the older male who's chasing the younger animal 549 00:20:16,860 --> 00:20:18,212 is making these facial displays. 550 00:20:18,212 --> 00:20:20,420 By the way, most of you will never have seen a Tonkin 551 00:20:20,420 --> 00:20:22,586 macaque and still you cannot understand what's going 552 00:20:22,586 --> 00:20:24,400 on there. 553 00:20:24,400 --> 00:20:26,170 This is something very special, again 554 00:20:26,170 --> 00:20:28,870 that you don't have in all animals, these facial displays. 555 00:20:28,870 --> 00:20:32,110 And Charles Darwin was actually again one of the first people 556 00:20:32,110 --> 00:20:35,530 to notice it in 1872 is that you use your face to express 557 00:20:35,530 --> 00:20:36,631 your emotional state. 558 00:20:36,631 --> 00:20:38,380 Otherwise your emotions are private to you 559 00:20:38,380 --> 00:20:41,770 but you use body language and then facial language 560 00:20:41,770 --> 00:20:43,155 to suppress your emotions. 561 00:20:43,155 --> 00:20:45,280 And oftentimes you do it even if you don't want to. 562 00:20:45,280 --> 00:20:47,330 It just happens automatically. 563 00:20:47,330 --> 00:20:49,280 And that's not possible in all animals. 564 00:20:49,280 --> 00:20:50,920 So if you are a fish or a frog, there 565 00:20:50,920 --> 00:20:52,720 are lots of really cool things you can do. 566 00:20:52,720 --> 00:20:54,970 You can sit on the front porch and enjoy the day. 567 00:20:54,970 --> 00:20:57,760 So lots of things that you can have in common with primates. 568 00:20:57,760 --> 00:21:00,220 But facial communication really requires something 569 00:21:00,220 --> 00:21:02,940 more that's very mammalian specific. 570 00:21:02,940 --> 00:21:04,900 In mammals you actually have in the face 571 00:21:04,900 --> 00:21:07,390 musculature that it's not attaching from bone to bone, 572 00:21:07,390 --> 00:21:09,310 but it's now attaching to the skin. 573 00:21:09,310 --> 00:21:11,950 And so if you look at these two rats here where their whiskers, 574 00:21:11,950 --> 00:21:14,200 I hope you can see it here, in the end are labeled. 575 00:21:14,200 --> 00:21:16,830 You can see they're actively exploring each other's faces 576 00:21:16,830 --> 00:21:18,510 in a somewhat sensory fashion. 577 00:21:18,510 --> 00:21:21,010 That's possible because they can move their whiskers because 578 00:21:21,010 --> 00:21:22,264 of this musculature. 579 00:21:22,264 --> 00:21:23,680 And that's a specialization that's 580 00:21:23,680 --> 00:21:25,155 becoming more and more refined in primates. 581 00:21:25,155 --> 00:21:27,150 So in rhesus monkeys and chimps and humans, 582 00:21:27,150 --> 00:21:29,124 we have 23 different facial muscles. 583 00:21:29,124 --> 00:21:30,790 They're becoming more and more flexible. 584 00:21:30,790 --> 00:21:33,123 I mentioned before that the snout region is increasingly 585 00:21:33,123 --> 00:21:34,010 reduced in primates. 586 00:21:34,010 --> 00:21:35,050 So you have some simpler primates 587 00:21:35,050 --> 00:21:36,820 where there's still a strong snout, which 588 00:21:36,820 --> 00:21:38,887 is limiting the ability of the face to move. 589 00:21:38,887 --> 00:21:41,470 But the more complex the primate is getting, the more flexible 590 00:21:41,470 --> 00:21:43,303 these muscles become and the more expressive 591 00:21:43,303 --> 00:21:44,080 the faces become. 592 00:21:44,080 --> 00:21:46,960 So the face now becomes richer and richer with social signals 593 00:21:46,960 --> 00:21:48,160 that can read out. 594 00:21:48,160 --> 00:21:50,110 And in rhesus monkeys which are shown here, 595 00:21:50,110 --> 00:21:51,860 you have a fixed set of facial expressions 596 00:21:51,860 --> 00:21:54,715 that, again, for a system that can analyze these, 597 00:21:54,715 --> 00:21:57,430 it's very important information about the emotional state 598 00:21:57,430 --> 00:21:59,539 of another animal. 599 00:21:59,539 --> 00:22:01,330 Primates are also very interested in faces. 600 00:22:01,330 --> 00:22:06,010 I'd very much like to show this movie which is showing 601 00:22:06,010 --> 00:22:08,847 a three-day-old macaque monkey. 602 00:22:08,847 --> 00:22:10,930 And I'll tell you what the point of this study is. 603 00:22:10,930 --> 00:22:12,920 So you can see that he's attending very closely 604 00:22:12,920 --> 00:22:14,317 to the face of the experimenter. 605 00:22:14,317 --> 00:22:16,900 Of course, if there were bananas you might think he would also 606 00:22:16,900 --> 00:22:19,150 be, this isn't proof that there is this specialization 607 00:22:19,150 --> 00:22:21,250 for faces that Nancy was alluding to before, 608 00:22:21,250 --> 00:22:22,990 but it's at least intuitive. 609 00:22:22,990 --> 00:22:25,180 The second thing why I like to show this movie 610 00:22:25,180 --> 00:22:27,250 is exactly what happened right now. 611 00:22:27,250 --> 00:22:28,810 You're all getting really excited 612 00:22:28,810 --> 00:22:31,990 about this absolutely adorable, little critter, right? 613 00:22:31,990 --> 00:22:33,980 And I've seen this movie now hundreds of times 614 00:22:33,980 --> 00:22:35,021 and it's still like this. 615 00:22:35,021 --> 00:22:36,650 It's still very emotionally charged. 616 00:22:36,650 --> 00:22:38,840 So here's the third reason. 617 00:22:38,840 --> 00:22:40,889 The experiment is based on facial movements here. 618 00:22:40,889 --> 00:22:42,930 You can see it's getting really excited about it, 619 00:22:42,930 --> 00:22:44,020 it's getting very active. 620 00:22:44,020 --> 00:22:45,978 And now he's reproducing these facial movements 621 00:22:45,978 --> 00:22:47,189 as best as he can. 622 00:22:47,189 --> 00:22:48,730 There's a specific facial interaction 623 00:22:48,730 --> 00:22:50,869 that's happening in human babies, 624 00:22:50,869 --> 00:22:51,910 I think for three months. 625 00:22:51,910 --> 00:22:54,119 It's happening in these rhesus monkeys for two weeks. 626 00:22:54,119 --> 00:22:56,326 And you can see that there is an intricate connection 627 00:22:56,326 --> 00:22:58,750 between what they're perceiving and what they're acting on 628 00:22:58,750 --> 00:23:00,190 in an automatic fashion. 629 00:23:00,190 --> 00:23:03,070 But this emotional part I think is really important. 630 00:23:03,070 --> 00:23:04,630 It's just at a certain point that you 631 00:23:04,630 --> 00:23:06,220 can't control these things. 632 00:23:06,220 --> 00:23:09,010 Faces really get very deep into your emotional and social brain 633 00:23:09,010 --> 00:23:09,750 automatically. 634 00:23:09,750 --> 00:23:11,680 And so one of the lines of research in my lab 635 00:23:11,680 --> 00:23:12,940 is to try to figure out the circuits 636 00:23:12,940 --> 00:23:14,740 that make that possible and then to use 637 00:23:14,740 --> 00:23:17,280 this to get an inroad into the social brain of function 638 00:23:17,280 --> 00:23:19,840 beyond face perception. 639 00:23:19,840 --> 00:23:22,210 So amongst the signals the faces are sending, 640 00:23:22,210 --> 00:23:24,280 you recognize Charles Darwin, so it's identity, 641 00:23:24,280 --> 00:23:26,300 they're getting their social communication, 642 00:23:26,300 --> 00:23:28,970 there's emotional responses, and there's also face following. 643 00:23:28,970 --> 00:23:31,110 So the direction that the eyes are looking to, 644 00:23:31,110 --> 00:23:32,600 we are following automatically. 645 00:23:32,600 --> 00:23:34,808 We can control this later, but this initial automatic 646 00:23:34,808 --> 00:23:35,390 response. 647 00:23:35,390 --> 00:23:36,765 Here's one very nice illustration 648 00:23:36,765 --> 00:23:38,200 from the British TV show. 649 00:23:38,200 --> 00:23:40,240 You have people wearing these glasses, 650 00:23:40,240 --> 00:23:43,870 actually it's a large background to people wearing these glasses 651 00:23:43,870 --> 00:23:46,500 where their eyes are drawn on these glasses that 652 00:23:46,500 --> 00:23:48,100 are going to one direction. 653 00:23:48,100 --> 00:23:50,417 And so you know that these are not real eyes, 654 00:23:50,417 --> 00:23:52,000 but what's happening is your attention 655 00:23:52,000 --> 00:23:54,082 is drawn constantly to this upper right region. 656 00:23:54,082 --> 00:23:55,540 And it's getting annoying over time 657 00:23:55,540 --> 00:23:57,130 because you know that there's nothing there. 658 00:23:57,130 --> 00:23:58,830 You know that they're not really paying attention there. 659 00:23:58,830 --> 00:24:00,490 But automatically your attention is drawn there 660 00:24:00,490 --> 00:24:01,930 and then you're going back again and your attention 661 00:24:01,930 --> 00:24:03,230 is going out there again. 662 00:24:03,230 --> 00:24:04,180 And so this is another thing that 663 00:24:04,180 --> 00:24:05,596 comes from the face that gets deep 664 00:24:05,596 --> 00:24:08,770 into your attentional control system. 665 00:24:08,770 --> 00:24:11,440 So social perception can start with faces, but faces 666 00:24:11,440 --> 00:24:15,790 are the most important visual sources of information. 667 00:24:15,790 --> 00:24:17,580 We get gender and age, of course identity, 668 00:24:17,580 --> 00:24:19,300 and things like perceived trustworthiness 669 00:24:19,300 --> 00:24:21,919 or attractiveness from just a very brief look at the face. 670 00:24:21,919 --> 00:24:23,710 And then there are these dynamical signals, 671 00:24:23,710 --> 00:24:25,460 like mood and overt direction of attention 672 00:24:25,460 --> 00:24:27,277 that we also get from the face. 673 00:24:27,277 --> 00:24:28,360 So how does this all work? 674 00:24:28,360 --> 00:24:31,720 So Jim was already explaining some of the challenges 675 00:24:31,720 --> 00:24:33,065 of object recognition to you. 676 00:24:33,065 --> 00:24:34,690 And so here are some of the challenges. 677 00:24:34,690 --> 00:24:36,940 So first of all, the social scene like this one here, 678 00:24:36,940 --> 00:24:40,150 lighting conditions can sometimes be non-optimal. 679 00:24:40,150 --> 00:24:43,570 And so the first thing for you to analyze the facial signals 680 00:24:43,570 --> 00:24:46,390 which are in this scene, is to localize where the faces are. 681 00:24:46,390 --> 00:24:47,620 And I'm going to tell you a little bit about what 682 00:24:47,620 --> 00:24:49,685 we understand about the mechanisms of that. 683 00:24:49,685 --> 00:24:51,310 Then once you know where the faces are, 684 00:24:51,310 --> 00:24:52,685 you want to analyze them further, 685 00:24:52,685 --> 00:24:54,880 you want to know who these individuals are. 686 00:24:54,880 --> 00:24:57,280 And I just realized that the images that I had from this, 687 00:24:57,280 --> 00:24:59,410 which are of course also taken from The Godfather, 688 00:24:59,410 --> 00:25:01,645 might not be the best. 689 00:25:01,645 --> 00:25:03,520 Where is the other picture of this individual 690 00:25:03,520 --> 00:25:05,552 here in this display of these five faces? 691 00:25:08,390 --> 00:25:10,490 Upper right. 692 00:25:10,490 --> 00:25:12,050 And then there's another individual, 693 00:25:12,050 --> 00:25:15,280 there's Don Corleone and there's another person down here 694 00:25:15,280 --> 00:25:16,530 with two different directions. 695 00:25:16,530 --> 00:25:18,655 And then if the lights were down a little bit more, 696 00:25:18,655 --> 00:25:19,790 you could see this better. 697 00:25:19,790 --> 00:25:21,950 The cool thing is that we have a way of relating these two 698 00:25:21,950 --> 00:25:23,616 pictures to each other knowing that they 699 00:25:23,616 --> 00:25:26,232 are from the same person, even though physically on a pixel 700 00:25:26,232 --> 00:25:28,190 by pixel basis these two actually are much more 701 00:25:28,190 --> 00:25:29,542 similar to each other. 702 00:25:29,542 --> 00:25:32,000 And so we'd like to figure out how the brain is doing that, 703 00:25:32,000 --> 00:25:35,030 achieving object recognition, in this case face recognition, 704 00:25:35,030 --> 00:25:37,760 in a manner that's invariant to transformations that are not 705 00:25:37,760 --> 00:25:40,070 intrinsic to the object. 706 00:25:40,070 --> 00:25:42,320 This is just a reminder that face recognition actually 707 00:25:42,320 --> 00:25:46,400 is very difficult. So this is of course just made up 708 00:25:46,400 --> 00:25:49,550 from Curb Your Enthusiasm, but there's a condition 709 00:25:49,550 --> 00:25:52,190 that many of you will have heard about, prosopagnosia. 710 00:25:52,190 --> 00:25:54,380 And to a prosopagnotic person who is face blind, 711 00:25:54,380 --> 00:25:56,370 the social world might look like this. 712 00:25:56,370 --> 00:25:58,340 So a prosopagnotic has great difficulty telling 713 00:25:58,340 --> 00:25:59,506 one individual from another. 714 00:25:59,506 --> 00:26:01,780 This is at least the most typical condition. 715 00:26:01,780 --> 00:26:03,738 And you can imagine that your social life would 716 00:26:03,738 --> 00:26:05,870 be really difficult and your enthusiasm 717 00:26:05,870 --> 00:26:07,494 about socially interacting would really 718 00:26:07,494 --> 00:26:09,740 be curbed if all the individuals looked at this 719 00:26:09,740 --> 00:26:12,020 and looked all the same. 720 00:26:12,020 --> 00:26:16,550 So there must be something about the new mechanisms that's 721 00:26:16,550 --> 00:26:17,420 very precise. 722 00:26:17,420 --> 00:26:19,378 So what's the neural basis of face recognition? 723 00:26:19,378 --> 00:26:21,470 So the story really starts with Charles Gross 724 00:26:21,470 --> 00:26:23,660 many years back in the late '60s, early '70s. 725 00:26:23,660 --> 00:26:25,970 He was recording from the inferotemporal cortex. 726 00:26:25,970 --> 00:26:28,310 He was showing pictures of monkey faces, 727 00:26:28,310 --> 00:26:30,100 other social stimuli like the monkey hand, 728 00:26:30,100 --> 00:26:31,670 he would scramble the face, and then 729 00:26:31,670 --> 00:26:32,961 look at the responses of cells. 730 00:26:32,961 --> 00:26:35,370 And he was the first to find a face selective neuron. 731 00:26:35,370 --> 00:26:36,110 Here's one. 732 00:26:36,110 --> 00:26:38,193 So these vertical lines have the action potentials 733 00:26:38,193 --> 00:26:39,014 the cell is firing. 734 00:26:39,014 --> 00:26:40,930 This is the period of time his face was shown. 735 00:26:40,930 --> 00:26:43,160 And this is the period of time the control object, the hand, 736 00:26:43,160 --> 00:26:43,910 was shown. 737 00:26:43,910 --> 00:26:46,010 And you can see the cells responding selectively 738 00:26:46,010 --> 00:26:47,960 to the face and not to the head. 739 00:26:47,960 --> 00:26:49,837 So this was a very nice finding. 740 00:26:49,837 --> 00:26:51,920 It actually took him some time to convince himself 741 00:26:51,920 --> 00:26:53,420 that he could publish it because he thought 742 00:26:53,420 --> 00:26:54,586 people would not believe it. 743 00:26:54,586 --> 00:26:57,590 It's recording a anaesthetised animal. 744 00:26:57,590 --> 00:27:00,686 But luckily, he did publish it many, many years later. 745 00:27:00,686 --> 00:27:02,060 And so this is the first evidence 746 00:27:02,060 --> 00:27:04,311 that there is a specialization in the brain for faces. 747 00:27:04,311 --> 00:27:06,310 He found many other cells that like other things 748 00:27:06,310 --> 00:27:07,980 and then faces and I think people 749 00:27:07,980 --> 00:27:09,920 thought that they were intermingled 750 00:27:09,920 --> 00:27:10,780 with other objects. 751 00:27:10,780 --> 00:27:11,654 So this was the view. 752 00:27:11,654 --> 00:27:13,654 That this is the side view of the macaque brain. 753 00:27:13,654 --> 00:27:16,112 This is the superior temporal circuits, the one big circuit 754 00:27:16,112 --> 00:27:17,169 in the monkey brain. 755 00:27:17,169 --> 00:27:18,710 And all these symbols here indicating 756 00:27:18,710 --> 00:27:20,750 positions where people found face selective neurons. 757 00:27:20,750 --> 00:27:22,730 The thought was they intermingled with object 758 00:27:22,730 --> 00:27:23,870 recognition hardware. 759 00:27:23,870 --> 00:27:25,286 It's basically the view that there 760 00:27:25,286 --> 00:27:27,162 is a big IT cortex where everything in object 761 00:27:27,162 --> 00:27:28,120 recognition can happen. 762 00:27:28,120 --> 00:27:29,090 And yes, of course you would have 763 00:27:29,090 --> 00:27:30,230 some cells that are face selective, 764 00:27:30,230 --> 00:27:32,521 you would have other cells that are non-face selective. 765 00:27:32,521 --> 00:27:35,130 And the mixture and the complex pattern of activity 766 00:27:35,130 --> 00:27:38,050 really is what gives you the identity of the object. 767 00:27:38,050 --> 00:27:42,460 Then Nancy used fMRI to discover face selective areas. 768 00:27:42,460 --> 00:27:44,210 So first, these are views from face areas. 769 00:27:44,210 --> 00:27:47,060 We now know multiple face areas that she was talking about 770 00:27:47,060 --> 00:27:47,920 before. 771 00:27:47,920 --> 00:27:50,366 So here are different slices to this. 772 00:27:50,366 --> 00:27:51,740 And the thought from these images 773 00:27:51,740 --> 00:27:54,680 really was that, no, that maybe within this large expanse 774 00:27:54,680 --> 00:27:56,690 of object recognition hardware there 775 00:27:56,690 --> 00:27:59,870 might be very specialized regions that are really there 776 00:27:59,870 --> 00:28:02,330 selectively to process faces. 777 00:28:02,330 --> 00:28:04,970 And so you give the FFA, and so the question you would ask is, 778 00:28:04,970 --> 00:28:07,939 is this really a region that's devoted to face processing 779 00:28:07,939 --> 00:28:08,980 and face processing only? 780 00:28:11,760 --> 00:28:13,560 Are these regions really face processing? 781 00:28:13,560 --> 00:28:15,981 Modules devoted to face processing 782 00:28:15,981 --> 00:28:17,480 or is it just the tip of the iceberg 783 00:28:17,480 --> 00:28:18,740 based on your statistical analysis 784 00:28:18,740 --> 00:28:20,630 that this region just looks a little bit more face selective 785 00:28:20,630 --> 00:28:22,250 than the neighboring regions? 786 00:28:22,250 --> 00:28:24,710 And second, do monkeys also have these localized face 787 00:28:24,710 --> 00:28:25,460 areas like humans? 788 00:28:25,460 --> 00:28:26,876 And you've got the answer already. 789 00:28:26,876 --> 00:28:28,075 Yes, they do. 790 00:28:28,075 --> 00:28:29,450 And then what is the distribution 791 00:28:29,450 --> 00:28:32,660 of cells within these regions versus outside? 792 00:28:32,660 --> 00:28:35,152 So this is really the research Doris Tsao 793 00:28:35,152 --> 00:28:37,680 and I engaged on many years ago. 794 00:28:37,680 --> 00:28:39,680 We used fMRI on macaque monkeys, same technology 795 00:28:39,680 --> 00:28:42,050 as in humans, slightly different coils. 796 00:28:42,050 --> 00:28:43,550 And this is the picture that we got. 797 00:28:43,550 --> 00:28:45,754 Very consistently across different animals. 798 00:28:45,754 --> 00:28:47,420 Here in the temporal lobe, you have six, 799 00:28:47,420 --> 00:28:50,500 face selective regions that you find 800 00:28:50,500 --> 00:28:52,550 that anatomically specific regions there's 801 00:28:52,550 --> 00:28:54,830 some variation from one individual to the other. 802 00:28:54,830 --> 00:28:56,610 But with the exception of the most posterior area, 803 00:28:56,610 --> 00:28:57,984 you actually find all these areas 804 00:28:57,984 --> 00:29:00,015 in all individuals on both hemispheres. 805 00:29:00,015 --> 00:29:02,390 There are also three areas in the prefrontal cortex which 806 00:29:02,390 --> 00:29:03,742 are a little harder to find. 807 00:29:03,742 --> 00:29:05,200 But the one in orbitofrontal cortex 808 00:29:05,200 --> 00:29:07,040 is actually is as reproducible as the one 809 00:29:07,040 --> 00:29:08,370 in the temporal lobe. 810 00:29:08,370 --> 00:29:09,624 So, yes there are. 811 00:29:09,624 --> 00:29:11,540 Monkeys have localized face areas like humans. 812 00:29:11,540 --> 00:29:13,250 And as Nancy was alluding to, we actually 813 00:29:13,250 --> 00:29:14,420 have quite a bit of evidence by now 814 00:29:14,420 --> 00:29:16,250 that these systems might be homologous. 815 00:29:16,250 --> 00:29:18,722 Very, very difficult to prove that they are homologous. 816 00:29:18,722 --> 00:29:20,180 But all the evidence we have so far 817 00:29:20,180 --> 00:29:22,910 is really pointing in this direction. 818 00:29:22,910 --> 00:29:24,710 So how selective are these face patches? 819 00:29:24,710 --> 00:29:27,470 And what Doris and I did was to lower recording electrodes 820 00:29:27,470 --> 00:29:30,350 into these face areas and record from cells 821 00:29:30,350 --> 00:29:32,494 inside this fMRI identified areas. 822 00:29:32,494 --> 00:29:34,160 And I'm going to show you a movie of one 823 00:29:34,160 --> 00:29:37,137 of the first cells we recorded from one of these regions. 824 00:29:37,137 --> 00:29:39,470 So it's actually a video we took from a control monitor. 825 00:29:39,470 --> 00:29:41,510 So it shows the same thing the monkey shows. 826 00:29:41,510 --> 00:29:44,400 The quality is not great because it's an actual video camera 827 00:29:44,400 --> 00:29:45,692 we took to take this image. 828 00:29:45,692 --> 00:29:47,150 In addition to what the animal saw, 829 00:29:47,150 --> 00:29:48,608 you will also see this black square 830 00:29:48,608 --> 00:29:51,110 which is indicating where the animal looked on the screen 831 00:29:51,110 --> 00:29:53,219 but the animal did not see this. 832 00:29:53,219 --> 00:29:55,760 And you're going to hear clicks if everything works fine when 833 00:29:55,760 --> 00:29:58,334 the actual potential is fired. 834 00:29:58,334 --> 00:29:59,750 Anyway, here's the quantification. 835 00:29:59,750 --> 00:30:04,500 So with 96 different stimuli in this image set, 16 faces and 18 836 00:30:04,500 --> 00:30:07,010 non-face stimuli, this is the average response 837 00:30:07,010 --> 00:30:09,282 which is normalized between minus 1 and 1. 838 00:30:09,282 --> 00:30:10,990 And you can see that the biggest response 839 00:30:10,990 --> 00:30:14,277 of this particular cell actually of course to the 16 faces 840 00:30:14,277 --> 00:30:15,860 and not to any of the control objects. 841 00:30:15,860 --> 00:30:18,200 You can see though that there are some stimuli here 842 00:30:18,200 --> 00:30:19,700 in the gadget category, for example, 843 00:30:19,700 --> 00:30:21,991 that are eliciting responses that are quite respectable 844 00:30:21,991 --> 00:30:23,090 relative to the faces. 845 00:30:23,090 --> 00:30:25,840 But really the biggest responses recorded were to the faces. 846 00:30:25,840 --> 00:30:28,270 So then we color coded this so you have a response 847 00:30:28,270 --> 00:30:31,060 vector of the sale, where red is now symbolizing response 848 00:30:31,060 --> 00:30:33,070 enhancement and blue is symbolizing response 849 00:30:33,070 --> 00:30:34,427 suppression below baseline. 850 00:30:34,427 --> 00:30:36,010 And the advantage of using this format 851 00:30:36,010 --> 00:30:37,810 is you can now stick all the responses you 852 00:30:37,810 --> 00:30:39,226 get from all the cells that you're 853 00:30:39,226 --> 00:30:42,040 recording day after day after day from this one face area. 854 00:30:42,040 --> 00:30:44,240 And you get a population response matrix. 855 00:30:44,240 --> 00:30:46,700 And the way that this works is cell numbers 856 00:30:46,700 --> 00:30:49,300 are organized from top to bottom, page number from left 857 00:30:49,300 --> 00:30:50,170 to right. 858 00:30:50,170 --> 00:30:52,900 And you can see very quickly that most of the cells 859 00:30:52,900 --> 00:30:55,930 here are either selectively enhanced or selectively 860 00:30:55,930 --> 00:30:57,052 suppressed by faces. 861 00:30:57,052 --> 00:30:58,510 There's a small group here between, 862 00:30:58,510 --> 00:31:00,520 something like 10% of the cells, where it's not 863 00:31:00,520 --> 00:31:01,780 so clear what they are doing. 864 00:31:01,780 --> 00:31:03,747 But if you do the population average, 865 00:31:03,747 --> 00:31:05,830 you can see much bigger responses to all the faces 866 00:31:05,830 --> 00:31:07,870 rather than on face objects. 867 00:31:07,870 --> 00:31:10,570 If you look more closely, what these pictures are eliciting 868 00:31:10,570 --> 00:31:12,740 in these intermediate responses, these 869 00:31:12,740 --> 00:31:14,680 are like clock faces, apples, pears, 870 00:31:14,680 --> 00:31:17,050 there are things that have physical properties in common 871 00:31:17,050 --> 00:31:17,794 with faces. 872 00:31:17,794 --> 00:31:19,210 So you can kind of fool the system 873 00:31:19,210 --> 00:31:20,950 to give a partial response. 874 00:31:20,950 --> 00:31:23,200 And this is one clue to what this area might be doing. 875 00:31:23,200 --> 00:31:24,699 It should be doing a visual analysis 876 00:31:24,699 --> 00:31:26,590 of the incoming stimuli to try and figure out 877 00:31:26,590 --> 00:31:29,810 if these are faces are not. 878 00:31:29,810 --> 00:31:32,071 So these are cells in the middle face patches. 879 00:31:32,071 --> 00:31:33,820 I was actually going over this pretty fast 880 00:31:33,820 --> 00:31:36,200 but I'm going to use this later quite a bit. 881 00:31:36,200 --> 00:31:37,450 And so let's wind back. 882 00:31:37,450 --> 00:31:41,710 We have one posterior area here, to middle face areas, 883 00:31:41,710 --> 00:31:44,290 see the middle face patches, and then three anterior ones. 884 00:31:44,290 --> 00:31:46,450 I'm mostly going to talk about this one here, AL, 885 00:31:46,450 --> 00:31:48,210 and this one here, M, in addition to the middle face 886 00:31:48,210 --> 00:31:48,710 patches. 887 00:31:54,160 --> 00:31:57,740 So we think that actually this is another automatic face 888 00:31:57,740 --> 00:31:59,170 recognition feat. 889 00:31:59,170 --> 00:32:01,880 We can't stop feeling sorry for these peppers. 890 00:32:01,880 --> 00:32:03,830 They've been just cut in half. 891 00:32:03,830 --> 00:32:05,340 And so they seem to be screaming, 892 00:32:05,340 --> 00:32:08,030 and then you know they are OK but still 893 00:32:08,030 --> 00:32:11,420 you feel like something really bad just happened. 894 00:32:11,420 --> 00:32:15,410 And so we can't stop having these inferences about peppers 895 00:32:15,410 --> 00:32:17,150 where they look like faces. 896 00:32:17,150 --> 00:32:19,780 And one reason could be that we have this specialized circuitry 897 00:32:19,780 --> 00:32:21,740 that's just getting active with right features, 898 00:32:21,740 --> 00:32:24,860 even if you know these are not faces. 899 00:32:24,860 --> 00:32:28,202 OK so when the faces were discovered by Charles Gross, 900 00:32:28,202 --> 00:32:29,910 this really fell on very fertile grounds. 901 00:32:29,910 --> 00:32:32,156 And I should just discuss some of the implications. 902 00:32:32,156 --> 00:32:33,530 So David Hubel and Torsten Weisel 903 00:32:33,530 --> 00:32:36,980 just discovered a few years before orientation selectivity. 904 00:32:36,980 --> 00:32:39,140 So it was a big jump from early processing 905 00:32:39,140 --> 00:32:41,970 where I could see how selectivity of cells 906 00:32:41,970 --> 00:32:43,550 was getting more complex. 907 00:32:43,550 --> 00:32:45,782 More concentric representations to elongated ones, 908 00:32:45,782 --> 00:32:47,240 from simple cells to complex cells, 909 00:32:47,240 --> 00:32:50,510 but complex cells are as selective as simple cells 910 00:32:50,510 --> 00:32:52,220 but don't really have a special location. 911 00:32:52,220 --> 00:32:54,620 All the way up to the opposite end of the visual system 912 00:32:54,620 --> 00:32:57,050 and now you find a face selective neuron. 913 00:32:57,050 --> 00:32:59,940 Jerome Lettvin just had coined the term grandmother 914 00:32:59,940 --> 00:33:02,284 neuron which some of you brought up yesterday. 915 00:33:02,284 --> 00:33:04,700 The idea is that there should be one neuron in your brain, 916 00:33:04,700 --> 00:33:07,040 or this is the hypothetical situation you came up with, 917 00:33:07,040 --> 00:33:09,290 one neuron in your the brain that's firing if and only 918 00:33:09,290 --> 00:33:11,210 if you see your grandmother, no matter what she's wearing, 919 00:33:11,210 --> 00:33:12,650 which direction you see from. 920 00:33:12,650 --> 00:33:14,630 That's the neural correlate of you 921 00:33:14,630 --> 00:33:17,000 perceiving your grandmother is the activity of this one 922 00:33:17,000 --> 00:33:17,750 neuron. 923 00:33:17,750 --> 00:33:19,960 And there were other concepts like Jerzy Konorski 924 00:33:19,960 --> 00:33:21,852 gnostic unit that made the same point. 925 00:33:21,852 --> 00:33:23,810 Then Horace Barlow came up with this idea maybe 926 00:33:23,810 --> 00:33:26,360 it's not one cell, but multiple cells. 927 00:33:26,360 --> 00:33:27,920 But gave us a sparse representation 928 00:33:27,920 --> 00:33:29,990 of pontifical cells, a few of them 929 00:33:29,990 --> 00:33:31,820 at the top of a processing hierarchy. 930 00:33:31,820 --> 00:33:33,612 And that's actually how we recognize faces. 931 00:33:33,612 --> 00:33:36,111 And then of course there's the opposite view of Donald Hebb. 932 00:33:36,111 --> 00:33:37,920 He talked about cell assembly since there's 933 00:33:37,920 --> 00:33:40,760 no-- things are there like large assembles] of cells. 934 00:33:40,760 --> 00:33:42,860 Or Karl Lashley who talked about mass action 935 00:33:42,860 --> 00:33:44,360 and actually were completely against 936 00:33:44,360 --> 00:33:45,944 functional specialization. 937 00:33:45,944 --> 00:33:47,360 If you look at a plot like this, I 938 00:33:47,360 --> 00:33:49,235 think one of the things you want to emphasize 939 00:33:49,235 --> 00:33:50,990 is that these cells really don't fall 940 00:33:50,990 --> 00:33:52,300 into any of these categories. 941 00:33:52,300 --> 00:33:54,508 You can have cells that are very, very face selective 942 00:33:54,508 --> 00:33:56,270 but they don't have to be very sparse. 943 00:33:56,270 --> 00:33:58,160 They will appear sparse if you poke them over and over 944 00:33:58,160 --> 00:33:59,820 and over with non-face stimuli because they're not 945 00:33:59,820 --> 00:34:01,096 going to respond to those. 946 00:34:01,096 --> 00:34:02,720 But within the domain of faces, they're 947 00:34:02,720 --> 00:34:05,814 going to respond to pretty much all faces. 948 00:34:05,814 --> 00:34:07,980 There are differences between these different cells. 949 00:34:07,980 --> 00:34:09,800 I'm going to come back to that as well. 950 00:34:09,800 --> 00:34:11,550 But it's one example where we can actually 951 00:34:11,550 --> 00:34:14,020 ask them these deep questions about what is the neural code 952 00:34:14,020 --> 00:34:15,860 and quantitative matter by focusing 953 00:34:15,860 --> 00:34:18,987 on the right stimulus and the right place to look at it. 954 00:34:18,987 --> 00:34:21,320 So we have some evidence that monkeys, like humans, have 955 00:34:21,320 --> 00:34:23,510 face regions, and the monkey face 956 00:34:23,510 --> 00:34:26,810 patches appear to be dedicated domain specific modules. 957 00:34:26,810 --> 00:34:28,370 The practical implications of this 958 00:34:28,370 --> 00:34:30,650 is that now we have unprecedented access 959 00:34:30,650 --> 00:34:32,570 to function homogeneous populations of cells 960 00:34:32,570 --> 00:34:34,760 coding for one high level object category. 961 00:34:34,760 --> 00:34:36,780 And we know this category, we can make stimuli. 962 00:34:36,780 --> 00:34:38,690 And we can modify the stimuli sometimes 963 00:34:38,690 --> 00:34:39,698 in parametric fashions. 964 00:34:39,698 --> 00:34:41,239 And so we can have very deep insights 965 00:34:41,239 --> 00:34:43,572 into how these cells actually are processing interfaces, 966 00:34:43,572 --> 00:34:46,100 how they are restricting properties from these faces. 967 00:34:46,100 --> 00:34:49,130 And we can do causal tests and actually show 968 00:34:49,130 --> 00:34:52,147 whether these cells are involved in face recognition behavior. 969 00:34:52,147 --> 00:34:54,230 And we're just going to go over this very quickly. 970 00:34:54,230 --> 00:34:56,477 This is work of Srivatsun Sadagopan, 971 00:34:56,477 --> 00:34:58,310 he actually gave me this picture of himself. 972 00:34:58,310 --> 00:35:01,280 This combines the front view and a profile view. 973 00:35:01,280 --> 00:35:02,322 The logic is very simple. 974 00:35:02,322 --> 00:35:04,780 So we wanted to inactivate one particular region in a male. 975 00:35:04,780 --> 00:35:06,920 I'm going to tell you in a second why a male. 976 00:35:06,920 --> 00:35:08,780 While the monkey would be engaged in a task 977 00:35:08,780 --> 00:35:11,071 like this where it is to find a face in a visual scene. 978 00:35:11,071 --> 00:35:12,634 The visual scene that we constructed 979 00:35:12,634 --> 00:35:13,550 looks a bit like this. 980 00:35:13,550 --> 00:35:15,216 It's displayed on a touch screen monitor 981 00:35:15,216 --> 00:35:16,760 so the animal's free to move around. 982 00:35:16,760 --> 00:35:18,259 It has to find the face in the scene 983 00:35:18,259 --> 00:35:20,270 and the scene is composed of a pink noise 984 00:35:20,270 --> 00:35:22,926 background embedded in which there are 24 different objects. 985 00:35:22,926 --> 00:35:24,800 And the target object, in this case the face, 986 00:35:24,800 --> 00:35:26,480 is going to be varying in visibility 987 00:35:26,480 --> 00:35:28,460 across 10 different levels. 988 00:35:28,460 --> 00:35:31,657 We would have other tasks where the monkey body was included, 989 00:35:31,657 --> 00:35:33,240 which you will not be able to see here 990 00:35:33,240 --> 00:35:35,360 but there is a monkey body here or the monkey 991 00:35:35,360 --> 00:35:37,580 was looking for a shoe. 992 00:35:37,580 --> 00:35:39,590 Then we would infuse muscimol which 993 00:35:39,590 --> 00:35:42,407 is a pharmacological agent that is inactivating cells 994 00:35:42,407 --> 00:35:44,240 along with a contrast agent gadolinium which 995 00:35:44,240 --> 00:35:45,472 you can measure an MRI. 996 00:35:45,472 --> 00:35:47,680 And then this yellow region here is the face area now 997 00:35:47,680 --> 00:35:49,763 and this white region is the actual injection site 998 00:35:49,763 --> 00:35:50,780 that we used. 999 00:35:50,780 --> 00:35:53,460 And so this gives it a way for every experiment to control, 1000 00:35:53,460 --> 00:35:55,580 are we inside the face area or are we outside. 1001 00:35:55,580 --> 00:35:58,570 And we can use the outside injections as controls. 1002 00:35:58,570 --> 00:36:00,220 What's been found is shown here. 1003 00:36:00,220 --> 00:36:01,900 So we have a psychometric curve. 1004 00:36:01,900 --> 00:36:04,150 So in normal behavior you're getting better and better 1005 00:36:04,150 --> 00:36:09,770 at finding the face in the scene as you increase its visibility. 1006 00:36:09,770 --> 00:36:12,500 If you inactivate, you get a reduced face detection 1007 00:36:12,500 --> 00:36:14,180 behavior. 1008 00:36:14,180 --> 00:36:18,290 I should emphasize that we only inactivating one face area out 1009 00:36:18,290 --> 00:36:19,610 of 12 in the temporal lobe. 1010 00:36:19,610 --> 00:36:21,351 We are only activating one hemisphere. 1011 00:36:21,351 --> 00:36:22,850 And what Jim DiCarlo was emphasizing 1012 00:36:22,850 --> 00:36:25,590 yesterday this retinopathy at this level of processing. 1013 00:36:25,590 --> 00:36:27,710 So the animal can actually use a scanning strategy 1014 00:36:27,710 --> 00:36:31,280 to go one direction and overcome this deficit. 1015 00:36:31,280 --> 00:36:34,132 And so likely this effect would be much stronger 1016 00:36:34,132 --> 00:36:35,840 if we had inactivated on both hemispheres 1017 00:36:35,840 --> 00:36:38,260 or would have controlled precisely for the eye movement. 1018 00:36:38,260 --> 00:36:40,250 But we are here for natural behavior. 1019 00:36:40,250 --> 00:36:41,790 And the controls of bodies and shoes 1020 00:36:41,790 --> 00:36:43,040 are, there is no effect there. 1021 00:36:43,040 --> 00:36:45,510 We put lots of controls and we went to the next state 1022 00:36:45,510 --> 00:36:46,570 of the behavior. 1023 00:36:46,570 --> 00:36:48,480 We did the injections outside as I mentioned, 1024 00:36:48,480 --> 00:36:51,200 it's very specific for inactivation inside the face 1025 00:36:51,200 --> 00:36:54,970 area that the most basic of face recognition abilities, face 1026 00:36:54,970 --> 00:36:57,250 detection is impaired. 1027 00:36:57,250 --> 00:36:59,422 And so there's a way you might visualize it. 1028 00:36:59,422 --> 00:37:00,880 So one way to explain this behavior 1029 00:37:00,880 --> 00:37:03,070 would be that the visibility of a face like this 1030 00:37:03,070 --> 00:37:04,552 would actually, with an activation, 1031 00:37:04,552 --> 00:37:06,010 look something like this where it's 1032 00:37:06,010 --> 00:37:09,867 going to be harder to detect. 1033 00:37:09,867 --> 00:37:11,700 The second way we can take advantage of this 1034 00:37:11,700 --> 00:37:13,230 is that we have now access to individual cells. 1035 00:37:13,230 --> 00:37:14,813 We actually ask more precise questions 1036 00:37:14,813 --> 00:37:16,650 about how they're processing faces. 1037 00:37:16,650 --> 00:37:17,880 And actually there's an activation study 1038 00:37:17,880 --> 00:37:20,421 that was motivated by earlier work we had done on selectivity 1039 00:37:20,421 --> 00:37:22,380 of these cells for features that should 1040 00:37:22,380 --> 00:37:24,780 be relevant for face detection. 1041 00:37:24,780 --> 00:37:26,280 This is what Shay Ohayon did when he 1042 00:37:26,280 --> 00:37:27,690 was a grad student with Doris. 1043 00:37:27,690 --> 00:37:30,420 He's actually now a post-doc with Jim. 1044 00:37:30,420 --> 00:37:33,870 And again, the question is how can you detect faces even 1045 00:37:33,870 --> 00:37:35,980 when the lighting conditions are very difficult. 1046 00:37:35,980 --> 00:37:37,896 There's beautiful work from Pavan Sinha that's 1047 00:37:37,896 --> 00:37:39,719 emphasizing that coarse, contrast 1048 00:37:39,719 --> 00:37:41,760 relationships in the face of very good heuristics 1049 00:37:41,760 --> 00:37:42,730 to do that. 1050 00:37:42,730 --> 00:37:44,790 The reason is that the 3D structure of our face 1051 00:37:44,790 --> 00:37:46,956 stays the same even when the lighting conditions are 1052 00:37:46,956 --> 00:37:47,490 changing. 1053 00:37:47,490 --> 00:37:49,530 And so no matter whether light is shining from, 1054 00:37:49,530 --> 00:37:51,420 typically the eye regions because they're 1055 00:37:51,420 --> 00:37:53,400 receded relative to nose and forehead, 1056 00:37:53,400 --> 00:37:54,817 are darker than nose and forehead. 1057 00:37:54,817 --> 00:37:56,775 And so he found that in the human psychophysics 1058 00:37:56,775 --> 00:37:59,010 you have 12 heuristics like this, forehead brighter 1059 00:37:59,010 --> 00:38:00,690 than left eye, forehead brighter than right eye, 1060 00:38:00,690 --> 00:38:02,731 nose brighter than mouth, and so on and so forth. 1061 00:38:02,731 --> 00:38:04,680 Twelve of these characteristics that actually 1062 00:38:04,680 --> 00:38:06,580 together can allow you to detect the face. 1063 00:38:06,580 --> 00:38:08,880 And in fact, your face detector on your cell phone 1064 00:38:08,880 --> 00:38:10,749 is using a very similar strategy as well. 1065 00:38:10,749 --> 00:38:12,790 Trying to find the coarse, contrast relationships 1066 00:38:12,790 --> 00:38:14,100 in the scene. 1067 00:38:14,100 --> 00:38:16,680 So what Shay did was he started with a real face, 1068 00:38:16,680 --> 00:38:19,050 he would parse it into 11 parts, and then 1069 00:38:19,050 --> 00:38:20,940 randomly assigned 11 different luminance 1070 00:38:20,940 --> 00:38:25,230 values to these 11 different parts and change this rapidly. 1071 00:38:25,230 --> 00:38:27,149 And then the analysis would look like this. 1072 00:38:27,149 --> 00:38:29,190 So no matter what the overall pattern looks like, 1073 00:38:29,190 --> 00:38:31,380 he's going to look for a particular contrast 1074 00:38:31,380 --> 00:38:33,644 relationship with the forehead versus the left eye. 1075 00:38:33,644 --> 00:38:36,060 And he's going to ask is the neuron responding differently 1076 00:38:36,060 --> 00:38:38,143 in these conditions where the forehead is brighter 1077 00:38:38,143 --> 00:38:40,170 than the left eye versus in these conditions 1078 00:38:40,170 --> 00:38:42,990 where the left eye is brighter than the left forehead. 1079 00:38:42,990 --> 00:38:46,650 And you can do this for all pairs of combinations 1080 00:38:46,650 --> 00:38:48,390 of 11 different parts. 1081 00:38:48,390 --> 00:38:50,940 And in these 55 different combinations, 1082 00:38:50,940 --> 00:38:52,920 we can mark by arrows the prediction 1083 00:38:52,920 --> 00:38:54,430 from human phycophysics. 1084 00:38:54,430 --> 00:38:57,270 So human psychophysics told us 12 of these constrast pairs 1085 00:38:57,270 --> 00:38:58,446 are going to be important. 1086 00:38:58,446 --> 00:38:59,820 It also told us what the polarity 1087 00:38:59,820 --> 00:39:02,327 was that was going to be important for detecting a face. 1088 00:39:02,327 --> 00:39:04,410 Again, forehead brighter than eye, or eye brighter 1089 00:39:04,410 --> 00:39:06,380 than forehead. 1090 00:39:06,380 --> 00:39:08,800 OK and this is what we actually found. 1091 00:39:08,800 --> 00:39:11,040 So what this shows is a population diagram 1092 00:39:11,040 --> 00:39:14,370 of all the cells that Shay found and it was half of the cells 1093 00:39:14,370 --> 00:39:17,120 that he recorded from that showed some selectivity 1094 00:39:17,120 --> 00:39:19,050 to some of these contrast features 1095 00:39:19,050 --> 00:39:21,060 and to some of these contrast polarities. 1096 00:39:21,060 --> 00:39:23,220 What's plotted here upwards is for one contrast 1097 00:39:23,220 --> 00:39:26,220 polarity and one particular contrast pair, 1098 00:39:26,220 --> 00:39:28,622 the number of cells he found that were selective. 1099 00:39:28,622 --> 00:39:30,330 And as you go through the entire diagram, 1100 00:39:30,330 --> 00:39:33,061 you see that there are only a very few examples of contrast 1101 00:39:33,061 --> 00:39:34,560 pairs where actually different cells 1102 00:39:34,560 --> 00:39:36,480 like different polarities. 1103 00:39:36,480 --> 00:39:38,460 Like here for example you have more than 60, 1104 00:39:38,460 --> 00:39:40,920 70 cells that all like one polarity and not 1105 00:39:40,920 --> 00:39:43,110 a single one that likes the opposite polarity. 1106 00:39:43,110 --> 00:39:45,190 And this is true for all these polarities. 1107 00:39:45,190 --> 00:39:47,040 So it's a very consistent pattern. 1108 00:39:47,040 --> 00:39:50,700 Second, we can explain all the human psychophysics preferences 1109 00:39:50,700 --> 00:39:51,890 here. 1110 00:39:51,890 --> 00:39:53,970 Not only that, these are important dimensions 1111 00:39:53,970 --> 00:39:54,979 for these cells. 1112 00:39:54,979 --> 00:39:56,520 But in all these cases the cells care 1113 00:39:56,520 --> 00:39:58,936 for the exact same polarity that they would have predicted 1114 00:39:58,936 --> 00:40:00,000 from human psychophysics. 1115 00:40:00,000 --> 00:40:02,550 In addition, there are other contrast pairs that apparently 1116 00:40:02,550 --> 00:40:04,470 don't matter this much in human psychophysics, 1117 00:40:04,470 --> 00:40:06,300 but these cells also care about. 1118 00:40:06,300 --> 00:40:08,750 So they seem to be using these coarse contrast features. 1119 00:40:08,750 --> 00:40:10,834 And again, they're very useful for face detection. 1120 00:40:10,834 --> 00:40:13,333 Now we got the behavior that we know that the areas involved 1121 00:40:13,333 --> 00:40:14,970 in the face detection with stimuli 1122 00:40:14,970 --> 00:40:17,449 where it's actually hard to make out the detail. 1123 00:40:17,449 --> 00:40:18,990 Then Shay did a control and I thought 1124 00:40:18,990 --> 00:40:21,180 this was really the coolest thing ever. 1125 00:40:21,180 --> 00:40:22,930 He's a computer scientist and so of course 1126 00:40:22,930 --> 00:40:24,990 he knew about databases and how to use them. 1127 00:40:24,990 --> 00:40:26,531 And he would say, OK, can we actually 1128 00:40:26,531 --> 00:40:30,390 fool these cells into responding to non-face stimuli that 1129 00:40:30,390 --> 00:40:33,540 comply by the rules of this coarse contrast of faces? 1130 00:40:33,540 --> 00:40:36,630 So here are some examples. 1131 00:40:36,630 --> 00:40:38,460 This is just a pattern where there 1132 00:40:38,460 --> 00:40:40,810 are some dark regions where the human face eyes might be 1133 00:40:40,810 --> 00:40:41,540 and so on and so forth. 1134 00:40:41,540 --> 00:40:43,498 This is a pattern that only has one of these 12 1135 00:40:43,498 --> 00:40:44,620 contrasts correct. 1136 00:40:44,620 --> 00:40:46,650 But also in human faces, you can find some-- 1137 00:40:46,650 --> 00:40:48,732 when a person is smiling, wearing glasses, 1138 00:40:48,732 --> 00:40:50,190 most of this contrasts are actually 1139 00:40:50,190 --> 00:40:52,645 not in the face that should be there. 1140 00:40:52,645 --> 00:40:54,990 So there are some very contrast correct faces and there 1141 00:40:54,990 --> 00:40:57,962 are some faces that are not very very contrast correct. 1142 00:40:57,962 --> 00:40:59,670 And now you can ask how does the response 1143 00:40:59,670 --> 00:41:02,490 of the cells in the middle face patch change as you are 1144 00:41:02,490 --> 00:41:04,500 increasing the number of correct contrasts, 1145 00:41:04,500 --> 00:41:07,092 either on the face or non-face stimuli. 1146 00:41:07,092 --> 00:41:08,050 And the answer is this. 1147 00:41:08,050 --> 00:41:10,620 If you increase the number of contrasts in the face, 1148 00:41:10,620 --> 00:41:12,294 the cells respond more and more. 1149 00:41:12,294 --> 00:41:14,210 If you change the contrast of non-face objects 1150 00:41:14,210 --> 00:41:15,402 the cells don't care. 1151 00:41:15,402 --> 00:41:17,360 So there is something else that caused contrast 1152 00:41:17,360 --> 00:41:19,235 that the cells care about, they're not easily 1153 00:41:19,235 --> 00:41:21,540 fooled to respond to things that clearly 1154 00:41:21,540 --> 00:41:25,890 aren't faces when the coarse contrasts are correct. 1155 00:41:25,890 --> 00:41:27,914 And we could actually have predicted something 1156 00:41:27,914 --> 00:41:29,830 like this should happen from an earlier study. 1157 00:41:29,830 --> 00:41:31,350 So the first study we did where we 1158 00:41:31,350 --> 00:41:33,810 took advantage of the fact that in a face selective area 1159 00:41:33,810 --> 00:41:36,120 we can record from faces over and over again 1160 00:41:36,120 --> 00:41:37,840 and that they have similar properties 1161 00:41:37,840 --> 00:41:40,440 was a study where we looked at the effect of part and whole. 1162 00:41:40,440 --> 00:41:42,975 This is one of the central features in psychophysical 1163 00:41:42,975 --> 00:41:45,930 of human faces is that you can get information from the face 1164 00:41:45,930 --> 00:41:48,600 without any detail just from the gist of the face. 1165 00:41:48,600 --> 00:41:50,940 An example is again from Pavan Sinha. 1166 00:41:50,940 --> 00:41:54,270 If you have a blurred face of a familiar individual, 1167 00:41:54,270 --> 00:41:57,391 like some of you might recognize Woody Allen here, 1168 00:41:57,391 --> 00:41:59,640 of course with the glasses it's a little bit cheating, 1169 00:41:59,640 --> 00:42:01,290 but you can recognize him. 1170 00:42:01,290 --> 00:42:02,970 And the other examples in the study 1171 00:42:02,970 --> 00:42:04,530 were people who don't wear glasses, 1172 00:42:04,530 --> 00:42:06,630 so you can recognize a famous face just from the gist of it. 1173 00:42:06,630 --> 00:42:07,880 You don't need the details. 1174 00:42:07,880 --> 00:42:09,630 On the other hand, we can process details. 1175 00:42:09,630 --> 00:42:11,310 We can focus on details. 1176 00:42:11,310 --> 00:42:14,880 And so how do these two things relate to each other? 1177 00:42:14,880 --> 00:42:18,360 What we did was we constructed a face space, 1178 00:42:18,360 --> 00:42:21,210 a cartoon face space, based on very, very simple geometric 1179 00:42:21,210 --> 00:42:21,930 shapes. 1180 00:42:21,930 --> 00:42:24,310 So these faces are just made out of ovals, and triangles, 1181 00:42:24,310 --> 00:42:25,080 and lines. 1182 00:42:25,080 --> 00:42:27,760 So very simple geometric shapes. 1183 00:42:27,760 --> 00:42:30,390 But if they are put together they actually look like faces. 1184 00:42:30,390 --> 00:42:32,139 Now we can parameterize this face space, 1185 00:42:32,139 --> 00:42:33,430 we can vary certain parameters. 1186 00:42:33,430 --> 00:42:37,070 So we had faces that change in aspect ratio. 1187 00:42:37,070 --> 00:42:40,230 They go from Sesame character Ernie here to Bert. 1188 00:42:40,230 --> 00:42:44,100 We have pupil size, like no pupil here, very big pupils. 1189 00:42:44,100 --> 00:42:46,060 We have inter-eye distance here. 1190 00:42:46,060 --> 00:42:49,440 So these eyes are close together almost like cyclopian fashion 1191 00:42:49,440 --> 00:42:52,070 or they can be very far apart from each other, 1192 00:42:52,070 --> 00:42:55,450 stretching the outside of the face and so on and so forth. 1193 00:42:55,450 --> 00:42:57,694 And we would now randomly change these features, 1194 00:42:57,694 --> 00:42:59,360 all these different features dimensions, 1195 00:42:59,360 --> 00:43:01,819 randomly choosing a new value every time we show this face. 1196 00:43:01,819 --> 00:43:03,443 It looks a bit like a cartoon character 1197 00:43:03,443 --> 00:43:04,809 who is trying to talk to you. 1198 00:43:04,809 --> 00:43:06,600 But the way we analyze this is very simple. 1199 00:43:06,600 --> 00:43:09,270 So we just asked no matter what these other features are, 1200 00:43:09,270 --> 00:43:11,330 for the first feature dimension mention does 1201 00:43:11,330 --> 00:43:13,830 the firing of the cell change as we're changing this feature 1202 00:43:13,830 --> 00:43:14,380 dimension. 1203 00:43:14,380 --> 00:43:15,630 Then we asked this for the second dimension, 1204 00:43:15,630 --> 00:43:18,005 the third dimension, and so on and so forth for all these 1205 00:43:18,005 --> 00:43:19,410 slightly different dimensions. 1206 00:43:19,410 --> 00:43:21,840 What we found is shown here for one example cell. 1207 00:43:21,840 --> 00:43:23,954 So we had 19 different tuning curves. 1208 00:43:23,954 --> 00:43:25,620 And of these 19 different tuning curves, 1209 00:43:25,620 --> 00:43:27,422 for four are significantly tuned. 1210 00:43:27,422 --> 00:43:29,880 For this particular cell it was face aspect ratio so didn't 1211 00:43:29,880 --> 00:43:31,062 like Ernie, it liked Bert. 1212 00:43:31,062 --> 00:43:33,270 It liked the eyes very close together, not far apart. 1213 00:43:33,270 --> 00:43:36,120 It likes the eyes a little bit narrow, not wide. 1214 00:43:36,120 --> 00:43:38,252 And it liked big irises, not small ones. 1215 00:43:38,252 --> 00:43:39,960 What's very typical for how the cells are 1216 00:43:39,960 --> 00:43:42,810 processing these features are these ram shaped tuning curves. 1217 00:43:42,810 --> 00:43:45,750 So more than 2/3 of the tuning curves have this ramp shape. 1218 00:43:45,750 --> 00:43:48,282 Which means that these cells are relaying the information 1219 00:43:48,282 --> 00:43:50,490 that they are measuring almost in one to one fashion. 1220 00:43:53,500 --> 00:43:55,590 This is not what the cells are actually doing. 1221 00:43:55,590 --> 00:43:57,030 It's just a metaphor. 1222 00:43:57,030 --> 00:43:59,280 But it's almost like they're taking a ruler, measuring 1223 00:43:59,280 --> 00:44:00,696 eye distance, and they're relaying 1224 00:44:00,696 --> 00:44:03,900 this feature in almost one to one fashion in their output. 1225 00:44:03,900 --> 00:44:07,320 Another implication is that most of your coding capacity 1226 00:44:07,320 --> 00:44:09,522 is actually at the extremes because many cells have 1227 00:44:09,522 --> 00:44:11,730 big responses there, many cells have small responses. 1228 00:44:11,730 --> 00:44:13,630 So most of the capacity is there. 1229 00:44:13,630 --> 00:44:15,390 That's the range where caricatures live. 1230 00:44:15,390 --> 00:44:17,940 And oftentimes we are better able to recognize individuals 1231 00:44:17,940 --> 00:44:21,950 based on caricatures than the individuals themselves. 1232 00:44:21,950 --> 00:44:23,640 So in the middle the face patches, 1233 00:44:23,640 --> 00:44:26,181 they're causally and selectively relevant for face detection. 1234 00:44:26,181 --> 00:44:27,990 The cells are virtually all face selective. 1235 00:44:27,990 --> 00:44:30,030 Based on these two findings we actually suggest 1236 00:44:30,030 --> 00:44:33,000 and it's a little like Nancy said as a strong signal 1237 00:44:33,000 --> 00:44:35,989 you're putting out that you get backlash if there is. 1238 00:44:35,989 --> 00:44:37,530 So we do think that there are modules 1239 00:44:37,530 --> 00:44:39,780 that are there for face processing and face processing 1240 00:44:39,780 --> 00:44:40,585 only. 1241 00:44:40,585 --> 00:44:42,960 The gain of the tuning curve is modulated by the presence 1242 00:44:42,960 --> 00:44:44,320 of the entire face. 1243 00:44:44,320 --> 00:44:46,741 There's this ramp shaped tuning, which is very useful. 1244 00:44:46,741 --> 00:44:48,990 The cells are sensitive to contrast relations which is 1245 00:44:48,990 --> 00:44:51,700 very useful for face detection. 1246 00:44:51,700 --> 00:44:55,060 So we can really get mechanistic about understanding 1247 00:44:55,060 --> 00:44:55,860 face recognition. 1248 00:44:55,860 --> 00:44:57,330 It's not just that we can say, OK, these cells 1249 00:44:57,330 --> 00:44:58,830 are responding more to faces but we 1250 00:44:58,830 --> 00:45:00,871 can say why they're responding more to some faces 1251 00:45:00,871 --> 00:45:01,780 than to others. 1252 00:45:01,780 --> 00:45:03,821 In fact, you can predict from the cartoon results 1253 00:45:03,821 --> 00:45:07,200 how the cells are responding to pictures of actual people 1254 00:45:07,200 --> 00:45:09,064 with the very fine details physically. 1255 00:45:09,064 --> 00:45:10,980 And so at the level of the middle face patches 1256 00:45:10,980 --> 00:45:12,060 we already have some of the requirements 1257 00:45:12,060 --> 00:45:13,232 for face recognition system. 1258 00:45:13,232 --> 00:45:14,940 So we have mechanisms for face detection, 1259 00:45:14,940 --> 00:45:16,860 we have some encoding of facial features, 1260 00:45:16,860 --> 00:45:18,832 and we have encoding of configurations. 1261 00:45:21,810 --> 00:45:23,362 Nancy said I should talk about this. 1262 00:45:23,362 --> 00:45:24,570 I'm going to talk about this. 1263 00:45:24,570 --> 00:45:29,025 Sebastian Miller was a wonderful grad student with Doris and me. 1264 00:45:29,025 --> 00:45:30,400 He asked the following questions. 1265 00:45:30,400 --> 00:45:32,996 So whether the face pictures are related to each other or not. 1266 00:45:32,996 --> 00:45:34,620 If you look at the overall organization 1267 00:45:34,620 --> 00:45:37,420 these face errors are very far apart from each other. 1268 00:45:37,420 --> 00:45:39,170 So the most posterior to the most anterior 1269 00:45:39,170 --> 00:45:41,336 is one inch apart, it's a third of the entire extent 1270 00:45:41,336 --> 00:45:42,271 of the primate brain. 1271 00:45:42,271 --> 00:45:44,520 They live in different cytoarchitectonic environments. 1272 00:45:44,520 --> 00:45:46,728 So you could also imagine that maybe the connectivity 1273 00:45:46,728 --> 00:45:47,842 is mostly local. 1274 00:45:47,842 --> 00:45:49,800 On the other hand, they are interested in faces 1275 00:45:49,800 --> 00:45:51,510 and so you would imagine that maybe there 1276 00:45:51,510 --> 00:45:54,084 are specialized connections between them. 1277 00:45:54,084 --> 00:45:56,250 The way we addressed this was with micro stimulation 1278 00:45:56,250 --> 00:45:57,041 inside the scanner. 1279 00:45:57,041 --> 00:45:58,950 So we would first image the face areas. 1280 00:45:58,950 --> 00:46:02,570 We would then lower an electrode to one of the face areas here, 1281 00:46:02,570 --> 00:46:05,220 we record from cells to make sure it's face selective, 1282 00:46:05,220 --> 00:46:07,110 but then we would use this electrode 1283 00:46:07,110 --> 00:46:09,107 to pass a current through inside the scanner. 1284 00:46:09,107 --> 00:46:10,690 Passing a current through an electrode 1285 00:46:10,690 --> 00:46:11,760 is going to activate cells. 1286 00:46:11,760 --> 00:46:14,218 That in turn is going to change blood flow and oxygenation. 1287 00:46:14,218 --> 00:46:16,000 So things we can pick up on the scanner. 1288 00:46:16,000 --> 00:46:18,541 And so yes, if this worked you should get a swath of activity 1289 00:46:18,541 --> 00:46:19,980 around your simulation site. 1290 00:46:19,980 --> 00:46:21,930 But if these cells at your stimulation site 1291 00:46:21,930 --> 00:46:24,554 have predictions that are strong and focal enough to drive down 1292 00:46:24,554 --> 00:46:26,670 some neurons, you might also find activation 1293 00:46:26,670 --> 00:46:29,190 that spatially remote locations and they can then 1294 00:46:29,190 --> 00:46:32,742 see where these locations are related to the face areas. 1295 00:46:32,742 --> 00:46:34,200 So here is a computer flattened map 1296 00:46:34,200 --> 00:46:36,990 of the brain of one macaque monkeys. 1297 00:46:36,990 --> 00:46:39,360 The green areas indicate by outline 1298 00:46:39,360 --> 00:46:41,070 the extent of the face areas. 1299 00:46:41,070 --> 00:46:42,510 We placed our simulation electrode 1300 00:46:42,510 --> 00:46:43,992 in one of the face areas and this 1301 00:46:43,992 --> 00:46:45,700 is the map we got from micro stimulation, 1302 00:46:45,700 --> 00:46:46,876 versus no micro stimulation. 1303 00:46:46,876 --> 00:46:48,750 So there's no visual stimulus there, actually 1304 00:46:48,750 --> 00:46:51,090 that works during sleep during complete darkness. 1305 00:46:51,090 --> 00:46:52,560 So yes, we get a swath of activity 1306 00:46:52,560 --> 00:46:54,120 around the stimulation site, and we 1307 00:46:54,120 --> 00:46:57,385 get multiple spatially disjunct regions that are activated. 1308 00:46:57,385 --> 00:46:59,010 So they're strongly driven by the cells 1309 00:46:59,010 --> 00:47:02,464 in this region and these overlap with the face areas. 1310 00:47:02,464 --> 00:47:04,380 And this was either be found very consistently 1311 00:47:04,380 --> 00:47:06,630 across different phase areas if you stimulated outside 1312 00:47:06,630 --> 00:47:08,760 you also got this patchy pattern of connectivity, 1313 00:47:08,760 --> 00:47:11,155 but now it's outside of the face system. 1314 00:47:11,155 --> 00:47:12,780 And so this is the picture that we got. 1315 00:47:12,780 --> 00:47:14,488 Is that yes these face areas are actually 1316 00:47:14,488 --> 00:47:17,970 part of a network of face areas that are strongly 1317 00:47:17,970 --> 00:47:19,330 interconnected with each other. 1318 00:47:19,330 --> 00:47:21,580 There's now data from retrograde tracer studies, 1319 00:47:21,580 --> 00:47:23,670 we find that 90% of the cell bodies that 1320 00:47:23,670 --> 00:47:26,530 are labeled after an injection inside a face area 1321 00:47:26,530 --> 00:47:29,740 are inside other face areas or in the same face areas. 1322 00:47:29,740 --> 00:47:33,150 So it's a surprisingly anatomically specialized 1323 00:47:33,150 --> 00:47:35,310 and closed network. 1324 00:47:35,310 --> 00:47:37,560 So what's happening in these areas, 1325 00:47:37,560 --> 00:47:40,540 and again, my movie isn't going to work. 1326 00:47:40,540 --> 00:47:44,461 So in this area AL was more anterior, also virtually 1327 00:47:44,461 --> 00:47:45,960 all of the cells are face selective. 1328 00:47:45,960 --> 00:47:47,668 But you have a property emerge here which 1329 00:47:47,668 --> 00:47:48,670 you didn't have before. 1330 00:47:48,670 --> 00:47:50,040 And that is mere symmetric confusion. 1331 00:47:50,040 --> 00:47:51,270 And it's something that we did not expect. 1332 00:47:51,270 --> 00:47:52,770 We're still puzzled by it. 1333 00:47:52,770 --> 00:47:56,456 We have no explanation why it's happening. 1334 00:47:56,456 --> 00:47:58,830 But in this area you have cells that like a profile view. 1335 00:47:58,830 --> 00:48:00,390 And if they like one profile view, 1336 00:48:00,390 --> 00:48:02,265 they also like the opposite profile view. 1337 00:48:02,265 --> 00:48:04,140 And this region here as I mentioned initially 1338 00:48:04,140 --> 00:48:06,485 that some of the cells did not really seem to be 1339 00:48:06,485 --> 00:48:07,110 face selective. 1340 00:48:07,110 --> 00:48:08,280 It's a small percentage. 1341 00:48:08,280 --> 00:48:10,885 But actually these cells are selective for facial profile. 1342 00:48:10,885 --> 00:48:13,260 But if they like one profile right, they don't like left. 1343 00:48:13,260 --> 00:48:14,968 If they like left, they don't like right. 1344 00:48:14,968 --> 00:48:17,620 In AL, this is being confused. 1345 00:48:17,620 --> 00:48:19,620 And then if you go to AM, you have cells 1346 00:48:19,620 --> 00:48:21,245 that respond to all faces. 1347 00:48:21,245 --> 00:48:22,620 It doesn't matter where they are, 1348 00:48:22,620 --> 00:48:24,119 doesn't matter who they are, doesn't 1349 00:48:24,119 --> 00:48:25,910 matter how big they are. 1350 00:48:25,910 --> 00:48:29,650 And the other cells that also don't care where they are 1351 00:48:29,650 --> 00:48:32,130 and how big they are, but they care exquisitely 1352 00:48:32,130 --> 00:48:33,517 about identity. 1353 00:48:33,517 --> 00:48:35,100 So they can be very, very finely tuned 1354 00:48:35,100 --> 00:48:37,680 to identity in particular to people 1355 00:48:37,680 --> 00:48:39,750 that the animals never see in real life. 1356 00:48:39,750 --> 00:48:42,060 So there seems to be a computation going on from here 1357 00:48:42,060 --> 00:48:48,390 to here where in Jim DiCarlo's conceptual framework 1358 00:48:48,390 --> 00:48:51,330 you could imagine that there's a manifold that's now becoming 1359 00:48:51,330 --> 00:48:55,020 sort of flatter and more like a more explicit representation. 1360 00:48:55,020 --> 00:48:56,520 And for some reason creating this 1361 00:48:56,520 --> 00:48:59,439 has to go through this mirror symmetric confusion. 1362 00:48:59,439 --> 00:49:00,480 I just want to highlight. 1363 00:49:00,480 --> 00:49:02,195 So we meant to touch upon the question 1364 00:49:02,195 --> 00:49:04,320 whether a face area should do different coputations 1365 00:49:04,320 --> 00:49:06,090 from non-face areas. 1366 00:49:06,090 --> 00:49:07,590 Actually, my intuition about this 1367 00:49:07,590 --> 00:49:09,100 was actually quite the opposite. 1368 00:49:09,100 --> 00:49:11,350 So I thought they would likely do the same computation 1369 00:49:11,350 --> 00:49:12,766 or hopefully the same computations 1370 00:49:12,766 --> 00:49:15,500 as outside the areas just in different material. 1371 00:49:15,500 --> 00:49:16,792 So why in other non-face areas? 1372 00:49:16,792 --> 00:49:18,374 Why don't you want to mix these cells? 1373 00:49:18,374 --> 00:49:20,460 We have one study that's a little too complicated 1374 00:49:20,460 --> 00:49:22,532 to explain here that gives some clues. 1375 00:49:22,532 --> 00:49:24,240 But some computation work from Joel Leibo 1376 00:49:24,240 --> 00:49:25,823 was a grad student with Tommy actually 1377 00:49:25,823 --> 00:49:27,180 gave some clues to that. 1378 00:49:27,180 --> 00:49:30,840 So Joel and Tommy were thinking about invariance. 1379 00:49:30,840 --> 00:49:32,370 And Tommy told me he's going to talk 1380 00:49:32,370 --> 00:49:35,520 about this at some later point in the course. 1381 00:49:35,520 --> 00:49:37,590 The different kinds of transformations, easy ones 1382 00:49:37,590 --> 00:49:38,430 and difficult ones. 1383 00:49:38,430 --> 00:49:41,010 The easy ones are affine transformations. 1384 00:49:41,010 --> 00:49:44,430 We're just shifting something in space or in size, 1385 00:49:44,430 --> 00:49:45,990 or we rotate it in plane. 1386 00:49:45,990 --> 00:49:47,982 And so if you learn how to correct 1387 00:49:47,982 --> 00:49:50,190 for this transformation for just three dots of light, 1388 00:49:50,190 --> 00:49:52,320 you can look for any image that you can ever see. 1389 00:49:52,320 --> 00:49:53,470 So this is relatively easy. 1390 00:49:53,470 --> 00:49:55,470 But then they're non-affine transformations that 1391 00:49:55,470 --> 00:49:56,845 are actually changing the picture 1392 00:49:56,845 --> 00:49:58,560 and they are very difficult. So if you 1393 00:49:58,560 --> 00:50:00,730 change your facial expression for example, 1394 00:50:00,730 --> 00:50:02,370 or if lighting conditions are changing, 1395 00:50:02,370 --> 00:50:05,231 or if you're turning your head in depth, 1396 00:50:05,231 --> 00:50:06,730 this is a non-affine transformation. 1397 00:50:06,730 --> 00:50:08,736 So it's not predictable from just three dots. 1398 00:50:08,736 --> 00:50:10,110 And you can learn something there 1399 00:50:10,110 --> 00:50:13,020 that could tell you how this picture would 1400 00:50:13,020 --> 00:50:15,030 look like under this non-affine transformation. 1401 00:50:15,030 --> 00:50:17,040 And one of the insights from Joel was that if you learn this 1402 00:50:17,040 --> 00:50:19,610 non-affine transformation on a particular object category-- 1403 00:50:19,610 --> 00:50:20,700 let's say faces-- 1404 00:50:20,700 --> 00:50:22,949 you actually have learned nothing about another object 1405 00:50:22,949 --> 00:50:24,720 category like cars. 1406 00:50:24,720 --> 00:50:26,352 That's actually quite surprising to me 1407 00:50:26,352 --> 00:50:27,810 but there could be a reason why you 1408 00:50:27,810 --> 00:50:29,351 might want to have all the cells that 1409 00:50:29,351 --> 00:50:31,890 have to learn representations across one transformation put 1410 00:50:31,890 --> 00:50:34,884 them all in one location. 1411 00:50:34,884 --> 00:50:36,300 The second insight they had, and I 1412 00:50:36,300 --> 00:50:38,490 think it's still very surprising to me 1413 00:50:38,490 --> 00:50:40,120 that it actually works so easily, 1414 00:50:40,120 --> 00:50:42,630 is to give the computation a count for the system 1415 00:50:42,630 --> 00:50:44,880 that I just described to you in qualitative terms. 1416 00:50:44,880 --> 00:50:47,620 So we have three levels of processing. 1417 00:50:47,620 --> 00:50:49,530 So we have a front end where cells 1418 00:50:49,530 --> 00:50:51,360 are useful for face detection, they're 1419 00:50:51,360 --> 00:50:52,860 all very face selective. 1420 00:50:52,860 --> 00:50:55,560 So you could think of this like a three level processing 1421 00:50:55,560 --> 00:50:58,800 hierarchy where the level one is like a face filter that's 1422 00:50:58,800 --> 00:51:01,590 just going to tell you it's their face or not. 1423 00:51:01,590 --> 00:51:03,450 The top level you want an identification. 1424 00:51:03,450 --> 00:51:04,650 And I didn't show the examples, again 1425 00:51:04,650 --> 00:51:06,390 I hope with a connection to the monitor I 1426 00:51:06,390 --> 00:51:07,804 can show you the actual movies. 1427 00:51:07,804 --> 00:51:09,720 You have some cells that are very, very finely 1428 00:51:09,720 --> 00:51:11,322 selective of facial identity. 1429 00:51:11,322 --> 00:51:12,780 With pattern readout techniques you 1430 00:51:12,780 --> 00:51:17,149 can read out identity extremely reliably. 1431 00:51:17,149 --> 00:51:18,940 If you now have like Hebbian Learning Rule, 1432 00:51:18,940 --> 00:51:20,940 maybe Tommy is going to explain this to you more, 1433 00:51:20,940 --> 00:51:22,606 you actually get something pretty magic. 1434 00:51:22,606 --> 00:51:24,976 So you do get invariance at level number three, 1435 00:51:24,976 --> 00:51:27,600 which is kind of what you wanted and might not be surprised by. 1436 00:51:27,600 --> 00:51:29,724 But as a byproduct, you are getting mirror symmetry 1437 00:51:29,724 --> 00:51:30,402 at level two. 1438 00:51:30,402 --> 00:51:32,610 And that's something you didn't stick into the system 1439 00:51:32,610 --> 00:51:34,674 and it just happens, not like magic, 1440 00:51:34,674 --> 00:51:36,090 but there's an explanation for why 1441 00:51:36,090 --> 00:51:38,310 this happens, out of very general assumptions 1442 00:51:38,310 --> 00:51:39,670 about the system. 1443 00:51:39,670 --> 00:51:41,520 So the point I want to make here, 1444 00:51:41,520 --> 00:51:43,200 this particular model could be wrong, 1445 00:51:43,200 --> 00:51:45,927 but it's something about how knowing something 1446 00:51:45,927 --> 00:51:47,760 about the overall organization of the system 1447 00:51:47,760 --> 00:51:51,644 might actually reveal underlying relationships that you 1448 00:51:51,644 --> 00:51:52,560 might not think about. 1449 00:51:52,560 --> 00:51:54,240 So the fact that there are three levels of processing 1450 00:51:54,240 --> 00:51:56,430 and not four or five or six might actually 1451 00:51:56,430 --> 00:51:58,389 impact whether you find mirror symmetry or not. 1452 00:51:58,389 --> 00:51:59,846 Or whether you find mirrir symmetry 1453 00:51:59,846 --> 00:52:01,110 at one level or another level. 1454 00:52:01,110 --> 00:52:04,380 And you don't necessarily get this automatically. 1455 00:52:04,380 --> 00:52:07,140 Just put any processing system together. 1456 00:52:07,140 --> 00:52:09,180 Becuase I was mentioning a facial motion, 1457 00:52:09,180 --> 00:52:12,810 I would like to give a brief vignette of that. 1458 00:52:12,810 --> 00:52:17,115 And so I was emphasizing here transformations 1459 00:52:17,115 --> 00:52:17,990 along this direction. 1460 00:52:17,990 --> 00:52:20,640 But you can see that at least two levels of processing 1461 00:52:20,640 --> 00:52:22,710 there are actually two face areas here, 1462 00:52:22,710 --> 00:52:25,240 one lateral to the STS, and the other one deep inside. 1463 00:52:25,240 --> 00:52:27,739 And so one of the questions we had was what's going on here? 1464 00:52:27,739 --> 00:52:29,110 How are they different? 1465 00:52:29,110 --> 00:52:31,740 And so one way you might think about this 1466 00:52:31,740 --> 00:52:34,060 is again, connecting faces to social perception. 1467 00:52:34,060 --> 00:52:35,697 So there are some faces out there 1468 00:52:35,697 --> 00:52:36,780 that are not really faces. 1469 00:52:36,780 --> 00:52:39,074 And so the faces of dolls are just one example. 1470 00:52:39,074 --> 00:52:41,115 So physically they are faces but you can actually 1471 00:52:41,115 --> 00:52:43,899 tell that they are not really real, they're agents. 1472 00:52:43,899 --> 00:52:46,440 And people like Thalia Wheatley are wondering about questions 1473 00:52:46,440 --> 00:52:47,940 like why dolls are creepy. 1474 00:52:47,940 --> 00:52:51,120 So there is an expectation that the face 1475 00:52:51,120 --> 00:52:52,604 should belong to a real agent. 1476 00:52:52,604 --> 00:52:54,770 And there are different clues that can give it away. 1477 00:52:54,770 --> 00:52:56,728 So if the face is on top of a body, more likely 1478 00:52:56,728 --> 00:52:59,190 it's an agent just like an artificial stimulus. 1479 00:52:59,190 --> 00:53:02,110 If a face is moving that's another clue it's an agent. 1480 00:53:02,110 --> 00:53:03,540 And this will change fundamentally 1481 00:53:03,540 --> 00:53:05,010 how you interacting with it. 1482 00:53:05,010 --> 00:53:06,676 Your interaction with the doll is likely 1483 00:53:06,676 --> 00:53:08,720 going to be very different than with a baby. 1484 00:53:08,720 --> 00:53:10,365 And so again, objects and their meaning 1485 00:53:10,365 --> 00:53:12,240 are making them actionable in different ways. 1486 00:53:12,240 --> 00:53:13,860 And we have to understand what the circuits are that actually 1487 00:53:13,860 --> 00:53:15,222 make this possible. 1488 00:53:15,222 --> 00:53:16,680 So one way to look at this again is 1489 00:53:16,680 --> 00:53:17,900 to think about these facial displays. 1490 00:53:17,900 --> 00:53:19,792 I showed the Tonkin macaques and Clark 1491 00:53:19,792 --> 00:53:22,230 Fisher was an M.D. PhD student in the lab was actually 1492 00:53:22,230 --> 00:53:23,850 addressing this question. 1493 00:53:23,850 --> 00:53:25,730 And so he made movies like this one here. 1494 00:53:25,730 --> 00:53:29,070 Luckily there's no sound so we can actually play them. 1495 00:53:29,070 --> 00:53:31,170 These are movies of macaque monkeys making 1496 00:53:31,170 --> 00:53:33,170 facial movements of all different kinds 1497 00:53:33,170 --> 00:53:34,701 of facial expressions. 1498 00:53:34,701 --> 00:53:36,450 And we then also have stills that are just 1499 00:53:36,450 --> 00:53:38,177 changing from time to time. 1500 00:53:38,177 --> 00:53:40,260 And we have controls of toys that the animals also 1501 00:53:40,260 --> 00:53:42,000 know that are either moving or that 1502 00:53:42,000 --> 00:53:44,242 are jumping every second from one state to the other. 1503 00:53:44,242 --> 00:53:46,200 And we would ask as we had in an earlier study, 1504 00:53:46,200 --> 00:53:48,330 are these areas responding to this motion 1505 00:53:48,330 --> 00:53:52,101 differently than to the static images. 1506 00:53:52,101 --> 00:53:53,100 So here's what he found. 1507 00:53:53,100 --> 00:53:55,450 So he had six different face areas he was looking at. 1508 00:53:55,450 --> 00:53:57,284 If you're looking at static form selectivity 1509 00:53:57,284 --> 00:53:59,741 we're just reproducing the way that the area was found just 1510 00:53:59,741 --> 00:54:01,000 with a different stimulus. 1511 00:54:01,000 --> 00:54:03,720 So all the six areas respond more to faces than to objects. 1512 00:54:03,720 --> 00:54:06,774 If we now compare moving faces to static faces, 1513 00:54:06,774 --> 00:54:08,190 all the arrows are responding more 1514 00:54:08,190 --> 00:54:10,050 to the moving faces than the static ones. 1515 00:54:10,050 --> 00:54:12,630 Some quantitative differences, but overall the same pattern, 1516 00:54:12,630 --> 00:54:15,222 more responsive to moving than static stimuli. 1517 00:54:15,222 --> 00:54:17,430 If you now compare on the right inside the modulation 1518 00:54:17,430 --> 00:54:19,140 by moving objects through static objects. 1519 00:54:19,140 --> 00:54:21,902 You can see that also all the areas or almost all 1520 00:54:21,902 --> 00:54:23,610 have a slight advantage of moving objects 1521 00:54:23,610 --> 00:54:25,140 over static objects. 1522 00:54:25,140 --> 00:54:28,440 There seems to be a general motion sensitivity there. 1523 00:54:28,440 --> 00:54:31,624 But if you do the interaction of shape and motion, 1524 00:54:31,624 --> 00:54:33,540 you can see that all the areas are selectively 1525 00:54:33,540 --> 00:54:38,520 more enhanced by face motion than by non-face motion. 1526 00:54:38,520 --> 00:54:40,909 But they all look pretty similar. 1527 00:54:40,909 --> 00:54:42,450 So now you can actually wonder if you 1528 00:54:42,450 --> 00:54:44,850 have a contrast like this, like moving versus still, 1529 00:54:44,850 --> 00:54:46,780 there a couple of things that are different. 1530 00:54:46,780 --> 00:54:50,207 So is it really about motion or is it just about the content? 1531 00:54:50,207 --> 00:54:52,540 If you just show a picture every one second you can say, 1532 00:54:52,540 --> 00:54:54,060 well, there's less content there, 1533 00:54:54,060 --> 00:54:56,185 therefore you might have more adaptation, therefore 1534 00:54:56,185 --> 00:54:57,640 less response across the board. 1535 00:54:57,640 --> 00:54:59,250 Is it about update frequency? 1536 00:54:59,250 --> 00:55:02,250 You know, a fast update versus a slow update. 1537 00:55:02,250 --> 00:55:09,867 So what Clark did, he was creating another stimulus. 1538 00:55:09,867 --> 00:55:11,700 If you think about creepiness, it's actually 1539 00:55:11,700 --> 00:55:12,780 like a little bit creepy. 1540 00:55:12,780 --> 00:55:16,150 So like a scrambled version of a motion that's shown here. 1541 00:55:16,150 --> 00:55:18,750 Shows us the same frames of the movie but just randomly 1542 00:55:18,750 --> 00:55:19,876 associated with each other. 1543 00:55:19,876 --> 00:55:22,000 So if anything, now the motion energy in this thing 1544 00:55:22,000 --> 00:55:23,579 is higher than in this one here. 1545 00:55:23,579 --> 00:55:25,620 And we can look now for the contrast of those two 1546 00:55:25,620 --> 00:55:28,715 and also the contrast through here. 1547 00:55:28,715 --> 00:55:30,090 And now what he finds is actually 1548 00:55:30,090 --> 00:55:33,120 something where the face areas are qualitatively different. 1549 00:55:33,120 --> 00:55:35,430 He finds two face areas here which 1550 00:55:35,430 --> 00:55:37,320 are responding more to the natural motion 1551 00:55:37,320 --> 00:55:39,670 than unscrambled motion, and three face areas 1552 00:55:39,670 --> 00:55:41,670 that are responding more to the scrambled motion 1553 00:55:41,670 --> 00:55:43,080 than the natural motion. 1554 00:55:43,080 --> 00:55:45,300 So they shift opposite preferences. 1555 00:55:45,300 --> 00:55:47,040 What we think is going in the areas 1556 00:55:47,040 --> 00:55:49,950 here that remember also the benefit for facial motion 1557 00:55:49,950 --> 00:55:52,260 over static faces, that they just 1558 00:55:52,260 --> 00:55:53,914 like a fast update of content. 1559 00:55:53,914 --> 00:55:55,580 Ideally in a way that's not predictable. 1560 00:55:55,580 --> 00:55:57,120 If you show me something new I'm going to respond. 1561 00:55:57,120 --> 00:55:59,520 And if it's something that I can't predict even better. 1562 00:55:59,520 --> 00:56:00,670 So this is what these guys are doing. 1563 00:56:00,670 --> 00:56:02,280 But these ones here, they seem to be 1564 00:56:02,280 --> 00:56:04,410 really sensitive to facial movement and naturalness 1565 00:56:04,410 --> 00:56:05,860 of facial movement. 1566 00:56:05,860 --> 00:56:09,160 And that was not smart. 1567 00:56:09,160 --> 00:56:11,820 And these areas are located more deep inside the STS, 1568 00:56:11,820 --> 00:56:15,760 more dorsally, and these areas are located more ventrally. 1569 00:56:15,760 --> 00:56:18,075 So there's an organization and to discover the new face 1570 00:56:18,075 --> 00:56:19,283 area they didn't know before. 1571 00:56:19,283 --> 00:56:22,380 It's a seventh face area which he called MD which is really 1572 00:56:22,380 --> 00:56:23,899 like a face motion area. 1573 00:56:23,899 --> 00:56:26,190 There are lots of reasons why we're excited about this. 1574 00:56:26,190 --> 00:56:29,310 I mentioned the link between face perception and agency 1575 00:56:29,310 --> 00:56:29,970 interpretation. 1576 00:56:29,970 --> 00:56:33,540 This is one possible link, there are more. 1577 00:56:33,540 --> 00:56:35,480 It might be a second phase processing system, 1578 00:56:35,480 --> 00:56:37,290 I'm going to go through all the evidence. 1579 00:56:37,290 --> 00:56:39,539 It's only indirect, though that this area might not 1580 00:56:39,539 --> 00:56:40,830 be connected to the other ones. 1581 00:56:40,830 --> 00:56:43,205 I told you before that the six face areas are intricately 1582 00:56:43,205 --> 00:56:44,820 integrated to be one network. 1583 00:56:44,820 --> 00:56:47,430 This area never showed up in stimulation, 1584 00:56:47,430 --> 00:56:48,722 therefore it might be separate. 1585 00:56:48,722 --> 00:56:50,888 And this is kind of nice because it fits very nicely 1586 00:56:50,888 --> 00:56:51,870 to the human situation. 1587 00:56:51,870 --> 00:56:54,660 So in the human brain you have the posterior STS face 1588 00:56:54,660 --> 00:56:57,600 area which is exquisitely sensitive to facial motion. 1589 00:56:57,600 --> 00:57:00,160 Actually you often don't even get it for static faces. 1590 00:57:00,160 --> 00:57:01,830 But it's very sensitive to facial motion 1591 00:57:01,830 --> 00:57:03,960 and actually Nancy has a beautiful study on that. 1592 00:57:03,960 --> 00:57:05,730 And this area by several accounts 1593 00:57:05,730 --> 00:57:07,620 is not like the other face area. 1594 00:57:07,620 --> 00:57:10,630 So it seems to be like a specialization. 1595 00:57:10,630 --> 00:57:12,430 That's another thing, another reason 1596 00:57:12,430 --> 00:57:16,966 why we think these systems might be connected to each other. 1597 00:57:16,966 --> 00:57:18,340 Just a cool thing in the end, who 1598 00:57:18,340 --> 00:57:19,810 can recognize this actor here? 1599 00:57:19,810 --> 00:57:22,000 Show of hands. 1600 00:57:22,000 --> 00:57:23,320 OK, who can recognize him now? 1601 00:57:25,960 --> 00:57:30,270 So facial motion gives away a lot of things like identity. 1602 00:57:30,270 --> 00:57:34,860 Jack Nicholson has very typical facial movements. 1603 00:57:34,860 --> 00:57:37,350 So it's not just agency, it's not just facial expressions, 1604 00:57:37,350 --> 00:57:39,641 it's also identity and lots of things that can go away. 1605 00:57:39,641 --> 00:57:44,170 So we actually don't know yet what these areas are doing. 1606 00:57:44,170 --> 00:57:47,830 So my summary, and I'm sorry I'm going into lunch. 1607 00:57:47,830 --> 00:57:50,436 So we can do fMRI on macaque monkeys just as in humans. 1608 00:57:50,436 --> 00:57:51,810 What we find we can apply to lots 1609 00:57:51,810 --> 00:57:53,309 of domains within attention studies, 1610 00:57:53,309 --> 00:57:54,492 found new attention areas. 1611 00:57:54,492 --> 00:57:55,950 Here we applied to face processing, 1612 00:57:55,950 --> 00:57:58,530 we find face selected areas for face processing. 1613 00:57:58,530 --> 00:58:00,400 Recording the micro stimulate-- 1614 00:58:00,400 --> 00:58:03,780 so inactivating these regions is supporting the notion 1615 00:58:03,780 --> 00:58:05,940 that these are likely modules that 1616 00:58:05,940 --> 00:58:10,590 are selective for processing faces and faces only. 1617 00:58:10,590 --> 00:58:13,800 These are interconnected into a face processing network. 1618 00:58:13,800 --> 00:58:16,050 It looks like all these areas have different functions 1619 00:58:16,050 --> 00:58:16,980 and specializations. 1620 00:58:16,980 --> 00:58:19,320 So fMRI experiments are notoriously 1621 00:58:19,320 --> 00:58:23,576 under powered in the way of number of different dimensions. 1622 00:58:23,576 --> 00:58:25,200 So if we call them face areas it is not 1623 00:58:25,200 --> 00:58:26,880 to say that they're all doing the same, 1624 00:58:26,880 --> 00:58:28,910 but they likely all have different functions and likely 1625 00:58:28,910 --> 00:58:30,618 sub regions of these different functions. 1626 00:58:30,618 --> 00:58:33,240 So again, there's no contradiction here 1627 00:58:33,240 --> 00:58:35,380 to the view of a fine organization. 1628 00:58:35,380 --> 00:58:36,964 Then there's a seventh face area which 1629 00:58:36,964 --> 00:58:38,171 doesn't seem to be connected. 1630 00:58:38,171 --> 00:58:40,640 Could be separate and separate face processing system. 1631 00:58:40,640 --> 00:58:43,170 We have evidence for processing that we can understand now 1632 00:58:43,170 --> 00:58:44,950 in computational terms. 1633 00:58:44,950 --> 00:58:46,860 And this is one way that it can link, 1634 00:58:46,860 --> 00:58:49,830 sometimes causally sometimes correlationally, activity 1635 00:58:49,830 --> 00:58:50,580 of single cells. 1636 00:58:50,580 --> 00:58:52,410 Two different levels of organization 1637 00:58:52,410 --> 00:58:54,380 to a very complex social behavior. 1638 00:58:54,380 --> 00:58:56,670 And that's, again, I think a very cool opportunity 1639 00:58:56,670 --> 00:58:59,169 to have in the domain of social cognition 1640 00:58:59,169 --> 00:59:01,710 that you can actually control stimuli very well because faces 1641 00:59:01,710 --> 00:59:04,200 are so powerful to get into your social brain, 1642 00:59:04,200 --> 00:59:06,870 you can likely take this approach deeper and get insight 1643 00:59:06,870 --> 00:59:09,450 into actual social intelligence beyond face perception 1644 00:59:09,450 --> 00:59:11,190 this way.