Flash and JavaScript are required for this feature.
Download the video from iTunes U or the Internet Archive.
Description: This lecture continues with the auditory cortex, focusing on echolocation by bats as well as speech and language. Topics include types of bat echolocation and bat cortical specializations along with speech spectrograms and cortical processing.
Instructor: Chris Brown
Lec 23: Auditory cortex 2: ...
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR: OK. I guess we'll get started. Last time, we were talking about auditory cortex, and the tonotopic fields in auditory cortex, the non-tonotopic fields. Any questions about that first lecture on auditory cortex?
We're going to continue on cortex today, and talk about some areas of cortex in a specialized mammal, the bat. It's where a lot of excellent work has been done on auditory cortex that really shows-- very nicely-- how neurons respond, at least, to the selective stimuli that are emitted and listened to by the bat.
So we'll be talking about bat echolocation. We'll start out by defining the different groups of bats, and talk about who discovered bat echolocation. The discovery was made just a few miles from here.
We'll talk about what the signals a bat's look like, in terms of what they look like on a spectrogram. Then we'll talk about the specializations for processing the emitted pulse and the return echo in several bat cortical fields.
In the second half of today's lecture, we'll be revisiting speech sounds. We had a little bit of that at the very beginning of my lectures. We'll talk about speech spectrograms.
And then we'll talk about cortical processing of speech and language, especially in the human, where we have a lot that is known about processing of language. OK. So we'll start out with bat echolocation. These are some pretty pictures of bats.
Oh, I also have some announcements now that everybody's here. So, on Wednesday's class, meet at the Massachusetts Eye and Ear Infirmary, if you haven't already gotten an email. So, for a lab tour, we meet at the Massachusetts Eye and Ear Infirmary.
And there are directions to get there from here. You just get on the Red Line, going inbound, toward Boston. Get off one stop later, at the Charles stop. And then you're going to the Massachusetts Eye and Ear Infirmary.
So a lot of people get that confused, of course, with the big behemoth right next door , Mass General. So Mass. Eye and Ear's a different building. It's a different hospital. But But it's clearly marked. The directions are on the website, so just follow them.
So the lab tour will be within the regular class period, so 2:35 to 4:00. We're not going to go beyond that because people have commitments after that. And we'll-- depending on how many people show up, it's likely we'll divide into groups and cycle through several demonstrations that I have prepared for you there.
OK. So questions about that? So we meet at Mass. Eye and Ear on Wednesday. At that time, the assignments are due. So we talked a little bit about the assignment a few weeks ago, when we talked about the Jeffers model. And you can send me the assignments by email or give me a typed printed version.
And the idea is that I'll look them over and then hand them back to you at the review session. And we'll talk about the assignments , and what I consider the right answers. So we did a little switch for the review sessions.
So we should put this on the website, if it isn't already there. But next week, Monday, we have two review sessions scheduled. The one on Monday will now be the one on vision. And so Doctor Schiller is going to come and review the vision part of the course on Monday.
And then I'll be back a week from Wednesday. And we'll do the audition review. And we'll return your assignments then, ? OK? So that's what's happening next week. Any questions on that? OK.
So here are the nice pictures of bats. They're beautiful animals. They have specializations, of course, for hearing.
They have large pinnae, right? Much, much larger than other animals, especially for their size. They have very small eyes. And their visual systems are not well developed.
They of course have wings. So these are flying animals. And many of them have noseleaves.
So here's a nose cartilage that's very well developed because these animals emit sound. Their echolocation pulse is emitted. And some of the sound comes out of the mouth.
But some of it comes out of the nose. And this noseleaf tends to focus the sound forward because that's where the bat is interested at detecting some kind of a target, like the insect prey that most of these bats eat.
So I should backtrack and say that we're really talking about three types of bats. We're talking about echolocating bats, of which there are two varieties that I'll tell you about it in a minute. And we're also not going to talk about non-echolocating bats.
And sometimes, these non-echolocating bats are called fruit-eating bats. They are also flying mammals. But they have big eyes. They have relatively small pinnae. And they navigate around like birds and other mammals, using their visual system. So they don't echolocate.
So you have non-echolocating bats that we're not going to talk about. We have echolocating bats that we will talk about. It's starting to get confusing with all these groups.
In fact, bats are a very successful group of mammals. Supposedly, there are more species of bats than all other mammals combined. It's and amazingly successful group of mammals. And mostly because echolocation has opened up a whole new vista for bats.
Not only can they fly around, but they can do so at night-- in total darkness-- and find prey, their targets, their insects. So instead of being fruit eating, these echolocating bats are carnivorous. Most of them eat insects that they catch on the wing. So flying insects.
But there are gleaning bats that eat insects on the forest floor. There are vampire bats that cut little holes in the top of mammals and lap up the blood that comes out. There are fish eating bats that eat fish.
There are a whole variety types of bats. But most of them eat insects that they catch on the wing. And we'll have a demonstration of that.
So these are insect eating bats here. This one I want to point out its name for. This one is called-- in the middle left, right here-- megaderma lyra.
So mega means big. Derma means skin, ? right? It's so named because of it's big skin here. And lyra refers to lyrical, or musical, or something that sings, OK? So these bats are singing.
Let's look at the types of singing that they do. And this display shows the two types of signals that are emitted by the two big groups of echolocating bats. The first I want to start with is the simplest.
It's called an FM bat. An FM-- you have an FM radio that stands for frequency modulated, or Fm. OK? And in this graph of the FM bat's echolocating signal-- this graph is called a spectrogram-- and it plots the frequency on the y-axis and as a function of time on the x-axis.
And this echolocating pulse is the thing that the bat is emitting. It's producing and emitting. And the reason it's frequency modulated is it starts at a high frequency and modulates down to a lower frequency. And here's another one. And here's another one.
Now, if there's a target out there, some distance from the bat, this pulse that goes out will be reflected off the target. And then it will come back to the bat in the form of an echo sometime later. OK?
So the bat-- in this case the FM bat-- gets information, number one, if there's an echo, there's a target out there. And number two, the time between the pulse and the echo is an indication of the distance the target is from the bat, right? Because the sound has to go from the bat, to the target, and then from the target back to the bat.
And we know that the sound velocity in air is about 340 meters per second. So knowing that velocity, and knowing the time between the pulse and the echo, we-- and the bat-- can get information of how far away the target is from the bat. Now, a couple of things I want to comment here on this spectrogram.
Number one, the frequency axis. You might not be able to see it, but it starts at zero and then quickly jumps to 30 kilohertz. And then there's 60, 90, and 120. So those are very high sound frequency.
And so, in the early days and to a certain extent still, those frequencies would be called ultrasonic. And there's no real good reason for that, except that we're humans. And everything is important with respect to humans. And our upper limit of frequency, as you well know-- if you're a very young human-- ends at about 20 kilohertz.
And most of us, who are in our middle or older ages, aren't hearing anything above 10 kilohertz. So all these frequencies emitted by the bat, and the echoes coming back, are beyond the range of human hearing. And in that sense, they're ultrasonic.
So you can't go out and say, oh. I heard a bat, or at least in terms of the echolocating signals that the bats are emitting. There are some sounds that bats emit that are communication sounds. And those are in the human frequency range to a certain extent. But almost all the echolocating signals are well above the upper limit of our hearing range.
Now, another thing that you can see from this spectrogram is that there's a big delay between this pulse and this echo, OK? The target is pretty far away. Generally, bats head toward targets. And this has been shown many times in behavioral experiments, especially if they're hungry. OK?
And as the bat gets closer and closer to the target, obviously the time between the pulse and the returning echo gets shorter. And, as you can probably see from this spectrogram, there are a lot more pulses emitted per time when the bat gets close to the target because the bat is interested in getting a lot of information when it gets close.
Another reason that the bat doesn't emit very many pulses when it's far away from the target is, if you emitted a pulse before the echo returned, you could get confused between the outgoing pulse and the returning echo. So typically, bats tend to increase their pulse rate a lot more as they get closer to their target. OK?
And I'm going to show you a demonstration of that. I'm sure we'll convince you of that. So this is the type of bat we have here in New England. Examples of this are a little brown bat or the big brown bat. And if anybody has seen bats-- have you guys seen bats flying around at night?
Yeah? Where do you see them? In your bedroom? Underneath-- [LAUGHTER] it's where I've seen some recently, which is a little scary. Well, this view graph says it hunts in open air.
Where has anybody seen them? I sometimes see them if I'm out canoeing on a lake at night, or in the evening. Any other places? On golf courses, for example. And those are all sensible, if you will, places for the bat to hunt because they're very open situations.
And there isn't a whole bunch of clutter, if you will, that will return echoes to the bat. If there's a moth or a mosquito out there-- it might be the only thing out there above the surface of the lake, and that's very interesting to the bat because it has one target. It doesn't have a million leaves of the forest, if you will, to get confused.
It gets one echo, it knows there's one target out there. It goes and swoops out there. And it possibly eats the target, if it's an insect.
OK. Let me give you some other examples of some information on FM bats. First, I want to point your attention to who discovered echolocating bats. And I have his book here.
This is Donald Griffin's book. It's called Listening In the Dark. And I'll write his name on the board. And I'd like to do just a short reading from his book. Then I'll pass it around. So this is the section where he talks about this discovery of bats' ultrasonic sounds.
So he writes, "during my undergraduate years at Harvard College, when I was actively engaged in banding bats to study their migration--" So bats here, about this time of year, start flying south like birds do. He says, "I was familiar only with the generally held view that bats felt with their wings the proximity of obstacles." that was how people thought they navigated around, by touch.
"Several friends suggested that I experiment with the ability of my bats to avoid obstacles." Blah, blah, blah. "I decided that I should contact a professor in the Harvard physics department. That was Professor GW Pierce, inventor of the Pierce circuit for the stabilization of radio frequency oscillator.
Pierce had developed almost the only apparatus then in existence that could detect and generate a wide range of sounds lying above the audio range. That is from 20,000 to almost 100,000 cycles per second. With some trepidation, I approached Professor Pierce in the winter of 1938 with the suggestion that we use his apparatus to listen to my bats.
I found him eager to try the experiment, particularly since he was already engaged in extensive studies of the high frequency sounds of insects. When I first brought a cage full of bats, myotis lucifugus--" OK, that's the little brown bat-- "to Pierce's lab and held the cage in front of the parabolic horn, we were surprised and delighted to hear a medley of raucous noises from the loudspeaker."
So Griffin, as an undergraduate, discovered that bats emitted ultrasonic stimuli, OK? And he went on to pursue a lifetime of research on bats. While he was still an undergraduate, he designed some experiments to see if the bats could use this echolocation to avoid objects.
And so he took a room, turned out all the lights so the bats could only use other senses. And he noticed that they would fly around. And he didn't have a very extensive equipment budget, so for objects, he went to the store that sold piano wire. And he strung piano wire from the ceiling to the floor of the room.
And he let the bats fly around. And he knew that if something touched the piano wire, he could hear a little sound of the wire vibrating. And he had to go down in diameter, to piano wire that was the thickness of the width of a human hair, before the bats finally started touching the wire.
They could detect objects even sub-millimeter in size. So their sense of echolocation was really well developed, and very good. It could detect very tiny targets. So that was one of his first, and foremost, experiments.
Now, I have some demonstrations. And some of them are from Griffin's original work. And so, in these demonstrations, you have a movie of the bat flying. And the target, this time, is a small food item that's thrown up. I think it's a mealworm.
So the investigator throws up the mealworm. And the bat catches it. And on the audio part of the track, you'll hear some popping. And I think that's a stroboscope that's illuminating the image.
You will also hear a little chirp. And they're pretty high frequency. But that is the bat echolocating pulse going out. It's detected by microphone, and transformed from the high frequencies down into lower frequencies in your audio range, so you can hear it. OK?
And notice that, number one, when the bat gets close to the target, the chirps increase in frequency. And number two, when the bat eats the target, the chirps stop, right? Because unlike you or me, they can't talk and eat at the same time. So those chirps are the bat echolocating pulse.
Here's the bat. Here's the target coming up. In that case, as sometimes happens, the bat missed the target. So it's falling away.
Here's the target coming up. In this case, the bat caught it in the tip of its wing. And it brings the wing in toward its mouth. And it eats the target. And it starts pulsing again after it's swallowed the target. And there, I think it caught it right in its mouth, without having to use its wings.
OK. So these are some other films that I won't go through. But they were some experiments that Griffin did, testing the ability of bats-- and certain species-- to actually catch fish. And he was mystified about this because-- as we've talked about-- sound and air, when it comes to a fluid boundary, mostly reflects off.
So it seemed unimaginable that the bat echolocating pulse could go under the water and would be reflected of the fish. So what he figured out later is that, when there is a smooth surface of the water, the bats did not seem interested. But when the fish came up and rippled the surface of the water, that was what the bats we're actually detecting, the little ripples on the surface of the water-- just as you see them visually.
There's some videos of bats catching fish here. Bats hanging out. Bats flying in rooms. Sorry about that. Now this last part of the demo is from more modern experiments in which the bat approaches a target.
In this case, the target is somewhere here. And it's tethered. The target is fixed. This is a spectrogram of the bat echolocating pulse. And this is the bat flying, in slow motion, to catch the target.
And you can see the tremendous increase in pulse repetition rate. As it eats the target, it stops vocalizing. And now it starts again. And this'll be repeated at least once.
So here's another run. Here's the bat spectrogram down here. And here's the bat coming in. Here is the target, right there. I guess I'd better stop that.
OK. And so that second demo is from Doctor Cynthia Moss, who used to be at Harvard. And now she's at the University of Maryland. And she does extensive work on bat echolocation.
And so I think it clearly shows the increase in pulse repetition rate as the bat gets close to the target. Any questions on that? OK. So those were all the first kind of bat I talked about, the FM bat.
And now, let's get into the second group of bats. And this group is called CFFM bats. OK. And these are completely different species of bats. They are new world CFFM bats and old world CFFM bats. It probably has evolved several times.
The echolocating pulse and echo are completely different, compared to the FM bat. So in the case of the mustache bat, which is an example of CFFM bat, this is the spectrogram. Again, frequency on the y-axis and time on the x-axis.
But in this case, instead of the pulse sweeping downward, or a frequency modulated very downward-- almost like a chirp-- instead, the pulse is a constant frequency. So the CF stands for constant frequency. That just means the frequency is staying constant as a function of time. And that's this flat section here.
In the case of bat sounds, just like human speech sounds, there are many harmonics. There's the first harmonic. That's called CF1. There's the second harmonic, an octave above, CF2. There's a third harmonic, CF3. And there's a fourth harmonic, CF4.
And it's conventional on this kind of spectrogram display to illustrate the sounds that have the most energy with the boldest marking here. So CF2 is the one that has the highest sound pressure level. And so that's the darkest here. These other ones, especially CF1 and CF4, are lower in level.
They're not as intense. And so they're not as black on this display. And in this display of the FM bat, the pulse is very intense. And the echo is, of course, much reduced.
Echolocating pulses can be 110 dB, if you measure them right at the bat's mouth. They can be very intense. And, of course, the bat contracts its middle ear muscles to prevent those kinds of intense stimuli from damaging its own ears. And then it relaxes the muscles when the echo comes by, and its hearing is fine.
OK. So at the end of the CF portion, this echo-locating pulse. There is a little FM sweep. OK.
So you can appreciate, maybe, that there's an FM1, a little FM2 sweep, a little FM3 sweep, and a little Fm4 sweep. And it's thought that this bat uses the FM of the pulse, and the FM of the return echo, to get a measure of the distance that the target is from the bat.
OK. It if comes back in 10 milliseconds, and you know the sound of velocity is 340 meters per second, you can figure out how close that is. And this bat can do that, as well. Now, what's going on with this CF part?
Well, as you can see on the spectrogram, the CF of the pulse is not exactly the same as the CF of the echo. OK. The echo has shifted up to be a little higher in frequency in each case.
Now, how can frequencies be shifted? Well, it has to do with the Doppler shift.
OK. So this is a shift in, in this case, sound frequency. But you can have a Doppler shift for any kind of wave. For example, you can have a Doppler shift for light waves.
If you've studied the Big Bang theory of the origin of the universe, there's a big explosion, right? Everything exploded out and is moving far away from us. So you look at the light coming from a star that's moving away from you.
It's actually shifted a little bit to longer wavelengths, toward the more reddish hues, because it's moving away from you. So Doppler shifts have to do with wave sources that are moving relative to the receiver, or the receiver moving relative to the emitter.
Another example of a Doppler shift, this time for sound, would be if you were in the grandstand of a race track, and the race cars were going around a big oval.
OK. And you hear the sound of their engines. As the race car comes toward you, along the straight away, it sounds like it's higher in pitch because it's moving toward you. As it passes you and then starts to move away from you, it sounds like it's lower in pitch.
So the thing you'd hear would be [BUZZING] as each race car went by you. So, as the race car is here, and you're the observer listening here, it emits-- let's say-- a pulse of sound.
OK. This might be the peak of the wave front, if it were just a sinusoid, let's say. Now, by the time a race car coming toward you has emitted the next peak, the race car's actually moved. OK.
So the peak is here. And the next peak is emitted. And it's very close together. If the race car is moving away from you, it emits one peak of sound. And then by the time it emits the next peak of sound, it's moved a little away from you.
The peaks are farther apart. And we know the sound source that has a quick oscillation sounds like a high frequency. And the sound source that has a very slow oscillation sounds like a low frequency.
So Doppler shifts are positive, higher frequencies for objects making sound moving toward you. And Doppler shifts are low, or negative, in frequency. They make lower frequencies if the object is moving away from you. It's just the physical characteristics of sound coupled with movement.
OK. In the case of these positive Doppler shifted echoes, we know either that the object that has been reflecting the echo is moving toward the bat or, conversely, that the bat has been flying toward the object that is emitting the echo.
So a positive Doppler shift means things are getting closer together. And a negative Doppler shift would be things are going farther apart. So not only does the bat, from its FM sweep, get an indication of how far it is away from the target.
But by its Doppler shifted CF part, it gets an idea of the relative motion of the target. Why is that important? These types of bats, instead of hunting in open air-- like the FM bats we have-- these are tropical bats.
And they hunt in dense vegetation, like the tropical rain forests. And there are millions of objects around. There are leaves. There's vines. There's lots of clutter here.
What the bat is interested in is not stationary clutter. Presumably, things that are Doppler shifted all the same. But something that is moving in all this clutter. It's very interested in moving objects because that's a life form, perhaps.
And just imagine the kind of a Doppler shift that would be made by a moth, first beating its wing toward you, if you were the bat. And then beating its wing downward, and away, from you. OK.
That's a very complicated positive and negative Doppler shift that the bat would pick up on its return echo. And that would be a very interesting target to the bat. Yeah.
AUDIENCE: And how do they tell that the difference is when they're moving versus if the object is [INAUDIBLE]?
PROFESSOR: They don't care. All they care about is that they might be getting thousands of Doppler shifted echoes from the targets in front of them. Let's say they're flying toward a whole bunch of tropical rain forest vegetation.
There are going to be thousands of positive Doppler shifted echoes. And then something is moving away from them, or toward them, faster than the background. All they care is that. It's Doppler shifted different relative to the background.
They're just looking for something special. That is, something that's moving relative to the background. Any other questions on that?
So you can, of course, design sonar systems. Submarine sonar systems work by sending out a pulse of sound, and listening for the echo. And most of the kinds of sonar systems that we have send out a ping, which is a frequency swept signal, and listen for the echo.
Because it's mostly interested in the distance from a target, and whether there's a target out there. This is a very unusual type of echolocation, or sonar, if you will.
Now why am I bringing this up? Well, it's very interesting because the bat gets two queues instead of just one. Also because a lot of work on bat cortex has used CFFM bats.
And one of the most popular has been the so-called mustache bat, I believe because it has a noseleaf that's between its upper lip-- it looks like a mustache-- and its nostrils. So it's called the mustache bat.
And a lot of this work has been done by a researcher, who is still active, at Washington University in Saint Louis. And his name is Nobua Suga. And he was the first, really, to work successfully on the bat auditory cortex.
A lot of his work comes from the 1970s, '80s, and '90s. And before I explain this bottom part of this slide, let me just go on and show you the kinds of experiments that Suga did. And here's one from one of his publications in the 1980s.
So Suga's work was innovative because he played around-- well, first he rationalized, OK. I could try any sound system I'd like to. I could try clicks. I could try pure tones.
I could try noise. I could try speech. I could try-- but why don't I try what the bat listens to over and over?
It listens to a pulse. And then a little bit later, an echo. And this turned out to be a very, very wise choice, as we'll see in a minute.
Secondly, about the time of the 1970s and '80s, speech researchers in human speech were using synthesized speech. Of course, we all know what synthesized speech is now.
But back then, it was very novel. And Suga says, well, I'm going to use synthesized bat echolocating calls. And here is an example of a synthesized pulse.
So this is for the mustache bat. And this looks like CF1-FM1, CF2-FM2, and CF3-FM3. So he's just using three harmonics. So one thing about synthesized calls is you can do things like dispense with one harmonic, if you want to, easily take it out, and put it an echo.
So here's a pulse. And here's an echo. It's a little bit Doppler shifted. You can look at the no response to the pulse, and to the echo. And this lower trace is a histogram from a single neuron in the auditory cortex of the echolocating bat to a pulse, and to an echo. There's not much response.
Suga found that when you play a pulse and, a short time later, an echo, you get a huge response. And that's what's indicated here. So this is the pulse-echo combination. And these are synthesized stimuli.
And you can read in the original paper, it looks like it's not given here, exactly what the delay is between the pulse and the echo. But Suga tried various pulse-echo delays. And he found that cortical neurons, in many cases, were very sensitive to the exact delay between the pulse and the echo.
So they were, if you will, delay tuned. And that's indicated here, in the first bullet. "The neuron pictured above responds little to--" blah, blah, blah. "--pulse of an echo alone. But vigorously to a pulse followed by an echo at certain delay. In this case, 9.3 milliseconds is the best delay for this neuron." So this is a delay tuned neuron.
Once Suga did recordings from different parts of the bat cortex, he found that the best delay was actually mapped along the surface of the cortex. So here is some of his work from the 1990s, showing you maps of best delays, and other properties, in the bat auditory cortex.
So here is a side view of the bat cortex. This is looking at the left side. This is the front, where the olfactory areas are. Way in the back would be the occipital cortex.
And our old friend, the auditory cortex, is in the temporal region, on the side of the brain, just like it was in the cat. And like it is in the human. And here is A1.
And this rectangle here is expanded here. And some of Suga's maps are shown. This part right here-- the biggest part-- is cortical field A1, which, as we've seen in other animals, is the tonotopically organized field. And it's tonotopically organized in this bat.
And these numbers and lines are the ISO frequency laminae. So remember, the experiment here is to go in and sample at a specific place in the cortex, and find the characteristic frequency for neurons in that column. And then move the electrode a little bit.
Do the same. Get the tuning curve. Get the CF for those neurons. And so on and so forth. And build up a map here.
So just like in the cat, posterior areas are tuned to low CF. Low CFs in the bat are 20 kilohertz. We're dealing with very, very high frequencies. As you go more rostrally, the CFs get increasingly high.
And at the very rostral end of A1, the CF is 100 kilohertz. Extremely high. And everything looks exactly like other mammals, except for this huge area right in the middle of A1. And almost all of the neurons here are tuned to between 61 and 66 kilohertz.
And you should perk up your ears a little bit because that is where the most intense harmonic of echolocating pulse is, right around 61 kilohertz. And a lot of the Doppler shifted echoes are just going to be a little bit above that.
If the bat is flying toward the target, this region goes up then to 66 kilohertz. So then there is an expanded region of the A1 of the bat that's tuned to a very important frequency for the echolocating signal. And, at first, this was called an acoustic fovea.
Because if you go down into lower nuclei of the bat pathway, and if you actually go into the cochlea, you find an expanded region of the cochlea devoted to these same frequencies. That is, you go along the basilar membrane, starting at the most apical regions, and you march down them.
When you get to the 61 kilohertz place, there's a lot of cochlea devoted to processing that area. And so the fovea, the eyes, where you have lots of receptor cells packed in to a certain part.
And this is where you have a lot of hair cells packed in, or expanded region of the basilar membrane, where lots of hair cells processing this small range of frequencies. So this is very much different from other mammals. And the cochlea of this CFFM bat is also very different.
Now, we were talking about delay to neurons. And Suga found a very interesting area near A1 in which there is a mapping for best delay. And that's indicated here. And the best delays are marching from short to longer best delays, as these arrows go along here.
Now they're marked FM1, FM2, FM3. So what does all that mean? So Suga found that with his synthesized pulses and echoes, he could dissect this rather complicated pulse-echo constellation into smaller parts.
And here's an example of a stimulus where you have, it looks like, CF2-FM2, CF3-FM3. And you have the echo for CF1-FM1, and the echo for CF3-FM3. And that hardly gave any response to the neurons.
Here's an example where you have only CF1-FM1 for the pulse, and only CF2-FM2 for the echo. And it gave a big response. And you could do even more.
You can strip off everything except the FM1 and the echo FM2. And you get a big response from the neuron. So this type of neuron, then, would be called an FM1-FM2 best delay neuron.
FM1 is the pulse. FM2 is the echo. And with a certain delay between those two, you get as big a response as with the whole constellation of pulse and echo.
And those neurons were located in this specific region, called FM1-FM2. And their delays were mapped along this axis, with delays going from 0.4 to 18 milliseconds. And knowing the velocity of sound, you can convert that to a target range of between 7 and 310 centimeters. That's how far the target was from the bat at that specific best delay.
Suga also found some other best delay regions. For example, FM2, FM3, so on and so forth in other adjacent parts of cortex. And this is really beautiful work, showing specializations for cortical neuron response. And for showing mappings for those specializations in the bat cortex.
We don't really have any data anywhere near as beautiful on non-echolocating mammalian cortex that show specialization for specific features of sound stimuli as we do like this in the bat cortex. And this is really beautiful work.
This is Nobel Prize deserving work because it really shows us what this bat cortex is doing. It's responding to specific features of the pulse. And the return echo delay a specific delay later. So it's very beautiful work.
Let me just mention one other region here. Suga showed nearby, a region where the neurons are specialized to certain combinations of the constant frequency of the echo. And then he showed a Doppler shifted constant [INAUDIBLE] pulse and CF of the echo. So this is the CFCF region right over here.
Yeah, question.
AUDIENCE: In the last diagram--
PROFESSOR: This one or the one before?
AUDIENCE: This one right here.
PROFESSOR: OK.
AUDIENCE: The last graph, like the middle bottom.
PROFESSOR: This guy?
AUDIENCE: Why is that the neuron is responding, like, before the onset frequency?
PROFESSOR: I don't know why that is. There's some clue to that, as to why this stimulus starts here. I don't know. I don't know the answer to that. Let's see if it says anything in the caption.
I don't know. I don't know why that is. Sorry. I'll have to dig out the paper and figure that out. Any other questions?
So one thing that's gone on after Sugo's early work on these specializations has asked the question, is this really happening in the auditory cortex? Or is the cortex just merely a reflection of some beautiful processing at a lower level of the pathway?
And to a certain extent, best delay sensitivity is found at lower levels. For example, the inferior colliculus has some best delay tuned neurons in the echolocating bat. So it probably has more in the auditory cortex. But they can arise at the inferior colliculus.
OK. So that's what I wanted to say about bat echolocation. And now I'm going to move on and spend the last part of today's class talking about speech sounds. So we had this particular slide in an earlier lecture, I think the very first lecture that I gave, talking about what speech sounds are.
So speech sounds, obviously, are formed in humans by the vocal cords, or vocal folds, closing and opening during airflow from the lungs to the upper vocal tract. And this closing and opening of the glottis gives rise to the so-called glottal pulses.
When the vocal cords are closed, there's no airflow coming out. But when they open, there's turbulent airflow and it makes a sound. So these are pulses. And they have a whole bunch of different frequencies.
So this is the wave form as a function of time. Sound pressure is a function of time. And this is the spectrum showing the different frequencies that are formed.
There's a whole bunch of different frequencies in your glottal pulses. It's very complicated. To form different speech sounds, you do things with your upper vocal tract. In this case of vowels, you position the muscles so that your upper vocal tract forms filters that enhance and decrease some of these frequencies.
And after you apply the filter function of the vocal tract to this glottal pulse spectrum, you get this type of spectrum where there are certain peaks. And in the production of a vowel, these peaks are at different frequencies. So, for this example, the vowel "eh" in hit, you have a very low peak, and a couple of high peaks up here.
These pacer called formants. And they're labeled by Fs. So we went over this before. So there's F1 here, F2, and F3 here. And certain cochlear implant processors, of course.
Try to look at the acoustic spectrum, and pick off these formants. And they present a lot of electrical stimuli to electrodes that correspond to them in the cochlea. This one we have a lot of stimulation at these low frequency electrodes, which would be apical in the cochlear implant.
And then in the intermediate electrodes, they completely shut them down, even if there's a little bit of background noise. And then they would present a lot of stimuli at the position corresponding to F2 and F3.
And that's an effort to decrease background noise, which is always a big problem in listening to any kind of acoustic wave form. But especially if you have a cochlear implant.
So this vowel, "ah," in the word "call" has two formants very low in frequency. And one in the middle frequency. It sounds very different, of course. And your vocal tract position is very different.
And this volume, which is "oo," as in the word cool, has three fairly evenly spaced formants here. Your vocal tract is yet in a different position. And you interpret this as yet a different vowel.
Now that's a display that's not very conventional. Much more conventional is to look at a speech spectrogram. So this is very similar to what we've just looked at for bat echolocating pulses. This spectrogram is a graph on the y-axis of frequencies.
And now these are more normal sonic, if you will, frequencies. These are well within the human range, of course, going from 0 to 7 kilohertz on this axis. This is a time axis here.
And again, the higher in level, the darker the display in the spectrogram. So there's some really dark bands here. And there's some very light stuff here, and here, and here.
And this is the utterance-- "Joe took father's shoe bench out." OK, and that's what the sound looks like when you make that utterance. So the spectrogram plots the frequencies of speech sounds over time.
What we've talked about-- up until now-- are voiced segments, which are mostly vowels. So in this utterance, you have a bunch of vowels here. Here's a nice one, "ah" in the word father's. And you can quite clearly see there's a nice band here that would be F1.
That would be about at 500 kilohertz. And F2 is about at 1 kilohertz. F3 would be about at 2 kilohertz. And there's a fourth formant about 3 kilohertz. And that's the very beautiful vowel, "ah" as in father.
Here's another one. "Joe." So "oh." "Oh" is a vowel where, in this case, there's a beautiful stable formant here. But here is a formant that's transitioning from higher frequency, maybe about 1.5 kilohertz, down to below 1 kilohertz. So it's Joe.
And there's a higher formant here. OK. So those are the vowels, or the so-called voiced segments. And voicing just means that there's a constant outflow of sound coming through the vocal tract.
And you can make these sounds forever. You can say, ahhhhhhh. And you can just keep going if you want to. Of course, you don't in normal speech.
Now for consonants, there are several types. And these are generally called unvoiced segments. Mostly consonants are intervals containing bands of frequencies, swept frequencies, and silent intervals.
So for example, one of the consonants here is F in the word fathers. And right before, at the beginning of the sound F, you go, which is no sound, right? You close your lips. You keep sound from coming out.
And you finally go, father, right? And that's what's happening right here. So that's a stop consonant. You stop the vocal tract before you let it go.
And the vowel T, as in took, is another stop consonant. So you're not doing anything at the beginning. And finally, you go, took. Right?
When you let go, and emit the sound of took, you have a very complex frequency band. That's generally high frequencies. 2 kilohertz in this case, up beyond 7 kilohertz. There's that explosion of sound right at the beginning of the consonant T, as and took. OK.
Now there been a lot of course studies on speech coding in the auditory nerve and the cochlear nucleus. And one of the findings shouldn't be too surprising at all. You have auditory nerve fibers that have tuning curves, right? We've been over tuning curves many times before.
So tuning curve is a graph of sound pressure level for a response. There's a sound frequency here. And then a 1 kilohertz CF. The CF for this tuning curve would be 1 kilohertz, let's say.
And a 10 kilohertz CF might look like that. OK. So you can explore the responses of 1 kilohertz auditory nerve fibers and 10 kilohertz auditory nerve fibers to this type of stimulus. And obviously, the 1 kilohertz fibers are going to be very active.
During portions of this utterance, for example, there's a lot of 1 kilohertz in this "oh" second formant here. So the 1 kilohertz fiber's going to respond like crazy there. In the vowel "ah," there's a big 1 kilohertz band there.
The 1 kilohertz fiber is going to respond a lot there. It's not going to respond here or here. But it's going to respond a lot at the end of the "out" sound. OK.
The 10 kilohertz fiber is kind of out of luck, right? Its way up here. It's off the axis. But notice the tail of this 10 kilohertz fiber.
If I had drawn it a little bit further it would be extending past 1 kilohertz. So it's certainly going to respond, right, in here, as long as the sound level is high enough. If you keep the sound level of this utterance low, down here, then the frequency is obviously down here.
That 10 kilohertz fiber is not going to respond. But if you boost the sound level, such that you're in the tail of the tuning curve, this 10 kilohertz CF fiber is going to start to respond.
For example, response to these frequencies here, this 4, 5, and 6 kilohertz. Maybe the 7 kilohertz can respond to things like the consonants, if the sound level is high enough so that it's within its response areas.
So there's clear CF processing of this type of speech signal at the auditory nerve and in the cochlear nucleus. There's also phase locking. For example, these lower frequencies are within the frequency range where there's really good phase locking for the auditory nerve.
Remember, phase locking falls off above 1 kilohertz. And by about 3 kilohertz, there's not much phase locking at all. But many of these voiced, or vowel, segments are going to have low frequencies. And they're going to be good phase locking in the auditory nerve or cochlear nucleus.
So that's just sort of a review. Think about the auditory nerve response to these speech signals because clearly, the auditory nerve is going to respond very nicely, in terms of what its CFs tell it to.
Now, there's been a lot of interesting work. Of course, I don't have time to get into much speech processing and language representation. But I just wanted to show you some things that relate quite nicely to what we've just gone through for echolocation.
Here are some synthesized speech stimuli. OK. And you can do this very nicely on your computer. This is a spectrogram of the synthesized sound. Frequency is on the y-axis and time, in this case in milliseconds, the very quick stimulus is on the x-axis.
And there are several harmonics, very much like we had for the CF echolocating bat sound. And there's some regions of constant frequency.
And there are clearly some regions of frequency modulation, very much like the bat echolocating signal that we just went over. Except that now the modulated part of the signal is in the front instead of at the back like it was for the bat.
And so right here, on the third formant, is a very interesting transition that's not shown in black. It's shown in white. Because in the work of Liberman and Mattingly from the 1980s, they studied this so-called formant transition.
So the format here is the vowel "ah." And you have these three formants, 1, 2, and 3. And the transition leading up into that is either the consonant D or the consonant G.
It's when it's coming down. It's the combination "da." but when this third formant transition is instead going up, it's the consonant "gah." completely different speech sound, "da" versus "gah." No one would ever mistake them.
Right. And so what Liberman and Mattingly did was they varied this transition into the third formant. Instead of just having one like that, or one like this, they sloped it any number of degrees.
All right. And when it goes, I think when it's falling, it's "da." And when it's rising, it's "gah" if I'm not mistaken. So what would you expect if it was right in the middle?
Well, you could expect anything. But actually, the observation is, as you move this formant transition over, subjects do not report something that's in between "gah" and "da." Instead, all of a sudden, they quickly shift from "gah" to "da." All of a sudden.
And there's a very sharp boundary in the shift. And the subjects never report something that's intermediate. So this is an example of putting the speech sound, which can be modulating continuously, into two sharply defined perceptual categories. Either "gah" or either "da."
But nothing in between. No gradual slope in between. It's just one or the other. This gave rise to the idea of categorical perception of speech sounds.
The other thing they could do is do things like this. Present the black stimuli to one ear and the white stimuli to the other ear. And you get the perception, then, of a speech sound.
If you don't present any formant transition at all, what would you expect to happen? Well, what actually happens is you do hear something ambiguous if there's actually no formant transition at all. If there's a formant transition, in one ear and the rest of the sound to the other ear, you hear the complete speech sound.
If you just present this formant transition and nothing else, you just hear a little chirp, a little speech sound. But you add that to the rest, and you get an unambiguous or categorical "da" or "gah." OK. These are beautiful series of experiments by Alvin Liberman in the 1980s.
Now, in cortex, the interesting question, then, if you pull an analogy between bat signals and human signals, we've had spectrograms from the two which are not that much different. The question is, do we have specialized neurons in our cortices that are sensitive to specific features of those signals?
For example, features are things like whether these two formants are close together, whether there's a formant sweep going down or going up. Do we have specific, if you will, feature detectors in the human cortex? We don't know that.
What we do know clearly is that there are areas that are very selective for language and speech stimuli in the cortex of humans. So in the cortex of humans, we've talked about there being a primary auditory cortex in the temporal lobe here. And we had the little model that showed you that there was a Heschl's gyrus.
And that is the site of primary auditory cortex, or A1, in humans. All around that region, an area that's sometimes called perisylvian cortex. And it gets its name from this big sylvian fissure, if you will, that divides the temporal lobe down here from the rest of the brain, especially the parietal lobe.
All around this perisylvian cortex is associated with language processing. And how do we know that? Well, of course, imaging studies. But in the beginning, the early pathologists like Broca and Wernicke, who studied patients who had lesions in the cortex, mostly from strokes. But sometimes from other injuries.
It showed that lesions in this region of the brain left patients with deficits in language processing, especially with a deficit called aphasia. OK? Disorders of comprehending or producing spoken language are known as aphasia. And these aphasias are often classified into types.
If you're a neurology resident, or you do your medical rotation in neurology, you will see patients with so-called Broca's aphasia. And this, originally, was brought to light by Broca. We saw such patients with lesions in this part of the brain. That's come to be known as Broca's area.
That's part of the frontal cortex, the lower frontal cortex, near motor areas. And the clinical manifestation is a major disturbance in speech production, with sparse or halting speech. It's often misarticulated, missing function words, and parts of words. So this is clearly a problem with producing speech. Sometimes this is called motor aphasia.
Wernicke is another early physician who saw patients with damaged cortices. He saw some of them with damage to this region, in the caudal temporal lobe and associated parietal lobe. It's an area which has become known as Wernicke's area.
And here, the clinical manifestation is completely different. In this case, the production of speech is fine. But it's a major disturbance in auditory comprehension. OK.
So you ask the patient something and they cannot understand you. But they have fluent speech production. It's fluent speech with maybe disturbances of the sounds and structures of the words. But the major deficit is in the auditory comprehension, language comprehension.
So the question, then, has always been, is this Broca's area the motor area for production of speech and language? And is this Wernicke's area the area for comprehension of speech? And clearly, this is a very simplistic idea, in what would be called the localizational idea.
I'm not sure if that word is here. But if you're a so-called localization proponent, you would say each little part of the cortex has its own function. And they do that independently of all the other areas.
So the Broca's area is involved in producing speech. And Wernicke's area is responsible for comprehending speech. And this is clearly from imaging studies that we know now is a very simplistic view, and probably an incorrect view.
It's more likely this whole of perisylvian cortex contributes to language processing. And so people who subscribe to that theory would be called holistic. They'd have the holistic view of processing in cortex.
And we'll go over the imaging studies in just a minute. One thing I want to make sure to say is that language processing is clearly a cortical phenomenon. That is, if you have injury to the cortex in these specific areas, you're likely to have aphasia.
If you have injuries to the brain stem, to the thalamus, you are much less likely to have any kind of aphasia. So clearly, the cortex is the place where language is processed. Another thing about cortex and language processing is that it's usually lateralized into one hemisphere or another, OK?
So if you're right handed, usually your language is processed in your opposite hemifield, in the left cortical area. And so how is that known? Well, if you have a stroke patient whose right handed, they have a lesion in the left cortex. They show up with aphasia.
If they have a lesion in the right cortex, there's minimal effect on their language functions. Another way is by the so-called Wada test. So people who are getting ready to have cortical neurosurgery-- so why would you ever want to have cortical neurosurgery?
Anybody? So another big disease in the cortex is epilepsy, right? Epilepsy is uncontrolled activity, usually starting in the cortex. Of course, the first line of attack is by medication.
But some epileptic patients have epilepsy that is not controlled by medication. And they have seizures every half hour. And it's pretty much intractable.
So the last line of attack, then, by the neurosurgeons is to try to go into the cortex and find the part of the cortex where the epileptic focus begins. And then they'd lesion that. And this is a successful treatment.
But if the surgeon goes in and lesions part of the language areas, you have a patient that wakes up as an aphasic. That's not a happy patient. So the surgeons do lots of tests before such surgery.
And one is to try to figure out which hemisphere is processing the language. So they do the Wada test. Has anybody heard of the Wada test? OK.
They take the patient, of course, and-- if they're smart and they plan ahead, the patient is seated. OK? And they have a carotid artery on the left side, and a carotid artery on the right side. And into the carotid artery is injected a quick acting barbiturate anesthetic.
On one side, that carotid artery feeds one hemisphere of the cortex and not the other. So the patient is seated because the patient is likely to slump because they're going to have some motor problems. And the patient might fall over if they were standing up.
But in the test, the patient is supposed to recite from reading or from memory. And the anesthetic is injected. And as soon as that anesthetic hits the hemisphere that's processing language, the patient stops reciting.
That's if they injected on the correct side. If they injected on the other side, the patient keeps reading, keeps reciting. So that's the Wada test.
Language function is mostly in one hemisphere in right handed individuals In left handers, things are a little bit different. Left handed individuals sometimes have the opposite hemispheric dominance.
Sometimes they have language function distributed bilaterally. Sometimes they have language function in the same side as their handedness. OK. But for this, I'm talking about right handers, OK?
Here's some very interesting work looking at the language areas in people. In postmortem, material has been done by Al Galaburda. So he's at Beth Israel Deaconess Hospital now.
And he looked at the left hemisphere and the right hemispheres. And he looked at the very, of course, closely associated language areas, especially right near A1. And so here is-- you can't see them on a side view like this-- but if you cut off the top of the cortex, which you can do. This is postmortem material.
You look down on the superior surface of the temporal lobe, you see these views here. And the area that's just caudal to the primary auditory cortex is called the plenum temporally. And there are clearly some left-right asymmetries in that region of the brain that almost certainly relate to language processing in that area.
OK? So there are anatomical asymmetries in the perisylvian regions. And so this is the first time in our course, now, where right and left makes a big difference.
All along, we said it doesn't matter if we simulated the right ear or the left ear. And here, clearly, there is a dominant hemisphere for language.
Now finally, imaging studies have shown us a great deal about the cortical processing of language. And here's data from a pet study, in which the subjects are listening to language stimuli. And these happen to be French speaking subjects.
So the last condition is a story in French. And obviously, the subjects understood the story. They could tell you what was going on. And these are, by the way, right handed subjects.
And here's the imaging of the areas in the left hemisphere, which would be expected to be the dominant hemisphere for language. And this is in the other hemisphere, which shows a lot less activation. The activation in the areas where the subjects were listening to language they understood is the superior temporal area, superior temporal gyrus.
That's including the temporal pole here, in purple. You can't see the yellow very well. But believe there's a lot of activation here. This blue area, labeled IFG inferior frontal gyrus, this is Broca's area.
And why does it light up if Broca's area is only a motor area? It's clearly involved in motor functions. But here is imaging results from subjects just listening, not producing language, where Broca's area lights up.
It's activated on the dominant side, just in the listening task. Contrast that to when the subjects were listening to a story in a language that they did not understand, this language is called Tamil. And none of the subjects could speak it. There's hardly any activation in the original.
There's a little bit of yellow activation near the primary auditory cortex. It's pretty symmetric, left to right. And that's just what you'd expect if you were, for example, given pure tones or noise.
This is a nice control because, presumably, this language has about the same frequency content. And other factors are fairly similar between these two languages. The one difference is the subjects were not perceptually aware of what they were learning about in the story, in the case of the unfamiliar.
These intermediate conditions, some of them having pseudo words and anomalous sentences, didn't light up the language areas to a great degree. But listening to a list, in this case, of French words-- which the subjects were familiar to-- again, showed activation. I'll point out to you in Broca's area in the dominant hemisphere, in these right handed subjects.
So again, clearly, just a listening task can light up Broca's area. And so that is a very clear example. I'll show you that these so-called motor areas, like Broca's area, are involved in listening and cortex processing of language stimuli.
It's not just involved in motor production of speech, even though what we call clinically Broca's aphasia has a major disturbance in speech production. OK. And one final thing to leave you with.
Broca's area tends to light up, in this case, in fairly simple stimuli. But it tends to light up in other studies, like in cases where the substances have difficult grammar or complex meaning. And so the subjects, you can imagine, are really listening hard and trying to figure out the meaning of a sentence that has a complicated grammar.
I think we've all written such senses. We've all tried to read them from other writers. And it takes a lot of brain power, then, to decode that, and figure out the meaning.
And maybe that's what happens here. Broca's area's called in when the task gets more difficult than just a simple list of words. In this case, it lit up. And in other studies, it's clearly showing more activation when the task gets more difficult.
OK. So we're out of time. I can take a question or two. And just a reminder, class meets at Mass. Eye and Ear on Wednesday for the lab tour. So I'll see you over there.