1 00:00:00,060 --> 00:00:02,400 The following content is provided under a Creative 2 00:00:02,400 --> 00:00:03,790 Commons license. 3 00:00:03,790 --> 00:00:06,030 Your support will help MIT OpenCourseWare 4 00:00:06,030 --> 00:00:10,120 continue to offer high-quality educational resources for free. 5 00:00:10,120 --> 00:00:12,660 To make a donation or to view additional materials 6 00:00:12,660 --> 00:00:16,590 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:16,590 --> 00:00:17,966 at ocw.mit.edu. 8 00:00:20,860 --> 00:00:22,110 PROFESSOR: So the cochlea is-- 9 00:00:22,110 --> 00:00:23,318 I'm just going to close this. 10 00:00:23,318 --> 00:00:23,850 Is that OK? 11 00:00:26,986 --> 00:00:29,940 The cochlea is long and skinny-- 12 00:00:29,940 --> 00:00:31,220 and maybe not? 13 00:00:36,840 --> 00:00:39,305 Maybe not. 14 00:00:39,305 --> 00:00:40,680 And the different parts of it are 15 00:00:40,680 --> 00:00:44,350 sensitive to different frequencies. 16 00:00:44,350 --> 00:00:45,360 Ringing any bells? 17 00:00:45,360 --> 00:00:46,100 Wow. 18 00:00:46,100 --> 00:00:46,800 Thanks. 19 00:00:46,800 --> 00:00:49,750 AUDIENCE: That's where the higher [INAUDIBLE] I think, 20 00:00:49,750 --> 00:00:54,539 Abby said yesterday that for really low tones, 21 00:00:54,539 --> 00:01:01,844 it's basically an increase in the frequency [INAUDIBLE] 22 00:01:01,844 --> 00:01:05,741 but that at a certain point if that becomes impractical then-- 23 00:01:05,741 --> 00:01:09,540 and then certain areas are [INAUDIBLE] 24 00:01:09,540 --> 00:01:13,960 PROFESSOR: So I think mainly for the frequencies 25 00:01:13,960 --> 00:01:14,810 that speech are in-- 26 00:01:14,810 --> 00:01:18,080 I'm here to talk about language, by the way-- 27 00:01:18,080 --> 00:01:20,480 the frequencies of speech are more 28 00:01:20,480 --> 00:01:23,780 like the different frequencies of the cochlea. 29 00:01:23,780 --> 00:01:26,090 Different parts are sensitive to different frequencies. 30 00:01:26,090 --> 00:01:28,920 So I'm just going to go with that story for now. 31 00:01:28,920 --> 00:01:32,345 But I'm sure Abby knows more about this than I do. 32 00:01:32,345 --> 00:01:34,310 The different frequencies of the cochlea 33 00:01:34,310 --> 00:01:37,050 are sensitive to different frequencies and sound. 34 00:01:37,050 --> 00:01:39,545 Do you guys know much about waves? 35 00:01:43,580 --> 00:01:47,810 Basically, any signal, like any sound signal 36 00:01:47,810 --> 00:01:54,290 can be analyzed as a composite of different frequencies 37 00:01:54,290 --> 00:01:55,490 of waves. 38 00:01:55,490 --> 00:02:01,430 So it's going to be maybe a lot of 400 Hertz 39 00:02:01,430 --> 00:02:05,540 and a little bit of 600 Hertz or what have you, 40 00:02:05,540 --> 00:02:09,229 and you can basically entirely understand 41 00:02:09,229 --> 00:02:12,480 the sound, more or less, based on this analysis. 42 00:02:12,480 --> 00:02:13,880 So that's what the cochlea does. 43 00:02:13,880 --> 00:02:15,200 Do you understand this? 44 00:02:15,200 --> 00:02:19,700 So cochlea will tell you, at any given moment, 45 00:02:19,700 --> 00:02:25,430 how present each of the frequencies are. 46 00:02:25,430 --> 00:02:27,650 Nod if you understand what I'm saying. 47 00:02:27,650 --> 00:02:29,270 Go like this. 48 00:02:29,270 --> 00:02:33,892 Do like this, if you don't understand. 49 00:02:33,892 --> 00:02:34,823 No? 50 00:02:34,823 --> 00:02:35,323 Yes? 51 00:02:38,966 --> 00:02:40,090 The cochlea is in your ear. 52 00:02:40,090 --> 00:02:42,898 AUDIENCE: [INAUDIBLE] 53 00:02:42,898 --> 00:02:44,890 That was nice of you guys to tell us. 54 00:02:44,890 --> 00:02:46,000 Hey, Sara-- 55 00:02:46,000 --> 00:02:49,270 PROFESSOR: --sensitive to different frequencies. 56 00:02:49,270 --> 00:02:51,280 Also sensitive to different frequencies 57 00:02:51,280 --> 00:02:54,820 is my handy dandy computer program, 58 00:02:54,820 --> 00:02:57,966 which I will now show you. 59 00:02:57,966 --> 00:02:59,090 So this is what it sounds-- 60 00:02:59,090 --> 00:02:59,820 COMPUTER: b. 61 00:02:59,820 --> 00:03:00,760 [LAUGHTER] 62 00:03:00,760 --> 00:03:02,343 PROFESSOR: I don't know why it shakes. 63 00:03:05,280 --> 00:03:07,240 Is says bee. 64 00:03:07,240 --> 00:03:13,090 So we have a beh sound going on and then an e vowel. 65 00:03:13,090 --> 00:03:19,000 So what's up here is just plain and simple, just the wave form. 66 00:03:19,000 --> 00:03:22,840 And what's going on down here is what 67 00:03:22,840 --> 00:03:26,770 I was talking about before with the frequency analysis. 68 00:03:26,770 --> 00:03:31,660 So basically, along the x-axis, we have time. 69 00:03:31,660 --> 00:03:35,620 And then on the y-axis, we have many different frequencies. 70 00:03:35,620 --> 00:03:40,000 And then, the darkness of the dot 71 00:03:40,000 --> 00:03:44,530 is how strong that frequency is at that time. 72 00:03:44,530 --> 00:03:46,820 Does this make sense? 73 00:03:46,820 --> 00:03:49,370 So what you can see, for example, 74 00:03:49,370 --> 00:03:52,414 is that during this e vowel-- 75 00:03:52,414 --> 00:03:53,740 COMPUTER: e. 76 00:03:53,740 --> 00:03:58,870 PROFESSOR: During the e vowel, we have many striations, 77 00:03:58,870 --> 00:03:59,950 like up and down stripes. 78 00:03:59,950 --> 00:04:01,090 Do you see this? 79 00:04:01,090 --> 00:04:02,470 Up and down stripes. 80 00:04:02,470 --> 00:04:07,360 And that's from, just plain and simple, the vocal chords-- 81 00:04:07,360 --> 00:04:11,870 just the sound going up and down. 82 00:04:11,870 --> 00:04:15,071 So sound is a periodic function, sometimes high pressure, 83 00:04:15,071 --> 00:04:16,029 sometimes low pressure. 84 00:04:16,029 --> 00:04:18,490 And that's what that's all about. 85 00:04:18,490 --> 00:04:20,050 So does this make sense? 86 00:04:20,050 --> 00:04:21,070 So this is voice. 87 00:04:21,070 --> 00:04:23,800 And you can tell that it's voice because of all these stations, 88 00:04:23,800 --> 00:04:24,510 especially down here. 89 00:04:24,510 --> 00:04:27,051 And this is a pitch track, which we should probably turn off. 90 00:04:31,010 --> 00:04:34,776 So does everyone understand what's going on here? 91 00:04:34,776 --> 00:04:35,975 You're not at all confused? 92 00:04:39,450 --> 00:04:42,270 So let me explain in more detail. 93 00:04:42,270 --> 00:04:45,960 So let's say we were to turn on the lights. 94 00:04:49,360 --> 00:05:08,190 Then, the vocal track looks kind of like this. 95 00:05:17,810 --> 00:05:20,810 This would be the mouth opening. 96 00:05:20,810 --> 00:05:22,910 These are the opening parts. 97 00:05:22,910 --> 00:05:24,630 This is the inside of your mouth. 98 00:05:24,630 --> 00:05:28,310 And then you have here your windpipe. 99 00:05:28,310 --> 00:05:31,790 And somewhere below here are some lungs. 100 00:05:31,790 --> 00:05:34,610 And somewhere down here you have your vocal folds, 101 00:05:34,610 --> 00:05:36,570 and those are going to vibrate create voicing. 102 00:05:36,570 --> 00:05:38,420 So anytime you say a voice vowel, 103 00:05:38,420 --> 00:05:41,780 like b or ah or whatever, you're going 104 00:05:41,780 --> 00:05:45,590 to have these guys vibrating. 105 00:05:45,590 --> 00:05:50,000 And they're going to create those up and down striations. 106 00:05:50,000 --> 00:05:55,070 So the question is how do you make different sounds? 107 00:05:55,070 --> 00:05:55,880 I'll ask you. 108 00:05:55,880 --> 00:05:58,205 How do you make different sounds? 109 00:05:58,205 --> 00:06:00,920 AUDIENCE: Obstructing the air flow from the windpipe 110 00:06:00,920 --> 00:06:06,609 use your mouth, different parts of your mouth. 111 00:06:06,609 --> 00:06:07,650 PROFESSOR: Yeah, exactly. 112 00:06:07,650 --> 00:06:10,110 So basically, the difference between different sounds 113 00:06:10,110 --> 00:06:11,985 is going to have something to do with the way 114 00:06:11,985 --> 00:06:13,290 this whole thing is shaped. 115 00:06:13,290 --> 00:06:15,130 Does that make sense? 116 00:06:15,130 --> 00:06:18,090 So for example, if you were to make a buh, 117 00:06:18,090 --> 00:06:20,310 you would close your lips. 118 00:06:20,310 --> 00:06:23,460 So then this would be blocked off here, and then 119 00:06:23,460 --> 00:06:26,281 suddenly opened. 120 00:06:26,281 --> 00:06:30,150 Go like this if you understand. 121 00:06:30,150 --> 00:06:34,170 And then the way your tongue is, is going 122 00:06:34,170 --> 00:06:36,132 to make this shape different. 123 00:06:36,132 --> 00:06:37,590 And then the way you're glottis is, 124 00:06:37,590 --> 00:06:39,150 is going to make the back different 125 00:06:39,150 --> 00:06:40,500 and all this sort of stuff. 126 00:06:40,500 --> 00:06:44,385 And this is how you make the different sounds. 127 00:06:51,695 --> 00:06:53,820 So for example, the way you make different vowels-- 128 00:06:53,820 --> 00:07:00,090 if you try saying like awe, e, awe, e, or something like that, 129 00:07:00,090 --> 00:07:03,540 or pretend like you're doing it, then 130 00:07:03,540 --> 00:07:06,150 it's mainly your tongue moving. 131 00:07:06,150 --> 00:07:08,210 You can kind of-- 132 00:07:08,210 --> 00:07:10,290 if you want to try it, it would be OK. 133 00:07:10,290 --> 00:07:13,680 awe, e-- it's basically your tongue 134 00:07:13,680 --> 00:07:15,390 going one way or another. 135 00:07:15,390 --> 00:07:18,630 And your tongue is going to create some sort of obstruction 136 00:07:18,630 --> 00:07:23,070 somewhere in here for each one. 137 00:07:23,070 --> 00:07:27,300 If you think about it, obstructing any part 138 00:07:27,300 --> 00:07:30,300 of your--of this whole pipe is going to make different 139 00:07:30,300 --> 00:07:31,920 cavities of different sizes. 140 00:07:31,920 --> 00:07:32,950 Does that make sense? 141 00:07:32,950 --> 00:07:34,026 So if you were to have-- 142 00:07:34,026 --> 00:07:35,400 this is just purely theoretical-- 143 00:07:35,400 --> 00:07:38,550 if you were to have an obstruction like here, 144 00:07:38,550 --> 00:07:40,400 then you would have a cavity of this size 145 00:07:40,400 --> 00:07:42,892 and a cavity of that size. 146 00:07:42,892 --> 00:07:45,100 If you were to make one here or something like that-- 147 00:07:45,100 --> 00:07:46,740 I'm not sure you could-- 148 00:07:46,740 --> 00:07:52,050 but then you would have you-- you could have multiple ones. 149 00:07:52,050 --> 00:07:55,149 If you were to do this with your lips, 150 00:07:55,149 --> 00:07:56,940 you would make this one a little bit bigger 151 00:07:56,940 --> 00:08:01,940 by a few centimeters, two centimeters or so. 152 00:08:01,940 --> 00:08:04,170 Does that make sense? 153 00:08:04,170 --> 00:08:06,130 So what do you guys know about resonance? 154 00:08:10,370 --> 00:08:10,870 Yeah? 155 00:08:10,870 --> 00:08:15,438 AUDIENCE: When two waves have the similar-- 156 00:08:15,438 --> 00:08:18,140 hold the same frequency, they can 157 00:08:18,140 --> 00:08:19,472 have a fight with each other. 158 00:08:19,472 --> 00:08:20,180 PROFESSOR: Right. 159 00:08:20,180 --> 00:08:26,020 So basically anything has what's called a fundamental frequency. 160 00:08:26,020 --> 00:08:29,420 And this is the frequency of a wave that 161 00:08:29,420 --> 00:08:35,840 would have a whole number of wavelengths with respect 162 00:08:35,840 --> 00:08:37,100 to the size of the thing. 163 00:08:37,100 --> 00:08:38,809 Something like that. 164 00:08:38,809 --> 00:08:41,570 So for instance, the cavities in your vocal cord, 165 00:08:41,570 --> 00:08:44,570 like the vocal tract, the various parts 166 00:08:44,570 --> 00:08:46,910 of your vocal tract that may or may not be blocked off, 167 00:08:46,910 --> 00:08:50,276 are going to have different resonant frequencies. 168 00:08:50,276 --> 00:08:51,470 Are you following? 169 00:08:51,470 --> 00:08:54,290 So it depends on the size of the thing. 170 00:08:54,290 --> 00:08:57,480 So the bigger part of your vocal tract 171 00:08:57,480 --> 00:09:00,401 is, the lower the resonant frequency. 172 00:09:00,401 --> 00:09:02,150 And then smaller parts of your vocal tract 173 00:09:02,150 --> 00:09:05,790 are going to have higher resonant frequencies. 174 00:09:05,790 --> 00:09:07,790 So if you're sending air through and you're 175 00:09:07,790 --> 00:09:10,370 making sound, what you're basically going to get 176 00:09:10,370 --> 00:09:12,740 is that certain frequencies are going 177 00:09:12,740 --> 00:09:16,220 to resonate and be really loud. 178 00:09:16,220 --> 00:09:18,140 And that's what we see in this picture. 179 00:09:18,140 --> 00:09:21,013 Can you still see it, even though the lights? 180 00:09:21,013 --> 00:09:21,655 AUDIENCE: Yes. 181 00:09:21,655 --> 00:09:22,280 PROFESSOR: Yes? 182 00:09:22,280 --> 00:09:22,779 OK. 183 00:09:22,779 --> 00:09:23,895 Good. 184 00:09:23,895 --> 00:09:25,520 So that's what you see in this picture. 185 00:09:28,320 --> 00:09:31,840 So for b-- or for the e vowel-- 186 00:09:31,840 --> 00:09:33,800 COMPUTER: e. 187 00:09:33,800 --> 00:09:38,180 PROFESSOR: We have this is the frequency of one 188 00:09:38,180 --> 00:09:42,290 of the big cavities in the vocal tract. 189 00:09:42,290 --> 00:09:47,090 And this is going to be the frequency of another one, 190 00:09:47,090 --> 00:09:49,400 and this is going to be another one. 191 00:09:49,400 --> 00:09:51,560 So this is going to be the biggest one. 192 00:09:51,560 --> 00:09:54,200 This is the next biggest one, et cetera, et cetera. 193 00:09:54,200 --> 00:09:55,519 These are called formants. 194 00:09:55,519 --> 00:09:56,310 I'll write it down. 195 00:10:07,900 --> 00:10:10,730 So does that make sense? 196 00:10:10,730 --> 00:10:13,810 So basically, changing your mouth, 197 00:10:13,810 --> 00:10:16,060 changing the shape of your mouth and your vocal tract, 198 00:10:16,060 --> 00:10:22,900 is going to directly change the frequencies of the formants. 199 00:10:22,900 --> 00:10:26,530 This program is basically doing exactly what the cochlea does. 200 00:10:26,530 --> 00:10:29,230 So this program takes all of the frequencies and figures 201 00:10:29,230 --> 00:10:33,100 out which are the strongest and which are the weakest. 202 00:10:33,100 --> 00:10:38,560 It basically analyzes the different magnitudes 203 00:10:38,560 --> 00:10:40,420 of each of the different frequencies that 204 00:10:40,420 --> 00:10:41,710 are all available. 205 00:10:45,390 --> 00:10:49,450 So basically, that seems to be what the speech signal is all 206 00:10:49,450 --> 00:10:50,860 about. 207 00:10:50,860 --> 00:10:53,960 So when you're producing the speech signal, 208 00:10:53,960 --> 00:10:57,700 you're going to be changing the shape of the vocal tract, 209 00:10:57,700 --> 00:11:00,029 and that's going to be changing what the formants are 210 00:11:00,029 --> 00:11:00,820 going to look like. 211 00:11:00,820 --> 00:11:04,390 And your cochlea is mapping out the formants for you. 212 00:11:04,390 --> 00:11:06,304 That's what it's doing. 213 00:11:06,304 --> 00:11:07,220 Does this makes sense? 214 00:11:11,020 --> 00:11:13,570 Then, we can see what is the difference 215 00:11:13,570 --> 00:11:15,790 between different vowels. 216 00:11:15,790 --> 00:11:17,390 I'll just play the whole thing. 217 00:11:17,390 --> 00:11:23,170 COMPUTER: Dee, dee, bah, dah. 218 00:11:23,170 --> 00:11:27,040 PROFESSOR: So let's just look at the difference between-- 219 00:11:27,040 --> 00:11:29,020 let's see, this is dee and this is bah. 220 00:11:34,920 --> 00:11:39,060 What are the differences between the formants 221 00:11:39,060 --> 00:11:42,030 of the dee and bah here? 222 00:11:56,770 --> 00:11:59,628 AUDIENCE: There's a very big one which has 223 00:11:59,628 --> 00:12:04,100 a low frequency [INAUDIBLE]. 224 00:12:04,100 --> 00:12:05,066 PROFESSOR: For dee? 225 00:12:05,066 --> 00:12:06,770 AUDIENCE: Yes. 226 00:12:06,770 --> 00:12:08,610 PROFESSOR: So the formant one-- 227 00:12:08,610 --> 00:12:09,560 this is formant one. 228 00:12:09,560 --> 00:12:13,410 It's the bottom one-- 229 00:12:13,410 --> 00:12:17,715 is lower in dee than in bah. 230 00:12:17,715 --> 00:12:19,760 Are people seeing this? 231 00:12:19,760 --> 00:12:24,380 This one is lower frequency than this one. 232 00:12:24,380 --> 00:12:27,609 Maybe if we were to show the-- 233 00:12:27,609 --> 00:12:29,319 AUDIENCE: [INAUDIBLE] 234 00:12:29,319 --> 00:12:30,860 PROFESSOR: So the red dots are trying 235 00:12:30,860 --> 00:12:32,070 to track performance here. 236 00:12:32,070 --> 00:12:34,920 So this one is lower than this one. 237 00:12:34,920 --> 00:12:36,962 Do you see that? 238 00:12:36,962 --> 00:12:37,920 How did the formant do? 239 00:12:46,884 --> 00:12:51,366 AUDIENCE: In the awe sounds, it's one or two 240 00:12:51,366 --> 00:12:54,670 lower [INAUDIBLE] 241 00:12:54,670 --> 00:13:00,380 PROFESSOR: So formant two is very high in e sound 242 00:13:00,380 --> 00:13:03,092 and very low in the awe sound. 243 00:13:03,092 --> 00:13:04,550 So this is basically the difference 244 00:13:04,550 --> 00:13:09,935 between what's called a high vowel, like e, 245 00:13:09,935 --> 00:13:14,330 and a low vowel, like awe, is the differences in formants one 246 00:13:14,330 --> 00:13:15,020 and two. 247 00:13:15,020 --> 00:13:17,861 If you look at the spectrum of vowels-- 248 00:13:17,861 --> 00:13:18,360 spectrum. 249 00:13:18,360 --> 00:13:20,750 If you look at-- 250 00:13:20,750 --> 00:13:22,400 all the vowel systems in the world, 251 00:13:22,400 --> 00:13:25,850 seem to have vowels that-- 252 00:13:25,850 --> 00:13:28,110 there's sort of a trapezoidal thing going. 253 00:13:28,110 --> 00:13:32,080 So e is over here, and ooh is over here. 254 00:13:32,080 --> 00:13:40,640 And awe and ah is over here, and like uh, oh and eh. 255 00:13:40,640 --> 00:13:43,370 So this is some of the vowels of English. 256 00:13:52,530 --> 00:13:56,100 If you look at vowel systems, languages 257 00:13:56,100 --> 00:13:59,610 can have three vowels systems or five vowels systems 258 00:13:59,610 --> 00:14:03,090 or seven or eight or however many, 259 00:14:03,090 --> 00:14:06,120 but they're all going to be distributed in here. 260 00:14:06,120 --> 00:14:09,049 And if you look at actually the formants of these, 261 00:14:09,049 --> 00:14:10,590 then the big differences between them 262 00:14:10,590 --> 00:14:15,660 is going to be between formants one and two. 263 00:14:15,660 --> 00:14:19,650 So formant one is going to be highest for the low vowels, 264 00:14:19,650 --> 00:14:22,470 and lowest for the high vowels. 265 00:14:22,470 --> 00:14:25,430 And formant two is basically going to go the other way. 266 00:14:25,430 --> 00:14:25,930 No. 267 00:14:25,930 --> 00:14:27,830 Formant two is going to be front back. 268 00:14:27,830 --> 00:14:30,945 Formant two is going to create this distinction. 269 00:14:33,925 --> 00:14:35,550 And this all happens because of the way 270 00:14:35,550 --> 00:14:37,216 the tongue is when you say these vowels. 271 00:14:37,216 --> 00:14:42,210 So when you say awe, e, you're making different obstructions. 272 00:14:42,210 --> 00:14:46,910 You're basically dividing the vocal tract in different ways 273 00:14:46,910 --> 00:14:49,410 and creating different resonant frequencies. 274 00:14:49,410 --> 00:14:51,295 And that's what's going on. 275 00:14:51,295 --> 00:14:52,170 Does this make sense? 276 00:14:55,150 --> 00:14:56,690 So that's what the vowels are about. 277 00:15:00,340 --> 00:15:06,692 I just want to tell you maybe about consonants. 278 00:15:12,509 --> 00:15:13,300 I'll play it again. 279 00:15:13,300 --> 00:15:18,640 COMPUTER: Bee, dee, baw, daw. 280 00:15:18,640 --> 00:15:24,390 PROFESSOR: So if you look at the wave form, this is for, what? 281 00:15:24,390 --> 00:15:24,890 Dee? 282 00:15:24,890 --> 00:15:25,500 COMPUTER: Bee. 283 00:15:25,500 --> 00:15:26,590 PROFESSOR: Bee, yeah. 284 00:15:26,590 --> 00:15:28,430 This is for bee. 285 00:15:28,430 --> 00:15:31,970 So if you divide this up, as it works out, 286 00:15:31,970 --> 00:15:34,730 there is silence for a while. 287 00:15:34,730 --> 00:15:38,120 This is the part that happens when your mouth is closed 288 00:15:38,120 --> 00:15:43,240 and you're saying the part that happens before the buh. 289 00:15:43,240 --> 00:15:47,180 So the buh sound is a release. 290 00:15:47,180 --> 00:15:50,090 I don't know if you've noticed, but if you say bee, 291 00:15:50,090 --> 00:15:52,730 there's part of it where your mouth is closed. 292 00:15:52,730 --> 00:15:55,070 And then, when your mouth opens, a buh sound 293 00:15:55,070 --> 00:15:58,530 happens, and then the vowel. 294 00:15:58,530 --> 00:16:01,550 So this is the part where your mouth is closed. 295 00:16:01,550 --> 00:16:03,830 And then, this is going to be the release. 296 00:16:03,830 --> 00:16:05,413 I don't know if you can hear anything. 297 00:16:07,460 --> 00:16:09,920 That's the consonant itself-- the buh. 298 00:16:09,920 --> 00:16:14,500 And then it's going to open up into the vowel itself. 299 00:16:14,500 --> 00:16:17,057 COMPUTER: Bee, bee. 300 00:16:17,057 --> 00:16:18,890 PROFESSOR: The reason it sounds like there's 301 00:16:18,890 --> 00:16:20,640 a consonant at the beginning is because it 302 00:16:20,640 --> 00:16:23,900 starts kind of abruptly. 303 00:16:23,900 --> 00:16:26,720 So if you were to say a vowel with no consonant 304 00:16:26,720 --> 00:16:30,770 at the beginning at all, it would start very quietly 305 00:16:30,770 --> 00:16:32,990 and get louder and then get softer again. 306 00:16:32,990 --> 00:16:35,480 So you can see that it gets softer again. 307 00:16:35,480 --> 00:16:40,240 But it starts loudly because it's preceded by this stop-- 308 00:16:40,240 --> 00:16:41,240 the buh sound. 309 00:16:41,240 --> 00:16:42,530 COMPUTER: Bee. 310 00:16:42,530 --> 00:16:45,770 PROFESSOR: So this is all the vowel, 311 00:16:45,770 --> 00:16:48,950 and this is all the consonant. 312 00:16:48,950 --> 00:16:53,120 What is mainly the difference in volume between the vowel 313 00:16:53,120 --> 00:16:54,188 and the consonant? 314 00:16:57,604 --> 00:16:59,556 Yeah? 315 00:16:59,556 --> 00:17:07,852 AUDIENCE: The consonant doesn't have much volume. 316 00:17:07,852 --> 00:17:09,930 PROFESSOR: The consonant is really quiet 317 00:17:09,930 --> 00:17:12,119 compared to the vowel. 318 00:17:12,119 --> 00:17:13,670 So the consonant is going to-- 319 00:17:13,670 --> 00:17:15,533 this is what the consonant sounds like. 320 00:17:15,533 --> 00:17:16,979 COMPUTER: [SOUND SIGNAL] 321 00:17:16,979 --> 00:17:17,770 PROFESSOR: Versus-- 322 00:17:17,770 --> 00:17:18,970 COMPUTER: Bee. 323 00:17:18,970 --> 00:17:23,109 PROFESSOR: So very quiet compared to the vowel. 324 00:17:23,109 --> 00:17:25,720 And let's maybe look at dee. 325 00:17:32,030 --> 00:17:33,905 COMPUTER: Dee, dee. 326 00:17:33,905 --> 00:17:35,280 PROFESSOR: So how about this one? 327 00:17:40,690 --> 00:17:44,170 Can you tell this is where the vowel is happening? 328 00:17:44,170 --> 00:17:45,250 COMPUTER: Dee. 329 00:17:45,250 --> 00:17:47,625 PROFESSOR: And this is the consonant. 330 00:17:47,625 --> 00:17:50,535 COMPUTER: [SOUND SIGNAL] 331 00:17:52,970 --> 00:17:54,980 PROFESSOR: So what do we think about this? 332 00:17:54,980 --> 00:17:58,002 AUDIENCE: Well, in this one, the duh sound 333 00:17:58,002 --> 00:18:01,923 doesn't obstruct the airflow, so it's the starting point-- 334 00:18:01,923 --> 00:18:03,370 or the duh sound. 335 00:18:03,370 --> 00:18:06,165 And the starting point of the e sound 336 00:18:06,165 --> 00:18:11,590 is at the same volume [INAUDIBLE] 337 00:18:11,590 --> 00:18:13,590 PROFESSOR: Yeah, I mean, it looks 338 00:18:13,590 --> 00:18:17,490 like what's happening here is that this is the release. 339 00:18:17,490 --> 00:18:19,830 This is where the tongue releases so you 340 00:18:19,830 --> 00:18:23,340 don't have a stop anymore-- 341 00:18:23,340 --> 00:18:26,230 stop in the air flow-- you don't have that anymore right here. 342 00:18:26,230 --> 00:18:28,060 And then, there's a little bit of noise, 343 00:18:28,060 --> 00:18:31,440 which is probably just air passing out of the mouth. 344 00:18:31,440 --> 00:18:33,390 And then the vowel seems to start. 345 00:18:33,390 --> 00:18:35,040 But it is louder. 346 00:18:35,040 --> 00:18:38,880 The stop itself is louder, but certainly-- 347 00:18:38,880 --> 00:18:42,360 louder than the buh one, but certainly not as loud 348 00:18:42,360 --> 00:18:43,320 as a vowel-- 349 00:18:43,320 --> 00:18:47,530 or not the main part of the vowel anyway. 350 00:18:47,530 --> 00:18:50,570 And you can see the difference here. 351 00:18:50,570 --> 00:18:56,690 So duh part looks like this. 352 00:18:56,690 --> 00:19:00,400 So it's got the beginnings of a formants. 353 00:19:00,400 --> 00:19:02,110 You're ready to say the vowel. 354 00:19:02,110 --> 00:19:04,860 Your mouth is shaped that way. 355 00:19:04,860 --> 00:19:07,660 But you don't really-- 356 00:19:07,660 --> 00:19:10,780 they don't get really dark until here. 357 00:19:10,780 --> 00:19:13,030 This is when you can really see stuff going on. 358 00:19:13,030 --> 00:19:14,050 So it's also quieter. 359 00:19:16,630 --> 00:19:18,580 Basically, what I'm trying to say 360 00:19:18,580 --> 00:19:22,610 is that consonants are much quieter than vowels. 361 00:19:27,070 --> 00:19:30,360 So we can look at these and see if they're the same. 362 00:19:33,160 --> 00:19:40,600 COMPUTER: Baw, baw, baw. 363 00:19:40,600 --> 00:19:43,580 PROFESSOR: I don't know why it shakes. 364 00:19:43,580 --> 00:19:46,943 So this is sort of the same thing going on, versus-- 365 00:19:52,570 --> 00:19:54,944 COMPUTER: Daw, daw. 366 00:19:54,944 --> 00:19:57,485 PROFESSOR: So again, we see that duh is a little bit louder-- 367 00:19:57,485 --> 00:19:59,015 COMPUTER: [SOUND SIGNAL] 368 00:19:59,015 --> 00:20:00,530 PROFESSOR: --than the buh was. 369 00:20:00,530 --> 00:20:03,140 But it's still nowhere near as loud as the main part 370 00:20:03,140 --> 00:20:05,425 of the vowel. 371 00:20:05,425 --> 00:20:08,610 AUDIENCE: What if you drug it to that-- 372 00:20:08,610 --> 00:20:11,459 can you start playing it in the middle of the vowel? 373 00:20:11,459 --> 00:20:13,220 Would we still tend to hear a consonant? 374 00:20:13,220 --> 00:20:14,330 PROFESSOR: You would hear a consonant, 375 00:20:14,330 --> 00:20:15,371 but it wouldn't be a dee. 376 00:20:15,371 --> 00:20:15,975 Listen. 377 00:20:15,975 --> 00:20:20,680 COMPUTER: Awe, awe, awe, awe, awe, awe. 378 00:20:20,680 --> 00:20:23,980 AUDIENCE: [INAUDIBLE] It's almost a bee. 379 00:20:23,980 --> 00:20:26,450 PROFESSOR: Yeah, you could perceive it in a few ways. 380 00:20:26,450 --> 00:20:28,660 COMPUTER: Awe, awe, awe awe. 381 00:20:33,120 --> 00:20:35,830 PROFESSOR: So it sounds like a bee to you? 382 00:20:35,830 --> 00:20:37,810 We'll talk about that. 383 00:20:37,810 --> 00:20:39,790 So you think that there's a consonant there 384 00:20:39,790 --> 00:20:42,510 because the vowel started so abruptly. 385 00:20:42,510 --> 00:20:44,380 First, I didn't hit play, and then I did. 386 00:20:49,610 --> 00:20:51,170 But that's just the only reason. 387 00:20:51,170 --> 00:20:54,440 So it doesn't have the characteristic swell 388 00:20:54,440 --> 00:20:56,690 that the vowel has. 389 00:20:56,690 --> 00:21:01,100 People understand that though the larger ones are louder? 390 00:21:01,100 --> 00:21:02,300 People are getting this? 391 00:21:02,300 --> 00:21:03,650 AUDIENCE: Higher amplitudes [INAUDIBLE] 392 00:21:03,650 --> 00:21:04,358 PROFESSOR: Right. 393 00:21:04,358 --> 00:21:07,190 Higher amplitude makes it louder, in this top part-- 394 00:21:07,190 --> 00:21:09,710 the wave form. 395 00:21:09,710 --> 00:21:10,870 This is different. 396 00:21:13,580 --> 00:21:14,480 So here's a question. 397 00:21:18,020 --> 00:21:19,980 If we can't tell the difference between-- 398 00:21:19,980 --> 00:21:25,310 or if the consonants are so quiet, then how can 399 00:21:25,310 --> 00:21:29,210 we hear the difference between two consonants? 400 00:21:29,210 --> 00:21:30,318 Yeah? 401 00:21:30,318 --> 00:21:33,082 AUDIENCE: Well one thing is the shape 402 00:21:33,082 --> 00:21:38,418 of the volume of the vowel. 403 00:21:38,418 --> 00:21:40,987 If you take the e, it usually goes 404 00:21:40,987 --> 00:21:42,362 from-- it starts from really loud 405 00:21:42,362 --> 00:21:47,292 and goes to get softer [INAUDIBLE] 406 00:21:47,292 --> 00:21:53,980 but if you say a dee sound, then [INAUDIBLE] 407 00:21:53,980 --> 00:21:54,880 PROFESSOR: Yeah. 408 00:21:54,880 --> 00:21:58,500 The release of the stop is quieter in the buh sound 409 00:21:58,500 --> 00:22:01,140 than in the duh sound, which means 410 00:22:01,140 --> 00:22:06,280 that the abruptness of the vowel is also different-- 411 00:22:06,280 --> 00:22:09,000 just plain how loud it is, is going 412 00:22:09,000 --> 00:22:12,090 to be more abrupt in the buh sound than in the duh sound. 413 00:22:12,090 --> 00:22:14,170 You can see it here. 414 00:22:14,170 --> 00:22:17,790 People are understanding what I'm saying? 415 00:22:17,790 --> 00:22:19,020 So that's one thing. 416 00:22:19,020 --> 00:22:22,800 So basically, the volume of the release. 417 00:22:22,800 --> 00:22:23,360 What else? 418 00:22:27,960 --> 00:22:30,112 This is, if anyone forgot, this is-- 419 00:22:30,112 --> 00:22:32,080 COMPUTER: Buh, duh. 420 00:22:43,122 --> 00:22:44,872 AUDIENCE: I think the starting frequencies 421 00:22:44,872 --> 00:22:47,850 have different formants. 422 00:22:47,850 --> 00:22:50,340 PROFESSOR: Yeah, they seem to be different, right? 423 00:22:50,340 --> 00:22:54,240 So in this one, it looks like the formants 424 00:22:54,240 --> 00:22:56,520 are pretty steady throughout-- 425 00:22:56,520 --> 00:22:59,450 formants one and two, at least, pretty steady throughout. 426 00:22:59,450 --> 00:23:01,890 Whereas in this one, it looks like formant one 427 00:23:01,890 --> 00:23:05,405 is starting from a lower frequency and getting higher. 428 00:23:05,405 --> 00:23:07,530 And formant two is starting from a higher frequency 429 00:23:07,530 --> 00:23:09,690 and getting lower. 430 00:23:09,690 --> 00:23:11,010 People see this? 431 00:23:11,010 --> 00:23:12,390 How this is happening? 432 00:23:15,810 --> 00:23:18,600 Why might this be? 433 00:23:18,600 --> 00:23:20,580 Why would we see a difference in how 434 00:23:20,580 --> 00:23:23,640 the beginnings of the formants are depending on the preceding 435 00:23:23,640 --> 00:23:26,490 consonant? 436 00:23:26,490 --> 00:23:27,670 Yeah? 437 00:23:27,670 --> 00:23:31,300 AUDIENCE: Well, in daw, the shape of your tongue 438 00:23:31,300 --> 00:23:33,734 is different than when you're emphasizing the awe part. 439 00:23:33,734 --> 00:23:40,777 [INAUDIBLE] with buh, the tongue stays [INAUDIBLE] 440 00:23:40,777 --> 00:23:42,610 PROFESSOR: Basically, it must have something 441 00:23:42,610 --> 00:23:45,730 to do with that time-- 442 00:23:45,730 --> 00:23:49,990 your lips are shaped one way for the duh part of the sound, 443 00:23:49,990 --> 00:23:52,930 and they're shaped another way for the awe part of the sound. 444 00:23:52,930 --> 00:23:57,880 And the mouth wasn't designed to talk. 445 00:23:57,880 --> 00:23:59,990 It was designed to eat. 446 00:23:59,990 --> 00:24:03,580 I don't know if you guys realize this-- 447 00:24:03,580 --> 00:24:06,880 probably, yes. 448 00:24:06,880 --> 00:24:08,650 It's slow. 449 00:24:08,650 --> 00:24:13,030 Phoneticians hypothesize that we're basically 450 00:24:13,030 --> 00:24:16,990 speaking as fast as we can. 451 00:24:16,990 --> 00:24:19,360 That rates of speech in different languages 452 00:24:19,360 --> 00:24:21,340 are more or less similar. 453 00:24:21,340 --> 00:24:25,600 And if they got any louder, the mechanics of the mouth 454 00:24:25,600 --> 00:24:28,907 wouldn't actually be able to keep up with it. 455 00:24:28,907 --> 00:24:30,490 Whether perception can keep up with it 456 00:24:30,490 --> 00:24:31,990 is maybe another question. 457 00:24:34,690 --> 00:24:37,150 There's going to be some sort of lag. 458 00:24:37,150 --> 00:24:39,820 There's going to be some lag time between when 459 00:24:39,820 --> 00:24:41,710 your mouth was in the duh shape and when 460 00:24:41,710 --> 00:24:44,590 your mouth is in the awe shape. 461 00:24:44,590 --> 00:24:48,250 And similarly, some lag between the buh shape and the awe 462 00:24:48,250 --> 00:24:52,000 shape, and the buh shape and the e shape, 463 00:24:52,000 --> 00:24:54,434 and all these sorts of things. 464 00:24:54,434 --> 00:24:55,850 What we're seeing here is the lag. 465 00:24:58,540 --> 00:25:01,120 This is the part between the way your mouth was 466 00:25:01,120 --> 00:25:04,060 shaped from the duh, and the way your mouth 467 00:25:04,060 --> 00:25:06,400 was shaped for the awe. 468 00:25:06,400 --> 00:25:09,160 It's kind of our ramp time. 469 00:25:09,160 --> 00:25:12,550 And it's very noticeable. 470 00:25:12,550 --> 00:25:18,110 So if you look at this, it's pretty clear. 471 00:25:18,110 --> 00:25:20,070 This is exactly what the cochlea is doing. 472 00:25:20,070 --> 00:25:22,090 The cochlea is basically analyzing 473 00:25:22,090 --> 00:25:25,700 how loud the different frequencies are in real time. 474 00:25:25,700 --> 00:25:27,850 And so basically, what the cochlea is doing 475 00:25:27,850 --> 00:25:29,950 is showing you this picture, showing your brain 476 00:25:29,950 --> 00:25:30,940 this picture. 477 00:25:30,940 --> 00:25:33,570 So the fact that this is happening 478 00:25:33,570 --> 00:25:36,070 is sort of really noticeable. 479 00:25:36,070 --> 00:25:41,830 Whereas-- these are getting kind of distracting. 480 00:25:41,830 --> 00:25:47,530 Whereas, the loudness of the buh sound versus the duh sound, 481 00:25:47,530 --> 00:25:51,070 not really as noticeable. 482 00:25:51,070 --> 00:25:53,410 Does that makes sense? 483 00:25:53,410 --> 00:26:00,918 Let's look at bee and dee [INAUDIBLE] How about this? 484 00:26:10,670 --> 00:26:11,580 What do you guys see? 485 00:26:11,580 --> 00:26:13,906 I'll play it for you. 486 00:26:13,906 --> 00:26:18,711 COMPUTER: Bee, dee, bee, dee. 487 00:26:27,370 --> 00:26:28,078 PROFESSOR: Ideas? 488 00:26:42,466 --> 00:26:45,358 AUDIENCE: It will take the lag time [INAUDIBLE] 489 00:26:45,358 --> 00:26:51,862 because the dee shape is closer to-- the duh shape is 490 00:26:51,862 --> 00:26:54,250 closer to the [INAUDIBLE] 491 00:26:54,250 --> 00:26:58,110 PROFESSOR: So it seems like we don't see that drastic lag 492 00:26:58,110 --> 00:27:01,740 time that we saw with the awe vowel, so exactly right. 493 00:27:01,740 --> 00:27:05,580 So it must be something like the shape of your mouth for a duh 494 00:27:05,580 --> 00:27:08,730 is closer to the shape of your mouth for e, dee. 495 00:27:08,730 --> 00:27:10,470 I mean, you can kind of tell-- 496 00:27:10,470 --> 00:27:14,370 dee-- we're thinking mostly of what your tongue is doing here, 497 00:27:14,370 --> 00:27:16,740 because that's the part of your mouth 498 00:27:16,740 --> 00:27:20,190 that's moving the most in this whole process. 499 00:27:20,190 --> 00:27:25,410 And b, we also don't see some sort 500 00:27:25,410 --> 00:27:28,650 of exciting formant transition. 501 00:27:28,650 --> 00:27:30,750 This is what is called formant transition. 502 00:27:30,750 --> 00:27:32,760 We're talking about the beginnings and ends 503 00:27:32,760 --> 00:27:34,380 of the formants. 504 00:27:34,380 --> 00:27:39,520 So we do see something over here in formants three and four. 505 00:27:39,520 --> 00:27:41,730 They seem to dip. 506 00:27:41,730 --> 00:27:44,370 I'm sorry-- they seem to rise at the beginning, 507 00:27:44,370 --> 00:27:46,410 for bee more than with dee. 508 00:27:46,410 --> 00:27:49,660 But formants one-- formant three is certainly important. 509 00:27:49,660 --> 00:27:56,030 But it seems that the lower numbers of formants, the lower 510 00:27:56,030 --> 00:27:59,229 formants tends to be more important in our perception 511 00:27:59,229 --> 00:28:00,270 than the higher formants. 512 00:28:03,780 --> 00:28:05,809 I guess because they represent larger parts 513 00:28:05,809 --> 00:28:06,600 of the vocal tract. 514 00:28:06,600 --> 00:28:07,472 Yeah? 515 00:28:07,472 --> 00:28:17,119 AUDIENCE: Does each formant [INAUDIBLE] 516 00:28:17,119 --> 00:28:17,910 PROFESSOR: Sort of. 517 00:28:17,910 --> 00:28:27,240 So basically for formants one and two seem to be more or less 518 00:28:27,240 --> 00:28:29,100 that-- 519 00:28:29,100 --> 00:28:31,680 there's typically one obstruction 520 00:28:31,680 --> 00:28:35,350 in the vocal tract going on at any one time, like one big one. 521 00:28:35,350 --> 00:28:37,890 And formants one and two seem to be related 522 00:28:37,890 --> 00:28:41,610 to each other, in that formant one is always the bigger half 523 00:28:41,610 --> 00:28:44,890 and formant two is always the smaller half. 524 00:28:44,890 --> 00:28:48,990 And if you actually look at the vowels 525 00:28:48,990 --> 00:28:53,340 and you think about which formant corresponds 526 00:28:53,340 --> 00:28:58,980 to which half, you'll see that they switch, as one half gets 527 00:28:58,980 --> 00:29:01,540 bigger and one half gets-- 528 00:29:01,540 --> 00:29:03,420 it crosses the midpoint. 529 00:29:03,420 --> 00:29:04,770 But that's beside the point. 530 00:29:07,740 --> 00:29:12,300 And then after that, are other resonant frequencies. 531 00:29:12,300 --> 00:29:14,730 The formants are going to take into account basically all, 532 00:29:14,730 --> 00:29:15,400 like-- 533 00:29:15,400 --> 00:29:16,640 they keep going. 534 00:29:16,640 --> 00:29:19,897 I've cut this off at 5,000 Hertz, but they keep going. 535 00:29:19,897 --> 00:29:21,480 There are many, many formants up there 536 00:29:21,480 --> 00:29:23,730 and they're resonating with all of the different parts 537 00:29:23,730 --> 00:29:26,010 of your head, all of your various sinuses, 538 00:29:26,010 --> 00:29:29,650 and all this sort of thing, and to various degrees. 539 00:29:29,650 --> 00:29:31,920 So they get quieter and quieter as they go up. 540 00:29:35,010 --> 00:29:36,750 So does that answer your question? 541 00:29:36,750 --> 00:29:38,250 Yeah. 542 00:29:38,250 --> 00:29:40,590 So what were we saying before? 543 00:29:40,590 --> 00:29:43,680 We don't have this drastic difference 544 00:29:43,680 --> 00:29:49,290 in formants one and two for the difference between bee and dee. 545 00:29:49,290 --> 00:29:52,215 So maybe we're not going to really-- 546 00:29:55,320 --> 00:30:00,480 what I'm trying to get at is, what is-- 547 00:30:00,480 --> 00:30:09,110 if you look at this, and you ask yourself, 548 00:30:09,110 --> 00:30:14,320 how does a hearer perceive the difference between a bee-- 549 00:30:14,320 --> 00:30:15,850 the difference between baw and daw? 550 00:30:19,100 --> 00:30:23,677 You might, without thinking, or without looking 551 00:30:23,677 --> 00:30:25,260 at this picture, you might think that, 552 00:30:25,260 --> 00:30:29,940 well, the difference between baw and daw is the beginning buh 553 00:30:29,940 --> 00:30:31,900 and duh sounds. 554 00:30:35,230 --> 00:30:36,830 But as you can see from this picture, 555 00:30:36,830 --> 00:30:39,170 the actual buh and duh sounds are 556 00:30:39,170 --> 00:30:42,140 extremely quiet with respect to the vowel. 557 00:30:42,140 --> 00:30:44,210 But if you look at the formats of the vowels, 558 00:30:44,210 --> 00:30:47,960 you can clearly tell that they are different. 559 00:30:47,960 --> 00:30:50,420 So one hypothesis might be that when 560 00:30:50,420 --> 00:30:52,782 you're perceiving sound it actually has more 561 00:30:52,782 --> 00:30:54,740 to do with the formant transitions of the vowel 562 00:30:54,740 --> 00:30:56,120 than it does-- 563 00:30:56,120 --> 00:30:59,810 with the difference in release, the volumes of the 564 00:30:59,810 --> 00:31:03,140 stops, or whatever. 565 00:31:03,140 --> 00:31:12,350 And then, if you look at bee and dee, 566 00:31:12,350 --> 00:31:14,570 there's not quite this obvious difference 567 00:31:14,570 --> 00:31:19,390 in the formant transitions from the buh and the duh sound 568 00:31:19,390 --> 00:31:20,810 to the e. 569 00:31:20,810 --> 00:31:24,650 So maybe when perceiving bee and dee 570 00:31:24,650 --> 00:31:28,820 it might have more to do with the loudness of the stop 571 00:31:28,820 --> 00:31:32,080 release or something like that. 572 00:31:32,080 --> 00:31:33,317 Does that makes sense? 573 00:31:33,317 --> 00:31:35,150 Go like this if you understand what's going. 574 00:31:38,100 --> 00:31:39,830 Go like this if you don't. 575 00:31:45,790 --> 00:31:49,740 I didn't see any perplexed faces, I guess. 576 00:31:52,332 --> 00:31:53,790 So let's look at a different sound. 577 00:32:22,340 --> 00:32:23,350 I'll just play this. 578 00:32:23,350 --> 00:32:25,382 COMPUTER: Gun, gum. 579 00:32:25,382 --> 00:32:26,840 PROFESSOR: Can you understand that? 580 00:32:26,840 --> 00:32:29,346 Is it a little loud? 581 00:32:29,346 --> 00:32:31,126 What did it say? 582 00:32:31,126 --> 00:32:32,382 AUDIENCE: Gun, gum. 583 00:32:32,382 --> 00:32:33,215 PROFESSOR: Gun, gum. 584 00:32:33,215 --> 00:32:33,913 Right. 585 00:32:33,913 --> 00:32:34,669 Gun, gum. 586 00:32:34,669 --> 00:32:35,460 COMPUTER: Gun, gum. 587 00:32:41,970 --> 00:32:46,150 PROFESSOR: Let's look at the guh part for now, try and review 588 00:32:46,150 --> 00:32:47,636 what we learned. 589 00:32:47,636 --> 00:32:49,548 COMPUTER: Guh, guh. 590 00:32:49,548 --> 00:32:53,590 PROFESSOR: That's a little bit zoomed in. 591 00:32:53,590 --> 00:32:56,340 So what's going on with the-- this is gun, I guess. 592 00:32:56,340 --> 00:32:58,600 COMPUTER: Gun, gun. 593 00:32:58,600 --> 00:33:00,680 PROFESSOR: What's going on with gun here? 594 00:33:00,680 --> 00:33:10,160 Can anyone summarize this situation? 595 00:33:10,160 --> 00:33:11,530 Here, I'll give you a hint. 596 00:33:11,530 --> 00:33:13,980 I think the guh is probably this much of it. 597 00:33:18,880 --> 00:33:20,306 COMPUTER: [SOUND SIGNAL] 598 00:33:20,306 --> 00:33:21,180 PROFESSOR: And then-- 599 00:33:21,180 --> 00:33:23,372 COMPUTER: Gun, gun, gun. 600 00:33:26,730 --> 00:33:28,620 PROFESSOR: So what do you think is going on? 601 00:33:43,350 --> 00:33:49,242 AUDIENCE: [INAUDIBLE] awe, except the middle part of it, 602 00:33:49,242 --> 00:33:53,670 [INAUDIBLE] an un, so that's part of it. 603 00:33:53,670 --> 00:33:55,130 PROFESSOR: Yeah, the n. 604 00:33:55,130 --> 00:33:57,463 Probably this-- 605 00:33:57,463 --> 00:33:59,612 COMPUTER: un, un, un. 606 00:33:59,612 --> 00:34:00,820 PROFESSOR: Can you hear that? 607 00:34:00,820 --> 00:34:02,876 That's an un. 608 00:34:02,876 --> 00:34:05,330 COMPUTER: Un, un. 609 00:34:05,330 --> 00:34:10,670 PROFESSOR: So what is happening with the formant transitions 610 00:34:10,670 --> 00:34:12,560 for just the guh part? 611 00:34:27,796 --> 00:34:29,352 Just the beginning. 612 00:34:40,668 --> 00:34:47,210 AUDIENCE: I think one [INAUDIBLE] 613 00:34:47,210 --> 00:34:48,139 PROFESSOR: Right. 614 00:34:48,139 --> 00:34:49,550 Oh, yeah. 615 00:34:49,550 --> 00:34:52,340 These dots at the beginning are trying 616 00:34:52,340 --> 00:34:56,520 to find formants in the guh, and may not be successful. 617 00:34:56,520 --> 00:34:58,610 So sorry, I should have mentioned probably, 618 00:34:58,610 --> 00:35:00,410 you can pretty much ignore anything 619 00:35:00,410 --> 00:35:04,310 before about this line. 620 00:35:04,310 --> 00:35:07,660 Any of this stuff is probably-- here, we'll just turn this off. 621 00:35:11,440 --> 00:35:15,550 So the formant one has this distinctive rise 622 00:35:15,550 --> 00:35:16,654 at the beginning. 623 00:35:16,654 --> 00:35:17,570 How about formant two? 624 00:35:29,920 --> 00:35:33,380 AUDIENCE: [INAUDIBLE] 625 00:35:33,380 --> 00:35:34,530 PROFESSOR: Yeah. 626 00:35:34,530 --> 00:35:35,924 What happens at the beginning? 627 00:35:39,382 --> 00:35:41,358 This is formant two, right? 628 00:35:45,284 --> 00:35:46,367 AUDIENCE: It's decreasing. 629 00:35:46,367 --> 00:35:47,182 It's going down. 630 00:35:47,182 --> 00:35:47,890 PROFESSOR: Right. 631 00:35:47,890 --> 00:35:51,470 So it seems to start high and dip a little bit. 632 00:35:51,470 --> 00:35:54,680 And formant one, starts low and rises a little bit, 633 00:35:54,680 --> 00:35:57,520 just in the beginning. 634 00:35:57,520 --> 00:36:01,370 Let's just, for kicks, see if it's the same over here. 635 00:36:01,370 --> 00:36:06,050 This is also, if you remember, also starts with a guh, 636 00:36:06,050 --> 00:36:08,910 so probably should be similar. 637 00:36:08,910 --> 00:36:11,210 What do we see? 638 00:36:11,210 --> 00:36:12,430 Similar? 639 00:36:12,430 --> 00:36:15,691 Go like this if you think it's similar. 640 00:36:15,691 --> 00:36:16,190 Yeah. 641 00:36:16,190 --> 00:36:18,960 So a little bit of a rise here, a little bit of a dip here. 642 00:36:18,960 --> 00:36:23,300 And formant three seems to be having also a rise. 643 00:36:23,300 --> 00:36:27,290 So that's kind of interesting. 644 00:36:27,290 --> 00:36:29,120 So that seems to be what's going on. 645 00:36:29,120 --> 00:36:33,500 Now let's look at the other transition, 646 00:36:33,500 --> 00:36:37,260 the one that's going into the n or the m. 647 00:36:42,064 --> 00:36:43,920 What's happening here? 648 00:36:43,920 --> 00:36:45,410 Can you you guys see this? 649 00:36:45,410 --> 00:36:49,630 So this dark part is the vowel. 650 00:36:49,630 --> 00:36:52,150 And then, you can see it corresponds with loudness. 651 00:36:52,150 --> 00:36:53,470 It's very loud. 652 00:36:53,470 --> 00:36:56,770 And then this quieter part is the n sound. 653 00:36:56,770 --> 00:36:59,570 So the line is probably somewhere around here. 654 00:36:59,570 --> 00:37:01,750 COMPUTER: Guh, n, n, n. 655 00:37:05,171 --> 00:37:07,420 PROFESSOR: So what is going on with the formants here. 656 00:37:07,420 --> 00:37:08,461 We can turn that back on. 657 00:37:16,200 --> 00:37:18,300 AUDIENCE: They get a lot quieter. 658 00:37:18,300 --> 00:37:20,050 PROFESSOR: They get quieter for sure. 659 00:37:20,050 --> 00:37:24,253 AUDIENCE: Looks like formant one drops off almost to zero. 660 00:37:24,253 --> 00:37:26,440 PROFESSOR: Yeah, formant one disappears. 661 00:37:31,520 --> 00:37:35,960 And then, formant two is pretty much steady. 662 00:37:35,960 --> 00:37:36,680 Let's check out-- 663 00:37:36,680 --> 00:37:37,900 AUDIENCE: [INAUDIBLE] formant 664 00:37:37,900 --> 00:37:42,020 PROFESSOR: Formant one is going to be the bottom formant. 665 00:37:42,020 --> 00:37:43,860 You see this? 666 00:37:43,860 --> 00:37:45,380 So they count from the bottom up. 667 00:37:45,380 --> 00:37:50,430 So this is the first formant, second formant, third formant. 668 00:37:50,430 --> 00:37:55,820 The formants are these big dark smudges we see. 669 00:37:55,820 --> 00:37:58,970 And also, the red dots are trying to track them. 670 00:37:58,970 --> 00:38:02,150 So that's the program trying to find formants for you, so you 671 00:38:02,150 --> 00:38:05,130 can measure them or whatever. 672 00:38:05,130 --> 00:38:06,380 Let's check out the other one. 673 00:38:06,380 --> 00:38:07,310 So that was gun. 674 00:38:11,330 --> 00:38:13,060 How about this? 675 00:38:13,060 --> 00:38:14,476 What's happening? 676 00:38:14,476 --> 00:38:16,428 AUDIENCE: Same thing. 677 00:38:16,428 --> 00:38:25,742 It's dropping off [INAUDIBLE] 678 00:38:25,742 --> 00:38:26,450 PROFESSOR: Right. 679 00:38:26,450 --> 00:38:28,521 So it's getting quieter. 680 00:38:28,521 --> 00:38:30,686 AUDIENCE: Also, formant two jumps up 681 00:38:30,686 --> 00:38:32,572 a bit before it comes steady. 682 00:38:32,572 --> 00:38:33,280 PROFESSOR: Right. 683 00:38:33,280 --> 00:38:35,840 So formant two seems to, rather than 684 00:38:35,840 --> 00:38:40,130 have this straight across going from the uh sound 685 00:38:40,130 --> 00:38:44,120 to the n sound, or going from the uh sound 686 00:38:44,120 --> 00:38:47,061 to the n sound in gun, we saw that it 687 00:38:47,061 --> 00:38:49,310 was more or less-- formant two was more or less steady 688 00:38:49,310 --> 00:38:50,900 all the way across. 689 00:38:50,900 --> 00:38:52,730 But this one, it's more-- 690 00:38:52,730 --> 00:38:55,580 it dips and then rises. 691 00:38:55,580 --> 00:38:57,290 Well, the dip is probably just a vowel. 692 00:38:57,290 --> 00:39:00,950 But it seems to rise up abruptly for the n. 693 00:39:00,950 --> 00:39:02,788 So if we listen-- 694 00:39:02,788 --> 00:39:06,180 COMPUTER: M, m, m. 695 00:39:06,180 --> 00:39:08,540 PROFESSOR: You can hear that it's an m sound. 696 00:39:08,540 --> 00:39:11,430 You can hear that it's mm and not nn. 697 00:39:11,430 --> 00:39:12,140 Yeah? 698 00:39:12,140 --> 00:39:14,270 You agree? 699 00:39:14,270 --> 00:39:19,180 And it seems like the main difference between mm and nn 700 00:39:19,180 --> 00:39:23,420 is where the second formant is. 701 00:39:23,420 --> 00:39:25,940 Assuming that the rest of the world is the same, which I I 702 00:39:25,940 --> 00:39:29,360 think we can pretty much safely assume that. 703 00:39:29,360 --> 00:39:30,846 Where did my mouse go? 704 00:39:30,846 --> 00:39:31,640 Down here? 705 00:39:31,640 --> 00:39:32,550 Oh, there it is. 706 00:39:35,781 --> 00:39:36,780 We can even see it here. 707 00:39:36,780 --> 00:39:44,510 So pretty steady transition here, whereas sharp rise here. 708 00:39:44,510 --> 00:39:45,050 Make sense? 709 00:39:49,580 --> 00:39:52,640 We could hypothesize that the difference 710 00:39:52,640 --> 00:40:00,800 between when you're perceiving gum and gun is whether the-- 711 00:40:00,800 --> 00:40:04,340 is going to be the formant transitions. 712 00:40:04,340 --> 00:40:08,330 Or we could hypothesize that it's the actual sound of the mm 713 00:40:08,330 --> 00:40:11,000 and nn. 714 00:40:11,000 --> 00:40:16,040 You could hear the difference, right? 715 00:40:16,040 --> 00:40:18,440 When I didn't play the vowel, you 716 00:40:18,440 --> 00:40:21,500 could hear the difference between mm and nn, right? 717 00:40:21,500 --> 00:40:24,696 Let's just try again. 718 00:40:24,696 --> 00:40:31,062 COMPUTER: Nn, nn, mm, mm, mm. 719 00:40:31,062 --> 00:40:32,770 AUDIENCE: [INAUDIBLE] play the beginning. 720 00:40:32,770 --> 00:40:34,478 I think if you play it from the beginning 721 00:40:34,478 --> 00:40:35,740 of the vowel [INAUDIBLE]. 722 00:40:39,830 --> 00:40:42,680 PROFESSOR: But I think even if you don't play-- 723 00:40:42,680 --> 00:40:44,894 if you don't get the formant transition, 724 00:40:44,894 --> 00:40:46,060 you can hear the difference. 725 00:40:46,060 --> 00:40:47,252 COMPUTER: Mm, nn. 726 00:40:50,479 --> 00:40:52,139 AUDIENCE: Unless this a placebo. 727 00:40:52,139 --> 00:40:54,534 Why don't we try it by [INAUDIBLE] Everybody 728 00:40:54,534 --> 00:40:58,366 close our eyes [INAUDIBLE] and then raise your right hand 729 00:40:58,366 --> 00:40:59,906 if it's nn, and left hand it's mm. 730 00:40:59,906 --> 00:41:00,780 PROFESSOR: Good idea. 731 00:41:00,780 --> 00:41:02,090 Everyone close your eyes. 732 00:41:02,090 --> 00:41:04,742 We'll see if we can hear this. 733 00:41:04,742 --> 00:41:06,350 AUDIENCE: [INAUDIBLE] 734 00:41:06,350 --> 00:41:08,183 PROFESSOR: I'll just say who thinks it's mm? 735 00:41:08,183 --> 00:41:11,564 Who thinks it's nn? 736 00:41:11,564 --> 00:41:12,925 I'm going to play it. 737 00:41:12,925 --> 00:41:16,422 COMPUTER: Nn, nn, nn, nn. 738 00:41:16,422 --> 00:41:19,955 PROFESSOR: Who thinks it's m? 739 00:41:19,955 --> 00:41:21,220 Who thinks it's n? 740 00:41:24,996 --> 00:41:26,620 PROFESSOR: That was like four and five, 741 00:41:26,620 --> 00:41:28,960 so I guess we can't hear it. 742 00:41:28,960 --> 00:41:30,330 OK-- can't hear it. 743 00:41:30,330 --> 00:41:33,400 But it is maybe different. 744 00:41:33,400 --> 00:41:34,210 It was n. 745 00:41:34,210 --> 00:41:35,560 It was an n sound. 746 00:41:35,560 --> 00:41:39,040 Congrats to the four people who thought it was n-- 747 00:41:39,040 --> 00:41:40,760 half of the class. 748 00:41:40,760 --> 00:41:44,650 So maybe not-- maybe we can't really hear it. 749 00:41:44,650 --> 00:41:48,400 So if we can't really hear the difference between mm and nn, 750 00:41:48,400 --> 00:41:50,800 how can we tell the difference between gum and gun? 751 00:41:59,710 --> 00:42:02,185 AUDIENCE: [INAUDIBLE] Context. 752 00:42:08,055 --> 00:42:09,693 AUDIENCE: But she just said both and it 753 00:42:09,693 --> 00:42:10,942 wasn't in any [INAUDIBLE] 754 00:42:10,942 --> 00:42:11,650 PROFESSOR: Right. 755 00:42:11,650 --> 00:42:13,108 I think I can say it out of nowhere 756 00:42:13,108 --> 00:42:18,220 and you can probably tell, even if you can't see my face. 757 00:42:18,220 --> 00:42:22,310 Gum, gun, gun, gum, gun, gun. 758 00:42:22,310 --> 00:42:22,810 Yeah? 759 00:42:22,810 --> 00:42:23,434 Maybe you can-- 760 00:42:23,434 --> 00:42:25,830 AUDIENCE: [INAUDIBLE] 761 00:42:25,830 --> 00:42:27,710 PROFESSOR: Yeah. 762 00:42:27,710 --> 00:42:29,410 Or you could-- sometimes, like-- 763 00:42:40,140 --> 00:42:46,100 So if it's not the sound of the m or n sound, 764 00:42:46,100 --> 00:42:49,740 then how can we tell the difference? 765 00:42:49,740 --> 00:42:54,610 I just played it for you just without the transitions. 766 00:42:54,610 --> 00:42:56,430 COMPUTER: Mm, nn. 767 00:42:56,430 --> 00:42:59,000 PROFESSOR: And you couldn't really tell the difference. 768 00:42:59,000 --> 00:43:05,620 But what if I played you the whole thing? 769 00:43:08,760 --> 00:43:09,804 COMPUTER: Gun. 770 00:43:09,804 --> 00:43:10,470 PROFESSOR: Yeah? 771 00:43:10,470 --> 00:43:11,922 AUDIENCE: Is it the actual change 772 00:43:11,922 --> 00:43:13,684 from the vowel to the consonant? 773 00:43:13,684 --> 00:43:14,350 PROFESSOR: Yeah. 774 00:43:14,350 --> 00:43:18,270 So it's probably the change from the vowel to the consonant, 775 00:43:18,270 --> 00:43:22,540 because this is pretty much steady, and this is different. 776 00:43:22,540 --> 00:43:24,820 So it's probably like our baw, daw example 777 00:43:24,820 --> 00:43:27,640 where the formant transitions were different, really 778 00:43:27,640 --> 00:43:28,550 different. 779 00:43:28,550 --> 00:43:36,670 And in comparison, the actual stop release 780 00:43:36,670 --> 00:43:39,760 was not such a drastic difference. 781 00:43:39,760 --> 00:43:43,270 So we might hypothesize that it's the formant transitions 782 00:43:43,270 --> 00:43:44,880 that we can tell. 783 00:43:44,880 --> 00:43:48,221 So if this is our hypothesis, how would we test it? 784 00:43:48,221 --> 00:43:48,970 What do you think? 785 00:43:58,630 --> 00:44:02,510 AUDIENCE: You mean by different [INAUDIBLE] 786 00:44:02,510 --> 00:44:03,820 PROFESSOR: Yeah. 787 00:44:03,820 --> 00:44:05,521 AUDIENCE: We can start off how we just 788 00:44:05,521 --> 00:44:08,407 did with our eyes closed and half of us [INAUDIBLE] 789 00:44:08,407 --> 00:44:12,255 if you started it from an earlier point where the 790 00:44:12,255 --> 00:44:14,756 transitions [INAUDIBLE] and see how many people 791 00:44:14,756 --> 00:44:16,120 could guess what it was. 792 00:44:16,120 --> 00:44:17,270 PROFESSOR: Yeah, exactly. 793 00:44:17,270 --> 00:44:21,130 So maybe we could play smaller parts of the speech signal 794 00:44:21,130 --> 00:44:25,810 and see what parts of speech signal 795 00:44:25,810 --> 00:44:30,400 you actually have to hear in order to hear the difference. 796 00:44:33,910 --> 00:44:36,970 I will point out, by the way, that when I first brought this 797 00:44:36,970 --> 00:44:38,770 up, all I did was play it and I asked you 798 00:44:38,770 --> 00:44:41,610 what it said-- everybody knew, even though it 799 00:44:41,610 --> 00:44:44,940 was completely out of context. 800 00:44:44,940 --> 00:44:47,380 So maybe if we chop up the speech 801 00:44:47,380 --> 00:44:53,380 signal in strategic ways, and then play it, 802 00:44:53,380 --> 00:44:56,560 we can maybe hear-- 803 00:44:56,560 --> 00:45:00,280 statistically figure out what people are actually hearing. 804 00:45:00,280 --> 00:45:02,080 So why don't we do something like that? 805 00:45:02,080 --> 00:45:15,320 So we could, for example, if we just play the guh part. 806 00:45:19,630 --> 00:45:22,130 COMPUTER: Guh, guh, guh. 807 00:45:25,130 --> 00:45:26,130 AUDIENCE: [INAUDIBLE] 808 00:45:26,130 --> 00:45:29,470 COMPUTER: Gum, gum, gum. 809 00:45:29,470 --> 00:45:32,876 PROFESSOR: Yeah just through the transition. 810 00:45:32,876 --> 00:45:37,866 COMPUTER: Gum, gum, gum. 811 00:45:37,866 --> 00:45:40,361 Gun, gun. 812 00:45:40,361 --> 00:45:44,540 PROFESSOR: Can you hear the difference? 813 00:45:44,540 --> 00:45:47,380 So it seems to me more this very beginning 814 00:45:47,380 --> 00:45:51,460 part of the nasal or the end part of the vowel 815 00:45:51,460 --> 00:45:55,550 is doing more for us. 816 00:45:55,550 --> 00:45:57,350 Maybe we could try just the end of it. 817 00:45:57,350 --> 00:45:58,058 COMPUTER: Un, un. 818 00:46:01,930 --> 00:46:04,724 Um, um, um. 819 00:46:04,724 --> 00:46:06,640 PROFESSOR: Could you hear the difference then? 820 00:46:06,640 --> 00:46:07,360 AUDIENCE: Yes. 821 00:46:07,360 --> 00:46:08,980 PROFESSOR: Yeah, much clearer, right? 822 00:46:08,980 --> 00:46:10,280 Much clearer. 823 00:46:10,280 --> 00:46:10,930 You agree? 824 00:46:10,930 --> 00:46:12,626 Go like this if you agree. 825 00:46:12,626 --> 00:46:14,137 Go like this if you don't agree. 826 00:46:16,880 --> 00:46:18,122 Everyone agrees. 827 00:46:18,122 --> 00:46:18,830 That's very nice. 828 00:46:21,830 --> 00:46:28,260 Or let's say that we were to zoom in so we 829 00:46:28,260 --> 00:46:30,780 can do some precision work. 830 00:46:36,429 --> 00:46:37,470 Oh, this is gum, I think. 831 00:46:48,220 --> 00:46:52,400 If we were to just stick the m in here-- 832 00:47:00,460 --> 00:47:02,530 I included the transition here sort of. 833 00:47:02,530 --> 00:47:05,380 You see, it might sound weird. 834 00:47:05,380 --> 00:47:07,000 Let's see what it sounds like. 835 00:47:09,700 --> 00:47:11,890 COMPUTER: Gun, gun, gun. 836 00:47:15,250 --> 00:47:18,398 PROFESSOR: It sounded like gun, I think. 837 00:47:21,320 --> 00:47:31,812 Let's try-- that's not what I want to do. 838 00:47:31,812 --> 00:47:32,640 Can I undo again? 839 00:47:35,540 --> 00:47:37,600 No. 840 00:47:37,600 --> 00:47:39,980 Sorry. 841 00:47:39,980 --> 00:47:42,148 Where am I? 842 00:47:42,148 --> 00:47:43,636 [INAUDIBLE] at the end. 843 00:47:51,076 --> 00:47:52,580 Is it back? 844 00:47:52,580 --> 00:47:54,430 Oh, no, it looks scary. 845 00:47:54,430 --> 00:47:55,070 Hold on. 846 00:47:58,046 --> 00:48:00,030 let me fix this. 847 00:48:26,800 --> 00:48:30,790 Let's try and grab this end bit. 848 00:48:37,412 --> 00:48:39,270 We should copy. 849 00:48:39,270 --> 00:48:40,916 That sounds good. 850 00:48:40,916 --> 00:48:44,620 We'll stick it here, make sure that we get-- 851 00:48:44,620 --> 00:48:46,830 we're moving over the transition as well. 852 00:48:53,200 --> 00:48:59,290 COMPUTER: Gum, gun, gun, gum. 853 00:48:59,290 --> 00:49:01,000 PROFESSOR: What did I do? 854 00:49:01,000 --> 00:49:03,760 AUDIENCE: You changed it from gun to gum. 855 00:49:03,760 --> 00:49:04,520 PROFESSOR: Yeah. 856 00:49:04,520 --> 00:49:05,020 So-- 857 00:49:05,020 --> 00:49:07,690 COMPUTER: Gun, gum, gum. 858 00:49:07,690 --> 00:49:09,370 PROFESSOR: So it seems that if I make 859 00:49:09,370 --> 00:49:13,660 very careful to include the end bit of a vowel, 860 00:49:13,660 --> 00:49:18,850 then I can successfully change the word. 861 00:49:18,850 --> 00:49:20,230 So to review, what I did here was 862 00:49:20,230 --> 00:49:23,682 I took the m, including, making sure to include 863 00:49:23,682 --> 00:49:24,640 the formant transition. 864 00:49:24,640 --> 00:49:26,860 So I was including the end of the vowel, 865 00:49:26,860 --> 00:49:30,910 and I successfully changed the word to gum. 866 00:49:30,910 --> 00:49:34,660 So let's try again, except this time, 867 00:49:34,660 --> 00:49:37,645 I'll not include the end bit of the vowel. 868 00:49:37,645 --> 00:49:39,399 That's actually what I did the first time. 869 00:49:51,320 --> 00:49:53,530 Now it's here. 870 00:49:53,530 --> 00:49:56,307 COMPUTER: Gun, gun, gun. 871 00:49:56,307 --> 00:49:57,390 PROFESSOR: Now what is it? 872 00:49:57,390 --> 00:49:58,590 AUDIENCE: [INAUDIBLE] 873 00:49:58,590 --> 00:50:02,380 PROFESSOR: Now it's gun, right? 874 00:50:02,380 --> 00:50:04,340 So this is very interesting. 875 00:50:04,340 --> 00:50:08,840 So what this means is that it must be something 876 00:50:08,840 --> 00:50:13,220 about the formant transitions that we're hearing, because-- 877 00:50:13,220 --> 00:50:19,520 that is the most salient auditory cue. 878 00:50:19,520 --> 00:50:22,790 Because if I included the formant transitions, 879 00:50:22,790 --> 00:50:27,080 I could successfully take the n and paste it onto 880 00:50:27,080 --> 00:50:28,690 and change the word. 881 00:50:28,690 --> 00:50:31,850 But if I didn't include the formant transitions, 882 00:50:31,850 --> 00:50:33,140 I couldn't. 883 00:50:33,140 --> 00:50:34,795 I didn't change the word. 884 00:50:34,795 --> 00:50:36,420 People following what's happening here? 885 00:50:39,770 --> 00:50:43,850 So this experiment suggests that it 886 00:50:43,850 --> 00:50:47,540 must be something about the formant transitions that's 887 00:50:47,540 --> 00:50:50,040 making us hear the difference. 888 00:50:50,040 --> 00:50:51,800 And there may be other cues. 889 00:50:51,800 --> 00:50:54,350 Certainly there are other cues, like m and then 890 00:50:54,350 --> 00:50:56,460 n itself look different. 891 00:50:56,460 --> 00:50:57,850 Let me just undo this. 892 00:51:01,420 --> 00:51:04,240 A little bit. 893 00:51:04,240 --> 00:51:05,980 But for one thing, the m and the n parts 894 00:51:05,980 --> 00:51:09,000 are significantly quieter than the vowels. 895 00:51:09,000 --> 00:51:10,060 See, here is the vowel-- 896 00:51:10,060 --> 00:51:11,470 much louder. 897 00:51:11,470 --> 00:51:13,420 Here's the nasal-- not loud. 898 00:51:16,060 --> 00:51:19,000 And for another thing, they're not that different. 899 00:51:19,000 --> 00:51:23,320 It must be-- so it makes sense that it's 900 00:51:23,320 --> 00:51:25,194 the formant transitions that we listen to. 901 00:51:25,194 --> 00:51:27,967 AUDIENCE: Do you call it [INAUDIBLE] 902 00:51:27,967 --> 00:51:29,800 PROFESSOR: Yeah, sorry, the nasal consonant. 903 00:51:29,800 --> 00:51:30,300 Right. 904 00:51:33,640 --> 00:51:36,382 The consonants are much quieter than the vowels. 905 00:51:36,382 --> 00:51:38,590 So it makes sense that what we're really listening to 906 00:51:38,590 --> 00:51:39,250 is the vowel. 907 00:51:43,110 --> 00:51:44,609 Let me play something else for you. 908 00:52:10,557 --> 00:52:12,793 COMPUTER: Lose, lose. 909 00:52:12,793 --> 00:52:13,918 PROFESSOR: What's the word? 910 00:52:13,918 --> 00:52:15,630 AUDIENCE: Lose. 911 00:52:15,630 --> 00:52:19,330 PROFESSOR: Lose, like lose a game, right? 912 00:52:19,330 --> 00:52:21,120 Something like that? 913 00:52:21,120 --> 00:52:23,591 AUDIENCE: [INAUDIBLE] 914 00:52:23,591 --> 00:52:24,840 PROFESSOR: This guy's British. 915 00:52:24,840 --> 00:52:27,560 AUDIENCE: Oh, yeah. 916 00:52:27,560 --> 00:52:28,534 That makes sense. 917 00:52:28,534 --> 00:52:30,092 COMPUTER: Lose. 918 00:52:30,092 --> 00:52:31,300 PROFESSOR: He's very British. 919 00:52:31,300 --> 00:52:32,962 AUDIENCE: Lose. 920 00:52:32,962 --> 00:52:33,920 PROFESSOR: Lose-- yeah. 921 00:52:38,200 --> 00:52:40,570 It's not quite the vowel we use, right? 922 00:52:40,570 --> 00:52:42,054 It's not quite the American vowel. 923 00:52:42,054 --> 00:52:43,720 There's something about the vowel that's 924 00:52:43,720 --> 00:52:45,284 a little bit different, and that's 925 00:52:45,284 --> 00:52:46,450 what makes it sound British. 926 00:52:46,450 --> 00:52:48,730 So let me break it down for you. 927 00:52:48,730 --> 00:52:52,150 Here is the l part, the luh part. 928 00:52:52,150 --> 00:52:54,370 COMPUTER: Luh, luh. 929 00:52:54,370 --> 00:52:56,946 PROFESSOR: Well, maybe I got a little bit of oo. 930 00:52:56,946 --> 00:52:59,170 COMPUTER: Loo, loo. 931 00:52:59,170 --> 00:53:01,810 PROFESSOR: And then we have the ooh. 932 00:53:01,810 --> 00:53:02,746 See this. 933 00:53:02,746 --> 00:53:04,450 COMPUTER: Ooh, ooh. 934 00:53:04,450 --> 00:53:07,342 PROFESSOR: And then here is the zuh. 935 00:53:07,342 --> 00:53:11,530 COMPUTER: [SOUND SIGNAL] 936 00:53:11,530 --> 00:53:13,720 PROFESSOR: So let me ask you, what 937 00:53:13,720 --> 00:53:18,790 is the difference between between the word lose 938 00:53:18,790 --> 00:53:21,770 and the word loose? 939 00:53:21,770 --> 00:53:22,270 Yeah? 940 00:53:25,150 --> 00:53:28,520 AUDIENCE: The [INAUDIBLE] consonant in his voice. 941 00:53:28,520 --> 00:53:29,805 PROFESSOR: In which one? 942 00:53:29,805 --> 00:53:30,695 AUDIENCE: In lose. 943 00:53:30,695 --> 00:53:31,778 PROFESSOR: In lose, right. 944 00:53:31,778 --> 00:53:36,380 So in lose, it's a Z sound. 945 00:53:36,380 --> 00:53:39,079 Lose-- I'm going to write them down. 946 00:53:39,079 --> 00:53:39,620 Make sure I-- 947 00:53:48,106 --> 00:53:49,900 So this a Z sound. 948 00:53:49,900 --> 00:53:55,370 And this is an S. People understand? 949 00:53:55,370 --> 00:53:56,440 You get this? 950 00:53:56,440 --> 00:53:57,760 This is a Z-- lose. 951 00:53:57,760 --> 00:53:59,350 This is an S-- loose. 952 00:53:59,350 --> 00:53:59,850 Yeah? 953 00:54:04,390 --> 00:54:07,620 So this is lose, so it should have a Z sound at the end here. 954 00:54:11,157 --> 00:54:14,930 COMPUTER: [SOUND SIGNAL] 955 00:54:14,930 --> 00:54:17,580 PROFESSOR: What does it sound like? 956 00:54:17,580 --> 00:54:19,890 It sounds like an s, right? 957 00:54:19,890 --> 00:54:22,860 COMPUTER: [SOUND SIGNAL] 958 00:54:22,860 --> 00:54:25,110 PROFESSOR: So in theory, this is voice, 959 00:54:25,110 --> 00:54:29,550 but in actuality kind of not. 960 00:54:29,550 --> 00:54:30,570 If we look at-- 961 00:54:30,570 --> 00:54:32,730 so the characteristic of voicing is 962 00:54:32,730 --> 00:54:37,050 this striations, these up and down stripes, in the bottom 963 00:54:37,050 --> 00:54:38,030 here. 964 00:54:38,030 --> 00:54:41,820 If we look, they peter out really fast. 965 00:54:41,820 --> 00:54:45,330 So for the zuh part of this, it's 966 00:54:45,330 --> 00:54:49,110 only really a Z for this much time, and then it is all s. 967 00:54:52,530 --> 00:54:54,690 Furthermore, the furcation noise-- 968 00:54:54,690 --> 00:54:57,240 that's the fuzziness that happens 969 00:54:57,240 --> 00:54:59,530 when you say a Z or an S-- 970 00:54:59,530 --> 00:55:03,460 is really loud, compared to what's going on here. 971 00:55:03,460 --> 00:55:08,786 This is not very loud, and this is kind of a big deal. 972 00:55:08,786 --> 00:55:10,680 Does this makes sense? 973 00:55:10,680 --> 00:55:19,620 So I ask you, if it's not really a zuh that's happening, 974 00:55:19,620 --> 00:55:21,420 if it's really more like an S-- 975 00:55:21,420 --> 00:55:23,670 This is s sound-- 976 00:55:23,670 --> 00:55:30,694 then how can we tell that this is lose and not loose? 977 00:55:39,143 --> 00:55:42,840 AUDIENCE: [INAUDIBLE] 978 00:55:42,840 --> 00:55:46,820 PROFESSOR: Sure, maybe it's just the beginning voicing. 979 00:55:46,820 --> 00:55:47,435 Yeah? 980 00:55:47,435 --> 00:55:49,710 AUDIENCE: Does it have to do with the volume-- 981 00:55:49,710 --> 00:55:51,080 like what happens after? 982 00:55:51,080 --> 00:55:52,005 PROFESSOR: After what? 983 00:55:52,005 --> 00:55:57,055 AUDIENCE: After the transition from the ooh to the suh. 984 00:55:57,055 --> 00:55:59,430 PROFESSOR: Yeah, maybe it's something about the furcation 985 00:55:59,430 --> 00:56:01,520 noise. 986 00:56:01,520 --> 00:56:04,800 What it probably is, and I don't really 987 00:56:04,800 --> 00:56:06,690 expect you guys to have known this, 988 00:56:06,690 --> 00:56:12,540 but the actual time that it takes to say the vowel, 989 00:56:12,540 --> 00:56:16,920 the actual length the vowel is going to be much longer in 990 00:56:16,920 --> 00:56:18,930 lose than in loose. 991 00:56:18,930 --> 00:56:20,250 If and you actually-- 992 00:56:20,250 --> 00:56:21,420 I don't have loose. 993 00:56:21,420 --> 00:56:22,430 I really should. 994 00:56:22,430 --> 00:56:25,590 If you actually look at lose versus loose, 995 00:56:25,590 --> 00:56:27,586 lose is going to be-- 996 00:56:27,586 --> 00:56:28,350 are you OK? 997 00:56:31,624 --> 00:56:32,832 AUDIENCE: What are you doing? 998 00:56:32,832 --> 00:56:35,170 AUDIENCE: [INAUDIBLE] 999 00:56:35,170 --> 00:56:35,814 PROFESSOR: Oh. 1000 00:56:35,814 --> 00:56:37,266 AUDIENCE: Never mind. 1001 00:56:37,266 --> 00:56:46,181 [INAUDIBLE] 1002 00:56:46,181 --> 00:56:47,181 That was really strange. 1003 00:56:50,010 --> 00:56:55,520 PROFESSOR: So the vowel in lose is actually 1004 00:56:55,520 --> 00:56:57,830 like-- it takes twice as long to say. 1005 00:56:57,830 --> 00:57:02,060 You actually spend more time saying it. 1006 00:57:02,060 --> 00:57:03,250 It's a long vowel. 1007 00:57:03,250 --> 00:57:04,670 Whereas in loose, it's not. 1008 00:57:04,670 --> 00:57:06,320 And I'm not going to do this, because it's going 1009 00:57:06,320 --> 00:57:07,444 to take a really long time. 1010 00:57:07,444 --> 00:57:13,400 But if you actually take half of the length of this vowel away, 1011 00:57:13,400 --> 00:57:18,480 and you just cut, use edit, cut, control 1012 00:57:18,480 --> 00:57:23,150 whatever, X or whatever, then it will actually 1013 00:57:23,150 --> 00:57:25,617 start sounding like loose. 1014 00:57:25,617 --> 00:57:27,200 The reason I'm not going to do for you 1015 00:57:27,200 --> 00:57:31,490 is that to preserve this beautiful arc here, 1016 00:57:31,490 --> 00:57:35,480 you actually have to take away every other period 1017 00:57:35,480 --> 00:57:40,940 sort of thing, which is painstaking work. 1018 00:57:40,940 --> 00:57:43,160 This is kind of another example of-- 1019 00:57:43,160 --> 00:57:46,029 what we are trying to signal is a difference in the consonant. 1020 00:57:46,029 --> 00:57:47,570 We're trying to signal the difference 1021 00:57:47,570 --> 00:57:50,030 between a ss and a zz. 1022 00:57:50,030 --> 00:57:52,580 And that's supposed to be a voicing difference. 1023 00:57:52,580 --> 00:57:54,230 But in English, we don't-- 1024 00:57:54,230 --> 00:57:56,510 it's not that the voicing is what's different. 1025 00:57:56,510 --> 00:57:59,030 The voicing is hardly different. 1026 00:57:59,030 --> 00:58:01,350 It's just this amount of different. 1027 00:58:01,350 --> 00:58:02,750 Hardly different at all. 1028 00:58:02,750 --> 00:58:07,580 But the vowel length is really very different. 1029 00:58:07,580 --> 00:58:09,920 So this is just another example of where 1030 00:58:09,920 --> 00:58:13,420 we're listening to the much louder vowel to get our signal, 1031 00:58:13,420 --> 00:58:18,140 to get relevant cues, than to the actual consonant. 1032 00:58:18,140 --> 00:58:18,783 Yeah? 1033 00:58:18,783 --> 00:58:21,550 AUDIENCE: Doesn't it have to be something like a consonant too, 1034 00:58:21,550 --> 00:58:24,576 because if you say two words that 1035 00:58:24,576 --> 00:58:26,780 have the exact same length, vowel, 1036 00:58:26,780 --> 00:58:33,550 you can still the difference between an S and a Z. 1037 00:58:33,550 --> 00:58:34,780 PROFESSOR: I mean, maybe. 1038 00:58:34,780 --> 00:58:36,445 Although-- what? 1039 00:58:36,445 --> 00:58:41,200 AUDIENCE: [INAUDIBLE] hearing sounds [INAUDIBLE] same vowel 1040 00:58:41,200 --> 00:58:41,700 length. 1041 00:58:41,700 --> 00:58:44,910 PROFESSOR: I mean you can say a Z with full voicing. 1042 00:58:44,910 --> 00:58:47,700 You can go zzz, if you want to. 1043 00:58:47,700 --> 00:58:51,420 It's possible to make that distinction. 1044 00:58:51,420 --> 00:58:55,470 But, like I said, if I were to take this signal 1045 00:58:55,470 --> 00:58:59,640 and make the duration of the vowel shorter, 1046 00:58:59,640 --> 00:59:02,160 you would hear a loose. 1047 00:59:02,160 --> 00:59:04,470 If I were to make this half the length, 1048 00:59:04,470 --> 00:59:06,210 then you would actually hear loose. 1049 00:59:06,210 --> 00:59:09,420 So that suggests that the vowel length 1050 00:59:09,420 --> 00:59:12,464 is some really relevant signal. 1051 00:59:12,464 --> 00:59:13,380 Does that makes sense? 1052 00:59:17,250 --> 00:59:18,584 That's very interesting. 1053 00:59:22,940 --> 00:59:23,908 Here's one. 1054 00:59:29,920 --> 00:59:30,420 Hello? 1055 00:59:35,914 --> 00:59:37,705 You guys don't need to see that, I realize. 1056 00:59:42,190 --> 00:59:44,620 So before I play this whole thing for you, let me just 1057 00:59:44,620 --> 00:59:49,456 play the n bit, and you guys can tell me what you hear. 1058 00:59:49,456 --> 00:59:53,134 COMPUTER: A, a, a, a, 1059 00:59:53,134 --> 00:59:54,592 PROFESSOR: What does it sound like? 1060 00:59:54,592 --> 00:59:55,980 AUDIENCE: A. 1061 00:59:55,980 --> 00:59:57,230 PROFESSOR: A, sure. 1062 00:59:57,230 --> 00:59:57,893 Yeah. 1063 00:59:57,893 --> 01:00:00,160 AUDIENCE: The consonant a. 1064 01:00:00,160 --> 01:00:04,180 PROFESSOR: Yeah, an a that comes out of nowhere, or something. 1065 01:00:04,180 --> 01:00:05,970 It's not a or I don't know-- 1066 01:00:05,970 --> 01:00:06,970 hard to do. 1067 01:00:06,970 --> 01:00:09,160 Yeah, some sort of consonant maybe. 1068 01:00:09,160 --> 01:00:13,710 COMPUTER: A, a, a, a, a. 1069 01:00:13,710 --> 01:00:15,047 PROFESSOR: So it sounded like a. 1070 01:00:15,047 --> 01:00:15,630 AUDIENCE: Yes. 1071 01:00:19,060 --> 01:00:23,419 COMPUTER: A, a, a, a. 1072 01:00:23,419 --> 01:00:25,710 PROFESSOR: Is it sounding different at all to you guys? 1073 01:00:25,710 --> 01:00:27,520 AUDIENCE: [INAUDIBLE] 1074 01:00:27,520 --> 01:00:28,690 PROFESSOR: More like bay. 1075 01:00:28,690 --> 01:00:30,280 AUDIENCE: [INAUDIBLE] 1076 01:00:30,280 --> 01:00:31,780 PROFESSOR: Yeah, right. 1077 01:00:31,780 --> 01:00:34,107 Sounds like-- kind of like bay. 1078 01:00:34,107 --> 01:00:35,240 AUDIENCE: Sounds like grey. 1079 01:00:35,240 --> 01:00:36,191 PROFESSOR: Grey? 1080 01:00:36,191 --> 01:00:38,532 AUDIENCE: Sounds like gay to me. 1081 01:00:38,532 --> 01:00:40,240 PROFESSOR: Gay, bay, something like that. 1082 01:00:40,240 --> 01:00:40,740 Yeah. 1083 01:00:40,740 --> 01:00:43,520 It sounds like there's a consonant there-- 1084 01:00:43,520 --> 01:00:45,290 some consonant at the beginning. 1085 01:00:45,290 --> 01:00:46,810 How about here? 1086 01:00:46,810 --> 01:00:49,000 COMPUTER: Bay, bay, bay. 1087 01:00:49,000 --> 01:00:50,670 AUDIENCE: Sounds like [INAUDIBLE] 1088 01:00:50,670 --> 01:00:52,902 PROFESSOR: What? 1089 01:00:52,902 --> 01:00:53,860 AUDIENCE: Bay. 1090 01:00:53,860 --> 01:00:56,618 PROFESSOR: Bay, still-- is that what you're hearing? 1091 01:00:56,618 --> 01:00:59,546 AUDIENCE: [INAUDIBLE] I heard a B at the end. 1092 01:00:59,546 --> 01:01:00,530 PROFESSOR: At the end? 1093 01:01:00,530 --> 01:01:01,956 COMPUTER: Bay, bay, bay. 1094 01:01:01,956 --> 01:01:03,164 AUDIENCE: It's not one vowel. 1095 01:01:03,164 --> 01:01:03,872 It's a diphthong. 1096 01:01:03,872 --> 01:01:06,330 PROFESSOR: Yes, it's a diphthong. 1097 01:01:06,330 --> 01:01:09,620 We're focusing on the beginning here. 1098 01:01:09,620 --> 01:01:12,342 COMPUTER: Bay, bay, bay, bay. 1099 01:01:12,342 --> 01:01:14,140 PROFESSOR: What do you guys hear? 1100 01:01:14,140 --> 01:01:15,020 Still the same? 1101 01:01:15,020 --> 01:01:16,250 AUDIENCE: Day. 1102 01:01:16,250 --> 01:01:18,080 PROFESSOR: Day? 1103 01:01:18,080 --> 01:01:19,310 hearing day. 1104 01:01:19,310 --> 01:01:21,910 Some people are hearing bay. 1105 01:01:21,910 --> 01:01:23,910 AUDIENCE: [INAUDIBLE] 1106 01:01:23,910 --> 01:01:30,347 COMPUTER: Bay, bay, day, day, day. 1107 01:01:30,347 --> 01:01:31,826 PROFESSOR: Still bay, day-- 1108 01:01:33,305 --> 01:01:36,263 AUDIENCE: It switches off between bay and day. 1109 01:01:36,263 --> 01:01:38,740 COMPUTER: Day, day, day, day. 1110 01:01:38,740 --> 01:01:40,434 PROFESSOR: How about now? 1111 01:01:40,434 --> 01:01:42,250 AUDIENCE: Date. 1112 01:01:42,250 --> 01:01:43,860 PROFESSOR: Day, they, date. 1113 01:01:43,860 --> 01:01:45,702 COMPUTER: Date, date, date. 1114 01:01:45,702 --> 01:01:47,910 PROFESSOR: This was actually taken out of a sentence. 1115 01:01:47,910 --> 01:01:51,530 So it makes sense to you here also a consonant at the end, 1116 01:01:51,530 --> 01:01:53,840 because the vowel stops abruptly, 1117 01:01:53,840 --> 01:01:57,050 because it was just sort of cut and pasted 1118 01:01:57,050 --> 01:01:59,495 from the middle of a sentence. 1119 01:02:02,310 --> 01:02:04,370 COMPUTER: Day, day. 1120 01:02:04,370 --> 01:02:06,640 PROFESSOR: People are hearing mostly day here? 1121 01:02:06,640 --> 01:02:08,110 AUDIENCE: Yeah. 1122 01:02:08,110 --> 01:02:08,754 PROFESSOR: Day? 1123 01:02:08,754 --> 01:02:09,670 Let's keep going back. 1124 01:02:09,670 --> 01:02:13,630 COMPUTER: Day, day, day, day. 1125 01:02:13,630 --> 01:02:14,774 PROFESSOR: Still day? 1126 01:02:14,774 --> 01:02:17,680 AUDIENCE: [INAUDIBLE] 1127 01:02:17,680 --> 01:02:20,465 COMPUTER: Day, day, day, day. 1128 01:02:20,465 --> 01:02:22,370 PROFESSOR: Still day? 1129 01:02:22,370 --> 01:02:23,675 I'll play you the whole thing. 1130 01:02:23,675 --> 01:02:24,625 COMPUTER: Say. 1131 01:02:24,625 --> 01:02:26,198 AUDIENCE: Say. 1132 01:02:26,198 --> 01:02:28,447 AUDIENCE: [INAUDIBLE] the last time right before that, 1133 01:02:28,447 --> 01:02:29,625 and I was like, oh. 1134 01:02:29,625 --> 01:02:31,530 COMPUTER: Say, say. 1135 01:02:31,530 --> 01:02:32,490 AUDIENCE: [INAUDIBLE] 1136 01:02:32,490 --> 01:02:35,090 PROFESSOR: Say-- anyone surprised? 1137 01:02:35,090 --> 01:02:37,473 AUDIENCE: [INAUDIBLE] 1138 01:02:37,473 --> 01:02:40,146 [INTERPOSING VOICES] 1139 01:02:40,146 --> 01:02:40,770 PROFESSOR: Huh? 1140 01:02:40,770 --> 01:02:41,220 Oh, yeah. 1141 01:02:41,220 --> 01:02:41,720 Right. 1142 01:02:41,720 --> 01:02:44,340 The name of the file is set to yes. 1143 01:02:44,340 --> 01:02:46,728 What's going on here? 1144 01:02:46,728 --> 01:02:48,960 AUDIENCE: I heard it right before. 1145 01:02:48,960 --> 01:02:50,620 PROFESSOR: Yeah. 1146 01:02:50,620 --> 01:02:53,140 COMPUTER: Say, say, say. 1147 01:02:53,140 --> 01:02:58,320 PROFESSOR: You can see from this demonstration 1148 01:02:58,320 --> 01:03:00,660 how we're interpreting the signal. 1149 01:03:00,660 --> 01:03:03,000 So back when we were over here, you guys 1150 01:03:03,000 --> 01:03:07,290 were saying it sounds like a or bay or something. 1151 01:03:07,290 --> 01:03:10,015 It sounds like a vowel and it comes out of nowhere, 1152 01:03:10,015 --> 01:03:11,890 so it must have a consonant at the beginning. 1153 01:03:11,890 --> 01:03:14,006 So your brain sort of fills that in, like oh, 1154 01:03:14,006 --> 01:03:15,630 there must have been a consonant there, 1155 01:03:15,630 --> 01:03:17,855 because it's a vowel coming out of nowhere. 1156 01:03:17,855 --> 01:03:20,640 If you were just to say a, then you would probably 1157 01:03:20,640 --> 01:03:24,030 have a sort of characteristic rise that we've seen before, 1158 01:03:24,030 --> 01:03:26,130 sort of swell into it. 1159 01:03:26,130 --> 01:03:28,747 And then as I got farther-- 1160 01:03:28,747 --> 01:03:29,330 what happened? 1161 01:03:36,720 --> 01:03:38,090 My computer is freaking out. 1162 01:03:38,090 --> 01:03:42,104 So as I got farther back, you guys started hearing day. 1163 01:03:42,104 --> 01:03:43,853 Why do you think you would have heard day? 1164 01:03:47,240 --> 01:03:49,330 As I got closer and closer to here, 1165 01:03:49,330 --> 01:03:53,190 you guys started hearing day rather than bay, right? 1166 01:03:53,190 --> 01:03:54,496 So why was that? 1167 01:03:59,809 --> 01:04:04,594 AUDIENCE: Maybe because bay was more abrupt-- 1168 01:04:04,594 --> 01:04:09,590 stop [INAUDIBLE] it's so abrupt, and so we 1169 01:04:09,590 --> 01:04:12,360 assumed it was that until we started hearing the ss sound. 1170 01:04:12,360 --> 01:04:13,850 PROFESSOR: Right. 1171 01:04:13,850 --> 01:04:15,362 So you guys started hearing day-- 1172 01:04:15,362 --> 01:04:17,320 I don't know when my computer is freaking out-- 1173 01:04:17,320 --> 01:04:20,220 you guys started hearing day around here, 1174 01:04:20,220 --> 01:04:25,062 Right so maybe it was a little less abrupt. 1175 01:04:25,062 --> 01:04:26,550 Anything else? 1176 01:04:37,958 --> 01:04:41,330 AUDIENCE: [INAUDIBLE] 1177 01:04:41,330 --> 01:04:43,980 PROFESSOR: Yeah, you didn't hear the furcation noise, 1178 01:04:43,980 --> 01:04:47,280 is what this is called, of the-- because S is a fricative, 1179 01:04:47,280 --> 01:04:49,260 and that's a furcation noise. 1180 01:04:49,260 --> 01:04:51,390 So you didn't hear that happening. 1181 01:04:51,390 --> 01:04:56,370 So you weren't perceiving S. But the question is, 1182 01:04:56,370 --> 01:04:59,790 since you weren't perceiving S, why was it day 1183 01:04:59,790 --> 01:05:04,350 that you were perceiving, and not something else, like bay. 1184 01:05:04,350 --> 01:05:06,405 Because you were perceiving bay a lot in here. 1185 01:05:09,030 --> 01:05:11,370 So you suggested that it's a little less 1186 01:05:11,370 --> 01:05:16,260 abrupt to go into there, maybe because of the release for day 1187 01:05:16,260 --> 01:05:17,130 being louder. 1188 01:05:17,130 --> 01:05:18,666 What do you think? 1189 01:05:18,666 --> 01:05:25,911 AUDIENCE: Maybe the shape of [INAUDIBLE] the formant changes 1190 01:05:25,911 --> 01:05:30,014 the closer you're [INAUDIBLE] 1191 01:05:30,014 --> 01:05:30,680 PROFESSOR: Yeah. 1192 01:05:30,680 --> 01:05:33,250 So if you think about it, the way 1193 01:05:33,250 --> 01:05:36,430 your mouth is for an S and a d, there's 1194 01:05:36,430 --> 01:05:39,424 a closure in a very similar place. 1195 01:05:39,424 --> 01:05:44,344 AUDIENCE: Didn't we notice before that with the d 1196 01:05:44,344 --> 01:05:49,270 that there was less of a formant change. 1197 01:05:49,270 --> 01:05:53,140 PROFESSOR: Yeah for awe, right? 1198 01:05:53,140 --> 01:05:58,730 So we have yet to see that for the a vowel, diphthong. 1199 01:06:02,400 --> 01:06:04,580 But if you think about it, the way 1200 01:06:04,580 --> 01:06:10,620 your mouth is for an S and duh, both have a closure right here. 1201 01:06:10,620 --> 01:06:13,660 Try it-- ss, duh, duh, duh-- 1202 01:06:13,660 --> 01:06:18,370 very close, like behind your top teeth, behind your teeth, 1203 01:06:18,370 --> 01:06:20,170 that's where the closure is happening. 1204 01:06:23,260 --> 01:06:26,530 Basically, what that means is that the different sort 1205 01:06:26,530 --> 01:06:29,455 of cavities in your vocal tract are going to be very similar. 1206 01:06:32,930 --> 01:06:33,740 That's too bad. 1207 01:06:41,180 --> 01:06:42,940 The cavities in your mouth are going 1208 01:06:42,940 --> 01:06:48,721 to be really similar for day and say. 1209 01:06:48,721 --> 01:06:50,470 So the formant transitions for day and say 1210 01:06:50,470 --> 01:06:52,460 are also going to be really similar. 1211 01:06:52,460 --> 01:06:54,950 So the way to figure out the difference between day and say 1212 01:06:54,950 --> 01:07:00,220 is going to be something like is it a fricative or is it a stop? 1213 01:07:00,220 --> 01:07:03,340 So when we were starting just at the end of the furcation 1214 01:07:03,340 --> 01:07:05,702 or the beginning of a vowel, somewhere in there, 1215 01:07:05,702 --> 01:07:07,660 we were thinking it's a stop, because we didn't 1216 01:07:07,660 --> 01:07:09,270 hear all that furcation noise. 1217 01:07:09,270 --> 01:07:13,210 We heard the vowel coming very loudly out of nowhere. 1218 01:07:13,210 --> 01:07:16,570 But when we moved it back to hear the whole furcation noise, 1219 01:07:16,570 --> 01:07:18,550 then we could hear that it was say. 1220 01:07:18,550 --> 01:07:20,600 People are getting this? 1221 01:07:20,600 --> 01:07:21,902 Understanding this? 1222 01:07:21,902 --> 01:07:22,402 Yeah? 1223 01:07:40,944 --> 01:07:41,860 Do you guys like this? 1224 01:07:41,860 --> 01:07:44,170 I'm trying out a new background. 1225 01:07:44,170 --> 01:07:45,550 I just found this yesterday. 1226 01:07:45,550 --> 01:07:48,070 I'm not sure whether I like it. 1227 01:07:48,070 --> 01:07:50,410 AUDIENCE: [INAUDIBLE] 1228 01:07:50,410 --> 01:07:51,160 PROFESSOR: I know. 1229 01:07:51,160 --> 01:07:56,050 I kind of wish that they were leaves all the way down, 1230 01:07:56,050 --> 01:07:57,348 and not stars. 1231 01:07:57,348 --> 01:07:59,540 AUDIENCE: Did you make it? 1232 01:07:59,540 --> 01:08:01,352 PROFESSOR: I didn't make it. 1233 01:08:01,352 --> 01:08:02,525 [INTERPOSING VOICES] 1234 01:08:02,525 --> 01:08:04,870 AUDIENCE: It's like maple leaves. 1235 01:08:04,870 --> 01:08:08,005 PROFESSOR: I mean, this one looks like a star, right? 1236 01:08:08,005 --> 01:08:10,430 AUDIENCE: Oh, I see it. 1237 01:08:10,430 --> 01:08:13,340 [INTERPOSING VOICES] 1238 01:08:17,763 --> 01:08:21,609 COMPUTER: Say, say. 1239 01:08:21,609 --> 01:08:23,100 PROFESSOR: What was I going to say? 1240 01:08:23,100 --> 01:08:23,600 Oh, yeah. 1241 01:08:23,600 --> 01:08:28,229 Incidentally, a lot of these studies are done with women-- 1242 01:08:28,229 --> 01:08:31,180 sorry-- a lot of these studies are done with men. 1243 01:08:31,180 --> 01:08:34,120 [INTERPOSING VOICES] 1244 01:08:34,120 --> 01:08:38,979 This is a sort of exceptional, in that this is a woman. 1245 01:08:38,979 --> 01:08:40,840 It's sort of exceptional, because the thing 1246 01:08:40,840 --> 01:08:44,100 is that, on average, men and women speak 1247 01:08:44,100 --> 01:08:46,660 at different pitches. 1248 01:08:46,660 --> 01:08:49,450 And yes, really. 1249 01:08:53,680 --> 01:08:56,340 So those interact with the formants. 1250 01:08:56,340 --> 01:09:00,189 So if you think of this, the formants 1251 01:09:00,189 --> 01:09:02,890 have something to do with the resonances in your vocal tract. 1252 01:09:02,890 --> 01:09:04,560 That's what they have to do with. 1253 01:09:04,560 --> 01:09:06,310 And the pitch of your voice has to do 1254 01:09:06,310 --> 01:09:09,790 with how quickly your vocal chords are vibrating. 1255 01:09:09,790 --> 01:09:12,550 So if you think about it, these are completely independent 1256 01:09:12,550 --> 01:09:13,810 things. 1257 01:09:13,810 --> 01:09:15,130 Do you understand this? 1258 01:09:20,512 --> 01:09:22,720 The formants have to do with the shape of your mouth, 1259 01:09:22,720 --> 01:09:25,956 and the pitch has to do with the speed of air 1260 01:09:25,956 --> 01:09:27,080 and all that sort of thing. 1261 01:09:27,080 --> 01:09:30,200 And these are independent. 1262 01:09:30,200 --> 01:09:34,399 But what you can see is that, depending on the frequency-- 1263 01:09:34,399 --> 01:09:38,830 So if you're producing sound at a certain pitch, 1264 01:09:38,830 --> 01:09:44,180 if you're producing a pitch of a vowel, then it has harmonics. 1265 01:09:44,180 --> 01:09:48,160 So if you produce something at like 400 Hertz, 1266 01:09:48,160 --> 01:09:52,029 then it's going to be loudest at 400, and 800, and 1200, 1267 01:09:52,029 --> 01:09:55,620 and 1600, and so on. 1268 01:09:55,620 --> 01:09:57,598 And then, in between-- 1269 01:09:57,598 --> 01:10:00,530 AUDIENCE: Don't you multiply be two each time? 1270 01:10:00,530 --> 01:10:05,330 PROFESSOR: 1200 comes in there, I believe. 1271 01:10:10,530 --> 01:10:14,490 And in between them, you're going to have like-- 1272 01:10:14,490 --> 01:10:17,280 at 400-- sorry, 400, 800-- 1273 01:10:17,280 --> 01:10:20,970 so at 600, it's going to be less quiet. 1274 01:10:20,970 --> 01:10:25,560 So what ends up happening is that like regardless you're 1275 01:10:25,560 --> 01:10:30,930 not going to see much going on at 600. 1276 01:10:30,930 --> 01:10:39,420 Basically, these little dots are indicative of where 1277 01:10:39,420 --> 01:10:40,470 those gaps are. 1278 01:10:40,470 --> 01:10:42,060 If you see the white dots, that's 1279 01:10:42,060 --> 01:10:44,900 where you can't really hear it because of the pitch 1280 01:10:44,900 --> 01:10:46,140 that her voice is at. 1281 01:10:46,140 --> 01:10:49,530 And for men, because it's a lower pitch, 1282 01:10:49,530 --> 01:10:52,910 they're going to be differently spaced. 1283 01:10:52,910 --> 01:10:58,270 So if we were to look at Edward again, 1284 01:10:58,270 --> 01:11:00,360 which we'll do in a second, I guess, 1285 01:11:00,360 --> 01:11:06,370 then we could see that he didn't interfere with the formant 1286 01:11:06,370 --> 01:11:07,180 here. 1287 01:11:07,180 --> 01:11:10,510 Sometimes, you have some issues actually finding 1288 01:11:10,510 --> 01:11:12,280 where is formant two, or where it's not, 1289 01:11:12,280 --> 01:11:14,955 because you have this pitch thing going on. 1290 01:11:14,955 --> 01:11:16,330 So that's why a lot of this stuff 1291 01:11:16,330 --> 01:11:18,540 is done with men's voices, because it 1292 01:11:18,540 --> 01:11:20,740 tends not to interfere with the lower formants 1293 01:11:20,740 --> 01:11:23,490 when you're just looking at the pictures. 1294 01:11:23,490 --> 01:11:24,820 That isn't really relevant. 1295 01:11:30,560 --> 01:11:33,140 I guess that is more or less what I had to say. 1296 01:11:33,140 --> 01:11:35,540 Do you guys I have questions? 1297 01:11:40,240 --> 01:11:42,080 No questions? 1298 01:11:42,080 --> 01:11:43,308 What? 1299 01:11:43,308 --> 01:11:46,242 AUDIENCE: I see the stars. 1300 01:11:46,242 --> 01:11:51,140 [INTERPOSING VOICES] 1301 01:11:51,140 --> 01:11:54,481 PROFESSOR: Do you guys want to see some cool Burmese stuff? 1302 01:11:54,481 --> 01:11:54,980 Burmese? 1303 01:11:54,980 --> 01:11:56,107 AUDIENCE: Yeah. 1304 01:11:56,107 --> 01:11:56,690 PROFESSOR: OK. 1305 01:11:56,690 --> 01:11:59,570 [INTERPOSING VOICES] 1306 01:11:59,570 --> 01:12:00,892 AUDIENCE: Oh, Burmese. 1307 01:12:00,892 --> 01:12:02,776 [INTERPOSING VOICES] 1308 01:12:05,470 --> 01:12:08,108 PROFESSOR: Yes, with the Buddhists. 1309 01:12:08,108 --> 01:12:11,084 [INTERPOSING VOICES] 1310 01:14:00,020 --> 01:14:01,350 PROFESSOR: This is Burmese. 1311 01:14:01,350 --> 01:14:03,170 [INTERPOSING VOICES] 1312 01:14:09,950 --> 01:14:11,569 PROFESSOR: That was a guy, yeah. 1313 01:14:11,569 --> 01:14:12,610 AUDIENCE: That was a guy? 1314 01:14:12,610 --> 01:14:13,235 PROFESSOR: Yes. 1315 01:14:16,600 --> 01:14:19,277 COMPUTER: Ma, ma. 1316 01:14:19,277 --> 01:14:20,610 PROFESSOR: This is the same guy. 1317 01:14:20,610 --> 01:14:22,362 AUDIENCE: Oh, it's a guy. 1318 01:14:22,362 --> 01:14:24,146 COMPUTER: Ma, ma. 1319 01:14:26,790 --> 01:14:30,390 PROFESSOR: This guy-- so Burmese has tones. 1320 01:14:34,194 --> 01:14:35,110 It's a tonal language. 1321 01:14:35,110 --> 01:14:36,330 Do you guys know what that means? 1322 01:14:36,330 --> 01:14:37,662 AUDIENCE: Is it like Chinese. 1323 01:14:37,662 --> 01:14:38,828 AUDIENCE: Japanese, Chinese. 1324 01:14:38,828 --> 01:14:40,320 PROFESSOR: It's like Chinese. 1325 01:14:40,320 --> 01:14:40,820 Yeah. 1326 01:14:40,820 --> 01:14:43,106 AUDIENCE: Japanese [INAUDIBLE] 1327 01:14:43,106 --> 01:14:44,730 PROFESSOR: Japanese isn't really tonal. 1328 01:14:44,730 --> 01:14:46,970 AUDIENCE: [INAUDIBLE] 1329 01:14:46,970 --> 01:14:48,780 PROFESSOR: Yeah. 1330 01:14:48,780 --> 01:14:52,710 So the difference-- So what makes it a tonal language is 1331 01:14:52,710 --> 01:14:58,130 that something about the pitch of the vowel or the shape-- 1332 01:14:58,130 --> 01:15:01,350 whether it's going up or down, makes a distinction. 1333 01:15:01,350 --> 01:15:04,140 So like in Chinese, there's going 1334 01:15:04,140 --> 01:15:08,460 to be difference between Ma and ma, something like that. 1335 01:15:08,460 --> 01:15:09,786 And those are different words. 1336 01:15:09,786 --> 01:15:11,535 AUDIENCE: It's like how you express anger. 1337 01:15:11,535 --> 01:15:13,593 because like in English, you use your tone 1338 01:15:13,593 --> 01:15:14,860 to express differently. 1339 01:15:14,860 --> 01:15:16,222 You just yell the word loudly. 1340 01:15:16,222 --> 01:15:17,590 [INTERPOSING VOICES] 1341 01:15:24,387 --> 01:15:25,970 PROFESSOR: So this is a good question. 1342 01:15:25,970 --> 01:15:28,780 And the answer to your-- 1343 01:15:28,780 --> 01:15:31,250 is the word you should Google is prosody. 1344 01:15:42,400 --> 01:15:46,600 Prosody is the study of the pitches of sentences 1345 01:15:46,600 --> 01:15:50,680 and entire utterances, and how those pitches change, 1346 01:15:50,680 --> 01:15:53,510 and how that relates to what you're saying. 1347 01:15:53,510 --> 01:15:55,690 So for example, in English, we can use prosody 1348 01:15:55,690 --> 01:15:58,600 to make the difference between a declarative sentence 1349 01:15:58,600 --> 01:16:04,240 and a question, like you ate earlier today. 1350 01:16:04,240 --> 01:16:06,930 Versus, you ate earlier today? 1351 01:16:06,930 --> 01:16:09,610 Or something like that. 1352 01:16:09,610 --> 01:16:13,050 So it's not the words themselves, but the porosity 1353 01:16:13,050 --> 01:16:14,050 that makes a difference. 1354 01:16:14,050 --> 01:16:16,330 And porosity in Chinese is different from porosity 1355 01:16:16,330 --> 01:16:20,580 in English is the short answer. 1356 01:16:20,580 --> 01:16:22,090 Burmese is tonal. 1357 01:16:22,090 --> 01:16:28,930 Burmese also has this weird thing called voiceless nasals. 1358 01:16:28,930 --> 01:16:31,030 So let's just look at one. 1359 01:16:35,620 --> 01:16:39,280 If you think of a nasal like mm in English, 1360 01:16:39,280 --> 01:16:41,830 pretty much whenever you say it, it has voicing going on. 1361 01:16:41,830 --> 01:16:45,450 Right there-- their vocal chords are vibrating during the mm. 1362 01:16:45,450 --> 01:16:46,810 Mm-- it's happening. 1363 01:16:46,810 --> 01:16:47,800 You can feel it-- 1364 01:16:47,800 --> 01:16:50,530 mm. 1365 01:16:50,530 --> 01:16:55,375 And in Burmese, this is true as well. 1366 01:16:55,375 --> 01:16:58,180 But they have something called the voiceless nasal. 1367 01:16:58,180 --> 01:17:02,950 So this is supposed to be the same word with the same tone, 1368 01:17:02,950 --> 01:17:04,960 and the difference is that this one-- or sorry, 1369 01:17:04,960 --> 01:17:05,905 these are supposed to be different words, 1370 01:17:05,905 --> 01:17:08,530 but they're supposed to have the same tone and the same vowel-- 1371 01:17:08,530 --> 01:17:13,190 and the difference is that this has a voiceless m 1372 01:17:13,190 --> 01:17:17,587 versus a voiced m here. 1373 01:17:17,587 --> 01:17:19,420 So let's see if you can hear the difference. 1374 01:17:19,420 --> 01:17:21,415 COMPUTER: Ma. 1375 01:17:21,415 --> 01:17:21,915 Muh. 1376 01:17:24,570 --> 01:17:25,320 PROFESSOR: Anyone? 1377 01:17:25,320 --> 01:17:27,680 AUDIENCE: Yeah, slightly. 1378 01:17:27,680 --> 01:17:29,976 PROFESSOR: The vowel is shorter. 1379 01:17:29,976 --> 01:17:33,830 COMPUTER: Muh, ma. 1380 01:17:33,830 --> 01:17:36,030 PROFESSOR: I mean, in truth, they're both voiced. 1381 01:17:36,030 --> 01:17:38,420 We can see this beautiful voicing 1382 01:17:38,420 --> 01:17:41,604 down here, and down here. 1383 01:17:41,604 --> 01:17:43,020 And maybe the length is different. 1384 01:17:43,020 --> 01:17:43,561 I don't know. 1385 01:17:43,561 --> 01:17:44,210 Let's see. 1386 01:17:44,210 --> 01:17:50,870 The length of this one is like 57 milliseconds, 1387 01:17:50,870 --> 01:17:54,770 and this one is like 42 milliseconds. 1388 01:17:54,770 --> 01:17:56,720 So it's maybe a little bit shorter. 1389 01:17:56,720 --> 01:17:57,450 Yeah? 1390 01:17:57,450 --> 01:17:58,950 AUDIENCE: I think it might be easier 1391 01:17:58,950 --> 01:18:01,400 to pronounce voiceless nasal [INAUDIBLE] shorter. 1392 01:18:01,400 --> 01:18:04,894 It might be easier, because-- 1393 01:18:04,894 --> 01:18:06,060 PROFESSOR: So phoneticians-- 1394 01:18:06,060 --> 01:18:09,770 [INTERPOSING VOICES] 1395 01:18:09,770 --> 01:18:12,930 PROFESSOR: Yeah, you have to voice it to hear it basically. 1396 01:18:12,930 --> 01:18:15,990 So phoneticians and Burmese speakers 1397 01:18:15,990 --> 01:18:18,870 have noticed that in the voiceless nasals, what happens 1398 01:18:18,870 --> 01:18:21,360 is that before the nasal happens, 1399 01:18:21,360 --> 01:18:23,740 there's some air flowing out of your mouth-- 1400 01:18:23,740 --> 01:18:25,715 or sorry, flowing out of your nose. 1401 01:18:25,715 --> 01:18:27,840 AUDIENCE: [INAUDIBLE] 1402 01:18:27,840 --> 01:18:29,640 PROFESSOR: Ma, right? 1403 01:18:29,640 --> 01:18:31,170 So ma. 1404 01:18:31,170 --> 01:18:35,040 So there's air flowing out of the nose before the nasal. 1405 01:18:35,040 --> 01:18:40,560 But I found this kind of unlikely. 1406 01:18:40,560 --> 01:18:43,170 It's really unlikely that the difference that you're hearing 1407 01:18:43,170 --> 01:18:47,090 is actually the air flowing out of your nose-- 1408 01:18:47,090 --> 01:18:48,450 not very loud, right? 1409 01:18:51,300 --> 01:18:52,020 Really quiet. 1410 01:18:54,790 --> 01:19:00,140 So I actually did a study where I took-- 1411 01:19:00,140 --> 01:19:03,360 I basically did a copy and paste study. 1412 01:19:03,360 --> 01:19:08,831 So it took everything before the vowel 1413 01:19:08,831 --> 01:19:10,830 for two words that were supposed to be the same, 1414 01:19:10,830 --> 01:19:13,770 except for the voicing of the nasal, 1415 01:19:13,770 --> 01:19:16,710 and I pasted them in the wrong way, 1416 01:19:16,710 --> 01:19:19,170 and then I also pasted them in the right way 1417 01:19:19,170 --> 01:19:21,250 with other instances of the same word, 1418 01:19:21,250 --> 01:19:25,410 and they always judged the meaning of the word 1419 01:19:25,410 --> 01:19:29,370 to be the one that associated with the correct vowel. 1420 01:19:29,370 --> 01:19:33,990 So if I took this off here, and put on the voice one, 1421 01:19:33,990 --> 01:19:37,620 they would still think it was unvoiced. 1422 01:19:37,620 --> 01:19:39,350 And if I took off the voiceless one-- 1423 01:19:39,350 --> 01:19:43,350 or if I put a voiceless onto here instead of this, 1424 01:19:43,350 --> 01:19:46,020 they would still think that it was voiced. 1425 01:19:46,020 --> 01:19:48,550 So I never figured out exactly what the difference was, 1426 01:19:48,550 --> 01:19:50,460 but it must be something to do with the vowel that's 1427 01:19:50,460 --> 01:19:51,376 making the difference. 1428 01:19:51,376 --> 01:19:52,640 Does that make sense? 1429 01:19:52,640 --> 01:19:56,340 This is another one of those stereotypical things. 1430 01:19:56,340 --> 01:19:58,024 AUDIENCE: [INAUDIBLE] 1431 01:19:58,024 --> 01:19:59,440 PROFESSOR: Totally out of context. 1432 01:19:59,440 --> 01:20:01,470 So I just played them the word. 1433 01:20:01,470 --> 01:20:04,530 I randomly mixed them up using-- 1434 01:20:04,530 --> 01:20:07,440 I think, actually, I drew numbers out of a hat-- 1435 01:20:07,440 --> 01:20:08,940 named them things that had nothing-- 1436 01:20:08,940 --> 01:20:11,670 like just letters or something-- had nothing 1437 01:20:11,670 --> 01:20:14,310 to do with what they meant, played them for them, 1438 01:20:14,310 --> 01:20:17,440 had lots of controls, and just said, what does this mean? 1439 01:20:17,440 --> 01:20:22,362 Does it mean to marinate or a celestial body, or whatever, 1440 01:20:22,362 --> 01:20:24,570 because this is like the differences that these have. 1441 01:20:24,570 --> 01:20:27,160 And they give me really significant results. 1442 01:20:27,160 --> 01:20:29,790 So it was really cool. 1443 01:20:29,790 --> 01:20:31,730 And it has everything to do with the vowel. 1444 01:20:31,730 --> 01:20:33,660 I hypothesized that it might have something 1445 01:20:33,660 --> 01:20:36,090 to do with breathiness of the vowel. 1446 01:20:36,090 --> 01:20:38,910 But that's also intertwined with the Burmese tone system, 1447 01:20:38,910 --> 01:20:44,320 so it's definitely not conclusive. 1448 01:20:44,320 --> 01:20:49,440 So that's an interesting thing that Burmese has. 1449 01:20:49,440 --> 01:20:52,109 It also has tones, so I wanted to play those for you. 1450 01:20:52,109 --> 01:20:53,400 AUDIENCE: Do you speak Burmese? 1451 01:20:53,400 --> 01:20:56,235 PROFESSOR: I do not. 1452 01:20:56,235 --> 01:21:01,525 AUDIENCE: So wasn't it really [INAUDIBLE] 1453 01:21:01,525 --> 01:21:02,150 PROFESSOR: Hmm? 1454 01:21:02,150 --> 01:21:03,898 AUDIENCE: Wasn't it like really difficult [INAUDIBLE] 1455 01:21:03,898 --> 01:21:05,210 experiment [INAUDIBLE] 1456 01:21:05,210 --> 01:21:07,380 PROFESSOR: I was asking Burmese speakers. 1457 01:21:07,380 --> 01:21:08,980 AUDIENCE: Oh. 1458 01:21:08,980 --> 01:21:15,380 PROFESSOR: I wasn't doing it with myself, obviously. 1459 01:21:15,380 --> 01:21:18,896 So here's a voiced-- 1460 01:21:26,720 --> 01:21:29,050 So here is one tone. 1461 01:21:29,050 --> 01:21:30,480 COMPUTER: Ma. 1462 01:21:30,480 --> 01:21:32,210 PROFESSOR: Here's another tone. 1463 01:21:32,210 --> 01:21:34,385 COMPUTER: Mah, mah. 1464 01:21:34,385 --> 01:21:36,010 PROFESSOR: Can you hear the difference? 1465 01:21:36,010 --> 01:21:37,326 AUDIENCE: Yeah. 1466 01:21:37,326 --> 01:21:38,700 PROFESSOR: So the first time is-- 1467 01:21:38,700 --> 01:21:42,225 COMPUTER: Ma, ma. 1468 01:21:42,225 --> 01:21:43,590 Mah. 1469 01:21:43,590 --> 01:21:48,000 PROFESSOR: So the first tone is falling and shorter-- 1470 01:21:48,000 --> 01:21:53,520 falling, shorter, creaky, while this is kind of breathy. 1471 01:21:53,520 --> 01:21:56,580 It's not really a tone system, because there's actually 1472 01:21:56,580 --> 01:22:00,990 not just pitches involved, but also different stuff going on. 1473 01:22:00,990 --> 01:22:03,470 So it's kind of a crazy tone system Burmese has. 1474 01:22:03,470 --> 01:22:06,090 And it has two more tones in addition to this. 1475 01:22:06,090 --> 01:22:07,880 And it also has-- 1476 01:22:07,880 --> 01:22:10,930 some syllables have no tone, somehow. 1477 01:22:10,930 --> 01:22:13,810 But they are special syllables. 1478 01:22:13,810 --> 01:22:15,300 So it's pretty complicated. 1479 01:22:19,560 --> 01:22:22,440 So for example-- maybe I'll play you 1480 01:22:22,440 --> 01:22:25,440 one more thing since we have two minutes or zero minutes. 1481 01:22:53,870 --> 01:22:58,199 COMPUTER: Ma, ma, mah. 1482 01:22:58,199 --> 01:22:59,607 PROFESSOR: Those are different. 1483 01:23:02,740 --> 01:23:07,800 And I think this doesn't always go-- 1484 01:23:07,800 --> 01:23:09,245 COMPUTER: Ma, ma. 1485 01:23:09,245 --> 01:23:10,452 PROFESSOR: --as reliably. 1486 01:23:10,452 --> 01:23:11,660 And this one sometimes falls. 1487 01:23:11,660 --> 01:23:13,200 It's kind of complicated. 1488 01:23:13,200 --> 01:23:15,534 So these are-- 1489 01:23:15,534 --> 01:23:17,850 AUDIENCE: What does ma mean? 1490 01:23:17,850 --> 01:23:20,370 PROFESSOR: I don't know. 1491 01:23:20,370 --> 01:23:23,250 I don't know. 1492 01:23:23,250 --> 01:23:24,690 Once upon a time I knew. 1493 01:23:24,690 --> 01:23:26,790 COMPUTER: Mah, mah. 1494 01:23:26,790 --> 01:23:30,720 PROFESSOR: So this is supposed to be a falling tone, 1495 01:23:30,720 --> 01:23:32,910 but it doesn't fall very far, as you see. 1496 01:23:32,910 --> 01:23:34,160 Maybe it isn't a falling tone. 1497 01:23:34,160 --> 01:23:35,240 I don't know. 1498 01:23:35,240 --> 01:23:38,230 So that was kind of cool. 1499 01:23:38,230 --> 01:23:39,790 That's basically it. 1500 01:23:39,790 --> 01:23:41,940 Class is over.