1 00:00:00,090 --> 00:00:02,490 The following content is provided under a Creative 2 00:00:02,490 --> 00:00:04,030 Commons license. 3 00:00:04,030 --> 00:00:06,330 Your support will help MIT OpenCourseWare 4 00:00:06,330 --> 00:00:10,720 continue to offer high quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:15,780 from hundreds of MIT courses, visit 7 00:00:15,780 --> 00:00:17,670 MITOpenCourseWare@OCW.MIT.edu. 8 00:00:27,210 --> 00:00:28,680 PROFESSOR: Welcome. 9 00:00:28,680 --> 00:00:31,140 One quick announcement-- if you have not yet 10 00:00:31,140 --> 00:00:34,680 picked up your graded exams, you can do so 11 00:00:34,680 --> 00:00:38,338 by seeing the TAs after the hour. 12 00:00:38,338 --> 00:00:41,170 OK? 13 00:00:41,170 --> 00:00:43,650 So today I want to continue to think 14 00:00:43,650 --> 00:00:48,660 about what we started last week, thinking about Fourier series. 15 00:00:48,660 --> 00:00:54,780 The idea is to develop a theory that 16 00:00:54,780 --> 00:00:59,610 lets us look at signals on the basis of frequency content, 17 00:00:59,610 --> 00:01:02,820 much as we looked at frequency responses 18 00:01:02,820 --> 00:01:05,850 as a characterization of systems, 19 00:01:05,850 --> 00:01:08,610 according to the way they process frequencies. 20 00:01:08,610 --> 00:01:11,850 And we saw last time that there were a number of kinds 21 00:01:11,850 --> 00:01:15,360 of signals, for example, musical signals, 22 00:01:15,360 --> 00:01:17,520 where that kind of an on approach-- 23 00:01:17,520 --> 00:01:20,940 thinking about the signal according to the frequencies 24 00:01:20,940 --> 00:01:21,870 that are in it-- 25 00:01:21,870 --> 00:01:25,290 makes a lot of sense and can lead to insight. 26 00:01:25,290 --> 00:01:27,600 We also developed some formalism. 27 00:01:27,600 --> 00:01:32,310 We figured out how you can break a signal into components 28 00:01:32,310 --> 00:01:37,090 and then assemble the components to generate the signal. 29 00:01:37,090 --> 00:01:39,810 And what I want to mention at the beginning of the hour today 30 00:01:39,810 --> 00:01:42,870 is just how to think about this operation 31 00:01:42,870 --> 00:01:45,810 in a more familiar way. 32 00:01:45,810 --> 00:01:50,430 We do this kind of a thing, breaking something 33 00:01:50,430 --> 00:01:53,826 into components all the time. 34 00:01:53,826 --> 00:01:55,200 One of the more familiar examples 35 00:01:55,200 --> 00:01:58,380 might be thinking about 3-space, right? 36 00:01:58,380 --> 00:02:02,220 The Cartesian analysis of 3-space is based on the idea 37 00:02:02,220 --> 00:02:05,070 that you can think about a vector location in 3-space 38 00:02:05,070 --> 00:02:06,090 as having components. 39 00:02:06,090 --> 00:02:07,965 There's a component in the x direction, the y 40 00:02:07,965 --> 00:02:09,180 direction, the z direction. 41 00:02:09,180 --> 00:02:10,919 That's completely analogous to the way 42 00:02:10,919 --> 00:02:15,330 we're thinking about Fourier representations for signals. 43 00:02:15,330 --> 00:02:20,070 So just like we would think about synthesizing 44 00:02:20,070 --> 00:02:24,570 the location of a point by adding together three pieces, 45 00:02:24,570 --> 00:02:28,290 and we would think about analyzing a point to figure out 46 00:02:28,290 --> 00:02:30,930 how big the components are in each of those directions, 47 00:02:30,930 --> 00:02:34,770 it's exactly the same when we think about Fourier series. 48 00:02:34,770 --> 00:02:38,860 We think about representing a signal as a sum of things. 49 00:02:38,860 --> 00:02:41,970 So the sum is precisely the same. 50 00:02:41,970 --> 00:02:44,750 This one happens to have an infinite number of terms, 51 00:02:44,750 --> 00:02:49,680 [? eh. ?] The top one has three terms, [? eh. ?] 52 00:02:49,680 --> 00:02:52,230 The principles are very similar. 53 00:02:52,230 --> 00:02:55,830 So we think about representing a signal as a sum of components. 54 00:02:55,830 --> 00:02:58,380 We think about representing a point in 3-space 55 00:02:58,380 --> 00:03:00,180 as a sum of components, and we think 56 00:03:00,180 --> 00:03:06,470 about analyzing the signal or the vector in 3-space, 57 00:03:06,470 --> 00:03:11,040 so that we figure out what each of those components are. 58 00:03:11,040 --> 00:03:13,980 And we do it in an operation where it's actually 59 00:03:13,980 --> 00:03:18,240 very convenient to think about the decomposition 60 00:03:18,240 --> 00:03:21,270 of the Fourier components using precisely the same language 61 00:03:21,270 --> 00:03:24,240 that we would use for thinking about vector spaces. 62 00:03:24,240 --> 00:03:26,701 So we would think about-- 63 00:03:26,701 --> 00:03:28,200 in the case of the Fourier, we think 64 00:03:28,200 --> 00:03:32,112 about integrating over the period sifts out a component. 65 00:03:32,112 --> 00:03:33,570 The analogous operation for 3-space 66 00:03:33,570 --> 00:03:35,700 is to think about a dot product. 67 00:03:35,700 --> 00:03:37,606 The way you take a vector and figure out 68 00:03:37,606 --> 00:03:39,480 the component in the x direction is to dot it 69 00:03:39,480 --> 00:03:42,720 with it, the dot product. 70 00:03:42,720 --> 00:03:44,670 In the Fourier case, we think about it 71 00:03:44,670 --> 00:03:45,810 as being an inner product. 72 00:03:48,550 --> 00:03:50,880 The idea is completely analogous. 73 00:03:50,880 --> 00:03:54,720 So we think about having the inner product of two things-- 74 00:03:54,720 --> 00:03:59,880 the reference direction and the vector. 75 00:03:59,880 --> 00:04:02,460 So reference direction and vector-- 76 00:04:02,460 --> 00:04:04,830 we think about it exactly the same way, except now it's 77 00:04:04,830 --> 00:04:07,316 an inner product, which means that after we've multiplied, 78 00:04:07,316 --> 00:04:08,190 we have to integrate. 79 00:04:08,190 --> 00:04:11,070 That's the only difference between inner product. 80 00:04:11,070 --> 00:04:13,830 Inner product implies some are [? integrated ?] after you've 81 00:04:13,830 --> 00:04:17,029 done the multiplication. 82 00:04:17,029 --> 00:04:19,880 So we do exactly the same thing, except that now we 83 00:04:19,880 --> 00:04:24,730 think about the inner product of a and b. 84 00:04:24,730 --> 00:04:29,530 That's just the integral, where we take the complex conjugate 85 00:04:29,530 --> 00:04:34,180 of one of the signals only because by defining it 86 00:04:34,180 --> 00:04:38,480 with a complex conjugate there, we set up the inner product 87 00:04:38,480 --> 00:04:41,320 so that the answer is zero unless we 88 00:04:41,320 --> 00:04:45,930 take the inner product of two things in the same direction. 89 00:04:45,930 --> 00:04:47,360 OK? 90 00:04:47,360 --> 00:04:50,300 By putting the minus sign there, if the two reference 91 00:04:50,300 --> 00:04:53,090 directions, that is to say the one 92 00:04:53,090 --> 00:04:59,180 characterized by k and m, the ones characterized by k and m, 93 00:04:59,180 --> 00:05:00,800 the inner product will be zero if we 94 00:05:00,800 --> 00:05:05,180 take the complex conjugate as long as k is not equal to m. 95 00:05:05,180 --> 00:05:08,050 k equals m is the only not zero component. 96 00:05:08,050 --> 00:05:11,270 OK, is that all clear? 97 00:05:11,270 --> 00:05:14,420 So to make sure that it's clear, here's a question. 98 00:05:17,507 --> 00:05:19,340 How many of the following pairs of functions 99 00:05:19,340 --> 00:05:23,015 are orthogonal in T equals 3? 100 00:05:25,412 --> 00:05:26,870 Part of the goal of the exercise is 101 00:05:26,870 --> 00:05:30,020 to figure out what the little caveat in T equals three means. 102 00:05:30,020 --> 00:05:32,810 So look at your neighbor, say hello, figure out 103 00:05:32,810 --> 00:05:35,636 a number between 0 and 4. 104 00:05:35,636 --> 00:05:38,624 [SIDE CONVERSATIONS] 105 00:07:48,340 --> 00:07:50,780 OK, so how many signals, how many of the pairs 106 00:07:50,780 --> 00:07:54,470 are orthogonal to each other? 107 00:07:54,470 --> 00:07:56,960 Raise your hand with some number between 0 and 4, 108 00:07:56,960 --> 00:07:59,990 unless you're completely bizarre and raise five, I mean. 109 00:07:59,990 --> 00:08:01,200 OK, come on, come on. 110 00:08:01,200 --> 00:08:02,760 Higher, so I can see them. 111 00:08:02,760 --> 00:08:04,080 Remember if you're wrong. 112 00:08:04,080 --> 00:08:08,710 It's your partner's fault, it's not your fault. 113 00:08:08,710 --> 00:08:11,530 OK, not quite. 114 00:08:11,530 --> 00:08:15,230 A lot of bad partners, no, no. 115 00:08:15,230 --> 00:08:16,460 Let's do the first one. 116 00:08:16,460 --> 00:08:21,290 Is the cos of 2 pi t orthogonal to the sine 117 00:08:21,290 --> 00:08:25,910 of 2 pi t over the interval capital T equals 3? 118 00:08:29,100 --> 00:08:31,770 Yes? 119 00:08:31,770 --> 00:08:34,400 No. 120 00:08:34,400 --> 00:08:37,049 I haven't a clue. 121 00:08:37,049 --> 00:08:38,695 I don't care. 122 00:08:38,695 --> 00:08:39,320 No, no, no, no. 123 00:08:39,320 --> 00:08:42,039 You all care, no. 124 00:08:42,039 --> 00:08:42,919 Are they orthogonal? 125 00:08:42,919 --> 00:08:43,460 So what do you-- 126 00:08:43,460 --> 00:08:45,043 how do I formally ask the question are 127 00:08:45,043 --> 00:08:46,015 they are orthogonal? 128 00:08:49,580 --> 00:08:52,820 OK, so it's either the last slide or the next slide. 129 00:08:52,820 --> 00:08:54,059 So go back to the last slide. 130 00:08:54,059 --> 00:08:55,600 What's it mean if they're orthogonal? 131 00:08:59,560 --> 00:09:00,191 Yeah? 132 00:09:00,191 --> 00:09:05,010 AUDIENCE: [INAUDIBLE] 133 00:09:05,010 --> 00:09:07,110 PROFESSOR: So how do I take the dot product? 134 00:09:07,110 --> 00:09:08,490 What do I do? 135 00:09:08,490 --> 00:09:11,152 AUDIENCE: [INAUDIBLE] conjugate. 136 00:09:11,152 --> 00:09:12,110 PROFESSOR: Conjugate 1. 137 00:09:12,110 --> 00:09:12,693 So I want to-- 138 00:09:12,693 --> 00:09:18,300 I'm thinking about 1 over T, the integral over T, a star of T, 139 00:09:18,300 --> 00:09:20,260 b of t, dt. 140 00:09:20,260 --> 00:09:21,990 Right? 141 00:09:21,990 --> 00:09:24,480 So the t comes in here, right, I'm 142 00:09:24,480 --> 00:09:26,640 integrating over a period t. 143 00:09:26,640 --> 00:09:31,180 So I take the two functions and I multiply them together. 144 00:09:31,180 --> 00:09:35,370 So I have this function, and I have that function. 145 00:09:35,370 --> 00:09:38,010 I multiply them together. 146 00:09:38,010 --> 00:09:41,700 If you multiply two sinusoids of the same frequency 147 00:09:41,700 --> 00:09:45,470 but different phase, what do you get? 148 00:09:45,470 --> 00:09:47,206 Another sinusoid, right? 149 00:09:47,206 --> 00:09:49,580 So you all know all these complicated trig relationships, 150 00:09:49,580 --> 00:09:50,079 right? 151 00:09:50,079 --> 00:09:50,920 Here's one of them. 152 00:09:50,920 --> 00:09:53,230 If you multiply cos of 2 pi T times the sine of pi T 153 00:09:53,230 --> 00:09:57,940 you get half the sign of double the frequency. 154 00:09:57,940 --> 00:09:58,480 OK? 155 00:09:58,480 --> 00:09:59,813 You don't need to memorize that. 156 00:09:59,813 --> 00:10:03,130 You just look at this picture, you look at that picture. 157 00:10:03,130 --> 00:10:07,490 This one over the interval 3 has 3 periods. 158 00:10:07,490 --> 00:10:08,620 Right? 159 00:10:08,620 --> 00:10:12,640 There are 3 periods of that waveform 160 00:10:12,640 --> 00:10:15,340 over the period of capital T. There's 161 00:10:15,340 --> 00:10:18,920 an integer number 3, same here. 162 00:10:18,920 --> 00:10:21,190 Here-- how many periods? 163 00:10:21,190 --> 00:10:22,630 Twice that. 164 00:10:22,630 --> 00:10:24,960 But it's exactly six. 165 00:10:24,960 --> 00:10:28,840 So you get a pure sinusoid. 166 00:10:28,840 --> 00:10:30,490 You get an integer number of periods. 167 00:10:30,490 --> 00:10:33,670 You integrate over an integer number of periods, you get 0. 168 00:10:33,670 --> 00:10:35,440 They're orthogonal. 169 00:10:35,440 --> 00:10:37,414 Had I chosen the period differently, 170 00:10:37,414 --> 00:10:38,830 they may not have been orthogonal. 171 00:10:38,830 --> 00:10:40,611 It depends on the period. 172 00:10:40,611 --> 00:10:41,110 OK? 173 00:10:41,110 --> 00:10:42,790 So the inner product depends on the period, 174 00:10:42,790 --> 00:10:44,415 because the inner product has something 175 00:10:44,415 --> 00:10:45,520 to do with integrator sum. 176 00:10:45,520 --> 00:10:50,550 And so the range over which you sum or integrate matters. 177 00:10:50,550 --> 00:10:54,570 How about cos 2 pi T cos 4 pi T? 178 00:10:54,570 --> 00:10:55,320 Orthogonal? 179 00:10:58,680 --> 00:10:59,640 Yeah? 180 00:10:59,640 --> 00:11:01,080 AUDIENCE: Yes. 181 00:11:01,080 --> 00:11:03,558 PROFESSOR: And the reason is? 182 00:11:03,558 --> 00:11:07,446 AUDIENCE: So think if you wrapped [INAUDIBLE] together, 183 00:11:07,446 --> 00:11:10,119 then there's a lot of symmetry that 184 00:11:10,119 --> 00:11:14,260 goes on [INAUDIBLE] is going to be 0. 185 00:11:14,260 --> 00:11:18,260 PROFESSOR: So now we've got two different frequencies. 186 00:11:18,260 --> 00:11:20,651 But we still get these funny cosine relationships 187 00:11:20,651 --> 00:11:22,400 that have to do with sums and differences. 188 00:11:22,400 --> 00:11:23,810 And the sums and differences both 189 00:11:23,810 --> 00:11:27,840 happen to be periodic over 3, over the interval capital 190 00:11:27,840 --> 00:11:29,690 T equals 3, right? 191 00:11:29,690 --> 00:11:33,390 So we still get the property that the average value here, 192 00:11:33,390 --> 00:11:36,470 which is what the interval was pulling out, the average is 0. 193 00:11:36,470 --> 00:11:38,500 So they're also orthogonal. 194 00:11:38,500 --> 00:11:41,390 How about cos 2 pi T sine pi T? 195 00:11:43,980 --> 00:11:46,735 OK, I've asked two questions, and they were both yes. 196 00:11:46,735 --> 00:11:48,340 So I'm getting bored at this point, 197 00:11:48,340 --> 00:11:52,336 so by the theory of questions in lecture, the answer is? 198 00:11:52,336 --> 00:11:54,779 [LAUGHTER] 199 00:11:54,779 --> 00:11:55,570 Now, wait a minute. 200 00:11:55,570 --> 00:11:57,100 I'm not that boring. 201 00:11:57,100 --> 00:11:59,730 Well, maybe. 202 00:11:59,730 --> 00:12:04,200 So is this periodic over a capital T equals 3? 203 00:12:07,610 --> 00:12:08,676 Ah, excuse me. 204 00:12:08,676 --> 00:12:10,300 I didn't say the right question, sorry. 205 00:12:10,300 --> 00:12:16,540 Is this function have an integer number of periods in the time 206 00:12:16,540 --> 00:12:19,319 interval capital T equals 3? 207 00:12:19,319 --> 00:12:21,360 What's the period-- what's the fundamental period 208 00:12:21,360 --> 00:12:22,800 of this waveform? 209 00:12:22,800 --> 00:12:23,580 AUDIENCE: 1. 210 00:12:23,580 --> 00:12:24,710 PROFESSOR: 1. 211 00:12:24,710 --> 00:12:29,037 So it has 3 periods over the interval cap T equals 3. 212 00:12:29,037 --> 00:12:29,870 What about this one? 213 00:12:33,502 --> 00:12:34,410 AUDIENCE: 2. 214 00:12:34,410 --> 00:12:35,610 PROFESSOR: Period is 2. 215 00:12:35,610 --> 00:12:39,290 How many periods are there in the time interval capital T 216 00:12:39,290 --> 00:12:40,297 equals 3? 217 00:12:40,297 --> 00:12:43,159 AUDIENCE: [INAUDIBLE] 218 00:12:43,159 --> 00:12:44,430 PROFESSOR: A period is 2. 219 00:12:44,430 --> 00:12:46,290 How many periods are there-- 220 00:12:46,290 --> 00:12:48,290 1 and 1/2, not an integer. 221 00:12:48,290 --> 00:12:49,760 Bad news, right? 222 00:12:49,760 --> 00:12:54,080 So integer number three, not integer number. 223 00:12:54,080 --> 00:12:57,170 If you were to integrate this over the period t 224 00:12:57,170 --> 00:13:00,440 equals 3, if I didn't multiply them 225 00:13:00,440 --> 00:13:03,170 if I just did that, if I just thought about that integral, 226 00:13:03,170 --> 00:13:05,600 I wouldn't get 0, right? 227 00:13:05,600 --> 00:13:08,600 There's more positives than there are negatives. 228 00:13:08,600 --> 00:13:11,720 And when I multiply them, the same sort of thing happens. 229 00:13:11,720 --> 00:13:14,340 I get two big peaks down and only one big peak up. 230 00:13:14,340 --> 00:13:18,980 It's because the resulting waveform no longer 231 00:13:18,980 --> 00:13:22,570 has an integer number of periods in the interval capital T 232 00:13:22,570 --> 00:13:23,270 equals 3. 233 00:13:23,270 --> 00:13:23,770 OK? 234 00:13:27,510 --> 00:13:30,900 Last one-- cos 2 pi T e to the-- 235 00:13:30,900 --> 00:13:33,735 whoops. 236 00:13:33,735 --> 00:13:37,410 Is that what I actually said? 237 00:13:37,410 --> 00:13:39,600 Good, I forgot the j. 238 00:13:39,600 --> 00:13:42,240 Because without the j, they would obviously not 239 00:13:42,240 --> 00:13:43,290 be orthogonal. 240 00:13:43,290 --> 00:13:44,550 Obviously, right? 241 00:13:44,550 --> 00:13:45,450 OK. 242 00:13:45,450 --> 00:13:48,400 I didn't mean to ask something quite that obvious. 243 00:13:48,400 --> 00:13:54,720 So what about cos 2 pi T and e to the j 2 pi T? 244 00:13:57,350 --> 00:14:00,870 Orthogonal. 245 00:14:00,870 --> 00:14:09,731 Not, I'm as clueless as I was on Part A. No, no, no, no, 246 00:14:09,731 --> 00:14:10,230 you're not. 247 00:14:10,230 --> 00:14:10,855 No, you're not. 248 00:14:10,855 --> 00:14:12,230 No, you're not. 249 00:14:12,230 --> 00:14:15,790 So how do you think about that? 250 00:14:15,790 --> 00:14:17,370 You can use Euler's expression. 251 00:14:17,370 --> 00:14:18,990 And if there had been a j there, this 252 00:14:18,990 --> 00:14:21,350 would have been a correct expression. 253 00:14:21,350 --> 00:14:22,050 OK? 254 00:14:22,050 --> 00:14:23,580 It's not quite a correct expression 255 00:14:23,580 --> 00:14:25,477 because I forgot to put the j there. 256 00:14:25,477 --> 00:14:27,060 But had there been a j there, it would 257 00:14:27,060 --> 00:14:30,600 have been cos 2 pi T plus j sine 2 pi T. 258 00:14:30,600 --> 00:14:33,330 And the awkward thing is that the cos and the cos 259 00:14:33,330 --> 00:14:35,820 are obviously not orthogonal with each other. 260 00:14:35,820 --> 00:14:40,650 A signal is not orthogonal with itself, OK? 261 00:14:40,650 --> 00:14:45,150 So because part of this signal is that signal, 262 00:14:45,150 --> 00:14:49,650 those two signals are not orthogonal. 263 00:14:49,650 --> 00:14:50,580 OK? 264 00:14:50,580 --> 00:14:52,940 Yes? 265 00:14:52,940 --> 00:14:54,620 OK, so that's kind of-- so that's 266 00:14:54,620 --> 00:14:56,180 the idea of orthogonality. 267 00:14:56,180 --> 00:14:59,810 It's a very good way to think about decompositions. 268 00:15:02,182 --> 00:15:04,640 And even though we only spent about half an hour last time, 269 00:15:04,640 --> 00:15:06,380 and only about 15 minutes this time, 270 00:15:06,380 --> 00:15:09,660 that is the whole theory of Fourier series. 271 00:15:09,660 --> 00:15:11,736 That doesn't mean we can't ask hard questions. 272 00:15:11,736 --> 00:15:13,110 There were a couple of questions. 273 00:15:13,110 --> 00:15:14,200 Yes, you were first. 274 00:15:14,200 --> 00:15:16,540 AUDIENCE: Is there a way to think about orthogonality 275 00:15:16,540 --> 00:15:20,260 using the Fourier [INAUDIBLE]. 276 00:15:20,260 --> 00:15:22,610 PROFESSOR: Well, the Fourier coefficients 277 00:15:22,610 --> 00:15:25,947 are the result of orthogonality. 278 00:15:25,947 --> 00:15:27,530 I don't think you can tell-- if I just 279 00:15:27,530 --> 00:15:29,060 told you a bunch of Fourier coefficients, 280 00:15:29,060 --> 00:15:30,920 I don't know if you can tell me something 281 00:15:30,920 --> 00:15:34,070 about the orthogonality of the underlying signals or not. 282 00:15:34,070 --> 00:15:36,975 AUDIENCE: What if [INAUDIBLE]. 283 00:15:36,975 --> 00:15:37,850 PROFESSOR: Excuse me? 284 00:15:37,850 --> 00:15:39,433 AUDIENCE: [INAUDIBLE] [? the period ?] 285 00:15:39,433 --> 00:15:41,570 and the Fourier [INAUDIBLE]. 286 00:15:44,370 --> 00:15:46,529 PROFESSOR: Let's see, so I'm not completely 287 00:15:46,529 --> 00:15:47,820 sure I know what you're asking. 288 00:15:47,820 --> 00:15:51,410 Certainly if you tell me that the Fourier's coefficients are 289 00:15:51,410 --> 00:15:55,030 blah, blah, blah, 3, 2 7, and 16. 290 00:15:55,030 --> 00:15:58,050 And if you tell me that you're working with a simple Fourier 291 00:15:58,050 --> 00:16:03,594 series periodic in 3, then you've told me everything. 292 00:16:03,594 --> 00:16:05,260 And so there's a way for me to backtrack 293 00:16:05,260 --> 00:16:06,760 that it was orthogonal. 294 00:16:06,760 --> 00:16:09,610 I am not sure if I'm connecting with you, so if I'm not, 295 00:16:09,610 --> 00:16:11,320 ask me after lecture to make sure that-- 296 00:16:11,320 --> 00:16:12,234 AUDIENCE: [INAUDIBLE] 297 00:16:12,234 --> 00:16:12,900 PROFESSOR: Sure. 298 00:16:12,900 --> 00:16:14,399 AUDIENCE: I think he's saying if you 299 00:16:14,399 --> 00:16:17,337 have two signals [INAUDIBLE] coefficients [INAUDIBLE] two 300 00:16:17,337 --> 00:16:21,281 signals, can I tell if those two signals are orthogonal 301 00:16:21,281 --> 00:16:27,220 [INAUDIBLE] the coefficients are orthogonal [INAUDIBLE].. 302 00:16:27,220 --> 00:16:31,150 PROFESSOR: If they have components in common, 303 00:16:31,150 --> 00:16:34,060 they couldn't possibly be orthogonal. 304 00:16:34,060 --> 00:16:38,750 So I would answer yes to that question. 305 00:16:38,750 --> 00:16:41,560 So if that's what you were-- so I think that's probably right. 306 00:16:41,560 --> 00:16:43,140 Does that sound right? 307 00:16:43,140 --> 00:16:44,430 Yeah, OK. 308 00:16:44,430 --> 00:16:45,630 Yes? 309 00:16:45,630 --> 00:16:48,130 AUDIENCE: I'm awfully confused by the complex conjugate, 310 00:16:48,130 --> 00:16:49,130 the [INAUDIBLE]. 311 00:16:49,130 --> 00:16:50,762 PROFESSOR: Yes, yes, yes. 312 00:16:50,762 --> 00:16:52,928 AUDIENCE: So does that mean we're taking the complex 313 00:16:52,928 --> 00:16:55,835 conjugate [? of a ?] and we're [? applying it ?] to b? 314 00:16:55,835 --> 00:16:57,710 PROFESSOR: We're taking the complex conjugate 315 00:16:57,710 --> 00:16:59,570 of the entire function. 316 00:16:59,570 --> 00:17:03,200 At every point in time, we take the complex conjugate of it. 317 00:17:03,200 --> 00:17:06,230 And it's especially useful to think about 318 00:17:06,230 --> 00:17:08,099 if you're doing something of the form-- 319 00:17:08,099 --> 00:17:17,480 if a of t were e to the j 2 pi mt and if b of t 320 00:17:17,480 --> 00:17:22,940 were e to the j 2 pi lt. 321 00:17:22,940 --> 00:17:25,190 The only thing we're trying to do-- 322 00:17:25,190 --> 00:17:27,030 but this comes up quite frequently-- 323 00:17:27,030 --> 00:17:28,490 the only thing we're trying to do 324 00:17:28,490 --> 00:17:32,420 is when you conjugate one of these, 325 00:17:32,420 --> 00:17:35,850 you rig it so that when you add the exponents, 326 00:17:35,850 --> 00:17:39,920 the result goes to 0 by putting the minus up there. 327 00:17:39,920 --> 00:17:40,653 That's all. 328 00:17:40,653 --> 00:17:42,028 AUDIENCE: It doesn't seem like we 329 00:17:42,028 --> 00:17:44,944 had to do any of that for the example we just worked on. 330 00:17:44,944 --> 00:17:47,860 It seems like there were just like [? signals ?] [INAUDIBLE].. 331 00:17:47,860 --> 00:17:49,290 PROFESSOR: Oh, interesting. 332 00:17:52,299 --> 00:17:53,340 That's a very good point. 333 00:17:53,340 --> 00:17:54,590 That's interesting. 334 00:17:54,590 --> 00:17:57,530 So I didn't intend to throw you a ringer. 335 00:17:57,530 --> 00:18:00,590 These were signals, all of these except that one, 336 00:18:00,590 --> 00:18:04,160 are real functions of time. 337 00:18:04,160 --> 00:18:06,980 That's why the complex conjugate didn't come up. 338 00:18:06,980 --> 00:18:07,790 So I apologize. 339 00:18:07,790 --> 00:18:10,970 I wasn't trying to make it seem tricky. 340 00:18:10,970 --> 00:18:14,480 OK, so it's because this function of time 341 00:18:14,480 --> 00:18:19,280 is everywhere real that we didn't need to rehearse this. 342 00:18:19,280 --> 00:18:20,920 We did have to do it in that one. 343 00:18:23,300 --> 00:18:23,800 OK? 344 00:18:26,800 --> 00:18:28,940 OK, so the point is that we've already covered, 345 00:18:28,940 --> 00:18:31,450 even though we've only done a little bit of work in lecture, 346 00:18:31,450 --> 00:18:34,150 we've already covered all of the theory. 347 00:18:34,150 --> 00:18:36,310 What remains though is to do some practice. 348 00:18:36,310 --> 00:18:39,760 And also what remains is to understand how this is useful. 349 00:18:39,760 --> 00:18:43,180 So it's not just music. 350 00:18:43,180 --> 00:18:47,800 The example that I want to talk about today is speech. 351 00:18:47,800 --> 00:18:50,350 The same sort of thing that we could do with music last time, 352 00:18:50,350 --> 00:18:51,610 we can do with speech. 353 00:18:51,610 --> 00:18:53,192 And here are some utterances. 354 00:18:53,192 --> 00:18:53,858 [AUDIO PLAYBACK] 355 00:18:53,858 --> 00:19:00,797 - Bat, bait, bet, beet, bit, bite, bought, boat, but, boot. 356 00:19:00,797 --> 00:19:01,380 [END PLAYBACK] 357 00:19:01,380 --> 00:19:02,755 PROFESSOR: All right, it was just 358 00:19:02,755 --> 00:19:05,650 intended to be a bunch of sounds that we can analyze 359 00:19:05,650 --> 00:19:08,890 with Fourier analysis to get some insight into how 360 00:19:08,890 --> 00:19:14,050 to think about, in particular, speech recognition and speech 361 00:19:14,050 --> 00:19:15,970 synthesis. 362 00:19:15,970 --> 00:19:18,580 So we can take those utterances, and all 363 00:19:18,580 --> 00:19:21,340 I did was write a little Python program to do the decomposition 364 00:19:21,340 --> 00:19:24,490 that I showed on the previous slides, 365 00:19:24,490 --> 00:19:27,850 so that I could break these time waveforms. 366 00:19:27,850 --> 00:19:31,670 Here I'm illustrating one, two, three, four, five, six periods. 367 00:19:31,670 --> 00:19:37,090 So I took one period of that sound 368 00:19:37,090 --> 00:19:41,680 and ran it through that kind of an integral 369 00:19:41,680 --> 00:19:43,910 to break it into Fourier components, which 370 00:19:43,910 --> 00:19:45,400 are showed here. 371 00:19:45,400 --> 00:19:47,500 And what I want you to see is just 372 00:19:47,500 --> 00:19:50,770 like you could have recognized a pattern here, 373 00:19:50,770 --> 00:19:55,060 and you might try to recognize which vowel is which 374 00:19:55,060 --> 00:19:58,780 by the signature in time. 375 00:19:58,780 --> 00:20:01,570 An alternative, and far more useful way 376 00:20:01,570 --> 00:20:04,150 of thinking about it, is to try to recognize 377 00:20:04,150 --> 00:20:07,480 the pattern in frequency. 378 00:20:07,480 --> 00:20:10,030 So there are characteristic differences in the sounds, 379 00:20:10,030 --> 00:20:12,701 and we'll look at the basis for why there are. 380 00:20:12,701 --> 00:20:14,200 There are characteristic differences 381 00:20:14,200 --> 00:20:18,520 in the sound that can help us to identify automatically, 382 00:20:18,520 --> 00:20:22,610 by a machine, what was being said. 383 00:20:22,610 --> 00:20:24,730 And so what we want to do is learn 384 00:20:24,730 --> 00:20:29,410 to think about a pattern that characterizes ah, 385 00:20:29,410 --> 00:20:34,151 ee, oo in the frequency domain, as opposed to the time domain. 386 00:20:34,151 --> 00:20:34,817 [AUDIO PLAYBACK] 387 00:20:34,817 --> 00:20:36,397 - Bat beet, boot. 388 00:20:36,397 --> 00:20:36,980 [END PLAYBACK] 389 00:20:36,980 --> 00:20:39,480 PROFESSOR: So there's something different about those sounds 390 00:20:39,480 --> 00:20:44,870 that manifests a difference in this Fourier signature. 391 00:20:44,870 --> 00:20:48,350 So that's one of the useful applications of this. 392 00:20:48,350 --> 00:20:50,060 And we'd like to understand that better. 393 00:20:50,060 --> 00:20:54,440 There's a really good physical reason why that happens. 394 00:20:54,440 --> 00:20:57,860 And it has to do with the way we produce speech. 395 00:20:57,860 --> 00:21:03,080 So you can think about speech as being generated by some source. 396 00:21:03,080 --> 00:21:05,180 Ultimately, the source of my speech 397 00:21:05,180 --> 00:21:10,070 is somewhere down here, which always amuses me 398 00:21:10,070 --> 00:21:14,270 when I see the cut off heads talking, like at Halloween 399 00:21:14,270 --> 00:21:16,310 and like on some cartoon shows. 400 00:21:16,310 --> 00:21:17,980 Because you can't do that, right? 401 00:21:17,980 --> 00:21:20,510 Because the source has to do with down here someplace, 402 00:21:20,510 --> 00:21:21,080 right? 403 00:21:21,080 --> 00:21:23,060 My lungs push in. 404 00:21:23,060 --> 00:21:24,800 That pushes air through something 405 00:21:24,800 --> 00:21:27,860 and starts making noise somehow. 406 00:21:27,860 --> 00:21:33,230 I'm going to focus today on things that we call voiced-- 407 00:21:33,230 --> 00:21:35,560 in a voiced sound. 408 00:21:35,560 --> 00:21:39,800 Ah, in a voiced sound, it's caused by vibrations 409 00:21:39,800 --> 00:21:40,715 of the vocal chords. 410 00:21:40,715 --> 00:21:44,910 So if you were to stick a camera down someone's throat, 411 00:21:44,910 --> 00:21:47,060 this is the sort of thing that you would see. 412 00:21:47,060 --> 00:21:52,100 It's an enormously complex structure 413 00:21:52,100 --> 00:21:57,020 whose mechanics are extremely difficult to understand. 414 00:21:57,020 --> 00:22:03,050 Because what happens is when you want to make a high sound, 415 00:22:03,050 --> 00:22:05,210 you tense the structure. 416 00:22:05,210 --> 00:22:07,970 You pull on some muscles that pull the cords. 417 00:22:07,970 --> 00:22:13,170 The cords are normally rattling pretty fast. 418 00:22:13,170 --> 00:22:16,170 And what you do is, you pull on a muscle 419 00:22:16,170 --> 00:22:20,507 that tenses them to make it go higher. 420 00:22:20,507 --> 00:22:22,590 But your intuition should say, now wait a minute-- 421 00:22:22,590 --> 00:22:25,290 you're making it longer to make it higher? 422 00:22:25,290 --> 00:22:28,140 And your intuition would be right. 423 00:22:28,140 --> 00:22:33,670 Normally, long organ pipes are higher or lower in frequency? 424 00:22:33,670 --> 00:22:34,640 AUDIENCE: Lower. 425 00:22:34,640 --> 00:22:37,300 PROFESSOR: Lower. 426 00:22:37,300 --> 00:22:40,720 So you have to do a lot of mental calculations 427 00:22:40,720 --> 00:22:44,090 in order to move these muscles correctly. 428 00:22:44,090 --> 00:22:46,570 So that the resulting frequency of the vibration 429 00:22:46,570 --> 00:22:47,590 comes out right. 430 00:22:47,590 --> 00:22:49,780 It's not obvious, because two things 431 00:22:49,780 --> 00:22:51,510 happen as you tense the muscle. 432 00:22:51,510 --> 00:22:55,750 The folds-- the vocal cords get longer which you would think 433 00:22:55,750 --> 00:22:58,930 would make the frequency lower, but they 434 00:22:58,930 --> 00:23:03,480 get tighter, which, of course, goes the other direction, 435 00:23:03,480 --> 00:23:03,980 right? 436 00:23:03,980 --> 00:23:05,310 So it's a very complicated thing. 437 00:23:05,310 --> 00:23:06,851 And in fact, it's something that goes 438 00:23:06,851 --> 00:23:08,270 bad with professional speakers. 439 00:23:08,270 --> 00:23:11,720 But even more, professional singers 440 00:23:11,720 --> 00:23:15,230 often have a lot of trouble with the enormous stress 441 00:23:15,230 --> 00:23:19,040 that happens on this structure with repeated use and repeated 442 00:23:19,040 --> 00:23:20,860 overuse. 443 00:23:20,860 --> 00:23:23,870 Anyway, this takes a real beating. 444 00:23:23,870 --> 00:23:27,764 But that's ultimately the source of speech. 445 00:23:27,764 --> 00:23:29,930 But if that were all you had, it wouldn't sound much 446 00:23:29,930 --> 00:23:31,790 like speech. 447 00:23:31,790 --> 00:23:35,960 A lot of the interesting stuff comes from these cavities 448 00:23:35,960 --> 00:23:37,760 that you intentionally manipulate 449 00:23:37,760 --> 00:23:41,000 as you're speaking to make the different characteristic 450 00:23:41,000 --> 00:23:43,880 sounds. 451 00:23:43,880 --> 00:23:51,020 So the idea then is that you have a source that contains 452 00:23:51,020 --> 00:23:53,030 information like frequency. 453 00:23:53,030 --> 00:23:57,320 What's the pitch of the utterance? 454 00:23:57,320 --> 00:24:02,240 But you have this other thing, which is acting like a filter. 455 00:24:02,240 --> 00:24:04,940 If you think about the whole thing as a system, 456 00:24:04,940 --> 00:24:09,620 we have a block which represents a filter, which 457 00:24:09,620 --> 00:24:12,039 is the thing that has a frequency response. 458 00:24:12,039 --> 00:24:13,580 The frequency response depends on how 459 00:24:13,580 --> 00:24:16,460 I've put my tongue in my mouth and how I've opened my lips 460 00:24:16,460 --> 00:24:17,356 and stuff like that. 461 00:24:17,356 --> 00:24:18,480 We'll see that in a minute. 462 00:24:18,480 --> 00:24:21,410 But it also depends on how the vocal folds-- 463 00:24:21,410 --> 00:24:25,710 it has an input, which is the vocal folds. 464 00:24:25,710 --> 00:24:29,330 So the idea then is the same kind of a source filter idea 465 00:24:29,330 --> 00:24:34,320 that we motivated last time by way of the RC filter example. 466 00:24:34,320 --> 00:24:37,070 If you put a resistor and a capacitor 467 00:24:37,070 --> 00:24:40,130 together with a source, a convenient way 468 00:24:40,130 --> 00:24:42,830 to think about that is as a low pass filter. 469 00:24:42,830 --> 00:24:45,170 We think about it having a frequency response. 470 00:24:45,170 --> 00:24:50,330 So the system, just the RC part, has a frequency response 471 00:24:50,330 --> 00:24:53,480 which we can characterize by a Bode diagram. 472 00:24:53,480 --> 00:24:54,860 So we can think about-- 473 00:24:54,860 --> 00:24:56,510 we did this last time-- 474 00:24:56,510 --> 00:24:59,270 so we can think about the low frequencies go through 475 00:24:59,270 --> 00:25:00,410 without attenuation. 476 00:25:00,410 --> 00:25:04,190 The gain is 1, and the phase is 0. 477 00:25:04,190 --> 00:25:07,820 So basically low frequencies go through the filter 478 00:25:07,820 --> 00:25:10,310 without any change. 479 00:25:10,310 --> 00:25:11,684 High frequencies are attenuated. 480 00:25:11,684 --> 00:25:13,100 The higher the frequency, the more 481 00:25:13,100 --> 00:25:20,030 the attenuation, and phase shifted by lagging pi over 2. 482 00:25:20,030 --> 00:25:23,090 So that's a way of thinking about the RC circuit 483 00:25:23,090 --> 00:25:24,980 as a low pass filter. 484 00:25:24,980 --> 00:25:26,510 And it gives us insight in the kinds 485 00:25:26,510 --> 00:25:28,790 of signals that go through and don't go through. 486 00:25:28,790 --> 00:25:33,710 So that, if we think about a signal like a square wave 487 00:25:33,710 --> 00:25:37,380 having a Fourier series decomposition, 488 00:25:37,380 --> 00:25:41,550 it only has odd components and the odd components fall with k. 489 00:25:41,550 --> 00:25:44,690 The magnitude of the component is inverse with k. 490 00:25:44,690 --> 00:25:49,640 So we get components that, if I plot on a log scale, 491 00:25:49,640 --> 00:25:53,625 the reciprocal relationship of the weight of the components, 492 00:25:53,625 --> 00:25:56,000 the magnitude of the components, makes it a straight line 493 00:25:56,000 --> 00:25:58,756 with a slope of minus 1. 494 00:25:58,756 --> 00:26:02,450 And now we can think about putting this signal into the RC 495 00:26:02,450 --> 00:26:07,040 filter and thinking about what the output should look like. 496 00:26:07,040 --> 00:26:09,080 If the frequency of the square wave, 497 00:26:09,080 --> 00:26:12,440 if the fundamental frequency, 2 pi 498 00:26:12,440 --> 00:26:17,250 over the period, if 2 pi over capital T, 499 00:26:17,250 --> 00:26:21,650 if 2 pi over capital T is some frequency that's low compared 500 00:26:21,650 --> 00:26:25,370 to the corner frequency of the low pass filter, 501 00:26:25,370 --> 00:26:28,440 basically the output of the filter, which 502 00:26:28,440 --> 00:26:32,390 is showed in green, overlaps the input, which is showed in red. 503 00:26:32,390 --> 00:26:35,780 You can't tell the difference because all the components 504 00:26:35,780 --> 00:26:39,750 have the same magnitude and phase as the input. 505 00:26:39,750 --> 00:26:42,420 But if you change the frequency of the square wave 506 00:26:42,420 --> 00:26:45,630 so that the fundamental is higher, 507 00:26:45,630 --> 00:26:48,870 some of the higher frequencies are attenuated 508 00:26:48,870 --> 00:26:51,700 and phase shifted. 509 00:26:51,700 --> 00:26:54,280 The shape of the waveform, showed in green, 510 00:26:54,280 --> 00:26:56,267 starts to deviate. 511 00:26:56,267 --> 00:26:57,850 If you go to still higher frequencies, 512 00:26:57,850 --> 00:26:59,080 the deviation's even greater. 513 00:26:59,080 --> 00:27:01,690 And if you go to high enough frequencies, 514 00:27:01,690 --> 00:27:04,810 they're all in the region where the magnitude is 515 00:27:04,810 --> 00:27:09,070 being attenuated by whatever frequency. 516 00:27:09,070 --> 00:27:12,730 So my dependence of 1 over k becomes 1 over k squared, 517 00:27:12,730 --> 00:27:17,560 and it goes from being a square wave to a triangle wave. 518 00:27:17,560 --> 00:27:20,430 So that's a way of thinking about the signal transformation 519 00:27:20,430 --> 00:27:21,450 in terms of a filter. 520 00:27:21,450 --> 00:27:22,590 We did that last time. 521 00:27:22,590 --> 00:27:25,880 What's going on with speech is exactly the same thing. 522 00:27:25,880 --> 00:27:27,510 What we want to do is think about-- 523 00:27:27,510 --> 00:27:32,310 the glottis makes some kind of a sound that goes into a filter. 524 00:27:32,310 --> 00:27:34,860 The filter is this thing that is controlled 525 00:27:34,860 --> 00:27:36,830 by my tongue's position and my jaw 526 00:27:36,830 --> 00:27:39,210 position and my lip position and stuff like that. 527 00:27:39,210 --> 00:27:41,790 And what comes out is speech. 528 00:27:41,790 --> 00:27:45,000 To demonstrate that, here's a film 529 00:27:45,000 --> 00:27:46,350 that was made by Ken Stevens. 530 00:27:46,350 --> 00:27:48,308 Ken Stevens was a professor in this department. 531 00:27:48,308 --> 00:27:50,790 He just recently retired. 532 00:27:50,790 --> 00:27:53,370 This was done when he was a graduate student. 533 00:27:53,370 --> 00:27:57,630 It's very hard to see because the contrast is not great. 534 00:27:57,630 --> 00:27:59,310 But you have to take into consideration 535 00:27:59,310 --> 00:28:01,050 this was made with X-rays. 536 00:28:01,050 --> 00:28:04,030 OK, we probably wouldn't do this today. 537 00:28:04,030 --> 00:28:07,530 It was a relatively large exposure 538 00:28:07,530 --> 00:28:10,620 to x-rays, which we sort of frown on these days. 539 00:28:10,620 --> 00:28:14,340 Just so you're not too worried, Ken Stevens, when he retired, 540 00:28:14,340 --> 00:28:17,140 had the longest teaching career in our history. 541 00:28:17,140 --> 00:28:21,380 He was a lecturer who actively lectured for 50 years. 542 00:28:21,380 --> 00:28:22,800 So he seemed to have done OK. 543 00:28:22,800 --> 00:28:23,910 He survived. 544 00:28:23,910 --> 00:28:26,280 So you don't need to worry about what happened to him. 545 00:28:26,280 --> 00:28:29,340 But we probably wouldn't repeat this. 546 00:28:29,340 --> 00:28:32,100 It's a little hard to see. 547 00:28:32,100 --> 00:28:35,400 The bone is easy, right, because x-rays don't 548 00:28:35,400 --> 00:28:37,500 go through bones very well. 549 00:28:37,500 --> 00:28:40,560 What you can just barely see is his lips. 550 00:28:40,560 --> 00:28:45,180 And it's important to watch the lips, too. 551 00:28:45,180 --> 00:28:50,250 It's also important that his chin is on a chin rest 552 00:28:50,250 --> 00:28:51,880 to simplify analysis. 553 00:28:51,880 --> 00:28:56,580 The idea of this was to get quantitative measurements 554 00:28:56,580 --> 00:29:00,450 to fit the source filter idea. 555 00:29:00,450 --> 00:29:06,074 OK, so now I'm going to play a film, a recording of him. 556 00:29:06,074 --> 00:29:10,058 [VIDEO PLAYBACK] 557 00:29:10,058 --> 00:29:11,552 - Test. 558 00:29:11,552 --> 00:29:13,046 Test. 559 00:29:13,046 --> 00:29:14,042 Test. 560 00:29:14,042 --> 00:29:17,030 [? The tongue. ?] [? The tongue. ?] 561 00:29:17,030 --> 00:29:18,524 The [INAUDIBLE]. 562 00:29:18,524 --> 00:29:22,508 The [INAUDIBLE].. [? The neck. ?] [INAUDIBLE].. 563 00:29:22,508 --> 00:29:24,002 [INAUDIBLE] 564 00:29:24,002 --> 00:29:25,496 [INAUDIBLE] 565 00:29:25,496 --> 00:29:26,990 [INAUDIBLE] 566 00:29:26,990 --> 00:29:28,982 [INAUDIBLE] 567 00:29:28,982 --> 00:29:30,476 [INAUDIBLE] 568 00:29:30,476 --> 00:29:33,962 [? Fox. ?] [? Clock. ?] The [INAUDIBLE].. 569 00:29:33,962 --> 00:29:35,456 The [INAUDIBLE]. 570 00:29:35,456 --> 00:29:36,950 [INAUDIBLE] 571 00:29:36,950 --> 00:29:41,432 [? Took. ?] [? Two. ?] [INAUDIBLE].. 572 00:29:41,432 --> 00:29:44,918 [? Tot. ?] [? Tech. ?] [INAUDIBLE].. 573 00:29:44,918 --> 00:29:46,910 [INAUDIBLE] 574 00:29:46,910 --> 00:29:48,404 [INAUDIBLE] 575 00:29:48,404 --> 00:29:49,898 [INAUDIBLE] 576 00:29:49,898 --> 00:29:51,392 [INAUDIBLE] 577 00:29:51,392 --> 00:29:53,384 [INAUDIBLE] 578 00:29:53,384 --> 00:29:58,025 Why did [INAUDIBLE] set the [INAUDIBLE] on top of his desk? 579 00:29:58,025 --> 00:29:59,858 I have put [? blood under ?] [? two clean ?] 580 00:29:59,858 --> 00:30:03,507 [? yellow shoes. ?] 581 00:30:03,507 --> 00:30:04,340 [END VIDEO PLAYBACK] 582 00:30:04,340 --> 00:30:05,834 [LAUGHTER] 583 00:30:05,834 --> 00:30:11,050 PROFESSOR: OK, so what you were supposed to see 584 00:30:11,050 --> 00:30:14,680 is that the thing that we associate with speech 585 00:30:14,680 --> 00:30:16,390 is only a small part of it. 586 00:30:16,390 --> 00:30:17,890 His lips were obviously moving. 587 00:30:17,890 --> 00:30:20,350 That's what we see. 588 00:30:20,350 --> 00:30:21,880 But if you were paying attention, 589 00:30:21,880 --> 00:30:27,389 his tongue was going up and down not a little bit, but a lot. 590 00:30:27,389 --> 00:30:29,680 So the gap between his tongue and the roof of his mouth 591 00:30:29,680 --> 00:30:31,385 was going from 0 to about that far. 592 00:30:34,810 --> 00:30:40,360 The velum back here was opening very broadly on occasion. 593 00:30:40,360 --> 00:30:42,580 So there was a significant variation 594 00:30:42,580 --> 00:30:48,860 in the shape of the structure through which the glottis wave 595 00:30:48,860 --> 00:30:50,470 form was passing. 596 00:30:50,470 --> 00:30:55,090 And that's the basis of the filtering that gives rise 597 00:30:55,090 --> 00:30:57,740 to the different speech sounds. 598 00:30:57,740 --> 00:31:01,030 So to convince you of that, here I 599 00:31:01,030 --> 00:31:04,936 have a carefully machined item. 600 00:31:04,936 --> 00:31:06,084 Let's see. 601 00:31:06,084 --> 00:31:07,000 I don't want this one. 602 00:31:07,000 --> 00:31:08,000 I want this one. 603 00:31:08,000 --> 00:31:10,930 So this is a Japanese oo. 604 00:31:10,930 --> 00:31:13,570 OK, now I don't know Japanese, so I have to just sort of trust 605 00:31:13,570 --> 00:31:15,153 the guy who made this that it actually 606 00:31:15,153 --> 00:31:17,110 sounds like a Japanese oo. 607 00:31:17,110 --> 00:31:20,380 The second one I'll show is a Japanese ee, which actually 608 00:31:20,380 --> 00:31:22,220 sounds more like an ee to me. 609 00:31:22,220 --> 00:31:26,260 But anyway, this model was made from measurements of the type 610 00:31:26,260 --> 00:31:28,300 that I just showed with Ken. 611 00:31:28,300 --> 00:31:31,420 So the idea was to estimate the size of those cavities 612 00:31:31,420 --> 00:31:33,520 through which the air was passing, 613 00:31:33,520 --> 00:31:37,020 and then make, by machining in Plexiglas, 614 00:31:37,020 --> 00:31:40,520 a structure that has that shape. 615 00:31:40,520 --> 00:31:43,670 So this was an early test of whether the source filter 616 00:31:43,670 --> 00:31:44,750 idea works. 617 00:31:44,750 --> 00:31:48,500 So if that is the explanation for how speech is generated, 618 00:31:48,500 --> 00:31:51,021 then I ought to be able to take a boring sound 619 00:31:51,021 --> 00:31:52,895 of the type that's generated by the glottis-- 620 00:31:52,895 --> 00:31:54,650 [BUZZING SOUND] 621 00:31:57,545 --> 00:32:00,500 And put it through this, and it should sound more like a vowel. 622 00:32:00,500 --> 00:32:01,730 OK? 623 00:32:01,730 --> 00:32:02,290 Got it? 624 00:32:02,290 --> 00:32:04,200 Know what I'm talking about? 625 00:32:04,200 --> 00:32:08,952 So this is a Japanese oo. 626 00:32:08,952 --> 00:32:11,850 [BUZZING SOUND] 627 00:32:11,850 --> 00:32:14,605 [COMBINES BUZZING SOUND WITH 'OO' SOUND] 628 00:32:14,605 --> 00:32:16,230 I don't know if anybody knows Japanese. 629 00:32:16,230 --> 00:32:17,856 I don't know if that sounds like an oo. 630 00:32:17,856 --> 00:32:19,563 Does anybody know Japanese, and does that 631 00:32:19,563 --> 00:32:20,580 sound like an oo or not? 632 00:32:23,410 --> 00:32:24,696 OK, I'll pass. 633 00:32:24,696 --> 00:32:28,590 [BUZZING SOUND] 634 00:32:28,590 --> 00:32:30,115 This is an ee. 635 00:32:30,115 --> 00:32:33,390 OK, now notice that the ee looks very different. 636 00:32:33,390 --> 00:32:34,770 Right? 637 00:32:34,770 --> 00:32:37,470 The question is whether that's a big enough difference 638 00:32:37,470 --> 00:32:44,130 to make the difference between an oo and an ee. 639 00:32:44,130 --> 00:32:47,270 I'm pressing the same button, nothing up my sleeve, nothing-- 640 00:32:47,270 --> 00:32:50,048 OK, so same button. 641 00:32:50,048 --> 00:32:53,520 [COMBINES BUZZING SOUND WITH 'EE' SOUND] 642 00:33:00,640 --> 00:33:02,570 OK, so what you're supposed to be convinced of 643 00:33:02,570 --> 00:33:07,780 is there is enough information in the shape 644 00:33:07,780 --> 00:33:12,440 of the vocal structures to account 645 00:33:12,440 --> 00:33:14,755 for the difference in the sounds. 646 00:33:17,380 --> 00:33:23,650 Now of course, we don't really care about the acoustics 647 00:33:23,650 --> 00:33:28,600 if we're trying to, for example, synthesize or analyze speech. 648 00:33:28,600 --> 00:33:30,670 We don't particularly care about that. 649 00:33:30,670 --> 00:33:32,650 We do like to know that there is a theory that underlies it, 650 00:33:32,650 --> 00:33:33,149 right? 651 00:33:33,149 --> 00:33:36,070 And there's a very sound physical basis 652 00:33:36,070 --> 00:33:39,610 for why we should think about the source filter idea. 653 00:33:39,610 --> 00:33:43,690 When I say source filter, source filter-- 654 00:33:43,690 --> 00:33:47,350 so everybody calls it the source filter model of speech. 655 00:33:47,350 --> 00:33:48,850 So is there any good physical reason 656 00:33:48,850 --> 00:33:50,980 for why that should be true? 657 00:33:50,980 --> 00:33:56,740 Of course, what we care about is the frequency response. 658 00:33:56,740 --> 00:33:58,920 So here what's showed is measurements 659 00:33:58,920 --> 00:34:01,860 of frequency responses taken from speakers. 660 00:34:01,860 --> 00:34:03,780 So now we don't do the x-ray thing. 661 00:34:03,780 --> 00:34:09,179 All we do is record somebody saying heed, had, hood, 662 00:34:09,179 --> 00:34:12,380 haw'd, who'd. 663 00:34:12,380 --> 00:34:16,070 And we look at men, women, and children, 664 00:34:16,070 --> 00:34:22,100 and we characterize how their frequency responses change 665 00:34:22,100 --> 00:34:24,210 when they make those different sounds. 666 00:34:24,210 --> 00:34:30,889 So what's showed here is that you get a relatively good fit 667 00:34:30,889 --> 00:34:36,409 by thinking about the frequency response having three formants. 668 00:34:36,409 --> 00:34:39,770 The formants are the peak frequencies. 669 00:34:39,770 --> 00:34:42,886 There's a theory, which I won't go into, 670 00:34:42,886 --> 00:34:44,510 for how you take this shape and turn it 671 00:34:44,510 --> 00:34:47,540 into a formant frequency. 672 00:34:47,540 --> 00:34:51,949 And given just the formant frequencies, or given 673 00:34:51,949 --> 00:34:54,650 the frequency response measured at uniform spacing 674 00:34:54,650 --> 00:34:57,260 across frequencies, there is a theory 675 00:34:57,260 --> 00:35:02,470 for how you can generate the smooth line, which is really-- 676 00:35:02,470 --> 00:35:05,050 this is an 11th order [? fit, ?] which 677 00:35:05,050 --> 00:35:09,550 means that there are 11 poles and no zeros. 678 00:35:09,550 --> 00:35:12,310 So what you do to get this shape then 679 00:35:12,310 --> 00:35:14,970 is take the locations and amplitudes 680 00:35:14,970 --> 00:35:20,020 of the formant frequencies and do a fit using poles. 681 00:35:20,020 --> 00:35:24,270 And so here's a table showing measured formant frequencies, 682 00:35:24,270 --> 00:35:29,440 F1, F2, and F3, for whatever, six different 683 00:35:29,440 --> 00:35:34,720 sounds for three different categories of speakers. 684 00:35:34,720 --> 00:35:35,980 OK? 685 00:35:35,980 --> 00:35:39,070 And that's kind of a complete analysis then in terms 686 00:35:39,070 --> 00:35:41,560 of the source filter idea. 687 00:35:41,560 --> 00:35:45,610 So this figure summarizes the idea. 688 00:35:45,610 --> 00:35:47,280 We think about source filters. 689 00:35:47,280 --> 00:35:49,450 So the source is the glottis. 690 00:35:49,450 --> 00:35:55,290 The filter is the formants created by the throat. 691 00:35:55,290 --> 00:35:58,900 And speech is the thing that comes out of the source filter. 692 00:35:58,900 --> 00:36:02,140 The source is some periodic waveform 693 00:36:02,140 --> 00:36:07,330 caused by the banging together of the vocal folds. 694 00:36:07,330 --> 00:36:15,430 The filter is the frequency response of the throat. 695 00:36:15,430 --> 00:36:18,310 And the result, then, is just passing this glottis wave 696 00:36:18,310 --> 00:36:21,160 form-- so this is a measured-- 697 00:36:21,160 --> 00:36:23,660 by sticking a microphone in somebody's throat, 698 00:36:23,660 --> 00:36:26,740 this is a measurement of what the glottis acoustics looks 699 00:36:26,740 --> 00:36:28,300 like. 700 00:36:28,300 --> 00:36:33,910 This is a Fourier decomposition of that periodic waveform. 701 00:36:33,910 --> 00:36:37,360 Then this is the frequency response of that thing. 702 00:36:37,360 --> 00:36:41,560 And this is the Fourier coefficients 703 00:36:41,560 --> 00:36:44,400 of the output for different sounds. 704 00:36:44,400 --> 00:36:45,510 So here is the frequency. 705 00:36:45,510 --> 00:36:50,230 So the same glottis signal underlies and ee sound and ah 706 00:36:50,230 --> 00:36:54,670 sound and generates two different spectra. 707 00:36:54,670 --> 00:36:57,100 We call that combination of magnitudes and angles. 708 00:36:57,100 --> 00:36:59,505 We call that the Fourier spectrum. 709 00:36:59,505 --> 00:37:00,880 So you get two different spectra, 710 00:37:00,880 --> 00:37:04,550 depending on the filter shape. 711 00:37:04,550 --> 00:37:07,140 OK? 712 00:37:07,140 --> 00:37:09,330 And that's the basis-- 713 00:37:09,330 --> 00:37:11,490 this theory, this source filter idea-- 714 00:37:11,490 --> 00:37:13,320 is the basis of the current technology 715 00:37:13,320 --> 00:37:16,690 for speech recognition and speech production. 716 00:37:16,690 --> 00:37:18,300 So I actually cheated. 717 00:37:18,300 --> 00:37:19,860 Those sounds that I played earlier-- 718 00:37:19,860 --> 00:37:22,410 bit, bat, bought, beat, all those things-- 719 00:37:22,410 --> 00:37:25,090 those were actually synthetic speech. 720 00:37:25,090 --> 00:37:27,510 OK, all I did is I ran a speech synthesizer, 721 00:37:27,510 --> 00:37:30,180 and I said, synthesize bit. 722 00:37:30,180 --> 00:37:32,850 So that was really a synthesized thing. 723 00:37:32,850 --> 00:37:34,920 That was not a real person. 724 00:37:34,920 --> 00:37:41,250 And so the synthesizer used this theory 725 00:37:41,250 --> 00:37:43,710 in order to generate this synthetic speech. 726 00:37:43,710 --> 00:37:47,650 We also use this theory in order to recognize speech. 727 00:37:47,650 --> 00:37:49,650 And you'll do a homework problem in Homework 10, 728 00:37:49,650 --> 00:37:53,700 I think it is, in which you'll build the primitive front 729 00:37:53,700 --> 00:37:59,360 end of a speech recognizer using this theory. 730 00:37:59,360 --> 00:38:01,940 And I'll give you a couple of utterances of different vowels 731 00:38:01,940 --> 00:38:05,180 and you'll have to classify which vowel is being said, 732 00:38:05,180 --> 00:38:08,000 according to some automatic speech recognizer 733 00:38:08,000 --> 00:38:10,310 based on this theory. 734 00:38:10,310 --> 00:38:13,430 The theory is also just fun, because a theory 735 00:38:13,430 --> 00:38:16,132 lets us figure out anomalies. 736 00:38:16,132 --> 00:38:17,840 So when somebody has a speech impediment, 737 00:38:17,840 --> 00:38:20,131 for example, when I did, when I was a little kid I did. 738 00:38:20,131 --> 00:38:21,560 And I was sent to speech school. 739 00:38:21,560 --> 00:38:23,310 Now they do a much better job because they 740 00:38:23,310 --> 00:38:27,320 do analysis to figure out what I'm doing wrong, 741 00:38:27,320 --> 00:38:29,120 using this sort of source filter idea. 742 00:38:29,120 --> 00:38:31,250 We can also use the source filter idea 743 00:38:31,250 --> 00:38:35,390 to understand paradoxes. 744 00:38:35,390 --> 00:38:38,990 So for example, I've told you before I work on hearing aids. 745 00:38:38,990 --> 00:38:40,820 I tried to make hearing aids hear. 746 00:38:40,820 --> 00:38:45,410 And so people with hearing deficiencies like mine, 747 00:38:45,410 --> 00:38:47,480 where I have sort of progressive age-related 748 00:38:47,480 --> 00:38:50,120 because I'm old, right, that's what happens. 749 00:38:50,120 --> 00:38:52,940 I have age-related hearing loss, which 750 00:38:52,940 --> 00:38:55,220 means that I'm losing high frequencies. 751 00:38:55,220 --> 00:38:56,900 I'm less sensitive to high frequencies. 752 00:38:56,900 --> 00:39:00,770 People like me, which is the vast majority of people 753 00:39:00,770 --> 00:39:03,620 my age-- 754 00:39:03,620 --> 00:39:07,580 it's easier to understand male speech than female speech. 755 00:39:07,580 --> 00:39:08,080 Why? 756 00:39:10,990 --> 00:39:12,080 Higher frequencies. 757 00:39:12,080 --> 00:39:16,480 So those higher frequencies shift 758 00:39:16,480 --> 00:39:18,066 some of the important stuff that I 759 00:39:18,066 --> 00:39:19,690 should be listening to into frequencies 760 00:39:19,690 --> 00:39:21,640 I don't hear anymore. 761 00:39:21,640 --> 00:39:22,630 Right? 762 00:39:22,630 --> 00:39:24,640 So that's a way of using this theory 763 00:39:24,640 --> 00:39:27,640 to try to understand what's wrong with me. 764 00:39:27,640 --> 00:39:30,310 But there's also things-- it's not just me. 765 00:39:30,310 --> 00:39:33,700 Normal people have trouble distinguishing female speech, 766 00:39:33,700 --> 00:39:36,621 especially in taxing environments, and one of those 767 00:39:36,621 --> 00:39:37,120 is singing. 768 00:39:40,170 --> 00:39:43,030 So if you consider altos and sopranos, 769 00:39:43,030 --> 00:39:46,350 sopranos are like, the worst, right? 770 00:39:46,350 --> 00:39:49,230 Because they are not only female, but they're at the high 771 00:39:49,230 --> 00:39:51,490 end the females. 772 00:39:51,490 --> 00:39:54,180 And there are those who complain about not 773 00:39:54,180 --> 00:39:57,360 being able to understand female singers. 774 00:39:57,360 --> 00:39:59,640 OK, so here's a demo that will help 775 00:39:59,640 --> 00:40:04,120 us to understand whether that's a valid kind of a criticism 776 00:40:04,120 --> 00:40:04,620 or not. 777 00:40:04,620 --> 00:40:10,110 So what I've got is a professional singer 778 00:40:10,110 --> 00:40:13,080 singing, la, la, la, la-- 779 00:40:13,080 --> 00:40:14,210 on a scale. 780 00:40:14,210 --> 00:40:16,430 So from low frequency to high frequency, 781 00:40:16,430 --> 00:40:18,180 then a different sound, a different sound, 782 00:40:18,180 --> 00:40:20,877 a different sound, a different sound. 783 00:40:20,877 --> 00:40:22,460 So the first thing that I want to do-- 784 00:40:22,460 --> 00:40:24,376 I want you to listen to those different sounds 785 00:40:24,376 --> 00:40:27,439 as she goes across the scale. 786 00:40:27,439 --> 00:40:29,480 Then I'm going to play just the low ones and just 787 00:40:29,480 --> 00:40:31,990 the high ones, just the low frequency ones and just 788 00:40:31,990 --> 00:40:32,990 the high frequency ones. 789 00:40:32,990 --> 00:40:36,080 So first, the different scales-- 790 00:40:36,080 --> 00:40:39,834 la, la, lore, loo, lee, OK? 791 00:40:39,834 --> 00:40:40,500 [AUDIO PLAYBACK] 792 00:40:40,500 --> 00:40:47,500 - La, la, la, la, la, la, la, la, la, la, la, la, la, la, la, 793 00:40:47,500 --> 00:40:48,500 la. 794 00:40:48,500 --> 00:40:50,500 La. 795 00:40:50,500 --> 00:40:55,000 Lore, lore, lore, lore, lore, lore, lore, lore, lore, lore, 796 00:40:55,000 --> 00:40:58,500 lore, lore, lore, lore, lore, lore, lore. 797 00:40:58,500 --> 00:41:00,000 Lore. 798 00:41:00,000 --> 00:41:05,750 Loo, loo, loo, loo, loo, loo, loo, loo, loo, loo, loo, loo, 799 00:41:05,750 --> 00:41:07,500 loo, loo, loo, loo. 800 00:41:07,500 --> 00:41:09,500 Loo. 801 00:41:09,500 --> 00:41:15,250 Ler, ler, ler, ler, ler, ler, ler, ler, ler, ler, ler, ler, 802 00:41:15,250 --> 00:41:17,000 ler, ler, ler, ler. 803 00:41:17,000 --> 00:41:19,000 Ler. 804 00:41:19,000 --> 00:41:24,750 Lee, lee, lee, lee, lee, lee, lee, lee, lee, lee, lee, lee, 805 00:41:24,750 --> 00:41:26,500 lee, lee, lee, lee. 806 00:41:26,500 --> 00:41:28,500 Lee. 807 00:41:28,500 --> 00:41:30,000 [END PLAYBACK] 808 00:41:30,000 --> 00:41:32,770 PROFESSOR: OK, so now what I've done 809 00:41:32,770 --> 00:41:37,420 is I've sliced out the lowest frequency, the very first 810 00:41:37,420 --> 00:41:40,810 of the scale from each of the sounds 811 00:41:40,810 --> 00:41:45,490 and pasted them together to get the low frequency run. 812 00:41:45,490 --> 00:41:51,470 And then I took out the high ones and pasted them together. 813 00:41:51,470 --> 00:41:52,110 OK? 814 00:41:52,110 --> 00:41:56,990 Exactly the same sounds, just played in a different order. 815 00:41:56,990 --> 00:41:59,540 So first the low frequency ones. 816 00:41:59,540 --> 00:42:01,022 SINGER: La. 817 00:42:01,022 --> 00:42:02,998 Lore. 818 00:42:02,998 --> 00:42:04,974 Loo. 819 00:42:04,974 --> 00:42:06,950 Ler. 820 00:42:06,950 --> 00:42:09,914 Lee. 821 00:42:09,914 --> 00:42:13,026 And the high frequency ones. 822 00:42:13,026 --> 00:42:15,022 La. 823 00:42:15,022 --> 00:42:16,519 Lore. 824 00:42:16,519 --> 00:42:18,515 Loo. 825 00:42:18,515 --> 00:42:20,511 Ler. 826 00:42:20,511 --> 00:42:21,509 Lee. 827 00:42:21,509 --> 00:42:26,000 [LAUGHTER] 828 00:42:26,000 --> 00:42:32,570 PROFESSOR: It's not her fault. She's doing everything right. 829 00:42:32,570 --> 00:42:34,030 And you can see that. 830 00:42:34,030 --> 00:42:40,010 Here is, again, a Python program analyzing those same segments. 831 00:42:40,010 --> 00:42:46,040 So what's shown here is the ee, the filter derived from the ee, 832 00:42:46,040 --> 00:42:50,360 by thinking about the lee, lee, lee, lee, lee, lee-- 833 00:42:50,360 --> 00:42:53,990 by looking at that sequence, and averaging 834 00:42:53,990 --> 00:42:57,620 across the frequencies. 835 00:42:57,620 --> 00:43:00,890 So here's the filter. 836 00:43:00,890 --> 00:43:08,180 Here's the filtered glottis spectrum for a low frequency 837 00:43:08,180 --> 00:43:10,880 and intermediate frequency and high frequency. 838 00:43:10,880 --> 00:43:14,340 What's the difference between the low, middle, and high? 839 00:43:17,330 --> 00:43:19,870 What's characteristically different at low and high? 840 00:43:26,110 --> 00:43:31,880 AUDIENCE: [INAUDIBLE] frequency, like high amplitude. 841 00:43:31,880 --> 00:43:37,230 PROFESSOR: So if you look at the low frequency, the low pitch, 842 00:43:37,230 --> 00:43:42,560 there are more frequency components in a given range. 843 00:43:42,560 --> 00:43:45,400 So if I say analyze the frequencies between 0 844 00:43:45,400 --> 00:43:50,110 and 1,000 hertz, 1,000 cycles per second, 845 00:43:50,110 --> 00:43:55,210 there are more lines when you have a low frequency. 846 00:43:55,210 --> 00:43:57,760 And so you get the density of the lines 847 00:43:57,760 --> 00:44:00,340 is greater for the low frequency utterance 848 00:44:00,340 --> 00:44:04,200 than it is for the high frequency utterance. 849 00:44:04,200 --> 00:44:09,450 The low frequency utterance is spaced close enough 850 00:44:09,450 --> 00:44:17,240 that you can clearly figure out this pattern from that spacing, 851 00:44:17,240 --> 00:44:20,660 because there are multiple lines per peak. 852 00:44:20,660 --> 00:44:23,430 The problem is that the speech waveforms 853 00:44:23,430 --> 00:44:26,870 have very sharp resonances. 854 00:44:26,870 --> 00:44:29,510 The peaks are narrow. 855 00:44:29,510 --> 00:44:32,720 So that as you go to a higher frequency, 856 00:44:32,720 --> 00:44:36,380 now it's very hard to see. 857 00:44:36,380 --> 00:44:39,140 So where there was two lines characterizing this guy, 858 00:44:39,140 --> 00:44:41,860 now there's one. 859 00:44:41,860 --> 00:44:44,460 And at the highest frequency, there's nothing there. 860 00:44:47,680 --> 00:44:51,340 Similarly with these peaks, again, several lines 861 00:44:51,340 --> 00:44:52,450 representing each peak. 862 00:44:52,450 --> 00:44:56,950 One line representing-- nothing representing this peak, nothing 863 00:44:56,950 --> 00:44:58,630 representing that peak. 864 00:44:58,630 --> 00:45:05,270 There is nothing about ee in that signal. 865 00:45:05,270 --> 00:45:08,520 And if you do the same analysis for ah, 866 00:45:08,520 --> 00:45:12,030 you get the same result. There's nothing about ee, 867 00:45:12,030 --> 00:45:13,740 and there's nothing about ah. 868 00:45:13,740 --> 00:45:15,940 There's just nothing there. 869 00:45:15,940 --> 00:45:19,470 There's no way anybody is going to tell those two sounds apart. 870 00:45:19,470 --> 00:45:24,840 So if the singer put her voice, her vocal tract, 871 00:45:24,840 --> 00:45:29,750 in precisely the right location, there 872 00:45:29,750 --> 00:45:32,330 would be no difference between those sounds, 873 00:45:32,330 --> 00:45:35,330 OK, regardless of what the director said. 874 00:45:35,330 --> 00:45:37,620 OK, so that's the problem. 875 00:45:37,620 --> 00:45:39,950 So that's a way of using the Fourier 876 00:45:39,950 --> 00:45:44,570 analysis to gain some insight into some anomalous situations. 877 00:45:44,570 --> 00:45:45,319 Yeah? 878 00:45:45,319 --> 00:45:46,777 AUDIENCE: Does this have more to do 879 00:45:46,777 --> 00:45:51,070 with the rate at which you're sampling? 880 00:45:51,070 --> 00:45:52,600 PROFESSOR: No. 881 00:45:52,600 --> 00:45:56,620 It has to do only with the frequency content 882 00:45:56,620 --> 00:45:59,800 of the glottis waveform. 883 00:45:59,800 --> 00:46:01,450 You can think about it as sampling. 884 00:46:01,450 --> 00:46:09,460 And that's a good insight, because the Fourier series only 885 00:46:09,460 --> 00:46:13,990 has components at integer multiples of a base frequency. 886 00:46:13,990 --> 00:46:18,680 So that means we're sampling in frequency, not in time. 887 00:46:18,680 --> 00:46:23,860 So we have this potentially continuous frequency response, 888 00:46:23,860 --> 00:46:26,230 which is characterizing this. 889 00:46:26,230 --> 00:46:27,250 That is continuous. 890 00:46:27,250 --> 00:46:29,680 I could excite this at any frequency that I want to. 891 00:46:29,680 --> 00:46:32,920 But the glottis waveform of the singer 892 00:46:32,920 --> 00:46:37,150 is only sampling that at particular frequencies-- 893 00:46:37,150 --> 00:46:45,010 C, C prime, C double prime, B, B prime, B double prime, right? 894 00:46:45,010 --> 00:46:47,410 So there's only certain frequencies at which 895 00:46:47,410 --> 00:46:51,625 the singer is sampling this. 896 00:46:51,625 --> 00:46:53,750 So there is a way of thinking about it as sampling. 897 00:46:53,750 --> 00:46:55,904 But it's not sampling due to my ADD converter 898 00:46:55,904 --> 00:46:56,820 or anything like that. 899 00:46:56,820 --> 00:46:59,450 It's sampling in frequency. 900 00:46:59,450 --> 00:47:03,790 So the point is that this kind of a source filter idea, 901 00:47:03,790 --> 00:47:05,830 and more generally, the filter idea, 902 00:47:05,830 --> 00:47:08,770 is such a powerful representation 903 00:47:08,770 --> 00:47:10,270 that next time we'll think about how 904 00:47:10,270 --> 00:47:15,020 to do the same sort of thing for non-periodic [? stimuli. ?] 905 00:47:15,020 --> 00:47:16,860 See you then.