1 00:00:00,090 --> 00:00:02,490 The following content is provided under a Creative 2 00:00:02,490 --> 00:00:04,030 Commons license. 3 00:00:04,030 --> 00:00:06,330 Your support will help MIT OpenCourseWare 4 00:00:06,330 --> 00:00:10,720 continue to offer high quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,280 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,280 --> 00:00:18,450 at ocw.mit.edu. 8 00:00:23,230 --> 00:00:24,804 [MUSIC PLAYING] 9 00:00:52,670 --> 00:00:54,810 ALAN OPPENHEIM: Hi, I'm Alan Oppenheim. 10 00:00:54,810 --> 00:00:57,750 And I'd like to welcome you to this self-study course 11 00:00:57,750 --> 00:01:00,450 on digital signal processing. 12 00:01:00,450 --> 00:01:04,290 The fact that you're interested in taking the course 13 00:01:04,290 --> 00:01:08,310 suggests that you're probably aware of the important role 14 00:01:08,310 --> 00:01:12,120 that digital techniques have been playing in signal 15 00:01:12,120 --> 00:01:14,100 processing, in general. 16 00:01:14,100 --> 00:01:17,820 And in fact, the impact of digital technology 17 00:01:17,820 --> 00:01:19,990 has been rather dramatic. 18 00:01:19,990 --> 00:01:22,170 And the indications are that it will 19 00:01:22,170 --> 00:01:26,160 be even more so in the future. 20 00:01:26,160 --> 00:01:30,210 One of the primary advantages to digital as 21 00:01:30,210 --> 00:01:33,270 opposed to analog signal processing techniques 22 00:01:33,270 --> 00:01:37,470 is the tremendous flexibility that digital techniques 23 00:01:37,470 --> 00:01:39,930 and digital signal processing offers. 24 00:01:39,930 --> 00:01:43,860 And because of this flexibility, digital signal processing 25 00:01:43,860 --> 00:01:45,900 techniques have found application 26 00:01:45,900 --> 00:01:51,690 in a rather large or wide variety of areas. 27 00:01:51,690 --> 00:01:55,080 Speech processing, for example, has represented 28 00:01:55,080 --> 00:01:57,870 one of the major areas of application 29 00:01:57,870 --> 00:02:02,490 of digital signal processing for, at least, the past decade. 30 00:02:02,490 --> 00:02:06,660 Both analysis of speech and synthesis of speech 31 00:02:06,660 --> 00:02:09,960 rely very heavily on the notions of, 32 00:02:09,960 --> 00:02:13,530 for example, digital filtering, other notions, 33 00:02:13,530 --> 00:02:17,100 such as the fast Fourier transform algorithm, 34 00:02:17,100 --> 00:02:19,800 and a variety of the other digital signal 35 00:02:19,800 --> 00:02:23,490 processing techniques and algorithms. 36 00:02:23,490 --> 00:02:26,460 More generally, in communication systems, 37 00:02:26,460 --> 00:02:30,780 digital signal processing is being used for coding, 38 00:02:30,780 --> 00:02:34,020 for multiplexing and, in fact, there 39 00:02:34,020 --> 00:02:38,070 is a considerable amount of work being done at present directed 40 00:02:38,070 --> 00:02:42,120 toward, basically, replacing all of the present filtering 41 00:02:42,120 --> 00:02:44,970 in communications and telephone systems 42 00:02:44,970 --> 00:02:48,280 by digital filters instead of analog filters. 43 00:02:48,280 --> 00:02:50,400 And I think that it's likely that, in the not 44 00:02:50,400 --> 00:02:52,770 too distant future, we'll see most 45 00:02:52,770 --> 00:02:56,610 of the filtering in communication systems being 46 00:02:56,610 --> 00:02:59,160 done digitally. 47 00:02:59,160 --> 00:03:01,830 Seismic data processing represents 48 00:03:01,830 --> 00:03:04,170 another very important area in which 49 00:03:04,170 --> 00:03:06,660 the flexibility of digital signal processing 50 00:03:06,660 --> 00:03:08,970 is very heavily exploited. 51 00:03:08,970 --> 00:03:11,730 In fact, seismic and speech processing 52 00:03:11,730 --> 00:03:16,170 have probably been the two major catalysts 53 00:03:16,170 --> 00:03:19,590 for most of the important developments 54 00:03:19,590 --> 00:03:21,090 in digital signal processing. 55 00:03:23,790 --> 00:03:26,160 In audio recording and processing, 56 00:03:26,160 --> 00:03:29,070 digital signal processing provides an opportunity 57 00:03:29,070 --> 00:03:34,770 for some very sophisticated processing and enhancement. 58 00:03:34,770 --> 00:03:39,540 And in fact, fairly recently, Professor Thomas Stockham 59 00:03:39,540 --> 00:03:42,000 at the University of Utah has been 60 00:03:42,000 --> 00:03:44,490 applying some sophisticated digital signal 61 00:03:44,490 --> 00:03:49,260 processing techniques to the restoration of old Caruso 62 00:03:49,260 --> 00:03:51,700 recordings. 63 00:03:51,700 --> 00:03:56,610 The problem, in that case, is that the recordings that 64 00:03:56,610 --> 00:04:00,150 were originally made in the days of Caruso 65 00:04:00,150 --> 00:04:06,090 involved a recording horn, as I've illustrated here. 66 00:04:06,090 --> 00:04:09,690 The singer, of course, singing into the recording horn. 67 00:04:09,690 --> 00:04:12,630 And the output of the recording horn being 68 00:04:12,630 --> 00:04:16,709 stored on a recording disk. 69 00:04:16,709 --> 00:04:20,700 The problem in that particular application 70 00:04:20,700 --> 00:04:25,440 is the fact that the frequency response of the recording horn 71 00:04:25,440 --> 00:04:26,340 is not flat. 72 00:04:26,340 --> 00:04:31,830 And what this tended to do is give the resulting recording 73 00:04:31,830 --> 00:04:35,400 a, sort of, megaphone type of distortion. 74 00:04:35,400 --> 00:04:39,870 And one of the objectives in enhancing or restoring 75 00:04:39,870 --> 00:04:42,630 some of these old Caruso recordings 76 00:04:42,630 --> 00:04:47,010 is to compensate, in a sense, for the frequency 77 00:04:47,010 --> 00:04:51,960 distortion introduced by the recording horn. 78 00:04:51,960 --> 00:04:54,390 What Professor Stockham has done, 79 00:04:54,390 --> 00:04:59,610 basically, is to use digital signal processing techniques 80 00:04:59,610 --> 00:05:02,460 to, first of all, estimate the frequency 81 00:05:02,460 --> 00:05:04,860 response of the recording horn. 82 00:05:04,860 --> 00:05:09,120 And second, to compensate for that frequency response. 83 00:05:09,120 --> 00:05:11,670 And all of the work that he carried out 84 00:05:11,670 --> 00:05:15,930 was done digitally, primarily, as I indicated previously, 85 00:05:15,930 --> 00:05:20,580 primarily to capitalize on the flexibility 86 00:05:20,580 --> 00:05:22,890 that digital signal processing offers. 87 00:05:22,890 --> 00:05:27,520 And some of the results that he obtained are rather dramatic. 88 00:05:27,520 --> 00:05:33,300 And let me just illustrate in a very short passage what 89 00:05:33,300 --> 00:05:36,150 some of this has sounded like. 90 00:05:36,150 --> 00:05:39,060 What I borrowed from Professor Stockham 91 00:05:39,060 --> 00:05:43,110 is a recording of the restoration 92 00:05:43,110 --> 00:05:46,140 that he generated on a digital computer. 93 00:05:46,140 --> 00:05:50,610 And this particular recording is a two track recording 94 00:05:50,610 --> 00:05:54,500 with the original segment recorded 95 00:05:54,500 --> 00:05:58,970 on channel 1 and the process segment recorded on channel 2. 96 00:05:58,970 --> 00:06:03,110 And that will allow us to switch back and forth between these. 97 00:06:03,110 --> 00:06:07,700 The particular piece that is illustrated here 98 00:06:07,700 --> 00:06:11,420 is a section from the famous aria "Vesti La Giubba" 99 00:06:11,420 --> 00:06:14,870 as sung, of course, by Enrico Caruso. 100 00:06:14,870 --> 00:06:17,420 So let me just quickly illustrate this 101 00:06:17,420 --> 00:06:21,560 as an example of some of the type of processing 102 00:06:21,560 --> 00:06:25,310 that is currently being done using digital signal processing 103 00:06:25,310 --> 00:06:27,530 techniques. 104 00:06:27,530 --> 00:06:30,740 Let me begin, first of all, by playing 105 00:06:30,740 --> 00:06:33,060 a little bit of the original. 106 00:06:33,060 --> 00:06:38,420 And then, I'll switch to the result of Professor Stockholm's 107 00:06:38,420 --> 00:06:39,590 enhancement. 108 00:06:39,590 --> 00:06:42,410 And then, switch back and forth a few times 109 00:06:42,410 --> 00:06:44,690 to present a comparison. 110 00:06:44,690 --> 00:06:49,970 So we begin, first of all, with the original. 111 00:06:49,970 --> 00:06:52,460 [MUSIC ENRICO CARUSO, "VESTI LA GIUBBA"] 112 00:06:54,960 --> 00:06:57,356 And then switch to the enhanced. 113 00:06:57,356 --> 00:07:03,450 [MUSIC ENRICO CARUSO, "VESTI LA GIUBBA"] 114 00:07:03,450 --> 00:07:04,550 Back to the original. 115 00:07:04,550 --> 00:07:06,334 [MUSIC ENRICO CARUSO, "VESTI 116 00:07:06,334 --> 00:07:09,792 LA GIUBBA"] 117 00:07:09,792 --> 00:07:12,040 And then, once more, to the enhanced. 118 00:07:12,040 --> 00:07:22,250 [MUSIC ENRICO CARUSO, "VESTI LA GIUBBA"] 119 00:07:22,250 --> 00:07:24,740 And I think what you can observe with that 120 00:07:24,740 --> 00:07:29,780 is a fairly dramatic increase in the improvement in the quality 121 00:07:29,780 --> 00:07:30,590 of the recording. 122 00:07:30,590 --> 00:07:34,610 Primarily, the megaphone type of quality in the original 123 00:07:34,610 --> 00:07:36,980 has been, essentially, eliminated. 124 00:07:36,980 --> 00:07:40,370 Now to go even further in illustrating 125 00:07:40,370 --> 00:07:45,540 some of the flexibility of digital signal processing. 126 00:07:45,540 --> 00:07:47,180 One of the things that we observe 127 00:07:47,180 --> 00:07:50,210 with this particular recording and this particular example 128 00:07:50,210 --> 00:07:53,930 is that, although there is some enhancement that's 129 00:07:53,930 --> 00:07:58,850 been implemented, there still is some background noise that 130 00:07:58,850 --> 00:08:03,170 is present in, both, the original and the enhanced 131 00:08:03,170 --> 00:08:04,490 or restored. 132 00:08:04,490 --> 00:08:06,410 And so one of the things, obviously, 133 00:08:06,410 --> 00:08:11,520 that we would like to do is remove this background noise. 134 00:08:11,520 --> 00:08:15,830 In fact, using some rather sophisticated signal processing 135 00:08:15,830 --> 00:08:21,080 techniques, Professor Stockham, together with Neil J. Miller, 136 00:08:21,080 --> 00:08:25,560 have not only removed the background noise 137 00:08:25,560 --> 00:08:28,880 but, with the same processing, removed also 138 00:08:28,880 --> 00:08:30,890 the orchestral accompaniment. 139 00:08:30,890 --> 00:08:33,770 Now this is, first of all, rather dramatic. 140 00:08:33,770 --> 00:08:37,370 Second of all, somewhat useful in the sense that in carrying 141 00:08:37,370 --> 00:08:40,559 out a complete restoration one can imagine then 142 00:08:40,559 --> 00:08:48,560 redubbing a new orchestra on top of the restored recording. 143 00:08:48,560 --> 00:08:51,560 So let me just play a little bit of this to, in fact, 144 00:08:51,560 --> 00:08:55,910 show you that it really has been possible to not only remove 145 00:08:55,910 --> 00:08:58,820 the background noise but also to remove 146 00:08:58,820 --> 00:09:02,090 the orchestral accompaniment. 147 00:09:02,090 --> 00:09:07,520 So first let me move forward on the tape to the right place. 148 00:09:14,610 --> 00:09:19,800 And what you'll hear now in this case, on channel 1, 149 00:09:19,800 --> 00:09:28,590 is the Caruso recording restored as I indicated previously. 150 00:09:28,590 --> 00:09:32,850 And on channel 2 is the result of further processing, 151 00:09:32,850 --> 00:09:35,460 the restored recording, to eliminate, 152 00:09:35,460 --> 00:09:41,190 both, the background noise and also the orchestra. 153 00:09:41,190 --> 00:09:43,410 So we'll begin with the restored, 154 00:09:43,410 --> 00:09:44,760 which includes the orchestra. 155 00:09:44,760 --> 00:09:46,776 And then, the orchestra removed. 156 00:09:46,776 --> 00:09:49,056 [MUSIC ENRICO CARUSO, "VESTI LA 157 00:09:49,056 --> 00:09:49,947 GIUBBA"] 158 00:09:49,947 --> 00:09:51,030 That's with the orchestra. 159 00:09:51,030 --> 00:09:54,204 [MUSIC ENRICO CARUSO, "VESTI LA GIUBBA"] 160 00:09:54,204 --> 00:09:56,730 And with the orchestra and the background noise removed. 161 00:09:56,730 --> 00:10:10,590 [MUSIC ENRICO CARUSO, "VESTI LA GIUBBA"] 162 00:10:10,590 --> 00:10:13,535 And then, once more, back to the orchestral [INAUDIBLE].. 163 00:10:13,535 --> 00:10:14,870 [MUSIC ENRICO CARUSO, 164 00:10:14,870 --> 00:10:18,440 "VESTI LA GIUBBA"] 165 00:10:18,440 --> 00:10:20,840 Well I think that you'll probably 166 00:10:20,840 --> 00:10:24,170 have to admit that, in fact, it's 167 00:10:24,170 --> 00:10:27,320 a rather dramatic example of some sophisticated digital 168 00:10:27,320 --> 00:10:30,180 signal processing. 169 00:10:30,180 --> 00:10:34,350 Another area in which digital signal processing 170 00:10:34,350 --> 00:10:38,280 has tremendous potential is in the area 171 00:10:38,280 --> 00:10:41,670 of digital image processing. 172 00:10:41,670 --> 00:10:46,230 And in that case, the basic notion 173 00:10:46,230 --> 00:10:52,350 is to treat an image as a two dimensional signal digitized, 174 00:10:52,350 --> 00:10:52,980 of course. 175 00:10:52,980 --> 00:10:55,500 And one is then afforded the possibility 176 00:10:55,500 --> 00:10:58,410 of applying digital signal processing techniques 177 00:10:58,410 --> 00:11:01,710 to the two dimensional signal. 178 00:11:01,710 --> 00:11:05,790 For example, in a very simple signal processing environment, 179 00:11:05,790 --> 00:11:10,240 we might be interested in low pass filtering a digital image. 180 00:11:10,240 --> 00:11:15,060 For example, if the image has considerable grain noise, 181 00:11:15,060 --> 00:11:18,000 grain noise, in fact, tends to be high frequency. 182 00:11:18,000 --> 00:11:22,200 And low pass filtering then will tend to reduce or eliminate 183 00:11:22,200 --> 00:11:23,850 noise of that type. 184 00:11:23,850 --> 00:11:27,360 Or we might be interested in high pass filtering. 185 00:11:27,360 --> 00:11:32,190 For example, if we wanted to enhance edges in a picture, 186 00:11:32,190 --> 00:11:35,370 the procedure would be or one procedure 187 00:11:35,370 --> 00:11:41,370 might be to apply a two dimensional high pass filter. 188 00:11:41,370 --> 00:11:45,600 More elaborately, we could consider 189 00:11:45,600 --> 00:11:50,220 some processing, which is directed at general image 190 00:11:50,220 --> 00:11:51,480 enhancement. 191 00:11:51,480 --> 00:11:55,530 And one example that I'd like to show you is some digital image 192 00:11:55,530 --> 00:11:58,620 processing that was carried out directed 193 00:11:58,620 --> 00:12:04,560 at simultaneously reducing the dynamic range of an image 194 00:12:04,560 --> 00:12:08,490 and increasing the contrast of the image. 195 00:12:08,490 --> 00:12:10,230 Generally, photographically, these 196 00:12:10,230 --> 00:12:12,120 are conflicting requirements. 197 00:12:12,120 --> 00:12:14,820 But with some sophisticated processing, 198 00:12:14,820 --> 00:12:19,410 it's possible to simultaneously reduce the dynamic range 199 00:12:19,410 --> 00:12:21,930 and increase the contrast. 200 00:12:21,930 --> 00:12:26,430 To illustrate how this works with an example, 201 00:12:26,430 --> 00:12:31,170 the first slide that I'll show you is an original image, which 202 00:12:31,170 --> 00:12:36,750 is, of course, a digital image displayed on a computer scope. 203 00:12:36,750 --> 00:12:39,630 And one of the things that we notice about that image 204 00:12:39,630 --> 00:12:42,330 is that it has a rather high dynamic range. 205 00:12:42,330 --> 00:12:45,960 For example, the snow outside the boiler room 206 00:12:45,960 --> 00:12:47,280 is rather bright. 207 00:12:47,280 --> 00:12:50,070 The inside of the boiler room is dark. 208 00:12:50,070 --> 00:12:53,550 And of course, the contrast inside the boiler room 209 00:12:53,550 --> 00:12:57,540 is relatively low because of the fact 210 00:12:57,540 --> 00:13:01,960 that the illumination inside the boiler room is relatively low. 211 00:13:01,960 --> 00:13:04,800 So one type of processing that we could consider 212 00:13:04,800 --> 00:13:08,940 is the simultaneous enhancement of contrast, 213 00:13:08,940 --> 00:13:12,810 and reduction of dynamic range, and applying some two 214 00:13:12,810 --> 00:13:15,000 dimensional signal processing. 215 00:13:15,000 --> 00:13:21,510 The result is what I show you on the next slide 216 00:13:21,510 --> 00:13:25,290 where, here, we've processed to bring out 217 00:13:25,290 --> 00:13:28,140 the detail inside the boiler room. 218 00:13:28,140 --> 00:13:31,770 You can notice that the dynamic range, in fact, is reduced. 219 00:13:31,770 --> 00:13:34,860 The snow is darker than it was in the original. 220 00:13:34,860 --> 00:13:38,610 The boiler room is brighter than it was in the original, 221 00:13:38,610 --> 00:13:41,010 suggesting reduced dynamic range. 222 00:13:41,010 --> 00:13:45,780 But also, the contrast is very clearly enhanced. 223 00:13:45,780 --> 00:13:50,200 Just as another example of the same type of processing. 224 00:13:50,200 --> 00:13:53,100 First, let's look at an original where 225 00:13:53,100 --> 00:13:56,020 we observe that there's a brightly illuminated area, 226 00:13:56,020 --> 00:13:58,980 which is where the radome moves is. 227 00:13:58,980 --> 00:14:01,920 And then, a more dimly illuminated area. 228 00:14:01,920 --> 00:14:04,092 The details in the right hand corner 229 00:14:04,092 --> 00:14:05,175 with the trees and leaves. 230 00:14:08,940 --> 00:14:11,850 And as a result of processing to, 231 00:14:11,850 --> 00:14:16,890 again, increase the contrast and reduce the dynamic range, 232 00:14:16,890 --> 00:14:20,130 we see in the resulting processed image 233 00:14:20,130 --> 00:14:25,590 that the detail in the dimly illuminated areas, 234 00:14:25,590 --> 00:14:29,650 in fact, is brought out rather dramatically. 235 00:14:29,650 --> 00:14:37,560 So this is one example of some rather sophisticated 236 00:14:37,560 --> 00:14:39,780 digital signal processing applied 237 00:14:39,780 --> 00:14:41,010 to two dimensional signals. 238 00:14:41,010 --> 00:14:42,870 Namely, to images. 239 00:14:42,870 --> 00:14:44,970 And I should mention incidentally 240 00:14:44,970 --> 00:14:48,990 that the type of processing that was applied for this image 241 00:14:48,990 --> 00:14:53,040 enhancement is discussed in considerable detail in chapter 242 00:14:53,040 --> 00:14:55,540 10 of the text. 243 00:14:55,540 --> 00:15:00,300 Now there are, of course, a long list of other applications 244 00:15:00,300 --> 00:15:02,340 of digital signal processing. 245 00:15:02,340 --> 00:15:04,650 In the biomedical area, for example, 246 00:15:04,650 --> 00:15:06,780 digital signal processing techniques 247 00:15:06,780 --> 00:15:09,480 are playing a very important role. 248 00:15:09,480 --> 00:15:12,990 In radar and sonar, those are two additional areas 249 00:15:12,990 --> 00:15:14,760 in which digital signal processing 250 00:15:14,760 --> 00:15:16,440 is extremely important. 251 00:15:16,440 --> 00:15:18,840 And I'm sure that there are other areas 252 00:15:18,840 --> 00:15:21,780 that you're aware of that I, perhaps, might not 253 00:15:21,780 --> 00:15:26,790 be where digital signal processing is particularly 254 00:15:26,790 --> 00:15:31,010 important because of its flexibility. 255 00:15:33,900 --> 00:15:36,630 Specifically, in this course, we won't 256 00:15:36,630 --> 00:15:39,600 be focusing on applications. 257 00:15:39,600 --> 00:15:42,660 Although, it's important to keep in mind as we 258 00:15:42,660 --> 00:15:45,300 go through the course that the material 259 00:15:45,300 --> 00:15:49,620 that we're talking about has very important applications. 260 00:15:49,620 --> 00:15:52,470 In the course, we'll be concentrating specifically 261 00:15:52,470 --> 00:15:56,670 on the fundamentals and techniques of digital signal 262 00:15:56,670 --> 00:15:59,310 processing. 263 00:15:59,310 --> 00:16:02,460 As I've indicated in the study guide, 264 00:16:02,460 --> 00:16:05,580 I will be assuming that you previously 265 00:16:05,580 --> 00:16:10,470 have had a course in linear system theory, continuous time 266 00:16:10,470 --> 00:16:15,810 linear system theory, including Fourier and Laplace transforms. 267 00:16:15,810 --> 00:16:20,310 But I will not be assuming any specific background 268 00:16:20,310 --> 00:16:24,060 in discrete time signals and systems in Z-transforms, 269 00:16:24,060 --> 00:16:24,990 et cetera. 270 00:16:24,990 --> 00:16:29,160 In fact, in the next lecture, lecture two, 271 00:16:29,160 --> 00:16:31,110 we will begin from the beginning. 272 00:16:31,110 --> 00:16:34,590 Namely, we will begin with a definition of discrete time 273 00:16:34,590 --> 00:16:36,250 signals and systems. 274 00:16:36,250 --> 00:16:40,710 And if you feel a little rusty on basic linear system 275 00:16:40,710 --> 00:16:42,660 theory for continuous time systems, 276 00:16:42,660 --> 00:16:45,809 it might be well before then to do 277 00:16:45,809 --> 00:16:47,100 just a little bit of reviewing. 278 00:16:47,100 --> 00:16:51,600 And I suggest some possible books in the study guide. 279 00:16:51,600 --> 00:16:55,950 I would also suggest, before beginning the next lecture, 280 00:16:55,950 --> 00:17:00,540 that you read the introduction to the text. 281 00:17:00,540 --> 00:17:05,520 And perhaps, also browse through the text and the study guide 282 00:17:05,520 --> 00:17:09,480 to get a general impression of the scope of the course, 283 00:17:09,480 --> 00:17:13,020 and some of the objectives of the course. 284 00:17:13,020 --> 00:17:15,810 As I indicated, next time, we will 285 00:17:15,810 --> 00:17:19,319 begin with the definition of discrete time signals 286 00:17:19,319 --> 00:17:21,180 and systems. 287 00:17:21,180 --> 00:17:24,180 I sincerely hope that you find this set of lessons 288 00:17:24,180 --> 00:17:27,010 to be interesting and worthwhile. 289 00:17:27,010 --> 00:17:29,590 And I'll see you at the next lecture. 290 00:17:29,590 --> 00:17:31,200 Thank you. 291 00:17:31,200 --> 00:17:33,650 [MUSIC PLAYING]