Flash and JavaScript are required for this feature.
Download the video from iTunes U or the Internet Archive.
Topics covered: High and low points of a curve; techniques for finding them; applications to finding maxima and minima of functions; physical applications.
Instructor/speaker: Prof. Herbert Gross
Lecture 8: Maxima and Minima
Related Resources
This section contains documents that are inaccessible to screen reader software. A "#" symbol is used to denote such documents.
Part II Study Guide (PDF - 29MB)#
Supplementary Notes (PDF - 46MB)#
Blackboard Photos (PDF - 8MB)#
ANNOUNCER: The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free.
To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
HERBERT GROSS: Hi, our lecture today, if we're looking at this from an analytical point view, should be called 'Maxima and Minima'. And if we're looking at it from a geometric point of view, 'High Points and Low Points'. And actually, whichever point of view we're looking at from, it's a very nice application off much of the theory that we have learned up until now. So without further ado, let's talk a little bit about what we mean by high points, low points, maxima or minima. Which is, as I say, I've called the lecture today.
Now, as is always the case, we usually have to have some sort of a fundamental result from which all of our other results follow. And in this context what I call the fundamental theorem for a study of maxima minima is the following.
Suppose that 'f of c' is at least as great as 'f of x' for all 'x' in a delta neighborhood of 'c'. In other words, we have some open interval with delta surrounding 'c'. And then for all 'x' in that neighborhood, 'f of c' is at least as great as 'f of x'. Or equivalently, it might be that 'f of c' is less than or equal to 'f of x' for all 'x' in this neighborhood. And suppose also, that 'f prime of c' exists. Then the theorem says, 'f prime of c' in this case, must be 0. And whereas this can be proven analytically, again the analytical proof is motivated by what's happening geometrically. And the geometric demonstration is particularly easy to visualize in this particular case. Let me run the risk of drawing this fairly freehand here. See what we're saying is this.
Suppose you have at the point (c, 'f of c') on this particular curve. Suppose that this particular curve the derivative exists and is positive. See that's the way I've drawn this. In other words, notice that in the neighborhood of the point 'c' here, for example, the derivative here is positive. The curve is always rising.
Now, what we're saying over here is that if you look at this particular picture, observe that 'f of c' will not exceed 'f of x' for all 'x' in this neighborhood. In fact, I think you can see just by looking at this picture that as 'x' increases, 'f of x' increases. In other words, in terms of this picture, if 'x' is less than 'c', 'f of x' is less than 'f of c'. And if 'x' is greater than 'c', 'f of x' is greater than 'f of c'.
In other words, what we're saying is if the derivative is positive, it means that the curve is always rising. And hence, the point in the middle of that neighborhood cannot be the highest point in that neighborhood. Nor, for example, can it be the lowest point. In fact, a similar argument holds if we want to illustrate that 'f of c' is less than or equal to 'f of x' in this particular neighborhood.
And by the way, again, observe what we're saying. This is rather crucial over here. First of all, we're talking about a sufficiently small neighborhood of 'c'. What I mean by that is something like this.
And by the way, this could cause a little bit of a problem if you're not careful with your language. If we drew a picture like this and you say to somebody, where are the high-low points on this? I think it's quite natural that the person would say, well, this is the high point. I'll call that 'H'. And this is the low point. I'll call that 'L'. And he might tend to forget about this point here because even though it's fairly high, it's not nearly as high as this point. What I'd like you to see however is that our definition talks about what? In a neighborhood of the given point.
You see what we're saying here is that if we pick a particular neighborhood, an appropriately chosen neighborhood surrounding 'c', then what is true is that at the value of 'x' corresponding to 'c', 'f of c' is the highest point in a suitable neighborhood of 'c'. In other words, if you knew that for some reason or other, you had to be working in this neighborhood here, you could say, well, in the domain in which I'm interested in, this is a high point. And this is why we talk about 'local' or 'relative' maxima or minima in addition to 'absolute' maxima and minima.
In fact, you see, this happens quite frequently in practice. Suppose you were doing an experiment and you really wanted to produce the largest possible value of 'y'. Well, you see, utopianaly you would like to pick 'x' out here someplace. But suppose because of some constraint in the laboratory, the largest value of 'x' that you could choose might be over here. And then you see what the equivalent problem would be. And this is where the domain of the function has such a powerful meaning in terms of practical applications.
What you're saying is look-it , if the domain of my function is limited to this interval over here, then this particular point as far as I'm concerned is the highest point. In other words then, notice that the labelling 'N sub 'delta of c'' indicates that you're talking locally rather than globally in a neighborhood of 'c'. And this is the important issue here.
Now, the hardest part about this particular result as I see it, is not understanding the result as much as it is of reading more into the result than what's really there. And so I have a few cautions for you. The three commandments they turn out to be.
The first is, beware of false converses. And before you can be beware of false converses, you have to know what a converse is. Roughly speaking, a converse applies only to an if-then type of statement. And you get the converse of a given statement by interchanging the clauses that follow the if and the then. And the reason you have to beware is that a true statement can have a false converse.
For example, consider the following true statement. If a person is listening to me lecture now, then he is alive. Hopefully, a true statement.
If we now interchange the clauses to form the converse it says if a person is alive, then he is listening to me lecture. A tragically false statement.
But notice the difference between inverting the clauses of an if-then statement. And the idea is this. Notice that our theorem says if 'f prime' exists, then-- or if there is a local maximum or a local minimum, 'f prime of c' is 0. It does not say that if 'f prime of c' is 0 we have a local maximum or a local minimum. Perhaps the easiest way to see that is in terms of an example. That's the nicest thing to show how a converse is false. To prove that something is true, you can't do it just by showing its true in certain cases. But to show that something is not true, all you have to do is exhibit one example in which the result is false. That's enough to prove that it can't always be true.
For example, in this particular diagram, notice that in the curve 'y' equals 'f of x', which I've drawn here, the curve at the point 'c' comma 'f of c' has a horizontal tangent. In other words, 'f prime of c' is 0 here. But notice that in any neighborhood that surrounds 'c', in any open interval that's around 'c', notice that if 'x' is less than 'c', 'f of x' will be less than 'f of c'. And if 'x' is greater than 'c', 'f of x' will be greater than 'f of c'. In other words, notice that except for the stationary value at which we have a horizontal tangent, the graph is always rising in any neighborhood of 'c'. That's the first point I want you to see.
The second point says beware if 'f prime of c' doesn't exist. See all our theorem said is, look it. If you have a relative high point, a relative low point, relative max, or a relative min at 'x' equal 'c', and if 'f prime of c' exists, then 'f prime of c' is 0. But who said that 'f prime of c' has to exist?
Again, let's look in terms of an example. If we let 'f of x' equal the absolute value of 'x', recall from our previous assignments and the like that the derivative of 'f of x', in this case, does not exist when 'x' is 0. In other words, 'f prime of 0' doesn't exist. And why is that? Well, notice that the graph of 'y' equals the absolute value of 'x' is the straight line 'y' equals 'x' if 'x' is non-negative and the straight line 'y' equals 'minus x' if 'x' is negative. In other words, 'f prime of x' is 1 if 'x' is positive. It's minus 1 if 'x' is negative. And hence, a jump discontinuity in the derivative at 0.
But you see, the point is this. If you look at our graph, do we have a low point at 'x' equals 0? In other words, is this the lowest point in the neighborhood of 0? And the answer is yes. In fact, it's an 'absolute' low point. Meaning that no matter where you go, no point on our graph can be less or lower than this particular point here. But that's irrelevant. The point is what? If we looked for a place where 'f prime of c' was 0 in this example, we wouldn't find one. That would not mean that there wasn't a low point. What happened was what? The low point snuck in at a place where the derivative did not exist.
And again, you have to be careful. All I said was what? Beware of points for which 'f prime of c' does not exist. It does not mean that at each point where 'f prime of c' does not exist that you're going to have a high or a low point.
For example, remember that the derivative not existing loosely speaking means what? That there's a sharp corner to the curve. What I'm thinking of is something like this.
Let's take a curve that's always rising, say something like this. Now, let's suppose we put a sharp corner in here, but in such a way that the curve still continues to always rise. Say like this.
Now you see, at this particular value of 'c', 'f prime of c' doesn't exist. I think you can see intuitively what's happening here. The slope approaches one value as you approach 'c' from the left and another value if you approach 'c' from the right.
Now even know 'f prime of c' doesn't exist at this point, notice that this point is neither a high nor a low point. Meaning that all points what? To the left of this point are below it and all points to the right of this point are above it. So again, what? Beware when 'f prime of c' equals 0, but don't jump to any false conclusions.
And the third caution is an extremely important one. In fact, for the first time from a practical point of view, we are going to probably see analytically what is the difference between an open interval and a closed interval.
Suppose I have the function 'f of x' equals 'x squared', but the domain of 'f' is now the open interval from 2 to 3. In other words, the inputs of the 'f' machine are restricted to all those numbers which are greater than 2, but less than 3.
Now, let's take a look here. Let's first of all, see where 'f prime of x' is 0. First of all, 'f prime of x' is '2x' and that equals 0 if and only if 'x' equals 0.
Now here's where the domain is very important. Is 'x' equal 0 in our domain? The answer is no. 'x' equals 0 is not in our domain. Our domain is restricted to be the open interval from 2 to 3. Therefore, as far as our function 'f' is concerned-- and remember, way back in one of our early, I don't even think it was a lecture. It was in our supplementary notes. We pointed out that when you define a function, you need not only the rule, but you must specify the domain.
Remember, two functions were equal not only if they were the same rule, but they had to be defined on the same domain. So f here is defined on a domain from 2 to 3. And on that particular domain of definition there is no place where the derivative is 0.
By the same token, since the derivative is a polynomial and all polynomials are differentiable, and all differentiable functions are continuous, there certainly will be no places where 'f prime of x' doesn't exist. In other words, 'f prime' exists for all 'x' in the domain of 'f'.
In other words, here is a particular example in which a particular function on its domain of definition does not have any high or low points on it. And I'll illustrate that graphically in a few moments. I simply wanted to put this on the board first to put it into sharp contrast with what we're going to say next. If we're not careful, in fact, the next problem looks exactly the same as the problem that we just solved. Namely, what we want to do now is investigate the function 'f of x' equals 'x squared'. But we want the domain of 'f' now to be what? The closed interval from 2 to 3. In other words, the only difference between this problem and the problem that we just solved is that now we want the endpoints included.
Now, there's no sense repeating the part that we did before. First of all, will 'f prime of x' equal 0 in the domain of definition? As we saw before, no.
Is 'f prime' nonexistent any place in the interval? No, it's differentiable. It's a smooth polynomial curve. Answer is no.
Now, here's where we come to two very important points that hopefully, will clarify certain conventions that were made in the textbook. In our section on continuity there was a little result that may have seemed a little bit obscure. It said what? That a continuous function defined on a closed interval must take on its maximum and minimum values someplace on that closed interval. You see, all we've proven over here is what?
That if 'f' has a max or min, what we've proven is what? It does not occur in the open interval from 'a' to 'b'. Well, look it. If the high and low points have to occur some place on the closed interval and they can't appear, as we've seen, in the open interval, where must they occur?
Well, if they can't be inside and they have to be on the interval, it must be that they take place at the endpoints.
Let's go back without referring back to another board. Remember we wrote down when we're talking about high and low points that we talked about 'x' being in a delta neighborhood of 'c'. It meant what? That you could surround 'c' by some bandwidth 'delta'. And notice that our definition in the textbook of a neighborhood was always an open interval. And the reason that the definition is given to be open is that notice that what the definition now says is what? You know what's going on on either side of 'c'. We know what's going on either side of 'c'.
Notice that in the case where you have a closed interval, by definition of a closed interval notice that what? We know what's happening as we come in on 'a' from the right. And we know what's happening to 'b' as we come in on it from the left. But allegedly, meaning that since the function is only defined on the closed interval from 'a' to 'b', we don't know what's happening before 'a', and we don't know what's happening before 'b'. In other words, this is why the test for 'f prime of c' equaling 0 applies only to 'c' being in the interior. In other words, in an open interval.
So in other words then, you see if a function happens to be continuous on the closed interval and it doesn't have any high-low points in the interior of the interval, then it must have its high-low points where? It must be at the endpoints.
And to show you quite simply what was going on in this particular problem, notice that if we graph 'f of x' equals 'x squared' and look at this say, first of all, on the closed interval from 2 to 3 what we're saying is look it, any point in which we look at a neighborhood that isn't 2 or 3, the curve is rising on one side of the point. In other words, the curve is rising on both sides of the point, so that if you are to the left of the point, the height will be less than the point you're interested in. If you're to the right of it, the height will exceed the point that you're interested in.
But notice that at the end points themselves, you do have what in this case? Not a relative high or low, but actually an absolute high or low. In other words, the point '2 comma 4'. See, what is the endpoint here? One endpoint is 4, the other endpoint going up on the y-direction here is 9. What you're saying is what? 4 is less than 9. Every value of 'x squared' between 2 and 3 falls between 4 and 9. And what you're saying is what? That the lowest point occurs when 'x' is 2, the highest point occurs when 'y' is 3.
The interesting point to note is that if you now look at the open interval and exclude the endpoints, you cannot get a lowest point or a highest point. Namely, notice that if you allow yourself to get as close to 2 as you want without ever getting there, it means you could have done what? Moved closer to 2. In other words, if 'x' is greater than 2, you can pick another value that's what? Between 'x' and 2. There's a space there. In other words, what you saying is that no matter how low you get here, as long as you're not exactly at 2, you can find the point that's lower. And in the same way as you move out this way, as you move closer and closer to this point, if you exclude this point itself, no matter where you stop, you could have always found the point that was a little bit higher.
And again, you have to be where? When I say check the endpoints of a closed interval, it does not mean that the endpoints will give you high or low points. For example, look at the following curve.
It looks something like this. It's continuous. It's defined on the closed interval from 'a' to 'b'. Notice that at a we do not get an absolute maximum. In fact, what? All of these points on the curve are higher than what's happening here.
Same thing happens, what? All of these points are lower than what the height is corresponding to 'x' equals 'a'.
And in the similar way, this is what happens at 'b'. You see, with the endpoints, obviously since you can't see what's happening before, this will be either the highest point or the lowest point near here depending on how the curve is sloped. But all we're saying is what? In terms of absolute high values and absolute low values, meaning the highest possible points and the lowest possible points, we must always check the endpoints. But we can't be positive that the endpoints are going to be chosen.
In fact, let's summarize. And I'm going to summarize again at the end of the lecture. But the idea is this. If 'f of x' is continuous on the closed interval from 'a' to 'b'-- and here's the key word. To find candidates-- you know the old cliche about many are called, but few are chosen. In this case, few are called and even fewer are chosen. Namely, what we're saying is look it. When we're looking for high-low points, the candidates are what? Those points for which the derivative is 0. We can check those out because those are possibilities. Those points for which the derivative fails to exist. We can check those out because they're possibilities. And the endpoints. Namely, if the function is continuous, we check the endpoints.
By the way, if the function is not continuous, then there is no need to check-- well, let's put it this way. If you're on an open interval there's no need to check the endpoints because there aren't any. In other words, notice that I'm talking about what? That the function 'f' is not only continuous, but on a closed interval. That's all there is to this. In other words, these are all the possible candidates.
Now you see the bigger question is, how do you use this? And I thought that I would make up a makeshift exercise, one that's rather easy to do at the blackboard. For deeper exercises, for more quantitative results, we have several exercises in the learning exercises. Several exercises worked out illustratively in the textbook. But let's pick a particularly straightforward example.
Let's suppose that what I want to do is construct a cylinder. This is a cross sectional view of a cylinder. 'x' represents the radius of the base and 'y' represents the height. This is a right circular cylinder.
I'm given a constraint, namely I'm told that for some reason or other, and I don't know why anybody would ever want to impose this condition other than the fact that we want some condition imposed here to see what's happening. I'm told that I want the sum of the radius of the base and the altitude to be exactly 30. In other words, if the radius of the base is going to be 6 inches, I want the altitude to be 24 inches. That's a constraint.
And by the way, I'll come back to this later also. Notice how you're almost begging a related rates relationship here. Or an implicit relationship that 'x' and 'y' are not independent. But I've now put a constraint on here. At any rate the question is, how shall I use up my 30 inches, say, if I want to make the volume of the resulting cylinder as large as possible? And notice how we work this thing.
We say, well, the volume is equal to 'pi 'x squared' y'. In this particular case, it's easy to see explicitly that 'y' is equal to '30 - x'. It's also easy to see physically that 'x' must be more than 0. You can't have a negative radius of the base. It must be less than 30 because if you used up more than 30 inches in the radius of your base, how can the radius of the base plus the altitude add up to exactly 30? Because physically, the constraint is that the altitude can't be negative. It's certainly a positive value. So our analytic relationship is what? That 'v' equals 'pi 'x squared' y', which can be written as 'pi 'x squared' times '30 - x''. Which in turn, can be written as the polynomial '30 pi 'x squared'' minus 'pi 'x cubed'', where 'x' is the open interval or defined on the open interval from 0 to 30.
Now, how do I proceed? What was my test for membership? I have three ways of checking out where high-low points will occur. Or max-min points in this case, since notice that this function does not require a graph to understand it.
I first check out to see where the derivative is 0. Well, the derivative is simply '60 pi x' minus '3 pi 'x squared''. If I set this thing equal to 0, I find upon factoring that either 'x' must be 0 or 'x' must be 20. I can immediately exclude 'x' equals 0 because notice that my function 'v' had its domain of definition on the open interval from 0 to 30. 'x' equals 0 is not in the open interval from 0 to 30. Consequently, we can rule this thing out. And what we discover is that 'x' equals 20. So 'x' equals 20 is the only possible candidate that we have.
And by the way, since the sum of 'x' and 'y' must be 20, if 'x' equals 20, 'y' must be 10. So the only candidate that we have by setting the derivative equal to 0 is that 'x' should be 20 and 'y' should be 10. We do not get any candidates in the sense of where the derivative doesn't exist. Because if you look at '60 pi x' minus '3 pi 'x squared'', this certainly exists for all values of 'x'.
And finally, we have no endpoints to check. Because again, our function 'v' is defined on an open interval.
By the way, if I tried to draw this somewhat to scale, a very interesting result turns up, which may show the power of analytical methods versus more intuitive types of methods.
You see, what we showed here was what? Without going into the details, that to get the largest volume cylinder, 'x' should be 20. In other words, the radius of your base should be 20 and the altitude should be 10. Let's take a look at that drawn roughly to scale. The radius of the base here is 20 and the altitude is 10.
By the way, that means that the cross section will be what? A rectangle whose height is 10 and whose base is 40. Now intuitively, I think it's easy to see that for a rectangle, for a given perimeter the largest possible area rectangle is the one which is a square. Well, without even trying to prove that, let's go to this thing here. Instead of using up the 20 and the 10, let's draw a second rectangle, or a second cross section of a cylinder where the radius is 15 and the height is 15. Notice that this still satisfies the fact that the sum of the radius and the altitude are 30.
Now, look at this. If we compute the area of this particular rectangle, 40 times 10, it's 400. On the other hand, the volume is what? Pi times 20 squared. The radius of the base squared. Times the height, which is 10. And that yields the result of 4,000 pi.
On the other hand, if we look at our second rectangle, its area is 450. But its volume is what? It's pi times 15 squared times 15, which is 3,375 pi.
The interesting result here is what? The area of this rectangle is greater than the area of this rectangle. In fact, the area of the second rectangle is 450. The area of the first rectangle is only 400. But notice that when you revolve this thing to form the cylinder, the smaller cross sectional area generates the larger volume. And the reason, of course, for that is that the relationship in our variables was not linear. In other words notice that when 'x' is large, a relatively small change in 'x' produces a large change in 'x squared'. In other words, a relatively small value in 'x' can offset a relatively large value or increase in 'y'.
And this is kind of interesting because try to figure out intuitively how you would figure out where these stop compensating for one another? Where does it turn out that finally you've taken so much away from 'x' that even though you square it, it can't compensate the change in 'y'? How would you intuitively pick off where the high-low points occur in a problem like this? And all I want you to see is again, the beautiful gentle balance between intuitive calculus and rigorous calculus. That we don't throw away our intuition. But notice how, in many cases, where our intuition fails us, the analytic recipes come to our rescue. But enough said about that, let me now again, highlight the difference or the relationship between functions and graphs. Namely, in all of this discussion that we've done on this particular board, we're talking about what? 'v' being a function of 'x'. We do not have to visualize this thing pictorially. But if we wish, what we can say is let's graph 'v' as a function of 'x'.
And you see, going back to the material of last time. And notice, you see how interrelated these things are. Notice how curve plotting ties in very nicely with derivatives and the like. All we're saying is what? Given this relationship, which is how 'v' is related to 'x', we can form the first derivative, we can form the second derivative, we can look to see where the first derivative is 0, we can look to see where the second derivative is 0. I leave these details to you because after the homework assignment you did last time these should be fairly trivial exercises to do. But the idea is if you now utilize all of this information, you find that if you plot 'v' verses 'x', you get a graph something like this.
By the way, again notice that we were not talking about just 'v' being a function of 'x'. The domain of 'v' was restricted to be the open interval from 0 to 30, and this is a very crucial thing to keep in mind. You can get into a whole bunch of trouble if you start looking to see what happens out here.
For example, you say, hey, this curve is always going to keep going up. Won't this be greater than this maximum value over here eventually? The answer is yes, it will be. But what does it mean to say that 'x' is negative? 'x' was the radius of our base. So in other words, what we should really say here is what? That the function that we're talking about is not this whole curve.
Yikes, that wasn't a very good job of drawing. But rather what? Just this portion of the curve defined on the open interval from 0 to 30.
By the way, let me point out something else. You recall in an earlier lecture we talked about related rates. We had an assignment with other lecture where you've solved some problems using related rates. Let me show you how related rates play a very important computational role in dealing with max-min problems.
In this particular problem, we had that 'v' equals 'pi 'x squared' y', where 'y' happened to be a particular function of 'x'. In fact, implicitly it was given by the fact that 'x + y' happened to equal 30. Now let's keep track of something here. In this particular problem, notice that it was very easy to change this implicit relationship to an explicit one. It was also easy once you expressed 'y' explicitly in terms of 'x' that when you wanted to substitute into here, it was a very easy computational job to carry out the operations.
But suppose there happened to be cube roots in here, or all sorts of nasty things, whatever they might be. And suppose instead of 'x + y' equals 30, you had our old friend something like 'x to the eighth' plus ''x to the sixth' 'y squared'' plus 'y to the sixth' equals 3. How would you solve for 'y' explicitly in terms of 'x' there? And the point that I'd like you to see is that we can solve this problem very nicely without having to resort to explicitly replacing 'y' as a function of 'x'. Namely, implicitly assuming that 'y' is a differentiable function of 'x', we can differentiate this thing as a product. 'dv dx' will be what? The derivative of the first factor, which is '2 pi x', times the second. Plus the first factor, which is 'pi 'x squared', times the derivative of 'y' with respect to 'x'. So that's 'dv dx'.
On the other hand, from this relationship here, we can conclude by differentiating implicitly that '1 + 'dy dx'' is 0. And therefore, 'dy dx' is minus 1. And putting that value for 'dy dx' in here, we wind up with this explicit relationship. And we can now see that 'dv dx' is 0 if and only if just by solving this thing, setting it equal to 0, 'x' equals '2y'.
By the way, that's exactly what happened in our problem. You'll notice that 'x' turned out to be 20 and 'y' turned out to be 10. You see again, 'x' and 'y' are related. You can say, gee, couldn't 'x' be 60 and 'y' be 30? Answer, no. Because the constraint is that 'x + y' must be 30.
Well, look it. I don't want us to get too wrapped up on the idea of computational differences now. What I do want to do is review our basic result. And in fact, let me come over here. I hope this doesn't spoil you. I'd like to come back to the board. And actually, since I've got this all written down, let's close on this particular result again.
To find the high-low points of a function 'f of x', we first of all, check out when 'f prime of c' is 0 for candidates. We check out where 'f prime of c' fails to exist. And if it's a closed interval, we check the endpoints. This is the mechanism behind what we're doing from that point on, as the cliche goes, it's all engineering's baby. It's all computational know how.
At any rate, I think this is enough in terms of emphasizing the points that we wanted to make. And so until next time, goodbye.
ANNOUNCER: Funding for the publication of this video was provided by the Gabriella and Paul Rosenbaum Foundation.
Help OCW continue to provide free and open access to MIT courses by making a donation at ocw.mit.edu/donate.