Flash and JavaScript are required for this feature.
Download the video from iTunes U or the Internet Archive.
Topics covered: Statement of Rolle's Theorem; a geometric interpretation; some cautions; the Mean Value Theorem; consequences of the Mean Value Theorem.
Instructor/speaker: Prof. Herbert Gross
Lecture 9: Rolle's Theorem ...
Related Resources
This section contains documents that are inaccessible to screen reader software. A "#" symbol is used to denote such documents.
Part II Study Guide (PDF - 29MB)#
Supplementary Notes (PDF - 46MB)#
Blackboard Photos (PDF - 8MB)#
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To make a donation, or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
PROFESSOR: Hi. Our lecture today is called 'Rolle's Theorem and Its Consequences'. And I suppose we could've made a take off on what goes up must come down, and say that what Rolle's theorem says intuitively is that what goes up smoothly and comes down smoothly must level off somewhere. OK? Now because that may sound too easy to understand, let's cloak that in the language of more formal mathematics.
Rolle's theorem says this. Let 'f' be defined and continuous on the closed interval from 'a' to 'b'. In other words, what? The domain of 'f' is the closed interval from 'a' to 'b'. The graph of 'f' is unbroken on this interval. And differentiable in the open interval from 'a' to 'b'. In other words, you want the function to be continuous on the entire interval, but for differentiability, you only require that it be smooth, differentiable, in the interior of the interval, in the open interval.
Suppose also that 'f of a' and 'f of b' are 0. Then, what Rolle's theorem says is that 'f prime of c', the derivative of 'f of x', must be 0 for some number 'c', at least one number 'c', in the open interval from 'a' to 'b'.
Now what this thing says intuitively is simply this, suppose you have a curve that is unbroken for all values of 'x' between 'a' and 'b' inclusively. Suppose the curve is smooth. Suppose the curve starts here and ends here, then what we're saying is there must be some point in here where the curve levels off. In other words, someplace where you have a horizontal tangent, which is what 'f prime of c' equals 0 means. In this particular diagram, this would be the value of 'c' that we're talking about.
By the way, I think the proof is intuitively clear. Namely, if the curve never leaves the x-axis, then it's leveled off for the entire domain. And if the curve does leave the x-axis, for example, if the curve starts to rise, OK, since it must eventually get back to the x-axis when 'b' is 0, it must ultimately begin to fall again.
Well, if the curve goes from rising to falling, it must have what? Since it's coming up and then going down, it must attain a maximum value. Because the curve is unbroken and smooth, as we saw in our previous lecture, the maximum value is characterized by the derivative at that point being 0. In fact, the analytic proof is precisely what we've just said, only translated into more mathematical language.
By the way, I would like to make a slight aside here, because I think it sometimes gets confusing to students to see, why do you say that the function has to be continuous on the closed interval, but differentiable only in the open interval? I thought you might like to see a contrived example as to what goes wrong if you allow the curve to be broken at the end points.
See, all I'm thinking of is something like this. Suppose I say, look, let's define a curve as follows. At 'x' equals 'a' and 'x' equals 'b', the curve will be 0. So in other words, it'll cross the x-axis. Then immediately, for 'x' greater than 'a', the curve jumps up to here, comes down along this line, and then, you see, when 'x' equals 'b', it jumps down here again.
In other words, why not let the curve be defined as follows? It will be 0 at these two endpoints. It'll be this curve on the open interval. Notice in this contrived example that 'f of a' and 'f of b' are both equal to 0, but there is no place in the open interval where the curve has a horizontal tangent line. In other words, the significance is you've got to be sure that the curve doesn't get broken at the ends, because with these gaps, all sorts of crazy things can happen.
Now, just as in our previous lecture, there are some rather important cautions that have to be taken in understanding Rolle's theorem. As simple as it is, we have to be sure that we understand exactly what's really happening here. The first caution is that Rolle's theorem is what we mathematicians call an 'existence theorem'. It says, under certain conditions, there exists at least one number 'c' that has a certain property. It doesn't tell us how many "c's" there are. It doesn't tell us where to find them. It just says, what? There exists at least one such 'c'.
And the point is, you must be careful to remember-- so let's take an example. Here's 'a', here's 'b'. If the function is continuous and smooth, in other words, if the graph is continuous and smooth, all we're saying is that at least one number between 'a' and 'b', the curve, must possess a horizontal tangent.
There may be more than one. You see, the curve, for example, could do something like this. See, in other words, here is one value of 'c', which we'll call 'c1', horizontal tangent here. Here's another value, which we'll call 'c2', horizontal tangent here. See, again, the meaning of at least one.
Secondly, we must make sure that we remember that the curve is smooth. Meaning what? That the function is differentiable. Now I'm taking the liberty of drawing these things in freehand. There's some mixed emotions here, if I draw the diagrams too smoothly all the time, you lose the significance of what's going on because of the picture. And if I draw them freehand all the time, you won't understand what I'm doing, because I don't draw very well. But I think here we can get away with this.
What I'm driving at is this. Let's suppose you have 'a' and 'b' here. Let's suppose that the function, the curve that we're drawing, passes through these two points. But suppose there happens to be a sharp corner in here. Maybe the curve goes like this, it goes up like this, then very abruptly comes down like this. Notice, what? That the curve is continuous. It does reach a local maximum. But the point is, for this particular value of 'c', 'f prime of c' is not 0 by default. Namely, 'f prime of c' doesn't even exist. So in other words, Rolle's theorem doesn't apply if you don't have differentiabilities. I want to make sure you see where each of the parts of the hypotheses for the theorem are used.
By the way, here's another interesting result, which has nothing to do with the statement of the theorem, but again, another piece of evidence as to why we like to shy away from functions which are not single value. See, for example, suppose you allowed the function to be multivalued, and you say, OK, I want the curve to pass through here and here. And I want it to be smooth. But the curve does not have to be single value.
Notice what you can do. You could have a curve that does something like this. I don't know. Now you see, there will be a point 'c' where the derivative will be 0. Even as badly as I've drawn this, I think roughly speaking, we can see that 'c' would be something like this. Notice, however, that in Rolle's theorem, the statement is what? That 'c' must be on the open interval from 'a' to 'b'.
If the function is not single value, as long as the curve is smooth, there will be places where the curve levels off. But the x-coordinate of the point at which the curve levels off may not be in the interval-- may not be, it could be, but it might not be, I should say-- in the open interval from 'a' to 'b'. This is a very, very important result. In fact, I'll have reason to make reference to this in a little while later in the lecture. I couldn't make reference to it later earlier in the lecture, I guess.
The fourth assumption here is also an aside, and it's one that's rather crucial. In most textbooks in which Rolle's theorem is stated, the condition is what? That 'f of a' equals 'f of b' equals 0. It turns out that this is too restrictive, that essentially, all you need is 'f of a' equals 'f of b'. What I mean by that is, let's suppose 'f of a' is not 0. Let's suppose this height represents 'f of a'. What I'm saying is suppose that 'f of a' and 'f of b' are equal.
What that means is, if I want to think of a new axis, which I call the 'x sub 1' axis-- see, with respect to the 'x sub 1' axis, the curve crosses the axis at these two points. In other words, notice that as long as these two points are at the same level, the same argument that we used to prove Rolle's theorem goes through unimpeded over here. Namely, we say what? What goes up smoothly and comes down smoothly-- because it has to come down, because it comes back to the same level here-- must reach a point someplace in here where it levels off.
So these are the four cautions that I'd like you to look at when we view Rolle's theorem. Now, if somebody were to say to us, what's so important about Rolle's theorem? And this happens so often in mathematics that frequently, the most important thing about one particular theorem, is that it may be the building block by which a more important, or more useful, or more practical theorem is derived.
And in this respect, I would say for my own opinion, that the most important application of Rolle's theorem is that it facilitates a very famous result known as the 'Mean Value theorem'. As our course proceeds, from time to time we will have ample reason to back track and make references to the mean value theorem. I intend not to make too deep references to the mean value theorem now, because what I would like to do is to establish the result, give you enough of an intuitive feeling so that you can tuck it under your belt without feeling too overwhelmed by it, and just enough applications of it so that we can get into the next phase of our calculus course. But the mean value theorem is another one of these things where if you state the thing analytically, and have no feeling for what's going on pictorially, the thing can become overwhelming.
Let's, in fact, do it in an overwhelming way, and then show what the thing means pictorially. Notice again how this thing now starts off the same way as Rolle's theorem. Let 'f' be continuous on the closed interval from 'a' to 'b', and differentiable in the open interval from 'a' to 'b'. By the way, this is just an idiosyncrasy of mine, I don't know if it's standard. When I talk about the closed interval, I have the habit of saying 'on' the interval. When I talk about the open interval, I like to say 'in' the interval, to sort of emphasize the interior.
It's just a vocabulary trait, and don't read too much into this. Don't be upset by it. But it's continuous on the closed interval, differentiable in the open interval. Now it's again an existence theorem. It says then there exists a number 'c'. When I say there exists, it means what? There is at least one number 'c' in the open interval from 'a' to 'b'-- and this is the part that looks kind of tough-- such that ''f of b' minus 'f of a'' divided by 'b - a' is 'f prime of c'. And this somehow or other may seem at first glance to be more ominous than the intuitive feeling about Rolle's theorem.
By the way, as the name implies, where by 'mean' we don't mean nasty, we mean average, if you'd like to see what this thing means, and I'll draw you a picture in a second, all it says is that if a particle is moving from point 'a' to point 'b', say, at at least one point during its trip, the instantaneous speed must equal the average speed.
You see? Because after all, if you're always going less than your average speed, how could you have had an average speed as high as your average speed? And if you're always going less than your average speed, how could you have had, you see, an average speed equaling this, what it did? So that somehow or other, all you're saying is that somehow or other, the instantaneous speed at a particular instance must equal your average speed someplace along the path.
Now what that means pictorially is this-- again, I'll chance a freehand diagram-- suppose our curve is 'y' equals 'f of x'. See, I've drawn it to be smooth. Suppose it's continuous and smooth on this open interval from 'a' to 'b'. Now what is, if I think of a particle moving from point 'p' to point 'q', how do I identify the average speed? The average speed is the slope of the straight line that joins 'p' to 'q'.
On the other hand, what is the instantaneous speed? If we think of it in terms of the picture, it's the slope of the line tangent to the curve at a particular point. So in other words, what we're saying is this. You see, if we were to take the line 'PQ', and we shift it parallel to itself, I think you can sense that the points 'P' and 'Q', if we labeled 'P' and 'Q' the points at which this chord intersects the curve, the points 'P' and 'Q' will roll in closer and closer together.
Ultimately, the line will fail to intersect the curve, and at the transition point, if the curve is smooth, all we're saying is that the last point at which that line touches the curve as we move it out, OK, that the line would be tangent to the curve at that particular point. You see, all we're saying is what? That someplace between here and here there must be a point where the tangent line to the curve is parallel to the chord 'PQ'.
And now, we have all the ingredients that we need to see what the mean value theorem says geometrically. Let's call the tangent line 'l'. First of all, what is the slope of the line 'PQ'? Well, it's a straight line. The slope of a straight line is 'delta y' divided by 'delta x'. Well, notice that this height here is by definition 'f of b'. This height here is 'f of a'. So this height here is just 'f of b' minus 'f of a'. This length here is just 'b - a'. So the slope, 'delta y' divided by 'delta x', is just ''f of b' minus 'f of a'', over 'b - a'.
On the other hand, what is the slope of the line 'l'? By definition, it's 'f prime of x' evaluated at 'x' equals 'c', that's 'f prime of c'. Now what does it mean in terms of slopes for two lines to be parallel? It means that their slopes are equal. And where is 'c'? 'c' is someplace in the open interval from 'a' to 'b'.
Now the reason I call this intuitively an extension of Rolle's theorem-- and by the way, you'll notice that what I have to say is a much simpler demonstration than what's given in the book. But before you think I'm being egotistic about this, let me point out, as is so often the case that wherever my demonstrations are easier than the one in the book, I'm losing something in my presentation. Either I haven't shown the most analytic representation, or I'm overlooking a particular complicated side effect that might occur.
But disregarding that for the moment, you see what I'm saying is this, let's suppose, for the sake of argument, we visualize the line 'PQ' as being our new x-axis. I'll call that the x1-axis. And now let's take a line perpendicular to 'PQ' and call that our new y-axis, the y1-axis. Now look at the curve that we've drawn. With respect to the y1- x1-axis, notice that the curve is smooth, right? It's unbroken. And it cuts the x1-axis at two points.
Now if we apply Rolle's theorem with respect to the x1- y1-axis, we say, look, here's a curve which cuts the x-axis-- the x1-axis-- at two points. It's smooth. Therefore, it must level off someplace. In other words, there must be some point on this curve where the tangent line to the curve is parallel to the x1-axis. That's exactly, you see, what this thing here says. That's another geometric interpretation that indicates how Rolle's theorem might be used.
However, there is a very, very subtle flaw in what I've said. One that is so subtle that you may not even notice it until I point it out to you, and even after I point it out, there's a chance you may not realize what I've said. Because it's a point that I know took me a long, long time to discover for myself. And it all hinges on the concept of single valuedness again.
The trouble with this interpretation is the following-- and by the way, let me point out, I'm not knocking my interpretation, I think it's still a tremendous way of visualizing the result, but from an analytical point of view, why we have to be careful. Let's suppose that my curve 'y' equals 'f of x' happens to look something like this, OK? Happens to look something like this.
Notice, barring any bad drawing that I've done here, that this curve is single value, that no line parallel to the y-axis cuts this curve in more than one place. Now here's my 'a' and here's my 'b'. And so I say, OK, by Rolle's theorem, if I look at this as being the x-axis and this as being the y-axis-- in other words, the x1- y1-axis again-- I say to myself, look, here's a smooth curve, it cuts the x-axis in two points, therefore, someplace between these two points, there must be a place where the curve levels off, et cetera, et cetera, et cetera.
And the interesting point is to notice that a given curve, as to whether it's single valued or not, is dependent upon the orientation of the axes. In other words, notice that I've drawn this particular curve so it is single valued with respect to the xy-plane. On the other hand, with respect to the x1- y1-coordinate system, this curve is not single valued. Namely, observe how a line parallel to the y1-axis can intersect this curve at more than one point.
In other words, whether a curve is single valued or not is not an absolute property independent of the coordinate system. So again, if I could be sure that when I rotated my coordinate axes the original single valued curve was still single valued, then my above proof would have been rigorous. But of course, I can't be sure of that.
By the way, the technique used in the book is quite standard, and what it does is the following, it still utilizes Rolle's theorem, but the technique behind the proof in the book is this. The function that we set up is the vertical distance between the chord and the curve, as we move along this way. And notice that that distance is 0 at these two endpoints. OK? And therefore, Rolle's theorem applies to that function.
And the whole idea is something like this. All we say is-- and the analytic part proves this-- all we say is look, the point at which this chord would have been tangent to the curve is the place where the vertical distance between the chord and the curve is what? Maximum. And we won't go into that right now, that is done in the text. All I wanted to do, as I always will do when possible, is that whenever the rigorous proof seems far more complicated than proofs which are more intuitive, I will not take the time in general, in our lectures, to give the more rigorous proof. What I will take the time to do is to show why the less rigorous proof has pitfalls.
Well, enough said about the statement of the mean value theorem. Time is getting very short, and we don't need much more time to make the home run ball pitch that we want to make now. And the idea is this, that the most important analytical reason for having the mean value theorem is, for those of us who like to use our geometric intuition, it turns out that almost every geometrically obvious fact that has a proper analytic counterpart has the property that the analytic counterpart is proven by the mean value theorem.
See, let me give you a simple for instance. In fact, in the text book this is called the first corollary to the mean value theorem. Suppose we have a function capital 'F of x', and we know that the derivative is always equal to 0. The claim is that 'F of x' itself must be a constant.
By the way, two cautions here. Don't say that we've had this result before. The result that we had before was actually the converse of this. The result that we had before was the one that said what? If 'F of x' is a constant, then its derivative is 0. Now we're saying the opposite--not the opposite, but the converse. Now we're saying, look, if the derivative is always 0-- notice the use of my identity symbol here-- if the derivative is 0 for all values of 'x', then the function must've been a constant.
Now, you know, geometrically this is a very simple thing to visualize. You say, look, the derivative is the slope. And if you're saying that the slope of the curve is always horizontal, the curve itself must be a straight line. And if the curve is a straight line, that's exactly what you mean by saying that the function is a constant. How do we prove this using the mean value theorem? See, and I just wanted to go through a proof here once, just so to get the idea of what a proof means.
You see, to show that something is a constant should mean what? That if 'a' is unequal to 'b' for any two values 'a' and 'b', 'F of a'-- well, I'm using capital 'F' here--capital 'F of a' has to equal capital 'F of b'. That's what you mean for a function to be a constant. No matter what the input is, the outputs are always equal. By the way, if 'a' equals 'b', it's trivial that 'F of a' equals 'F of b'.
But essentially, to prove that capital F is a constant, this is what I have to prove. That if 'a' is different from 'b', no matter what 'a' and 'b' I use, that 'F of a' equals 'F of b'. And the idea is by the mean value theorem, we say, look, what does the mean value theorem say? We're assuming now that 'F' is a continuous and differentiable function on an interval, OK, from 'a' to 'b'. The mean value theorem says under these conditions, there exists a number 'c' between 'a' and 'b' with what property? That ''F of b' minus 'F of a'' over 'b - a' is equal to 'F prime of c'. That's just a statement of the mean value theorem. This is always true if the conditions of the mean value theorem apply.
Now all we're saying is, in this particular problem, what property that capital 'F' have? It had the property that its derivative for all values of 'x' was 0. In particular then, when 'c' is the value that we're talking about, if 'F prime of x' is 0 for all values of 'x', in particular, then, it's 0 when 'x' is equal to 'c'. In other words, by our given hypothesis, this is 0.
But if a fraction is 0, its numerator must be 0. That says what? 'F of b' minus 'F of a' is 0. See, the only way a quotient can be 0 is for the numerator-- or the dividend, the divisor, I don't know these formal names, they slipped my mind, but the top one has to be 0. And if 'F of b' minus 'F of a' is 0, that says 'F of b' equals 'F of a', and that's precisely what we had to show to show that 'F' was a constant. OK?
So again, notice, it's not that we're saying that the mean value theorem is a harder way of proving what we already know to be intuitively true, what we're saying is what? That we know that many intuitively obvious results frequently turn out to be false. We would like some analytical way of knowing which of the intuitive results are actually correct. All I'm saying is the mean value theorem gives us a big hint that way.
By the way, let me close by giving you one more important illustration of what we can prove by the mean value theorem. And this is called a corollary of a corollary, as I'll show you what I mean in a minute.
The next example that I want to use is what it means to say suppose I'm given two functions 'f' and 'g', and all I know about these two functions is that the derivatives are identical. In other ways, that 'f' and 'g' have the property that for every value of 'x', 'f prime of x' is equal to 'g prime of x'. By the way, when I say every value of 'x', again, it's local versus global. It's not necessary that this happens for all 'x', what is important is what? That 'x' be defined on some interval.
In other words, even if I know that this property is true for some interval, I don't really care what happens outside of that interval. In terms of local properties, all I'm saying is, all I know is that for some interval, maybe the whole axis, doesn't make any difference, 'f prime' is identical to 'g prime'. Now, you would like to be able to say, maybe, that if 'f prime' is equal to 'g prime', 'f' equals 'g'. But that's not the case. What is the case is that the difference between the two functions must be a constant.
Again, geometrically, what you're saying is what? That if you have two curves, which point by point always have the same slope-- in other words, for each 'x' value, the slopes are the same-- is just essentially saying that the two curves are parallel. And if they're parallel curves, what's a way of stating that two curves are parallel? That one is a constant displacement of the other.
In other words, the geometric impact of two curves having the same derivative is not that the curves are the same, but that they're parallel. And by the way, the proof of this result is again a corollary to the mean value theorem. Namely, let's look at the function 'f of x' minus 'g of x'. Call that capital 'F of x'. Let capital 'F of x' be ''f of x' minus 'g of x''.
Since the derivative of a difference is the difference of the derivatives, that would say the derivative of capital 'F' is the derivative little 'f' minus the derivative of 'g'. OK? Now what do we know about 'f prime' and 'g prime of x'? We know that 'f prime of x' equals 'g prime of x' for all 'x'. Consequently, the difference between these two must be 0.
Remember, if two functions are identical, their difference is 0. That says, therefore, that capital 'F prime of x' is identically 0. And by our previous theorem-- notice the beautiful logic of this-- from the mean value theorem, we proved that if the derivative of a function is identically 0, the function must be a constant. So we apply that here. But what was capital 'F'? It was 'little 'f - g'. And that proves our desired result.
Again, what I want you to see here is that we have not done anything different with the mean value theorem. We're not trying to say we're going to prove results we couldn't prove before. Rather, what? The mean value theorem is our way of showing that certain intuitive results hold true analytically, that we can talk about parallel curves, and things like this.
Most important, in terms of summarizing this lecture from a point of view of what's coming next, it's crucial to observe that this last example is what is going to allow us to enter the study of something called the 'indefinite integral'. Or in another manner of speaking, something called the inverse of taking a derivative. You see, the idea is, notice that in these two examples we start with information about the derivative and deduce what's true about the original function. That's inverting the emphasis of what we've been doing up until now, where we've started with the function and investigated its derivative.
To see this in more detail, join me again next time. And until next time, goodbye.
Funding for the publication of this video was provided by the Gabriella and Paul Rosenbaum Foundation. Help OCW continue to provide free and open access to MIT courses by making a donation at ocw.mit.edu/donate.