This is the graph of the line y = x:
If you put your finger down on any point on that line, and then put another finger on another point on that line, you find that the total change in the y-coordinate divided by the total change in the x-coordinate between those two positions is 1. Move two units to the right, and the line rises by two units, etc. This is the same no matter which two points you pick. Every line has this sort of property, which is called slope. Steep lines that rise greatly for each unit of x displacement have a large slope, flat horizontal lines have zero slope, and descending lines have negative slope. In each case the way to calculate the number describing the slope is just to divide the amount of vertical rise by the amount of horizontal run. If you have the equation of the line, the slope is just the number in front of the x. I.e., y = 3x has a slope of 3, because if you increase x by some amount, y increases by three times that amount.
But what if your equation isn't a line? Here's y = x^2:
Clearly this curve hasn't got a constant slope. But this doesn't stop us from at least informally describing each point on the curve as if it had a slope. Over on the left, the slope seems to be negative, since a little ball places on the curve would be sliding downward. Then it gets less and less negative, until at the bottom of the parabola the slope seems to be zero. Then the slope becomes more and more positive as the graph rises more and more for each bit of distance run.
That's pretty informal though. We might want to assign a number to that ever-changing slope, and we need a good way to define that slope. I propose we define the "slope of a curve at a particular point" as the slope of the line (which is well defined) which happens to be tangent - just touching - the curve at that point. For instance, if we wanted to find the slope at x = 1, we'd look at the slope of the line touching the graph at that point:
Which is nice, but right now we have no way of knowing what that line is. To write the equation of a line, you either need a point on that line and a slope, or two points on that line. We only know one point, and we're trying to find the slope. Can we get around this? Well, we can try to do so by putting one point where we're trying to find the slope and one point farther to the right by a distance delta x. Then we can shrink delta x and see what happens to the slope. Since the slope is the change in y over the change in x, we'll just call it Δy/Δx.
Well, it looks like the slope as x = 1 is about 2. It sure seems to be closing in on that value anyway. But we want to be sure of this. We'll calculate it by hand for a generic x, though in this case of course x = 1 which we can substitute in at the end if we want.
The change in y (which we called Δy) is easy to write down. The change between two numbers is the end value minus the start value. The start height for a given x is just x^2, and the end height will be the square of x plus however far we went to the right:
Therefore the slope between the points x and (x + Δx) is:
We can go ahead and multiply out the top:
Cancel the x^2 and divide by delta x:
But the whole point is that we want the change in x to approach zero. This means the only surviving term is 2x. And that's the slope of the tangent line to the parabola y = x^2 at any point x. It's our Sunday Function, and for its official presentation I'll switch the deltas on the left side into the d-notation favored in modern calculus:
The notation dy/dx is the standard way of expressing the statement "the slope of the tangent line to the function y at a given point x", which takes up way too much space to write down. "d" is just part of the notation, it's not a new variable or anything.
And that's one half of the basic concepts of calculus - finding the slope (or more generally the rate of change) of a function at a given point.
The eagle-eyed among you may not be happy with this. You might say something like this: "If delta x is greater than zero, you can't really be said to have truly found rate of change at exactly one point. But if delta x is actually equally to zero, then you've divided by zero, which is impossible. Either way this method doesn't quite work."
And you'd be right. But calculus did nonetheless work perfectly for solving problems (heck, we just solved a problem with it), so for a while mathematicians and scientists were more or less willing to press on and keep developing the calculus in the anticipation that a more formally careful method of finding slopes could be found, avoiding the divide-by-zero problem. Sure enough, the 19th century mathematicians Augustin-Louis Cauchy and Karl Weierstrass came up with a formally correct if conceptually recondite way to re-express this procedure without division by zero. With calculus on a firm logical footing, this process of differentiation is ubiquitous in modern science and engineering.
The equation just after the moving tangent graph has an error. You've got a delta x where you should have an x^2.
I'm curious about how the animation of the function was made? It looks very cool!
The process of taking the limit is the key to how the contemporary calculus works. Taking the limit allows you to get around the problem of division by zero, using a type of mathematical sleight of hand. If you combine limits with things like Cauchy sequences, you find that the limiting process does indeed define a unique number in a very real sense, and this number is then the derivative of the function at that point.
But this wasn't always the way things were done, or at least conceived of. The old process of "infinitesimal rates of change" was actually the original method Leibniz used to justify his methods. In fact, when he invented the fraction notation for derivatives, he meant it as a true fraction. dy/dx really meant "the infinitesimal quantity dy divided by the infinitesimal quantity dx".
Today we read it as "the derivative of y with respect to x" or sometimes "the rate of change of y with respect to x". By both these definitions raise the question of what the rate of change of x actually is. Our wording seems to imply that if you must give the rate of change of x in order to get the rate of change of y; but no-one every does that.
Once nice way of clarifying this is to look at the derivative in higher dimensions. For functions of several variables, the derivative becomes the Jacobian, which is a matrix, D. The effect of D is to take vectors dx in x space and map them to vectors dy in y space by the rule, dy=Ddx. The derivative is now seen as something which really does take rates of change of x, the vectors dx, and maps them to rate of change of y, dy.
So instead of writing dy/dx = f(x), what we should really write is dy=f(x)dx, and we should explicity give the rate of change dx at each point where we calculate dy, in order to get the true rate of change of the y variable. But even if we did that, the ratio of the rates of change dy/dx would be constant, f(x), for any given point. So you really can't get away from the concept of dy/dx as a fraction or ratio. In fact, you can't even do it in higher dimensions if the Jacobian is invertible as dx=D^(-1) dy, so the derivative still expresses a relation or mapping which takes rate of change in x and y spaces to the other invertible, just like a non-zero fraction dy/dx=f(x) would.
The whole idea of the derivative as being a mapping between rates of changes in two spaces is actually entirely divorced from the idea of limits, at least directly. The two can be shown to be equivalent, but both are very different ways of looking at the same concept. The former is closer to the ideas of Newton's fluxions and indeed Leibniz' original conception of derivatives. The idea of limits and such is all very well, but is all this epsilon delta sequence stuff really necessary to a concept as basic as differentiation?