Sunteți pe pagina 1din 11

Partial Differentiation1

Functions of more than one variable depend on two variables that are
independent, e.g.

z = f (x, y)
where x and y are two variables that are independent of each other, while the
third variable, dependent on x and y is z.

Because of this dependence of z on two independent variables, it can be


changed in many ways. For a function of one variable, e.g. s = f (t), the only
possible way to change s is to change t in some way. However, for a function of
two or more variables, the ‘output’, also the dependent variable, can be
changed by changing only on of the vairables it depends, or changing some
combination of them together, or changing them all at the same time.

E.g. for the above function z = f (x, y), the dependent variable z can be
changed by changing x alone while keeping y constant, or changing y alone
and keeping x constant, or changing both together in some way.

Partial Differentiation: Keeping one variable constant

Consider z = f (x, y)

The partial derivative of f (or z) with respect to x is the rate of change


of f with respect to x if the other variables on which f depends (in this case
there is only one such variable, y) are kept constant. Therefore, such a
derivative conveys how f would change if only one of the variables on which it
depends, x here, is changed while all other variables are kept constant.

The above definition can be modified by using any variable in place of x, and
so for a function of more than one variable, there exist partial derivatives with
respect to each of the independent variables, and for each such derivative, all
the other variables are kept constant.
Note that for functions of one variable, the partial derivative and the ordinary
derivative are the same, as there are no other variables to keep constant.

Computing Partial Derivatives

Keeping the other variables constant is identical to saying that they can be
replaced by some value for the purposes of differentiation. This translates
1 By Saeed Khan, student, McGill University

1
easiliy to the actual computation of partial derivatives, as when the partial
derivative with respect to a particular variable is being computed, all other
variables can be treated as constants, and regular differentiation carried out.

Example 1

Consider the example z = f (x, y) = x2 − 2y. Now if the partial derivative of f


(or z as both are identical) with respect to x is considered, the other variable,
y, is kept constant. So, if we treat the y variable as a constant, we get,
differentiating with respect to x,
∂f
= 2x − 0 = 2x
∂x
The y term vanishes since it is a constant with respect to x when we are
considering the partial derivative of f with respect to x (all other variables
held constant). Similarly, we have the partial derivative of f with respect to y,
∂f
= 0 − 2 = −2
∂y
and now the x term vanishes since it is held constant when the partial
derivative with respect to y is being considered.

Example 2

Now consider z = f (x, y) = x2 y − 2y 2 . The partial derivative of f with respect


to x is now:
∂f
= 2xy
∂x
There is no ‘product rule’ involvement with the term x2 y; the y is simply a
multiplicative constant as the partial derivative of f with respect to x is being
considered, and all other variables on which f depends are being held
constant. The second term is similarly a constant here and becomes zero after
differentiation. Similarly for the partial derivative with respect to y,
∂f
= x2 − 4y
∂y
The x2 in the first term now behaves like a multiplicative constant for the y,
and differentiation is carried out normally.

Example 3

Finally consider:

2
xy
z = f (x, y) =
1 − x2
Here the partial derivative of f with respect to x does involve a ‘product rule
(or rather ‘quotient rule’) application, since the numerator and denominator
are both functions of x, even when y is held constant. Therefore we have:

∂f  1 − x2 − x(−2x)  (1 + x2 )y
=y =
∂x (1 − x2 )2 (1 − x2 )2
remembering again that y is just a constant for this partial derivative. The
partial derivative of f with respect to y is a lot simpler, since the entire term
x
1−x2 is a multiplicative constant for the partial derivative of f with respect to
y, and therefore we get:
∂f x
=
∂y 1 − x2

Interpretation of Partial Derivatives

The partial derivative of a function z = f (x, y) with respect to x gives the rate
of change of the function f if x is increased, while keeping the other variable,
y, constant (the ‘increased’ comes from how a derivative is defined as the limit
of the slope of the line joining two points as the two points becomes
infinitesimally closer; the second, moving, point is always to the right of the
first, or has a greater x value than the first.)

Similarly, the partial derivative of f with respect to y gives the rate of change
of the function f if y is increased, while the other variable, x, is kept constant

In general, for z = f (x, y), the partial derivatives of f with respect to x and y
are also functions of x and y, as seen. Evaluating these partial derivatives at
particular values of x and y gives the respective rates of change at that
particular (x, y) value.

However, as stated before, these are only two ways of changing f . There are
infinitely many ways to vary x and y in some arbitrary way that would also
change f . For our case of having a two variable function, we can imagine a 2
dimensional space containing all possible x and y values that f can be
evaluated at.

Now. at any given (x, y) pair, there are an infinite number of directions in
which one can move. The direction of motion indicates how the
variables x and y change. For example, moving only along the direction of
the x-axis, i.e along the î direction, is equivalent to keeping y constant and

3
only changing x, and the rate of change of f in this direction is therefore given
by the ∂f
∂x .

Similarly, moving only along the direction of the y-axis, or along the ĵ
direction, is equivalent to keeping x constant and only changing y, and the
rate of change of f in this direction is therefore given by the ∂f
∂y .

Directional Derivatives

Moving in any other direction involves a combination of changes in x and y,


and hence involves some combination of the rates of change for these two
particular cases.

Consider motion along an arbitrary direction, defined by the unit vector


along that direction, say û = aî + bĵ, where a2 + b2 = 1. Now, motion along
this direction can be decomposed into moving along the î and ĵ directions
separately, and we know the rates of change of f associated with moving along
these directions for any value of (x, y). The total change of f along û is given
by:
∂f ∂f
Dû f = a +b
∂x ∂y
 ∂f ∂f 
Dû f = î + ĵ · (aî + bĵ)
∂x ∂y
~ · û
Dû f = ∇f (1)
where Dû f is called the directional derivative of f along the direction
û, and gives the rate of change of the function f in the specified direction û.
Crucially, as I stated before, the direction indicates how the variables x and y
change. The gradient operator, ∇, ~ is defined for a function f of two variables
x and y as:
 ∂ ∂ 
~ =
∇f î + ĵ f (2)
∂x ∂y
The gradient operator is a vector operator in that upon acting on a function f ,
~ , called ‘grad f ’, is the generalized
it returns a vector. This vector, ∇f
derivative of f . This operator has various uses, and one among them is
determining the directional derivative of a function f along a particular
direction using (1).

The ‘amount’ of motion along each direction î and ĵ is given by the value of a
and b respectively, and the dot product weights the rates of change along x
and y accordingly, as would be expected.

4
Expressing (1) using the definition of the dot product of two vectors, we have:

~ || · ||û|| cos θ
Dû f = ||∇f
~ || cos θ
Dû f = ||∇f (3)
~ , and the direction
where θ is the angle between the generalized derivative, ∇f
along which the rate of change is required, specified by the unit vector û.

From (3) we can see how the value of θ affects the directional derivative of a
function f in the direction û. We see that the magnitude of the directional
derivative is maximum for θ = 0 or π, that is when the direction along which
the rate of change of f is being determined, û, is either parallel or antiparallel
~ .
respectively to ∇f

Furthermore, the value of the directional derivative is positive if θ = 0, and is


negative if θ = π. So, for any point (x, y), the function f has maximum
~ at that point, and the function f has
increase along û if û is parallel to ∇f
maximum decrease along û if û is in the opposite direction
~ .
(antiparallel) to ∇f

Example 4

Consider a function z = f (x, y) = 16 − x2 − y 2 . This function represents the


height, z, above sea level, of a hill. Find out the rate of increase of the height
of the hill at the point (1, 2), in the direction 2î − ĵ. Also find the direction in
which a climber must move if he is at the point (1, 2) so as to gain height
fastest, and find the rate of height gain in this direction.

The rate of increase of height of the hill is just the rate of increase of
z = f (x, y). So, we need to calculate the directional derivative of f in the
direction given by v = 2î − ĵ. Notice that I haven’t designated the direction by
û; this is because the vector denoting direction given to us is NOT a unit
vector, and must be converted into one before we can proceed. So,

2î − ĵ
û = p
22 + (−1)2
2 1
û = √ î − √ ĵ
5 5
Also for the directional derivative along û at the point (1, 2), we need the
~ ).
generalized derivative of f (i.e ∇f

~ = ∂f î + ∂f ĵ = (−2x)î + (−2y)ĵ
 
∇f
∂x ∂y

5
This is the generalized derivative of f at any value of (x, y). We need the
~ must be evaluated there as well.
directional derivative at (1, 2) and hence ∇f

~
∇f = (−2)î + (−4)ĵ = −2î − 4ĵ
(1,2)

Now, the required directional derivative of f at (1, 2) can be found:


   2 1 
~ · û =
Dû f = ∇f − 2î − 4ĵ · √ î − √ ĵ
5 5
−4 4
Dû f = √ + √ = 0
5 5
So, in moving along the direction parallel to 2î − ĵ, there is no increase in the
height in of the hill at the point (1, 2). At any other point, there may be a
change as one moves along this very direction, as Dû f may not be zero there,
since it depends on ∇f~ which varies from point to point.

Now for the direction in which the gain of height is fastest at (1, 2), we need
the direction in which the rate of increase of z = f (x, y) is maximum at this
point. From the form (3) of the directional derivative, we had derived that this
~ at that point. So, in our
direction is nothing but the direction parallel to ∇f
case, this direction would be

~
∇f = −2î − 4ĵ
(1,2)

However, once again this vector is not a unit vector, so the appropriate
direction is:
2 4
û = − √ î − √ ĵ
20 20
What is the value of the rate of change of z = f (x, y) in this direction? It is
just the directional derivative of f in this direction, and is calculable using (1).
However we can also use (3) to see that if θ = 0 (for maximum increase, which
is what we want), then Dû f is simply equal to ||∇f ~ || at the point where the
directional derivative is being evaluated ((1, 2) here). So,

~ || = (−2)2 + (−4)2 = 20
Dû f = ||∇f

This is the maximum rate of height gain at (1, 2). If the direction of maximum
height decrease and the corresponding rate of decrease was required, the
~ , and the magnitude of the
direction would simply be antiparallel to ∇f
~ || is unchanged.
decrease would be the same, since the value of ||∇f

Example 5

6
If the climber in the previous question moves any distance in the direction
~ in a bid to climb the mountain as fast as possible, he will have
parallel to ∇f
~
to choose a new direction in which to climb as the direction parallel to ∇f
changes from one point to another. Find the projection on the x-y plane of the
complete path the climber must take to climb the hill.

To climb the hill in the fastest possible time, the climber must always be
~ at every point along his journey. It is
moving in a direction parallel to ∇f
obvious that the climber will be moving along some kind of curve bound to the
surface. This curve will obviously be three dimensional.

However, all that is required is the x-y plane projection of the path, in which
case the curve becomes simply two dimensional. The key concept is that at
every point along this curve, the climber must be moving along the direction
~ , which implies that the tangent vector to the curve at
parallel to ∇f
every point must be parallel to ∇f ~ at that point.

We can assume the two dimensional curve we require has a parametric


representation of the form r(t) = x(t)î + y(t)ĵ. Now, we need the tangent
~ at that point. That is,
vector to this curve at each point to be parallel to ∇f
for the case at hand, we require:
dr dx dy ~ = −2xî − 2y ĵ
= î + ĵ = ∇f
dt dt dt
We need the components of the two vectors to be equal, and this gives two
equations:
dx dy
= −2x; = −2y
dt dt
It is possible to solve the above equations for x(t) and y(t), which gives:

x(t) = Ae−2t ; y(t) = Be−2t


where A and B are arbitrary constants, to be determined by the initial
conditions. As we are given that the climber is at the point (1, 2) initially, we
must have x(0) = 1 and y(0) = 2. Therefore, A = 1, B = 2, and so

x(t) = e−2t ; y(t) = 2e−2t


and so the required projection onto the x-y plane of the path the climber
follows is given by the parametric representation:

r(t) = e−2t î + 2e−2t ĵ


We can see that as t → ∞, r(t) → (0, 0), that is the climber moves towards the
origin. At the origin, the height of the hill is the maximum, z = 16, which
means that the climber reaches the top of the hill, as would be expected.

7
What is not as obvious is that this is the fastest way to reach the top of the
hill, which is true from how we have obtained this result by requiring a curve
~ at that point, the direction
whose tangent vector at all points is parallel to ∇f
of fastest increase of z = f (x, y).

We can also obtain the required projection completely in terms of the


coordinates x and y, by either eliminating t from the expressions for x(t) and
y(t), or using the differential equations obtained for x(t) and y(t) to obtain a
differential equation for y(x):
dy
dy dt −2y y
= dx
= =
dx dt
−2x x
This simply yields:

y = kx
where once again we can use the fact that the climber is at position (1, 2)
initially, and so this point must lie on the curve that the climber follows, to
obtain k = 2. So the required projection onto the x-y plane of the path
followed by the climber, expressed in cartesian coordinates, is:

y = 2x

We can also have problems involving more than two variables, where the
relevant equations (1), (2), and (3) remain the same, with only an addition of
new components in the vectors, corresponding to the new extra variables.

Furthermore, the variables involved need not be cartesian coordinates; they


can actually be any kind of variables. While equations (1) and (3) are
unchanged in appearance, the expression for the generalized derivative, ∇f ~
(equation (2)) does change depending on the orientation of the unit vectors
along each variable’s axis relative to each other. So, for example, the
~ is different for cartesian,
expression for the generalized derivative ∇f
cylindrical, and spherical coordinates. Usually, unless explicitly stated that
cylindrical or spherical coordinates are being used, one can assume that the
variables are oriented like cartesian coordinates.

Example 6
γ
Find the generalized derivative of T = f (P, V, m) = PVKm , where T is the
temperature, P the pressure, V the volume, and m the mass of a system
consisting of an ideal gas, and K and γ are constants with K = 0.1, and
γ = 1.5 respectively. Also find the rate of change of the temperature if the
volume of the system is increased while both the pressure on it and the mass

8
inside it remain constant if in the initial state we have for the system P = 1,
V = 1.4, m = 0.5.

Don’t be alarmed by the apparent explosion of physics here; this is actually


quite a simple question. Here the function T = f (P, V, m) is a function of
three variables, P, V, and m. The generalized derivative, ∇f~ , will involve the
partial derivatives of the function f with respect to P, V, and m. First,
simplifying f using the values of the constants given, we have:

PV1.5
T = f (P, V, m) =
0.1m
Now, we have:

~ = ∂f P̂ + ∂f V̂ + ∂f m̂
∇f
∂P ∂V ∂m
It may not be obvious what P̂ and the other unit vectors mean, but this is
very simple. These unit vectors are just like î etc., but now their directions do
not represent directions of ‘motion’ directly, because they are no longer space
variables. Instead, their directions represent, in the general case, the direction
along which only one of the variables changes while all the others are constant.
Calculating the partial derivatives goes like before, and we have:
1.5
~ = V 1.5P V 0.5 P V 1.5
∇f P̂ + V̂ − m̂
0.1m 0.1m 0.1m2

The second part of the question requires us to find the rate of change of the
temperature if P and m are held constant while V is changed. Associated
directly with the concept of rate of change is of course the notion of directional
derivatives. Now, the ‘directions’ along which the rates of change are required
are not spatial directions, because our variables are no longer space variables.
However, you must recall that the ‘directions’ along which directional
derivatives are taken indicate how the various variables on which f
depends change. Consequently, these directions represent the different ways
in which P, V, and m can be changed to change T (i.e f ), and will be
represented in terms of the unit vectors P̂, V̂, and m̂.

For the case at hand, we are modifying T by keeping P and m constant, and
only changing V. This corresponds to moving in a direction V̂ (along this
direction all the variables except V are constant), and the directional
derivative is hence required along this direction. This vector is a unit vector
and so we can directly use (1):

~ · û = ∇f
Dû = ∇f ~ · V̂

9
 V 1.5 1.5P V 0.5 P V 1.5 
Dû = P̂ + V̂ − m̂ · V̂
0.1m 0.1m 0.1m2

1.5P V 0.5
Dû =
0.1m
(because the various unit vectors P̂, V̂, and m̂ are mutually orthogonal, so
their dot products with each other are zero; they are oriented like the
cartesian coordinates)

We need to find this directional derivative for our system having the intial
state with P = 1, V = 1.4, and m = 0.5, or equivalently (1, 1.4, 0.5), and this
becomes:

1.5(1)(1.4)0.5
Dû = ≈ 35.496
0.1(0.5)
∂f
Note that the required directional derivative is simply ∂V , as would be
expected because the partial derivative I have stated has the physical
interpretation of being the rate of change of f if all the variables (on which f
depends) other than V are kept constant, which is exactly what is happening
here. If either P or m were also being changed, then this simplification would
not have occured.

For example, say that the volume and mass were being increased equally, and
the pressure decreased at twice the rate of increase of the mass and volume.
This way of changing the variables P, V, and m would correspond to moving
in the direction:

v = −2P̂ + 1V̂ + 1m̂


The vector v is not a unit vector, and therefore the direction along which the
directional derivative must be computed for this case is:
2 1 1
û = − √ P̂ + √ V̂ + √ m̂
6 6 6
The required directional derivative can now be computed using the previous
~ at the initial state of the system (1, 1.4, 0.5) (given)
result for ∇f

10
Level Sets and their Orthogonal Vectors

Consider a function w = f (x, y, z), that is a function of 3 variables. A level


set of the function f is a set of values of the variables x, y, and z such that for
these values, the value of f is always a constant. That is, if we required f w to
be a constant c, then we have w = c = f (x, y, z). Therefore, the level set of f
is the set containing values of x, y, and z that satisfy f (x, y, z) = c.

A level set can be defined for a function of any number of variables. So, if we
have a function y = f (x1 , x2 , . . . , xn ), so that f is a function of n independent
variables, then the level set of f will have values of x1 , x2 , . . . , xn that satisfy
y = f (x1 , x2 , . . . , xn ) = c where c is a constant.

In the case where the function f is a function of 3 variables, say


y = f (x1 , x2 , x3 ), then the level set wil have values of these variables that
satisfy f (x1 , x2 , x3 ) = c for some constant c. This, however, is now just a
function of 2 independent variables (if the values of any two variables are
defined, the value of the third variable is fixed). This implies that this function
defines a surface, and therefore the values of x1 , x2 , and x3 belonging to the
level set are actually on a surface, called the level surface.

Similarly, if our function f is a function of two variables only, say


y = f (x1 , x2 ), then the level set wil have values of these variables that satisfy
f (x1 , x2 ) = c for some constant c. This, however, is now just a function of 1
independent variable (if the values of any one variable is defined, the value of
the second variable is fixed). This implies that this function defines a curve,
and therefore the values of x1 and x2 belonging to the level set are actually on
a curve, called the level curve.

For a function of a larger number of variables, the corresponding level set


contains values of the variables that lie on a level hypersurface.

Orthogonal Vectors to Level Hypersurfaces

Consider a function of 3 variables, w = f (x, y, z). Now, the level set for w = 4
will have values of x, y, and z satisfying f (x, y, z) = 4 which defines a level
surface, as seen before. Now, suppose I wish to find the rate of change of w for
values of x, y, and z restricted to this level surface. For all such values, we
know w = 4 by definition of the level set, and so the rate of change of w must
be zero.

11

S-ar putea să vă placă și